Avi Das

Home for my code, thoughts and else.

Speaking at PyCon Canada 2015: I

High Level Takeaways

  • If you are giving a talk, too much content on slides means the audience is reading the slides instead of listening to you.
  • Design your talk expecting failure, and assume things like wifi will not work. An analogy would be the non functional escalator still being a staircase.
  • Show your talk to as many people as time allows. Every time I showed my talk to someone, I would find a new way to make the talk better.
  • It was amazing to hear from a 10 year old about his experience in coding. The barrier to entry to tech will keep falling in a noticeable way.
  • Teaching remains the best way to learn alongside with building things.
  • Coding for expectability is often as important as any considerations in a software project.
  • Science, data, web, systems and infrastructure were dominant themes at PyCon.

Getting to Toronto

  • It was really exciting to have my talk accepted at PyCon, since it was my first time speaking at a conference.
  • Getting through customs went as painless as they could have.
  • Toronto was colder than Austin, big surprise! Reminded me of times back in East Coast.
  • T-mobile data roaming was a breeze to set up, and worked mostly well across different providers.
  • Toronto had different modes of public transportation getting from downtown to airport: buses, streetcars, subway. Makes a city more interesting, although makes day to day travelling more complicated. Although, it does not take a whole lot to put public transportation in Austin to shame.
  • Asked a lady on the Subway for directions. It soon turned into a great conversation with her and her husband about life in Toronto and their experiences in US. For a big city, Toronto scores major points for having friendly people. Canadians have a reputation of being polite and helpful, and I would come to recognize it throughout my trip there.
  • My AirBnB was in Kensington Market, close to UoT where PyCon was taking place. It was a vibrant neighborhood, bars, restaurants, transportation nearby. My room was no hotel room, but a bed was all I needed.

Saturday

  • Morning started with me feeling the stress of not having all my slides and examples ready. I wanted to take some time to reflect on the great feedback I got from my team, but there was little time left.
  • Adding to my anxiety was the wifi connection not working. Thankfully, some organizers helped me out. Once the certificate issues resolved, it worked well for remainder of the conference.
  • Continental breakfast consisted of an assortment of cottage cheese, granola/yogurt, muffins, bread and coffee. No complaints.
  • Talked to Dusty, a Facebook engineer working on the Facebook infrastructure in Portland. Having lived in Canada, he had a lot to share about his experience there.
  • Morning keynote explored the history of Python interpreters and went into benchmarks. Benchmark related conversations can get subjective, but the speaker did a good job avoiding that.
  • Talks on application security, Emmy nominated CGI(!) and Docker deployment followed. The CGI talk was offered a very different viewpoint in software problems. Being highly computation intensive and long life cycles means the tradeoffs are very different from the usual SASS app/consumer product.

Building Realtime Apps With React, Socket.io and Node.js

Update: Udemy has generously granted a free coupon for the readers of this blog for their React JS and Flux course. Use the code avidasreactjs and the first 50 readers will get free access to the course!

The importance of delivering realtime feedback to users is more than ever. Gone are the days when chats or games were the only applications of realtime software. Starting from finance, advertising or education, having a realtime component to your web application will elevate the user experience.

Socket.io

From socket.io’s homepage, it is a library that enables real-time bidirectional event-based communication. It has two parts, a client side library that runs in the browser and a server side library for node.js. In recent times, this has become the de facto way of doing realtime web applications in the node.js world. Key reasons behind this has been the way it abstracts away the overhead of maintaining multiple protocols, while carrying on similar primitives from Node streams and eventEmitter. Some of its other powerful features include being able to stream Binary data, broadcast to multiple sockets and being able to manage connected client data from the server.

Architecture

The WebSocket protocol is a W3C standard that enables interactive communication between browser and server. It functions as an Upgrade request over HTTP 1.1. However, since all legacy browsers and devices do not have support for WebSockets, it’s cross-platform abilities get limited.

Socket.io itself is a library to build realtime applications. It will try to upgrade to and use the Websocket protocol if available. Socket.io depends on another libray called Engine.io which exposes a Websocket like API but provides fallbacks to other transport mechanisms such as XHR and JSONP polling. This enables application developers to write realtime codebases that are browser, device and transport implementation independent.

Getting started with Socket.io

This tutorial assumes that you have Node.js, npm and Express on your system.

In a directory create two files called index.html and app.js. In your app.js file, add the following

1
2
3
4
5
6
7
8
9
10
11
12
13
14
var app = require('express')();
var server = require('http').Server(app);
var swig = require('swig');
var path = require('path');

// view engine setup
app.engine('html', swig.renderFile);
app.set('view engine', 'html');

// server and routing
server.listen(8080);
app.get('/', function (req, res) {
  res.render('index');
});

We set up the view engine and serve up a basic index page. If this part looks unfamiliar, please check out Express docs. Now add the following in app.js.

1
2
3
4
5
6
7
8
var io = require('socket.io')(server);
// socket.io demo
io.on('connection', function (socket) {
  socket.emit('server event', { foo: 'bar' });
  socket.on('client event', function (data) {
    console.log(data);
  });
});

We create a new instance of socket.io and pass in the created express server as a parameter. As the server listens, whenever a new client starts a connection, we emit an event called server event and send the payload { foo : ‘bar’ }. It also listens for ‘client event’ and logs the payload once it gets the event.

In your index.html file add the following

1
2
3
4
5
6
7
8
<script src="/socket.io/socket.io.js"></script>
<script>
  var socket = io();
  socket.on('server event', function (data) {
    console.log(data);
    socket.emit('client event', { socket: 'io' });
  });
</script>

It includes the client side socket io library. After instantiating a new connection, it listens for the ‘server event’ and when that event happens it logs the data and emits ‘client event’ and sends the payload { socket: ‘io’}.

Run node app.js and fire up localhost:8080 in your browser. On the terminal you should see { socket: ‘io’ } and on the console you should see { foo : ‘bar’ } printed out. Congrats, you just did your first Socket.io app!

Useful Socket.io Concepts

Message sending/receiving

Socket.IO allows you to emit and receive custom events. Besides connect, message and disconnect, you can emit custom events and send with associated payload. Emit and Broadcast are ways to send events and on is the event listener.

Server vs Client API

There are some common functions between server and client side, but it is worth looking into the docs and understanding what is possible on the server vs client. Generally, the server side has much more features and capabilites and is capable of creating rooms and namespaces but both sides and send and respond to events.

Rooms and Namespaces

Socket.io provides built in abstractions to demultiplex the connected clients. Namespaces, identified by a path, can be connected via the following

1
2
var socket = io(); //connects to default namespace "/"
var admin = io("/admin"); //connects to the namespace specified by the path "/path"

After a client connects with var socket = io('/admin') , we can send message only to the admin namespace.

1
admin.emit("admin alert", "website traffic is up!"); //the event will only be sent to the clients who connected to the admin namespace

This enables more role or other criteria based distribution of socket.io events/messages within your application.

Rooms provide a way to further divide up clients within individual namespaces. Clients within a namespace can join and leave a room. By default, a client always is connected to a room idenfied by the sockets id. Hence it is possible to send targeted messages to a connected client via socket.broadcast.to(<SOCKET.ID>).emit('test', 'message'). Rooms could make more sense for particular themes whereas namespaces seem to fit well for user type/responsibilities.

React and Socket.io

Now for the exciting part, integrating React.js and Socket.io into an application. React.js is Javascript UI framework from facebook. You can follow some of the initial docs to get started with React. This tutorial will not go into great detail into the terminologies of React.js but refer to the official documentation if any of the React syntax looks confusing.

The basic idea of the app is to have an html input and a label. When someone types in something into the input box, it will update the label for anyone else who have an window open except for the person typing.

Client side code

Let’s start by changing your index.html to the following

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
<!DOCTYPE html>
<html>
  <head>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/react/0.13.2/react.min.js"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/react/0.13.2/JSXTransformer.js"></script>
    <script src="/socket.io/socket.io.js"></script>
  </head>
  <body>
    <div id="mount-point"></div>
    <div id="label-mount-point"></div>
    <script type="text/jsx">
     /** @jsx React.DOM */
    var socket = io();

    var Input = React.createClass({
      _notifyServer: function(event){
        socket.emit('client event', { value: event.target.value });
      },
      render: function(){
        return (
          <div className="update-label">
            <input type="text" placeholder="Enter text" onChange={this._notifyServer}/>
          </div>
        );
      }
    });
    var Label = React.createClass({
      _onUpdateLabel: function(data) {
        this.setState(serverValue: data.value);
      },
      getInitialState: function(){
        return { serverValue: '' };
      },
      render: function(){
        return (
          <div class="my-label">
            <h2>{this.state.serverValue}</h2>
          </div>
        )
      }
    });
    var input = React.render(<Input/>, document.getElementById('mount-point'));
    var label = React.render(<Label/>, document.getElementById('label-mount-point'));
    socket.on('update label', function (data) {
      label._onUpdateLabel(data);
    });
    </script>
  </body>
</html>
Server side

The server side of the codebase can mostly stay the same, except we broadcast ‘update label’ when ‘client event’ is received.

1
2
3
4
5
6
io.on('connection', function (socket) {
  socket.emit('server event', { foo: 'bar' });
  socket.on('client event', function (data) {
    socket.broadcast.emit('update label', data);
  });
});
Explanation

On the client side, two React components called Input and Label are created and mounted by calling React.render. Input renders an html input box which calls the notifyServer method whenever the someone types into the input field. The notifyServer method then emits socket.io event called ‘client event’ with the value of the input box.

On the server side, when ‘client event’ is received with the data, the server calls socket.broadcast.emit and passes the data payload along. This means that all the connected clients except for the socket that generated ‘client event’ will receive the ‘update label’ event and the payload. This sends the message to everyone except for the person typing.

Back to the client side, the Label component consists on a div with a h2 element with is set to the serverValue state of the component. getInitialState sets the initial value to be ” so initially the Label is empty. When ‘update label’ is received, we call the _onUpdateLabel on label, which is an instance of Label. It sets the serverValue state of the Label component to data.value. This invokes the render method of the label component, and it generates a h2 header with the updated value of the serverValue.

Guide to Finding a Technical Cofounder

This has been happening at various meetups/hackathons/startup events sufficiently enough to warrant a blogpost. The situation is generally a variant of this, someone has an idea they are really convinced is the next big thing, the only thing stopping that from happening is making an app/website which requires a technical cofounder. The person with the idea is not at a position to afford the costs of hiring a full time/part time developer, so an equity sharing situation makes sense. Hackathons and tech meetups are where developers hang out, so approaching them there seem to be a good idea to find that cofounder.

There are a few problems to approaches like this. Software people who go to events like this gets pitched a fair amount, sometimes repeatedly on the same ideas. Also, we can be a rather cynical bunch, often as result of the kind of work that we do. This can result you not finding that engineer/hacker to build your app during a hackathon. Or they might do so during the hackathon, but simply drop off after.

It can get discouraging, specially if you are convinced about the idea and new to such events. Personally, I like idea people, specially because they bring in ideas from domains and problem spaces I would have no exposure to otherwise. Moreover, I also believe that cross-pollination of people from different groups is healthy and more products coming into the world is a good thing. Therefore, I would rather like to jot down some helpful tips which can maximize your chances of finding a technical cofounder next time you are looking for one.

  1. Understand what motivates engineers: It’s important to understand what motivates engineers beyond just financial opportunity. If such an opportunity exists, you may be in pretty decent shape already and should really drill down on your exact plans on how the app would make money in the future. If you are less sure, there are still options. Can you prove that the app would have a broad user base? A great way to do this would be to prove that you have tried unscalable ways doing this already, be it door-to-door, personal know how, competitors etc. Most ideas can be validated using non-technical approaches. Knowing your problem space well will not only help you to build a business but also lend credibility when you are looking for a cofounder. Another thing that attracts is interesting technical problems or cutting-edge tech, so if your app involves either, it would be a positive. Good technical co founders can be extremely self-motivated once they realize that they have a problem is really worth spending time on.

  2. Manage expectations: It is best to present the idea and the opportunity and not expect immediate commitment. Generally people are busy, but if you have done your homework and can present the problem well, there is always a good chance. Not all engineers want the same thing, and lot are perfectly happy working where they are. If you do not have a proven user base or revenue plan yet, it does involve a certain risk-taking to get on that journey. As someone who wants to be a founder, you should seek technical co-founders with the same risk appetite as you.

Evaluating React.js and Flask

Update: Udemy has generously granted a free coupon for the readers of this blog for their React JS and Flux course. Use the code avidasreactjs and the first 50 readers will get free access to the course!

As a connoisseur of the web, front-end frameworks have been been a fertile area of late. React.js from Facebook has taken much fanfare, and this post evaluates key ideas on react, and digs into why you could be interested in React. Staying true to single responsibility principle, React is a highly useful tool if you are doing web programming.

In this post, we will dive into building a Frontend using React.js and Backend built using the Python framework Flask. Flask is a minimalistic framework, and excellent when your backend becomes more and more of an API. Moreover, this facilitates the microservices architecture, where the decoupling of your your app into small unit of services can make it more maintainable and scalable.

We will cover some of the key ideas of React and Flask here, but it would be worth referring to the official documentation for React and Flask for getting started and understanding the philosophies of each framework.

Key Ideas of React

The core idea of React is the developers are better of leaving manipulating the DOM to battle tested framework code. Since the DOM has a tree structure, finding elements and manipulating them would need many traversals of a potentially very large tree. Instead, what you modify is a virtual DOM, and React runs its intelligent diffing algorithm to directly update the DOM.

React

React itself is the UI library that will manage all the DOM updates as data changes. It’s takes the V of MVC frameworks, hence it can be used with other MVC frameworks such as Angular, Backbone or Meteor. It is quite easy to use React to manage specific areas of your application’s UI, rather than the entire app.

Virtual Dom

The virtual Dom is an abstraction layer between nodes in the real DOM and the view of the code you are modifying. When React selectively renders subtrees of the nodes in DOM based upon state changes, it achieves the following

 1. Ensures that your DOM is always up to date with current state
 2. Reduces the need to re-render the DOM every time there is change in state
 3. Updating only the individual components on state change ensures high performance
JSX

JSX is a JavaScript syntax extension and it brings in a HTML/XML like familiar syntax for defining a tree structure with attributes. This is the syntax you can use to declare the changes in layout code and React will update the UI. It’s a bold approach, since developers are conditioned to keep layout code separate from Javascript. We will explain more React terminology later as we dive into some code.

Key Ideas of Flask

Flask is a microframework, which means that it trades a short learning curve for fewer out of the box functionalities, compared to heavier frameworks such as Django or Rails. It gives developers more freedom to use their preferable tools and libraries. However, it does have a list of officially supported extensions which when plugged in provide a wide breath of functionalities for a standard web app. Extensions behave as if they are native flask code.

We strongly recommend that you set up a virtualenv for this project, and you may also want to check out virtualenvwrapper for convenience. This is to provide your app with a sandboxed environment.

Getting up and running with Flask

Lets first install Flask

1
2
3
4
pip install Flask

# For viewing and reusing app dependencies
pip freeze > requirements.txt

Set up the following directory structure in your app.

1
2
3
4
5
├── README.md
├── app.py
├── requirements.txt
└── templates
    └── index.html

Modify your app.py code to include the following

1
2
3
4
5
6
7
8
9
10
11
from flask import Flask, render_template

app = Flask(__name__)

@app.route("/")

def index():
    return render_template('index.html')

if __name__ == "__main__":
    app.run()

We start by importing Flask and creating a new instance of a flask application. In flask, app.route is used to describe the behavior when users hit particular endpoints in the application. Here when user hits the index route, we render a template called hello world. By default Flask uses the Jinja2 templating language, but you can use any other templating language. In fact, we will not be covering Jinja2 in this blog post. Finally we tell python to call the run method of the app when invoked as a main function.

Let’s populate index.html with the following basic HTML boilerplate

1
2
3
4
5
6
7
8
9
10
11
12
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Flask React Tutorial</title>
</head>
<body>
     <div id="mount-point">
         <p1>Hello world.</p1>
     </div>
</body>
</html>

Now run the app with

1
2
3
python app.py
// * Running on http://127.0.0.0:5000/
// * Restarting with reloader

By default it runs on port 5000. Navigate to the endpoint and you should see the html page you just created. You are now up and running with Flask!

Integrate React

Easiest way to include React would be to just include them from a cdn. Let’s update the index.html to include React and and port our existing html to React. index.html will now look like

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Flask React Tutorial</title>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/react/0.13.2/react.min.js"></script>
    <script src="https://cdnjs.cloudflare.com/ajax/libs/react/0.13.2/JSXTransformer.js">
</head>
<body>
     <div id="mount-point"></div>
</body>

  <script type="text/jsx">
     /*** @jsx React.DOM */
    var FirstComponent = React.createClass({
        render: function() {
            return (<p1>Hello world.</p1>);
        }
    });
    React.render(<FirstComponent />, document.getElementById('mount-point') );
     </script>

</html>

How Browserify Improves Client-side Development

For a more modular, maintainable Frontend

As Single Page Applications gain in popularity, the size of front end codebases keeps growing rapidly. For keeping these codebases maintainable, modularity becomes a priority. The easier it is to modularize code, the more incentives developers will have for doing so. With the ease of modularity with CommonJS, npm has seen explosive growth of packages published which has helped the Node ecosystem greatly. Browserify brings that ease to client side development leveraging the CommonJS module system. When used with build tools such as Grunt or Gulp, you can write modular client side code just like you would write your server side Node code, and Browserify takes care of the bundling for you. There is much less excuse these days to make everything global and attach to the window object!

Leveraging npm modules

Package Manager Traction

Looking at the graph above is a big selling point when trying to evaluate the value Browserify can bring to your client side workflow. The graph is a comparison of the rate at which packages are getting published in different package managers Bower, PyPI, RubyGems. npm leads the pack easily. Recently, jQuery registry stopped accepting new plugins, with new packages being published on npm. Cordova recently announced the same change, moving plugins to npm. npm is now hosting much broader range of modules than only server-side Node.js modules and Browserify can help you leverage these modules on the front-end. The flipside of this as a module publisher is that publishing modules on npm now gives you access to a much broader audience since people might use the module on the browser, custom hardware etc.

How it works

In the CommonJS syntax, the “exports” object is the public API of a module and “require” can be used to include a module in your javascript file. Since browsers do not have require available, Browserify traverses the dependency trees of all the required modules, and bundles the dependencies into one self contained file that you can just include with a script tag on the browser. Browserify is aware of package.json and the order in which node_modules are resolved. Moreover, it supports built in Node modules e.g. path and gloabls e.g. Buffer so you have access of those in the client side as well.

Transforms

Core Browserify only bundles modules written in the CommonJS syntax, adhering to the single responsibility principle. However, there are other ways of modularizing client side code, AMD and Global Variables being the two usual ones. Instead of handling every possibly of modules, Browserify exposes a Transforms API so that a plugin can be built which can preprocess a file into Javascript in CommonJS syntax which Browserify can then consume. This means that you can write modular code just like your node codebases regardless of what module system your dependencies may adhere to. There are also lot of people writing in languages that compile into Javascript, such as CoffeeScript or TypeScript. To handle this, there are transforms available for AMD (deamdify), Bower modules (debowerify), globals (deglobalify), coffeescript(coffeeify), harmony (es6ify) etc. A simple search of Browserify on Github or npm brings up thousands of modules and attests to the ecosystem around Browserify. Delegating to transforms helps to keep the footprint of Browserify smaller, while makes it more extensible.

Verifying X509 Certificate Chain of Trust in Python

Executing network spoofing and man in the middle attacks have become easier than ever. This is more of an issue if a client has an open server for you to send push notifications, since the open port can be detected by methods such as port scanning. As such, it is important to sign data, and ship the signature and metadata about verifying the data against the signature along with the data itself. This provides a way for the client to verify that the data received is unaltered, from the correct sender and indented for the correct recipient. Python’s pyopenssl has a handy method called verify for checking the authenticity of data.

1
OpenSSL.crypto.verify(certificate, signature, data, digest)

The problem then becomes how to provide the certificate while retaining the flexibility necessary to update the certificate without clients needing to modify their certificate stores every time. Providing a url that can be used to download the cert provides that but leaves the door open for the same kind of attacks.

Therefore, clients will need to ensure that the downloaded certificate is trustworthy before using it to verify the authenticity of a message. The openssl module on the terminal has a verify method that can be used to verify the certificate against a chain of trusted certificates, going all the way back to the root CA. The builtin ssl module has create_default_context(), which can build a certificate chain while creating a new SSLContext. However, it does not expose that functionality for adhoc post processing when you are not opening new connections.

pyopenssl provides some very handy abstractions for exactly this purpose:

  • X509Store: The chain of certificates you have chosen to trust going back to root Certificate Authority

  • X509StoreContext - Takes in a X509Store and a new certificate which you can now validate against your store by calling verify_certificate. It raises exceptions if the intermediate or root CA is missing in the chain or the certificate is invalid.

The full example of verifying a downloaded certificate against a trust chain is given below

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import requests
from OpenSSL import crypto

def _verify_certificate_chain(cert_url, trusted_certs):

    # Download the certificate from the url and load the certificate
    cert_str = requests.get(cert_url)
    certificate = crypto.load_certificate(crypto.FILETYPE_PEM, str(cert_str.text))

    #Create a certificate store and add your trusted certs
    try:
        store = crypto.X509Store()

        # Assuming the certificates are in PEM format in a trusted_certs list
        for _cert in trusted_certs:
            store.add_cert(_cert)

        # Create a certificate context using the store and the downloaded certificate
        store_ctx = crypto.X509StoreContext(store, certificate)

        # Verify the certificate, returns None if it can validate the certificate
        store_ctx.verify_certificate()

        return True

    except Exception as e:
        print(e)
        return False

Using this can be really useful for client libaries where you cannot rely on the system to provide the certificates, so you can ship your trust chain along with the library. There are also other useful abstractions in the pyopenssl library for some useful checks against the certificate. get_subject() provides information about the certificate such as common name, has_expired() which checks if the certificate is within valid time range and other features such as blacklisting potentially compromised certificates are possible. Thus pyopenssl is really handy when you need ssl abstractions beyond the standard library while not needing to execute the openssl shell calls via a subprocess.

Nodeconf 2015: Unconf With the Right Intentions

Conferences can be a great way to get the creative juices flowing, meet people in the community and share stories and problems. They offer great opportunities to learn from core developers building the frameworks that your software depends on.

Nodeconf managed to achieve all this, in the rather unusual form of an unconference. An unconference meant that the structure and events/presentations and talks at the conference were left to be decided by the community rather than a committee. That does make Nodeconf a conference not for everyone. Understanding the format and structure of Nodeconf is important before you make the hike to Walnut Creek Ranch next year.

I thought to distill down the reasons why you might or might not be interested in attending Nodeconf as well as get the most out of it. You might be interested in Nodeconf if you

  1. Build for the web: For a lot of attendees, Nodeconf would feel like living in the future as a lot of attendees are very involved in making the decisions and tradeoff that would shape the future of the web. Specially discussions around packaging and parceling front end assets in npm (Modular UI) was really interesting as was Isomorphic JS, which covered the challenges involved in writing identical client and server side code. The JavaScript landscape is a fast evolving one and Nodeconf offers fantastic perspective on how the decision making can work.

  2. Publish on npm/github: As someone who maintains projects on npm and github, the discussions around distributing node modules were very insightful. Issues such as broadening adoption, getting contributors for github modules and standards for publishing on npm came up and maintainers of hugely popular modules shared their experiences. Picking a good module scope, having really good examples for beginners to start with and publishing with concise yet searchable package descriptions were all emphasized.

Building Realtime User Monitoring and Targeting Platform With Node, Express and Socket.io

Being able to target users and send targeted notifications can be key to turn visitors into conversions and tighten your funnel. Offerings such as mailchimp and mixpanel offer ways to reach out to users but in most of those cases you only get to do them in post processing. However, there are situations when it would be really powerful is to be able to track users as they are navigating your website and send targeted notifications to them.

Use Cases

Imagine that a buyer is looking for cars to buy and is interested in vehicles of a particular model and brand. It is very likely that he/she will visit several sites to compare prices. If there are a few results the buyer has looked at already, there may be an item which would fit the profile of this user. If you are able to prompt and reach out as the user is browsing through several results, it could make the difference between a sale and user buying from a different site. This is particularly useful for high price, high options scenerios e.g. Real Estate/Car/Electronics purchases. For use cases where the price is low or the options are fewer, e.g. a SAAS offering with a 3 tiers, this level of fine grained tracking may not be necessary. However, if you have a fledgling SAAS startup, you may want to do this in the spirit of doing things that don’t scale.

Prerequisites

This article assumes that you have node and npm installed on your system. It would be also be useful to get familiar with Express.js, the de facto web framework on top of Node.js. Socket.io is a Node.js module that abstracts WebSocket, JSON Polling and other protocols to enable simultaneous bi directional communication between connected parties. This article makes heavy use of Socket.io terminology, so it would be good to be familiar with sending and receiving events, broadcasts, namespaces and rooms.

Install and run

Start by git cloning the repo, install dependencies and run the app.

1
2
3
4
git clone git@github.com:avidas/socketio-monitoring.git
cd socketio-monitoring
npm install
npm start

By default this will start the server at port 8080. navigate to localhost:8080/admin on a browser e.g Chrome. Now, on a different browser, e.g. Firefox, navigate to localhost:8080 and browse around. You will see that the admin page gets updated with the url endpoints as you navigate your way through the website in firefox. You can even send an alert to the user on Firefox by pressing the send offer button on Chrome!

Walkthrough

Let’s get into how this works. When an admin visits localhost:8080/admin, she joins a Socket.io namespace called adminchannel.

1
var adminchannel = io.of('/adminchannel');

When a new user visits a page, we get the express sessionID of the user by calling req.sessionID and pass it to the templating engine for rendering. The session id ensures that we can identify a user across pages and browser tabs.

1
res.render('index', {a:req.sessionID});

The template sets the value of sessionID as a hidden input field on the page, with the id “user_session_id”.

1
2
3
4
5
6
7
<body>
<input type="hidden" id="user_session_id" value="<%= a %>" />
  <div id="device" style="font-size: 45px;">2015 Tesla Cars</div>
    <a href="/about">About</a>
  <br />
  <a href="/">Home</a>
</body>

After the page has loaded, it will emit a pageChange socket.io event. Accompanying the event is the url endpoint for the current page and sessionID.

1
2
3
4
5
6
7
8
  var userSID = document.getElementById('user_session_id').value;
  var socket = io();

  var userData = {
    page: currentURL,
    sid: userSID
  }
  socket.emit('pageChange', userData);

On server side, when pageChange is received, a Socket.io event called alertAdmin is sent to the adminchannel namespace. This ensures that only the admins are alerted that user with particular session id and particular socket id has navigated to a different page. Since anyone with access to /admin endpoint will join the adminchannel namespace, this can easily scale to multiple admins.

1
2
3
4
5
6
  socket.on('pageChange', function(userData){
    userData.socketID = socket.id;
    userData.clientIDs = clientIDs;
    console.log('user with sid ' + userData.sid + ' and session id ' + userData.socketID + ' changed page ' + userData.page);
    adminchannel.emit('alertAdmin', userData);
  });

When altertAdmin is received on the client side, the UI dashboard is updated so that the admins have a realtime dashboard of users navigating the site. This is done via Jquery which appends each new page change to a html list as users navigate through the site.

1
2
3
4
5
6
7
8
  adminsocket.on('alertAdmin', function(userData){
    var panel = document.getElementById('panel');
    var val = " User with session id " + userData.sid + " and with socket id " + userData.socketID + " has navigated to " + userData.page;
    userDataGlob = userData;
    var list = $('<ul/>').appendTo('#panel');
    //Dynamic display of users interacting on your website
    $("#panel ul").append('<li> ' + val + ' <button type="button" class="offerClass" id="' + userData.socketID + '">Send Offer</button></li>');
  });

Now, the admin may choose to send certain notifications to the particular user. When the admin clicks on the “Send Offer” button, a socket.io event called adminMessage is sent to the general namespace on the server with the user specific data.

1
2
3
4
  //Allow admin to send adminMessage
  $('body').on('click', '.offerClass', function () {
    socket.emit('adminMessage', userDataGlob);
  });

When adminMessage is received on the server side, we broacast to the specific user the message. Since every user always joins into a room identified by their socketID, we can send a notification only to that user by using socket.broadcast.to(userData.socketID) and we send an event called adminBroadcast with the data.

Here, you could have chosen to broadcast a message to all the users, or to a particular room, which subsets of users could have joined. Thus, you can fine tune how you want to reach out to users as well.

1
2
3
  socket.on('adminMessage', function(userData) {
    socket.broadcast.to(userData.socketID).emit('adminBroadcast', userData);
  });

Finally on the client side of the user when adminBroadcast is received, the user is alterted with a notification. However, you can easily use it for more complex use cases such as dynamically updating the page results, update ads section to show offers and so on by setting up event listeners.

1
2
3
  socket.on('adminBroadcast', function(userData){
    alert('Howdy there ' + userData.sid + ' ' + userData.socketID + ' ' + userData.page);
  })

There you have an end to end way in which a set of admins can track a set of users on a website and send notifications. This system can be particularly valuable when the user’s primary reason for visit accompanies purchasing intent. E-commerce and SAAS platforms have recognized the importance to user segmentation and targeted outreach. This system enables you to minimize the latency of such outreach. On the plus side, you can get to rely on fully open source tools with broad user bases and support.

This particular example used url endpoints as part of the data payload, but you can really strech it to any user events. For example, you can easily track where the user’s cursor is and send that information back in real time. One can imagine High Frequency Trading firms using this technique in bots to track real time user behavior, e.g. user’s cursor hovering on a buy button for a ticker, as information gathered for its trades. How much you want to track and react to can be an exercise in determining the bounderies of being responsive and creepiness for users.

Props to my friend Shah for working with me on this. If you are doing some level of realtime tracking on your site, I would love to hear about it. Please feel free to send over any other feedback as well.