I decided to build a tool that would allow me to see how websites are related in a visual way. I’m still working on the concept but the tool is ready and working: WebVisualizer.

##Building blocks In order to get this to work there were some basic building blocks I needed. Some server side and a few client side. Here is a little run through:

###Client graph rendering I had some experience with the VivaGraph library, from the very talented Andrei Kashcha, so that was an obvious choice. It allows you to render very large graphs (tested up to 50000+) using WebGL without any hassle really.

Using WebGL has some limitations, the main one is that it can only access same domain assets (Unless using CORS or changing the accept headers). This means we’ll need a server proxy to load the favicons for each website.

###Server I needed to setup a server, both for the image proxy and to interact with the Search APIs (they all require private keys) so I decided to go with Nodejs and deploy to Heroku through Travis.

For the tests I’m using mocha with should and supertest, they make testing a web server extremely easy.

Testing the Image proxy (detailed below):

###Image proxy There were some existing npm packages to proxy requests in Nodejs, but I wanted something very simple. I set it up using Express and require:

###Search API I was going to use the usual Google Search API, but it turns out it was deprecated some years ago and will be dissapearing. They are replacing it with the Custom Search API, which unfortunately only allows 100 queries/day.

After searching around for a bit I found that Bing offered 5000 queries/month, 5x the number Google allowed, so I went with Bing for this one…

You can sign up for Bing’s Search API on the new Azure Marketplace, after which you go to your profile in order to get your Access key.

On my code I set the Access Key as an environment variable called bingAPIkey in order to keep it secret. When running locally use:

bingAPIkey=Your-Key node index.js

If you are deploying to Heroku as well remember to setup the Config Variable in the Settings tab.

Serving Bing searches from Nodejs was quite easy after that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
module.exports = function (app) {

	var bingAPIkey = process.env.bingAPIkey;

	if (!bingAPIkey) {
		console.error("[ERROR] No API key for Bing detected, please set the 'bingAPIkey' env variable.");
		return;
	}

	var rootUri = 'https://api.datamarket.azure.com/Bing/Search/v1',
		auth = new Buffer([bingAPIkey, bingAPIkey].join(':')).toString('base64'),
		request = require('request').defaults({
			headers: {
				'Authorization': 'Basic ' + auth
			}
		});

	// Setup the service
	app.get('/search', function (req, res) {
		var service_op = req.query.service_op || 'Web',
			query = req.query.query;

		var url = rootUri + '/' + service_op;

		console.log("Bing Search: " + url + " -> " + query);

		request.get({
			url: url,
			qs: {
				$format: 'json',
				Query: "'" + query + "'", // Single quotes required
			}
		}, function (err, response, body) {
			if (err) {
				return res.status(500).send(err.message);
			}
			if (response.statusCode !== 200) {
				return res.status(500).send(response.body);
			}

			var results = JSON.parse(response.body);
			res.send(results.d.results);
		});
	});
};

##Putting it all together After all that was working I built a simple UI to display everything, it’s still a work in progress so I’ll keep adding to this post.


blog comments powered by Disqus