Codiqa is the fastest mobile prototyping tool around. Learn more
1 Flares 1 Flares ×

Update: Part 2 is now available

For the last two weeks, we’ve been using MongoDB for storing conversion data on our Codiqa beta signup landing page.  Basically, we track conversion from landing to signup both from ad campaigns we are running and from organic clicks.  Part of our goal with Codiqa is to make decisions that are informed with data, rather than relying on gut feelings.

Having built equivalent tracking systems using RDBMSs, I thought I’d share how using MongoDB differs.

Storing Data with MongoDB

With Mongo, there is no requirement to define a schema up front.  This means that if I need to start tracking a new field for, say, a referral id, I can simply send the column to MongoDB and any future documents that use this column will have it available. Contrast this with an RDBMS like MySQL or PostgreSQL: the live table whose schema needs to be updated has to be altered to add the new column.  For a database like MySQL, an alter can take minutes to hours to days depending on the size of the table. Not to mention the comparative difficulty in syncing the schema in production for the new code base and the database.

Another benefit to using MongoDB is that unexpected data does not block the collection of that data.  For example, if you are expecting a field to only be 32 characters long, and you encounter perfectly valid data that is longer than that, the whole row will fail to insert.  

A document-oriented database like MongoDB lets you quickly track new data.  I find that for systems that rely heavily on data storage, the flexibility of MongoDB lets me spend less time on stat data modeling. 

Querying Data with MongoDB

Querying data with MongoDB is quite different than a SQL-based RDBMS.  Rather than one big query, a MongoDB query reads more like a script.

Here is the current script I’m using to track conversion for our beta signup landing page:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
/*
A MongoDB query script. Example output:
760 landings
69 signups
Conversion rate: 9.078947368421053%
Campaign conversions:
organic: 8.042% - 53/659
betabuild1: 0% - 0/6
betanocodingbuilder: 37.500% - 6/16
betaonlydrag: 50.000% - 2/4
betaonlydragtool: 0% - 0/4
betamobmin: 11.321% - 6/53
*/
var signups = db['beta_interest'];
var landings = db['landing'];

var landingEntries = landings.find();
var signupEntries = signups.find();

var landingClick = {};
var signupClick = {};

var campaignLandingCount = {};
var campaignSignupCount = {};

landingEntries.forEach(function(obj) {
  landingClick[obj.clickId] = obj
  if(!campaignLandingCount.hasOwnProperty(obj.campaign)) {
    campaignLandingCount[obj.campaign] = 1;
  } else {
    campaignLandingCount[obj.campaign]++;
  }
});
signupEntries.forEach(function(obj) {
  signupClick[obj.clickId] = obj
  if(!campaignSignupCount.hasOwnProperty(obj.campaign)) {
    campaignSignupCount[obj.campaign] = 1;
  } else {
    campaignSignupCount[obj.campaign]++;
  }
});

var totalLanding = landingEntries.count();
var totalSignups = signupEntries.count();
print(totalLanding + ' landings');
print(totalSignups + ' signups');
print('Conversion rate: ' + (totalSignups / totalLanding) * 100 + '%');
print('Campaign conversions:');
for(var campaign in campaignLandingCount) {
  if(campaignSignupCount.hasOwnProperty(campaign)) {
    var signups = campaignSignupCount[campaign];
    var landings = campaignLandingCount[campaign];
    var perc = (signups / landings) * 100;
    print('t' + campaign + ': ' + perc.toFixed(3) + '%' + ' - ' + signups + '/' + landings);
  } else {
    print('t' + campaign + ': 0% - 0/' + campaignLandingCount[campaign]);
  }
}

In this script we see that our grouping and aggregation is being done through normal hashing.  As a programmer, this is more natural to me than SQL aggregations, especially when data is spread across multiple tables, database servers, or when utilizing data in columns that are not part of the grouping.

The script I ended up writing to track conversions is longer than an equivalent aggregation query in SQL.  However, having written more complex SQL queries, I was able to come up with the MongoDB query much more quickly.  I’m reading up on the MongoDB Map/Reduce docs to see how I could make this query more compact and efficient, and the MongoDB Aggregation docs have ideas for how to get equivalent GROUP BY behavior in MongoDB.

Conclusion

Stay tuned to future blog posts that chronicle our experiences using MongoDB to track the data we gather.  The script I wrote above is only useful as a quick check that everything is working.  In the future I will write on putting this data into a graphing library or monitoring system like Geckoboard

Max

Hi, I'm Max, Co-founder of Codiqa, the easiest way to build jQuery Mobile prototypes. I'd love to talk with you: follow me!

More Posts

1 Flares Buffer 0 Twitter 0 Facebook 0 Google+ 1 LinkedIn 0 1 Flares ×
Share →
Buffer