theChrisWalker.net

“Lithium RAD Framework rocks! ”

says Chris

QR Code for Permalink
Mongo — and the coolness of Document-Oriented Databases

with one comment

I have been following a PHP Rapid Application Development Framework called Lithium with much interest (for other reasons which I fully intend to blog about later) and they are the ones I owe for turning me on to document oriented database systems.

So what are they and why are they so cool, and what/who the hell is “Mongo”?

Let’s deal with the what first. “Traditional” database systems are Relational, they are characterised by strictly defined tables, strong relations between tables and their ACID compliance. It simple terms that means you say:

“I want a table of users and each one will have a first name, a last name and an email address, and each of those will be a ‘text’ entry of a maximum of 255 characters”.

If at a later date you want to add more info, then you have to go back and alter your table definition, give the old rows default values and fix any coding bugs that relied on the previous structure (not that you’d ever develop such a dependant application). Now the approach in a document oriented database is that you say:

“I want a table of users”

Then you can give each one whatever data you want. you may have one user with a name and address, another with a name and phone number only, a third might have a phone number and address but no first name. All these records can co-exist! That’s pretty different and I really didn’t know whether it was a good idea. So I played with MongoDB.

MongoDB is really cool. I installed it my laptop simply and fired up the shell to play. The syntax is very javascript-like, becasue it is javascript! So we can do this.

MongoDB shell version: 1.3.3-
url: test
connecting to: test
type "help" for help
> db.myFirstMongoDB.save({key:"value", mixed:["one","two","three"], deep:{has:{another:"object"}}})
ObjectId("4b96b836ea70a136feb89c20")
> db.myFirstMongoDB.find()
{ "_id" : ObjectId("4b96b836ea70a136feb89c20"), "key" : "value",  "mixed" : [ "one", "two", "three" ],
   "deep" : { "has" : { "another" : "object" } } }
>

Notice two amazing things here.

  1. We never created the table! we simply call db.myFirstMongoDB... which creates the “collection” (as they are called in the document oriented world). We certainly never defined a schema!
  2. We passed an object with nested data. Not just simple fields! We can use this in a much more useful way in the next example

So why might you want to do this. Consider a website with articles. Articles are written by a person, they are published on a date and the have content, tags, comments, view counts, trackbacks, and a proprietary rating system. If this where a relational database, we are looking at the following tables:

articles: holds the article, date, person_id, view counts, proprietary rating
people: holds the writers
comments: the comments, related by article_id
trackbacks: link back details
ratings: the info for the proprietary system.
tags: tag info
tags_articles: a join table linking tags to the articles

Joining all that info together is not only complex, but can be a fair amount of load on the system. As more info wants to be added by the website owners changes become more complicated and difficult. However with Mongo I can hold the article in one collection like this:

[

{
  "published": "2010-03-09 21:21:00",
  "author": { "name": "Chris Walker", "profile": "/profiles/chris-walker" },
  "title": "My amazing Article",
  "content": "here's the text...",
  "tags": [ "wonder", "amazement", "mongo", "demo", "chris" ],
  "comments": [
    { "commenter": "Rosie", "comment": "What is this?", "when": "2010-03-09 22:02:00" },
    { "commenter": "Chris", "url": "http://thechriswalker.net/", "comment":"it's class, that's what!", "when": "2010-03-09 23:12:00" },
    { "commenter": "anonymous", "comment":"I am your father.", "when": "2010-03-10 01:25:20" }
  ],
  rating: { stars: 5, tagline: "what a monster!" },
  views: 1024,
  trackbacks: [
    { "url": "http://some.blog/some/where", "quote": "the text for the link"},
    { "url": "http://some.other.blog/some/where/else", "quote": "the text they used to link here"},
  ]
}

So that might look like a more complex entity. It is, but you can search for values in sub-documents, you can filter results on existence of certain keys, and you can pull all this info one result. Your previous 7 tables and 8 joins, just became 1 collection and no joins. guess which is quicker?

Try to pull articles with a certain tag: “balls”? In SQL you’d be doing this:

SELECT `article_id` FROM `tags_articles` `ta` JOIN `tags` `t` ON `ta`.`tag_id` = `t`.`id` WHERE `t`.`name` = 'balls'

And that would just get you the list of Id’s and you’d still need to get the info for each article. In Mongo with the collection we described above, you’d do:

db.articles.find({ tags: "balls"})

How much easier is that, and we’ve now got all the info, not just the “id”s! I was just amazed at how easy that is!

Of course this simplicity comes with a price, you lose the relational elements making tight relationships harder to work with. Also you some ease in aggregating things like “tags”, but MapReduce support makes this very efficient anyway.

In my next blog I’ll talk about Lithium, Mongo and how fast you can build a full blown web app with these tools.

Written by Chris

March 9th, 2010 at 9:44 pm

One Response to 'Mongo — and the coolness of Document-Oriented Databases'

Subscribe to comments with RSS or TrackBack to 'Mongo — and the coolness of Document-Oriented Databases'.

  1. #1

    thanks, you just made the difference between relational db and document oriented db obvious and simple. I am working on a project where the developers who work in Ruby are going to use a document oriented db and it is good to have a clear picture of how this works.

    Catherine

    22 Jun 10 at 05:59

Leave a Reply