Wednesday, July 15, 2009

No SQL, so what?

It's been a fortnight since Log Buffer rounded up the reaction to the nascent No SQL movement. But there is a lively thread still running on Oracle-L. The entire thread is worth reading, but I was particularly struck by something Nuno Souto wrote:
"Now: the simple fact here is that folks from Google, Facebook, Myspace, Ning etcetc, and what they do as far as IT goes, are absolutely and totally irrelevant to the VAST majority of enterprise business."
This is so true. For starters, there is no SLA for users of Google's search engine. If Google doesn't include a page because it hasn't been indexed yet, well that's just the way it is. Ditto if Google returns duplicate hits because the same page has been indexed in multiple places, or returns different results to different uses because the indexes haven't been replicated across the entire estate. It doesn't really matter because Google's results are usually "good enough". Besides, it is jolly hard to spot missing hits or inconsistent results. Whereas in regular IT a similar casualness would undermine our users' faith in the system and lead to developers' heads being paraded on pikes.

It is also worth noting the major omission from the list of the usual suspects which are trotted out in these arguments: Amazon. Amazon's business model is most like regular enterprise IT - focused data retrieval, highly transactional, and with a premium on data integrity, security and performance. Consequently Amazon runs Oracle.

Why is there this widespread antipathy to SQL databases? It's not just because SQL is hard. I mean, Hibernate is complicated to understand and fiddly to implement. It goes beyond mere effort. Seth Godin wrote the following while discussing what qualities a computer game must possess in order to turn a customer into a die-hard fan:
"For World of Warcraft, [the learning curve is] huge. It's very difficult to spend just an hour or two. There's a chasm between encounter and enjoyable experience. Tetris was oriented in precisely the other way--everyone who tried it instantly became almost as smart as an expert."
I think this applies to development software too. Hibernate may be complex but it is couched in objects and Java and XML configuration files, so if you're already experienced in J2EE you already have an innate understanding of its fundamentals. You can become productive quite quickly. Many of the data storage tools present at the NoSQL briefing come with APIs in Java, Python and similar development languages. In fact, ease of use for developers is a big play; Voldemort even celebrates its mockability (which is a reference to the Mock Objects school of test driven development and not a measurement of ridiculousness).

We are all aware of the cost of context switching in Oracle. Embedding SQL in unnecessary PL/SQL constructs is less performant than using set-based SQL statements. The NoSQL movement is addressing a similar problem: concept switching. It is easier for application developers to maintain their velocity if all their work uses the same languages, approach and indeed IDE.

It is obvious what the attraction is for developers. That does not make a NoSQL product suitable for any given business. Sure, if the application is primarily concerned with the storage, retrieval and emendation of documents it probably makes more sense to use a product like CouchDB than to try to shred the document into relational tables. But if the application is highly transactional and/or handles valuable data then something like MongoDB is definitely a bad fit. To be fair MongoDB does list the applications for which it is less suited.

It comes down to understanding what is appropriate for the project in hand. That is an assessment which really belongs to the users, because they are the people who know - or at least ought to know - the value of the data to the business. The Daily WTF recently published this cautionary tale showing the consequences for a company and all its employees which entailed from underestimating the value of its data and disrespecting the importance of adequate data storage.

Labels: , , ,

9 Comments:

Blogger LewisC said...

Excellent post. I've never really understood the al or nothing approach anyway. Linux OR Windows, C OR Java, etc.

Best tool for the job.

Although hibernate sucks! ;-)

LewisC

15 July 2009 at 04:57:00 GMT-7  
Blogger dm said...

Amazon does have a "nosql" tool - Dynamo - which according to the Dynamo white paper is used quite a bit in their infrastructure. I assume they use relational databases too. That makes sense - use the right tool for the right job - and there are many jobs within the tech infrastructure of a company that size.

15 July 2009 at 06:58:00 GMT-7  
Blogger kevinglenny said...

This comment has been removed by the author.

16 July 2009 at 09:11:00 GMT-7  
Blogger kevinglenny said...

This comment has been removed by the author.

16 July 2009 at 09:19:00 GMT-7  
Blogger kevinglenny said...

Excellent post and I agree with your sentiments about developers wanting to use the tools that they are most productive with.

I would like to add another reason why there is a tide against the SQL market - the scalability problems that they present. The question of whether data stores can be be plugged into existing infrastructures is discussed
here

16 July 2009 at 09:25:00 GMT-7  
Blogger Noons said...

Just as an aside:

look at the description of memcache in the link you added and its further links.

In it we are told that database - note the lack of any qualification! - writers do block readers.

In other words: whoever wrote the memcache description has NEVER used Oracle, or they would know that its writers NEVER block readers!

But you can bet these guys have added Oracle into their list of dbs that "don't scale".

And folks wonder why I call these people total idiotic ignorants who are incapable of the most basic logic and informed argumentation?

17 July 2009 at 06:37:00 GMT-7  
Blogger Joel Garry said...

Twitter is inappropriate with Google Apps

Noons: you are a hero!

Now if we can just convince metalink not to use flash...

word: wareurro

17 July 2009 at 10:59:00 GMT-7  
OpenID dledwards said...

"On Radio Free Tooting, Andrew Clarke says, 'No SQL, so what?' taking as his keynote something Nuno Souto said: [...]"

Log Buffer #154

17 July 2009 at 11:26:00 GMT-7  
Blogger Sachin FromDev said...

just stopped by from your stackoverflow thread answer for normalize or denormalize.

Thanks for sharing your thoughts. The recent social media companies have built a big hype about using no sql type of option. However majority of application may not really need that.

16 August 2012 at 12:43:00 GMT-7  

Post a Comment

<< Home