"Now: the simple fact here is that folks from Google, Facebook, Myspace, Ning etcetc, and what they do as far as IT goes, are absolutely and totally irrelevant to the VAST majority of enterprise business."This is so true. For starters, there is no SLA for users of Google's search engine. If Google doesn't include a page because it hasn't been indexed yet, well that's just the way it is. Ditto if Google returns duplicate hits because the same page has been indexed in multiple places, or returns different results to different uses because the indexes haven't been replicated across the entire estate. It doesn't really matter because Google's results are usually "good enough". Besides, it is jolly hard to spot missing hits or inconsistent results. Whereas in regular IT a similar casualness would undermine our users' faith in the system and lead to developers' heads being paraded on pikes.
It is also worth noting the major omission from the list of the usual suspects which are trotted out in these arguments: Amazon. Amazon's business model is most like regular enterprise IT - focused data retrieval, highly transactional, and with a premium on data integrity, security and performance. Consequently Amazon runs Oracle.
Why is there this widespread antipathy to SQL databases? It's not just because SQL is hard. I mean, Hibernate is complicated to understand and fiddly to implement. It goes beyond mere effort. Seth Godin wrote the following while discussing what qualities a computer game must possess in order to turn a customer into a die-hard fan:
"For World of Warcraft, [the learning curve is] huge. It's very difficult to spend just an hour or two. There's a chasm between encounter and enjoyable experience. Tetris was oriented in precisely the other way--everyone who tried it instantly became almost as smart as an expert."I think this applies to development software too. Hibernate may be complex but it is couched in objects and Java and XML configuration files, so if you're already experienced in J2EE you already have an innate understanding of its fundamentals. You can become productive quite quickly. Many of the data storage tools present at the NoSQL briefing come with APIs in Java, Python and similar development languages. In fact, ease of use for developers is a big play; Voldemort even celebrates its mockability (which is a reference to the Mock Objects school of test driven development and not a measurement of ridiculousness).
We are all aware of the cost of context switching in Oracle. Embedding SQL in unnecessary PL/SQL constructs is less performant than using set-based SQL statements. The NoSQL movement is addressing a similar problem: concept switching. It is easier for application developers to maintain their velocity if all their work uses the same languages, approach and indeed IDE.
It is obvious what the attraction is for developers. That does not make a NoSQL product suitable for any given business. Sure, if the application is primarily concerned with the storage, retrieval and emendation of documents it probably makes more sense to use a product like CouchDB than to try to shred the document into relational tables. But if the application is highly transactional and/or handles valuable data then something like MongoDB is definitely a bad fit. To be fair MongoDB does list the applications for which it is less suited.
It comes down to understanding what is appropriate for the project in hand. That is an assessment which really belongs to the users, because they are the people who know - or at least ought to know - the value of the data to the business. The Daily WTF recently published this cautionary tale showing the consequences for a company and all its employees which entailed from underestimating the value of its data and disrespecting the importance of adequate data storage.
9 comments:
Excellent post. I've never really understood the al or nothing approach anyway. Linux OR Windows, C OR Java, etc.
Best tool for the job.
Although hibernate sucks! ;-)
LewisC
Amazon does have a "nosql" tool - Dynamo - which according to the Dynamo white paper is used quite a bit in their infrastructure. I assume they use relational databases too. That makes sense - use the right tool for the right job - and there are many jobs within the tech infrastructure of a company that size.
Excellent post and I agree with your sentiments about developers wanting to use the tools that they are most productive with.
I would like to add another reason why there is a tide against the SQL market - the scalability problems that they present. The question of whether data stores can be be plugged into existing infrastructures is discussed
here
Just as an aside:
look at the description of memcache in the link you added and its further links.
In it we are told that database - note the lack of any qualification! - writers do block readers.
In other words: whoever wrote the memcache description has NEVER used Oracle, or they would know that its writers NEVER block readers!
But you can bet these guys have added Oracle into their list of dbs that "don't scale".
And folks wonder why I call these people total idiotic ignorants who are incapable of the most basic logic and informed argumentation?
Twitter is inappropriate with Google Apps
Noons: you are a hero!
Now if we can just convince metalink not to use flash...
word: wareurro
"On Radio Free Tooting, Andrew Clarke says, 'No SQL, so what?' taking as his keynote something Nuno Souto said: [...]"
Log Buffer #154
just stopped by from your stackoverflow thread answer for normalize or denormalize.
Thanks for sharing your thoughts. The recent social media companies have built a big hype about using no sql type of option. However majority of application may not really need that.
Post a Comment