Thursday, October 10, 2013

T-Shirt slogans

One of the Cloudera chaps at the Oracle Big Data meetup had a T-shirt with this cool slogan:
Data is the new bacon
Even as a vegetarian I can appreciate the humour. However I think it has a corollary, which would also make a good T-shirt:
Metadata is the new Kevin Bacon
Because metadata is the thing which connects us all.

Oracle Big Data Meetup - 09-OCT-2013

The Oracle guys running the Big Data 4 the Enterprise Meetup are always apologetic about marketing. The novelty is quite amusing. They do this because most Big Data Meetups are full of brash young people from small start-ups who use cool open source software. They choose cool open source software partly because they're self-styled hackers who like being able to play with their software any way they choose. But mainly it is because the budgetary constraints of being a start-up mean they have to choose between a Clerkenwell office and Aeron chairs, or enterprise software licenses, and that's no choice at all.

But an Oracle Big Data meetup has a different constituency. We come from an enterprise background, we've all been using Oracle software for a long time and we know what to expect from an Oracle event. We're prepared to tolerate a certain amount of Oracle marketing because we want to hear the Oracle take on things, and we come prepared with our shields up. Apart from anything else, the Meetup sponsor is always cut some slack, in exchange for the beer'n'pizza.

Besides the Oracle Big Data Appliance is quite at easy sell, certainly compared to the rest of the engineered systems. The Exa stack largely comprises machines which replace existing servers whereas Big Data is a new requirement. Most Oracle shops probably don't have a pool of Linux/Java/Network hackers on hand to cobble together a parallel cluster of machines and configure them to run Hadoop. A pre-configured Exadoop appliance with Oracle's imprimatur is just what those organisations need. The thing is, it seems a bit cheeky to charge a six figure sum for a box with a bunch of free software on it. No matter how good box is. Particularly when it can be so hard to make the business case for a Big Data initiative.

Stephen Sheldon's presentation on Big Data Analytics As A Service addressed exactly this point. He works for Detica. They have stood up an Oracle BDA instance which they rent out for a couple of months to organisations who want to try a Big Data initiative. Detica provide a pool of data scientists and geeks to help out with the processing and analytics. At the end of the exercise the customer has a proven case showing whether Big Data can give them sufficient valuable insights into their business. This strikes me as a highly neat idea, one which other companies will wish they had thought of first.

Ian Sharp (one of the apologetic Oracle guys) presented on Oracle's Advanced Analytics. The big idea here is R embedded in the database. This gives data scientists access to orders of magnitude more data than they're used to having on their desktop R instances. Quants working in FS organisations will most likely have an accident when they realise just how great an idea this is. Unfortunately, Oracle R Enterprise is part of the Advanced Analytics option, so probably only the big FS companies will go for it. But the Oracle R distro is still quite neat, and free.

Mark Sampson from Cloudera rounded off the evening with a talk on a new offering, Cloudera Search. This basically provides a mechanism for building a Google / Amazon style search facility over a Hadoop cluster. The magic here is that Apache Solr is integrated into the Hadoop architecture instead of as a separate cluster, plus a UI building tool. I spent five years on a project which basically did this with an Oracle RDBMS, hand-rolled ETL and XML generators and lots of Java code plumbing an external search engine into the front-end. It was a great system, loved by its users and well worth the effort at the time. But I expect we could do the whole thing again in a couple of months with this tool set. Which is good news for the next wave of developers.

Some people regard attending technical meetups a bit odd. I mean, giving up your free time to listen to a bunch of presentations on work matters? But if you find this stuff interesting you can't help yourself. And if you work with Oracle tech and are interested in data then this meetup is definitely worth a couple of hours of your free time.