Thursday, July 11, 2013

UKOUG Analytics Event: a semi-structured analysis

Yesterday's UKOUG Analytics event was a mixture of presentations about OBIEE with sessions on the frontiers of data analysis. I'm not going to cover everything, just dipping into a few things which struck me during the day

During the day somebody described dashboards as "Fisher Price activity centres for managers". Well, Neil Sellers showed a mobile BI app called RoamBI which is exactly that. Swipe that table, pinch that graph, twirl that pie chart! (No really, how have we survived so long with pie charts which can't be rotated?) The thing is so slick, it'll keep the boss amused for hours. Neil's theme on the importance of data visualization to convey a message or tell a story was picked up by Claudio Bastia and Nicola Sandol.   Their presentation included a demo of IConsulting's Location Intelligence extension for OBIEE. The tool not only does impressive things with the display of geographic data, it also allows users to interact with the maps to refine queries and drill down into the data. This is visualization which definitely goes beyond the gimmick: it's an extremely powerful way of communicating complex data sets.

A couple of presentations quoted the statistic that 90% of our data was created in the last two years. This is a figure which has been bandied about but I've never seen a citation which explains who calculated it and what method they used (although it's supposed to have originated at IBM). It probably comes from the same place as most other statistics (and project estimates). What is the "data" the figure measures? I'm sure in some areas of human endeavour (bioinformatics, say, or CERN) the amount of data they produce has gone metastatic. And obviously digital cameras, especially on phones, are now ubiquitous, so video and photographs account for a lot of the data growth. But are selfies, instagrammed burgers and cute kittens really data? Same with other content: how much of this data explosion is mirroring, retweets, quoting, spam and AdSense farms? Not to mention the smut. Anyway, that 90% was first cited in 2012; it's now 2013 and somebody needs to invent derive a new figure.

The day rounded off with a panel and a user presentation. Toby Price opened the Q&A by asking Oracle's Nick Whitehead, how does Hadoop fit into an Oracle estate? It's a good question. After all, Oracle has been able to handle unstructured data, i.e. text, since the introduction of ConText in 8.0 (albeit as a chargeable extra in those days). And there's nothing special about MapReduce: PL/SQL can do that. So what's the deal with Hadoop? Here's the impertinent answer to this pertinent question: Hadoop allows us to run massively parallel jobs without paying Oracle's per processor licenses. Let's face it, not even Tony Stark could afford to run a one-thousand core database.

The closing session was a presentation from James Wyper & Dirk Shelley about upgrading the BI architecture at John Lewis Partnership. They described it as a war story, but actually it was a report from the front lines, because the implementation is not yet finished. James and Dirk covered the products - which ones worked as advertised, which ones gave them grief (integration was a particular source of grief). They also discussed their approach to the project, relating what they did well and what they would do differently with the advantage of hindsight. This sort of session is the best part of any user group: real users sharing their experiences with the community. We need more of them.

