Sramana Mitra: What is your estimate of the visualization side of the equation? What tools and technologies are you using in your work from the visualization tool kits?
Oliver Downs: That is a good question and also always a challenge. Making sure that the findings you have are accessible to the type of customer you have. With our business, our sale is not to the CIO, it is to the CMO’s organization. We have been making use of some of the advances in dynamic Java Script – D3 – as a good example. From an ad hoc perspective, we use some of the interactive tools that are available on Typhoon [Map]. Prototype visualizations are getting them out there quickly and allowing us to get interesting feedback on them and then perhaps bring them into a more elegant and sophisticated looking visual experience using D3.
SM: It sounds like your strategy on visualization is that you are developing the actual visualizations on top of native visualization technologies. You are not really using off-the-shelf engines like Tableau and so forth. Is that correct?
OD: We do make use of Tableau for some light visualization. But we find that some of the sophistication we have around behavior analytics, clustering and behavioral sequences are still hard to represent in tableau. But we represent a common use case for Tableau.
SM: Is there any visualization provider or vendor that is capable of representing those?
OD: That is a good question. If I say no, I will likely be proven wrong. I haven’t come across anyone, but that doesn’t mean there isn’t someone.
SM: Unfortunately, in this case you may be right. Because I have heard from several people that they are looking for better visualization tools and not finding them. Existing tools have severe limitations when it comes to the kinds of representations that you guys are trying to do in the big data world.
OD: Product visualizations with D3: It is an elegant framework and creates great experiences, but the engineering effort that goes into it is still quite high to gather that high quality experience and output from it.
SM: So you are saying you do a [representation] on a specific problem and a specific data set, and you can’t really productize it that easily.
OD: That is correct.
SM: What other interesting topic within this general domain would you like to discuss?
OD: We are still at the early stage in terms of the machine learning and visualization. Big data for most companies today really still means storage, light querying and perhaps aggregation. We have a huge ecosystem now, and a multi-billion dollar business emerging mainly around Hadoop-style storage and aggregation system. But we have not seen advances in driving meaningful insights from a lot of this data.
SM: That is what I am interested in, and I have been talking to a lot of people who share that point of view. If you really understand how to drive insights out of big data and are able to apply machine learning algorithms to it, the storage part is not that critical. The storage is not the differentiating factor.
OD: That is right. The interesting lesson about storage has been that things people had to do in the 1970s and 1980s – with early Unix systems – to efficiently manipulate files in the files system turned out to be the same great things that work out really well now that we are much less constrained in terms of hardware and storage capability, but they are the same things that work well. It is just that today we have a much larger amount of data compared to them.
SM: We are very much on the same page on that. I really enjoyed this conversation.
OD: Likewise, I very much enjoyed this conversation, too.
This segment is part 5 in the series : Thought Leaders in Big Data: Interview with Oliver Downs, SVP of Data Sciences, Globys
1 2 3 4 5