Robert Youngjohns is the senior vice president and general manager of HP Autonomy. Simply put, Autonomy helps organizations understand the meaning of information by applying big data technologies. Robert previously worked at Microsoft and has more than 30 years of experience in the technology industry. In this interview, Robert talks about Autonomy’s role within HP and provides an interesting proposal for entrepreneurs in this space.
Sramana Mitra: Robert, let’s start with setting some context about HP Autonomy. Of course, the readership knows very well about HP and the acquisition of Autonomy. Where are things right now?
Robert Youngjohns: I am a relative newcomer. I have been at Autonomy and HP for six months. Prior to that I was at Microsoft, and before that I had my own private software company at Silicon Valley. So, I have a pretty long history in the IT industry. Putting aside press releases about the acquisition cost, the way we look at it from an HP perspective is that we acquired some serious assets, which are broadly targeted at what people now think of as big data. They fit into three prime categories. The first category is “How do we solve big business problems that derive from large quantities of unstructured data?” I use the word “unstructured” there, but internally we describe it as “human information,” which I think describes it better. We are talking about everything from text to video to voice, which is something large corporations are accumulating at a massive rate right now. They all have the suspicion that there is something valuable in there, but they are not quite sure what and they are not sure how to turn it into corporate value.
That is the sort of business problem we are trying to tackle. So, the first point of business is to go straight at that problem and use the software we have, which is predominantly based on a part called “Idol,” to help people derive value from and meaning out of all that unstructured data they have. It could be an incoming video feed. If you had 10,000 security cameras, you monitor the output of those cameras. This is a big data problem, because if you digitize that data, you are talking about petabytes of data, and then it is all being fed in to a control center, where a few individuals are looking at it and trying to figure out if anything significant is happening. That is a problem you can automate. With our technology you can say, “That door shouldn’t have been open between 5 o’clock and 6 o’clock. But it was open and the person who came through that door wasn’t someone we recognize on our facial database.” That is just an example of a human information–orientated big data problem.
Another example could be how to detect fraud in an insurance company, which is quite complicated. Lots of cases of fraud are caused by duplicate claims. How do you detect duplicate claims? People usually are smart enough not to use the same details twice, but how can you detect patterns in claims? You could say, “These two claims are likely to be the same claim. They use different names and phone numbers, but they spelled some words in a similarly curious way.” We have a team that is focused on that, and there are a whole bunch of application areas we can talk about.
The second point deals with a specific corporate problem. You have all this information spilling around your organization, and you are increasingly under regulatory requirements on how you look after that data, how it is archived, how you discover it, how you delete it, etc. This is a big data problem. We keep email archives for some of the largest banks in the world. I met with one of them a while ago, and they now have over a trillion emails in their archive, and they are adding another 70 billion to 80 billion emails a month. What you need to do here is search that archive, and if something goes wrong you need to have tools to search through those one trillion emails and work out if somebody did the wrong thing at the wrong time or sent the wrong email, etc.
These are big data problems we try to wrap those up into something we call “machine-augmented information management and information governance.” You have such vast information in corporations, and simply managing it is turning out to be a huge issue.
The third thing we are working on here at Autonomy is slightly different from all this. But in a way it is another big data problem, and that is, “Where do we go with web content management platforms?” Most large corporate websites are integrated with data they need to look after, manage, maintain, and present. But more important, they are all struggling with a problem: “ As users hit my website, how do I make sure that the information I present back to them is more interesting and more likely to get them to buy on my website?”
We are working on a big data problem here, which is how to do real-time analytics as a user is hitting our website – analyzing the context by which the user has come to the website, their previous buying history, what is happening at the moment in the social media feeds, etc. so that we can then formulate the right offer for that visitor at that time. It is a disparate look at what big data is all about. It is founded on an engine that we developed a few years back called “Idol.” Then we built application families on top of that engine.
This segment is part 1 in the series : Thought Leaders in Big Data: Interview with Robert Youngjohns, SVP and GM at HP Autonomy
1 2 3 4 5