categories

HOT TOPICS

Thought Leaders in Big Data: Interview with Brian Bulkowski and Srini Srinivasan, Co-founders of Aerospike (Part 2)

Posted on Wednesday, Jun 19th 2013

Sramana Mitra: Real-time is one issue, and the ability to distribute the data between Flash and the hard disk effectively [is another].

Srini Srinivasan: It is actually Flash and main memory. We have a hybrid system based on DRAM and flash. We can also work with rotational disk, but this is old technology.

Brian Bulkowski: How we think about ourselves is that we are an in-memory solution. But we have to broaden the idea of in-memory to include Flash. Flash is silicon, it is chips, it is Moore’s law, just like DRAM is. And it is persistent as well. But for rotational disks, you apply very different algorithms. We have customers running pure in-memory and we have customers running data in flash as well. It is all about different kinds of indexes and database strategies around in-memory. Core algorithms have to change.

SS: And they leverage the way SSDs or Flash works. They have been improving every year since we had our product out. That means more and more of our interactions can happen with the Flash disks over time. What this means is we are able to run through hundreds of thousands of transactions per second and provide you with DRAM-like performance with flash. Now you have a high-scale server that you can run like Google and much smaller.

BB: What that has meant practically is that DRAM, even though prices are coming down, is still in the $30-$35 a gigabyte range. Buying a terabyte of it in a single machine costs about $50,000. Powering it costs another $100,000 a year. Flash’s economics are completely different. It is about $15,000 for a four-terabyte machine. Power consumption is negligible. It is back down to $1,000 per month.

SM: You are saying in this segment, which are highly real-time oriented applications, the server is a Flash server effectively?

BB: We don’t call it that. We call it in-memory. And in-memory is DRAM in some cases and flash in others. But Flash has such great advantages right now in terms of price-performance.

SM: So my statement is accurate.

SS: It is. The innovation here is to harness that power to produce Internet scale and reliability. The innovations we have made in terms of distribution are unique. We published them in research conferences like RealDB. All of that brings you a very high-quality product that runs itself.

SM: Let’s talk about customers. You said already that advertising and media are where you are seeing the maximum uptake right now. E-commerce is probably another if there are a lot of recommendations going on.

BB: E-commerce has a lot of angles to it. Some e-commerce companies are doing deeper integration with their advertising partners. There is even an advertising angle straight from e-commerce. Some of them are doing dashboarding and want better real-time dashboarding. There is a real-time aspect there. Some of them are doing predictive recommendations. There is a lot of action in so many companies that focus on real-time, not in terms of milliseconds, but in terms of human interactions. “I was here a second ago, I hit refresh, what do I show now?”

SM: But the “real” real-time use case is advertising.

SS: Advertising is the first use case where our technology has dominated.

SM: Let’s talk a bit about whatever customers you are willing and able to discuss. Talk about two or three customers and what you are doing.

BB: Our first customer has been live for three years now. This is one of our largest customers, and it has been very gratifying to watch them grow over time. They have doubled repeatedly. It is a company called AppNexus out of New York. They are an advertising platform company for real-time bidding. When an impression has to be filled, they run auctions among many companies. The winning bid comes back and fills the ad in that particular moment.

SM: This is a highly real-time use case.

BB: It is a highly real-time use case, and their aggregating load and impression over many sites – a large percentage of the entire Internet. Their customers rely on AppNexus and their front edge to be up constantly. If they are down, a large percentage of the Internet isn’t seeing PSAs or average ads. What we do with them is there is a front end component that stores user profile information and recent actions to allow things like re-targeting. When any advertisement comes in, they need to look up models that were computed in Hadoop – what this person might like versus not like, what they did recently, etc. They do a database look-up in an Aerospike cluster. They have many terabytes of storage, they are tracking billions of users, and they are doing it in real-time with multiple data centers. They have done so without any outage or downtime.

This segment is part 2 in the series : Thought Leaders in Big Data: Interview with Brian Bulkowski and Srini Srinivasan, Co-founders of Aerospike
1 2 3 4 5 6 7

Hacker News
() Comments

Featured Videos