Brian Bulkowski and Srini Srinivasan are co-founders of Aerospike, a real-time NoSQL database. Brian and Srini have more than 20 years of experience in the IT industry, having worked for companies like IBM, VerdiSoft, Copernicus, and iControl Networks. In this interview they talk about Aerospike’s real-time software solution and how it is unique, and discuss possibilities for entrepreneurs to fill gaps in this space building on top of their platform.
Sramana Mitra: Let’s start with some background on you as well as the company. Tell us what you do, what problem you solve, and who the customers are.
Srini Srinivasan: My background is a PhD in databases from Wisconsin. I worked at IBM and a few startups. Eventually when I was working in the Internet, especially Yahoo, I noticed a lot of Internet-scale problems being solved inside companies like Google, Facebook, or Yahoo, using what we call real-time transactional databases. These kinds of databases are not available to anybody who is starting a new company. Everybody had to build this real-time transactional database. The idea for the company came when I [first] met Brian. Then we met again while I was at Yahoo about four years ago. We discovered the mutual interest in building a product like what is used inside Internet companies for people outside as a product.
SM: And what were you doing when you reconnected?
Brian Bulkowski: I started this project about one year before Srini joined. We built out some fundamental technologies around this idea.
SM: So you were already on the same idea?
BB: Exactly. The funny story is that when Srini and I reconnected, he asked, “Brian, what are you doing with a database company? You are a networking and distributed systems guy.” Then I said, “That is the new revolution. That is where direct attached storage and networking and distributed systems need to merge with the database technologists in order to create the new revolution in databases.”
I was at one of these Internet scale companies doing advertising work. They brought me in to lead a team on a distributed recommendation system in real-time. What I saw was a company that was really struggling with operations at scale. They were using the typical technologies from five to ten years ago, which included MySQL with a lot of cache layers, having a lot of troubles with their goals to their board and employees. A huge amount of resources were put into keeping up their systems, simply recording their data and trying to do these kinds of analysis. That front edge component, where reliability is key – if your service isn’t up, that is what we all expect from the Internet – a lot of resources went into that. Yet with mind working distributed systems and networking combined with database expertise, we can build a highly reliable system as well as being able to handle peak loads the Internet can generate.
SM: We have looked at a bunch of different platform companies that are supporting the big data industry. There are a whole lot of companies that are talking about the scale issue. It sounds that for you the reliability at handling peak loads is the key value proposition. Is that observation correct?
BB: That is where we started. There is so much benefit in doing operations with a very high level of predictability. One of our advisors Srini worked with said, “This is what we did with db2, building modern storage layers, which is a primary key database, and then you can see where the business takes you.” With db2 it took them from IMS into SQL. That was the correct interface at the time. Today we have a more polyglot interface. We have graph interfaces, SQL, map reduce, streaming analytics, etc. Our goal is to build a core data engine and be able to attack new problems vertical after vertical. The first one we attacked has been advertising and media, because that has the highest read and write load. But we are positioned to be able to take on more and more of those data types.
SS: We have also leveraged the growth of SSDs or flash memory. One of the prerequisites for a startup are applications that need this level of immediacy. The Internet has lots of those. Then you need innovations in networking, databases, and self-management. That is the kind of software we provide. But there is an even harder component to it. That is where flash memory comes in. Databases, over the years, have essentially been software that has been written to work around the extremely large times it took to get data in and out of discs. With flash this became a lot closer to DRAM – rewriting on the data platform, where you can get 100x performance improvement. First of all, we reduce the cost of a deployment by 10 to 20 times. But that is not even the important thing. The important thing is you can solve higher scale problems on smaller hardware. That is where big data comes in. Real-time big data with predictable performance is what we have made.
This segment is part 1 in the series : Thought Leaders in Big Data: Interview with Brian Bulkowski and Srini Srinivasan, Co-founders of Aerospike
1 2 3 4 5 6 7