Sramana Mitra: That brings me to my next question. What kinds of use cases were your founders seeing? Let’s get into some of those use cases where you as a company bring special value.
Billy Bosworth: I will give you one which is a pure Cassandra use case. This was a very early customer, Netflix. When they needed to change the viewer experience – when viewers were using Netflix streaming services – they turned to Apache Cassandra. They needed a technology that could do several things. First, it had to be able to scale in what is called a linear fashion. What that means is it scales as you add new resources to the technology. In our case it is what we call nodes – you can consider that as a machine. When you add a machine to the cluster, one of the challenges with older technologies, and even with some of the newer technologies, is that you don’t know what kind of scaling you are going to get. In other words, it starts to top off, or you reach a diminishing return when you get to a certain number of nodes. Netflix didn’t want that. They wanted something predictable, dependable, and massively scalable, so they knew whenever they added another machine, they would get the exact amount of computing power they could calculate. That is what they did.
Netflix is interesting because they run their entire streaming service in the Amazon cloud. They are a 100% cloud infrastructure, and then they take our technology and run it in Amazon. But they run it across multiple Amazon high availability regions, which is point number two. They, like the vast majority of our customers, need continuous availability. Don’t think about 9-to-5 availability or fast time for recovery. The goal is to be continuously available. To do that, you need protection in the way you build your architecture. Netflix needed to rely not on just a single Amazon high availability zone, but they needed to go across multiple Amazon high availability zones so if they lost one, the service would continue to run. That is another strength of our technology. It is called multi-data center, where you can take a single database cluster and span it across multiple data centers.
The third thing Netflix needed was to be up and running in multiple regions very quickly. One of the quotes they used was that they can be up and running in a new part of the world in 15 minutes, because now they have a fully distributed architecture ready to go. All they have to do is load it in a data center that is located close to the users in question.
The reason Netflix had these architecture demands was that they wanted to change the way people experience watching a video. For example, if you are watching something on Netflix through your Wii game console and you end the show or movie, pick up your mobile device, head to the gym, and want to pick up where you left, it knows where you were. How does it know that? It knows because in Cassandra they are all the time tracking where you are and what you are viewing. Then they want to know what movies you watched before and what movies you they think you want to watch after, so they can make very good recommendations for you. It is not about the movie itself. Those bits are delivered through a content delivery service. It is about all the metadata about your viewing experience. This data is stored, retrieved, and analyzed in Cassandra. It is all about that real-time experience, capturing everything there is to know about it so that you as a viewer have a pleasant viewing experience and so that they can recommend to you what you might like next after you watch a certain class of videos.
This segment is part 2 in the series : Thought Leaders in Big Data: Interview with Billy Bosworth, CEO of DataStax
1 2 3 4 5 6 7