John Price: Back to your question, there were no platforms out there to do this with. We had to start with the raw tools that were out there. Then we had to start with the data and focus on our core industry—automotive. The data is a mess. It’s all over the place. It’s very hard to get. When you get it, it’s dirty and incomplete. A lot of times, it’s just flat out wrong.
We spent a decade not only aggregating but also normalizing all that data in all the various data platforms that are out there. We have a very robust ruleset to help complete data where data is incomplete. We’ve had to develop a lot of AI-based image technology where if you can’t figure what’s on the car, we look at the picture and try to derive it from the picture to complete the dataset. Then we have the next challenge. We have to compare that unique piece of inventory to another unique piece of inventory but in a way that can be reasoned over objectively.
In order to do that, you have to have very advanced normalization of these disparate datasets. Then you need layers of analytics and insights that can actually allow you to do legitimate comparisons. This then was our Big Data AI part of our stack. We use all the traditional tools out there that you’re familiar with. With that said, though, nothing really was designed for this particular problem.
Above our AI stack, we now had to build all of our own instrumentation of those analytics and insights into the applications that we’re powering for our customers. We had to build a full-stack from data to every interaction – natural language, chat, app, dealer website. We had to instrument all that with the data being refreshed every 15 minutes. To give you the idea of the scale of the data we’re talking about, we’ve seen over 200 million events over the last 10 years. When you think about that, it indicates that about 80% of all vehicles were sold in the last 10 years.
With that, every vehicle has 1200 features. We’re taking in about 7 to 10 million listings daily and putting them through that factory and stack that I was referring to. You take all the supplied data and combine it with an equally-impressive amount of demand data. We’ve seen over 500 million searches on all the automotive sites that we engage with. We have 7,000 dealer sites.
We take this and that drives back through the training, through the machine learning and down to the stack, and then bubbles back up the analytics. The analytics in the stack are providing very exciting, engaging content for helping big purchase decisions whether you’re talking to or typing in natural language, or finding recommendations for vehicles similar to a vehicle you might find interesting.
We refer to that problem as the Pandora of cars. We hold the exclusive patent on it. We call that our Similarities patent. Just like Pandora, we actually break every single vehicle down to its core features, its local market, and allow you to build back similar vehicles across that market that you would be interested in. This stack is extremely comprehensive. We use everything in open source. We use a lot of homegrown stuff.
We are in the cloud. Even AWS is not made for this problem as well as we would like it to be. We’re just pushing the boundary on the amount of data that we have. We’ve always been on the edge of what technology is capable of doing here. We’re continuing to push as hard as we can.
This segment is part 2 in the series : Thought Leaders in Artificial Intelligence: John Price, CEO of Vast
1 2 3 4 5 6 7