Sramana Mitra: Tell me a bit about the structure of these. As you were revolving from these pilots to actual deployment, sounds like your platform was strengthening. Who was doing the application layer? Was a part of your team doing the application layer or the customer’s team doing the application layer?
Anthony Scodary: Yes, that’s a good question. It was us. Evan, the microfinance CEO, has a product design background. We’ve always been very product focused. One problem in the machine learning space then and now is that it’s very technical to build your own state-of-the-art machine models. There is this expectation that you just wrap your model in an API, and you’ll provide value as a Lego set.
Productization and being able to focus has always been very important for startups, but it’s doubly important for machine learning, because there’s a very tight relationship between focusing your product and focusing on what your model must do. That’s really important in machine learning, because you essentially reduce the variance of your data by restricting exactly what the product does. If all a model ever has to do is summarize call center conversations or identify entities and call center conversation, then you’ve really limited what the product needs to do. You also limit the variance and what the model needs to do, which means that per parameter, you need less data to get to the same level of performance.
When it came to building out the application of Sift, we not only built the whole web app, but we also built a lot of supporting technology. We had to build our own telephony switch from scratch. We’re probably the only company in the world that’s running neural networks on our telephony switches that we built ourselves. The reason we did that is because we had very tight constraints on latency and on the quality of the media.
We also have to do all these integrations with telephony systems, like SBCs, ADCs, and PBXs, and build our own search back end. In general, a big part of our analytics product, the Sift platform, is used is as a search index. A lot of our initial product development was in semantic search. When you’re trying to find all the calls where someone complains about a late bill, there’re hundreds of thousands of ways they could phrase the same idea in just one utterance. So, semantic search was really important. One of the first pieces of core technology that we invented at Gridspace was an efficient way for indexing and then retrieving things semantically.
A big part of what made the application distinct is a product we called Scanner, where you could type a query in natural language, and it would find things that were semantically related by essentially indexing the sequence of embeddings and then using signal processing to see the similarity between your query and a segment of a conversation. But we had to invent a lot of stuff and build the search index, right? That included building the entire information retrieval system, which is now used in our bot as well.
Sramana Mitra: What you’re describing is essentially a toolkit for building sophisticated applications in a range of use cases within a domain, right? I’m seeing this model quite a bit in the AI space in particular. Palantir started with this kind of a model. They do mostly solutions. They have this kind of a toolkit and they do solutions on their own. Then they’re able to charge a lot of money for the full solution. They’re not selling a platform that customers build on. They are selling the whole solution.
There’s another very interesting company called Machinify. They have a machine learning platform, and they also do solution engineering in the health care space. They have exactly the same model as you’re describing. For a long time, the VCs would tell you that, “Oh, we only want to invest in products. We don’t want to invest in services.”
But I think today that thinking is shifting a little bit because if you have a good platform and if you have the toolkit with which to build really powerful solutions, especially in the AI machine learning space, you can charge a very big fee for your solutions. I think that far exceeds the monetization potential of pure platforms.
I think that shift is happening at the moment.
Anthony Scodary: Yes, I completely agree.
The big next step for Gridspace was, about four years ago when we had the best-in-world speech recognition model for long-form conversations and call centers, oftentimes with half the error rates of Microsoft and others that built them, partly because we were specialized and had leading speech scientists from Stanford and MIT.
We then moved into launching our second product, Grace, which is our voice bot. It started with generative models. Four or five years ago, we’re starting to get very powerful models using deep learning. In the speech synthesis space, there was a series of models: WaveNet from DeepMind; Tacotron from Google; and WaveGlow from NVIDIA, which was a variant of OpenAI’s model called Glow. These were a series of models that made different forms of generative models, in particular in speech synthesis space, capable of sounding identical to humans.
We realized that we were already working with contact centers and have billions of minutes of speech data. We started prototyping with some of our customers – what would it sound like if we used state-of-the-art speech synthesis, but we trained it on call center agents. We learned quickly that these models have rapidly gotten better. Obviously, now everyone realizes that generative models have gotten really rapidly good, but this was four or five years ago.
If you use data of people speaking conversationally in that context, with all the imperfections, like the puff of their microphone, the transfer function of the room, it sounds to a person listening that that is a real person, unambiguously, especially over a phone line.
We recognized an opportunity to compete with IVRs by building our own bot platform. We brought in a couple more people from Stanford who had been working on generative models. We ended up building a product called Grace. At this point, we’re investing a lot into Grace for our growth because, in general, the value proposition of analytics is generally limited to very large companies that have large data science teams and can invest in critical strategic tools. However, the value proposition of a machine that sounds identical to a person handling phone calls from, large telecom all the way down to SMBs is pretty ubiquitous.
Sramana Mitra: Is that in the hands of customers right now?
Anthony Scodary: It is, yes.
Sramana Mitra: Okay, tell me how customers are deploying it.
Anthony Scodary: Grace is used in healthcare. Memorial Hermann Hospital System in Houston uses it. We’ve used it with health insurers, telecom companies, and financial service folks. For a lot of them, customer experience is critical, right? Because one, it’s a big part of their workforce. And two, they don’t want to be perceived by customers as necessarily trying to provide a worse customer experience. So, the bar that we have set is that working with Grace has to be as good or better than talking with a human.
This segment is part 3 in the series : Bootstrapping First, then Raising Money to Build a $10M+ Generative AI Startup: Anthony Scodary, Co-Founder of Gridspace
1 2 3 4 5 6 7