Sramana Mitra: What are the results of these deployments?
Anthony Scodary: For instance, we have one customer that calls when you’re discharged from hospitals, and they call thousands of patients a day. Some people spend twenty minutes talking with Grace. We build all our own language and speech synthesis models, so we’re able to make this really unified, low-latency experience that sounds like a person.
The tools are designed so that the hospital or the insurer or the bank can manage their own assets. One thing we’ve learned from working for a long time in the space is that call centers are constantly training people. They have the highest attrition in the U.S of 18 months. Three million people work in contact centers in the US. The reason for the high attrition is that it’s a minimum-wage job that’s very hard. You get yelled at all day and in general, you don’t get much support. As a result, they’re constantly training people. They constantly need to keep the call centers on the rails because they have people that are fairly inexperienced, speaking on behalf of your company.
That industry has developed a lot of tricks and tools that we were able to borrow directly to develop a large language model (LLM) that specializes in a call center and operates on the voice channel. Generally, information in those places lives in three places. It lives in documents such as training documents and knowledge bases. Two, you typically have, some form of business process or script or flow that they’re supposed to follow, especially in scripted call centers. Then the third thing is, generally a big part of how you train agents is through white cable sessions where their boss would plug in with them, and then just coach them on what they should and should not say.
So when we built the tooling for Grace, our LLM-based voice system, we built three systems. One that we call playbooks, which is where the flow lives. Two is called knowledge bases, which is where your documents live and the third is coaching.
The idea is that the bot utilizes these tools, just like a person uses their tools. The machine, essentially the executive control of the dialogue system, is served a little computer where she has a website that’s the playbook, a website that’s a little search engine for the knowledge base, and contextually coaching sessions for similar situations in the past. That allows you to build a dialogue system that is hallucination resistant, that is able to follow a process somewhat similar to what a human agent has to do.
It means that if you are an existing call center, and you want to integrate this technology into your current process with your current training and QA, we provide the exact tools that you use for managing bots. Our knowledge base system was first built for our agent desktop, where we had live agents that are served documents contextual to what’s happening in their call. We just reapplied that to Grace and then served up documents to the bot instead.
We’ve obviously, like everyone else at this point, seen how powerful LLMs are, but we’ve been building specialized language models for call centers for several years now. As opposed to something like GPT-4, which is trained on common crawls, it can try to output text out of any part of a corpus and can generate sonnets or whatever. Our models are pretty highly aligned to truthfully referencing documents or procedures and following them. They can still improvise and make small talk and answer common sense questions.
But in general, when we’re calling hospital patients, if someone says like, “Hey, can I take all of my medicine today instead of over the next two weeks?” If you had a GPT-4, it’d say, “Sure, you can take whatever medicine you want and hallucinate and answer.” That’s a really serious problem, right? And so building hallucination resistant, highly aligned, highly specialized language models is a big part of our strategy.
A big part of the way we do that is we don’t just have one monolithic language model that you’re in the conversational loop with. We have a core dialogue system that is rule based. It uses several different specialized language models, or in some cases, classifiers and regression models. The NLG language model is obviously very large because it has to make general conversation, but some of the other models are much more specialized. That allows you to not just put guardrails on, like you might with a language model, but actually build a very reliable piece of machinery that is still capable of general conversation.
Sramana Mitra: And it’s much more useful. I’m not really that excited about general AI. I think general AI doesn’t really have much application as such. This kind of stuff has much more application in actually solving business problems. Writing sonnets is cute, but it doesn’t solve any problem.
Anthony Scodary: When you’re doing core research and language model development, that’s very exciting. It’s a really good way to check the potential for generalization.
Sramana Mitra: It’s an interesting problem. But what you are doing application-wise is far more interesting and much more that constrained model keeps hallucinations to a minimum and that’s much more important. Domain-specific solutions can be built with small models that are domain-specific. There’s no need for general AI for that.
Anthony Scodary: For what it’s worth, our NLG model has billions of parameters. It’s a very big language model, but it’s highly specialized and trained on just call center calls. Even though our customers very excited about what’s happening in generative AI, they want something safe. For the most part, there’re three things we have to do on these call center calls – route calls, answer questions, and fill out forms. Having a general language model do that in an unconstrained manner just adds a massive amount of risk.
Sramana Mitra: In your case, there’re call centers for different domains. You can even limit it further to the data set for that particular domain. A healthcare call center and a financial services call center are different models.
Anthony Scodary: Yes, we talked about this too. In January, CBS Sunday Morning did a piece about us. They really did a great job just getting into what you really want to automate in call centers. They’re not trying to necessarily build something that can generate sonnets for you. Most of the time people call in the call centers to just vent. So it is useful to have a machine that is friendly and capable of conversing but stays on track.
This segment is part 4 in the series : Bootstrapping First, then Raising Money to Build a $10M+ Generative AI Startup: Anthony Scodary, Co-Founder of Gridspace
1 2 3 4 5 6 7