Sramana Mitra: Can we talk about a few use cases of the kinds of companies you’ve invested in with that thesis?
Sailesh Ramakrishnan: Absolutely. I can point to one of each. In the tooling area, we are investors in a company called TALC. TALC tries to solve the problem of how you assess the quality of output coming from an AI system.
AI systems are supposed to replicate human responses for a wide variety of use cases, but they’re also very convincing in the answers they provide, even if they are completely wrong. People call this hallucinations, but even beyond hallucinations, AI systems at this level can and will make causal errors in their reasoning. So you need a more smarter way of evaluating the quality of the responses that come from an AI system. So TALC creates a platform that provides that opportunity.
Today, the solution that companies like Google have adopted is human evaluation, because they believe that you can’t use an automated system to evaluate AI. You can only use humans, but that is not a very scalable solution. So TALC has come up with a hybrid way of doing this where humans are still involved, but the vast majority of the evaluation is done by a hybrid process for both humans and AI.
Sramana Mitra: What is an example of the kind of things TALC does?
Sailesh Ramakrishnan: Let’s say you had deployed an AI system to guide a customer through a banking transaction process. So, what you have is your AI system and the bank customer’s responses. At the end of it, the job of the quality evaluation system is to determine whether the customer was happy or not, or the customer was taken to the right destination or not.
The simplest way of doing this is, did the customer sign off saying they are happy? But that doesn’t give you enough feedback to improve your algorithm. So, the idea is to look at it step by step and say, did the AI system choose the right conclusions at every step of the process? The first step of what TALC does is, it sort of looks at the conversation and generates these waypoints or milestones where the AI system made choices and for the most part, it gets those waypoints correct. But then a human confirms that these were the right choices to make.
This hybrid then allows you to get the confidence that this would be very close to what a human would have concluded through this process as well. Hence, this is now a valid evaluation of this conversation. The key here is to understand that the AI system could have gotten sidetracked here or there, which could have been okay because humans get sidetracked too. The key is not to measure every step of that conversation, but only measure that they took the right choices along the way to make the customer happy. That’s one high level example. There are other examples that are specific again to different use cases.
Sramana Mitra: okay. Let’s talk about a vertical use case, a vertical company that you’ve invested
Sailesh Ramakrishnan: The second use case that has to do with verticals is a company called Dili; it’s short for diligence. They offer a software system that helps investors perform diligence on deals. This is not just for investors like myself, but also large investors who are making deals or transactions that are worth billions of dollars. Usually, those transactions involve large amounts of information regarding the company. Let’s say you are taking a public company private, the number of contracts, financial documents, and business relationships could be in about 50-60 boxes of information. Large banks usually have 50-100 people poring over all these documents. Usually these transactions take a year or two to consummate because of the vast information that needs to be processed.
This is something that GenAI systems are actually good at. They can synthesize and summarize well. They can read through a lot of these documents and try and identify the salient things. So you can actually ask questions like, was there ever a quarter where the revenue dipped more than 20%? This may not necessarily be in a single document or a single graph, but because GenAI systems have this ability to synthesize a lot of information, they can provide that kind of an insight. The human then can do further due diligence on that point and say, “Why did that happen?”
Dili allows you to synthesize a vast majority of information in order to be an efficient co-pilot to the legal or financial teams doing deal diligence. This is not just simply a problem of taking all this data, throwing it into your ChatGPT system for the answer, because a lot of this information has a lot of context and structure. You have to be able to understand things like spreadsheets and graphs and connections between a legal document and a financial document. So there’s a lot of vertical knowledge that is required, and a combination of legal as well as financial acumen is needed in order to do a job well.
This segment is part 4 in the series : 1Mby1M Virtual Accelerator AI Investor Forum: With Sailesh Ramakrishnan, Managing Partner at Rocketship.vc
1 2 3 4 5 6