categories

HOT TOPICS

1Mby1M Virtual Accelerator AI Investor Forum: With Sailesh Ramakrishnan, Managing Partner at Rocketship.vc (Part 6)

Posted on Sunday, Sep 15th 2024

Sramana Mitra: I’ve been talking to various people about their various experiments and experiences selling enterprise AI software in vertical AI, and I think this issue is coming up. To train an AI model, you need people who really have deep domain knowledge in that workflow, in the kind of data and the kind of heuristics that they’re applying as human beings to do whatever it is they’re doing. When you’re really trying to automate a lot of stuff that people are doing manually, I think it’s a very delicate balance to get those people to impart that knowledge into training that AI to automate their functions. It’s not been an easy journey for the product people to design, enhance, and get these products to be more effective because of that tension.

Sailesh Ramakrishnan: Absolutely. The key insight that both TALC and Dili adopted is to show value pretty much out of the box and at least to convince the end user that there is something here that is worth a little bit more investment and time. If they spend a little bit more time, they will get more value because they’re already getting some value out of the box. Then if they spend a little bit more time helping this automated system, they will generate more value. That has been an important aspect to show that value upfront. It’s not easy to do in many other domains, but in at least these few examples that I’ve laid out, it has been possible.

But I think you’re absolutely right. It takes a little bit more convincing for a human expert to really impart a lot of the knowledge. Sometimes a lot of their knowledge is not necessarily expressible. It’s obviously in their heads, as in when you ask them questions, they would absolutely tell you what you need, but if you ask them to give guidance to an AI system, they don’t know where to start. So it’s also an interesting human-computer interaction challenge to extract some of this large wealth of knowledge that is in our human experts.

Sramana Mitra: I think that the organization inside of an enterprise that would have some of that knowledge, at least some of that knowledge institutionalized is the training. So if you have a large group of people who are doing a certain function, the people who train those people have at least some amount of institutionalization that could be codified and get an algorithm started. Then beyond that, the refining of that is where the trickiness of product management lies. 

It’s all over the vertical AI problem. So, unless you have somebody who’s a deep domain expert in that function, it’s not so easy to bridge that gap. So most of the people who are succeeding have resident domain experts in a particular domain in which the company is playing.

Sailesh Ramakrishnan: Absolutely. Another approach that AI researchers are now adopting is the realization that it’s very hard for human experts to train, but they are now creating AI systems that can learn by observing. So you’re not necessarily having a conversation with the human as much as you’re observing all the human’s reactions, actions, questions, and trying to see if you can build a model.

The AI system now can build a model of the human’s expertise and understanding. If something doesn’t match, then perhaps ask the appointed human a question saying, “Hey, why did you make this choice here? Or why did you press that button there because that doesn’t match the model?” So this learning by observing is becoming a big part of the new wave of learning systems today.

Sramana Mitra: So you’ve brought up hallucination. Of course, this is one of the pushbacks everybody’s getting especially from mission-critical enterprise applications.

There is also talk now of small language models being more effective in enterprise applications than large language models as they hallucinate less, are more constrained and so on. What are you seeing?

Sailesh Ramakrishnan: The challenge of hallucinations, at least from our perspective, is just a challenge that comes from the first few crude unconstrained models that were created, where we basically just threw all of this data into the system and asked it to reason with it. We are now already three-four generations forward. I think the vast majority of the hallucination problems are now being resolved. Can we completely remove every possible hallucination path? I don’t think so yet. However, the more obvious challenges of taking two completely random facts and then joining them together and creating a brand new fact. Those kinds of things I think are being reduced.

Smaller models have always been something that we have personally believed are going to be with the direction. We have also believed that smaller models that are focused in particular domain data sets do not have a broad reasoning system that knows everything from the sky is blue to the earth is round.

Sramana Mitra: What is the need? If you’re trying to solve problems within a specific domain, you need a small model which is completely trained in that domain. It’s cheaper to train and it’s cheaper to build. It’s a much better strategy to work with a small model that is constrained and not have these very large models that go haywire.

Sailesh Ramakrishnan: I agree, but the challenge that people have been trying to balance is that at some point, you also need some common sense reasoning. That’s part of what large models provide. You’re right in the sense that you want the small models, but you want them to be not so small that they’re only constrained to knowledge within a particular domain, but that they have some amount of that common sense knowledge as well. That balance of the right size of model where there is generic information and domain specific information allows for exactly the kind of impact that you are suggesting, which is lesser hallucinations but also having that little bit of common sense to make that leap in reasoning,

Sramana Mitra: One possibility or direction this could all evolve into is that you have a kind of a Platform as a Service like OpenAI or any of the large models and then on top of that, you can develop a smaller model that is more constrained and takes advantage of the common sense learning of the bigger model but then really focuses on the domain-specific models and works on that.

Sailesh Ramakrishnan: In fact, that’s what I see most companies doing today. So a company that wants to build something doesn’t just completely build only on top of OpenAI. They use OpenAI for certain things, but then it’s part of the platform approach because they also don’t want their data to leave the confines of their company.

Every prompt is also an input to OpenAI. So you want to be careful about what you use OpenAI for and what you use your internal data sets for. Almost every company that I have spoken with is adopting this kind of approach where they have their own internal models that work on their proprietary data, and the generic reasoning is taken over by the large models out there.

Sramana Mitra: I think in the interest of time, I need to let you go. So thank you for the conversation. Nice to meet you.

This segment is part 6 in the series : 1Mby1M Virtual Accelerator AI Investor Forum: With Sailesh Ramakrishnan, Managing Partner at Rocketship.vc
1 2 3 4 5 6

Hacker News
() Comments

Featured Videos