categories

HOT TOPICS

1Mby1M Virtual Accelerator AI Investor Forum: With Ashmeet Sidana, Chief Engineer at Engineering Capital (Part 3)

Posted on Wednesday, Jul 24th 2024

Sramana Mitra: So let me double click down on the second one. I will come to the picks and shovels in a moment, but my observation is that to build a vertical application on top of an LLM, you obviously need to train in domain specific data. Now, there is a benefit to kind of constraining that model. You can tell me more technically how much of this is viable and how are people doing it. If you constrain the model to a small language model, the hallucination problem should go away or at least get much more manageable. Is that a correct statement?

Ashmeet Sidana: I wouldn’t say necessarily a small language model, but yes, in general, I would say that if the set of acceptable output predictions is constrained, if you reduce that, then you reduce the amount of hallucination.

Sramana Mitra: Yes.

Ashmeet Sidana: So yes, your instincts are absolutely right. By having a narrower domain and a narrower vertical, you get better quality predictions and less hallucination.

Sramana Mitra: Now, I’m going to ask you a question that is floating around in the industry because general AI is getting so much publicity. People are focusing on the wrong thing in my opinion. At least from entrepreneurship perspective, general AI doesn’t really matter. Does it matter in your view?

Ashmeet Sidana: I think if it could be built, it would matter hugely, but I am very skeptical that it’s going to be built in the very near term. We need some more breakthroughs that have not happened yet. It is completely unpredictable when those breakthroughs will happen. It’s like trying to predict the next Nobel Prize.

Sramana Mitra: As far as I’m concerned, it doesn’t matter because there’s so much available right now with which so much can be done in the vertical AI mode. It’s almost like I don’t really care what happens to general AI.

Ashmeet Sidana: I’m in violent agreement. There is so much low hanging fruit and opportunity right now that if you want to be just a pragmatic entrepreneur and you just want to quickly get started and quickly want to start building a company, there’s tons of opportunity in that space.

Sramana Mitra: Yes, there’s tons of opportunity, and you can build on top of one of these platforms that are getting ridiculous amounts of funding and are following through with the Platform as a Service offering and you can build on top of that. You don’t need to worry about training the whole model yourself.

Ashmeet Sidana: Exactly, you are benefiting from this massive investment of capital that has been done on your behalf to train this model that you can now use for practically free.

Sramana Mitra: Now on picks and shovels, what are you seeing and where are the gaps in the market? What would you like to see?

Ashmeet Sidana: The picks and shovels has traditionally been a very lucrative area for venture capital. The tradition is always to invest in the picks and shovels, not in the gold itself if you want to be a good investor. We have seen a tremendous amount of investment in that space, coming in recently all the way from developer tools, infrastructure tools, orchestration tools, and observability tools. I think that follows a more traditional path and the nearest analogy is actually something that you brought up earlier – Force.com, which really started the whole PaaS movement and how applications can be quickly deployed and built on other people’s platforms. That is really where you are seeing a lot of companies develop. I continue to see that as a lucrative area. It can be a smaller market.

So you have to be careful. And when I say smaller, it’s all relative. In other words, it’s not a trillion dollar market. It may only be only $10 billion, which is of course still a gigantic business. So you have to be more careful when you think about how you size it, how you plan your business, and what you are growing with. But there will be developer tools built for sure in this.

The danger in the picks and shovels market is that whatever picks and shovels you build implicitly assumes some sort of a stack. It assumes what type of AI you’re going to build. If that changes tomorrow, you could become irrelevant very quickly because this is such a dynamic space and so many new developments taking place. But those are the risks that every business takes. A new technology can always make a business obsolete. So you really have to stay on top of state of the art.

Sramana Mitra: Now there’s obviously all kinds of points of views on general AI. Let’s say for discussion sake, general AI does become viable. Does general AI then eliminate the need for vertical AI or vertical AI is still needed?

Ashmeet Sidana: So first, just to clarify, by general AI, I think you are referring to AGI or Artificial General Intelligence.

Sramana Mitra: Yes.

Ashmeet Sidana: You know, the answer to your question will depend on what are the limitations of this AGI? What are the characteristics of this AGI that develops? It’s hard to imagine, you know, an AGI that could do anything and everything.

In other words, a general AGI is kind of hard to imagine. So if you think of ourselves as a form of intelligence, we obviously define human beings as intelligence. And most people would define an AGI as something similar or equivalent to a human brain, a human mind with the ability to reason, to learn new tasks, to create new ideas, to be creative, and to come up with novel ideas.

But we still have limitations, right? We have a very limited memory span. We have to train ourselves, even then, if we want to build new ideas, we still have to train ourselves for years and decades, sometimes, to advance the state of the art. So it really depends on what type of AGI gets built. As I mentioned earlier, I’m skeptical. I don’t believe an LLM is a path to AGI.

So I don’t think just throwing more GPUs and building an ever bigger model and training it more times is going to give you an AGI. So there’s going to have to be some other breakthroughs from which AGI will come, and then it will depend on what the constraints of those breakthroughs are.

Sramana Mitra: The way I’ve been processing this AGI question is as follows. First and foremost, you and I are relatively important human beings, not artificial, real human beings. But, if I decide now that I want to be a heart surgeon, this is not a reasonable path that I can follow. Maybe if AGI is part of the equation, that AGI can be trained very rapidly in that skill set, but it still has to be trained in that skill set.

Can we assume that AGI, when it becomes a reality, is going to be an expert in every domain in every possible way, can solve every possible problem, can be an expert in every single function? Maybe one way of thinking about training AGI to become an expert in everything, but that’s an unlikely scenario. Here’s the rub in terms of where I think I’m getting stuck in in this thought experiment.

There are APIs and workflows that you have to navigate, right? So if I’m going to be this super expert doctor suddenly with the help of AGI, I need to still be able to get into the medical systems, look at the medical records and medical research and so on and so forth. So there is a vertical workflow. There’s a vertical industry structure that we operate within.

Just because there’s a technology that is capable of learning out of your real space, are we going to do away with all of those infrastructures? It’s security, it’s privacy, you talked about HIPAA and compliance, there’s all kinds of things that an industry is built with. Are we going to do away with all of that?

Ashmeet Sidana: Of course not. I think your instincts are absolutely right to the point that there is a subset of researchers who believe that the path to AGI relies on being embodied in a human-like device. In other words, they believe that part of the reason that we are as intelligent as we are and we are able to learn and do all of these things is because we have a particular shape and form when we are born as children, and we learn a lot from the environment that we interact with, and thereby become more intelligent. And that is intrinsic to our ability to achieve this form of AGI that we have.

So they are actually building robots and physical devices and trying to embody AI in a physical body. They believe that they’ll get faster learning from that method. We don’t know the answer. These are research topics. If you want to do a PhD in AI and AGI, that’s a great area to go work in, but it is not a way to entrepreneurship in my opinion.

Entrepreneurship is about getting to a commercial product relatively quickly. For a traditional VC, it’s within one to two years. For a very patient VC, it may be three to five years, but that’s about the maximum time you have. And so you’re really out in the realm of science.

Sramana Mitra: What we say is it’s, you go from zero to a hundred million dollars in five to seven years, otherwise you don’t have a venture scale company. And that’s the reality of our business.

So, as to the other thing that you brought up about human form, humanoid form is not the most efficient from a robotics point of view. Robots cannot do simple things that human beings can do. Is it really worthwhile to invest in training those humanoid robots to do simple, silly things like making coffee? Hey, come on, it’s a waste of your time.

Ashmeet Sidana: It could be, but again, that’s why it’s a research topic. We don’t know. There may be some deep insight in training a humanoid to make coffee in an arbitrary kitchen. I think you’re referring to that famous test for AGI. Can this machine walk into an arbitrary kitchen and make a cup of coffee for itself? Most human beings can do that very easily. It would be impossible today to solve that. Maybe there’s some learning, maybe there’s some insight in that.

Perhaps we can put a bow on this whole discussion by recognizing that the things that we are so excited about like AGI and ChatGPT and all of these LLMs, we still don’t know how they work. We still don’t know why they work. We don’t have any first principles research which has solved that problem. We just know they work. We know they do certain things in a certain way, which we find useful. Okay, wonderful. But it’s just an empirical observation. There’s no magic. We haven’t figured out why they work.

So at some point, my hope and belief is that we will come to the first principles definition of what is intelligence. And from that derive an AGI, and then maybe we can derive what you’re talking about, which is a general form of AGI, which can know everything and do everything that an intelligence could ever do.

When I trained as an engineer, we studied Nyquist and Shannon theorem, the information theory, and some very fundamental principles and theorems that exist in knowledge. None of that has been applied to AGI. None of that has been applied to information. Surely there is some match over there. The simplest empirical observation is that right now I’m using roughly twenty watts to have this conversation. Our brains consume order of twenty watts. And here we are using hundreds of megawatts to do much more.

This segment is part 3 in the series : 1Mby1M Virtual Accelerator AI Investor Forum: With Ashmeet Sidana, Chief Engineer at Engineering Capital
1 2 3 4

Hacker News
() Comments

Featured Videos