categories

HOT TOPICS

Thought Leaders in Artificial Intelligence: Steve Scott, CTO of Cray (Part 4)

Posted on Friday, Jul 6th 2018

Steve Scott: With the advent of GPU computing, deep neural nets started to become enabled to the point where you could get good enough performance so that you could really do useful things with them. GPU computing is the application of the processors that were designed for highly parallel tasks of painting triangles on the screen for rendering graphics in real time.

It turns out you can use all those parallel functional units for doing normal computation. GPUs are the first processors with a single processor level powerful enough to do meaningful deep neural net. We could previously do it on very large systems, but that limits you to a small number of people who have these large systems. GPUs were the first ones that could do it on a single desktop basis at a useful scale.

Back when I was at Nvidia from 2011 to 2013, we started to do some deep neural networks on our GPUs, but we didn’t really understand how big of a market it would become. It wasn’t until a few years later that it really started to take off. Now it’s become so successful that it’s got the attention of lots and lots of startups. GPUs are really great because they do lots of work in parallel.

They have thousands of functional units on them that are very efficient when you’ve the parallel workload. Even so, they’re not the most efficient things. If you just wanted to do deep learning, there’re a lot of things on the GPU that you don’t really need. You don’t really need the graphics pipeline that’s meant to do rasterization. You don’t need the higher precision floating point support.

Scientific processors including GPUs have 64-bit numbers in them that allow you to do very high precision math. Deep learning workloads tend to only need about 16 bits. Nvidia has put some functionality into their processors to really accelerate these small numbers. If that’s all you want to do, you can actually build something that’s even more efficient. There are now, at least, 20 different startup companies that are building deep learning accelerators.

None of them have really hit the market yet. In the next year or two, they’re going to start hitting the market. It’s going to be really interesting to see what happens. They’re likely to give GPUs a run for their money. AMD is now also designing high-performance GPUs for deep learning as well. The market is big enough so that it’s worth having a lot of companies build specialized hardware.

Sramana Mitra: You haven’t mentioned Intel. I think Intel also falls in that category. They’re going to end up acquiring some of these companies to get the expertise and hit those markets, right?

Steve Scott: Absolutely. Intel purchased Nirvana a couple of years ago for just that reason.

Sramana Mitra: How about other trends that you want to provide pointers to and other open problems that you want to provide pointers to that entrepreneurs could be looking at?

Steve Scott: In our world, the most interesting trend is trying to find ways to accelerate traditional simulations with deep learning. The problems that come along with that is that scientific data tends to be different than the data we’ve been using for AI today. We’re really good at doing image recognition and speech recognition.

In science and technology, we’re doing simulations that have very different sorts of data. It’s really unknown what the right model topology is. What’s the right structure to use for these different sorts of problems that we’re facing. That’s an interesting area with a lot of work yet to go in terms of figuring out new models for different types of data.

Sramana Mitra: That’s a very precise pointer for something to look at.

Steve Scott: Another problem is how do you get the training data. It’s not a mistake that deep neural nets really took off at Google, Microsoft, and Facebook because these are the companies that have tremendous amounts of data. I mentioned earlier that you can think of the computation as being the engine but the data as the fuel. If you don’t have a lot of good training data, you can’t really train for neural net.

How do you create that training data? That’s a really interesting problem for a lot of domains that don’t naturally have it. One thing you can do in game playing is you can write programs to play games against each other. If you think about Alpha Go, they were able to play earlier versions of Alpha Go against each other to generate all the training data.

They originally only had data from human games but then they were able to add more data thorough computing. In our world where we’re trying to do simulations, we can run simulations to create data and then train the neural nets on that. A lot of places don’t really have the data. How do you obtain the data? How do you curate it? How do you avoid biases in the data? That’s another really interesting question.

From an ethical perspective, there have been some cases where people have trained neural nets based on existing data from humans or the internet. If the existing data had biases in it, then you will train your deep neural net to be biased.

This segment is part 4 in the series : Thought Leaders in Artificial Intelligence: Steve Scott, CTO of Cray
1 2 3 4 5

Hacker News
() Comments

Featured Videos