HOT TOPICS

Thought Leaders in Artificial Intelligence: Steve Scott, CTO of Cray (Part 3)

Posted on Thursday, Jul 5th 2018

Sramana Mitra: Can you give an example?

Steve Scott: If you think about deep neural networks in particular, there’s training and there’s inference. Training is the learning part where you take a bunch of data and based on that, you train a model to be able to provide some function. Inference, of course, is using that model that has done the learning already to make decisions on new data. The inference problem is sort of a throughput problem.

Once you’ve got the model designed, you can run lots of data through the model and make ad decisions very quickly. The training problem itself takes a lot of compute and a lot of communication. This is the sort of thing that a Cray system is good at. It’s basically a high-performance computing problem because you have lots and lots of data and you have to feed all the data through the model.

You feed the model through the deep neural network in a forward way, then you calculate the error between what the error predicts and what your labeled training data says it’s supposed to predict. Then you use that to back propagate changes to the weights of your model. That algorithm itself is an HPC problem. You can apply anywhere from just a single CPU or GPU up to hundreds or even thousands of GPU’s to process all the data in parallel.

As part of that, the processor that you’re using in parallel to do the training needs a lot of communication because every time each processor processes a chunk of data and decides, “Based on my data, I want to update these weights by a certain delta.” They have to globally communicate all of those weights together so they can all update the model in lockstep and then continue to process.

There’s a lot of computation but also a lot of synchronization and communication. That’s the classic HPC problem. One of the things at Cray is, we build high-performance interconnects and software that’s particularly good at exchanging data at high rates and doing synchronization without much latency. That’s what makes a good super computer. It’s directly applicable to scaling an AI problem.

Sramana Mitra: You talked about it in the beginning. You made a small detour to Nvidia. It probably gives us a good context to talk about the high performance computing landscape and how that is shifting around in this AI era. Could you talk about who’s doing what? Who are the players who have the most advanced technology in this realm?

Steve Scott: Before you think about that, it’s useful to understand why AI didn’t take root until the past few years. Back when I was in school, we learned about deep neural networks. We’ve known about them for many decades and they just haven’t been terribly useful in practice. That’s because you need a tremendous amount of computational power.

In order to have a useful deep neural net, it needs to be fairly large which means lots of weights. It needs to have a lot of data to feed it. The high-performance compute really becomes the engine and the data becomes the fuel, if you will. Until a few years ago, we simply didn’t have enough computational power to do a good job at training large deep neural nets.

This segment is part 3 in the series : Thought Leaders in Artificial Intelligence: Steve Scott, CTO of Cray
1 2 3 4 5

() Comments

Featured Videos

The Other 99%
Can 1M/1M Help Me Raise Money?
How Does 1M/1M Democratize Entrepreneurship Education?
How Does 1M/1M Democratize Management Consulting?
When Is The Right Time To Join 1M/1M?
Can 1M/1M Help Me With Business Development?
Can 1M/1M Help Me With Market Sizing?
Can 1M/1M Help Me Validate My Product?
Will I Have Private 1-on-1 Sessions In 1M/1M?
How Does 1M/1M Help Entrepreneurs Connect With Silicon Valley?
Mentoring or Consulting?
Why Does 1M/1M Charge $1000 a Year?
Why Does 1M/1M Partner With Local Organizations?
Why Don\’t Mentoring Networks Work?
Why Is It Important To Study With 1M/1M Now?
Dan Stewart Story
Vikrant Mathur Story

Sign Up for FREE
Company* :
First Name
Last Name* :
Email* :
Aweber Listname :
	Captcha validation failed. If you are not a robot then please try again.

One Million

by

One Million Blog

categories

HOT TOPICS

Thought Leaders in Artificial Intelligence: Steve Scott, CTO of Cray (Part 3)

Featured Videos

Free Program

Premium Program

Premium Members Only

Partners