This conversation deep dives into the nascent ML Ops industry.
Sramana Mitra: Let’s start by introducing our audience to yourself as well as to Striveworks.
Jim Rebesco: I’m one of the co-founders at Striveworks. We’re an ML Ops company that focuses on where we see analytics going in the future – a future where analytics would be cheap and ubiquitous; and the scaffolding that supports them disappears. We provide that as a platform to our customers.
Sramana Mitra: Let’s double-click down one more level. The audience you are communicating with is very sophisticated and highly technical. You can be as technical as you need to be. I am a scientist from MIT. So, I will have no trouble following you. The best way would be to take some uses cases to illustrate your point.
Jim Rebesco: Let me talk about how we think about the industry. Then we can talk about some use cases and the way we solve problems. One interesting challenge or environmental aspects in the ML Ops space is it exists in the broader value chain of data-enabled decisions. Data-enabled businesses and organizations are sitting with a whole bunch of data and they’re saying, “I’d like to make a smarter, faster, and more informed decision based on that data.” This has obviously been a persistent challenge.
When the big data movement kicked off maybe a decade plus ago, the working hypothesis was that if you bring all that data that an organization generates together, the insights that you need to run your business will fall out of it. There are some good successes there. But, it also fell short of the expectation that all you need to do is bring data together and a sufficiently rich query language will get you all the answers you need. That gap has driven the next wave of AI.
Data is critical but you should spend time and effort to build the models and analytics that will chunk that data down to human-interpretable components. If you break the value chain down, you’ve got data here. You need some analytics to process that data and a way to visualize that. That’s the big environment that we’re all living in.
The question then is what is ML Ops and why does it matter? Within that chain, models are breathing and living. When you construct a model, you are making some very strong assumptions about the world. The fundamental one is data is stationary and doesn’t change. That’s an assumption that’s true enough to be useful. However, it’s not always true. At those points of change, there are often times when things are happening very fast and actually want the most support for the decisions you make.
The world changes in times of crisis when tomorrow is nothing like yesterday. It’s usually the time you want some analytics to help you out. Within the ML Ops industry, that’s the fundamental problem we’re trying to solve. When tomorrow looks nothing like yesterday, how quickly can you retrain and reoptimize and bring that model back into production? That’s how we think about the problem.
Sramana Mitra: Why don’t we do a use case? Take an industry problem where there is some kind of yesterday’s data and some kind of tomorrow’s model. Walk us through one of these instances.
Jim Rebesco: We were a trading company at my first company. We traded stocks. I’ll give you an example. When you’re an energy producer or an airline who needs to purchase a lot of gasoline, it’s very important for you to accurately predict and forecast, not just supply and demand, but prices in the energy space. I’m down here in Texas.
As happened a few years ago, there were a lot of refineries. When a hurricane rolls through and shuts all those plants down in a very unanticipated way, it’s a very safe bet that all your statistical models are inaccurate. You can exercise good judgment there in two ways. You can work with your data coming into your models and say, “This is out of distribution.” You can also be just a human who’s watching CNN and saying, “It’s time to take this stuff offline. How do we quickly gather data from the outside world and prepare that for data scientists to retrain the model and then redeploy it back into production?”
This segment is part 1 in the series : Thought Leaders in Big Data: Jim Rebesco, CEO of Striveworks
1 2 3