categories

HOT TOPICS

Thought Leaders in Big Data: Jim Rebesco, CEO of Striveworks (Part 3)

Posted on Wednesday, Nov 30th 2022

Sramana Mitra: In the previous example that you gave about stock markets and hurricanes, how do you adjust? Let’s say you have a Black Swan event, how do you put that into the model? That’s an interesting use case.

Jim Rebesco: It’s really interesting. This double-clicks on that notion that the data that goes into a model needs to look like data that’s going to be run in production. In a Black Swan event, the very core answer is how quickly can I gather a new dataset from the outside world. If you say there were X number of hurricanes in the past, and you see a hurricane that looks like those. It could be true or it might not be true.

As a data scientist, you’re probably not the person who’s going to answer it but an expert would. The subject matter experts are going to be your best ally in determining what’s in sample and out of sample.

Sramana Mitra: When you have a real Black Swan event like a pandemic, no machine learning model can predict that.

Jim Rebesco: That’s right. We empirically saw that.

Sramana Mitra: Where are you seeing adoption?

Jim Rebesco: It’s scaling rapidly in the enterprise. These are the companies and organizations that have dedicated data analytics teams together and are asking how to make those teams more effective. Where the trend is going is how can we make this as automated a process as possible. How can we reduce the burden on organizations and move it to a more commoditized way of doing business?

Sramana Mitra: Where are the bulk of your customers? Are they more in enterprises or more in the AI companies that are building models and building stuff for other enterprises?

Jim Rebesco: It’s been very much the enterprise customers for us.

Sramana Mitra: What are the open problems in the ML Ops industry? I’m talking about the problems that are out there that are not being solved yet. What do you see from where you sit?

Jim Rebesco: First and foremost, ask yourself where is the real intersection between your ability, your passion, and a validated problem. The best way to do that is to have you experience this problem. Do a lot of people you know experience this problem? From there, you can wrap your hands around your ability to solve and contribute meaningfully to that space.

I’m a data scientist at heart. I feel very nervous when I talk outside of my domain. I walk and talk within the ML Ops space as a user. With that, where are those big pain points? One of them is very much in the notion of abstracting away all of the continued themes within the ML Ops experience. Metaphorically, how can I as a data scientist Alt + Tab between functions of options?

That’s something very interesting to me. Can we get to a point where training, deployment, monitoring, and retraining are not discrete processes and each one is triggered by an email or a pager? Let’s talk about a use case. You’re presenting suggestions to the end user and I’m swiping left or right. You just got an end user to label data for you. How is that being collected? How is that being turned into a training set in a way that doesn’t require intervention? How are errors monitored?

Those are the big unanswered questions in the ML space. It rolls up to the idea of how can you make this whole stack disappear and make it boring. Five years ago, it might have been an exotic question if I said I had an object store and I want to instantiate some sort of API so I can programmatically access it. Today, that’s the F3 protocol that’s the de facto standard. That level of commoditization is the next level to unlock in our industry.

Sramana Mitra: Thank you for your time.

This segment is part 3 in the series : Thought Leaders in Big Data: Jim Rebesco, CEO of Striveworks
1 2 3

Hacker News
() Comments

Featured Videos