Gideon Mendels: If you think about the managers, they’re mostly looking for this visibility. If I use the software engineering analogy, as engineering managers we are able to look at everyone’s pull request, commit history. We have full visibility on who’s working at what. A lot of these managers rely on self-reported progress.
With Comet, we provide them with this full visibility. As a manager, you can see every result of every experiment by every team member. The main goal there is that those managers can support their team members and guide them in understanding what’s working and what’s not working.
At the organization level, all this institutional knowledge is saved. It sounds obvious, but a lot of teams don’t have that. Machine learning engineers are sought after and they tend to switch projects and jobs. If you don’t have a system like Comet, everything they know goes with them.
Sramana Mitra: When you look at these contributors who are using your product, it sounds to me like there are two personas. There is the data scientist and there is the developer.
Gideon Mendels: With the level of maturity in the industry today, personas are very confusing in terms of titles. You have data scientists who are more business analysts. You have software engineers who build models. Then you’ve got machine learning engineers who are focused on deployment. Titles are challenging. I agree with you. We serve both the people training and building the model and the software engineers who typically come in when the teams think about deploying the models.
Sramana Mitra: What is the state of the industry in terms of machine learning practitioners today? Let’s separate out the two personas. What tools does a data scientist learn today to be a data scientist?
Gideon Mendels: Python, by far, is the most used programming language. That will be table steaks. Then you have these machine learning frameworks like PyTorch, Keras, TensorFlow. Those are higher level frameworks written in Python. Then there’re a lot of open source tools that people use. That’s dependent on what problems you’re solving. Jupyter notebook is a very common one. Pandas helps with data processing. Then you have tools like Comet. Eventually you do have to be familiar with most of these tools to be successful in the role.
Sramana Mitra: If you’re a developer, what else do you need on top?
Gideon Mendels: If you think about the persona that focuses on putting these models into production, there are a couple of frameworks that help serve models in production. Some people decide to build it from scratch based on typical web frameworks. There are tools that help make that easier. A lot of times, Kubernetes is the lower-level orchestration and serving with some type of web framework to serve HTTP. Then when you’re doing it at scale, you start to get into more interesting things.
Generally speaking, a model can only make prediction on one data point at a time. There’s a question of how you batch them together. I wouldn’t say that there is one tool that if you learn that, you’ll be successful. It is dependent on what problem you’re trying to solve and what does your software stack look like today. You don’t typically do something from scratch now. You build on top of what you already have in place.
This segment is part 3 in the series : Thought Leaders in Artificial Intelligence: Comet ML CEO Gideon Mendels
1 2 3 4 5