Sramana Mitra: What technique do you use? Since this is not big data, you cannot use machine learning. Do you use expert systems? How are you setting these things up?
David Talby: We do use machine learning and deep learning. We use a lot of deep learning and transfer learning. Now, we have our own built-in buildings.
One of the things that we did in the last three months of 2020 is release three papers. Among medical name recognition entity or papers with code, we hold eight of the top ones out of eleven benchmarks. We are the most accurate.
Deep learning and transfer learning works. That’s where most of the state-of-the-art techniques are. On top of it, you need to build software engineering systems. There are two things we are proud of.
One, we apply the latest state-of-the-art technique in this space. You have the open-source and you have hundreds of thousands of people who train customer models with us every day. The other thing that we are proud of is that we have real case studies with some of the world’s largest companies in this space that have taken systems through production.
As you know, there is a big gap between having a nice academic prototype versus an actual production system with real data. It’s not a data science challenge that closes the gap; it’s the software engineering challenge. You need to be able to train localized models. You need to be able to measure them.
Sramana Mitra: You also need access to the data.
David Talby: Exactly. Privacy compliance is a big thing. We have a whole team working on that. We have a person who does the compliance questionnaires to show everything that we do, because we get audited frequently. You have scaling issues as well.
Other areas of AI are considered nice to have. In healthcare, you cannot go to production if you don’t have the real version or if you don’t have audit credit. It’s not just per hospital that you need different models. Sometimes it’s per floor that you need different models. You also have multiple specialties and subspecialties.
If you cannot generate and support many models in production, that means you can’t deploy. The system will not work. The other is the feedback loop. The system needs to be able to learn. Feedback loops also allow you to learn from small data. You need to be able to learn from a handful of patients and not from 500,000 people who came to your website. That is the second part of the effort.
When we do projects and we work with some customers there are always three parts. There are the data scientists, the software engineers, and the clinicians. You have all three within your project team.
This segment is part 5 in the series : Thought Leaders in Artificial Intelligence: David Talby, CTO of John Snow Labs
1 2 3 4 5 6 7