Sramana Mitra: Do you want to take a different example from a different vertical perhaps and illustrate more of your point of view?
Angela Zutavern: We can talk about healthcare. We partnered with the National Institute of Health on reading MRI scans for heart health. Trained cardiologists had to spend 20 to 30 minutes looking at these heart scans on a computer and coming with calculations that would let them know how the blood is flowing in a person’s heart.
We put the challenge out to the community to build algorithms that could automatically read those heart scans. What was interesting about this is it was a competition. The winners of the competition had absolutely no background in health or medicine whatsoever. One of them was from the finance industry. They just learned what they needed to know from a 20-minute tutorial from one of the cardiologists. Clearly, machines can help in understanding diagnosis and suggesting treatments.
Sramana Mitra: When you say you sent out invitations to a large number of data scientists, what does that mean? What is the procedure? Did you work with a platform like Kaggle?
Angela Zutavern: Exactly. We worked with Kaggle. These are the biggest competitors that Kaggle does every year at the Annual Data Science Bowl. Each year, we choose a very challenging problem and we reach out to Kaggle’s community. They all volunteer to solve it and we have prize money.
Two years ago, we did the heart health example that I mentioned. This past year, we wrapped up a challenge around diagnosing lung cancer. For that, we were able to award $1 million in prizes through one of the foundations we worked with. The challenge there was to discover ways of diagnosing lung cancer that people just aren’t looking for.
Josh Sullivan: We just finished this two weeks ago. Anthony Goldbloom, Angela, and I created this annual data science competition where 10,000 people participated. There’s this technology in 2010 that could reduce lung cancer deaths by 20%. NIH stated it in a study. 225,000 people a year are diagnosed with lung cancer. If you can save 20% of those people, that’s 45,000 people. That’s amazing.
The problem is they have a very high false positive rate. I’ve seen them as high as 50%. You don’t want to tell someone who doesn’t have cancer that they have cancer. It’s very costly and there’s the mental anguish. This technology is not used by doctors because it’s not good enough yet. We took high-resolution DICOM files of 200 people who have been told that they have lung cancer. We published it in open-source for the first time ever.
We put it out there and 10,000 people worked for three months. The result was 10% better than anything we have today. We now work with NIH. One team had students. Another team from the Netherlands had professional Kagglers. I see them all the time. None of them had a medical background. It’s amazing.
We wrote about this in the book. Look outside of your industry to create algorithms. It’s good to have experts in your line of work, but you want people who know, comparatively, very little about your domain. They can be very valuable in terms of your formulate machine learning.
This segment is part 3 in the series : Thought Leaders in Artificial Intelligence: Josh Sullivan, SVP and Angela Zutavern, VP of Data Sciences at Booz Allen Hamilton
1 2 3 4