Welcome to the Episode 318, part of the continuing series called “Behind the Scenes of the NetApp Tech ONTAP Podcast.”
This week, David Talby, CTO of John Snow Labs (david@johnsnowlabs.com, https://www.linkedin.com/in/davidtalby/, @davidtalby) and NetApp Healthcare AI & Cloud Principal Esteban Rubens (esteban.rubens@netapp.com, @esteban_aihc) join us to discuss how natural language processing is revolutionizing the way doctors diagnose and treat patients.
- NetApp NLP landing page
- Spark NLP
- Spark NLP for Healthcare
- Spark OCR
- Real-world case studies
- Peer-reviewed papers
- Live demos & notebooks
Tech ONTAP Community
We also now have a presence on the NetApp Communities page. You can subscribe there to get emails when we have new episodes.
Finding the Podcast
You can find this week’s episode here:
You can also find the Tech ONTAP Podcast on:
I also recently got asked how to leverage RSS for the podcast. You can do that here:
http://feeds.soundcloud.com/users/soundcloud:users:164421460/sounds.rss
Transcription
The following transcript was generated using Google’s voice to text transcription service. As it is AI generated, YMMV.
I’m here in the basement of my house. And with me today, I have some special guests to talk to us all about NLP. And we’ll talk about what NLP here’s here is in a second. I had to Google it myself, and I’m pretty sure I got the answer here, but we’ll find out. So with me today, Esteban Rubens, our resident Healthcare expert at NetApp here. Esteban. What do you do here at NetApp? And how do we reach you?
00:19 – 01:03
0
en-us
0.88
Hey Justin, I’m part of the health care. Team Healthcare life sciences. And I focus on AI and cloud and Healthcare which is providers payers and life science companies. So Pharma and genomics anything that involves patient care patients. All right, and how do we reach you? You can reach me at Esteban Ruben’s, a LinkedIn. It’s Esteban and my last name is r u b e, n s. All right, excellent. Also with us today. David Talby is here. So David, what do you do? And how do I reach you? I’m the technology technology officer at the John Snow Labs. Here. We got to be a developer’s of the spark NLP library and sparkling people Healthcare Library.
01:03 – 01:50
0
en-us
0.89
If I’m available by email, David at John Snow Labs. Or also on LinkedIn, just linkedin.com / in /, David talby d, a v, i d t, a l v, e, b. All right, exit hole include that in the blog notes as well. David did. Did this company start before or after Game of Thrones? Yeah. How did that during Game of Thrones during yes, and actually, initially, there was some point we’re at the end of The Fall season. It was unknown, whether done snowing the life of dead or dead. And before the 15, we actually saw a significant bump in traffic to our own website because people will just so people such a popular query, like John, John, Snow help, you know, which which worked well, but really, the company is not named after the Game of Thrones character.
01:51 – 02:51
0
en-us
0.74
Okay, that’s interesting. Is actually worked out really well for you. If if you know exactly what you mean, you’re named after the, the father of epidemiology. Dr. Jon Snow was in position in Victorian London and considered one of the first people to actually apply data to improve Public Health. That’s, that’s the inspiration behind the company. If you learn something new every day here at the podcast. So that’s excellent home and we can put it at Link in the blog. We just say, I always thought it was really cool that they have that cholera map of London where he was annotating where people were getting sick and the numbers and the locations. And then they found out that it’s related to water sources and they shut those down and people didn’t get sick anymore. So it’s actually really an amazing story of data science in the nineteenth Century, like the father of contact tracing basically, so so yeah, so it’s exactly like that. So the water cooling out breaking in SoHo in London eighteen fifty-four and what Jon Snow did.
02:51 – 03:50
0
en-us
0.76
Was really amazing, you know, amazingly forward-looking at the time was greatly. Just one said, we’ll build a map of all the people who died. Okay? Any literally, he did contact tracing by, actually working house, you know, walking down the house to house and and and trying to find those contacts and and so, so yes to an extent definitely and one of the interesting insights was that there was one water. Well that seemed to consist of those patients. And then it was a very big inside the time because before the invention of the microscope, it was not known that color eyes water born and that was really kind of the big insights. So that’s really one of the great really well known early examples of of really, using data. Really to, you know, to mind, right? Mine insights, right that are unexpected and you and actually useful to people. Cool. So, you know, I’m covered in the beginning that we’re going to talk about something called NLP. And before we start talking about that specifically, let’s cover the acronym. What is that accurate? And let’s unpack that.
03:51 – 04:45
0
en-us
0.88
Actual language processing like research is correct. Yes, which basically leads to Astra? It’s a field of computer science that page. How do we understand natural language? Right? Meaning human language as we speak? So if you know, of course computers can Aid, know, programming languages, but if we’d like them to be able to read and answer questions, you know, for example would be the Articles, right? Like you look on the SAT like reading comprehension or two, for example, to understand patient notes and set or what’s right? Next thing we should do with this station, or do you have to read the legal contract, right? Or legal college students and tell you what’s in case or you know with the customer support use case and say, oh he used the correct answer for this document for those things, whether it’s text audio or video. Yeah, you need software that stands really older than you answers and these are gone right? And you know different ways. We you know, we as human speak.
04:46 – 04:55
0
en-us
0.87
And that’s, that’s what natural language processing is all about. Okay, so that said, I mean, how does this relate to Health Care? Like, where does this fit in?
04:56 – 05:56
0
en-us
0.75
I sure. So in healthcare. Yeah, text is still. Well, a lot of the clinically relevant content is document. So one thing we did with, you know, with the implemented, the address electronic health, records in the industry. Is that instead of my medical are could be in, you know, in the paper file in the, you know, in the doctor’s home, on the, on the Shelf. Now, it’s in a computer, but it’s still all Texas. So when I go to visit, there was a, there will be no describe. What happened, right? Why I came in. What were the symptoms? What was the physical exams? What was the discussion about? What the plan is? When someone goes to a hospital? Also, we have the admit. No discharge notes, progress, notes, like everything that’s happened with the patient and because of really just the complexity of Health Care. And the fact that each person is unique off the quite a few Specialties in healthcare. Well, if you want to understand, if you want to do clinical decision support, right, what’s the right thing to do for this? For this person? Next, if you want to Define patient cohorts, right wage,
05:56 – 06:55
0
en-us
0.71
Do you want to? It could be performing Enterprise night if you want to understand, you know, what is the coughing box in do and how does loan copy look like right? In many cases? The only place that actually has the information. You want to do is text. Okay, and, and for example, if you look at the oncology, very basic things like, you know, where is the tumor? What’s the, the staging of the storm or is it metastatic home? You know, the, The Logical values only available in 3 texts. If you’re dealing with the mental health or social determinants of health and you want to know whether there is a family history of old-fashioned weather is, you know, him. So soft substance abuse or violence or other things that will only be in text. So, so Health Care is really one of those domains where wage if you want to move away from, I mean do it today which is, you know, people sit and read things one by one. You’re right. And that’s how you do today. You want to know how many people probably had the flu, you know, in in your, you know, in your world. You see down.
06:56 – 07:15
0
en-us
0.69
Person in the manual one by one. If you want to get away from that you need software. That can not only understand natural language as as we speak it and rewriting but also understand it specifically in a clinical context, right? Understand the jargon and medical terminology and and Specialty the questions. We need answers for
07:17 – 08:16
0
en-us
0.70
Yeah, I’ll just to add one thing. There’s so much variation to David’s point every institution, even every person does things slightly differently. It’s it’s jargon within jargon. I come from Medical Imaging and even there you have reports and you would think that people would describe anatomical features are bones, organs the same way and they’re always subtle differences of what they say to refer to the same thing. And then also, there’s the issue of the subtext or things that are that are really understood if you read it but may not be clear. Unless you are human. How do we extract that other information, you know, tone or mood, or V, you know, anything related to to that kind of messaging. So the problem is really, really complicated and hasn’t been tractable until fairly recently, which is why it’s so
08:16 – 09:16
0
en-us
0.89
Exciting to have NLP to have an LP that actually works. And to have companies like John Snow, Labs, focusing on, and in healthcare. So David, can you tell us a little bit more about jobs do Labs? I know we touched on it a bit, you know with the origin story of Jon Snow, but what about the the company itself? Can you kind of give us the history? There? Definitely be a software company. We started in 2015. We’ve always been focused on on Healthcare. We most we are most well known for the spark energy Library, special occasion is an open source Library here, which for the past a half years has been that the most widely used NFP library in the Enterprise is kind of outside research when people go to production, only the scaling, the usually go with Falcon LP and introduced it now because we’ve been releasing new versions every two weeks over four years. Now, they’re more, they’re more than five thousand models in the in the models are based on Market. Determine, you know, supported available to community dead.
09:16 – 10:07
0
en-us
0.88
And and the, you know, upon the languages and it’s pretty widely deployed in production. Basically every environment on top of the open source project can basically, the way we fund it is, we have to software products. We have spoken and before long and we have to smoke OCR spoken. The people will care gives the healthy and power. Industries state-of-the-art Healthcare NLP, mean clinical and biomedical natural language processing. But it’s it’s by far the most value using the field in the latest NLP in the sixer break. We have about the you know, we have ninety 59% market, share, Ur, own Healthcare, and off and life-science. The items is when they need any. Usually people come to us because really, what we’ve done is we’ve taken the latest research in deep learning transfer learning. Well, it is specific to The health-care Domain in his sponsors that they think the
10:08 – 11:08
0
en-us
0.86
But it’s it’s a different technical challenge, the the micro languages. There’s a lot of major gone. That’s very, very specific to a subspecialty or even organizations. You need to be able to do tunings off. And even that academic level are kind of separate conferences workshops for biomedical NLP. In our job is to track all the new research on innovations that are coming. When there is something actually improves the state-of-the-art, the outdoor is to give the the industry, the the production great scalability animal version of that. We also in in quite a few cases. We were able to improve accuracy in Astra bishnu State without results and wage civil. Few papers on, you know, papers. We called the Charles where we have completely the best Tech University level that’s been achieved. And, and, and that’s, and that’s really, that’s, that’s kind of the best turbo. And now we wait for the majority of the larger, companies, large a, quite a few yrs health care, providers and software components. So I know that there’s a lot of other similar birth.
11:08 – 12:08
0
en-us
0.81
Assessing of of language that is available. So like, for example, you know, podcast transcriptions were out. I’ll run this audio through Google’s podcast, our audio transcription off services and you know, it’s Hit or Miss as far as like the results. So I imagine that with this type of, you know, learning processing with or sorry language processing with the medical terminology. You have to kind of train the data sets to to learn these different languages. I know I have to do that with the podcast, have to tell us that NetApp has NetApp and that Justin is Justin. So how long are you constantly training? These is this an automated process. How is that happening on the back end with with your software? Yep, so effectively, as you’re absolutely correct, you do need the ability to train engine model. And in one of these features is that we never used to do that to enable, you know, just to configure your own pipelines. Also has to train your own models to tune existing models, like Rapunzel learning, two different, you know, two different contexts dead.
12:08 – 13:08
0
en-us
0.90
And in health care is critical for people. You know, for, for many reasons, one is is yeah. When you have a question to tell you, the NetApp is NetApp in healthcare, for example, the homeless off the terminal just in English has all three million different medical terms. Okay, and if you think about that, several times, the size of like the, the English dictionary of the English language. All right, so that there’s a lot of complex, but on top of that, there’s a lot of drag on that people use right? And and really the way to think about languages in health care is each specialty has its own language. Okay, and I’m in a bad language in the human way you want to to learn to read, you know, pathology reports. You probably need to invest as much time as you need. If you have if you wanted to let them know Russian and opposed to this, right? It’s it’s, you know, it’s different words. Different subjects, different grammar, different context, right? Different slang, right. The people used to convey different things. Even doctors. I saw, you know, if you would entice, you’re not going to be able to read your report, right? Yep.
13:08 – 13:44
0
en-us
0.72
He has a different language. So we need to really specialize. And then on top of that people, they just write things differently and they mean different things. Right? And there’s a sponsored. Yes. I am within, when you within the same Specialties. You think the allergy report is a CT scan, you know, of the chest, right? It should be the same because different places know, people call organs different things. They, they diagnosed in the different way they call out, you know, different things, and then they also things that are super specific to contact. So, for example, one of the examples we’ve seen in the past was when someone writes, for example, a patient denies alcohol.
13:45 – 14:17
0
en-us
0.72
Okay, what they really mean is a doctor is I suspect alcohol, abuse, right? So this is a message to the next local. Look, try not to give this medication, you know, this this person medication, they conflicted alcohol, right? So those are a couple of things will if you bring question-answering right-wing to see, you know, what’s the next big thing. This thing for this patient to, you know, what are the risks? They’re right. What? What organs are. Maybe if it for these are the kind of things. Do you need to get information correctly? Right? And which means that you really need to understand the the domain, right? And the context that you’re running?
14:19 – 15:17
0
en-us
0.79
Yeah, there’s there’s lots of other nuances there to, you know, accents, right? Like, Southern Accents and the in America, or Northern Accents in America, or you have different dialects of Spanish. Those all have to be factored into the transcriptions of the audio and I imagine that that Healthcare audio and surveys would have the same problem there. Yes, if you took typing music, so one issue is yes, a lot of a lot of the doctors very everywhere. But also in the US yes. U s born and and they have accents definitely and off the accents are very right. So, so, you know, if you’re dealing with the say, you know, a call center industry then usually no, okay. Okay, if my, you know, if my call center and I can make a model trains to the concept, right? So I can have my American English, Costa Rica in English, Filipino English India. English in the malls would pretty much works with the doctor. It’s really one by one and that’s a lot of what you see in health care, you know. Yep.
15:17 – 16:10
0
en-us
0.73
The multiple one person. We are you playing the more than one person that happens a lot but in healthcare, it’s it’s the problem is is in essence worse than death because what happens is this song, you have the 6th floor of internal medicine, you know a in a hospital and you have the 7th Floor which is internal medicine. Be those two floors would go to use limit different languages over a different job, right? Because what happens is you have a group of people, though. So you’re going to have, you know, six seven, doctors and, you know, 15 to 20 nurses. They work together day in day out for five years. Ten years. They’re going to bring up their own language, right? So so they’re going to agree among themselves. Oh, you know, when I tell you to do, you know, test a okay, so, please all the tests, ABC, what it means is for this kind of patient wage, they’d all be rule out to other things like otherwise, you know that I would ask you for something else.
16:11 – 17:10
0
en-us
0.71
Right, they’re all of those things imply things that people were together for a long time put together and which really means that you need to you need to be super specific about how you train models, how you tune models and also how you structure a single smooth things over time. So I know I focused a lot on audio transcriptions cuz that’s kind of what I’m familiar with what other sorts of things does. Does your service do? I mean, is it is it taking files of tractor? Is it taking images or all the above? Yeah, so it’s it’s all day above and one, big thing that you need to do. If you want to. For example, understand patience, over time is be able to take all the data you have about. The issue is kind of normalize it and put it together. So some of the information is going to be digital text, right? For example, not at someone type into, you know, into any age are right. Electronic health system, right? All the sudden only just suck it. Some of the influential going to be from audio, right? And it could be, it could be the doctor, you know you conversation. You’ll do. It could be, you know, a doctor calling the Pharma company and asking questions wrong.
17:10 – 17:23
0
en-us
0.76
Potential side-effect, right? It could be patient calling, you know helpline single, you know, I’m having these symptoms. Do I need to go to the area? So so that’s on the audio even in. There’s a lot of also a lot of skin documents.
17:24 – 18:24
0
en-us
0.74
Okay, so there’s a lot of weight, you know emailing PDFs around and even faxing in a healthcare especially things like, you know, next Generation, sequencing, keyboards lab reports, a pathology report like basically every time they send you to another building right very often, they’ll email, you know, they used to just send you back newspapers and some sometimes they still do and then you need to scan it through the email, PDFs, whole faction. So we get rid of it and then the images as well, right? So the guy, images medical images, and, and for them often you need to extract information from the image itself. We need to identify the image as part of the nav button by your patience. Is we deal with that as well. So basically the inspection and be able to fix spoken before. He’ll can we deal with kind of clinical medical texts? And we spoke ousia we deal with, you know, the images. The visual documentation is going to be X. So these data sources I would imagine are going to have to be stored somewhere. So where does Jon Snow keep all this data or do they license out data sets from other people or they sharing and call?
18:24 – 18:26
0
en-us
0.66
Lamborghini across the industry.
18:28 – 19:28
0
en-us
0.76
Not sure. So, you know, a bit of everything for one data set up, some data sets that we have built. Because one of the challenges involved here is that, a lot of chewing is either extremely difficult sometimes even just a legal like, for, you know, good privacy reasons. So some data sets are available publicly. Also kind of, you know, with, you know, should cause the industry. Some damage that we have built to to implement some specific module not available, and then does the the customer data? And we, that’s the way we operate. We do not operate anything as a service. So it’s not, you know, if you need something for example, like medical consents conversation to transcribe, you do not send this post over the cloud. We install the software new infrastructure, the software runs there, and the advantage to you is then you do not know if she said anything with us, right? So you never send the data to any third-party, everything runs within your own security and infrastructure. Nothing gets shared, right? And in the outputs on use, like you will never change.
19:28 – 20:28
0
en-us
0.68
Anyone else, we never see what will happen. So, from a just a privacy by Design perspective. This works very well for this industry because very often, they people are just, you know, either unable or unwilling right? To share data for those purposes, either, you know, you know, for privacy reasons and all sometimes really Fond du Lac, the reasons. The other thing, this enables you to do. Is it a used one, the software? What are the data is? And another thing you have with the health care data is you often have data residency, regulations wage with a patient is in the UK. The data X2, physically stay in because in u.k. Right across Canada from Singapore, office, you know, Brazil. So we can, we can easily deal with those issues, as well. So, how about your Cloud presence? Do you have a way to Leverage The Cloud? Are you already using it for scaling out? Compute and storage? Like what is the story? They’re, yes, so we wage.
20:28 – 21:28
0
en-us
0.77
Also runs know, he’s been rooting production cuz the older all the major Cloud platforms and and we have quite a few live customers, you know W, and on Google and agree on other platforms as well, and we’ve done a lot of work over the years to make it easier for people to do and our software is called spark. A because it is, it is very much based on, on a purchase part, making it off, still only natively, distributable and parallelizable natural language processing a library out there in this works for training as well. As for inference. What it means, it means if, for example, you know, if I should say, Google, your managed minutes Park infrastructure, like, you know, like Mr. Right. Where you can, if you need to you can you can easily scale scale your computer. That way, right? A similar, you can, of course, scale storage or any of the large Cloud providers. If you awaken within and email platform, right? Like, you know, as uml studio, right? Or Osage maker off.
21:28 – 22:20
0
en-us
0.89
You know, you can, we can install this off the locally. I can still work within within your container. And really we didn’t help get a lot of the Privacy by Design and Security package design. I really Keen right to making sure we can work with in a sandbox and because a lot of very often people use all sorts of to reckon date that they test be a giant. So you can do that. But then either, you know, you can walk with all the all the cloud tools today, you can scale training, you can scale inference. So people scale usually today, either one kubernetes right on on a larger star clusters off and he all the cloud providers provide kind of the, you know, the scaling, and also scaling tools to enable you to do that. And so will, usually, our focus is just to make sure that each and every of the large Cloud providers, all that information on Blain kind of platforms. We integrate name.
22:22 – 22:33
0
en-us
0.88
So are your customer is expected to have their own existing spark clusters and kubernetes deployments or does the John Snow Labs portion act as a platform-as-a-service and provide all that up front for them?
22:34 – 22:39
0
en-us
0.75
No, so our clients do that and and that’s the and that’s consistently being the choice.
22:41 – 22:58
0
en-us
0.77
Initially, we thought that we would need to do some kind of platform with the service but I think what we’re seeing specifically in healthcare and life science is because of that privacy and compliance reason wage If people really want things to one, you know, under the one control and this is actually been a differentiate to cross.
22:59 – 23:24
0
en-us
0.87
Especially against, you know, really some some of the public cloud services on the NLP, just the ability to say. Yes, look, you can run this within your own control infrastructure, whether it’s on v w s as your account and you can we can help you scale you there if you need to, but you know, nothing ever, get sent out. Nothing gets stirred. That’s actually been a that’s that’s been a positive differentiate as we see it.
23:25 – 24:25
0
en-us
0.89
So did the software get installed on clients or is it an agent or does it run as a container within their kubernetes pods? So we cannot either way, you can either install it as a librarian and then you can run it on data, brakes, right? You can you can you can within the local container, lot of most people when they start yet. They just start with the wage like which enables you to, for example, to want things on your laptop. And that’s another thing that often happens. Sometimes people actually do. I do I really do like, you know, must a have a special class, the rightful kubernetes cluster ontap. He’s answering his know, definitely just on things locally in your laptop and you kind of debug bill that way until you’re ready to go to production and scale, right? And then you can take the same containers, right? And scale them right or you know, hey, I need a loud vegetable steaming job. You can scale it all this podcast. And as far as the data sets, I mean, Can it can, it can consume any sort of data based on what the client can consume, right? So if it’s an NFS Mount or
24:26 – 25:16
0
en-us
0.77
A block device or, you know, whatever that you provided the software doesn’t care. Yeah, exactly. So the topic is the library. So exactly what whatever you can read into memory, you know, we can read what we have a we have not been able to work to be able to to optimize a right specific and especially the more large-scale infrastructure, right? So which is not what what spot providers, right? So if you have like that, you know, on actually a honestly, I’m not allowed kind of look storage with, what you want to be able to do is to be able to actually, you know, distribute the jobs right. And and be able to you know, actually use parallelism life bossing, very large amounts of data, to be done, optimizations, to enable you to do that. And that’s, that’s really a lot. A lot of the the benefit from smoke comes along, right near your spark back ends. Probably be something like a cluster, or maybe a flex group, from ontap.
25:18 – 25:33
0
en-us
0.82
A precise. Exactly. Cool. Yep, and what about interfaces? Like hl7 do you go out and get data from some Health Care applications or does the data have to be curated out of the applications for you to look at?
25:34 – 26:34
0
en-us
0.90
So we we do it both ways. Usually we get asked to get the data as well and 7, usually, this is limited to move the structural data package. We definitely due process agent 7 for two reasons. There are some notes within Angel, seven. Either a few notes Fields, but more commonly sometimes because of the way just the edges are, in fact that people put three techs will be shouldn’t, right? So they are communication, but they put kind of instructions there. So even if you’re looking to, you know, the identified data often you need to actually look for. Even in, you know, in all of those things. So, that’s one thing we are seeing, but I would say, usually, we need to get data from elsewhere as well. Because when you deal with NLP, usually we dealt with domestic help out that are not covered by fire, right? Because you look at you know, five or more o d c kind of those data models. Usually the focus on kind of the, you know, the core, the core data model, right? So, you know, what is
26:34 – 27:34
0
en-us
0.89
The patient. What is a clinical encounter? What is, you know, procedure? Right? What is the prediction? What is a lab result? Right? Right with, you know, with NLP of any kind of say, oh, and well, you know, at least a 100-page multiple 4matic. Can you please read it? And do you know patiently specification, right? Or you know, see which group discussion will make sure when sense and so specifically include questions, right? Or I have the same generation sequencing, you both result of these, you know, Thirty page PDF. That’s, you know, that’s actually really imprinted in a way. That’s really helpful machine to read. And can, you know, can you tell me what are the May eleventh, you know, biomolecules that without kind of actionable alterations right for this specific patients similar with the images, right? If you did with the icon images usually kind of oh, yeah Indeed jobs, you know, we have a you know, we have an idea and then you go to this other recipes location right to NLP face, right? When you get it from there. So right now I would say very often when you know, when when you want to deal with.
27:35 – 28:34
0
en-us
0.87
Anything, that’s not fluctuate. You you you start dealing with some custom stuff. So, how about use cases? I mean, we talked a little bit. We’ve talked about what NLP can do and what sort of data sources. It needs wage. Where would we use it? Like, where would it be the most beneficial? Within the industry? I sure. So. So I notice that we, we have customers closely to the superiors to Providers. And so, companies and lots of software companies in this space. So busy, we are seeing a Broadband reviews cases, and let’s see if I can but just think about some of the popular ones we will sing. And so let me when you get one set of use cases comes on with the original date and patient records and putting together. Kind of a timeline of probation. Yeah, that’s important. If you want to cancel really just summarize what a patient needs to enable. For example, in oncology will use the boat for this patient is where they are right now. We are the pros and cons and doing a, any kind of job.
28:34 – 28:43
0
en-us
0.72
No decision support, right? So what is the recommend anything to do with this patient, which is in most most cases. I mean, you cannot do just actual data today.
28:45 – 29:23
0
en-us
0.73
Matching patients who clinical trials is a big problem because that’s the living a source to hospitals to, to Pharma companies have most importantly and, you know, the example and delays still people down in the right now, most most guys do get delayed but inevitable fear mentioned, patience to a new research and most of the relevant papers. For this specific type of patient. Very, very good condition, raise minimum graphics. And that I should know about is the doctor is another use case. We are saying, and let’s see other use cases of there and gave me the real fairy will data. And we would evidence
29:24 – 30:13
0
en-us
0.79
Okay, and so, for example, let’s say you’re looking for example, in your own Covidien box. Right? So you have this population of, you know, five ten twenty, you know, two hundred million people and say, okay, to see what happened to them. Clinically and what are the, you know, what’s interesting? And you’re not really sure what you’re looking for. And so, in those cases specially for way, because especially in the US, that’s what you have a selection date is the stuff that’s available. So if you’re looking for cases, where people had seemed or more, we suspected something all those family history, right? Or patient complained about something, but we didn’t quite know what to do with it. All of them is only be available in text. So having software, automatically read those reports, right? So, you know in college, it’s more about the ology pathology right in the office visits mobile kind of the office leaders themselves. Sometimes live reports.
30:14 – 31:13
0
en-us
0.77
That’s another common use case, with in more population. Health one, big question within population. Health is okay. Let’s say have a program, right? So every program for, you know, pregnancy is a Christian and diagnose diabetes. I diagnose depression. You first problem is, how do I find the page? The patient should reach out to try to try to get them into the right. And and if you want to understand for example, undiagnosed diabetes, it’s undiagnosed. It’s obviously not going to have, you know, clinical code, that’s attached to it. Right? If you want people, who are at risk, for example, for you know, suicidal depression, am going to be like most people do prescription, right? Because you’re looking for that and manage patients, like to reason for hints, right? So socially determinants, substance abuse, previous depression, feminists, which was kind of things to see, you know, who may be may have appointment that we can potentially benefit, which means that you need to read the notes, right? And really the way people do this today. Really, I mean, seems very Al-Qaeda, but but really, people sit down and read patient home.
31:13 – 31:20
0
en-us
0.80
One by one, which takes months. And means that, you know, a lot of things just never happen. And today we, we do have the capability to do this fairly well,
31:21 – 31:51
0
en-us
0.86
So they’re from the clinical side and then there’s also a lot of work with NLP on the biomedical side. Biomedical research is example being able to automatically read medical literature home and expect the insights for me. So finding a you know, you know, what’s the calendar recently? That actual about, you know, black dog. Introductions about genes Gene variants and human phenotypes Volkswagen Steamboat humans, a decent, you losers from different clinical trials.
31:52 – 32:51
0
en-us
0.89
A billion Astra. So, there’s a lot of work automatically reading the papers automatically summarizing them in automatically, billing knowledge, graphs. Okay, so that you can do kind of multi-hop queries, right, You can come on this paper says that, you know, these protein has this programmatic, you know, biological mechanism these other papers that these biological mechanism involved in this symptom of this disease in these other papers. This is right now, you know tells displaying them drug. It seems to have a similar biological mechanism. So maybe you can use the other as well. So those are the kinds of things but, you know drug Discovery Bank saying that the law today as well. And so I would say look it’s you know, it’s a variable type of use cases near. We were seeing fraud waste and abused, use cases, identity theft use cases with a C or we both in use cases. I think we we are right now really very early on because really now for the for the first time in history, we’ve taken all these notes, that doctors have been trained to fight about us and have wage.
32:51 – 33:51
0
en-us
0.89
Think about this and into things happened. First of all, for the first time in history. So actually digital I’ve actually available in computer either not, you know, sitting on the shelf on paper somewhere someplace. And the second thing that happened was I planning a transfer learning happened to NLP. Increasingly just in the past three, four years. The number of use cases that will eventually become possible right that moved from, you know, like saying like translation, right, when you have twenty years ago. It’s also it’s also be visible to us. We, we’ve had similar lips inaccuracy. Well, right now, the more and more tasks. I, I can tell you, you know, we kind of give you the academic results humans do as well as machines. Right on this specific task. ID identification is one of them that also some other kind of super specific tasks where we can pull it out as well. So I would say it’s it’s very early days. We were seeing a lot of things that people are trying and I think we have a easily a good, you know, two three decades of of you know progress and actually Port exercising thing off.
33:52 – 34:05
0
en-us
0.75
It was so, how about Howard applied it to, you know, the modern stuff that’s going on with the pandemic. You know, where were where were the language processing pieces fall in there or does it even have a notification in that space?
34:06 – 34:49
0
en-us
0.88
I definitely I mean definitely be seeing some new application that covered and you know, I mean, it’s been unfortunate accelerator right for many things, many Technologies, NLP being one of them off the things we’ve seen really in the all the industry survey in the past two years is that in 2020 and this was you know, post copied. And then again in 2021 in the fall Enterprise investment and it’ll be is gone by thirty percent in the last kind of doing the surveys and and it’s really can start the security Nai, you know, pretty much the only consistently growing kind of investment in NLP. Is really is one of those things that become one of those. In one of those two technologies, at least in healthcare.
34:50 – 35:11
0
en-us
0.75
And so, I’m saying he’ll some of the manuals cases, will sing one of them, a lot of the hospitals, because really just means just deluded patient. Questions v, a lot of use cases around, looking at questions from patients, like either either phone calls or from emails, or from web forms. Well, either can be automatically answer these questions.
35:13 – 36:13
0
en-us
0.89
Right, it would be the last about 12, or more importantly, can we Auto automatically classify, clinical emergencies, right? Or example, I’m really curious cuz we’ve done we had with a large, I help you help your system. Here was they were receiving about 70,000 really online, you know, messages, right different messaging Channel, their forms online, messenger chats, all of that month. And the problem was that about 2% of them will actual clinical emergencies, right? Well, someone described thus Implement like you with this and look you should not be feeling away from. You should not be a right now and and you think the people would, you know, called nine-one-one. Although but but sometimes they don’t and the problems with 70,000 today. It was just impossible to manufacture and filter, everything, right, especially when ninety-eight ninety-nine percent of them are most North American cities. So this is one case where if you can have an automated model and find the the ones that are either emergency or likely to be emergency in mentally forward.
36:13 – 37:13
0
en-us
0.71
Which clinicians you can save people? Right? And that’s something that that works and ethically with Kobe. Because a lot of people really move from coming to, you know, to calling, or using messaging, using whatever form of communication available. In other thing, just go with seriously accelerated was automatically summarizing in learning for medical research. So, one of the things that happened with Kobe is that I don’t remember the exact number with hundreds of thousands of new academic papers about covering that will publish in the last 18 months. And the point is that nobody no one can read them. Right? I mean, you know, it used to be the case, like, you know, in the in the same thing is, like, if you would help surgeon every month, you just sit one weekend. You will be the only new papers about heart surgery that came out this month, right today. I think I saw one estimate couple of weeks ago. That said, you need to spend twenty six hours a day reading, just to catch up on your own subspecialty, right? There’s a doctor, right? So, so will you just don’t know? So, so been able to basically automate this page.
37:13 – 38:12
0
en-us
0.71
I’m supposed to reflect the papers exactly to me. What what? You know, what is the general agreement in research? Right. What are the the key symptom? What seems to work? What are the side effects? What are the probable mechanisms, wage the demographic, inbox? That’s that’s another thing. That’s that’s in use a very, very heavily. So that’s, that’s another thing that I think, definitely, we keep happening. We have a lot of wage really reading academic, papers was summarizing it so that people can keep up and the other big thing. Right now, if you’re looking at the vaccine looking at lunken with one of the really open questions took, right? That’s, that’s, you know, only want one of them all most agents wants to kind of to, you know, to the world science. Today is really what happens. Right? So, so, you know, what these long, right? And what, what are the side effects, right off? What are the side effects of the vaccine? What do we not know? What do we do about the things we haven’t tested in clinical trials, like, you know, children people with cancer people with pregnancies. And for that, what you want to do is you want to, you know, collecting wage.
38:13 – 39:00
0
en-us
0.79
That’s right. You want to look at medical records and what people reported over the past eighteen months and very quickly be able to come out, come up with with those insights. And a lot of it depends on be able to read the notes home. Right? Because a lot of it, you know, or patient complained about, you know, whatever, you know, nausea and inability to sleep, but it passed after three days. You just going to be a notice. There’s not going to be nothing else. So for a lot of those things, really, you you only going to be be able to pay attention to the sake. Oh, look, this is important because this happened to, you know, eighty thousand people, you know, within this demographic, you can only be able to do this. You’re able to automatically read the notes and summer a few things, right? Then then once you have selected, they’ve done all these two strands. So, where do you see? NLP going in the next few years. Like, what sort of Trends are you noticing and where do you think it’ll end up?
39:02 – 40:00
0
en-us
0.75
It’s, I think it’s very early days right now. It’s very, very exciting. Right now. We are still staying really, I mean, you know, accuracy is improving daily on, you know, the quarterly basis and we suggest we do a lot of Legacy implementing things, right? Implementing new and better ideas to come along. If we look forward. I think there’s some super interesting friends that are that are just around the corner. One of them is really dead more and no code and local Solutions, right? So, so right now. You see, I need to be a data scientist and have some experience with the planning to be able to train with your models. I think right now we have some to be a viral infection wage and what you’re seeing right now is really the first time will really doctors are planning and zoning models, right? So, you know, you identity kind of say, okay. Look I want you to read, you know, automatically translate liberal theology reports and I understand there, you know other activities like root canals, you know, should we do something, you know, with this patient you want domain expert like, you know, like you do like like a lawyer wage.
40:00 – 41:00
0
en-us
0.77
Chris Events, like financial analyst we Financial disclosures automatically train you in the model without coding, right? And be able to deploy it all the way. And I think that that’s, you know, I think we get there and do you know who the next the next few years? That’s one thing. Another big thing is moving here to enable much more direct question answering. All right, so and right now it’s kind of natural language bi, but I think that those our goal to do much better in terms of accuracy and kind of being able to answer domain specific questions. So that really sucks. You have to be machines, really as human explicit images of just give me, can give me all the time. Give me all the noisy data, right? I don’t care if the same address, if it’s canned, if it’s half. If the data is completely, if you have missing data off, if it’s, you know, super local just don’t just give it to me, just ask my questions about the patience, right? So, you know how many patients that you know, we stage 3 of these kind of cancer, they decide heavy side effect right off.
41:00 – 41:16
0
en-us
0.74
Many people on the trial, you know show this level of improvement yes or no compared to like other groups and really you’ll be able to to ask natural language question like researchers do and get immediate answers and I Think We’re Alone Now getting to a point where kind of older all the underlying pieces of their to enable this.
41:17 – 42:14
0
en-us
0.74
And, which is very, very exciting. And he was in progress, really want government with this, the third. I think, very important thing Bill saying, is I think responsibility is becoming more more more, you know, it’s moving from the talk stage to the reality stage. And that’s also something. I mean, I’m pursuing the same progress. We are doing a lot of work in that area around, making sure that dealing with, with bias, billing, transparency, dealing with, you know, fairness dealing with the really just the safety, right? Which is bigger, really the first thing you care about in healthcare really becomes a much more engaged in the data scientist element process, right? So first of all, it’s not an afterthought, but the other thing, the tools are there, the processes are there, I think, kind of, you know, we’re going to, you know, blink twice and it’s going to be one of the things that’s just going to be. Kind of England, England best practices, but the people we just come to expect. So you think we’ll get to a point where all have Alexa for healthcare.
42:15 – 43:15
0
en-us
0.76
Oh, we definitely wait. Yes. I’ve always there a little bit, one important thing. This is not going to replace doctor first, right? I think that the some, you know, kind of a fairy tale stuff. Yeah, we’re going to do a, we doctors. I don’t see that happening a lot. I think that the most useful tool in medicine, but. For the past five thousand year is a face-to-face discussion between the doctor and the patient, right? And that’s going to be mend that’s going to remain the case, right, you know, even, you know, even installed like a human doctor, right? Because you have a patient who to talk to your patients, see what the problem is, Yeah, but I think the definitely what’s going to happen. Yes. The both the patients and the doctors, we have tools that basically bypass a lot of what’s very, very painful to a book display, right? So we know doctors hate, the teachers, they hit the administrative work, right? They had the documentation. They hit the Quality Reporting, they, you know, late hours, right? And then the regular wage.
43:15 – 43:53
0
en-us
0.75
We you know be yesterday don’t think helps, you know is the patient. I can read a lot of parts in my medical records, even if I have access to it, right? If I have a question about it, you know, I only took an appointment and do that. That’s not really what I want. I think we will get to a point where he has all of that is going to be March much nice. I’ll be able to go and talk to my doctor and you know, it will be transcribed. Hey, the doctor will get the recommendations. I will get the explanations all the back end kind of processing reporting Regulatory and such will happen in the back and I think we will get this, you know, it’s it’s going to take a few days, but I think that’s definitely is the direction.
43:54 – 44:50
0
en-us
0.78
Yeah, I don’t see this replacing doctors either. I mean, you still have bedside manner. There’s still the need for that Personal Touch. I think what, it’ll, it’ll eventually do is give doctors more time to be with their patients. Cuz right now there’s a I mean they they could spend a few minutes and they gotta move on to the next one. So if you’re able to do less of that administrative work have more help trying to find answers and you know, cuz you can’t remember everything. Ultimately, I’m going to be a win. I think for everybody in the industry. Oh definitely. Definitely look and I think, as I said, the the two big wins here, one is assisted. Look, we have a problem with doctors burning out this life, what they do, and spending less time with patients and they should, they want. And that we need them to, that’s a big, big problem. And I think, yeah, that’s like technology that we have a responsibility and duty to reduce thow, you know what? It shows the position, but the other thing, even bigger thing is really just just equal access to healthcare.
44:51 – 45:06
0
en-us
0.73
Right, because one thing that’s, that’s only in the US, but also it’s a bigger problem outside the US. If you just look at the, you know, population goals. And you look in the number of doctors being trained V. Not even close, and the Gap is going to get bigger.
45:07 – 46:07
0
en-us
0.70
Right. So so the problem is, you know, how many people you’re not going to be able to see, you know, especially a specialist, right? So, you know, if you need a cardiologist right to a psycho based off, you know, and, and you know, in and colleges to an orthopedic, the person may just not be there. So what do you need? You need to give the dogs that you do have superpowers, right? Doctor can cancel, you kind of, you know, you can scan the patient, right? And and you know, the patient will automatically be analyzed against a database to like, literally all medical knowledge in humanity has helped right today at points. Best interests of the patient, right? And then you making the doctor really much more much more knowledgeable and able at that point to help, right? Well, of course, the person, you know, the doctor is a person, see? Need to explain to me what my problem is, you know, discuss my options, what they like to do, right? But but the thing in terms of, if you, if we, we also have a responsibility to make sure that, you know, all of the, you know, all the gates work that’s happening, for example, in Precision medicine, right?
46:07 – 46:45
0
en-us
0.69
Inventive spelling of prediction opthamology and holiday, you can actually get to everything in the world that needs them and, you know, and we just not doing to get the whole human doctors, right? So that that’s another big, big thing. But let’s say I can do for us, I would imagine it also helped kind of tamper down the WebMD effect, right? Like, you know, what happened? He’s great. You can find a lot of information out there, but you can also do is diagnose yourself pretty easily. So I would imagine without a i a machine learning those misdiagnosis if you’re trying to do self-care are going to be a lot lower because you’re going to have higher probability of success, is because of the data that you’re feeding in.
46:46 – 47:45
0
en-us
0.83
Yes. Yeah, I think yes. So yeah. Web and a look, when you know before one day we had actual books, right? And that’s how your good self diagnose and most will just not have access to them right now, with M DSP h e, I may have seemed to me. I, I Googled myself to say that. Okay, you know, she feels why you may die today or not. Right? But the problem is all I have. Yes, I have some stuff off. So if I ever, you know, I think he’s working cuz I could put in Google broken pinky. And I see what comes up, right? Well, really what, what I want to put in is not on his my entire medical history, right? Here’s my demographics. I use my, you know, my writer’s is on the other things. I have, I have a family needs to be replaced. I had me on taking these medications and you know this operation last month right now my pinky is, you know, is broken off and then has, you know, you’d want something small to Tolono. Look, that’s probably, you know, fairly well, known side effect of the medication because if you, you know, you don’t remember what we change, those Edge 2 weeks ago, right wage.
47:45 – 48:45
0
en-us
0.76
That’s a it’s a much more personalized thing. So so that will help. You know, what, what you will still have. A problem with is that, you know, none of the to replace the actual human doctor wage. It’s so, so what you’ll have, you’ll have more educated patients, which is great on a more personalized, you know, using more personalized tools, which is very, you know, you’d still need them to get professional advice, especially when things get serious. So, yeah, you know, we’re not going to get away from that. No, absolutely. And I would hope we wouldn’t. I mean, you you that nobody can be, nobody can know all these things, including the doctor’s office. So having a place where we can store all this information and compare it against other results and other, you know, vital signs, that sort of thing is going to be invaluable to getting more accurate results when we were doing our diagnosis. Exactly. And, and also one thing to help yourself is in general, like, people don’t want to be educated. Right? So, you know, people spend tons of times, you know, memorizing, you know, like sports club.
48:45 – 49:39
0
en-us
0.75
Sticks right now, looking at new clothes and Brands right and memorizing, you know, movie lines because that’s fun. And, you know, and, and it’s socially, you know, nice, right? Hey, I don’t want to like, nobody wants to sit down. Let’s read about all the different kinds of pencils that can happen to me, right? And, and how we build the treatment would be just. So that a medicated like nobody does this, right? This is how are, you know, this is not how normal you spend your afternoon, right? Which means that when something happened you have this, you know, serious gap of college, right? Which is why, you know, why people look right for, you know, X, like with the Google experts, and I mean, that’s going to remain the case we healthy, right? You’re like the city of really, you know, unpleasant things that can happen. And then and you, you know, when that happened really, you know, use less than a dedicated, right? And Technology can help you. But he’s with the, you know, with education part.
49:40 – 50:40
0
en-us
0.75
It’s almost like it’s fulfilling the promise of what Watson was supposed to deliver. Was that a failure of math? The, the AI techniques that they were using enough compute, guess everything that you’ve been talking about but summarizing papers and, you know, looking at all the available data, that’s what they were talking about. But clearly didn’t deliver a so, where did they go wrong? And I think, from what I’ve read and, I mean, I don’t know that the team intimately, but I’ve read a lot about it. I think it was a big, get this over promising, for what the technology was able to do the time. But and I think what we are seeing, I mean, we have, you know, we pride ourselves on having, you know, a lot of time of extra production, deployment and production factions in both of them start today, you know, when we we talked to a pastor when they say for example, yes, we want to do this, you know, for example automated, you know, like is this a clinical emergency? Yes, or no, right. So, yep.
50:40 – 51:40
0
en-us
0.72
Really where you want to start these? Okay. First of all, you know, we can give you state of that, accuracy, but state of, that can see, does opening perfect. Right? So, so you start me. Okay. Well, that’s kind of what’s the human workflow? Right? How do technology package help you? How do you deal with issues of of safety, right? And fairness and reliability. How do you deal with? You know, with the case of the system, is untrue, right? You need to put these together right now. I think really we are literally stage. I think in every project. Yeah. We it’s kind of, it’s an education process. Last with the customers and say, okay, here’s how you’d want to put together the system, right? Here’s what the system came in in Buffalo and ultimate the way here, the things I need the wrong. And it’s, it’s a learning process. I think what what I’m trying to do to come and say they can sell all these future. We describe it, you know, maybe fifty years out of twenty years old. They may already have it and I think that’s, that’s one big failure. That’s one thing. The other thing I think there was and that’s something that really happened to me. It really, I think it happens to everyone gets into Healthcare. Yep.
51:40 – 52:40
0
en-us
0.90
The same Healthcare look in the first ten years, in health care, you you a newbie and you should you know, you should know that. I think there was another estimate of how how technically healthier is compared to other patients and I think people outside the health care. They often underestimate is the assume that they know that they’ll carry a specific and it turns into a jog on its own rules. I think people underestimate just how technically how to do the problem is really generous. Just if you, you know, like forget doing a iPhone good, right? And helping people that just from technology perspective. I think it’s a problem. It’s a, it’s a how the problem. And for example, one one of the things that happened and I know that IBM, for example, they train the most important quality, you know, with us position and like amazing musician. Right? There were very well connected. Then they’ve won a case study in Denmark. And what happened in Marquise. I think that the, the the oncologist state was in college, it completely stopped using the system. They said, oh, we only give you the system 30% wage.
52:40 – 53:05
0
en-us
0.74
The bank, right? And and the issue that it wasn’t the, the, you know, the system was giving fundamentally want recommendations. It was giving you a silly commendations, right? And and, and when you do something like ontology off, first of all, not everything is well, not little guideline, people disagree about the guidance. Okay, but more importantly, especially if you move, you know, between the US and Europe, people just don’t belong to the same treatment is the correct one.
53:07 – 54:04
0
en-us
0.80
Right, but those are just is, this has been values in different ways. People, teach medicine, right? And that’s also, that’s a lot of the human element comes in. Right? Someone has a, you know, it’s a very personal decision. Do you want the aggressive treatment? Maybe you want to palliative treatment, right? Maybe if you do not have any treatment at all, right. Those are, you know, it’s not a medical question, right? It’s, it’s human values question. Right? And the fact that if you go even to to hospitals in within the same city, they would treat you differently. Right, right. So so they took, this is why we have second opinion, right? We want to go through that actually, you know, things like that, things like like like us, right? Understand that there are certain things that we want out of the treatment life, you know, fixing human human body is not like fixing a car. I can you fix the character and say, look, here’s how you replace the tire want, you know, on, you know, on this kind of Volvo Riders, you know, there’s a checklist.
54:05 – 54:42
0
en-us
0.76
You must, you do not work with the checklist, humans, believe the visuals. Right? And I think that it was kind of, you know, that was a big gap, right? So, I think that that’s the wage really then just over-promising and booking marketing machine ahead of the, you know, product itself. I think that the was some basic misunderstanding of healthcare, right? Of the just the level of complexity, off the level of, you know, the the level of new answers you need to solve to have something that people would actually click use. All right, David sounds like you’ve given us a lot to think about in regards to NLP and how it’s being used in the healthcare industry. Again, if we wash reach you had, we do that.
54:44 – 55:09
0
en-us
0.87
So over email David@JohnSnowLabs.com Yes. I’m at Esteban Rubens or LinkedIn. Esteban Rubens off, right? Excellent. Thanks so much for joining us today.
Pingback: Behind the Scenes Episode 323 – How Natural Language Processing Enhances the Healthcare Insurance Industry | Why Is The Internet Broken?
Pingback: A Year in Review: 2022 Highlights | Why Is The Internet Broken?