IBM Watson
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 19 | |
Author | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/39644 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | |
Genre |
5
13
00:00
Lecture/ConferenceMeeting/Interview
05:44
Meeting/InterviewLecture/Conference
07:55
Lecture/ConferenceMeeting/Interview
10:55
Lecture/ConferenceMeeting/Interview
11:25
Lecture/ConferenceMeeting/Interview
12:50
Meeting/Interview
13:50
Lecture/Conference
16:15
Lecture/Conference
Transcript: English(auto-generated)
00:00
Firstly, my name is Justin Fessler. I've been working with Watson technology for about eight or nine years now. I've been working with the federal government for about 10. The reason I say contradict is search is great, but it's not comprehensive. It's actually an archaic, outdated way to find information. Watson is a little bit different. It's all about cognitive functionality in terms of findability, or as Dolby said,
00:23
discovery of different types of information. So I'm gonna contradict myself as well as I go through this presentation because we do have search capabilities. I'll talk a little bit about them, but I really wanna talk about context and I wanna talk about cognitive abilities to find information quicker, easier, and not based on keywords. Because when you think of the search,
00:41
you think of like the Google paradigm, where I go to Google and I put in a couple words and it finds thousands or hundreds of thousands of documents, and that really doesn't help. Those documents might contain a specific word, but it doesn't tell me what else is in the document and how it's related to the type of information that I'm looking for. So let me get started.
01:01
I'm gonna focus on the top three trends and challenges in AI today and what's coming and what's actually available Now I wanna point to the bottom of the screen here. Our AI doesn't necessarily mean artificial intelligence. I know I'm an artificial intelligence strategist, but to me it means augmented intelligence. And the human ability to consume a lot of information is just lacking, right?
01:21
So AI is really the technology's ability to augment what the human can do to consume and understand a wide variety and a large volume of information a lot quicker and see what patterns exist in that information, okay? So, first challenge, information access. And I think my panelists did a really great job of encapsulating this.
01:40
How do we find information? How do we find relevant information? And how do we pull it all together in one place? And that's really what worldwide science is really meant to do. Pull all the information relative to my question, my query, in one place at one time. And yes, Watson does have this functionality. We actually call it a contextual 360 degree view. So just like you can form a single search on worldwide science, you can form a single search
02:01
in a Watson application, query, federate out to a wide variety of systems, where we differentiate is the ability to ask a natural language question. So, for example, what other researchers in the field of XYZ have done this? And you can ask that full question. And the benefit of Watson is to contextually understand
02:22
the intent of what's being asked. It's not based on keywords. It's based on context of a query. So we can connect out through a wide variety of different systems, databases, content management systems, through a wide variety of techniques like APIs, REST, JSON, et cetera, web services. And we actually have native connectors to a,
02:41
I would say of two or three dozen repositories, whether they're structured databases through ODBC or JDBC, or content management systems through CMIS, which is an industry standard, et cetera. We can connect to a wide variety of systems and contextually pull it all together, okay? So this is challenge number one. Which you guys are doing a really good job of solving.
03:01
Challenge number two gets into unstructured data. How do I extrapolate out patterns, trends, relationships that are just contained in freeform text? And again, how do I ask a natural language query to identify those little bits of information that might be contained in a very large, 20, 30, 50 page report that would normally require me
03:21
to do a control F? And what if I don't find what I'm looking for in the first go around? I do another control F. And again, that's only contained within one document. Now, what we're providing the ability to do is not just do a control F, but graphically show relationships, patterns, and connections within that natural language. This is really where Watson got its foundation
03:41
on Jeopardy! February 14th, 2011, Valentine's Day. So not a lot of people know this, but as Alex Trebek read the contestants' clues on Jeopardy!, Watson was fed a text message. So on the show, there was actually no active listening. We can do that today. We have speech-to-text capabilities. We have text-to-speech capabilities where the system can actually respond with a voice.
04:02
We have translation. We have a lot of different functionalities. We took this system, and we deconstructed it into a whole series of APIs, application program interfaces, capabilities in reality, okay? Now, there was a product called IBM Content Analytics at the time. It's now called Watson Explorer, which we'll talk about in a minute. This is IBM's flagship natural language processing engine,
04:23
so text analytics, not search, but understanding the linguistics around how people say things, okay? So in this application, Watson was fed a text message as the contestants were read a clue, and part of the grand challenge was to parse out that text message, understand what's being really inquired in that clue,
04:42
have the system go query a very large self-contained database of knowledge, find contextual responses, and then answer in the form of a question contextually to that query, and we had to do that all in three seconds as well as hit the plunger, so that was part of the grand challenge, and if you watch Jeopardy! 2011, we smoked them,
05:01
and it was all because of this text analytic capability to understand how linguistics are being described in a clue and how linguistics are being described in a series of documents, and this system actually had hundreds of thousands of documents that were pulled from a variety of sources like Wikipedia, the World Wide Web, Shakespearean literature, we had the entire Shakespeare collection
05:21
ingested into this application, but again, when you talk about Shakespeare, then you talk about different idioms, you talk about different linguistic capabilities, and again, that also takes some translation to understand how English was described back in the 1800s and how English is described today. So that's where contextual cognitive capabilities come in to understand what's being described,
05:41
not just based off individual keywords. So we are really good at doing this, reading unstructured data and doing what I call attribution extraction. So going into that information, understanding that there's entities, there's people, there's locations, there's objects that are mentioned in that information, and those people, entities, and objects are linked to other people, entities, and objects,
06:02
and we have the ability to graphically show it, but also show and push out results to analysts so they can make the best informed decision based on how this text analytics is showing those connections. Now again, it's not based off keywords, but what we did in an application with Oak Ridge National Lab is we took this text analytics
06:20
and we ingested a few hundred scientific publications, and this is in the realm of material science, and I know it's a little bit tough to see up here, but in that little dropdown, come up here really quickly, we've actually gone and we've extracted authors, we've extracted organization, we've extracted chemical compounds and substrates
06:43
that are contained in natural language. Now this is done through a few different techniques first and foremost it's called parsing rules. Parsing rules are the ability to create different mechanisms to say, when you recognize this mask, you know, big letter, little letter, number, big letter, little note, number, right?
07:01
That might be a chemical compound, and we can go through hundreds of thousands of documents and automatically extract those compounds and apply it to a compound field. That becomes a filter, and that gives the user a very quick way to slice and dice and find contextually relevant information to different characteristics of metadata in reality that's contained in that information.
07:20
So not only searching, which yes we're doing, we're also indexing, applying additional metadata based on relationships of content to each other, and then graphically showing them out in this user interface, okay? So that's challenge number two around unstructured information.
07:40
Challenge number three is how do I then scale that? How do I take it, an artificially intelligent system or an augmented intelligent system, and train that system to understand really what I'm talking about, right? So, and then get into predictive analytics, let me say. So here's an example of what I talked about earlier with those parsing rules. On the left-hand side is a document,
08:00
very unstructured in its formatting. On the right-hand side is how we've provided structure to that document. So I talked about parsing rules. Here's an example. On the right-hand side, we have a resting officer, and we know that officer is noted by a double alpha triple numeric code. So every time I see that, I apply it to a resting officer. So this is one way to initially do training. I've given it some examples,
08:22
and then the system will then read and pull out those examples and associate it to different categories. This is not just search. This is text analytics, okay? So it's taking it a step further and showing those relationships, not based on Boolean, not based on just keywords. The system actually parses out that entire document and tells me what characteristics are associated to
08:41
and what metadata is also associated to that. It's getting more holistic in how we're approaching a search application, okay? So I really like this slide because in the legal space, there's a ton of jargon. In the scientific community, there's a ton of jargon. So now that we've got our unstructured data,
09:02
wide corpus of unstructured data, normalized to some extent, where do we go from there? We wanna take that information, or we wanna take the historical information, or we wanna feed it into predictive models to get the what if, or where am I going if this happens or this occurs? But quite candidly, this is not an easy problem to solve.
09:22
I don't think there's many data scientists in here, are there? Right. So I've been running data science for the past couple years. This is not easy. But the process of training Watson application is fairly easy, and I'll get into that in a second. So that talks about supervised and unsupervised machine learning. Unsupervised machine learning,
09:41
actually is somewhat related to clustering, which we heard about a couple minutes ago. But unsupervised machine learning is feeding a system information and letting the system tell me what is contained in that information. And then it kind of iterates on itself to say, oh, I found bits over here that might be related to bits over here. Let's try to normalize them together and then push it all out to the end user.
10:00
And that's good, it works to some extent, but it doesn't necessarily take into account the human's ability to help train the system to give it more context on what it's reading. And also, when you talk about supervised machine learning, this gets into feeding the system dictionaries, ontologies around the type of information that it's gonna be reading, so it has a better contextual understanding
10:20
of that type of information. So Watson, or any AI system that you might use, consider it a five-year-old out of the box. Okay. As that five-year-old grows up, it gleans more insight and more domain understanding of the scenarios and the environment that it's working in. Same with Watson, right? You can feed it dictionaries, you can feed ontologies,
10:40
you can feed it lexicon. The more information we can provide it, the better it's contextually able to understand and read a whole series of documents to make those relationships really gleaned and get surfaced out. Now, there are a bunch of techniques, both from the unsupervised and supervised perspective. And here we go. You'll see unsupervised learning clustering, right? Supervised learning has a number
11:01
of different characteristics around classification and regression, and then that can be broken down even further, I'm not gonna go into this, because we don't have enough time. I saw that there. Heh heh heh heh heh. But I do want to talk about use cases. And these are federal use cases. These are a number of my clients. I know this presentation is not exactly what's in all your decks today,
11:21
so I'll make this available, because I made some last minute alterations. All right, we've got three NASA use cases that I'll talk about today. First one is what we call Flight Operations Advisor. My team and I went out to, actually United and Delta Airlines, and we went to their operations centers to see what they do on a daily basis.
11:40
It's kind of scary, actually. You can have one flight operator sitting at this desk up here with eight screens. I know I can't look at eight screens at one time, but if something flashes on the bottom left screen, they have to then reference the top right screen and say, oh, what's going on here? And then they have to go to another screen and find a reference library of documents,
12:01
pull out one of the reference manuals, and do a control F. You gotta be kidding me, right? If I'm on a flight, and I've got an operator down on the ground doing a control F to try to figure out how to solve my problem, there's a big problem, right? So, contextually, we've given the operators the ability to leverage Watson,
12:21
ask a natural language question, and say, we've got a scenario given these constraints. How do I solve that flight's problem? And it's going into reference libraries, it's going into streaming data sensors, combining all that information, and presenting the best result sets, not based on keywords, based on context of what's in the manuals, the maintenance records,
12:42
and providing that to the analysts, the operators, so they can make the best informed decision on how to reconcile that problem. Use case number one. Use case number two actually takes that application and puts it into the cockpit. So, there was a scenario where a Boeing 717 was actually dolphin-ing in the air,
13:02
and it took about 15 minutes for those pilots to diagnose what actually occurred. I know that, I can get motion sick. 15 minutes of going like this is awful, right? So, what we did was we trained this application on the real scenario. Turns out that part of the fuselage got iced
13:22
and wouldn't de-ice, which would not allow pressure to release from that aircraft. It took the pilots 15 minutes to figure that out. So, we trained Watson on the scenario, and regardless of the question, again, question, natural language question that we asked Watson, each time it immediately pointed back to, this is your root cause, this is how you need to solve the problem,
13:41
and here's all the relevant maintenance and documents that you need to use to fix that problem. So, again, real time. Use case number two. Use case number three, I think, is more relevant to this audience. It's what we call aerospace innovation advisor. This is your traditional research assistant. So, in this scenario, NASA has been using
14:02
some of the Watson capabilities since about 2012 for human factors research. So, how does a human body perform in our atmosphere under the constraints of our atmosphere? Well, it turns out if you take our body and put us into space, we don't operate the same way. Muscles atrophy a lot quicker. You're exposed to a lot more radiation from the sun.
14:24
So, this is the ability to ingest scientific papers, literature, and see what research is being done in the field, in the medical community, and how to then translate that into sending someone up to ISS, International Space Station, or actually long space duration to Mars.
14:42
Taking that a step further, we also have a Watson robot. If you wanna come up to me afterwards, I can show you a picture at Johnson Space Center. The point of that is we wanna actually have a robot on that shuttle to send him outside the shuttle to fix something rather than an astronaut so he doesn't get into harm's way.
15:00
Use case number three. And finally, use case number four, our good friends at OSTI, thank you again, Brian, and Lori, are actually using some of the Watson capabilities for audio indexing. So, we've talked about Science Cinema here a little bit. What this does is do translation from audio and video files into text and make that searchable.
15:21
So, we have speech to text, like I said. We have translation. If it's coming in a foreign language, we can translate it as well. And if you can see, top right, I always get this question, how confident are you that Watson is translating this information accurately? These points are actually provided by my colleagues right here.
15:40
97% accuracy. 37 minutes of duration that we've indexed so far, and we can find out that there's even silence in those audio files and video files, right? Purpose being, we wanna try to allow as much information that's out there to be searchable, and a lot of information is contained in these audio and video files that is also related to research documents, right?
16:01
So, we wanna provide comprehensive picture of all the information and even allow that simple search to be done, whether it's through a search engine or whether it's through Watson. Now, just to recap, AI, whether it be artificial or augmented is not scary, all right?
16:22
It's meant to do certain things in terms of find information better, contextually relevant information better, a lot quicker, consume, actually, not just the variety and veracity, but the four Vs, variety, veracity, volume, and velocity of information that the human ability cannot,
16:42
and allow through the interpretation through training to understand better what information is out there and how it's related to me. Now, I actually did this presentation last week at a Navy conference called AI Run Amok. So, in my opinion, it's not AI that's run amok, it's data, all right? There's so much data out there,
17:01
there's so much data that is not standardized together, but is related to each other, and it's a human inability to create those associations, and that's why AI is needed to do this searching. Now, I always like to have a little bit of fun during these presentations, so this is in your presentation guides, as well.
17:21
I hate saying this. Go to Google and type in Chef Watson, Watson News Explorer, and your celebrity match, and these are Watson applications that are publicly available that you can all play with. All right, I personally like your celebrity match because if you have a Twitter handle, you can put in your handle. I usually use President Trump's because I think it's crazy,
17:42
but what it does is it compares that individual's tweets to celebrity tweets and sees who you are most and least similar to based on how you express yourself in those tweets. It's pretty fun. It's actually built off of what we call Personality Insights, so think Big Brother, but Personality Insights was an API that we trained
18:03
off of what we call the Big Five. The Big Five, not what we call, but Big Five is an industry standard in psychology. I don't remember off the top of my head all of them, but openness, extroversion, conscientiousness, and there's a couple others, and what the Big Five do is understand personality characteristics
18:21
of how you express yourself in your text, in your speech. So these are a couple that you can play with, and I think that's all I have, so thank you.