Ontologies: technologies for domain modelling, knowledge re-purposing and knowledge sharing
This is a modal window.
Das Video konnte nicht geladen werden, da entweder ein Server- oder Netzwerkfehler auftrat oder das Format nicht unterstützt wird.
Formale Metadaten
Titel |
| |
Serientitel | ||
Anzahl der Teile | 38 | |
Autor | ||
Mitwirkende | ||
Lizenz | CC-Namensnennung 3.0 Unported: Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen. | |
Identifikatoren | 10.5446/62901 (DOI) | |
Herausgeber | ||
Erscheinungsjahr | ||
Sprache | ||
Produktionsort | Porto, Portugal |
Inhaltliche Metadaten
Fachgebiet | |
Genre |
7
13
14
15
16
17
19
20
21
27
30
32
33
35
38
00:00
MiniaturmodellArchitekturKaufhausSchlusssteinVorlesung/Konferenz
00:33
MiniaturmodellVorlesung/Konferenz
01:03
MiniaturmodellArchitekturVorlesung/Konferenz
01:48
MiniaturmodellVorlesung/KonferenzComputeranimation
02:19
PrivatgrundstückEmissionWasserhärteHolzrahmenbauKunststoffrasenZimmerVorlesung/Konferenz
06:40
KunststoffrasenEmissionPrivatgrundstückGebäudeVorlesung/Konferenz
13:11
MitverbrennungPrivatgrundstückBlende <Bauwesen>Vorlesung/KonferenzComputeranimation
19:37
WasserwaageArbeitszimmerComputeranimation
20:07
RaumfachwerkSpurrinneMiniaturmodellCASTOR-BehälterNeue SachlichkeitComputeranimationBesprechung/InterviewVorlesung/Konferenz
24:35
MitverbrennungHauseigentumPrivatgrundstückUmbauter RaumEmissionNeue SachlichkeitAbfallEndlagerungGebäudeSchützDenkmalpflegeMiniaturmodellBauteilDeposition <Meteorologie>BohrinselVorlesung/KonferenzBesprechung/Interview
Transkript: Englisch(automatisch erzeugt)
00:00
Well, good morning. My name is Susanne Milan. I'm the head of the department of architecture of ESA. My mission is to present the second section. And the second section is to present
00:23
the presence of the key speaker of Hishtina Alibay. Hishtina Alibay also teaches in different small universities for the reasons in which she goes on to write a section on the performance of an engineer that's effectively our engineer for the whole world.
00:42
Then we have Fredrico Souffleier. And I'm very pleased to present Fredrico, because he was one student of our school. And it's nice and important to have students that continue to do research and to continue to maintain interest in the subjects
01:02
that we view in our school. And then at the end, we wait for Katerina Milis. And she has a master in the faculty of architecture of Oporto. And she's making a PhD research in Lisbon. So at the end, I'm going to talk and then
01:21
to motivate the people. And now I present the Hishtina Alibay. Thank you. First of all, thank you very much for inviting me. It's not common for some faculty of engineering to be invited here.
01:41
So I hope I will bring something useful to you. I'm going to Hishtina Alibay, which is my research unit, by the way. So Hishtina Alibay has been involved in many areas of research, from communications to health
02:07
to power systems to sensor systems. And that's my research unit. I might talk to them.
02:22
My commission was to talk a bit about ontologies. It's not easy to do. I was talking with Tana just a minute ago. And it was saying, I don't know what's that, of an ontology. But maybe I'm doing that as well, and probably is.
02:42
So I tried to put the whole thing a bit in context and start with a bit of why our main issue is complexity and introduce some notion of computational thinking.
03:01
And then coming to ontologies, which are actually part of what is traditionally traditionally called knowledge representation in artificial intelligence, by the way, and another sort of buzzword in the way which is linked over later. And I talked a bit about the area
03:22
where I'm applying ontologies to give you an idea of things that are actually being done. And to start with, I'll start with about complexity, because in computer science, complexity is a big issue.
03:42
There is lots of hard theory on mathematics and computer science about additional complexity because it's crucial to computing. What can we compute? What can we compute effectively? And I thought it was interesting that other people
04:05
in other areas were thinking that complexity is a big issue now. And as you see, we are surrounded by these buzzwords of big data, big data everywhere, right? So the issue of how we can deal with the complexity
04:25
of the world in all aspects and of all the data that is now being collected in amounts which we're not used to is part of the context
04:42
of what I'm talking about. And about complexity and computers, I would like to, maybe everybody knows that the word computer was first used for people. The first computers were people, not machines.
05:02
And by the way, they were mostly women, I don't know. And so the first occurrence of the computer in the New York Times, I think it was in 1892, and it was a net looking for someone with confidence in mathematics. So this is much later.
05:21
It's by the time of the war efforts. And this is one of the first computers to adapt. And two of the six women which were the first programmers of the computer. By the time of programming, had a lot of tables going one way
05:41
and another in big rooms. And by the way, they were not given credit when the computer was presented. So I'd like to tell the story. Well, computers talked about a lot of thinking about how we can manage this complexity of the world.
06:17
I like this notion that has been introduced
06:20
by someone from time to time of computational thinking. Not the computation, not jumping right to computing doesn't seem done by computers, but the computation is something that frames the way we think about problems and about the world.
06:41
And we see it nowadays, we see computing everywhere. And sometimes you see in these areas, people just feel they are being pushed by the computer people, and taking them away from their place. And that's not the real picture.
07:00
The real picture is that, at least as many people in computer science see, is the need for everybody to take advantage of things that came from computing. The many, many notions that were developed for computation,
07:22
but which are very useful for training the way we think about problems. And so you see in all areas, so now you hear about machine learning and data mining and things like that which bring people from statistics together with people from computing. You see that biology,
07:41
you can do anything in biology without computing. And economics and so on and so on. And so computational thinking is a notion which depends, that computing has changed the way that we think about the problem.
08:03
And this is a very short paper by Genevieve in the communications of the ACM. And it's very interesting because it shows you this point of view of computing
08:20
as something that should be a basic skill for people in their everyday life. And so not things like if you want to compute that you need to program, no, it's more like
08:40
you need to think in an abstract way. And then, as I'm talking today about things which are actually abstractions, I think that's a useful way of introducing my topic and that's why I brought this here. I think these things are interesting for people who may want to
09:01
have a look at how people from computer science see the other areas and the rest of the world. And my next buzzword, so to speak, is knowledge representation. Because ontologies and semantic wave that's where ontologies fit
09:20
are a natural continuation of an area which is active since the 60s, which is artificial intelligence, which include as a big issue knowledge representation. So, artificial intelligence is, so to speak,
09:40
an area, a sub-area of computer science which is concerned with how we can automate reasoning, how we can do automatically things that people are good at and machines use not to be. So, how we can play chess
10:01
by machines. By the way, they are very good right now. How we can translate language automatically. That's getting pretty good as well. All these activities depend heavily on how we represent our knowledge. So, if you want to translate, what are you going to do?
10:21
Build syntactic structures that capture the structure of this course in one language and map them into these structures in another language. Maybe. Maybe use lots of statistics and lots of pattern matching. That's also the trend in language translation. So, knowledge representation
10:42
is an area, a traditional area in artificial intelligence and that's where ontologies and knowledge representation with ontologies build upon. Of course, this is not new.
11:02
So, it goes back to what is normal and the knowledge representation goes back that far. So, knowledge representation in artificial intelligence typically involved
11:20
or can be separated in these three areas. Logic. Logic is typically the undisputed underlying representation for any knowledge.
11:40
You need to be able to do inference and inference is naturally captured in the logic reasoning. And you need rules of reference for your logic. And ontology. Ontology, that's where you decide
12:00
what you'll be reasoning about. So, ontology comes from philosophy, the part of philosophy concerned with the being, what is there to be analyzed and thought about. So, when we say I'm defining the ontology
12:21
for my problem, that means simply that I'm deciding what are the important concepts, how they relate to each other and then if you want to compute with that, then you have to represent that in a way that it can be manipulated, reasoned over and
12:41
that's what ontology is about. And so, logic ontology computation. So, we don't just want to decide on the beautiful concepts that you'll be dealing with and I'll emulate, we also want to infer to work on those concepts and
13:01
extract logical consequences. That's the goal of automatic reasoning, so to speak. And then we, this the trend from knowledge representation
13:21
to the so-called semantic web, I put there just the goal of the semantic web as it is stated in the W3C, the World Bank consortium which manages the standards and recommendations for web
13:42
technologies. The semantic web the expression has been introduced in 2000 by Berners-Lee, the inventor of the web, as a so-called web of data. So, the difference from the web, traditional web, is that the traditional web
14:02
is there in that people, so we find documents and pages and things like that. We navigate and we understand the contents, but on the semantic web, you want the contents to be understandable by machines. So, you need to make the semantics of the content
14:20
more explicit. That's what it's all about. And the semantic web is full of technologies, so somehow people tend to think that the semantic web is lots of technology, lots of engineering, but at the core, it's semantic. So, it's conceptual
14:41
and it's a way of people capturing the knowledge in a way that can be shared and that can be used by very diverse applications. And these are the references to two papers that on the scientific
15:01
and the one that introduced the expression semantic web and another one which dropped the inaction part. So, the applications that were being built on the semantic web. And the
15:21
of course, the semantic web has lots of technologies and if you want a slide full of acronyms next to one, you find lots of questions on this slide. It has been introduced as well. This one has been important
15:41
from a presentation. But what is essential here, and I would like to this is all acronyms and things and technologies which are actually defined in documents which have been developed by committees and so
16:01
but the essential part here is that this is something which the goal of this stack of technologies is actually to make the applications there at the top
16:21
quite free from getting locked in vendor dependent formats and standards. So the idea in the semantic web is to keep all this available to everybody to explore
16:41
and interlink. And I think one interesting part here is that it starts very low here on the gray part with things which are quite basic. What do you need if you want to make sense
17:01
of a file? First of all you need to understand the character goals in which the file is, in which the bits in the file are aligned. So if you don't get to understand the characters in the file nothing.
17:21
You can do nothing, right? So it starts there. It starts with you got to understand and so there is a standard for that. There is Unicode that allows you to use very different alphabets and character coding and still understand what's going on. And another essential part
17:41
is you need to give names to things. So that's where URIs which are getting nowadays IRIs like internationalization and are identifiers for resources. So everybody is using URLs. Those are
18:01
locators on the web. URIs are a bit more ambitious. Things don't have to be on the web they have to have a way to name them. To give them identifiers. And URLs are good because they are distributed.
18:21
If I have my own domain I generate how many identifiers I need. I just define them as it pleases me. And it's a way that in that way everybody can have an identifier which depends on their own domain and which does not collide
18:41
with anyone else's. So identification and understanding the character codes are there at the bottom because if you don't have that you can go any further. And then there is XML. Everybody knows a bit about what an XML
19:01
or an HTML document looks like. This is here for easy parsing. We want to make it easy to understand the parts of a file. So you already understand the characters. Now I need to understand that here is the beginning of a name
19:21
and here the name is over. So when I'm automatically looking at that file I can extract the name which is there. Or extract a block of text. Or extract an image. So that's another basic level.
19:41
And then you start with the actual semantic web stuff. So RDF is already a semantic web recommendation from the study. And all of the world there the ontology language another level.
20:01
Why do we want all those things? Well, we want to make queries on that. So I need a language to interpret. To make sense of my data and answer queries that someone has on that data. Is there anybody which is related by just
20:21
words with relationship with this person I'm looking for? If there is anybody who is part of the semantic web representation. So this is actually a technology stack but underlying this technology stack there are all
20:41
the needs of the conceptual needs that we there are some requirements for our understanding of the world. And this is already let me jump a bit to the RDF part here. So
21:01
RDF is a very simple is a very simple this is from the documentation of the W3C documentation. RDF is a very simple model. And it's very simple because very simple is good enough for capturing any knowledge that
21:21
you can capture automatically. So everything I can say between two entities Bob there, Alice the Mona Lisa I can say it with binary assertions. So Bob is a person Bob is born on, Bob is interested in
21:41
those are just assertions and they constitute a sort of very basic way of making statements about the world. And these statements are quite easy to capture also formally. So this is
22:01
the graph. But this graph can be very easily turned into something well it's called triples. When you hear about semantic web you always hear about triples. The triple stores instead of databases. Triple stores are simply
22:21
stores where you have assertions in three parts. A subject, a predicate and an object. And both the subject and the predicate and the object they can have unique identifiers. Why is that interesting? Let me just show you.
22:41
Here instead of the first graph you have a graph where some of those are triples. This one is just a date. But that note here there, it has a URI. So it has an identifier.
23:01
And each note which is not a simple literal, a simple value like a date has an identifier. So for example this is a documentary and it's on a rubiana. So it's URI it's on European domain. So that's the entity that's providing the identifier
23:21
for that resource. And it's not only the notes. Also these relationships birthday, time, topic interest, title they also come from vocabularies which can be published and made available
23:41
to others. That's one of the where you come to the ontologies. That's exactly where you come to the ontologies. I want to decide which verbs I use to talk about things. And I can do that because I can define my own ontology and
24:01
the other interesting part is I can mix and match and use my ontology together with others. So for example there you have this is terms title for the the entity for example is identified by a wiki data identifier
24:21
it could be another one. And then you relate it with the name Monalisa with this is terms title. This is terms is a very well known vocabulary. It's a lovely part. It's used all over in libraries and other other institutions because it's a very simple
24:41
vocabulary developed for the web by the way. And that's just the idea. Of course this graph is not the most interesting way for a computer to read but this graph can be rendered
25:01
very easily in textual formats which is very easy to parse by a program. But that's also a requirement. I can also read that. Right? Maybe it's not the
25:21
more interesting interface for that information but I can read that and so can you. You can identify those pieces which start with description and with description and which are parts of that graph. The most interesting thing is the semantic web is any machine
25:41
but it is both machines and people. So things are written in a way that's simple enough to be easily parsed by any machine but it's also something which is not locked in a very hermetic format. It's something you can read yourself. Like you do with
26:01
HTML. But HTML is just displayed. It has no meaning for the components of the document. If you use a vocabulary that someone defined or that you defined yourself then you can provide an explicit meaning for your data.
26:21
And that's the goal of onologies. Now I'm just providing instead of this instead of this you can write all your ontologies like this in XML. But when you define an ontology
26:41
you're lucky enough to have tools that help you define your ontology very easily. This is just an example of an ontology of an ontology tool which is called Protege. It's being developed in
27:01
Stanford and it allows you to, for example, define what kind of this is an ontology for travelling. So it talks about accommodations and destinations. And in that ontology you can define the kind of objects
27:21
that exist in your world. That's ontology. And also what kind of relationships hold between those objects. So for example here you're talking about some family destination which is
27:41
a subclass of destination so I'm just browsing through the concepts in ontologies. Classes. Classes help us define kinds of objects. So there's a kind of object which is an accommodation and there's another kind of object
28:01
which is a destination. I define that with a family destination saying that well a family destination has to have accommodations and activities and things like that. So you state a lot of what you know about those concepts.
28:21
And then come the properties. And properties are things like as accommodation. As activity. So that relates for example a destination with an activity. In this destination you can go to the beach.
28:41
So you state things about the relationships between objects of several kinds. And for convenience there are also so-called data properties because they capture those simpler relationships where the object is
29:01
just a literal. So if you just want to state that someone has a birth date you don't need to create a birth date as an object. It's just a value. So that's captured with data properties. And this is this is practically what there is to know about
29:21
designing an ontology. Anyone can take one of these tools and start designing an ontology for their domain. The interesting thing about ontologies is the way they can be combined. So ontologies operate on an open-world assumption. Which I'm finding it is.
29:41
This is this is a beautiful picture which you can go to if you just search for a long link of the data you'll find this graph. This graph is interesting because it shows how the world is
30:01
getting connected via these ontologies. They are in different colors because they are organized by domains. For example the pink here I think is life sciences and the green here is social networking.
30:21
There's a lot of that on Semantic Web. The big note there is DBpedia which is a project that has taken Wikipedia and transformed the information in Wikipedia into Semantic Web triples.
30:41
Knowledge which is easy to process by machine. What you see there is a wealth of knowledge that people are capturing in several areas and which is there to be picked. You can just look for things that
31:01
have already been defined by others and add to that. The idea of the Semantic Web is that you can build on existing knowledge. Building on existing knowledge is an important concept here. This graph gives you an idea of that. There are so many
31:21
things already there. These are just the big words. To be on this graph you have to be highly connected and have a certain volume in your models. It gives you a picture of the
31:41
world. The open there should have some thing about because sometimes those data sets which are there are not really open. You have to ask someone for them or get permission. It's not all
32:01
completely open, but that's another issue. They are open enough to be found. I'm just mentioning you may think that I'm just always talking about Semantic Web. We are really doing things on the Semantic Web.
32:21
I work on research data management which is capturing information on data which is generated in research context. For example, the fracture mechanics that's something from our colleagues from mechanical engineering,
32:41
chemical chemistry, biodiversity that's a contact we have with people from CBO. The problem is people are generating in research lots of interesting data, but then it's very difficult for other people to get to that data
33:01
and reuse it and use it in their own research. Nowadays projects, funded projects are requiring you to deposit your data. Most people don't even know how to do that. What we are doing is building tools that can help people
33:21
prepare their data so that it will be ready to go to a data deposit rate. More and more people will get credit not just for writing the paper, but for publishing the data associated with the paper. In some areas you have to deposit the data, otherwise your paper won't even be accepted.
33:41
And we did this with ontologies. And what I want to show you here is those ontologies up there, Friend of a Friend, Dublin Core, Serif they are ontologies which already were there. So we just put them. We use Dublin Core, we use the Friend of a Friend
34:01
because they already find interesting verbs for us. And then of course if we wanted to describe very well an experiment made with fracture, wood fracture for example, from our students in mechanical engineering, you need specific
34:21
verbs to talk about the experiment. And so we define domain ontologies. And all of this works together in an application that helps people to prepare their data to be published. And what we always say about our system, which is all better, is that
34:41
it's ready to be thrown away. And that's how all applications should be. They should all be ready to go away. But you shouldn't throw away the baby with the bug on the right. So we want to keep the ontologies and we want to keep the data in the description that you made of that data. And the description of the data,
35:01
the metadata, that's the hard part. That's the part that people well, I have to sit and write down my metadata. And well, we are also working around that with our data. So if anyone here has interesting data that they want to publish, they should talk to me. I'm always picking people from
35:21
different areas. So this is the image of what we are doing. We can say that we work on digital preservation because our goal is to make that data generated in a research context to live beyond the computer
35:41
of the PhD student to collect them. That's a typical thing. When the PhD student goes away, the computer dissolves and everything goes to the garbage. What we are building is that's just an application, right? What I'm talking about here is
36:01
just our application of the ontologies. We use ontologies so that the intermediate platform can collect the data and the metadata and publish all that in a public repository. And then, well,