From Papers to Knowledge: Representing Scholarly Contributions in the Open Research Knowledge Graph (ORKG)
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 3 | |
Author | 0000-0001-5336-6899 (ORCID) | |
License | CC Attribution 4.0 International: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/56056 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | |
Genre |
00:20
Computer animation
00:34
Computer animation
01:40
Computer animation
02:02
Computer animation
02:49
Computer animation
03:47
Computer animation
Transcript: English(auto-generated)
00:00
sorry that I'm not so well prepared because I prepared the slides around about 19 minutes ago. But I'm quite happy to report you a bit about the open research knowledge graph and I am thankful to David that he already explained what a knowledge graph is so I can skip it here. The next slide to come and I want to start a little bit. Oh yeah, maybe some of us still
00:26
remember we had a time where we mainly communicated with paper where we had mail order catalogs, maps, encyclopedia or phone books but then there was a time when the digital transformation started and all these paper-based processes changed in a way to become digital services and in this way
00:44
also our way of publishing and communicating the world profoundly changed. So back to the open research knowledge graph we are talking about research and this leads us to the question what happened in academia regarding the terms of scholarly publishing and communication and when we take a look over the last centuries we can see that we also changed a bit away from
01:06
the paper-based document in the 17th century and the 19th century but we are still in a way a little bit encapsulated in documents because we only digitize our knowledge in PDFs now which we publish online which is still not really machine actionable and processable.
01:25
So this way we are currently doing research and communicating our knowledge is the same way as we would scan our maps into PDF files and using these PDFs to navigate instead of using our smartphone with our Google maps navigation app. So this leads to the core idea of the open
01:43
research knowledge graph. We are on the way of digitalizing research in the way we communicate and this should be more than digitization. So we need to digitalize the knowledge we keep in our publications and we should not digitalize any longer the document that contained this
02:01
knowledge. Since there are a lot of different challenges related to this problem in the way we are communicating based on documents we are suffering from the publication flood there around about 2.5 million publications every year. Experiments become more and more or less reproducible due to the large amount of knowledge. Even our peer reviews get worse
02:24
since the reviewers are not able to get the state-of-the-art overview any longer and keep it. And there are also other aspects like the digitalization of science in general. We have monopolizations due to commercial actors in the publishing sector but also these predatory
02:43
publishing journals which only publish for for making monies which even impedes the publication flood. So when we now take a look as in one current example from the energy system modeling and analysis domain we often start as researchers that we want to get information on a specific
03:01
topic here like the way of simulation scenarios that they use for energy simulation. But what we do we start with Google Scholar, enter the search key and get a list of 1.5 million publications where we need to get the information out of it in some way. However we do not want a list at the end of publications we need to read want to answer
03:21
specific questions. And currently we are doing this in a more old-fashioned way by using systematic literature reviews where we manually extract the information, compare it, make our analysis, write again a paper, publish it out in the world and the detailed information gets lost about these extracted information and maybe someone who's interested in the same topic
03:44
will repeat the same kind of systematic literature reviews in five years in the future. So we came up to the point where we asked ourselves wouldn't it be great if we could ask our computer the question and get the answer afterwards and how can we achieve this goal? And here come the old research knowledge graphs into play where we exactly want to do this. We want
04:04
to structure our scholarly knowledge we have in science also in the knowledge graph as the name of the platform may imply and this kind of knowledge or this is a structuring as a graph allows us to do a lot of more beneficial ways to support the computer to extract this knowledge
04:24
and deliver answers to not only publications where this answer could maybe be included. So the open research knowledge graph has generous several objectives mainly focused on scientific content and not on the metadata of publications. We want to foster collaboration
04:40
between a lot of different people from different domains by making research fair according to the fair principles so findable, accessible, interoperable and reusable. And of course with this idea we on the one side want to provide an overview of the state of the art so that you can see what is going on in different research domains regarding specific research problems
05:00
and we want to tackle interdisciplinary challenges. So how are we doing this? On the left side you can see a pdf document of a paper and on the right side you see the semantic description of this paper that we entered in the ORKG. And this semantic description makes the information machine actionable and fair since based on this UI we are using we create the knowledge
05:24
graph in the background so the information is stored in another way that is more accessible for the computer. And with this we are able to give the computer a natural language question, get back the art that we want but we can also leverage other aspects like for example more
05:40
sophisticated analysis to create visualizations. So here I have one example of a comparison. So these comparisons show the state of the art regarding a specific research problem and in this style or in this comparison we have 25 different publications on studies on the energy supply in Germany which predict the energy generation of the time year of 2050.
06:07
And the benefit of making such comparisons that you get an overview it's a new way that is machine actionable and fair for the computer also to get out this information that is presented in our comparisons. And the benefit is that these are also contribute to your research
06:23
activities since they can be found on Google Scholar for example so it can even be cited since you can have a DOI. So besides these benefits of getting an overview of these different papers with the ORKG and the computer is able to answer this question you can even leverage the ORKG as a back-end system so it's a backbone for all new applications since all
06:45
our data is with different interfaces that we provide accessible. So for example we also have the sparkle endpoint where we can make queries on our graph and here I have an example this question of on the average energy generation per the individual energy sources about the five
07:06
years and there was the space of the comparison we saw previously and these papers were published between 2006 and 2020. And then we came up with a question that had nothing longer anything to do with this comparison with the reason why we created it to see if there a change in the
07:22
amount of energy generation per energy source that can observe over the time frames and this is important information for the energy system simulation area since we need to see whether our expected values are similar to one of the studies maybe predict so we are in the way that we can
07:42
assess whether we are comparable. And as a result of this query we get out a simple graph first of all which is a diagram and which shows up the information about the average energy generation per energy source and when we now take a deeper look into the diagram we can see that we have a strong change from 2006 to 2020 in the areas of photovoltaics
08:06
and onshore winds so there's a larger amount of energy generation from these energy sources which we should also consider in our model when we create such an energy simulation one. So overall is a short summary of this talk what we want from the ORKG is to change
08:24
to change the way we are currently communicating in research by getting away from these documents centered at these encapsulated knowledge in pdf files and make it open for computers and humans both in the machine readability so that scholarly communication also gets into the new
08:44
21st century. So in this way the ORKG tries to be the lighthouse in the publication flood for all your researchers to get the knowledge you want at the end and with this I'm at the end of my talk and I thank you for your attention and I'm hopeful that there will be some questions. Thanks.