We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Operational Research Literature as a Use Case for the Open Research Knowledge Graph

00:00

Formal Metadata

Title
Operational Research Literature as a Use Case for the Open Research Knowledge Graph
Title of Series
Number of Parts
31
Author
License
CC Attribution - ShareAlike 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The Open Research Knowledge Graph (ORKG) provides machine-actionable access to scholarly literature that habitually is written in prose. Following the FAIR principles, the ORKG makes traditional, human-coded knowledge findable, accessible, interoperable, and reusable in a structured manner in accordance with the Linked Open Data paradigm. At the moment, in ORKG papers are described manually, but in the long run the semantic depth of the literature at scale needs automation. Operational Research is a suitable test case for this vision because the mathematical field and, hence, its publication habits are highly structured: A mundane problem is formulated as a mathematical model, solved or approximated numerically, and evaluated systematically. We study the existing literature with respect to the Assembly Line Balancing Problem and derive a semantic description in accordance with the ORKG. Eventually, selected papers are ingested to test the semantic description and refine it further.
17
Open setGraph (mathematics)MathematicsOperations researchModel theoryMathematicsOpen setGraph (mathematics)Line (geometry)Graph (mathematics)FamilyMultiplication signModel theoryPrice indexList of unsolved problems in mathematicsComputer animation
ModulformProcess (computing)Logistic distributionGroup actionComputer animation
HypothesisCalculationPhysicalismUniverse (mathematics)TheoryComputabilityModel theoryModulformPiProjective planeMereologyGraph (mathematics)Open setArithmetic meanComputability theory
Projective planeTable (information)Open setGraph (mathematics)Cartesian coordinate systemAlgebraic structureCuboid
StrutProgram flowchart
Graph (mathematics)Program flowchart
VaporProgram flowchart
Model theory
Maxima and minimaValuation using multiplesPairwise comparison
Model theoryMaxima and minimaMathematical optimizationLinear mapSimulated annealingApproximationHeuristicSimplex algorithmBasis <Mathematik>Set (mathematics)Latent heatLine (geometry)Chemical equationMathematicsOperations researchProcess (computing)Mathematical optimizationPairwise comparisonAlgebraic structureElement (mathematics)Time domainCategory of beingDescriptive statisticsAxiom of choiceTheory of relativityCuboidSocial classReal numberComputer animation
GoogolMathematicsGoogolAlgebraic structureContent (media)ResultantNumerical analysisStress (mechanics)Computer animation
Annulus (mathematics)SequenceNormed vector spaceArrow of timeMaß <Mathematik>Valuation using multiplesArithmetic meanModel theoryMathematical optimizationConnectivity (graph theory)Product (business)Constraint (mathematics)ModulformProcess (computing)Algebraic structureHypothesisSequenceAdditionMultiplication signProbability density functionStructural loadSelectivity (electronic)
Computer animation
Product (business)Food energyThermal fluctuationsCuboidSheaf (mathematics)Travelling salesman problemApproximationComputer animation
AngleRange (statistics)Gamma functionPartikel-Schwarm-OptimierungSeries (mathematics)Model theoryTerm (mathematics)Computer programmingOperations researchApproximationHeuristicSheaf (mathematics)MathematicsCategory of beingLatent heatMultiplication signOpen setHierarchySinc functionMeasurementAsymptotic analysisCombinatory logicComputabilityComputer animation
MetreRule of inferenceDiameterProcess (computing)Time domainProgram flowchart
Model theoryAdditionAreaConsistencyStandard deviationInterface (chemistry)Maxima and minimaContent (media)Category of beingModel theoryDirected graphComputer animation
Mathematical optimizationMathematicsNumerical analysisComplex (psychology)Adaptive behaviorSocial classAlgebraic structureModel theoryProcess (computing)Price index
Content (media)ModulformCategory of beingFood energyAreaDescriptive statisticsTable (information)Program flowchart
SineProjective planeKörper <Algebra>Theory of relativityMathematicsComputer animation
Computer animation
Transcript: English(auto-generated)
Hi, we would like to present the Open Research Knowledge Graph and what its added value with respect to mathematics might look like. First of all, there'll be an introduction to the Open Research Knowledge Graph. Its timeliness in the digital dark ages, its purpose and a quick demo to get a tiny bit familiar with it.
Secondly, the assembly line balancing problem is discussed with respect to its suitability as an early use case to ingest curated data into the Open Research Knowledge Graph. This concerns not only the mathematical problem per se, but also its coverage in scholarly literature.
Then, the hands-on approach of adding selected papers leads to a proposed model for ALBP literature in the Open Research Knowledge Graph. Finally, we conclude with a summary and derive a brief outlook on how to improve
the procedure in order to yield best practices for indexing scholarly literature efficiently in the ORKG. As is often the case, the best way to understand an innovation is to understand where we're coming from. Since the invention of the printing press almost 400 years ago, the archive of scientific knowledge has been built upon paper.
Knowledge had to be physically disseminated. The only way to extract the data was reading and hopefully remembering. In the wake of the digital age, there are fundamentally different opportunities for the logistics of distributing knowledge.
After we have changed the publishing process so far, we might question whether traditionally provided articles are the most effective and efficient form to absorb knowledge for humans and machines alike.
Scholarly publishing nowadays is just not optimized to efficiently select information, for it is also a means to build a reputation or, considering the humanities or the arts, the form does not only convey new findings, but is part of scientific innovation in itself. Yet, it's not a pie in the sky to think about structuring information for the purpose to contextualize and compare across articles.
After all, this is what mankind did for the last centuries intellectually. It stands to reason computers might assist us here too and uncover where we might have been blinded by routine.
A rather dramatic example is the so-called crisis in theoretical physics. Provocative voices proclaimed theoretical physics to be stuck while preparing insanely laborious experiments. There is now a branch in physics working on computational theory development where a computer is fed with
a century's worth of knowledge and spits out a theory of life, the universe and everything in mere days. However, translating journals and books into machine-interactionable information is not a trivial task. This is where the Open Research Knowledge Graph enters the stage.
The basic idea is to restructure prosaic scientific knowledge into a database model. This idea is not entirely new though. The DBpedia project transforms Wikipedia articles into linked open data in order to use the information in semantic web applications.
Structured information is extracted most notably from the info boxes, tables or graphics. This enables us to formulate queries like, give me a list of the five economically fastest growing countries in the world. In a nutshell, the ORKG conveys this idea to scholarly knowledge.
Scholarly literature of mathematics is published in a prosaic fashion and yet linguistically and formally highly structured, thus a suitable early use case for the Open Research Knowledge Graph.
However, considering the heterogeneity of mathematical sub-disciplines, we'd need much more papers to illustrate structural patterns and comparability than restricting our use case to a specific topic. Also, we agreed on an exemplary domain with a high occurrence of review literature.
Operational research and mathematical optimization in particular fits that description nicely and has further advantageous properties. Firstly, it covers relatable real-world problems, making its literature more attractive for queries in the ORKG.
Secondly, translating the problem into OR speech, an objective function and its constrained region, that is, is subject to strict conventions, so we already have structural elements here. Lastly, optimization comprises a large toolbox of methods to directly solve a problem or at least approximate a solution.
Quite often, the usual suspects are benchmarked against each other, explaining the abundance of review papers. Of course, optimization is much more intricate than is described here. Nonetheless, this is a crude overview of the problem-solving process and how it is structurally depicted in scholarly literature.
Still, we considered optimization as a whole too versatile. That's why we picked a specific and well-researched problem, the assembly line balancing problem. Of the listed reasons for this choice, the most important is the availability and archive, reassuring us to act in accordance with copyrights.
The number of ALBP-related literature over the last decade might not seem very impressive at first, at least if we ignore a questionable result on Google Scholar. But we focused on freely accessible content and a highly specific structure.
Also, the number of papers is much higher since we put a stress on the subject-specific databases. However, the ALBP is not solely a mathematical research subject but also an engineering and economics-related topic.
We organized the literature with Zotero. The collections on the left depict our topic finding from general optimization to the ALBP. Although this seems like a nice selection at first glance, many papers don't have a preprint, an open-access version or any electronic form.
Where available, we attached the PDF version, not necessarily to ingest the paper to the ORKG but to discover or confirm our hypotheses about structural patterns. If available on archive, a paper was tagged accordingly. Also, we counted the sequence the papers were modeled.
The first paper for example took 15 minutes to ingest, the last one only 7. We used Zotero's notepad to document the process and observations, be it bugs, usability issues, time or questions.
In manufacturing, assembly line balancing aims at a continuous, evenly distributed and coordinated load of workstations under additional constraints, like manufacturing different products on the same assembly line or fluctuations of energy costs over the course of the day.
Organizing assembly lines optimally is an NP-hard problem, like the traveling salesman problem. This indicates that we go to the well-equipped heuristics or approximation section of the toolbox in operational research.
Getting prepped for ingesting an ALBP-related paper into the Open Research Knowledge Graph, we derive sensible properties. First of all, the problem type, since many variants have been studied so far, as seen in the illustration here. Another property is the exemplary dataset that is used and which is commonly referenced by its author's name.
Most important, the algorithm that is applied needs to be documented. Preferably, we might distinguish heuristics, exact methods or even wild combinations in the algorithm zoo.
In practice, we also introduce computational specifications, like programming language and a bit fuzzily performance, which might contain both the asymptotic behavior or the measured duration. The arrays indicate hierarchical dependencies. The asterisks indicate properties that were newly introduced to the ORKG.
Now, let's enter this conference's contribution rudimentarily to the ORKG. Fortunately, we already know the DOI, so the former metadata are extracted automatically.
Then, we assign a domain. So far, only one domain can be assigned at once. Now, we're ready to model the paper.
This is quite a creative process and seems never to be really finished. Just watch me think here. And this is what it eventually looks like.
Now, we are a tiny bit familiar with the Open Research Knowledge Graph by applying a use case we feel comfortable with. Where does that leave us? A major realization is that we are rarely familiar with both knowledge engineering and a paper's content, which is peer-reviewed research, after all.
Intellectually transferring a paper is quite tedious, although it gets a lot better with more experience. In addition to that, the ORKG's many modeling options grant a lot of freedom, but they can also backfire when comparable properties are modeled inconsistently.
In addition to the user-friendly interface, an easy-to-use guideline with minimum standards would help beginners or enthusiasts. Wikipedia is a perfect example on those editing workflows. A reviewer suggested to integrate the MSc into the ORKG, which would already provide a skeletal structure.
In the long run, each MSc class could provide its own template. We might have to consider complexity issues here, because it makes ingesting an increasing number of published papers according to the indexing process of the ORKG extremely tedious, but there might be growing pains in hindsight.
Templates have been mentioned already. Our model is looking forward to becoming one of those one day. That would make life so much easier. We shall have a very quick glance on existing templates.
Templates are a better feature in the ORKG that provide an adapted form for recurring content. The ORKG mainly provides templates for epidemiological terminology. A template consists of a description, for example for their associated use case,
its typical properties and information on their format, since they are considered concepts themselves. An ALBP template could provide the properties that have been introduced in the table earlier.
The ORKG is a collaboratively growing organism. Even if papers could be imported automatically, there will be handiwork in the foreseeable future. That means approaching the semantic web geeks in our prospective fields and connecting with related projects.
In Germany, there is Tzetbimat and SVMAT respectively, or MathHub as obvious partners. Thank you very much for the opportunity to present the Open Research Knowledge Graph. So, what are your suggestions?