Task Area 6, Synergies & Cross-Cutting Topics
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 5 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/60203 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Hannover |
Content Metadata
Subject Area | |
Genre |
00:00
Plant breedingAreaChemistryProtein domainModul <Membranverfahren>River sourceGoldChemotherapySilicon monoxideBockIon channelJoint (geology)SpectroscopyAreaLaw enforcementTopicityProtein domainProcess (computing)Breed standardSynergyChemistryInfrastructureActivity (UML)ISO-Komplex-HeilweiseSet (abstract data type)ExplosionInternationale Union für Reine und Angewandte ChemieHope, ArkansasPH indicatorBase (chemistry)Tool steelStuffingActive siteMachinabilityGap junctionController (control theory)IsotopenmarkierungComputer animation
07:37
Joint (geology)SpectroscopyProtein domainZellmigrationGoldRiver sourceSetzen <Verfahrenstechnik>Separation processCooperativityInitiation (chemistry)Process (computing)IsotretinoinSeparator (milk)Carbonate platformWursthülleRiver sourceTool steelZellmigrationRaman spectroscopyGoldProtein domainInfrastructureSeparation processFood additiveChemistryWalkingLactitolAddition reactionModul <Membranverfahren>Computer animation
15:09
MissernteSetzen <Verfahrenstechnik>PolymethacrylimideChemical reactionChemistryAmineSpectroscopyNuclear magnetic resonanceActivity (UML)Set (abstract data type)Active siteChemical reactionSetzen <Verfahrenstechnik>CheminformaticsBreed standardSample (material)ChemistryTransportProtein domainAtomic orbitalSea levelSemioticsMolecularityDeterrence (legal)Elektrolytische DissoziationElectronic cigaretteFunctional groupConnective tissueTopicityAreaActivity (UML)Chemical structureEmission spectrumAddition reactionBase (chemistry)Systemic therapyMoleculeNuclear magnetic resonanceKorngrenzeHarvester (forestry)Gap junctionBiochemistryRiver sourceWine tasting descriptorsElectronComputer animation
23:52
Base (chemistry)Consensus sequenceOrcinolElectrical mobilityWine tasting descriptorsGenregulationBreed standardMan pageChemistryAreaDeterrence (legal)Base (chemistry)Plant breedingFunctional groupProcess (computing)BurnChemistryTopicityCommon landIon channelInfrastructureSolutionActivity (UML)Wine tasting descriptorsElectronic cigaretteSeparation processDeep seaElastinCooperativityWursthülleRegulatorgenComputer animation
32:35
Computer animation
Transcript: English(auto-generated)
00:06
I will give you an overview of task area six synergies and cross-cutting topics. So we already heard this morning a little bit about the NFDI and the cross-cutting topics and NFDI for base and the sections, and I will pick up some pieces also concerning
00:22
data site formats, overarching search, and so on. Unfortunately, I have no pictures at all, but I would like to point out, so a few members of the TIB team are sitting over there, and you have the possibility to get to know them. During the poster session, we have Ralph Miller-Feferkorn here from Tio Triest and Michael Klix working
00:44
in the cross-cutting topics on AI, and somewhere in the audience, we also have Thomas Hartman from FITSCardsRU for the legal stuff. So you will have hopefully the possibility to speak to them during these days. So task area six addresses issues and actions for holistic use of the NFDI for CAM infrastructures
01:10
and services. So as we already seen or heard from Christoph, not everything is a service maybe, and moderates and enforcers the harmonization of the existing and new components and adds overarching
01:26
services on top. So we develop and maintain the terminology service, for example, and the search service, and we also contribute to ontology curation and ontology development as well.
01:42
And TA6 closely cooperates with task area two, task area three, and task area four on key issues like ontologies, metadata, standards, or the exemplification of data and the application of the corresponding services and APIs.
02:01
And last but not least, TA6 supports and contributes to the cross-cutting topics in the NFDI, which is quite a workload these days. And I would like to start with a short overview of our previous activities with regards to ontologies, terminologies, and the terminology service, and these activities are represented
02:28
by the label or on the label ontologies for CAM. And at the beginning of the project, we did a survey on existing ontologies, terminologies and controlled vocabularies, which are suitable for describing research activities and research
02:45
data management. And for this survey, we defined a set of criteria on how to select these ontologies so that's very closely to what you know from the fair principles. So the ontologies needs to be indexed in one of the well-known ontology repositories
03:05
like the OLS, BioPortal, Ontroby, and so on, and be publicly available and ideally in a repository with an issue tracking and versioning. And the ontology should have a machine actually hopefully license information, preferable and
03:27
open license, and should also be documented openly so that you can read how to apply the ontology. And it should be reusable in a modular way and used by known projects, so this is
03:43
some kind of indication of the quality. So if there's something out there which is heavily used, then it should be good hopefully. And of course, we are especially looking for ontologies that are still actively maintained and curated.
04:01
And based on this set of criteria, we so far identified 25 ontologies which are currently available through the terminology service of NFT-Africa. And in September this year, so that was not that long ago, we organized our first Ontologies
04:22
for Chem workshop, and the overarching aim of this workshop was to bring together experts from the various ontology projects related to chemistry together with domain experts and ontology experts, the software developers and the service providers of NFT-Afrochem,
04:43
and also the other NFT-Africa consortia that are somehow related to chemistry and chemistry data. And so this was a two-day workshop, and the focus of the first day was on reports on our updates from the most relevant ontologies for chemistry, and we were very lucky that
05:04
all major ontologies were present, and their curators reported, so Kemen, Kba, Kiro, Adam, to name only a few, and that was a very nice update in this combination.
05:23
And as an additional perspective in this context, two guest consortia, so NFT-Africa and NFT-Africa presented their plans and applications of ontologies with an overlap to what we do in NFT-Afrochem, and then as a close-up for the first day, we had
05:42
also an update on tools you use for pipelines or workflows, like robot or the ontology development kit and the terminology service from NFT-Afrochem, so to see what is out there to work with ontologies, and then on the second day, we had a focus on
06:02
exchanging experiences and best practices on ontology development, so what we found out so far, and the development and curation, as well as how to do actually data annotation using ontologies. And as an outcome of the workshops, that was quite interesting, the participants agreed
06:24
on the benefits and the need to closer coordinate in the future on ontology development, and with regards to quality measures, the need for more standardization and the development process was also identified.
06:42
And as one of the next steps, now for NFT-Afrochem, we will investigate on how ontology term definitions can be populated from the IUPAC goalbook, so this is another activity we share with our partners from IUPAC, and that's something we will next work on.
07:04
And if you're interested, you can find a playlist, so the recordings from the workshop are available on our NFT-Afrochem YouTube channel as well. So, with regards to the curation of ontologies, so we started first with contribution to existing
07:22
ones, RXNO, CHMO, for example, from the Royal Society of Chemistry, and also came in a while ago, so they were our first steps, and recently, with an NFT-Afrochem, we identified some gaps to fully describe vibrational spectroscopy and the corresponding
07:44
data, and this is now our first approach to develop a new ontology as a joint effort of the main ontology experts within NFT-Afrochem, including colleagues from the university,
08:03
from the Jener team, the IPHT in Jener, and the PTB and TIB, so this is the first step we are going to develop something new, and for now, we are focusing on Raman spectroscopy. And following the modular design approach or paradigm, the ontology, we are working on
08:27
imports terms from existing ones, so BFO, EFO, and a few others, and several terms from the CHMO, so the chemical ontology methods will be reused to not reinvent the wheel always again.
08:47
And we recently also shifted the development process towards using TSV files, so that's something you can, as a domain expert, easily contribute to without using all the specialized
09:01
tools like Prodigy or Prodigy, which usually have some kind of a learning curve. And in the context of the terminology service, we identified that we need an improvement of the metadata of the ontologies, and that's something we experienced
09:24
recently, so currently we struggle with providing accurate information about releases, versions, and last updated of an ontology we provide via the terminology service, but which is a requirement by the other services accessing the ontologies using our API,
09:45
for example, so the application of automated creation processes with tools like the ontology development kit will hopefully improve the situation so that we can improve the quality of the metadata there, and we will continue our cooperation with the RC in this
10:05
context, particularly on the use case and jointly work on the migration of the creation process towards automated pipelines and workflows, which then help you to automatically update the metadata. And firstly, we will evaluate how the gold book can be
10:26
used as a source of truth for the term definitions for ontologies and draft a recommendation for the community, as I mentioned before. Okay, so leaving the ontology development
10:43
and creation, we move on to the terminology service, so the platform providing ontologies and the repository for searching and browsing, and actually our vision of the terminology service is not limited to just only providing or giving access to ontologies, both for humans
11:05
and machines, of course. That is the initial purpose, but we see that the range of features of a terminology service can be much broader, especially if the service is embedded in the landscape of the NFDI. And not all features need to be offered directly in the terminology
11:26
service, but it can act as a broker or to other services so that you can link the users with some data in their hands to a service to be used further. We have identified
11:41
three main features for the provisioning of the ontologies, the most obvious one, and second, then the publishing or archiving of ontologies, which is often achieved by public services like GitHub or GitLab, but we see this as a future addition of the
12:01
terminology service as well, so a close integration of a Git service with a terminology service enables us to develop future features like cross-ontology issue tracking, for example, or discussion across ontologies, which are usually stored in separated repositories
12:22
in different places on GitHub. And so support for the curation process is therefore a sort of feature we are looking into, offering a platform for discussion across ontologies, simplifying the process on filling out issues for term requests or any other topic.
12:43
That's something we're looking into. And for those actively involved in the maintenance of ontologies, the direct initialization of tools like WebProtege out of the terminology services of interest as well, so that you don't have to load files from different resources.
13:03
For the terminology service 2.0, so the front-end development was now separated from the back-end to pave the way to some of these new features beyond terminology lookup and provisioning, but still we have the possibility to stay in sync with the release of the ontology
13:24
lookup service software by EBI. So that's the main idea behind that. And the new front-end is now being further developed with React. So mentioned before, 25 ontologies currently within the terminology service
13:41
evaluated and selected based on our criteria. So that's the current content. And we also have recently improved the API documentation, especially to support the other NFTI for CAM services and their efforts to link to the terminology service and reuse ontology terms for the data annotation. And for further details, we will have a poster on that,
14:07
but you can also test out the terminology service on one of our demo laptops. Regarding the overarching future development of the terminology service at TIB, we are also
14:25
seeking a cooperation in the NFTI and with the other consortia and NFTI for Ink and NFTI for Health and also NFTI for Objects all rely on the OLS by EBI software for their terminology
14:41
services. And already now, both NFTI for Ink and NFTI for CAM are connected to our TIB TS general backend and providing then domain-specific ontologies and services. And one can think of within the NFTI of additional backend server that can be operated
15:04
to increase also the reliability and stability of the whole service infrastructure. And here we see a strong opportunity for the consortia and also the communities to join forces. And in this regard, so we coordinate together with the institution
15:29
from NFTI for Health for a proposal for terminology service as one of the NFTI-based services in the corresponding work group of the section metadata and the NFTI association. And
15:47
I will give you a little bit more context of that in a few slides. So that brings us to the search service, which was from the idea already mentioned or formulated
16:01
in a few of the questions. So the idea of what the search service enables is allowing the community to search across the federation of repositories we're currently building up within NFTI for CAM based on the index metadata and then direct them to the data repositories.
16:22
So the first version covers currently three repositories, commercial repository, mass bank and for CAM. And on a service level now running productuses means that the repositories are synchronized every 24 hours so that we harvest and index the latest entries.
16:44
And very soon hopefully other repositories like NMR archive can be connected there as well so that we step-by-step extend the data space available for searching. And this search service
17:02
that will or can also act then as a central point of access or a hub to the NFTI. So speaking of interdisciplinary searches or searches across disciplines which are very close to chemistry and somehow work with molecular data, but also for the European Open Science
17:21
Cloud. So that's basically the idea. So if it's first NFTI and then you're a CEO, there will be direct link link. That's something we have to look at in the future. And the search is actually a great example on the needs or how we need to cooperate
17:42
and collaborate between the task areas. And here again we are in connection with task area three, four and six and then mentioned then task area 15 where we discussed this cross-area topic. So dealing with only three repositories at the moment we already need to
18:05
consider different metadata schemas, formats and protocols for transferring data during the harvesting and nixing process. And so right now radar for Chem and ChemMotion provide XML data. We are the OIMPH interface and mass bank uses JSON-LD based on bioschemas.org which is then
18:31
actually embedded in the web pages of the individual data set and not so much available using an API. So that's one of the huge differences. So one is optimized for search
18:41
engines and the other one for API calls. And you can imagine that already this is quite a challenging constellation. Because of course we would like to use and provide domain specific metadata and the corresponding search features building on such metadata in our search.
19:03
So something like a structure search or a subset search. And as you can see here with some of the metadata currently available in the system we can also put some additional information on chemical structure then again in the overarching search. And comparing the different sources we
19:25
have, so as of today metadata schema of radar unfortunately needs to be extended to accept chemistry metadata and ChemMotion repository closely linked to the ChemMotion electronic lab book of course can access a rich set of chemical metadata at this point and ChemMotion uses data site
19:47
which is the most common schema used for research data management and research data repositories out there but it still lacks the capability to cover domain specific metadata like chemical structure information. And BIO schemas as Stefan and his team already shown us
20:08
extending the original schema.org on the other hand can represent chemical substances and molecular entities but there are still some gaps concerning measurement data so that's on all levels room for improvement. And I would now like to show you some workarounds so
20:27
as you can see here the data model from ChemMotion again so with this set of reactions between samples creatings and different data sets how this is currently modeled or squeezed into data
20:41
site XML. So it works if the sender and the recipient know where to look right. So the chemical structure information is extracted from the alternative title so that's an agreement so ChemMotion on the one side puts it there and we as a search service know where to look
21:03
so we can transport chemical information and extract it for the search again. And the role of the data set is extracted from the resource type and the relations between samples and data sets is extracted from the related identifier so it somehow works right. But someone
21:23
else using this kind of data site XML might not know where to look for and is not able to extract this chemical information. And here we have the corresponding metadata of the NMR data set so from the sample one measurement data and here you can see it's is part of relations so
21:41
that's a possibility to connect different data sets published having a DOI and build up this network. But hopefully this short example demonstrates the urgent need of metadata standards and minimum information for chemical investigations and actually we will approach the data site
22:05
metadata group to discuss an extension of the data site scheme at least for the chemical entities right. So although they always say it's a general XML schema they don't want such domain specific metadata inside you still can find the geolocation so
22:22
there are some examples for domain specific information. Because in the end we would like to provide inquiries for example for molecular entities then returning a comprehensive overview of data available in our federation
22:41
of the repository so that you can see okay here I can find NMR data there's a mass spectrum available for this molecule that's that's the idea behind it when the metadata is harmonized and enriched also for chemical information. Okay so we are leaving Nfdi4chem for a moment rising up to look at the Nfdi and their
23:06
consortia and the cross-cutting activities and Nfdi4chem has contributed to the cross-cutting topics from the very beginning so 2019 I think it was the first meeting for this cross-cutting
23:22
topics and so we co-authored both the Berlin Declaration also the Berlin Leipzig Declaration on the cross-cutting topics so this was of interest for us from the very beginning and over the last years the cross-cutting topics have been consolidated into the sections of the Nfdi Association moderating the joint efforts for the consortia to vats the base
23:48
services available for all consortia that's the idea behind that we jointly develop services that can be used then for by all consortia hopefully and we have seen it this morning
24:04
as Chris has shown us already our commitment to the Nfdi is also expressed by the leading roles we have and then three out of four sections and within the sections and you can see here the the Nfdi universe with the sections and the different work groups of the
24:21
sections representing the cross-cutting topics so they now have been initiated to work on specific tasks and many members of Nfdi for chem are also involved in this working groups and this year so August I think it was so this this impressive commitment of all consortia to the
24:44
Nfdi on these cross-cutting topics resulted in this space for Nfdi proposal which was submitted to the DFG and which is signed and supported by all 19 consortia we have right now but also by the hopefully remaining consortia coming for the third round so this was a quite a unique
25:10
process to agree on on these cross-cutting topics and how to proceed in the development and this was initiated of course again by by a community driven process and or will be
25:27
concerning the the proposals we will have for base for Nfdi services which will be then moderated in the sections where the consortia can bring in the requirements and recommendations and base for Nfdi will then provide a framework for defining and developing
25:45
these services and so the decision for funding will be published announced I think mid of November hopefully and funding will hopefully then start in beginning of 23. So coming back
26:05
now to the work program of Nfdi for chem and we are directly involved in directly contributing to two of the mentioned cross-cutting topics and first by TU Dresden so our colleagues here in
26:21
the front row they are contributing our first we're contributing to task force AI so authentication and authorization infrastructure concerning this whole thing about single sign-ons and roads and rights and which collects all the task force was first responsible for collecting
26:40
requirements of the individual consortia to develop then an Nfdi wide solution for this identity and access management which is now then also the the work group of the section and only very very short so there will be a part or it will be a presentation within the federation of services so there was or we were involved in the survey on AI across all Nfdi consortia this
27:10
was end of 21 and so the majority of the Nfdi services and it's not a surprise need authentication with regards to AI to ensure a common user identity across several services
27:24
so that's something we know from other infrastructure projects as well and with regards to authorization it is important to define these roles rights and also it's about trust and of course previous developments and already established instances and we heard about
27:44
this earlier of community ais and the experience of the participants will then be part of this process and a joint solution for the Nfdi and the the the task force now has merged into this new working group within the section common infrastructures
28:03
and besides Michael also Stefan and Felix are involved to make sure that our interests are covered there in the current discussion and this identity and access management
28:20
is also defined as one of the most important issues for the Nfdi concerning the cross-cutting topics and therefore it's it's one of the features topics for the base for Nfdi proposal and so it's planned to be initiated it's one of the first development process for base service okay so the other cross-cutting topic and this is
28:52
something we can address our colleague Thomas Hartman is we work further on legally a reliable framework of policies and guidelines to handle and to publish research data and we
29:06
have seen earlier also that Francisca Boom a member of Nfdi is a spokesperson of the section Elser which is dealing with all the aspects of of legal questions and copyright and ownership and so on and we are really glad to have Thomas here today and also tomorrow so whenever there's
29:28
some burning question you have about the legal issues and then now it's the time to to address them and a high impact activity in the last months was the contribution to national research
29:42
data management law or research data law regulations and the European data act coordinated by the section where we contributed so that's something that happens more and more that policymakers and and politicians are addressing the Nfdi but also the sections
30:00
for statements and that's a good opportunity to express our vision and opinion and furthermore in the context of of these activities so several lectures and training courses have been organized to provide a better understanding of data law so it's something which is always a mystery to me it's fine and how it impacts the daily work of researchers
30:26
and with the use case of research data management and the commotion eln the repository already in in deep copyright analysis has been made resulting in several publications by the fritz council team among further topics that have been addressed in the last
30:47
months i would like to point out the the discussion about the cooperation with the industry and industrial partners so that's actually one of the upcoming new sections that will be formed in the next month so how to cooperate with industry concerning research data
31:07
and this is something which entered the stage with the new consortium of the second round so that we see some additional activities there very soon and closing the the remarks on
31:22
the legal issues i would also like to point out and invite you to the workshop thomas is offering tomorrow afternoon at two on data ownership of research data management and example all the questions you can think of dealing with your data and publishing data
31:46
so clothing i would like to point out this is basically coming back to communication especially across the task areas so we have this ta15 rocket chat channel which i was
32:01
expecting to already exist but it wasn't so we had another one but i made this one yesterday evening we have the one for for ta6 in general and the one for the ontologies discussion and we have our monthly meetings both for ta6 and ta15 are you all invited to come there and discuss especially this cross area topics on on data metadata how to use technologies
32:26
and resources and so on and that was my last slide and thank you for your