Getting Connected With DataCite Metadata! (DataCite Metadata Working Group)
Formal Metadata
Title |
| |
Author | 0000-0003-3585-6733 (ORCID) | |
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/69882 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Principal ideal domainGraph (mathematics)BlogMetadataElement (mathematics)System identificationType theoryData typeComputer-generated imageryRevision controlType theoryWebsiteMultiplicationConnected spaceSet (mathematics)CodeLattice (order)CASE <Informatik>Electronic mailing listSelf-organizationRepository (publishing)Key (cryptography)Element (mathematics)AdditionPersonal identification numberMathematicsPie chartPresentation of a groupMetadataSoftwareField (computer science)Graph (mathematics)CirclePrincipal ideal domainGroup actionSlide ruleComputer animationDiagram
03:04
Type theoryRevision controlNumberMetadataEvent horizonComputer-generated imagerySystem identificationRight angleWebsiteStandard deviationTerm (mathematics)Inheritance (object-oriented programming)PlotterType theoryScaling (geometry)Electronic mailing listNumber1 (number)Multiplication signoutputPole (complex analysis)MetadataCodeGroup actionBitOcean currentComputer animationXML
04:44
Meta elementType theorySlide ruleProbability density functionPresentation of a groupData typeContent (media)Reading (process)BlogMobile appExplosionMetadataSurjective functionPiData dictionaryFinitary relationSample (statistics)Object (grammar)Communications protocolSource codeSoftwareCodeScale (map)Interpreter (computing)Different (Kate Ryan album)QuicksortPresentation of a groupType theorySlide ruleComputer animationXML
05:44
Communications protocolData dictionaryPresentation of a groupSlide ruleSample (statistics)SoftwareSource codeStandard deviationType theoryObject (grammar)Variable (mathematics)Content (media)AbstractionScale (map)BlogReading (process)Mobile appExplosionMixed realityCodeBinary fileDifferent (Kate Ryan album)Graph (mathematics)Slide rulePresentation of a groupElectronic mailing listCommunications protocolQuicksortComputer animation
06:33
ExplosionSoftwareData modelType theorySystem identificationView (database)CodeService (economics)Electric currentStandard deviationEvent horizonPeer-to-peerLaptopComputational physicsObject (grammar)Computer-generated imagerySoftware frameworkOpen setPhysical systemRule of inferenceNumberRepository (publishing)MetadataSlide ruleTouchscreenMusical ensembleIdentifiabilityWebsiteNumberFigurate numberRepository (publishing)Visualization (computer graphics)Distribution (mathematics)PiRow (database)Pie chartGraph coloringCASE <Informatik>QuicksortProgram slicingMaxima and minima3 (number)Self-organizationSet (mathematics)Universe (mathematics)MetadataLine (geometry)Sinc functionTerm (mathematics)State observerType theoryElectronic mailing listMultiplication signWindows RegistryArithmetic meanDomain nameComputer animationMeeting/InterviewDiagram
12:53
Library (computing)Repository (publishing)Software repositoryNumberSelf-organizationConnected spaceGraph (mathematics)IdentifiabilityDomain nameHuman migrationMetadataMereologyWebsiteState observerCASE <Informatik>Principal ideal domainComputer animationDiagram
14:12
CodeMobile appExplosionBlogReading (process)2 (number)MetadataWebsiteRow (database)Connected spaceFunctional (mathematics)InformationIdentifiabilityAuthorizationNumberMultiplication signQuicksortComputer animation
16:14
WeightWeb pageView (database)CodePrincipal ideal domainGraph (mathematics)Self-organizationGraph (mathematics)Connected spaceSet (mathematics)Key (cryptography)GoogolDatabaseWebsiteMusical ensembleIdentifiabilitySelf-organizationElement (mathematics)Digital object identifierUniverse (mathematics)CASE <Informatik>Radio-frequency identificationTerm (mathematics)Principal ideal domainComputer animationMeeting/Interview
18:01
MetadataFinitary relationNumberWebsiteType theoryTheory of relativityRow (database)Connected spaceEvent horizonDialectMetadataCalculationPartition (number theory)AverageStandard deviationPresentation of a groupGroup actionSelf-organizationLink (knot theory)Set (mathematics)SpacetimeDiagram
20:58
Revision controlType theoryElectric currentFinitary relationMereologyTheory of relativityType theoryWebsiteRight angleMetadataConnected space2 (number)Slide ruleExpert systemMultiplication signMusical ensembleComputer animation
22:03
Translation (relic)Wrapper (data mining)Library (computing)Function (mathematics)Finitary relationType theoryMetadataMobile appExplosionReading (process)Electronic mailing listCodeMultiplication signMereologyResultantOnline chatComputer animation
22:42
ExplosionElectronic mailing listReading (process)BlogFinitary relationType theoryMetadataEmailCodeTranslation (relic)Function (mathematics)Wrapper (data mining)Library (computing)Theory of relativityFunction (mathematics)Type theoryVotingComputer animation
23:22
CodeMereologyElectric currentFinitary relationType theoryMetadataInformationCategory of beingVotingElement (mathematics)Standard deviationIdentifiabilityOrder (biology)CASE <Informatik>Revision controlInformationWeb pageObject (grammar)Connected spaceNumbering schemeComputer animation
24:42
InformationStructured programmingEvent horizonLink (knot theory)NumberStatisticsUniform resource locatorImage resolutionType theoryFinitary relationSoftwareWebsiteIdentifiabilityEvent horizonHypermediaMusical ensembleConnected spaceDataflowSource codeTheory of relativityRevision controlFunction (mathematics)AdditionData managementCASE <Informatik>Descriptive statisticsFinite-state machineoutputSlide ruleMereologyRegular graphInformationSoftwareQuicksortNumberProjective planeLink (knot theory)Graph (mathematics)Principal ideal domainPlanningTerm (mathematics)LaptopVirtual machineGroup actionTrailFeedbackBitUniform resource locatorPresentation of a groupMultiplication signSet (mathematics)Moore's lawWorkstation <Musikinstrument>Characteristic polynomialTwitterDifferent (Kate Ryan album)Self-organizationMetadataDigital object identifierDirection (geometry)Computer animation
32:30
System callEmailMetadataType theoryPrincipal ideal domainTrailView (database)Windows RegistryMobile appReading (process)CodeStatisticsDirected setInformationMoment (mathematics)Uniform resource locatorLaptopMusical ensembleSlide ruleOnline chatComputer animation
34:10
Characteristic polynomialMetadataVideo trackingEmailSystem callExplosionReading (process)Mobile appElectronic mailing listCodeVotingComputer configurationStress (mechanics)Moment (mathematics)Musical ensembleShared memoryComputer animation
35:07
CodeMetadataCharacteristic polynomialVideo trackingDublin CoreRevision controlTexture mappingBackupSheaf (mathematics)Home pageProcess (computing)Performance appraisalOpen setSupremumMultiplication signStress (mechanics)Group actionMetadataDublin CoreOpen setWebsiteCategory of beingLevel (video gaming)Uniform resource locatorInformationHypermediaHome pageSlide ruleQuicksortDialectShared memoryNumberArithmetic progressionEmailStaff (military)outputMathematicsSet (mathematics)Computer animationProgram flowchart
38:49
SicLibrary (computing)Game theoryMetadataLocal GroupMetadataGroup actionSlide ruleOcean currentUniform resource locatorRight angleComputer animationMeeting/Interview
39:31
Online chatMusical ensembleMessage passingMeeting/Interview
40:09
Information managementPressureMeta elementSheaf (mathematics)MetadataIdentifiabilityCASE <Informatik>Self-organizationType theoryMereologyMultiplication signMeeting/Interview
41:08
MereologyMultiplication signSelf-organizationConnected spaceWebsitePoint (geometry)Arithmetic meanGroup actionMetadataSet (mathematics)Principal ideal domainMusical ensembleMeeting/Interview
42:35
Network topologyLarge eddy simulationPressureGroup actionObject (grammar)Electronic mailing listMetadataPlastikkarteProduct (business)Connected spaceWebsiteMultiplication signComputing platformType theoryLevel (video gaming)Meeting/Interview
44:59
MereologyMetadataDecision theoryField (computer science)VotingMusical ensembleService (economics)InformationWebsiteMeeting/Interview
46:00
JoystickMeeting/Interview
Transcript: English(auto-generated)
00:04
Are we seeing slides. Yes. Great. Okay, thanks for all and thanks to both you and Mary for making this, this long longest set of meetings so go so smoothly.
00:24
I really learned a lot and enjoyed the meetings and today, or this one, this is the third metadata working group presentation with the emphasis on getting connected. So, data site has multiple, multiple roles and today we're going to be talking about two of them.
00:47
One of them is identifying resources like those shown here as circles. Mostly data sets and publications but also of course connections to funders and software and organization and people organizations and
01:01
people and many other things and then the, the important role here illustrated schematically with a pig graph is making connections between all those things. And it's really exciting to be to be working with so many people in data side and around in the repositories and other places when these connections start to be made.
01:24
And of course the pins are the primary keys that make those connections possible so we'll be talking about that today as well. Talk about identifying resources and people and organizations and then making connections between those things. And then also the important
01:41
role that you all have and sharing your suggestions for new metadata elements and not just new elements but ideas and and use cases and things like that that that come from the, the data site community that's great. So first in describing resources.
02:01
Data site currently holds about 24 million findable items, and the types of those things are given by the required field resource type general resource type general. Many of the of the values in the code list started in 2011 with the first version of the schema. And then in version
02:23
4.4, which came out in March of this year, we added 13 new types to that controlled vocabulary to better describe textural resources. As you can see in the, the picture on the left the pie chart text is the second most common resource type in data
02:44
site. And so having a little more actually having a significant amount of more detail about what those text things can be very helpful. Of course, changing and making additions to the metadata schema is the first step and the community adopting those changes is the is the more important one.
03:06
And then I did a working group is excited to, to see that the types that were introduced just March, essentially six months ago, are already being used quite a bit. Dissertation and journal article are already used more than 16,000 times each.
03:23
Notice that the scale on this only goes up to 6000 so if I showed the full scale for everything they would be going out of the top of our monitor somewhere, and actually dissertation and journal article are I think in the top 12. Now in just six months in the top 12 already in terms of resources that are that are resource types that are being used in data site so that's super.
03:48
Of course as someone who worked for quite a number of years on metadata standards. I'm super happy to see standard added to this list. And those of you that write standards have to start registering them for DOI so that
04:02
we can we can get some noticeable blue over on the right hand side of this plot. We're going to have four polls today and this is the first one. The current resource types are shown here you can see all the new ones and and the old ones all here. And the first question for the first poll question is, are there other resource types that you need.
04:28
Mary is going to help us. This is the mentee.com code 62960146 and if you go there you can enter that code and and give us an input.
04:45
Okay, so just to warn you that we decided that it would be okay to show the previous answers so we can compile everything. So you will see you can vote here for new resource types that aren't appearing, or you can just add to that's already shown.
05:05
Mary one question I had on this, because it seems like presentation slide is just sitting there and nothing's happened to it. Is that, is that something that sort of hanging out on the slide or is it something that someone entered. I understand that to mean that the resource type presentation slides would be useful.
05:28
Would you agree, sir. That was my interpretation. The same whether this size, makes a difference as well. Does that mean doesn't the size mean it's been entered more than anything else.
05:45
Yes, yeah, I would note there are a couple of different spellings of that also represented in other places on this graph. Oh yeah, taking slides. Good one. Yeah, I think, I think the, the, the appearance of protocol on this list is really interesting I mean that's
06:05
a very interesting sort of a new kind of resource that's being shared and getting deal eyes to protocols would be really cool.
06:29
I'll stop sharing but I'll leave that open if anyone wants to add any more. You can go ahead and share again. Okay, we back to the slides participants can now see my screen. Wonderful.
06:56
Okay, so that's for identifying resources and types of resources. Another important set of
07:02
identifiers is people that are involved in creating and making contributions to those resources. These this is a list of the repositories the data site members that have the most, the most records that include orchids. There's been a lot of discussion of orchids earlier today, and right now creator
07:28
identifiers, which are mostly orchids about 85% orchids others, others are also other identifiers are also there. They occur in about 10% of data set metadata records data site metadata records so we have a long way to go in
07:43
terms of increasing the number of people that are connected to data site resources and there's a lot of ideas about ways to do that. The contributors are also, you know, very important contributors to this to this
08:00
infrastructure of knowledge and contributors can also have orchids in data site metadata. And unfortunately they're, they're less common than orchids for identify orchids for creators. So, we have a work cut out for us there. Identifiers for organizations are super also very important.
08:26
And the research organization registry is a is a recently, I guess it's not so recent anymore but a cooperative and open community that involves data site and cross ref and orchid and others in identifiers for organizations.
08:45
And these are again, like the last slide the 10 most common. The 10 repositories with the most records with organizational identifiers. So these are really the, the dry, dry data site members that are currently leading the adoption of these identifiers.
09:05
The first step of course towards having identifiers in your metadata is having affiliations. And right now about 23% of data site members currently include affiliation so we also have a lot of work cut out for us in terms of adding affiliations, and then hopefully
09:24
identifiers for those organizations to the metadata. So note, like on one of the previous slides, the, the line that the the bar on the left for IPK.GBIS is broken here because it actually has 208,000 over 208,000 organizational identifiers in the metadata.
09:48
And that's a big number since another slide we saw that the second leading repository has 37,000 so seeing 208,000 is really pretty amazing.
10:03
And they actually have 208,373 metadata records, and the same number of creator affiliation identifiers, and the same number of contributor affiliation identifiers which is of course, sort of an interesting observation, but it even becomes more interesting
10:24
because they actually only have one identifier that occurs 208,373 times in all these records. So, as someone who looks at data a lot that was a very interesting and surprising observation for me, and it brought up the question in my mind, is it common for repositories
10:45
to only need a small number of organizational identifiers. So, this is the figure that we had before, and I trying I was trying to find a visualization, because I'm a visual learner at heart, that would help understand what the distributions of affiliations are in these in these repositories.
11:09
And so, I made a pie chart, a simple pie chart. And, of course, as we just said on the last slide, this repository IPK.GBIS has only one identifier in it. So in this case the pie chart is all one color.
11:27
It's maybe a boring pie but that's what all blue means one. The second leading repository here is Dryad. And most people know Dryad, a significant general repository for data associated with papers. And in Dryad's case, their picture is sort of the
11:53
opposite. In these pies, just because of the way that I got these data the maximum number of slices in the pie is 10.
12:02
Dryad actually has thousands of roars, but the distribution of those is much more uniform across the top 10 and completely different than IPK.GBIS. And it turns out that if we look at the rest of these repositories, most of them look a lot
12:21
like IPK.GBIS and Dryad really sticks out. On some of these you can see little slivers of light maybe in the Old Dominion University there's a significant second piece. But the interesting thing here is that most of these leading organizations here have pies that are mostly blue, which means that they're able to
12:47
add identifiers to many of their metadata records with only a small number of organizations. So, it turns out that there are a lot of institutional or domain repositories in
13:03
data site, and many of those, several hundreds of them actually include a very small number of organizations. General repositories like Dryad may have a larger number of organizations, a larger number of contributors in many cases, and so finding affiliations and finding identifiers for those can be much more challenging.
13:27
But the important part of this observation for me, knowing that there are a number of institutional and domain repositories in data site, is that those repositories can really help be leaders and have played an important role in demonstrating the benefits of having these identifiers in their metadata, things like the
13:47
PID graph and other connections that help those institutions find people that are using their data, find people that are doing research that's related to those organizations. So, we can look to those
14:03
domain repositories for really helping us lead this migration towards increased identifiers in metadata. So, second poll question. Should, because affiliations are so important and they're, as I said, unfortunately
14:21
rather rare, they're in about almost a quarter of data site metadata records. Some people think that those affiliations should be mandatory, so that we could take advantage of the connections across the whole, all of data site. So that is the second poll question.
14:46
Can you see it? Yes.
15:00
We're still getting, maybe not a resounding answer of no to this question, which I can understand because getting affiliation information can be difficult. In the published world, represented by Crossref and other identifier creators, affiliations are actually much more common than they are, than our orchids. Many of you who live
15:28
in that world know that essentially all authors for papers have affiliations, and typically a small number of them have orchids. So one of the advantages of trying to do retrospective affiliation
15:47
archaeology, what I call it, is that there's a lot of affiliations around out there, which makes them easier in some ways than orchids, although they also change as a function of time, so that's difficult. So filling in affiliations in historic collections is a challenge.
16:12
I'm going to stop that, but you can still vote. Great. Okay.
16:21
Okay. Next topic is connecting these resources, people, organizations, funders, and connecting these things together. So this is a picture of a PID graph and probably many of you have heard about the PID graph. And the identifiers that we've been talking about, DOIs, orchids, and ROARS, and other PIDs, are what enable the
16:53
entities, publications, researchers, funders, and data sets to be identified, and they act like primary keys in the database that connects these. The PID graph provides a picture of those.
17:04
In this case, in this picture it shows all of the resources and researchers associated with a single funder in the middle here. So the PID graph is a great tool for funders and other organizations to find work that's related,
17:23
that people that are related to those organizations are doing, either funders or universities or research institutes or things like that. And I think it's also going to be more and more of an important way to find data sets, to find publications, and use all of these connections to find them.
17:44
Different than, you know, Google text searches and things like that. So data site has two important tools for making these connections. First, related identifiers, and second, the new term or the new element related items.
18:02
So the kinds of relationships in data site are shown by this picture, and they already said that these are making connections. And in data site, over 16 million data site metadata records include related resources.
18:21
Remember that there are roughly 24 million records all together, so more than half of them include related resources. And, of course, many of those include multiple related resources. The event data picture that will come up later shows about 48 million connections between things, so maybe the average number of related resources is three,
18:49
something like that, but obviously that's pretty crude calculation. One of my favorite things I mentioned earlier that I worked with, I worked with a lot of scientific metadata standards that are more detailed than data site, which is really focused on identifying and connecting things.
19:08
And has metadata is a relation type that allows you to point from your data site discovery record that you maybe find by following connections to a more detailed metadata
19:25
record that that would support things like the I, the interoperability and the R, reusability and fair. Data site focuses mostly on the F and the A, the findability and the accessibility, but the things that get pointed
19:41
to by has metadata are things that are really critical for the interoperability and reuse of data sets that are discovered. It was great to see earlier today, a little short presentation by Marco and space
20:01
in his last name here from FAO, the Food and Agricultural Organization of the UN. FAO actually is responsible for over a million of this 1.4 million has metadata examples. So they're relying on this to point from data site back to their more detailed metadata. Another organization
20:27
that does really well in doing that is ECMWF, which is a weather forecasting, European weather forecasting group group that has that includes has metadata links for all of their all their DUIs. So this is a great partitioning
20:45
of effort between discovery and connections in data site and detailed scientific metadata in some other typically in some other metadata dialect.
21:00
Excuse me, we're experts now at polls. This is the third poll. These are the relation types that currently exist in data site and the question is, are there other relation types you need to make connections from your metadata. So, I would just invite everyone to take one or 30 seconds more to have a look at the existing
21:25
relation types that are listed there. And I will share. Right. Yeah, we have plenty of time. That came up in the last session that we had to go back and have a look at the slides.
21:43
Of course, one of the nice things about these is that there's pairs. So there is cited by in sites, or is supplement to and is supplemented by. So you really only have to remember half of them. Whatever. And. Okay, let's see what comes up here.
22:14
I think Susan is saying that the poll is showing the previous results and that's that's intentional.
22:21
Yeah, this is the third time this poll has been given so we wanted to get everybody into one. Whoops, trying all the previous questions Susan said sorry Susan I just was only seeing part of your chat.
22:41
Yeah, there is a time lag when a new question is posted to replace the old one. This actually says relation types not resource types, slightly different, but very similar. His output of that's an interesting one I like that one.
23:21
Okay, I'm gonna stop sharing again, but carry on voting if you want to. Great, thanks so much Mary. Okay so identifying things and connecting things. In a lot of situations there are, or in some situations there
23:43
are resources important resources that don't have identifiers. In some cases they're older, older papers or various things that just for some reason don't have an identifier. And in the in the most recent version of the scheme
24:01
of 4.4 we added a something called related item that we can use for making the connections to those to those objects that don't have identifiers. And of course, because there is no identifier you need more citation information in order to be able to find
24:21
those items. So related item includes standard citation kinds of elements like title, journal, date, pages, things like that. So the primary use case for those is for connecting to items that don't have identifiers.
24:45
Unfortunately this introduces into the schema two ways that you can make these relations, either using just an identifier or using an identifier and a more complete description, because the related item also includes an identifier.
25:02
And we did that because there were some use cases that the community brought up where they wanted to have an identifier and more detailed citation information. So we were trying to kill two use cases with one stone here. We wanted to warn you, though, that in those if you have a related item with a related identifier those identifiers are
25:31
not yet picked up in the event data. So what that means is those connections made are not in the PID graph.
25:42
So if you have a related item that has an identifier, and you don't need that additional citation information, you know, stick with related identifier, or if you need that citation information, repeat the related identifier inside the related item, and also by itself so that those important connections be made or are made.
26:10
I mentioned in the last description event data. So event data is an interesting and emerging set of data being managed and created by
26:22
data site and Crossref and others, and it has links between publications and data citation and software reuse, has a number of different sources, including Crossref data site and various social media kinds of things like tweets.
26:44
Whenever a DOI is mentioned in one of those sources, the source that it comes from and the DOI goes into the event data, and right now there's 93, almost 94 million events in event data, and this shows that
27:06
just over half of those are data site related identifiers, so this is where that 48 million number came up. But keeping track of the kinds of these connections that are being made, and of course keeping track of the flow of social media
27:26
information and inputs is really an exciting thing in event data. There is some effort to recognize titles of things in addition to identifiers for things in the social media flow, so we wanted to mention event data because it's a very interesting
27:49
sort of a new, it's been around for a few years, but many people are not as familiar with it as with the regular data site API.
28:01
DMP connections. In the new version of the schema we added something called output management plans because we wanted to address the fact that research projects have more than just data as output, so data, software, presentations. Of course, publications were already covered, but the non-publication piece, many of us still
28:27
say DMP because we're creatures of habit, but in the new schema, this is really OMP. And this is an example, and actually I think this is the slide that has occurred the most times today, which I'm happy about
28:47
my, actually my wife Erin produced this slide as part of Fair Island, so I'm glad that it's, I think it's really a great example. In the middle of this is a DMP, which has a DOI, is in data site for the Morea project, which is an island in French Polynesia, and when
29:12
we started looking at this DMP, it had a lot of names of people who are around here on the lower left, and then had one funder,
29:24
the Morea Foundation, and also the Gump South Pacific Research Station, but data sets and publications, nothing had been added. So this is a project that has been going on for a number of years, and actually the project has come to an end already, and
29:41
we, people that were working on the Fair Island project knew that this project had data sets and publications, and actually looked up DOIs for three data sets and eight publications that had come from this project. So this is the, you know, this is a
30:05
set of connections made through the new DMP kind of resource in data site to a number of people or funders, other organizations, and then also data sets, data sets up here and papers here. So this is really the
30:25
direction that a lot of people are thinking about going in terms of machine actionable DMPs and using these DMPs to make connections between things. Of course, the coolest thing about this is that
30:43
Kristin Garza at data site made a great notebook, Jupyter notebook, for inputting a DOI, in this case, the DOI for the data management plan, and seeing the connections that are
31:01
enabled in the metadata for that DOI. And of course, that notebook works for any, it doesn't have to be a DMP, it could be anything, so Aaron actually found this quite easy to use, so when the slides come out, the URL is in there, or you can search for this title and put your own
31:26
DOIs in at the center and watch your connections grow. The fourth and last, thank you so much for your feedback so far on the first three. The fourth one is a little bit more complicated. Two questions.
31:44
We've talked a lot about metadata improvements in a number of these sessions today. We want to know what improvements are important for you and your users, and we'll see in a minute the roadmap, the data site roadmap, included in that roadmap is
32:04
a dashboard that has bibliometrics and other metadata characteristics that might evolve over time as we improve. So the second question we have is, which of those characteristics are you interested in tracking?
32:36
It's always those scary moments when you say, hmm, am I talking to anybody?
32:41
Ted, while we're waiting, there's a request for the URL for that Jupyter notebook. Can you drop that in the chat? Yeah, let me find that.
33:05
If it's the one I'm thinking of in the notes for the slides, I can find it if you like. Yeah. I thought it was in the notes for the slides. Don't see it there. Into the chat.
33:34
Okay. Okay, that is now in the chat.
33:47
Thank you.
34:01
I think this one is quite difficult and takes some thinking about it. So, as before, we'll leave it open and move on to the last one. This might also be an opportune moment to remind folks that the Q&A option is available if there are questions.
35:01
Okay, not getting much traction. Okay, one more bit, but I will stop sharing. It's still time to vote on that. We have a way to get more traction as time moves on. So, happy to show that. One thing that we announced, the metadata working group announced in March,
35:25
or maybe it was in May, after the release of 4.4, is an updated crosswalk between DataSite and Dublin Core. And the URL for that crosswalk is here. Dublin Core, everyone here is familiar with Dublin Core, a nice, another sort of lightweight discovery, mostly discovery oriented metadata dialect.
35:53
And then new ideas. We mentioned several times that the DataSite community is an important, obviously, set of people who have things that they need to do and new ideas.
36:07
The DataSite has recently made their roadmap available at this URL. You can get from the DataSite homepage to the roadmap or go there directly. It's a place where new ideas about new capabilities, new features, and also metadata changes can be input.
36:29
Those suggestions can be talked about during open hours, DataSite open hours, and currently stay up for up to two months.
36:40
Six suggestions so far have made it through the gauntlet and been validated using this process. And the metadata working group is, of course, watching the roadmap and participating and commenting on things and suggestions that a number of suggestions
37:02
already in the roadmap are of interest to the working group and more will come from there. So some of the, particularly the last two poll questions needed some time and some thought, and the roadmap could be a way to provide more details on your answers to those questions
37:22
as time goes on and also to see what other people have, what other people suggest. The roadmap looks like this. This is the URL we were just looking at. There are four categories for these items. Under consideration, planned, in progress, and launched. So you can hopefully watch
37:46
your ideas or ideas that you like migrate along this torturous path. If you have a new idea, there's a button here that's a little smaller than shown on the slide but still big and useful and you can use that to submit a new idea.
38:05
Once you submit that idea, press the submit button here, you enter your email here so that data site can communicate back to you about this idea, and you'll actually get an email from the roadmap
38:23
asking if you're really a human being and if this was really your crazy idea and you say yes and then the idea goes up there. So this is a super exciting way to collect information and share information across
38:40
the community and also of course data site staff and members of the metadata working group and others. So that's the last slide. This is the current hiccup of the metadata working group. We are looking for new metadata working group members and
39:01
there is a URL which will be in the slides. I can also track this one down if you're dying to get into the metadata working group right away. But that's, that's the end of the slides, and I guess we have some questions.
39:27
Q&A. So far it's very quiet. We have a lot of stuff going on in the chat. Hello Sheila.
39:42
Nice to see you. Thanks Ted. So, I went through all the messages, I think you answered all of them.
40:00
If you want to talk you can put your hand up as well. Raina Jenkins from Canada has a question. Sometimes an organization is the creator, definitely true. In that case the creator, there's a name type associated with the creator which is either personal or organizational.
40:23
If the, if that name type is organization, then the name identifier which happens in that creator section of the metadata would be expected probably to be an orchid or some other organizational, sorry, to be a roar or some other organizational identifier rather than an orchid identifying
40:45
that you would use to identify that that creator, if that if that creator was a person. So, Raina is definitely right there that in the, if the, if the creator or the contributor is an organization, then roar becomes a name identifier, instead of an affiliation identifier.
41:07
Actually that came, I was watching the chat at the time that came during the part where we were talking about affiliations. And, and some organizations actually do have affiliations. For example, there
41:20
are small labs that are located within larger organizations in my specialty. And so the larger organization, like the Jet Propulsion Lab wants to be mentioned as the affiliation of the instrument team that is the organization that produced a data set, for example, and that's fine. The metadata can handle those relationships very nicely. So if that exists, by all means use it.
41:43
The other point that was made with respect to affiliations is that perhaps it would be reasonable to have a, an unaffiliated value that could be used as a standard value to say this person is known to have no affiliations at this time. And of course the usual.
42:00
You'd need want some way to indicate the affiliation is not known, and probably not knowable for things like my legacy data where it was created 50 to 100 years ago and I, there's no way I can track down who was affiliated with what at that point in time, but at least people could stop looking.
42:23
There's a question in the Q&A. Is there a connection between the RDA PIDS for instrumentation working group and the data site metadata group? I happen to be the chair of that RDA group and a member of the metadata working group so there is a connection.
42:46
Unfortunately I haven't had much time to spend on that one of my roles, but there's also been a lot of discussion between data site and members of the RDA PIDS group about moving forward.
43:04
And that is one of the items which is currently, I'm not sure if it's on the roadmap but it's definitely in the list of items that the, that the metadata working group is considering this year. And so the answer I think is, is a strong yes.
43:25
There are some groups already, NCAR is one that's using physical object for describing not only instruments but platforms like airplanes and they're an atmospheric research group in actually in Boulder.
43:41
So there are some people that are already using it without the instrument resource type. And of course, Marcus and Rolf and others are also experimenting already with it. And so yes, there's a lot of connection.
44:11
Okay, Mark. Mark. Mark. So many people that are in metadata land that I know and I haven't seen for so long because of this
44:23
weird situation that we're in so it's really nice to see them coming and and contributing to the data site community. Liz says there is a roadmap card for instruments. Great.
44:47
Okay, well, if, if there are no other questions, I guess we can go to recess early. I'm not actually sure so Rala knows if there's something after this. Yes, we have a product session after this. So you're welcome to join.
45:05
It will be a working session as well. Part of it will be working on which, working on which items that everyone would like to push forward to next year
45:22
for data site services. Yes, so please do join the next session. And I would like to thank everyone for attending this session and also participating in the voting because the information is really helpful for make for us to make decisions, because this is the metadata schema is for the community.
45:46
So we would like to implement the fields that you want to have on it. So it's really helpful. Thank you so much. So this is the end of the metadata session in the members. Thanks so much to Sarah and Mary for making this go so smoothly and to everybody for participating. So, I'll see you in metadata land.
46:10
Bye bye.