The Q-Codes: Metadata, Research data, and Desiderata, Oh My! Improving Access to Grey Literature
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 19 | |
Author | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/39642 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | |
Genre |
5
13
00:00
Lecture/Conference
00:35
Lecture/Conference
01:41
Meeting/Interview
02:35
Lecture/Conference
07:22
Lecture/ConferenceMeeting/Interview
10:31
Lecture/Conference
11:28
Lecture/Conference
12:01
Lecture/Conference
15:26
Lecture/Conference
Transcript: English(auto-generated)
00:00
Here's the title and all the authors that went in to help with this research this time. As many of you know from this conference, it is difficult to find gray literature and its data, and one of the important things is adding metadata to both the research
00:28
and the data, and one of the ways to do this is an indexing terminology. Next slide, please. In GL19, our friend Mark Jamul actually submitted a paper discussing one of the terminologies
00:45
that can be used especially for general practice family medicine, and he calls it 3CGP. Next slide should be slide number 3. The 3CGP has two components. One is the international classification of primary care, which covers the medical part
01:05
of any data that you're trying to index, and the Q codes, which covers the nonclinical or he's now calling it contextual information, and we'll discuss that a little bit more.
01:20
Slide 4, please. The Q codes have, as I said, cover the contextual information, and they have eight top categories. One of them is patient's category, and this includes age, gender issues, and abuse. Family doctor's issue is another one, and that covers communication,
01:44
clinical prevention, and medical legal issues. Another top category is medical ethics, which obviously would cover such things as bioethics, professional ethics, and information ethics. Another one is planetary health, which covers environmental health, biological hazards,
02:03
and we must not forget nuclear hazards. Another one is patient issue, which covers patient safety, patient-centeredness, and quality of healthcare. Another one is research, and this covers research methods,
02:20
research tools, and epidemiology of primary care. Another top category is structure of practice, covering primary care setting, primary care provider, and practice relationships. And our last one is knowledge management, which covers teaching, training, and knowledge dissemination, which we all know is important.
02:43
Next slide, please. One of the things that I wanted to show you is each one of these top categories comes in a hierarchical setup, and this shows one of them, QT, which is knowledge management, QT4, which is training, and QT43, which shows trainers and supervisors.
03:06
And this will become clear later on why this can be helpful. And each category, this is the simple hierarchy, and it's a tree-like structure, and each one has one of its own, so that there's eight simple tree-like structures.
03:23
Slide six, please. While building this taxonomy, researchers like to use guidelines, and Jamal and colleagues attempted to keep the following 12 proposed guidelines proposed by Cimino in his 1998 paper called,
03:49
oh well, whatever it was called, Desiderata for Controlled Medical Vocabularies in the 21st Century. And what he did, Jim Cimino, was to bring together 12 of these guidelines that researchers
04:06
wanted to see in a terminology, and that they thought that would make it a good terminology. Slide seven, please. Okay, the aim of my research and the research of all the people on this paper is to evaluate
04:26
the Q codes against the 12 desiderata to see how well they met these desiderata. And what I did was I used the current version of the Q codes and the desiderata from this paper that James Cimino wrote, and I examined the taxonomy for the presence or absence
04:48
of each of these desiderata. Slide eight, please. Here are all 12 of them, and I'm going to discuss them one at a time and go as quickly as I can, but yet make sense, because this is the bulk of what the paper was all about.
05:04
The first desideratum is content, and what it says is that there needs to be a systematic and explicit and reproducible method for adding content to a terminology, and there
05:20
are two ways to do this. One way is to add all the single word terms covering the domain, and after you do that, you can allow the users to combine them in the terms to form concepts. And a second way is add the concepts or terms as you encounter them in the various
05:44
data sources, and when the Q codes were formed, conference abstracts were used. And so we chose the second way to do it. Desideratum two is concept orientation, and this says that a term must have one
06:06
meaning and no more than one meaning, and this is actually true in the Q codes. And the third desideratum is concept permanence. Concept permanence requires that once a concept has been created, its meaning cannot be changed,
06:25
even if the concept's preferred name changes or if the concept is marked as inactive, which means it either gets too old or that name for that concept is not used anymore.
06:41
And we have, this actually has, is being used. This is actually working in the, so the Q codes are meeting this desideratum. The fourth one is non-semantic concept identifier.
07:01
You can use the terms name as the identifier or as you saw earlier, the hierarchical code like QT, QT4, QT43 can be used, and this often will tell you the placement of the term in the hierarchy. And as you saw, the actual, this was actually used in the Q codes.
07:24
In the fifth desideratum is the poly hierarchy, and there's two ways you can do this. You can use a simple hierarchy where there's, the term only has one place in the hierarchy
07:40
and, or you can have a poly hierarchy and many of you who use mesh know that that's, uses that, which means that the term can appear in more than one place in the hierarchy. And as Mark would say, what a mess. And in some ways I agree with him, but it works.
08:03
And in the Q codes we, he used a simple hierarchy and one of the reasons for this is because it complements the international classification of primary care. The sixth desideratum is formal definitions.
08:22
The different relationships in which the concept participates with other concepts, and that's what this, that desideratum is all about. And as you could tell from the hierarchy, that's what happens. QT4 is a narrower term for QT, and it's a broader term for QT43.
08:48
The seventh desideratum is reject not elsewhere classified. And this desideratum discourages the use of this category of not elsewhere classified,
09:01
which is, in our case, we called it a rag bag and, you know, if you can't find a place for the term, why not just throw it in the darn rag bag? Maybe it'll eventually find a function or a place. And this, when you do that, it often causes problems in the future. And so it's recommended that you don't do that.
09:22
In the Q codes, the only reason that a rag bag is used is when terms are being examined for suggestion for inclusion into a, into the actual Q codes taxonomy. Otherwise, when a version is being submitted, it isn't, the rag bag doesn't show up
09:45
because there's no reason for it to. Desideratum 8 is multiple granularities. You can have a single level, which means, well, let's make a mess and just list everything on one level. Well, that's cumbersome and it makes no sense after a while.
10:02
Or you can have multiple levels like the Q codes have. And that can help clarify and show who is broader, who is narrower, and it makes it easier for those who are using the taxonomy to understand it. The ninth desideratum is multiple consistent views.
10:22
And this, in a sense, oh, I'm sorry, I'm not even, yeah, you're stuck on that one slide. That must be brilliant. Anyway, usually this involves when you have a computer view. And if you go to hetop.edu, or .eu, that's a health taxonomy terminology portal.
10:48
And when you look at it and put it in the Q codes, you can see if a vocabulary is used for multiple applications, then there is a need to provide multiple consistent views
11:04
of the vocabulary. There's three ways that you can do this. If a user needs to see a coarser-grained concepts that are like, for instance, if you want to see diabetes mellitus, then finer-grained concepts such as insulin-dependent
11:20
type 2 diabetes mellitus can be hidden and marked as a synonym. Another way is to enable users to show or hide specific levels. And this means that you can see the one level, such as QT4, and see what it looks like at that level, or you can see it as you can climb up and down that level,
11:44
like I showed you as you can go from QT4 up to QT, or QT4 down to QT43. And if you sit on the QT4 level, you can, in the ETOP, you can see the term, its preferred
12:01
label, and you can have links to definitions and also to abstracts that cover that term. So, actually, this is, he does have consistent different views that you can see the Q codes.
12:26
In desideratum 10 is representing context, and this desideratum asserts that vocabularies should contain explicit information about how and when the concepts should or should not be used.
12:40
And unfortunately, although the Q codes have definitions, there's no information about how and when to use the terms. And in the future, this is going to be worked on, because this is important, and you probably know an example of this. In MeSH, for instance, they will show annotations of how the term should be used.
13:04
In the 11th desideratum, it's called graceful evolution. And terms will change over time, and terms are added, and others are flagged because they're too old or archaic. And these terms need to be changed or tracked and logged.
13:23
And at this time, different tools are being used, like databases, and we often work with Atlas TI to track all of these changes and requests for new terms to be added. And the last but not least, the 12th desideratum is recognize redundancies.
13:47
And since the Q codes are very young, they haven't been around for more than one to two years, right now we don't have any redundancies in it, and it's a good thing.
14:01
And one of the things that you can do is by putting different, for instance, representing context, which is how and when to use terms, and also the relationships between concepts
14:25
will often help you see when redundancies are going to show up. And since we are going to fix the problem with context, when and how we should use the terms, that should often, that should help us as well as showing the relationships.
14:45
So we have shown that, thank you, that the Q codes meet 11 of 12, which shows that some improvements can be made, which in turn will improve his 3CGP,
15:02
which he uses to index gray literature in GPFM, and this will help us to make gray literature available to different people as well as general practice family medicine individuals. And this, in turn, will decrease the loss of information.
15:24
The last slide, please. Thank you very much, and if you have any questions, I'll do my best to answer them.