We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Floating classifications - Knowledge Organization Systems in past, present and future

00:00

Formal Metadata

Title
Floating classifications - Knowledge Organization Systems in past, present and future
Title of Series
Number of Parts
36
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
36
Home pageEvent horizonWebsiteMessage passingInternetworkingSound effectFamilyComputer fileMultiplication signSoftwareAmenable groupVideoconferencingHypermediaService (economics)Direction (geometry)Interrupt <Informatik>Computer programmingDisk read-and-write headTowerFile archiverVideo gameMappingMereologyArmEvent horizonMobile appMachine visionVirtualizationYouTubeFormal languageTwitterSystem administratorComputer clusterMeeting/Interview
Meta elementDynamical systemReflection (mathematics)Metric systemPhysicalismModel theoryCentralizer and normalizerInformationDirection (geometry)Meeting/Interview
Self-organizationVirtual realityOpen setMeta elementPresentation of a groupCategory of beingObservational studyTime domainTime evolutionInformationObject (grammar)MereologyData structureSystem programmingLevel (video gaming)StatisticsUniverse (mathematics)Inclusion mapPresentation of a groupField (computer science)Physical systemKnowledge organizationInformationBuildingLevel (video gaming)Projective planeVirtual machineFormal grammarStatistical physicsProcess (computing)Subject indexingArithmetic meanGraph (mathematics)Multiplication signWordCategory of beingDatabaseExpert systemInformation systemsTouchscreenGroup actionObservational studyBitComputer scienceSlide ruleOnline helpMorley's categoricity theoremSocial classCondition numberState transition systemPoint (geometry)Self-organizationHypermediaFile archiverExterior algebraKey (cryptography)Workstation <Musikinstrument>Ocean currentOrder (biology)Meeting/InterviewComputer animation
AliasingStatisticsData structureUniverse (mathematics)Self-organizationSystem programmingInformationTime domainTelecommunicationMach's principleSlide ruleWebsiteMedical imagingInternetworkingLibrary (computing)Surface of revolutionAssociative propertyInformationLevel (video gaming)Digital photographyKnowledge organizationGeneric programmingOrder (biology)Interactive televisionProduct (business)Instance (computer science)Physical systemField (computer science)Universe (mathematics)Representation (politics)Information systemsObject (grammar)Figurate numberExpert systemMultiplication signDirection (geometry)System callSelf-organizationPlastikkarteData miningNavigationLink (knot theory)Computer configurationFatou-MengeAveragePhysical lawSocial classBuildingXML
Machine visionKnowledge organizationDigital photographyCovering spaceTouchscreenInformationUniverse (mathematics)Set (mathematics)Power (physics)Self-organizationLaptopGreen's functionOcean currentText editorFile archiverEngineering drawing
Slide ruleText editorLibrary (computing)MereologyArchaeological field surveyLibrary catalogObject (grammar)Social classPhysical systemSlide ruleOrder (biology)Field (computer science)Knowledge organizationKeyboard shortcutError messageInsertion lossComputer animation
DecimalTable (information)Physical systemSimulationDemosceneCorrelation and dependencePC CardMIDIHill differential equationACIDHausdorff spacePublic domainSocial classSign (mathematics)Finitary relationHome pageLibrary (computing)Musical ensembleFingerprintGroup actionInvertible matrixMetropolitan area networkTime domainMachine visionArithmetic meanNumberCartesian coordinate systemLibrary (computing)Shared memoryPhysical systemSocial classFingerprintString (computer science)Row (database)DemosceneTerm (mathematics)Table (information)Instance (computer science)Universe (mathematics)MereologyBit rateLibrary catalogFood energyMathematical analysisGraph coloringVideo gameSubgroupSubject indexingFormal languageStability theoryComplex systemPhysicalismGroup actionMultiplication signOperator (mathematics)MathematicsPlastikkarteLevel (video gaming)Line (geometry)Disk read-and-write headMetropolitan area networkShooting methodNumbering schemeContent (media)Point (geometry)Combinational logicLatent heatComplex (psychology)Sign (mathematics)DecimalTemporal logicPoisson-KlammerComputer animation
FolksonomyMereologyWeb pageFunction (mathematics)Abelian categoryLink (knot theory)Event horizonLength of stayRAIDTotal S.A.Term (mathematics)Time evolutionCategory of beingSoftwareMetropolitan area networkSelf-organizationFile formatHypermediaWeb pageLibrary (computing)Sign (mathematics)EvoluteMultiplication signLink (knot theory)Scaling (geometry)Different (Kate Ryan album)Data miningLine (geometry)Computer clusterLevel (video gaming)Term (mathematics)Classical physicsKey (cryptography)NumberPresentation of a groupShared memoryDemosceneHierarchyMathematicsKnowledge baseParameter (computer programming)Motion captureForm (programming)Decision theoryMorley's categoricity theoremGraph (mathematics)Functional (mathematics)CASE <Informatik>CuboidInformationSocial classProjective planeCognitionTranslation (relic)MereologyText editorCore dumpPhysicalismPhysical systemStatisticsOcean currentIntegrated development environmentHypothesisMappingFluxProcess (computing)Subject indexingKnowledge organizationComputer fileXML
Order (biology)Visualization (computer graphics)CASE <Informatik>SpacetimeLevel (video gaming)Category of beingWeb pageSubject indexingBitGraph coloring
Visualization (computer graphics)Order (biology)Menu (computing)Level (video gaming)Normed vector spaceCASE <Informatik>Interface (computing)Self-organizationNumbering schemeInformationDocument management systemSign (mathematics)WebsiteLevel (video gaming)MappingLink (knot theory)Information securityPhysical systemTouchscreenMathematicsKey (cryptography)Inclusion mapComputer animation
Self-organizationInterface (computing)CASE <Informatik>Numbering schemeInformationLevel (video gaming)Order (biology)Visualization (computer graphics)DecimalWordModel theoryLevel (video gaming)SpacetimeLibrary (computing)Food energyMultiplication signLibrary catalogContent (media)CASE <Informatik>Category of beingArc (geometry)Social classBuildingClient (computing)MultilaterationWorkstation <Musikinstrument>AreaSelf-organizationVideo gameMusical ensembleReading (process)Term (mathematics)Instance (computer science)Matching (graph theory)PhysicalismMoving averageConnected spaceVector potentialLine (geometry)Key (cryptography)WeightSheaf (mathematics)Context awarenessMappingCurvatureMathematical analysisComputer animationProgram flowchart
Observational studyTime domainMessage passingMultiplication signPairwise comparisonBitContent (media)Arithmetic meanKnowledge organizationMappingScaling (geometry)Physical systemGrass (card game)Domain nameOrder (biology)Context awarenessSurface of revolutionData structureGreatest elementBuildingSelf-organizationProjective planeShared memorySubject indexingProgram flowchart
Term (mathematics)Price indexDecision theoryLibrary (computing)DecimalUniqueness quantificationTime evolutionFile formatUniform resource locatorRevision controlTime domainDirected setInternet service providerAttribute grammarCodeInfinite conjugacy class propertySineCartesian coordinate systemPlane (geometry)AbstractionInclusion mapAddress spaceArmField (computer science)Right angleMultiplication signDatabaseGenderPoint (geometry)Touch typingInformationCasting (performing arts)NumberPresentation of a groupSocial classRevision controlLine (geometry)Software maintenanceDescriptive statisticsError messageObject (grammar)State of matterDecision theoryElectronic program guideSubject indexingPrice indexGodWindows RegistryInstance (computer science)Library (computing)Row (database)Goodness of fitBitCategory of beingWeb pageNormal (geometry)Knowledge organizationObservational studyMetropolitan area networkSelf-organizationVideo gameExtreme programmingTerm (mathematics)Physical systemDecimalOrder (biology)Slide ruleMathematical analysisCartesian coordinate systemDot productMathematicsView (database)Type theoryPower (physics)Latent heatComputer animation
Observational studyKnowledge organizationTerm (mathematics)Physical systemBasis <Mathematik>WordGraph (mathematics)Slide ruleHierarchyField (computer science)Set (mathematics)QuicksortElectronic mailing listSelf-organizationProgram flowchart
Self-organizationModel theoryDecimalInformationWordGroup actionTime evolutionAbelian categoryData structureSystem programmingCategory of beingVisualization (computer graphics)Interface (computing)Computer networkDigital signalLarge eddy simulationPrincipal ideal domainPhysical systemLogical constantEvoluteInformationSelf-organizationPower (physics)Knowledge organizationGoodness of fitArithmetic meanComputer animation
Physical systemCurvatureSystem programmingSelf-organizationInformationMereologyProjective planeTerm (mathematics)Library (computing)OntologyElectronic program guideSystem callDigitizingMorley's categoricity theoremDifferent (Kate Ryan album)Multiplication signStudent's t-testLibrary catalogOrder (biology)EvoluteArithmetic progressionPresentation of a groupDampingVideo gameResultantCategory of beingKnowledge organizationMachine visionFormal languageOperator (mathematics)Touch typingPhysical systemInternetworkingWave packetView (database)Reading (process)YouTubePatch (Unix)Constructor (object-oriented programming)Source codeGroup actionLattice (order)Social classSpacetimeAreaWordFlock (web browser)Perpetual motionChainPower (physics)OctahedronCAN busShooting methodObject (grammar)PlastikkarteInsertion lossPoint (geometry)Extension (kinesiology)ACIDWeightFood energyCausalityDiscounts and allowancesEnterprise architectureStandard deviationObservational studyKey (cryptography)MathematicsDecision theoryComputer animation
Transcript: English(auto-generated)
Good morning. Welcome back to the virtual WikiCite conference. Please have a look on the website to check out the other sessions. They come from all over the world and we are kind of happy that so many people participating in the conference. We have such a lively discussion. It's so many network tools.
For those who haven't joined yet, you can comment in the chat on Twitter, hashtag, WikiCite, or directly next to the YouTube video. I think it's that direction. Or you can post your questions also in the etherpad. Actually, we have a smaller problem today. My co-host, Jacob, has Kraftmann at home disturbing his internet.
And he hopes he can join in about an hour or so. So we are going to start without him. And, yeah, we have a bilingual session today. The talk of Andrea Scharnhaus we are about to hear in about two minutes. This will be heard in English and the second part with my co-host, if he shows up.
And Philipp Zumstein will be heard in German. Jacob will speak about the mapping of non-data using the tool Kokoda. And Philipp will present about the ingest of literature to Viki data with literature administration program Zotero. So this is the second part later on.
Yeah, and we start with the first thing with the talk of Andrea. And she told me that you can also ask your questions in German if you like to or also in Dutch. And you can also ask your questions during her talk and interrupt her. So if you are in Germany or in the Netherlands, you can ask your questions in the Netherlands.
You can also ask your questions in this etherpad or on Twitter or on YouTube. And you can also ask your questions in the essential language. You can also ask your questions in German.
So in English, her talk will or Andrea will talk for about 30 minutes if you don't interrupt her. But you are allowed to interrupt her at any time. Yeah, so I can introduce Andrea now.
Andrea Scharnaus is head of research at DANSS, which stands for Data Archiving and Network Services. And she's an active member of DARIA, Research Network for Digital Humanities. Originally, Andrea studied physics in Berlin at Humboldt University. And she also carried out her Ph.D.
And in her thesis, she developed an interest in the direction of sensor metrics and the modeling of research dynamics using quantitative methods. And this is also where she continued to work afterwards. Many publications also include reflections on philosophical questions related to central metrics and information science.
And so I'm and so we are here for a kind of meta talk today, how to organize knowledge in the past, in the present and the future. And I'm super curious for your talk, Andrea, on floating classifications. Thank you very much, Eva, for that very nice introduction.
If you can now switch to my to my slides, if you can arrange for that, I'm also super excited to be here. And I would like to thank you very much for that invitation and for the possibility to be here. And without further ado, I go to my slides.
So what I would like to do with you today is to actually guide you through my own learning process and a couple of projects. I have been involved in the last about 10 years, which all have to do with knowledge organization.
And that map of the presentation will come back during my talk. But let me first elaborate a bit. So why knowledge organization and actually what I do, what do I mean by knowledge organization? Knowledge organizing systems such as classification schemes, design the ontologies,
but any kind of categorization and formalization is something we are constantaneously surrounded by. In our individual development, we grow insights. Our ability to recognize the world has also something to do with learning to categorize, to classify, to order.
And classification, of course, has also something to do with formalizations. And we live in a digital age, we live in the information age. So formalization is actually the key. It is the key instrument.
It is like the stone was in the Stone Age or the bronze was in the Bronze Age. The systems of how to order our knowledge, they are indispensable. They are behind every database. They are behind every knowledge graph we now build.
And at the same time, sometimes I think that they are also kind of faded away. They become more hidden. It's almost as we, the engineers, mankind has become too good in building systems
which seamlessly navigate the users through the information ocean. So I heard Daniel's talk yesterday and he talked a lot about tagging and tagging topics and how to use the data richness in the WikiSight project to construct an alternative bibliographic database.
And when I listened to him, I was reminded to an evening lecture I'm attending since a couple of weeks organized by the International Society for Knowledge Organization where Sylvie Davis gives an introduction into indexing and subject heading and what it all means.
And for me, this is also kind of new territory. I'm not trained primarily in the information sciences. I learned that along my professional career and I'm still learning and I'm still surprised how much we seem to have sometimes also forgotten about those older debates.
What is the relationship between a word and a concept and a keyword? And how do you index? And how can machine help you and where human beings are still needed? So this talk today is very much a kind of thing.
It tries to be a contribution to re-evoking a discourse about knowledge organization systems and bringing together those pioneers of the today technologies with those which have reflected about that in the past.
So if I talk about knowledge organization, then it has something to do with our own individual practice but also with our societal or group or community-built practices where we build knowledge organization systems as a kind of means or tools or instruments. And then there is also knowledge organization as a scientific field
which lives in this soft layer between computer science and information seeking behavior, searching, building of information systems and the like. So I tell you about, and that's a kind of disclaimer,
so I tell you about knowledge organization really not as an expert. So every time I sit in those lectures of the experts, I discover how much I actually don't know. And as Eva already mentioned, so I started my career in physics, in statistical physics more to say,
and then I moved to the philosophy of science and then I moved further to quantitative studies of science and science and technology studies and now I'm working at an archive. So I'm more a traveler and I symbolize that by such a map of science as you can now see on the screen
which is actually done by Rick Lavance and Kevin Bojak and can be found in Katie Berner's Atlas of Science. So as a traveler between many fields in my career, what I was always in need of is also a kind of a quick overview about what is the other field I'm about to enter,
what is this about, how can I get a grip on that, how can I on a very generic level actually contextualize that and make myself familiar with it. And I think this has triggered my own personal motivation to look into knowledge organization systems
as a kind of guidance, as a kind of thing I can hold to a reference system, I can organize myself around. But at the same time, as a traveler, you will not get the crisp and clear definitions from me
that you get from the experts in the fields. And I'm very much aware of that, so I give you at the end of the slides, which will also be on SlideShare later, I give you a lot of references and I really invite you to explore them. So once more, so why is the ordering of the knowledge so important
and why I have titled that slide as Ordering of the Knowledge and Ordering of the World? Now I spoke already about the ordering of our own knowledge and understanding and insight and then knowledge organization would say we communicate to actually further develop these insights
and also share them with others and spread them, so contribute to the diffusion of knowledge. And we do this also by documentation, so this is why, this is where texts come in, this is where documents come in, be they as, be they books, be they websites or photographs or efishes or whatever kind of document you can imagine,
images for instance also. So information, so we would say, is shared knowledge by having been communicated and that's also in the one way or the other documented. And knowledge and information as well as knowledge organization system
and that is an interesting aspect I would like to have you, to keep you in to, I would like to have you keeping that in your mind, is they are socially constructed representations of objective reality. Also the builder of large knowledge organization systems as for instance the builder of Wikipedia
of course attempt to give you an insight in what is around us, but at the end it's the social interaction of those building them which also shape a lot how it is built.
And that is, I will show you, I hope I can show you how this is important later on. So I already mentioned what has triggered myself, my own interest into knowledge organization. And for that talk I decided to actually look back one revolution before us, one technological revolution before us.
I think we all agree that we currently are in the information revolution with the outcome of the internet that has been exploded or has led to an explosion of knowledge
and to unforeseeable possibility what you can do nowadays. So if you look one way back then it's probably the start of the industrialization at the beginning of the 20th century. And that has come with a newly evoked kind of image on what the role of library, public library and academic libraries are.
And then you are back to funders of universal or generic classification systems. Two of them are marked here, one is Melville Dewey, who was president of the American Library Association
but is mostly known for the Dewey Decimal Classification which is still widely used. It's widely used in a lot of libraries, so you will also find it in OCLC's products. And when he developed that at the same time, Paul Oudley together with Henri Lafontaine in Belgium,
in the just emerged Belgium nation, was dreaming about knowledge information systems which not only help you to order information and to navigate but also to improve the world.
So Paul Oudley is now known for the universal decimal classification but he was also very much engaged in peace processes, in the ideas of building associations among countries to secure world peace.
So Dewey and Oudley are just two figures which are interesting to look at also historically. And I will stick to Paul Oudley for the time being because I have to admit that I was really kind of,
yeah, how to say that, I was mesmerized when a colleague of mine, Charles van den Hoefel, had as a screen cover on his laptop, he brought up these kind of photographs and I said to him, what is this? And he said, oh, these are photos I have taken in the Mondaneum
where the legacy of Paul Oudley is still kept today and that's also something I really, really recommend that you should go there and have a look if you are any interested in information and knowledge organization. It should be a must for everybody to go there. So I was mesmerized by that
because Oudley was a visionary who also was drawing his ideas. He was scribbling around a lot. So he would even work on a visual encyclopedia on how we gain knowledge, how we document it, how we communicate it, how we should archive that.
So there are a lot, a lot of drawings which are very inspirational still, I find. And he also developed that into traveling exhibits which reminded me a lot to those big old paper posters
you might remember or you might not remember which used to hang in the classrooms of the kind of school. But today we're going to look into the bibliographic, into a knowledge organization system.
He also invented the so-called universal decimal classification and I borrowed a slide here from Aydan Slavic, the current editor-in-chief of the UDC. And that was the system devised to order all the knowledge of the world and it's organized in classes, so as you would probably also expect to do with the scientific fields
as the kind of main ordinary principles. And it was actually Richard Smiralia who explained to me why we have that ordering, why Paul Oudley came up with this ordering.
So he starts with general principles like almost divine principles and then he goes to philosophy, religion and belief and then he goes to social sciences and then so he goes to more and more kind of material, concrete things.
So this is also not arbitrary, what is the class zero and what is the class nine in this system. Just to make a remark on that. So the universal decimal classification as it stands today is named as an analytical synthetic faceted classification.
And if you go to the UDC summary who is open then you see the system the UDC represents today. And what is interesting is, so the main tables is what you probably would expect to see.
So these are kind of topical classifications. And then you have auxiliary tables and you have signs. And the common auxiliary signs as part of the auxiliary tables, they define kind of operations
you can do with those main tables, with those conceptual categorized kind of entries. And the common auxiliary numbers, which are another part of the auxiliary tables, they allow you to combine or to say that something is in a specific language
or something is from a specific place. So actually what the UDC is about is a language. And while it has been used now, I wanted to say that while it has been used to actually in libraries to create subject headings, so to index objects, documents, books
in libraries and archives, the original vision of Paul O'Dley was to apply it to everything. So not only books. So Paul O'Dley applied it to his own drawings.
They all have a UDC number or most of them have a UDC number. So this is also something to keep in mind. And it has been also applied to Wikipedia articles, but I come to that later. Actually, my dear host, what I also realized that I cannot look at my slides
and look at the channel to the same time. So I actually don't see comments coming in for understanding. So either you need to disrupt me or I just go on bubbling. So the universal decimal classification kind of from a physics point of view, it's a complex system.
So you get in the in the practice indexes have developed long complex compounds, so-called compound UDC numbers where each part has a specific meaning. And it's also a language you need to learn to apply.
So, for instance, this long string here starts with a class coming from the main table. And that class comes from the social sciences and in particular the subclass of cultural anthropology, the subclass of public life and the subclass of ceremonial something in this.
And then you get another part of the UDC, a so-called sub-grouping, where you have another part, the first part on the history of a place. And in that case, it says so did 92 stands for history. And what is in the bracket stands for all country and Czech Republic. So you can break that through and break that through.
And what it means is that it describes a document which says something about the celebrations of May 1st or May 8th, so related to Borka's movement in the Czechoslovakian Republic.
So there's also, of course, a historian kind of aspect here to it, as you immediately will grab. Now, if you are quantitatively interested, you will also immediately see that you can play around with this string, with this kind of strings, but you also see it's not, for instance, not easy to parse them.
So parsing a UDC number is something you have to be really, really careful because where the numbers are, they have a specific meaning behind which sign they are and in which combination they appear. We did, with Richard Smiralia together and Cheng and Almela and Krzysztof,
we did a lot of investigations of UDC numbers in actions, we call that, or UDC numbers in the wild. So you can make an analysis how the UDC, how long they are in the master reference file. That means in this UDC system you can still look up and find the number and how they are actually applied in the different libraries.
And you can also take the UDC numbers in a catalogue and get a fingerprint for a library from that analysis.
But what I found most intriguing were that the instantiation of the UDC we see also, both in design and in the practical applications, are changing over time. So remember that I was seeking for a kind of stable guiding reference systems, also for myself,
and what I actually found is that those classification systems are by no means as stable as I naively thought. So they also influx, they also change. But they change probably less, dramatic,
they operate on another kind of level of temporal changes than the actual content they are used to index and to make findable. Now let me kind of summarize how we came so far. So I started with a quest for ways to organize our knowledge.
And then I said, so maybe we can look into the past and see what is there and look into classification schemes as used in libraries. And then discovered they are less stable than I naively thought. And they are also rather complex, both in their design and in their practical applications.
So maybe we should give up, maybe it's not worth to go to the past, it's not appropriate, it's not practical for us. So that was one thought or the quest to still see what we can do with things like the UDC in the current information environment we live in was one of the motivation for a project.
The so-called knowledge space lab in which Almela and Sheng and Krzysztof and myself work together. And our immediate research question or our primary research question was to actually look if the formation of classifying knowledge by foxonomies
is different from those we see by applying tradition, classifications in a traditional, in a library kind of way. And I wanted to do something on the UDC and Krzysztof said, but this is all small numbers.
So we have like 60,000 terms in the master referent file, maybe it's now has grown to 80,000. So why don't we look to something bigger? And then it was Jakob Foss giving a presentation, I think, in one of the UDC seminars, which was actually triggering that whole project.
And I'm not sure Jakob knows about it. And he spoke about indexing Wikipedia articles with UDC numbers. And then Krzysztof said, yeah, Wikipedia. So this is big number. Here's a statistic of physics. So let's go for that.
So we were looking to the English Wikipedia, and we would be interested how main topic classifications are organized in the Wikipedia. I also have to add another disclaimer. I'm a heavy user of Wikipedia, but I'm not a Wikipedian in the sense that I'm contributing a lot.
So I learned a lot about the functions of Wikipedia. And I also I looked back to some of the pages we looked at that time, and I discovered that a lot of things have been changed again around categorization in Wikipedia,
which actually is an argument for my knowledge organization systems are socially constructed thesis I put at the beginning of this talk. So we looked into main topic classification in Wikipedia. And if you go to the page now, then there is a box saying subcategories,
and you see there are 42 listed there. And actually, what we wanted to do is that we wanted to go over these subcategories in time. The problem is they are dynamically created. So they are always in the presence. So you can't use this beautiful history feature.
At least this is what we understood. So what we did instead is we started on the Wikipedia dump of 2008. And then we extracted the Wikipedia pages from that dump and the category pages.
So the article pages and the category pages, and we extracted the links between them. And that was a big data mining exercise we did. And with those networks, we would actually try to understand what has been going on in the organization at that time in the English Wikipedia.
And we were not the only people doing this. There are a lot of work on the Wikipedia classifications. So what I wanted to show you are two slides. I don't have time to really go into the details of this work. But this is the category size measured by the number of articles related to a category page.
And then you see that there's, of course, a tremendous growth. So even in these four years, you see in that graph. And you also see that categories kind of have different sizes over time. They seem to grow and they shrink.
And that has something to do with the social communicative cognitive processes beneath this kind of quantitative analysis. So yes, you can pick a category and you can categorize your Wikipedia page you work on.
But all the decisions, of course, are part of the editorial process, which is also going on in Wikipedia. So editorial processes are not only going on in the past from the editors of a certain classification. It's just another scale is also qualitatively different, but it's still going on. And that leads to these changes.
And if you then follow main categories, which were around at the beginning, then you see that sometimes they grow bigger in being assigned to Wikipedia pages. Sometimes they fade away from that top level. Sometimes they disappear at all. And sometimes they just kind of are rearranged.
What is also important to mention here is the Wikipedia categories and pages form a fully connected graph. They don't form a hierarchy. So that's an important difference to the UDC.
And to be able to compare them both, we kind of forced them into a quasi hierarchy. And that left us with a lot of notes with a lot of pages, which we couldn't fully assign them anymore. So we looked into the evolution of the Wikipedia categories. We found that even more in flux than we saw already with the UDC.
And then we said, we thought, okay, maybe we can compare them both and therefore get a better insight. So we did very simple mapping exercises. So on the term base, we mapped the Wikipedia categories, those highest, those one level after the topic classification.
We mapped them to the UDC master reference file. And at the end, we came up with a map you can find in the Atlas of Knowledge of Katie Berner and in the Places of Spaces exhibit,
which depicts the Wikipedia on the left side category and pages and the UDC on the right side. In the case of the UDC, only the categorial system, not the pages in the kind of books indexed with it.
But what you already see, because we color-coded. May I ask, can you make the picture a little bit bigger that we can have a look at? It would be super interesting to see what are in the bubbles. I'm not sure. I don't think so.
Oh, no. Okay, what a pity. Oh, but you can, you can go to the, where can you go to? You can go to the science maps website. Maybe Lyon can look that out. And the map is called emergence versus design and bring up the link and there people can click on that and then make it bigger.
Maybe you can, you can add the link to the private chat here in StreamYard and we can bring it up to everybody else.
Oh, wait, wait, wait, wait. That's kind of, that's... I can't, would this be possible? I can see that it is interesting. I don't have a direct link because they changed their, they also changed the reference system. So wait, wait, wait.
So if somebody of you can look into this website and search for the map. Ah, yeah. Okay, I see. Great. So now you probably see my screen. Now you see this again. Okay, so let me go to the, back to the map.
So what you see in the map and what you can navigate through in the, in the, if you go to the places and spaces website, what you see already without being able to read any of the labels is that the Wikipedia is more, the categories are more purple and blue.
And purple and blue are categories which either belong to classes like philosophy or organization of knowledge, so generic principle, or which belong to classes in the UDC which are more about cultural events. So a lot, we have a lot we found at that time, a lot of categories and also content related to music, to art, to our daily life.
And in that sense, the Wikipedia categories are actually in their composition, much more similar to Odley's first dream.
So when he designed the, the 10 classes, then he was thinking about all of the society. But when the UDC was applied, it was applied in academic and scientific, towards academic, scientific literature. Yeah, in libraries of that kind.
So it got much more differentiated in the, what is here a green area, and that is applied sciences, physics, biology. So you can already see that Wikipedia is about a society as a whole, as a whole, and the UDC as it is designed now is more suitable to annotate scientific, scientific knowledge.
We also found, and I think you also will find that curious and interesting, when you do term mapping, there are, can be, so we found for instance radio, there was a lot going on on radio, and then on a term basis, radio would appear in the, in the UDC under physics.
So it's radio physics, but it's radio stations in the Wikipedia. So that also says something that there is not a unilateral connections between terms and concepts or terms and categories.
You always have to, you always have to do actually a close reading after you have done such a kind of distant reading analysis. What I still would like to see, and what I'm still dreaming about, is that we would be able to have those kind of maps as guidance in the large collection spaces we deal with.
So what we have achieved with the OPACs is that we can go to any library in the world basically and look at their catalogues. Also sometimes I think libraries are giving up on catalogues. But the OPAC is a very flat kind of entry to a library.
If you go physically to a library, you already see going to the building how big the collection is, which is there. And if you were still, you still would use card catalogues of the systematic catalogue, you would see from the furniture, you would see how much different categories they actually would, and content they would encompass.
And we kind of lost that. So we search in text, in a text line, and we have no clue against what we are searching. So there is very little context given to the user today.
Okay, so this is my Stäckenförde, I could go on on that endlessly. Now let me wrap up and then, and I think I'm running out of time also a bit, but it's not so much coming, so I hope you still give me a bit of time. So let me wrap up what my kind of takeaway message is from that excursion into the Wikipedia UDC comparison.
So we had looked at the classifications used in the bibliographic domain with the knowledge explosion during the Industrial Revolution, and then saw that they are rather complex.
They also went out of fashion a bit. So maybe we don't need them, was the questions, and we can rely on foxonomies or grass kind of bottom-up built classifications. What we saw in the Wikipedia UDC in the knowledge space projects in that comparison is that mapping out
classifications can actually help us to gain an overview about the content of the collections in which they are embedded. And that we can also deepen our understanding both of the content and the way
it is organized by mapping and comparing classifications and knowledge organization systems in both spheres. We also saw that the Wikipedia, by no means structurally, is very different from what Paolo Leight dreamt about it. So it organizes knowledge also, and how it organizes knowledge is not so very much different
from how we thought about organizing knowledge or the people thought about it a hundred years ago. But it has, of course, a totally new quality to it, to the big scale on which we collectively can interact in the Wikipedia.
So, and still, it is also actually quite complex to do this. So when I read through the guidance, the current guidance in the Wikipedia, how to categorize and how to do this and recommendations, and that reminded me a lot to the many textbooks on indexing, I also look at.
So why is this still important? Why can we not give up on all of that? So why should we stick to it? So why do I'm taking this lecture actually?
Now let me go back to one of the titles I started with at the very beginning, Ordering Knowledge and Ordering the World. So Paolo Leight was not only wanting to find, wanting us to find things better, he wanted also to contribute to a better world.
And categories not only reflect how we think about things, they also represent world views. And by doing this, they act both as a reference guide, but also as a normalization in our minds.
So there is a power relation, that's what I wanted to say, in courses too. So what goes into a classification system is never arbitrary. It is always selected. And it can have very practical implications in our daily life. So think about which nation
you belong to. So what is written in your passport? Are you allowed to have two passports? There have been, I remember that there was a PhD or a book edited in the science and technology studies and they looked
into the databases which are used to judge if somebody who comes to Europe as a refugee is actually entitled to seek asylum. And there it is sometimes very sensitively depending at which point in time you answer which kind of questions with the right kind of answer.
And that brings you then to the one path, or if you make a mistake, it brings you to the other path. So it's not innocent, that's what I wanted to say. And this is
also something people in the field of knowledge organization have looked into in detail. And one of the person I would like to point you to is Joe Tannes, who is at the Information School, University of Washington. And very early on, looked with a microscope to those changes in both in classification and the application of classifications.
So that slide of Joe, I have borrowed from a presentation of him, tells you actually something about the term eugenics and how it appears in the Dewey Decimal classification.
And those are all the quadrants and the crossed out quadrants. They tell you something, what are possible classes to put this term to in the DDC and which of those classes have been discontinued. And on the x-axis you see the numbers of those main classes, there are also ten.
The dots in this comes from an empirical study Joe and others did where they looked into about 14, 15k records from libraries around the world and they looked into a specific mark field in those records.
And so those dots actually represent decisions of indexes, where to put something related to eugenics in which classes. So this kind of very granular analysis raises all kinds of questions about our
ability to index our personality decisions but also the system in which we operate. And if it comes to eugenics, then over the time you can see that it moved from a more biological kind of connotation to a more social connotation and back to a more biological connotation.
But it's not eugenics you can look into. You can also think about discussions around feminism, discussions around gender, what kind of gender types we have and so on. I think you get the idea. So that brought Richard Smiralia when he came to the Netherlands and stayed with Danz
together with Peter Dorn and Gerard Kuhn to the idea, could we not classify classifications ourselves? So should we not study classifications as an object of study? And that's actually my last kind of slide.
And Jerry started very practically by looking into taking classifications from the shelf and asking, do we know something about the creator, the maintenance organization? Are they published? Are they what versions they have? And this kind of descriptions is a
bit more detailed than what you will find in the registries, for instance in the Bartok registry. But he also looked into this and he started to collaborate with Bartok. So he did that based on a number of Excel sheets and looked into all those details.
And we were quite surprised to see how much knowledge is actually missing on courses. And that's my last slide. I hope I have made clear that knowledge organization is kind of at the basis of how we operate nowadays.
And that is actually that knowledge organization systems are everywhere. But we should not think about knowledge organizations as a set of back of words or terms which are organized in a hierarchy or organized in a list or organized in a graph.
So we express concepts with words, but words are not identical with concepts. And the knowledge organization field has reflected about that over and over again.
So I would like to invite you to go to that field and to kind of make yourself familiar with the basis sorts in this field. So that's what I wanted to say. Thank you very much. And those are the references.
I think I'm now switching back. So clap, clap, clap. Thank you so much, Andrea. So we started with in the past of information organization systems with Oatly and the decimal classification went over to
the present with Wikipedia and its constant evolution and came out now with the omnipresence of knowledge organization systems everywhere. I think we can. I hope this is a good synapse of your talk.
Now we can go into the discussion and maybe it is also possible to show the either patch. So I don't need to read all those comments. Maybe Jacob can brings us up. Yeah, we can start with the first question, maybe of the first comment.
So you talked about the information would be like or the information today is like the stone in the Stone Age. So this was one phrase you said. So information is power. This is also what you what you said in the in the last part of your question.
So what is more powerful? It says shaping the order or classifications of information or own those information or does it goes together? Is it the same? What would be a democratic way of information power?
I think what I would like to what I would like. I think it's a good question. It's a very good question. And in the past, if you go to the history and you see that kind of those in charge of designing this knowledge organization systems. Of course, also gain power by by doing this.
And this is what what makes the Wikipedia again so appealing that it is all open and transparent to do this. I think the problem we have as much as is probably in my view and much more simpler. Yeah. Is that often independently who owns the knowledge organization system, we are not even aware that we operate under them.
So that there is the conscious that there is actually a classification at the back operating.
However you construct that, I think this is something which should be brought to the foreground. So there has been a lot of discussion how to train people in information literacy, in particular in the in the library world. So how, how to make you see that the information you get you have to check against the
sources now you could almost advocate that you also have to check against the knowledge organization systems in place. So add that to this kind of educational to this kind of educational threat, I would say, make people aware of it.
So I would also advocate that libraries should bring their systematic catalogs to the forefront. Again, that would also be a great way to do this. Even if you are a private company, or a private enterprise, you should make that
visible under which kind of guiding principles you order and present knowledge to your customers. Would you say this information literacy would also include to question those systems, so not
only to make it transparent but also to opponent against it maybe and change it? Yeah, if you can see it you can also say something about it you can formulate a stance if you don't see it, and you only see the, the, you only feel the results coming from those systems being in place.
It makes it much more harder to develop a stance against it. But it is clear that we need them. So we cannot just drop them. Yeah, so that would that that would be also nice. So let's, let's make them visible and discuss them in an open discourse. Yeah, yes.
Okay. So the next, the next comment is on the decimal classification. This order of all the knowledge in the world is very ambitious or arrogant. Can any classification claim to last forever? Order of knowledge must always be in evolution. Could an evolution, evolution
classification as Wikipedia last forever when it's, when is the concept completely outdated, or is it everything basically the same? And this question, these questions, touch on something very deep on a very deep philosophical stance. So it almost
touches on questions like, is there is, is there an objective reality can be recognized that are the eternal principles. Yeah, which, which guide is now I think everything I have showed you have also shown that it is very much
in flux, right? And so that it is very much in flux, but some of the things are kind of popping up again and again. So there is there seem to be also something, something which guides us generically in our social construction. So it's not arbitrary. It's not every time you let somebody classify it comes up with a totally kind of different classification.
So the, the knowledge organization people, if I understood them right, they would say that classification and categorization develops against what they sometimes also call literary variant of works.
So you have something documented there and that material actually guides you how you classify that. And this is very similar to what categories and how categorization classification is also going on in the Wikipedia. So the life we live, our reality shapes very much the needs, how we want it, how we want it to
have it ordered. And some categories or some concepts will vanish because we don't, we don't have them anymore in our lives. Yeah. Or they change connotations. Yeah. So, so, so I think, so I think yes there is flux, but I would always argue that to
be able to see this flux, and to be able to destroy the existing orders, yeah, and to gain your freedom and to be creative.
You need to have a reference system first. So if you just start from scratch, and I'm not even sure that's possible, but I'm not a neuroscientist, yeah, but imagine you could just start from scratch.
Then where would you go? You need to start somewhere. So to do something new, to know that you do something new, you need to know what exists to some extent. And this is why I think, yeah, make those visible and then we can fight against them and can break them up. And
yes, Audley was, he was very visionary, but then a lot of those idealistic movements can be found prior to the First World War, where people thought that also in a naive way that the technological progress can bring them freedom. And you can see
this again, when the internet emerged in the 90s, you have the same kind of philosophical and idealistic talks related to them. But technology is never, technology is technology and what we make with technology, that has a social connotation. That's not indeed technology. That's in us.
Okay, from this very philosophical questions or comments, we get over to a more concrete one. So it's written, during the yesterday's session, it was shown that the parliament documents from Sweden is uploaded and they start looking into at main topics, etc.
Have you joined the session yesterday? No, that was not a session I could join, no. Okay. The vision I see is find all parliament discussions in Europe about AI. And now the question is how should we do
this? Any standards? We have seen Europe work, but don't know if it's the way forward. Any good further readings in this area? Yes, I wasn't able to join that session. But I know that in different countries, I know
that also from the Netherlands that they also made the transcripts of parliamentary debates available for people. So I would start a search in all those projects and I would look into what kind of ontologies they actually refer to. So that is with the digitization of material always brings a kind of new research question with it.
And parliamentary debates, I think are now digital available in quite some countries. Yeah. So I would look for projects there and then look into which ontologies they use.
So I know, I think it was Pig Fossen who actually did projects in the Netherlands on that and Marike van Erp. At least those are two researchers I know have been looked into this. Okay. So and the last comment is, I guess, a kind of critique. What is about the information order approaches from other parts of the world?
So actually, we only heard from Europe or the Western world. What can we learn about organization information from Asia or Africa, for example?
When we were in the Wikipedia project and we had a Chinese master student working with us. So we also started to discuss, could we not look into Chinese classifications. And there I have to humbly admit that language is a barrier, of course. So we choose
the English encyclopedia because we came from, the team came from Turkey, Poland, Germany, Netherlands and China. And that was a language we could all understand. So I have argued a lot against taking kind of terms for
granted and put them equal to concepts. But if you have no language at all, it is very hard to discuss classifications. So yes, I would love to see projects emerging and maybe they are and I'm just not aware of them.
Because I have also to admit that we did that project, but I'm not working on that kind of topic anymore. But I think it would be worth to do this also comparative and would be very intriguing classification always say something about a society and culture in which they emerge.
Okay, I think if there are no other questions from YouTube maybe, but I haven't seen everything within the patch. If you don't want to add anything, Andrea. I think I have spoken long enough. Thank you very much for your time. Thank you so much, Andrea, for your talk and for the discussion.