We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Managing Data in MediaWiki

00:00

Formal Metadata

Title
Managing Data in MediaWiki
Alternative Title
semantic mediawiki
Title of Series
Number of Parts
84
Author
Contributors
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production Year2012

Content Metadata

Subject Area
Genre
Abstract
FOSDEM (Free and Open Source Development European Meeting) is a European event centered around Free and Open Source software development. It is aimed at developers and all interested in the Free and Open Source news in the world. Its goals are to enable developers to meet and to promote the awareness and use of free and open source software.
VideoconferencingExtension (kinesiology)Software developerComputer programmingSemantics (computer science)GoogolCodeXMLUMLLecture/Conference
Self-organizationSemantics (computer science)Open setSoftwareLecture/Conference
HypermediaHypermediaType theoryComputer animation
MereologyWeb pageWikiCategory of beingComputer animationLecture/Conference
Query languageWikiWebsiteInformationSemantics (computer science)Shared memoryCartesian coordinate systemWeb pageLecture/Conference
InformationWikiFile formatValidity (statistics)Semantics (computer science)Form (programming)Web pageExtension (kinesiology)Source codeSource codeXML
MappingoutputWikiInformationWeb pageSemantics (computer science)XMLProgram flowchart
Hacker (term)Event horizon
Maxima and minimaInclusion mapWeb pageMach's principleWeb pageComputer animation
InformationFood energyOpen setOpen setWikiFood energyInformation
Open sourceDirectory serviceWebsiteInformationHypermediaSemantics (computer science)Web pageCategory of beingCuboidWikiSound effectInformationField (computer science)Variety (linguistics)Computer animation
Lecture/Conference
MereologyWeb pageWikiCodeWikiLink (knot theory)Square numberExtension (kinesiology)Category of beingPoisson-KlammerComputer animation
WikiWeb pageHome pageCategory of beingWikiComputer animation
WikiCASE <Informatik>Semiconductor memoryLecture/Conference
WikiWeb pageHome pageType theoryNumberUniform resource locatorExecution unitData conversionSet (mathematics)Category of beingElectronic visual displayType theoryData typeComputer animation
Uniform resource locatorSet (mathematics)RadiusData typeQuery languageLecture/Conference
View (database)Web pageQuery languageAreaoutputStatement (computer science)Parameter (computer programming)Query languageAbelian categoryParsingCategory of beingWeb pageHierarchySet (mathematics)Table (information)Statement (computer science)Interface (computing)Query languageElectronic visual displayAreaFile formatParameter (computer programming)Extension (kinesiology)Computer animation
Category of beingLecture/Conference
Electronic mailing listQuery languageSoftware developerCore dumpInternet forumExtension (kinesiology)DisintegrationTable (information)Semantics (computer science)EstimatorEmailParameter (computer programming)Extension (kinesiology)Installation artFormal languageWikiWebsiteCategory of beingInternet service providerGraph coloringTrailFlow separationSoftware bugSoftware developerComputer animation
Software developerComputer animation
Multiplication signLecture/Conference
Computer animation
BitProjective planeLecture/Conference
Complex (psychology)BitProjective plane
Transcript: English(auto-generated)
OK, thank you. So I hope you all had a great Folsom so far. I'm Jeroen. I'm a MediaWiki developer. I got involved with MediaWiki in 2009 during the Google Summer of Code program and sticked around since then.
I also got involved with semantic MediaWiki project, or SMW in short, which is a MediaWiki extension. Like MediaWiki, semantic MediaWiki is written in PHP. It's completely open source, and it's released under the GPLv2.
Well, let's quickly do a show of hands. Who actually knows what MediaWiki is? Who has used it before? OK, who knows what semantic MediaWiki is? And who has used it before?
OK, cool. So for those who don't know, in short, MediaWiki is the world's most popular wiki software. It's the thing that runs Wikipedia. I also work for the Wikipedia Foundation, which is the organization that's behind Wikipedia.
So MediaWiki allows you to collaborate on text documents, rich text, and all types of media. What semantic MediaWiki does is
that it also allows you to collaborate on actual data. So how does it do this exactly? The basic idea is to assign property value pairs to wiki pages. So for example, if you have an article about Brussels,
you could have a property population with a certain value. Or if you have FOSDAM, then it could have a start date, and so on. Why would you do this? Well, the great advantage of this is that if you have this data stored in a structured fashion rather than in a blob of wiki text,
you can obviously do queries against it and browse it in various ways. And by extension, also share this data with other applications. I'll quickly run through a few examples to give you an idea of what semantic MediaWiki installation
looks for from a usage perspective. This is a website from German University, which has a bunch of articles about people, events, and publications, amongst other things.
This is an example of a page about a person, which has, in typical MediaWiki style, has info books with some information. And one thing from semantic MediaWiki you can see here is the little vCard link, which allows you to export this information about this person
in the vCard format. So in regular MediaWiki, the source of the page probably looks something like this. Well, this is just MediaWiki at work. No semantic MediaWiki syntaxes to see here.
But one of the great things that semantic MediaWiki, or rather an extension to it, allows is editing this via forms. So instead of the usage having to bother with the Wiki text, they can add and edit things via such interfaces, which can have all kinds of fancy things such as, well, validation and auto-completion
on existing values or inputs with date takers and maps and whatnot. This is an example of semantic MediaWiki pulling data that's located at other places in the Wiki into a page.
So this page does not have any of the information you see here, but rather this information has been pulled from individual Wiki pages. An older example is the hackerspaces.org wiki, which has all hackerspaces on it, which have information
such as when they were founded, how much members they have, where they are located, et cetera. Because they have this geographical data associated with them, you can, in the Wiki, pull all this data into a nice map, which you can then interact and navigate through the pages.
You also have a hackerspace here in Brussels. If you don't know it yet, I recommend checking it out. It's a great place. My last example is the Open Energy Wiki.
This lists information about power plants and energy facilities in the United States. And well, this is a similar but more elaborate example of pulling data onto a map, but then, well,
having the data split up by usage. And you can get this data in many other kinds of formats, such as charts, graphs, calendars, whatnot. There are many more examples in a whole variety of fields.
The users of Semantic Media Wiki include governments, companies, all kinds of organizations, and even lots of hobbies just use it to, well, do their stuff, right? It's basically useful on any Wiki that has data and info boxes.
If you have a category with pages of a certain type, for example, buildings, people, events, whatever, then you can use Semantic Media Wiki to great effect. Whenever I see a Wiki that has such kind of information on it
and that is not using Semantic Media Wiki, I ask myself the question, why not use it? It's that easy, and you get all the great benefits of being able to visualize your data and summarizing it, which you would otherwise need to do manually and which is a great thing. So how does it work?
Well, this is also from a user perspective, but from how you would set it up in the Wiki, so not really how the code works. So to assign these property values in pairs in Wiki pages,
we use something that's called extended link syntax, which consists of two square brackets and property name and your value. So these are the earlier examples and how you would enter them in the Wiki page. As you can see, there are, well,
the values can be of different data types. That's something interesting to note here. They can be numbers, dates. And in this case, this could be just text or more likely another Wiki page, depending on how you define it.
Properties in a Wiki can be introduced as needed. So there is no need for some architect or whatever to set up a schema in advance. This really allows collaboratively building something and makes semantic memory a great tool in cases where you do not know in advance
what exactly it's going to end up being. They also have their own Wiki pages on which you can enter documentation about property, but also properties of the property, which can specify things such as, well,
what their data type is, and also restrictions, for example, saying that only a certain set of values is allowed or things like unit conversions. So here are some examples of properties you could have.
These data types, the types of the properties, they can affect how properties are displayed. For example, you cannot display a property of type date on a map, and you cannot display geographical coordinates on a calendar.
They also influence how you can search and browse data. For example, if you have geographical coordinates, you can do queries that find all the geographical coordinates in a certain radius of this location and things like that.
Usually, use is made of the existing data types, so there is a basic set of data types. You see some examples here. But it is possible to actually extend this in a way by using records, which are composite types, which makes use of these base data types.
Perhaps the most used way to get this data you put in into some other format are inline queries. We saw an example of this earlier. But interesting to note is this is definitely not
the only way how you can get it out, semantic main hierarchy. Well, it has a set of dedicated query interfaces and browsing interfaces. And there are many extensions that add on top of that, allow you to do additional things. And there are ways to export the data in formats like RDF or JSON or whatever.
So inline queries basically allow you to put in some Wiki text into Wiki page and then get results. These consist out of three aspects, which pages should you display, which values, well, on these pages
should be displayed in your result, and how should all this be formatted. So an example query we might want to do is showing all the countries in Africa. For each of these countries, show their population and area and display this in a nice table.
So this, we would need to provide some SMW query language, which limits the resulting set of pages to whatever you want, some printout statements which say which properties you want and then formatting parameters.
I'll quickly explain this. This is the ask parser function, this meta-wiki way of extending Wiki text. This is the SMW query language, which specifies that it needs to be in the category country and that it needs to have the property located in Africa.
Then this specifies that you want the population and the area, which are the properties you defined, and this is a formatting parameter that says you want it in a table. So you could get something like this. Nice to note here is that this is in other color. If you hover over it, you will actually get a pop-up
or a balloon showing the same value but then converted into other units. So the community, well, we know of some hundreds of sites
using semantic meta-wiki, but it's hard to tell because meta-wiki does not have a way to really track usage of extensions or meta-wiki itself, for that matter. I suspect there are probably several thousand semantic meta-wiki installs that are used to some extent.
There is worldwide usage. This is an estimation of two years ago in how many languages it is used. Great thing about this is that it's actually internationalized or localized in 115 different languages. So if you have a wiki in some bizarre language, you can use semantic meta-wiki probably.
We have public mailing lists, IRC, bug tracker, and active developer community. There are several dozen extensions that are actually built on top of semantic meta-wiki. There are teams at companies that work on it and provide commercial support,
and it's integrated into the meta-wiki community itself. We have our bi-yearly conference, one in Europe and one in the United States. The next one will be in April in Carisbot. So if you're from the States,
you might be interested in going there. And the next one in Europe will probably be in Germany in the fall of this year. So okay, I guess I'm out of time and out of slides, so thanks for your attention.
So any questions? Oh, that's a good question.
Why is it not live on Wikipedia? So this project has been there for five years already, and actually the initial goal was getting it onto Wikipedia, which has never happened. Perhaps its main reason there is that it's a bit too expressive. It allows too much things. You do not want users to go do on Wikipedia
because they would simply not scale or they would complicate things there too much. I mean, a lot of people already have problems with the complexity of Wikitex, so we do not want to go make it more complex. There is now a project called Wikidata
organized by Wikimedia Germany, which will actually, well, of which the goal is getting similar functionality, although a bit more limited, onto Wikipedia. This will start in April. I'm on this team, so if you have any questions about this, I'll greatly talk to you about it.
Other questions, please see the speaker in front. So thank you very much, and I have some Belgian chocolates.