KDE Itinerary
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 490 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/46963 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Personal digital assistantInformation privacyGoogolInformationEmailFinitary relationCoefficient of determinationDigital photographyPresentation of a groupMobile WebGoogolTheory of relativityPlastikkartePersonal digital assistantService (economics)Roundness (object)Digital libraryFamilyMultiplication signNatural languageElectronic program guideComputer configurationWater vaporBit rateRight angleSpeech synthesisReal-time operating systemWeightNumberMusical ensembleCoprocessorExecution unitState of matterWebsitePoint (geometry)Computer animation
03:27
Computer animation
03:42
CodeAerodynamicsStandard deviationData modelEmailProbability density functionTime zoneUnicodeWorkstation <Musikinstrument>Latent heatInformationSystem identificationLeakTerm (mathematics)Condition numberDecimal40 (number)Physical systemBitCategory of beingMobile appInformationMultiplication signOctahedronBuildingNumbering schemeExecution unit2 (number)EmailArmSoftwareFamilyVotingMereologyMathematicsMessage passingWave packetBoss CorporationImplementationFreewareObservational studyCondition numberWater vaporModal logicBit rateCASE <Informatik>Time zoneProcess (computing)DampingChainWorkstation <Musikinstrument>AreaSearch engine (computing)Data structureOffice suiteOpen setSource codeOperator (mathematics)Interface (computing)Standard deviationService (economics)CodeKey (cryptography)GoogolLatent heatProbability density functionOnline service providerBus (computing)Logic gatePoint (geometry)Content (media)IdentifiabilityLeakForm (programming)Term (mathematics)Uniform resource locatorCycle (graph theory)Computer animation
09:42
Library (computing)Common Language InfrastructureData modelScripting languageLatent heatAugmented realityProbability density functionMountain passQuery languageOpen setInformationINTEGRALOctahedronScripting languageNumbering schemeBuildingLatent heatConnectivity (graph theory)Video gameService (economics)Form (programming)Uniform resource locatorConservation law2 (number)Library (computing)Physical systemSoftwareRootResultantRight angleCartesian coordinate systemFood energyLine (geometry)Different (Kate Ryan album)TheoryData structureRegulärer Ausdruck <Textverarbeitung>FreewareData modelGroup actionMedical imagingFile formatNatural languageMechanism designMobile appTime zoneMessage passingEvent horizonQuery languageVolumenvisualisierungBitWhiteboardFunction (mathematics)Front and back endsConfiguration spaceComputer animation
12:43
BitGroup actionTraffic reportingMobile appMessage passingVideo gameComputer animation
13:11
StatisticsReal numberAddressing modeInformationMathematicsComputing platformWave packetStatisticsConnected spaceLogic gateXML
13:45
Power (physics)Sign (mathematics)Bookmark (World Wide Web)Mobile appNormal (geometry)Correspondence (mathematics)Computer animation
14:18
Local ringWave packetElectronic visual displayMIDIPlug-in (computing)EmailDifferent (Kate Ryan album)Computing platformFunctional (mathematics)Workstation <Musikinstrument>Form (programming)Electronic visual displayChainWave packetNumberCartesian coordinate systemRight anglePersonal digital assistantPoint cloudSimilarity (geometry)
15:51
Information privacyEvent horizonWave packetWebsitePlug-in (computing)Multiplication signInformation privacySoftware testingRight angleComputer animation
16:20
BuildingPoint cloudMultiplication signPoint cloudComputer animation
16:45
Computer animation
17:00
Point cloudFacebookOpen source
Transcript: English(auto-generated)
00:05
So, the next presentation is by Mr. Folker, and he is going to talk about KDE Itinerary, so please give him a round of applause, over to you.
00:21
Thank you. Yeah, so I'm going to talk about KDE Itinerary or digital travel assistance. What do we mean when we say that? You probably know services like Trip It or some of the travel assistance features in
00:43
Google Now. They all more or less work the same, right? So they read your email, they find tickets, boarding passes, bookings, that kind of stuff in there. They put it into your calendar or they create a nice timeline and then guide you through
01:05
that and give you real-time updates on delays and that kind of stuff. All of that is available for free in the sense that you don't pay for it, at least not by money, but with your data.
01:22
And how bad is that? So, the first thing that comes to mind is like the data you directly leak to those services, right? So, your name, your birthday, your credit card number, your passport number, that kind of stuff.
01:42
And maybe you're okay with sharing that with those services, but the thing people think less about is kind of the indirectly leaked data. So if you and I travel to Brussels on the first weekend in February, that might be
02:06
pure coincidence, right? But I guess for many of us, that isn't the first time that has happened, right? So two or three times, same destination, same time, that is no longer a coincidence, right?
02:22
So if you have enough of that travel data, that actually tells you a lot on relations between people. So what you are interested in, who you work for, where your family lives, all that kind of stuff, right? So, in that point, giving your travel data to someone like Google is not just impacting
02:47
yourself, but everyone else as well. So not ideal. What do we do about that? So one approach might be, let's just not use those services.
03:05
And that works until you find yourself traveling in a foreign country where you don't speak any of the local languages, and then get introduced to their counterpart of, and then
03:21
you really want some assistance, right? So that brings us to another approach to deal with this. If you don't like something that, well, if you don't like the available options, right, let's build one on our own. So let's have a look at what this would take.
03:46
Turns out the problem is actually more about data rather than about code. There's also quite some code necessary, but a lot of building blocks already exist. The main challenge seems to be getting the various pieces of data we need.
04:05
And I grouped this into three general categories. First one is the personal data. So that is you booked a hotel or a train to a specific location.
04:23
That data usually comes to you in the form of emails or PDF documents, or you find it on a website. The second category of data is what I would call the static data. So that is information on where exactly is a specific airport or in which time zone is
04:47
that airport. I come from Berlin, so their airports are actually static. In other parts of the world, people apparently manage to build new airports. So static here refers to kind of the release cycle of the software, right?
05:07
So a few months at most. So that is data that is practical to ship for offline use. And the third category is what I would call it dynamic data. So delay information, for example, or gate changes.
05:26
So extremely short-lived information that you need to query from some online service. So let's look at those three categories in a bit more detail.
05:42
Fortunately, we are not the first one with those specific problems, right? Google built the same system, and they also needed to get their hands on those data. And the way Google usually solves this is they define a standard and encourage everyone
06:01
else to follow that. And that's what they did for the booking data. They came up with the schema.org data model, which they also use for the search engine, but that contains like a structured annotations that they expect in emails about your flight
06:22
or your train and so on. Depending on where you live, that seems to be present in about 30 to 50% of booking related emails. So that's a good starting point. Then of course you have random unstructured emails, so just meant for human consumption
06:46
that we have to deal with. Then for flights, there is Apple Wallet passes as something that is fairly popular. And then there's barcodes.
07:01
Barcodes I can probably fill an entire talk with on what you can extract from that as a kind of structured information but not meant for our purpose. So that's at least something to work with. For static data, we also have quite some stuff to work with.
07:24
Most prominently, Wikidata. So that is structured data of pretty much everything imaginable. And OpenStreetMap covering the local specific parts.
07:40
And also the time zone maps. Time zones is something really important for that use case. Areas where we still have some gaps I would say is adding vendor specific station identifiers to Wikidata. That is for example necessary to match to barcode content.
08:03
And then the whole problem of indoor navigation and navigation to your specific seat on the train and those kind of information. And for dynamic data, again, Google did some groundwork there. They defined the GTFS standards as an interface for public transport operators to feed that
08:28
information to Google Maps. But many of them luckily do that as open data. So we can consume that by free software as well.
08:41
And there's two big free software implementations of such journey querying services based on top of GTFS. That's Navicja and OpenTrip Planner. So that's at least covering the train and bus part.
09:00
Then we have the Apple wallet boarding passes again. They also have a built-in update API that's useful for gate changes for example. But that has the disadvantage that it leaks user information. So not ideal. And then of course there's plenty of vendor specific online APIs.
09:23
Many of them unfortunately not really compatible with free software or open data requirements. So you have some terms of conditions that aren't really compatible with our use cases. Or you need API keys that you're not allowed to publish, right, so problems like this.
09:45
But that's the theory. So let's have a look at what we have actually built over the last two years. The first component is the K-altenary data extraction library.
10:01
So that implements the schema.org data model for flights, trains, buses, hotels, events. And I'm forgetting one, well, basically those are restaurant reservations. That's the other one. And it has an extraction system that can handle the structured data.
10:24
So if there's structured annotations in emails, it can consume that. And it has an unstructured extraction mechanism both generically and it has support for like vendor specific scripts.
10:41
So for I think a bit more than 50 vendors where the generic approaches or the structured approaches don't work, you can write a small JavaScript that does the extraction for that specific render. And that's then a few lines of regular expressions or XPath queries. The output of that is then augmented by information we draw from Wikidata.
11:06
So that's filling in time zones or geo-coordinates for stations and airports. And that can consume basically all the data formats in which you might get these documents,
11:21
anything you find in an email, PDF, the Apple Wallet Boarding Passes and so on. If you, if any one of you has seen the next talk from Joss earlier today, they showed the integration with the itinerary extraction that's using exactly that system.
11:46
Another building block we created is the K-public transport library. So that's covering basically the dynamic data problem. Giving you API for querying for locations, departures and arrivals at those locations
12:05
and journeys between locations. This can talk to free software services, Navidya and OpenTrip Planner, as well as to a few proprietary backends.
12:24
And we have about 50 or so configurations for different services that then use any of the backends to actually get the data from. And the library picks the right service for the location you're looking at and then
12:41
gives you the results for that. And of course we have the actual KDE itinerary application. That's a mobile app giving you a timeline of your trip that automatically groups the various bits on your itinerary together.
13:01
So that's my FOSTEM trip, this live weather report in between. I can show you the boarding passes. And it can pull you delay information for trains or gate and platform changes.
13:21
If you miss your train, it allows you to find an alternative connection to get to your destination. Having all the data available, it provides you some statistics on how much you traveled in the last year. So if you're watching out for your CO2 impact, for example, that gives you trends
13:41
on if you're improving over the years. One of my favorite features, because nobody else has it, is the power plug compatibility warning powered by Wikidata. So, I mean, if you're traveling to the UK, you probably know they have weird power plugs.
14:02
But some more normal countries like Switzerland or Italy, coming from Germany, also need an adapter. And I tend to forget that. So the app seeing that I travel to those countries or through those countries reminds me to bring the corresponding adapter.
14:20
Something that is pretty new is an assistance feature to automatically fill the gaps in your itinerary. So I arrive at the main station in Brussels, I have my hotel somewhere, how do I get to the hotel? It now suggested me to take the metro number two to wherever I needed to go.
14:44
So that's then actually moving from just managing data to actively supporting you on this. That's another brand new thing. That's the train layout display. So it shows me where on the platform I need to go to find my reserved seat.
15:07
Right. And then finally, how do we actually get the data into the app? For that we have plugins for a number of different email applications starting in Kmail, of course, because that's where we came from.
15:24
So that shows you a nice nice to read banner on what it found in the email. Next cloud you might have seen earlier today doing something similar. Released two weeks ago.
15:41
The third one is Sanderbird. That isn't released yet, but it's basically going to have the same functionality available there as well. And last but not least, we have the browser plugin basically doing the same for websites you're looking at.
16:04
Right. I'm done almost exactly in time. One last bit. Forget about all the privacy stuff I said. We need test data to improve the data extraction, so please donate that. If you want to learn more about that, meet us in building K at the KDE stand or at
16:26
the next cloud stand around the corner. Thank you. To respect the time and the time delay we're having, if there are no burning questions,
16:40
my advice is please reach out to Folker and address the question in person and we can move on to the next presentation. Thank you so much.