I’ve got geodata – How do I get out there (on the web)?
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 295 | |
Author | ||
Contributors | ||
License | CC Attribution 3.0 Germany: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/43500 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
FOSS4G Bucharest 201984 / 295
15
20
28
32
37
38
39
40
41
42
43
44
46
48
52
54
57
69
72
75
83
85
87
88
101
103
105
106
108
111
114
119
122
123
126
129
130
131
132
137
139
140
141
142
143
144
147
148
149
155
157
159
163
166
170
171
179
189
191
192
193
194
195
196
197
202
207
212
213
214
215
216
231
235
251
252
263
287
00:00
InternetworkingPoint (geometry)Lecture/Conference
00:23
Natural numberLocal ringCurvatureComputer fileComputer configurationSoftware maintenanceComputerInstallation artClient (computing)Query languageServer (computing)Information securityContent delivery networkArchitectureIdeal (ethics)Software frameworkCommunications protocolEstimationPublic domainData integrityNebenläufigkeitskontrolleData recoveryOperations researchConsistencyRule of inferenceCustomer relationship managementPrice indexTemporal logicAuthenticationAuthorizationUser-defined functionSoftware bugINTEGRALBenchmarkComputer networkSoftware testingInternetworkingDatabaseDirected setIntranetConcurrency (computer science)Windows ServerIntegrated development environmentBefehlsprozessorLaptopLibrary (computing)Computer fileProfil (magazine)Revision controlFunctional (mathematics)Software developerWorld Wide Web ConsortiumFormal languageMechanism designComponent-based software engineeringWeb serviceComputer configurationLocal ringDatabaseFlash memoryProcess (computing)NeuroinformatikState of matterStatement (computer science)Software maintenanceGame controllerCartesian coordinate systemData integrityServer (computing)Relational databaseInternetworkingBenchmarkWeb-DesignerPhysical systemConnected spaceShooting methodMeasurementScripting languagePhysical lawMathematicsMetreLimit (category theory)Multiplication signWeb pageProxy serverRight angleGreatest elementLine (geometry)Metropolitan area networkCodeMappingSequenceCausalitySequelQuicksortSoftware bugConcurrency (computer science)Database transactionData recoverySoftware testingPortable communications deviceAdditionCustomer relationship managementContent delivery networkConsistencySineCurvatureOperator (mathematics)Information securitySubject indexingContent (media)Client (computing)Reading (process)Query languageLevel (video gaming)Direction (geometry)Visualization (computer graphics)BackupVariable (mathematics)State transition systemMathematical optimizationOverhead (computing)Order (biology)Computer animation
09:25
BefehlsprozessorEstimationServer (computing)Polygon meshStatement (computer science)AdditionDatabaseSQL ServerOverhead (computing)Database transactionComputer configurationObject (grammar)Traffic reportingComplete metric spaceData analysisClient (computing)Visualization (computer graphics)Function (mathematics)File formatInjektivitätExecution unitoutputMultiplicationDrop (liquid)Parsing.NET FrameworkComputer networkFirewall (computing)Formal languageTerm (mathematics)Band matrixMobile WebData structureSocial classSoftware maintenanceComputer programmingNumbering schemeQuery languageDirected setCross-site scriptingWorld Wide Web ConsortiumProper mapServer (computing)Formal languageDatabaseComputer configurationString (computer science)Software maintenanceInjektivitätClient (computing)Object (grammar)Web serviceFile formatInteractive televisionBand matrixBefehlsprozessor2 (number)Information securityGoodness of fitData analysisNumberSoftwareTable (information)Variable (mathematics)Normal (geometry)GeometryMultiplication signOrder (biology)Statement (computer science)Physical systemLevel (video gaming)Degrees of freedom (physics and chemistry)Communications protocolCode refactoringArithmetic meanConnected spaceAdditionComponent-based software engineering1 (number)Drop (liquid)Parameter (computer programming)Sheaf (mathematics)MereologyOperator (mathematics)Database transactionChief information officerOverhead (computing)Form (programming)UsabilitySequelWeightPiRepresentational state transferWebsiteMobile appNumbering schemeMappingRight angleState observerArmSequenceTunisComputer programEqualiser (mathematics)PerimeterRoundness (object)Musical ensembleVideo gameCausalityReading (process)Computer animation
18:26
Server (computing)Traffic reportingException handlingWeb serviceLibrary catalogElement (mathematics)Error messageParsingEncapsulation (object-oriented programming)ImplementationInterface (computing)Client (computing)Multitier architectureIndependence (probability theory)SoftwareComputer programmingSoftware architectureDirected setInformation securityDisintegrationComputing platformAbstractionDatabaseComputer configurationQuery languageProcess (computing)TelecommunicationData conversionCommunications protocolRule of inferenceData integrityOverhead (computing)Staff (military)Statement (computer science)NumberInjektivitätLimit (category theory)AuthenticationFirewall (computing)outputInformationMessage passingPasswordPrice indexPreprocessorSubject indexingConnected spaceAdditionView (database)Axiom of choiceArithmetic meanLeast squaresWhiteboardStatement (computer science)Insertion lossWeb pageComputer programmingOperating systemCartesian coordinate systemScripting languageComponent-based software engineeringInternet forumNormal (geometry)Right angleProof theoryNatural numberVideo gameValidity (statistics)Transformation (genetics)Group actionSheaf (mathematics)DatabaseWeb serviceObservational studyTable (information)Message passingWorld Wide Web ConsortiumMereologyClient (computing)PasswordTelecommunicationInformation securityMultiplication signServer (computing)InjektivitätData integrityPerimeterSoftwareSequeloutputGoodness of fitSoftware maintenanceInformationOverhead (computing)Computer programStandard deviationUsabilityRule of inferenceProjective planeLimit (category theory)AuthorizationNumberImplementationSoftware architectureComputing platformPower (physics)Execution unitDifferent (Kate Ryan album)AuthenticationInternetworkingTraffic reportingAbstractionInteractive televisionMultilaterationMultitier architectureSelectivity (electronic)Error messageTunisMetadataException handlingLoginComputer animation
27:28
Vector spaceWorld Wide Web ConsortiumWeb serviceGeometryLevel (video gaming)Raster graphicsMappingRight angleRepresentational state transferStandard deviationDrop (liquid)Lecture/Conference
28:14
Price indexPreprocessorSubject indexingMedical imagingRight angleMultiplication signLine (geometry)Web serviceCodeInformationExtension (kinesiology)MappingMathematicsGroup actionStudent's t-testRevision controlServer (computing)Real numberGeometryStandard deviationWorld Wide Web ConsortiumVector spaceWeb 2.0Sheaf (mathematics)Level (video gaming)Computer animationLecture/Conference
Transcript: English(auto-generated)
00:07
Hi, everybody. I'll be talking the next 20 minutes or so about how to publish geodata on the internet. And I'm going to compare, oh, do I get a pointer?
00:28
I'm going to compare basically three methods that are out there. There's probably a few more, but I concentrated on three ways to publish geodata.
00:40
And the first one is obviously a web server. The second one is a direct database connection with PHP. And the last one is a flat file. And that's probably one of the least obvious ones. Oh, this is just a slide, because it's mainly I'm going to go through a JavaScript code.
01:03
So I have found these in my research to show you that JavaScript is actually in GitHub and Stack Overflow one of the most sought for languages. So I thought that was quite interesting. First one I'm going to present is a flat file locally.
01:23
So all you have to do is in your web page, you have to point your script to a flat file that can be on a flash drive or on a local computer. And I always use leaflet here to show it in the client.
01:42
So basically, all you have to do is it's a map variable. And then with two lines of code, you can load your features in leaflet. That's quite easy. And I'm going to go through some of the advantages and disadvantages of a local flat file.
02:03
Actually, I really like this. And I think it's an underestimated option to publish geodata, because it's very, very portable. And it's plug and play, right? I mean, you don't need any installation. All you need is a web browser, and you can publish. You can plug it in somewhere, and then you
02:23
can have a look at it. Closer? Oh, OK. And you can have a look at it right in the browser, your geodata, and with JavaScript that is so rich in visualization now and animation. It's actually a really nice way to publish geodata.
02:42
And as I said, you don't need any desktop application. You don't need any server hosting, no maintenance, no security of any kind, because the flat file is separated from your database. So you don't have to worry about anything. You don't even talk to a server,
03:01
or you've got nothing to do with the server. So it's all happening locally. So that's why actually I really like this option. I've done quite a few applications like that. And if you need other libraries, then you can always load them through the URLs,
03:21
through a content delivery network, or you can locally host them. Also on a flash drive, no big deal. Of course, with the content delivery network, you're always up to date. You don't have to care about the latest version. But the disadvantage is that you cannot further develop code.
03:41
So if you want to do that, if you want to implement your own functions, then you have to go for the locally hosted library. What are the disadvantages of a flat file? There's quite a few, and actually pretty much what a relational database management
04:02
system gets credit for. Its speed, well, it probably depends on the amount of data that you have to load. After a certain amount, it will get really, really slow, because it doesn't have any query optimizer.
04:22
You don't have any index available. That's what a database management system all does for you. You don't have to worry about that. You cannot filter any data. So you can only load the whole file. And you would have to do the filtering within the client.
04:43
So if you have a lot of operations that the user has to do, if you want to change parameters or if you want to, well, if you have changing requirements, then this is probably not the best option. And of course, because it is a flat file, don't use that option if you have
05:03
to write back to a database, because all you can do is you can write to a flat file. But obviously, what you're missing there is you don't have any measures to look for data integrity or data consistency. So there's no way of observing that.
05:22
So you cannot normalize your data. You can have a lot of anomalies. And of course, transactions are not possible at all. So if you want to observe that your data will be written in a consistent state, then you cannot do that with a flat
05:43
file. And you will always have problem if you have many users, because you have no concurrency control. And obviously, there's no backup and recovery mechanism. And these are quite important disadvantages. So basically, flat file is really only
06:02
to for read operation and then maybe publish as a base layer or something. From here, if you host your flat file on the server, and I've tested that with two methods, jQuery asynchronous,
06:21
it is also a local flat file. You just use the internet to retrieve it or to bring it to the client. And at the bottom, there's the XML HTTP request. So those are the two options you've got available. Or you take another library, of course. There's more than two options.
06:46
That's also a local flat file, but this time from a server. And from the benchmark testing I've done, it's the most performant. So it's the quickest way to publish geodata.
07:01
But then again, my tests have been with 60, no, 70 features and 600 features. So I actually don't know where the cutoff limit is when it really gets low, because when you probably have to load a lot of megabyte or gigabyte even,
07:23
then this is definitely not the best option for you to do. But for now, because if it's a very limited amount, you don't have any connection overhead to the database. It's just a flat file, and it can be loaded into the client.
07:40
And that's quite an advantage, I would say. And the pros and cons are very similar to a local flat file, but obviously you have to think about network traffic as well, and internet connectivity that you have to have in order for it to be on a server.
08:05
OK, option three. You don't have to use a web service to publish geodata on the internet, right? If you think about the database systems out there,
08:20
they are all spatially enabled in these days. You have Oracle Spatial, you have Postgres with Postgres, you have SQL Server, you have MySQL. Whatever it is, they all have spatial data types, so there's no need to use a web feature service or a web map service.
08:41
You can just write in SQL statement, and here it's the STS GeoJSON. And that will serve your data out in GeoJSON, which
09:01
is obviously the native language within web development, JSON, GeoJSON. And yeah, that's also quite a quick way to publish your data. Also from the benchmark tests, it's
09:21
faster than a web service, from what I can figure, which makes sense, because a web service is an additional software component on top of a database. So you have more software components. And if you can reduce that, then it will be faster.
09:40
And especially if you fine-tune your requests, and I will come back to that in a minute, if you have a lot of options to fine-tune your SQL statements within PHP. And that ranges from a persistent database connection.
10:04
A database connection is also an overhead, because you always have to connect to the database. So that will take some time. And if you use a persistent database connections, that will save you time, because you don't have to connect every time to the database. But you have to be cautious about that one,
10:20
because it will also lock your tables. So there will be a persistent lock on your table. And that's why transactions are not recommended for this option. So if you use a persistent database connection, I would be cautious. How to use it. Also only for read operations.
10:41
And another important thing is to prepare SQL statements. Prepared SQL statements or parameterized SQL statements, they are called. Because what they do is they think about it like variables.
11:01
They create variables. And with those variables, they will write back into the database. And that has actually two advantages. One of them is the prepared SQL statements, they are only executed once.
11:21
And that's why they are way quicker than normal SQL statements. And they also run faster. But it's also a security thing, because prepared statements, they will prevent SQL injection. We're going to have a look at that in a second.
11:43
But also, of course, if you are tightly coupled to the database, so that means you have a good degree of freedom over your data. That means you cannot only request a GeoJSON or JSON. You can put together any format that you would like to have,
12:04
like HTML arrays, objects. And if you want to do a really interactive data analysis and visualization, then this is probably the best method for you to get the data out of the database. And you have to think about that as well.
12:22
If you need access to a database schema or to a database object, this is the only way forward. Because if you want to do bulk loading or reporting, then you would have to use PHP or any server-side language. I mean Python, Perl, Ruby, Node.js, JSP, whatever
12:41
is out there. But if you need access to the proper database schema, then this is the only option that you will have available. What are the disadvantages? And while that's also quite straightforward, because you actually allow access to the database system,
13:06
it's not as secure. And that's what I mentioned before, the SQL injection attacks that happen. I put an example together here. So this is only a string concatenation in PHP.
13:24
So that means you concatenate a string not using variables. And if someone puts in here country Romania, and then the capital Bucharest, and then you have a drop database statement afterwards,
13:40
then if you concatenate the string, it's three SQL statements. And they are all perfectly valid. And that would be a malicious attack on your database, right? So you have to think about it. And the only prevention you have available
14:03
is to use these ones here as parameters. And then this will not be a SQL statement. It will be part of the value section.
14:21
So I would never use PHP when you don't know your users or also with a high number of users that you really don't know, because they are much closer to your database. And they don't even have to penetrate a firewall or have to get access to your internal network.
14:45
So I would recommend only as a read-only option as well. It's not quite as friendly, multi-user friendly as a web service. I mean, a web service, of course,
15:03
that was invented because of the dissemination on the web. So it uses more resources as well on your CPU and your bandwidth. So if you do a mobile application,
15:22
then think about it, because then it's not a good option either, because it will delete your battery pretty fast. So a web service is not as demanding on your resources on the client side.
15:41
So if you want to have a thin client, go for a web service. That's the better option. And the tight coupling, which means that there's no intermediate layer, right? You're going right into the database.
16:01
Means you have a lot of maintenance issues if you change your, well, if you program your client or your database, for that matter. So if you continuously develop your database or you do a database refactoring
16:21
or you change your schemas, if you develop your database quite fast, then this is also very difficult. So this means you will most certainly run into issues with backward compatibility,
16:40
and it's much harder to keep up with that. So web service is much more rigid, but because it is an additional layer, it's much easier to communicate. And with tight coupling, that's a big problem. You have the option to encapsulate further,
17:05
so to get away from the coupling with data access objects, but it will never be as loosely coupled as a web service.
17:20
And that brings us to the fourth option, right? This is the, I use the GeoServer here. So this is web feature service to get web features, right? That's the protocol to get access to your Geo data.
17:45
And what you need to have in order to use a web feature service is a web map server. So you have the option. I've come up with five. So you have a GeoServer, that's the one I used.
18:03
You have the UMN map server, or I think it's only called map server in these days. You have QGIS server, you have Decree, or you have a proprietary system which is ArcGIS server. Does anyone know anyone else?
18:20
Another map server? Are there more around? No? So you have to choose between one of them. You have to do some installation, and then you can serve your WFS. And of course, that has many advantages.
18:43
If you need metadata, or if you need to serve metadata, then that's definitely the only way to go. The Cadillacs service for the web, or also the get capabilities document, right?
19:01
They also have metadata about which projections are available, which layers are available, who the author is, and so on, and so on, and so on. So that's the standard for interoperability. And it's actually quite a dominant standard,
19:26
and that can be used in so many ways. And that's why I put reusability down as well, because it's actually much easier for others to use your web feature service,
19:40
because the dissemination on the internet is much easier. Yeah, and the loose coupling means, of course, as I said earlier, it's an intermediate layer, so it separates your database from your client.
20:02
And that has the advantage that you can develop both tiers in completely different ways. So you're completely independent of each other. You don't have to worry about which operating system or which software to use on both sides.
20:22
And that obviously facilitates application or programming in many ways as well. It's very easy to scale, service-oriented architecture, so it can be reused. And it is definitely the way to go
20:42
if you have a lot of clients, and also if you don't know who your clients are. And if you have to cater for numerous platforms, that would also be my preferred choice. And it also has an integrated security, so it's an additional abstraction layer, right?
21:03
So you are not right into the database, and it's much, much harder to use malicious attacks to attack your database. What are the disadvantages?
21:22
Well, it's one of the slowest methods. You cannot tune it as much. It's very rigid, the way a web service is set up. And that's why the connections can be very costly. You always have to convert from and to. So that means if you use GML or if you use JSON,
21:41
it has to be converted. And if you have user interaction or user input, then you might have to validate your data to conform to business rules and to data integrity. So every transformation raises the need
22:00
to check your data again, or it will go wrong. Sooner or later. And of course, it's a stateless nature. The web service is stateless, so that means the connections are very brief, or yeah, they have no persistence, and they have no knowledge about each other.
22:20
So if you want to know more about your user, or if you know your user and you have to provide some information, you would have to use a PHP. There's no persistent connection available for web feature services.
22:42
Yeah, and I mentioned that already. So think about that you have the largest software needs. It's an additional component, so you have overhead in programming, in maintenance, and you have the most investment needs with a web service.
23:00
There's also not 100% availability guaranteed, and it's a one-size-fits-all approach. So that means tight coupling may not always be the best communication between a server and a client.
23:24
Those are the last few pages already. So the recommendations in summary, what you can use or what you should use, or definitely think about what the purpose of your application is, how many users you're gonna have,
23:42
what your security needs are. And if we talk about security, then always think about SQL injection and how to prevent it and one of the best ways to do that is to prepare your statements. This is what I mentioned earlier.
24:00
This is a non-hazardous SQL statement because they are parameterized here, the country capital. So whatever the malicious attack writer wants to achieve will not happen because it's part of the value section, so he will not be able to get a hold of your database.
24:25
Of course, always think about the application firewalls that you have available and use them wisely. You can validate user input. That's also a very reasonable thing to do.
24:43
So no malicious attacks will be issued or no one will get into the database and think about hiding your information. So don't, well, turn off your database error messages. You can do that in PHP with the error reporting
25:02
and that will prevent any other users to see your database error messages because they are quite important, right? They give away a lot of information from a database. And also, think about brute force attacks.
25:20
There's the web feature service in GeoServer. You can delay authentication. You can limit the number of login attempts and you can implement strong passwords. So that's also something that you really have to think about. And in PHP, it would be, well, you have to script that
25:41
but that's always a very wise thing to do. One minute here, I'm done. I always forget the welcome, no, the goodbye page. No, there's one more page. That's actually quite an important one, so sorry to withhold that from you.
26:02
But performance, if you think about performance, I've tested it as well in the, well, in the last few weeks. It's always good to pre-process your data as good as you can. So if you cater your GeoJSON within PHP
26:22
and you do here, this is a three times nested select statement. That would consume a lot of resources and a lot of time. That's something that you really don't have to do because you can just use the select into
26:40
and then put it in another table, right? Select into creates another table and puts this statement right in GeoJSON. So in PHP, you can only select the whole table which is also prepared. And that makes a huge difference with time and that's something I can really recommend.
27:05
I hope that was helpful to you. Thank you very much. Any questions?
27:21
The OGC standards, the way that I understand is actually a collection of standards, right? You have WFS and all the capabilities that you mentioned. Do you need all of these or you can actually select or how does it work? The OGC standards? Right, you have like, you have the web map service,
27:41
the web feature service, web processing service, web coverage service. They are all web services for disseminating geospatial data, right? Use one of them or, yeah, yeah, how does it actually? Yeah, yeah, because well, my purpose is to publish geo data, right?
28:01
If you, yeah, for vector data, you need the web feature service. For raster data, you need the web coverage service but I only concentrated on vector data here. So, and that's the only one that will, that you can get a hold of the true data, right?
28:21
Not like the WMS which only serves images. Of course, yeah, yeah. Web map service is really good for base maps or you can also get some basic information but you will never get the real geo data, right? The geo data you can only get with the web feature service
28:41
for vector data, for vector data. Because I was upstairs in the 21st floor, they also had a session on the new modern version
29:01
of the OGC APIs which, yeah, actually, so they're gonna get rid of a lot of XML, they're gonna simplify things, they're gonna make it more modern, more JSON based and also following some best practices in the web world. That's about time, isn't it? Especially, I mean, yeah, well,
29:20
some of these standards are cumbersome, right? Yeah, yeah, yeah, yeah. And if you think about like, especially styling these things when you use SLD, that's a nightmare, that's a pure nightmare. If you use geo server and you do have to do styling, I would recommend not use SLD, use CSS,
29:41
because that comes with an extension geo server. And that's, you can reduce your code from 180. I teach my students that because I compare those two methods and the SLD document is 160 lines of code and CSS is four lines.
30:01
You're welcome. More questions? If no, then thank you for your attention and let's meet in the next sections.