We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Django with PostgreSQL superpowers

00:00

Formal Metadata

Title
Django with PostgreSQL superpowers
Title of Series
Number of Parts
115
Author
Contributors
License
CC Attribution - NonCommercial - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Every time we’re going to create a new project with Django we make assessments on its requirements to choose the best architecture, of which, the database is usually the core. Django is a database-agnostic web framework but natively supports only 4 Open Source databases: PostgreSQL, SQLite, MariaDB and MySQL. PostgreSQL has the richest feature set of any supported database and some of these features are natively supported directly in Django via its contrib module. In this talk we’ll see how to use to our advantage the features of PostgreSQL as a database in Django, its exclusive features present in its contrib module and also other superpowers that can be exploited through the use of third-party packages.
GoogolSoftware industryNetwork topologyEndliche ModelltheorieExtension (kinesiology)Table (information)Subject indexingField (computer science)Front and back endsBlogProjective planeOperator (mathematics)CryptographyFunctional (mathematics)Digital photographyDatabaseSpring (hydrology)Computer fileSheaf (mathematics)Social classPlatonic solidOverlay-NetzLatent heatSet (mathematics)NumberWeb 2.0Similarity (geometry)Core dumpQuery languageArray data structureSystem administratorWater vaporSound effectReal numberType theorySoftware testingMusical ensembleArithmetic meanDefault (computer science)PRINCE2Row (database)WebsiteLink (knot theory)Multiplication signTwitterGroup actionSoftware developerCustomer relationship managementPulse (signal processing)MappingInstance (computer science)Series (mathematics)Human migrationContent (media)2 (number)Relational databaseCycle (graph theory)Shared memorySelf-organizationProcess (computing)MethodenbankSoftwareSoftware frameworkStatement (computer science)CoalitionRevision controlData structure9 (number)Standard deviationMereologyData storage deviceoutputGUI widgetQuicksortStorage area networkRandomizationMultiplicationResultantAttribute grammarFile formatConfiguration spaceUniform resource locatorBand matrixData transmissionSpacetimeAdditionTransformation (genetics)Data typeVector spaceParameter (computer programming)CuboidConnected spaceOperating systemVortexVariable (mathematics)WordInformationSoftware engineeringForm (programming)Flow separationOpen sourcePhysical systemLibrary (computing)Source codeCASE <Informatik>String (computer science)Set (mathematics)Wrapper (data mining)ExpressionImplementationPresentation of a groupComputer programContext awarenessIntegrated development environmentSearch engine (computing)Range (statistics)Client (computing)Buffer overflowStack (abstract data type)Device driverEmailMetric systemAuthorizationRepresentation (politics)CodeElectronic mailing listMilitary baseTriangleQR codeGraphics tabletHierarchyAbstractionMeeting/Interview
Projective planeSoftware developerFront and back endsDatabaseComputer animationMeeting/Interview
DatabaseFront and back endsDatabaseMereologyCentralizer and normalizerFront and back endsSheaf (mathematics)Computer animation
Web 2.0DatabaseProjective planeSet (mathematics)
SoftwareSoftware developerDevice driverInstallation artClient (computing)Front and back endsDatabaseUniform resource locatorIntegrated development environmentConfiguration spaceConnected spaceOperator (mathematics)DatabaseProjective planeSoftware developerSubject indexingFront and back endsClient (computing)Computer fileSoftware industryRevision controlSheaf (mathematics)System administratorMultiplication signDevice driverVariable (mathematics)Uniform resource locatorParameter (computer programming)Software engineeringEmailCustomer relationship managementLibrary (computing)Integrated development environmentTwitterPRINCE2Computer animation
DatabaseEndliche ModelltheorieObject (grammar)Revision controlBlogData modelOperator (mathematics)Extension (kinesiology)Subject indexingE-textMessage passingAuthorizationMethodenbankSheaf (mathematics)Query languageGroup actionDatabaseEndliche ModelltheorieFunctional (mathematics)MethodenbankNumberOperator (mathematics)RandomizationField (computer science)Core dumpComputer fileProjective planeType theorySoftware testingMobile appComputer animation
Digital photographyDatabaseCore dumpMethodenbankFunctional (mathematics)Spring (hydrology)Meeting/Interview
Revision controlRandom numberMethodenbankHuman migrationEndliche ModelltheorieData modelBlogObject (grammar)Set (mathematics)MethodenbankOperator (mathematics)2 (number)CryptographyFunctional (mathematics)Relational databaseInstance (computer science)Extension (kinesiology)Revision controlRandomizationDatabaseField (computer science)Rule of inferenceHuman migrationData storage deviceTable (information)Endliche ModelltheorieDefault (computer science)Latent heatXMLComputer animation
WebsiteGroup actionProjective planeSoftware developerSpring (hydrology)ImplementationDigital photographyPresentation of a groupFunctional (mathematics)Meeting/Interview
E-textRegulärer Ausdruck <Textverarbeitung>Price indexObject (grammar)BlogEndliche ModelltheorieData modelMeta elementSubject indexingDigital filterWordWeb 2.0Front and back endsDatabaseForm (programming)Field (computer science)Endliche ModelltheorieFunctional (mathematics)Query languageType theorySubject indexingResultantRevision controlMethodenbankAdditionVector spaceCuboidSearch engine (computing)Computer animation
Digital photographySound effect
EmailField (computer science)Endliche ModelltheorieData modelBlogObject (grammar)Coma BerenicesoutputGUI widgetDefault (computer science)EmailEndliche ModelltheorieField (computer science)Military baseSimilarity (geometry)MultiplicationContent (media)AuthorizationSystem administratorArray data structureOperator (mathematics)Type theoryCodeComputer animation
Self-organizationSoftwareOpen sourceComputer programDigital photography
Civil engineeringTable (information)Data modelEndliche ModelltheorieObject (grammar)BefehlsprozessorBeer steinEmailDrop (liquid)Total S.A.MappingComputer fileBenchmarkProjective planeQuery languageTable (information)SpacetimeMiniDiscReduction of orderInsertion lossContent (media)Standard deviationData transmissionLimit (category theory)Customer relationship management2 (number)Endliche ModelltheorieRow (database)Default (computer science)Multiplication signStructural loadComputer animation
Set (mathematics)Digital photographyLecture/Conference
Network topologyMethodenbankSubject indexingEndliche ModelltheorieMeta elementBlogObject (grammar)ASCIIOrder (biology)AstrophysicsHierarchyEndliche ModelltheorieProcess (computing)Operator (mathematics)Subject indexingStatement (computer science)Network topologyField (computer science)Extension (kinesiology)Representation (politics)Query languageResultantTable (information)QuicksortComputer animation
MappingExtension (kinesiology)Series (mathematics)Digital photographyBlog
Extension (kinesiology)Front and back endsType theorySubject indexingTable (information)OracleWorld Wide Web ConsortiumFront and back endsDatabaseTable (information)Shared memoryFunctional (mathematics)Computer animation
Subject indexingExtension (kinesiology)Range (statistics)Source codeRange (statistics)Open sourcePresentation of a groupSource codeField (computer science)Extension (kinesiology)Subject indexingProjective planeString (computer science)Reading (process)Library (computing)Functional (mathematics)Stack (abstract data type)Software developerInformationMethodenbankComputer animation
Link (knot theory)Front and back endsTable (information)Shared memoryQR codeLecture/ConferenceMeeting/Interview
Transcript: English(auto-generated)
Welcome back, everyone. Hope you all are enjoying the EuroPython conference. I'm Anmol Sosteva, and I'm going to be the session host for the next two sessions. Let's welcome our next speaker, Paolo Melchiori, who is the CTO of a Pythonic software company
named 20Tap. Paolo is a seasoned Python backend developer and actively contributes to the Django project. Today, he's going to talk about using PostgreSQL as a database in Django. So, pretty nice. Where are you joining from, Paolo? Hi, I'm from Italy.
I'm in the central part of Italy, and close to the sea. Nice. How's the weather in Italy? Yeah, it's very hot now, so inside it's fresh. All right, awesome. Over to you, Paolo, we can start.
Thank you. So, hello, everyone, and I'm very happy to be here with you, even if remotely. In this talk, we'll see how to use some great feature of Postgres as a database in Django. In the database section of the Django documentation,
we can read that the Django attempts to support as many features as possible on all database backends. However, not all databases are alike. Django, per se, is a database agnostic web framework, but real-world projects based on it are not.
Postgres has the richest feature set of any Django-supported database, and we'll see in this talk how to use some of this superpower. Before moving on, it's important that I make this clear. So, now it seems clear to me, so now we can move on.
Jokes apart, I would like to underline that I'm not a database administrator. So, who am I? I'm Paolo Micciore, and I'm the CEO of 20Tab, a Bitonic software company for which I work remotely. I'm a software engineer and long-time Python backend developer.
After using Django for a few years, I became a contributor to the project. I also used Postgres as a database for all my Django projects, and now it's time to create one of them. As usual, to create a Django project, I use the latest Python 3 stable,
create and activate a virtual amp, in which then install the latest stable Django. Then, using the Django start project command, I created the basic file of our project. So, let's see what it takes to add Postgres to this newly created project.
I think many of you are familiar with this drawing from The Little Prince. This drawing is used as the header of the Twitter account of psychopg, a Postgres driver for Python. I think it represents its goal very well, Python with Postgres inside.
psychopg is the most used and advanced Postgres driver for Python. It implements the Python DB API 2.0 specification, and it's distributed under the LGPL license. The library was released 20 years ago,
and over the time has been constantly improved and kept aligned with Postgres. Version 3 is currently being developed. psychopg is a wrapper for libpq, the Postgres C client library. To install this package on a Debian-based system,
like Ubuntu, you can use the Hapity package manager. For most operating systems, the quickest way to install psychopg is using the package available in the Python package index. And now, let's see how to use psychopg in Django.
To use Postgres as a database in our Django project, we modified the settings, adding the psychopg-based database backend and the connection parameters of our Postgres database, which we may have locally or remotely.
If you embrace the 12-factor methodology, you can define a database URL variable in your environment. Depending on whatever you use, Django database URL directly or Django configuration,
your database's section should look like something like this. Let's now see our database in action. We'll use the example model defined in the making queries section of the Django documentation.
For our test queries, we only use another model and an entry model, both containing various type of fields, so we can search on. We can perform basic queries like this on our model,
but actually, we can run these queries using all other supported database as well. What we're really interested in is using Postgres-specific feature from Django. For the same reason, in 2014, Mark Tamil,
a Django core developer, started a crowdfunding campaign to develop a module to contain fields for a number of Postgres-specific data types. The campaign was a success, and the new module was merged in Django 1.8.
The module now contains Postgres-specific fields, indexes, function, extension, and so on. Over the year, important functions have been added, such as JSON field, full-text search, random UID, and operator classes. JSON fields have become usable also in the other supported database,
but only from Django 3.1, released last year, five years after being introduced in the Postgres module. So, to use all the features of the Django module, just add it in the installed apps in the section file of our project.
And now, let's get to know some features of this module better. Okay, I took this photo during the spring day after the DjangoCon Europe 2017 in Florence. In that day, I completed a pull request to add a database function
in the Postgres module for Django 2.0. I was helped by Mark Tamil, the original creator of this Postgres module, and by Marcus Alterman, a Django core developer, both in this photo. The database function I'm talking about is random UUID.
The random UUID database function returns a version four random UUID. It's contained in the pgCrypto module that provides cryptographic function for Postgres. It can be activated using the crypto extension migration operation. And from Postgres 13,
this function is included in core. To see the function in action, we'll add the UUID field in our entry model. This field uses the related Python module and only when used on Postgres, this store in a specific UUID data type.
The database will not generate it for you, so it's recommended to use default. But not that the UUID for callable is passed to default and not an instance of it. Using the Postgres function, you can update all the values in a model,
way faster than cycling over all the entry and generating a new value for the related Python function. I recently used this technique to set in few seconds UUIDs in a nearly one million rule tables.
Pretty impressive. I took this other photo during the spring day after EU Python 2017 in Rimini. I promoted a working group on Django and some developers joined me. That day we started the transition
of the Django project website search function from Elasticsearch to Postgres full text search. Since then, I've written an article and given more than one presentation on full text search with Django, so I skipped the implementation details.
The full text search support in the Postgres module is available as specific fields, expression, and function. If your Postgres version is recent enough, you can also use specific indexes for a search or web search style. Without any customization, we are able to perform a full text search
on a single field of the entry model. For example, we can search for a word in the plural form and have results in the singular form. This is a very convenient way to start using the Postgres full text search out of the box. But to speed up the full text search,
we can add a search vector for the entry model and use it to create a functional gene index on the same model. The functional index are an addition of Django 3.2, available for all Django database backend. But gene index is only available in Postgres backend.
After that, we can search for a word using a syntax similar to the one used by web search engines and have more accurate results. We can use these syntaxes using the search query
with the search type attribute. Furthermore, the SQL queries will be faster thanks to the gene index.
With this photo, we move virtually in Northern Europe, more precisely in Norway. I took this photo because I really liked the effect of these typical houses on the water, all similar to each other but repeated, like data in an array. The array fields make Postgres array types
available in Django. They are very convenient for storing arrays of similar data without creating a new model for them. You specify other Django model fields as bases. Also, its size can be defined and it can be even multi-dimensional.
For example, we can store multiple email in our outer model by defining an array of email using the email field as a base. We can then query our authors looking for an email.
The content of the field itself is represented as a list. The resulting SQL code uses all Postgres-specific operators for the array. Unfortunately, the default array widget in the Django admin is a simple input text
with comma-separated values. But using this Python package, you can represent in the Django admin the values as a multiple dynamically-headed input text. It's better. Okay, I took this photo in San Francisco.
We are now virtually moving in California because the package we are going to talk about is provided by the California Civic Data Coalition, an open-source network of journalists and computer programmers from news organizations across America. Django Postgres Copy is a Python package
to quickly import and export the limited data with Django support for Postgres copy command. The copy command moves data between tables and standard file. Copy to copies the content of a table to a file, and copy from copies data from a file to a table.
To more flexibility, Django Postgres Copy use a temporary table that are automatically dropped at the end of the session. To benchmark Postgres Copy, we use a file containing all the geographic name from the OpenStreetMap project. We create a new model that maps each columns
contained in the CSV file into a field. We have to replace the model default manager with the one from Postgres Copy. Here we use the file with all the geographical names of Italy,
and the file is more than 200 megabytes. To upload the file, we use the specific query set method to which we pass the path of the file. The loading speed is impressive. Almost one million record in just over three seconds.
Impressive. Under the hood, Django Postgres Copy executes several SQL statements, creates a temporary table based on the content of the CSV file. Upload the content of the file to the temporary table in just over two seconds.
It inserts the data of the temporary table into the table managed by the feature Django model, applying eventually some transformation, and finally drops the temporary table. To reduce disk space and transmission bandwidth,
we have compressed our file in a gzip format, reducing the size to a fifth. We can pass our compressed file directly to Postgres Copy without having to decompress it. Loading is done in a shorter time than before. I want to repeat it. Almost one million record in just over three seconds.
We are now virtually moving back in Italy with this photo that I took in a bird in Abruzzo, the region where I live. I'm showing this photo because now we are going to talk about tree.
Postgres L3, exactly. Django L3 is a tree extension to support hierarchical tree-like data in Django models using the native Postgres extension L3. It's a simpler and faster alternative
to implement materialized paths compared to the more used Django packages. The package has a path field and an abstract tree model. To add tree-like hierarchy to the entry model,
we add a path field inheriting from the tree model provided by Django L3. We also add the dedicated Postgres GST index on the same field to speed up the queries. This is a tree representation
of the example hierarchical structure that we have stored in the path field of our model. I took this example from the Postgres L3 documentation. We perform our hierarchical query to filter all the contained models of a particular path,
sort the result by the tree structure, and then take all the subpaths. The resulting SQL statement uses the L3 operator to filter the table and the GST index to speed up the operation and the sorting process.
Everything is really fast. With this photo, we are now virtually moving on the path of my last hike on the Italian Apennines. I already used this photo on my latest article series about maps with Django,
which you can read on my blog. But I want to briefly talk about the geographical extension of Postgres used by GeoDjango. Post.js is a Postgres extension, and it's also the best database backend for GeoDjango. It internally integrates spatial data
and adds spatial data types, indexes, and function. In this chart, I've synthesized the compatibility table of geographic backends supported by GeoDjango. In the GeoDjango official documentation, there are three compatibility tables,
one for spatial lookups, for database function, and for aggregate function. As you can see, Post.js is the only geographic backend that supports 100% of these features. If you are interested in using this feature,
you can read my article about it. I'll share the URL at the end of this talk. There are also many other Postgres-specific features that can be used directly in Django. For example, you can use a lot of indexes and aggregation function
that are only available in Postgres. You can also use the trigram extension to perform fast searching for similar string. There are also specific fields only available in Postgres, like range fields and case-insensitive text fields, and more.
Before saying goodbye, I want to share with you some tips. Based on my experience as a Postgres user with Django. The first one is to read the documentation in the Django website, because it's full of information about the Postgres module features. After that, read the details about this feature
in the Postgres website, because it helps you to understand how things work under the hood. Read also the source code of both projects on GitHub, because there is something you can learn only from the source code.
And at the end, search for questions on Stack Overflow, but try to answer questions by yourself instead of reading the answer. Last but not least, you can also study this talk, because it is released with the Creative Commons, with a Creative Commons license.
The Pyscho PG3 library is under active development, and you can use this contact to learn more about it, get involved, and also sponsor its development. The company I work for, 20tab, is one of the sponsors of this library. In 20tab, we have developed many Django projects
using Postgres. You can find out more about our open-source project and our Patonic work using these contacts. And finally, to find out more about my personal work with Django and Postgres, you can use all these my contacts. With this QR code, you can download directly
this presentation on my website. Thanks again for having me, and join the next talk in the conference. Ciao tutti. Thank you a lot, Paolo. It was a really informative and well-structured talk.
Yeah, one question. Yeah, so let me read it out for you. Can you please share the link to what you mentioned about Postgres? Yes, in the talk, on my website, you can find the link
to the official documentation in Django, where there is all the graphic, the table for support of a feature in Django backend and about Postgres. All right.
We are actually having a few more minutes, so if anybody has a question, please put it on the channel, metric channel, else Paolo will be happy to answer the questions in the breakout knee room. Yes, of course. Yeah, we're having a few people typing. Let's wait for a minute or two.
Yes, I'll share also the link to my talk in the chat, so if someone is interested, can read directly without using the QR code I shared before, so it's easier. Yeah, all right.
I think we are good. So thanks again, Paolo, for joining and sharing your knowledge. It was really a pleasure listening to you. Thanks for having me.
Yep, thanks. If anybody has any more questions, please feel free to reach out to Paolo over the breakout knee room. Yep, thanks. Bye. Bye.