Full-Text Search in Django with PostgreSQL

Video in TIB AV-Portal: Full-Text Search in Django with PostgreSQL

Formal Metadata

Full-Text Search in Django with PostgreSQL
Title of Series
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Full-Text Search in Django with PostgreSQL [EuroPython 2017 - Talk - 2017-07-12 - PythonAnywhere Room] [Rimini, Italy] After some experiences in the implementation of full-text search functionality with different system, we have decided to use PostgreSQL to implement full-text search functionality in our next project, a website to search for shows, venues, bands and festivals. In the past, I have worked in two different projects, a mobile platform to sell and buy used items and a sport videos sharing platform, where I used two of the most currently famous full-text search software (Elasticsearch or Solr) but I had some synchronization and management problems. After that, in my company, we searched for new Django support of full-text search PostgreSQL implementation and we decided to use it to avoid any problems that I had in the past. I’m going to start speaking about the full-text search in a general context and I want to show the problems I encountered implementing it in the past. Afterwards, I’m going to talk about the PostgreSQL functionality to implement the full-text search functionality and also present the django.contrib.potgres.search module, with step-by-step demonstrations of its functions with real world data. Finally, I’m going to show the way we use and test this functionality in our project and which functionality lacks us to have a complete implementation of full-text search in our project. At the end, I want to present my conclusions about our solution and I want to explore some new features that will be present in the next versions of Django and PostgreSQL
E-text Intel Scalar field Software
E-text Software engineering Software developer Single-precision floating-point format Software developer Gender Software Projective plane Computer science Computer Basis <Mathematik> Implementation
Execution unit Presentation of a group Greatest element Supremum Projective plane Electronic mailing list Database Calculus of variations Formal language E-text Medical imaging Search engine (computing) Subject indexing Ranking
Web page Data management Data management Mobile app Process (computing) Mobile Web Projective plane Order (biology) 1 (number) Quicksort Physical system Elasticity (physics)
Purchasing Axiom of choice Code Projective plane Codebuch Focus (optics) Subject indexing Vector space Synchronization Personal digital assistant Computer cluster Order (biology) Synchronization Videoconferencing Lie group Videoconferencing Solr Physical system Form (programming)
Module (mathematics) Area Email Execution unit Electronic mailing list Generic programming Binary file Computer programming Focus (optics) Field (computer science) Web 2.0 Revision control E-text Subject indexing Medical imaging Different (Kate Ryan album) Revision control Subject indexing Module (mathematics) Quicksort Table (information) Message passing Physical system Asynchronous Transfer Mode
Standard deviation Software engineering Digital filter Functional (mathematics) Group action Workstation <Musikinstrument> Similarity (geometry) Field (computer science) Formal language Number Impulse response Data model Blog Object (grammar) String (computer science) Operator (mathematics) Vector space Query language Ranking Configuration space Endliche Modelltheorie Extension (kinesiology) Social class Exception handling Stability theory Operations research Theory of relativity Vector potential Similarity (geometry) Message passing Vector space Personal digital assistant Logic Query language Blog Network topology Order (biology) MiniDisc Endliche Modelltheorie Quicksort Object (grammar) Ranking Resultant
Server (computing) Radar Weight Projective plane Moment (mathematics) Variance Field (computer science) Number Web 2.0 Revision control Number Vector field Different (Kate Ryan album) Search engine (computing) Object (grammar) Vector space Revision control Musical ensemble Website Musical ensemble Endliche Modelltheorie
Digital filter Multiplication sign Bit rate Electronic mailing list Event horizon Number Mechanism design Bit rate Different (Kate Ryan album) Object (grammar) Vector space Query language Videoconferencing Musical ensemble Configuration space Social class Form (programming) Electronic mailing list Data management Word Radius Logic Software testing Right angle Musical ensemble Table (information)
Source code Functional (mathematics) Codierung <Programmierung> Connectivity (graph theory) Stack (abstract data type) Formal language Field (computer science) E-text Data management Integrated development environment Googol Core dump Order (biology) Subject indexing Synchronization Website Condition number Integrated development environment Configuration space Condition number
Intel Presentation of a group Word Software developer Multiplication sign Software Quantum
E-text Slide rule Game controller Bit rate Information Object (grammar) Revision control Shared memory Musical ensemble Software testing Bit rate
Data management Functional (mathematics) Word Bit rate State of matter Mobile Web Query language Mathematical analysis Ranking Right angle
Web page Digital filter Functional (mathematics) Presentation of a group Multiplication sign Virtual machine Set (mathematics) Insertion loss Field (computer science) Theory Number Product (business) Mathematics Bit rate Different (Kate Ryan album) Vector space Query language Flag Ranking Social class Information Weight Projective plane Volume (thermodynamics) Demoscene Type theory Word Process (computing) Film editing Logic Ranking Resultant Freezing
hello everyone this is my fists English talk and the have put it in the scalar adjusted is also excuse me for reading some notes anyway I map it to be a you where we here to find out more about food that search in jungle with prosperous my name
is follow mercuric and and entirely and I'm not computer science engineer I am I began Python developer for more than 10 years and gender developer for about 5 years at the present and working remotely at 20 Tobin as a single software engineer I'm not that the Bayes had misread or but I'm not a lawyer user all also there's you know my projects we
historic I want to show you all we have used jungle full-text search and impossible as in a real project the reason why we use jungle and possibly as basis for the search was because we preferred to implement full-text search without any external tools the these are the main topics of
this presentation full text search engine existing solutions for full-text search full-text search sup port for bottom possumus jungle suppler for full-text search day project conjectural not that column the next signal a shot in the full text search so my best conclusion and any question after the tool put the search derives from the needed to do some document search for example to file documents that contain specific awarded and its variation if a doberman contain hours or houses it would be the same for the search some example of every the use of full-text search our search and die in the common search image searching and so on this is
a list of some future work that we can find it in an enough for the vines full-text search solution is that can we use in our E. R. projects stemming ranking stock or the removal you what about language arts and stop working the sing and infrastructure Alexis search
and sort of the true solution or for the full text search that today are or and use the others but these are the only ones that they've used in my professional projects the they are assuming is the and retaining job is not market was a
start up where I worked in the past and that produce and animal by phone application for asylum by his use of the items in this project I use elastic here that are already being set up on the system but we have some difficulties of managing and synchronizing needs we have to apply some pages to the job in the to use for the compound of orders in Germany and they didn't particularly enjoy the tea
another project was those codebook form is a website dedicated to show sport videos uploaded but public user and it there's uh about 2 25 thousand videos the use of so far for a full text search in this project was a customer choice the we always had some problems synchronizing the doctor and at the end we prefer doing or writing on possumus and or reading it on a purchase order the
solution that Davis spoken about our for a future REDD and yeah Romania lie resources regarding its recommendation article or frequently as 1 question but they found some problem in synchronization and I always had to use a driver to connect with the it's it's a bottleneck between the John when this session John the in some cases I have to fix the code of the personally I am a more of them than adults so I'd like to be forced to integrate body sees them I prefer to developing a Soviet problem writing quite code also this as being some popping search since 2000 Haiti interrogate used yes a vector or Antaeus created to process the diet that into to search it some indexes that can be used to speed up the search gene and gene this as other the researching in 2016 OK before we define the
food to search area using the concept of document the document is the generic concept use it in the fall for that search and web search is that you know that the Bayes of local my can be a field don't table the aggregation of for more freedom on a table or in different tables the module jungle point-counterpoint post lists contain the south or for the full text search this article for full-text search as being present in this mode you since there 1 . 10 version it's every
imaging images indexes that have been added in them they 1 . 11 mention Virginia makes a sort the Gini index is very useful to speed up full text search use of this full-text searching John ways more developer-friendly for me OK let's look
at the function of sort of searching starting from the more those present the in this session of mutation the official the documentation we have a blogger and an outdoor classes connected from 1 classes these are the basic search that we can use 1 more than in John Locke using their freedom would contained in the 1st case In the 2nd case using case-insensitive it contains In order to get more results we can activating the IMAX and also the small group as so we can use them accent except as they shown to search without worrying about accented characters is most useful for register languages what's really is a more affordable stables can execution and these potential is low under the
extension is string of at about in activating the trigger impulses model we can use the trigger my station a tree armies on group of 3 consecutive characters taken from a string of we can evaluate the similarity of 2 strings by the number of all of the tree on the ship it's more it's a lot the user for about the something is not enough this is the base issue cup of jungle and would be is we can execute the real for the search on a field must is a really simple example we can use search vector to search or more field on similar jet object or objects like in this case when you would pass text tool for text search by a search query we can apply operation of stemming and stopword removing ever on the use of text and on these we can apply basic logic operation the we can use also the possumus rank 2 great great disk or offered relation to us so should text and we can use it to filter and to sort it the we can set up the search vector to execute the stemming and stopword removing 4 s ch and we can get is the languages of from a plus it is possible to set up the search
to give a different weights on various fields and the use and we can use this variance in the search for featuring the or we can decide to add that to the more than the search vector fields to speed up the search is very fast but we have to update this field mainly for example using junga signal 0 it was forget OK this is the project we are working on conjectural model called is a website to insert and to search for show festival bands being you all in the city of Rome at the moment to the website as the following numbers by are growing up and the web server is aligned since 24 king
this is the version number 2 is the or where of the website was developed at some years ago we jungle 1 . 7 and repeats surround sound and why don't you . 7 the debt was managed by both version 9 1 1 and the search is performed only using civil like syntax which feet and so hard this is the version 3 is the new version 1st a recently released was developed by the way the jungle version 1 . 11 and it's runs on why 3 1 6 the other is managed the was this version 9 . 6 and the search user it's full of text so much from and search engine let's look at
an example for managers defined for the band class it the finest search metal that contain form that such a logic is more complex than than the for example to better understand the
mechanism we can take into consideration example all wasn't protests interim defined the In this this set up we define the example that that we will use up towards to test the were search she bands and true musical genomes that we are saying to the ditch events in the search on the band's we simply bought the search medical giving us search text and we got back back the list of videos for difference nickname and rates nickname is stored upon the band table wire radius degraded by our search method at certain times In this example we call for it the of social salt with a list of lists where we define and the pair composed of the band's nickname and the numerical value that is the social rights or in other words the numerical value that defines the importance often that's we have seen a
simplified the use of core feature of jungle and the possible as full-text search moderate itself what of these softer are getting better in these fields and these are some of the feature that can be available in the next future OK in conclusion the following under
condition we evaluated so to implement the solution not any extra dependencies not doing too complex search managing easily order components synchronize not that is known to Western all SMEs is already available in some respects and is present at by the lonely environments these are the resources that they used to prepare the stored and to develop the search function I should view and we use on the website I would like to thank
the 20 top the company work for to given me this opportunity and match them in the region jungle quantum post developer for sharing with everyone is or thanks everyone for time that you have spent listen to me In this presentation released the on which Triticom once you can download it from my speaker like accounts after somehow words if anyone here as any question please wait off the representation and if you want you can contact me here be thank you for far the so the takes follow from from presentation the yeah all really interesting
tho I was something because in 1 of the slides if you got to get a few slides back your showing the rate but I'm not going to ask how exactly to calculate the rate because it's so I can imagine that's proprietary information with the use of signal but I'm wondering if you speak 0 sorry that I'm wondering how much control do we actually shares on the fact how the rate think use
generate can you all for example
state want to rate higher words that the longer than 5 characters so that if they appear only in the sentence chemical that has a value of these sorry for me is the which understand clearly your question can you can hear yeah the and this is the search for an OK let's say you care of an example my answer is explain its role In July right to and silence searching for the Cyrus will that be ranked higher if you want to say all he sister status that your income the 2nd the 2nd word in the sentence so if I can't analysis is that it's of viewing income a the 1st what I want to run the contact you hit back and controlling possible and you can control or which this and other the functionality of the 2 you know the
function I think is the weights of weight of searching OK uh you can specify the weights for something thank In possumus you can specify for pebble weights from page to to see to the story and you can change the number uh rated to the is the weights and configuring a lot of these the number you can have a different type of wait for all final weights the results so should be framed in type 2 ranking phosphorus strength in theUS ranked city and that have different logic uh on the first one have counted the frequency of the word in your governments and the 2nd 1 have a very complex and logic but it's more flexible and we used to construct a search rectal it's 10 or 11 of these you I assure you in that sense of soul we worked a lot in this Freeze 2 cut rates and 2 settings in very low details the resulting search is uh it's having some trouble and session to fix this synchronizing like in the solution but I think a it's the a enough to uh um to 2nd chemical the the can you give us some more information about the difficulties you encountered in this scene kriging the the adult men data be users we class set change you know like a sodding on a loss search thank you of a In both the project I used can before theories that was on a guy that managed to synchronize this the post but the bees and the last social solar angina and being controlled the lot problem of of dining sometimes they the user want to the flag on for example that know set flag on concept to to light and then you have to weights and and the doctor can from data the base to the other giant last assertion and every time the user and on a single process and especially when data in the volume of your about that grow up and you can you can have a very good so that machine to put on a for ship for example but in this project and what this was a very little products so we don't have a lot of big machine to that's and which was we've Don want that this problem at all because the dad up and therefore took search using the same it simply say that this and the last question to Poland the so that's thanks again ball for his interesting presentation thanks for the key fields