We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

#bbuzz: A Journey to Write a New Lucene PostingsFormat

Formale Metadaten

Titel
#bbuzz: A Journey to Write a New Lucene PostingsFormat
Serientitel
Anzahl der Teile
48
Autor
Lizenz
CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Some hard technical challenges are better solved by changing the foundations. We had a use case of searching many fields with strong constraints on memory and performance. We needed a massive number of fields to support field level security at scale and open the path to machine learned ranking models. A custom PostingsFormat allowed for a solution with greater efficiencies than our prior solution. We developed a new Lucene PostingsFormat called UniformSplit, we deployed it at a very large scale, and we open-sourced it. We learned a lot during the journey, especially about micro-benchmarking, java memory consumption, compact data representation and high performance Lucene indices. This presentation is a good medium to share what we learned with step backwards, the learnings on the Lucene mechanisms, the tips and the pitfalls we encountered. And as we continued the development, we will share the latest works and production measures.