From Natural Language to Structured Solr Queries using LLMs

Cite

Related Material

Plain Schwarz

Petreti, Ilaria Ruggero, Anna

Formal Metadata

Title

From Natural Language to Structured Solr Queries using LLMs

Title of Series

Berlin Buzzwords 2024

Number of Parts

Author

Petreti, Ilaria

Ruggero, Anna

Contributors

N. N. (Moderation)

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/70229 (DOI)

Publisher

Plain Schwarz

Release Date

2024

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

This talk draws on experimentation to enable AI applications with Solr. One important use case is to use AI for better accessibility and discoverability of the data: while User eXperience techniques, lexical search improvements, and data harmonization can take organizations to a good level of accessibility, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints. That is where AI – and most importantly, Natural Language Processing and Large Language Model techniques – could make a difference. This natural language, conversational engine could facilitate access and usage of the data leveraging the semantics of any data source. The objective of the presentation is to propose a technical approach and a way forward to achieve this goal. The key concept is to enable users to express their search queries in natural language, which the LLM then enriches, interprets, and translates into structured queries based on the Solr index's metadata. This approach leverages the LLM's ability to understand the nuances of natural language and the structure of documents within Apache Solr. The LLM acts as an intermediary agent, offering a transparent experience to users automatically and potentially uncovering relevant documents that conventional search methods might overlook. The presentation will include the results of this experimental work, lessons learned, best practices, and the scope of future work that should improve the approach and make it production-ready.