Semantic vs keyword search as context for GPT

Plain Schwarz

Golubenco, Tudor

Formale Metadaten

Titel

Serientitel

Berlin Buzzwords 2023

Anzahl der Teile

Autor

Golubenco, Tudor

Mitwirkende

N. N. (Moderation)

Lizenz

CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/66608 (DOI)

Herausgeber

Plain Schwarz

Erscheinungsjahr

2023

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

The OpenAI ChatGPT has taken the world by storm and people want to be able to offer the same type of chat bot experience on their own data. Such a bot can answer questions based on your documentation or knowledge base. This can be done with the OpenAI API by providing the right context, extracted from your data, to the model. You can do this in two steps: * the search step: perform a search to select the documentation pages that are likely to contain the answer. * the GPT step: provide these pages as context with a prompt like "With this context: .... answer this questions: ...". For the search step, semantic search is often used, because it makes use of the LLM capabilities. However, we have found that in practice keyword search (e.g. BM25 based) has some advantages when it comes to tuning the search step, and it tends to be more "explainable".