Advanced Retrieval-Augmented Generation Techniques

Zitieren

Zugehöriges Material

Plain Schwarz

Hasan, Zain

Formale Metadaten

Titel

Advanced Retrieval-Augmented Generation Techniques

Serientitel

Berlin Buzzwords 2024

Anzahl der Teile

Autor

Hasan, Zain

Mitwirkende

N. N. (Moderation)

Lizenz

CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/70267 (DOI)

Herausgeber

Plain Schwarz

Erscheinungsjahr

2024

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

Chatbots are becoming increasingly popular for interacting with users, providing information, entertainment, and assistance. However, building chatbots that can handle diverse and complex user queries is still a challenging task. One of the main difficulties is finding relevant and reliable information from large and noisy data sources. In this talk, I will present some of the latest advances in retrieval-augmented generation(RAG) techniques, which combine the strengths of both retrieval-based and generative approaches for chatbot development. Retrieval-based methods can leverage existing text documents to provide informative and coherent responses, while generative methods can produce novel and engaging conversations personalized to the user. I will cover the following topics: 1. Hybrid search with vector databases: How to use both keyword-based and semantic-based search methods to retrieve relevant documents from large-scale vector databases. 2. Query generation using LLMs: How to use large language models to generate natural and effective queries for document retrieval, based on the user input and the dialogue history. 3. Automatically excluding irrelevant search results: How to use various filtering and ranking techniques based on vector distance to exclude irrelevant search results. 4. Re-ranking: How to dynamically re-rank retrieved documents to further improve context relevance. 5. Chunking Techniques: How to use text segmentation and summarization methods to chunk long documents into shorter and more relevant passages. I will demonstrate the effectiveness of these advanced techniques in the RAG workflow. I will also discuss the challenges and limitations of these techniques and the future directions for research and development.