Prototype to Production for RAG applications

Zitieren

Swiss Python Summit Association

Chung, Isaac

Formale Metadaten

Titel

Prototype to Production for RAG applications

Serientitel

Swiss Python Summit 2024 (SPS24)

Anzahl der Teile

Autor

Chung, Isaac

Mitwirkende

N. N. (Moderation)

Lizenz

CC-Namensnennung 4.0 International:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/69792 (DOI)

Herausgeber

Swiss Python Summit Association

Erscheinungsjahr

2024

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

Talk recorded at the Swiss Python Summit on October 18th, 2024. Licensed as Creative Commons Attribution 4.0 International. --------- Abstract: Retrieval Augmented Generation (RAG) has been used to mitigate hallucination issues from LLMs and rapidly provide LLMs with external knowledge that were not part of the pre-training data. While tutorials offer convenient ways to build POCs quickly, transitioning these prototypes to production environments often catches us off-guard with unforeseen challenges. This talk takes a deeper dive into the topics that are often missing from cookbooks and tutorials yet are crucial in scaling your RAG prototype to production. Our discussion will use real examples to help you better understand some of the best practices in production RAG for observability, security, scalability, and fault tolerance. --------------------- About the speaker(s): I am currently a Staff Data Scientist at Wrike, where I work on enabling new generative AI features in production. I also help maintain MTEB and organize PyData Tallinn in my spare time. My background is in Aerospace Engineering and Machine Learning and I hold undergraduate and graduate degrees from the University of Toronto.