cuVS and Lucene: GPU-based Vector Search

Zitieren

Zugehöriges Material

Plain Schwarz

Nolet, Corey J. Narang, Vivek

Formale Metadaten

Titel

cuVS and Lucene: GPU-based Vector Search

Serientitel

Berlin Buzzwords 2024

Anzahl der Teile

Autor

Nolet, Corey J.

Narang, Vivek

Mitwirkende

N. N. (Moderation)

Lizenz

CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/70217 (DOI)

Herausgeber

Plain Schwarz

Erscheinungsjahr

2024

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

By 2025, it is believed that 80% of all data will be unstructured. Meanwhile, modern advances in large language models and generative AI have acted as a catalyst for the use of vector embeddings, which are impacting nearly every industry. Vector search is growing in popularity and, thanks to these advances, the need for efficient and scalable semantic search is becoming more evident. GPUs have become synonymous with AI over the past decade, but even more intriguing is the advancement of software capabilities that can leverage GPUs to accelerate more general-purpose data processing workloads like vector search. cuVS, from Nvidia [1], is a CUDA-based library containing state of the art implementations of several algorithms for approximate nearest neighbors and clustering on the GPU. Apache Lucene [2] is an open-source search library that is at the core of popular search engines like MongoDB, Elasticsearch, OpenSearch and Apache Solr. This talk will have two main parts, 1) Introduction to Nvidia’s cuVS library, its history, approximate nearest neighbor search algorithm types, their implementations and comparison, the novel graph-based CAGRA algorithm, and the cuVS future roadmap. 2) Integration of cuVS into Apache Lucene to power GPU accelerated vector search, the motivations, challenges, the roadmap associated with this integration, and potential future directions for turbo-charging Lucene on the GPU. We will provide benchmarks and lessons we've learned along the way, and hope that this will spark a new trend where GPUs can be used to accelerate other compute-heavy search, analytics and database tasks in the future. [1] - cuVS, GitHub: rapidsai/cuvs, was formerly part of RAFT, GitHub: rapidsai/raft