With numerous references to current literature, we will explain how we designed our new system and solved the multiple challenges we encountered on both the ML and engineering side (data pipeline encoding documents, live service encoding queries, integration with search engine) as well as sharing insights from analyzing the impact. Our system is based on OpenSearch, the lessons can be applied to other search engines as well.
To be more specific, the presentation will cover:
- Status and Short-Comings of our old Search
- Introduction of Hybrid Search
- general setup
- recommendations from literature
- Machine Learning
- model decision (quality vs. latency)
- fine-tuning and offline evaluation (in particular: using Paid Search / SEM data if you have few historic own search performance data)
- Architecture and Implementation: (with special consideration of latency)
- pipeline for encoding documents and indexing the resulting vectors (PySpark)
- service for live-encoding of queries (Python)
- implementing hybrid search within OpenSearch (including important filter value extraction from query and ranking scores)
- Learnings and Next Steps:
- observations from our A-B test
- challenge of cut-off decisions
- realistic training / evaluation data
- filter value extraction from query vs. semantic search
- combining search with auto-complete
- impact of the call to action in the search bar
- for which other use cases we successfully apply such semantic vector approaches |