Pre-trained transformers have revolutionized search. However, off-the-shelf transformers, at a fixed model size, perform poorly on out-of-domain data. Larger models have better generalization capabilities but strict latency and cost requirements limit the size of production models. In this talk, we will demonstrate how small transformer models can be fine-tuned on specific domains, even in the absence of labelled data, using the technique of synthetic query generation. Our process involves releasing a fine-tuned 1.5B parameter query generation model that, given a document, generates multiple questions that are answered by the document. These query-document combinations are then used to train a fine-tuned model. We combine the fine-tuned model with OpenSearch lexical search tools and benchmark them. Using these tools, we demonstrate a state-of-the-art, zero-shot nDCG@10 boost of 14.30% over BM25 on a benchmark of 10 public test datasets. We elaborate upon lessons learned from training and using large language models for query generation. We also discuss some open questions around representation anisotropy, keyword filtering and index sizes of dense models. Ultimately, audiences will take away from the presentation an understanding of the processes used to fine-tune small transformer models and combine them with lexical search, along with step-by-step guidance with which to pursue their own improvements in search accuracy using open-source tools. |