Deep learning for search has become a hot topic in recent years, it enables users to search based on semantics, search based on visual similarity, and conduct cross-multi/modality searches. Though promising, it is non-trivial to use deep neural nets inside your system and expect it works out of the box. In fact, in most cases, it doesn't work. The reason can be summarised into three pillars: task shift, domain shift, and knowledge shift. Firstly, most of the deep learning models are trained to minimize classification/regression/segmentation loss, rather than search loss. Secondly, the dataset on which the model was trained could be quite different from the data you're working on. Last but not least, we observed a notable knowledge gap between search engineers and machine learning engineers. In this talk, we would like to gently guide the audience into the neural search world. Discuss the motivation behind model tuning. Then, we'll discuss the algorithm frameworks behind model fine-tuning, such as deep metric learning, contrastive learning and self-supervised learning. Last but not least, we'll talk about the infrastructure behind a mature training service and how could we scale it up. We believe the topic could be interesting for the Berlin Buzzwords audience since it covers several aspects of the tags: search, data science, and scale. After the 40 minutes talk, the audience is expected to understand: 1. What is neural search and why it is important. 2. The algorithms to improve pre-trained neural nets for single-modality search/cross-modality search. 3. Our tech stack to scale the training platform up. |