We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Bootstrapping an End-to-End Natural Language Interface for Databases

Formale Metadaten

Titel
Bootstrapping an End-to-End Natural Language Interface for Databases
Serientitel
Anzahl der Teile
155
Autor
Lizenz
CC-Namensnennung 3.0 Deutschland:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
The ability to extract insights from data is critical for decision making. Intuitive natural language interfaces to databases provide non-technical users with an effective way to formulate complex questions and information needs efficiently and effectively. A recent trend in the area of Natural Language Interfaces for Databases (NLIDBs) has been the use of neural machine translation models to synthesize executable Structured Query Language (SQL) queries from natural language utterances. The main bottleneck in this type of approach is the acquisition of examples for training the model. Recent work has assumed access to a rich manually-curated training set for a given target database. However, this assumption ignores the large manual overhead required to curate the training set for any new database. As a result, NLIDB systems that can simply ‘plug in’ to any new database and perform effectively for naive users have yet to make their way into commercial products. Here we present DBPal, an end-to-end NLIDB framework in which a neural translation model is trained for any new database schema with minimal manual overhead. In addition to being the first off-the-shelf, neural machine translationbased system of its kind, the contributions of our project are 1) its use of a synthetic training set generation pipeline used to bootstrap a translation model without requiring manually curated data, and 2) its use of state-of-the-art multi-task and cross-domain learning techniques that increases the robustness of the translation model towards unseen linguistic phenomena in new domains. In experiments we show that our system can achieve competitive performance on the recently released benchmarks for nl-to-sql translation. Through ablation experiments we show the benefit of using cross-domain learning techniques on the performance of the system. In a user study we show that DBPal outperforms a well-known rule-based NLIDB and performs comparably to an approach using a similar neural model that relies on manually curated data.