We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

SpeakQL: Towards Speech-driven Multimodal Querying

Formal Metadata

Title
SpeakQL: Towards Speech-driven Multimodal Querying
Title of Series
Number of Parts
155
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Speech-based inputs have become popular in many applications on constrained device environments such as smartphones and tablets, and even personal conversational assistants such as Siri, Alexa, and Cortana. Inspired by this recent success of speech-driven interfaces, in this work, we consider an important fundamental question: How should one design a speech-driven system to query structured data? Recent works have studied new querying modalities like visual [4, 8], touch-based [3, 7], and natural language interfaces (NLIs) [5, 6], especially for constrained querying environments such as tablets, smartphones, and conversational assistants. The commands given by the user are then translated to the Structured Query Language (SQL). But conspicuous by its absence is a speech-driven interface for regular SQL or other structured querying. One might wonder: Why dictate structured queries and not just use NLIs or visual interfaces? From a practical standpoint, many users, including in the C-suite, enterprise, Web, and other domains are already familiar with SQL (even if only a subset of it) and use it routinely. A spoken SQL interface could help them speed up query specification, especially in constrained settings such as smartphones and tablets, where typing SQL would be painful. More fundamentally, there is a trade-off inherent in any query interface, as illustrated in Figure 1(A).