Multidimensional Text

Cite

River Valley TV

Plaice, John

Formal Metadata

Title

Multidimensional Text

Title of Series

The annual conference of the TeX Users Group (TUG 2008)

Part Number

Number of Parts

Author

Plaice, John

License

CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/30793 (DOI)

Publisher

River Valley TV

Release Date

2012

Language

English

Production Place

Cork, Ireland

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

The Unicode model of text makes a clear distinction between character and glyph, and in so doing, paradoxically, creates the impression that the ultimate representation for text is some form of abstraction from its visual presentation. However,the level of abstraction for different languages encoded “naturally” in Unicode is quite different. We propose instead that text be encoded as sequences of context–tagged indices into arbitrary indexed structures, including not just character sets such as Unicode, but also dictionaries of words or compound words. Furthermore, these sequences need not necessarily contain elements from the same indexed structures. Using our approach allows natural solutions for a wide range of problems, including the creation of documents that can be printed using several alternate spellings, the automatic generation of error messages with arguments, and the correct generation of nouns or adjectives with number, case or gender markers or of verb conjugations.