2016-11-25

Grammatical Framework: Formalizing the Grammars of the World


source: GoogleTechTalks    2016年9月15日
A Google TechTalk, 9/7/2016, presented by Professor Aarne Ranta, University of Gothenburg.
Speaker's errata:
4:57: “sixteen forms” should be “twenty-six”
19:32: “more than 2000 members” should be “200” as on the slide

ABSTRACT: GF (Grammatical Framework) is a grammar formalism that was first released at Xerox Research in 1998 and later became an open-source collaborative project. GF is thus at least a decade younger than the major grammar formalisms (LFG, HPSG, TAG, CCG) and has grown up in an era when computational linguistics is dominated by statistical methods rather than grammars. Its background is in fact quite different from the major grammar formalisms, as its roots are in theorem provers and compiler construction rather than theoretical linguistics.
The original mission of GF was to make it easy to implement multilingual controlled language systems, where a semantic interlingua serves as a hub between multiple languages. In such a system, translation works as parsing the source language into an interlingua followed by generation into the target language. Unlike in many other interlingual systems, the interlingua is not fixed but can be easily changed e.g. to adapt to application domains. Thus GF has been used to implement software specification systems, spoken dialogue systems, mathematical teaching tools, tourist phrasebooks, and many other applications, in which up to 30 parallel languages are involved.
In recent years, GF has also scaled up to wide-coverage parsing and translation, resulting for instance in the mobile app GF Offline Translator. While not quite as good in open-domain tasks as state-of-the-art statistical systems, the GF translator has some advantages: compact size (15 languages available offline in 30 megabytes), inspectability (via syntax trees and other grammatical information), and domain-adaptability. The traditional weakness of grammars, their labour intensiveness, is relieved by software techniques that make the development of grammars in GF orders of magnitude faster than with traditional methods.
Another emerging usage of GF is dependency parsing. The booming initiative of Universal Dependencies (UD) has turned out to be very similar to the interlingua used in the wide-coverage GF translator, so that GF trees can be automatically converted to UD trees. Since GF trees support generation in addition to parsing, the mapping makes it possible to bootstrap UD treebanks for new languages. More generally, the use of UD data in combination with GF grammars suggests a way to build hybrid systems that combine data-driven UD parsing with the precise semantic analysis and generation of GF.

ABOUT THE SPEAKER: Aarne Ranta is Professor of Computer Science at the University of Gothenburg. He defended his PhD at the University of Helsinki in 1990. After seven years as Junior Fellow of the Academy of Finland, he worked at Xerox Research Centre Europe in Grenoble in 1997-1999, starting the development of Grammatical Framework (GF), after which he joined the Department of Computing Science of Chalmers University of Technology and University of Gothenburg. Ranta’s research interests have covered type theory, functional programming, compiler construction, and, as his main field, computational linguistics. His has followed the mission to formalize the grammars of the world and make them available for computer applications. In this work, he has been helped by 10 PhD graduates and by a community of over 200 GF contributors. Ranta is currently on a partial leave from the university to work for the start-up company Digital Grammars AB, which develops reliable language technology for producers of information.

No comments: