Multilingual natural language processing applications : from theory to practice

Simply hook up to the net and start to download and install the web page web link we share. Multilingual Natural Language Processing Applications is the first comprehensive single-source guide to building robust and accurate multilingual NLP systems. Edited by two leading experts, it integrates cutting-edge advances with practical solutions drawn from extensive field experience.

This book will be invaluable for all engineers, software developers, researchers, and graduate students who want to process large quantities of text in multiple languages, in any environment: government, corporate, or academic. About the Author Daniel M. More linguistically justified methods for good, coherent, consistent annotation schemes are needed. There has been almost no study of the applicability of methods across languages or attempts to identify language-independent features that can be exploited in NLP systems across languages.

Data-driven techniques are often language-independent, and once again, systematic analysis of what works in a multilingual environment is required. Data annotation, on the other hand, is largely language-dependent, but has to be produced in a standardized way in order to enable both system improvement and evaluation. The standardization of annotation formats as in, for instance, the European EAGLES effort and international collaboration are crucial here. Each of the four major application areas should be stimulated to face its particular challenges.

Machine Translation Chapter 4 should pursue coverage and robustness by putting more statistics into symbolic approaches and should pursue higher quality by putting more linguistics into statistical approaches.

Information Retrieval Chapter 2 should focus on multilinguality, which will require new, statistical, methods of simplifying traditional symbolic semantics. Text Summarization and Information Extraction Chapter 3 should attack the problems of query analysis and sentence analysis in order to pinpoint specific regions in the text in which specific nuances of meaning are covered, mono- and multilingually, by merging their respective primarily statistical and primarily symbolic techniques. Natural language processing research is at a crossroads: both symbolic and statistical approaches have been explored in-depth and the strengths and limitations of each are beginning to be well understood.

We have the data to feed the development of lexicons, term banks, and other knowledge sources, and we have the data to perform large-scale study of statistical properties of both written and spoken language in actual use. Coupled with this is the urgent need to develop reliable and robust methods for retrieval, extraction, summarization, and generation, due in large part to the information explosion engendered by the development of the Internet.

We have the tools and methods, yet we remain far from a solid understanding a general solution to the problem. What is needed is a concerted and coordination of researchers across the spectrum of relevant disciplines, and representing the international community, to come together and shape the bits and pieces into a coherent set of methods, resources, and tools. As noted above, this involves, in large part, a systematic and pains-taking effort to gain a deep understanding of the contributing factors and elements, from both a linguistic and a computational perspective.

However this may best be accomplished, one thing is clear: it demands conscious effort. The current emphasis on the development of applications may or may not naturally engender the sort of work that is necessary, but progress will certainly be enhanced with the appropriate recognition and support. Abney, S.

Statistical Methods and Linguistics. Klavans and Ph. Resnik eds. Brown, P.

Cocke, S. Della Pietra, V. Della Pietra, F. Jelinek, J. Lafferty, R. Mercer, P. A Statistical Approach to Machine Translation. Charniak, E. Chomsky, N. Syntactic Structures. The Hague, The Netherlands: Mouton. Church, K. Collins, M. San Francisco: Morgan Kaufmann. Dorr, B. Frederking, R.

Nirenburg, D. Farwell, S. Helmreich, E. Hovy, K. Knight, S. Beale, C. Domanshnev, D. Attardo, D Grannes, R. Harris, Z. Methods in Structural Linguistics. Chicago: University of Chicago Press. Hermjakob, U. Hobbs, J. Appelt, J. Bear, M.


Tyson, and D. Ide, N. Very large neural networks for word sense disambiguation. Johnson, R. King, and L. Des Tombe.

Kilgarriff, A. Computers and the Humanities special issue , forthcoming. Klavans, J. Magerman, D. Statistical Decision-Tree Models for Parsing.

Niemann, H. Noeth, A. Kiessling, R.

Kompe and A. Prosodic Processing and its Use in Verbmobil. Munich, Germany.