Human Language Technologies
Human Language Technologies (HLT) comprise a number of areas of
research and development that focus on the use of technology to
facilitate communication in a multilingual information society. Human
language technologies are areas of activity in departments of the
European Commission that were formerly grouped under the heading
Language Engineering (Gupta & Schulze 2011: Section 1.1).
[72]
The parts of HLT that is of greatest interest to the language teacher is
Natural Language Processing (NLP), especially
parsing, as well as the areas of
speech synthesis and
speech recognition.
Speech synthesis has improved immeasurably in recent years. It is
often used in electronic dictionaries to enable learners to find out how
words are pronounced. At word level, speech synthesis is quite
effective, the artificial voice often closely resembling a human voice.
At phrase level and sentence level, however, there are often problems of
intonation, resulting in speech production that sounds unnatural even
though it may be intelligible. Speech synthesis as embodied in
Text To Speech
(TTS) applications is invaluable as a tool for unsighted or partially
sighted people. Gupta & Schulze (2010: Section 4.1) list several
examples of speech synthesis applications.
[72]
Speech recognition is less advanced than speech synthesis. It has
been used in a number of CALL programs, in which it is usually described
as
Automatic Speech Recognition (ASR). ASR is not easy to implement. Ehsani & Knodt (1998) summarise the core problem as follows:
"Complex cognitive processes account for the human ability to
associate acoustic signals with meanings and intentions. For a computer,
on the other hand, speech is essentially a series of digital values.
However, despite these differences, the core problem of speech
recognition is the same for both humans and machines: namely, of finding
the best match between a given speech sound and its corresponding word
string. Automatic speech recognition technology attempts to simulate and
optimize this process computationally."
[73]
Programs embodying ASR normally provide a native speaker model that
the learner is requested to imitate, but the matching process is not
100% reliable and may result in a learner's perfectly intelligible
attempt to pronounce a word or phrase being rejected (Davies 2010:
Section 3.4.6 and Section 3.4.7).
[40]