參數(shù)資料
型號(hào): ISD-SR3000
文件頁(yè)數(shù): 43/120頁(yè)
文件大?。?/td> 1293K
代理商: ISD-SR3000
第1頁(yè)第2頁(yè)第3頁(yè)第4頁(yè)第5頁(yè)第6頁(yè)第7頁(yè)第8頁(yè)第9頁(yè)第10頁(yè)第11頁(yè)第12頁(yè)第13頁(yè)第14頁(yè)第15頁(yè)第16頁(yè)第17頁(yè)第18頁(yè)第19頁(yè)第20頁(yè)第21頁(yè)第22頁(yè)第23頁(yè)第24頁(yè)第25頁(yè)第26頁(yè)第27頁(yè)第28頁(yè)第29頁(yè)第30頁(yè)第31頁(yè)第32頁(yè)第33頁(yè)第34頁(yè)第35頁(yè)第36頁(yè)第37頁(yè)第38頁(yè)第39頁(yè)第40頁(yè)第41頁(yè)第42頁(yè)當(dāng)前第43頁(yè)第44頁(yè)第45頁(yè)第46頁(yè)第47頁(yè)第48頁(yè)第49頁(yè)第50頁(yè)第51頁(yè)第52頁(yè)第53頁(yè)第54頁(yè)第55頁(yè)第56頁(yè)第57頁(yè)第58頁(yè)第59頁(yè)第60頁(yè)第61頁(yè)第62頁(yè)第63頁(yè)第64頁(yè)第65頁(yè)第66頁(yè)第67頁(yè)第68頁(yè)第69頁(yè)第70頁(yè)第71頁(yè)第72頁(yè)第73頁(yè)第74頁(yè)第75頁(yè)第76頁(yè)第77頁(yè)第78頁(yè)第79頁(yè)第80頁(yè)第81頁(yè)第82頁(yè)第83頁(yè)第84頁(yè)第85頁(yè)第86頁(yè)第87頁(yè)第88頁(yè)第89頁(yè)第90頁(yè)第91頁(yè)第92頁(yè)第93頁(yè)第94頁(yè)第95頁(yè)第96頁(yè)第97頁(yè)第98頁(yè)第99頁(yè)第100頁(yè)第101頁(yè)第102頁(yè)第103頁(yè)第104頁(yè)第105頁(yè)第106頁(yè)第107頁(yè)第108頁(yè)第109頁(yè)第110頁(yè)第111頁(yè)第112頁(yè)第113頁(yè)第114頁(yè)第115頁(yè)第116頁(yè)第117頁(yè)第118頁(yè)第119頁(yè)第120頁(yè)
2-2
2—SOFTWARE
ISD-SR3000
Voice Solutions in Silicon
2.2
ISD-SR3000 uses a segmented triphone recognition process. The sampled speech utterance
is split into distinct phonetic sounds, the smallest units of speech. Because these phonemes
vary in both sound and duration, the processor must be able to determine boundaries between
the sounds. The ISD-SR3000 uses Hidden Markov Models to hypothesize boundaries between
sounds and to form probabilistic models on each possible combination.
RECOGNITION ENGINE
The outputs are then classified by determining matches between the phonetic sounds and the
stored phoneme models. The acoustic models for the phonemes are gathered from a large
sample of speakers, allowing for a wide variation across accents, dialect, and gender. This al-
lows the recognizer to associate the sound segments with a number of possible phonemes, en-
abling recognition when words are pronounced differently.
The phonemes are then matched to vocabulary words or phrases using a search routine. The
set of phonemes is compared to the vocabulary models for the active topics, and the recognized
word is returned. If the phonemes do not match any of the active vocabulary words, nothing is
returned. The ISD-SR3000 does not return a score with the word; it either recognizes a word,
or it does not.
2.2.1
The ISD-SR3000 is capable of both speaker-independent and speaker defined recognition.
The recognition engine is continuous, allowing for multiple word commands and connected dig-
its. However, there must be recognized silence before and after valid utterances. The length of
the silence is programmed into the host controller, and may be as small as 100ms. The com-
mands and digits are speaker-independent, with models constructed from a large corpus of
speakers. The speaker-defined voicetags and commands are partially speaker-dependent.
However, they are constructed by creating acoustic models “on-the-fly” from the phoneme
base. This means only one training pass is required for entering the voicetags, and recognition
is possible with some variation in the way the name is spoken. The first pass is used to create
the phoneme model, and a second pass is used for recognition confirmation.
TYPES OF RECOGNITION
2.2.2
A grammar is used to define the structure of the commands. The ISD-SR3000 is designed to
work with multiple topics or a finite-state grammar. This type of grammar is designed to limit
perplexity (the number of possible branches during recognition) by pre-defining the number of
allowable words at a given state. For example, a prompt that requires a “yes” or “no” response
has a perplexity of two. Greater perplexities increase the chances for substitution errors. During
recognition, a limited number of topics are active. Topics are groups of words that are active at
a given time. For example, in a voice dialing application, digit topics are active after the user
issues the “dial” command. No other topics are open (except the global topics such as “cancel”
or “help”) so that the recognizer is only trying to recognize digits. This type of grammar and ac-
tive topics inherently increases recognition accuracy.
GRAMMAR
相關(guān)PDF資料
PDF描述
ISD1100SERIES ISD1110/ISD1112 Part2
ISD1200SERIES ISD1210/ISD1212 Part3
ISD1400_1
ISD1400_2
ISD1400_3
相關(guān)代理商/技術(shù)參數(shù)
參數(shù)描述
ISD-T266SA/J 制造商:未知廠家 制造商全稱:未知廠家 功能描述:Solid-State Recorder
ISD-T266SA/Q 制造商:未知廠家 制造商全稱:未知廠家 功能描述:Solid-State Recorder
ISD-T266SC/J 制造商:未知廠家 制造商全稱:未知廠家 功能描述:Solid-State Recorder
ISD-T266SC/Q 制造商:未知廠家 制造商全稱:未知廠家 功能描述:Solid-State Recorder
ISD-T266SP/J 制造商:未知廠家 制造商全稱:未知廠家 功能描述:Solid-State Recorder