Table of Contents
This is a brief summary of big changes between major revisions of
recent Julius. The details of each release changes are listed in the
file Release.txt
which is included in the
distribution package.
Support for Plug-in extension
Support for multi-stream AM
Support for MSD-HMM
Support for CVN and VTLN (-cvn
, -vtln
)
Added output compatibility option (-fallback1pass
)
On Linux, default audio API is moved from OSS to ALSA.
On Linux, audio API can be changed at run time:
-input
alsa
,
oss
, esd
Fixed bug: -multigramout
, environment variable expansion in jconf file, -record
and others.
Add option -usepower
to use power instead of magnitude on MFCC computation.
This document.
Compatibility issues:
Julian was merged to Julius. No change for usage, just swap julian to julius.
Word graph output is now a run-time option(-lattice
)
Short-pause segmentation is now a run-time option (-spsegment
).
Also, pause model list can be specified by -pausemodels
option.
Multi-path mode integrated, Julius will automatically switch to multipath mode when the AM requires it.
Module mode extended: new outputs like <STARTRECOG>
, <ENDRECOG>
, and new command like GRAMINFO
and many commands manipulating each recognition process on multi-model recognition.
Dictionary now allows to omit the output string on the second column. When omitted, Julius uses the LM entry string (firts column) as output string. This is the same format as HTK.
Dictaionary allows to use double-quotes to quote LM string.
New features:
Multi-model recognition (-AM
, -LM
, -SR
, -AM_GMM
, -inactive
)
Output each recognition result to a separate file (-outfile
)
Log to file instead of stdout, or stop log (-logfile
/ -nolog
)
Allow environment variable in jconf file ("$VARNAME
")
Down sampling from 48kHz to 16kHz (-48
)
Environment variable to set delay time in adin device LATENCY_MSEC
Environment variable to specify capture device name in ALSA: ALSADEV
Rejection based on average power
(-powerthres
, --enable-power-reject
)
GMM-based VAD (--enable-gmm-vad
, -gmmmargin
, -gmmup
, -gmmdown
)
Decoder-based VAD (--enable-decoder-vad
-spdelay
)
Can specify list of silence models for short-pause segmentation decision (-pausemodels
)
Support N-gram longer than 4-gram
Support recognition with forward only or backward only N-gram.
Initial support for user-defined LM
Support isolated word recognition using only dictionary (-w
, -wlist
, -wsil
)
Confusion network output (-confnet
)
Speed up by 20% to 40%, greatly reduced memory access, many fixes on Windows.
Grammar tools added: dfa_minimize, dfa_determinize, and another tool slf2dfa on Web
Extended support for MFCC extraction: full parameter settings, MAP-CMN and online energy coef.
Can read MFCC parameter setting from HTK Config file, and can embed the parameters into binary HMM file.
Input rejection based on GMM.
Word lattice output.
Multi-grammar recognition: -multigramout
, -gram
, -gramlist
Character set conversion on output: -charconv
Change input audio device via environmental variable "AUDIODEV
"
Now use integrated zlib library to expand gzipped files.
Integrate all variants of Julius (Linux / Windows / Multi-path ...) into one source tree, and support for compilation with MinGW.
Almost full documentation of source codes for Doxygen.