The Quantitative Imaging Files

Due to a reorganisation we are not able anymore to maintain these files. They will be removed in the near future.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

AbbotDemo: Continuous Speech Recognition Available by FTP




AbbotDemo is a near-real-time speaker-independent continuous speech
recognition system for American and British accented English.  The
vocabulary size of this demonstration system is 5,000 words. 

The system uses a hybrid recurrent network/hidden Markov model
acoustic model and a trigram language model.  The recurrent network
(which contains around 100,000 weights) was trained as a phone
probability estimator using back-propagation through time.

It is available by FTP from svr-ftp.eng.cam.ac.uk in directory
/pub/comp.speech/binaries.  The file AbbotDemo.README gives more
information on this system and the remainder of the files provide the
executables for various flavours of UNIX (Linux, SunOS4, HP-UX, IRIX).
A 16 bit soundcard and a reasonable microphone are required.  The Linux
version is also available by FTP from sunsite.unc.edu (and mirrors) in
directory /pub/Linux/apps/sound/speech.  Sorry, but at this stage no
sources and only limited documentation are provided.

Although the task domain is focused on noise-free read speech from a
north American business newspaper (e.g. the Wall Street Journal) we hope
that this system provides a fair representation of the state of the art
in large vocabulary speech recognition and that it will encourage the
creation of novel applications.

Tony Robinson (Cambridge University)
Mike Hochberg (Cambridge University)
Steve Renals  (Sheffield University)
and many many more.

AbbotDemo URLs:
ftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/binaries/
ftp://sunsite.unc.edu/pub/Linux/apps/sound/speech/         



Maintained by Bob Duin, e-mail: duin@ph.tn.tudelft.nl

Last update: February 17, 2004

Return to the home page