- IVR (interactive voice response) systems are those in which a caller hears a menu of options and is then
asked to press a key on their telephone keypad to select one e.g. "Press 1 for your balance, press 2 to pay a bill...".
This is a less natural interaction than using speech recognition but we are happy to provide applications
using it. In particular it can be useful as a fall back mechanism where there is too much background noise or if you
don't want the caller to say information such as their password out loud.
- Multi Modality. Multi modal applications are those in which the user interacts in more than one mode.
An example would be Blackjack on the internet. The user can call in using a PC phone for example
Skype, as well as logging on to a web site. They can then interact via keyboard and mouse or by speaking. We do have
the capability in this area and welcome any requirements for applications requiring multi-modality.
- Dictation Systems. A dictation system is where a computer will attempt to produce a written
transcription of a users speech. To get good accuracy users must spend considerable time and effort
training the system. We do not offer dictation systems. Our speech recognition technology on the other hand,
is intended to work for any caller with no training.
- Voice Recognition or Speaker Recognition as opposed to Speech Recognition, is an alternative technology
in which the computer is trying to recognise who the person speaking is, rather than what they are saying. This can
be used when logging on to a speech recognition application for example, but there are questions regarding
reliability so we tend to recommend other approaches.
- DTMF or "dual-tone multi-frequency" is the signal generated when you press a key
on your telephone keypad during an IVR interaction.
- ASR is short for "automatic speech recogniser". Most speech recognition software packages are a
combination of ASR and TTS.
- TTS. A "text to speech" engine allows text to be read out loud in a human
synthesized voice. We use human recorded speech where possible, but may use TTS when reading out unpredictable text, such
as that in an email or, for speed of turnaround in our demonstrators.