Status: General Availability
Platform:
The RecoMadeEasy® AudioVisual Recognition (AV) System
is an award-winning engine developed entirely by Recognition
Technologies, Inc., capable of conducting speaker recognition,
face recognition, and speech recognition. It is
currently runs on Linux, Mac and Windows operating systems. The SIV
system is fully integrated with our IVR system which is
compatible with
Dialogic®
telephony T1 and E1 cards as well as their analog cards.
It may also be run in a stand-alone environment independent of our
IVR system in a telephony or non-telephony setting.
This is a state-of-the-art language and
text-independent speaker recognition system which has been developed
to work in different environments. Large-Scale and Small-Scale
versions of this speaker identification and speaker verification
(SIV) engine have been developed over many years of research to work
in the telephony as well as stand-alone environments. This speaker
biometric engine may be customized to fit your exact needs including
special modifications to fit the operating environment in which
your related applications run. Our staff has been actively
involved in defining speaker recognition (speaker biometric)
standards in the VoiceXML and ANSI communities by providing
detailed consultation to the VoiceXML and M1 committees involved
in defining the speaker verification and identification standards.
Capabilities
The RecoMadeEasy® SIV system operates in 6 different
modalities:
- Speaker Identification (Open-Set and Closed-Set)
The speaker enrolls his voice with the system. The system trains for
this and other speakers' voices. Once the speaker returns, the system
only has to listen to the speaker and will be able to identify the
speaker's voice among the trained voices it has in the database. The
identification process returns an ID for the speaker. There are two
different identification approaches. The simpler one is called
Closed-Set Identification in which case the ID of the closest voice in
the database is returned. In this case, if the speaker is not in the
database there is a possibility of a mis-tagged ID since the closest
voice is the database is picked. The more sophisticated (but harder)
approach is called Open-Set Identification where the speaker may
be tagged with an ID from the database or if the speaker has not been
enrolled in the database, he is rejected as not-enrolled.
Our SIV engine supports both Open-Set and Closed-Set approaches.
- Speaker Verification
In this modality, again, the speaker has to enroll his voice. Once the
enrollment process is done (recording of about 30 seconds of speech and
obtaining a positive ID of the speaker), the speaker is added to the
database. When the speaker returns, he makes a claim of his identity.
He will also speak for a few seconds and the speaker's voice is matched
against the database. His identity is either authenticated or he is
rejected as an impostor. It is important to note that there are two
possible sources of error; 1. False Acceptance and 2. False Rejection.
A false acceptance error would happen if the individual is mistakenly
authenticated. This is the number that we should try to minimize in
more security conscious applications. There is a trade-off between
the false acceptance and false rejection. If we reduce the false
acceptance rate, it means that we are making the security tighter. This
will naturally increase the number of false-rejections. False rejections
could become annoying if they are not limited.
- Speaker Classification and Event Detection
This modality of the engine may be used to classify speakers into
groups such as gender groups (male/female/child). Language detection
may also be viewed as classification. Age group and many other
categories may also be used to perform speaker classification. This may
also be used to classify or detect events such as beeps, speech, horn,
auto noise, background noise, etc.
- Speaker Detection
This would be the case where a speaker is already enrolled in the
database and we would be trying to find the speaker among recordings or
in a live conversation.
- Speaker Tracking
In this case a speaker's voice is tracked through the conversation and
the tracking makes sure the speaker stays on-line.
- Speaker Segmentation
This would be used to segment the speech between two or more speakers in
a conversation.
The Engine May be Used in the Following Ways
- Standalone engine which may be run through the use of
command lines and system calls.
- Standalone engine which may be used through a very simple
C++ SDK and API. This would be most useful for integrating
the engine into current products and IVR systems.
- As a module of our RecoMadeEasy® IVR system.
- As a web service using our servers.
- As a web service using your own servers.
Supported Audio Interface
The following interfaces are natively supported. However, the speaker
recognition engine may be used with any audio interface as long as
the audio is passed to the engine through a third party software such
as your own IVR system or recording program. The engine may be used
in many different scenarios such as a web service, C++ API, and
command-line interface.
- All Dialogic JCT cards (T1 and Analog)
- Microphone devices
- Audio File Access
Supported Operating Systems
The speaker recognition engine is available for the following
operating systems. The C++ SDK, command-line interface, and web
services may be used in any of the following systems:
Microsoft Windows
Apple Macintosh
Linux (both 32-bit and 64-bit versions are supported)
- CentOS 6.3 Linux (New)
- CentOS 6.2 Linux
- CentOS 5.7 Linux
- CentOS 5.6 Linux
- CentOS 5.4 Linux
- Fedora 20 Linux (New)
- Fedora 19 Linux
- Fedora 18 Linux
- Fedora 17 Linux
- Fedora 16 Linux
- Fedora 15 Linux
- Fedora 14 Linux
- Fedora 13 Linux
- Fedora 12 Linux
- Fedora 11 Linux
- Fedora 10 Linux
- Fedora 9 Linux
- Fedora 8 Linux
- Fedora 7 Linux
- Fedora 6 Linux
- Fedora Core 5 Linux
- Fedora Core 4 Linux
- Fedora Core 3 Linux
- Fedora Core 2 Linux
- Fedora Core Linux
- N.B.: May be made available for other Unix-Like systems upon request
Supported Operating Systems -- Telephony
If you are interested to run the speaker recognition engine natively
as a module inside our IVR system using a telephone interface, then
the following operating systems are supported, only because the
Linux version of the Dialogic drivers only support the following
systems. However, if you have your own IVR system, the extended
list of operating systems listed for "Other Audio Devices" applies
to your system.
- CentOS 5.7 Linux (New)
- Fedora Core 5 Linux
- Fedora Core Linux
An evaluation account for the hosted version of
the RecoMadeEasy® AudioVisual
Recognition software may be made available to interested
organizations.
For further information please contact us at 1-800-215-0841 inside the
U.S. or +1-914-997-5676 from any other country. Alternatively, you may
send an Email to
info@recotechnologies.com.
|