Glossary

Name	Definition
ANI	ANI (Automatic Number Identification) is a service that provides the receiver of a telephone call with the number of the calling phone. The method of providing this information is determined by the service provider and can be useful for accounting and billing purposes, or to direct to a specific contact center group.
Confidence Score	The probability that the result returned by the speech engine matches what a speaker said. Speech engines generally return confidence scores that reflect the probability; the higher the score, the more likely the engine's result is correct.
CIF	CIF (Common Intermediate Format), also known as FCIF (Full Common Intermediate Format), is a format used to standardize the horizontal and vertical resolutions in pixels of YCbCr sequences in video signals, commonly used in video teleconferencing systems. It was first proposed in the H.261 standard.
DID	DID is an acronym for Direct Inward Dialing is a feature allowing callers to directly reach a PBX extension without an operator’s assistance.
DNIS	DNIS (Dialed Number Identification Service) data identifies which telephone number was dialed. A PBX often receives calls on the same port that were dialed to different 800 or 900 numbers, and the DNIS data contains the dialed number so that the PBX can track the call.
DTMF	DTMF (Dual-tone Multi-frequency). The tones produced by pressing keys on a telephone. DTMF, also called Touch-Tone, is often used as a way of sending data to IVRs. DTMF assigns a specific sound frequency, or tone, to each key so that it can easily be identified by a monitoring microprocessor. That frequency is then translated into a usable analog or digital signal.
Grammar	A grammar (in the context of speech recognition) is a file that contains a list of words and phrases to be recognized by a speech application. Grammars may also contain bits of programming logic to aid the application. All of the active grammar words make up the vocabulary.
IVR	IVR (Interactive Voice Response) is an automated system that allows callers to interact with a computer, using a telephone (or VOIP). An IVR may use speech recognition, DTMF, or a combination of the two.
PSTN	PSTN (Public Switched Telephone Network ). This is the term which usually is the “cloud.” It represents a generic term for any non-dedicated data service such as dial-up analog service.
MRCP	Media Resource Control Protocol (MRCP) is a communication protocol used by speech servers to provide various services (such as speech recognition and speech synthesis) to their clients. MRCP relies on another protocol, such as Real Time Streaming Protocol (RTSP) or Session Initiation Protocol (SIP) for establishing a control session and audio streams between the client and the server.
PBX	A private branch exchange (PBX) is a telephone exchange that serves a particular business or office, as opposed to one that a common carrier or telephone company operates for many businesses or for the general public.
QCIF	QCIF means “Quarter CIF”. To have one fourth of the area, as “quarter” implies, the height and width of the frame are halved.
RTMP	Real Time Messaging Protocol (RTMP) was initially a proprietary protocol developed by Macromedia for streaming audio, video and data over the Internet, between a Flash player and a server. Macromedia is now owned by Adobe, which has released an incomplete version of the specification of the protocol for public use.
RTSP	Real Time Streaming Protocol (RTSP) is a network control protocol designed for use in entertainment and communications systems to control streaming media servers. The protocol is used for establishing and controlling media sessions between end points. Clients of media servers issue VCR-like commands, such as play and pause, to facilitate real-time control of playback of media files from the server.
Speech Server	A Speech Server is a piece of software that runs the speech application. It follows the logic of the application, collects spoken audio, passes the audio to the speech engine, and passes the recognition results back to the application.
SIP	The Session Initiation Protocol (SIP) is a signaling communications protocol, widely used for controlling multimedia communication sessions such as voice and video calls over Internet Protocol (IP) networks.
TDM	Time-division multiplexing (TDM) is a method of transmitting and receiving independent signals over a common signal path by means of synchronized switches at each end of the transmission line so that each signal appears on the line only a fraction of time in an alternating pattern. This form of signal multiplexing was developed in telecommunications for telegraphy systems in the late 1800s, but found its most common application in digital telephony in the second half of the 20th century.
ASR	ASR (Automatic Speech Recognition) is the process by which a computer speech engine recognizes human speech
TTS	TTS (Text-to-Speech) software converts text into voice output using speech synthesis engines.
Vocabulary	The total list of words a speech engine will be comparing an utterance against. The vocabulary is made up of all the words in all active grammars. The utterance is the word or phrase spoken by the caller.
VoiceXML	Voice Extensible Markup Language (VXML) is a mark-up language designed to code voice applications with many of the same architectural components as HTML. VoiceXML platforms connect to a combination of speech recognition engines, text-to-speech synthesis, video, telephony interfaces and VoiceXML Interpreter software to process the call.
VoIP	VoIP is an acronym for Voice over IP. This technology enables voice delivered using the Internet Protocol. In general, this means sending voice information in digital form in discrete packets rather than in the traditional circuit committed protocols of the public switched telephone network.