Name | Definition |
---|---|
ANI | ANI (Automatic Number Identification) is a service that provides the receiver of a telephone call with the number of the calling phone. The method of providing this information is determined by the service provider and can be useful for accounting and billing purposes, or to direct to a specific contact center group. |
Confidence Score | The probability that the result returned by the speech engine matches what a speaker said. Speech engines generally return confidence scores that reflect the probability; the higher the score, the more likely the engine's result is correct. |
CIF | CIF (Common Intermediate Format), also known as FCIF (Full Common Intermediate Format), is a format used to standardize the horizontal and vertical resolutions in pixels of YCbCr sequences in video signals, commonly used in video teleconferencing systems. It was first proposed in the H.261 standard. |
DID | DID is an acronym for Direct Inward Dialing is a feature allowing callers to directly reach a PBX extension without an operator’s assistance. |
DNIS | DNIS (Dialed Number Identification Service) data identifies which telephone number was dialed. A PBX often receives calls on the same port that were dialed to different 800 or 900 numbers, and the DNIS data contains the dialed number so that the PBX can track the call. |
DTMF | DTMF (Dual-tone Multi-frequency). The tones produced by pressing keys on a telephone. DTMF, also called Touch-Tone, is often used as a way of sending data to IVRs. DTMF assigns a specific sound frequency, or tone, to each key so that it can easily be identified by a monitoring microprocessor. That frequency is then translated into a usable analog or digital signal. |
Grammar | A grammar (in the context of speech recognition) is a file that contains a list of words and phrases to be recognized by a speech application. Grammars may also contain bits of programming logic to aid the application. All of the active grammar words make up the vocabulary. |
IVR | IVR (Interactive Voice Response) is an automated system that allows callers to interact with a computer, using a telephone (or VOIP). An IVR may use speech recognition, DTMF, or a combination of the two. |
PSTN | PSTN (Public Switched Telephone Network ). This is the term which usually is the “cloud.” It represents a generic term for any non-dedicated data service such as dial-up analog service. |
MRCP | Media Resource Control Protocol (MRCP) is a communication protocol used by speech servers to provide various services (such as speech recognition and speech synthesis) to their clients. MRCP relies on another protocol, such as Real Time Streaming Protocol (RTSP) or Session Initiation Protocol (SIP) for establishing a control session and audio streams between the client and the server. |
PBX | A private branch exchange (PBX) is a telephone exchange that serves a particular business or office, as opposed to one that a common carrier or telephone company operates for many businesses or for the general public. |
QCIF | QCIF means “Quarter CIF”. To have one fourth of the area, as “quarter” implies, the height and width of the frame are halved. |
RTMP | Real Time Messaging Protocol (RTMP) was initially a proprietary protocol developed by Macromedia for streaming audio, video and data over the Internet, between a Flash player and a server. Macromedia is now owned by Adobe, which has released an incomplete version of the specification of the protocol for public use. |
RTSP | Real Time Streaming Protocol (RTSP) is a network control protocol designed for use in entertainment and communications systems to control streaming media servers. The protocol is used for establishing and controlling media sessions between end points. Clients of media servers issue VCR-like commands, such as play and pause, to facilitate real-time control of playback of media files from the server. |
Speech Server | A Speech Server is a piece of software that runs the speech application. It follows the logic of the application, collects spoken audio, passes the audio to the speech engine, and passes the recognition results back to the application. |
SIP | The Session Initiation Protocol (SIP) is a signaling communications protocol, widely used for controlling multimedia communication sessions such as voice and video calls over Internet Protocol (IP) networks. |
TDM | Time-division multiplexing (TDM) is a method of transmitting and receiving independent signals over a common signal path by means of synchronized switches at each end of the transmission line so that each signal appears on the line only a fraction of time in an alternating pattern. This form of signal multiplexing was developed in telecommunications for telegraphy systems in the late 1800s, but found its most common application in digital telephony in the second half of the 20th century. |
ASR | ASR (Automatic Speech Recognition) is the process by which a computer speech engine recognizes human speech |
TTS | TTS (Text-to-Speech) software converts text into voice output using speech synthesis engines. |
Vocabulary | The total list of words a speech engine will be comparing an utterance against. The vocabulary is made up of all the words in all active grammars. The utterance is the word or phrase spoken by the caller. |
VoiceXML | Voice Extensible Markup Language (VXML) is a mark-up language designed to code voice applications with many of the same architectural components as HTML. VoiceXML platforms connect to a combination of speech recognition engines, text-to-speech synthesis, video, telephony interfaces and VoiceXML Interpreter software to process the call. |
VoIP | VoIP is an acronym for Voice over IP. This technology enables voice delivered using the Internet Protocol. In general, this means sending voice information in digital form in discrete packets rather than in the traditional circuit committed protocols of the public switched telephone network. |