Say
Say
The say
command sends synthesized speech to the remote party. The text provided can be either plain text or use SSML tags. Zerpia supports a large number of speech vendors out of the box (see list below), and you can add others via the custom speech API.
{
"verb": "say",
"text": "hi there!",
"synthesizer": {
"vendor": "google",
"language": "en-US"
}
}
You can use the following attributes with the say
command:
Option | Description | Required |
---|---|---|
text |
Text to speak; may contain SSML tags. | Yes |
synthesizer.vendor |
Speech vendor to use (see list below, along with any others you add via the custom speech API). | No |
synthesizer.language |
Language code to use. | No |
synthesizer.fallbackVendor |
Fallback speech vendor to use (see list below, along with any others you add via the custom speech API). | No |
synthesizer.fallbackLanguage |
Fallback language code to use. | No |
synthesizer.gender |
(Google only) MALE , FEMALE , or NEUTRAL . |
No |
synthesizer.voice |
Voice to use. Note that the voice list differs whether you are using AWS or Google. Defaults to application setting, if provided. | No |
loop |
The number of times a text is to be repeated; 0 means repeat forever. Defaults to 1. | No |
earlyMedia |
If true and the call has not yet been answered, play the audio without answering the call. Defaults to false . |
No |
—
Text-to-Speech Vendors
Zerpia natively supports the following text-to-speech services:
- AWS
- Azure
- Deepgram
- ElevenLabs
- IBM
- Nuance
- NVIDIA
- WellSaid
- Whisper
Note: Microsoft supports on-prem and private link options for deploying the speech service in addition to the hosted Microsoft service.