Say

The say command sends synthesized speech to the remote party. The text provided can be either plain text or use SSML tags. Zerpia supports a large number of speech vendors out of the box (see list below), and you can add others via the custom speech API.

{
  "verb": "say",
  "text": "hi there!",
  "synthesizer": {
    "vendor": "google",
    "language": "en-US"
  }
}

You can use the following attributes with the say command:

Option	Description	Required
`text`	Text to speak; may contain SSML tags.	Yes
`synthesizer.vendor`	Speech vendor to use (see list below, along with any others you add via the custom speech API).	No
`synthesizer.language`	Language code to use.	No
`synthesizer.fallbackVendor`	Fallback speech vendor to use (see list below, along with any others you add via the custom speech API).	No
`synthesizer.fallbackLanguage`	Fallback language code to use.	No
`synthesizer.gender`	(Google only) `MALE`, `FEMALE`, or `NEUTRAL`.	No
`synthesizer.voice`	Voice to use. Note that the voice list differs whether you are using AWS or Google. Defaults to application setting, if provided.	No
`loop`	The number of times a text is to be repeated; 0 means repeat forever. Defaults to 1.	No
`earlyMedia`	If `true` and the call has not yet been answered, play the audio without answering the call. Defaults to `false`.	No

—

Text-to-Speech Vendors

Zerpia natively supports the following text-to-speech services:

AWS
Azure
Deepgram
ElevenLabs
Google
IBM
Nuance
NVIDIA
WellSaid
Whisper

Note: Microsoft supports on-prem and private link options for deploying the speech service in addition to the hosted Microsoft service.

Ready To Get Started?

Sign Up Today