Quotas and limits in SpeechKit

Yandex SpeechKit has the following limits:

  • Quotas are organizational constraints that can be changed by technical support on request.
  • Limits are technical constraints due to the Yandex Cloud architecture. Limits cannot be changed.

If you need more resources, contact support and specify which quotas you want increased and by how much.

Quotas

Type of limit Value
Streaming speech recognition
Requests per second 40
Synchronous recognition
Requests per second 20
Asynchronous recognition
Recognition requests per hour 500
Operation status check requests via API v2 per hour 2,500
Operation status check requests via API v3 per second 5
Billable hours of audio per day1 10,000
Queries accessing an LLM
Concurrent queries accessing generative text models 2
Speech synthesis
Requests per second 40

1 The first recognition request triggers the start of the time count.

Limits

Type of limit Value
Streaming speech recognition
Maximum duration of transmitted audio for entire session 5 minutes
Maximum size of transmitted audio data 10 MB
Maximum number of audio channels 1
Synchronous recognition
Maximum file size 1 MB
Maximum duration of audio 30 seconds
Maximum number of audio channels 1
Asynchronous recognition
Maximum file size when uploading to a bucket 1 GB
Maximum file size when uploading in an API v3 request body 60 MB
Maximum duration of audio 4 hours
Period for storing recognition results on the server 3 days
Speech synthesis
Minimum duration of a pattern for synthesis 1 second
Maximum request size for the API v1 5,000 characters
Maximum request size for the API v3 250 characters and 24 seconds
Maximum request size for the API v3 in unsafe mode 5,000 characters
Maximum request size for the API v3 in streaming mode 5,000 characters
Queries accessing an LLM
Number of instructions for the generative text model per session 16