Quotas and limits in SpeechKit

Yandex SpeechKit has the following limits:

Quotas are organizational constraints that can be changed by technical support on request.
Limits are technical constraints due to the Yandex Cloud architecture. Limits cannot be changed.

If you need more resources, contact support and specify which quotas you want increased and by how much.

Type of limit	Value
Streaming speech recognition
Requests per second	40
Synchronous recognition
Requests per second	20
Asynchronous recognition
Recognition requests per hour	500
Operation status check requests via API v2 per hour	2,500
Operation status check requests via API v3 per second	5
Billable hours of audio per day¹	10,000
Requests querying LLMs
Concurrent requests querying generative text models	2
Speech synthesis
Requests per second	40

¹ The first recognition request triggers the time count.

Type of limit	Value
Streaming speech recognition
Maximum duration of transmitted audio for the entire session	5 minutes
Maximum size of transmitted audio data	10 MB
Maximum number of audio channels	1
Synchronous recognition
Maximum file size	1 MB
Maximum duration of audio	30 seconds
Maximum number of audio channels	1
Asynchronous recognition
Maximum file size when uploading to a bucket	1 GB
Maximum file size when uploading in an API v3 request body	60 MB
Maximum duration of audio	4 hours
Period for storing recognition results on the server	3 days
Speech synthesis
Minimum duration of a pattern for synthesis	1 second
Maximum request size for the API v1	5,000 characters
Maximum request size for the API v3	250 characters and 24 seconds
Maximum request size for the API v3 in unsafe mode	5,000 characters
Maximum request size for the API v3 in streaming mode	5,000 characters
Requests querying LLMs
Number of instructions for the generative text model per session	16