Synchronous recognition API

With the synchronous recognition API, you can transcribe prepared audio files with the following characteristics:

  • Maximum file size: 1 MB
  • Maximum duration: 30 seconds
  • Maximum number of audio channels: 1

The synchronous recognition service is located at stt.api.cloud.yandex.net/speech/v1/stt:recognize

Query parameters

Parameter Description
lang string
Recognition language.
See the model description for acceptable values. The default value is ru-RU, Russian.
topic string
Language model to use for recognition.
The more accurate your choice of the model, the better the recognition result. You can only specify one model per request.
Acceptable values depend on the selected language. The default value is general.
profanityFilter boolean
This parameter controls the profanity filter in recognized speech.
Valid values:
  • false (default): Profanities will not be excluded from the recognition results.
  • true: Profanities will be excluded from the recognition results.
rawResults boolean
Flag for how to write numbers: true for words, false (default) for figures.
format string
Audio format.
Acceptable values:
sampleRateHertz string
Audio sampling frequency.
Used if format equals lpcm. Valid values:
  • 48000 (default): 48 kHz.
  • 16000: 16 kHz.
  • 8000: 8 kHz.
folderId string

ID of the folder you have access to. Required for authorization with a user account (see the Authentication with the SpeechKit API resource). Do not use this field if you make a request on behalf of a service account.

The maximum string length is 50 characters.

Request body parameters

The request body has to contain the binary content of an audio file.

Response

The recognized text is returned in the result field of the response.

{
          "result": <recognized_text>
        }
        

For more information about the response format and codes, see Response status codes.

Use cases