Supported languages and recognition models

A recognition model is a model trained to recognize speech in a particular language. The models are trained on datasets generated by Yandex services and applications. This allows us to keep improving the speech recognition quality.

The general model is the main supported model for each recognition type. It recognizes speech on any topic in a given language, including short and long utterances, names, addresses, dates, and numbers.

Version tags

Three versions of the general model may be available at the same time. Select the one you need by tag:

general: Main version.
general:rc: Release candidate version available for testing.
general:deprecated: Previous version.

Note

We discontinue support for the general:deprecated tag versions as new models are released. SpeechKit guarantees two weeks of support for the previous version after we update the general tag version. You can find the list of updates in Yandex SpeechKit release notes: Speech recognition.

You can also use the deferred-general tag for asynchronous recognition. Learn more about asynchronous recognition modes.

Supported recognition languages

Use a recognition language code from the table below. All available code values are case insensitive.

Code	Language
`auto`	Automatic language recognition
`de-DE`	German
`en-US`	English
`es-ES`	Spanish
`fi-FI`	Finnish
`fr-FR`	French
`he-IL`	Hebrew
`it-IT`	Italian
`kk-KZ`	Kazakh
`nl-NL`	Dutch
`pl-PL`	Polish
`pt-PT`	Portuguese
`pt-BR`	Brazilian Portuguese
`ru-RU`	Russian (default)
`sv-SE`	Swedish
`tr-TR`	Turkish
`uz-UZ`	Uzbek (Latin script)

Automatic language detection

Note

Language detection and language labels are only available in the API v3.

SpeechKit automatically detects the language in each sentence during speech recognition.

To use automatic language detection, specify auto in the language_code parameter of the LanguageRestrictionOptions() method:

Python 3

language_restriction=stt_pb2.LanguageRestrictionOptions(
              restriction_type=stt_pb2.LanguageRestrictionOptions.WHITELIST,
              language_code=['auto']
        )

Along with recognition results, the service returns language labels containing the language code and probability of its correct detection:

language_code: "ru-RU" probability: 0.91582357883453369

If a sentence contains words in different languages, the language may be detected incorrectly. To make recognition more accurate, replace auto with a list of expected languages as a clue for the model. For example:

Python 3

...
              language_code=['en-US', 'es-ES', 'fr-FR']
        ...

The language is detected for each phrase. If a sentence has phrases in different languages, all of them will most likely be transcribed in the same language.

Examples

Text in audio	Transcript
Xiaomi is a Chinese brand	shumi is a chinese brand
Привет is hi in Russian	privet is hi in russian
Men koʻchada sayr qilishni va muzqaymoq isteʼmol qilishni yaxshi koʻraman, I like to take a walk outside and have some ice cream	Men koʻchada sayr qilishni va muzqaymoq isteʼmol qilishni yaxshi koʻraman, I like to take a walk outside and have some ice cream

Was the article helpful?

About the technology

Streaming recognition

Supported languages and recognition models

Version tags

Supported recognition languages

Automatic language detection

Recognition accuracy

Use cases

Useful links

Was the article helpful?

Supported languages and recognition models

Version tagsVersion tags

Supported recognition languagesSupported recognition languages

Automatic language detectionAutomatic language detection

Recognition accuracyRecognition accuracy

Use casesUse cases

Useful linksUseful links

Was the article helpful?

Version tags

Supported recognition languages

Automatic language detection

Recognition accuracy

Use cases

Useful links