Common instance models
Yandex Cloud AI Studio provides access to large generative models from different vendors. If out-of-the-box models are not enough, you can fine-tune some of them for more accurate responses. All roles required for working with the models are listed in Access management in Yandex Cloud AI Studio.
In a common instance, model resources are available to all Yandex Cloud users and shared between them, so model response time may increase under heavy workloads. You can be assured that other users have no access to the context of your exchanges with the model: even with logging on, requests are stored anonymized and potentially sensitive information is masked. However, we recommend disabling data logging whenever you use our models to process sensitive information.
Common instance models are subject to the update rules described in Model lifecycle. Modified models share usage quotas with their basic models.
|
Model and URI |
Context |
Available APIs |
|
Alice AI LLM |
32k |
Text generation APIs, OpenAI-compatible APIs |
|
YandexGPT Pro 5.1 |
32k |
Text generation APIs, OpenAI-compatible APIs |
|
YandexGPT Pro 5 |
32k |
Text generation APIs, OpenAI-compatible APIs |
|
YandexGPT Lite 5 |
32k |
Text generation APIs, OpenAI-compatible APIs |
|
DeepSeek V3.2 |
128k |
OpenAI-compatible APIs |
|
Qwen3 235B |
256k |
OpenAI-compatible APIs |
|
gpt-oss-120b |
128k |
OpenAI-compatible APIs |
|
gpt-oss-20b |
128k |
OpenAI-compatible APIs |
|
Fine-tuned YandexGPT Lite |
32k |
Text generation APIs, OpenAI-compatible APIs |
|
Qwen3.5 35B |
256k |
OpenAI-compatible APIs |
|
Gemma 3 27B |
128k |
OpenAI-compatible APIs |
|
YandexART |
500 characters |
Image generation APIs |
|
Realtime |
32k |
Realtime API |
Gemma 3 27B works with Base64-encoded images. The model can process images of any aspect ratio thanks to its adaptive algorithm that scales the image to 896 pixels on the longer side while preserving important visual details. Each image uses 256 context tokens.
Model lifecycle
Each base instance model has its own URI, which allows you to uniquely identify the model family and version. The URI stays with the model until the latter gets decommissioned. If major changes are made, a new version of the model is published separately and gets a URI of its own.
AI Studio informs you in advance about the decommissioning of model versions through release notes, the user community, and direct mail so you could make proper changes to your products. Until the end of the support period, the version will operate normally, after which all requests sent to the obsolete URI will return the 400 Bad Request error. There is no auto switching between versions.
To keep your apps working with a legacy model onboard, change the model's URI in the code and follow these steps:
- Check the prompt and correct it for the new model to perform up to your expectations.
- If your legacy model's quotas had been increased, re-request the necessary quotas for the new one.
- If you used a fine-tuned model, fine-tune it again.
Plan your migration to a new model well in advance because it may take time to test your new model version and fine-tune it again.
All models are covered by the SLA.
YandexGPT Pro 5 and YandexGPT Pro 5.1 models remain available at gpt://<folder_ID>/yandexgpt/latest and gpt://<folder_ID>/yandexgpt/rc, respectively, until the end of support; however, we recommend using their explicit URIs.