Common instance models

Yandex Cloud AI Studio provides access to large generative models from different vendors. If out-of-the-box models are not enough, you can fine-tune some of them for more accurate responses. All roles required for working with the models are listed in Access management in Yandex Cloud AI Studio.

In a common instance, model resources are available to all Yandex Cloud users and shared between them, so model response time may increase under heavy workloads. You can be assured that other users have no access to the context of your exchanges with the model: even with logging on, requests are stored anonymized and potentially sensitive information is masked. However, we recommend disabling data logging whenever you use our models to process sensitive information.

Common instance models are subject to the update rules described in Model lifecycle. Modified models share usage quotas with their basic models.

Model and URI

Context

Available APIs

Alice AI LLM
gpt://<folder_ID>/aliceai-llm

32k
(32,768)

Text generation APIs, OpenAI-compatible APIs

YandexGPT Pro 5.1
gpt://<folder_ID>/yandexgpt-5.1

32k
(32,768)

Text generation APIs, OpenAI-compatible APIs

YandexGPT Pro 5
gpt://<folder_ID>/yandexgpt-5-pro

32k
(32,768)

Text generation APIs, OpenAI-compatible APIs

YandexGPT Lite 5
gpt://<folder_ID>/yandexgpt-5-lite

32k
(32,768)

Text generation APIs, OpenAI-compatible APIs

DeepSeek V3.2
gpt://<folder_ID>/deepseek-v32

128k
(131,072)

OpenAI-compatible APIs

Qwen3 235B
gpt://<folder_ID>/qwen3-235b-a22b-fp8

256k
(262,144)

OpenAI-compatible APIs

gpt-oss-120b
gpt://<folder_ID>/gpt-oss-120b

128k
(131,072)

OpenAI-compatible APIs

gpt-oss-20b
gpt://<folder_ID>/gpt-oss-20b

128k
(131,072)

OpenAI-compatible APIs

Fine-tuned YandexGPT Lite
gpt://<folder_ID>/yandexgpt-lite/latest@<suffix>

32k
(32,768)

Text generation APIs, OpenAI-compatible APIs

Qwen3.5 35B
gpt://<folder_ID>/qwen3.5-35b-a3b-fp8

256k
(262,144)

OpenAI-compatible APIs

Gemma 3 27B
gpt://<folder_ID>/gemma-3-27b-it
Gemma Terms of Use
exclamation The model is available until May 15, 2026

128k
(131,072)

OpenAI-compatible APIs

YandexART
art://<folder_ID>/yandex-art-2.0

500 characters

Image generation APIs

Realtime
gpt://<folder_ID>/speech-realtime-250923

32k
(32,768)

Realtime API

Gemma 3 27B works with Base64-encoded images. The model can process images of any aspect ratio thanks to its adaptive algorithm that scales the image to 896 pixels on the longer side while preserving important visual details. Each image uses 256 context tokens.

Model lifecycle

Each base instance model has its own URI, which allows you to uniquely identify the model family and version. The URI stays with the model until the latter gets decommissioned. If major changes are made, a new version of the model is published separately and gets a URI of its own.

AI Studio informs you in advance about the decommissioning of model versions through release notes, the user community, and direct mail so you could make proper changes to your products. Until the end of the support period, the version will operate normally, after which all requests sent to the obsolete URI will return the 400 Bad Request error. There is no auto switching between versions.

To keep your apps working with a legacy model onboard, change the model's URI in the code and follow these steps:

  • Check the prompt and correct it for the new model to perform up to your expectations.
  • If your legacy model's quotas had been increased, re-request the necessary quotas for the new one.
  • If you used a fine-tuned model, fine-tune it again.

Plan your migration to a new model well in advance because it may take time to test your new model version and fine-tune it again.

All models are covered by the SLA.

YandexGPT Pro 5 and YandexGPT Pro 5.1 models remain available at gpt://<folder_ID>/yandexgpt/latest and gpt://<folder_ID>/yandexgpt/rc, respectively, until the end of support; however, we recommend using their explicit URIs.

Use cases