For scenarios for search, analysis, and summarizing information based on corporate data. Used in assistants, internal portals, analytics, and document management.

Model Gallery
A catalog of models for any task: from generating text and images to working with unstructured data. Here are models for working with text, images, and voice.
A wide range of models for different cases
Yandex’s cutting-edge developments and current open-source solutions.
Out-of-the-box inference and scaling
No need to purchase or rent hardware, DevOps teams, or complex operations.
Common interfaces and APIs
A unified way of working with models from different manufacturers, and OpenAI compatibility.
Enterprise data security
Access control, environment isolation, and enterprise compliance.
DeepSeek V4 Flash in Model Gallery
Plan actions, keep context in long chains of reasoning, and invoke tools without losing logic. It works confidently with code: generating, analyzing, and refactoring, understanding dependencies, and performing multi-step tasks.

A variety of models for your needs
Choose the right model for a specific use case, without experimentation or unnecessary complexity.

Yandex models
Open-source models

DeepSeek V4 Flash New
A model for code and agent scenarios and complex reasoning

Qwen3.6 35B
A multimodal model that works with text and images

Qwen3 235B
A model for agent scenarios and complex instructions

GPT OSS 20B и 120B
Language models for text generation, reasoning, and applied tasks
Test models right now
Playground is an interactive environment for experimenting with models. Select a model, enter your request, and see how the model responds in real time.
Models of all modalities are available: voice, images, text.

Common interfaces and standards
Use familiar SDKs and frameworks — LangChain, LlamaIndex, LangGraph — with minimal code changes. Model Gallery is compatible with the OpenAI interface and supports core developer tools.
Secure work with corporate data
Model Gallery is focused on the safe and predictable use of models, rather than on fine-tuning on customer data.
Data is not used to train models.
Accesses are controlled at the platform level.
Logging can be disabled.
Dedicated network accesses and connections between local infrastructure and the platform are available.

Pricing policy
Usage costs for models from Model Gallery depends on the model’s operating mode, the number of incoming and outgoing tokens, and the tools used. The token count in the same text may vary from one model to another.

Options for accessing models
Model Gallery supports various inference modes, for tasks of any complexity and volume.

Instant responses
For interactive scenarios like chatbots, voice interfaces, and assistants. The models respond in real time, providing live interaction without delay.

Batch processing
For tasks with large amounts of data: mass text generation, document analysis, updates to knowledge bases. Requests are processed asynchronously and scalable in response to changing loads.

Dedicated inference
For deploying models outside a shared pool. It is suitable for specific scenarios and models not available on basic instances. Optimal for tasks requiring predictable resources and load management.
Other useful services

A service for receiving Yandex search database responses in XML or HTML format. It helps organize search on a site, a group of sites, or the internet, and track the position of sites for search queries.

A computer vision service for recognizing text in images and PDF files. It supports 45+ languages and detects them automatically.

A service for integrating Yandex Translator algorithms into applications or web projects for end users. It supports 100+ languages and can translate individual words and entire texts.
Start working with Model Gallery
Try launching the first model — test responses in the console, connect the API, or train them according to the specifics of your business.

