Video Generation API is now live!

Models

Explore the active model market,from a local OpenRouter snapshot.

This page reads from a local JSON snapshot synced from OpenRouter, so the catalog stays fast, indexable, and stable. Use it to browse current model coverage by provider, modality, reasoning support, context window, and pricing metadata.

Reset

Results

Showing 48 of 683 matching models

Snapshot source: OpenRouter. Synced April 21, 2026 at 8:00 AM. Page 2 of 15.

This route is built from local JSON so the catalog stays stable for browsing and SEO. If you need a specific model on ImaRouter, treat this page as a discovery reference and then contact the team for availability.

TextReasoning

OpenAI

OpenAI: GPT-5 Codex

GPT-5-Codex is a specialized version of GPT-5 optimized for software engineering and coding workflows. It is designed for both interactive development sessions and long, independent execution of complex engineering tasks. The model supports building projects from scratch, feature development, debugging, large-scale refactoring, and code review. Compared to GPT-5, Codex is more steerable, adheres closely to developer instructions, and produces cleaner, higher-quality code outputs. Reasoning effort can be adjusted with the `reasoning.effort` parameter. Read the [docs here](https://openrouter.ai/docs/use-cases/reasoning-tokens#reasoning-effort-level) Codex integrates into developer environments including the CLI, IDE extensions, GitHub, and cloud tasks. It adapts reasoning effort dynamically—providing fast responses for small tasks while sustaining extended multi-hour runs for large projects. The model is trained to perform structured code reviews, catching critical flaws by reasoning over dependencies and validating behavior against tests. It also supports multimodal inputs such as images or screenshots for UI development and integrates tool use for search, dependency installation, and environment setup. Codex is intended specifically for agentic coding applications.

TextImage

Context

400K

Group

GPT

Pricing preview

Input Price: $1.25 /M tokens

Output Price: $10 /M tokens

Slug

openai/gpt-5-codex

Text

OpenAI

OpenAI: GPT-4o Audio

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs are currently not supported. Audio tokens are priced at $40 per million input and $80 per million output audio tokens.

TextAudio

Context

128K

Group

GPT

Pricing preview

Input Price: $2.5 /M tokens

Output Price: $10 /M tokens

Slug

openai/gpt-4o-audio-preview

Text

OpenAI

OpenAI: GPT-5 Chat

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.

TextFileImage

Context

128K

Group

GPT

Pricing preview

Input Price: $1.25 /M tokens

Output Price: $10 /M tokens

Slug

openai/gpt-5-chat

TextReasoning

Azure

OpenAI: GPT-5

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy in high-stakes use cases. It supports test-time routing features and advanced prompt understanding, including user-specified intent like "think hard about this." Improvements include reductions in hallucination, sycophancy, and better performance in coding, writing, and health-related tasks.

TextImageFile

Context

400K

Group

GPT

Pricing preview

Input Price: $1.25 /M tokens

Output Price: $10 /M tokens

Slug

openai/gpt-5

TextReasoning

OpenAI

OpenAI: GPT-5 Mini

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost. GPT-5 Mini is the successor to OpenAI's o4-mini model.

TextImageFile

Context

400K

Group

GPT

Pricing preview

Input Price: $0.25 /M tokens

Output Price: $2 /M tokens

Slug

openai/gpt-5-mini

TextReasoning

Azure

OpenAI: GPT-5 Nano

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger counterparts, it retains key instruction-following and safety features. It is the successor to GPT-4.1-nano and offers a lightweight option for cost-sensitive or real-time applications.

TextImageFile

Context

400K

Group

GPT

Pricing preview

Input Price: $0.05 /M tokens

Output Price: $0.4 /M tokens

Slug

openai/gpt-5-nano

TextReasoning

OpenAI

OpenAI: o3 Pro

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently better answers. Note that BYOK is required for this model. Set up here: https://openrouter.ai/settings/integrations

TextFileImage

Context

200K

Group

GPT

Pricing preview

Input Price: $20 /M tokens

Output Price: $80 /M tokens

Slug

openai/o3-pro

TextReasoning

Unknown provider

OpenAI: Codex Mini

codex-mini-latest is a fine-tuned version of o4-mini specifically for use in Codex CLI. For direct use in the API, we recommend starting with gpt-4.1.

TextImage

Context

200K

Group

GPT

Pricing preview

No display pricing published in the current snapshot.

Slug

openai/codex-mini

TextReasoning

OpenAI

OpenAI: o4 Mini High

OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high. OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning and coding performance across benchmarks like AIME (99.5% with Python) and SWE-bench, outperforming its predecessor o3-mini and even approaching o3 in some domains. Despite its smaller size, o4-mini exhibits high accuracy in STEM tasks, visual problem solving (e.g., MathVista, MMMU), and code editing. It is especially well-suited for high-throughput scenarios where latency or cost is critical. Thanks to its efficient architecture and refined reinforcement learning training, o4-mini can chain tools, generate structured outputs, and solve multi-step tasks with minimal delay—often in under a minute.

TextImageFile

Context

200K

Group

GPT

Pricing preview

Input Price: $1.1 /M tokens

Output Price: $4.4 /M tokens

Slug

openai/o4-mini-high

TextReasoning

OpenAI

OpenAI: o3

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following. Use it to think through multi-step problems that involve analysis across text, code, and images.

TextImageFile

Context

200K

Group

GPT

Pricing preview

Input Price: $2 /M tokens

Output Price: $8 /M tokens

Slug

openai/o3

TextReasoning

OpenAI

OpenAI: o4 Mini

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning and coding performance across benchmarks like AIME (99.5% with Python) and SWE-bench, outperforming its predecessor o3-mini and even approaching o3 in some domains. Despite its smaller size, o4-mini exhibits high accuracy in STEM tasks, visual problem solving (e.g., MathVista, MMMU), and code editing. It is especially well-suited for high-throughput scenarios where latency or cost is critical. Thanks to its efficient architecture and refined reinforcement learning training, o4-mini can chain tools, generate structured outputs, and solve multi-step tasks with minimal delay—often in under a minute.

TextImageFile

Context

200K

Group

GPT

Pricing preview

Input Price: $1.1 /M tokens

Output Price: $4.4 /M tokens

Slug

openai/o4-mini

Text

Azure

OpenAI: GPT-4.1

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.

TextImageFile

Context

1M

Group

GPT

Pricing preview

Input Price: $2 /M tokens

Output Price: $8 /M tokens

Slug

openai/gpt-4.1

Text

OpenAI

OpenAI: GPT-4.1 Mini

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard instruction evals, 35.8% on MultiChallenge, and 84.1% on IFEval. Mini also shows strong coding ability (e.g., 31.6% on Aider’s polyglot diff benchmark) and vision understanding, making it suitable for interactive applications with tight performance constraints.

TextImageFile

Context

1M

Group

GPT

Pricing preview

Input Price: $0.4 /M tokens

Output Price: $1.6 /M tokens

Slug

openai/gpt-4.1-mini

Text

OpenAI

OpenAI: GPT-4.1 Nano

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million token context window, and scores 80.1% on MMLU, 50.3% on GPQA, and 9.8% on Aider polyglot coding – even higher than GPT‑4o mini. It’s ideal for tasks like classification or autocompletion.

TextImageFile

Context

1M

Group

GPT

Pricing preview

Input Price: $0.1 /M tokens

Output Price: $0.4 /M tokens

Slug

openai/gpt-4.1-nano

TextReasoning

OpenAI

OpenAI: o1-pro

The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o1-pro model uses more compute to think harder and provide consistently better answers.

TextImageFile

Context

200K

Group

GPT

Pricing preview

Input Price: $15 /M tokens

Output Price: $6 /M tokens

Slug

openai/o1-pro

Text

OpenAI

OpenAI: GPT-4o-mini Search Preview

GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.

Text

Context

128K

Group

GPT

Pricing preview

Input Price: $0.15 /M tokens

Output Price: $0.6 /M tokens

Slug

openai/gpt-4o-mini-search-preview

Text

OpenAIAdapter

OpenAI: GPT-4o Search Preview

GPT-4o Search Previewis a specialized model for web search in Chat Completions. It is trained to understand and execute web search queries.

Text

Context

128K

Group

GPT

Pricing preview

Input Price: $2.5 /M tokens

Output Price: $10 /M tokens

Slug

openai/gpt-4o-search-preview

Text

Unknown provider

OpenAI: GPT-4.5 (Preview)

GPT-4.5 (Preview) is a research preview of OpenAI’s latest language model, designed to advance capabilities in reasoning, creativity, and multi-turn conversation. It builds on previous iterations with improvements in world knowledge, contextual coherence, and the ability to follow user intent more effectively. The model demonstrates enhanced performance in tasks that require open-ended thinking, problem-solving, and communication. Early testing suggests it is better at generating nuanced responses, maintaining long-context coherence, and reducing hallucinations compared to earlier versions. This research preview is intended to help evaluate GPT-4.5’s strengths and limitations in real-world use cases as OpenAI continues to refine and develop future models. Read more at the [blog post here.](https://openai.com/index/introducing-gpt-4-5/)

TextImage

Context

128K

Group

GPT

Pricing preview

No display pricing published in the current snapshot.

Slug

openai/gpt-4.5-preview

TextReasoning

OpenAI

OpenAI: o3 Mini High

OpenAI o3-mini-high is the same model as [o3-mini](/openai/o3-mini) with reasoning_effort set to high. o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. The model features three adjustable reasoning effort levels and supports key developer capabilities including function calling, structured outputs, and streaming, though it does not include vision processing capabilities. The model demonstrates significant improvements over its predecessor, with expert testers preferring its responses 56% of the time and noting a 39% reduction in major errors on complex questions. With medium reasoning effort settings, o3-mini matches the performance of the larger o1 model on challenging reasoning evaluations like AIME and GPQA, while maintaining lower latency and cost.

TextFile

Context

200K

Group

GPT

Pricing preview

Input Price: $1.1 /M tokens

Output Price: $4.4 /M tokens

Slug

openai/o3-mini-high

TextReasoning

OpenAI

OpenAI: o3 Mini

OpenAI o3-mini is a cost-efficient language model optimized for STEM reasoning tasks, particularly excelling in science, mathematics, and coding. This model supports the `reasoning_effort` parameter, which can be set to "high", "medium", or "low" to control the thinking time of the model. The default is "medium". OpenRouter also offers the model slug `openai/o3-mini-high` to default the parameter to "high". The model features three adjustable reasoning effort levels and supports key developer capabilities including function calling, structured outputs, and streaming, though it does not include vision processing capabilities. The model demonstrates significant improvements over its predecessor, with expert testers preferring its responses 56% of the time and noting a 39% reduction in major errors on complex questions. With medium reasoning effort settings, o3-mini matches the performance of the larger o1 model on challenging reasoning evaluations like AIME and GPQA, while maintaining lower latency and cost.

TextFile

Context

200K

Group

GPT

Pricing preview

Input Price: $1.1 /M tokens

Output Price: $4.4 /M tokens

Slug

openai/o3-mini

TextReasoning

OpenAI

OpenAI: o1

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought. The o1 models are optimized for math, science, programming, and other STEM-related tasks. They consistently exhibit PhD-level accuracy on benchmarks in physics, chemistry, and biology. Learn more in the [launch announcement](https://openai.com/o1).

TextImageFile

Context

200K

Group

GPT

Pricing preview

Input Price: $15 /M tokens

Output Price: $60 /M tokens

Slug

openai/o1

Text

OpenAI

OpenAI: GPT-4o (2024-11-20)

The 2024-11-20 version of GPT-4o offers a leveled-up creative writing ability with more natural, engaging, and tailored writing to improve relevance & readability. It’s also better at working with uploaded files, providing deeper insights & more thorough responses. GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as fast and 50% more cost-effective. GPT-4o also offers improved performance in processing non-English languages and enhanced visual capabilities.

TextImageFile

Context

128K

Group

GPT

Pricing preview

Input Price: $2.5 /M tokens

Output Price: $10 /M tokens

Slug

openai/gpt-4o-2024-11-20

TextReasoning

Google Vertex

Anthropic: Claude Opus 4.1

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on SWE-bench Verified and shows notable gains in multi-file code refactoring, debugging precision, and detail-oriented reasoning. The model supports extended thinking up to 64K tokens and is optimized for tasks involving research, data analysis, and tool-assisted reasoning.

TextImageFile

Context

200K

Group

Claude

Pricing preview

Input Price: $15 /M tokens

Output Price: $75 /M tokens

Slug

anthropic/claude-opus-4.1

TextReasoning

Google Vertex (Global)

Anthropic: Claude Sonnet 4

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions. Sonnet 4 is optimized for practical everyday use, providing advanced reasoning capabilities while maintaining efficiency and responsiveness in diverse internal and external scenarios. Read more at the [blog post here](https://www.anthropic.com/news/claude-4)

TextImageFile

Context

1M

Group

Claude

Pricing preview

Input Price: $3 /M tokens

Output Price: $15 /M tokens

Slug

anthropic/claude-sonnet-4

TextReasoning

Amazon Bedrock

Anthropic: Claude 3.7 Sonnet

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and extended, step-by-step processing for complex tasks. The model demonstrates notable improvements in coding, particularly in front-end development and full-stack updates, and excels in agentic workflows, where it can autonomously navigate multi-step processes. Claude 3.7 Sonnet maintains performance parity with its predecessor in standard mode while offering an extended reasoning mode for enhanced accuracy in math, coding, and instruction-following tasks. Read more at the [blog post here](https://www.anthropic.com/news/claude-3-7-sonnet)

TextImageFile

Context

200K

Group

Claude

Pricing preview

Input Price: $3 /M tokens

Output Price: $15 /M tokens

Slug

anthropic/claude-3.7-sonnet

TextReasoning

Google Vertex

Anthropic: Claude 3.7 Sonnet (thinking)

Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and extended, step-by-step processing for complex tasks. The model demonstrates notable improvements in coding, particularly in front-end development and full-stack updates, and excels in agentic workflows, where it can autonomously navigate multi-step processes. Claude 3.7 Sonnet maintains performance parity with its predecessor in standard mode while offering an extended reasoning mode for enhanced accuracy in math, coding, and instruction-following tasks. Read more at the [blog post here](https://www.anthropic.com/news/claude-3-7-sonnet)

TextImageFile

Context

200K

Group

Claude

Pricing preview

Input Price: $3 /M tokens

Output Price: $15 /M tokens

Slug

anthropic/claude-3.7-sonnet

TextReasoning

Alibaba Cloud Int.

Qwen: Qwen3.6 Plus

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts routing, enabling strong scalability and high-performance inference. Compared to the 3.5 series, it delivers major gains in agentic coding, front-end development, and overall reasoning, with a significantly improved “vibe coding” experience. The model excels at complex tasks such as 3D scenes, games, and repository-level problem solving, achieving a 78.8 score on SWE-bench Verified. It represents a substantial leap in both pure-text and multimodal capabilities, performing at the level of leading state-of-the-art models.

TextImageVideo

Context

1M

Group

Qwen3

Pricing preview

Input Price: $0.325 /M tokens

Output Price: $1.95 /M tokens

Slug

qwen/qwen3.6-plus

TextReasoning

io.net

Z.ai: GLM 5.1

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on a single task for more than 8 hours, autonomously planning, executing, and improving itself throughout the process, ultimately delivering complete, engineering-grade results.

Text

Context

202.8K

Group

Other

Pricing preview

Input Price: $0.698 /M tokens

Output Price: $4.4 /M tokens

Slug

z-ai/glm-5.1

TextReasoning

Xiaomi

Xiaomi: MiMo-V2-Pro

MiMo-V2-Pro is Xiaomi's flagship foundation model, featuring over 1T total parameters and a 1M context length, deeply optimized for agentic scenarios. It is highly adaptable to general agent frameworks like OpenClaw. It ranks among the global top tier in the standard PinchBench and ClawBench benchmarks, with perceived performance approaching that of Opus 4.6. MiMo-V2-Pro is designed to serve as the brain of agent systems, orchestrating complex workflows, driving production engineering tasks, and delivering results reliably.

Text

Context

1M

Group

Other

Pricing preview

Input Price: $1 /M tokens

Output Price: $3 /M tokens

Slug

xiaomi/mimo-v2-pro

TextReasoning

Unknown provider

xAI: Grok 4.20 Beta

Grok 4.20 Beta is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering consistently precise and truthful responses. Reasoning can be enabled/disabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens)

TextImageFile

Context

2M

Group

Grok

Pricing preview

No display pricing published in the current snapshot.

Slug

x-ai/grok-4.20-beta

TextReasoning

xAI

xAI: Grok 4.20

Grok 4.20 is xAI's newest flagship model with industry-leading speed and agentic tool calling capabilities. It combines the lowest hallucination rate on the market with strict prompt adherance, delivering consistently precise and truthful responses. Reasoning can be enabled/disabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens)

TextImageFile

Context

2M

Group

Grok

Pricing preview

Input Price: $2 /M tokens

Output Price: $6 /M tokens

Slug

x-ai/grok-4.20

TextReasoning

Parasail

Arcee AI: Trinity Large Thinking

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance in PinchBench, agentic workloads, and reasoning tasks. Launch video: https://youtu.be/Gc82AXLa0Rg?si=4RLn6WBz33qT--B7

Text

Context

262.1K

Group

Other

Pricing preview

Input Price: $0.22 /M tokens

Output Price: $0.85 /M tokens

Slug

arcee-ai/trinity-large-thinking

Rerank

Cohere

Cohere: Rerank 4 Fast

Cohere's AI search foundation model for enhancing the relevance of information surfaced within search and RAG systems. Features a 32K context window, multilingual support across 100+ languages, no data pre-processing required, and high performance with lowest latency.

RerankText

Context

32.8K

Group

Cohere

Pricing preview

Search units: $0.002 per search

Slug

cohere/rerank-4-fast

Rerank

Cohere

Cohere: Rerank 4 Pro

Cohere's AI search foundation model for enhancing the relevance of information surfaced within search and RAG systems. Features a 32K context window, multilingual support across 100+ languages, no data pre-processing required, and state of the art performance with low latency.

RerankText

Context

32.8K

Group

Cohere

Pricing preview

Search units: $0.0025 per search

Slug

cohere/rerank-4-pro

Rerank

Cohere

Cohere: Rerank v3.5

Rerank v3.5 is designed to reorder search results for improved relevance. It supports multi-aspect and semi-structured data reranking over 100+ languages. Ideal for refining results from semantic or keyword search pipelines.

RerankText

Context

4.1K

Group

Cohere

Pricing preview

Search units: $0.001 per search

Slug

cohere/rerank-v3.5

TextReasoning

Google Vertex

Anthropic: Claude Sonnet 4.5

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflows. It delivers state-of-the-art performance on coding benchmarks such as SWE-bench Verified, with improvements across system design, code security, and specification adherence. The model is designed for extended autonomous operation, maintaining task continuity across sessions and providing fact-based progress tracking. Sonnet 4.5 also introduces stronger agentic capabilities, including improved tool orchestration, speculative parallel execution, and more efficient context and memory management. With enhanced context tracking and awareness of token usage across tool calls, it is particularly well-suited for multi-context and long-running workflows. Use cases span software engineering, cybersecurity, financial analysis, research agents, and other domains requiring sustained reasoning and tool use.

TextImageFile

Context

1M

Group

Claude

Pricing preview

Input Price: $3 /M tokens

Output Price: $15 /M tokens

Slug

anthropic/claude-sonnet-4.5

TextReasoning

Google AI Studio

Google: Gemma 4 26B A4B (free)

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

TextImageVideo

Context

262.1K

Group

Gemma

Pricing preview

Input Price: $0 /M tokens

Output Price: $0 /M tokens

Slug

google/gemma-4-26b-a4b-it

TextReasoning

DeepInfra

Google: Gemma 4 26B A4B

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

TextImageVideo

Context

262.1K

Group

Gemma

Pricing preview

Input Price: $0.08 /M tokens

Output Price: $0.35 /M tokens

Slug

google/gemma-4-26b-a4b-it

TextReasoning

Google AI Studio

Google: Gemma 4 31B (free)

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages. Strong on coding, reasoning, and document understanding tasks. Apache 2.0 license.

TextImageVideo

Context

262.1K

Group

Gemma

Pricing preview

Input Price: $0 /M tokens

Output Price: $0 /M tokens

Slug

google/gemma-4-31b-it

TextReasoning

DeepInfra

Google: Gemma 4 31B

Gemma 4 31B Instruct is Google DeepMind's 30.7B dense multimodal model supporting text and image input with text output. Features a 256K token context window, configurable thinking/reasoning mode, native function calling, and multilingual support across 140+ languages. Strong on coding, reasoning, and document understanding tasks. Apache 2.0 license.

TextImageVideo

Context

262.1K

Group

Gemma

Pricing preview

Input Price: $0.13 /M tokens

Output Price: $0.38 /M tokens

Slug

google/gemma-4-31b-it

TextReasoning

Z.ai

Z.ai: GLM 5V Turbo

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, and text inputs, excels at long-horizon planning, complex coding, and task execution, and works seamlessly with agents to complete the full loop of “perceive → plan → execute“.

TextImageVideo

Context

202.8K

Group

Other

Pricing preview

Input Price: $1.2 /M tokens

Output Price: $4 /M tokens

Slug

z-ai/glm-5v-turbo

TextReasoning

Reka AI

Reka Edge

Reka Edge is an extremely efficient 7B multimodal vision-language model that accepts image/video+text inputs and generates text outputs. This model is optimized specifically to deliver industry-leading performance in image understanding, video analysis, object detection, and agentic tool-use.

TextImageVideo

Context

16.4K

Group

Other

Pricing preview

Input Price: $0.1 /M tokens

Output Price: $0.1 /M tokens

Slug

rekaai/reka-edge

TextReasoning

Reka AI

Reka Flash 3

Reka Flash 3 is a general-purpose, instruction-tuned large language model with 21 billion parameters, developed by Reka. It excels at general chat, coding tasks, instruction-following, and function calling. Featuring a 32K context length and optimized through reinforcement learning (RLOO), it provides competitive performance comparable to proprietary models within a smaller parameter footprint. Ideal for low-latency, local, or on-device deployments, Reka Flash 3 is compact, supports efficient quantization (down to 11GB at 4-bit precision), and employs explicit reasoning tags ("<reasoning>") to indicate its internal thought process. Reka Flash 3 is primarily an English model with limited multilingual understanding capabilities. The model weights are released under the Apache 2.0 license.

Text

Context

65.5K

Group

Other

Pricing preview

Input Price: $0.1 /M tokens

Output Price: $0.2 /M tokens

Slug

rekaai/reka-flash-3

TextReasoning

Unknown provider

xAI: Grok 4.20 Multi-Agent Beta

Grok 4.20 Multi-Agent Beta is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information across complex tasks. Reasoning effort behavior: - low / medium: 4 agents - high / xhigh: 16 agents

TextImageFile

Context

2M

Group

Grok

Pricing preview

No display pricing published in the current snapshot.

Slug

x-ai/grok-4.20-multi-agent-beta

TextReasoning

xAI

xAI: Grok 4.20 Multi-Agent

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents operate in parallel to conduct deep research, coordinate tool use, and synthesize information across complex tasks. Reasoning effort behavior: - low / medium: 4 agents - high / xhigh: 16 agents

TextImageFile

Context

2M

Group

Grok

Pricing preview

Input Price: $2 /M tokens

Output Price: $6 /M tokens

Slug

x-ai/grok-4.20-multi-agent

TextReasoning

xAI

xAI: Grok 4

Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel tool calling, structured outputs, and both image and text inputs. Note that reasoning is not exposed, reasoning cannot be disabled, and the reasoning effort cannot be specified. Pricing increases once the total tokens in a given request is greater than 128k tokens. See more details on the [xAI docs](https://docs.x.ai/docs/models/grok-4-0709)

TextImageFile

Context

256K

Group

Grok

Pricing preview

Input Price: $3 /M tokens

Output Price: $15 /M tokens

Slug

x-ai/grok-4

TextReasoning

xAI

xAI: Grok 4 Fast

Grok 4 Fast is xAI's latest multimodal model with SOTA cost-efficiency and a 2M token context window. It comes in two flavors: non-reasoning and reasoning. Read more about the model on xAI's [news post](http://x.ai/news/grok-4-fast). Reasoning can be enabled/disabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens)

TextImageFile

Context

2M

Group

Grok

Pricing preview

Input Price: $0.2 /M tokens

Output Price: $0.5 /M tokens

Slug

x-ai/grok-4-fast

TextReasoning

xAI

xAI: Grok 4.1 Fast

Grok 4.1 Fast is xAI's best agentic tool calling model that shines in real-world use cases like customer support and deep research. 2M context window. Reasoning can be enabled/disabled using the `reasoning` `enabled` parameter in the API. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#controlling-reasoning-tokens)

TextImageFile

Context

2M

Group

Grok

Pricing preview

Input Price: $0.2 /M tokens

Output Price: $0.5 /M tokens

Slug

x-ai/grok-4.1-fast

Page 2 of 15

Need a model request?

Use the market snapshot for discovery, then ask ImaRouter for rollout.

If a model matters for your product, send the slug, expected traffic, target region, and latency expectations. The team can confirm support status, onboarding priority, or a migration path to an equivalent route on ImaRouter.

Contact

support@imarouter.com

Best for model availability questions, onboarding priority, routing strategy, and enterprise rollout planning.

Models | ImaRouter