Video Generation API is now live!

Models

Explore the active model market,from a local OpenRouter snapshot.

This page reads from a local JSON snapshot synced from OpenRouter, so the catalog stays fast, indexable, and stable. Use it to browse current model coverage by provider, modality, reasoning support, context window, and pricing metadata.

Reset

Results

Showing 48 of 683 matching models

Snapshot source: OpenRouter. Synced April 21, 2026 at 8:00 AM. Page 13 of 15.

This route is built from local JSON so the catalog stays stable for browsing and SEO. If you need a specific model on ImaRouter, treat this page as a discovery reference and then contact the team for availability.

Image

Black Forest Labs

Black Forest Labs: FLUX.2 Max

FLUX.2 [max] is the new top-tier image model from Black Forest Labs, pushing image quality, prompt understanding, and editing consistency to the highest level yet. Pricing is as follows, [per the docs](https://bfl.ai/pricing?category=flux.2): Input: We charge $0.03 for each megapixel on the input (i.e. reference images for editing) Output: The first generated megapixel is charged $0.07. Each subsequent megapixel is charged $0.03.

ImageText

Context

46.9K

Group

Other

Pricing preview

Output Image: $0.07 per megapixel

Slug

black-forest-labs/flux.2-max

Image

Seed

ByteDance Seed: Seedream 4.5

Seedream 4.5 is the latest in-house image generation model developed by ByteDance. Compared with Seedream 4.0, it delivers comprehensive improvements, especially in editing consistency, including better preservation of subject details, lighting, and color tone. It also enhances portrait refinement and small-text rendering. The model’s multi-image composition capabilities have been significantly strengthened, and both reasoning performance and visual aesthetics continue to advance, enabling more accurate and artistically expressive image generation. Pricing is $0.04 per output image, regardless of size.

ImageText

Context

4.1K

Group

Other

Pricing preview

Image Output: $0.04 per image

Slug

bytedance-seed/seedream-4.5

Image

Sourceful

Sourceful: Riverflow V2 Fast Preview

Riverflow V2 Fast Preview is the fastest variant of Sourceful's Riverflow V2 preview lineup. This preview version exceeds the performance of Riverflow 1 Family and is Sourceful's first unified text-to-image and image-to-image model family. Pricing is $0.03 per output image, regardless of size. Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

ImageText

Context

8.2K

Group

Other

Pricing preview

Image Output: $0.03 per image

Slug

sourceful/riverflow-v2-fast-preview

Image

Sourceful

Sourceful: Riverflow V2 Standard Preview

Riverflow V2 Standard Preview is the standard variant of Sourceful's Riverflow V2 preview lineup. This preview version exceeds the performance of Riverflow 1 Family and is Sourceful's first unified text-to-image and image-to-image model family. Pricing is $0.035 per output image, regardless of size. Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

ImageText

Context

8.2K

Group

Other

Pricing preview

Image Output: $0.035 per image

Slug

sourceful/riverflow-v2-standard-preview

Image

Sourceful

Sourceful: Riverflow V2 Max Preview

Riverflow V2 Max Preview is the most powerful variant of Sourceful's Riverflow V2 preview lineup. This preview version exceeds the performance of Riverflow 1 Family and is Sourceful's first unified text-to-image and image-to-image model family. Pricing is $0.075 per output image, regardless of size. Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

ImageText

Context

8.2K

Group

Other

Pricing preview

Image Output: $0.075 per image

Slug

sourceful/riverflow-v2-max-preview

Image

Sourceful

Sourceful: Riverflow V2 Fast

Riverflow V2 Fast is the fastest variant of Sourceful's Riverflow 2.0 lineup, best for production deployments and latency-critical workflows. The Riverflow 2.0 series represents SOTA performance on image generation and editing tasks, using an integrated reasoning model to boost reliability and tackle complex challenges. Pricing is $0.02 per 1K output image and $0.04 per 2K output image. Does not support 4K image output. Additional features: - Custom font rendering via font_inputs ($0.03/font, max 2) - Image enhancement via super_resolution_references ($0.20/reference, max 4) See the image generation docs for details: https://openrouter.ai/docs/features/multimodal/image-generation Note: Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

ImageText

Context

8.2K

Group

Other

Pricing preview

Image Output: $0.02 per image

Font Input: $0.03 per font

Slug

sourceful/riverflow-v2-fast

Image

Sourceful

Sourceful: Riverflow V2 Pro

Riverflow V2 Pro is the most powerful variant of Sourceful's Riverflow 2.0 lineup, best for top-tier control and perfect text rendering. The Riverflow 2.0 series represents SOTA performance on image generation and editing tasks, using an integrated reasoning model to boost reliability and tackle complex challenges. Pricing is $0.15 per 1K/2K output image and $0.33 per 4K output image. Additional features: - Custom font rendering via font_inputs ($0.03/font, max 2) - Image enhancement via super_resolution_references ($0.20/reference, max 4) See the image generation docs for details: https://openrouter.ai/docs/features/multimodal/image-generation Note: Sourceful imposes a 4.5MB request size limit, therefore it is highly recommended to pass image URLs instead of Base64 data.

ImageText

Context

8.2K

Group

Other

Pricing preview

Image Output: $0.15 per image

Font Input: $0.03 per font

Slug

sourceful/riverflow-v2-pro

Text

Unknown provider

AllenAI: Molmo2 8B

Molmo2-8B is an open vision-language model developed by the Allen Institute for AI (Ai2) as part of the Molmo2 family, supporting image, video, and multi-image understanding and grounding. It is based on Qwen3-8B and uses SigLIP 2 as its vision backbone, outperforming other open-weight, open-data models on short videos, counting, and captioning, while remaining competitive on long-video tasks.

TextImageVideo

Context

36.9K

Group

Other

Pricing preview

No display pricing published in the current snapshot.

Slug

allenai/molmo-2-8b

Text

Ionstream

Qwen: Qwen3 Coder Next

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It uses a sparse MoE design with 80B total parameters and only 3B activated per token, delivering performance comparable to models with 10 to 20x higher active compute, which makes it well suited for cost-sensitive, always-on agent deployment. The model is trained with a strong agentic focus and performs reliably on long-horizon coding tasks, complex tool usage, and recovery from execution failures. With a native 256k context window, it integrates cleanly into real-world CLI and IDE environments and adapts well to common agent scaffolds used by modern coding tools. The model operates exclusively in non-thinking mode and does not emit <think> blocks, simplifying integration for production coding agents.

Text

Context

262.1K

Group

Qwen

Pricing preview

Input Price: $0.15 /M tokens

Output Price: $0.8 /M tokens

Slug

qwen/qwen3-coder-next

TextReasoning

Unknown provider

Free Models Router

The simplest way to get free inference. openrouter/free is a router that selects free models at random from the models available on OpenRouter. The router smartly filters for models that support features needed for your request such as image understanding, tool calling, structured outputs and more.

TextImage

Context

200K

Group

Router

Pricing preview

No display pricing published in the current snapshot.

Slug

openrouter/free

TextReasoning

SiliconFlow

StepFun: Step 3.5 Flash

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) architecture, it selectively activates only 11B of its 196B parameters per token. It is a reasoning model that is incredibly speed efficient even at long contexts.

Text

Context

262.1K

Group

Other

Pricing preview

Input Price: $0.1 /M tokens

Output Price: $0.3 /M tokens

Slug

stepfun/step-3.5-flash

Text

Unknown provider

Auto Router

Your prompt will be processed by a meta-model and routed to one of dozens of models (see below), optimizing for the best possible output. To see which model was used, visit [Activity](/activity), or read the `model` attribute of the response. Your response will be priced at the same rate as the routed model. Learn more, including how to customize the models for routing, in our [docs](/docs/guides/routing/routers/auto-router). Requests will be routed to the following models: - [anthropic/claude-haiku-4.5](/anthropic/claude-haiku-4.5) - [anthropic/claude-opus-4.6](/anthropic/claude-opus-4.6) - [anthropic/claude-sonnet-4.5](/anthropic/claude-sonnet-4.5) - [anthropic/claude-sonnet-4.6](/anthropic/claude-sonnet-4.6) - [deepseek/deepseek-r1](/deepseek/deepseek-r1) - [google/gemini-2.5-flash-lite](/google/gemini-2.5-flash-lite) - [google/gemini-3-flash-preview](/google/gemini-3-flash-preview) - [google/gemini-3-pro-preview](/google/gemini-3-pro-preview) - [google/gemini-3.1-pro-preview](/google/gemini-3.1-pro-preview) - [meta-llama/llama-3.3-70b-instruct](/meta-llama/llama-3.3-70b-instruct) - [minimax/minimax-m2.5](/minimax/minimax-m2.5) - [mistralai/codestral-2508](/mistralai/codestral-2508) - [mistralai/mistral-7b-instruct-v0.1](/mistralai/mistral-7b-instruct-v0.1) - [mistralai/mistral-large](/mistralai/mistral-large) - [mistralai/mistral-medium-3.1](/mistralai/mistral-medium-3.1) - [mistralai/mistral-small-3.2-24b-instruct-2506](/mistralai/mistral-small-3.2-24b-instruct-2506) - [moonshotai/kimi-k2-thinking](/moonshotai/kimi-k2-thinking) - [openai/gpt-5](/openai/gpt-5) - [openai/gpt-5-mini](/openai/gpt-5-mini) - [openai/gpt-5-nano](/openai/gpt-5-nano) - [openai/gpt-5.1](/openai/gpt-5.1) - [openai/gpt-5.2](/openai/gpt-5.2) - [openai/gpt-5.2-pro](/openai/gpt-5.2-pro) - [openai/gpt-5.3-chat](/openai/gpt-5.3-chat) - [openai/gpt-oss-120b](/openai/gpt-oss-120b) - [perplexity/sonar](/perplexity/sonar) - [qwen/qwen3-235b-a22b](/qwen/qwen3-235b-a22b) - [x-ai/grok-3](/x-ai/grok-3) - [x-ai/grok-3-mini](/x-ai/grok-3-mini) - [x-ai/grok-4](/x-ai/grok-4) - [x-ai/grok-4.1-fast](/x-ai/grok-4.1-fast) - [z-ai/glm-5](/z-ai/glm-5)

TextImageAudioFileVideo

Context

2M

Group

Router

Pricing preview

No display pricing published in the current snapshot.

Slug

openrouter/auto

TextReasoning

Unknown provider

Google: Gemini 3 Pro Preview

Gemini 3 Pro is Google’s flagship frontier model for high-precision multimodal reasoning, combining strong performance across text, image, video, audio, and code with a 1M-token context window. Reasoning Details must be preserved when using multi-turn tool calling, see our docs here: https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks. It delivers state-of-the-art benchmark results in general reasoning, STEM problem solving, factual QA, and multimodal understanding, including leading scores on LMArena, GPQA Diamond, MathArena Apex, MMMU-Pro, and Video-MMMU. Interactions emphasize depth and interpretability: the model is designed to infer intent with minimal prompting and produce direct, insight-focused responses. Built for advanced development and agentic workflows, Gemini 3 Pro provides robust tool-calling, long-horizon planning stability, and strong zero-shot generation for complex UI, visualization, and coding tasks. It excels at agentic coding (SWE-Bench Verified, Terminal-Bench 2.0), multimodal analysis, and structured long-form tasks such as research synthesis, planning, and interactive learning experiences. Suitable applications include autonomous agents, coding assistants, multimodal analytics, scientific reasoning, and high-context information processing.

TextImageFileAudioVideo

Context

1M

Group

Gemini

Pricing preview

No display pricing published in the current snapshot.

Slug

google/gemini-3-pro-preview

Text

Mistral

Mistral: Devstral 2 2512

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter dense transformer model supporting a 256K context window. Devstral 2 supports exploring codebases and orchestrating changes across multiple files while maintaining architecture-level context. It tracks framework dependencies, detects failures, and retries with corrections—solving challenges like bug fixing and modernizing legacy systems. The model can be fine-tuned to prioritize specific languages or optimize for large enterprise codebases. It is available under a modified MIT license.

Text

Context

262.1K

Group

Mistral

Pricing preview

Input Price: $0.4 /M tokens

Output Price: $2 /M tokens

Slug

mistralai/devstral-2512

Text

Amazon Bedrock

Writer: Palmyra X5

Palmyra X5 is Writer's most advanced model, purpose-built for building and scaling AI agents across the enterprise. It delivers industry-leading speed and efficiency on context windows up to 1 million tokens, powered by a novel transformer architecture and hybrid attention mechanisms. This enables faster inference and expanded memory for processing large volumes of enterprise data, critical for scaling AI agents.

Text

Context

1M

Group

Other

Pricing preview

Input Price: $0.6 /M tokens

Output Price: $6 /M tokens

Slug

writer/palmyra-x5

TextReasoning

Chutes

Xiaomi: MiMo-V2-Flash

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309B total parameters and 15B active parameters, adopting hybrid attention architecture. MiMo-V2-Flash supports a hybrid-thinking toggle and a 256K context window, and excels at reasoning, coding, and agent scenarios. On SWE-bench Verified and SWE-bench Multilingual, MiMo-V2-Flash ranks as the top #1 open-source model globally, delivering performance comparable to Claude Sonnet 4.5 while costing only about 3.5% as much. Users can control the reasoning behaviour with the `reasoning` `enabled` boolean. [Learn more in our docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#enable-reasoning-with-default-config).

Text

Context

262.1K

Group

Other

Pricing preview

Input Price: $0.09 /M tokens

Output Price: $0.29 /M tokens

Slug

xiaomi/mimo-v2-flash

Text

Unknown provider

LiquidAI: LFM2-2.6B

LFM2 is a new generation of hybrid models developed by Liquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.

Text

Context

32.8K

Group

Other

Pricing preview

No display pricing published in the current snapshot.

Slug

liquid/lfm-2.2-6b

Text

Unknown provider

LiquidAI: LFM2-8B-A1B

LFM2-8B-A1B is an efficient on-device Mixture-of-Experts (MoE) model from Liquid AI’s LFM2 family, built for fast, high-quality inference on edge hardware. It uses 8.3B total parameters with only ~1.5B active per token, delivering strong performance while keeping compute and memory usage low—making it ideal for phones, tablets, and laptops.

Text

Context

8.2K

Group

Other

Pricing preview

No display pricing published in the current snapshot.

Slug

liquid/lfm2-8b-a1b

Text

Liquid

LiquidAI: LFM2.5-1.2B-Instruct (free)

LFM2.5-1.2B-Instruct is a compact, high-performance instruction-tuned model built for fast on-device AI. It delivers strong chat quality in a 1.2B parameter footprint, with efficient edge inference and broad runtime support.

Text

Context

32.8K

Group

Other

Pricing preview

Input Price: $0 /M tokens

Output Price: $0 /M tokens

Slug

liquid/lfm-2.5-1.2b-instruct

TextReasoning

Liquid

LiquidAI: LFM2.5-1.2B-Thinking (free)

LFM2.5-1.2B-Thinking is a lightweight reasoning-focused model optimized for agentic tasks, data extraction, and RAG—while still running comfortably on edge devices. It supports long context (up to 32K tokens) and is designed to provide higher-quality “thinking” responses in a small 1.2B model.

Text

Context

32.8K

Group

Other

Pricing preview

Input Price: $0 /M tokens

Output Price: $0 /M tokens

Slug

liquid/lfm-2.5-1.2b-thinking

Text

SiliconFlow

Nex AGI: DeepSeek V3.1 Nex N1

DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series — a post-trained model designed to highlight agent autonomy, tool use, and real-world productivity. Nex-N1 demonstrates competitive performance across all evaluation scenarios, showing particularly strong results in practical coding and HTML generation tasks.

Text

Context

131.1K

Group

DeepSeek

Pricing preview

Input Price: $0.135 /M tokens

Output Price: $0.5 /M tokens

Slug

nex-agi/deepseek-v3.1-nex-n1

TextReasoning

AtlasCloud

MiniMax: MiniMax M2.1

MiniMax-M2.1 is a lightweight, state-of-the-art large language model optimized for coding, agentic workflows, and modern application development. With only 10 billion activated parameters, it delivers a major jump in real-world capability while maintaining exceptional latency, scalability, and cost efficiency. Compared to its predecessor, M2.1 delivers cleaner, more concise outputs and faster perceived response times. It shows leading multilingual coding performance across major systems and application languages, achieving 49.4% on Multi-SWE-Bench and 72.5% on SWE-Bench Multilingual, and serves as a versatile agent “brain” for IDEs, coding tools, and general-purpose assistance. To avoid degrading this model's performance, MiniMax highly recommends preserving reasoning between turns. Learn more about using reasoning_details to pass back reasoning in our [docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks).

Text

Context

196.6K

Group

Other

Pricing preview

Input Price: $0.29 /M tokens

Output Price: $0.95 /M tokens

Slug

minimax/minimax-m2.1

TextReasoning

Unknown provider

AllenAI: Olmo 3.1 32B Think

Olmo 3.1 32B Think is a large-scale, 32-billion-parameter model designed for deep reasoning, complex multi-step logic, and advanced instruction following. Building on the Olmo 3 series, version 3.1 delivers refined reasoning behavior and stronger performance across demanding evaluations and nuanced conversational tasks. Developed by Ai2 under the Apache 2.0 license, Olmo 3.1 32B Think continues the Olmo initiative’s commitment to openness, providing full transparency across model weights, code, and training methodology.

Text

Context

65.5K

Group

Other

Pricing preview

No display pricing published in the current snapshot.

Slug

allenai/olmo-3.1-32b-think

TextReasoning

Clarifai

Arcee AI: Trinity Mini

Trinity Mini is a 26B-parameter (3B active) sparse mixture-of-experts language model featuring 128 experts with 8 active per token. Engineered for efficient reasoning over long contexts (131k) with robust function calling and multi-step agent workflows.

Text

Context

131.1K

Group

Other

Pricing preview

Input Price: $0.045 /M tokens

Output Price: $0.15 /M tokens

Slug

arcee-ai/trinity-mini

TextReasoning

AtlasCloud

DeepSeek: DeepSeek V3.2 Speciale

DeepSeek-V3.2-Speciale is a high-compute variant of DeepSeek-V3.2 optimized for maximum reasoning and agentic performance. It builds on DeepSeek Sparse Attention (DSA) for efficient long-context processing, then scales post-training reinforcement learning to push capability beyond the base model. Reported evaluations place Speciale ahead of GPT-5 on difficult reasoning workloads, with proficiency comparable to Gemini-3.0-Pro, while retaining strong coding and tool-use reliability. Like V3.2, it benefits from a large-scale agentic task synthesis pipeline that improves compliance and generalization in interactive environments.

Text

Context

163.8K

Group

DeepSeek

Pricing preview

Input Price: $0.4 /M tokens

Output Price: $1.2 /M tokens

Slug

deepseek/deepseek-v3.2-speciale

TextReasoning

Nebius Token Factory

Prime Intellect: INTELLECT-3

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervised fine-tuning (SFT) followed by large-scale reinforcement learning (RL). It offers state-of-the-art performance for its size across math, code, science, and general reasoning, consistently outperforming many larger frontier models. Designed for strong multi-step problem solving, it maintains high accuracy on structured tasks while remaining efficient at inference thanks to its MoE architecture.

Text

Context

131.1K

Group

Other

Pricing preview

Input Price: $0.2 /M tokens

Output Price: $1.1 /M tokens

Slug

prime-intellect/intellect-3

TextReasoning

Unknown provider

TNG: R1T Chimera

TNG-R1T-Chimera is an experimental LLM with a faible for creative storytelling and character interaction. It is a derivate of the original TNG/DeepSeek-R1T-Chimera released in April 2025 and is available exclusively via Chutes and OpenRouter. Characteristics and improvements include: We think that it has a creative and pleasant personality. It has a preliminary EQ-Bench3 value of about 1305. It is quite a bit more intelligent than the original, albeit a slightly slower. It is much more think-token consistent, i.e. reasoning and answer blocks are properly delineated. Tool calling is much improved. TNG Tech, the model authors, ask that users follow the careful guidelines that Microsoft has created for their "MAI-DS-R1" DeepSeek-based model. These guidelines are available on Hugging Face (https://huggingface.co/microsoft/MAI-DS-R1).

Text

Context

163.8K

Group

Other

Pricing preview

No display pricing published in the current snapshot.

Slug

tngtech/tng-r1t-chimera

TextReasoning

Parasail

AllenAI: Olmo 3 32B Think

Olmo 3 32B Think is a large-scale, 32-billion-parameter model purpose-built for deep reasoning, complex logic chains and advanced instruction-following scenarios. Its capacity enables strong performance on demanding evaluation tasks and highly nuanced conversational reasoning. Developed by Ai2 under the Apache 2.0 license, Olmo 3 32B Think embodies the Olmo initiative’s commitment to openness, offering full transparency across weights, code and training methodology.

Text

Context

65.5K

Group

Other

Pricing preview

Input Price: $0.15 /M tokens

Output Price: $0.5 /M tokens

Slug

allenai/olmo-3-32b-think

TextReasoning

Unknown provider

AllenAI: Olmo 3 7B Think

Olmo 3 7B Think is a research-oriented language model in the Olmo family designed for advanced reasoning and instruction-driven tasks. It excels at multi-step problem solving, logical inference, and maintaining coherent conversational context. Developed by Ai2 under the Apache 2.0 license, Olmo 3 7B Think supports transparent, fully open experimentation and provides a lightweight yet capable foundation for academic research and practical NLP workflows.

Text

Context

65.5K

Group

Other

Pricing preview

No display pricing published in the current snapshot.

Slug

allenai/olmo-3-7b-think

ImageReasoning

Google AI Studio

Google: Nano Banana Pro (Gemini 3 Pro Image Preview)

Nano Banana Pro is Google’s most advanced image-generation and editing model, built on Gemini 3 Pro. It extends the original Nano Banana with significantly improved multimodal reasoning, real-world grounding, and high-fidelity visual synthesis. The model generates context-rich graphics, from infographics and diagrams to cinematic composites, and can incorporate real-time information via Search grounding. It offers industry-leading text rendering in images (including long passages and multilingual layouts), consistent multi-image blending, and accurate identity preservation across up to five subjects. Nano Banana Pro adds fine-grained creative controls such as localized edits, lighting and focus adjustments, camera transformations, and support for 2K/4K outputs and flexible aspect ratios. It is designed for professional-grade design, product visualization, storyboarding, and complex multi-element compositions while remaining efficient for general image creation workflows.

ImageText

Context

65.5K

Group

Gemini

Pricing preview

Input Price: $2 /M tokens

Output Price: $12 /M tokens

Slug

google/gemini-3-pro-image-preview

TextReasoning

NovitaAI

MoonshotAI: Kimi K2 Thinking

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256 k-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves step-by-step reasoning with tool use, enabling autonomous research, coding, and writing that can persist for hundreds of sequential actions without drift. It sets new open-source benchmarks on HLE, BrowseComp, SWE-Multilingual, and LiveCodeBench, while maintaining stable multi-agent behavior through 200–300 tool calls. Built on a large-scale MoE architecture with MuonClip optimization, it combines strong reasoning depth with high inference efficiency for demanding agentic and analytical tasks.

Text

Context

262.1K

Group

Other

Pricing preview

Input Price: $0.6 /M tokens

Output Price: $2.5 /M tokens

Slug

moonshotai/kimi-k2-thinking

TextReasoning

Perplexity

Perplexity: Sonar Pro Search

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic search system. It is designed for deeper reasoning and analysis. Pricing is based on tokens plus $18 per thousand requests. This model powers the Pro Search mode on the Perplexity platform. Sonar Pro Search adds autonomous, multi-step reasoning to Sonar Pro. So, instead of just one query + synthesis, it plans and executes entire research workflows using tools.

TextImage

Context

200K

Group

Other

Pricing preview

Input Price: $3 /M tokens

Output Price: $15 /M tokens

Slug

perplexity/sonar-pro-search

TextReasoning

Groq

OpenAI: gpt-oss-safeguard-20b

gpt-oss-safeguard-20b is a safety reasoning model from OpenAI built upon gpt-oss-20b. This open-weight, 21B-parameter Mixture-of-Experts (MoE) model offers lower latency for safety tasks like content classification, LLM filtering, and trust & safety labeling. Learn more about this model in OpenAI's gpt-oss-safeguard [user guide](https://cookbook.openai.com/articles/gpt-oss-safeguard-guide).

Text

Context

131.1K

Group

GPT

Pricing preview

Input Price: $0.075 /M tokens

Output Price: $0.3 /M tokens

Slug

openai/gpt-oss-safeguard-20b

TextReasoning

AtlasCloud

MiniMax: MiniMax M2

MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning, tool use, and multi-step task execution while maintaining low latency and deployment efficiency. The model excels in code generation, multi-file editing, compile-run-fix loops, and test-validated repair, showing strong results on SWE-Bench Verified, Multi-SWE-Bench, and Terminal-Bench. It also performs competitively in agentic evaluations such as BrowseComp and GAIA, effectively handling long-horizon planning, retrieval, and recovery from execution errors. Benchmarked by [Artificial Analysis](https://artificialanalysis.ai/models/minimax-m2), MiniMax-M2 ranks among the top open-source models for composite intelligence, spanning mathematics, science, and instruction-following. Its small activation footprint enables fast inference, high concurrency, and improved unit economics, making it well-suited for large-scale agents, developer assistants, and reasoning-driven applications that require responsiveness and cost efficiency. To avoid degrading this model's performance, MiniMax highly recommends preserving reasoning between turns. Learn more about using reasoning_details to pass back reasoning in our [docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks).

Text

Context

196.6K

Group

Other

Pricing preview

Input Price: $0.255 /M tokens

Output Price: $1 /M tokens

Slug

minimax/minimax-m2

TextReasoning

Alibaba Cloud Int.

Qwen: Qwen3 VL 8B Thinking

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visual and textual reasoning across complex scenes, documents, and temporal sequences. It integrates enhanced multimodal alignment and long-context processing (native 256K, expandable to 1M tokens) for tasks such as scientific visual analysis, causal inference, and mathematical reasoning over image or video inputs. Compared to the Instruct edition, the Thinking version introduces deeper visual-language fusion and deliberate reasoning pathways that improve performance on long-chain logic tasks, STEM problem-solving, and multi-step video understanding. It achieves stronger temporal grounding via Interleaved-MRoPE and timestamp-aware embeddings, while maintaining robust OCR, multilingual comprehension, and text generation on par with large text-only LLMs.

TextImage

Context

131.1K

Group

Qwen3

Pricing preview

Input Price: $0.117 /M tokens

Output Price: $1.36 /M tokens

Slug

qwen/qwen3-vl-8b-thinking

Text

DeepInfra

AllenAI: Olmo 3.1 32B Instruct

Olmo 3.1 32B Instruct is a large-scale, 32-billion-parameter instruction-tuned language model engineered for high-performance conversational AI, multi-turn dialogue, and practical instruction following. As part of the Olmo 3.1 family, this variant emphasizes responsiveness to complex user directions and robust chat interactions while retaining strong capabilities on reasoning and coding benchmarks. Developed by Ai2 under the Apache 2.0 license, Olmo 3.1 32B Instruct reflects the Olmo initiative’s commitment to openness and transparency.

Text

Context

65.5K

Group

Other

Pricing preview

Input Price: $0.2 /M tokens

Output Price: $0.6 /M tokens

Slug

allenai/olmo-3.1-32b-instruct

TextReasoning

Seed

ByteDance Seed: Seed 1.6 Flash

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual understanding. It features a 256k context window and can generate outputs of up to 16k tokens.

TextImageVideo

Context

262.1K

Group

Other

Pricing preview

Input Price: $0.075 /M tokens

Output Price: $0.3 /M tokens

Slug

bytedance-seed/seed-1.6-flash

TextReasoning

Seed

ByteDance Seed: Seed 1.6

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and adaptive deep thinking with a 256K context window.

TextImageVideo

Context

262.1K

Group

Other

Pricing preview

Input Price: $0.25 /M tokens

Output Price: $2 /M tokens

Slug

bytedance-seed/seed-1.6

TextReasoning

Google AI Studio

Google: Gemini 3 Flash Preview

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and coding assistance. It delivers near Pro level reasoning and tool use performance with substantially lower latency than larger Gemini variants, making it well suited for interactive development, long running agent loops, and collaborative coding tasks. Compared to Gemini 2.5 Flash, it provides broad quality improvements across reasoning, multimodal understanding, and reliability. The model supports a 1M token context window and multimodal inputs including text, images, audio, video, and PDFs, with text output. It includes configurable reasoning via thinking levels (minimal, low, medium, high), structured output, tool use, and automatic context caching. Gemini 3 Flash Preview is optimized for users who want strong reasoning and agentic behavior without the cost or latency of full scale frontier models.

TextImageFileAudioVideo

Context

1M

Group

Gemini

Pricing preview

Input Price: $0.5 /M tokens

Output Price: $3 /M tokens

Slug

google/gemini-3-flash-preview

Text

Mistral

Mistral: Mistral Small Creative

Mistral Small Creative is an experimental small model designed for creative writing, narrative generation, roleplay and character-driven dialogue, general-purpose instruction following, and conversational agents.

Text

Context

32.8K

Group

Mistral

Pricing preview

Input Price: $0.1 /M tokens

Output Price: $0.3 /M tokens

Slug

mistralai/mistral-small-creative

Text

Unknown provider

Body Builder (beta)

Transform your natural language requests into structured OpenRouter API request objects. Describe what you want to accomplish with AI models, and Body Builder will construct the appropriate API calls. Example: "count to 10 using gemini and opus." This is useful for creating multi-model requests, custom model routers, or programmatic generation of API calls from human descriptions. **BETA NOTICE**: Body Builder is in beta, and currently free. Pricing and functionality may change in the future.

Text

Context

128K

Group

Router

Pricing preview

No display pricing published in the current snapshot.

Slug

openrouter/bodybuilder

Text

Mistral

Mistral: Ministral 3 14B 2512

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.

TextImage

Context

262.1K

Group

Mistral

Pricing preview

Input Price: $0.2 /M tokens

Output Price: $0.2 /M tokens

Slug

mistralai/ministral-14b-2512

Text

Mistral

Mistral: Ministral 3 8B 2512

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

TextImage

Context

262.1K

Group

Mistral

Pricing preview

Input Price: $0.15 /M tokens

Output Price: $0.15 /M tokens

Slug

mistralai/ministral-8b-2512

Text

Mistral

Mistral: Ministral 3 3B 2512

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

TextImage

Context

131.1K

Group

Mistral

Pricing preview

Input Price: $0.1 /M tokens

Output Price: $0.1 /M tokens

Slug

mistralai/ministral-3b-2512

Text

Relace

Relace: Relace Search

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant files to the user request. In contrast to RAG, relace-search performs agentic multi-step reasoning to produce highly precise results 4x faster than any frontier model. It's designed to serve as a subagent that passes its findings to an "oracle" coding agent, who orchestrates/performs the rest of the coding task. To use relace-search you need to build an appropriate agent harness, and parse the response for relevant information to hand off to the oracle. Read more about it in the [Relace documentation](https://docs.relace.ai/docs/fast-agentic-search/agent).

Text

Context

256K

Group

Other

Pricing preview

Input Price: $1 /M tokens

Output Price: $3 /M tokens

Slug

relace/relace-search

TextReasoning

Together

EssentialAI: Rnj 1 Instruct

Rnj-1 is an 8B-parameter, dense, open-weight model family developed by Essential AI and trained from scratch with a focus on programming, math, and scientific reasoning. The model demonstrates strong performance across multiple programming languages, tool-use workflows, and agentic execution environments (e.g., mini-SWE-agent).

Text

Context

32.8K

Group

Other

Pricing preview

Input Price: $0.15 /M tokens

Output Price: $0.15 /M tokens

Slug

essentialai/rnj-1-instruct

TextReasoning

Amazon Bedrock

Anthropic: Claude Haiku 4.5

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost and latency of larger Claude models. Matching Claude Sonnet 4’s performance across reasoning, coding, and computer-use tasks, Haiku 4.5 brings frontier-level capability to real-time and high-volume applications. It introduces extended thinking to the Haiku line; enabling controllable reasoning depth, summarized or interleaved thought output, and tool-assisted workflows with full support for coding, bash, web search, and computer-use tools. Scoring >73% on SWE-bench Verified, Haiku 4.5 ranks among the world’s best coding models while maintaining exceptional responsiveness for sub-agents, parallelized execution, and scaled deployment.

TextImage

Context

200K

Group

Claude

Pricing preview

Input Price: $1 /M tokens

Output Price: $5 /M tokens

Slug

anthropic/claude-haiku-4.5

TextReasoning

Amazon Bedrock

Amazon: Nova 2 Lite

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos to generate text. Nova 2 Lite demonstrates standout capabilities in processing documents, extracting information from videos, generating code, providing accurate grounded answers, and automating multi-step agentic workflows.

TextImageVideoFile

Context

1M

Group

Nova

Pricing preview

Input Price: $0.3 /M tokens

Output Price: $2.5 /M tokens

Slug

amazon/nova-2-lite-v1

Page 13 of 15

Need a model request?

Use the market snapshot for discovery, then ask ImaRouter for rollout.

If a model matters for your product, send the slug, expected traffic, target region, and latency expectations. The team can confirm support status, onboarding priority, or a migration path to an equivalent route on ImaRouter.

Contact

support@imarouter.com

Best for model availability questions, onboarding priority, routing strategy, and enterprise rollout planning.

Models | ImaRouter