Models

Explore the active model market,from a local OpenRouter snapshot.

This page reads from a local JSON snapshot synced from OpenRouter, so the catalog stays fast, indexable, and stable. Use it to browse current model coverage by provider, modality, reasoning support, context window, and pricing metadata.

Request a model on ImaRouter View source endpoint

Modality

Provider

Sort

Reasoning only

All (683)Audio (23)Embeddings (26)File (71)Image (226)Rerank (3)Text (683)TTS (2)Video (47)

Results

Showing 48 of 683 matching models

Snapshot source: OpenRouter. Synced April 21, 2026 at 8:00 AM. Page 7 of 15.

This route is built from local JSON so the catalog stays stable for browsing and SEO. If you need a specific model on ImaRouter, treat this page as a discovery reference and then contact the team for availability.

Text

DeepInfra

Mistral: Mistral Small 3

Mistral Small 3 is a 24B-parameter language model optimized for low-latency performance across common AI tasks. Released under the Apache 2.0 license, it features both pre-trained and instruction-tuned versions designed for efficient local deployment. The model achieves 81% accuracy on the MMLU benchmark and performs competitively with larger models like Llama 3.3 70B and Qwen 32B, while operating at three times the speed on equivalent hardware. [Read the blog post about the model here.](https://mistral.ai/news/mistral-small-3/)

Text

Context

32.8K

Group

Mistral

Pricing preview

Input Price: $0.05 /M tokens

Output Price: $0.08 /M tokens

Slug

mistralai/mistral-small-24b-instruct-2501

TextReasoning

NextBit

DeepSeek: R1 Distill Qwen 32B

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwen2.5-32B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.\n\nOther benchmark results include:\n\n- AIME 2024 pass@1: 72.6\n- MATH-500 pass@1: 94.3\n- CodeForces Rating: 1691\n\nThe model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

Text

Context

32.8K

Group

Qwen

Pricing preview

Input Price: $0.29 /M tokens

Output Price: $0.29 /M tokens

Slug

deepseek/deepseek-r1-distill-qwen-32b

TextReasoning

Unknown provider

DeepSeek: R1 Distill Qwen 14B

DeepSeek R1 Distill Qwen 14B is a distilled large language model based on [Qwen 2.5 14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). It outperforms OpenAI's o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. Other benchmark results include: - AIME 2024 pass@1: 69.7 - MATH-500 pass@1: 93.9 - CodeForces Rating: 1481 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

Text

Context

131.1K

Group

Qwen

Pricing preview

No display pricing published in the current snapshot.

Slug

deepseek/deepseek-r1-distill-qwen-14b

Text

Unknown provider

Liquid: LFM 7B

LFM-7B, a new best-in-class language model. LFM-7B is designed for exceptional chat capabilities, including languages like Arabic and Japanese. Powered by the Liquid Foundation Model (LFM) architecture, it exhibits unique features like low memory footprint and fast inference speed. LFM-7B is the world’s best-in-class multilingual language model in English, Arabic, and Japanese. See the [launch announcement](https://www.liquid.ai/lfm-7b) for benchmarks and more info.

Text

Context

32.8K

Group

Other

Pricing preview

No display pricing published in the current snapshot.

Slug

liquid/lfm-7b

Text

Unknown provider

Liquid: LFM 3B

Liquid's LFM 3B delivers incredible performance for its size. It positions itself as first place among 3B parameter transformers, hybrids, and RNN models It is also on par with Phi-3.5-mini on multiple benchmarks, while being 18.4% smaller. LFM-3B is the ideal choice for mobile and other edge text-based applications. See the [launch announcement](https://www.liquid.ai/liquid-foundation-models) for benchmarks and more info.

Text

Context

32.8K

Group

Other

Pricing preview

No display pricing published in the current snapshot.

Slug

liquid/lfm-3b

TextReasoning

DeepInfra

DeepSeek: R1 Distill Llama 70B

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.3-70b-instruct), using outputs from [DeepSeek R1](/deepseek/deepseek-r1). The model combines advanced distillation techniques to achieve high performance across multiple benchmarks, including: - AIME 2024 pass@1: 70.0 - MATH-500 pass@1: 94.5 - CodeForces Rating: 1633 The model leverages fine-tuning from DeepSeek R1's outputs, enabling competitive performance comparable to larger frontier models.

Text

Context

131.1K

Group

Llama3

Pricing preview

Input Price: $0.7 /M tokens

Output Price: $0.8 /M tokens

Slug

deepseek/deepseek-r1-distill-llama-70b

TextReasoning

NovitaAI

DeepSeek: R1

DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Fully open-source model & [technical report](https://api-docs.deepseek.com/news/news250120). MIT licensed: Distill & commercialize freely!

Text

Context

64K

Group

DeepSeek

Pricing preview

Input Price: $0.7 /M tokens

Output Price: $2.5 /M tokens

Slug

deepseek/deepseek-r1

Text

MiniMax

MiniMax: MiniMax-01

MiniMax-01 is a combines MiniMax-Text-01 for text generation and MiniMax-VL-01 for image understanding. It has 456 billion parameters, with 45.9 billion parameters activated per inference, and can handle a context of up to 4 million tokens. The text model adopts a hybrid architecture that combines Lightning Attention, Softmax Attention, and Mixture-of-Experts (MoE). The image model adopts the “ViT-MLP-LLM” framework and is trained on top of the text model. To read more about the release, see: https://www.minimaxi.com/en/news/minimax-01-series-2

TextImage

Context

Group

Other

Pricing preview

Input Price: $0.2 /M tokens

Output Price: $1.1 /M tokens

Slug

minimax/minimax-01

Text

Unknown provider

Mistral: Codestral 2501

[Mistral](/mistralai)'s cutting-edge language model for coding. Codestral specializes in low-latency, high-frequency tasks such as fill-in-the-middle (FIM), code correction and test generation. Learn more on their blog post: https://mistral.ai/news/codestral-2501/

Text

Context

256K

Group

Mistral

Pricing preview

No display pricing published in the current snapshot.

Slug

mistralai/codestral-2501

Text

NextBit

Microsoft: Phi 4

[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficiently in situations with limited memory or where quick responses are needed. At 14 billion parameters, it was trained on a mix of high-quality synthetic datasets, data from curated websites, and academic materials. It has undergone careful improvement to follow instructions accurately and maintain strong safety standards. It works best with English language inputs. For more information, please see [Phi-4 Technical Report](https://arxiv.org/pdf/2412.08905)

Text

Context

16.4K

Group

Other

Pricing preview

Input Price: $0.065 /M tokens

Output Price: $0.14 /M tokens

Slug

microsoft/phi-4

Text

Infermatic

Sao10K: Llama 3.1 70B Hanami x1

This is [Sao10K](/sao10k)'s experiment over [Euryale v2.2](/sao10k/l3.1-euryale-70b).

Text

Context

16K

Group

Llama3

Pricing preview

Input Price: $3 /M tokens

Output Price: $3 /M tokens

Slug

sao10k/l3.1-70b-hanami-x1

Text

DeepInfra

DeepSeek: DeepSeek V3

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of the previous versions. Pre-trained on nearly 15 trillion tokens, the reported evaluations reveal that the model outperforms other open-source models and rivals leading closed-source models. For model details, please visit [the DeepSeek-V3 repo](https://github.com/deepseek-ai/DeepSeek-V3) for more information, or see the [launch announcement](https://api-docs.deepseek.com/news/news1226).

Text

Context

163.8K

Group

DeepSeek

Pricing preview

Input Price: $0.32 /M tokens

Output Price: $0.89 /M tokens

Slug

deepseek/deepseek-chat

Text

NextBit

Sao10K: Llama 3.3 Euryale 70B

Euryale L3.3 70B is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the successor of [Euryale L3 70B v2.2](/models/sao10k/l3-euryale-70b).

Text

Context

131.1K

Group

Llama3

Pricing preview

Input Price: $0.65 /M tokens

Output Price: $0.75 /M tokens

Slug

sao10k/l3.3-euryale-70b

Text

Unknown provider

Inflatebot: Mag Mell R1 12B

Mag Mell is a merge of pre-trained language models created using mergekit, based on [Mistral Nemo](/mistralai/mistral-nemo). It is a great roleplay and storytelling model which combines the best parts of many other models to be a general purpose solution for many usecases. Intended to be a general purpose "Best of Nemo" model for any fictional, creative use case. Mag Mell is composed of 3 intermediate parts: - Hero (RP, trope coverage) - Monk (Intelligence, groundedness) - Deity (Prose, flair)

Text

Context

32K

Group

Mistral

Pricing preview

No display pricing published in the current snapshot.

Slug

inflatebot/mn-mag-mell-r1

Text

Unknown provider

EVA Llama 3.33 70B

EVA Llama 3.33 70b is a roleplay and storywriting specialist model. It is a full-parameter finetune of [Llama-3.3-70B-Instruct](https://openrouter.ai/meta-llama/llama-3.3-70b-instruct) on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model This model was built with Llama by Meta.

Text

Context

16.4K

Group

Llama3

Pricing preview

No display pricing published in the current snapshot.

Slug

eva-unit-01/eva-llama-3.33-70b

Text

Unknown provider

xAI: Grok 2 Vision 1212

Grok 2 Vision 1212 advances image-based AI with stronger visual comprehension, refined instruction-following, and multilingual support. From object recognition to style analysis, it empowers developers to build more intuitive, visually aware applications. Its enhanced steerability and reasoning establish a robust foundation for next-generation image solutions. To read more about this model, check out [xAI's announcement](https://x.ai/blog/grok-1212).

TextImage

Context

32.8K

Group

Grok

Pricing preview

No display pricing published in the current snapshot.

Slug

x-ai/grok-2-vision-1212

Text

Unknown provider

xAI: Grok 2 1212

Grok 2 1212 introduces significant enhancements to accuracy, instruction adherence, and multilingual support, making it a powerful and flexible choice for developers seeking a highly steerable, intelligent model.

Text

Context

131.1K

Group

Grok

Pricing preview

No display pricing published in the current snapshot.

Slug

x-ai/grok-2-1212

Text

Cohere

Cohere: Command R7B (12-2024)

Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, tool use, agents, and similar tasks requiring complex reasoning and multiple steps. Use of this model is subject to Cohere's [Usage Policy](https://docs.cohere.com/docs/usage-policy) and [SaaS Agreement](https://cohere.com/saas-agreement).

Text

Context

128K

Group

Cohere

Pricing preview

Input Price: $0.0375 /M tokens

Output Price: $0.15 /M tokens

Slug

cohere/command-r7b-12-2024

Text

Unknown provider

Google: Gemini 2.0 Flash Experimental

Gemini Flash 2.0 offers a significantly faster time to first token (TTFT) compared to [Gemini Flash 1.5](/google/gemini-flash-1.5), while maintaining quality on par with larger models like [Gemini Pro 1.5](/google/gemini-pro-1.5). It introduces notable enhancements in multimodal understanding, coding capabilities, complex instruction following, and function calling. These advancements come together to deliver more seamless and robust agentic experiences.

TextImage

Context

Group

Gemini

Pricing preview

No display pricing published in the current snapshot.

Slug

google/gemini-2.0-flash-exp

Text

Venice

Meta: Llama 3.3 70B Instruct (free)

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. [Model Card](https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md)

Text

Context

65.5K

Group

Llama3

Pricing preview

Input Price: $0 /M tokens

Output Price: $0 /M tokens

Slug

meta-llama/llama-3.3-70b-instruct

Text

Inceptron

Meta: Llama 3.3 70B Instruct

Text

Context

131.1K

Group

Llama3

Pricing preview

Input Price: $0.12 /M tokens

Output Price: $0.38 /M tokens

Slug

meta-llama/llama-3.3-70b-instruct

Text

Amazon Bedrock

Amazon: Nova Lite 1.0

Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, and text inputs to generate text output. Amazon Nova Lite can handle real-time customer interactions, document analysis, and visual question-answering tasks with high accuracy. With an input context of 300K tokens, it can analyze multiple images or up to 30 minutes of video in a single input.

TextImage

Context

300K

Group

Nova

Pricing preview

Input Price: $0.06 /M tokens

Output Price: $0.24 /M tokens

Slug

amazon/nova-lite-v1

Text

Amazon Bedrock

Amazon: Nova Micro 1.0

Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of models at a very low cost. With a context length of 128K tokens and optimized for speed and cost, Amazon Nova Micro excels at tasks such as text summarization, translation, content classification, interactive chat, and brainstorming. It has simple mathematical reasoning and coding abilities.

Text

Context

128K

Group

Nova

Pricing preview

Input Price: $0.035 /M tokens

Output Price: $0.14 /M tokens

Slug

amazon/nova-micro-v1

Text

Amazon Bedrock

Amazon: Nova Pro 1.0

Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and cost for a wide range of tasks. As of December 2024, it achieves state-of-the-art performance on key benchmarks including visual question answering (TextVQA) and video understanding (VATEX). Amazon Nova Pro demonstrates strong capabilities in processing both visual and textual information and at analyzing financial documents. **NOTE**: Video input is not supported at this time.

TextImage

Context

300K

Group

Nova

Pricing preview

Input Price: $0.8 /M tokens

Output Price: $3.2 /M tokens

Slug

amazon/nova-pro-v1

TextReasoning

Unknown provider

Qwen: QwQ 32B Preview

QwQ-32B-Preview is an experimental research model focused on AI reasoning capabilities developed by the Qwen Team. As a preview release, it demonstrates promising analytical abilities while having several important limitations: 1. **Language Mixing and Code-Switching**: The model may mix languages or switch between them unexpectedly, affecting response clarity. 2. **Recursive Reasoning Loops**: The model may enter circular reasoning patterns, leading to lengthy responses without a conclusive answer. 3. **Safety and Ethical Considerations**: The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it. 4. **Performance and Benchmark Limitations**: The model excels in math and coding but has room for improvement in other areas, such as common sense reasoning and nuanced language understanding.

Text

Context

32.8K

Group

Qwen

Pricing preview

No display pricing published in the current snapshot.

Slug

qwen/qwq-32b-preview

Text

Unknown provider

Google: Gemini Experimental 1121

Experimental release (November 21st, 2024) of Gemini.

TextImage

Context

41K

Group

Gemini

Pricing preview

No display pricing published in the current snapshot.

Slug

google/gemini-exp-1121

Text

Unknown provider

EVA Qwen2.5 72B

EVA Qwen2.5 72B is a roleplay and storywriting specialist model. It's a full-parameter finetune of Qwen2.5-72B on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.

Text

Context

32K

Group

Qwen

Pricing preview

No display pricing published in the current snapshot.

Slug

eva-unit-01/eva-qwen-2.5-72b

Text

Mistral

Mistral Large 2411

Mistral Large 2 2411 is an update of [Mistral Large 2](/mistralai/mistral-large) released together with [Pixtral Large 2411](/mistralai/pixtral-large-2411) It provides a significant upgrade on the previous [Mistral Large 24.07](/mistralai/mistral-large-2407), with notable improvements in long context understanding, a new system prompt, and more accurate function calling.

Text

Context

131.1K

Group

Mistral

Pricing preview

Input Price: $2 /M tokens

Output Price: $6 /M tokens

Slug

mistralai/mistral-large-2411

Text

Mistral

Mistral Large 2407

This is Mistral AI's flagship model, Mistral Large 2 (version mistral-large-2407). It's a proprietary weights-available model and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large-2407/). It supports dozens of languages including French, German, Spanish, Italian, Portuguese, Arabic, Hindi, Russian, Chinese, Japanese, and Korean, along with 80+ coding languages including Python, Java, C, C++, JavaScript, and Bash. Its long context window allows precise information recall from large documents.

Text

Context

131.1K

Group

Mistral

Pricing preview

Input Price: $2 /M tokens

Output Price: $6 /M tokens

Slug

mistralai/mistral-large-2407

Text

Mistral

Mistral: Pixtral Large 2411

Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral Large 2](/mistralai/mistral-large-2411). The model is able to understand documents, charts and natural images. The model is available under the Mistral Research License (MRL) for research and educational use, and the Mistral Commercial License for experimentation, testing, and production for commercial purposes.

TextImage

Context

131.1K

Group

Mistral

Pricing preview

Input Price: $2 /M tokens

Output Price: $6 /M tokens

Slug

mistralai/pixtral-large-2411

Text

Unknown provider

xAI: Grok Vision Beta

Grok Vision Beta is xAI's experimental language model with vision capability.

TextImage

Context

8.2K

Group

Grok

Pricing preview

No display pricing published in the current snapshot.

Slug

x-ai/grok-vision-beta

Text

Unknown provider

Google: Gemini Experimental 1114

Gemini 11-14 (2024) experimental model features "quality" improvements.

TextImage

Context

41K

Group

Gemini

Pricing preview

No display pricing published in the current snapshot.

Slug

google/gemini-exp-1114

Text

Unknown provider

Infermatic: Mistral Nemo Inferor 12B

Inferor 12B is a merge of top roleplay models, expert on immersive narratives and storytelling. This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [anthracite-org/magnum-v4-12b](https://openrouter.ai/anthracite-org/magnum-v4-72b) as a base.

Text

Context

32K

Group

Mistral

Pricing preview

No display pricing published in the current snapshot.

Slug

infermatic/mn-inferor-12b

Text

Cloudflare

Qwen2.5 Coder 32B Instruct

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: - Significantly improvements in **code generation**, **code reasoning** and **code fixing**. - A more comprehensive foundation for real-world applications such as **Code Agents**. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies. To read more about its evaluation results, check out [Qwen 2.5 Coder's blog](https://qwenlm.github.io/blog/qwen2.5-coder-family/).

Text

Context

32.8K

Group

Qwen

Pricing preview

Input Price: $0.66 /M tokens

Output Price: $1 /M tokens

Slug

qwen/qwen-2.5-coder-32b-instruct

Text

Unknown provider

SorcererLM 8x22B

SorcererLM is an advanced RP and storytelling model, built as a Low-rank 16-bit LoRA fine-tuned on [WizardLM-2 8x22B](/microsoft/wizardlm-2-8x22b). - Advanced reasoning and emotional intelligence for engaging and immersive interactions - Vivid writing capabilities enriched with spatial and contextual awareness - Enhanced narrative depth, promoting creative and dynamic storytelling

Text

Context

16K

Group

Mistral

Pricing preview

No display pricing published in the current snapshot.

Slug

raifle/sorcererlm-8x22b

Text

Unknown provider

EVA Qwen2.5 32B

EVA Qwen2.5 32B is a roleplaying/storywriting specialist model. It's a full-parameter finetune of Qwen2.5-32B on mixture of synthetic and natural data. It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve versatility, creativity and "flavor" of the resulting model.

Text

Context

32K

Group

Qwen

Pricing preview

No display pricing published in the current snapshot.

Slug

eva-unit-01/eva-qwen-2.5-32b

Text

NextBit

TheDrummer: UnslopNemo 12B

UnslopNemo v4.1 is the latest addition from the creator of Rocinante, designed for adventure writing and role-play scenarios.

Text

Context

32.8K

Group

Mistral

Pricing preview

Input Price: $0.4 /M tokens

Output Price: $0.4 /M tokens

Slug

thedrummer/unslopnemo-12b

Text

Amazon Bedrock (US-WEST)

Anthropic: Claude 3.5 Haiku

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in real-time applications, it delivers quick response times that are essential for dynamic tasks such as chat interactions and immediate coding suggestions. This makes it highly suitable for environments that demand both speed and precision, such as software development, customer service bots, and data management systems. This model is currently pointing to [Claude 3.5 Haiku (2024-10-22)](/anthropic/claude-3-5-haiku-20241022).

TextImage

Context

200K

Group

Claude

Pricing preview

Input Price: $0.8 /M tokens

Output Price: $4 /M tokens

Slug

anthropic/claude-3.5-haiku

Text

Unknown provider

Anthropic: Claude 3.5 Haiku (2024-10-22)

Claude 3.5 Haiku features enhancements across all skill sets including coding, tool use, and reasoning. As the fastest model in the Anthropic lineup, it offers rapid response times suitable for applications that require high interactivity and low latency, such as user-facing chatbots and on-the-fly code completions. It also excels in specialized tasks like data extraction and real-time content moderation, making it a versatile tool for a broad range of industries. It does not support image inputs. See the launch announcement and benchmark results [here](https://www.anthropic.com/news/3-5-models-and-computer-use)

TextImageFile

Context

200K

Group

Claude

Pricing preview

No display pricing published in the current snapshot.

Slug

anthropic/claude-3.5-haiku-20241022

Text

Unknown provider

NeverSleep: Lumimaid v0.2 70B

Lumimaid v0.2 70B is a finetune of [Llama 3.1 70B](/meta-llama/llama-3.1-70b-instruct) with a "HUGE step up dataset wise" compared to Lumimaid v0.1. Sloppy chats output were purged. Usage of this model is subject to [Meta's Acceptable Use Policy](https://llama.meta.com/llama3/use-policy/).

Text

Context

131.1K

Group

Llama3

Pricing preview

No display pricing published in the current snapshot.

Slug

neversleep/llama-3.1-lumimaid-70b

Text

Mancer

Magnum v4 72B

This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://openrouter.ai/anthropic/claude-3.5-sonnet) and Opus(https://openrouter.ai/anthropic/claude-3-opus). The model is fine-tuned on top of [Qwen2.5 72B](https://openrouter.ai/qwen/qwen-2.5-72b-instruct).

Text

Context

16.4K

Group

Qwen

Pricing preview

Input Price: $3 /M tokens

Output Price: $5 /M tokens

Slug

anthracite-org/magnum-v4-72b

Text

Unknown provider

xAI: Grok Beta

Grok Beta is xAI's experimental language model with state-of-the-art reasoning capabilities, best for complex and multi-step use cases. It is the successor of [Grok 2](https://x.ai/blog/grok-2) with enhanced context length.

Text

Context

131.1K

Group

Grok

Pricing preview

No display pricing published in the current snapshot.

Slug

x-ai/grok-beta

Text

Unknown provider

Mistral: Ministral 8B

Ministral 8B is an 8B parameter model featuring a unique interleaved sliding-window attention pattern for faster, memory-efficient inference. Designed for edge use cases, it supports up to 128k context length and excels in knowledge and reasoning tasks. It outperforms peers in the sub-10B category, making it perfect for low-latency, privacy-first applications.

Text

Context

128K

Group

Mistral

Pricing preview

No display pricing published in the current snapshot.

Slug

mistralai/ministral-8b

Text

Unknown provider

Mistral: Ministral 3B

Ministral 3B is a 3B parameter model optimized for on-device and edge computing. It excels in knowledge, commonsense reasoning, and function-calling, outperforming larger models like Mistral 7B on most benchmarks. Supporting up to 128k context length, it’s ideal for orchestrating agentic workflows and specialist tasks with efficient inference.

Text

Context

128K

Group

Mistral

Pricing preview

No display pricing published in the current snapshot.

Slug

mistralai/ministral-3b

Text

Phala

Qwen: Qwen2.5 7B Instruct

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. - Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. - Long-context Support up to 128K tokens and can generate up to 8K tokens. - Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).

Text

Context

32.8K

Group

Qwen

Pricing preview

Input Price: $0.04 /M tokens

Output Price: $0.1 /M tokens

Slug

qwen/qwen-2.5-7b-instruct

Text

DeepInfra

NVIDIA: Llama 3.1 Nemotron 70B Instruct

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses. Leveraging [Llama 3.1 70B](/models/meta-llama/llama-3.1-70b-instruct) architecture and Reinforcement Learning from Human Feedback (RLHF), it excels in automatic alignment benchmarks. This model is tailored for applications requiring high accuracy in helpfulness and response generation, suitable for diverse user queries across multiple domains. Usage of this model is subject to [Meta's Acceptable Use Policy](https://www.llama.com/llama3/use-policy/).

Text

Context

131.1K

Group

Llama3

Pricing preview

Input Price: $1.2 /M tokens

Output Price: $1.2 /M tokens

Slug

nvidia/llama-3.1-nemotron-70b-instruct

Text

Unknown provider

xAI: Grok 2

Grok 2 is xAI's frontier language model with state-of-the-art reasoning capabilities, best for complex and multi-step use cases. To use a faster version, see [Grok 2 Mini](/x-ai/grok-2-mini). For more information, see the [launch announcement](https://x.ai/blog/grok-2).

Text

Context

32.8K

Group

Grok

Pricing preview

No display pricing published in the current snapshot.

Slug

x-ai/grok-2

Text

Unknown provider

xAI: Grok 2 mini

Grok 2 Mini is xAI's fast, lightweight language model that offers a balance between speed and answer quality. To use the stronger model, see [Grok Beta](/x-ai/grok-beta). For more information, see the [launch announcement](https://x.ai/blog/grok-2).

Text

Context

32.8K

Group

Grok

Pricing preview

No display pricing published in the current snapshot.

Slug

x-ai/grok-2-mini

Page 7 of 15

Previous Next

Need a model request?

Use the market snapshot for discovery, then ask ImaRouter for rollout.

If a model matters for your product, send the slug, expected traffic, target region, and latency expectations. The team can confirm support status, onboarding priority, or a migration path to an equivalent route on ImaRouter.

Contact

support@imarouter.com

Best for model availability questions, onboarding priority, routing strategy, and enterprise rollout planning.

Contact ImaRouter Email the slug