Google AI Studio
Google: Gemini Embedding 2 Preview
Gemini Embedding 2 Preview is Google's first multimodal embedding model, mapping text, images, video, audio, and PDFs into a unified vector space for semantic search and retrieval-augmented generation (RAG). It supports input context up to 8,192 tokens and flexible output dimensions from 128 to 3,072 (recommended: 768, 1536, or 3,072). Designed for cross-modal similarity β you can embed a text query and retrieve the most relevant images, or vice versa β making it well-suited for multimodal search, recommendation, and document understanding pipelines.
Context
8.2K
Group
Gemini
Pricing preview
Text Input: $0.2 /M tokens
Image Input: $0.45 /M tokens
Slug
google/gemini-embedding-2-preview