Google dropped Gemini Embedding 2 yesterday.
Text. Images. Audio. Video. One unified vector space. Everyone's saying multi-vector search is dead.
Here's what they're missing:
Gemini Embedding 2 is brilliant when your data tells the same story across formats.
→ A product video where frames, voiceover, and captions all mean the same thing? One model, done.
But most real retrieval problems aren't one story.
A biometric system:
→ Face. Fingerprint. Iris. Voiceprint.
Same person. Completely different semantic spaces. You can't collapse them. Physics won't allow it.
A coding assistant:
→ Fuzzy semantic search for "that deployment bug last week"
→ Exact keyword match for --config flag or file path
These two need to stay separate. Merging them makes both worse.
This is exactly why Milvus supports multiple vector fields in a single collection — dense, sparse, different dims, different metrics — all queryable in parallel, one ranked list back to you.
The real split:
→ Same thing, different formats → Gemini Embedding 2
→ Different things, same entity → multi-vector retrieval
Two tools. Two problems. Both still have a job.
———
Follow @milvusio , created by @zilliz_universe , for everything related to unstructured data