返回资讯列表
Multimodal AI

Google Gemini Ultra's Native Multimodality: 1-Hour Video Analysis in a Single API Call

Google's Gemini 1.5 Pro enables true video understanding — processing up to 1 hour of video at 1fps in a single context window. Early benchmarks show 87% accuracy on technical diagram relationship extraction and 94% accuracy on financial report chart data extraction, significantly outperforming models using bolt-on vision.

2025年5月18日来源:Google DeepMind
GoogleGeminimultimodalvideo-understandingvision-AI

阅读原文

本条资讯来源于 Google DeepMind,点击查看完整报道。

前往 Google DeepMind