extremetech.com

Google Announces Gemini 2.0 to Kickstart an Era of 'AI Agents'

Time sure flies when you're being bombarded with AI models in every technology product in the world. It has been just a year since Google revealed Gemini 1.0, and it's celebrating by announcing the latest iteration, Gemini 2.0. Like the current v1.5 models, Google plans to have multiple flavors of Gemini 2.0, but it's starting with the smaller, more efficient Gemini 2.0 Flash, and you can try it now.

Gemini 2.0 was built for what Google insists upon calling the "agentic era," which will be marked by AI models (agents) that can act autonomously on your behalf. Gemini 2.0 will power some of Google's previously teased AI projects, but it's starting relatively modestly with Gemini 2.0 Flash. While Google is making some big promises for what Gemini 2.0 will do, its current implementation is optimized for chats.

Demis Hassabis of Google's DeepMind AI division says Gemini 2.0 Flash outperforms the larger Gemini 1.5 Pro on multiple metrics. At the same time, it's twice as fast. That means you won't have to wait as long for text to appear, and Google saves some computing resources. It supports multimodal input as well as output. That means you can feed it images, video, or audio like the 1.5 models, but it can also output multiple data types. Currently, that's limited to text and images (via Imagen 3), but it could include more data types later.

In our limited testing, Gemini 2.0 Flash does appear to outdo Gemini 1.5 Pro in some queries. For example, Gemini 1.5 still often fails the "strawberry" test, insisting there are only one or two letter Rs in the word. Gemini 2.0 Flash gets it right the first time. If you want to take the new model for a spin, all you need to do is select it in the Gemini app or website.

This announcement confirms Google is going all-in with AI agents. According to Google, the agentic era will finally unleash the power of AI. Rather than have a back-and-forth chat with a robot, you can instruct the robot to perform a complex task or ask it contextual questions. Google first showed off Project Astra at I/O this year, but it's built a new version on Gemini 2.0. Astra analyzes live video from your phone camera, allowing you to ask questions and get feedback on what it sees. Google says the new version is faster, has more reliable memory, and works in more languages. We'll have to take Google's word about that, as Astra has not expanded beyond a small group of "trusted testers."

Google also says that Gemini 2.0 will power Project Mariner, which Google has also teased. Mariner is a computer use agent, capable of browsing the web and collecting data on your behalf. In the demo above, the user asks Gemini to look up contact info for a list of companies. The agent searches Google, clicks around websites, and then lists the requested data. That's potentially quite useful if it works. Google claims Mariner has topped the charts in the WebVoyager benchmark, which tests an agent's ability to perform real-world tasks.

That's all off in the future, though. For now, you'll have to make do with Gemini 2.0 Flash. Even if Google's lofty agent ambitions take longer to come to fruition, the new model is a positive update. You can access Gemini 2.0 Flash even without a Gemini subscription, making the basic version of Google's AI marginally more useful. At least it can spell "strawberry."

Read full news in source page