arstechnica.com

Gemini is an increasingly good chatbot, but it’s still a bad assistant

Google's generative AI is not ready to serve as your virtual assistant.

Gemini AI Android app assistant Gemini AI Android app assistant

Credit: Ryan Whitwam

Google announced its intention to unify its generative AI efforts under the Gemini brand at the tail end of 2023, and it's been full steam ahead ever since. In 2025, Google Assistant is being phased out and replaced with Gemini. As Google, Amazon, and others move toward a world in which all virtual assistants are based on generative AI, it's reasonable to consider if this is actually a good idea. Despite promises of "smarter" AI and ever-increasing token limits, these robots still have a fundamental flaw that may make them bad Assistants: They lie.

They don't set out to lie, of course, because they don't know what a "lie" is. These systems attempt to generate the most plausible next token to build an output. Because of this, generative AI is non-deterministic—you can't predict the output, and even running the same prompt multiple times will offer varying responses.

This can look impressively like thinking sometimes, but it also leads to frequent hallucinations. That's why the iPhone said Luigi Mangione was dead and Google told people to put glue on pizza. GenAI proponents like Google and Apple have been trying to curb the chaos of confabulations, but this may always be a problem because of the nature of the underlying technology.

Even if a generative AI assistant is right most of the time—and we are reaching that point—an occasional hallucination can still screw up your day. And yet, Google has moved with relentless efficiency to wedge generative AI into every one of its products, which is why we're all watching Assistant wither and die in favor of Gemini.

Trust but verify

I'm not deeply committed to Assistant—it's missing a lot of functionality, and sometimes the bugs can be so frustrating that I regret even engaging with it. However, I'm almost certain I'll miss it once it's gone. Assistant is great at basic things like setting timers and sending messages, and it does so without a fuss. These are things that Gemini, in spite of all its cloud-based processing muscle, still screws up. For anything more complex or important, Gemini is worse than inefficient—it's untrustworthy.

Google leverages the theoretical power of generative AI to give Gemini access to data across multiple apps. When it works, this can be very handy. For example, you can ask Gemini to check your email for a specific message, extract data, and pass it to another app. I was excited about this functionality at first, but in practice, it makes me miss the way Assistant would just fail without wasting my time.

Gemini icon macro

Credit: Ryan Whitwam

I was reminded of this issue recently when I asked Gemini to dig up a shipment tracking number from an email—something I do fairly often. It appeared to work just fine, with the robot citing the correct email and spitting out a long string of numbers. I didn't realize anything was amiss until I tried to look up the tracking number. It didn't work in Google's search-based tracker, and going to the US Postal Service website yielded an error.

That's when it dawned on me: The tracking number wasn't a tracking number; it was a confabulation. It was a believable one, too. The number was about the right length, and like all USPS tracking numbers, it started with a nine. I could have looked up the tracking number myself in a fraction of the time it took to root out Gemini's mistake, which is very, very frustrating. Gemini appeared confident that it had completed the task I had given it, but getting mad at the chatbot wouldn't do any good—it can't understand my anger any more than it can understand the nature of my original query.

At this point, I would kill for Assistant's "Sorry, I don't understand."

This is just one of many similar incidents I've had with Gemini over the last year—I can't count how many times Gemini has added calendar events to the wrong day or put incorrect data in a note. In fairness, Gemini usually gets these tasks right, but its mechanical imagination wanders often enough that its utility as an assistant is suspect. Assistant just couldn't do a lot of things, but it didn't waste my time acting like it could. Gemini is more insidious, claiming to have solved my problem when, in fact, it's sending me down a rabbit hole to fix its mistakes. If a human assistant operated like this, I would have to conclude they were incompetent or openly malicious.

Like all generative AI firms, Google includes a disclaimer on Gemini that it can make mistakes and users should double-check its work. If I'm using Gemini for anything even remotely important, you can bet I'm scrutinizing what it does. Maybe that's viable for some tasks, but at that point, I might as well do things myself.

Google has a case to make at I/O

When Google was all-in with Assistant, it created an expansive toolkit for developers to build integrations and conversational experiments with the system. Now, devs are starting from scratch in the Gemini era as Google pushes to sunset Assistant by the end of the year. Google placed a big bet on Gemini improving, and everyone will be looking for evidence of that improvement at Google I/O.

Extreme close-up of Google Assistant text.

Credit: Google

This is, of course, not a surprise. Right from the start, Gemini has been pushing Assistant to the side. When Google released the Gemini app for Android, you couldn't even install it without disabling Assistant on your phone.

Google's annual I/O event is ostensibly for developers, and many of the attendees, both virtual and physical, will remember when Assistant was core to Google's strategy. Many of them probably spent time working with the Assistant dev tools, but Google will be pitching Gemini even harder this year.

Anyone who's plugged into Google's platform can attest that Gemini has been a long road. Google couldn't wait to make it everyone's assistant, even though it was lacking features compared to Google Assistant. Google has been moving quickly as it tries to catch up to OpenAI, releasing new Gemini models so quickly that it can be hard to keep up. Some of them, like the new experimental version of 2.5 Pro, are beginning to show noticeable vibe improvements. But is Gemini trustworthy enough to manage my calendar or email? Not yet.

Read full news in source page