OpenAI is dominating the AI scene right now after having cleverly taken over the holidays with its “12 Days” event. Each weekday brings us a new livestream where OpenAI announces a new ChatGPT feature or another product update, leaving little room for competing AI companies to shine. But then there’s Google, OpenAI’s biggest AI rival, which has found equally clever ways to compete for attention.
Just last week, Google announced the big Gemini 2.0 upgrade and its first AI agents. If that wasn’t enough to make us forget about ChatGPT for at least a day, Google decided to also unveil the Android AR platform that will power XR devices with AI at the core. In the process, Google previewed its unnamed Gemini-powered AR smart glasses.
A few days later, Google came out with another new AI product. It’s an exciting image generator that’s called Google Whisk. It isn’t like your regular AI image generator, and that probably makes it the most fun. Instead of typing a prompt to Gemini to create a specific AI image, you can upload images and have Whisk create new scenes based on your prompts.
It’s not a full product, as Whisk is currently only available as a Google Labs demo. It’s also restricted to the US market, But it looks awesome nonetheless.
Tech. Entertainment. Science. Your inbox.
Sign up for the most interesting tech & entertainment news out there.
By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.
Google has a few powerful AI image generators at its disposal. Some are available in Google Photos, and some were introduced with the Pixel 9 phones. I’ve usually criticized Google’s AI photo editing software, especially those that shipped with the Pixel 9 phones, as they allow anyone to easily manipulate reality and turn it into something fake.
The company was in such a rush to show its progess in AI that it launched those features without first deploying safeguards. Those came later.
A Whisk image (right) based on what Gemini understood from the three uploaded images on the left. Image source: Google
Whisk isn’t like that. It’s not meant to create lifelike images that can be used for dubious activities. It’s a fun way to develop quick AI images using photos you already have as inspiration. Whisk will not tell you to write a detailed prompt for an AI-generated image. Instead, it’ll ask you to upload three images: One for the subject, one for the scene, and one for the style. Gemini will then analyze those images, craft its own prompt based on them, and pass that on to Google’s Imagen 3 image generation tool.
Google said in a blog post that the process “captures your subject’s essence, not an exact replica.”
However, you might not like what Gemini thought you wanted out of your images. If that’s the case, you can add a text prompt so the AI can Whisk up something new that’s more in line with what you envisioned.
Google also notes that Whisk is a “new type of creative tool,” rather than a traditional image editor. “We built it for rapid visual exploration, not pixel-perfect edits. It’s about exploring ideas in new and creative ways, allowing you to work through dozens of options and download the ones you love,” Google said.
Some Redditors who tested the feature found that Whisk can create lifelike subjects, like the cat below:
It also seems to me like Whisk is the perfect tool to enlist your help in training the AI without Google telling you that you’re training the AI. Think about it: you’re giving Google your photos, and then Gemini looks at them to see what it can understand. It then pieces together three pics to create one image that isn’t perfect. The text prompt you use to refine the image is actually a feedback tool for Gemini.
In a world where AI firms are running out of data to train the AI, experiments like Whisk, which can easily go viral, can come in handy. On that note, Google doesn’t say what happens with your Whisk interaction. What happens to the photos you upload to Whisk? What happens with the “chat” with Gemini? We don’t know.
You can try Whisk by signing up for it at Google Labs at this link, as long as you’re based in the US. The new AI image generator isn’t available in international markets. Here’s a video of Whisk in action: