Google I/O Focuses On AI Reasoning, Vision, Audio - And Planning Ahead

by Laurie Sullivan @lauriesullivan, May 14, 2024

Google announced a wealth of AI-related news and features Tuesday for developers at its I/O conference, but none will have more impact for advertisers and marketers then the ability to see, hear, respond, and think steps ahead while working across multiple types of media.

The news comes one day after OpenAI and Microsoft announced similar technology, but the impact on the industry will be far-reaching. For advertising and marketing, the technology should give brands a new vision for their campaigns, literally, from search to connected television (CTV). And it will become more important to rethink content on websites -- not just text, but audio and video.

“AI agents will undoubtedly transform the ad industry,” Jacob Bourne, a technology analyst at eMarketer, wrote in an email to MediaDailyNews. “We can expect to see AI-powered ads that can adapt in real-time to user interactions by offering similar content in the different modalities of text, images, audio, and video.”

He explained how the agent’s conversational abilities could help guide consumers through their shopping experiences, both data-driven and personalized.

Capabilities with computer vision will play a significant role in search, and those that can analyze images and provide recommendations to consumers will open new possibilities for brands to deliver ads with immediate relevance.

Those who opted into Search Generative Experience (SGE) via Search Labs are familiar with the AI overview feature, which populates AI insights at the top of search results.

SGE, renamed AI Overviews today, will become available to everyone in the U.S. starting today, made possible by a new Gemini model and customized for Google Search.

Gemini uses video understanding capabilities from Project Astra, a project from Google DeepMind intended to reshape the future of AI assistants. The universal AI agent understands the complex world and takes in images and voice to understand what it sees.

AI vision must have the ability to take in images and remember what it sees. Whittling the response time down to provide a true conversational experience is complicated. It will change the way search-engine optimization marketers think about the type of content and descriptions on websites.

To demonstrate AI vision, a Google employee held up a camera in a prerecorded demonstration and said, “tell me when you see something that makes sound.” The camera phone panned across the room scanning the image of a speaker. “I see a speaker, which makes sound,” the AI said. The Google employee asked, “what is that part of the speaker called?

The conversation continued between the Google employee and the AI agent, covering a variety of topics related to objects and elements in the room and environment such as crayons, code on a computer screen, and the neighborhood after the employee held the phone by the window, so the agent could identify the surroundings.

In upcoming months, Gemini Advanced will offer a new planning feature that helps users plan meals, vacations and more. With this experience, Google says Gemini Advanced could create an itinerary through a multi-step prompt in a search query.

"My family and I are going to Amsterdam for weeklong vacation in May. My daughter loves tulips, so where can we see them growing live? Can you also find a hotel close to these areas, and include any Gmail correspondence I have about this trip?”

ad campaign, ai, artificial intelligence, computers, connected tv, generative ai, google, media buying, paid search, real-time, recommendations, research, seo, technology, television, web sites

Next story loading