9 Important Announcements at Google I/O 2024
9 Important Announcements at Google I/O 2024
Google I/O 2024 was officially held on Wednesday morning WIB (15/5/2024). In the keynote session the internet search giant announced a number of innovations and projects that will shape the future of technology.
Different from previous years, Google I/O 2024 focuses on discussing artificial intelligence (AI) technology developed by the company. One of the most highlighted topics is Gemini.
"Google is fully in the Gemini era," said Sundar Pichai, SEO Google and Alphabet when opening the Google I/O 2024 keynote session which was held in a hybrid manner.
Since being announced at Google I/O 2023, Gemini has continued to develop. Two months ago Google introduced Gemini 1.5 Pro which can run 1 million tokens in a single query.
The company, which is headquartered in Mountain View, is working to ensure that Gemini continues to be used by more users. Today more than 1.5 million developers use Gemini models to build next-generation AI applications.
"We've also brought Gemini's breakthrough capabilities across our products, in a powerful way. We'll showcase examples today in Search, Photos, Workspace, Android and more," said Sundar Pinchai.
There were 9 important announcements made during the Google I/O 2024 keynote session, namely:
1. Gemini
Google announced Gemini 1.5 Flash offers high speed and cost efficiency as an alternative to Gemini 1.5 Pro, while remaining highly capable. Meanwhile, Gemini 1.5 Pro has also been improved to provide higher quality responses in various areas such as translation, reasoning, programming, and others.
Google announced a context window of 1 million for Gemini Advanced, allowing consumers to get AI assistance for large documents such as PDFs that are up to 1,500 pages long or 100 emails. Currently, Google claims to be previewing two million context windows for Gemini 1.5 Pro and Gemini 1.5 Flash to developers via the waiting list in Google AI Studio.
No less interesting is Google announcing the Gemini Nano with Multimodality. This model is designed to run on smartphones and has been extended to understand images, text and spoken language.
Speaking of Gemma, the Gemini model family also received a big improvement with the launch of Gemma 2 which is optimized for TPUs and GPUs with 27B parameters. Not only that, Google announced PaliGemma which was added to the Gemma model family.
2. Project Astra
Google DeepMind revealed Project Astra, which is expected to change the future of AI assistants with video understanding capabilities. Project Astra aims to develop a universal AI agent that can help in everyday life.
During the demonstration, this research model demonstrated its capabilities by identifying objects that produce sounds, providing creative alliteration, explaining codes on the monitor, and finding forgotten items.
Project Astra also shows its potential in wearable devices, such as smart glasses, where it can analyze diagrams, suggest improvements, and generate intelligent responses to visual stimuli. In the future, Gemini will use the video understanding capabilities of Project Astra to shape the future of AI assistants.
3. AI integration in Google Search
AI will be integrated in almost all Google products, from the long-standing Search to Android 15. Especially for US users, they can now use AI Overview in search results, not limited to Search Labs.
Users will also be able to customize your AI Overview with options to simplify the language or group it in more detail. This can be especially useful if the user is new to a topic, or if trying to simplify something to satisfy a child's curiosity.
Google promises Overview AI will help answer increasingly complex questions. For example, maybe you're looking for a new yoga or pilates studio, and you want one that's popular with locals, conveniently located for your trip, and also offers discounts for new members.
4. Imagen3
This model produces the highest quality images, with more detail and fewer artifacts in the image to help create more realistic images. Imagen 3 has enhanced natural language capabilities to better understand user commands and the intent behind them.
This model can solve one of the biggest challenges for AI image generators, namely text, and Google claims Imagen 3 is the best for rendering it. It's just that Imagen 3 isn't widely available yet, it's available in private preview within Image FX for select creators. This model will be available soon on Vertex AI, and people can register to join the waitlist .
5. Veo (text-to-video generator)
Veo can produce high quality 1080p resolution videos with a duration of more than one minute. The model can better understand natural language to produce videos that better represent the user's vision, according to Google. Veo also understands cinematic terms like “timelapse” to produce videos in a variety of styles and gives users more control over the final result.
6. SynthID
In the era of generative AI, we see many companies focusing on the multimodality of AI models. To make its AI labeling tools compliant, Google is now expanding SynthID (the technology that watermarks AI images) to two new modalities, text and video. In addition, Google will apply a SynthID watermark to videos produced by Veo.
7. Ask Photos
If you've ever spent hours scrolling through your feed to find images, Google offers an AI solution to solve the problem. Using Gemini, users can use conversation prompts in Google Photos to find the images they are looking for.
This feature is called Ask Photos. Google announced that the feature will roll out later this summer with more capabilities in the future.
In the example Google provided, the user wanted to see their daughter's progress as a swimmer over time, so they asked Google Photos that question, and it automatically packaged the highlights for them.
8. AI on Android
Circle to Search, which previously could only perform Google searches by circling images, videos and text on their phone screen, can now "help students with their homework." Google says the feature will work with topics ranging from math to physics, and will eventually be able to process complex problems like symbolic formulas, diagrams, and more.
Gemini will also replace Google Assistant, becoming the default AI assistant on Android phones and accessible by long pressing the power button. Google says Gemini will be implemented across a variety of services and applications, providing multimodal support when requested.
Gemini Nano's multimodal capabilities will also be utilized through Android's TalkBack feature, providing more descriptive responses for users who are blind or visually impaired.
emini Nano can listen for and detect suspicious conversation patterns and notify users to “Hang up & continue” or “End call”. This feature is promised to be available at the end of this year.
9. Google Workspace
With all the Gemini updates, Google Workspace is increasingly integrated with AI. For starters, the Gemini side panels of Gmail, Docs, Drive, Slides, and Sheets will be upgraded to Gemini 1.5 Pro.
Gmail for mobile now has three useful new features: summaries, Gmail Q&A, and Contextual Smart Replies. The Summarize feature does exactly what its name suggests -- it summarizes a series of emails that utilize Gemini. This feature will be available to users starting this month.
Gmail's Q&A feature allows users to chat with Gemini about the context of their emails within the Gmail mobile app. For example, in the demo, users asked Gemini to compare roof repair offers based on price and availability. Gemini then pulls information from several different inboxes and displays it to the user.
Finally, the Help Me Write feature in Gmail and Docs is getting support for Spanish and Portuguese, which will come to desktop in the coming weeks.
Post a Comment for "9 Important Announcements at Google I/O 2024"