Site icon

Google’s new AI “Astra” is the future of AI assistants – JARVIS for mankind

Google Astra AI

Google unveiled a new AI model named “Astra”, powered by Gemini which is expected to be the future of AI assistants, enhancing the way people interact with the AI.

Google held its Google I/O developer conference recently on 14th May, where it unfurled couple of AI models out of which this Astra was very well received. Google has been exploring ways to cater more value to its users utilizing artificial intelligence to its fullest. And one such way is blending technology with human’s life, making users compatible to easily access the technology in real-time seamlessly.

In that aspect, Astra is anticipated to cater that and is told to be the future of AI assistants, more or less like Iron Man’s JARVIS for mankind.

What is Google Astra?

Google Astra is a “universal AI agent that is helpful in everyday life”, – that’s what DeepMind’s Astra page mentions. But it needs more description.

Astra is a multimodal AI model, developed by Google’s DeepMind to serve people as a useful AI assistant for range of activities from getting information, clarifying their doubts in the real world and to make AI work as their assistant.

Built on Google’s Gemini, Astra is the future of AI assistants that is capable of processing multimodal (text, picture, audio and video) information, understand the context you’re in and respond naturally in conversation.

In the demo showed by Google during the conference, a user asked AI to identify a part of a speaker, find her missing glasses, review code and more. Astra answered all of these in real time and in a very conversational way like your encyclopedic assistant.

Google thinks the best way to connect more with people is via smartphones and smart glasses. And Astra, being a multimodal assistant will be highly beneficial to them, if it can seamlessly connect in real-time on phones and glasses.

“It also needs to be proactive, teachable and personal, so users can talk to it naturally and without lag or delay,” says DeepMind’s site.



Demo of Astra

Google let developers to demo Astra, presenting them with four use cases (storyteller, Pictionary, alliteration and free-form), however fourth-one is free-form so you can do anything you want with the AI.

These four – storyteller, Pictionary, alliteration and free-form are self-explanatory and are what existing generative-AI models are expert in doing. But the real execution is in Astra’s multimodality, the depth, speed and adaptability of the real world to the context of the question, which the users explained as ‘unbelievably impressive’. The storyteller and Pictionary capabilities will benefit more for children, students and people who have time to spare entertained.

Astra can explain drawings, describe what a specific part of a device can do, solve maths problems, review code, recognize drawings of landmarks, memorize a sequence of objects, interpret drawings from literature and more.

For example, when a pepper was placed on Astra’s camera and feed and asked to create an alliteration, it came up with this “Perhaps polished peppers pose peacefully,” said a reviewer from ZDNet.

The real challenge for Google was to make the existing generative-AI to understand multimodal information, and getting response time down to something conversational in ‘real-time’.

Google Astra is still in the early prototype phase and only represents one way you might want to interact with a system like Gemini. The DeepMind team is still researching how best to bring multimodal models together and how to balance ultra-huge general models with smaller and more focused ones. Means, Google won’t stop with Astra and lot more is coming, as announced in Google I/O.


(For more such interesting informational, technology and innovation stuffs, keep reading The Inner Detail).

Kindly add ‘The Inner Detail’ to your Google News Feed by following us!

Exit mobile version