Gems, Imagen 3 and Gemini Live

Share This Post

At I/O 2024, Google presented new functions for its Gemini AI platform, which are now gradually becoming available. With GEMS, a system of topic-based AI assistants, and Imagen 3, an advanced image generator, Google is showing how artificial intelligence can be used in a personalised and ethically responsible way. In this article, we take a closer look at the new functions, their areas of application and the associated challenges.

GEMS: focussed AI chatbots for individual needs

GEMS are special versions of the Gemini AI assistant. They allow users to create experts for specific topics or tasks that are tailored precisely to their needs. You can define these personalised chatbots yourself and they remember what you have already talked about and in what context. This allows them to support you with recurring tasks. The concept is broadly similar to Open AI's GPTs and Anthropic's projects.

GEMS is available for Gemini users who use Advanced, Business or Enterprise. Users can define specifications, for example goals, rules of behaviour and application purposes. There are also a few ready-made Gems to get you started - for example as a learning aid, creative partner, career counsellor, writing assistant or coding partner. These ready-made Gems can be used directly and make it easier to familiarise yourself with the topic-based AI assistants.

Imagen 3: Advanced AI-supported image generation

Imagen 3 marks the return of Google's AI image generator, which can now (again) create images of people, albeit with a few limitations. Imagen 3 now generates images in a much better and more varied way. In addition, the AI can now create images in different styles - from completely realistic depictions to artistic interpretations. This function is available in all languages supported by Google and is based on Google's SynthID watermarking technology. This means you can be sure that the AI-generated content really comes from it.

It is worth noting that Google has reintroduced the generation of people images after problems with this previously.

However, only in English for the time being. This means that the free version of Gemini can also access general image generation, but without the extended functions for human images.

In earlier versions, for example, the AI depicted historical figures incorrectly. With Imagen 3, Google wants to correct such inaccuracies. Depictions of famous people or minors as well as excessive depictions of violence or inappropriate content are no longer prohibited.

Gemini Live: The future of real-time AI interactions

Even though Gemini Live is currently only available in English and as a subscription model, Google has overtaken the Chatgpt variant, which is stuck in alpha testing, and put pressure on Open Ai to be able to interact with AI in real time by voice. With Gemini Live, responses should be even more dynamic and contextualised in the future.

In addition, the boundary between machine and human communication is further blurred. This function could be particularly helpful when it comes to responding quickly to enquiries, for example in customer service or in learning environments.

Pricing and availability

Most of the new functions are currently only available with a paid subscription: Gemini Advanced costs USD 20 per month, while the Enterprise version costs USD 30.

Conclusion: Personalised AI for everyday life and beyond

The new functions on the Gemini platform show how Google wants to make the use of AI accessible to a broad user base. GEMS and Imagen 3 offer customised support in a wide range of scenarios and show what is already possible with artificial intelligence today. Google is taking a careful approach here, in which innovation and responsibility go hand in hand. Protective measures such as SynthID, which recognises AI-generated content, are particularly important.

With these developments, Google is showing that the future of AI lies in customised, responsible yet powerful solutions that meet the needs of users and go beyond the limits of traditional applications. Beyond these basic facts, it remains exciting to see which models are and will be most useful in which situations.

Related Posts

OpenAI's new AI model o1: A quantum leap in machine thinking?

On 12 September 2024, OpenAI surprised the tech world with...

Alexa upgrade via Claude, but not for everyone

Amazon has recently taken a significant step forward in...

Gems, Imagen 3 and Gemini Live

At I/O 2024, Google announced new functions for...

Aleph Alpha introduces new Pharia language models

The German AI company Aleph Alpha recently announced its new...

The silent revolution: how AI is imperceptibly changing our everyday lives

Introduction: The invisible change In a world characterised by technological...

Kling AI: An alternative to Runway and Co ?

Kling AI, developed by the Chinese tech giant Kuaishou, is a new...