ChatGPT gets screensharing and real-time video analysis, rivaling Gemini 2

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

OpenAI finally added long-awaited video and screen sharing to its advanced voice mode, allowing users to interact with the chatbot in different modalities.

Both capabilities are now available on iOS and Android mobile apps for ChatGPT Teams, Plus and Pro users, and will be rolled out to ChatGPT Enterprise and Edu subscribers in January. However, users in the EU, Switzerland, Iceland, Norway and Liechtenstein won’t be able to access advanced voice mode.

OpenAI first teased the feature in May, when the company unveiled GPT-4o and discussed ChatGPT learning to “watch” a game and explain what’s happening. Advanced voice mode was rolled out to users in September.

Credit: OpenAI

Users can access video via new buttons on the advanced voice mode screen to start a video.

OpenAI’s video mode feels like a video call like Facetime, because ChatGPT responds in real-time to what users show in the video. It can see what is around the user, identify objects and even remember people who introduce themselves. In an OpenAI demo as part of the company’s “12 Days of Shipmas” event, ChatGPT used the video feature to help brew coffee. ChatGPT saw the coffee paraphernalia, instructed when to put in a filter and critiqued the result.

It is also very similar to Google’s recently announced Project Astra, in which users can open a video chat, and Gemini 2.0 will respond to questions about what it sees, like identifying a sculpture found in a London street. In many ways, these features are more advanced versions of what AI devices like the Humane Pin and the Rabbit r1 were marketed to do: Have an AI voice assistant respond to questions about what it’s seeing in a video.

The new screen-sharing feature brings ChatGPT out of the app and into the realm of the browser.

For screen share, a three-dot menu allows users to navigate out of the ChatGPT app. They can open apps on their phones and ask ChatGPT questions about what it’s seeing. In the demo, OpenAI researchers triggered screen share, then opened the messages app to ask ChatGPT for help responding to a photo sent via text message.

However, the screen-sharing feature on advanced voice mode bears similarities to recently released features from Microsoft and Google.

Last week, Microsoft released a preview version of Copilot Vision, which lets Pro subscribers open a Copilot chat while browsing a webpage. Copilot Vision can look at photos on a store’s website or even help play the map guessing game Geoguessr. Google’s Project Astra can also read browsers in the same way.

Both Google and OpenAI released screen-sharing AI chat features on phones to target the consumer base who may be using ChatGPT or Gemini more on the go. But these types of features could signal a way for enterprises to collaborate more with AI agents, as the agent can see what a person is looking at onscreen. It can be a precursor to models that use computers, like Anthropic’s Computer Use, where the AI model is not only looking at a screen but is actively opening tabs and programs for the user.

Ho ho ho, ask Santa a question

In a bid for levity, OpenAI also rolled out “Santa Mode” in advanced voice mode. The new preset voice sounds much like the jolly old man in a red suit.

Unlike the new features restricted to specific users, “Santa Mode” is now available to users with access to advanced voice mode on the mobile app, the web version of ChatGPT and the Windows and MacOS apps until early January.

Chats with Santa, though, will not be saved in chat history and will not affect ChatGPT’s memory.

Even OpenAI is feeling the Christmas spirit.

Daily insights on business use cases with VB Daily

If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

Read our Privacy Policy

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

Source link

ChatGPT gets screensharing and real-time video analysis, rivaling Gemini 2

Share This Post

Ho ho ho, ask Santa a question

Related Posts

Samsung To Ship Next-Gen Memory To Nvidia In February

Where Tech Leaders and Students Really Think AI Is Going

Government aims to make 3-nanometre chips by 2032; achieve self-reliance in 75% tech categories in 4 years

Vivo X200T launched in India with ZEISS cameras, Dimensity 9400+ chipset. Confirmed price and availability

We have a new way to explain why we agree on the nature of reality

Access Denied

ChatGPT gets screensharing and real-time video analysis, rivaling Gemini 2

Share This Post

Sharing a screen

Ho ho ho, ask Santa a question

Related Posts

Samsung To Ship Next-Gen Memory To Nvidia In February

Where Tech Leaders and Students Really Think AI Is Going

Government aims to make 3-nanometre chips by 2032; achieve self-reliance in 75% tech categories in 4 years

Vivo X200T launched in India with ZEISS cameras, Dimensity 9400+ chipset. Confirmed price and availability

We have a new way to explain why we agree on the nature of reality

Access Denied