Nvidia, a GPU manufacturer, has release a demo for a new AI-powered system that can generate a video conferencing feed from a still image.
In December 2020, they announced Vid2Vid Cameo, a deep learning model built on a dataset of 180,000 videos that uses generative adversarial networks (GANs), to animate 2D images by using live video input. The same image can be re-used for future meeting as well.
According to an article from TechRadar, the Vid2Vid Cameo requires both a source image (real photo or avatar) and a live webcam feed, and during the video conference the system converts the user’s real-time motions and expressions during that call on to the image they provided, which acts as a talking head.
In a blog post, Nvidia detailed how this will enable someone to attend a meeting with their hair unkept or not being dressed in business attire and still appear on the screen as having a professional experience based off the image they provided.
It also re-orients the person to appear to be speaking directly into the camera as opposed to focusing on one individual in the meeting and the user can adjust the camera angles if needed. Vid2Vid will also address the issue of video feeds with poor resolution.
The system uses video compression techniques to decrease the bandwidth requirements by almost tenfold, which will allow meetings to run more efficiently without the worry of connection quality. And instead of sending big video streams between meeting attendees, it will only require audio data and facial movement information to be sent between them as this data is meshed into a video to the receiver.
“Many people have limited internet bandwidth, but still want to have a smooth video call with friends and family,” said Ming-Yu Liu, a researcher at Nvidia and co-author of the project, in the blog post and reported in the TechRadar article.
Liu also points out that many other groups besides remote workers could benefit from this product including video game, animation, and photo editing developers.
The systems capabilities are set to be packaged with the Nvidia Maxine SDK, which is a free platform that assists developers in optimizing video and live streaming feeds by using a number of AI models.