AI Firm Synthesia Unveils Digital Video Avatars—Here’s How They Work
4 months ago admin
Have you ever wanted a digital twin? Or dreaded having to re-record a series of videos? Like a sci-fi storyline, London-based synthetic media generation developer Synthesia has launched a service that gives customers the ability to create digital video representations of themselves that they can use on social media and in marketing campaigns.
Synthesia first announced the launch of its Personal Avatars during a livestream on Wednesday. These avatars, Synthesia claims, can be generated using as little as two minutes of video from a webcam or mobile phone—although I discovered that creating them takes significantly longer.
The company says its Personal AI Avatars could be enlisted to generate employee training videos, product explainers, sales and marketing materials, or engage with customers. The benefit to businesses is that these different videos can be created using the same AI avatar without having to re-record the actor or employee in a brand-new, in-person session.
There are limitations, however, which I’ll get into.
Preventing misuse
Synesthesia recognizes that the ability to create high-fidelity replicas of real people could be abused for malicious or deceptive purposes. Synthesia told Decrypt it keeps its technology from being used to create AI-generated deepfakes by following three principles: consent, control, and collaboration.
“We will never create an AI avatar without someone’s clear consent,” Alexandru Voica, head of corporate affairs and policy at Synthesia, said. “Our platform provides a secure environment for users, ensuring their data is safe, they are in control of their avatars, and misuse is minimized through content moderation at the point of creation.”
To Voica’s point, before the avatar generation begins, the user is asked to consent to the collection, use, storage, and disclosure of their video and audio recordings by Synthesia Limited and its vendors “to authenticate the personal avatar submission.”
Voica said Synthesia also works with industry peers, policymakers, and others to develop best practices for the responsible use of AI.
“Non-consensual deepfakes are the biggest source of harmful content online,” Voica said. “Because Synthesia avatars cannot be made without the explicit consent of the human they represent, we’re not in the business of non-consensual deepfakes, which significantly limits the potential for abuse of our platform and of Personal Avatars specifically.”
When asked if there are personal avatars or avatars in general that Synthesia will not allow, Voica said the company uses both advanced technological filters and human content moderation to make sure Synthesia is not being used to facilitate the creation of inappropriate or harmful content.
“When someone attempts to make a video, that content will be put through our content moderation workflow before it is generated,” Voica told Decrypt. “If it is found to violate our policies, the video never gets created. Repeat offenders or serious violations can also lead to their account being disabled.”
Creating your avatar
To get started, users must create a Synthesia account. Personal avatars are available on the “Starter,” “Creator,” and “Enterprise” tiers. An annual subscription for Starter and Creator accounts ranges between $18 and $59 a month, respectively. Businesses seeking enterprise licensing must contact Synthesia to determine pricing.
The process worked better on my MacBook Pro than on my Windows 11 PC, both using the Brave browser. When you are ready, Synthesia will ask if you want to record directly from the platform or upload a video. I chose to record.
According to Synthesia, a personal avatar is created using an advanced form of looping technology called auto alignment, which can determine when an avatar is speaking and makes body movements more responsive. Languages available to personal avatars include English, German, French, Spanish, Arabic, Croatian, Filipino, Greek, Hindi, Italian, Romanian, Russian, Turkish, and Ukrainian.
Before recording, Synthesia recommends using a quiet and well-lit environment, pausing between paragraphs, not covering your face, using natural body language, being positive, and smiling.
From my experience with the tool, additional recommendations are to make sure the camera is not too close so you appear smaller in the frame. If you decide to upload a video instead of using the recorder, use a microphone for improved audio quality.
Synthesia will give you a script to read, which will take roughly three minutes to recite. According to the avatar generator, one to five minutes of audio and video are needed to complete the process.
The process was relatively simple for something that could have a major impact on a company or content creator’s brand.
While the process of uploading, recording, and reading the script took less than five minutes, Synthesia said the actual generating of the Personal Avatar can take up to 24 hours. My first video took about 10 hours to create. After that, subsequent videos that delivered new scripts that I provided took about five minutes.
Here is my Personal Avatar from Synthesia. What do you think?
The size of the finished video can be changed to fit the needs of the platform it will be uploaded to, whether it’s YouTube, Instagram, or TikTok. Generating new videos using a new script with a Personal Avatar, I found, took about five minutes.
However, you cannot change your background, outfit, or appearance without generating a new Personal Avatar, which—as noted above—could take up to a day.
While the Personal Avatars are very impressive, it was weird to see an animated version of myself, an instance of the “uncanny valley.” The uncanny valley is where a photorealistic presentation of a human is almost too close to the real thing, making its appearance even more disconcerting.
Despite this, these video avatars are impressive, and could pass for real if not carefully scrutinized by the viewer.
Edited by Ryan Ozawa.
Generally Intelligent Newsletter
A weekly AI journey narrated by Gen, a generative AI model.