OpenAI unveils new text-to-video generator Sora

A still image from a video generated by OpenAI's Sora, as seen on the company's website. Photo courtesy of OpenAI
A still image from a video generated by OpenAI's Sora, as seen on the company's website. Photo courtesy of OpenAI

Feb. 18 (UPI) -- Open AI, the company behind ChatGPT, is rolling out its text-to-video model which will generate videos up to a minute long based only on text input.

The product, Sora, is currently in early testing with a select group of users and artists and comes on the heels of Make-a-Video, developed by Facebook parent Meta.


Meta's product was unveiled in October 2022 but has not been released to the public. Other text-to-video generators are currently on the market but Sora, if extended to a wide release, would be among the first major players in video AI products for consumers.

"We're teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction," OpenAI said in a statement on the Sora website. The company has also released a white paper on its development.

OpenAI is currently seeking feedback from a handful of visual artists, designers and filmmakers on how Sora can be most helpful and useful for creative artists.

The company will use the research it gets from early users to tell the public what Sora is capable of and what AI capabilities are on the horizon.


"Sora is able to generate complex scenes with multiple characters, specific types of motion, and accurate details of the subject and background," the company said. "The model understands not only what the user has asked for in the prompt, but also how those things exist in the physical world."

OpenAI said Sora has a deep understanding of language, enabling it to interpret prompts and generate characters that express life-like emotions. Nuance will be a hallmark of the new release of the software, overcoming existing weaknesses with simulating the details of a complex scene and understanding cause and effect.

"For example, a person might take a bite out of a cookie, but afterward, the cookie may not have a bite mark," OpenAI said.

"The model may also confuse spatial details of a prompt, for example, mixing up left and right, and may struggle with precise descriptions of events that take place over time, like following a specific camera trajectory."

The company said it is making safety a hallmark of the new system, guarding against misinformation, bias and hateful content, which other AI manufacturers have also confronted.

Latest Headlines