Meet Sora, OpenAI's new text-to-video model that creates stunning HD videos based on text prompts

It's not a question of "can we do it," but "should we do it."

Home » News

Rafly Gilang

Tech Reporter

News

2 min. read

Updated on November 18, 2024

Key notes

OpenAI has just announced a new text-to-video AI model, Sora, and it looks impressive yet scary at the same time.
People are now concerned whether this model will take over their jobs.
The model will carry C2PA metadata once implemented in an OpenAI product in the future.

OpenAI has just announced a new text-to-video AI model, Sora, and it looks impressive. The premise of it looks pretty simple, yet remarkable: you can type out any word prompts, however detailed you want, and then the AI model will come back with a 60-second highly-detailed video.

Take a look at some of the results that Sora could do:

Prompt: “Several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance… pic.twitter.com/Um5CWI18nS
— OpenAI (@OpenAI) February 15, 2024

Prompt: “A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.” pic.twitter.com/0JzpwPUGPB
— OpenAI (@OpenAI) February 15, 2024

Prompt: “A gorgeously rendered papercraft world of a coral reef, rife with colorful fish and sea creatures.” pic.twitter.com/gzEE8SwP81
— OpenAI (@OpenAI) February 15, 2024

Prompt: “Animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. the art style is 3d and realistic, with a focus on lighting and texture. the mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with… pic.twitter.com/aLMgJPI0y6
— OpenAI (@OpenAI) February 15, 2024

That’s honestly scary and impressive at the same time. The public reaction to it is a mix of awe and alarm, especially considering OpenAI’s past legal disputes with journalistic companies for allegedly using their articles to train the model without their consent.

And we also need to talk about potential jobs that could be replaced. Even OpenAI boss Sam Altman himself, who was once ousted from the position, said that the pace of our AI research have been advancing way too fast and the amount of adaptation that humankind needs to make is alarming.

This model builds upon past DALL-E and GPT research, using DALL-E 3’s unique recaptioning method to generate extremely descriptive captions for the visual training data. However, it still faces challenges in realistically simulating complex scenes, understanding cause-and-effect relationships, and not confusing spatial details of a prompt.

When implemented in an OpenAI product in the future, be it ChatGPT, a new offering, or Copilot, the model will carry C2PA metadata, similar to what Microsoft has been doing on Image Creator from Designer. Text and image checks guard OpenAI products against harmful content like violence, hate speech, and IP infringement.

“We are also granting access to a number of visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals,” says OpenAI.

Rafly Gilang

Tech Reporter

Rafly is a reporter with years of journalistic experience, ranging from technology, business, social, and culture. Currently reporting news on Microsoft-related products, tech, and AI on MSPowerUser. Got a tip? Send it to [email protected]

Readers help support MSpoweruser. We may get a commission if you buy through our links.

Improve this guide

User forum

0 messages

Sort by:

Leave a Reply Cancel reply