1 minute of OpenAI's Sora video may take over an hour to generate

Home » News

2 min. read

Published on February 21, 2024

by Devesh Beri

published on February 21, 2024

Readers help support MSpoweruser. We may get a commission if you buy through our links.

Key notes

OpenAI’s Sora generates videos from text prompts, but rendering times are lengthy.
Discussions suggest it can take over an hour to generate a minute of video.
Users expressed concerns about the scalability and practicality of iterative workflows.

OpenAI’s recently unveiled AI model, Sora, generates realistic-looking videos from text prompts. However, discussions suggest that generating a single minute of video using Sora can take over an hour.

While the exact time frame remains unclear, this post on Reddit indicates significant rendering times compared to traditional video creation methods. It’s important to note that these discussions are based on limited information, with researchers primarily showcasing pre-selected examples and not allowing public access to custom prompts. The longest demonstrated video was only 17 seconds long.

Several perspectives have emerged regarding these rendering times. Some users commented on the impracticality of lengthy rendering times, particularly when iterating through multiple prompts:

If you’re going to need to try multiple prompts, that’s going to be a big problem.

Others speculated on the potential reasons behind the long rendering times, with one user referencing comments from OpenAI’s CEO, Sam Altman, regarding significant funding needs:

I can see why he (Sam Altman) wants 7 trillion now.

Comparisons were also drawn to another recently announced AI model, Google’s Gemini 1.5, highlighting the competitive landscape:

Comment
byu/hasanahmad from discussion
inOpenAI

Finally, some users attempted to contextualize the rendering times by comparing them to traditional animation:

That’s pretty reasonable. 90 hours for a 90-minute movie. When you calculate the savings for all the typical animation labor, that’s not crazy at all. But assuming zero shots, 100% accuracy, zero hallucinations, and 100% character transfer through most of the movie,

One of the biggest noticeable things is that the time is not surprising, but the absence of 1 minute tells me that after 20 seconds, it might begin to hallucinate a lot, which is currently known for LLMs.

Overall, the information regarding Sora’s rendering times underscores the ongoing development stage of the technology. While the potential for AI-generated video creation is evident, addressing efficiency and scalability remains crucial for wider adoption.

As Sora continues to evolve, it will be interesting to see how these challenges are tackled and how technology shapes the future of video creation.