Chapter 3. AI-Powered Video
This chapter attempts to answer the question of how AI can produce videos. Could we soon just ask a model to make anything we imagine? If not, what are the structural limitations preventing that?
A fundamental part of creating videos is editing. To understand how an AI can make videos, we need to spare a thought for this process.
Why You Should Care About Editing
Although we’re all habitual consumers of the language of video editing, its processes aren’t conspicuous, and so we may miss opportunities it presents.
On the basic level, editing is about transforming raw video inputs into new video output with a different temporal and visual structure. Being able to do this automatically can unlock the use cases discussed in the previous chapters. Editing is necessary because the outcomes we seek cannot be defined as a 1:1 temporal mapping of video input and output.
In other words, we’re looking to reassemble the temporal structure of the video and its storyline, not just modify an aspect of its visual content or audio track, as is the case with various popular AI-based enhancement tools such as beauty filters, face replacement deepfakes, resolution upscalers, or voice modifiers.
But wait. At the rate AI is progressing, could we just wait until AI is good enough to conjure properly edited movies seemingly out of thin air, like we see happening with stories and images today? OpenAI has introduced its Sora model which promises to create convincing full-motion ...
Get AI Processing and Automatic Editing for Real-Time Video now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.