The Future Of AI In Video (According To Our Team)
Get caught up on the first part of this series (written completely by AI) here.
No AI tools were used in the creation of this blog. Much like myself, they get self-conscious talking about themselves.
In every conversation with my peers for the last six to 12 months, the only guaranteed topic during our chat is Artificial Intelligence (AI), and whether or not it will replace us all. AI is an extremely hot topic, not only in the video industry but across the entire world.
From creators to small business owners, AI has quickly proven it can be helpful to anyone in some way or another. As PEG's Production Manager (and mostly as a nerd) I have been fascinated with AI news and I have played with every AI tool or platform I can get my hands on. Because of this, I will likely be writing several AI-focused articles throughout the next year as more and more tools become available to us.
I think to start my AI blogs off, I want to look at a blog we did in April of last year titled: The Future Of AI In Video (According To AI). I want to quickly look at some of the predictions that the AI made and speak to where those predictions stand today.
(Keep in mind this will not be all-encompassing, I will likely speak to functionalities more specific to our workflows, as that is what I keep my eye on. This means what I speak about will be mostly in the Adobe or standalone AI app environments. At PEG, we do not spend much time in Davinci Resolve outside of color correction, and we do not use Final Cut, so any tools specific to those won’t be mentioned.)
Automated Editing
For me, one buzzword over the last 10 years in our industry always proved to be a bit overhyped; Automated. I have always been excited at the concept of automation in our editing platforms, but for the most part up until about 2-3 years ago, they all fell short of being the game-changing time saver we all hoped for.
The first prediction AI made was about “automated editing.” When I first saw this I immediately disagreed with the title, but when you read the explanation the AI gave it was really just talking about streamlined, efficient editing. It predicted that tools would exist to analyze footage and help select shots, locate scene changes in a single media file and automatically add transitions, automatically add speed adjustments to clips, etc…
These things do not equate to a fully automated video edit, but it did predict something pretty awesome for editors and creators: a ton of automated AI tools to enhance our workflows. In one way or another, every bullet it listed as a future capability now exists, as well as many, many more AI-powered tools (which I will dive deeper into in a future blog.) What’s more, all of the things it listed were implemented directly inside of Adobe software, powered by their AI engine: Adobe Sensei.
The most exciting thing about this prediction is that even AI was not able to anticipate the amount of tools and capabilities that have been added to our workflows in under a year's time. Tools for our editors are what I keep my eye on the most, and it is almost difficult to keep up with the amazing things emerging in our industry.
Personalized Content
This specific prediction is less noticeable for me, but I think this one was low-hanging fruit to some degree. It speaks to targeted advertising and curated user experiences and on that front, we were already using machine learning to optimize advertising and experience before we were using the term AI. It was pretty safe to assume it would improve as quickly and efficiently as AI engines improve, and I think that has definitely been the case in the marketing world. So I guess, good job AI! You guessed it!
Interactive Elements
One odd thing that it threw into this category was “Interactive Elements.” It predicted that AI will be used to generate elements within videos such as quizzes, polls, games etc. And while I am certain some video platforms are using AI to generate polls, I haven’t seen much else in terms of interactive AI implementations during videos.
I think there is potential here for far more than the AI mentioned, such as video review tools that offer similar suggestions to what I or our Creative Director would, or catch flash frames, or cropping issues. To me, that idea is exciting and somewhat falls in line with the idea of AI studying a video and automatically creating something based on that. So, I am curious to see where things go on this front.
Choose Your Story
Lastly, it made an odd prediction that hasn’t really come true at all yet, but again could be cool. It anticipated AI curating “choose your story” style videos similar to the movie Bandersnatch, which allowed you to direct the character in the movie to take certain actions. This to me was the first major miss by the AI, but maybe when I check back in a year we will have all sorts of amazing AI “choose your story” videos to play around with. Time will tell.
Real-Time Video Analysis
Anything specific to live events will spark interest in me, and in this category, AI predicted several AI tools that will do all sorts of amazing things in the live events world. From identifying key moments, such as a touchdown or goal, to automatically cutting a show, to more in-depth social media integration, it had a lot of big ideas for AI in the live events world.
Unfortunately, while these things sounded very cool, a lot of them did not come to fruition in a noticeable way this last year. Certain elements of these bullets exist in all sorts of other video-based media, but the only one I see having a big effect on actual live broadcasts right now is live AI transcription, and to a much smaller extent, I do have to assume that certain social media feeds on streams are monitored via AI. But if it were me, I would always have some human element approving any crowdsourced media (you never know what could happen, we never want to show something inappropriate and end up trending for all the wrong reasons.)
The action/object tracking idea has existed for a while in gimbals and PTZ cameras, but now these tools are becoming more and more reliable in our systems. I should also note it doesn’t just affect cameras that can move, but also cameras attempting to decide what to autofocus on. This is another example of a tool that in the past we never fully trusted only more and more reliable.
Improved Visual Effects
This category was the one I was most excited to talk about, since behind live events, animation is my favorite part of my job. In the AIs prediction it hits on so many things that made me excited to think about, and what’s even more exciting: these things are happening (for the most part.)
Object Removal/Insertion
Object removal and insertion can be tedious in video depending on the scene and needs. It can take a lot of manual keyframing and masking/compositing to make it look natural. Sometimes, spending all of that time to remove a trash can you forgot to move out of the frame in a dolly shot really gets to you. So having tools to assist with this process is amazing.
I will say no integrated Adobe tool (yet) lets you select an object in a video and just remove it outright. (There are whispers of this though.) But even just Adobe’s generative fill is fantastic for creating a patch to track into your footage. It saves a lot of time upfront manually creating a patch in Photoshop.
There are also standalone AI tools that offer this exact service if you are really struggling or are on a tight timeline. Apps like Hitpaw can do this, but on this front reliability is still pretty low on complex shots with any motion.
So, if you already have the skill set to do this type of patching/compositing work, there are certainly AI tools to make you even more efficient. If you don’t have this skill set and are hoping for a tool to do it for you, I think we are closer than ever, but not quite there yet.
Image/Video Restoration
The last thing in this category I really wanted to talk about is Image/Video Restoration. To me, this has been the coolest and most noticeable upgrade to our workflows yet. Working with nonprofits and companies that have been around for decades, the quality of media we receive is not always at a high level. Since the AI prediction, so many tools for media quality improvement have been released. These tools can do any number of things revolving around resolution, frames per second, clarity and stabilization.
I can take old standard-definition footage and generate pixels between pixels to increase the resolution by up to four times the original size or more. I can take 30p footage and generate frames between frames allowing me to slow it down in post without any choppiness or stuttering. I can denoise and sharpen footage and images to a level I had never thought possible without severe quality loss. This specific prediction not only came true, but it blew my socks off and it is still improving every day.
With all of the things I just listed, I have still only scratched the surface of these tools and their capabilities. I will be doing an entire blog coming up on just this AI resource with some fun examples.
Automated Captioning
To summarize this category, the AI prediction absolutely nailed this one.
This is another major benefit and change generated from AI in our industry. Machine-generated captions and transcripts have existed for quite some time, but like everything else AI has given these features significantly better results and usability. Captions and transcripts in any language are available in a single click, though as a professional I still have to recommend you proofread them if they are being viewed by your client or the public.
My favorite change, and I guess my suggested editing hack if you aren’t already doing this, is no longer subclipping 45-minute interviews. Now we can do an automatic transcription and cut a story entirely via text, then check the shots for quality and swap around if needed. This was a massive time saver and gave us the ability to make this type of edit more affordable by cutting review time for clients. It also allows us to quickly provide a transcript to the client so they can curate their own rough outline based on their expertise and informed knowledge of their needs and vision.
Because I have been noting weird predictions, the only strange point in this category to me was AI customizing the actual style of the captions based on the user. Not saying it couldn’t happen, just saying that is an odd prediction in a world where most platforms don’t even offer the ability to customize captions beyond size and even that is limited.
Final (Human) Thoughts
Overall the AI did a pretty good job predicting the future of AI in video! This was interesting to look back on, re-evaluate and talk about.
AI is exciting, but I do think the word automated is still being thrown around a bit too heavily. It does a great job, but nothing it does is perfect, it creates amazing efficiency, but it doesn’t do the whole job for us. It can create captions, but you still need to proofread them. It can automate a video cut based on footage, but how likely is it that we don’t make any tweaks to that?
To me, AI represents not automation, but incredible efficiency and time savings in an industry where time is everything. I am still just as enthralled with it as I was a year ago, and I cannot wait to see how it changes our processes next.
Keep an eye out for my next AI blog post!