Meta Announces Text-To-Video AI Generator Make-A-Video

This is is part of my live-learning series! I will be updating this post as I continue through my journey. I apologize for any grammatical errors or incoherent thoughts. This is a practice to help me share things that are valuable without falling apart from the pressure of perfection. 

Episode Summary

– Like DALL-E, Midjourney and Stable Diffusion for video
– Other engines will moves towards text-to-video too
– The available open-source alternative is CogVideo by Tsinghua University and the Beijing Academy of Artificial Intelligence (BAAI) researchers
– People are already modifying image generation outputs to create videos
– Meta shares that creating videos is much harder than creating images
– Clips are not longer than five seconds and don’t have any sound
– However, the quality of the technical team and interest will improve rapidly
– There continue to be ethical and copyright challenges at this time with generative AI
– The current resolution of the output is only 64 pixels by 64 pixels at 16 frames
– The examples are then resized using a separate AI model to reach 768 x 768 pixels
– the Facebook website is the exact same as OpenAi’s DALL-E Website which is hilarious
– You can sign up to share your interest in “Make-A-Video”
– Implication for the metaverse is that Meta is continuing to build
– Make-A-Video uses publicly available data sets which is supposed to add transparency
– No one is currently able to access the model and limited examples have been shown
– Mark Zuckerberg is fired up about it and says it is “amazing progress”

You can also learn how you can search for and find Stable Diffusion prompts using Lexica! And, if you don’t yet have access, find out how to get Beta access to Stable Diffusion by Stability AI.

YouTube Video

Resources

Make-A-Video
DALL·E 2
Introducing Make-A-Video: An AI system that generates videos from text
Text to video: Meta has created a new generator based on artificial intelligence • Mezha.Media
Make-A-Video Interest Submission
Meta one-ups competitors, introducing an AI text-to-video tool
Meta’s new text-to-video AI generator is like DALL-E for video – The Verge
Beijing Academy of Artificial Intelligence (BAAI) | Institution outputs | Nature Index
Tsinghua University
Zuckerberg Post on Make-A-Video
[2205.15868] CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Text to Video Samples – CogVideo by Tsinghua University – YouTube
Meta’s new text-to-video AI generator is like DALL-E for video – The Verge
Meta Announces Text-to-Video AI Generator | PetaPixel
Meta unveils an AI that generates video based on text prompts | MIT Technology Review
Meta is using AI to generate videos from just a few words | CNN Business
Make-A-Video.pdf
What is Unsupervised Learning? | IBM
real2.webp (512×512)
Meta’s new Make-a-Video AI can generate quick movie clips from text prompts | Engadget
Make-A-Video: Meta AI’s New Model For Text-To-Video Generation – Weights & Biases
Meta’s Got a New AI Tool that Enables Users to Make Freaky Mish-Mash Videos | Social Media Today
Meta Jumps into Synthetic Media With Text-to-Video AI Generator ‘Make-A-Video’ – Voicebot.ai
THUDM/CogVideo: Text-to-video generation.
Midjourney

Affiliates

Shure MV7

Shure SM7B

Hashtags

make a video,make-a-video,meta,meta ai,text to video generation,text to image generation,generative ai,video generation,text-to-video,DALLE,OpenAi,openai,open ai,Stability Ai,stability ai,stable diffusion,midjourney,Midjourney,CogVideo,cogvideo,mark zuckerberg,image generation,dall e 2,text to video converter,text to video maker

Automated Transcription

OK, hello, Hello, Tyler Bryden here. I hope everything’s going well man. I’ve been talking a lot about AI and generative AI lately and it’s like every day seems like this is massive breakthrough and like where we are in this technology. Dally, mid journey stability, I was stable diffusion, all these incredible sort of innovations and technology and allowing us to take you know large language models, type in a little bit of a text prompt and then generate these incredible things. Fucking crazy. And here we are now with meta coming into the picture and you know, it makes sense that they’re coming into the picture. They don’t like to miss out on things with the text to image or text on. I screwed it up already. Text the video generation, screw the text to image generation. That’s a thing of the past that doesn’t matter anymore. Or make a video. And OK, that’s maybe a little bit of an exaggeration here, but this is, you know, super interesting. Anyway, let me pull up something and I got to say right away, the thing that made me laugh is that.

The basically the Facebook, the meta page, Facebook, whatever, I’m gonna switch it all the time is like the exact same as the dowry. Even like with these sort of display of how they display the actual examples, you know, you might see, you know, might see some similarities here. Let me pull up the open API page here just so we can see just how you know how similar these two pages are. I’m guessing there’s some sort of intention there, even if it’s just like meta copying things like normal. Yeah, I go.

Exact same and then even the exact same comparison here. So you know, Meta continuing its ways but in this in, you know, whatever, whatever to be said. This is a huge innovation and something that people have, you know, want to see since text to image generation has come like this idea that video is possible. You know, it seemed a few years ago this didn’t seem possible and now it’s like it’s not that far right now. There are some things to consider here, like.

The the actual output that they’re they’re standing right now is only 64 pixels by 64 pixels at 16 frames. So, you know, not quite high quality video. And the actual final output that we have here has actually been run through a different AI model to get it to like 768 by 768 pixels. So there’s no sound. There are clips that are under 5 seconds, but at the same time, like the text bombs are there, it’s showing exactly what they typed in and the outputs are still, you know, pretty pretty stunning.

Ah. I’m just clicking through them for sorry. For anyone who’s listening to this and not watching it, there is surreal. There’s realistic, there’s stylized and like, you know, a young couple walking in heavy rain. This is truly the output that it’s coming with it. It’s pretty, pretty mind blowing. And so I’m just stopping there and, you know, thinking for a second horse drinking water like.

This is not a real horse. I mean you can see a couple sort of oddities in that as it’s rendering and you know as those frames are going, but. This is going to change super quick and you know the quality is going to get better. You know there is like an exponential increase in the innovation of this. Meta obviously has an incredibly.

You know, technical team who can execute on this and then just the overwhelming interest of this technology, I think it’s all sort of just accumulates together to make this a huge focus point for many of these companies. Google doing work here, we haven’t heard much from Microsoft or I don’t even think Amazon’s got some stuff going on. I have to dive a little deeper into that. But we obviously like open AI has spurred this along a lot and then now these open source alternatives have come. I’m just clicking through to show a couple stylized version so.

You know that same sort of process of like oil painting that we’ve seen and image generation being applied to the video generation and you know again some pretty stunning results. So previously although text to image generation you know was pretty becoming relatively widely available, the video one is not. And so the only other real one that we have is 1 called COG Video and I know I have it somewhere here But I’ll I’ll hop back to it and cog video was made by. Some researchers from I’m not going to get this right. Ting Chu, you university, and I’m sorry. And the Beijing Academic Academy of Artificial Intelligence. I have all these links in the resources here, so you can check those out if you’re interested. And the card video was the open source version that allowed you to do this and, you know, super cool, super powerful. It stirred, you know, it’s stemmed a lot of this imagination and innovation and I’ve got it here now. Actual GitHub link where you can see some of the outputs. I think I have, actually.

You have a YouTube example, turn off the volume, but you can see a couple outputs there. So much more stylized, you know, you can definitely see, I don’t think the word would be rasterization, but, you know, definitely not the quality that I think Facebook has already showed. Facebook Meta has showed us with what they’re doing. But in the end, like open source and, you know, super, Super valuable, super powerful. I do have the GitHub library here if you’re interested in exploring that, because right now Meta Ozaki does not have.

As as Open Access to anyone. So this is not open. This is super early they have a research paper on it and you know what I have seen is that you are interested in submitting or signing up. There is actually a form here let’s just there we go. So funny that like a company like meta that are using a Google form for this. It shows this like sort of MVP technology that’s here first name, last name, e-mail, country check all that apply and then you know a couple other things about like what are you interested in doing. So I’m guessing that this is getting.

Some heat on it right now. I’ll be signing up for this. I was able to lock it enough to get, you know, early access to Dolly, and it would be super cool if I could get access to make a video soon and you know, make a video. You know, very. Uncreative name, I guess, but it’s stating what it does, and I think that’s super important in its own regard. And then really, I’ve just got a bunch of different articles that are here. So I’ll just, you know, from my own, both from reading some of these articles and doing a bit of research about this announcement. There’s actually the the research paper itself. So, you know, sort of taking all this information. A couple of points that I think are sort of worthwhile synthesizing is that, you know, people are already currently modifying imagination, general generation sort of pieces to output like video. So I don’t have an example in this video. I apologize.

A loser. But like you would take an image and then do multiple frames and then you would blur that into an image, or you blur those images into what would come a final video, some super fascinating ones. If you go on Twitter, you can find some of those. So that’s this has already been done. The need is there, the desire is there for people who are exploring generative AI. I think Zuckerberg himself talks about. No, just pull this down, make sure I’m not pulling up anything sensitive as I always do. Here we go.

So Zuckerberg himself talks about this, still posting on Facebook, like a good little man, you know, was talking about, you know this and, you know, lots of good comments around it, but super excited. I mean, I would be excited if I do this and also talks about how this is, you know, much more difficult problem than what you know dally and open AI has done with image generation because, you know, there needs to be this prediction of how this will change over time. And I believe that that’s true. I think it’s hilarious that he’s taking, you know, these shots and you can see. Overwhelmingly, you know, even with people sort of skepticism of Facebook. Now, there’s some bias on this, but, you know, Facebook is the people are commenting on Facebook. There’s a pretty overwhelmingly positive response and just pure amazement. Now the other thing that you can see is that.

There are concerns and concerns with anything course with meta involved is relevant, but I think just this idea of generative AI in general, which is generative AI, is copyright infringement IP, where does this originally come from? And Facebook talks about it itself, that they are, you know, scraping this from public data sets. But that means that people originally created these data sets and you know, is there any intellectual property sign for that? As these images I get more and more realistic. The question is like, are these original, are these videos?

Are they genuine? Are they true videos, are they accurate or are they made-up and our ability to differentiate those as humans might get more and more difficult, might get more hard. So this idea of like information disinformation seems really problem and and then I think, you know, just in general, you know? Like, we’ve seen problems with, you know, stable diffusion because you haven’t been able, like there was no real filter put on that compared to open AI, where if you typed in the text prompt with violence or hate or whatever, it would not allow you to do it. Whereas open source systems then allow you to do really whatever you want. And there’s been sort of like pornographic and violent images and everything that I’ve already been created. So definitely some risks there. But we’ll see how I think to me, this is going to happen no matter what. People are too excited about this. The creative potential is too large. And overall I think it’s just.

So much ambition from these companies, like, definitely seems like there is a little bit of a, you know, an arms race in this generative AI space where big companies that this was not their main focus, you know, have turned their eye towards this as the interest enthusiasm around it is so significant. So this is just another day of a massive release of something that is truly, I think, mine, you know, mind-blowing world changing that. You know, maybe you’re looking at this and saying, oh, you know, it’s sort of cool, but I think this, you know, this ends up in people developing their entire, you know, taking a book script and developing a movie off it. Within seconds to creating, you know, obviously meta invested in the Metaverse, you know, interactive worlds created through prompts that you have that are completely personalized to you. Like, there’s so much consequences to this. And we don’t just look at this as like, hey, this is what it looks like today. Let’s look at it in six months from now, let’s look at it from a year from now, there’s going to be massive breakthroughs. And the more and more people use it, the more and more they debug, the more and more they solve these problems. And the process just continues to build.

And you know, accumulate. And no, I asked myself, is like, I’m just so fascinated by this, like, how do I get more involved besides just the creation process and have my own company here at PKI and, you know, and then my own personal life and it’s just like. I don’t know what this is, but it seems like one of those opportunities, you know, sort of a lifetime to participate and contribute in this and that’s why I like to share these videos and create these videos. You know, if you’re interested in this. I hope that you get some insight and perspective on this. You know this was specifically a video about Meta announcing the text of video AI generator, but I’ve talked about like the problems with it all the other sort of dally and you know dally and mid journey and stable diffusion and all these incredible tax and you know, I hope that you know check out some of those.

One recently I published was like Getty, you know, in Shutterstock banning images that were created by these engines not being allowed. A lot of the world is going to change with this, so I’m Tyler Braden. I’m going to continue doing this. If you like this, like, comment, subscribe, share all those good things. And if you’re just interested in general, please feel free to send me a note. Would love to chat. I’ve learned so much from the comments on these videos and gotten a lot of great feedback on how to improve these videos. But also just oh here you didn’t know this or check out this rank, or check out this reference or people even answering each other’s questions and comments to help them.

Figure this all out because we’re piercing together, you know, a puzzle with a bunch of different companies and developers and technology that’s all sort of accumulating at once and I could not help myself but share this today. So thank you for checking this out. Hope you have a wonderful rest of day this mentally. Brighton talking about Meta announcing text to video AI generation with their new system make a video and I think we’re going to see a lot more of this soon. Goodbye.

 

More To Explore

Podcast

Can Canada Join the USA?

Interested in Can Canada Join the USA?? Check out the latest video and resources from Tyler Bryden on Can Canada Join the USA?!

Read More »

Share This Post

Join My Personal Newsletter ❤

Get insights and resources into awareness, well-being, productivity, technology, psychedelics and more.

Don't want to chat but want to keep updated?

I'd love if you subscribed today. I promise I will only send you great, valuable content that has transformed me and helped others flourish. 

You have Successfully Subscribed!

Pin It on Pinterest

Shares