A First Look at Stable Diffusion Image Generation by Stability AI

This is is part of my live-learning series! I will be updating this post as I continue through my journey. I apologize for any grammatical errors or incoherent thoughts. This is a practice to help me share things that are valuable without falling apart from the pressure of perfection.

https://tylerbryden.com/podcast-player/60448/a-first-look-at-stable-diffusion-image-generation-by-stability-ai.mp3

Episode Summary

– Stability AI is an organization focused on AI by the people, for the people
– Their developer communities have over 20,000 members
– Developing cutting-edge open AI models for Image, Language, Audio, Video, 3D and Biology
– Stable Diffusion is a text-to-image model that will empower billions of people to create stunning art within seconds
– Open to researchers with beta sign-up form
– Model weights hosted by Hugging Face
– Stable Diffusion runs on under 10 GB of VRAM on consumer GPUs, generating images at 512×512 pixels
– People will be able to run this at home!
– The model was trained using the LAION Aesthetics dataset, a subset of the LAION 5B dataset, containing 120 million image-text pairs from the complete set which contains nearly 6 billion image-text pairs.
– Have been testing the model at scale with over 10,000 beta testers that are creating 1.7 million images a day
– Preparing for public release soon
– Are preparing to release synthetic datasets
– Trending on Google with new art, searches and videos being made
– Emad Mostaque is the founder of Stability AI
– A Model To Rival DALL·E 2 With Fewer Restrictions
– Less expensive than DALL·E 2
– They are releasing the code and weights for the public good
– Also need to dive into Disco Diffusion

Stable Diffusion is from an independent multimodal AI company Stability Ai. You can leverage the Stable Diffusion API,see that their vibrant communities consist of developers and now see several Stability Ai Google Colab notebooks and modified versions of the code.

You can also learn how you can search for and find Stable Diffusion prompts using Lexica! And, if you don’t yet have access, find out how to get Beta access to Stable Diffusion by Stability AI.

Resources

stable diffusion – Explore – Google Trends
Beta Signup Form — Stability.Ai
CompVis/stable-diffusion · Hugging Face
Stability.Ai
GitHub – pesser/stable-diffusion
Stable Diffusion launch announcement — Stability.Ai
Research and Academia — Stability.Ai
Stable Diffusion launch announcement — Stability.Ai
(99+) Stability AI: About | LinkedIn
GitHub – CompVis/stable-diffusion
Text to Image AI BACKLASH – Should AI be Regulated? – Stable Diffusion’s Open Source Power – YouTube
The Man behind Stable Diffusion – YouTube
Stable Diffusion: A Model To Rival DALL·E 2 With Fewer Restrictions – Weights & Biases
Stable Diffusion: New And FREE Text-To-Image AI Tool | by Jim Clyde Monge | CodeX | Aug, 2022 | Medium
Stable Diffusion, the slick generative AI tool, just launched and went live on GitHub – CDM Create Digital Music
GitHub – CompVis/stable-diffusion
Stable Diffusion 🎨 – News, Art, Updates (@StableDiffusion) / Twitter
stable diffusion – Twitter Search / Twitter
#stablediffusion – Twitter Search / Twitter
Emad (@EMostaque) / Twitter
preferences (reddit.com)
StableDiffusion
Stable Diffusion Interactive Google Colab Notebook for Image Generation : StableDiffusion
Stable Diffusion : deepdream
Rivers Have Wings (@RiversHaveWings) / Twitter
Stable Diffusion launch announcement | Hacker News
Replicate – Run open-source machine learning models with a cloud API
The latest in Machine Learning | Papers With Code
[2204.08583] VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance
GitHub – nerdyrodent/VQGAN-CLIP: Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
Ai generative art tools
Disco Diffusion v5.6 [Now with portrait_generator_v001] – Colaboratory
GitHub – alembics/disco-diffusion
nightmareai/disco-diffusion – Generate images using a variety of techniques – Powered by Discoart – Replicate

Hashtags

stable diffusion,dall-e 2,ai art,dall e 2,disco diffusion,dall-e 2 ai,ai art generator,dall-e 2 openai,openai,dall e,stable diffusion ai,dall-e 2 demo,text to image,stability ai,emad mostaque,stable diffusion beta,dall-e 2 access,openai dalle-2

YouTube Video

Automated Transcription

OK, Tyler Bryden here. I hope everything’s going well. I have been deep down the rabbit hole of image generation and large language models and we all are starting to come across Daly and mid journey, but there’s something else that has been rising lately and we can see this reflected in Google Trends and that is stable diffusion. So in this video I want to ask what is stable diffusion? This image generation system? What is stability AI? How is this all coming together? How is this different and hopefully? Giving you some insights along the way as well as hopefully informing myself I’m diving deeper into this, but I’m still early into this journey, just as I was in mid journey a few weeks or even months ago. It’s all blurring when someone first introduced it for me, but I’ve got some resources, I’ve got a lot of links. You can see my tab game is crazy right now and hopefully we get some pretty fascinating stuff along the way, so I’ll click in. Just a couple I guess things here, we’re starting to see some related queries and we’re starting to see obviously there are mid journey breakout, these are all.

Break out queries and then we’re also seeing disco diffusion, which I want to dive in deeper into at a point. But let’s stay with the norm here, which is stability AI. They are in beta access with opening up public access soon. I don’t have the exact date on when this is happening, but they’re preparing for it and then they’re also preparing what I’m interested in is some synthetic data sets around some of the outputs that they’ve been learning from, so right away if you are interested in this, which I’m sure you are if you’re checking out this video you do have. Beta sign up form and you know, take a take a minute to check this out and hopefully submit one quick thing. I was lucky enough to get into Dali early, not even early. I applied late but submitted some of my social media profiles. Some of the work that I had done in this space and I was lucky to get approved quickly. So if you can, you know, write a nice message show why you’re doing it. Share SHARE, maybe some content that you are, you know creating around image generation and all this good stuff and maybe you’ll be more likely to get.

Streamlined into access here. I’m not sure I can’t make that promise, but I’ve seen some very sad people with with dally there. So I want to tap into first of all, what is stability AI? Be honest, there’s a little bit of confusion here on my side. I mean amazing sort of overall mission, mission AI for the people, by the people, for the people. And they’re designing and implementing solutions using collective intelligence and augmented technology. So apparently they have a developer community of over 20,000 members and developing.

Open AI, open AI shots, fired a little bit because it seems like they’re focused on open source and they’re actually releasing the actual code. The weight and access to the models that they’ve built in a much more rich way than these other systems, which are now being sort of scrutinized for their lack of, you know, lack of transparency and some of the black box of the technology that it has. And and then this is where I get a little bit confused, I see. All these, all these pieces and have I’ve heard a couple of a bunch of these, all sort of separated, and so I’m going to do a little bit of a deeper dive to understand how they’re connected. But for this case, you know it seems like they’ve got an incredible team full of technical talent who are focused on bringing cutting it, cutting edge open AI models to the world, and what we’re focused on specifically in this video is stable diffusion, which is.

If I jump into the announcement, which is a text to image model, and ideally what they’re hoping, and I believe this is probably what will happen, is that they’re planning on empowering billions of people to create stunning art within seconds. There are open to researchers at this current moment, and then what’s you know? Interesting is another huge leader in the space is actually the one hosting this. So again, all these links will be here. Once you do get access. If you get. If you do get through this and and and get that, get that nice invite. You can then go to hugging.

And then this is where you actually can run. And run some of the. Basically you can run the tech and it looks like people are now working on other ways and Google collapse and other ways to then do this technology. But for now hugging face the model weights are hosted by there and so that opens up a lot of access and sort of democracy. Democratization to this technology that I think people are very excited about hugging face has a great great brand and is doing some amazing stuff in this space, and so there’s a lot in this article. Again, it’s going to be linked a couple of things that were really interesting to me was it runs on under 10 gigabytes.

A of on consumer based on consumer GPUs and it generates a 512 by 512 pixels, which is not the biggest image. You know it’s going to be hard. You know that’s not even matching where we want for. Like you know, people posting on Instagram and backgrounds and things, but it’s still pretty good quality, better than many Daly and things that we’ve seen. And the fact that you can run this locally is pretty incredible thing. So people will be able to run this at home. That was some of the excitement that I’ve seen in the comments.

Below, sort of on threads and videos and everything there and then. To this date, they’re testing with over 10,000 beta testers, and they’re creating a 1.7 million images a day. So this is wildly popular already being adapted, adapted, and just like we’ve seen a huge explosion with Dolly with Mini, Dolly with mid journey, and then many others wambo and all these others, this this is not going away. People are overjoyed at the potential of this and are now trying to figure out which system works best, which one it has the best. Price point, and that’s some of the challenges that I think are now arising with open AI and dally people doing direct comparisons to, of course the quality of output. But then the actual cost of that, how much, how much refinement or how many prompts do I need to do to get to the high quality output that I’m looking for. And if that’s too expensive, it will create barriers, especially because a lot of this is very creative personal work and not necessarily direct business use cases. And that’s why I think we’ve seen the rise of prompt design, prompt engineering I’ve shared.

On prompt base I’ve shared on the prompt the Dali prompt book. Lots more coming around refining these prompts and understanding what should we put in to then get the output that we want, and it seems like systems like this will allow us to do even more to modify the output without even the constraints of. Maybe you know 400 character prompts which then gives us more control and more ability to get the final image. The final output that we want, so there’s a lot in the in in this, you know. All over the place I’ve got links I’ve got. I’m gonna follow these these guys because I haven’t yet. We’ve actually got a GitHub stable diffusion piece that talks about it and and you know, basically talks about the comparison, but then does step by step, you know walkthroughs of how to understand it, how it compares with other platforms, and then a bunch of other stuff which I haven’t unfortunately gotten able to dive deep enough into yet. But as I do.

Are we sure we’re now seeing what’s really interesting too? Is videos that are being published on it. So Matt did pro 128K surprise subscribers and he’s talking a lot about this sort of image generation and you can see you know this one wasn’t as big, but a couple of the other videos he’s done have actually been pretty significant in size and I’m excited to see you know more and more people sort of walking through these, sharing it, and you know, I’m I’m personally fired up about this now. I gotta make the connection here. Another video that I’m sourcing, but this is the guy who did the OR talked about the GPT 4 Chan, the 4 Chan GPT and all the sort of, you know, dangerous precedents that that said and the abilities of that to do some pretty dangerous things. So check that video out if you haven’t like the four 4 Chan GPT. But he also has an interview with the Mad Mostack, and I apologize for pronouncing that improperly if I did the founder of stability. And it seems like there’s a lot of people who are looking up to him.

Because he is so focused on basically building, you know, open source models and you know technology for people not just with image generation but with you know all sorts of technology, and I think that’s a very admirable thing, especially in the world that we live in so super and just to follow him. I do have his Twitter profile up here, so if you’re interested in following him you can check, check that out and he’s you know. Seems very you know, relatively. Accurate, he’s retweeting. He’s asking for feedback, so if you have feedback, please give it to him. And then I’m also got the stable diffusion page here as well too. So we’ve got the website everything there, and then I just took a look to see what is actually coming out when I look at hashtags or sort of mentions around this and from what I’m seeing it is incredible, incredible art and it’s coming back fairly quickly at pretty good. Pretty good resolution at 512 by 512 and it seems like you know every day.

People seem to just get better at, you know, generating these images from it. People had a conversation about this talking about how good this is, and you know why you know as some as an artist how he’s both you know, scared but also incredibly excited at the potential. And ask for people to iterate on some images that he’s moving. So like you know you can see here talking about the momentum. I see that momentum just based on this. This is absolutely insane and I can see apologize for that.

I can see I’ve had to reset an e-mail. I’ve got a stable diffusion here, so there’s now a subreddit dedicated to it. I expect that to be. You go getting more and more popular. You can seeing sort of the density of those in hours, go by and and continue to increase. It’s going to get more. We’ve got lots of sort of comments here. If I refresh this page, sorry there we go. You can see people asking questions. So people are asking for tutorials. Refresh this page. All my Reddit pages need to be refreshed and then we’ve got.

Some some Twitter followers and some Twitter creators who are using this system and are getting lots of heat as well as a why combinator. Why Hacker thread on it as well too. So people talking about Delhi being too expensive. I just sort of having challenges with open AI in general and and then like the way that they’re sort of limiting output. So I think you know there is Flack against some of these system, which I think in originally get a lot of love and then innovation. People who love open source. They then create competitive competition, indicate good competition, and that drives more innovation. So I think people overall agree that this is a great thing if open source versions come out and are comparable or at even the similar quality.

We will be able to force open AI to maybe make changes to their business model, maybe make pricing less expensive. And all those sort of good things that come with innovation and competition and different people working on the same technology. So you know this is just a fresh sort of like what is stable diffusion image generation by stability AI me asking this question and walking through a couple of things and some resources for you to then dive deeper into this one of the other things that I do want to tap into. And I’ll put some things. Just some links at the end which is just disco diffusion trying to understand the difference between them and it seems like more and more tools and versions of this.

Are coming live and I’m, you know, incredibly excited. I’m following along. I’m participating and very excited for this to come onto my radar here with what they’re doing at stable diffusion. I’ll be or at stability. I with stable diffusion, I’ll be submitting my application here, hopefully get in, and maybe I’ll be able to share some insights along the way. From what I learned. But this was a quick look at at this. I hope you enjoyed. If you have any thoughts or please feel encouraged. Any insights anything that I miss? Anything that I got wrong, please feel encouraged to to drop a comment.

Low light comment subscribe. If you are enjoying talking about this, it gives me the motivation to continue. And if you have questions or want me to explore things deeper, please let me know. I’m more than happy to. I absolutely love this stuff. I have been Tyler Briden checking out stability, AI stable diffusion, image generation. We have a world incredible world ahead of us with this kind of technology and I’m glad to see more and more of it coming. Glad to see it getting more popular. And of course there are risks. But overall I think we have a huge human ingenuity and creativity at our fingertips.

Right now and this is going to continue to emerge a very, very exciting time. Thank you very much. Have a great rest of your day. Bye, bye.

More To Explore

Podcast

Tumblr and WordPress Selling Data To Midjourney and OpenAI

Interested in Tumblr and WordPress Selling Data To Midjourney and OpenAI? Check out the latest video and resources from Tyler Bryden on Tumblr and WordPress Selling Data To Midjourney and OpenAI!

Tyler Bryden February 27, 2024

Podcast

Mistral Releases New AI Model Mistral Large & Partners With Microsoft

Interested in Mistral Releases New AI Model Mistral Large & Partners With Microsoft? Check out the latest video and resources from Tyler Bryden on Mistral Releases New AI Model Mistral Large & Partners With Microsoft!

Tyler Bryden February 26, 2024

Podcast

Google’s Gemini Won’t Generate White People

Interested in Google’s Gemini Won’t Generate White People? Check out the latest video and resources from Tyler Bryden on Google’s Gemini Won’t Generate White People!

Tyler Bryden February 22, 2024

Podcast

2023 YouTube Year In Review

Interested in 2023 YouTube Year In Review? Check out the latest video and resources from Tyler Bryden on 2023 YouTube Year In Review!

Tyler Bryden January 2, 2024

Podcast

Founder Wealth

Interested in Founder Wealth? Check out the latest video and resources from Tyler Bryden on Founder Wealth!

Tyler Bryden December 5, 2023

Podcast

Datastreamer, Diply & Unstructured Data

Interested in Datastreamer, Diply & Unstructured Data? Check out the latest video and resources from Tyler Bryden on Datastreamer, Diply & Unstructured Data!

Tyler Bryden November 28, 2023

A First Look at Stable Diffusion Image Generation by Stability AI

Episode Summary

Resources

Hashtags

YouTube Video

Automated Transcription

More To Explore

Tumblr and WordPress Selling Data To Midjourney and OpenAI

Mistral Releases New AI Model Mistral Large & Partners With Microsoft

Google’s Gemini Won’t Generate White People

2023 YouTube Year In Review

Founder Wealth

Datastreamer, Diply & Unstructured Data

Connect

Listen to my podcast:

Support my work

Share This Post

Join My Personal Newsletter ❤

Get insights and resources into awareness, well-being, productivity, technology, psychedelics and more.

Let's Grow Together.

Connect

Social Channels

How to Contribute

Don't want to chat but want to keep updated?

You have Successfully Subscribed!

Pin It on Pinterest