The Hollywood Insider Dall-E and Text Based AI Art Generators

Video Version of this Article

Photo: Dall-E

The Age of Artificial Intelligence

Artificial intelligence has come a long way this past decade, and we’ve seen some pretty incredible accomplishments the past few recent years, especially in regards to content made available for use by the general public. From online text generators that allow people to take turns writing stories with an AI, to text-to-speech applications that are trained on the voices of both well-known people and fictional characters, those who seek to kill time online have some new and pretty entertaining options at their disposal. But over the past few months, another form of artificially intelligent entertainment has risen to mainstream attention: Text-based art generators.

Text-based art generators are exactly what they sound like: online AI programs that run on text prompts, allowing users to enter text inputs which the AI will attempt to create a picture out of. Dall-E (named after both Salvador Dali and the Pixar film ‘Wall-E’) in particular stands out, as while there are numerous text-to-image generators out there, this yet-to-go-fully-public model is by far the most powerful one yet. Created by the research laboratory OpenAI, Dall-E uses GPT-3 to generate images from text prompts. The second and latest model, simply known as Dall-E 2, was announced earlier this year, which improves on the previous and gives even more impressive results.

Dall-E  – How it Works in Layman’s Terms

DALL-E makes use of a neutral network that was trained on several different images and their text descriptions. From there, it’s able to understand relations between these different images. For example, when fed properly-labeled images of cats, it learns what a cat looks like and how it should be depicted. This goes for any animals and objects the neural network was trained on. Therefore, if one was to create a text input like “cat holding a pencil” or “Cat wearing a fez hat,” Dall-E would be able to generate an image of a cat holding a pencil or wearing a fez hat, as it knows from the data it’s been fed what those individual things are, and how they would work together.

And it doesn’t even stop there; it’s also possible to input what style you want your images in, so if you want that fez-wearing cat to look like a watercolor painting, you got it. If you want it to look photorealistic, Dall-E’s got you covered. The AI can even make changes to existing images, adding elements into them while incorporating appropriate use of shadows and reflections, meaning that the cat can wear its funny little hat wherever you want it to!

Due to a number of reasons, Dall-E isn’t something that’s quite ready for use by the general public. I’ll admit that I don’t have the clearest understanding of the technology, but I completely get why a tool this powerful isn’t just something that anyone can use. Dall-E requires a lot of energy to generate the kind of images that it produces, and making it public would overwork the servers that power it. And while the model has its limits on what it can produce, and how realistic what it produces look, there’s no doubt that some people would take its creative freedom a bit too far. There is, however, a very long waitlist for people seriously interested in giving it a try. Not only that, but technically there is currently a way for the general public to try their hand at generating images via text prompts. Think of it as a low-budget alternative…

Earlier this year, a particular image-to-text generator blew up in popularity. Originally named “Dall-E Mini,” this generator isn’t actually associated with the real Dall-E, and as such changed its name to “Craiyon” (get it?) to distance itself from OpenAI’s creation. However, the original name did fit pretty well, as not only does this generator function as a simplified version of the real deal, but its results are generally pretty accurate to what you type in. There’s a similar level of versatility as well; Craiyon has been trained on a surprising amount of characters and people

While less strong than the full Dall-E, it’s able to be used by the general public with ease, though at a few costs. The images aren’t of the same quality as the ones that the actual Dall-E model can create, and the same level of customization offered by it is absent. However, Craiyon could very well be considered the next best thing, as it still manages to yield pretty crazy results, and a great deal of people have had all sorts of wacky fun with it, typing up the most random and bizarre prompts they can think of and sharing what the bot generates online. When this mini model first gained mainstream attention, so many people began using it that it began to have trouble loading (further exemplifying the problems the actual Dall-E would probably face if it went public). Luckily, there are more servers now, and the online app now works with ease.

The Practicality of Text-Based Art

Aside from just simple entertainment, there’s quite a lot of practical uses for text-based art generators like Dall-E. If you’re not an artist, but have ideas in your head that you’d like to convey, these generators could potentially work really well in your favor. Do you need art for something like a school presentation or a mood board? Or maybe you’d like to create concept art of some sort? Perhaps you’re designing a location for a video game, and said location has a bunch of individual paintings on the walls? The possibilities are endless! Also keep in mind that this technology is still in its early stages; who knows what it could bring as it evolves over time? Images that are even more realistic? An even wider variety of customization options? What will art generators be like by the end of this decade?

What About Actual Artists?

Now that I’ve gone on this long about how incredible the world of AI-generated art is, it’s time to answer the question that I’ve proposed in the title of this article. As impressive as Dall-E is, I honestly still think we’ve got a pretty decent way to go before real artists have anything to worry about. Some of what I said in the above paragraph may kind of seem blasphemous, but rest assured that I’m actually an artist myself, and as an artist, I can say that no matter how good art generated from text prompts may look, it’ll never a suitable replacement for what real artists have to offer. It’s not always about how “good” any particular piece looks; art is something that artists put their own thoughts, feelings, and experiences into, making each work of theirs true to themselves. If you’ve got the skills, you’ve got the freedom to create anything you want, how you want, without letting artificial intelligence fill in the blanks for you. Dall-E can create some really impressive steampunk renditions of the houses from SpongeBob SquarePants, but what if you wanted to make your own steampunk SpongeBob house, using a design that you yourself imagined? As corny as it might sound, there’s a certain level of soul that man-made art contains, and it’s a level of soul that you just don’t get in AI.

The Future Is Wild

It really can’t be understated just how impressive art generators like Dall-E 2 really are; honestly, this article I’ve written barely does some of the insane creations people have made with it justice. It’s something that really needs to be seen to be believed. I honestly feel kind of blessed to not only be living in a day and age where technology has become so advanced, but to be old enough to appreciate it. I may be of the firm belief that human-made art is something that can never be replaced, but I still have to admire the level of artificial talent Dall-E possesses, and give major props to OpenAI for developing it. I look forward to seeing what they’ve got in store for the future, what kind of advanced AI technology they’ll present to us next. Here’s to another decade of progress!

By Austin Oguri 

