Generating images based on text prompts

What can the technology do?

AI can generate photorealistic images, art images and scenes based on text input and recreate artistic versions from originals, and insert objects into images, taking into account lighting and shadow conditions. The text input is sometimes referred to as text prompts which are short text descriptions.

This type of AI is sometimes referred to as an AI image generator. Midjourney provides an AI that's much more accessible to everyday users compared to OpenAI's DallE.

Midjourney images look a lot more like art. Midjourney has an ability to create stylistic and sometimes hauntingly realistic renderings. Midjourney excels at creating environments, especially fantasy and dystopian sci-fi scenes with dramatic lighting that look like rendered concept art from a video game. It’s also quite good at making surreal pastiche that mimics different art styles.

What are its limitations?

Some AI cannot insert into existing images. Midjourney's images are not incredibly photorealistic and almost none of them meet the standard of photorealism. You can't manipulate existing image.

More realistic images don't look like they're possible with Midjourney, as opposed to Dall-E. This was supposed to be a realistic rendering of a Macbook Pro on a white background. It got the texture right, but nothing else was close.

However, if you'll looking for more realistic images, the ability to use reference images, or an easy way to manipulate existing images, you'll need to wait for Midjourney to add those features or Dall-E to become more accessible.

How are the risks and dangers managed?

OpenAI has attempted to mitigate some of the training biases inherent to their AI model, Midjourney hasn’t published any information about what datasets and methods were used to train its AI tool, but its likely it was partially trained on images from Artstation.

Midjourney doesn’t seem to have many explicit content protections aside from automatically blocking certain keywords. The “Content and Moderation” section of Midjourney’s user guide instructs users to “not create images or use text prompts that are inherently disrespectful, aggressive, or otherwise abusive,” and to “avoid making visually shocking or disturbing content” including adult content and gore. The rules also forbid content that “can be viewed as racist, homophobic, disturbing, or in some way derogatory to a community,” including “offensive images of celebrities or public figures.” It’s unclear how or how well any of this will be enforced, but with such impressive results, the Midjourney project will likely be something to watch as companies start precariously navigating a path forward for image-generating AI.

How do you use it?

The interface to most of these AI models is not user-friendly. Midjourney uses a Discord bot to generate images. Paying subscribers have access to a web app. Users type their prompts directly into the chat interface and receive messages from a bot that shows their generations rendering in real-time. Users can then choose to upscale and enhance an image from each set of generations, or create more variations from the same prompt.

Is it free to use?

A “free trial” period only gives each user a limited number of generations before the bot prompts them to buy a subscription. For non-commercial use, the cheapest plan allows 200 images for $10 per month, while the premium tier allows unlimited generations for $30 per month. Midjourney discourages people from minting NFTs by requiring that anyone using generated images in “anything related to blockchain technologies” pay a 20% royalty on any revenue over $20,000 per month.

Collaborate with us

We're building great products for agencies

Sign up to access Berserq's AI Artist