From prompt to picture: Proven tips for better ChatGPT images
The ability of AI services to generate images is well recognized and one of the most common uses of services like ChatGPT. It is also an area that has been surrounded by controversy – photographers, artists and filmmakers are upset that Open AI and other companies have “trained” their AI models on their copyrighted works.
In this guide, I gather tips on what you can do and how to get better results.
Create images from text
The most obvious thing you can use ChatGPT to do with images is to generate something completely new. Just give prompts like “create a picture of two rabbits playing in a meadow” or “make a photorealistic picture of a woman sitting in front of a computer drinking coffee out of a cup that says PC for Everyone” and an image will pop up that you can download and use.
In many cases the images ChatGPT generates are really good, or at least good enough to use, but sometimes they don’t match what you asked for, or have obvious errors. Today, errors like too many fingers or an extra hand in a group photo are not as common, but errors that cannot be ignored are still common.
When this happens, you can either try to continue in the same chat and try to get ChatGPT to tweak the image until the result is better, or try again with a new or modified prompt. Which works best will vary and you will simply have to try it out.
Bilder genererade av Chat GPT
In my experience, the results of small adjustments rarely improve enough to be worth the effort and so-called prompt engineering, where you try out different formulations, is far from an exact science. Even an adjustment that is logically very small, like adding “one of the rabbits has a pink collar” or “she’s holding the cup in her left hand” to the examples above, can lead to totally different results – or be exactly as you hope.
5 tips for better photos with ChatGPT
Skärmdump
Describe what you are looking for
Have a picture in your head of what you want? Describe it as if you were telling someone who can’t see what you see. “A girl with brown hair and pale skin sitting at a piano in an old house with old-fashioned furnishings” is better than “a girl sitting and playing the piano” if that’s exactly how you want the image. If you don’t have a clear picture yourself, ChatGPT has nothing to go on. Sure, it can produce interesting results from time to time to see what happens when the AI is given a freer rein, but if you’re after something in particular, you need to actually say what it is.
Avoid overly detailed descriptions
A detailed description is important, but it can also be too much. If you write a whole A4 page with an extremely detailed description, there is a high probability that ChatGPT will lose the thread and produce something unusable. Include the essentials but let the AI fill in the rest.
“Metadata”
Do you want a wide image or a square one? Should it look like a photo or a painting? Should the colors be saturated or faded? How much of the image should the main subject take up? Should the light be warm or cold, sharp or soft? Tell ChatGPT how the picture should be made, not just what it should contain.
Try it again
Didn’t get it quite right? Ask ChatGPT to try again, or ask it to create some different suggestions. Change the prompt and see if it gives better results. Try a more detailed description – or vice versa, a simpler one, if your original prompt was already very detailed.
Start from a sketch
If you can draw a simple sketch that shows the basic composition and content of the image you’re after, you can ask ChatGPT to turn it into a finished image in any style. How well this works varies widely. Common problems include illogical composition, facial expressions that don’t match the sketch, and most of all, people looking in the wrong direction.
Editing and improving your own images
In addition to generating brand new images, you can use ChatGPT to edit existing images. It’s important to note that this isn’t really editing in the usual sense of the word. Every time you ask it to make a change, it generates the whole image again, it’s just that the algorithm works in such a way that most of the new image will be identical to the original.
Skärmdump
Once ChatGPT has created an image, you can click on it to open the chatbot’s editing interface. There’s really only one tool here, plus buttons for undo and redo. Click on the edit tool and your mouse pointer will turn into a big circle when you hover it over the image. Click and drag to paint an area that marks the part of the image where you want ChatGPT to make the changes you then ask for.
This could be things like removing a distracting object, changing the details of something (like changing the print on a jumper visible in the image) or adding something new.
If you want to make more general changes, you can do so directly in the prompt without selecting anything. “Remove background” often works well in its simplicity, but other changes may need a bit more detailed descriptions.
Anders Lundberg
Sometimes ChatGPT gets itself to make more changes than requested. Then you can try to specifically tell it not to change anything else. For example: “Change the color of the umbrella to red. Do not make any other adjustments or changes to the image.”
“Zoom, enhance”
A very common trope in films is that a detail is needed from a blurry photo or still from a surveillance film, and all a “computer person” needs to do is zoom in on the image and click an Enhance button. Sometimes a high-resolution version showing the necessary details pops up instantly, but sometimes the story requires it to take time, and then the computer can keep thinking for a long time. Often part of the image is shown at a time and the tension is unbearable as pixel after pixel appears on the screen.
This is science fiction, of course. Information that doesn’t exist can’t be ‘recreated’ no matter how advanced an algorithm or how powerful a computer. But with AI, it can be faked.
Any feature that removes distracting objects or people and fills in the background uses machine learning of some kind, whether it’s called AI or not. Older techniques like Photoshop’s content-aware fill use simpler algorithms while some newer ones use the same algorithms that AI chatbots do when generating new images.
Anders Lundberg
Genererad av Chat GPT
Enlarging an image works in a similar way, but instead of guessing what fits to fill in a larger gap, the algorithm guesses how many small gaps to fill in so that the image is sharper (shows more detail). Since some of the information is already there, the risk of the AI coming up with something completely wrong is much lower. If you can already see what a sign says in a low-resolution image and the AI just makes the text clearer, it hasn’t lied, although it can’t be said to have recreated lost information.
The result will never be identical to what it would have been if the image had simply been taken at higher resolution or better sharpness, but in most cases that difference is an academic question – what matters is whether you can use the image at the size you want without it looking blurry.
Anders Lundberg
Another thing you can try is to ask ChatGPT to sharpen a blurred image, for example a photo where the camera focused wrong. This can work really well if the image is only slightly blurry, but if it’s very blurry it guesses wildly and then the person in the photo can look like someone else entirely.
Apply a certain style to images
ChatGPT has become known for being good at a particular kind of editing – turning a photo or other image into a picture with a particular style. You’ve probably seen examples of the trend to ask ChatGPT to make images in Studio Ghibli style, that is, with a cartoon style similar to films directed by Hayao Miyazaki. It’s very good at it, but be aware that the creators you make it mimic have in most cases been sharply critical of the move. Some have sued Open AI for copyright infringement.
Less controversial is asking ChatGPT to change the image to a style that is not that of any individual artist, for example “turn this photo into a watercolour painting”, or asking for a style that belongs to a long-dead artist like Rembrandt.
Foto: Anders Lundberg, målning genererad av Chat GPT
You can also upload an existing image to have as a reference and ask ChatGPT to remake other uploaded images to match the style of that image.
A trick you can try if this does not give satisfactory results is to upload the example in a new chat instead and ask ChatGPT to “generate a description of the image that could be used to ask ChatGPT to apply the same style to another image”. Paste the results into the chat where you have uploaded the image(s) you want to change the style of.
Skärmdump
Gallery
In the top left column of ChatGPT, under New Chat and Search Chats, you will find the Gallery feature. It’s a repository for all the images you’ve generated with ChatGPT (technically only with the GPT-4o model, not images generated with the older Dall-e model).
It makes it easier to find specific images you have generated, so that you can, for example, continue working or look up how you wrote the prompt at the time. Click on an image and then on Open in chat in the top right corner to go to the thread where the image was generated.
Skärmdump
Generate video with Sora
In addition to image generation, Open AI has developed algorithms that can generate video, and is offered as a separate service called Sora, with its own website and app. Sora is not embedded in ChatGPT mainly because the service requires a more advanced user interface, and Open AI wants to keep ChatGPT’s simple interface.
Sora is exciting and can create scarily realistic videos. Going through everything you might find useful about video generation would take up more space than I have in this guide. But you can start from the same basic tips as for image generation. My second tip is to try and play around with the service. But keep in mind that you can create a maximum of 15 10-second clips a day unless you have an expensive Pro subscription.
Skärmdump
Projects and GPTs
Just like with text, you can use projects to keep all of your chats organised and add files and instructions to accompany any new chats in that project. This is ideal if, for example, you’re using ChatGPT to create image resources for a website or anything else where you want to stick to a consistent style.
If you pay for a Plus subscription, you can also use the GPT feature to create customised versions of the chatbot, not to mention accessing GPTs created by other users, like the upscaling GPT I mentioned above.
AI-generated images and copyright
If you let ChatGPT or another AI service generate images for you, you have no copyright on them. It doesn’t matter how detailed your description was or how much you fiddled with the prompt. This means that others can copy ‘your’ images and use them, without asking you and without you being able to do anything about it. It is also illegal to claim that you own the copyright to an AI-generated image.
However, if you take an AI-generated image and make major changes to it using a program such as Photoshop, it can become a “work of authorship”, which gives you the copyright to it. The same applies if you paint an image that the AI has generated – then it is your painting that you have the copyright to, not the generated original.
The US Library of Congress has a good guide to AI and copyright, which also warns of the risk of an AI infringing someone else’s copyright. If you’re just using the images for personal use, there’s a low risk of you being sued, for example by Studio Ghibli if you’ve made a portrait of yourself “Ghibli-style”, but for those running a business, it’s more important to be careful.