Random Thoughts and a Few Valid Points
A Writer Uses AI Image Generation: This is His Story
Random Thought:
Every good blog needs to start with an insightful quote or two (or a few!) that the author uses to set up the rest of the piece. It’s like a little stretch before a workout.
Here are mine.
“A writer is an artist who uses words instead of paint." – Unknown
"Every writer is a painter of the mind." – John Keats
"A writer is a person who can make you see the world through their eyes." – Unknown
Random Thought on that Random Thought:
First off, wise of Keats to make sure his quote got noticed. Don’t let that gold go to waste! Promote yourself to immortality. But let’s be honest: Back in those days, you had about 10 writers across the world, and not a lot had been written or said to that point anyway, so it was easy to be the first to have a thought-provoking quote and get credit for it.
“I had eggs for breakfast.” – Voltaire.
The other two quotes, the “unknowns” and whoever was behind them, they all missed a chance at having a quotable legacy. The first looks like an initial stab from Keats before he settled on the second. (Talking to himself: “Not so wordy, John. C’mon man.”) The third sounds like someone who’d gotten one too many rejection letters.
Random Thought on the Random Thought on the first Random Thought:
This “unknown” attribution has me pondering if I should change my last name to the hyphenate “Unknown-Anonymous” and see if I can get my share of back-dated royalties. Yeah, that “Frankie and Johnnie” poem? It has my name on it. Where’s my check?
Anyway, let’s get back on the rails. This intro might be fun, but it’s also distracting. Yet it also indicates this was human-generated.
And that’s what we’re here to discuss. AI image creation. The quotes––yes, they have a purpose! I almost forgot! ––are a segue to the idea that a writer, a person full of confidence in words (save for the 23.75 hours of the day they feel like a hack), doesn’t experience the same assuredness with pictures. I’ll use 400 words to describe a circle before I dare draw one.
See? Not bad for a fifth try.
So, to start: I am a writer.
That’s less a prideful remark or guilty admission and more a clarifying setup to provide context for this topic: Using AI visual tools to create graphics and images. Sure, designers, art directors and artists have ingrained skills to visualize and create graphics and images through InDesign, Photoshop and those other programs the bosses don’t let us writers have on our computers.
But writers live in words, and they also live in Word®. Which hates images. Copy and paste an image into a Word doc and you’d think you dropped 40 Mentos into a bottle of Coke. What a mess. So, tools that enable a writer to create images on our own with a well-crafted prompt and strong concept make us feel, well, if not like Picasso, maybe like a kind of Thomas Kinkaid. (“I’m going to call this one, ‘Cottage with flowers.’”)
[I am aware that with all this digression and this is feeling more and more like a podcast, (“This blog is brought to you by Salesforce.”), I hear you. Let’s get to this AI topic. For real.]
I’m here to share my experiences with four of the top/most popular image-generating tools and provide ideas on how best to use them, and for what purpose.
Time to get to it.
First out of the gate …
DALL-E: My First
The original tool. Well, for me it was. While AI was a thing for years, OpenAI’s ChatGPT feels like it was there first, and was the initial AI tool I used, so I have an affinity for it. Thanks for that, Sam. I mean Mr. Altman.
My brain when I’m trying to sleep.
Quick Random Thought:
DALL-E was there when I was a struggling artist (image creator), with me through the early days of uncertainty of this AI ride. Like MacKenzie Scott. But I’ll never leave you for something flashier. You’re just too good, nice and generous.
The quick take:
I enjoy working with DALL-E. It feels like a friend willing to help. They have some issues, but don’t we all? While DALL-E has gotten better with this the last couple of years, they still have trouble with faces and words. [But with the recent launch of ChatGPT4o, which I’m starting to explore, a lot of those issues are much improved.]
- First, the melting, doughy faces in perpetual screams mean that if I do need to show people, it’s from behind, in darkness, or one at a time. However, while groups look like the medical staff in that old Twilight Zone episode “The Eye of the Beholder,” individuals look better. There’s less risk in having DALL-E show you one person rather than a few or a crowd.
- DALL-E has a way with words, by not having a way with words. Which has been a common issue with AI image generation. You may have a great concept for an image, you provide a detailed prompt, and ask that a sign above read, “DOG.” And you get a perfect image with a sign that says “diggaroo&@.” So close!
Well, um, you’re close enough.
- But every once in a while, DALL-E shocks everyone—it’s an unblemished opening paragraph to War & Peace––like Clay Aiken at his American Idol audition. So, you never know. DALL-E is getting better with words, though it’s hit and miss and still comes across like a spelling bee for toddlers.
Quick tip:
I’ve learned to be very specific with the prompt and put the word I want to see in “ALL CAPS” and quotation marks, like Grandma on Facebook, and making it clear this is a word to be shown rather than a descriptor of the scene.
Random Thought:
For any AI generation, the overall experience of waiting for the image to form … it’s a real sense of: “waiting for the mechanic to tell you what they found.” Sometimes it nails it exactly. Sometimes it enhances your concept and makes it even better. Sometimes you get that Cristiano Ronaldo statue.
Hey, maybe Cristiano doesn’t have a face for sculpture.
Remember:
With AI imaging, editing is trial and error that requires a deft touch. Example: You get almost what you need, but you’d really like to add a newspaper to the ornate desk already present. And you get that newspaper, but for some reason, the desk becomes a dining room table. Whoa, enough with the adlibs, DALL-E. Just the newspaper, please. But changing back, reverting, is a tricky business. So consider this ...
...Quick editing tip: Fixing one thing can break something else. I’ve found the best trick is to say, “Thank you! This is great!” Then start over with the original prompt, revised to include the newspaper. Then I’m more likely to get what I need.
What to use AI imaging for, Part 1:
- Creating images to put in a PowerPoint deck you might not have otherwise included, or you may have done a Google search for. AI imaging can get you more specificity; the image used is one no one else has.
- Building a storyboard for a video concept you’d like to sell in, or you’re imagining. Build out a series of images in AI and you get a sense of the style and story you want to tell. In a concept phase, it’s a quick way to get started and determine the concept’s validity and logistical creation.
Random Thought: Be polite, no matter what!
I have no data that shows being a jerk or off-putting to AI affects the output vs. being polite, but I’m not taking any chances. And in the case of DALL-E, they’re nice right back. (“Hello! Your concept sounds amazing! Let’s get right to it!”). I want to be nice to them! I know I’m getting compliments from a machine that doesn’t know me at all, but I love it. A co-worker once said I was a “handsome bloke” 20 years ago, and I’m still living off that. And those stories about AI becoming sentient and treating polite people better as they choose their servants, I think it’s absurd, but I’m not taking any chances. Have you heard about Roko’s basilisk? Yikes.
Playing the Field with Midjourney
I started using Midjourney for AI image generation at about the same time as DALL-E. Man, what a different experience! I felt I walked into a foam party wearing a Member’s Only jacket. The record screeches to a halt like when Reggie Hammond entered a honky-tonk bar in “48 Hours.” I don’t belong here.
The quick take:
Midjourney definitely feels like it’s for the artists who want to amplify their already stellar visual ideas. But with the right touch, anyone can benefit from it.
- The graphics created through Midjourney are stunning and original and run the gamut of styles. Oil painting, photo-real images, beautiful color arrangements. And they can look stunningly real.
May the force be with ... all of you at the next Bridgerton ball!
- The party line atmosphere (I latched onto a “newbies” group and started giving prompts) allows you to see what others are working on too.
- But it’s a queue, so you wait for others to get their results first before yours appear. And theirs are stunning, one beautiful output after another that fosters my imposter syndrome exponentially because they can see what I’m doing too.
Random thought within a bullet point:
I feel like I’m in an art museum with people who know what they’re talking about.
Them: “The structure of this one conveys the ebbing of life into the darkest depths even as the mind finally, ironically, tragically achieves enlightenment.”
Me: “The colors are pretty.”
- The editing is trial and error. You get four images, and can adjust them all, or pick a fave and it provides new variations of that one, with an updated prompt.
I had early success with Midjourney despite my anxiety over being a writer in a designer’s world. While it responds better to more concise and designer-friendly prompts (lighting, angles, artistic styles) it humored my more descriptive prompts to give me great output, and then freelanced a bit with styles that I hadn’t thought of. It provides styles (comic book, watercolor, hyperrealism) that may expand your thinking. “I hadn’t thought of this.” Midjourney is more likely to build upon a concept rather than just build it.
This was a super surprise when I prompted “a person with blue hair standing out in a crowd.” The comic feel and slight bird’s-eye view was a nice Midjourney flex.
Quick tip:
Midjourney is a more complicated user interface for a writer. It’s great for designers and is super for ongoing trial, error, and subsequent fixes. Use the party-line UX to your advantage. See what others are prompting and copy them with the terminology they use. Plus, when MidJourney gives you something you weren’t expecting, don’t dismiss it. It may be closer to what you want.
What to use AI imaging for, Part 2:
- Creating images to place in an email newsletter or blog.
- Providing a concept for a layout to come later, to better sell it in with more specificity.
- Developing a kind of mood board for anything like a trade show booth, a new or updated brand, and ad concepts.
Meta AI: Slow your roll
The “look, we got an image generator too” tool from Meta started strong for me, but it’s not quite the same as it was. Kind of like Facebook itself.
The quick take:
I liked how Meta AI provided four options like Midjourney, and two more than DALL-E. I liked how it would start to form an image as I wrote the prompt, like guessing the “Wheel of Fortune” puzzle when the only letter showing is T. I liked the beautiful, backlit Norman Rockwell-like images it would generate.
Not bad for a writer, huh?
But the bloom is somewhat off the rose.
- The four images don’t confirm quantity equals quality. Each image kind of looks and feels the same. Midjourney does a better job providing four distinctive variations of a theme, IMO.
- The “build-as-you-prompt” action becomes distracting and affects the “prompt-zone” I’m in while I write it. It’s like a kid pulling on your arm, “Hey! Look at what I did! LOOK!” Don’t rush me, Meta. I know you’re brilliant and are excited to help, but let me complete this thought first, because I don’t want you guessing wrong and embarrassing both of us.
“Imagine a st__”
Image of stop sign appears.
“Um I was looking for a steel drum band in Jamaica playing for tourists in a tiki hut. But appreciate you jumping the gun there.”
Quick tip:
It’s OK to play the field! Even if you don’t love a certain image-generating tool, it’s wise to use a few of them and not get locked into one. I like to provide the same/similar prompts to a couple of tools at once and have them unwittingly compete against each other.
Random thought: Strong concepts matter
AI image-generating tools are great. I’ve learned that you can prompt it, and it’s more than happy to give you something unique and impressive. But I’ve also learned it’s crucial to figure out a really cool concept first (use your own brain!) to create something wholly original. The pieces, parts, nuggets and details that you can imagine are what AI can develop. The story you tell in a prompt makes a huge difference. It’s why some AI generations seem so much better than others. It’s not the tool. It’s the person who wields it.
What to use AI imaging for, Part 3:
- Ad templates in the concepting phase
- As a pre-photoshoot cut sheet, such as showing items or people to be photographed. AI imaging can provide the angle you want, lighting, backdrops, etc. This way you can provide a more accurate list to the photographer.
Imagen … All the people, looking normal.
Google’s Gemini AI Imagen 3 (Can we get more modifiers of software ownership in here? It’s the Skechers LinkedIn Tostitos Ford Bowl brought to you by Snickers, presented by Murray’s Bail Bonds) is one I’ve been diving into lately and really like.
The quick take:
I liked the output and ease of prompting. And the results look “realer” than real life sometimes. But sometimes perfect is too much. Like with other AI image-generating tools, I had to ask for diversity to get it.
“Have you met Cary Grant, my less attractive older brother?”
- Its people look more realistic, and the imagery is cleaner. The faces aren’t sloughing off their skulls. We’ve cured Melting Stitch Face!
- Their guidelines are a little more restrictive as far as making things look too much like a known quantity. They’re definitely concerned you may create something that’s copyrighted or known, so it errs on the side of caution.
- The changes with each edit can be more than you want. Like with DALL-E, asking for a specific item to adjust in an image may provide a complete redo. You might get a different person, different background and/or a different foreground, when all you wanted was to make the can of pop a bottle instead.
- If I had a concept but needed help visually articulating it, I could converse with Imagen 3 and it would figure it out. Like the idea of a company going through a hiring freeze as shown below.
"Sir, our assets have been frozen. Get it? Frozen ... anyhoo, I just called to say someone tipped over Martin again.”
Quick tip:
On prompts, I tend to start with the big picture and work my way to the details. Describe the purpose and what the overall image idea is. Then I move to details in order of importance. Setting. Number of elements. Some items to show. Types of apparel. That sort of thing. I’ve found you don’t need to include every element in your prompt (you can if you’d like), but providing a couple of specific and leading elements is enough for the tool to fill in the gaps with designs of their own.
What to use AI imaging for, Part 4:
- Getting yourself unstuck on a concept. The tools are good for brainstorming and discussing half-formed ideas.
- Creating an image that will be the inspiration for a piece you’re trying to write. For me, an image can spark me to end writer’s block. A simple prompt (“Show me a team of IT people solving a problem.”) can deliver an image that gets me typing and ‘in the zone” on a related topic (“Skilling up IT teams.”).
Concluding Thought
AI image-generating tools help close the gap on the playing field of creation. A writer or anyone sitting at a desk can achieve visual looks much better than he or she could have created with traditional tools. This has been my experience. But I know I can’t achieve the level a professional designer or art director can achieve, with or without AI tools. The most important thing I’ve found is the originality of the concept that I ask AI image-generating tools to create. The better the concept, the more detailed it is, and the more original the idea, these are your foundations for making everything else with AI image generation easier and will lead to better results.
And you can quote me on that.
The better the concept, the more detailed it is, and the more original the idea, those are your foundations for making everything else with AI image generation easier and will lead to better results.
- Mike Unknown-Anonymous
Written By Mike Lawrence Creative Director