AI Image Generation


A little bit of fun using AI...

The title graphic of this article was generated by AI using the prompt: "a factory set in the 1800's making images using AI" and two iterations.

I have previously discussed prompt generation as required in the age of generative AI: The Art of Prompt Engineering: Unleashing the Power of AI with Refined Queries.

So how have I been using it? I have tried various text based models such as Chat GPT and Google Bard, but also moved into generative images. I will not cover the text based ones today, but they can be used to derive base texts, review existing texts etc, or my favourite of asking it to reword something for you.

For image generation I have tried with some success tools such as Dali-E and Midjorney, Microsoft's recently launched Designer tool is where I have spent most of my time and got what I believe are great results for my use case. Midjourney can generate more life like images, but I don't need that for my current application. But the output is still good using Designer (as seen in the below picture of a lion)... and because of Microsofts OpenAi investment is back-ended by Dali-E.

prompt: "a photo realistic picture of a lion running through long grass"

One of the first things I used it for was to show my 9 year old daughter about the power of AI and where the future of technology was going. So to help her understand it, we used her prompts which resulted after a few iterations in this picture.

The prompt that we used for this was: "a 3d illustration of a gorilla riding a horse through 1800's London whilst eating a pink doughnut with sprinkles on it"

The outcome was a good representation of her expectations.

I have been going through my website replacing graphics in prior articles. The old graphics used were sometimes of poor quality since they were either made by myself through screenshots of something I had done in PowerPoint or rights free graphics which were either low quality, contained watermarks or did not quite meet the expected topic description accurately. And almost all the time, the images varied in size.

Now through generative AI, I have been able to generate high quality images very quickly to meet the topic descriptions accurately and often combing many themes of topics into one image.

Here is an example for a recent article, where I wanted to overlay themes into one picture

I used this image as the header graphic to my article: Understanding Nokia's Struggles in the US Mobile Market. To get to this output required many iterations. The way I did this was to begin with my prompts and gradually add or remove detail to the prompt until I was getting the type of output I felt was required to convey the message, and then begin the process of refining an iteration. To be frank I did not note down the exact final prompt, but it was something along these lines: "a 3d illustration showing a person holding their head in despair over the failure to make it big in the USA consumer market. The person's outfit needs to demonstrate USA. Set the entire image against the skyline of New York"

I also used generative AI to systematically update old articles to make my site look more current and also keep the theme throughout consistent as opposed to the random images I had used to date. Here are a few examples of change I made, and the prompts sometimes were as simple as using the headlines from he articles with a few iteration requests on the same prompt.

Here are a few showing the before and after images from articles.

SWAT Teams vs. Film Crews, how they both use Bricolage

prompt: "Members of a Swat team at the ready Vs a Film Crew"


Strategy: RBV, Inside-Out, Outside-In 

prompt: "an image of white piece of paper with the following words included: Strategy, Planning, Goals. The word strategy must be in bold and larger than the others and highlighted by a yellow highlighter"


VAST Networks Shortlisted for 3 International Awards

This is a good comparison of a Shutterstock image vs AI Generated. 

prompt: "a Wi-Fi hotspot sign in a restaurant window"

I have found the guardrails are present, for instance I can generate an image including the prompt of "iPhone" or "Nokia", but I am not able to use the prompt of "Steve Jobs". Similarly when iterating the SWAT image above, if I included the word "assault" it result an error.

It also seems to almost protect the image by not duplicating it, even if the exact same prompt is reused, rather it treats reuse as an iteration.

It does not always get it right though, sometimes known in AI as hallucinations, seen below two images resulting from the same prompt came out distinctively different. 

prompt: "a photo realistic picture of a gorilla dressed as a clown eating cotton candy while riding simba from the lion king"

Overall, I am finding Generative Image AI a great way to make not only my website look better, but have incorporated it into work documents and presentations. Whereas before I, like many, would do a google image search to find an image to try match my content or make a request to marketing which same with a time delay and/or associated cost, I can now generate a high quality image file using prompts similar to what I would have used as a my query on a Google search.

So one final test; using the same query in Google Image and as a prompt in Microsoft Designer, let's see what we get..... and this is the first result for each....

Prompt/Query: "a google image search"

Result:

Screenshot from Google Result


Microsoft Designer Image Creator