6 min read

A Glimpse of AI Generated Art

A Glimpse of AI Generated Art
Photo by Growtika / Unsplash

When preparing the materials for my book review post Rebuilding, When Relationships End, I had this idea of using openAI API to generate images based on the quotes that I would use in the post. In total I requested 13 images from the 13 quotes / paragraphs of words (one image each) using openAI's DALL・E3, and 12 were successfully generated, although I only ended up using 4 of them in my book review post. The other 8 are also very interesting but I don't think they are suitable for the blog (and you will see why later).

Through this simple, fun exercise, I had the opportunity to have a glimpse into the AI generated art, and here are what I found:

1. Easy to get started

It only takes a few steps to install the libraries for the openAI APIs. The openAI tutorials are very easy to follow and you don't need to have programming knowledge to be able to use them, as openAI provides pieces of example codes for you to start.

To use the openAI APIs, you will need to pay a small fee each time you trigger a request. DALL・E3 image model costs $0.08 per request.

2. More literal than abstract

The 13 quotes / paragraphs of words serve as "prompt" for openAI DALL・E3. As you may have read in the post Rebuilding, When Relationships End, most of them are about feelings, mentalities and spiritualities that are rather abstract. (As of March 2024) OpenAI doesn't seem to response well to those sentences and paragraphs that are not concretised to any specific "stuff", nor does it have the ability to integrate several related or progressive sentences into an abstract idea. Most of the generated images are more like literal translations of the key words of the paragraphs pieced together.

OpenAI DALL・E3 generated this comic strip based on the texts: "The power struggle in the relationship is diminished when: Each person learns to talk about feelings. Each person starts using I-messages instead of you-messages. Each person takes ownership of unresolved problems. Each person looks at the other person as a relationship teacher. Each person works at learning more about herself or himself, instead of projecting the hurt and blame upon the other person."

Some images hilariously express quite the opposite from the texts as AI is only focusing on illustrating the key words not the whole passage. You can see in the following examples:

OpenAI DALL・E3 generated this comic strip based on the texts: "Many people marry for the wrong reasons, among them (1) to overcome loneliness; (2) to escape an unhappy parental home; (3) because they think that everybody is expected to marry; (4) because only “losers” who can’t find someone to marry stay single; (5) out of a need to parent, or be parented by, another person; (6) because they got pregnant; and (7) because “we fell in love.”"
OpenAI DALL・E3 generated this comic strip based on the texts: "Your inner critic is SMALLER than you are, and you can be BIGGER than it is. Consciously make a decision to start listening to that part. Acknowledge the voice, and it will eventually start softening the words it uses. When your critic has finished speaking each time, you may respond with a simple “Thank you.”"

3. Concerning gender and race ratios

Despite the gender neutral tones in the prompt texts in all 12 successful attempts, 6 out of the 12 images have a male central figure whereas only 4 are that of a female central figure. The other 2 depict heterosexual couples. As a female blogger looking to use AI generated images to enhance what I wanted to express, I certainly didn't feel these heavily male-centred images representative to either my expression or the gender base of my audience. This was one of the reasons why I couldn't use many of the images I paid for in the post. To give openAI the benefit of the doubt, it is possible that this gender ratio bias I'm seeing is due to a not so large sample size in my experiment. Nevertheless, I would hope that the world leading AI service could live up to the gender equality expectation the customers have on it.

It becomes more concerning when you start to count the racial ratio of the figures in the openAI generated images. The 2024 AI technologies seem to still struggle to find the sweet spot between not representative for all racial groups and over-compensating for an underrepresented racial group. 😅

As these technologies are trained with the gender-biased and race-biased database we already have, if the engineers do not get the representations right, the biased trends will be exaggerated exponentially with the AI boom.

4. Narrow representation of the spirituality

Funny enough, openAI DALL・E3 seems to narrowly link the text of feelings, mentalities and spiritualities to religions, and especially in the female centred images, even though the texts have no indication of religions. Church, cross, and sari are the reoccurring elements in those images.

5. Oops, safety system.

I fed 13 paragraphs of texts to openAI DALL・E3, of which only 12 got image results. The one that got rejected, according to openAI, may contain text that is not allowed by their safety system. Here are the texts:

Ask yourself: Were you and your partner friends? Did you confide in each other? What interests did you share? Hobbies? Attitudes toward life? Politics? Religion? Children? Were your goals for yourself, for each other, and for the relationship similar/compatible? Did you agree on methods for solving problems between you (not necessarily the solutions, but the methods)? When you got angry with each other, did you deal with it directly, hide it, or try to hurt each other? Did you share friendships? Did you go out together socially? Did you share responsibilities for earning money and household chores in a mutually agreed upon way? Did you make at least major decisions jointly? Did you allow each other time alone? Did you trust each other? Was the relationship important enough for each of you to make some personal sacrifices for it when necessary?

Maybe the "safety system" is just to cover that the DALL・E3 couldn't deal with long and complex prompt texts? Lol.

6. And the award goes to...

Despite some of the awkwardness in most of the openAI DALL・E3's image generation outputs, I was very impressed by one image. It keeps the contents at the abstract level. By using bold, rich and high contrast colours, and smooth curving flows, it embeds the key words "creative selves" and "spiritual well-being" in the generated art very well.

If you are interested in more AI generated images, check out other interesting openAI DALL・E art work here.