Image Generation & Learning Art

Essential Visual Attributes

Feb 27, 2024

Recently I have been dabbling with metaphysics, as I have some interest in putting into a more detailed and basic perspective of what is intelligence, knowledge, ethics and such. And of cos I tried to dabble a bit more into image generation as well given the vast improvement that I have seen but also per the last issue on differentiating AI generated images vs non-AI generated images.

As I play around with image generation software, I start to realize something. Image generation AI can actually teach us what are the essential visual attributes, meaning once we see a part of the image, we recognize that it comes from something we are familiar with. Our eyes tends to look out for patterns that we recognize very quickly. I used Bing Image generator with the following prompt,

“A single monster that is a mix of Chinese dragon, rabbit, frog and duck.”

Barring the lack of understanding what “A single monster”, you can quickly start to notice attributes of the few animals we have stated.

Chinese Dragon: As see with the dragon head, the horns on the head, plus the long beard and the claws.

Rabbit: Especially the long ears

Frog: The entity’s posture, green color and the beady reptilian eyes (bottom picture on the left).

Duck: The beak and webbed feet was there (bottom picture).

The above experiment I used Bing Image Generator to do it. I have tried other combination of it and sadly Bing Image Generator is a bit “lazy” in my opinion as it tends to “paint over” the entity it recognise, at least based on a few tries.

Conclusion

I feel image generators are very good “teachers” that can be used to teach artists, especially upcoming artists to be more perspective, understand what are the basic visual attributes that will hit the mind quickly and trigger a full image of certain entity.

This can help to level up artists to be better and create arts with better communication to its appreciators.

What are your thoughts on it? Please share at the comments below. :)

Like to support my work? Consider dropping me some “books”. :)

Eric Sandosham

You made some interesting points, Koo. One of the hardest thing to encode for is language because of the compression (i.e. few words carry a lot of meaning). We longer write or talk literally. As English becomes the world's default language, these 'shortcuts' in word usage will accelerate. Using Gen AI image generators can teach us how to reconstitute these shortcuts into their original meaning, and the prompts themselves can be studied by AI to learn about uncompression.

Expand full comment

1 reply by Koo Ping Shung

1 more comment...

"Let's Build Intelligence Together"

Discussion about this post