Oscraps

Google Gemini - AI images from verbal prompts

hoodsmom

Guess Who!
I posted this LO in the gallery -

and thought I'd share more details about how I made the frogs. First off, be sure to read the section under "photo credit," which tells a little more about Google Gemini, which I'm not repeating in this post. What I especially liked about Google Gemini is that you don't have to buy credits to use the tool as you do with other web-based AI image generating software. If you are a privacy-freak, note what I said about Google keeping the info for three days and consider creating a Google profile not associated with your "real" Google profile.

I was going to extract images of origami frogs that I've folded and photographed, but if I hadn't already done the photography, that would have been extra work to get out the models and photograph them against a plain background for extraction. And I wanted the frogs to have a sardonic expression - so I'd have had to have found a model that allowed for facial expression (and even if I'd found one to fold, my shaping skills aren't that advanced). Enter AI, which solved a number of problems, including the copyright issues.

Using Gemini is pretty easy. FWIW, the frog in the Gemini result below looks a lot like many other frogs I generated the day before using a different Google profile. So my take on this is that Gemini is good with the technical bits, but not so good with the creative part. And unlike working with a human illustrator, you can't go back to Gemini and say, "No, you turned the frog to its left, not its right," or "I want the facial expression to look **really** egotistical and dismissive, like a politician one can't stand." You just keep hitting the regenerate button until you get something that appeals to you.
instruction 2.jpg

And here's how I "colored" the extracted frogs:
instruction 1.jpg
 
Here are two other AI tools I've tried that offer a reasonable number of free images:

Canva (canva.com) - they now own Affinity. Apparently pretty popular. AI tools are part of a larger suite for doing graphics projects. I find the interface complicated and confusing, but all I want to do is generate an image or two, not make a big project, so it seems like a lot of work to generate a single image. One hint - resizing is a "pro" feature, so when you choose your initial "project" size, be sure it's big enough for digi-scrapping work. I think the only serious constraint on "free" accounts is storage space of 5GB. Canva partners with AI generators. One, Mojo AI (express option only), allows 10 free images per day. To use the others more than a few times you have to purchase credits (not sure if you also have to upgrade your Canva account). Instructions and experiment with the origami frog prompt follow:

Instruction 3.jpg
MojoAIExamples.jpg
Mojo AI has "styles," and I tried several - but I didn't get much variation once it decided it understood my prompt

Clipdrop Reimagine (clipdrop.co) - you upload an image and software generates weird variations. If you want really weird, start with an already weird image and keep clicking the button. You don't get to go back, so be sure to download things you like right away.

I used this before in this LO - the starting image was a LO (also in the gallery) I had created myself. My comment was that AI obviously didn't "get" digiscrapping when it tried to reimagine my entire LO.

Here is the experiment with the Google Gemini frog (bottom left) - got some very interesting stuff, but it didn't have the look I was going for, at least not on the first go around. Images were pretty far away from "origami" but I might have been able to use one with the correct facial expression, because "origami" was less important to me than "sardonic."
Instruction 4a.jpg
 
Last edited:
Wow!
You did a great job here explaining how you made this frog. It's an art to tell the AI what to do to make it look the way you want it to. I'm experimenting with several programs and I don't always get what I want.
I hadn't heard of Gemini and Canva before. I work with Bing Creator and Adobe Firefly.
Maybe we can share prompts here so we can all learn to get even better results.

Thank you!
 
@Su_Sanne - Thanks for telling me about Bing Create (which relies on Dall-E)

Here are my experiments. Interesting that none of the AI engines can reliably determine what a creature's right and left are. (It'd be bad news if the AI engines were surgeons or even interventional radiologists).
Row 1: Dall-E doesn't understand "sardonic," but gets "origami"
Row 2-3: I really liked the second images in each row. You might see them again. Images are more "low-poly" than "origami," but I love the expressions
I think Dall-E did the best job of the lot coming up with different interpretations of same/similar prompt and was the most likely to pick up on the request for a transparent background

BingDallE.jpg
 
Back
Top