Google wired Nano Banana into the chat interface generateContent, not the image API’s predict. Counterintuitive if you’re using RubyLLM, which makes you think in terms of actions like paint instead of chat.
Once you know that quirk, it’s straightforward. Only caveat: you need the latest trunk or v1.9+, because that’s where we taught RubyLLM to unpack inline file data from chat responses.
Wire It Up
chat = RubyLLM
.chat(model: "gemini-2.5-flash-image")
.with_temperature(1.0) # optional, but you like creativity, right?
.with_params(generationConfig: { responseModalities: ["image"] }) # also optional, if you prefer the model to return only images
response = chat.ask "your prompt", with: ["all.png", "the.jpg", "attachments.png", "you.png", "want.jpg"]
image_io = response.content[:attachments].first.source
That StringIO holds the generated image. Stream it to S3, attach it to Active Storage, or keep it in memory for a downstream processor.
Want a file?
response.content[:attachments].first.save "nano-banana.png"
That’s it. Chat endpoint, one call. Ship the image feature and go enjoy the rest of your day.