Nano Banana Image Generation Interface Document

1. Interface Basic Information

Model Name: gemini-2.5-flash-image (Nano Banana)

Base URL: https://llm.ai-nebula.com/v1/images/generations

Authentication Method: Bearer Token

Auth Token: Bearer sk-xxxxxxxxxx

Core Capabilities:

✅ Text-to-Image (Generate images from pure text descriptions)

✅ Image-to-Image (Generate new images from a single image + text)

✅ Multi-Image-to-Image (Generate new images by fusing multiple images)

✅ Multi-turn Conversational Image Generation (Continuous modification with context)

Supported Aspect Ratios: 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9

Supported Image Formats: PNG, JPEG, JPG, WEBP

Image Size Limit: Max 7MB

1.1 Core Parameters

Parameter Name	Type	Required	Description	Example Value
model	string	Yes	Model Name	gemini-2.5-flash-image
prompt	string	No*	Text prompt	"A cute orange kitten"
contents	array	No*	Multimodal content (supports conversational context and image-to-image)	See examples
response_format	string	No	Response format: b64_json	b64_json
size	string	No	Image aspect ratio (ratio or pixels)	"16:9" or "1792x1024"

*Note: Select either prompt or contents; at least one must be provided.

Image Input Requirements:

Supported Formats: PNG, JPEG, JPG, WEBP

Max Size: 7MB

Input Methods: Supports both URL and Base64 formats

URL Format: "image": "https://example.com/image.jpg"

Base64 Format: "image": "data:image/png;base64,iVBORw0..."

1.2 Aspect Ratio Settings

Set the image aspect ratio via the size parameter, supporting two formats:

Format 1: Direct Ratio (Recommended)

{
 "size": "16:9"
}

Format 2: Pixel Dimensions (Automatically converted to corresponding ratio)

{
 "size": "1792x1024" // Automatically converted to 16:9
}

For supported pixel dimensions and their corresponding ratios, please refer to the "1.3 Supported Aspect Ratios and Corresponding Pixel Dimensions" table below.

1.3 Supported Aspect Ratios and Corresponding Pixel Dimensions

Aspect Ratio	Pixel Dimensions	Usage Scenarios
1:1	1024x1024	Square images, avatars, social media
3:2	1536x1024	Standard photography ratio
2:3	1024x1536	Vertical posters, phone wallpapers
3:4	1536x2048	Vertical photos
4:3	2048x1536	Traditional display ratio
4:5	1024x1280	Instagram vertical
5:4	1280x1024	Traditional display ratio
9:16	1024x1792	Mobile vertical screen, short video covers
16:9	1792x1024	Widescreen, video covers, desktop wallpapers
21:9	1024x2176	Ultrawide screen

2. Simple Text-to-Image Functionality

2.1 Basic Text-to-Image (Default 1:1 Ratio)

Generate an image of "A cute orange kitten sitting in a garden":

2.2 Text-to-Image with Aspect Ratio (16:9 Widescreen)

Generate a landscape image with a 16:9 widescreen ratio:

2.3 Vertical Text-to-Image (9:16 Mobile Screen)

Generate a poster suitable for a mobile vertical screen:

2.4 Using Pixel Dimensions to Specify Ratio

You can also use pixel dimensions directly; the system will automatically convert them to the corresponding ratio:

3. Image-to-Image Functionality

3.1 Basic Image-to-Image (Default Ratio)

Generate a new image based on a base image. Supports two image input methods:

Method 1: URL address (System automatically downloads the image)

Method 2: Base64 encoding (Must include data URI prefix)

Example 1: Using Base64 Input

Example 2: Using URL Input

3.2 Image-to-Image with Aspect Ratio (21:9 Ultrawide)

Generate a 21:9 ultrawide image (Supports both Base64 and URL inputs):

Using URL Input (Recommended)

Using Base64 Input

3.3 Multi-Image-to-Image (Multi-Image Fusion)

Nano Banana supports inputting multiple images simultaneously. The model analyzes all images to generate a new one. Suitable for:

Style Fusion: Applying the style of one image to another.

Element Combination: Extracting elements from multiple images to combine them.

Contrast/Reference: Providing multiple reference images to help the model understand your needs.

Scene Mixing: Blending features of multiple scenes.

Example 1: Style Transfer (2-Image Fusion)

Apply the style of one image to the content of another:

Example 2: Element Combination (3-Image Fusion)

Extract different elements from multiple images to combine them:

Example 3: Product Design Reference (Multi-Image + Detailed Description)

Provide multiple reference images to generate a product design meeting specific requirements:

Example 4: Mixing URL and Base64 Input

You can flexibly combine URL and Base64 input methods:

Tips for Multi-Image Generation:

Supports inputting 2-5 images simultaneously.

You can mix URL and Base64 formats.

Clearly explain the role of each image and how to fuse them in the prompt.

The order of images affects the result; place important images first.

Each image must meet the format and size limits (PNG/JPEG/JPG/WEBP, max 7MB).

3.4 Conversational Image Generation

Nano Banana supports multi-turn conversational image generation, allowing you to continue modifying based on previously generated images:

4. Response Handling

4.1 Response Format

A successful response (Status Code 200) will return JSON containing image data.

When response_format is set to b64_json, the image data is in the data[].b64_json field.

Note: Nano Banana only supports b64_json format; url format is not supported.

4.2 Success Response Example

{
  "code": 200,
  "msg": "Operation successful",
  "data": {
    "data": [
      {
        "url": "",
        "b64_json": "iVBORw0KGgoAAAANSUhEUgAABAAAAAQA[base64 data truncated]",
        "revised_prompt": ""
      }
    ],
    "created": 1757320007
  }
}

4.3 Saving Base64 Image Data (Command Line Example)

4.4 Error Handling

If the request fails, an error message will be returned:

{
  "code": 400,
  "msg": "Invalid parameter: Unsupported aspect ratio",
  "data": null
}

Common Error Codes:

400: Parameter Error (e.g., incorrect model name, invalid aspect ratio format).

401: Authentication Failed (Invalid or expired API key).

429: Rate Limit Exceeded (Requesting too frequently).

500: Internal Server Error.

5. Best Practices

5.1 Aspect Ratio Suggestions

Social Media:

Instagram Post: 1:1 or 4:5

Instagram Story: 9:16

Twitter/X: 16:9

Facebook Cover: 21:9

Design Usage:

Website Banner: 16:9 or 21:9

Poster: 2:3 or 9:16

Product Image: 1:1 or 4:3

Mobile Wallpaper: 9:16

Video Related:

YouTube Thumbnail: 16:9

Short Video Cover: 9:16

Widescreen Video: 21:9

5.2 Prompt Optimization Suggestions

Text-to-Image Prompt Tips

Specify Ratio Needs: Indicate composition direction in the prompt.

Horizontal: Use "Horizontal composition", "Widescreen view".

Vertical: Use "Vertical composition", "Vertical view".

Consider Image Layout:

16:9/21:9: Suitable for containing more horizontal elements (e.g., landscapes, panoramas).

9:16: Suitable for containing vertical elements (e.g., portraits, buildings).

1:1: Suitable for centered symmetric composition.

High-Quality Keywords:

Add keywords like "High quality", "HD", "Professional photography".

Specify style: "Realistic style", "Oil painting style", "Anime style", etc.

Multi-Image-to-Image Prompt Tips

Clarify Image Roles:

✅ Good: "Apply the oil painting style of the first image to the landscape content of the second image."

❌ Bad: "Mix these images."

Specify Fusion Method:

Style Transfer: "Transform the second image using the style of the first image."

Element Extraction: "Use the sky from the first image, the building from the second image, and the foreground from the third image."

Reference Design: "Design a new product referencing the color scheme, lines, and layout of these images."

Describe Needs in Detail:

Explain what to keep and what to change.

Specify the desired style of the final effect.

Give composition suggestions if necessary.

Example Comparison:

Effect	Prompt Example
❌ Blurry	"Mix these two images"
✅ Clear	"Apply the watercolor style of the first image to the cityscape of the second image, keeping the architectural details but re-rendering with soft colors and brushstrokes."
❌ Blurry	"Combine these images into one"
✅ Clear	"Create a product poster: adopt the minimalist color scheme of the first image, the minimalist lines of the second image, and the white space layout style of the third image."

5.3 Image Input Best Practices

Choose the Right Input Method:

Image is online: Use URL to reduce data transmission.

Image is local: Use Base64 to avoid uploading to a temporary server.

Need privacy protection: Use Base64, do not pass through third-party URLs.

Image Quality Suggestions:

Recommended Resolution: 1024x1024 or higher.

File Size: Max 7MB.

Format Selection: PNG (High Quality), JPG (Smaller Size), WEBP (Best Balance).

Multi-Image Input Tips (Multi-Image-to-Image):

Count Control: Supports 2-5 images; 2-3 images are recommended for best results.

Sorting Order: Arrange in order of importance; place the most important image first.

Clear Description: Clearly state the role of each image and how to fuse them in the prompt.

Application Scenarios:

Style Transfer: Transform one image with another's style.

Element Combination: Combine different elements from multiple images into a new one.

Product Design: Design a new product referencing features of multiple images.

Scene Fusion: Blend characteristics of multiple scenes to create a new environment.

Practical Tips:

Explicitly refer to "the first image", "the second image" to avoid confusion.

Explain specific elements you want to keep or extract.

You can mix URL and Base64 input methods.

5.4 Performance Optimization Suggestions

Batch Generation: If you need to generate multiple images, use concurrent requests to improve efficiency.

Caching Strategy: For requests with identical parameters, it is recommended to cache on the client side.

Async Processing: For non-real-time needs, use asynchronous processing mechanisms.

Image Pre-processing: For large images, compress them to a reasonable size before transmission.

6. Frequently Asked Questions (FAQ)

Q1: How to maintain the same aspect ratio in a conversation?

A: In the contents conversation array, include the size parameter with every request. The system will apply the specified aspect ratio to the current request.

Q2: What are the requirements for using URL images?

The URL must be a publicly accessible HTTP/HTTPS address.

Supported formats: PNG, JPEG, JPG, WEBP.

File size: Max 7MB.

The system automatically downloads and converts it to Base64 format to pass to the model.

Q3: What are the requirements for Base64 image format?

Must include the complete data URI prefix, e.g., data:image/png;base64,iVBORw0....

Supported formats: image/png, image/jpeg, image/webp.

File size: Max 7MB (before encoding).

Ensure Base64 data is correctly encoded.

Q4: Which aspect ratios does Nano Banana support?

A: Nano Banana (gemini-2.5-flash-image) supports all 10 aspect ratios listed in the documentation: 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9.

Q5: What are the actual pixel dimensions of the generated images?

A: The actual pixel dimensions are determined by the model and will generally result in high-quality images based on the specified aspect ratio. Different aspect ratios may have different pixel dimensions, but will maintain the specified proportional relationship.

Q6: Can I upload multiple images at the same time (Multi-Image-to-Image)?

A: Yes! Nano Banana supports inputting 2-5 images simultaneously for fusion generation:

Style Transfer: Apply one image's style to another.

Element Combination: Extract different elements from multiple images.

Product Design: Generate new designs referencing multiple images.

Scene Fusion: Blend features of multiple scenes.

Usage: Add multiple image objects in the contents[].parts array and explicitly state how to process these images in the text. See section 3.3 for multi-image examples.

Best Practices:

Provide clear textual instructions telling the model how to use each image.

Image order matters; place the most important image first.

Every image must meet the format and size requirements (PNG/JPEG/JPG/WEBP, max 7MB).

Document Version: v2.1
Update Time: 2025-11-05
Model: Nano Banana (gemini-2.5-flash-image)
Technical Support: https://llm.ai-nebula.com

Quick Reference

Model Parameters Cheat Sheet

Parameter	Value
Model Name	gemini-2.5-flash-image
Supported Formats	PNG, JPEG, JPG, WEBP
Max Size	7MB (per image)
Supported Aspect Ratios	1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
Image Input	URL or Base64
Multi-Image Input	Supports inputting 2-5 images simultaneously
Response Format	b64_json

Core Functions Cheat Sheet

Function	Description	Example Section
Text-to-Image	Generate images from pure text descriptions	2.1-2.4
Image-to-Image	Generate new images from single image + text	3.1-3.2
Multi-Image-to-Image	Generate from fusing 2-5 images	3.3
Conversational Generation	Continuous image modification in multi-turn chat	3.4

Common Aspect Ratios Cheat Sheet

Ratio	Pixels	Scenario
1:1	1024x1024	Social media, avatars
16:9	1792x1024	Video covers, landscape wallpapers
9:16	1024x1792	Short videos, phone wallpapers
21:9	1024x2176	Ultrawide panoramas

Multi-Image-to-Image Application Scenarios

Scenario	Image Count	Prompt Example
Style Transfer	2 images	"Apply the oil painting style of the first image to the content of the second image"
Element Combination	2-3 images	"Use the sky from image 1 + building from image 2 + plants from image 3"
Product Design	3-4 images	"Design a coffee cup referencing these images: color from 1st + lines from 2nd + handle from 3rd"
Scene Fusion	2 images	"Fuse the characteristics of these two scenes to create a new environment"

Nano Banana Image Generation Interface Document

1. Interface Basic Information#

1.1 Core Parameters#

1.2 Aspect Ratio Settings#

Format 1: Direct Ratio (Recommended)#

Format 2: Pixel Dimensions (Automatically converted to corresponding ratio)#

1.3 Supported Aspect Ratios and Corresponding Pixel Dimensions#

2. Simple Text-to-Image Functionality#

2.1 Basic Text-to-Image (Default 1:1 Ratio)#

2.2 Text-to-Image with Aspect Ratio (16:9 Widescreen)#

2.3 Vertical Text-to-Image (9:16 Mobile Screen)#

2.4 Using Pixel Dimensions to Specify Ratio#

3. Image-to-Image Functionality#

3.1 Basic Image-to-Image (Default Ratio)#

Example 1: Using Base64 Input#

Example 2: Using URL Input#

3.2 Image-to-Image with Aspect Ratio (21:9 Ultrawide)#

Using URL Input (Recommended)#

Using Base64 Input#

3.3 Multi-Image-to-Image (Multi-Image Fusion)#

Example 1: Style Transfer (2-Image Fusion)#

Example 2: Element Combination (3-Image Fusion)#

Example 3: Product Design Reference (Multi-Image + Detailed Description)#

Example 4: Mixing URL and Base64 Input#

3.4 Conversational Image Generation#

4. Response Handling#

4.1 Response Format#

4.2 Success Response Example#

4.3 Saving Base64 Image Data (Command Line Example)#

4.4 Error Handling#

5. Best Practices#

5.1 Aspect Ratio Suggestions#

5.2 Prompt Optimization Suggestions#

Text-to-Image Prompt Tips#

Multi-Image-to-Image Prompt Tips#

5.3 Image Input Best Practices#

5.4 Performance Optimization Suggestions#

6. Frequently Asked Questions (FAQ)#

Q1: How to maintain the same aspect ratio in a conversation?#

Q2: What are the requirements for using URL images?#

Q3: What are the requirements for Base64 image format?#

Q4: Which aspect ratios does Nano Banana support?#

Q5: What are the actual pixel dimensions of the generated images?#

Q6: Can I upload multiple images at the same time (Multi-Image-to-Image)?#

Quick Reference#

Model Parameters Cheat Sheet#

Core Functions Cheat Sheet#

Common Aspect Ratios Cheat Sheet#

Multi-Image-to-Image Application Scenarios#