1. Image
Nebula-API操作文档
🇺🇸English
  • 🇨🇳中文
  • 🇺🇸English
  • Chat
    • General Text Dialogue Interface Document
    • Tongyi Qianwen General Dialogue Interface Document
    • DeepSeek General Dialogue Document
    • GPT Chat General Dialogue Document
    • Grok Model (xAI) General Dialogue Interface Document
  • Image
    • General Image Generation Interface Document
    • Nano Banana Image Generation Interface Document
    • Tongyi Qianwen Text to Image Model Interface Document
    • Tongyi Qianwen Image Editing Model Interface Document
  • Video
    • Sora-2 interface document
    • Alibaba Wanxiang Wan2.5 Tu Sheng Video Interface Document
    • Google Veo Video Model Interface Document
    • General Video Generation Interface Document
  • AI App
    • Cherry Studio Integration Guide
    • LangChain Development Framework Integration Guide
    • Cursor Code Editor Integration Guide
    • Claude Code and other client integration guidelines
    • Cline (VS Code) AI Programming Assistant Integration Guide
    • Immersive Translation Integration Guide
  • Real time conversation
    • Realtime real-time conversation document
  1. Image

Nano Banana Image Generation Interface Document

1. Interface Basic Information#

Model Name: gemini-2.5-flash-image (Nano Banana)
Base URL: https://llm.ai-nebula.com/v1/images/generations
Authentication Method: Bearer Token
Auth Token: Bearer sk-xxxxxxxxxx
Core Capabilities:
✅ Text-to-Image (Generate images from pure text descriptions)
✅ Image-to-Image (Generate new images from a single image + text)
✅ Multi-Image-to-Image (Generate new images by fusing multiple images)
✅ Multi-turn Conversational Image Generation (Continuous modification with context)
Supported Aspect Ratios: 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
Supported Image Formats: PNG, JPEG, JPG, WEBP
Image Size Limit: Max 7MB

1.1 Core Parameters#

Parameter NameTypeRequiredDescriptionExample Value
modelstringYesModel Namegemini-2.5-flash-image
promptstringNo*Text prompt"A cute orange kitten"
contentsarrayNo*Multimodal content (supports conversational context and image-to-image)See examples
response_formatstringNoResponse format: b64_jsonb64_json
sizestringNoImage aspect ratio (ratio or pixels)"16:9" or "1792x1024"
*Note: Select either prompt or contents; at least one must be provided.
Image Input Requirements:
Supported Formats: PNG, JPEG, JPG, WEBP
Max Size: 7MB
Input Methods: Supports both URL and Base64 formats
URL Format: "image": "https://example.com/image.jpg"
Base64 Format: "image": "data:image/png;base64,iVBORw0..."

1.2 Aspect Ratio Settings#

Set the image aspect ratio via the size parameter, supporting two formats:

Format 1: Direct Ratio (Recommended)#

{
 "size": "16:9"
}

Format 2: Pixel Dimensions (Automatically converted to corresponding ratio)#

{
 "size": "1792x1024" // Automatically converted to 16:9
}
For supported pixel dimensions and their corresponding ratios, please refer to the "1.3 Supported Aspect Ratios and Corresponding Pixel Dimensions" table below.

1.3 Supported Aspect Ratios and Corresponding Pixel Dimensions#

Aspect RatioPixel DimensionsUsage Scenarios
1:11024x1024Square images, avatars, social media
3:21536x1024Standard photography ratio
2:31024x1536Vertical posters, phone wallpapers
3:41536x2048Vertical photos
4:32048x1536Traditional display ratio
4:51024x1280Instagram vertical
5:41280x1024Traditional display ratio
9:161024x1792Mobile vertical screen, short video covers
16:91792x1024Widescreen, video covers, desktop wallpapers
21:91024x2176Ultrawide screen

2. Simple Text-to-Image Functionality#

2.1 Basic Text-to-Image (Default 1:1 Ratio)#

Generate an image of "A cute orange kitten sitting in a garden":

2.2 Text-to-Image with Aspect Ratio (16:9 Widescreen)#

Generate a landscape image with a 16:9 widescreen ratio:

2.3 Vertical Text-to-Image (9:16 Mobile Screen)#

Generate a poster suitable for a mobile vertical screen:

2.4 Using Pixel Dimensions to Specify Ratio#

You can also use pixel dimensions directly; the system will automatically convert them to the corresponding ratio:

3. Image-to-Image Functionality#

3.1 Basic Image-to-Image (Default Ratio)#

Generate a new image based on a base image. Supports two image input methods:
Method 1: URL address (System automatically downloads the image)
Method 2: Base64 encoding (Must include data URI prefix)

Example 1: Using Base64 Input#

Example 2: Using URL Input#

3.2 Image-to-Image with Aspect Ratio (21:9 Ultrawide)#

Generate a 21:9 ultrawide image (Supports both Base64 and URL inputs):

Using URL Input (Recommended)#

Using Base64 Input#

3.3 Multi-Image-to-Image (Multi-Image Fusion)#

Nano Banana supports inputting multiple images simultaneously. The model analyzes all images to generate a new one. Suitable for:
Style Fusion: Applying the style of one image to another.
Element Combination: Extracting elements from multiple images to combine them.
Contrast/Reference: Providing multiple reference images to help the model understand your needs.
Scene Mixing: Blending features of multiple scenes.

Example 1: Style Transfer (2-Image Fusion)#

Apply the style of one image to the content of another:

Example 2: Element Combination (3-Image Fusion)#

Extract different elements from multiple images to combine them:

Example 3: Product Design Reference (Multi-Image + Detailed Description)#

Provide multiple reference images to generate a product design meeting specific requirements:

Example 4: Mixing URL and Base64 Input#

You can flexibly combine URL and Base64 input methods:
Tips for Multi-Image Generation:
Supports inputting 2-5 images simultaneously.
You can mix URL and Base64 formats.
Clearly explain the role of each image and how to fuse them in the prompt.
The order of images affects the result; place important images first.
Each image must meet the format and size limits (PNG/JPEG/JPG/WEBP, max 7MB).

3.4 Conversational Image Generation#

Nano Banana supports multi-turn conversational image generation, allowing you to continue modifying based on previously generated images:

4. Response Handling#

4.1 Response Format#

1.
A successful response (Status Code 200) will return JSON containing image data.
2.
When response_format is set to b64_json, the image data is in the data[].b64_json field.
3.
Note: Nano Banana only supports b64_json format; url format is not supported.

4.2 Success Response Example#

{
  "code": 200,
  "msg": "Operation successful",
  "data": {
    "data": [
      {
        "url": "",
        "b64_json": "iVBORw0KGgoAAAANSUhEUgAABAAAAAQA[base64 data truncated]",
        "revised_prompt": ""
      }
    ],
    "created": 1757320007
  }
}

4.3 Saving Base64 Image Data (Command Line Example)#

4.4 Error Handling#

If the request fails, an error message will be returned:
{
  "code": 400,
  "msg": "Invalid parameter: Unsupported aspect ratio",
  "data": null
}
Common Error Codes:
400: Parameter Error (e.g., incorrect model name, invalid aspect ratio format).
401: Authentication Failed (Invalid or expired API key).
429: Rate Limit Exceeded (Requesting too frequently).
500: Internal Server Error.

5. Best Practices#

5.1 Aspect Ratio Suggestions#

Social Media:
Instagram Post: 1:1 or 4:5
Instagram Story: 9:16
Twitter/X: 16:9
Facebook Cover: 21:9
Design Usage:
Website Banner: 16:9 or 21:9
Poster: 2:3 or 9:16
Product Image: 1:1 or 4:3
Mobile Wallpaper: 9:16
Video Related:
YouTube Thumbnail: 16:9
Short Video Cover: 9:16
Widescreen Video: 21:9

5.2 Prompt Optimization Suggestions#

Text-to-Image Prompt Tips#

1.
Specify Ratio Needs: Indicate composition direction in the prompt.
Horizontal: Use "Horizontal composition", "Widescreen view".
Vertical: Use "Vertical composition", "Vertical view".
2.
Consider Image Layout:
16:9/21:9: Suitable for containing more horizontal elements (e.g., landscapes, panoramas).
9:16: Suitable for containing vertical elements (e.g., portraits, buildings).
1:1: Suitable for centered symmetric composition.
3.
High-Quality Keywords:
Add keywords like "High quality", "HD", "Professional photography".
Specify style: "Realistic style", "Oil painting style", "Anime style", etc.

Multi-Image-to-Image Prompt Tips#

1.
Clarify Image Roles:
✅ Good: "Apply the oil painting style of the first image to the landscape content of the second image."
❌ Bad: "Mix these images."
2.
Specify Fusion Method:
Style Transfer: "Transform the second image using the style of the first image."
Element Extraction: "Use the sky from the first image, the building from the second image, and the foreground from the third image."
Reference Design: "Design a new product referencing the color scheme, lines, and layout of these images."
3.
Describe Needs in Detail:
Explain what to keep and what to change.
Specify the desired style of the final effect.
Give composition suggestions if necessary.
Example Comparison:
EffectPrompt Example
❌ Blurry"Mix these two images"
✅ Clear"Apply the watercolor style of the first image to the cityscape of the second image, keeping the architectural details but re-rendering with soft colors and brushstrokes."
❌ Blurry"Combine these images into one"
✅ Clear"Create a product poster: adopt the minimalist color scheme of the first image, the minimalist lines of the second image, and the white space layout style of the third image."

5.3 Image Input Best Practices#

1.
Choose the Right Input Method:
Image is online: Use URL to reduce data transmission.
Image is local: Use Base64 to avoid uploading to a temporary server.
Need privacy protection: Use Base64, do not pass through third-party URLs.
2.
Image Quality Suggestions:
Recommended Resolution: 1024x1024 or higher.
File Size: Max 7MB.
Format Selection: PNG (High Quality), JPG (Smaller Size), WEBP (Best Balance).
3.
Multi-Image Input Tips (Multi-Image-to-Image):
Count Control: Supports 2-5 images; 2-3 images are recommended for best results.
Sorting Order: Arrange in order of importance; place the most important image first.
Clear Description: Clearly state the role of each image and how to fuse them in the prompt.
Application Scenarios:
Style Transfer: Transform one image with another's style.
Element Combination: Combine different elements from multiple images into a new one.
Product Design: Design a new product referencing features of multiple images.
Scene Fusion: Blend characteristics of multiple scenes to create a new environment.
Practical Tips:
Explicitly refer to "the first image", "the second image" to avoid confusion.
Explain specific elements you want to keep or extract.
You can mix URL and Base64 input methods.

5.4 Performance Optimization Suggestions#

1.
Batch Generation: If you need to generate multiple images, use concurrent requests to improve efficiency.
2.
Caching Strategy: For requests with identical parameters, it is recommended to cache on the client side.
3.
Async Processing: For non-real-time needs, use asynchronous processing mechanisms.
4.
Image Pre-processing: For large images, compress them to a reasonable size before transmission.

6. Frequently Asked Questions (FAQ)#

Q1: How to maintain the same aspect ratio in a conversation?#

A: In the contents conversation array, include the size parameter with every request. The system will apply the specified aspect ratio to the current request.

Q2: What are the requirements for using URL images?#

A:
The URL must be a publicly accessible HTTP/HTTPS address.
Supported formats: PNG, JPEG, JPG, WEBP.
File size: Max 7MB.
The system automatically downloads and converts it to Base64 format to pass to the model.

Q3: What are the requirements for Base64 image format?#

A:
Must include the complete data URI prefix, e.g., data:image/png;base64,iVBORw0....
Supported formats: image/png, image/jpeg, image/webp.
File size: Max 7MB (before encoding).
Ensure Base64 data is correctly encoded.

Q4: Which aspect ratios does Nano Banana support?#

A: Nano Banana (gemini-2.5-flash-image) supports all 10 aspect ratios listed in the documentation: 1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9.

Q5: What are the actual pixel dimensions of the generated images?#

A: The actual pixel dimensions are determined by the model and will generally result in high-quality images based on the specified aspect ratio. Different aspect ratios may have different pixel dimensions, but will maintain the specified proportional relationship.

Q6: Can I upload multiple images at the same time (Multi-Image-to-Image)?#

A: Yes! Nano Banana supports inputting 2-5 images simultaneously for fusion generation:
Style Transfer: Apply one image's style to another.
Element Combination: Extract different elements from multiple images.
Product Design: Generate new designs referencing multiple images.
Scene Fusion: Blend features of multiple scenes.
Usage: Add multiple image objects in the contents[].parts array and explicitly state how to process these images in the text. See section 3.3 for multi-image examples.
Best Practices:
Provide clear textual instructions telling the model how to use each image.
Image order matters; place the most important image first.
Every image must meet the format and size requirements (PNG/JPEG/JPG/WEBP, max 7MB).

Document Version: v2.1
Update Time: 2025-11-05
Model: Nano Banana (gemini-2.5-flash-image)
Technical Support: https://llm.ai-nebula.com

Quick Reference#

Model Parameters Cheat Sheet#

ParameterValue
Model Namegemini-2.5-flash-image
Supported FormatsPNG, JPEG, JPG, WEBP
Max Size7MB (per image)
Supported Aspect Ratios1:1, 3:2, 2:3, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
Image InputURL or Base64
Multi-Image InputSupports inputting 2-5 images simultaneously
Response Formatb64_json

Core Functions Cheat Sheet#

FunctionDescriptionExample Section
Text-to-ImageGenerate images from pure text descriptions2.1-2.4
Image-to-ImageGenerate new images from single image + text3.1-3.2
Multi-Image-to-ImageGenerate from fusing 2-5 images3.3
Conversational GenerationContinuous image modification in multi-turn chat3.4

Common Aspect Ratios Cheat Sheet#

RatioPixelsScenario
1:11024x1024Social media, avatars
16:91792x1024Video covers, landscape wallpapers
9:161024x1792Short videos, phone wallpapers
21:91024x2176Ultrawide panoramas

Multi-Image-to-Image Application Scenarios#

ScenarioImage CountPrompt Example
Style Transfer2 images"Apply the oil painting style of the first image to the content of the second image"
Element Combination2-3 images"Use the sky from image 1 + building from image 2 + plants from image 3"
Product Design3-4 images"Design a coffee cup referencing these images: color from 1st + lines from 2nd + handle from 3rd"
Scene Fusion2 images"Fuse the characteristics of these two scenes to create a new environment"
修改于 2025-12-04 07:49:10
上一页
General Image Generation Interface Document
下一页
Tongyi Qianwen Text to Image Model Interface Document
Built with