1. Chat
Nebula-API操作文档
🇺🇸English
  • 🇨🇳中文
  • 🇺🇸English
  • Chat
    • General Text Dialogue Interface Document
    • Tongyi Qianwen General Dialogue Interface Document
    • DeepSeek General Dialogue Document
    • GPT Chat General Dialogue Document
    • Grok Model (xAI) General Dialogue Interface Document
  • Image
    • General Image Generation Interface Document
    • Nano Banana Image Generation Interface Document
    • Tongyi Qianwen Text to Image Model Interface Document
    • Tongyi Qianwen Image Editing Model Interface Document
  • Video
    • Sora-2 interface document
    • Alibaba Wanxiang Wan2.5 Tu Sheng Video Interface Document
    • Google Veo Video Model Interface Document
    • General Video Generation Interface Document
  • AI App
    • Cherry Studio Integration Guide
    • LangChain Development Framework Integration Guide
    • Cursor Code Editor Integration Guide
    • Claude Code and other client integration guidelines
    • Cline (VS Code) AI Programming Assistant Integration Guide
    • Immersive Translation Integration Guide
  • Real time conversation
    • Realtime real-time conversation document
  1. Chat

GPT Chat General Dialogue Document

Overview#

This document describes how to invoke standard ChatGPT (OpenAI Chat Completions) capabilities via Nebula's OpenAI-compatible interface, including minimal examples, streaming, tool calling, and structured output highlights.

Basic Information#

ItemContent
Base URLhttps://llm.ai-nebula.com/v1/chat/completions
Authentication MethodAPI Key (Token)
Request HeadersAuthorization: Bearer sk-xxxx, Content-Type: application/json

Supported Models (Examples)#

gpt-4o, gpt-4.1, gpt-4o-mini, gpt-3.5-turbo, etc. (subject to routing configuration)

API Interface#

1. Minimal Example (Non-streaming)#

2. Streaming SSE Example#

3. Tool Calling (Functions / Tools)#

Complete Tool Calling Process (Two Stages)#

1.
Phase 1: The model returns tool_calls (content is usually null, finish_reason=tool_calls). You need to execute the corresponding function on your server based on tool_calls[*].function.name/arguments.
2.
Phase 2: Pass the tool execution result back to the model as a role:"tool" message and continue completion (streaming is supported).
Non-streaming continuation example (Phase 2):
Streaming continuation example (Phase 2 supports streaming as well):
Note:
The tool_call_id must match the one returned in Phase 1.
If tool execution fails, readable error information or degraded results should be returned to avoid blocking subsequent completions.

4. Structured Output (response_format/json_schema)#

5. File Input#

In the examples below, we first upload a PDF file using the following methods:
1.
File URL: You can upload a PDF file by linking to an external URL.
2.
Base64 Encoded File: Send as Base64 encoded input.
3.
Usage Notes
File Size Limits
You can upload multiple files; each file must not exceed 50 MB. The total size limit for all files in a single API request is 50 MB.
Supported Models
Only models that support text and image inputs, such as gpt-4o, gpt-4o-mini, or o1, can accept PDF files as input.

Response and Usage#

Non-streaming: Returns standard OpenAI structure at once, including choices, usage.
Streaming: Returns SSE chunks; usage aggregation may be attached at the end. If the channel supports stream_options.include_usage=true, chunks may contain real-time usage.

Frequently Asked Questions (FAQ)#

1.
How to improve the stability of structured outputs?
Use response_format: json_schema and provide a strict JSON Schema; if necessary, combine with lowering temperature and setting max_tokens.
2.
How to handle tool execution?
Read tool_calls from incremental chunks, execute the function on the server, and pass the result back to the model as a tool message.
3.
Is Reproducible (Seed) supported?
If the channel supports it, seed can be used; implementation may vary across vendors, so it is recommended to enable it only for workflows requiring reproducibility.

Best Practices#

Frontend streaming should use event stream parsing and render in real-time.
It is recommended to turn off/lower temperature under strict JSON mode.
Implement timeout and retry mechanisms for tool calls to avoid blocking model responses.

About "Deep Thinking / Reasoning Process"#

Standard ChatGPT series (such as gpt-4o, gpt-4.1, gpt-3.5-turbo) do not provide visual chain-of-thought output; passing enable_thinking in the request body will not take effect.
If you need models with "reasoning capabilities/usage statistics" (such as o1, o3, o4-mini, etc.), please use the Responses API (/v1/responses).
修改于 2025-12-04 07:49:06
上一页
DeepSeek General Dialogue Document
下一页
Grok Model (xAI) General Dialogue Interface Document
Built with