1. Chat
Nebula-API操作文档
🇺🇸English
  • 🇨🇳中文
  • 🇺🇸English
  • Chat
    • General Text Dialogue Interface Document
    • Tongyi Qianwen General Dialogue Interface Document
    • DeepSeek General Dialogue Document
    • GPT Chat General Dialogue Document
    • Grok Model (xAI) General Dialogue Interface Document
  • Image
    • General Image Generation Interface Document
    • Nano Banana Image Generation Interface Document
    • Tongyi Qianwen Text to Image Model Interface Document
    • Tongyi Qianwen Image Editing Model Interface Document
  • Video
    • Sora-2 interface document
    • Alibaba Wanxiang Wan2.5 Tu Sheng Video Interface Document
    • Google Veo Video Model Interface Document
    • General Video Generation Interface Document
  • AI App
    • Cherry Studio Integration Guide
    • LangChain Development Framework Integration Guide
    • Cursor Code Editor Integration Guide
    • Claude Code and other client integration guidelines
    • Cline (VS Code) AI Programming Assistant Integration Guide
    • Immersive Translation Integration Guide
  • Real time conversation
    • Realtime real-time conversation document
  1. Chat

DeepSeek General Dialogue Document

Overview#

This document describes how to call the DeepSeek conversational model via the Nebula API's OpenAI-compatible interface.

Basic Information#

ItemContent
Base URLhttps://llm.ai-nebula.com/v1/chat/completions
AuthenticationAPI Key (Token)
Request HeaderAuthorization: Bearer sk-xxxx, Content-Type: application/json

Supported Models (Examples)#

deepseek-v3-1-250821
Other DeepSeek series models (subject to routing configuration)

API Endpoints#

1. Minimal Example (Non-streaming)#

2. Streaming SSE Example#

3. Common Parameters#

Sampling and Control: temperature, top_p, max_tokens, stop
Structured Output: response_format/json_schema
Tool Calling: tools/tool_choice (follows OpenAI compatible format)
Note: DeepSeek may support more features or differentiated fields across different channels. Nebula will try to pass through and standardize them in the compatibility layer. It is recommended to use only common fields or consult the channel support list.

4. Tool Calling (Functions / Tools)#

Complete Tool Calling Process (Two Stages)#

1.
Stage 1: The model returns tool_calls (content is usually null, finish_reason=tool_calls). You need to execute the corresponding function on your server based on tool_calls[*].function.name/arguments.
2.
Stage 2: Pass the tool execution result back to the model as a role:"tool" message and continue completion (can be streaming).
Non-streaming continuation example (Stage 2):
Streaming continuation example (Stage 2 also supports streaming):
Note:
tool_call_id must be consistent with the return from Stage 1.
When tool execution fails, readable error messages or degraded results should be returned to avoid blocking subsequent completion.
DeepSeek's support for tool calling may vary by model version. It is recommended to confirm channel support before use.

5. Thinking Capability (Thinking)#

DeepSeek supports enabling/disabling thinking capability with the thinking field. Disabled by default:
thinking={
 "type": "disabled" // Default behavior: disable thinking capability
 // "type": "enabled" // Enable thinking capability
}
In OpenAI-compatible requests, you can pass the thinking field directly at the top level:
Explanation:
Different models/versions may output thinking capability differently (e.g., whether to return an explicit reasoning field or only reflect it in the content structure).
If you need to see the thinking process intuitively in the terminal and the channel returns streaming, you can combine it with stream: true for a better interactive experience.

Response and Usage#

Non-streaming: Returns choices and usage all at once.
Streaming: SSE chunks return, possibly containing usage at the end; if the channel supports stream_options.include_usage=true, real-time usage may be returned within chunks.

Frequently Asked Questions (FAQ)#

1.
Compatibility with OpenAI?
Uses OpenAI Chat Completions format; a few extension fields may not take effect, subject to channel support.
2.
Is structured output supported?
Supports response_format: json_schema; for complex Schemas, it is recommended to lower temperature to improve consistency.
3.
Are Chain of Thought / Search switches supported?
Depends on channel and model version; for exclusive capabilities, contact admin for activation or pass through parameters (if channel supports).
4.
Is tool calling supported?
Supports OpenAI-compatible tool calling format; specific support levels may vary by model version and channel.

Best Practices#

Use streaming to improve time-to-first-token and interactive experience.
Lower temperature and control max_tokens for strictly structured output.
Implement fault tolerance, retry, and timeout control for tool calling results.
修改于 2025-12-04 07:49:04
上一页
Tongyi Qianwen General Dialogue Interface Document
下一页
GPT Chat General Dialogue Document
Built with