enable_thinking), search, and speech recognition.| Item | Content |
|---|---|
| Base URL | https://llm.ai-nebula.com/v1/chat/completions |
| Authentication | API Key (Token) |
| Request Headers | Authorization: Bearer sk-xxxx, Content-Type: application/json |
parameters): https://bailian.console.aliyun.com/?tab=api#/api/?type=model&url=2712576qwen3-omni-flashenable_thinking: true and stream: true.enable_thinking: true but stream: false, the system will automatically disable deep thinking to avoid upstream errors."nebula_thinking_to_content": true (Only affects downlink display, not passed upstream, does not affect billing).<think>...</think> and appear in content along with normal content, suitable for terminals or SDKs that only display content.parameters object:enable_thinking, incremental_output, search_options, enable_searchasr_optionstemperature, top_p, top_k, seed, stop, max_tokenspresence_penalty, frequency_penalty, etc. (refer to official documentation)response_format (text/json_object/json_schema), json_schemausage at the end; upstream usually does not provide reasoning_tokens details, so this value may be 0 even if deep thinking is enabled.enable_thinking: true, deep thinking will be automatically disabled to avoid upstream errors.{
"id": "chatcmpl-...",
"object": "chat.completion.chunk",
"created": 1762153960,
"model": "qwen3-omni-flash",
"choices": [ ... ],
"usage": {
"prompt_tokens": 53,
"completion_tokens": 2123,
"total_tokens": 2176,
"completion_tokens_details": {
"reasoning_tokens": 0
}
}
}stream: true; if the client does not display reasoning_content, you can add nebula_thinking_to_content: true to inline reasoning into content.reasoning_tokens 0?stream: true or remove enable_thinking.top_p/top_k/temperature and combine with incremental_output to improve interaction experience.parameters; placing them at the top level will be automatically corrected, but passing parameters according to specification is recommended.usage.