| Item | Content |
|---|---|
| Base URL | https://llm.ai-nebula.com/v1/chat/completions |
| Authentication Method | API Key (Token) |
| Request Headers | Authorization: Bearer sk-xxxx, Content-Type: application/json |
gpt-4o, gpt-4.1, gpt-4o-mini, gpt-3.5-turbo, etc. (subject to routing configuration)tool_calls (content is usually null, finish_reason=tool_calls). You need to execute the corresponding function on your server based on tool_calls[*].function.name/arguments.role:"tool" message and continue completion (streaming is supported).tool_call_id must match the one returned in Phase 1.choices, usage.usage aggregation may be attached at the end. If the channel supports stream_options.include_usage=true, chunks may contain real-time usage.response_format: json_schema and provide a strict JSON Schema; if necessary, combine with lowering temperature and setting max_tokens.tool_calls from incremental chunks, execute the function on the server, and pass the result back to the model as a tool message.seed can be used; implementation may vary across vendors, so it is recommended to enable it only for workflows requiring reproducibility.temperature under strict JSON mode.gpt-4o, gpt-4.1, gpt-3.5-turbo) do not provide visual chain-of-thought output; passing enable_thinking in the request body will not take effect.o1, o3, o4-mini, etc.), please use the Responses API (/v1/responses).