gpt-realtime and gpt-realtime-mini models are available. It establishes a persistent connection via WebSocket and interacts based on an event stream (sending session settings, conversation messages, generation requests; receiving incremental text/audio and usage statistics).| Item | Content |
|---|---|
| Base URL | wss://llm.ai-nebula.com |
| Endpoint | /v1/realtime?model={model} |
| Authentication | Authorization: Bearer sk-xxxx |
| Protocol | WebSocket (JSON Event Stream) |
| Supported Models | gpt-realtime, gpt-realtime-mini |
| Audio Format | Input/output are both PCM16 Mono, Sample Rate 24000Hz (if audio is enabled) |
session.update: Set/update session configuration (modalities, system instructions, voice, etc.)conversation.item.create: Send conversation message (input_text or input_audio)input_audio_buffer.append + input_audio_buffer.commit: Stream audio pushresponse.create: Request to generate a responsesession.created / session.updated: Session is ready or has been updatedresponse.created: Generation has startedresponse.text.delta / response.text.done: Text delta and completionresponse.audio_transcript.delta / response.audio_transcript.done: Audio transcription delta and completionresponse.audio.delta / response.audio.done: Audio delta and completionresponse.done: Turn finished, includes usage statisticserror: Error eventwss://llm.ai-nebula.com/v1/realtime?model=gpt-realtime (or gpt-realtime-mini), carrying the Authorization header.session.update to configure the session (can be updated multiple times).conversation.item.create.response.create to trigger generation and receive incremental events.{
"event_id": "evt_001",
"type": "session.update",
"session": {
"modalities": ["text", "audio"], // Supports "text" / "audio" (one or both)
"instructions": "You are a friendly assistant",
"voice": "alloy", // TTS voice, optional
"temperature": 0.8,
"input_audio_format": "pcm16",
"output_audio_format": "pcm16",
"input_audio_transcription": { "model": "whisper-1" }
}
}{
"event_id": "evt_002",
"type": "conversation.item.create",
"item": {
"id": "item_01",
"type": "message",
"role": "user",
"content": [
{ "type": "input_text", "text": "Hello, please briefly introduce yourself." }
]
}
}{ "event_id": "evt_003", "type": "response.create" }{ "type": "session.created", "session": { "id": "sess_xxx" } }
{ "type": "response.created", "response": { "id": "resp_xxx" } }
{ "type": "response.text.delta", "delta": "Hello! I am" }
{ "type": "response.text.delta", "delta": " Nebula's realtime assistant." }
{ "type": "response.done",
"response": {
"usage": {
"total_tokens": 123,
"input_tokens": 45,
"output_tokens": 78
}
}
}input_audio in content. The audio must be base64 encoded first (PCM16, 24k Mono).{
"type": "conversation.item.create",
"item": {
"type": "message",
"role": "user",
"content": [
{ "type": "input_audio", "audio": "<base64-of-pcm16>" }
]
}
}input_audio_buffer.append multiple times; the audio field contains the base64 chunk.input_audio_buffer.commit to let the server generate the conversation item.response.create to get the response.{ "type": "input_audio_buffer.append", "audio": "<chunk-1-base64>" }
{ "type": "input_audio_buffer.append", "audio": "<chunk-2-base64>" }
{ "type": "input_audio_buffer.commit", "item": { "type": "message", "role": "user" } }
{ "type": "response.create" }Authorization header is valid and has the sk- prefix.model only supports gpt-realtime / gpt-realtime-mini.response.create has been sent, or if the connection is still alive./api/sync/system/realtimeuser_id and internally forwards the request to /v1/realtime.wss://llm.ai-nebula.com/api/sync/system/realtimeAuthorization: <system_access_token> (No Bearer prefix required)user_id (Required, int): The ID of the actual user to be charged.model (Required, string): gpt-realtime or gpt-realtime-mini.group (Optional, string): Specify a group; otherwise, the user's default group is used.user_id in the query parameters./v1/realtime. The event protocol and return stream are identical.user_id must exist and not be disabled; the model must be a configured realtime model.group if routing switching is needed; otherwise, the user's default group is used./v1/realtime (e.g., session.update, conversation.item.create, response.create).