Chat API Usage

Build custom chat applications using the Comput3 Network API with OpenAI-compatible endpoints.

Quick Start

Get API Key

Obtain your API key from the Comput3 Dashboard.

Make Your First Request

Send a simple chat completion request:

curl

curl -X POST "https://api.comput3.ai/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "hermes4:70b",
    "messages": [
      {"role": "user", "content": "Hello, world!"}
    ]
  }'

Handle the Response

Process the JSON response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "hermes4:70b",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
}

API Endpoints

Chat Completions

Endpoint: POST /v1/chat/completions Create a chat completion with conversation context.

model

string

required

The model to use for completion. Available models:

hermes4:70b - Hermes 4 model (70B parameters) for advanced reasoning
hermes4:405b - Largest Hermes 4 model (405B parameters) for complex tasks
deepseek-v3.1 - Latest DeepSeek model for coding and general tasks
kimi-k2 - Kimi K2 model for general conversation
qwen3-coder:480b - Massive Qwen3 Coder model for advanced coding tasks
qwen3-max - Large-scale reasoning and analysis
grok-code-fast-1 - Fast coding assistance
claude-sonnet-4 - Creative writing and analysis

messages

array

required

Array of message objects representing the conversation history.

Show Message Object Structure

role

string

required

The role of the message author: system, user, or assistant

content

string

required

The content of the message

name

string

Optional name for the message author

temperature

number

default:"1"

Controls randomness. Range: 0.0 to 2.0

max_tokens

integer

Maximum number of tokens to generate

stream

boolean

default:"false"

Whether to stream partial message deltas

stop

string | array

Sequences where the API will stop generating tokens

SDK Examples

Python

import openai

client = openai.OpenAI(
    api_key="YOUR_COMPUT3_API_KEY",
    base_url="https://api.comput3.ai/v1"
)

response = client.chat.completions.create(
    model="hermes4:70b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing"}
    ],
    temperature=0.7,
    max_tokens=500
)

print(response.choices[0].message.content)

JavaScript/Node.js

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_COMPUT3_API_KEY',
  baseURL: 'https://api.comput3.ai/v1'
});

const response = await client.chat.completions.create({
  model: 'gpt-4',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain blockchain technology' }
  ],
  temperature: 0.7,
  max_tokens: 500
});

console.log(response.choices[0].message.content);

Streaming Responses

Enable real-time response streaming for better user experience:

import openai

client = openai.OpenAI(
    api_key="YOUR_COMPUT3_API_KEY",
    base_url="https://api.comput3.ai/v1"
)

stream = client.chat.completions.create(
    model="hermes4:70b",
    messages=[{"role": "user", "content": "Write a short story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Advanced Features

Function Calling

Enable the model to call external functions:

functions = [
    {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name"
                }
            },
            "required": ["location"]
        }
    }
]

response = client.chat.completions.create(
    model="hermes4:70b",
    messages=[
        {"role": "user", "content": "What's the weather in New York?"}
    ],
    functions=functions,
    function_call="auto"
)

Conversation Memory

Maintain conversation context across multiple requests:

class ChatSession:
    def __init__(self, system_prompt=None):
        self.messages = []
        if system_prompt:
            self.messages.append({"role": "system", "content": system_prompt})
    
    def send_message(self, content):
        self.messages.append({"role": "user", "content": content})
        
        response = client.chat.completions.create(
            model="hermes4:70b",
            messages=self.messages
        )
        
        assistant_message = response.choices[0].message.content
        self.messages.append({"role": "assistant", "content": assistant_message})
        
        return assistant_message

# Usage
chat = ChatSession("You are a helpful programming assistant.")
response1 = chat.send_message("How do I create a Python list?")
response2 = chat.send_message("Can you show me an example?")

Error Handling

Implement robust error handling for production applications:

import openai
from openai import OpenAI
import time

client = OpenAI(
    api_key="YOUR_COMPUT3_API_KEY",
    base_url="https://api.comput3.ai/v1"
)

def chat_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="hermes4:70b",
                messages=messages
            )
            return response.choices[0].message.content
            
        except openai.RateLimitError:
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # Exponential backoff
                continue
            raise
            
        except openai.APIError as e:
            print(f"API error: {e}")
            raise
            
        except Exception as e:
            print(f"Unexpected error: {e}")
            raise

Rate Limiting and Optimization

Managing Rate Limits

Request Queuing

Implement a queue system for high-volume applications:

import asyncio
from asyncio import Semaphore

class RateLimitedClient:
    def __init__(self, max_concurrent=5):
        self.semaphore = Semaphore(max_concurrent)
        self.client = OpenAI(
            api_key="YOUR_COMPUT3_API_KEY",
            base_url="https://api.comput3.ai/v1"
        )

    async def chat_completion(self, messages):
        async with self.semaphore:
            response = await self.client.chat.completions.create(
                model="hermes4:70b",
                messages=messages
            )
            return response.choices[0].message.content

Token Optimization

Optimize token usage to reduce costs:

def optimize_messages(messages, max_tokens=4000):
    """Truncate conversation history to fit within token limits"""
    total_tokens = sum(len(msg["content"].split()) * 1.3 for msg in messages)
    
    while total_tokens > max_tokens and len(messages) > 2:
        # Remove oldest messages but keep system message
        if messages[0]["role"] == "system":
            messages.pop(1)
        else:
            messages.pop(0)
        total_tokens = sum(len(msg["content"].split()) * 1.3 for msg in messages)
    
    return messages

Best Practices

Security

Store API keys as environment variables
Use HTTPS for all requests
Implement proper authentication
Validate and sanitize user inputs

Performance

Use appropriate models for each task
Implement response caching
Use streaming for long responses
Monitor token usage and costs

Error Handling

Implement retry logic with exponential backoff
Handle rate limiting gracefully
Log errors for debugging
Provide fallback responses

User Experience

Show loading states during API calls
Implement typing indicators
Cache frequent responses
Provide offline functionality where possible

Getting Started

API

Chat

IDE/CLI

Launch GPU

Generate Medias

COM Token

MCP

ELIZAOS

Quick Start

API Endpoints

Chat Completions

SDK Examples

Python

JavaScript/Node.js

Streaming Responses

Advanced Features

Function Calling

Conversation Memory

Error Handling

Rate Limiting and Optimization

Managing Rate Limits

Best Practices

Security

Performance

Error Handling

User Experience

Getting Started

API

Chat

IDE/CLI

Launch GPU

Generate Medias

COM Token

MCP

ELIZAOS

​Quick Start

​API Endpoints

​Chat Completions

​SDK Examples

​Python

​JavaScript/Node.js

​Streaming Responses

​Advanced Features

​Function Calling

​Conversation Memory

​Error Handling

​Rate Limiting and Optimization

​Managing Rate Limits

​Best Practices

Security

Performance

Error Handling

User Experience

Quick Start

API Endpoints

Chat Completions

SDK Examples

Python

JavaScript/Node.js

Streaming Responses

Advanced Features

Function Calling

Conversation Memory

Error Handling

Rate Limiting and Optimization

Managing Rate Limits

Best Practices