Skip to main content
Build custom chat applications using the Comput3 Network API with OpenAI-compatible endpoints.

Quick Start

1

Get API Key

Obtain your API key from the Comput3 Dashboard.
2

Make Your First Request

Send a simple chat completion request:
curl
curl -X POST "https://api.comput3.ai/v1/chat/completions" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "hermes4:70b",
    "messages": [
      {"role": "user", "content": "Hello, world!"}
    ]
  }'
3

Handle the Response

Process the JSON response:
{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "hermes4:70b",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
}

API Endpoints

Chat Completions

Endpoint: POST /v1/chat/completions Create a chat completion with conversation context.
model
string
required
The model to use for completion. Available models:
  • hermes4:70b - Hermes 4 model (70B parameters) for advanced reasoning
  • hermes4:405b - Largest Hermes 4 model (405B parameters) for complex tasks
  • deepseek-v3.1 - Latest DeepSeek model for coding and general tasks
  • kimi-k2 - Kimi K2 model for general conversation
  • qwen3-coder:480b - Massive Qwen3 Coder model for advanced coding tasks
  • qwen3-max - Large-scale reasoning and analysis
  • grok-code-fast-1 - Fast coding assistance
  • claude-sonnet-4 - Creative writing and analysis
messages
array
required
Array of message objects representing the conversation history.
temperature
number
default:"1"
Controls randomness. Range: 0.0 to 2.0
max_tokens
integer
Maximum number of tokens to generate
stream
boolean
default:"false"
Whether to stream partial message deltas
stop
string | array
Sequences where the API will stop generating tokens

SDK Examples

Python

import openai

client = openai.OpenAI(
    api_key="YOUR_COMPUT3_API_KEY",
    base_url="https://api.comput3.ai/v1"
)

response = client.chat.completions.create(
    model="hermes4:70b",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing"}
    ],
    temperature=0.7,
    max_tokens=500
)

print(response.choices[0].message.content)

JavaScript/Node.js

import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: 'YOUR_COMPUT3_API_KEY',
  baseURL: 'https://api.comput3.ai/v1'
});

const response = await client.chat.completions.create({
  model: 'gpt-4',
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain blockchain technology' }
  ],
  temperature: 0.7,
  max_tokens: 500
});

console.log(response.choices[0].message.content);

Streaming Responses

Enable real-time response streaming for better user experience:
import openai

client = openai.OpenAI(
    api_key="YOUR_COMPUT3_API_KEY",
    base_url="https://api.comput3.ai/v1"
)

stream = client.chat.completions.create(
    model="hermes4:70b",
    messages=[{"role": "user", "content": "Write a short story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Advanced Features

Function Calling

Enable the model to call external functions:
functions = [
    {
        "name": "get_weather",
        "description": "Get current weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name"
                }
            },
            "required": ["location"]
        }
    }
]

response = client.chat.completions.create(
    model="hermes4:70b",
    messages=[
        {"role": "user", "content": "What's the weather in New York?"}
    ],
    functions=functions,
    function_call="auto"
)

Conversation Memory

Maintain conversation context across multiple requests:
class ChatSession:
    def __init__(self, system_prompt=None):
        self.messages = []
        if system_prompt:
            self.messages.append({"role": "system", "content": system_prompt})
    
    def send_message(self, content):
        self.messages.append({"role": "user", "content": content})
        
        response = client.chat.completions.create(
            model="hermes4:70b",
            messages=self.messages
        )
        
        assistant_message = response.choices[0].message.content
        self.messages.append({"role": "assistant", "content": assistant_message})
        
        return assistant_message

# Usage
chat = ChatSession("You are a helpful programming assistant.")
response1 = chat.send_message("How do I create a Python list?")
response2 = chat.send_message("Can you show me an example?")

Error Handling

Implement robust error handling for production applications:
import openai
from openai import OpenAI
import time

client = OpenAI(
    api_key="YOUR_COMPUT3_API_KEY",
    base_url="https://api.comput3.ai/v1"
)

def chat_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = client.chat.completions.create(
                model="hermes4:70b",
                messages=messages
            )
            return response.choices[0].message.content
            
        except openai.RateLimitError:
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # Exponential backoff
                continue
            raise
            
        except openai.APIError as e:
            print(f"API error: {e}")
            raise
            
        except Exception as e:
            print(f"Unexpected error: {e}")
            raise

Rate Limiting and Optimization

Managing Rate Limits

Implement a queue system for high-volume applications:
import asyncio
from asyncio import Semaphore

class RateLimitedClient:
    def __init__(self, max_concurrent=5):
        self.semaphore = Semaphore(max_concurrent)
        self.client = OpenAI(
            api_key="YOUR_COMPUT3_API_KEY",
            base_url="https://api.comput3.ai/v1"
        )

    async def chat_completion(self, messages):
        async with self.semaphore:
            response = await self.client.chat.completions.create(
                model="hermes4:70b",
                messages=messages
            )
            return response.choices[0].message.content
Optimize token usage to reduce costs:
def optimize_messages(messages, max_tokens=4000):
    """Truncate conversation history to fit within token limits"""
    total_tokens = sum(len(msg["content"].split()) * 1.3 for msg in messages)
    
    while total_tokens > max_tokens and len(messages) > 2:
        # Remove oldest messages but keep system message
        if messages[0]["role"] == "system":
            messages.pop(1)
        else:
            messages.pop(0)
        total_tokens = sum(len(msg["content"].split()) * 1.3 for msg in messages)
    
    return messages

Best Practices

Security

  • Store API keys as environment variables
  • Use HTTPS for all requests
  • Implement proper authentication
  • Validate and sanitize user inputs

Performance

  • Use appropriate models for each task
  • Implement response caching
  • Use streaming for long responses
  • Monitor token usage and costs

Error Handling

  • Implement retry logic with exponential backoff
  • Handle rate limiting gracefully
  • Log errors for debugging
  • Provide fallback responses

User Experience

  • Show loading states during API calls
  • Implement typing indicators
  • Cache frequent responses
  • Provide offline functionality where possible