Nexus

Cerebras

Use Cerebras Wafer Scale Engine for ultra-fast Llama inference with Nexus.

Cerebras provides ultra-fast inference on their Wafer Scale Engine hardware. Uses an OpenAI-compatible API.

Installation

import "github.com/xraph/nexus/providers/cerebras"

Quick Start

provider := cerebras.New(os.Getenv("CEREBRAS_API_KEY"))

gw := nexus.New(
    nexus.WithProvider(provider),
)

Options

OptionDescription
cerebras.WithBaseURL(url)Override the API base URL (default: https://api.cerebras.ai/v1)

Capabilities

CapabilitySupported
ChatYes
StreamingYes
EmbeddingsNo
VisionNo
ToolsNo
ThinkingNo

Models

ModelContextMax OutputInput PriceOutput Price
llama3.1-8b8,1924,096$0.10/M$0.10/M
llama3.1-70b8,1924,096$0.60/M$0.60/M

On this page