Build, route, verify.
Three install commands, one API key, and a routed call in under three minutes. BCU meter definitions are public. SDK and API references are drafting; the meter table and pricing examples below are the source of truth.
Quickstart
#quickstartInstall the CLI or SDK, set your provider keys, send a call. The drop-in client mirrors the OpenAI Chat Completions shape so existing code paths usually work without changes.
First call
from bytevion import Bytevion
client = Bytevion(api_key="bv-...")
response = client.chat.completions.create(
model="auto",
messages=[{"role": "user", "content": "Explain recursion"}],
optimize=True,
)
print(response.choices[0].message.content)BCU meter reference
#bcu-metersA Byte Compute Unit is a normalized unit of platform work. Every invoice itemizes BCUs across the meters below so customers can audit precisely what they are paying for. List price is $0.010 per BCU; enterprise committed pricing ranges from $0.004 to $0.007 per BCU.
Pricing examples
#pricing-examplesThree worked examples at the list rate ($0.010 per BCU). The Cheap passthrough row is intentional. Bytevion does not claim savings on already-optimized workloads, and negative savings figures are preserved in invoice-grade ledgers as useful routing evidence.
Provider matrix
#providersBytevion supports 11 provider adapters out of the box, with BYOK and managed billing modes. Text, vision, document and audio inputs are routed through the same gateway.
API reference
#apiThe full SDK and API reference is drafting. The OpenAI-shaped Chat Completions endpoint is the most stable surface and is documented inline in the SDK. If you are integrating today and want preview docs ahead of the public release, send a request to namaste@bytevion.com and we will share them.
Deployment modes
#deploy- in-process· drop the SDK into your app. Routing and cache run locally; telemetry streams to the API portal.
- gateway· run the FastAPI Docker gateway as a sidecar or shared service. Single endpoint, no app changes.
- k8s / vpc· private deployment in your own VPC with data residency, private caches and custom runtime constraints. Enterprise only.