LLM Gateway

Managed LLM access included with exe.dev

The LLM Gateway is exe.dev's managed access to Anthropic, OpenAI, and
Fireworks models. Your subscription includes a monthly token allocation, and
you can purchase additional tokens at [https://exe.dev/user](https://exe.dev/user).

See the [full list of supported models](/llm-gateway-models) ([JSON](/llm-gateway-models.json)).

New accounts get a default [LLM integration](/docs/integrations-llm) named
`llm`, attached to `auto:all`. That integration exposes the gateway at
`https://llm.int.exe.xyz` inside attached VMs, with no provider API keys stored
on the VM.

The gateway is only for exe.dev-managed model access. If you want to use your
own provider API key or a ChatGPT subscription, configure those as provider
sources on an [LLM integration](/docs/integrations-llm); those sources are not
part of the gateway or its token allocation.

Use the [LLM Integration guide](/docs/integrations-llm) to configure provider
sources, connect a ChatGPT subscription, attach the integration to VMs, or use
the integration with Shelley, Codex, Claude Code, and curl.

## Shelley

[Shelley](/docs/shelley/intro) automatically discovers attached LLM
integrations through the `reflection` integration. On new accounts, Shelley
sees the default `llm` integration and shows its gateway-backed models in the
`Model:` picker without custom model setup.

If Shelley does not show the gateway models, see
[Use with Shelley](/docs/integrations-llm#use-with-shelley).

## Low-level gateway endpoint

The default LLM integration is the preferred interface. The lower-level
gateway endpoint is also available inside exe.dev VMs at:

```
http://169.254.169.254/gateway/llm/<provider>
```

Replace `<provider>` with `anthropic`, `openai`, or `fireworks`. No API keys
are necessary; the gateway authenticates the VM.

```
$ curl -s http://169.254.169.254/gateway/llm/anthropic/v1/messages \
    -H "content-type: application/json" \
    -H "anthropic-version: 2023-06-01" \
    -d '{
      "model": "claude-sonnet-4-6",
      "max_tokens": 256,
      "messages": [{"role": "user", "content": "Hello!"}]
    }'
```