For the complete documentation index, see llms.txt. This page is also available as Markdown.

Generic (OpenAI-compatible)

Use this backend for any provider exposing an OpenAI-compatible Chat Completions API.

Requirements

  • Provider base URL.

  • Model identifier.

  • Optional API key/headers depending on provider.

Setup

  1. Get provider URL and model name.

  2. Configure fields in AI Backend settings tab.

  3. Validate with Test connection.

Configuration

Setting
Value

Preferred Backend

Generic (OpenAI-compatible)

Base URL

provider URL

Model

provider model id

API Key (Bearer)

optional

Extra Headers

optional (Header: value)

Timeout (seconds)

increase for heavy prompts

URL Behavior

Final endpoint resolution:

  • Base URL ends with /vN -> append /chat/completions.

  • Base URL already ends with /chat/completions -> use as-is.

  • Otherwise -> append /v1/chat/completions.

Examples:

Headers example:

Output Token Limits

The extension sets max_tokens automatically per request type to ensure complete responses:

Request Type

max_tokens

Chat

4096

Scanner (single request)

2048

Scanner (batch analysis)

4096

Payload generation

1024

Troubleshooting

  • 401/403: verify auth credentials and headers.

  • 404: verify provider supports chat completions at resolved path.

  • Timeout: increase timeout or choose smaller/faster model.

Retry Behavior

If a request fails due to a transient network error, the extension retries automatically up to 6 attempts using a bounded stepped backoff (500/1000/1500/2000/3000/4000 ms). A circuit breaker opens after 5 consecutive failures and resets after 30 s. Each retry is logged in the AI Request Logger. See Backends Overview → Retry Behavior.

Last updated