NVIDIA NIM
NVIDIA NIM (integrate.api.nvidia.com) hosts a range of open and proprietary models behind an OpenAI-compatible chat-completions interface. The extension targets /v1/chat/completions with the configured bearer token.
Requirements
An NVIDIA Developer account and an API key (starts with
nvapi-…).Network access to
integrate.api.nvidia.com.
Setup
Sign up at build.nvidia.com and generate an API key.
Pick a model (for example
moonshotai/kimi-k2.5).Configure the backend in the AI Backend settings tab.
Configuration
Preferred Backend
NVIDIA NIM
Select backend.
Base URL
https://integrate.api.nvidia.com
NVIDIA-hosted endpoint; override only when targeting a self-hosted NIM.
Model
(empty)
Model identifier, e.g. moonshotai/kimi-k2.5.
API Key
(empty)
Your nvapi-… token. Sent as Authorization: Bearer ….
Extra Headers
(empty)
Optional extra Header: value lines if a gateway requires them.
Timeout
120
Request timeout in seconds.
A working baseline:
Privacy Considerations
NVIDIA NIM is a cloud backend. The same privacy guidance as other cloud providers applies:
Keep privacy mode at
STRICTorBALANCED(the default) for real targets.Review the context preview dialog before sending auto-captured traffic.
Review the Privacy Modes page for redaction patterns.
Output Token Limits
The extension sets max_tokens automatically per request type:
Request Type
max_tokens
Chat
4096
Scanner (single request)
2048
Scanner (batch analysis)
4096
Payload generation
1024
Troubleshooting
401 Unauthorized: verify the API key is a validnvapi-…token and not expired.404 Not Foundon the model: confirm the model ID exactly matches NVIDIA's catalog.Slow first token: NIM models are shared; cold starts are expected.
Extra headers: only add them if your organization routes requests through a gateway.
Retry Behavior
Transient network failures trigger automatic retries (max 6 attempts) with the standard bounded stepped backoff (500 / 1000 / 1500 / 2000 / 3000 / 4000 ms). Each retry is recorded in the AI Request Logger as a RETRY activity.
Related Pages
Last updated
