Skip to main content

remote::nvidia

Description

NVIDIA inference provider for accessing NVIDIA NIM models and AI services.

Configuration

FieldTypeRequiredDefaultDescription
allowed_modelslist[str] | NoneNoList of models that should be registered with the model registry. If None, all models are allowed.
refresh_modelsboolNoFalseWhether to refresh models periodically from the provider
api_keySecretStr | NoneNoAuthentication credential for the provider
networkNetworkConfig | NoneNoNetwork configuration including TLS, proxy, and timeout settings.
network.tlsTLSConfig | NoneNoTLS/SSL configuration for secure connections.
network.tls.verifybool | PathNoTrueWhether to verify TLS certificates. Can be a boolean or a path to a CA certificate file.
network.tls.min_versionLiteral[TLSv1.2, TLSv1.3] | NoneNoMinimum TLS version to use. Defaults to system default if not specified.
network.tls.cipherslist[str] | NoneNoList of allowed cipher suites (e.g., ['ECDHE+AESGCM', 'DHE+AESGCM']).
network.tls.client_certPath | NoneNoPath to client certificate file for mTLS authentication.
network.tls.client_keyPath | NoneNoPath to client private key file for mTLS authentication.
network.proxyProxyConfig | NoneNoProxy configuration for HTTP connections.
network.proxy.urlHttpUrl | NoneNoSingle proxy URL for all connections (e.g., 'http://proxy.example.com:8080').
network.proxy.httpHttpUrl | NoneNoProxy URL for HTTP connections.
network.proxy.httpsHttpUrl | NoneNoProxy URL for HTTPS connections.
network.proxy.cacertPath | NoneNoPath to CA certificate file for verifying the proxy's certificate. Required for proxies in interception mode.
network.proxy.no_proxylist[str] | NoneNoList of hosts that should bypass the proxy (e.g., ['localhost', '127.0.0.1', '.internal.corp']).
network.timeoutfloat | TimeoutConfig | NoneNoTimeout configuration. Can be a float (for both connect and read) or a TimeoutConfig object with separate connect and read timeouts.
network.timeout.connectfloat | NoneNoConnection timeout in seconds.
network.timeout.readfloat | NoneNoRead timeout in seconds.
network.headersdict[str, str] | NoneNoAdditional HTTP headers to include in all requests.
base_urlHttpUrl | NoneNohttps://integrate.api.nvidia.com/v1A base url for accessing the NVIDIA NIM
timeoutintNo60Timeout for the HTTP requests
rerank_model_to_urldict[str, str]No{'nv-rerank-qa-mistral-4b:1': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/reranking', 'nvidia/nv-rerankqa-mistral-4b-v3': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/nv-rerankqa-mistral-4b-v3/reranking', 'nvidia/llama-3.2-nv-rerankqa-1b-v2': 'https://ai.api.nvidia.com/v1/retrieval/nvidia/llama-3_2-nv-rerankqa-1b-v2/reranking'}Mapping of rerank model identifiers to their API endpoints.

Sample Configuration

base_url: ${env.NVIDIA_BASE_URL:=https://integrate.api.nvidia.com/v1}
api_key: ${env.NVIDIA_API_KEY:=}