remote::vllm
Description
Remote vLLM inference provider for connecting to vLLM servers.
Configuration
Field |
Type |
Required |
Default |
Description |
---|---|---|---|---|
|
|
No |
The URL for the vLLM model serving endpoint |
|
|
|
No |
4096 |
Maximum number of tokens to generate. |
|
|
No |
fake |
The API token |
|
|
No |
True |
Whether to verify TLS certificates. Can be a boolean or a path to a CA certificate file. |
|
|
No |
False |
Whether to refresh models periodically |
Sample Configuration
url: ${env.VLLM_URL:=}
max_tokens: ${env.VLLM_MAX_TOKENS:=4096}
api_token: ${env.VLLM_API_TOKEN:=fake}
tls_verify: ${env.VLLM_TLS_VERIFY:=true}