inline::meta-reference
Description
Meta’s reference implementation of inference with support for various model formats and optimization techniques.
Configuration
Field |
Type |
Required |
Default |
Description |
---|---|---|---|---|
|
|
No |
||
|
|
No |
||
|
|
No |
4096 |
|
|
|
No |
1 |
|
|
|
No |
||
|
|
No |
True |
|
|
|
No |
||
|
|
No |
Sample Configuration
model: Llama3.2-3B-Instruct
checkpoint_dir: ${env.CHECKPOINT_DIR:=null}
quantization:
type: ${env.QUANTIZATION_TYPE:=bf16}
model_parallel_size: ${env.MODEL_PARALLEL_SIZE:=0}
max_batch_size: ${env.MAX_BATCH_SIZE:=1}
max_seq_len: ${env.MAX_SEQ_LEN:=4096}