llms#
LLM
objects help run an LLM on prompts. All LLMs derive from the
LLM
base class.
Tip
Instead of using run()
directly, use a
step
that takes an LLM
as an args
argument such as Prompt
or
FewShotPrompt
.
Efficient Generation Techniques#
Throughput#
DataDreamer provides efficient generation through a variety of techniques that can optimize throughput.
Adaptive Batch Sizing
For locally running LLMs, the maximum batch size that the LLM can handle before running out of memory is determined by the amount of memory available on your system, but also dependent on the maximum sequence length the batch of inputs passed to the LLM, as longer inputs require more memory. Over many iterations, DataDreamer will automatically learn the maximum batch size that the LLM can handle for a given sequence length and will adaptively adjust the batch size to maximize throughput.
The maximum batch size that will ever be used is the batch_size
argument passed
to the run()
method. DataDreamer will try to find the largest batch
size that the LLM can handle that is less than or equal to the batch_size
. If
a batch size is too large, DataDreamer will automatically catch the out of memory
error and reduce the batch size and learn for future iterations.
To disable adaptive batch sizing, you can pass adaptive_batch_size=False
to the
run()
method.
Batch Scheduling
In order to minimize padding processed by the LLM, DataDreamer will attempt to schedule batches such that the length of all sequences in a batch are similar. This will minimize the amount of padding that the LLM has to process.
To do this, DataDreamer reads a large buffer of prompts, sorts the prompts by
length, and then schedules batches of size batch_size
from the sorted prompts.
To manually control the size of the buffer, you can pass a
batch_scheduler_buffer_size
to the run()
method.
To disable batch scheduling, you can set batch_scheduler_buffer_size
equal to
batch_size
.
Robustness & Retries
For API-based LLMs, DataDreamer will attempt to retry failed requests. This is can
be disabled via retry_on_fail=False
.
Running on Multiple GPUs#
See the Running Models on Multiple GPUs page.
Quantization#
See the Quantization page.
Caching#
LLMs internally perform caching to disk, so if you run the same prompt with the same generation settings multiple times, the LLM will only run the prompt once and then cache the results for future runs.
- class datadreamer.llms.LLM(cache_folder_path=None)[source]#
Bases:
_Cachable
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- abstract get_max_context_length(max_new_tokens)[source]#
Gets the maximum context length for the model. When
max_new_tokens
is greater than 0, the maximum number of tokens that can be used for the prompt context is returned.
- format_prompt(max_new_tokens=None, beg_instruction=None, in_context_examples=None, end_instruction=None, sep='\\n', min_in_context_examples=None, max_in_context_examples=None)[source]#
Formats a prompt for the LLM given instructions and in-context examples.
Prompt Format
The final prompt will be constructed as follows:
beg_instruction sep in_context_example_1 sep in_context_example_2 sep ... sep in_context_example_n sep end_instruction
If
beg_instruction
,in_context_examples
, andend_instruction
areNone
, they will not be included in the prompt.If all of the
in_context_examples
will not fit in the prompt (accounting for the possiblemax_new_tokens
that may be generated) the prompt will be constructed with as many in-context examples that will fit.If
min_in_context_examples
andmax_in_context_examples
are set, those constraints will be enforced.- Parameters:
max_new_tokens (
Optional
[int
], default:None
) β The maximum number of tokens that can be generated.beg_instruction (
Optional
[str
], default:None
) β The instruction at the beginning of the prompt.in_context_examples (
Optional
[list
[str
]], default:None
) β The in-context examples to include in the prompt.end_instruction (
Optional
[str
], default:None
) β The instruction at the end of the prompt.sep (default:
'\\n'
) β The separator to use between the instructions and in-context examples.min_in_context_examples (
Optional
[int
], default:None
) β The minimum number of in-context examples to include in the prompt.max_in_context_examples (
Optional
[int
], default:None
) β The maximum number of in-context examples to include in the prompt.
- Return type:
- Returns:
The formatted prompt.
- abstract run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.OpenAI(model_name, system_prompt=None, organization=None, api_key=None, base_url=None, api_version=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LLM
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.OpenAIAssistant(model_name, system_prompt=None, tools=None, organization=None, api_key=None, base_url=None, api_version=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
OpenAI
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- class datadreamer.llms.HFTransformers(model_name, chat_prompt_template=AUTO, system_prompt=AUTO, revision=None, trust_remote_code=False, device=None, device_map=None, dtype=None, quantization_config=None, adapter_name=None, adapter_kwargs=None, cache_folder_path=None, **kwargs)[source]#
Bases:
LLM
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=True, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.CTransformers(model_name, model_type=None, model_file=None, max_context_length=None, chat_prompt_template=AUTO, system_prompt=AUTO, revision=None, threads=None, gpu_layers=0, cache_folder_path=None, **kwargs)[source]#
Bases:
HFTransformers
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.VLLM(model_name, chat_prompt_template=AUTO, system_prompt=AUTO, revision=None, trust_remote_code=False, device=None, dtype=None, quantization=None, swap_space=1, cache_folder_path=None, **kwargs)[source]#
Bases:
HFTransformers
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.Petals(model_name, chat_prompt_template=AUTO, system_prompt=AUTO, revision=None, trust_remote_code=False, device=None, dtype=None, adapter_name=None, cache_folder_path=None, **kwargs)[source]#
Bases:
HFTransformers
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=True, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.HFAPIEndpoint(endpoint, model_name, chat_prompt_template=AUTO, system_prompt=AUTO, token=None, revision=None, trust_remote_code=False, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
HFTransformers
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.Together(model_name, chat_prompt_template=AUTO, system_prompt=AUTO, api_key=None, max_context_length=None, tokenizer_model_name=None, tokenizer_revision=None, tokenizer_trust_remote_code=False, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LLMAPI
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.MistralAI(model_name, api_key=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LLMAPI
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.Anthropic(model_name, api_key=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLM
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.Cohere(model_name, api_key=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLM
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.AI21(model_name, api_key=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLM
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.Bedrock(model_name, aws_access_key_id=None, aws_secret_access_key=None, aws_region_name=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLM
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.GoogleAIStudio(model_name, api_key=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLM
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.VertexAI(model_name, vertex_project=None, vertex_location=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLM
Base class for all LLMs.
- Parameters:
cache_folder_path (
Optional
[str
], default:None
) β The path to the cache folder. IfNone
, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.ParallelLLM(*llms)[source]#
Bases:
_ParallelCachable
,LLM
Creates a LLM that will run multiple LLMs in parallel. See running models in parallel for more details.
- Parameters:
*llms (
LLM
) β The LLMs to run in parallel.