llms#
LLM objects help run an LLM on prompts. All LLMs derive from the
LLM base class.
Tip
Instead of using run() directly, use a
step that takes an LLM as an args
argument such as Prompt or
FewShotPrompt.
Efficient Generation Techniques#
Throughput#
DataDreamer provides efficient generation through a variety of techniques that can optimize throughput.
Adaptive Batch Sizing
For locally running LLMs, the maximum batch size that the LLM can handle before running out of memory is determined by the amount of memory available on your system, but also dependent on the maximum sequence length the batch of inputs passed to the LLM, as longer inputs require more memory. Over many iterations, DataDreamer will automatically learn the maximum batch size that the LLM can handle for a given sequence length and will adaptively adjust the batch size to maximize throughput.
The maximum batch size that will ever be used is the batch_size argument passed
to the run() method. DataDreamer will try to find the largest batch
size that the LLM can handle that is less than or equal to the batch_size. If
a batch size is too large, DataDreamer will automatically catch the out of memory
error and reduce the batch size and learn for future iterations.
To disable adaptive batch sizing, you can pass adaptive_batch_size=False to the
run() method.
Batch Scheduling
In order to minimize padding processed by the LLM, DataDreamer will attempt to schedule batches such that the length of all sequences in a batch are similar. This will minimize the amount of padding that the LLM has to process.
To do this, DataDreamer reads a large buffer of prompts, sorts the prompts by
length, and then schedules batches of size batch_size from the sorted prompts.
To manually control the size of the buffer, you can pass a
batch_scheduler_buffer_size to the run() method.
To disable batch scheduling, you can set batch_scheduler_buffer_size equal to
batch_size.
Robustness & Retries
For API-based LLMs, DataDreamer will attempt to retry failed requests. This is can
be disabled via retry_on_fail=False.
Running on Multiple GPUs#
See the Running Models on Multiple GPUs page.
Quantization#
See the Quantization page.
Caching#
LLMs internally perform caching to disk, so if you run the same prompt with the same generation settings multiple times, the LLM will only run the prompt once and then cache the results for future runs.
- class datadreamer.llms.LLM(cache_folder_path=None)[source]#
Bases:
_CachableBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- abstract get_max_context_length(max_new_tokens)[source]#
Gets the maximum context length for the model. When
max_new_tokensis greater than 0, the maximum number of tokens that can be used for the prompt context is returned.
- format_prompt(max_new_tokens=None, beg_instruction=None, in_context_examples=None, end_instruction=None, sep='\\n', min_in_context_examples=None, max_in_context_examples=None)[source]#
Formats a prompt for the LLM given instructions and in-context examples.
Prompt Format
The final prompt will be constructed as follows:
beg_instruction sep in_context_example_1 sep in_context_example_2 sep ... sep in_context_example_n sep end_instruction
If
beg_instruction,in_context_examples, andend_instructionareNone, they will not be included in the prompt.If all of the
in_context_exampleswill not fit in the prompt (accounting for the possiblemax_new_tokensthat may be generated) the prompt will be constructed with as many in-context examples that will fit.If
min_in_context_examplesandmax_in_context_examplesare set, those constraints will be enforced.- Parameters:
max_new_tokens (
Optional[int], default:None) β The maximum number of tokens that can be generated.beg_instruction (
Optional[str], default:None) β The instruction at the beginning of the prompt.in_context_examples (
Optional[list[str]], default:None) β The in-context examples to include in the prompt.end_instruction (
Optional[str], default:None) β The instruction at the end of the prompt.sep (default:
'\\n') β The separator to use between the instructions and in-context examples.min_in_context_examples (
Optional[int], default:None) β The minimum number of in-context examples to include in the prompt.max_in_context_examples (
Optional[int], default:None) β The maximum number of in-context examples to include in the prompt.
- Return type:
- Returns:
The formatted prompt.
- abstract run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.OpenAI(model_name, system_prompt=None, organization=None, api_key=None, base_url=None, api_version=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LLMBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.OpenAIAssistant(model_name, system_prompt=None, tools=None, organization=None, api_key=None, base_url=None, api_version=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
OpenAIBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- class datadreamer.llms.HFTransformers(model_name, chat_prompt_template=AUTO, system_prompt=AUTO, revision=None, trust_remote_code=False, device=None, device_map=None, dtype=None, quantization_config=None, adapter_name=None, adapter_kwargs=None, cache_folder_path=None, **kwargs)[source]#
Bases:
LLMBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=True, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.CTransformers(model_name, model_type=None, model_file=None, max_context_length=None, chat_prompt_template=AUTO, system_prompt=AUTO, revision=None, threads=None, gpu_layers=0, cache_folder_path=None, **kwargs)[source]#
Bases:
HFTransformersBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.VLLM(model_name, chat_prompt_template=AUTO, system_prompt=AUTO, revision=None, trust_remote_code=False, device=None, dtype=None, quantization=None, swap_space=1, cache_folder_path=None, **kwargs)[source]#
Bases:
HFTransformersBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.Petals(model_name, chat_prompt_template=AUTO, system_prompt=AUTO, revision=None, trust_remote_code=False, device=None, dtype=None, adapter_name=None, cache_folder_path=None, **kwargs)[source]#
Bases:
HFTransformersBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=True, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.HFAPIEndpoint(endpoint, model_name, chat_prompt_template=AUTO, system_prompt=AUTO, token=None, revision=None, trust_remote_code=False, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
HFTransformersBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.Together(model_name, chat_prompt_template=AUTO, system_prompt=AUTO, api_key=None, max_context_length=None, tokenizer_model_name=None, tokenizer_revision=None, tokenizer_trust_remote_code=False, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LLMAPIBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.MistralAI(model_name, api_key=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LLMAPIBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.Anthropic(model_name, api_key=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLMBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.Cohere(model_name, api_key=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLMBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.AI21(model_name, api_key=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLMBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.Bedrock(model_name, aws_access_key_id=None, aws_secret_access_key=None, aws_region_name=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLMBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.GoogleAIStudio(model_name, api_key=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLMBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.VertexAI(model_name, vertex_project=None, vertex_location=None, retry_on_fail=True, cache_folder_path=None, **kwargs)[source]#
Bases:
LiteLLMBase class for all LLMs.
- Parameters:
cache_folder_path (
Optional[str], default:None) β The path to the cache folder. IfNone, the default cache folder for the DataDreamer session will be used.
- run(prompts, max_new_tokens=None, temperature=1.0, top_p=0.0, n=1, stop=None, repetition_penalty=None, logit_bias=None, batch_size=10, batch_scheduler_buffer_size=None, adaptive_batch_size=False, seed=None, progress_interval=60, force=False, cache_only=False, verbose=None, log_level=None, total_num_prompts=None, return_generator=False, **kwargs)[source]#
- class datadreamer.llms.ParallelLLM(*llms)[source]#
Bases:
_ParallelCachable,LLMCreates a LLM that will run multiple LLMs in parallel. See running models in parallel for more details.
- Parameters:
*llms (
LLM) β The LLMs to run in parallel.