datasets#
OutputDataset
and OutputIterableDataset
dataset objects are
returned as outputs from Step
objects under the
output
attribute.
Tip
You never need to construct a dataset object yourself. They are returned as
output
from
Step
objects. If you need to convert in-memory Python
data or data in files to a DataDreamer dataset object, see the
DataSource steps
available in datadreamer.steps
.
Accessing Columns#
To access a column on the dataset objects you can use the __getitem__
operator like
so: step.output['column_name']
. This will return a OutputDatasetColumn
or OutputIterableDatasetColumn
column object that can be passed as an input
to the inputs
argument of a Step
.
- class datadreamer.datasets.OutputDataset(step, dataset, pickled=False)[source]#
Bases:
OutputDatasetMixin
- class datadreamer.datasets.OutputDatasetColumn(step, dataset, pickled=False)[source]#
Bases:
OutputDatasetColumnMixin
,OutputDataset
- class datadreamer.datasets.OutputIterableDataset(step, dataset, pickled=False, total_num_rows=None)[source]#
Bases:
OutputDatasetMixin
- property dataset: IterableDataset[source]#
The underlying Hugging Face
IterableDataset
.
- class datadreamer.datasets.OutputIterableDatasetColumn(step, dataset, pickled=False, total_num_rows=None)[source]#
Bases:
OutputDatasetColumnMixin
,OutputIterableDataset