datasets#
OutputDataset and OutputIterableDataset dataset objects are
returned as outputs from Step objects under the
output attribute.
Tip
You never need to construct a dataset object yourself. They are returned as
output from
Step objects. If you need to convert in-memory Python
data or data in files to a DataDreamer dataset object, see the
DataSource steps
available in datadreamer.steps.
Accessing Columns#
To access a column on the dataset objects you can use the __getitem__ operator like
so: step.output['column_name']. This will return a OutputDatasetColumn
or OutputIterableDatasetColumn column object that can be passed as an input
to the inputs argument of a Step.
- class datadreamer.datasets.OutputDataset(step, dataset, pickled=False)[source]#
Bases:
OutputDatasetMixin
- class datadreamer.datasets.OutputDatasetColumn(step, dataset, pickled=False)[source]#
Bases:
OutputDatasetColumnMixin,OutputDataset
- class datadreamer.datasets.OutputIterableDataset(step, dataset, pickled=False, total_num_rows=None)[source]#
Bases:
OutputDatasetMixin- property dataset: IterableDataset[source]#
The underlying Hugging Face
IterableDataset.
- class datadreamer.datasets.OutputIterableDatasetColumn(step, dataset, pickled=False, total_num_rows=None)[source]#
Bases:
OutputDatasetColumnMixin,OutputIterableDataset