datasets#

OutputDataset and OutputIterableDataset dataset objects are returned as outputs from Step objects under the output attribute.

To access a column on the dataset objects you can use the __getitem__ operator like so: step.output['column_name']. This will return a OutputDatasetColumn or OutputIterableDatasetColumn column object that can be passed as an input to the inputs argument of a Step.

Tip

You never need to construct a dataset object yourself. They are returned as output from Step objects. If you need to convert in-memory Python data or data in files to a DataDreamer dataset object, see the DataSource steps available in datadreamer.steps.

class datadreamer.datasets.OutputDataset(step, dataset, pickled=False)[source]#

Bases: OutputDatasetMixin

property dataset: Dataset[source]#
property num_rows: int[source]#
__getitem__(key)[source]#
Return type:

Any

property column_names: list[str][source]#
property num_columns: int[source]#
property step: Step[source]#
class datadreamer.datasets.OutputDatasetColumn(step, dataset, pickled=False)[source]#

Bases: OutputDatasetColumnMixin, OutputDataset

class datadreamer.datasets.OutputIterableDataset(step, dataset, pickled=False, total_num_rows=None)[source]#

Bases: OutputDatasetMixin

property dataset: IterableDataset[source]#
property num_rows: None | int[source]#
__getitem__(key)[source]#
Return type:

Any

class datadreamer.datasets.OutputIterableDatasetColumn(step, dataset, pickled=False, total_num_rows=None)[source]#

Bases: OutputDatasetColumnMixin, OutputIterableDataset