Skip to content

Port RASampler functionality to iterable datasets #6018

Open
@pmeier

Description

@pmeier

Our current classification references use the "Repeated Augment Sampler" (RASampler) from #5051:

class RASampler(torch.utils.data.Sampler):

Since after the revamp we will have iterable- rather than map-style datasets, samplers are no longer supported.

Given that the RASampler increases accuracy, we need to support the same functionality going forward. It can probably be achieved by appending a custom RepeatedAugmentIterDataPipe to the dataset graph, but we need to make sure it works correctly with the shuffling and sharding.

cc @pmeier @YosuaMichael @datumbox @vfdev-5 @bjuncek

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions