Open
Description
Our current classification references use the "Repeated Augment Sampler" (RASampler
) from #5051:
Since after the revamp we will have iterable- rather than map-style datasets, samplers are no longer supported.
Given that the RASampler
increases accuracy, we need to support the same functionality going forward. It can probably be achieved by appending a custom RepeatedAugmentIterDataPipe
to the dataset graph, but we need to make sure it works correctly with the shuffling and sharding.