Skip to content

Raise an error if Kinetics400 dataset is empty #2903

Closed
@vfdev-5

Description

@vfdev-5

🚀 Feature

Currently, if we construct a dataset on empty folder, it would be nice to have an error. Testing the length of the dataset is failing:

import torchvision
torchvision.__version__
> '0.8.0a0+e280f61'
dataset = torchvision.datasets.Kinetics400("/tmp/", frames_per_clip=10, step_between_clips=1, frame_rate=15)
len(dataset)
> Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/torchvision/torchvision/datasets/kinetics.py", line 70, in __len__
    return self.video_clips.num_clips()
  File "/torchvision/torchvision/datasets/video_utils.py", line 247, in num_clips
    return self.cumulative_sizes[-1]
IndexError: list index out of range

Additional context

Without any errors related to empty dataset, reference video_classification example is failing inside the random sampler with misleading error:

  File "/vision/torchvision/datasets/samplers/clip_sampler.py", line 175, in __iter__                                                                               
    idxs_ = torch.cat(idxs)                                                                                                                                                 
RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat.  Th
is usually means that this function requires a non-empty list of Tensors.  Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCP
U, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].   

cc @pmeier @bjuncek

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions