Skip to content

Fix bug in rsync's delete_missing #1596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 3, 2024
Merged

Fix bug in rsync's delete_missing #1596

merged 1 commit into from
May 3, 2024

Conversation

GabrielBianconi
Copy link
Contributor

The code crashes if there are no files to delete when delete_missing=True.


IndexError                                Traceback (most recent call last)
Cell In[7], line 2
      1 s3_path = "s3://tensorzero-etl-online/data/organization_id=af98d0df-163b-4024-a7fd-7405c86215a0/workflow_id=98ecbc65-e853-4974-86e8-e9a7a9ba07d9/inference/function_id=0e1dc437-3bbe-49ee-9a63-36254a274c08/"
----> 2 read_from_s3_with_caching(s3_path)

Cell In[5], line 13
     10 print(dest)
     12 print(f"rsync...\n\t{source}\n\t->\n\t{dest})")
---> 13 fsspec.generic.rsync(s3_path, dest, delete_missing=True)
     15 print("Loading Parquet files...")
     17 parquet_paths = glob.glob(
     18     "**/*.parquet", root_dir=dest, recursive=True
     19 )

File /usr/local/lib/python3.10/dist-packages/fsspec/generic.py:139, in rsync(source, destination, delete_missing, source_field, dest_field, update_cond, inst_kwargs, fs, **kwargs)
    137 logger.debug(f"{len(to_delete)} files to delete")
    138 if delete_missing:
--> 139     fs.rm(to_delete)

File /usr/local/lib/python3.10/dist-packages/fsspec/asyn.py:118, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    115 @functools.wraps(func)
    116 def wrapper(*args, **kwargs):
    117     self = obj or args[0]
--> 118     return sync(self.loop, func, *args, **kwargs)

File /usr/local/lib/python3.10/dist-packages/fsspec/asyn.py:103, in sync(loop, func, timeout, *args, **kwargs)
    101     raise FSTimeoutError from return_result
    102 elif isinstance(return_result, BaseException):
--> 103     raise return_result
    104 else:
    105     return return_result

File /usr/local/lib/python3.10/dist-packages/fsspec/asyn.py:56, in _runner(event, coro, result, timeout)
     54     coro = asyncio.wait_for(coro, timeout=timeout)
     55 try:
---> 56     result[0] = await coro
     57 except Exception as ex:
     58     result[0] = ex

File /usr/local/lib/python3.10/dist-packages/fsspec/generic.py:256, in GenericFileSystem._rm(self, url, **kwargs)
    254 if isinstance(urls, str):
    255     urls = [urls]
--> 256 fs = _resolve_fs(urls[0], self.method)
    257 if fs.async_impl:
    258     await fs._rm(urls, **kwargs)

IndexError: list index out of range

@martindurant martindurant merged commit ecf7f86 into fsspec:master May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants