Add support for data download during backtest and refactor of catalog #2652

faysou · 2025-05-19T21:20:21Z

Pull Request

NautilusTrader prioritizes correctness and reliability, please follow existing patterns for validation and testing.

Summary

Add download function to backtest node in order to be able to download data and save it to a catalog easily.
This PR builds on previous PRs (see their comments for more details on each of them).

To reduce maintance, I will only rebase this PR while waiting it gets merged.

Related Issues/PRs

#2574
#2594

Type of change

New feature (non-breaking)

Release notes

I added a concise entry to RELEASES.md that follows the existing conventions (when applicable)

Testing

Tested on included example

CLAassistant · 2025-05-23T18:39:58Z

All committers have signed the CLA.

cjdsellers

I still need to wrap my head around the changes to the node, looking great so far though!

cjdsellers · 2025-06-10T11:56:58Z

nautilus_trader/live/data_client.py

            if success_msg:
-                self._log.info(success_msg, success_color)
+                self._log.info(f"{success_msg} (Color: {success_color})")


Did we mean to change passing the LogColor enum here?

cjdsellers · 2025-06-10T11:58:47Z

nautilus_trader/persistence/catalog/parquet.py

-        kwargs : Any
-            Additional keyword arguments to be passed to the `write_chunk` method.
+        start : int, optional
+            The start timestamp for the data chunk.


Is it Unix nanos? (we should document that, param names are fine)

Also, should it be inferred from that data [0] and [-1] to prevent mismatches? (there might be a reason for this I haven't read yet)

[edit] Is it because we're carrying the original start and end metadata for the query which produced the data? we could probably provide a little more detail about the behavior in the docstring, which we'll carry over to Rust - because when I saw start and end I immediately thought of the potential for mismatches.

cjdsellers · 2025-06-10T12:00:20Z

nautilus_trader/persistence/catalog/parquet.py

+        self.fs.mkdirs(directory, exist_ok=True)
+
+        if isinstance(data[0], Instrument):
+            # When writing an instrument for a given instrument_id, we don't want duplicates


We'll eventually want to change this to handling a series of instrument versions, to reflect changes to the definition over time. (Understood that this preserves existing functionality)

cjdsellers · 2025-06-10T12:07:16Z

nautilus_trader/persistence/catalog/parquet.py

    ) -> None:
        """
-        Consolidate several parquet files into a single file with data sorted in
-        ascending chronological order.
+        Reset the filenames of parquet files for a specific data class and instrument


Great docs here 👌

cjdsellers · 2025-06-10T12:11:16Z

nautilus_trader/persistence/funcs.py

@@ -34,14 +35,11 @@ def class_to_filename(cls: type) -> str:
    return name


-def urisafe_instrument_id(instrument_id: InstrumentId | str) -> str:
+def urisafe_instrument_id(instrument_id: InstrumentId | BarType | str) -> str:


Should we choose a different param name if it could also be a bar type?

cjdsellers · 2025-06-10T12:14:15Z

nautilus_trader/data/messages.pyx

@@ -107,6 +107,8 @@ cdef class SubscribeData(DataCommand):
    ----------
    data_type : type
        The data type for the subscription.
+    instrument_id : InstrumentId


I'm a little on the fence about making instrument_id optional and adding it to the more general subscribe custom data message. Does it make downstream logic in the catalog easier?

faysou force-pushed the download_data branch 5 times, most recently from 94bee45 to d3ae70b Compare May 23, 2025 18:39

faysou force-pushed the download_data branch 11 times, most recently from 2ac3ff8 to 8d5b794 Compare May 29, 2025 09:48

faysou force-pushed the download_data branch 8 times, most recently from 161fd38 to 1e28b9b Compare June 1, 2025 13:20

faysou mentioned this pull request Jun 1, 2025

Migrate refactor of catalog to rust #2681

Open

3 tasks

faysou force-pushed the download_data branch 4 times, most recently from d5b6a85 to 6f8617e Compare June 4, 2025 11:18

faysou changed the title ~~Add download_data function to BacktestNode~~ Add support for data download during backtest and refactor of catalog Jun 4, 2025

faysou force-pushed the download_data branch 6 times, most recently from 8083561 to be93e2b Compare June 8, 2025 21:59

Add support for data download during backtest and refactor of catalog

ca3ff99

faysou force-pushed the download_data branch from be93e2b to ca3ff99 Compare June 9, 2025 08:38

cjdsellers reviewed Jun 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for data download during backtest and refactor of catalog #2652

Add support for data download during backtest and refactor of catalog #2652

Uh oh!

faysou commented May 19, 2025 •

edited

Loading

Uh oh!

CLAassistant commented May 23, 2025 •

edited

Loading

Uh oh!

cjdsellers left a comment

Uh oh!

cjdsellers Jun 10, 2025

Uh oh!

cjdsellers Jun 10, 2025

Uh oh!

cjdsellers Jun 10, 2025

Uh oh!

cjdsellers Jun 10, 2025

Uh oh!

cjdsellers Jun 10, 2025

Uh oh!

cjdsellers Jun 10, 2025

Uh oh!

Uh oh!

Add support for data download during backtest and refactor of catalog #2652

Are you sure you want to change the base?

Add support for data download during backtest and refactor of catalog #2652

Uh oh!

Conversation

faysou commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

Summary

Related Issues/PRs

Type of change

Release notes

Testing

Uh oh!

CLAassistant commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cjdsellers left a comment

Choose a reason for hiding this comment

Uh oh!

cjdsellers Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

cjdsellers Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

cjdsellers Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

cjdsellers Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

cjdsellers Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

cjdsellers Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

faysou commented May 19, 2025 •

edited

Loading

CLAassistant commented May 23, 2025 •

edited

Loading