You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RFC: Change of default behavior from "load all data at once" to streaming in high-level API
Context
I propose to change of default behavior from "load all data at once" to streaming during high level API back-tests. chunk_size=65536 seems to work well.
Considerations
There are almost only positives for users processing large amount of data at once and almost no performance hit for users, which are processing smaller amount of data, or have enough of RAM for backtest available.
Also, I propose to change this until NT is beta, as this presents breaking API change for NT.
The text was updated successfully, but these errors were encountered:
From a performance perspective, what you outline here makes sense.
The reason we don't set a chunk_size by default for streaming is that this then constraints the data types which can be used to the built in ones available through Rust and the DataFusion backend:
OrderBookDelta
OrderBookDepth10
QuoteTick
TradeTick
Bar
There are users who need other custom data types to be available, and streaming is a memory optimization which can be discovered as a user gains experience with the platform and reads the documentation.
I think that changing the default has only marginal benefits and we're better off leaving things as they are until the Rust port is complete, at which point all data types including custom data will be available - and it would definitely make sense to stream by default for the reasons you mention.
Uh oh!
There was an error while loading. Please reload this page.
RFC: Change of default behavior from "load all data at once" to streaming in high-level API
Context
I propose to change of default behavior from "load all data at once" to streaming during high level API back-tests. chunk_size=65536 seems to work well.
Considerations
There are almost only positives for users processing large amount of data at once and almost no performance hit for users, which are processing smaller amount of data, or have enough of RAM for backtest available.
Also, I propose to change this until NT is beta, as this presents breaking API change for NT.
The text was updated successfully, but these errors were encountered: