Change of default behavior from "load all data at once" to streaming in high-level API #2565

stastnypremysl · 2025-04-27T22:06:38Z

RFC: Change of default behavior from "load all data at once" to streaming in high-level API

Context

I propose to change of default behavior from "load all data at once" to streaming during high level API back-tests. chunk_size=65536 seems to work well.

Considerations

There are almost only positives for users processing large amount of data at once and almost no performance hit for users, which are processing smaller amount of data, or have enough of RAM for backtest available.

Also, I propose to change this until NT is beta, as this presents breaking API change for NT.

cjdsellers · 2025-05-02T00:01:28Z

Hi @stastnypremysl

From a performance perspective, what you outline here makes sense.

The reason we don't set a chunk_size by default for streaming is that this then constraints the data types which can be used to the built in ones available through Rust and the DataFusion backend:

OrderBookDelta
OrderBookDepth10
QuoteTick
TradeTick
Bar

There are users who need other custom data types to be available, and streaming is a memory optimization which can be discovered as a user gains experience with the platform and reads the documentation.

I think that changing the default has only marginal benefits and we're better off leaving things as they are until the Rust port is complete, at which point all data types including custom data will be available - and it would definitely make sense to stream by default for the reasons you mention.

stastnypremysl · 2025-05-02T12:21:47Z

Hi, @cjdsellers.

Thanks for your answer.

After reading your argument, I fully agree.

stastnypremysl added the RFC A request for comment label Apr 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change of default behavior from "load all data at once" to streaming in high-level API #2565

Change of default behavior from "load all data at once" to streaming in high-level API #2565

stastnypremysl commented Apr 27, 2025 •

edited

Loading

cjdsellers commented May 2, 2025

Uh oh!

stastnypremysl commented May 2, 2025

Uh oh!

Change of default behavior from "load all data at once" to streaming in high-level API #2565

Change of default behavior from "load all data at once" to streaming in high-level API #2565

Comments

stastnypremysl commented Apr 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

RFC: Change of default behavior from "load all data at once" to streaming in high-level API

Context

Considerations

cjdsellers commented May 2, 2025

Uh oh!

stastnypremysl commented May 2, 2025

Uh oh!

stastnypremysl commented Apr 27, 2025 •

edited

Loading