-
Notifications
You must be signed in to change notification settings - Fork 1k
Refine Rust catalog file operations #2454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dbda861
to
e7ef66d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for all this work 👌
ONE_MIN_BID = BarSpecification(1, BarAggregation.MINUTE, PriceType.BID) | ||
AUDUSD_1_MIN_BID = BarType(AUDUSD_SIM, ONE_MIN_BID) | ||
|
||
bar1 = Bar( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TestDataStubs
could be useful here for stub bars:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need some py03 types instead of standard ones here, I could add a similar stubs file for pyo3 though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, in that case the way you've done it is the easiest path for now then.
@@ -3480,6 +3486,48 @@ class PostgresCacheDatabase: | |||
def update_order(self, order: object) -> None: ... | |||
def update_account(self, account: Account) -> None: ... | |||
|
|||
|
|||
class PyParquetDataCatalogV2: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to prefix with Py
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I'll do that, I used an LLM for this and didn't notice
} | ||
} | ||
let mut combined = Vec::with_capacity(existing_batches.len() + batches.len()); | ||
let batches: Vec<RecordBatch> = batches.to_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice simplification.
self.inner | ||
.write_to_parquet(data, None, None, None, write_mode) | ||
.map(|path| path.to_string_lossy().to_string()) | ||
.map_err(|e| PyIOError::new_err(format!("Failed to write bars: {}", e))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can standardize on format strings here, rather than interpolation of e
outside the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I'll do that
@@ -294,42 +288,47 @@ impl ParquetDataCatalog { | |||
start: Option<UnixNanos>, | |||
end: Option<UnixNanos>, | |||
where_clause: Option<&str>, | |||
) -> datafusion::error::Result<QueryResult> | |||
) -> Result<QueryResult> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should leave all non built-in Result
types qualified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was to make it compatible with the calling function that uses anyhow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try again to see if it works when keeping the data fusion result in the signature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to keep the anyhow result as standard, the type of the error is not lost in the anyhow result, it's a box dyn error that can be converted back to whatever is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the actual result type changed then thats fine, which is where using anyhow::Result
is clearer.
I've updated things above. It should be ready after the CI finishes. |
Pull Request
Refine rust catalog file operations
Made it work in python
Type of change
How has this change been tested?
Added python tests