-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
ENH: Add check for character limmit #57103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
20e357a
74e302f
74f4f81
d864912
babdf29
4768089
65284bb
41cad2b
5470464
2e79c61
91a4bee
62b8b54
bd6a975
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1326,6 +1326,13 @@ def _value_with_fmt( | |
fmt = "0" | ||
else: | ||
val = str(val) | ||
# GH#56954 | ||
if len(val) > 32767: | ||
warnings.warn( | ||
luke396 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
"String value too long, truncated to 32767 characters", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This can potentially come from other data types, "String" may end up being confusing if that's the case. I think a more generic "Cell contents too long" is better. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have a quick question regarding this: should we also explore validating other data types, like float or int? The current code seems to focus solely on validating the length for specific data types. Do we need to extend these checks to other types mentioned above? Alternatively, have we conducted similar checks elsewhere, or is there a best practice that should be adhered to in this context? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think Excel's limitation is specifically for strings.
When you say elsewhere, are you thinking of outside the excel code? I do not know of any formal best practice. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe I grasp your perspective. I want to ensure we deliberate on whether we should also account for the length of other data types. For now, let's concentrate on checking the length specifically for strings in this pull request; that should suffice. |
||
UserWarning, | ||
stacklevel=find_stack_level(), | ||
) | ||
|
||
return val, fmt | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1385,6 +1385,17 @@ def test_to_excel_empty_frame(self, engine, ext): | |
expected = DataFrame() | ||
tm.assert_frame_equal(result, expected) | ||
|
||
def test_to_excel_raising_warning_when_cell_character_exceed_limit( | ||
self, path, engine | ||
): | ||
# GH#56954 | ||
df = DataFrame({"A": ["a" * 32768]}) | ||
msg = "String value too long, truncated to 32767 characters" | ||
with tm.assert_produces_warning( | ||
UserWarning, match=msg, raise_on_extra_warnings=False | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What extra warning is raised here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From commit babdf29, the check https://github.com/pandas-dev/pandas/actions/runs/7678882560/job/20929191559 rise and check https://github.com/pandas-dev/pandas/actions/runs/7678882560/job/20929192274 rasie
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Gotcha. Yeah since these are external warnings this is fine for now |
||
): | ||
df.to_excel(path, engine=engine) | ||
|
||
|
||
class TestExcelWriterEngineTests: | ||
@pytest.mark.parametrize( | ||
|
Uh oh!
There was an error while loading. Please reload this page.