-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
BUG: read_csv interpreting NA value as comment when NA contains comment string #38392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
8c2e1ca
191a34c
62e8ea2
0f4cda1
94e8c87
fd67e62
0fda3d3
cdf7c9f
54c5d59
b867c9e
855fb23
332cffb
7d357ca
a6ba22c
13f443b
019a73d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,6 +8,7 @@ | |
import csv | ||
from io import BytesIO, StringIO | ||
|
||
import numpy as np | ||
import pytest | ||
|
||
from pandas.errors import ParserError | ||
|
@@ -314,3 +315,26 @@ def test_malformed_skipfooter(python_parser_only): | |
msg = "Expected 3 fields in line 4, saw 5" | ||
with pytest.raises(ParserError, match=msg): | ||
parser.read_csv(StringIO(data), header=1, comment="#", skipfooter=1) | ||
|
||
|
||
def test_comment_char_in_default_value(python_parser_only): | ||
# GH#34002 | ||
from io import StringIO | ||
gfyoung marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
data = ( | ||
"# this is a comment\n" | ||
"1,2,3,4\n" | ||
"1,2,3,4#inline comment\n" | ||
"1,2#,3,4\n" | ||
"1,2,#N/A,4\n" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it be better to have the first row (not the commented one) to be like ``col_1,col_2,col_3,col4\n" to avoid confusion with the values below it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point, done |
||
) | ||
result = python_parser_only.read_csv(StringIO(data), comment="#", na_values="#N/A") | ||
expected = DataFrame( | ||
{ | ||
"1": [1] * 3, | ||
"2": [2] * 3, | ||
"3": [3.0, np.nan, np.nan], | ||
"4": [4.0, np.nan, 4.0], | ||
} | ||
) | ||
tm.assert_frame_equal(result, expected) |
Uh oh!
There was an error while loading. Please reload this page.