Skip to content

Lint comparison to "NA" and "NULL" #2130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Bisaloo opened this issue Sep 9, 2023 · 6 comments
Closed

Lint comparison to "NA" and "NULL" #2130

Bisaloo opened this issue Sep 9, 2023 · 6 comments

Comments

@Bisaloo
Copy link
Collaborator

Bisaloo commented Sep 9, 2023

A pattern I sometimes see to "fix" the fact that comparison to NA and NULL don't return TRUE/FALSE is to convert them as character.

There are two situations. The first one sort of works but it's still a code smell and the second doesn't work at all.

First case: comparison with %in%

This code works as expected I guess but we should still direct users toward better alternatives such as anyNA():

test <- list(
  a = c(1, 2, 3),
  b = NULL,
  c = NA
)

"NA" %in% test
#> [1] TRUE
"NULL" %in% test
#> [1] TRUE

Created on 2023-09-09 with reprex v2.0.2

The last two lines should produce lints.

Second case: comparison with == / !=

This doesn't work at all but users might miss it because they are not testing it with cases where their variable is NA or NULL:

my_is_na <- function(x) x == "NA"

my_is_na("test")
#> [1] FALSE
my_is_na(NA)
#> [1] NA

my_is_null <- function(x) x == "NULL"

my_is_null("test")
#> [1] FALSE
my_is_null(NULL)
#> logical(0)

Created on 2023-09-09 with reprex v2.0.2

New linter

This could either be a new linter in the category "common mistakes", especially for the second case which doesn't work at all like expected.
Alternatively, the code with NA could be part of the equals_na_linter(). Not sure about the code with NULL in this scenario.

I expect these cases to be relatively simple so I'm happy to try and submit a PR for this.

@MEO265
Copy link
Contributor

MEO265 commented Sep 9, 2023

When working with databases it can often be necessary to test for "NULL" as a string, so linting "NULL" %in% test would be inconvenient.

@Bisaloo
Copy link
Collaborator Author

Bisaloo commented Sep 9, 2023

This linter, like many others, would be not be enabled by default. That would hopefully avoid problems like the situation you're describing.

@MichaelChirico
Copy link
Collaborator

I'm afraid the second case is out of scope for {lintr}... my guess would be comparison to the actual strings "NA" or "NULL" is the correct/intended behavior far more often than it's a mistake. If you encounter this enough in your code base I'd encourage writing a custom linter.

@MichaelChirico
Copy link
Collaborator

ditto for the first case as strings. a case like NA %in% x is covered by #2113, so I'll close.

@Bisaloo
Copy link
Collaborator Author

Bisaloo commented Sep 12, 2023

While I can easily understand how false positives could be generated, I confirm that all cases I have seen were incorrect usage. So I followed your suggestion. I went ahead and added this as a custom linter in our internal development package: epiverse-trace/etdev#7

For future reference in case anyone else is interested in this linter:

na_null_strings_linter <- function() {

  # Code mostly adapted from lintr::equals_na_linter() (MIT license)

  # nolint next: undesirable_operator_linter
  na_null_strings <- lintr:::xp_text_in_table(c(
    glue::double_quote(c("NA", "NULL")),
    glue::single_quote(c("NA", "NULL"))
  ))

  xpath <- glue::glue("
    //STR_CONST[{ na_null_strings }]
      /parent::expr
      /parent::expr[EQ or NE]
    |
    //SPECIAL[
      text() = '%in%' and
      following-sibling::expr/STR_CONST[{ na_null_strings }]
    ]
      /parent::expr
  ")

  lintr::Linter(function(source_expression) {
    if (!lintr::is_lint_level(source_expression, "expression")) {
      return(list())
    }

    xml <- source_expression$xml_parsed_content

    bad_expr <- xml2::xml_find_all(xml, xpath)

    lintr::xml_nodes_to_lints(
      bad_expr,
      source_expression,
      lint_message = paste(
        "Are you sure you want to compare to *the string* 'NA' or 'NULL'?",
        "To compare to the special values NA or NULL,",
        "please use `is.null()` or `is.na()` instead."
      ),
      type = "warning"
    )
  })
}

I initially tried to use the new make_linter_from_xpath() helper but the single / double quote issue was a bit annoying so I opted to write the function myself and use xp_text_in_table() for readability.

Any chance xp_text_in_table() could be exported by lintr, the same way other helpers (e.g., expect_lint()) are? I can open a new issue for this if you wish.

@MichaelChirico
Copy link
Collaborator

Any chance xp_text_in_table() could be exported by lintr, the same way other helpers (e.g., expect_lint()) are? I can open a new issue for this if you wish.

Yes, please file an issue. I have a copy of it in our package too... better to export. Side note: lintr::get_r_string() will be more robust for matching string values like R"{NULL}" (rare/completely absent though that may be).

Note that you can still use make_linter_from_xpath():

na_null_strings_linter <- local({
  # nolint next: undesirable_operator_linter
  na_null_strings <- lintr:::xp_text_in_table(c(
    glue::double_quote(c("NA", "NULL")),
    glue::single_quote(c("NA", "NULL"))
  ))

  make_linter_from_xpath(
    xpath = glue::glue("
      //STR_CONST[{ na_null_strings }]
        /parent::expr
        /parent::expr[EQ or NE]
      |
      //SPECIAL[
        text() = '%in%' and
        following-sibling::expr/STR_CONST[{ na_null_strings }]
      ]
        /parent::expr
    "),
    lint_message = paste(
      "Are you sure you want to compare to *the string* 'NA' or 'NULL'?",
      "To compare to the special values NA or NULL,",
      "please use `is.null()` or `is.na()` instead."
    )
  )
})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants