Skip to content

Bats tests don't test for shell code correctness #74

Open
@h3xx

Description

@h3xx

The tests in tests/*.bats don't test for the correctness or usability of the shell code generated.

All of the tests merely compare the output against a cached copy of what it assumes the output should be.

This leads to inflexible tests, and more effort to make trivial changes.

An example of a better test methodology might be:

  • Ensure the script output is the same across all versions of PHP
    • I.e. Run in PHP 8.2, then ensure it matches the output from running in PHP 7.4, then ensure it matches PHP 8.0, etc.
  • Run shellcheck against the generated output
Excerpt of my original comment in #73 (~43 lines)

It is a little concerning that the test that failed depends on a byte-for-byte comparison against a copy of the code it tests.

I recall an article I read a long time ago about a dev who built a unit test that contained an exact copy of the code being tested, a particularly tricky algorithm. The dev said they were okay with updating it twice (once in the source and once in the unit test) stating it was worth 100% code coverage.

Reasons not to do this:

1. It doesn't actually test the code for correctness.

Since it's just a simple text diff, it doesn't matter how the code works, just that it's the same.

Furthermore, if there's a bug in both the source and the copy, the test won't show it - in fact, it'll pass! Instead of testing for the presence of bugs, this test actually presents a barrier to fixing the bug since it must now be fixed in multiple places.

(I'm sure there are other tests that test for correctness, but my point is that this one doesn't.)

2. It deliberately violates DRY (Don't Repeat Yourself).

The code has to exist in two places, and must be updated in both places.

I suppose this could be fixed by copying the source file into the cache before starting, but that IMO would be equally silly.

3. It's extra development overhead.

It's an extra step to making changes, and can be confusing to new developers (it confused me).

It can also muddy the intent of commits in the git history; it may be harder to grok the change intended by the dev, if the change is made n times instead of just one. It's the same reason I avoid committing auto-generated files alongside the files they were generated from - just having one source of truth is best.

Originally posted by @h3xx in #73 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions