Skip to content

Add tests for edn/read-string#910

Open
brandoncorrea wants to merge 6 commits into
jank-lang:mainfrom
brandoncorrea:edn/read-string
Open

Add tests for edn/read-string#910
brandoncorrea wants to merge 6 commits into
jank-lang:mainfrom
brandoncorrea:edn/read-string

Conversation

@brandoncorrea

@brandoncorrea brandoncorrea commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

This PR closes #908.

These tests attempt to cover all behaviors according to edn-format.

Many of these tests were inspired by ClojureScript's reader tests.

Additional tests were added to flesh out all behaviors of the EDN spec, as well as some edge cases, negative scenarios, dialect-specific divergences, and some EDN extensions that some dialects implement and others do not (reader-conditionalized where applicable).

Babashka, JVM, CLR, and CLJS have all been tested and pass locally. Phel and Basilisp have been left to fail where apparent gaps exist.

@jeaye

jeaye commented Jun 20, 2026

Copy link
Copy Markdown
Member

Thanks, Brandon!

@dgr What are your thoughts on the approach here?

@dgr dgr left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this is great. I would be careful about conditionalizing for other dialects. In some cases, some of these things seem like bugs and it's appropriate for the test to fail. Then leave it for the dialect authors to make the decision of what they want to do. If you conditionalize everything, it gives the dialect authors a false sense that everything is fine because the test passes and they don't review it.

Comment thread test/clojure/edn_test/read_string.cljc Outdated
Comment thread test/clojure/edn_test/read_string.cljc Outdated
@dgr

dgr commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

@dgr What are your thoughts on the approach here?

I love this! Overall, this is quite well done and comprehensive.

Comment thread test/clojure/edn_test/read_string.cljc Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a comprehensive, cross-dialect test suite for clojure.edn/read-string to close #908 and to codify expected EDN-reader behavior (including dialect divergences and a few non-EDN extensions).

Changes:

  • Introduces a new .cljc test namespace covering edn/read-string across strings, numbers, collections, tags, comments/discard, and options.
  • Adds dialect-conditional assertions for known behavioral differences (JVM/CLR/CLJS/Babashka/Basilisp/Phel).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/clojure/edn_test/read_string.cljc Outdated
Comment thread test/clojure/edn_test/read_string.cljc Outdated
I believe this was what I originally intended to do, similar to the cases immediately above it.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@jeaye

jeaye commented Jun 23, 2026

Copy link
Copy Markdown
Member

@dgr What are your thoughts on the approach here?

I love this! Overall, this is quite well done and comprehensive.

I agree, the tests are comprehensive. I also appreciate Brandon's work in getting these written. I want to convey caution before we continue, though. Are we testing the right thing? This repo is a test suite for core functions, to ensure they exist and behave correctly.

I would imagine that the edn/read-string should just be checking that we can read strings. Instead, we are thoroughly testing the underlying EDN reader. Let's scale this forward: when we add read-json, will we then implement and own a comprehensive JSON reader (and writer) testing suite? This is way beyond the scope of the repo.

As an example of how we don't do this, in other cases.

  1. We don't test the underlying runtime, to ensure that calling fns with incorrect arg counts throws
  2. We don't test the underlying compiler, to ensure that forms like let, throw, and fn behave correctly
  3. We don't test the underlying runtime, to ensure that all possible code given to eval can be evaluated (see here)
  4. We don't test the underlying runtime, to ensure that threads created by future are actually running in parallel

Overall, we don't test the underlying compiler+runtime. This is a unit testing repo for core functions. If we continue down this path, this repo will grow to contain:

  1. Comprehensive unit tests for core functions
  2. Comprehensive EDN reader/writer tests (clojure.edn)
  3. Comprehensive JSON reader/writer tests (clojure.data.json)
  4. Comprehensive XML reader/writer tests (clojure.data.xml)
  5. Comprehensive CSV reader/writer tests (clojure.data.csv)
  6. And so on...

Keep in mind that we have a policy against LLM-generated code. Outside of generating this test code, we cannot reasonably expand the scope of this repo to contain all of these tests. I think this would be biting off far more than we can chew.

Brandon asked if clojure.edn/read-string should be allowed in this repo and we said yes. However, the tests I had in mind for it were very much along the lines of the eval tests, not a comprehensive test of the entire reader. I should have been more clear about that, out of respect for Brandon. In order to not have helpful folks feel like their time is wasted, this is something we need to be very clear on.

@brandoncorrea

Copy link
Copy Markdown
Contributor Author

@jeaye I'd be happy to scale these tests back a bit. I was originally hesitant to create a 900 LOC PR, but went for it since each test seemed like it carried its weight.

I think we could certainly remove the non-edn extension tests... since those are not part of the EDN spec. That should cut the size of the tests significantly by maybe 100 lines.

  • Octal Escapes
  • Octal, Hex, Radix Integers
  • Ratios
  • Namespaced Maps
  • Metadata

Many of the other tests I think we could go without altogether or collapse to just a handful of cases. A few candidates I see just skimming through:

  • Collapse some BigDecimal and Exponent cases
  • Overflows could probably be removed
  • Collapse Collections + Key Uniqueness cases
  • Many of the Whitespace tests throughout could get collapsed

@jeaye

jeaye commented Jun 23, 2026

Copy link
Copy Markdown
Member

@jeaye I'd be happy to scale these tests back a bit. I was originally hesitant to create a 900 LOC PR, but went for it since each test seemed like it carried its weight.

Yeah, this is 30x larger than the median test file in the repo (~30 lines). That is generally an indicator that EITHER an LLM was used and there's serious over-testing OR we're being way too specific about something.

For clojure.edn/read-string, I think the main things to test will be along these lines:

  • Empty input => gives nil
  • Single valid input => gives value
  • Multiple valid inputs => gives first value?
  • Invalid input => throws
  • Unsafe reader macros are not evaluated (edn/read-string is supposed to be safe for untrusted inputs)

Then we can test each of the options to ensure their behavior works, but each of those only needs to test the happy path and any edge cases we can figure.

@dgr

dgr commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

I think it's quite reasonable to test edn/read-string to determine if it complies with the EDN behavior. To see why this is the case, think about the opposite. If end/read-string read a simple form but failed on the bulk of the valid EDN strings that we threw at it, wouldn't people immediately say, "This test is a joke and doesn't really test anything." Yes, there are other cases where we're not exhaustive. Perhaps we should be testing more in those cases. To the extent that we're testing anything in this test suite, I think we should do a good job of it. Some of the things that we don't test are because it's extremely difficult to test them (e.g., that a future is actually running in parallel). But to the extent that simple is tests can be written to validate behavior, we should try to validate more rather than less, up to the point of, but not including, completely over-testing and attempting to validate bug-for-bug behavior in out-of-bounds corner cases. So, I could see us doing something similar with JSON or XML reading.

That said, perhaps this is a progressive thing, where we shoot for a minimum simple test and if somebody wants to go further, more power to them. Perhaps we don't require such great tests as @brandoncorrea delivered, but we gratefully accept them if somebody wants to write them.

Finally, I think this also goes to the point I made to you privately on Slack, @jeaye. In these non-core libraries, if they are based mostly on Clojure code that will be reused in multiple implementations and the behavior is what it is as long as core is implemented correctly, then we don't need exhaustive tests. But exhaustive tests are useful in detecting issues with a library that relies on a lot of native platform code that must be reimplemented to port the library. In that case, the exhaustive tests validate those portions of the lib that needed to be rewritten.

@jeaye jeaye left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, Brandon!

Comment thread test/clojure/edn_test/read_string.cljc
@jeaye

jeaye commented Jun 24, 2026

Copy link
Copy Markdown
Member

Following up publicly, after speaking privately with folks. We're going to move forward to merge Brandon's hard work, which intentionally grows the scope of the clojure-test-suite to now handle testing parsers for data formats. We're going to trim down some of the over-testing prior to merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

clojure.edn/read-string

4 participants