reject out-of-range integers in JsonReader::ParseInt#16240
Conversation
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces range checking in JsonReader::ParseInt to prevent silent wrapping of integer values that do not fit the target IntType when parsing with nlohmann::json. It also adds a corresponding unit test to verify that out-of-range integer values are rejected. The reviewer noted that the current implementation unconditionally rejects parsing any signed integer representation into an unsigned IntType (even if the value is positive), and suggested a more robust range-checking logic that allows non-negative signed integers to be parsed into unsigned types.
| if (value.is_number_unsigned()) { | ||
| if (value.get<uint64_t>() > | ||
| static_cast<uint64_t>(std::numeric_limits<IntType>::max())) { | ||
| reader.Fail("Integer value out of range: " + value.dump()); | ||
| return 0; | ||
| } | ||
| } else if (!std::numeric_limits<IntType>::is_signed) { | ||
| reader.Fail("Integer value out of range: " + value.dump()); | ||
| return 0; | ||
| } else if (value.get<int64_t>() < | ||
| static_cast<int64_t>( | ||
| std::numeric_limits<IntType>::min()) || | ||
| value.get<int64_t>() > | ||
| static_cast<int64_t>( | ||
| std::numeric_limits<IntType>::max())) { | ||
| reader.Fail("Integer value out of range: " + value.dump()); | ||
| return 0; | ||
| } |
There was a problem hiding this comment.
The current implementation unconditionally rejects parsing any signed integer representation into an unsigned IntType (e.g., parsing a positive integer stored as a signed int64_t into uint32_t). In nlohmann::json, integers can be stored as signed int64_t even if they are positive (for example, if constructed from a signed variable or initializer list).
Instead of failing immediately when !std::numeric_limits<IntType>::is_signed is true on the signed branch, we should allow non-negative values and check if they fit within the range of the unsigned IntType.
if (value.is_number_unsigned()) {
if (value.get<uint64_t>() >
static_cast<uint64_t>(std::numeric_limits<IntType>::max())) {
reader.Fail("Integer value out of range: " + value.dump());
return 0;
}
} else {
int64_t val = value.get<int64_t>();
if (std::numeric_limits<IntType>::is_signed) {
if (val < static_cast<int64_t>(std::numeric_limits<IntType>::min()) ||
val > static_cast<int64_t>(std::numeric_limits<IntType>::max())) {
reader.Fail("Integer value out of range: " + value.dump());
return 0;
}
} else {
if (val < 0 ||
static_cast<uint64_t>(val) >
static_cast<uint64_t>(std::numeric_limits<IntType>::max())) {
reader.Fail("Integer value out of range: " + value.dump());
return 0;
}
}
}There was a problem hiding this comment.
Good catch. Reworked the signed branch so a non-negative value stored as int64_t decodes when it fits an unsigned IntType (e.g. version/totalDocuments as uint32_t), instead of rejecting every signed representation. Negatives into unsigned and genuinely out-of-range values still fail. Pushed in 5e2228f.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request adds range validation for integers parsed from JSON in JsonReader::ParseInt to prevent silent wrapping of out-of-range values, and includes a corresponding unit test. The reviewer suggested refactoring the validation logic to consolidate duplicated error-handling calls, which helps reduce template code bloat and improves maintainability.
| if (value.is_number_unsigned()) { | ||
| if (value.get<uint64_t>() > | ||
| static_cast<uint64_t>(std::numeric_limits<IntType>::max())) { | ||
| reader.Fail("Integer value out of range: " + value.dump()); | ||
| return 0; | ||
| } | ||
| } else { | ||
| // A non-negative value may still be stored as a signed `int64_t`, so | ||
| // accept it when it fits an unsigned `IntType` rather than rejecting | ||
| // every signed representation outright. | ||
| const int64_t val = value.get<int64_t>(); | ||
| if (std::numeric_limits<IntType>::is_signed) { | ||
| if (val < static_cast<int64_t>(std::numeric_limits<IntType>::min()) || | ||
| val > static_cast<int64_t>(std::numeric_limits<IntType>::max())) { | ||
| reader.Fail("Integer value out of range: " + value.dump()); | ||
| return 0; | ||
| } | ||
| } else if (val < 0 || | ||
| static_cast<uint64_t>(val) > | ||
| static_cast<uint64_t>( | ||
| std::numeric_limits<IntType>::max())) { | ||
| reader.Fail("Integer value out of range: " + value.dump()); | ||
| return 0; | ||
| } | ||
| } |
There was a problem hiding this comment.
The current implementation duplicates the error handling logic (reader.Fail(...) and return 0;) three times across different branches. Since ParseInt is a template function, this duplication is repeated for every instantiation of the template (e.g., for int32_t, uint32_t, int64_t, uint64_t), leading to unnecessary template code bloat and larger binary sizes.
We can refactor this by using a boolean flag out_of_range to track the validation status and perform a single error handling call at the end. This improves maintainability and reduces the compiled code size.
bool out_of_range = false;
if (value.is_number_unsigned()) {
if (value.get<uint64_t>() >
static_cast<uint64_t>(std::numeric_limits<IntType>::max())) {
out_of_range = true;
}
} else {
// A non-negative value may still be stored as a signed int64_t, so
// accept it when it fits an unsigned IntType rather than rejecting
// every signed representation outright.
const int64_t val = value.get<int64_t>();
if (std::numeric_limits<IntType>::is_signed) {
if (val < static_cast<int64_t>(std::numeric_limits<IntType>::min()) ||
val > static_cast<int64_t>(std::numeric_limits<IntType>::max())) {
out_of_range = true;
}
} else if (val < 0 ||
static_cast<uint64_t>(val) >
static_cast<uint64_t>(
std::numeric_limits<IntType>::max())) {
out_of_range = true;
}
}
if (out_of_range) {
reader.Fail("Integer value out of range: " + value.dump());
return 0;
}There was a problem hiding this comment.
Code Review
This pull request introduces range checking to JsonReader::ParseInt to prevent silent wrapping of out-of-range integer values parsed from JSON, and adds a corresponding unit test. The reviewer noted that the current bounds-checking logic could trigger compiler warnings (such as -Wconstant-conversion or -Woverflow) during template instantiation when IntType is unsigned, and provided a refactored implementation to safely calculate limits and simplify the code.
| if (value.is_number_unsigned()) { | ||
| if (value.get<uint64_t>() > | ||
| static_cast<uint64_t>(std::numeric_limits<IntType>::max())) { | ||
| reader.Fail("Integer value out of range: " + value.dump()); | ||
| return 0; | ||
| } | ||
| } else { | ||
| // A non-negative value may still be stored as a signed `int64_t`, so | ||
| // accept it when it fits an unsigned `IntType` rather than rejecting | ||
| // every signed representation outright. | ||
| const int64_t val = value.get<int64_t>(); | ||
| if (std::numeric_limits<IntType>::is_signed) { | ||
| if (val < static_cast<int64_t>(std::numeric_limits<IntType>::min()) || | ||
| val > static_cast<int64_t>(std::numeric_limits<IntType>::max())) { | ||
| reader.Fail("Integer value out of range: " + value.dump()); | ||
| return 0; | ||
| } | ||
| } else if (val < 0 || | ||
| static_cast<uint64_t>(val) > | ||
| static_cast<uint64_t>( | ||
| std::numeric_limits<IntType>::max())) { | ||
| reader.Fail("Integer value out of range: " + value.dump()); | ||
| return 0; | ||
| } | ||
| } |
There was a problem hiding this comment.
The current implementation of bounds checking can trigger compiler warnings (such as -Wconstant-conversion or -Woverflow) during template instantiation when IntType is unsigned. This happens because the compiler still compiles the signed branch where std::numeric_limits<IntType>::max() (which is uint64_t::max()) is cast to int64_t, resulting in an out-of-range constant conversion.
We can avoid these warnings and simplify the code by calculating the limits safely as int64_t and uint64_t based on the signedness of IntType, and then performing warning-free comparisons. This also eliminates the duplicated error-handling logic.
bool out_of_range = false;
if (value.is_number_unsigned()) {
out_of_range = value.get<uint64_t>() >
static_cast<uint64_t>(std::numeric_limits<IntType>::max());
} else {
// A non-negative value may still be stored as a signed int64_t, so
// accept it when it fits an unsigned IntType rather than rejecting
// every signed representation outright.
const int64_t val = value.get<int64_t>();
const int64_t min_limit =
std::numeric_limits<IntType>::is_signed
? static_cast<int64_t>(std::numeric_limits<IntType>::min())
: 0;
const uint64_t max_limit =
static_cast<uint64_t>(std::numeric_limits<IntType>::max());
if (std::numeric_limits<IntType>::is_signed) {
out_of_range = val < min_limit ||
(val > 0 && static_cast<uint64_t>(val) > max_limit);
} else {
out_of_range = val < 0 || static_cast<uint64_t>(val) > max_limit;
}
}
if (out_of_range) {
reader.Fail("Integer value out of range: " + value.dump());
return 0;
}
Repro: load a bundle whose
integerValueor timestampnanosis a raw (unquoted) JSON number outside the target type's range, e.g. ananosof5294967295.Cause:
JsonReader::ParseIntreturnsvalue.get<IntType>()on theis_number_integer()path with no bounds check, so5294967295wraps to999999999and slips past the downstreamnanos < 1e9validation; the string branch already rejects the same input viaabsl::SimpleAtoi, so the two encodings disagree.Fix: bounds-check the numeric branch against
IntTypebefore the cast so out-of-range numbers fail to decode, matching the string branch.