add get_bytes#5
Conversation
| if rpv.len() > 0 && rpv[0] == b'~' { | ||
| // convert to bool | ||
| rpv = &rpv[1..]; | ||
| if value.bool() { | ||
| tvalue.slice = "true"; | ||
| tvalue.info = INFO_TRUE; | ||
| } else { | ||
| } else { | ||
| tvalue.slice = "false"; | ||
| tvalue.info = INFO_FALSE; | ||
| } | ||
| tvalue.info = INFO_FALSE; | ||
| } | ||
| value = &tvalue; | ||
| } | ||
| } |
There was a problem hiding this comment.
no changes happened, only code formatting from my IDE
|
The PR is simple enough for me to take a look. I think that your For example: let json = b"{\"hello\":\"w\xFFld\"}";
let r = get_bytes(json, "hello").json().to_owned();
let s = String::from_utf8(r.as_bytes().to_vec()).unwrap();This will panic with an This because the input In your signature: pub fn get_bytes<'a>(json: &'a [u8], path: &'a str) -> Value<'a>The return https://github.com/deankarn/gjson.rs/blob/get-bytes/src/util.rs#L11-L21 You can see that The fix could possibly be one of the following:
If at all possible I would rather not expose |
|
@tidwall as with your original get function this one also includes the same warning explicitly stating this fact https://github.com/tidwall/gjson.rs/pull/5/files#diff-b1a35a68f14e696205874893c07fd24fdb88882b47c23cc0e0c80a30c7d53759R1060 In my case I already know that the bytes in question are valid utf8. If I want to read out two fields it would be very inefficient to check validity multiple times. I think that the warning alone should be more than sufficient and that always checking for validity is not something that should be forced upon the end user that has the option to do so before hand.
|
|
If it is 100% required though I’d opt for marking the get_bytes function as unsafe as it seems a reasonable tradeoff. I’ll try to make that change within the next few hours:) |
|
Sounds good. FYI, I made some minor changes to the trunk following the merge. |
|
Yes I saw @tidwall makes sense for this PR 👍 |
Exposes get_bytes(json: bytes, path: str, default=...) -> Value, mirroring the gjson.rs get_bytes API (tidwall/gjson.rs#5). The Python binding validates UTF-8 and raises ValueError on invalid input rather than using the unsafe from_utf8_unchecked path, keeping the binding safe for Python callers. https://claude.ai/code/session_01Ab8vgRMuBcE3fxEgTcfu6w
This is a simplified version of #4 which
onlyadds the newget_bytesfunction for when one has a&[u8]instead of an&strto avoid the need for a type conversation to a string.@tidwall I'm hoping you might be able merge and cut a release for this small change and then have all the time to review the other PR you need. The new
get_bytesfunction is really the only change I require at this time, the rest is minor optimizations. I did fork this repo however, am not able to release my crate without publishing it as a separate crate which is undesirable.If not please let me know and I'll instead copy json into my codebase for the time being.