Skip to content

Accepting illegal UTF-8 in strings #201

@lindig

Description

@lindig
utop # let str =  "\"Foo\xc0\xafBar\"";;
val str : string = "\"FooÀ¯Bar\""

utop # Yojson.Basic.from_string str;;
- : Yojson.Basic.t = `String "FooÀ¯Bar"

utop # String.is_valid_utf_8 str;;
- : bool = false

I believe the string above is not legal UTF-8 but is accepted by the parser. Given that JSON is typically read from an external source it would be best to detect illegal UTF-8 during parsing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions