Skip to content

A case of parser backtracking not being supported #138

@ForNeVeR

Description

@ForNeVeR

Disclaimer

Once again, mind that I never had any formal education in language parsing, so I may ask of strange / unrealisting things, and please feel free to correct me in anything I say.

Describe the bug

Let's consider this EBNF sample, an excerpt of the actual C17 grammar with everything unrelated stripped.

function_definition: declaration_specifiers declarator
declaration_specifiers: type_specifier declaration_specifiers?
declarator: direct_declarator
direct_declarator: Identifier
type_specifier: 'int'
type_specifier: Identifier

So, a function_definition may be boiled down to a sequence of type_specifiers followed by a single direct_declarator (which is a Identifier). So far, so good.

This synthetic sample should pass through this parser:

int main

It has one type_specifier, namely int, and then a declarator which is direct_declarator which is main.

Unfortunately, I wasn't able to make Yoakke to parse this sample.

Here's my program:

using Yoakke.SynKit.C.Syntax;
using Yoakke.SynKit.Lexer;
using Yoakke.SynKit.Parser.Attributes;

namespace Foo;

public record FunctionDefinition(List<IDeclarationSpecifier> Specifiers, Declarator Declarator);

public interface IDeclarationSpecifier
{
}

public record Declarator(DirectDeclarator DirectDeclarator);

public record DirectDeclarator(string Text);

public record TypeSpecifier(string Name) : IDeclarationSpecifier;

[Parser(typeof(CTokenType))]
public partial class CParser
{
    [Rule("function_definition: declaration_specifiers declarator")]
    private static FunctionDefinition MakeFunctionDefinition(
        List<IDeclarationSpecifier> specifiers,
        Declarator declarator) => new(specifiers, declarator);

    [Rule("declaration_specifiers: type_specifier declaration_specifiers?")]
    private static List<IDeclarationSpecifier> MakeDeclarationSpecifiers(
        IDeclarationSpecifier typeSpecifier,
        List<IDeclarationSpecifier>? rest) =>
        rest?.Prepend(typeSpecifier).ToList() ?? new List<IDeclarationSpecifier> { typeSpecifier };
    
    [Rule("declarator: direct_declarator")]
    private static Declarator MakeDeclarator(DirectDeclarator directDeclarator) =>
        new(directDeclarator);

    [Rule("direct_declarator: Identifier")]
    private static DirectDeclarator MakeDirectDeclarator(IToken identifier) =>
        new DirectDeclarator(identifier.Text);
    
    [Rule("type_specifier: 'int'")]
    [Rule("type_specifier: Identifier")]
    private static TypeSpecifier MakeSimpleTypeSpecifier(IToken specifier) => new(specifier.Text);
}


public class Program
{
    public static void Main(string[] args)
    {
        var parser = new CParser(new CLexer("int main"));
        Console.WriteLine(parser.ParseFunctionDefinition().Ok);
    }
}

For convenience, here's also a .csproj:

<Project Sdk="Microsoft.NET.Sdk">

    <PropertyGroup>
        <OutputType>Exe</OutputType>
        <TargetFramework>net6.0</TargetFramework>
        <ImplicitUsings>enable</ImplicitUsings>
        <Nullable>enable</Nullable>
    </PropertyGroup>

    <ItemGroup>
      <PackageReference Include="Yoakke.SynKit.C.Syntax" Version="2022.1.24-2.29.33-nightly" />
      <PackageReference Include="Yoakke.SynKit.Parser" Version="2022.1.24-2.29.33-nightly" />
      <PackageReference Include="Yoakke.SynKit.Parser.Generator" Version="2022.1.24-2.29.33-nightly" />
    </ItemGroup>

</Project>

I expect that this program should print the following:

Yoakke.SynKit.Parser.ParseOk`1[Foo.FunctionDefinition]

Instead, it prints this:

Unhandled exception. System.InvalidCastException: Unable to cast object of type 'Yoakke.SynKit.Parser.ParseError' to type 'Yoakke.SynKit.Parser.ParseOk`1[Foo.FunctionDefinition]'.
   at Yoakke.SynKit.Parser.ParseResult`1.get_Ok()
   at Foo.Program.Main(String[] args) in T:\Temp\ConsoleApp4\ConsoleApp4\Program.cs:line 52

Analysis

I have thoroughly read and, I believe, understood the generated code, and I believe that the following is happening.

Yoakke eagerly eats the declaration_specifiers collection, and eats both tokens: int and main as its items. Then it tries to parse a declarator, but isn't able to do so, because it's out of tokens already.

Then, it's unable to drop the latest item from the declaration_specifiers and retry the declarator, though it would be the winning strategy here.

Unfortunately, I don't know a workaround for this problem, so I would appreciate any feedback. If there's any hacky/ugly way to fix this, I'd love to hear it. Obviously, I would love to hear if there's an elegant solution to this problem, too :)

Environment

  • OS: Windows 10
  • .NET version: .NET 6
  • Library version: 2022.1.24-2.29.33-nightly

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions