Skip to content

Implement TreeProcessor in Morphir.IR.Pipeline #375

Description

@DamianReeves

GitHub Issue: Implement TreeProcessor in Morphir.IR.Pipeline

Type: Feature
Labels: feature, file-architecture, priority-p1, pipeline
Priority: P1 (Enhances VFileTree support)
Milestone: v1.0.0
Estimated Effort: 3-5 days
Project: Morphir.IR.Pipeline (existing project - add new file)


Description

Implement TreeProcessor - pipeline support for processing VFileTree structures in the Morphir.IR.Pipeline project. This enables multi-file transformations while maintaining the hierarchical structure and aggregating diagnostics across all files.

Related Design: Unified File Architecture


Context

What Exists Today

VFile (in Morphir.IR.Pipeline/File.fs):

  • Virtual file with content, path, diagnostics, metadata
  • Used by existing MorphirProcessor for single-file operations

MorphirProcessor (in Morphir.IR.Pipeline/Processor.fs):

  • Processes single VFile through transformation pipeline
  • Supports computation expression syntax

What's Missing

Multi-file pipeline support:

  • No way to process VFileTree (hierarchical multi-file structures)
  • No aggregation of diagnostics across files
  • No directory-level transformations

Acceptance Criteria

Core Types (Add to Morphir.IR.Pipeline/TreeProcessor.fs)

Type Definitions:

  • Define TreeProcessor record type
  • Define computation expression builder TreePipelineBuilder
  • XML doc comments on all types

TreeProcessor Type

Fields:

  • ProcessTree: VFileTree -> Result<VFileTree, VFileTree> - Process entire tree
  • ProcessFile: VFile -> Result<VFile, VFile> - Process individual files (for leaf operations)
  • Name: string option - Processor name (for diagnostics)

Module Functions

Creation Functions (TreeProcessor module):

  • empty: TreeProcessor - Create empty processor (identity)
  • fromFileProcessor: (VFile -> Result<VFile, VFile>) -> TreeProcessor - Lift file processor to tree processor
  • fromMorphirProcessor: MorphirProcessor -> TreeProcessor - Convert existing processor

Composition Functions:

  • compose: TreeProcessor -> TreeProcessor -> TreeProcessor - Sequential composition
  • parallel: TreeProcessor list -> TreeProcessor - Parallel processing (independent transformations)

Execution Functions:

  • run: TreeProcessor -> VFileTree -> Result<VFileTree, VFileTree> - Execute processor on tree
  • runOnFiles: TreeProcessor -> VFile list -> Result<VFile list, VFile list> - Execute on file list

Diagnostics Functions:

  • collectDiagnostics: VFileTree -> Map<string, VMessage list> - Aggregate diagnostics by file
  • hasErrors: VFileTree -> bool - Check if any file has errors
  • summarize: VFileTree -> ProcessorSummary - Get processing summary

Computation Expression Builder

Custom Operations:

  • Yield - Create empty processor
  • parseTree - Parse tree structure
  • transformTree - Transform entire tree
  • mapFiles - Apply transformation to each file
  • filterFiles - Filter files based on predicate
  • aggregateDiagnostics - Collect diagnostics
  • validateTree - Validate tree structure

ProcessorSummary Type

Fields:

  • TotalFiles: int - Total files processed
  • SuccessCount: int - Files processed successfully
  • ErrorCount: int - Files with errors
  • WarningCount: int - Total warnings
  • ProcessingTime: TimeSpan option - Time taken (optional)

Implementation Tasks

1. Create TreeProcessor.fs

# Add new file to existing Morphir.IR.Pipeline project
touch src/Morphir.IR.Pipeline/TreeProcessor.fs
# Update Morphir.IR.Pipeline.fsproj to include TreeProcessor.fs AFTER FileTree.fs

2. Define Core Types

namespace Morphir.IR.Pipeline

open System

/// <summary>
/// Summary of tree processing results.
/// </summary>
type ProcessorSummary = {
    TotalFiles: int
    SuccessCount: int
    ErrorCount: int
    WarningCount: int
    ProcessingTime: TimeSpan option
}

/// <summary>
/// Processor that operates on VFileTree (multi-file projects).
/// Supports both tree-level and file-level transformations.
/// </summary>
type TreeProcessor = {
    /// <summary>Process entire tree</summary>
    ProcessTree: VFileTree -> Result<VFileTree, VFileTree>

    /// <summary>Process individual file (for leaf operations)</summary>
    ProcessFile: VFile -> Result<VFile, VFile>

    /// <summary>Processor name (for diagnostics)</summary>
    Name: string option
}

3. Implement Module Functions

[<RequireQualifiedAccess>]
module TreeProcessor =
    /// Create empty (identity) processor
    let empty: TreeProcessor = {
        ProcessTree = Ok
        ProcessFile = Ok
        Name = None
    }

    /// Create processor from file processor (applies to each file)
    let fromFileProcessor (name: string option) (proc: VFile -> Result<VFile, VFile>): TreeProcessor =
        {
            ProcessTree = fun tree ->
                // Apply processor to each file in tree
                let rec processTree (t: VFileTree): Result<VFileTree, VFileTree> =
                    let processedContent =
                        t.Content
                        |> List.map (function
                            | File file ->
                                match proc file with
                                | Ok processedFile -> Ok (File processedFile)
                                | Error errorFile -> Error (File errorFile)
                            | Directory subtree ->
                                match processTree subtree with
                                | Ok processedSubtree -> Ok (Directory processedSubtree)
                                | Error errorSubtree -> Error (Directory errorSubtree))

                    // Check if any errors occurred
                    let errors = processedContent |> List.choose (function | Error e -> Some e | _ -> None)

                    if errors.IsEmpty then
                        let successContent = processedContent |> List.choose (function | Ok c -> Some c | _ -> None)
                        Ok { t with Content = successContent }
                    else
                        // Return tree with errors
                        Error { t with Content = errors }

                processTree tree

            ProcessFile = proc
            Name = name
        }

    /// Convert MorphirProcessor to TreeProcessor
    let fromMorphirProcessor (processor: MorphirProcessor): TreeProcessor =
        fromFileProcessor processor.Name processor.Process

    /// Compose two processors sequentially
    let compose (first: TreeProcessor) (second: TreeProcessor): TreeProcessor =
        {
            ProcessTree = fun tree ->
                first.ProcessTree tree
                |> Result.bind second.ProcessTree

            ProcessFile = fun file ->
                first.ProcessFile file
                |> Result.bind second.ProcessFile

            Name =
                match first.Name, second.Name with
                | Some n1, Some n2 -> Some $"{n1} >> {n2}"
                | Some n, None | None, Some n -> Some n
                | None, None -> None
        }

    /// Run multiple processors in parallel (all must succeed)
    let parallel (processors: TreeProcessor list): TreeProcessor =
        {
            ProcessTree = fun tree ->
                let results = processors |> List.map (fun p -> p.ProcessTree tree)

                // Check if all succeeded
                let errors = results |> List.choose (function | Error e -> Some e | _ -> None)

                if errors.IsEmpty then
                    // All succeeded - return last result
                    results |> List.last
                else
                    // Return first error
                    Error (errors |> List.head)

            ProcessFile = fun file ->
                let results = processors |> List.map (fun p -> p.ProcessFile file)

                // Check if all succeeded
                let errors = results |> List.choose (function | Error e -> Some e | _ -> None)

                if errors.IsEmpty then
                    results |> List.last
                else
                    Error (errors |> List.head)

            Name = Some "Parallel processors"
        }

    /// Execute processor on tree
    let run (processor: TreeProcessor) (tree: VFileTree): Result<VFileTree, VFileTree> =
        processor.ProcessTree tree

    /// Execute processor on file list
    let runOnFiles (processor: TreeProcessor) (files: VFile list): Result<VFile list, VFile list> =
        let results = files |> List.map processor.ProcessFile

        let errors = results |> List.choose (function | Error e -> Some e | _ -> None)

        if errors.IsEmpty then
            Ok (results |> List.choose (function | Ok f -> Some f | _ -> None))
        else
            Error errors

    /// Collect diagnostics from all files in tree
    let collectDiagnostics (tree: VFileTree): Map<string, VMessage list> =
        tree
        |> VFileTree.allFiles
        |> List.map (fun file ->
            let path = file.Path |> Option.defaultValue "unknown"
            (path, file.Messages))
        |> Map.ofList

    /// Check if tree has errors
    let hasErrors (tree: VFileTree): bool =
        VFileTree.hasErrors tree

    /// Summarize processing results
    let summarize (tree: VFileTree): ProcessorSummary =
        let stats = VFileTree.statistics tree
        {
            TotalFiles = stats.TotalFiles
            SuccessCount = stats.TotalFiles - stats.ErrorCount
            ErrorCount = stats.ErrorCount
            WarningCount = stats.WarningCount
            ProcessingTime = None
        }

4. Implement Computation Expression Builder

/// <summary>
/// Computation expression builder for tree pipelines.
/// Enables pipeline { ... } syntax for multi-file processing.
/// </summary>
type TreePipelineBuilder() =
    member _.Yield(_) = TreeProcessor.empty

    [<CustomOperation("parseTree")>]
    member _.ParseTree(proc: TreeProcessor, parser: VFileTree -> Result<VFileTree, VFileTree>) =
        { proc with ProcessTree = parser }

    [<CustomOperation("transformTree")>]
    member _.TransformTree(proc: TreeProcessor, transformer: TreeProcessor) =
        TreeProcessor.compose proc transformer

    [<CustomOperation("mapFiles")>]
    member _.MapFiles(proc: TreeProcessor, mapper: VFile -> Result<VFile, VFile>) =
        let fileProc = TreeProcessor.fromFileProcessor None mapper
        TreeProcessor.compose proc fileProc

    [<CustomOperation("filterFiles")>]
    member _.FilterFiles(proc: TreeProcessor, predicate: VFile -> bool) =
        let filterProc = TreeProcessor.fromFileProcessor (Some "FilterFiles") (fun file ->
            if predicate file then Ok file else Error file)
        TreeProcessor.compose proc filterProc

    [<CustomOperation("aggregateDiagnostics")>]
    member _.AggregateDiagnostics(proc: TreeProcessor) =
        { proc with
            ProcessTree = fun tree ->
                match proc.ProcessTree tree with
                | Ok resultTree ->
                    // Log summary
                    let summary = TreeProcessor.summarize resultTree
                    printfn "Processed %d files (%d errors, %d warnings)"
                        summary.TotalFiles summary.ErrorCount summary.WarningCount
                    Ok resultTree
                | Error errorTree ->
                    Error errorTree
        }

/// <summary>
/// Pipeline builder for tree processing.
/// </summary>
let treePipeline = TreePipelineBuilder()

5. Write Tests

Create tests/Morphir.IR.Pipeline.Tests/TreeProcessorTests.fs:

module Morphir.IR.Pipeline.Tests.TreeProcessorTests

open TUnit.Core
open Morphir.IR.Pipeline

[<Test>]
let ``Empty processor returns tree unchanged`` () =
    let tree = VFileTree.empty
    let result = TreeProcessor.run TreeProcessor.empty tree

    match result with
    | Ok resultTree -> resultTree |> should equal tree
    | Error _ -> failwith "Should not error"

[<Test>]
let ``fromFileProcessor applies to all files`` () =
    let file1 = VFile.create "file1.fs" "content1"
    let file2 = VFile.create "file2.fs" "content2"
    let tree = VFileTree.fromFiles [file1; file2]

    // Processor that adds metadata
    let addMetadata file =
        Ok (VFile.setData "processed" true file)

    let processor = TreeProcessor.fromFileProcessor (Some "AddMetadata") addMetadata

    match TreeProcessor.run processor tree with
    | Ok resultTree ->
        let files = VFileTree.allFiles resultTree
        files |> List.forall (fun f -> VFile.getData "processed" f = Some (box true))
        |> should be true
    | Error _ -> failwith "Should not error"

[<Test>]
let ``compose chains processors`` () =
    let file = VFile.create "test.fs" "content"
    let tree = VFileTree.fromFiles [file]

    let proc1 = TreeProcessor.fromFileProcessor (Some "Proc1") (fun f ->
        Ok (VFile.setData "step1" true f))

    let proc2 = TreeProcessor.fromFileProcessor (Some "Proc2") (fun f ->
        Ok (VFile.setData "step2" true f))

    let composed = TreeProcessor.compose proc1 proc2

    match TreeProcessor.run composed tree with
    | Ok resultTree ->
        let files = VFileTree.allFiles resultTree
        files |> List.head |> VFile.getData "step1" |> should equal (Some (box true))
        files |> List.head |> VFile.getData "step2" |> should equal (Some (box true))
    | Error _ -> failwith "Should not error"

[<Test>]
let ``collectDiagnostics aggregates messages`` () =
    let fileWithError =
        VFile.create "error.fs" "content"
        |> VFile.error "Test error" None

    let fileWithWarning =
        VFile.create "warning.fs" "content"
        |> VFile.warn "Test warning" None

    let tree = VFileTree.fromFiles [fileWithError; fileWithWarning]
    let diagnostics = TreeProcessor.collectDiagnostics tree

    diagnostics |> Map.count |> should equal 2
    diagnostics |> Map.containsKey "error.fs" |> should be true
    diagnostics |> Map.containsKey "warning.fs" |> should be true

[<Test>]
let ``summarize provides processing statistics`` () =
    let fileWithError =
        VFile.create "error.fs" "content"
        |> VFile.error "Error" None

    let fileOk = VFile.create "ok.fs" "content"

    let tree = VFileTree.fromFiles [fileWithError; fileOk]
    let summary = TreeProcessor.summarize tree

    summary.TotalFiles |> should equal 2
    summary.ErrorCount |> should equal 1

[<Test>]
let ``treePipeline computation expression works`` () =
    let file = VFile.create "test.fs" "content"
    let tree = VFileTree.fromFiles [file]

    let pipeline = treePipeline {
        mapFiles (fun f -> Ok (VFile.setData "processed" true f))
        aggregateDiagnostics
    }

    match TreeProcessor.run pipeline tree with
    | Ok resultTree ->
        let files = VFileTree.allFiles resultTree
        files |> List.head |> VFile.getData "processed" |> should equal (Some (box true))
    | Error _ -> failwith "Should not error"

6. Update Project File

Add to src/Morphir.IR.Pipeline/Morphir.IR.Pipeline.fsproj:

<ItemGroup>
  <Compile Include="File.fs" />
  <Compile Include="FileTree.fs" />
  <Compile Include="TreeProcessor.fs" />  <!-- NEW -->
  <Compile Include="Processor.fs" />
  <!-- ... other files ... -->
</ItemGroup>

7. Documentation

  • Add XML doc comments to all public types and functions
  • Create usage examples in unified-file-architecture.md
  • Document integration with MorphirProcessor
  • Document computation expression syntax

Usage Examples

Example 1: Simple File Transformation

open Morphir.IR.Pipeline

// Create processor that adds metadata to all files
let addTimestamp =
    TreeProcessor.fromFileProcessor (Some "AddTimestamp") (fun file ->
        Ok (VFile.setData "timestamp" DateTime.UtcNow file))

// Run on tree
let tree = VFileTree.fromFiles [file1; file2]
match TreeProcessor.run addTimestamp tree with
| Ok resultTree -> printfn "Success!"
| Error errorTree -> printfn "Errors: %A" (TreeProcessor.collectDiagnostics errorTree)

Example 2: Pipeline Syntax

let pipeline = treePipeline {
    // Parse F# files
    parseTree FSharpFrontend.parse

    // Transform to IR
    mapFiles (fun file ->
        // Each file contains F# AST, transform to IR
        IRMapper.mapFile file)

    // Validate IR
    mapFiles IRValidator.validate

    // Aggregate diagnostics
    aggregateDiagnostics
}

let result = TreeProcessor.run pipeline myTree

Example 3: Composition

// Compose multiple processors
let fullPipeline =
    parseProcessor
    |> TreeProcessor.compose transformProcessor
    |> TreeProcessor.compose validateProcessor

let result = TreeProcessor.run fullPipeline tree

Success Criteria

  • All types defined and compile successfully
  • All module functions implemented
  • All tests pass (≥80% coverage)
  • Can process VFileTree with file-level transformations
  • Can compose processors
  • Computation expression syntax works
  • Documentation complete
  • Integrates with existing MorphirProcessor

Dependencies

  • VFile (exists in Morphir.IR.Pipeline/File.fs)
  • VFileTree (new - see related issue)
  • MorphirProcessor (exists in Morphir.IR.Pipeline/Processor.fs)

Blocks

  • F# Frontend multi-file parsing (enhanced by this)
  • F# Backend tree generation (enhanced by this)

Related Documents


Ready for Implementation: This issue provides complete context and code examples.

Estimated Effort: 3-5 developer-days

Priority: P1 (Enhances VFileTree functionality, not blocking but highly valuable)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions