RISC-V Pipeline Diagram (from `ALL/ALL.ipynb`)

This file translates the end-to-end notebook workflow into an advisor-friendly diagram.

1) High-level pipeline (recommended for slides)

flowchart LR
    %% Inputs
    A1[RISC-V Unified DB\nall_instructions.golden.adoc]
    A2[RISC-V Opcodes\nextensions/*]
    A3[RISC-V Specifications\n3_SPECIFICATION/2_SPLIT\n(929 chunks)]

    %% Instruction branch
    subgraph I[Instruction Rule Synthesis]
      I1[1_INSTRUCTION_ALL.txt\nBuild full instruction metadata]
      I2[2_INSTRUCTION_SOME.txt\nFilter target instructions]
      I3[3_INSTRUCTION_BOOLEAN.txt\nLLM chunk matching]
      I4[4_INSTRUCTION_RULE.txt\nLLM rule extraction]
      I5[5_INSTRUCTION_FINAL.txt]
      I6[6_INSTRUCTION_TOTAL.txt\nAppend/update cumulative rules]
      I7[7_INSTRUCTION_SUMMARY.txt]
    end

    %% Register branch
    subgraph R[Register Rule Synthesis]
      R1[1_REGISTER_ALL.txt\nBuild full register metadata]
      R2[2_REGISTER_SOME.txt\nFilter target registers]
      R3[3_REGISTER_BOOLEAN.txt\nLLM chunk matching]
      R4[4_REGISTER_RULE.txt\nLLM rule extraction]
      R5[5_REGISTER_EXTENSION.txt]
      R6[6_REGISTER_FIELD.txt]
      R7[7_REGISTER_FINAL.txt]
      R8[8_REGISTER_TOTAL.txt\nAppend/update cumulative rules]
      R9[9_REGISTER_SUMMARY.txt]
    end

    %% Test generation & repair
    subgraph T[Test-case Generation + Repair]
      T1[1_TEST_CASE_INPUT.txt\nPrompt-ready constraints]
      T2[2_TEST_CASE_RAW.txt]
      T3[3_TEST_CASE_REVISED_*.txt\nStructured/numbered blocks]
      T4[4_REPAIR_TEST_CASE_OUTPUT.txt\nLLM+execution repair loop]
      T5[4_REPAIR_TEST_CASE_OUTPUT_2.txt\nRegister/semantic patching]
      T6[REPAIR_TEST_CASE_OUTPUT_EXTENSION.txt\nFiltered by extension headers]
    end

    %% Emulator validation
    subgraph Q[QEMU/xv6 Validation]
      Q1[1_QEMU_XV6_INPUT.txt]
      Q2[2_QEMU_XV6_RUN.txt\nAssemble/run + register deltas]
      Q3[3_QEMU_XV6_CHECK.txt\nPass/fail checks]
    end

    A1 --> I1
    A2 --> I1
    A3 --> I3
    I1 --> I2 --> I3 --> I4 --> I5 --> I6 --> I7

    A2 --> R1
    A3 --> R3
    R1 --> R2 --> R3 --> R4 --> R5 --> R6 --> R7 --> R8 --> R9

    I6 --> T1
    R8 --> T1
    T1 --> T2 --> T3 --> T4 --> T5 --> T6

    T5 --> Q1
    Q1 --> Q2 --> Q3

2) Mapping-focused view (close to your current sketch)

flowchart LR
    S[RISC-V Specifications] --> SC[Split into specification chunks (Python)] --> CH[Specification chunks (929)]

    O[RISC-V Opcodes] --> MX[Metadata extraction (Python)]
    U[RISC-V Unified DB] --> MX

    MX --> IM[Instruction metadata]
    MX --> RM[Register metadata]

    CH --> ML1[Matching (LLM)]
    IM --> ML1
    ML1 --> M1[Instruction-to-spec chunk mapping]

    CH --> ML2[Matching (LLM)]
    RM --> ML2
    ML2 --> M2[Register-to-spec chunk mapping]

    M1 --> IR[Instruction rules]
    M2 --> RR[Register rules]

    IR --> TG[Test-case generation]
    RR --> TG
    TG --> RP[Repair loop]
    RP --> QV[QEMU/xv6 validation]

3) Suggested figure caption (for thesis/advisor)

Figure X. End-to-end pipeline for deriving instruction/register rules from RISC-V sources, grounding them to specification chunks with LLM-based matching, generating and repairing test cases, and validating behavior in QEMU/xv6.

4) How to update this diagram when the pipeline changes

If your new system changed file names/stages, update the diagram with this quick workflow:

Open and run ALL/ALL.ipynb from top to bottom so the latest pipeline artifacts are regenerated.
Compare current outputs under:
- ALL/4_INSTRUCTION/1_OUTPUT/
- ALL/5_REGISTER/1_OUTPUT/
- ALL/6_TEST_CASE/
- ALL/TEXT/
In this file, update Mermaid node labels to match the new artifact names/stages.
Keep the same 4 logical groups (Inputs, Rule Synthesis, Test/Repair, QEMU Validation) unless your architecture changed.
Re-render Mermaid (GitHub preview or Mermaid Live Editor) and verify no parse errors.

Minimal command checklist

# from repo root
python - <<'PY'
import json
from pathlib import Path
nb=json.loads(Path('ALL/ALL.ipynb').read_text())
print('code cells:', sum(1 for c in nb['cells'] if c['cell_type']=='code'))
for i,c in enumerate(nb['cells']):
    if c['cell_type']=='code' and c['source']:
        first=''.join(c['source']).splitlines()[0]
        if first.strip().startswith('#'):
            print(f'{i:02d}: {first.strip()}')
PY

Use that list as the source-of-truth stage order, then rename the Mermaid nodes accordingly.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.ipynb_checkpoints		.ipynb_checkpoints
1_INSTALL		1_INSTALL
2_INPUT		2_INPUT
3_SPECIFICATION		3_SPECIFICATION
4_INSTRUCTION		4_INSTRUCTION
5_REGISTER		5_REGISTER
6_TEST_CASE		6_TEST_CASE
7_REPAIR_TEST_CASE		7_REPAIR_TEST_CASE
8_EMULATOR		8_EMULATOR
8_EMULATOR_EXPERIMENT		8_EMULATOR_EXPERIMENT
ALL		ALL
DOCUMENTATION/mseccfg_mstatus		DOCUMENTATION/mseccfg_mstatus
3_SPECIFICATION.ipynb		3_SPECIFICATION.ipynb
4_INSTRUCTION.ipynb		4_INSTRUCTION.ipynb
5_REGISTER.ipynb		5_REGISTER.ipynb
6_TEST_CASE.ipynb		6_TEST_CASE.ipynb
PUSH.bat		PUSH.bat
PUSH.sh		PUSH.sh
README.md		README.md
alarm.mp3		alarm.mp3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RISC-V Pipeline Diagram (from `ALL/ALL.ipynb`)

1) High-level pipeline (recommended for slides)

2) Mapping-focused view (close to your current sketch)

3) Suggested figure caption (for thesis/advisor)

4) How to update this diagram when the pipeline changes

Minimal command checklist

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RISC-V Pipeline Diagram (from ALL/ALL.ipynb)

1) High-level pipeline (recommended for slides)

2) Mapping-focused view (close to your current sketch)

3) Suggested figure caption (for thesis/advisor)

4) How to update this diagram when the pipeline changes

Minimal command checklist

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

RISC-V Pipeline Diagram (from `ALL/ALL.ipynb`)

Packages