diff --git a/.agents/skills/haskell-pro/SKILL.md b/.agents/skills/haskell-pro/SKILL.md new file mode 100644 index 0000000..60c224f --- /dev/null +++ b/.agents/skills/haskell-pro/SKILL.md @@ -0,0 +1,60 @@ +--- +name: haskell-pro +description: "Expert Haskell engineer specializing in advanced type systems, pure" +risk: safe +source: community +date_added: "2026-02-27" +--- + +## Use this skill when + +- Working on haskell pro tasks or workflows +- Needing guidance, best practices, or checklists for haskell pro + +## Do not use this skill when + +- The task is unrelated to haskell pro +- You need a different domain or tool outside this scope + +## Instructions + +- Clarify goals, constraints, and required inputs. +- Apply relevant best practices and validate outcomes. +- Provide actionable steps and verification. +- If detailed examples are required, open `resources/implementation-playbook.md`. + +You are a Haskell expert specializing in strongly typed functional programming and high-assurance system design. + +## Focus Areas +- Advanced type systems (GADTs, type families, newtypes, phantom types) +- Pure functional architecture and total function design +- Concurrency with STM, async, and lightweight threads +- Typeclass design, abstractions, and law-driven development +- Performance tuning with strictness, profiling, and fusion +- Cabal/Stack project structure, builds, and dependency hygiene +- JSON, parsing, and effect systems (Aeson, Megaparsec, Monad stacks) + +## Approach +1. Use expressive types, newtypes, and invariants to model domain logic +2. Prefer pure functions and isolate IO to explicit boundaries +3. Recommend safe, total alternatives to partial functions +4. Use typeclasses and algebraic design only when they add clarity +5. Keep modules small, explicit, and easy to reason about +6. Suggest language extensions sparingly and explain their purpose +7. Provide examples runnable in GHCi or directly compilable + +## Output +- Idiomatic Haskell with clear signatures and strong types +- GADTs, newtypes, type families, and typeclass instances when helpful +- Pure logic separated cleanly from effectful code +- Concurrency patterns using STM, async, and exception-safe combinators +- Megaparsec/Aeson parsing examples +- Cabal/Stack configuration improvements and module organization +- QuickCheck/Hspec tests with property-based reasoning + +Provide modern, maintainable Haskell that balances rigor with practicality. + +## Limitations +- Use this skill only when the task clearly matches the scope described above. +- Do not treat the output as a substitute for environment-specific validation, testing, or expert review. +- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing. diff --git a/.agents/skills/htcc-ci-test-suite-coverage/SKILL.md b/.agents/skills/htcc-ci-test-suite-coverage/SKILL.md new file mode 100644 index 0000000..f9fa311 --- /dev/null +++ b/.agents/skills/htcc-ci-test-suite-coverage/SKILL.md @@ -0,0 +1,57 @@ +--- +name: htcc-ci-test-suite-coverage +description: "Keep htcc CI workflows aligned with the test runner command matrix, especially explicit self/components/subp/docker lists." +risk: safe +source: project +date_added: "2026-05-18" +--- + +## Use this skill when + +- Editing `.github/workflows/main.yml`, `.travis.yml`, Docker CI scripts, or the htcc test command selector. +- Adding or renaming test commands such as `self`, `components`, `subp`, or `docker`. +- Review feedback says a suite runs by default locally but is skipped in CI because CI enumerates commands explicitly. + +## Do not use this skill when + +- The task is unrelated to CI/test command coverage. + +## Required checks + +1. Search all CI command lists for explicit `stack test --test-arguments ...` invocations. +2. Compare those lists to `commandsToRun` / command-selection behavior in the test runner. +3. If a workflow lists `self` and `subp` explicitly, ensure `components` is also listed when component tests are required. +4. Do not assume default command selection applies to CI steps that enumerate commands. + +## Known CI locations + +- `.github/workflows/main.yml`: GitHub Actions compiler tests. +- `.travis.yml`: Travis compiler tests. +- Docker-specific commands may remain separate: `stack test --test-arguments docker` and clean variants. +- The Docker clean test command is `stack test --test-arguments docker --test-arguments --clean`. +- README examples are documentation, not CI enforcement, unless the task explicitly asks to update docs. + +## Verification + +- Run the newly added command locally, for example: + +```bash +stack test --test-arguments components +``` + +- Run `git diff --check`. +- Parse changed YAML files. If `yaml-ls` is unavailable, use: + +```bash +ruby -e 'require "yaml"; YAML.load_file(".github/workflows/main.yml"); YAML.load_file(".travis.yml"); puts "yaml ok"' +``` + +- Full `stack test` is optional for CI-only command-list changes unless the test runner itself changed. +- Run Docker commands only when Docker is available locally. If unavailable, report them as unavailable and rely on CI definition inspection/YAML parsing rather than claiming local Docker success. + +## Review focus + +- Every explicit CI compiler-test list includes the required suite. +- Added commands are in the same stage as related compiler tests. +- No secrets, permissions, or checkout credentials are broadened. +- Docker commands are not mixed into non-Docker stages. diff --git a/.agents/skills/htcc-ci-test-suite-coverage/agents/openai.yaml b/.agents/skills/htcc-ci-test-suite-coverage/agents/openai.yaml new file mode 100644 index 0000000..fa8b167 --- /dev/null +++ b/.agents/skills/htcc-ci-test-suite-coverage/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "HTCC CI Test Suite Coverage" + short_description: "Keep htcc CI test suites aligned" + default_prompt: "Use $htcc-ci-test-suite-coverage to keep htcc CI test command lists aligned with the test runner." diff --git a/.agents/skills/htcc-output-permission-workflow/SKILL.md b/.agents/skills/htcc-output-permission-workflow/SKILL.md new file mode 100644 index 0000000..e7fb7c3 --- /dev/null +++ b/.agents/skills/htcc-output-permission-workflow/SKILL.md @@ -0,0 +1,75 @@ +--- +name: htcc-output-permission-workflow +description: "Work safely on htcc output replacement, fallback copy, file mode preservation, rollback, symlink, and hard-link behavior." +risk: safe +source: project +date_added: "2026-05-18" +--- + +## Use this skill when + +- Editing `src/Htcc/Output.hs` or code that writes `-o` outputs, visualization outputs, staging files, backups, or direct replacement paths. +- Review feedback mentions `setFileMode`, owner/group/other permissions, read-only directories, fallback copies, stale output, symlinks, hard links, or rollback. + +## Do not use this skill when + +- The output path is not touched and the task is only parser/codegen logic. + +## Key invariants + +- Existing output content must not be lost on failed replacement. +- Existing output mode should be preserved unless the mode strategy intentionally changes executable bits. +- Direct fallback must not preemptively `chmod` when a write/copy could already succeed through group or other permissions. +- Hard-linked outputs must remain protected by `ensureInPlaceReplacementSafe`. +- Symlink behavior must match existing resolution rules; do not add broad symlink rewrites while fixing permission bugs. + +## Test patterns + +- Component tests belong in `test/Tests/ComponentsTests/AsmOutput.hs` near existing fallback/permission tests. +- Use custom `copyReplacementOutput` functions to assert ordering, mode at copy time, partial-write rollback, or failure surfacing. +- Use temp files from `openTempFile`, close handles before mode changes, and cleanup with `catchIOError`-guarded removal. +- Test file modes with `fileMode <$> getFileStatus` and `intersectFileModes replacedMode 0o777`. + +## Manual QA patterns + +- For fallback replacement, create a read-only directory so staging in the output directory fails and direct replacement is exercised. +- Verify: + - command exit status, + - stdout/stderr, + - output file contains expected assembly such as `.global main`, + - stale output is preserved when failure is expected, + - file mode is restored when mode behavior is the contract. + +Example shape: + +```bash +tmpdir=$(mktemp -d /tmp/htcc-output-fallback.XXXXXX) +target="$tmpdir/out.s" +cleanup() { + chmod u+rw "$target" 2>/dev/null || true + chmod u+rwx "$tmpdir" 2>/dev/null || true + rm -rf -- "$tmpdir" +} +trap cleanup EXIT +printf 'stale output\n' > "$target" +chmod 444 "$target" +chmod 555 "$tmpdir" +printf 'int main(void) { return 0; }\n' | stack exec htcc -- -o "$target" /dev/stdin +``` + +Always restore permissions before cleanup; use a `trap` so interrupted QA does not leave read-only temp paths behind. + +## Verification + +- `stylish-haskell -i src/Htcc/Output.hs test/Tests/ComponentsTests/AsmOutput.hs` +- `lsp_diagnostics` on both files when the diagnostics tool is available. If unavailable, state that and substitute `stack build` plus the relevant test command. +- `stack test --test-arguments components`. +- Full `stack test` when output replacement affects subprocess behavior. +- `stack build`. +- Manual permission/fallback QA. + +## Review focus + +- Ordering of `copy`, `chmod`, `restore`, and `rollback`. +- PermissionError-only fallback; non-permission errors should not be silently retried as permission problems. +- Data loss during partial replacement and restoration failure reporting. diff --git a/.agents/skills/htcc-output-permission-workflow/agents/openai.yaml b/.agents/skills/htcc-output-permission-workflow/agents/openai.yaml new file mode 100644 index 0000000..c4f44ad --- /dev/null +++ b/.agents/skills/htcc-output-permission-workflow/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "HTCC Output Permission Workflow" + short_description: "Guide htcc output permission fixes" + default_prompt: "Use $htcc-output-permission-workflow to handle htcc output replacement and permission regressions safely." diff --git a/.agents/skills/htcc-parser-regression-workflow/SKILL.md b/.agents/skills/htcc-parser-regression-workflow/SKILL.md new file mode 100644 index 0000000..44b963c --- /dev/null +++ b/.agents/skills/htcc-parser-regression-workflow/SKILL.md @@ -0,0 +1,59 @@ +--- +name: htcc-parser-regression-workflow +description: "Fix htcc parser regressions with focused component tests, malformed-input QA, and nontermination safeguards." +risk: safe +source: project +date_added: "2026-05-18" +--- + +## Use this skill when + +- Editing `src/Htcc/Parser/**` or parser-facing code in `app/Main.hs`. +- Review feedback mentions parse acceptance, parse errors, `ATEmpty`, `manyTill`, EOF, `Megaparsec`, scope restoration, declarations, initializers, or implicit calls. +- A malformed C input can hang, parse successfully by mistake, or produce the wrong diagnostic. + +## Do not use this skill when + +- The change is pure codegen/output behavior after parsing has already succeeded. + +## Investigation steps + +1. Read the exact parser function and the nearest combinators it composes with. +2. Search for zero-width success in recursive or repeated contexts: `M.many`, `M.manyTill`, `M.option`, `ATEmpty`, `M.eof`, `M.try`. +3. Identify whether the parser must fail, consume input, or produce `ATEmpty` intentionally. +4. Check existing component tests in `test/Tests/ComponentsTests/Parser/Combinators.hs` before adding new helpers. + +## Regression test placement + +- Put parser unit regressions in `test/Tests/ComponentsTests/Parser/Combinators.hs` when direct `parseProgram`, `parseAssignExpr`, or helper assertions can reproduce the bug. +- Put CLI parser regressions in subprocess tests only when stderr/exit-code behavior is the relevant contract. +- Prefer adding cases under the existing `Parser.Program.*` group matching the construct, such as `function-call`, `scalar-initializer`, or `function-pointer-arithmetic`. +- For call argument lists, trailing commas such as `f(1,)` are malformed. For initializer lists, check existing accepted C-like trailing-comma behavior before turning a trailing comma into a rejection. +- For initializer EOF regressions, consider both file-scope and local-scope shapes such as `int g =`, `int g = {1`, and `int main(){ int x =`. + +## Nontermination safeguards + +- Any bug involving EOF, repeated combinators, or `manyTill` must have a timeout-guarded manual QA command, for example: + +```bash +printf 'int main(){ return f(' | timeout 5s stack exec htcc -- /dev/stdin +``` + +- The expected outcome for malformed input is non-zero exit before the timeout, not success. +- Use `timeout` or `gtimeout`, whichever exists on the host, and report the command used. +- Do not fix nontermination by accepting malformed input as `ATEmpty`. + +## Verification + +- `stylish-haskell -i test/Tests/ComponentsTests/Parser/Combinators.hs` +- `lsp_diagnostics` on edited Haskell files when the diagnostics tool is available. If unavailable, state that and substitute `stack build` plus the relevant test command. +- `stack test --test-arguments components`. +- Full `stack test` for parser-wide behavior or when the change affects common expression/statement parsing. +- `stack build`. +- Manual CLI QA for malformed inputs when user-visible parser behavior or hangs are involved. + +## Review focus + +- Does the fix remove the bad acceptance path without breaking intended empty statements or empty `for` sections? +- Do the tests cover both minimal malformed shape and comma/trailing-argument shape where applicable? +- Does the parser fail with a parse error rather than looping or producing a misleading successful AST? diff --git a/.agents/skills/htcc-parser-regression-workflow/agents/openai.yaml b/.agents/skills/htcc-parser-regression-workflow/agents/openai.yaml new file mode 100644 index 0000000..3758709 --- /dev/null +++ b/.agents/skills/htcc-parser-regression-workflow/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "HTCC Parser Regression Workflow" + short_description: "Fix htcc parser regressions safely" + default_prompt: "Use $htcc-parser-regression-workflow to fix htcc parser regressions with focused tests and malformed-input QA." diff --git a/.agents/skills/htcc-review-comment-response/SKILL.md b/.agents/skills/htcc-review-comment-response/SKILL.md new file mode 100644 index 0000000..93c4243 --- /dev/null +++ b/.agents/skills/htcc-review-comment-response/SKILL.md @@ -0,0 +1,68 @@ +--- +name: htcc-review-comment-response +description: "Handle htcc review comments end-to-end: reproduce, judge validity, apply minimal fixes, verify, review, and prepare commit/push." +risk: safe +source: project +date_added: "2026-05-18" +--- + +## Use this skill when + +- Addressing GitHub, Codex, or reviewer comments against this htcc repository. +- The comment points at parser, CLI, CI, codegen, output replacement, Docker, or test behavior. +- You need to decide whether a review finding is valid before changing code. + +## Do not use this skill when + +- The task is only to explain a comment without changing or verifying anything. +- The task is a generic Haskell question unrelated to htcc review feedback. + +## Core workflow + +1. Read the exact reviewed file and nearby code before judging the comment. +2. Reproduce or reason from the current diff; do not assume the reviewer is right. +3. If valid, make the smallest behavior-preserving fix that addresses exactly the comment. +4. Add or extend a regression test that fails on the reviewed bug shape. +5. Run the htcc verification matrix for the touched area. +6. Launch independent review agents after implementation: + - at least two general/code reviewers for behavior and tests, + - one security/operational reviewer. +7. Wait 180-300 seconds when reviewer results are a prerequisite. Treat `No agents completed yet` as pending, not failure, and do not mark review complete while required reviewers are still unfinished. +8. If a reviewer returns empty output, retry or report that reviewer as unavailable; do not silently replace a required review with an unrelated fallback. +9. If all review findings are resolved and the user explicitly asks for a commit, commit with `--no-gpg-sign` and follow the current repo/user rules for trailers. Do not add co-author or other trailers unless the current repo/user rules or the user explicitly require them. Push only after explicit request. + +## GitHub review threads + +- When a GitHub PR URL or unresolved review threads are part of the task, use `gh` or an available GitHub connector to fetch PR metadata, inline review context, and current diff before editing. +- If thread resolution state matters, prefer a GitHub connector or `gh` GraphQL query that exposes unresolved/resolved thread state; flat review comments alone are not enough. +- Stop before commit or push when the user asks to fix comments without publishing local changes. + +## Validity checklist + +- Does the comment cite reachable code on this branch? +- Does the described bad behavior still exist after earlier commits? +- Can a focused test or timeout-guarded command distinguish old and fixed behavior? +- Is the requested behavior consistent with existing htcc patterns? +- Would the fix broaden scope beyond the review comment? If yes, reduce it. + +## Verification requirements + +- Always run `stylish-haskell -i` for edited Haskell files. +- Always run `lsp_diagnostics` on edited Haskell files when the diagnostics tool is available. If unavailable, state that and substitute `stack build` plus the relevant test command. +- Always run the smallest relevant `stack test --test-arguments ...` command, then full `stack test` when parser/core behavior changed. +- Always run a real CLI/manual QA command when the change affects the executable, output files, CI commands, or malformed-input behavior. +- Report any unavailable verifier explicitly, with the substitute used. + +## Common htcc review patterns + +- Parser nontermination or malformed input: use `timeout` or `gtimeout` around `stack exec htcc -- /dev/stdin` and add component parser tests. +- CLI diagnostics: assert stderr text, exit code, stdout absence, and absence of Haskell exception internals such as `HasCallStack`. +- Output replacement and permission handling: test file modes, stale-output preservation, rollback, symlink/hard-link safety, and manual read-only directory scenarios. +- CI test coverage: check every explicit CI command list; defaults in `commandsToRun` do not affect workflows that enumerate commands. + +## Output format + +- State whether each review comment was valid. +- Summarize the minimal fix and regression test. +- List verification commands and outcomes. +- List reviewer outcomes and any declined findings with reasons. diff --git a/.agents/skills/htcc-review-comment-response/agents/openai.yaml b/.agents/skills/htcc-review-comment-response/agents/openai.yaml new file mode 100644 index 0000000..747fbce --- /dev/null +++ b/.agents/skills/htcc-review-comment-response/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "HTCC Review Comment Response" + short_description: "Resolve htcc review comments safely" + default_prompt: "Use $htcc-review-comment-response to validate and address htcc review comments end to end." diff --git a/.agents/skills/htcc-verification-matrix/SKILL.md b/.agents/skills/htcc-verification-matrix/SKILL.md new file mode 100644 index 0000000..cf40b9b --- /dev/null +++ b/.agents/skills/htcc-verification-matrix/SKILL.md @@ -0,0 +1,55 @@ +--- +name: htcc-verification-matrix +description: "Choose and run the correct htcc verification commands for parser, CLI, CI, output permission, Docker, and documentation changes." +risk: safe +source: project +date_added: "2026-05-18" +--- + +## Use this skill when + +- You edited htcc code, tests, CI, scripts, examples, or documentation and need to know what to run. +- You are preparing a final report after implementation. + +## Do not use this skill when + +- No files changed and the user only asked for discussion. + +## Universal checks + +- Haskell edits: `stylish-haskell -i `. +- Haskell edits: `lsp_diagnostics` on each changed `.hs` file when the diagnostics tool is available. If unavailable, state that and substitute `stack build` plus the relevant tests. +- Any edit: `git diff --check`. +- YAML edits: parse `.yml` / `.yaml` files. If `yaml-ls` is unavailable, use `ruby -e 'require "yaml"; YAML.load_file("path")'` or equivalent. +- Before final answer: `git status --short` and report uncommitted files. + +## Change-type matrix + +| Change type | Required tests | Manual QA | +|---|---|---| +| Parser combinators / AST typing | `stack test --test-arguments components`; full `stack test` for core parser behavior | malformed input through `stack exec htcc -- /dev/stdin`, often with `timeout` | +| CLI diagnostics / app/Main.hs | relevant subprocess tests or full `stack test` | run `stack exec htcc -- ...`, capture exit code, stdout, stderr | +| Output replacement / permissions | `stack test --test-arguments components`; full `stack test` if behavior is shared | create temp files/dirs with modes, run `htcc -o`, verify output and modes | +| CI workflow or Travis | `stack test --test-arguments `; YAML parse | inspect `git diff` and ensure every explicit list includes intended command | +| Docker command behavior | `stack test --test-arguments docker` and `stack test --test-arguments docker --test-arguments --clean` when Docker is available | do not fake Docker success; report unavailable Docker explicitly | +| Examples | `cd example && make && cd dist && ./knapsack && ./merge_sorting_linked_list && ./shuffle_and_sort`; `cd example && make docker && make clean_docker` only when Docker is available | execute generated examples when requested or affected | +| Documentation only | no Haskell formatter unless code changed; run referenced commands when they are the contract and feasible | check links/commands touched by the edit when practical | + +## Evidence standard + +- A passing unit test is not enough when user-visible CLI behavior changed; run the CLI. +- A parser test is not enough for hang bugs; include timeout-guarded manual QA. +- Use `timeout` or `gtimeout`, whichever exists on the host, and report the command used. +- A successful build is not enough for CI edits; parse YAML and run the newly added command locally when possible. +- When Docker is unavailable, substitute Docker checks with non-Docker example/build tests where relevant plus command/diff inspection, then report Docker behavior as residual risk. +- If a tool is missing, state the exact missing tool and the substitute check. + +## Reporting template + +```text +Changed: +Verification: +Manual QA: +Review: +Remaining: +``` diff --git a/.agents/skills/htcc-verification-matrix/agents/openai.yaml b/.agents/skills/htcc-verification-matrix/agents/openai.yaml new file mode 100644 index 0000000..2cad75f --- /dev/null +++ b/.agents/skills/htcc-verification-matrix/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "HTCC Verification Matrix" + short_description: "Choose htcc verification commands" + default_prompt: "Use $htcc-verification-matrix to choose and report the right htcc verification commands." diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml index a0753df..317c9d2 100644 --- a/.github/workflows/main.yml +++ b/.github/workflows/main.yml @@ -1,65 +1,71 @@ name: CI + on: push: branches: - master - develop + pull_request: + workflow_dispatch: + +permissions: + contents: read jobs: - test: - name: htcc test - runs-on: ${{ matrix.os }} - strategy: - matrix: - os: - - ubuntu-18.04 - - ubuntu-20.04 - # - windows-latest - # - macos-latest - ghc: ["8.6.5"] - cabal: ["3.0"] - steps: - - uses: actions/checkout@v2 - - name: Setup Haskell to ${{ matrix.os }} (GHC ${{ matrix.ghc }}) - uses: actions/setup-haskell@v1.1.4 - with: - enable-stack: true - stack-setup-ghc: true - ghc-version: ${{ matrix.ghc }} - cabal-version: ${{ matrix.cabal }} - - name: Cache - uses: actions/cache@v2 - id: stack-cache - with: - path: ~/.stack - key: stack-v3-${{ runner.os }}-${{ hashFiles('stack.yaml') }} - - name: Setup Dhall to yaml - if: runner.os == 'Linux' - run: .travis/install-dhall-to-yaml.sh - - name: Add path - run: echo "$HOME/.local/bin" >> $GITHUB_PATH - - name: Install dependencies to ${{ matrix.os }} - run: stack build --only-dependencies - - name: Build htcc to ${{ matrix.os }} - run: stack install - - name: Test htcc on ${{ matrix.os }} - if: runner.os == 'Linux' - run: | - stack test --test-arguments self - stack test --test-arguments subp - - name: Compile and execute example C codes on ${{ matrix.os }} - if: runner.os == 'Linux' - run: | - make - cd dist - ./knapsack && ./merge_sorting_linked_list && ./shuffle_and_sort - working-directory: ./example - - name: Test htcc on ${{ matrix.os }} with Docker - run: | - stack test --test-arguments docker - stack test --test-arguments docker --test-arguments --clean - - name: Compile and execute example C codes on ${{ matrix.os }} with Docker - run: | - make docker - make clean_docker - working-directory: ./example + test: + name: htcc test + runs-on: ${{ matrix.os }} + strategy: + matrix: + include: + - os: ubuntu-24.04 + ghc: "9.10.3" + cabal: "3.12.1.0" + steps: + # Pinned actions/checkout v6 + - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd + with: + persist-credentials: false + - name: Setup Haskell to ${{ matrix.os }} (GHC ${{ matrix.ghc }}) + # Pinned haskell-actions/setup v2 + uses: haskell-actions/setup@cd0d9bdd65b20557f41bea4dbe43d0b5fbbfe553 + with: + enable-stack: true + stack-setup-ghc: true + ghc-version: ${{ matrix.ghc }} + cabal-version: ${{ matrix.cabal }} + - name: Cache + # Pinned actions/cache v5 + uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae + id: stack-cache + with: + path: ~/.stack + key: stack-v4-${{ runner.os }}-${{ hashFiles('stack.yaml', 'stack.yaml.lock') }} + - name: Setup Dhall to yaml + run: .travis/install-dhall-to-yaml.sh + - name: Add path + run: echo "$HOME/.local/bin" >> "$GITHUB_PATH" + - name: Install dependencies to ${{ matrix.os }} + run: stack build --only-dependencies + - name: Build htcc to ${{ matrix.os }} + run: stack install + - name: Test htcc on ${{ matrix.os }} + run: | + stack test --test-arguments self + stack test --test-arguments components + stack test --test-arguments subp + - name: Compile and execute example C codes on ${{ matrix.os }} + run: | + make + cd dist + ./knapsack && ./merge_sorting_linked_list && ./shuffle_and_sort + working-directory: ./example + - name: Test htcc on ${{ matrix.os }} with Docker + run: | + stack test --test-arguments docker + stack test --test-arguments docker --test-arguments --clean + - name: Compile and execute example C codes on ${{ matrix.os }} with Docker + run: | + make docker + make clean_docker + working-directory: ./example diff --git a/.gitignore b/.gitignore index fd0c8a8..a4c5141 100644 --- a/.gitignore +++ b/.gitignore @@ -1,4 +1,34 @@ .stack-work +.stack-root*/ +.cabal-local/ +.cabal-home/ +.cabal-store/ +.cabal-sandbox/ +.codex-home/ +.home/ +.serena/ +.sisyphus/ +.claude/ +dist-newstyle/ +dist-newstyle-review*/ +.cabal-review/ +.stack-review/ +dist-newstyle-sandbox/ +dist-newstyle-sandbox-check/ +dist-newstyle-local/ +dist-review*/ +.home-local/ +htcc-test-noexec-like-cwd-debug.*/ +a.out +.DS_Store +*.hi +*.hi-boot +*.o +*.o-boot +*.dyn_hi +*.dyn_o +tmp* +.tmp* *.swp bench_report.html docs/ diff --git a/.travis.yml b/.travis.yml index 6e0bc8d..a071996 100644 --- a/.travis.yml +++ b/.travis.yml @@ -8,7 +8,6 @@ cache: apt: true directories: - "$HOME/.stack/" - - "$HOME/.local/bin/" - ".stack-work/" addons: apt: @@ -29,14 +28,11 @@ before_install: - | if [ "$TRAVIS_OS_NAME" == "linux" ] || [ "$TRAVIS_OS_NAME" == "osx" ]; then mkdir -p ~/.local/bin + rm -f ~/.local/bin/bash export PATH=$HOME/.local/bin:$PATH ./.travis/install-stack.sh ./.travis/install-dhall-to-yaml.sh fi - echo -e "Host github.com\n\tStrictHostKeyChecking no\nIdentityFile ~/.ssh/deploy.key\n" >> ~/.ssh/config - openssl aes-256-cbc -pass "pass:$SERVER_KEY" -pbkdf2 -in .travis/deploy_key.enc -d -a -out deploy.key - mv deploy.key ~/.ssh/ - chmod 600 ~/.ssh/deploy.key git config --global user.email "falgon53@yahoo.co.jp" git config --global user.name "falgon" git config --global core.autocrlf "input" @@ -55,6 +51,7 @@ jobs: os: linux script: - stack test --test-arguments self + - stack test --test-arguments components - stack test --test-arguments subp - stage: Run compiling and executing example C codes os: linux @@ -73,21 +70,41 @@ jobs: os: linux services: docker script: - - stack test --test-arguments docker - - stack test --test-arguments docker --test-arguments --clean + - docker-compose --version + - DOCKER_COMPOSE=docker-compose stack test --test-arguments docker + - DOCKER_COMPOSE=docker-compose stack test --test-arguments docker --test-arguments --clean - stage: Run compiling and executing example C codes on Docker os: linux services: docker script: - - cd example && make docker && make clean_docker - - if: branch = master + - docker-compose --version + - cd example && make DOCKER_COMPOSE=docker-compose docker && make DOCKER_COMPOSE=docker-compose clean_docker + - if: branch = master AND type = push stage: Deploy gh-pages os: linux script: - - stack clean - - stack haddock --fast - - ghp-import -m "by Travis CI (JOB $TRAVIS_JOB_NUMBER)" -n docs - - git push -qf git@github.com:falgon/htcc.git gh-pages + - test "$TRAVIS_PULL_REQUEST" = "false" + - test "$TRAVIS_BRANCH" = "master" + - | + mkdir -p ~/.ssh + printf '%s\n' \ + 'github.com ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIOMqqnkVzrm0SdG6UOoqKLsabgH5C9okWi0dh2l9GKJl' \ + 'github.com ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBEmKSENjQEezOmxkZMy7opKgwFB9nkt5YRrYMjNuG5N87uRgg6CLrbo5wAdT/y6v0mKV0U2w0WZ2YB/++Tpockg=' \ + 'github.com ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCj7ndNxQowgcQnjshcLrqPEiiphnt+VTTvDP6mHBL9j1aNUkY4Ue1gvwnGLVlOhGeYrnZaMgRK6+PKCUXaDbC7qtbW8gIkhL7aGCsOr/C56SJMy/BCZfxd1nWzAOxSDPgVsmerOBYfNqltV9/hWCqBywINIR+5dIg6JTJ72pcEpEjcYgXkE2YEFXV1JHnsKgbLWNlhScqb2UmyRkQyytRLtL+38TGxkxCflmO+5Z8CSSNY7GidjMIZ7Q4zMjA2n1nGrlTDkzwDCsw+wqFPGQA179cnfGWOWRVruj16z6XyvxvjJwbz0wQZ75XK5tKSb7FNyeIEs4TT4jk+S4dhPeAUC5y+bDYirYgM4GC7uEnztnZyaVWQ7B381AK4Qdrwt51ZqExKbQpTUNn+EjqoTwvqNj4kqx5QUCI0ThS/YkOxJCXmPUWZbhjpCg56i+2aB6CmK2JGhn57K5mj0MNdBXA4/WnwH6XoPWJzK5Nyu2zB3nAZp+S5hpQs+p1vN1/wsjk=' \ + > ~/.ssh/known_hosts + printf '%s\n' \ + 'Host github.com' \ + ' StrictHostKeyChecking yes' \ + ' UserKnownHostsFile ~/.ssh/known_hosts' \ + ' IdentityFile ~/.ssh/deploy.key' \ + >> ~/.ssh/config + - openssl aes-256-cbc -pass "pass:$SERVER_KEY" -pbkdf2 -in .travis/deploy_key.enc -d -a -out deploy.key + - mv deploy.key ~/.ssh/ + - chmod 600 ~/.ssh/deploy.key + - stack clean + - stack haddock --fast + - ghp-import -m "by Travis CI (JOB $TRAVIS_JOB_NUMBER)" -n docs + - git push -qf git@github.com:falgon/htcc.git gh-pages branches: except: - gh-pages diff --git a/.travis/install-dhall-to-yaml.sh b/.travis/install-dhall-to-yaml.sh index d0fed6a..40ae60b 100755 --- a/.travis/install-dhall-to-yaml.sh +++ b/.travis/install-dhall-to-yaml.sh @@ -1,30 +1,72 @@ #!/bin/bash -set -eux - -travis_retry() { - cmd=$* - $cmd || (sleep 2 && $cmd) || (sleep 10 && $cmd) -} - -DALL_BIN_URL=https://github.com/dhall-lang/dhall-haskell/releases/download -TAG_VERSION=1.34.0 -DHALL_TO_JSON_VERSION=1.7.1 - -fetch_dhall-to-yaml_osx() { - curl -sL $DALL_BIN_URL/$TAG_VERSION/dhall-json-$DHALL_TO_JSON_VERSION-x86_64-macos.tar.bz2 |\ - tar xjv -C ~/.local/bin --strip-components=1 -} - -fetch_dhall-to-yaml_linux() { - curl -sL $DALL_BIN_URL/$TAG_VERSION/dhall-json-$DHALL_TO_JSON_VERSION-x86_64-linux.tar.bz2 |\ - tar xjv -C ~/.local/bin --strip-components=2 -} - -mkdir -p ~/.local/bin; -if [ "$(uname)" = "Darwin" ]; then - travis_retry fetch_dhall-to-yaml_osx +set -euo pipefail + +dhall_release_version="1.42.2" +dhall_json_version="1.7.12" + +mkdir -p "${HOME}/.local/bin" + +case "$(uname -s):$(uname -m)" in + Linux:x86_64 | Linux:amd64) + platform="x86_64-linux" + archive_sha256="acbada5e29ecc9b6a723c3f390beb76b9db26df81546d1f472415a2f387bc457" + ;; + Darwin:x86_64) + platform="x86_64-darwin" + archive_sha256="f6b0bc2f120e5ade2c4c789555237cb4a0b4611fb2455f2a16a3bde4a441e589" + ;; + Darwin:arm64 | Darwin:aarch64) + platform="aarch64-darwin" + archive_sha256="761048afa225dc9978b9fb742cc9d4feee104f2656aefe37b6a6f157862b77dd" + ;; + *) + echo "unsupported platform for dhall-to-yaml: $(uname -s):$(uname -m)" >&2 + exit 1 + ;; +esac + +archive="dhall-json-${dhall_json_version}-${platform}.tar.bz2" +url="https://github.com/dhall-lang/dhall-haskell/releases/download/${dhall_release_version}/${archive}" +tmpdir="$(mktemp -d)" +trap 'rm -rf "${tmpdir}"' EXIT + +curl --fail --location --silent --show-error "${url}" --output "${tmpdir}/${archive}" +if command -v sha256sum >/dev/null 2>&1; then + actual_sha256="$(sha256sum "${tmpdir}/${archive}" | awk '{ print $1 }')" else - travis_retry fetch_dhall-to-yaml_linux + actual_sha256="$(shasum -a 256 "${tmpdir}/${archive}" | awk '{ print $1 }')" +fi +if [[ "${actual_sha256}" != "${archive_sha256}" ]]; then + echo "checksum mismatch for ${archive}" >&2 + echo "expected: ${archive_sha256}" >&2 + echo "actual: ${actual_sha256}" >&2 + exit 1 +fi + +expected_entries="$(printf '%s\n' \ + "bin/dhall-to-json" \ + "bin/dhall-to-yaml" \ + "bin/json-to-dhall")" +actual_entries="$(tar -tjf "${tmpdir}/${archive}")" +if [[ "${actual_entries}" != "${expected_entries}" ]]; then + echo "unexpected entries in ${archive}" >&2 + echo "${actual_entries}" >&2 + exit 1 +fi + +if ! tar -tvjf "${tmpdir}/${archive}" | awk ' + BEGIN { + ok["bin/dhall-to-json"] = 1 + ok["bin/dhall-to-yaml"] = 1 + ok["bin/json-to-dhall"] = 1 + } + substr($1, 1, 1) != "-" || !($NF in ok) { exit 1 } +'; then + echo "unsafe entries in ${archive}" >&2 + exit 1 fi +tar -xjf "${tmpdir}/${archive}" -C "${tmpdir}" bin/dhall-to-yaml +install -m 0755 "${tmpdir}/bin/dhall-to-yaml" "${HOME}/.local/bin/dhall-to-yaml" +"${HOME}/.local/bin/dhall-to-yaml" --version diff --git a/.travis/install-stack.sh b/.travis/install-stack.sh index 9395701..a6dd44a 100755 --- a/.travis/install-stack.sh +++ b/.travis/install-stack.sh @@ -1,25 +1,37 @@ #!/bin/bash -set -eux +set -euo pipefail travis_retry() { - cmd=$* - $cmd || (sleep 2 && $cmd) || (sleep 10 && $cmd) + "$@" || (sleep 2 && "$@") || (sleep 10 && "$@") } -fetch_stack_osx() { - curl -skL https://www.stackage.org/stack/osx-x86_64 | tar xz --strip-components=1 --include '*/stack' -C ~/.local/bin; -} +STACK_VERSION=3.9.3 +STACK_RELEASE_URL=https://github.com/commercialhaskell/stack/releases/download/v${STACK_VERSION} -fetch_stack_linux() { - curl -sL https://www.stackage.org/stack/linux-x86_64 | tar xz --wildcards --strip-components=1 -C ~/.local/bin '*/stack'; -} +case "$(uname)" in + Darwin) + STACK_PLATFORM=osx-x86_64 + STACK_SHA256=e9139c80dc0e2d5df6bef45d40e20c8ac9ca2dbd1bb31358782fa5446d0df38c + ;; + Linux) + STACK_PLATFORM=linux-x86_64 + STACK_SHA256=bc45cf6e1d00910348dcceb510a469056f6d9c63997230f901c22cf997598b05 + ;; + *) + echo "Unsupported platform: $(uname)" >&2 + exit 1 + ;; +esac + +STACK_ARCHIVE=stack-${STACK_VERSION}-${STACK_PLATFORM}.tar.gz +TMP_DIR=$(mktemp -d) +trap 'rm -rf "${TMP_DIR}"' EXIT -mkdir -p ~/.local/bin; -if [ "$(uname)" = "Darwin" ]; then - travis_retry fetch_stack_osx -else - travis_retry fetch_stack_linux -fi +mkdir -p "${HOME}/.local/bin" +travis_retry curl -fsSLo "${TMP_DIR}/${STACK_ARCHIVE}" "${STACK_RELEASE_URL}/${STACK_ARCHIVE}" +printf '%s %s\n' "${STACK_SHA256}" "${TMP_DIR}/${STACK_ARCHIVE}" | shasum -a 256 -c - +tar xzf "${TMP_DIR}/${STACK_ARCHIVE}" -C "${TMP_DIR}" +install -m 0755 "${TMP_DIR}/stack-${STACK_VERSION}-${STACK_PLATFORM}/stack" "${HOME}/.local/bin/stack" -travis_retry stack --no-terminal setup; +travis_retry stack --no-terminal setup diff --git a/README.md b/README.md index f502fb1..c7c9fc5 100644 --- a/README.md +++ b/README.md @@ -58,9 +58,14 @@ $ gcc -no-pie t.s -o out For one liner: ```sh -$ echo 'int printf(); int main() { printf("hello world!\n"); }' | stack exec htcc -- /dev/stdin | gcc -xassembler -no-pie -o out - +$ echo 'int printf(); int main() { printf("hello world!\n"); }' | stack exec htcc -- /dev/stdin | gcc -x assembler -no-pie -o out - ``` +When using `-r`/`--run-asm`, htcc uses the driver selected by +`$HTCC_ASSEMBLER` for both assembly and linking. +On hosts where `gcc` points to clang, set `HTCC_ASSEMBLER` to a +GNU-compatible compiler driver before running htcc. + ## AST diagram generation htcc has the ability to visualize ASTs built from loaded C code. @@ -164,8 +169,9 @@ The implementation description is available in [here](https://falgon.github.io/h ## Specification and Requirements htcc outputs x86_64 assembly according to System V ABI [[2]](#cite2) and -[GCC 7.4.0](https://gcc.gnu.org/onlinedocs/7.4.0/) is used for assemble. -Perhaps a newer version of GCC will work, but not checked currently. +[GCC 7.4.0](https://gcc.gnu.org/onlinedocs/7.4.0/) is used for assemble. +Perhaps a newer version of GCC will work, but not checked currently. When +assembling via `-r`, choose the driver with `HTCC_ASSEMBLER`. ## About emoji of commit messages diff --git a/app/Main.hs b/app/Main.hs index 3b34fbf..e40d438 100644 --- a/app/Main.hs +++ b/app/Main.hs @@ -1,101 +1,6781 @@ -{-# LANGUAGE OverloadedStrings #-} +{-# LANGUAGE LambdaCase, TemplateHaskell, TupleSections #-} module Main where -import Control.Conditional (ifM) -import Data.Bool (bool) -import Data.List.Split (splitOn) -import Data.Maybe (isJust, isNothing) -import qualified Data.Text.IO as T -import Data.Tuple.Extra (both, dupe, fst3) -import Diagrams.TwoD.Size (mkSizeSpec2D) -import Options.Applicative -import System.Directory (doesFileExist) -import System.Exit (exitFailure) -import Text.PrettyPrint.ANSI.Leijen (char, linebreak, text, - (<+>)) -import Text.Read (readMaybe) - -import Htcc.Asm (InputCCode, casm, - execAST) -import Htcc.Parser (ASTs) -import Htcc.Parser.ConstructionData.Scope.Var (GlobalVars, Literals) -import Htcc.Utils (errTxtDoc, locTxtDoc, - putDocLnErr, - putStrLnErr) -import Htcc.Visualizer (visualize) - -data Options = Options - { visualizeAST :: Bool - , resolution :: String - , inputFName :: FilePath - , outputFName :: FilePath - , supressWarn :: Bool - } deriving Show - -visualizeASTP :: Parser Bool -visualizeASTP = switch $ mconcat - [ long "visualize-ast" - , help "Visualize an AST built from source code" +import Control.Applicative ((<|>)) +import Control.Concurrent (forkIO, + threadDelay, + yield) +import Control.Concurrent.MVar (MVar, + newEmptyMVar, + putMVar, takeMVar) +import Control.Concurrent.STM (STM, TVar, + atomically, check, + newTVarIO, orElse, + readTVar, + writeTVar) +import Control.Exception (SomeException, + bracket, evaluate, + finally, throwIO, + try) +import Control.Monad (foldM, forM_, + unless, when, + (>=>)) +import Data.Bifunctor (first) +import Data.Bits (Bits (shiftL, shiftR, (.&.), (.|.)), + xor) +import Data.Bool (bool) +import qualified Data.ByteString as B +import qualified Data.ByteString.Char8 as BC +import Data.Char (digitToInt, + isAlpha, + isAlphaNum, + isDigit, + isHexDigit, + isSpace, toLower) +import Data.Either (fromRight) +import Data.Functor (($>), (<&>)) +import Data.IORef (modifyIORef', + newIORef, + readIORef, + writeIORef) +import Data.List (intercalate, + isInfixOf, + isPrefixOf, + mapAccumL, sortOn, + stripPrefix) +import qualified Data.List as List +import Data.List.NonEmpty (NonEmpty (..)) +import Data.Maybe (catMaybes, + fromMaybe, isJust, + mapMaybe) +import qualified Data.Sequence as SQ +import qualified Data.Text.IO as T +import Data.Version (showVersion) +import Data.Word (Word64, Word8) +import Language.Haskell.TH.Syntax (addDependentFile, + lift, runIO) +import qualified Options.Applicative as OA +import qualified Paths_htcc as P + +import qualified Data.Map.Strict as Map +import qualified Data.Set as Set +import qualified Data.Text as T +import Data.Void +import Diagrams.Prelude (V2) +import Diagrams.Size (SizeSpec) +import Diagrams.TwoD.Size (mkSizeSpec2D) +import GHC.Conc (threadWaitReadSTM) +import GHC.IO.Device (SeekMode (AbsoluteSeek)) +import GHC.IO.Exception (IOErrorType (IllegalOperation, InappropriateType, InvalidArgument, NoSuchThing, PermissionDenied, ResourceExhausted, ResourceVanished)) +import GHC.IO.Handle (hDuplicate) +import Htcc.Asm (casmNormalized', + prepareAsmInput, + prepareVisualizableInput) +import qualified Htcc.Asm.Intrinsic.Structure.Internal as SI +import qualified Htcc.CRules.Types as CT +import qualified Htcc.MegaparsecCompat as M +import Htcc.Output (ReplacementOutputMode (..), + creationMaskedOutputMode, + resolveReplacementOutputPath, + stagedOutputMode, + temporaryWritableMode, + withReplacementOutputPath, + withReplacementOutputPathAndResolvedPath) +import Htcc.Parser (ASTs, ATKind (..), + ATKindFor (..), + ATree (..)) +import Htcc.Parser.Combinators (parser, runParser) +import qualified Htcc.Parser.Combinators.ParserType as PT +import Htcc.Parser.ConstructionData.Core (Warnings) +import qualified Htcc.Parser.ConstructionData.Scope.Function as PF +import Htcc.Parser.ConstructionData.Scope.Var (GVar (..), + GVarInitData (..), + GVarInitWith (..), + GlobalVars, + Literal (..), + Literals) +import Htcc.Utils +import Htcc.Visualizer (validateVisualizationOutputPath, + writeVisualization) +import Htcc.WarningSuppression (CompilerOutputChunk, + CompilerWarningFilterDecision (..), + IncrementalCompilerWarningFilter, + emptyIncrementalCompilerWarningFilter, + feedIncrementalCompilerWarningFilter, + filterCompilerOutputChunks, + finalCompilerOutputChunk, + finalizeIncrementalCompilerWarningFilter, + incompleteCompilerOutputNeedsMoreInputForWarningSuppression, + newlineByte, + normalizeCompilerOutputLine, + splitCompleteCompilerOutputChunks) +import Numeric.Natural (Natural) +import System.Directory (canonicalizePath, + doesDirectoryExist, + doesFileExist, + executable, + getCurrentDirectory, + getPermissions, + getTemporaryDirectory, + listDirectory, + makeAbsolute, + removeFile) +import System.Environment (getEnvironment, + lookupEnv) +import System.Exit (ExitCode (..), + exitFailure) +import System.FilePath (isAbsolute, + normalise, + searchPathSeparator, + takeFileName, + ()) +import System.Info (os) +import qualified System.IO as IO +import System.IO (Handle, + IOMode (ReadMode, WriteMode), + hClose, hFlush, + hPutStr, + hSetBinaryMode, + openTempFile, + stderr, stdout, + withFile) +import System.IO.Error (catchIOError, + ioeGetErrorString, + ioeGetErrorType, + isDoesNotExistError, + isEOFError) +import System.Posix.Files (FileStatus, + deviceID, fileID, + fileMode, + fileSize, + getFdStatus, + getFileStatus, + getSymbolicLinkStatus, + groupExecuteMode, + intersectFileModes, + isCharacterDevice, + isRegularFile, + linkCount, + modificationTimeHiRes, + otherExecuteMode, + ownerExecuteMode, + ownerReadMode, + setFdMode, + setFdSize, + setFileMode, + statusChangeTimeHiRes, + unionFileModes) +import System.Posix.IO (FdOption (NonBlockingRead), + OpenFileFlags (..), + OpenMode (ReadOnly, WriteOnly), + closeFd, + defaultFileFlags, + fdSeek, + handleToFd, + openFd, + setFdOption) +import qualified System.Posix.IO.ByteString as PB +import System.Posix.Signals (Signal, + nullSignal, + sigKILL, sigTERM, + signalProcessGroup) +import System.Posix.Types (Fd, FileMode, + ProcessGroupID) +import System.Process (CreateProcess (..), + ProcessHandle, + StdStream (CreatePipe, Inherit), + getPid, proc, + readCreateProcessWithExitCode, + readProcessWithExitCode, + waitForProcess, + withCreateProcess) +import System.Timeout (timeout) +import qualified Text.Parsec as Parsec +import Text.Read (readMaybe) + +data Opts = Opts + { optIsRunAsm :: !Bool + , optIsVerbose :: !Bool + , optVisualizeAst :: !Bool + , optImgResolution :: !(Maybe String) + , optSuppressWarns :: !Bool + , optOutput :: Maybe FilePath + , optInput :: [FilePath] + } deriving (Read, Show) + +output :: OA.Parser (Maybe String) +output = OA.optional $ outputOption <|> legacyOutputOption + where + outputHelp = "Place the output into (legacy alias: --out)" + outputOption = OA.strOption $ mconcat [ + OA.metavar "" + , OA.long "output" + , OA.short 'o' + , OA.help outputHelp + ] + legacyOutputOption = OA.strOption $ mconcat [ + OA.metavar "" + , OA.long "out" + , OA.internal + ] + +visualizeAst :: OA.Parser Bool +visualizeAst = OA.switch $ mconcat [ + OA.long "visualize-ast" + , OA.help "Visualize ASTs instead of emitting assembly" + ] + +imgResolution :: OA.Parser (Maybe String) +imgResolution = OA.optional $ OA.strOption $ mconcat [ + OA.metavar "RESOLUTION" + , OA.long "img-resolution" + , OA.help "Specify the output size for --visualize-ast as WIDTHxHEIGHT (default: 640x480)" + ] + +suppressWarns :: OA.Parser Bool +suppressWarns = canonicalFlag <|> legacyFlag <|> pure False + where + canonicalFlag = OA.flag' True $ mconcat + [ OA.long "suppress-warns" + , OA.short 'w' + , OA.help "Disable all warning messages (legacy alias: --supress-warns)" + ] + legacyFlag = OA.flag' True $ mconcat + [ OA.long "supress-warns" + , OA.internal + ] + +input :: OA.Parser [String] +input = OA.some $ OA.strArgument $ mconcat [ + OA.metavar "file..." + , OA.help "Input source files" + ] + +isRunAsm :: OA.Parser Bool +isRunAsm = OA.switch $ mconcat [ + OA.long "run-asm" + , OA.short 'r' + , OA.help "Generates executable binaries via the driver selected by $HTCC_ASSEMBLER" + ] + +isVerbose :: OA.Parser Bool +isVerbose = OA.switch $ mconcat [ + OA.long "verbose" + , OA.short 'v' + , OA.help "Show the programs invoked by the compiler" + ] + +programOptions :: OA.Parser Opts +programOptions = Opts + <$> isRunAsm + <*> isVerbose + <*> visualizeAst + <*> imgResolution + <*> suppressWarns + <*> output + <*> input + +versionOption :: OA.Parser (a -> a) +versionOption = OA.infoOption vopt $ mconcat [ + OA.long "version" + , OA.help "Show compiler version information" + ] + where + vopt = concat [ + "The C Language Compiler htcc " + , showVersion P.version + , "\ncommit hash: " + , gitHashValue + ] + +gitHashValue :: String +gitHashValue = $(do + let trim = reverse . dropWhile isSpace . reverse + readGit args = + catchIOError + (do + (exitCode, stdoutOut, _) <- readProcessWithExitCode "git" args "" + pure $ case exitCode of + ExitSuccess -> Just $ trim stdoutOut + ExitFailure _ -> Nothing + ) + (const $ pure Nothing) + addGitDependency path = do + exists <- runIO $ doesFileExist path + when exists $ addDependentFile path + + gitDir <- runIO $ readGit ["rev-parse", "--git-dir"] + case gitDir of + Just dir -> do + let headPath = dir "HEAD" + addGitDependency headPath + addGitDependency $ dir "packed-refs" + headRef <- runIO $ + catchIOError + (stripPrefix "ref: " . trim <$> readFile headPath) + (const $ pure Nothing) + maybe (pure ()) (addGitDependency . (dir )) headRef + Nothing -> pure () + + lift . fromMaybe "unknown" =<< runIO (readGit ["rev-parse", "HEAD"]) + ) + +optsParser :: OA.ParserInfo Opts +optsParser = OA.info (OA.helper <*> versionOption <*> programOptions) $ mconcat [ + OA.fullDesc + , OA.progDesc $ "The C Language Compiler htcc " ++ showVersion P.version + ] + +ignoreIOException :: IO () -> IO () +ignoreIOException = flip catchIOError $ const $ pure () + +nonEmptyEnv :: Maybe String -> Maybe String +nonEmptyEnv (Just s) | all isSpace s = Nothing +nonEmptyEnv x = x + +data ShellWordSpan = ShellWordSpan + { shellWordSpanText :: String + , shellWordSpanAllowsPosixExpansion :: Bool + , shellWordSpanAllowsPosixFieldSplitting :: Bool + } + +data ParsedShellWord = ParsedShellWord + { parsedShellWordSpans :: [ShellWordSpan] + , parsedShellWordPreservesEmptyField :: Bool + } + +newtype CompilerEnvOverrideSpec = CompilerEnvOverrideSpec + { compilerEnvOverrideSpans :: [ShellWordSpan] + } + +data PosixShellContext + = PosixShellLiteralContext + | PosixShellDoubleQuotedContext + | PosixShellUnquotedContext + deriving (Eq) + +data ShellWordChar = ShellWordChar + { shellWordCharText :: Char + , shellWordCharContext :: PosixShellContext + } + +shellWordText :: ParsedShellWord -> String +shellWordText = + concatMap shellWordSpanText . parsedShellWordSpans + +environmentAssignmentName :: ParsedShellWord -> Maybe String +environmentAssignmentName parsedWord = + case span isAssignmentNameChar $ shellWordCharsFromSpans $ parsedShellWordSpans parsedWord of + (firstChar:remainingNameChars, equalsChar:_) + | startsLikeIdentifier (shellWordCharText firstChar) + && all (isEnvironmentVariableNameChar . shellWordCharText) remainingNameChars + && shellWordCharContext equalsChar == PosixShellUnquotedContext + && shellWordCharText equalsChar == '=' -> + Just $ map shellWordCharText (firstChar : remainingNameChars) + _ -> + Nothing + where + isAssignmentNameChar wordChar = + shellWordCharContext wordChar == PosixShellUnquotedContext + && isEnvironmentVariableNameChar (shellWordCharText wordChar) + +isEnvironmentAssignmentWord :: ParsedShellWord -> Bool +isEnvironmentAssignmentWord = + isJust . environmentAssignmentName + +literalShellWordSpan :: String -> ShellWordSpan +literalShellWordSpan text = + ShellWordSpan + { shellWordSpanText = text + , shellWordSpanAllowsPosixExpansion = False + , shellWordSpanAllowsPosixFieldSplitting = False + } + +doubleQuotedExpandableShellWordSpan :: String -> ShellWordSpan +doubleQuotedExpandableShellWordSpan text = + ShellWordSpan + { shellWordSpanText = text + , shellWordSpanAllowsPosixExpansion = True + , shellWordSpanAllowsPosixFieldSplitting = False + } + +unquotedExpandableShellWordSpan :: String -> ShellWordSpan +unquotedExpandableShellWordSpan text = + ShellWordSpan + { shellWordSpanText = text + , shellWordSpanAllowsPosixExpansion = True + , shellWordSpanAllowsPosixFieldSplitting = True + } + +coalesceShellWordSpans :: [ShellWordSpan] -> [ShellWordSpan] +coalesceShellWordSpans = + foldr + (\currentSpan accumulatedSpans -> + case accumulatedSpans of + nextSpan : remainingSpans + | shellWordSpanAllowsPosixExpansion currentSpan + == shellWordSpanAllowsPosixExpansion nextSpan + && shellWordSpanAllowsPosixFieldSplitting currentSpan + == shellWordSpanAllowsPosixFieldSplitting nextSpan -> + nextSpan + { shellWordSpanText = + shellWordSpanText currentSpan <> shellWordSpanText nextSpan + } + : remainingSpans + _ -> + currentSpan : accumulatedSpans + ) + [] + +dropShellWordSpanChars :: Int -> [ShellWordSpan] -> [ShellWordSpan] +dropShellWordSpanChars charsToDrop spans + | charsToDrop <= 0 = spans + | otherwise = + case spans of + [] -> + [] + currentSpan : remainingSpans -> + let spanText = shellWordSpanText currentSpan + spanLength = length spanText + in if charsToDrop < spanLength + then + currentSpan + { shellWordSpanText = drop charsToDrop spanText + } + : remainingSpans + else + dropShellWordSpanChars (charsToDrop - spanLength) remainingSpans + +shellWordsWithContext :: String -> Either String [ParsedShellWord] +shellWordsWithContext commandLine = + either (Left . show) Right $ Parsec.parse shellParser "" commandLine + where + shellParser = skipSpaces *> Parsec.sepEndBy word spaces <* Parsec.eof + skipSpaces = Parsec.skipMany $ Parsec.satisfy isSpace + spaces = Parsec.skipMany1 $ Parsec.satisfy isSpace + word = buildParsedShellWord <$> Parsec.many1 chunk + chunk + | os == "mingw32" = + Parsec.choice [doubleQuoted, caretEscaped, bare] + | otherwise = + Parsec.choice [singleQuoted, doubleQuoted, escaped, bare] + singleQuoted = do + text <- Parsec.char '\'' *> Parsec.manyTill Parsec.anyChar (Parsec.char '\'') + pure (True, [literalShellWordSpan text]) + doubleQuoted = + fmap ((,) True . coalesceShellWordSpans . concat) $ + Parsec.char '"' *> Parsec.manyTill doubleChunk (Parsec.char '"') + doubleChunk + | os == "mingw32" = + Parsec.choice + [ doubleEscaped + , doublePercentEscaped + , doublePercentChar + , doubleCaretEscaped + , doubleBare + ] + | otherwise = + Parsec.choice [doubleEscaped, doubleBare] + doubleBare = + pure . doubleQuotedExpandableShellWordSpan + <$> Parsec.many1 (Parsec.satisfy isDoubleBareChar) + doubleEscaped = do + _ <- Parsec.char '\\' + c <- Parsec.anyChar + pure . pure . literalShellWordSpan $ case c of + '\\' -> "\\" + '"' -> "\"" + '`' -> "`" + '$' -> "$" + '\n' -> "" + _ -> ['\\', c] + doublePercentEscaped = do + _ <- Parsec.try $ Parsec.string "%%" + pure [literalShellWordSpan "%"] + doublePercentChar = do + _ <- Parsec.char '%' + pure [doubleQuotedExpandableShellWordSpan "%"] + doubleCaretEscaped = do + _ <- Parsec.char '^' + c <- Parsec.anyChar + pure [literalShellWordSpan [c]] + escaped = do + _ <- Parsec.char '\\' + c <- Parsec.anyChar + pure (False, [literalShellWordSpan $ case c of + '\n' -> "" + _ -> [c] + ]) + caretEscaped = do + _ <- Parsec.char '^' + c <- Parsec.anyChar + pure (False, [literalShellWordSpan [c]]) + bare = do + text <- Parsec.many1 (Parsec.satisfy isBareNonBackslashChar) + pure (False, [unquotedExpandableShellWordSpan text]) + + buildParsedShellWord chunks = + ParsedShellWord + { parsedShellWordSpans = + coalesceShellWordSpans $ concatMap snd chunks + , parsedShellWordPreservesEmptyField = + any fst chunks + } + + isBareNonBackslashChar c = + not (isSpace c) + && c /= '"' + && (os /= "mingw32" || c /= '^') + && (os == "mingw32" || c /= '\'') + && (os == "mingw32" || c /= '\\') + + isDoubleBareChar c = + c /= '"' + && c /= '\\' + && (os /= "mingw32" || c /= '%') + && (os /= "mingw32" || c /= '^') + +shellWords :: String -> Either String [String] +shellWords = + fmap (map shellWordText) . shellWordsWithContext + +data CompilerCommand = CompilerCommand + { compilerEnvOverrides :: [(String, String)] + , compilerEnvOverrideSpecs :: Maybe [CompilerEnvOverrideSpec] + , compilerBaseEnvironment :: Map.Map String String + , compilerExecutable :: FilePath + , compilerArguments :: [String] + } + +resolveCompilerCommand :: String -> IO CompilerCommand +resolveCompilerCommand = + resolveCompilerCommandIn Nothing + +resolveCompilerCommandIn :: Maybe FilePath -> String -> IO CompilerCommand +resolveCompilerCommandIn maybeWorkingDir compiler = do + parsedParts <- case shellWordsWithContext compiler of + Left parseErr -> ioError . userError $ + "failed to parse compiler command " <> show compiler <> ": " <> parseErr + Right [] -> ioError . userError $ + "empty compiler command: " <> show compiler + Right xs -> pure xs + let (envAssignmentWords, initialCompilerWords) = + span isEnvironmentAssignmentWord parsedParts + splitAssignments = + map splitEnvironmentAssignment envAssignmentWords + initialEnvOverrides = + map fst splitAssignments + initialEnvOverrideSpecs = + map snd splitAssignments + baseEnvironment <- baseProcessEnvironment maybeWorkingDir + let expandedEnvOverrides = + environmentFromList $ + expandEnvironmentOverridesWithBaseEnvironment + baseEnvironment + initialEnvOverrides + (Just initialEnvOverrideSpecs) + compilerWordExpansionEnvironment + | os == "mingw32" = + Map.union expandedEnvOverrides baseEnvironment + | otherwise = + baseEnvironment + compilerResolutionEnvironment = + Map.union expandedEnvOverrides baseEnvironment + compilerParts = + concatMap + (expandParsedShellWordIntoArguments compilerWordExpansionEnvironment) + initialCompilerWords + envOverrides = initialEnvOverrides + envOverrideSpecs = initialEnvOverrideSpecs + when (null compilerParts) . ioError . userError $ + "empty compiler command: " <> show compiler + resolvedPrefix <- + findExecutablePrefix + maybeWorkingDir + compilerResolutionEnvironment + (hasExplicitSearchPathOverride envOverrides) + compilerParts + pure $ + case resolvedPrefix of + Just (compilerLen, resolvedCompiler) -> + CompilerCommand + { compilerEnvOverrides = envOverrides + , compilerEnvOverrideSpecs = Just envOverrideSpecs + , compilerBaseEnvironment = baseEnvironment + , compilerExecutable = resolvedCompiler + , compilerArguments = drop compilerLen compilerParts + } + Nothing -> + CompilerCommand + { compilerEnvOverrides = envOverrides + , compilerEnvOverrideSpecs = Just envOverrideSpecs + , compilerBaseEnvironment = baseEnvironment + , compilerExecutable = head compilerParts + , compilerArguments = tail compilerParts + } + where + splitEnvironmentAssignment word = + case environmentAssignmentName word of + Just name -> + ( (name, value) + , CompilerEnvOverrideSpec $ + dropShellWordSpanChars (length name + 1) (parsedShellWordSpans word) + ) + where + value = drop (length name + 1) (shellWordText word) + Nothing -> + error "internal compiler error" + + hasExplicitSearchPathOverride = + any + ((== environmentNameKey "PATH") . environmentNameKey . fst) + + findExecutablePrefix _ _ _ [] = pure Nothing + findExecutablePrefix maybeWorkingDir' envOverrides' explicitSearchPathOverride (cmd:_) = do + resolved <- + resolveExecutableCommand + maybeWorkingDir' + envOverrides' + explicitSearchPathOverride + cmd + pure $ fmap (1,) resolved + + resolveExecutableCommand maybeWorkingDir' envOverrides' explicitSearchPathOverride cmd = do + if hasExplicitPath cmd + then localExecutablePath maybeWorkingDir' cmd + else + findExecutableInSearchPath + maybeWorkingDir' + envOverrides' + explicitSearchPathOverride + cmd + + findExecutableInSearchPath maybeWorkingDir' envOverrides' explicitSearchPathOverride cmd = do + pathValue <- maybe + (fromMaybe "" <$> lookupEnv "PATH") + pure + (environmentLookup "PATH" envOverrides') + maybeResolvedFromPath <- firstResolved $ + map (localExecutablePath maybeWorkingDir' . searchPathCommand cmd) $ + searchPathEntries pathValue + case maybeResolvedFromPath of + Just resolvedFromPath -> + pure $ Just resolvedFromPath + Nothing + | explicitSearchPathOverride -> + pure Nothing + | otherwise -> + localExecutablePath maybeWorkingDir' cmd + + searchPathCommand cmd "" = cmd + searchPathCommand cmd dir = dir cmd + + searchPathEntries pathValue = case break (== searchPathSeparator) pathValue of + (dir, []) -> [dir] + (dir, _:remain) -> dir : searchPathEntries remain + + localExecutablePath maybeWorkingDir' cmd = do + let candidatePath = + normalise $ + case maybeWorkingDir' of + Just workingDir + | not (isAbsolute cmd) -> + workingDir cmd + _ -> + cmd + isLocalFile <- doesFileExist candidatePath + isLocalExec <- if isLocalFile then executable <$> getPermissions candidatePath else pure False + pure $ + if isLocalExec + then Just $ normalizeLocalExecutablePath maybeWorkingDir' cmd candidatePath + else Nothing + + hasExplicitPath = any (`elem` ['/', '\\']) + + firstResolved [] = pure Nothing + firstResolved (resolvePath : resolvePaths) = do + resolved <- resolvePath + maybe (firstResolved resolvePaths) (pure . Just) resolved + + normalizeLocalExecutablePath maybeWorkingDir' cmd candidatePath + | isJust maybeWorkingDir' && not (isAbsolute cmd) = candidatePath + | hasExplicitPath cmd = cmd + | otherwise = "./" <> cmd + +compilerInvocationArgs :: CompilerCommand -> [String] -> [String] +compilerInvocationArgs compiler extraArgs = + compilerArguments compiler <> extraArgs + +expandEnvironmentOverrides + :: Maybe FilePath + -> [(String, String)] + -> Maybe [CompilerEnvOverrideSpec] + -> IO [(String, String)] +expandEnvironmentOverrides maybeWorkingDir overrides maybeOverrideSpecs = do + baseEnvironment <- baseProcessEnvironment maybeWorkingDir + pure $ + expandEnvironmentOverridesWithBaseEnvironment + baseEnvironment + overrides + maybeOverrideSpecs + +expandEnvironmentOverridesWithBaseEnvironment + :: Map.Map String String + -> [(String, String)] + -> Maybe [CompilerEnvOverrideSpec] + -> [(String, String)] +expandEnvironmentOverridesWithBaseEnvironment baseEnvironment overrides maybeOverrideSpecs = + reverse . snd $ + List.foldl' + expandOverride + (baseEnvironment, []) + (zipOverrideSpecs overrides maybeOverrideSpecs) + where + expandOverride (expansionEnvironment, expandedOverrides) ((name, value), maybeOverrideSpec) = + let expandedValue = maybe + (expandEnvironmentValue expansionEnvironment value) + (expandEnvironmentValueWithShellSpans expansionEnvironment) + maybeOverrideSpec + in ( environmentInsert name expandedValue expansionEnvironment + , (name, expandedValue) : expandedOverrides + ) + +baseProcessEnvironment :: Maybe FilePath -> IO (Map.Map String String) +baseProcessEnvironment maybeWorkingDir = do + baseEnvironment <- environmentFromList <$> getEnvironment + case maybeWorkingDir of + Just workingDir -> + pure $ environmentInsert "PWD" workingDir baseEnvironment + Nothing -> + case environmentLookup "PWD" baseEnvironment of + Just _ -> + pure baseEnvironment + Nothing -> do + workingDir <- getCurrentDirectory + pure $ environmentInsert "PWD" workingDir baseEnvironment + +environmentNameKey :: String -> String +environmentNameKey name + | os == "mingw32" = map toLower name + | otherwise = name + +environmentFromList :: [(String, String)] -> Map.Map String String +environmentFromList = + Map.fromList . map (first environmentNameKey) + +environmentInsert :: String -> String -> Map.Map String String -> Map.Map String String +environmentInsert = Map.insert . environmentNameKey + +environmentLookup :: String -> Map.Map String String -> Maybe String +environmentLookup name = + Map.lookup (environmentNameKey name) + +expandEnvironmentValue :: Map.Map String String -> String -> String +expandEnvironmentValue expansionEnvironment = + expandForHost + where + expandForHost + | os == "mingw32" = + expandWindowsEnvironmentVariables expansionEnvironment '!' + . expandWindowsEnvironmentVariables expansionEnvironment '%' + | otherwise = + expandPosixEnvironmentVariables expansionEnvironment + +expandParsedShellWord :: Map.Map String String -> ParsedShellWord -> String +expandParsedShellWord expansionEnvironment = + expandShellWordSpans expansionEnvironment . parsedShellWordSpans + +expandParsedShellWordIntoArguments :: Map.Map String String -> ParsedShellWord -> [String] +expandParsedShellWordIntoArguments expansionEnvironment parsedWord + | os == "mingw32" = + let expandedWord = + expandParsedShellWord expansionEnvironment parsedWord + fields = + if parsedShellWordAllowsWindowsRetokenization parsedWord + then + case shellWordsWithContext expandedWord of + Right expandedWords -> + map shellWordText expandedWords + Left _ -> + [expandedWord] + else + [expandedWord] + in if null fields && parsedShellWordPreservesEmptyField parsedWord + then [""] + else fields + | otherwise = + let fields = + splitExpandedShellWord $ + expandShellWordChars + PosixTildeExpansionForShellWord + expansionEnvironment + (shellWordCharsFromSpans $ parsedShellWordSpans parsedWord) + in if null fields && parsedShellWordPreservesEmptyField parsedWord + then [""] + else fields + +parsedShellWordAllowsWindowsRetokenization :: ParsedShellWord -> Bool +parsedShellWordAllowsWindowsRetokenization = + all shellWordSpanAllowsPosixFieldSplitting . parsedShellWordSpans + +expandShellWordSpans :: Map.Map String String -> [ShellWordSpan] -> String +expandShellWordSpans expansionEnvironment spans + | os == "mingw32" = + concatMap + (expandWindowsShellWordSpan expansionEnvironment) + spans + | otherwise = + concatMap expandedShellWordFragmentText $ + expandShellWordChars + PosixTildeExpansionForShellWord + expansionEnvironment + (shellWordCharsFromSpans spans) + +expandWindowsShellWordSpan :: Map.Map String String -> ShellWordSpan -> String +expandWindowsShellWordSpan expansionEnvironment wordSpan + | shellWordSpanAllowsPosixExpansion wordSpan = + expandEnvironmentValue + expansionEnvironment + (shellWordSpanText wordSpan) + | otherwise = + shellWordSpanText wordSpan + +expandEnvironmentValueWithShellSpans :: Map.Map String String -> CompilerEnvOverrideSpec -> String +expandEnvironmentValueWithShellSpans expansionEnvironment overrideSpec + | os == "mingw32" = + concatMap + (expandWindowsShellWordSpan expansionEnvironment) + (compilerEnvOverrideSpans overrideSpec) + | otherwise = + concatMap expandedShellWordFragmentText $ + expandShellWordChars + PosixTildeExpansionForAssignmentValue + expansionEnvironment + (shellWordCharsFromSpans $ compilerEnvOverrideSpans overrideSpec) + +data ExpandedShellWordFragment = ExpandedShellWordFragment + { expandedShellWordFragmentText :: String + , expandedShellWordFragmentAllowsPosixFieldSplitting :: Bool + } + +data PosixTildeExpansionMode + = PosixTildeExpansionForShellWord + | PosixTildeExpansionForAssignmentValue + deriving (Eq) + +shellWordSpanContext :: ShellWordSpan -> PosixShellContext +shellWordSpanContext wordSpan + | shellWordSpanAllowsPosixExpansion wordSpan = + if shellWordSpanAllowsPosixFieldSplitting wordSpan + then PosixShellUnquotedContext + else PosixShellDoubleQuotedContext + | otherwise = + PosixShellLiteralContext + +shellWordCharsFromSpans :: [ShellWordSpan] -> [ShellWordChar] +shellWordCharsFromSpans = + concatMap $ \wordSpan -> + shellWordCharsFromTextWithContext + (shellWordSpanContext wordSpan) + (shellWordSpanText wordSpan) + +shellWordCharsFromTextWithContext :: PosixShellContext -> String -> [ShellWordChar] +shellWordCharsFromTextWithContext wordContext = + map (\c -> ShellWordChar {shellWordCharText = c, shellWordCharContext = wordContext}) + +shellContextAllowsPosixExpansion :: PosixShellContext -> Bool +shellContextAllowsPosixExpansion context = + context /= PosixShellLiteralContext + +shellContextAllowsPosixFieldSplitting :: PosixShellContext -> Bool +shellContextAllowsPosixFieldSplitting context = + context == PosixShellUnquotedContext + +coalesceExpandedShellWordFragments :: [ExpandedShellWordFragment] -> [ExpandedShellWordFragment] +coalesceExpandedShellWordFragments = + foldr + (\currentFragment accumulatedFragments -> + case accumulatedFragments of + nextFragment : remainingFragments + | expandedShellWordFragmentAllowsPosixFieldSplitting currentFragment + == expandedShellWordFragmentAllowsPosixFieldSplitting nextFragment -> + nextFragment + { expandedShellWordFragmentText = + expandedShellWordFragmentText currentFragment + <> expandedShellWordFragmentText nextFragment + } + : remainingFragments + _ -> + currentFragment : accumulatedFragments + ) + [] + +expandShellWordChars + :: PosixTildeExpansionMode + -> Map.Map String String + -> [ShellWordChar] + -> [ExpandedShellWordFragment] +expandShellWordChars tildeExpansionMode expansionEnvironment = + coalesceExpandedShellWordFragments . go True + where + go _ [] = [] + go tildePrefixAllowed (currentChar:remainingChars) + | Just tildeFragment <- + expandPosixTildePrefix + tildeExpansionMode + expansionEnvironment + tildePrefixAllowed + currentChar + remainingChars = + tildeFragment : go False remainingChars + | shellWordCharText currentChar == '$' + && shellContextAllowsPosixExpansion (shellWordCharContext currentChar) = + case + parsePosixParameterExpansionInShellWord + (shellWordCharContext currentChar) + remainingChars + of + Just (parameterExpansion, trailingChars) -> + expandPosixParameterExpansionInShellWord + tildeExpansionMode + expansionEnvironment + parameterExpansion + <> go False trailingChars + Nothing -> + literalFragment currentChar + : go (tildePrefixContinuesAfterChar tildeExpansionMode currentChar) remainingChars + | otherwise = + literalFragment currentChar + : go (tildePrefixContinuesAfterChar tildeExpansionMode currentChar) remainingChars + + literalFragment wordChar = + ExpandedShellWordFragment + { expandedShellWordFragmentText = [shellWordCharText wordChar] + , expandedShellWordFragmentAllowsPosixFieldSplitting = + shellContextAllowsPosixFieldSplitting (shellWordCharContext wordChar) + } + + expandPosixTildePrefix expansionMode expansionEnvironment' tildePrefixAllowed' wordChar trailingChars + | not tildePrefixAllowed' = Nothing + | shellWordCharContext wordChar /= PosixShellUnquotedContext = Nothing + | shellWordCharText wordChar /= '~' = Nothing + | not (posixTildePrefixTerminatedBy expansionMode trailingChars) = Nothing + | otherwise = + Just $ + ExpandedShellWordFragment + { expandedShellWordFragmentText = + fromMaybe "~" $ environmentLookup "HOME" expansionEnvironment' + , expandedShellWordFragmentAllowsPosixFieldSplitting = False + } + + posixTildePrefixTerminatedBy _ [] = True + posixTildePrefixTerminatedBy expansionMode (nextChar:_) + | shellWordCharContext nextChar /= PosixShellUnquotedContext = False + | shellWordCharText nextChar == '/' = True + | otherwise = + expansionMode == PosixTildeExpansionForAssignmentValue + && shellWordCharText nextChar == ':' + + tildePrefixContinuesAfterChar expansionMode wordChar = + expansionMode == PosixTildeExpansionForAssignmentValue + && shellWordCharContext wordChar == PosixShellUnquotedContext + && shellWordCharText wordChar == ':' + +splitExpandedShellWord :: [ExpandedShellWordFragment] -> [String] +splitExpandedShellWord = + reverse . finalizeSplitState . List.foldl' splitFragment ([], []) + where + splitFragment splitState fragment = + List.foldl' + (splitCharacter $ expandedShellWordFragmentAllowsPosixFieldSplitting fragment) + splitState + (expandedShellWordFragmentText fragment) + + splitCharacter allowsFieldSplitting (completedFields, currentFieldReversed) c + | allowsFieldSplitting && isSpace c = + finalizeCurrentField (completedFields, currentFieldReversed) + | otherwise = + (completedFields, c : currentFieldReversed) + + finalizeSplitState = + fst . finalizeCurrentField + + finalizeCurrentField (completedFields, currentFieldReversed) + | null currentFieldReversed = + (completedFields, []) + | otherwise = + (reverse currentFieldReversed : completedFields, []) + +expandPosixEnvironmentVariables :: Map.Map String String -> String -> String +expandPosixEnvironmentVariables expansionEnvironment = go + where + go [] = [] + go ('$':xs) = + case parsePosixParameterExpansion xs of + Just (parameterExpansion, remaining) -> + expandPosixParameterExpansion expansionEnvironment parameterExpansion + <> go remaining + Nothing -> + '$' : go xs + go (x:xs) = + x : go xs + +data PosixParameterExpansionWordMode + = PosixParameterUseDefaultWord Bool + | PosixParameterUseAlternativeWord Bool + +data PosixParameterExpansion + = PosixSimpleParameterExpansion PosixShellContext String + | PosixBracedParameterExpansion + PosixShellContext + String + (Maybe (PosixParameterExpansionWordMode, [ShellWordChar])) + +parsePosixParameterExpansionInShellWord + :: PosixShellContext + -> [ShellWordChar] + -> Maybe (PosixParameterExpansion, [ShellWordChar]) +parsePosixParameterExpansionInShellWord expansionContext chars = + case chars of + wordChar : remainingChars + | shellWordCharText wordChar == '{' -> + parseBracedPosixParameterExpansionInShellWord expansionContext remainingChars + | isEnvironmentVariableName name -> + Just + ( PosixSimpleParameterExpansion expansionContext name + , trailingChars + ) + where + (nameSuffixChars, trailingChars) = + span (isEnvironmentVariableNameChar . shellWordCharText) remainingChars + name = + shellWordCharText wordChar : map shellWordCharText nameSuffixChars + _ -> + Nothing + +parseBracedPosixParameterExpansionInShellWord + :: PosixShellContext + -> [ShellWordChar] + -> Maybe (PosixParameterExpansion, [ShellWordChar]) +parseBracedPosixParameterExpansionInShellWord expansionContext chars = do + (name, remainingChars) <- parseEnvironmentVariableNamePrefixInShellWord chars + case remainingChars of + wordChar : trailingChars + | shellWordCharText wordChar == '}' + && shellWordCharContext wordChar == expansionContext -> + Just + ( PosixBracedParameterExpansion expansionContext name Nothing + , trailingChars + ) + colonChar : operatorChar : trailingChars + | shellWordCharText colonChar == ':' + && shellWordCharContext colonChar == expansionContext + && shellWordCharContext operatorChar == expansionContext + && shellWordCharText operatorChar `elem` ['-', '+'] -> do + (wordChars, restChars) <- + takePosixParameterExpansionWordInShellWord expansionContext trailingChars + Just + ( PosixBracedParameterExpansion + expansionContext + name + ( Just + ( posixParameterExpansionWordMode + True + (shellWordCharText operatorChar) + , wordChars + ) + ) + , restChars + ) + operatorChar : trailingChars + | shellWordCharContext operatorChar == expansionContext + && shellWordCharText operatorChar `elem` ['-', '+'] -> do + (wordChars, restChars) <- + takePosixParameterExpansionWordInShellWord expansionContext trailingChars + Just + ( PosixBracedParameterExpansion + expansionContext + name + ( Just + ( posixParameterExpansionWordMode + False + (shellWordCharText operatorChar) + , wordChars + ) + ) + , restChars + ) + _ -> + Nothing + +parseEnvironmentVariableNamePrefixInShellWord + :: [ShellWordChar] + -> Maybe (String, [ShellWordChar]) +parseEnvironmentVariableNamePrefixInShellWord chars = + case chars of + wordChar : remainingChars + | isEnvironmentVariableName name -> + Just (name, trailingChars) + where + (nameSuffixChars, trailingChars) = + span (isEnvironmentVariableNameChar . shellWordCharText) remainingChars + name = + shellWordCharText wordChar : map shellWordCharText nameSuffixChars + _ -> + Nothing + +takePosixParameterExpansionWordInShellWord + :: PosixShellContext + -> [ShellWordChar] + -> Maybe ([ShellWordChar], [ShellWordChar]) +takePosixParameterExpansionWordInShellWord expansionContext = + go [] [] + where + go _ _ [] = + Nothing + go nestedExpansionContexts accumulatedChars (currentChar:remainingChars) + | Just (nestedContext, trailingChars) <- + parseNestedExpansionStart currentChar remainingChars = + let nextChar = head remainingChars + in go + (nestedContext : nestedExpansionContexts) + (nextChar : currentChar : accumulatedChars) + trailingChars + | shellWordCharText currentChar == '}' = + case nestedExpansionContexts of + nestedContext : remainingContexts + | shellWordCharContext currentChar == nestedContext -> + go remainingContexts (currentChar : accumulatedChars) remainingChars + [] + | shellWordCharContext currentChar == expansionContext -> + Just (reverse accumulatedChars, remainingChars) + _ -> + go nestedExpansionContexts (currentChar : accumulatedChars) remainingChars + | otherwise = + go nestedExpansionContexts (currentChar : accumulatedChars) remainingChars + + parseNestedExpansionStart currentChar remainingChars = + case remainingChars of + nextChar : trailingChars + | shellWordCharText currentChar == '$' + && shellContextAllowsPosixExpansion (shellWordCharContext currentChar) + && shellWordCharContext nextChar == shellWordCharContext currentChar + && shellWordCharText nextChar == '{' -> + Just (shellWordCharContext currentChar, trailingChars) + _ -> + Nothing + +parsePosixParameterExpansion :: String -> Maybe (PosixParameterExpansion, String) +parsePosixParameterExpansion ('{':xs) = + parseBracedPosixParameterExpansion xs +parsePosixParameterExpansion (x:xs) + | isEnvironmentVariableName name = + Just (PosixSimpleParameterExpansion PosixShellUnquotedContext name, remaining) + where + (nameSuffix, remaining) = span isEnvironmentVariableNameChar xs + name = x : nameSuffix +parsePosixParameterExpansion _ = + Nothing + +parseBracedPosixParameterExpansion :: String -> Maybe (PosixParameterExpansion, String) +parseBracedPosixParameterExpansion xs = do + (name, remaining) <- parseEnvironmentVariableNamePrefix xs + case remaining of + '}':rest -> + Just + ( PosixBracedParameterExpansion + PosixShellUnquotedContext + name + Nothing + , rest + ) + ':':op:rest + | op `elem` ['-', '+'] -> do + (word, trailing) <- takePosixParameterExpansionWord rest + Just + ( PosixBracedParameterExpansion + PosixShellUnquotedContext + name + ( Just + ( posixParameterExpansionWordMode True op + , shellWordCharsFromTextWithContext PosixShellUnquotedContext word + ) + ) + , trailing + ) + op:rest + | op `elem` ['-', '+'] -> do + (word, trailing) <- takePosixParameterExpansionWord rest + Just + ( PosixBracedParameterExpansion + PosixShellUnquotedContext + name + ( Just + ( posixParameterExpansionWordMode False op + , shellWordCharsFromTextWithContext PosixShellUnquotedContext word + ) + ) + , trailing + ) + _ -> + Nothing + +parseEnvironmentVariableNamePrefix :: String -> Maybe (String, String) +parseEnvironmentVariableNamePrefix (x:xs) + | isEnvironmentVariableName name = + Just (name, remaining) + where + (nameSuffix, remaining) = span isEnvironmentVariableNameChar xs + name = x : nameSuffix +parseEnvironmentVariableNamePrefix _ = + Nothing + +takePosixParameterExpansionWord :: String -> Maybe (String, String) +takePosixParameterExpansionWord = + go (0 :: Int) [] + where + go _ _ [] = + Nothing + go nested acc ('$':'{':xs) = + go (nested + 1) ('{' : '$' : acc) xs + go 0 acc ('}':xs) = + Just (reverse acc, xs) + go nested acc ('}':xs) = + go (nested - 1) ('}' : acc) xs + go nested acc (x:xs) = + go nested (x : acc) xs + +posixParameterExpansionWordMode :: Bool -> Char -> PosixParameterExpansionWordMode +posixParameterExpansionWordMode colonSensitive operator = + case operator of + '-' -> + PosixParameterUseDefaultWord colonSensitive + '+' -> + PosixParameterUseAlternativeWord colonSensitive + _ -> + error "internal compiler error" + +expandPosixParameterExpansion :: Map.Map String String -> PosixParameterExpansion -> String +expandPosixParameterExpansion expansionEnvironment parameterExpansion = + concatMap expandedShellWordFragmentText $ + expandPosixParameterExpansionInShellWord + PosixTildeExpansionForShellWord + expansionEnvironment + parameterExpansion + +expandPosixParameterExpansionInShellWord + :: PosixTildeExpansionMode + -> Map.Map String String + -> PosixParameterExpansion + -> [ExpandedShellWordFragment] +expandPosixParameterExpansionInShellWord + tildeExpansionMode + expansionEnvironment + parameterExpansion = + case parameterExpansion of + PosixSimpleParameterExpansion expansionContext name -> + renderExpandedValue expansionContext $ + fromMaybe "" (environmentLookup name expansionEnvironment) + PosixBracedParameterExpansion expansionContext name Nothing -> + renderExpandedValue expansionContext $ + fromMaybe "" (environmentLookup name expansionEnvironment) + PosixBracedParameterExpansion expansionContext name (Just (wordMode, wordChars)) -> + let maybeValue = environmentLookup name expansionEnvironment + isSet = isJust maybeValue + isSetAndNonEmpty = maybe False (not . null) maybeValue + expandedWord = + expandShellWordChars + tildeExpansionMode + expansionEnvironment + wordChars + in case wordMode of + PosixParameterUseDefaultWord colonSensitive + | posixParameterShouldUseDefaultWord colonSensitive isSet isSetAndNonEmpty -> + expandedWord + | otherwise -> + renderExpandedValue expansionContext $ fromMaybe "" maybeValue + PosixParameterUseAlternativeWord colonSensitive + | posixParameterShouldUseAlternativeWord colonSensitive isSet isSetAndNonEmpty -> + expandedWord + | otherwise -> + renderExpandedValue expansionContext "" + where + renderExpandedValue expansionContext value = + [ ExpandedShellWordFragment + { expandedShellWordFragmentText = value + , expandedShellWordFragmentAllowsPosixFieldSplitting = + shellContextAllowsPosixFieldSplitting expansionContext + } + ] + +posixParameterShouldUseDefaultWord :: Bool -> Bool -> Bool -> Bool +posixParameterShouldUseDefaultWord colonSensitive isSet isSetAndNonEmpty + | colonSensitive = + not isSetAndNonEmpty + | otherwise = + not isSet + +posixParameterShouldUseAlternativeWord :: Bool -> Bool -> Bool -> Bool +posixParameterShouldUseAlternativeWord colonSensitive isSet isSetAndNonEmpty + | colonSensitive = + isSetAndNonEmpty + | otherwise = + isSet + +expandWindowsEnvironmentVariables :: Map.Map String String -> Char -> String -> String +expandWindowsEnvironmentVariables expansionEnvironment delimiter = go + where + go [] = [] + go (x:xs) + | x /= delimiter = + x : go xs + | otherwise = + case break (== delimiter) xs of + (name, _:rest) + | isWindowsEnvironmentVariableName name -> + fromMaybe "" (environmentLookup name expansionEnvironment) <> go rest + _ -> + delimiter : go xs + +isEnvironmentVariableName :: String -> Bool +isEnvironmentVariableName [] = False +isEnvironmentVariableName (x:xs) = + startsLikeIdentifier x && all isEnvironmentVariableNameChar xs + +startsLikeIdentifier :: Char -> Bool +startsLikeIdentifier c = + isAlpha c || c == '_' + +isEnvironmentVariableNameChar :: Char -> Bool +isEnvironmentVariableNameChar c = + isAlphaNum c || c == '_' + +isWindowsEnvironmentVariableName :: String -> Bool +isWindowsEnvironmentVariableName name = + not (null name) && all isWindowsEnvironmentVariableNameChar name + +isWindowsEnvironmentVariableNameChar :: Char -> Bool +isWindowsEnvironmentVariableNameChar c = + not (isSpace c) && c /= '"' && c /= '%' && c /= '!' + +compilerProcessEnv :: CompilerCommand -> IO (Maybe [(String, String)]) +compilerProcessEnv compiler + | null (compilerEnvOverrides compiler) = pure Nothing + | otherwise = do + let expandedOverrides = + expandEnvironmentOverridesWithBaseEnvironment + (compilerBaseEnvironment compiler) + (compilerEnvOverrides compiler) + (compilerEnvOverrideSpecs compiler) + pure . Just . Map.toList $ + Map.union + (environmentFromList expandedOverrides) + (compilerBaseEnvironment compiler) + +showCompilerCommandForUser :: CompilerCommand -> [String] -> String +showCompilerCommandForUser = + renderCompilerCommandForUserHost os + +renderCompilerCommandForUserHost :: String -> CompilerCommand -> [String] -> String +renderCompilerCommandForUserHost hostOs compiler extraArgs + | hostOs == "mingw32" = + intercalate " && " $ + ["setlocal EnableDelayedExpansion" | windowsCommandNeedsDelayedExpansion] + <> map renderWindowsEnvOverride (compilerEnvOverrides compiler) + <> [renderCommandWords] + | otherwise = + unwords $ + renderPosixEnvAssignments + <> quotedCommandWords + where + quoteWord = shellQuoteForHost hostOs + renderPosixEnvAssignments = + zipOverrideSpecs (compilerEnvOverrides compiler) (compilerEnvOverrideSpecs compiler) + <&> uncurry renderPosixEnvAssignment + commandWords = + compilerExecutable compiler : compilerInvocationArgs compiler extraArgs + windowsCommandNeedsDelayedExpansion = + not (null (compilerEnvOverrides compiler)) + || any ('!' `elem`) commandWords + quotedCommandWords = + map quoteWord commandWords + renderCommandWords = + unwords quotedCommandWords + renderPosixEnvAssignment (name, value) maybeOverrideSpec = + name + <> "=" + <> maybe + (quoteWord value) + renderPosixEnvOverrideSpec + maybeOverrideSpec + renderWindowsEnvOverride (name, value) = + "set " <> cmdExeSetAssignmentQuote (name <> "=" <> value) + renderPosixEnvOverrideSpec (CompilerEnvOverrideSpec spans) = + let wordChars = shellWordCharsFromSpans spans + in maybe + (renderPosixShellWordChars wordChars) + (renderUniformPosixShellWordChars wordChars) + (uniformShellWordContext wordChars) + +zipOverrideSpecs + :: [(String, String)] + -> Maybe [CompilerEnvOverrideSpec] + -> [((String, String), Maybe CompilerEnvOverrideSpec)] +zipOverrideSpecs overrides maybeOverrideSpecs = + case maybeOverrideSpecs of + Just overrideSpecs + | length overrideSpecs == length overrides -> + zip overrides $ map Just overrideSpecs + _ -> + map (, Nothing) overrides + +shellQuoteForHost :: String -> String -> String +shellQuoteForHost hostOs + | hostOs == "mingw32" = cmdExeQuote + | otherwise = shellQuote + +cmdExeQuote :: String -> String +cmdExeQuote = + quoteWindowsCommandWord . escapeWindowsDelayedExpansion . escapeWindowsPercentExpansion + +cmdExeSetAssignmentQuote :: String -> String +cmdExeSetAssignmentQuote = + (\value -> "\"" <> value <> "\"") + . concatMap escapeWindowsSetAssignmentQuote + . escapeWindowsSetAssignmentExpansion + +escapeWindowsSetAssignmentQuote :: Char -> String +escapeWindowsSetAssignmentQuote c + | c == '"' = "\"\"" + | otherwise = [c] + +escapeWindowsPercentExpansion :: String -> String +escapeWindowsPercentExpansion = + concatMap $ \c -> + if c == '%' + then "%%" + else [c] + +escapeWindowsDelayedExpansion :: String -> String +escapeWindowsDelayedExpansion = + concatMap $ \c -> + if c == '!' + then "^!" + else [c] + +escapeWindowsSetAssignmentExpansion :: String -> String +escapeWindowsSetAssignmentExpansion [] = [] +escapeWindowsSetAssignmentExpansion ('%':xs) = + case break (== '%') xs of + (name, '%':rest) + | isWindowsEnvironmentVariableName name -> + "!" <> name <> "!" <> escapeWindowsSetAssignmentExpansion rest + _ -> + "%%" <> escapeWindowsSetAssignmentExpansion xs +escapeWindowsSetAssignmentExpansion ('!':xs) = + "^!" <> escapeWindowsSetAssignmentExpansion xs +escapeWindowsSetAssignmentExpansion (x:xs) = + x : escapeWindowsSetAssignmentExpansion xs + +quoteWindowsCommandWord :: String -> String +quoteWindowsCommandWord word = + "\"" <> go word <> "\"" + where + go [] = [] + go xs = + let (backslashes, rest) = span (== '\\') xs + escapedBackslashes n = replicate n '\\' + in case rest of + [] -> + escapedBackslashes (2 * length backslashes) + '"':ys -> + escapedBackslashes (2 * length backslashes + 1) + <> "\"" + <> go ys + c:ys -> + backslashes <> [c] <> go ys + +uniformShellWordContext :: [ShellWordChar] -> Maybe PosixShellContext +uniformShellWordContext [] = Nothing +uniformShellWordContext (firstChar:remainingChars) + | all ((== shellWordCharContext firstChar) . shellWordCharContext) remainingChars = + Just $ shellWordCharContext firstChar + | otherwise = + Nothing + +renderUniformPosixShellWordChars :: [ShellWordChar] -> PosixShellContext -> String +renderUniformPosixShellWordChars wordChars context = + case context of + PosixShellLiteralContext -> + shellQuote wordText + PosixShellDoubleQuotedContext -> + "\"" <> renderPosixDoubleQuotedExpandableText wordText <> "\"" + PosixShellUnquotedContext -> + renderPosixShellWordChars wordChars + where + wordText = map shellWordCharText wordChars + +renderPosixDoubleQuotedExpandableText :: String -> String +renderPosixDoubleQuotedExpandableText [] = [] +renderPosixDoubleQuotedExpandableText ('$':xs) = + case parsePosixParameterExpansion xs of + Just (parameterExpansion, remaining) -> + renderPosixParameterExpansion parameterExpansion + <> renderPosixDoubleQuotedExpandableText remaining + Nothing -> + "\\$" <> renderPosixDoubleQuotedExpandableText xs +renderPosixDoubleQuotedExpandableText (x:xs) = + escapePosixDoubleQuotedChar x <> renderPosixDoubleQuotedExpandableText xs + +renderPosixShellWordChars :: [ShellWordChar] -> String +renderPosixShellWordChars [] = shellQuote "" +renderPosixShellWordChars chars = + renderNonEmptyPosixShellWordChars chars + +renderPosixParameterExpansionWord :: [ShellWordChar] -> String +renderPosixParameterExpansionWord [] = [] +renderPosixParameterExpansionWord chars = + renderNonEmptyPosixShellWordChars chars + +renderNonEmptyPosixShellWordChars :: [ShellWordChar] -> String +renderNonEmptyPosixShellWordChars [] = [] +renderNonEmptyPosixShellWordChars chars@(currentChar:_) + | shellWordCharContext currentChar == PosixShellLiteralContext = + let (segment, remainingChars) = + span + ((== PosixShellLiteralContext) . shellWordCharContext) + chars + in shellQuote (map shellWordCharText segment) + <> renderNonEmptyPosixShellWordChars remainingChars + | shellWordCharContext currentChar == PosixShellDoubleQuotedContext = + "\"" <> renderDoubleQuotedShellWordChars chars + | otherwise = + renderUnquotedShellWordChars chars + +renderDoubleQuotedShellWordChars :: [ShellWordChar] -> String +renderDoubleQuotedShellWordChars [] = "\"" +renderDoubleQuotedShellWordChars chars@(currentChar:remainingChars) + | shellWordCharContext currentChar /= PosixShellDoubleQuotedContext = + "\"" <> renderNonEmptyPosixShellWordChars chars + | shellWordCharText currentChar == '$' = + case + parsePosixParameterExpansionInShellWord + PosixShellDoubleQuotedContext + remainingChars + of + Just (parameterExpansion, trailingChars) -> + renderPosixParameterExpansion parameterExpansion + <> renderDoubleQuotedShellWordChars trailingChars + Nothing -> + "\\$" <> renderDoubleQuotedShellWordChars remainingChars + | shellWordCharText currentChar == '`' = + "\\`" <> renderDoubleQuotedShellWordChars remainingChars + | otherwise = + escapePosixDoubleQuotedChar (shellWordCharText currentChar) + <> renderDoubleQuotedShellWordChars remainingChars + +renderUnquotedShellWordChars :: [ShellWordChar] -> String +renderUnquotedShellWordChars [] = [] +renderUnquotedShellWordChars chars@(currentChar:remainingChars) + | shellWordCharContext currentChar /= PosixShellUnquotedContext = + renderNonEmptyPosixShellWordChars chars + | shellWordCharText currentChar == '$' = + case + parsePosixParameterExpansionInShellWord + PosixShellUnquotedContext + remainingChars + of + Just (parameterExpansion, trailingChars) -> + renderPosixParameterExpansion parameterExpansion + <> renderUnquotedShellWordChars trailingChars + Nothing -> + "\\$" <> renderUnquotedShellWordChars remainingChars + | shellWordCharText currentChar == '`' = + "\\`" <> renderUnquotedShellWordChars remainingChars + | otherwise = + escapePosixUnquotedChar (shellWordCharText currentChar) + <> renderUnquotedShellWordChars remainingChars + +escapePosixUnquotedChar :: Char -> String +escapePosixUnquotedChar c + | isSpace c || c `elem` ['\\', '"', '\'', '#', ';', '&', '|', '<', '>', '*', '?', '[', ']', '(', ')', '{', '}'] = + ['\\', c] + | otherwise = + [c] + +renderPosixParameterExpansion :: PosixParameterExpansion -> String +renderPosixParameterExpansion parameterExpansion = + case parameterExpansion of + PosixSimpleParameterExpansion _ name -> + '$' : name + PosixBracedParameterExpansion _ name Nothing -> + "${" <> name <> "}" + PosixBracedParameterExpansion _ name (Just (wordMode, word)) -> + "${" + <> name + <> renderPosixParameterExpansionWordMode wordMode + <> renderPosixParameterExpansionWord word + <> "}" + +renderPosixParameterExpansionWordMode :: PosixParameterExpansionWordMode -> String +renderPosixParameterExpansionWordMode wordMode = + case wordMode of + PosixParameterUseDefaultWord colonSensitive -> + bool "-" ":-" colonSensitive + PosixParameterUseAlternativeWord colonSensitive -> + bool "+" ":+" colonSensitive + +escapePosixDoubleQuotedChar :: Char -> String +escapePosixDoubleQuotedChar c + | c == '\\' = + "\\\\" + | c == '"' = + "\\\"" + | c == '$' = + "\\$" + | c == '`' = + "\\`" + | otherwise = + [c] + +shellQuote :: String -> String +shellQuote word = + "'" <> concatMap escapeShellQuoteChar word <> "'" + where + escapeShellQuoteChar '\'' + = "'\"'\"'" + escapeShellQuoteChar c + = [c] + +readCompilerProcessWithExitCode :: CompilerCommand -> [String] -> IO (ExitCode, String, String) +readCompilerProcessWithExitCode compiler extraArgs = do + processEnv <- compilerProcessEnv compiler + readCreateProcessWithExitCode + (proc (compilerExecutable compiler) (compilerInvocationArgs compiler extraArgs)) + { env = processEnv + } + "" + +data CompilerOutputStream + = CompilerStdout + | CompilerStderr + deriving (Eq, Ord, Show) + +data CapturedCompilerOutputChunk = CapturedCompilerOutputChunk + { capturedCompilerOutputStream :: CompilerOutputStream + , capturedCompilerOutputIndex :: Int + , capturedCompilerOutputChunk :: CompilerOutputChunk + } + +type CapturedCompilerOutputKey = (CompilerOutputStream, Int) + +data SuppressibleCapturedCompilerOutputChunk = SuppressibleCapturedCompilerOutputChunk + { suppressibleCapturedCompilerOutputKeys :: [CapturedCompilerOutputKey] + , suppressibleCapturedCompilerOutputChunk :: CompilerOutputChunk + } + +data CapturedCompilerOutputDecision + = RetainCapturedCompilerOutput + | SuppressCapturedCompilerOutput + deriving (Eq) + +data IncrementalStreamWarningSuppressionState = IncrementalStreamWarningSuppressionState + { incrementalStreamWarningPendingChunk :: Maybe ([CapturedCompilerOutputKey], [B.ByteString]) + , incrementalStreamWarningChunkFilter :: IncrementalCompilerWarningFilter SuppressibleCapturedCompilerOutputChunk + } + +data IncrementalCompilerWarningSuppressionState = IncrementalCompilerWarningSuppressionState + { incrementalCompilerWarningStdoutState :: IncrementalStreamWarningSuppressionState + , incrementalCompilerWarningStderrState :: IncrementalStreamWarningSuppressionState + , incrementalCompilerWarningChunkFilter :: IncrementalCompilerWarningFilter SuppressibleCapturedCompilerOutputChunk + , incrementalCompilerWarningPending :: [CapturedCompilerOutputChunk] + , incrementalCompilerWarningDecisions :: Map.Map CapturedCompilerOutputKey CapturedCompilerOutputDecision + } + +readCompilerProcessWithExitCodeChunks + :: StdStream + -> CompilerCommand + -> [String] + -> IO (ExitCode, [CapturedCompilerOutputChunk]) +readCompilerProcessWithExitCodeChunks = + readCompilerProcessWithExitCodeChunksUntil (\_ _ -> pure False) + +readCompilerProcessWithExitCodeChunksUntil + :: (IO [CapturedCompilerOutputChunk] -> CompilerPostExitReadiness) + -> StdStream + -> CompilerCommand + -> [String] + -> IO (ExitCode, [CapturedCompilerOutputChunk]) +readCompilerProcessWithExitCodeChunksUntil postExitDrainSatisfied stdinStream compiler extraArgs = do + capturedChunksRef <- newIORef [] + let readCapturedChunks = + reverse <$> readIORef capturedChunksRef + (exitCode, ()) <- + foldCompilerProcessWithExitCodeChunksUntil + (postExitDrainSatisfied readCapturedChunks) + stdinStream + compiler + extraArgs + () + (\() capturedChunk -> modifyIORef' capturedChunksRef (capturedChunk :) $> ()) + capturedChunks <- readCapturedChunks + pure (exitCode, capturedChunks) + +foldCompilerProcessWithExitCodeChunks + :: StdStream + -> CompilerCommand + -> [String] + -> a + -> (a -> CapturedCompilerOutputChunk -> IO a) + -> IO (ExitCode, a) +foldCompilerProcessWithExitCodeChunks = + foldCompilerProcessWithExitCodeChunksUntil compilerPostExitNotReady + +foldCompilerProcessWithExitCodeChunksUntil + :: CompilerPostExitReadiness + -> StdStream + -> CompilerCommand + -> [String] + -> a + -> (a -> CapturedCompilerOutputChunk -> IO a) + -> IO (ExitCode, a) +foldCompilerProcessWithExitCodeChunksUntil postExitDrainSatisfied stdinStream compiler extraArgs initialAcc accumulateChunk = do + processEnv <- compilerProcessEnv compiler + withCreateProcess + (proc (compilerExecutable compiler) (compilerInvocationArgs compiler extraArgs)) + { env = processEnv + , std_in = stdinStream + , std_out = CreatePipe + , std_err = CreatePipe + , create_group = True + , close_fds = True + } $ \maybeInputHandle maybeStdoutHandle maybeStderrHandle processHandle -> do + maybe (pure ()) hClose maybeInputHandle + stdoutHandle <- requireCapturedHandle "stdout" maybeStdoutHandle + stderrHandle <- requireCapturedHandle "stderr" maybeStderrHandle + hSetBinaryMode stdoutHandle True + hSetBinaryMode stderrHandle True + processExitVar <- newTVarIO False + failurePostExitDrainPendingVar <- newTVarIO False + forcePostExitDrainCompleteVar <- newTVarIO False + let postExitDrainSatisfied' processGroupId = do + (failureDrainPending, forceComplete) <- + atomically $ + (,) + <$> readTVar failurePostExitDrainPendingVar + <*> readTVar forcePostExitDrainCompleteVar + if failureDrainPending + then pure forceComplete + else + postExitDrainSatisfied processGroupId + processGroupId <- compilerProcessGroupIdForHandle processHandle + capturedChunksVar <- newEmptyMVar + _ <- forkIO $ + putMVar capturedChunksVar =<< try + ( foldCompilerOutputChunks + processGroupId + postExitDrainSatisfied' + processExitVar + stdoutHandle + stderrHandle + initialAcc + accumulateChunk + ) + exitCode <- waitForProcess processHandle + case exitCode of + ExitSuccess -> + atomically $ writeTVar processExitVar True + ExitFailure _ -> do + atomically $ writeTVar failurePostExitDrainPendingVar True + _ <- forkIO $ do + threadDelay compilerProcessFailureOutputDrainGraceMicros + atomically $ do + writeTVar forcePostExitDrainCompleteVar True + writeTVar processExitVar True + pure () + capturedChunksResult <- try $ do + when (exitCode == ExitSuccess) $ + waitForCompilerProcessPostExitCompletion + processGroupId + postExitDrainSatisfied' + capturedChunks <- takeCapturedResultAfterProcessExit capturedChunksVar + case exitCode of + ExitSuccess -> pure () + ExitFailure _ -> terminateCompilerProcessGroup processGroupId + pure capturedChunks + capturedChunks <- case capturedChunksResult of + Right capturedChunks' -> + pure capturedChunks' + Left capturedException -> do + terminateCompilerProcessGroup processGroupId + throwIO (capturedException :: SomeException) + pure (exitCode, capturedChunks) + +readCompilerProcessWithExitCodeBytes :: CompilerCommand -> [String] -> IO (ExitCode, B.ByteString, B.ByteString) +readCompilerProcessWithExitCodeBytes = + readCompilerProcessWithExitCodeBytesUntil (\_ _ -> pure False) + +readCompilerProcessWithExitCodeBytesUntil + :: (IO [CapturedCompilerOutputChunk] -> CompilerPostExitReadiness) + -> CompilerCommand + -> [String] + -> IO (ExitCode, B.ByteString, B.ByteString) +readCompilerProcessWithExitCodeBytesUntil postExitDrainSatisfied compiler extraArgs = do + (exitCode, capturedChunks) <- + readCompilerProcessWithExitCodeChunksUntil + postExitDrainSatisfied + CreatePipe + compiler + extraArgs + pure + ( exitCode + , compilerOutputBytesForStream CompilerStdout capturedChunks + , compilerOutputBytesForStream CompilerStderr capturedChunks + ) + +requireCapturedHandle :: String -> Maybe Handle -> IO Handle +requireCapturedHandle handleName = + maybe + ( ioError . userError $ + "failed to capture compiler " <> handleName <> " output" + ) + pure + +compilerProcessGroupIdForHandle :: ProcessHandle -> IO (Maybe ProcessGroupID) +compilerProcessGroupIdForHandle = + fmap (fmap fromIntegral) . getPid + +compilerProcessGroupAlive :: Maybe ProcessGroupID -> IO Bool +compilerProcessGroupAlive = + maybe + (pure False) + ( \processGroup -> + catchIOError + (signalProcessGroup nullSignal processGroup $> True) + ( \ioErr -> + case ioeGetErrorType ioErr of + NoSuchThing -> pure False + PermissionDenied -> pure True + _ -> ioError ioErr + ) + ) + +terminateCompilerProcessGroup :: Maybe ProcessGroupID -> IO () +terminateCompilerProcessGroup processGroupId = do + processGroupStillAlive <- compilerProcessGroupAliveForTermination processGroupId + when processGroupStillAlive $ do + signalCompilerProcessGroupForTermination sigTERM processGroupId + waitForCompilerProcessGroupExitAfterTermination + processGroupId + compilerProcessFailureTerminationGraceMicros + processGroupStillAlive' <- compilerProcessGroupAliveForTermination processGroupId + when processGroupStillAlive' $ + signalCompilerProcessGroupForTermination sigKILL processGroupId + +compilerProcessGroupAliveForTermination :: Maybe ProcessGroupID -> IO Bool +compilerProcessGroupAliveForTermination processGroupId = + catchIOError + (compilerProcessGroupAlive processGroupId) + (const $ pure False) + +signalCompilerProcessGroupForTermination :: Signal -> Maybe ProcessGroupID -> IO () +signalCompilerProcessGroupForTermination signal = + maybe + (pure ()) + (ignoreIOException . signalProcessGroup signal) + +waitForCompilerProcessGroupExitAfterTermination :: Maybe ProcessGroupID -> Int -> IO () +waitForCompilerProcessGroupExitAfterTermination processGroupId timeoutMicros = + timeout timeoutMicros go $> () + where + go = do + processGroupStillAlive <- compilerProcessGroupAliveForTermination processGroupId + when processGroupStillAlive $ do + threadDelay compilerOutputDrainAfterExitPollMicros + go + +type CompilerPostExitReadiness = Maybe ProcessGroupID -> IO Bool + +compilerPostExitNotReady :: CompilerPostExitReadiness +compilerPostExitNotReady _ = pure False + +waitForCompilerProcessPostExitCompletion :: Maybe ProcessGroupID -> CompilerPostExitReadiness -> IO () +waitForCompilerProcessPostExitCompletion processGroupId postExitSatisfied = + timeout compilerProcessPostExitCompletionTimeoutMicros go >>= \case + Just () -> pure () + Nothing -> + ioError . userError $ + "compiler wrapper did not finish delayed output after exit" + where + go = do + postExitCompleted <- postExitSatisfied processGroupId + processGroupStillAlive <- compilerProcessGroupAlive processGroupId + unless (postExitCompleted || not processGroupStillAlive) $ do + threadDelay compilerOutputDrainAfterExitPollMicros + go + +capturedCompilerTargetLineAvailableAfterExit :: IO [CapturedCompilerOutputChunk] -> CompilerPostExitReadiness +capturedCompilerTargetLineAvailableAfterExit readCapturedChunks _ = + fmap + ( any isCompleteTargetLine + . completeStdoutLines + . compilerOutputBytesForStream CompilerStdout + ) + readCapturedChunks + where + completeStdoutLines bytes + | B.null bytes = [] + | B.last bytes == newlineByte = BC.lines bytes + | otherwise = + case BC.lines bytes of + [] -> [] + lines' -> init lines' + + isCompleteTargetLine line = + let trimmedLine = trimProbeLine line + in not (B.null trimmedLine) + && BC.any (== '-') trimmedLine + && not (BC.any isSpace trimmedLine) + + trimProbeLine = + BC.reverse . BC.dropWhile isSpace . BC.reverse . BC.dropWhile isSpace + +waitForCompilerProcessGroupQuiescenceAfterExit :: CompilerPostExitReadiness +-- Fallback for compiler invocations that do not have a more specific readiness +-- signal than process-group quiescence. +waitForCompilerProcessGroupQuiescenceAfterExit = compilerPostExitNotReady + +stabilizePostExitPredicate :: IO Bool -> IO CompilerPostExitReadiness +stabilizePostExitPredicate isReady = do + wasReadyRef <- newIORef False + pure $ \_ -> do + ready <- isReady + wasReady <- readIORef wasReadyRef + writeIORef wasReadyRef ready + pure (ready && wasReady) + +stabilizePostExitFingerprint :: Eq a => IO (Maybe a) -> IO CompilerPostExitReadiness +stabilizePostExitFingerprint readFingerprint = do + previousFingerprintRef <- newIORef Nothing + pure $ \_ -> do + fingerprint <- readFingerprint + previousFingerprint <- readIORef previousFingerprintRef + writeIORef previousFingerprintRef fingerprint + pure $ + case fingerprint of + Just _ -> previousFingerprint == fingerprint + Nothing -> False + +data CompilerObjectOutputFingerprint = CompilerObjectOutputFingerprint + !Integer + !Integer + !Integer + !String + !String + !CompilerObjectContentDigest + deriving (Eq) + +compilerObjectOutputFingerprint + :: FileStatus -> Word64 -> CompilerObjectContentDigest -> CompilerObjectOutputFingerprint +compilerObjectOutputFingerprint status outputSize = + CompilerObjectOutputFingerprint + (fromIntegral $ deviceID status) + (fromIntegral $ fileID status) + (fromIntegral outputSize) + (show $ modificationTimeHiRes status) + (show $ statusChangeTimeHiRes status) + +compilerObjectOutputFingerprintAfterExit :: FilePath -> IO (Maybe CompilerObjectOutputFingerprint) +compilerObjectOutputFingerprintAfterExit path = + catchIOError + (bracket openObjectOutput closeFd readObjectOutputFingerprint) + handleObjectOutputReadinessError + where + openObjectOutput = + openFd path ReadOnly defaultFileFlags {nofollow = True, cloexec = True, nonBlock = True} + + readObjectOutputFingerprint fd = do + statusBefore <- getFdStatus fd + let maybeOutputSize = fileStatusSizeWord64 statusBefore + case maybeOutputSize of + Just outputSize + | isRegularFile statusBefore + && outputSize > fromInteger minimumStableCompilerObjectOutputBytes -> do + maybeContentFingerprint <- + relocatableElfObjectContentFingerprint fd outputSize + statusAfter <- getFdStatus fd + pure $ + if compilerObjectOutputStatusFingerprint statusBefore + == compilerObjectOutputStatusFingerprint statusAfter + && isRegularFile statusAfter + then + compilerObjectOutputFingerprint + statusBefore + outputSize + <$> maybeContentFingerprint + else Nothing + _ -> pure Nothing + +compilerProbeObjectTargetFingerprintAfterExit :: FilePath -> IO (Maybe CompilerObjectOutputFingerprint) +compilerProbeObjectTargetFingerprintAfterExit path = + catchIOError + (bracket openObjectOutput closeFd readObjectOutputFingerprint) + handleObjectOutputReadinessError + where + openObjectOutput = + openFd path ReadOnly defaultFileFlags {nofollow = True, cloexec = True, nonBlock = True} + + readObjectOutputFingerprint fd = do + statusBefore <- getFdStatus fd + let maybeOutputSize = fileStatusSizeWord64 statusBefore + case maybeOutputSize of + Just outputSize + | isRegularFile statusBefore + && outputSize >= minimumProbeObjectTargetBytes -> do + let targetByteCount = + min outputSize maximumProbeObjectTargetBytes + maybeTargetBytes <- + readCompilerObjectBytesAt fd 0 targetByteCount + statusAfter <- getFdStatus fd + let maybeContentFingerprint = do + targetBytes <- maybeTargetBytes + if probeObjectTargetBytesCanBeClassified targetBytes + then + digestCompilerObjectLoadedBytes + emptyCompilerObjectContentDigest + 0 + targetByteCount + targetBytes + else Nothing + pure $ + if compilerObjectOutputStatusFingerprint statusBefore + == compilerObjectOutputStatusFingerprint statusAfter + && isRegularFile statusAfter + then + compilerObjectOutputFingerprint + statusBefore + outputSize + <$> maybeContentFingerprint + else Nothing + _ -> pure Nothing + +probeObjectTargetBytesCanBeClassified :: B.ByteString -> Bool +probeObjectTargetBytesCanBeClassified bytes = + B.length bytes >= 4 + && (B.take 4 bytes /= elfMagic || B.length bytes >= minimumElfProbeObjectTargetBytes) + +handleObjectOutputReadinessError :: IOError -> IO (Maybe a) +handleObjectOutputReadinessError ioErr + | objectOutputReadinessCanRetry ioErr = pure Nothing + | otherwise = ioError ioErr + +objectOutputReadinessCanRetry :: IOError -> Bool +objectOutputReadinessCanRetry ioErr = + isDoesNotExistError ioErr + || isEOFError ioErr + || ioeGetErrorType ioErr + `elem` [ IllegalOperation + , InappropriateType + , InvalidArgument + , NoSuchThing + , PermissionDenied + , ResourceVanished + ] + +compilerObjectOutputStatusFingerprint + :: FileStatus -> (Integer, Integer, Integer, String, String) +compilerObjectOutputStatusFingerprint status = + ( fromIntegral $ deviceID status + , fromIntegral $ fileID status + , fromIntegral $ fileSize status + , show $ modificationTimeHiRes status + , show $ statusChangeTimeHiRes status + ) + +fileStatusSizeWord64 :: FileStatus -> Maybe Word64 +fileStatusSizeWord64 status = + let size = fromIntegral (fileSize status) :: Integer + in if size >= 0 && size <= fromIntegral (maxBound :: Word64) + then Just $ fromInteger size + else Nothing + +withCompilerObjectSnapshot :: FilePath -> (FilePath -> IO a) -> IO a +withCompilerObjectSnapshot sourcePath action = do + tmpDir <- getTemporaryDirectory + (snapshotPath, snapshotHandle) <- openTempFile tmpDir "htcc-object-snapshot-.o" + setFileMode snapshotPath temporaryWritableMode + hSetBinaryMode snapshotHandle True + finally + (snapshotCompilerObjectOutput sourcePath snapshotHandle *> action snapshotPath) + ( ignoreIOException (hClose snapshotHandle) + *> ignoreIOException (removeFile snapshotPath) + ) + +snapshotCompilerObjectOutput :: FilePath -> Handle -> IO () +snapshotCompilerObjectOutput sourcePath snapshotHandle = + timeout compilerProcessPostExitCompletionTimeoutMicros go >>= \case + Just () -> pure () + Nothing -> + ioError . userError $ + "compiler wrapper did not finish stable object output snapshot after exit" + where + go = do + snapshotComplete <- + trySnapshotCompilerObjectOutput sourcePath snapshotHandle + unless snapshotComplete $ do + threadDelay compilerOutputDrainAfterExitPollMicros + go + +trySnapshotCompilerObjectOutput :: FilePath -> Handle -> IO Bool +trySnapshotCompilerObjectOutput sourcePath snapshotHandle = + catchIOError + (bracket openObjectOutput closeFd copyObjectOutput) + handleSnapshotError + where + openObjectOutput = + openFd sourcePath ReadOnly defaultFileFlags {nofollow = True, cloexec = True, nonBlock = True} + + copyObjectOutput fd = do + statusBefore <- getFdStatus fd + let maybeOutputSize = fileStatusSizeWord64 statusBefore + case maybeOutputSize of + Just outputSize + | isRegularFile statusBefore + && outputSize > fromInteger minimumStableCompilerObjectOutputBytes + && outputSize <= maximumStableCompilerObjectOutputBytes -> do + maybeContentFingerprintBefore <- + relocatableElfObjectContentFingerprint fd outputSize + case maybeContentFingerprintBefore of + Just contentFingerprintBefore -> do + copied <- copyObjectOutputBytes fd outputSize + statusAfter <- getFdStatus fd + maybeContentFingerprintAfter <- + if copied + then relocatableElfObjectContentFingerprint fd outputSize + else pure Nothing + pure $ + copied + && compilerObjectOutputStatusFingerprint statusBefore + == compilerObjectOutputStatusFingerprint statusAfter + && maybeContentFingerprintAfter + == Just contentFingerprintBefore + Nothing -> pure False + _ -> pure False + + copyObjectOutputBytes fd outputSize = + do + IO.hSeek snapshotHandle IO.AbsoluteSeek 0 + IO.hSetFileSize snapshotHandle 0 + _ <- fdSeek fd AbsoluteSeek 0 + copied <- copyCompilerObjectSnapshotBytes fd snapshotHandle outputSize + hFlush snapshotHandle + pure copied + + handleSnapshotError ioErr + | objectOutputSnapshotCanRetry ioErr = pure False + | otherwise = ioError ioErr + + objectOutputSnapshotCanRetry ioErr = + isDoesNotExistError ioErr + || isEOFError ioErr + || ioeGetErrorType ioErr + `elem` [ IllegalOperation + , InappropriateType + , InvalidArgument + , NoSuchThing + , PermissionDenied + , ResourceVanished + ] + +copyCompilerObjectSnapshotBytes :: Fd -> Handle -> Word64 -> IO Bool +copyCompilerObjectSnapshotBytes fd snapshotHandle = + go + where + go remainingBytes + | remainingBytes == 0 = pure True + | otherwise = + case word64ToInt chunkSize of + Just chunkSizeInt -> do + bytes <- readCompilerObjectExactBytes fd chunkSizeInt + if B.length bytes == chunkSizeInt + then do + B.hPut snapshotHandle bytes + go $ remainingBytes - chunkSize + else pure False + Nothing -> pure False + where + chunkSize = min remainingBytes compilerObjectDigestChunkBytes + +stabilizeCompilerObjectOutputAfterExit :: FilePath -> IO CompilerPostExitReadiness +stabilizeCompilerObjectOutputAfterExit = + stabilizeCompilerObjectOutputFingerprintAfterExit + . compilerObjectOutputFingerprintAfterExit + +stabilizeCompilerProbeObjectTargetAfterExit :: FilePath -> IO CompilerPostExitReadiness +stabilizeCompilerProbeObjectTargetAfterExit = + stabilizeCompilerObjectOutputFingerprintAfterExit + . compilerProbeObjectTargetFingerprintAfterExit + +stabilizeCompilerObjectOutputFingerprintAfterExit + :: IO (Maybe CompilerObjectOutputFingerprint) -> IO CompilerPostExitReadiness +stabilizeCompilerObjectOutputFingerprintAfterExit readFingerprint = do + previousFingerprintRef <- newIORef Nothing + pure $ \processGroupId -> do + fingerprint <- readFingerprint + previousFingerprint <- readIORef previousFingerprintRef + writeIORef previousFingerprintRef fingerprint + case fingerprint of + Just fingerprint' + | previousFingerprint == fingerprint -> + compilerProcessGroupHasNoWritableObjectHandles + processGroupId + fingerprint' + _ -> pure False + +compilerProcessGroupHasNoWritableObjectHandles + :: Maybe ProcessGroupID -> CompilerObjectOutputFingerprint -> IO Bool +compilerProcessGroupHasNoWritableObjectHandles Nothing _ = pure False +compilerProcessGroupHasNoWritableObjectHandles (Just processGroupId) fingerprint = do + maybeProcessIds <- compilerProcessGroupMemberIds processGroupId + case maybeProcessIds of + Just [] -> pure False + Just processIds -> + not <$> anyM (compilerProcessHasWritableObjectHandle fingerprint) processIds + Nothing -> pure False + +compilerProcessGroupMemberIds :: ProcessGroupID -> IO (Maybe [Integer]) +compilerProcessGroupMemberIds processGroupId = do + procResult <- compilerProcessGroupMemberIdsViaProc processGroupId + case procResult of + Just processIds -> pure $ Just processIds + Nothing -> compilerProcessGroupMemberIdsViaPs processGroupId + +compilerProcessGroupMemberIdsViaProc :: ProcessGroupID -> IO (Maybe [Integer]) +compilerProcessGroupMemberIdsViaProc processGroupId = do + procAvailable <- doesDirectoryExist "/proc" + if not procAvailable + then pure Nothing + else + catchIOError + ( do + procEntries <- listDirectory "/proc" + Just . reverse + <$> foldM + collectProcessGroupMember + [] + (mapMaybe readMaybe procEntries) + ) + (const $ pure Nothing) + where + targetProcessGroupId = fromIntegral processGroupId + + collectProcessGroupMember processIds processId = do + maybeProcessGroupId <- procProcessGroupId processId + pure $ + if maybeProcessGroupId == Just targetProcessGroupId + then processId : processIds + else processIds + +compilerProcessGroupMemberIdsViaPs :: ProcessGroupID -> IO (Maybe [Integer]) +compilerProcessGroupMemberIdsViaPs processGroupId = + catchIOError + ( do + psPath <- trustedSystemHelperPath "ps" + (exitCode, stdoutText, _) <- + maybe + (pure (ExitFailure 127, "", "")) + (\path -> readProcessWithExitCode path ["-axo", "pid=,pgid="] "") + psPath + pure $ + case exitCode of + ExitSuccess -> + let processIds = + parseProcessGroupMemberIds + (fromIntegral processGroupId) + stdoutText + in if null processIds + then Nothing + else Just processIds + ExitFailure _ -> Nothing + ) + (const $ pure Nothing) + +procProcessGroupId :: Integer -> IO (Maybe Integer) +procProcessGroupId processId = + catchIOError + (parseProcStatProcessGroupId <$> readFile ("/proc" show processId "stat")) + ( \ioErr -> + if ioeGetErrorType ioErr == NoSuchThing + then pure Nothing + else ioError ioErr + ) + +parseProcStatProcessGroupId :: String -> Maybe Integer +parseProcStatProcessGroupId statText = + case break (== ')') $ reverse statText of + (reversedRest, ')':_) -> + let rest = reverse reversedRest + in case words rest of + _state:_parentProcessId:processGroupId:_ -> readMaybe processGroupId + _ -> Nothing + _ -> Nothing + +parseProcessGroupMemberIds :: Integer -> String -> [Integer] +parseProcessGroupMemberIds targetProcessGroupId = + mapMaybe parseProcessLine . lines + where + parseProcessLine line = + case words line of + pidText:processGroupText:_ + | readMaybe processGroupText == Just targetProcessGroupId -> + readMaybe pidText + _ -> Nothing + +compilerProcessHasWritableObjectHandle + :: CompilerObjectOutputFingerprint -> Integer -> IO Bool +compilerProcessHasWritableObjectHandle fingerprint processId = do + maybeProcResult <- compilerProcessHasWritableObjectHandleViaProc fingerprint processId + case maybeProcResult of + Just hasWritableHandle -> pure hasWritableHandle + Nothing -> + fromMaybe True + <$> compilerProcessHasWritableObjectHandleViaLsof fingerprint processId + +compilerProcessHasWritableObjectHandleViaProc + :: CompilerObjectOutputFingerprint -> Integer -> IO (Maybe Bool) +compilerProcessHasWritableObjectHandleViaProc fingerprint processId = do + procAvailable <- doesDirectoryExist "/proc" + if not procAvailable + then pure Nothing + else do + let processDir = "/proc" show processId + let processFdDir = "/proc" show processId "fd" + processDirExists <- doesDirectoryExist processDir + processFdDirExists <- doesDirectoryExist processFdDir + if not processFdDirExists + then pure $ Just processDirExists + else + catchIOError + (classifyProcFdObjectHandleChecks =<< mapM (procFdCanWriteObject fingerprint processId) =<< listDirectory processFdDir) + (const $ pure Nothing) + +classifyProcFdObjectHandleChecks :: [Maybe Bool] -> IO (Maybe Bool) +classifyProcFdObjectHandleChecks results + | Just True `elem` results = pure $ Just True + | Nothing `elem` results = pure Nothing + | otherwise = pure $ Just False + +procFdCanWriteObject :: CompilerObjectOutputFingerprint -> Integer -> FilePath -> IO (Maybe Bool) +procFdCanWriteObject fingerprint processId fdName + | not (all isDigit fdName) = pure $ Just False + | otherwise = + catchIOError + ( do + let fdPath = "/proc" show processId "fd" fdName + fdStatus <- getFileStatus fdPath + if compilerObjectFingerprintMatchesStatus fingerprint fdStatus + then do + let fdInfoPath = + "/proc" show processId "fdinfo" fdName + maybeFlags <- readProcFdOpenFlags fdInfoPath + pure $ compilerObjectOpenFlagsCanWrite <$> maybeFlags + else pure $ Just False + ) + ( \ioErr -> + if ioeGetErrorType ioErr == NoSuchThing + then pure $ Just False + else pure Nothing + ) + +readProcFdOpenFlags :: FilePath -> IO (Maybe Integer) +readProcFdOpenFlags fdInfoPath = + catchIOError + ( do + fdInfo <- readFile fdInfoPath + _ <- evaluate $ length fdInfo + pure $ do + flagsLine <- findProcFdFlagsLine $ lines fdInfo + parseOctalInteger flagsLine + ) + (const $ pure Nothing) + +findProcFdFlagsLine :: [String] -> Maybe String +findProcFdFlagsLine [] = Nothing +findProcFdFlagsLine (line:rest) = + case words line of + "flags:":flagsText:_ -> Just flagsText + _ -> findProcFdFlagsLine rest + +compilerObjectOpenFlagsCanWrite :: Integer -> Bool +compilerObjectOpenFlagsCanWrite flags = + flags .&. 3 /= 0 + +compilerProcessHasWritableObjectHandleViaLsof + :: CompilerObjectOutputFingerprint -> Integer -> IO (Maybe Bool) +compilerProcessHasWritableObjectHandleViaLsof fingerprint processId = + catchIOError + ( do + lsofPath <- trustedSystemHelperPath "lsof" + (exitCode, stdoutText, _) <- + maybe + (pure (ExitFailure 127, "", "")) + ( \path -> + readProcessWithExitCode + path + ["-nP", "-F", "fDina", "-p", show processId] + "" + ) + lsofPath + case exitCode of + ExitSuccess -> + pure . Just $ + any + (lsofFdRecordCanWriteObject fingerprint) + (parseLsofFdRecords stdoutText) + ExitFailure _ -> pure Nothing + ) + (const $ pure Nothing) + +trustedSystemHelperPath :: FilePath -> IO (Maybe FilePath) +trustedSystemHelperPath helperName = + firstM doesFileExist $ + map ( helperName) trustedSystemHelperDirectories + where + firstM _ [] = pure Nothing + firstM predicate (candidate:candidates) = do + matched <- predicate candidate + if matched + then Just <$> canonicalizePath candidate + else firstM predicate candidates + +trustedSystemHelperDirectories :: [FilePath] +trustedSystemHelperDirectories = + [ "/usr/bin" + , "/bin" + , "/usr/sbin" + , "/sbin" ] -resolutionP :: Parser String -resolutionP = strOption $ mconcat - [ metavar "RESOLUTION" - , long "img-resolution" - , help "Specify the resolution of the AST graph to be generated" - , value "640x480" - , showDefaultWith id +data LsofFdRecord = LsofFdRecord + { lsofFdAccess :: Maybe String + , lsofFdDevice :: Maybe Integer + , lsofFdInode :: Maybe Integer + } + +emptyLsofFdRecord :: LsofFdRecord +emptyLsofFdRecord = + LsofFdRecord + { lsofFdAccess = Nothing + , lsofFdDevice = Nothing + , lsofFdInode = Nothing + } + +parseLsofFdRecords :: String -> [LsofFdRecord] +parseLsofFdRecords = + reverse . flushCurrent . List.foldl' step ([], Nothing) . lines + where + flushCurrent (records, maybeRecord) = + case maybeRecord of + Just record -> record : records + Nothing -> records + + step (records, maybeRecord) line = + case line of + 'f':_ -> + ( flushCurrent (records, maybeRecord) + , Just emptyLsofFdRecord + ) + 'a':access -> + ( records + , (\record -> record {lsofFdAccess = Just access}) + <$> maybeRecord + ) + 'D':deviceText -> + ( records + , (\record -> record {lsofFdDevice = parseUnsignedIntegerAutoBase deviceText}) + <$> maybeRecord + ) + 'i':inodeText -> + ( records + , (\record -> record {lsofFdInode = parseUnsignedIntegerAutoBase inodeText}) + <$> maybeRecord + ) + _ -> (records, maybeRecord) + +lsofFdRecordCanWriteObject :: CompilerObjectOutputFingerprint -> LsofFdRecord -> Bool +lsofFdRecordCanWriteObject fingerprint record = + case (lsofFdDevice record, lsofFdInode record) of + (Just deviceId, Just inode) + | compilerObjectFingerprintMatchesId fingerprint deviceId inode -> + maybe True lsofAccessCanWrite $ lsofFdAccess record + _ -> False + +lsofAccessCanWrite :: String -> Bool +lsofAccessCanWrite access = + 'w' `elem` access || 'u' `elem` access + +compilerObjectFingerprintMatchesStatus + :: CompilerObjectOutputFingerprint -> FileStatus -> Bool +compilerObjectFingerprintMatchesStatus fingerprint status = + compilerObjectFingerprintMatchesId + fingerprint + (fromIntegral $ deviceID status) + (fromIntegral $ fileID status) + +compilerObjectFingerprintMatchesId + :: CompilerObjectOutputFingerprint -> Integer -> Integer -> Bool +compilerObjectFingerprintMatchesId (CompilerObjectOutputFingerprint deviceId inode _ _ _ _) deviceId' inode' = + deviceId == deviceId' && inode == inode' + +parseOctalInteger :: String -> Maybe Integer +parseOctalInteger digits + | null digits || any (`notElem` ['0' .. '7']) digits = Nothing + | otherwise = + Just $ List.foldl' (\acc digit -> acc * 8 + fromIntegral (digitToInt digit)) 0 digits + +parseUnsignedIntegerAutoBase :: String -> Maybe Integer +parseUnsignedIntegerAutoBase value = + case stripPrefix "0x" value <|> stripPrefix "0X" value of + Just hexDigits -> parseHexInteger hexDigits + Nothing + | not (null value) && all isDigit value -> readMaybe value + | otherwise -> Nothing + +parseHexInteger :: String -> Maybe Integer +parseHexInteger digits + | null digits || not (all isHexDigit digits) = Nothing + | otherwise = + Just $ List.foldl' (\acc digit -> acc * 16 + fromIntegral (digitToInt digit)) 0 digits + +foldCompilerOutputChunks + :: Maybe ProcessGroupID + -> CompilerPostExitReadiness + -> TVar Bool + -> Handle + -> Handle + -> a + -> (a -> CapturedCompilerOutputChunk -> IO a) + -> IO a +foldCompilerOutputChunks processGroupId postExitDrainSatisfied processExitVar stdoutHandle stderrHandle initialAcc accumulateChunk = do + stdoutFd <- handleToFd stdoutHandle + stderrFd <- handleToFd stderrHandle + -- Keep captured pipes nonblocking from the start so readiness races or + -- HUP-only wakeups cannot strand us in a blocking read while leaked child + -- writers still hold the pipe open. + setFdOption stdoutFd NonBlockingRead True + setFdOption stderrFd NonBlockingRead True + (stdoutReady, closeStdoutWait) <- threadWaitReadSTM stdoutFd + (stderrReady, closeStderrWait) <- threadWaitReadSTM stderrFd + let cleanup = + ignoreIOException closeStdoutWait + *> ignoreIOException closeStderrWait + *> ignoreIOException (closeFd stdoutFd) + *> ignoreIOException (closeFd stderrFd) + flip finally cleanup $ + go + initialAcc + [] + [] + B.empty + B.empty + True + True + 0 + 0 + CompilerStdout + stdoutReady + stderrReady + stdoutFd + stderrFd + where + go acc stdoutPendingChunks stderrPendingChunks stdoutTrailingBytes stderrTrailingBytes stdoutOpen stderrOpen stdoutIndex stderrIndex preferredStream stdoutReady stderrReady stdoutFd stderrFd + | not stdoutOpen + && not stderrOpen + && null stdoutPendingChunks + && null stderrPendingChunks = + pure acc + | otherwise = do + shouldReadOtherStreamBeforePending <- + shouldReadOtherOutputStreamBeforePending + stdoutOpen + stderrOpen + stdoutPendingChunks + stderrPendingChunks + stdoutReady + stderrReady + case + ( shouldReadOtherStreamBeforePending + , nextPendingOutputStream + preferredStream + stdoutPendingChunks + stderrPendingChunks + ) + of + (False, Just outputStream) -> + do + (acc', stdoutPendingChunks', stderrPendingChunks', stdoutIndex', stderrIndex') <- + emitPendingCapturedCompilerOutputChunk + outputStream + acc + stdoutPendingChunks + stderrPendingChunks + stdoutIndex + stderrIndex + let preferredStream' + | stdoutOpen && stderrOpen = flipCompilerOutputStream outputStream + | otherwise = preferredStream + yield + go + acc' + stdoutPendingChunks' + stderrPendingChunks' + stdoutTrailingBytes + stderrTrailingBytes + stdoutOpen + stderrOpen + stdoutIndex' + stderrIndex' + preferredStream' + stdoutReady + stderrReady + stdoutFd + stderrFd + _ -> do + maybeOutputStream <- + waitForNextOutputStreamOrExit + processExitVar + preferredStream + stdoutOpen + stderrOpen + stdoutReady + stderrReady + case maybeOutputStream of + Nothing -> do + ( acc' + , stdoutPendingChunks' + , stderrPendingChunks' + , stdoutIndex' + , stderrIndex' + ) <- + drainCompilerOutputAfterExit + acc + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes + stderrTrailingBytes + stdoutOpen + stderrOpen + stdoutIndex + stderrIndex + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + go + acc' + stdoutPendingChunks' + stderrPendingChunks' + B.empty + B.empty + False + False + stdoutIndex' + stderrIndex' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + Just outputStream -> do + (acc', stdoutTrailingBytes', stderrTrailingBytes', stdoutIndex', stderrIndex') <- + flushPendingTrailingOutputChunkBefore + outputStream + acc + stdoutTrailingBytes + stderrTrailingBytes + stdoutIndex + stderrIndex + let (outputFd, trailingBytes) = + case outputStream of + CompilerStdout -> + (stdoutFd, stdoutTrailingBytes') + CompilerStderr -> + (stderrFd, stderrTrailingBytes') + maybeReadResult <- readCompilerOutputByte outputFd + case maybeReadResult of + Nothing -> case outputStream of + CompilerStdout -> + let (stdoutPendingChunks', stdoutIndex'') = + queuePendingCapturedCompilerOutputChunks + CompilerStdout + stdoutIndex' + (finalCompilerOutputChunk stdoutTrailingBytes') + stdoutPendingChunks + in go + acc' + stdoutPendingChunks' + stderrPendingChunks + B.empty + stderrTrailingBytes' + False + stderrOpen + stdoutIndex'' + stderrIndex' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + CompilerStderr -> + let (stderrPendingChunks', stderrIndex'') = + queuePendingCapturedCompilerOutputChunks + CompilerStderr + stderrIndex' + (finalCompilerOutputChunk stderrTrailingBytes') + stderrPendingChunks + in go + acc' + stdoutPendingChunks + stderrPendingChunks' + stdoutTrailingBytes' + B.empty + stdoutOpen + False + stdoutIndex' + stderrIndex'' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + Just CompilerOutputReadEOF -> case outputStream of + CompilerStdout -> + let (stdoutPendingChunks', stdoutIndex'') = + queuePendingCapturedCompilerOutputChunks + CompilerStdout + stdoutIndex' + (finalCompilerOutputChunk stdoutTrailingBytes') + stdoutPendingChunks + in go + acc' + stdoutPendingChunks' + stderrPendingChunks + B.empty + stderrTrailingBytes' + False + stderrOpen + stdoutIndex'' + stderrIndex' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + CompilerStderr -> + let (stderrPendingChunks', stderrIndex'') = + queuePendingCapturedCompilerOutputChunks + CompilerStderr + stderrIndex' + (finalCompilerOutputChunk stderrTrailingBytes') + stderrPendingChunks + in go + acc' + stdoutPendingChunks + stderrPendingChunks' + stdoutTrailingBytes' + B.empty + stdoutOpen + False + stdoutIndex' + stderrIndex'' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + Just CompilerOutputReadWouldBlock -> do + yield + go + acc' + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes' + stderrTrailingBytes' + stdoutOpen + stderrOpen + stdoutIndex' + stderrIndex' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + Just (CompilerOutputReadBytes bytes) -> do + let (completedChunks, remainingTrailingBytes) = + splitCompleteCompilerOutputChunks (trailingBytes <> bytes) + readyChunks = + completedChunks + <> finalCompilerOutputChunk remainingTrailingBytes + readyChunkCount = length readyChunks + ( acc'' + , stdoutPendingChunks' + , stderrPendingChunks' + , stdoutIndex'' + , stderrIndex'' + ) <- + captureCompletedCompilerOutputChunks + outputStream + readyChunks + acc' + stdoutPendingChunks + stderrPendingChunks + stdoutIndex' + stderrIndex' + let preferredStream' + | stdoutOpen && stderrOpen = flipCompilerOutputStream outputStream + | otherwise = preferredStream + when (readyChunkCount > 0) yield + go + acc'' + stdoutPendingChunks' + stderrPendingChunks' + B.empty + B.empty + stdoutOpen + stderrOpen + stdoutIndex'' + stderrIndex'' + preferredStream' + stdoutReady + stderrReady + stdoutFd + stderrFd + + nextPendingOutputStream preferredStream stdoutPendingChunks stderrPendingChunks = + case preferredStream of + CompilerStdout + | not (null stdoutPendingChunks) -> Just CompilerStdout + | not (null stderrPendingChunks) -> Just CompilerStderr + CompilerStderr + | not (null stderrPendingChunks) -> Just CompilerStderr + | not (null stdoutPendingChunks) -> Just CompilerStdout + _ -> + Nothing + + shouldReadOtherOutputStreamBeforePending stdoutOpen stderrOpen stdoutPendingChunks stderrPendingChunks stdoutReady stderrReady = + case (stdoutOpen, stderrOpen, stdoutPendingChunks, stderrPendingChunks) of + (True, True, _ : _, []) -> + outputStreamReadyNow stderrReady + (True, True, [], _ : _) -> + outputStreamReadyNow stdoutReady + _ -> + pure False + + outputStreamReadyNow ready = + atomically $ (ready >> pure True) `orElse` pure False + + captureCompletedCompilerOutputChunks outputStream completedChunks acc stdoutPendingChunks stderrPendingChunks stdoutIndex stderrIndex = + case outputStream of + CompilerStdout -> + let completedChunkCount = length completedChunks + in do + (acc', stdoutPendingChunks') <- + captureCompletedCompilerOutputChunksForStream + CompilerStdout + stdoutIndex + completedChunks + acc + stdoutPendingChunks + pure + ( acc' + , stdoutPendingChunks' + , stderrPendingChunks + , stdoutIndex + completedChunkCount + , stderrIndex + ) + CompilerStderr -> + let completedChunkCount = length completedChunks + in do + (acc', stderrPendingChunks') <- + captureCompletedCompilerOutputChunksForStream + CompilerStderr + stderrIndex + completedChunks + acc + stderrPendingChunks + pure + ( acc' + , stdoutPendingChunks + , stderrPendingChunks' + , stdoutIndex + , stderrIndex + completedChunkCount + ) + + captureCompletedCompilerOutputChunksForStream outputStream startIndex completedChunks acc pendingChunks = + case buildCapturedCompilerOutputChunks outputStream startIndex completedChunks of + nextChunk:remainingChunks + | null pendingChunks -> + do + acc' <- accumulateChunk acc nextChunk + pure (acc', remainingChunks) + | otherwise -> + pure (acc, pendingChunks <> (nextChunk : remainingChunks)) + [] -> + pure (acc, pendingChunks) + + queuePendingCapturedCompilerOutputChunks outputStream startIndex outputChunks pendingChunks = + let indexedChunks = + buildCapturedCompilerOutputChunks outputStream startIndex outputChunks + in ( pendingChunks <> indexedChunks + , startIndex + length indexedChunks + ) + + emitPendingCapturedCompilerOutputChunk outputStream acc stdoutPendingChunks stderrPendingChunks stdoutIndex stderrIndex = + case outputStream of + CompilerStdout -> + case stdoutPendingChunks of + nextChunk:remainingChunks -> + do + acc' <- accumulateChunk acc nextChunk + pure + ( acc' + , remainingChunks + , stderrPendingChunks + , stdoutIndex + , stderrIndex + ) + [] -> + pure (acc, [], stderrPendingChunks, stdoutIndex, stderrIndex) + CompilerStderr -> + case stderrPendingChunks of + nextChunk:remainingChunks -> + do + acc' <- accumulateChunk acc nextChunk + pure + ( acc' + , stdoutPendingChunks + , remainingChunks + , stdoutIndex + , stderrIndex + ) + [] -> + pure (acc, stdoutPendingChunks, [], stdoutIndex, stderrIndex) + + flushPendingTrailingOutputChunkBefore outputStream acc stdoutTrailingBytes stderrTrailingBytes stdoutIndex stderrIndex = + case outputStream of + CompilerStdout -> do + acc' <- emitTrailingOutputChunk accumulateChunk CompilerStderr stderrIndex stderrTrailingBytes acc + pure + ( acc' + , stdoutTrailingBytes + , B.empty + , stdoutIndex + , stderrIndex + pendingTrailingOutputChunkCount stderrTrailingBytes + ) + CompilerStderr -> do + acc' <- emitTrailingOutputChunk accumulateChunk CompilerStdout stdoutIndex stdoutTrailingBytes acc + pure + ( acc' + , B.empty + , stderrTrailingBytes + , stdoutIndex + pendingTrailingOutputChunkCount stdoutTrailingBytes + , stderrIndex + ) + + waitForNextOutputStreamOrExit + :: TVar Bool + -> CompilerOutputStream + -> Bool + -> Bool + -> STM () + -> STM () + -> IO (Maybe CompilerOutputStream) + waitForNextOutputStreamOrExit processExitVar' preferredStream stdoutOpen stderrOpen stdoutReady stderrReady = + atomically $ + waitForProcessExit processExitVar' + `orElse` waitForNextOutputStreamStm + preferredStream + stdoutOpen + stderrOpen + stdoutReady + stderrReady + + waitForNextOutputStreamStm + :: CompilerOutputStream + -> Bool + -> Bool + -> STM () + -> STM () + -> STM (Maybe CompilerOutputStream) + waitForNextOutputStreamStm preferredStream stdoutOpen stderrOpen stdoutReady stderrReady + | stdoutOpen && stderrOpen = + case preferredStream of + CompilerStdout -> + (stdoutReady >> pure (Just CompilerStdout)) + `orElse` (stderrReady >> pure (Just CompilerStderr)) + CompilerStderr -> + (stderrReady >> pure (Just CompilerStderr)) + `orElse` (stdoutReady >> pure (Just CompilerStdout)) + | stdoutOpen = + stdoutReady >> pure (Just CompilerStdout) + | stderrOpen = + stderrReady >> pure (Just CompilerStderr) + | otherwise = + pure Nothing + + waitForProcessExit :: TVar Bool -> STM (Maybe CompilerOutputStream) + waitForProcessExit processExitVar' = do + processExited <- readTVar processExitVar' + check processExited + pure Nothing + + -- Keep draining until the wrapper's meaningful side effects are ready. + -- Once the caller-specific completion condition is satisfied, stop + -- waiting for EOF from inherited pipe holders and finish with the bytes + -- that were already observed. + drainCompilerOutputAfterExit + acc + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes + stderrTrailingBytes + stdoutOpen + stderrOpen + stdoutIndex + stderrIndex + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd = do + when stdoutOpen $ setFdOption stdoutFd NonBlockingRead True + when stderrOpen $ setFdOption stderrFd NonBlockingRead True + drainCapturedCompilerOutputAfterExit + acc + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes + stderrTrailingBytes + stdoutOpen + stderrOpen + stdoutIndex + stderrIndex + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + + drainCapturedCompilerOutputAfterExit + acc + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes + stderrTrailingBytes + stdoutOpen + stderrOpen + stdoutIndex + stderrIndex + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + | not stdoutOpen && not stderrOpen = + finishDrainedCompilerOutputAfterExit + acc + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes + stderrTrailingBytes + stdoutIndex + stderrIndex + | otherwise = + case nextDrainOutputStream preferredStream stdoutOpen stderrOpen of + Nothing -> + finishDrainedCompilerOutputAfterExit + acc + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes + stderrTrailingBytes + stdoutIndex + stderrIndex + Just outputStream -> do + (acc', stdoutTrailingBytes', stderrTrailingBytes', stdoutIndex', stderrIndex') <- + flushPendingTrailingOutputChunkBefore + outputStream + acc + stdoutTrailingBytes + stderrTrailingBytes + stdoutIndex + stderrIndex + let (outputFd, trailingBytes) = + case outputStream of + CompilerStdout -> + (stdoutFd, stdoutTrailingBytes') + CompilerStderr -> + (stderrFd, stderrTrailingBytes') + maybeBytes <- readDrainedCompilerOutputByte outputFd + case maybeBytes of + Nothing -> + closeDrainedCompilerOutputStreamAfterExit + outputStream + acc' + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes' + stderrTrailingBytes' + stdoutOpen + stderrOpen + stdoutIndex' + stderrIndex' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + Just CompilerOutputReadEOF -> + closeDrainedCompilerOutputStreamAfterExit + outputStream + acc' + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes' + stderrTrailingBytes' + stdoutOpen + stderrOpen + stdoutIndex' + stderrIndex' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + Just CompilerOutputReadWouldBlock -> do + postExitSatisfied <- postExitDrainSatisfied processGroupId + processGroupStillAlive <- compilerProcessGroupAlive processGroupId + if postExitSatisfied || not processGroupStillAlive + then + closeDrainedCompilerOutputStreamAfterExit + outputStream + acc' + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes' + stderrTrailingBytes' + stdoutOpen + stderrOpen + stdoutIndex' + stderrIndex' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + else do + maybeOutputStream <- + waitForDrainedCompilerOutputStreamAfterExit + preferredStream + stdoutOpen + stderrOpen + stdoutReady + stderrReady + case maybeOutputStream of + Just preferredStream' -> + drainCapturedCompilerOutputAfterExit + acc' + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes' + stderrTrailingBytes' + stdoutOpen + stderrOpen + stdoutIndex' + stderrIndex' + preferredStream' + stdoutReady + stderrReady + stdoutFd + stderrFd + Nothing -> + drainCapturedCompilerOutputAfterExit + acc' + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes' + stderrTrailingBytes' + stdoutOpen + stderrOpen + stdoutIndex' + stderrIndex' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + Just (CompilerOutputReadBytes bytes) -> do + postExitSatisfied <- postExitDrainSatisfied processGroupId + bytes' <- + if postExitSatisfied + then readForcedDrainedCompilerOutputBytes outputFd bytes + else pure bytes + let (completedChunks, remainingTrailingBytes) = + splitCompleteCompilerOutputChunks (trailingBytes <> bytes') + readyChunks = + completedChunks + <> finalCompilerOutputChunk remainingTrailingBytes + readyChunkCount = length readyChunks + ( acc'' + , stdoutPendingChunks' + , stderrPendingChunks' + , stdoutIndex'' + , stderrIndex'' + ) <- + captureCompletedCompilerOutputChunks + outputStream + readyChunks + acc' + stdoutPendingChunks + stderrPendingChunks + stdoutIndex' + stderrIndex' + let preferredStream' + | stdoutOpen && stderrOpen = flipCompilerOutputStream outputStream + | otherwise = preferredStream + if postExitSatisfied + then + closeDrainedCompilerOutputStreamAfterExit + outputStream + acc'' + stdoutPendingChunks' + stderrPendingChunks' + B.empty + B.empty + stdoutOpen + stderrOpen + stdoutIndex'' + stderrIndex'' + preferredStream' + stdoutReady + stderrReady + stdoutFd + stderrFd + else + if readyChunkCount > 0 + then + drainCapturedCompilerOutputAfterExit + acc'' + stdoutPendingChunks' + stderrPendingChunks' + B.empty + B.empty + stdoutOpen + stderrOpen + stdoutIndex'' + stderrIndex'' + preferredStream' + stdoutReady + stderrReady + stdoutFd + stderrFd + else + closeDrainedCompilerOutputStreamAfterExit + outputStream + acc'' + stdoutPendingChunks' + stderrPendingChunks' + B.empty + B.empty + stdoutOpen + stderrOpen + stdoutIndex'' + stderrIndex'' + preferredStream' + stdoutReady + stderrReady + stdoutFd + stderrFd + + finishDrainedCompilerOutputAfterExit + acc + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes + stderrTrailingBytes + stdoutIndex + stderrIndex = + let (stdoutPendingChunks', stdoutIndex') = + queuePendingCapturedCompilerOutputChunks + CompilerStdout + stdoutIndex + (finalCompilerOutputChunk stdoutTrailingBytes) + stdoutPendingChunks + (stderrPendingChunks', stderrIndex') = + queuePendingCapturedCompilerOutputChunks + CompilerStderr + stderrIndex + (finalCompilerOutputChunk stderrTrailingBytes) + stderrPendingChunks + in pure + ( acc + , stdoutPendingChunks' + , stderrPendingChunks' + , stdoutIndex' + , stderrIndex' + ) + + closeDrainedCompilerOutputStreamAfterExit + outputStream + acc + stdoutPendingChunks + stderrPendingChunks + stdoutTrailingBytes + stderrTrailingBytes + stdoutOpen + stderrOpen + stdoutIndex + stderrIndex + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd = + case outputStream of + CompilerStdout -> + let (stdoutPendingChunks', stdoutIndex') = + queuePendingCapturedCompilerOutputChunks + CompilerStdout + stdoutIndex + (finalCompilerOutputChunk stdoutTrailingBytes) + stdoutPendingChunks + in drainCapturedCompilerOutputAfterExit + acc + stdoutPendingChunks' + stderrPendingChunks + B.empty + stderrTrailingBytes + False + stderrOpen + stdoutIndex' + stderrIndex + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + CompilerStderr -> + let (stderrPendingChunks', stderrIndex') = + queuePendingCapturedCompilerOutputChunks + CompilerStderr + stderrIndex + (finalCompilerOutputChunk stderrTrailingBytes) + stderrPendingChunks + in drainCapturedCompilerOutputAfterExit + acc + stdoutPendingChunks + stderrPendingChunks' + stdoutTrailingBytes + B.empty + stdoutOpen + False + stdoutIndex + stderrIndex' + preferredStream + stdoutReady + stderrReady + stdoutFd + stderrFd + + nextDrainOutputStream preferredStream stdoutOpen stderrOpen = + case preferredStream of + CompilerStdout + | stdoutOpen -> Just CompilerStdout + | stderrOpen -> Just CompilerStderr + CompilerStderr + | stderrOpen -> Just CompilerStderr + | stdoutOpen -> Just CompilerStdout + _ -> + Nothing + + waitForDrainedCompilerOutputStreamAfterExit + :: CompilerOutputStream + -> Bool + -> Bool + -> STM () + -> STM () + -> IO (Maybe CompilerOutputStream) + waitForDrainedCompilerOutputStreamAfterExit preferredStream stdoutOpen stderrOpen stdoutReady stderrReady = + fromMaybe Nothing + <$> timeout + compilerOutputDrainAfterExitPollMicros + (atomically $ waitForNextOutputStreamStm preferredStream stdoutOpen stderrOpen stdoutReady stderrReady) + + readCompilerOutputByte fd = + catchIOError + ( do + bytes <- PB.fdRead fd (fromIntegral compilerOutputReadChunkSize) + pure . Just $ + if B.null bytes + then CompilerOutputReadEOF + else CompilerOutputReadBytes bytes + ) + ( \ioErr -> + if isEOFError ioErr + then pure Nothing + else + if ioeGetErrorType ioErr == ResourceExhausted + then pure $ Just CompilerOutputReadWouldBlock + else ioError ioErr + ) + + readDrainedCompilerOutputByte fd = + catchIOError + ( do + bytes <- PB.fdRead fd (fromIntegral compilerOutputReadChunkSize) + pure . Just $ + if B.null bytes + then CompilerOutputReadEOF + else CompilerOutputReadBytes bytes + ) + ( \ioErr -> + if isEOFError ioErr + then pure Nothing + else + if ioeGetErrorType ioErr == ResourceExhausted + then pure $ Just CompilerOutputReadWouldBlock + else ioError ioErr + ) + + readForcedDrainedCompilerOutputBytes fd initialBytes = + readLoop (B.length initialBytes) [initialBytes] + where + readLoop byteCount chunks + | byteCount >= compilerOutputForcedDrainMaxBytes = + pure $ B.concat $ reverse chunks + | otherwise = + readDrainedCompilerOutputByte fd >>= \case + Just (CompilerOutputReadBytes bytes) -> + readLoop (byteCount + B.length bytes) (bytes : chunks) + _ -> + pure $ B.concat $ reverse chunks + +compilerOutputReadChunkSize :: Int +compilerOutputReadChunkSize = 4096 + +compilerOutputDrainAfterExitPollMicros :: Int +compilerOutputDrainAfterExitPollMicros = 50000 + +compilerProcessFailureOutputDrainGraceMicros :: Int +compilerProcessFailureOutputDrainGraceMicros = 500000 + +compilerOutputForcedDrainMaxBytes :: Int +compilerOutputForcedDrainMaxBytes = 65536 + +compilerProcessFailureTerminationGraceMicros :: Int +compilerProcessFailureTerminationGraceMicros = 200000 + +compilerProcessPostExitCompletionTimeoutMicros :: Int +compilerProcessPostExitCompletionTimeoutMicros = 3000000 + +minimumStableCompilerObjectOutputBytes :: Integer +minimumStableCompilerObjectOutputBytes = 20 + +minimumProbeObjectTargetBytes :: Word64 +minimumProbeObjectTargetBytes = 4 + +minimumElfProbeObjectTargetBytes :: Int +minimumElfProbeObjectTargetBytes = 20 + +data CompilerOutputReadResult + = CompilerOutputReadEOF + | CompilerOutputReadWouldBlock + | CompilerOutputReadBytes !B.ByteString + +pendingTrailingOutputChunkCount :: B.ByteString -> Int +pendingTrailingOutputChunkCount trailingBytes + | B.null trailingBytes = + 0 + | otherwise = + 1 + +buildCapturedCompilerOutputChunks + :: CompilerOutputStream + -> Int + -> [CompilerOutputChunk] + -> [CapturedCompilerOutputChunk] +buildCapturedCompilerOutputChunks outputStream startIndex = + zipWith mkCapturedChunk [startIndex ..] + where + mkCapturedChunk outputIndex outputChunk = + CapturedCompilerOutputChunk + { capturedCompilerOutputStream = outputStream + , capturedCompilerOutputIndex = outputIndex + , capturedCompilerOutputChunk = outputChunk + } + +emitTrailingOutputChunk + :: (a -> CapturedCompilerOutputChunk -> IO a) + -> CompilerOutputStream + -> Int + -> B.ByteString + -> a + -> IO a +emitTrailingOutputChunk accumulateChunk outputStream nextIndex trailingBytes acc = + foldM + accumulateChunk + acc + (buildCapturedCompilerOutputChunks outputStream nextIndex (finalCompilerOutputChunk trailingBytes)) + +flipCompilerOutputStream :: CompilerOutputStream -> CompilerOutputStream +flipCompilerOutputStream = \case + CompilerStdout -> CompilerStderr + CompilerStderr -> CompilerStdout + +emptyIncrementalStreamWarningSuppressionState :: IncrementalStreamWarningSuppressionState +emptyIncrementalStreamWarningSuppressionState = + IncrementalStreamWarningSuppressionState + { incrementalStreamWarningPendingChunk = Nothing + , incrementalStreamWarningChunkFilter = emptyIncrementalCompilerWarningFilter + } + +emptyIncrementalCompilerWarningSuppressionState :: IncrementalCompilerWarningSuppressionState +emptyIncrementalCompilerWarningSuppressionState = + IncrementalCompilerWarningSuppressionState + { incrementalCompilerWarningStdoutState = emptyIncrementalStreamWarningSuppressionState + , incrementalCompilerWarningStderrState = emptyIncrementalStreamWarningSuppressionState + , incrementalCompilerWarningChunkFilter = emptyIncrementalCompilerWarningFilter + , incrementalCompilerWarningPending = [] + , incrementalCompilerWarningDecisions = Map.empty + } + +processIncrementalCompilerWarningSuppressionChunk + :: Handle + -> Handle + -> IncrementalCompilerWarningSuppressionState + -> CapturedCompilerOutputChunk + -> IO IncrementalCompilerWarningSuppressionState +processIncrementalCompilerWarningSuppressionChunk stdoutHandle stderrHandle suppressionState capturedChunk = + flushIncrementalCompilerWarningSuppressionState stdoutHandle stderrHandle $ + applyCompilerWarningFilterDecisions decisions suppressionState'' + where + outputStream = capturedCompilerOutputStream capturedChunk + streamState = + streamWarningSuppressionState outputStream suppressionState + (streamState', maybeCompletedChunk, localDecisions) = + feedIncrementalStreamWarningSuppressionState streamState capturedChunk + suppressionState' = + setStreamWarningSuppressionState outputStream streamState' $ + suppressionState + { incrementalCompilerWarningPending = + incrementalCompilerWarningPending suppressionState <> [capturedChunk] + } + (chunkFilter', completedChunkDecisions) = + case maybeCompletedChunk of + Nothing -> + (incrementalCompilerWarningChunkFilter suppressionState', []) + Just completedChunk -> + feedIncrementalCompilerWarningFilter + (fst . suppressibleCapturedCompilerOutputChunk) + (snd . suppressibleCapturedCompilerOutputChunk) + (incrementalCompilerWarningChunkFilter suppressionState') + [completedChunk] + suppressionState'' = + suppressionState' + { incrementalCompilerWarningChunkFilter = chunkFilter' + } + decisions = + localDecisions <> completedChunkDecisions + +finalizeIncrementalCompilerWarningSuppression + :: Handle + -> Handle + -> IncrementalCompilerWarningSuppressionState + -> IO IncrementalCompilerWarningSuppressionState +finalizeIncrementalCompilerWarningSuppression stdoutHandle stderrHandle suppressionState = + flushIncrementalCompilerWarningSuppressionState stdoutHandle stderrHandle $ + applyCompilerWarningFilterDecisions finalDecisions suppressionState' + where + pendingPartialChunks = + orderPendingSuppressibleChunks + (incrementalCompilerWarningPending suppressionState) + (mapMaybe streamPendingSuppressibleCapturedCompilerOutputChunk + [ incrementalCompilerWarningStdoutState suppressionState + , incrementalCompilerWarningStderrState suppressionState + ] + ) + (chunkFilter', pendingChunkDecisions) = + List.foldl' + feedPendingSuppressibleChunk + ( incrementalCompilerWarningChunkFilter suppressionState + , [] + ) + pendingPartialChunks + finalDecisions = + pendingChunkDecisions + <> finalizeIncrementalCompilerWarningFilter + (fst . suppressibleCapturedCompilerOutputChunk) + (snd . suppressibleCapturedCompilerOutputChunk) + chunkFilter' + suppressionState' = + suppressionState + { incrementalCompilerWarningStdoutState = emptyIncrementalStreamWarningSuppressionState + , incrementalCompilerWarningStderrState = emptyIncrementalStreamWarningSuppressionState + , incrementalCompilerWarningChunkFilter = chunkFilter' + } + feedPendingSuppressibleChunk (chunkFilter, accumulatedDecisions) suppressibleChunk = + let (nextChunkFilter, chunkDecisions) = + feedIncrementalCompilerWarningFilter + (fst . suppressibleCapturedCompilerOutputChunk) + (snd . suppressibleCapturedCompilerOutputChunk) + chunkFilter + [suppressibleChunk] + in (nextChunkFilter, accumulatedDecisions <> chunkDecisions) + +feedIncrementalStreamWarningSuppressionState + :: IncrementalStreamWarningSuppressionState + -> CapturedCompilerOutputChunk + -> ( IncrementalStreamWarningSuppressionState + , Maybe SuppressibleCapturedCompilerOutputChunk + , [CompilerWarningFilterDecision SuppressibleCapturedCompilerOutputChunk] + ) +feedIncrementalStreamWarningSuppressionState streamState capturedChunk = + let pendingChunk = + appendCapturedCompilerOutputChunk + (incrementalStreamWarningPendingChunk streamState) + capturedChunk + suppressibleChunk = + buildSuppressibleCapturedCompilerOutputChunk pendingChunk + in if capturedCompilerOutputChunkEndsLine capturedChunk + then + ( streamState + { incrementalStreamWarningPendingChunk = Nothing + } + , Just suppressibleChunk + , [] + ) + else + if incompleteCompilerOutputNeedsMoreInputForWarningSuppression + (snd $ suppressibleCapturedCompilerOutputChunk suppressibleChunk) + then + -- Keep only warning-like incomplete chunks buffered so + -- safe interactive output still reaches the terminal + -- before the wrapped tool emits a newline or exits. + ( streamState + { incrementalStreamWarningPendingChunk = Just pendingChunk + } + , Nothing + , [] + ) + else + ( streamState + { incrementalStreamWarningPendingChunk = Nothing + } + , Just suppressibleChunk + , [] + ) + +streamPendingSuppressibleCapturedCompilerOutputChunk + :: IncrementalStreamWarningSuppressionState + -> Maybe SuppressibleCapturedCompilerOutputChunk +streamPendingSuppressibleCapturedCompilerOutputChunk streamState = + buildSuppressibleCapturedCompilerOutputChunk + <$> incrementalStreamWarningPendingChunk streamState + +orderPendingSuppressibleChunks + :: [CapturedCompilerOutputChunk] + -> [SuppressibleCapturedCompilerOutputChunk] + -> [SuppressibleCapturedCompilerOutputChunk] +orderPendingSuppressibleChunks pendingChunks = + sortOn earliestPendingChunkPosition + where + pendingChunkPositions = + Map.fromList $ + zip (map capturedCompilerOutputKey pendingChunks) [0 :: Int ..] + earliestPendingChunkPosition suppressibleChunk = + minimum $ + map + (\outputKey -> Map.findWithDefault maxBound outputKey pendingChunkPositions) + (suppressibleCapturedCompilerOutputKeys suppressibleChunk) + +streamWarningSuppressionState + :: CompilerOutputStream + -> IncrementalCompilerWarningSuppressionState + -> IncrementalStreamWarningSuppressionState +streamWarningSuppressionState outputStream suppressionState = + case outputStream of + CompilerStdout -> + incrementalCompilerWarningStdoutState suppressionState + CompilerStderr -> + incrementalCompilerWarningStderrState suppressionState + +setStreamWarningSuppressionState + :: CompilerOutputStream + -> IncrementalStreamWarningSuppressionState + -> IncrementalCompilerWarningSuppressionState + -> IncrementalCompilerWarningSuppressionState +setStreamWarningSuppressionState outputStream streamState suppressionState = + case outputStream of + CompilerStdout -> + suppressionState + { incrementalCompilerWarningStdoutState = streamState + } + CompilerStderr -> + suppressionState + { incrementalCompilerWarningStderrState = streamState + } + +applyCompilerWarningFilterDecisions + :: [CompilerWarningFilterDecision SuppressibleCapturedCompilerOutputChunk] + -> IncrementalCompilerWarningSuppressionState + -> IncrementalCompilerWarningSuppressionState +applyCompilerWarningFilterDecisions decisions suppressionState = + suppressionState + { incrementalCompilerWarningDecisions = + List.foldl' + applyCompilerWarningFilterDecision + (incrementalCompilerWarningDecisions suppressionState) + decisions + } + +applyCompilerWarningFilterDecision + :: Map.Map CapturedCompilerOutputKey CapturedCompilerOutputDecision + -> CompilerWarningFilterDecision SuppressibleCapturedCompilerOutputChunk + -> Map.Map CapturedCompilerOutputKey CapturedCompilerOutputDecision +applyCompilerWarningFilterDecision decisionMap decision = + List.foldl' + (\decisionMap' outputKey -> Map.insert outputKey capturedDecision decisionMap') + decisionMap + outputKeys + where + (capturedDecision, suppressibleChunk) = + case decision of + RetainCompilerWarningFilterChunk chunk -> + (RetainCapturedCompilerOutput, chunk) + SuppressCompilerWarningFilterChunk chunk -> + (SuppressCapturedCompilerOutput, chunk) + outputKeys = + suppressibleCapturedCompilerOutputKeys suppressibleChunk + +flushIncrementalCompilerWarningSuppressionState + :: Handle + -> Handle + -> IncrementalCompilerWarningSuppressionState + -> IO IncrementalCompilerWarningSuppressionState +flushIncrementalCompilerWarningSuppressionState stdoutHandle stderrHandle suppressionState = + go + (incrementalCompilerWarningPending suppressionState) + (incrementalCompilerWarningDecisions suppressionState) + [] + where + go [] decisionMap reversedPendingChunks = + pure $ + suppressionState + { incrementalCompilerWarningPending = reverse reversedPendingChunks + , incrementalCompilerWarningDecisions = decisionMap + } + go (capturedChunk:remainingChunks) decisionMap reversedPendingChunks = + case Map.lookup (capturedCompilerOutputKey capturedChunk) decisionMap of + Nothing -> + go remainingChunks decisionMap (capturedChunk : reversedPendingChunks) + Just capturedDecision -> do + when (capturedDecision == RetainCapturedCompilerOutput) $ + replayCapturedCompilerOutputBytes + (capturedCompilerOutputDestinationHandle stdoutHandle stderrHandle capturedChunk) + (fst $ capturedCompilerOutputChunk capturedChunk) + go + remainingChunks + (Map.delete (capturedCompilerOutputKey capturedChunk) decisionMap) + reversedPendingChunks + +capturedCompilerOutputDestinationHandle :: Handle -> Handle -> CapturedCompilerOutputChunk -> Handle +capturedCompilerOutputDestinationHandle stdoutHandle stderrHandle capturedChunk = + case capturedCompilerOutputStream capturedChunk of + CompilerStdout -> stdoutHandle + CompilerStderr -> stderrHandle + +replayCapturedCompilerOutputChunk :: Handle -> Handle -> () -> CapturedCompilerOutputChunk -> IO () +replayCapturedCompilerOutputChunk stdoutHandle stderrHandle () capturedChunk = + replayCapturedCompilerOutputBytes + (capturedCompilerOutputDestinationHandle stdoutHandle stderrHandle capturedChunk) + (fst $ capturedCompilerOutputChunk capturedChunk) + +takeCapturedResult :: MVar (Either SomeException a) -> IO a +takeCapturedResult outputVar = + takeMVar outputVar >>= either throwIO pure + +takeCapturedResultAfterProcessExit :: MVar (Either SomeException a) -> IO a +takeCapturedResultAfterProcessExit outputVar = + timeout compilerProcessPostExitCompletionTimeoutMicros (takeCapturedResult outputVar) >>= \case + Just result -> pure result + Nothing -> + ioError . userError $ + "compiler wrapper did not finish delayed output after exit" + +compilerOutputBytesForStream + :: CompilerOutputStream + -> [CapturedCompilerOutputChunk] + -> B.ByteString +compilerOutputBytesForStream outputStream = + B.concat + . map (fst . capturedCompilerOutputChunk) + . filter ((== outputStream) . capturedCompilerOutputStream) + +capturedCompilerOutputKey + :: CapturedCompilerOutputChunk + -> CapturedCompilerOutputKey +capturedCompilerOutputKey capturedChunk = + ( capturedCompilerOutputStream capturedChunk + , capturedCompilerOutputIndex capturedChunk + ) + +appendCapturedCompilerOutputChunk + :: Maybe ([CapturedCompilerOutputKey], [B.ByteString]) + -> CapturedCompilerOutputChunk + -> ([CapturedCompilerOutputKey], [B.ByteString]) +appendCapturedCompilerOutputChunk maybePending capturedChunk = + let (pendingKeys, pendingBytes) = fromMaybe ([], []) maybePending + (chunkBytes, _) = capturedCompilerOutputChunk capturedChunk + in ( capturedCompilerOutputKey capturedChunk : pendingKeys + , chunkBytes : pendingBytes + ) + +buildSuppressibleCapturedCompilerOutputChunk + :: ([CapturedCompilerOutputKey], [B.ByteString]) + -> SuppressibleCapturedCompilerOutputChunk +buildSuppressibleCapturedCompilerOutputChunk (reversedKeys, reversedBytes) = + let chunkBytes = B.concat $ reverse reversedBytes + in SuppressibleCapturedCompilerOutputChunk + { suppressibleCapturedCompilerOutputKeys = reverse reversedKeys + , suppressibleCapturedCompilerOutputChunk = + (chunkBytes, normalizeCompilerOutputLine chunkBytes) + } + +capturedCompilerOutputChunkEndsLine :: CapturedCompilerOutputChunk -> Bool +capturedCompilerOutputChunkEndsLine capturedChunk = + let (chunkBytes, _) = capturedCompilerOutputChunk capturedChunk + in not (B.null chunkBytes) && B.last chunkBytes == newlineByte + +groupCapturedCompilerOutputChunksForWarningSuppression + :: [CapturedCompilerOutputChunk] + -> [SuppressibleCapturedCompilerOutputChunk] +groupCapturedCompilerOutputChunksForWarningSuppression capturedChunks = + orderPendingSuppressibleChunks capturedChunks (completedChunks <> pendingChunks) + where + (incrementalStdoutPendingChunk, incrementalStderrPendingChunk, completedChunks) = + List.foldl' + step + (Nothing, Nothing, []) + capturedChunks + pendingChunks = + mapMaybe + (fmap buildSuppressibleCapturedCompilerOutputChunk) + [ incrementalStdoutPendingChunk + , incrementalStderrPendingChunk + ] + + step (stdoutPendingChunk, stderrPendingChunk, accumulatedChunks) capturedChunk = + let pendingChunk = + appendCapturedCompilerOutputChunk + (case capturedCompilerOutputStream capturedChunk of + CompilerStdout -> stdoutPendingChunk + CompilerStderr -> stderrPendingChunk + ) + capturedChunk + completedChunk = + [ buildSuppressibleCapturedCompilerOutputChunk pendingChunk + | capturedCompilerOutputChunkEndsLine capturedChunk + ] + in case capturedCompilerOutputStream capturedChunk of + CompilerStdout -> + ( if null completedChunk then Just pendingChunk else Nothing + , stderrPendingChunk + , accumulatedChunks <> completedChunk + ) + CompilerStderr -> + ( stdoutPendingChunk + , if null completedChunk then Just pendingChunk else Nothing + , accumulatedChunks <> completedChunk + ) + +compilerOutputBytesForSuppressibleChunks + :: CompilerOutputStream + -> [SuppressibleCapturedCompilerOutputChunk] + -> B.ByteString +compilerOutputBytesForSuppressibleChunks outputStream = + B.concat + . map (fst . suppressibleCapturedCompilerOutputChunk) + . filter + ( \capturedChunk -> + maybe False ((== outputStream) . fst) $ + case suppressibleCapturedCompilerOutputKeys capturedChunk of + outputKey:_ -> Just outputKey + [] -> Nothing + ) + +readCompilerProcessWithExitCodeProbeSuppressingWarnings + :: CompilerCommand + -> [String] + -> IO (ExitCode, String, String) +readCompilerProcessWithExitCodeProbeSuppressingWarnings = + readCompilerProcessWithExitCodeProbeSuppressingWarningsUntil (\_ _ -> pure False) + +readCompilerProcessWithExitCodeProbeSuppressingWarningsUntil + :: (IO [CapturedCompilerOutputChunk] -> CompilerPostExitReadiness) + -> CompilerCommand + -> [String] + -> IO (ExitCode, String, String) +readCompilerProcessWithExitCodeProbeSuppressingWarningsUntil postExitDrainSatisfied compiler extraArgs = do + (exitCode, capturedChunks) <- + readCompilerProcessWithExitCodeChunksUntil + postExitDrainSatisfied + CreatePipe + compiler + extraArgs + let retainedChunks = + filterCompilerOutputChunks + (fst . suppressibleCapturedCompilerOutputChunk) + (snd . suppressibleCapturedCompilerOutputChunk) + (groupCapturedCompilerOutputChunksForWarningSuppression capturedChunks) + pure + ( exitCode + , BC.unpack $ + compilerOutputBytesForSuppressibleChunks CompilerStdout retainedChunks + , BC.unpack $ + compilerOutputBytesForSuppressibleChunks CompilerStderr retainedChunks + ) + +readCompilerProcessWithExitCodeProbe + :: Bool + -> CompilerCommand + -> [String] + -> IO (ExitCode, String, String) +readCompilerProcessWithExitCodeProbe = + readCompilerProcessWithExitCodeProbeUntil (\_ _ -> pure False) + +readCompilerProcessWithExitCodeProbeUntil + :: (IO [CapturedCompilerOutputChunk] -> CompilerPostExitReadiness) + -> Bool + -> CompilerCommand + -> [String] + -> IO (ExitCode, String, String) +readCompilerProcessWithExitCodeProbeUntil postExitDrainSatisfied suppressWarnsOutput compiler extraArgs + | suppressWarnsOutput = + readCompilerProcessWithExitCodeProbeSuppressingWarningsUntil + postExitDrainSatisfied + compiler + extraArgs + | otherwise = do + (exitCode, stdoutBytes, stderrBytes) <- + readCompilerProcessWithExitCodeBytesUntil + postExitDrainSatisfied + compiler + extraArgs + pure (exitCode, BC.unpack stdoutBytes, BC.unpack stderrBytes) + +callCompilerProcess :: Bool -> CompilerCommand -> [String] -> IO () +callCompilerProcess = + callCompilerProcessUntil compilerPostExitNotReady + +callCompilerProcessUntil :: CompilerPostExitReadiness -> Bool -> CompilerCommand -> [String] -> IO () +callCompilerProcessUntil postExitDrainSatisfied suppressWarnsOutput compiler extraArgs + | suppressWarnsOutput = do + (exitCode, _) <- + bracket (hDuplicate stdout) hClose $ \stdoutHandle -> + bracket (hDuplicate stderr) hClose $ \stderrHandle -> do + hSetBinaryMode stdoutHandle True + hSetBinaryMode stderrHandle True + (exitCode', suppressionState') <- + foldCompilerProcessWithExitCodeChunksUntil + postExitDrainSatisfied + Inherit + compiler + extraArgs + emptyIncrementalCompilerWarningSuppressionState + (processIncrementalCompilerWarningSuppressionChunk stdoutHandle stderrHandle) + finalizedSuppressionState <- + finalizeIncrementalCompilerWarningSuppression + stdoutHandle + stderrHandle + suppressionState' + pure (exitCode', finalizedSuppressionState) + handleCompilerProcessExit compiler extraArgs exitCode + | otherwise = do + (exitCode, _) <- + bracket (hDuplicate stdout) hClose $ \stdoutHandle -> + bracket (hDuplicate stderr) hClose $ \stderrHandle -> do + hSetBinaryMode stdoutHandle True + hSetBinaryMode stderrHandle True + foldCompilerProcessWithExitCodeChunksUntil + postExitDrainSatisfied + Inherit + compiler + extraArgs + () + (replayCapturedCompilerOutputChunk stdoutHandle stderrHandle) + handleCompilerProcessExit compiler extraArgs exitCode + +handleCompilerProcessExit :: CompilerCommand -> [String] -> ExitCode -> IO () +handleCompilerProcessExit compiler extraArgs = \case + ExitSuccess -> pure () + exitCode -> + ioError . userError $ + showCompilerCommandForUser compiler extraArgs + <> " failed with " + <> show exitCode + +replayCapturedCompilerOutputBytes :: Handle -> B.ByteString -> IO () +replayCapturedCompilerOutputBytes destination bytes + | B.null bytes = pure () + | otherwise = do + B.hPut destination bytes + hFlush destination + +withReadableFile :: FilePath -> FileMode -> IO a -> IO a +withReadableFile path originalMode action + | intersectFileModes originalMode ownerReadMode /= 0 = action + | otherwise = do + setFileMode path readableMode + action `finally` setFileMode path originalMode + where + readableMode = originalMode `unionFileModes` ownerReadMode + +shouldValidateRunnableLinkedOutput :: FilePath -> IO Bool +shouldValidateRunnableLinkedOutput path = + catchIOError + (do + status <- getSymbolicLinkStatus path + if isRegularFile status + then pure True + else + if isAllowedDirectRunnableLinkedOutput path status + then pure False + else rejectUnsupportedRunnableLinkedOutputPath path + ) + (\ioErr -> if isDoesNotExistError ioErr then pure True else ioError ioErr) + +isAllowedDirectRunnableLinkedOutput :: FilePath -> FileStatus -> Bool +isAllowedDirectRunnableLinkedOutput path status = + isCharacterDevice status && normalise path == "/dev/null" + +rejectUnsupportedRunnableLinkedOutputPath :: FilePath -> IO a +rejectUnsupportedRunnableLinkedOutputPath path = + ioError . userError $ + "unsupported -r output path type: " <> path + +validateRunnableLinkedOutput :: FilePath -> Maybe String -> IO Bool +validateRunnableLinkedOutput path maybeProbeMarker = + catchIOError + (do + status <- getSymbolicLinkStatus path + if isRegularFile status + then do + let originalMode = fileMode status + hasExecuteBits = + intersectFileModes originalMode executableFileMode /= 0 + if not hasExecuteBits + then pure False + else do + withReadableFile path originalMode $ + validateReadableRunnableLinkedOutput path maybeProbeMarker + else pure False + ) + ( \ioErr -> + if linkedOutputValidationCanFailClosed ioErr + then pure False + else ioError ioErr + ) + +linkedOutputValidationCanFailClosed :: IOError -> Bool +linkedOutputValidationCanFailClosed ioErr = + isDoesNotExistError ioErr + || isEOFError ioErr + || ioeGetErrorType ioErr + `elem` [ IllegalOperation + , InappropriateType + , InvalidArgument + , NoSuchThing + , PermissionDenied + , ResourceExhausted + , ResourceVanished + ] + +validateReadableRunnableLinkedOutput :: FilePath -> Maybe String -> IO Bool +validateReadableRunnableLinkedOutput path maybeProbeMarker = + bracket openLinkedOutput closeFd readLinkedOutput + where + openLinkedOutput = + openFd path ReadOnly defaultFileFlags {nofollow = True, cloexec = True, nonBlock = True} + + readLinkedOutput fd = + isJust <$> readStableRunnableLinkedOutputBytes fd maybeProbeMarker + +copyValidatedRunnableLinkedOutput :: FilePath -> FilePath -> Maybe String -> IO Bool +copyValidatedRunnableLinkedOutput sourcePath destinationPath maybeProbeMarker = do + maybeSnapshot <- readValidatedRunnableLinkedOutputSnapshot sourcePath maybeProbeMarker + case maybeSnapshot of + Just (bytes, sourceMode) -> writeLinkedOutputBytes destinationPath sourceMode bytes + Nothing -> pure False + +readValidatedRunnableLinkedOutputSnapshot :: FilePath -> Maybe String -> IO (Maybe (B.ByteString, FileMode)) +readValidatedRunnableLinkedOutputSnapshot sourcePath maybeProbeMarker = + catchIOError + (bracket openLinkedOutput closeFd readLinkedOutput) + ( \ioErr -> + if linkedOutputValidationCanFailClosed ioErr + then pure Nothing + else ioError ioErr + ) + where + openLinkedOutput = + openFd sourcePath ReadOnly defaultFileFlags {nofollow = True, cloexec = True, nonBlock = True} + + readLinkedOutput fd = + readStableRunnableLinkedOutputBytes fd maybeProbeMarker + +ensureRunnableLinkedOutputCopied :: FilePath -> FilePath -> Maybe String -> IO Bool +ensureRunnableLinkedOutputCopied sourcePath destinationPath maybeProbeMarker = do + copied <- copyValidatedRunnableLinkedOutput sourcePath destinationPath maybeProbeMarker + if copied + then validateRunnableLinkedOutput destinationPath maybeProbeMarker + else pure False + +readStableRunnableLinkedOutputBytes :: Fd -> Maybe String -> IO (Maybe (B.ByteString, FileMode)) +readStableRunnableLinkedOutputBytes fd maybeProbeMarker = do + statusBefore <- getFdStatus fd + let maybeOutputSize = fileStatusSizeWord64 statusBefore + hasExecuteBits = + intersectFileModes (fileMode statusBefore) executableFileMode /= 0 + case maybeOutputSize of + Just outputSize + | isRegularFile statusBefore + && hasExecuteBits + && outputSize <= maximumRunnableLinkedOutputValidationBytes -> do + maybeBytes <- readCompilerObjectBytesAt fd 0 outputSize + statusAfter <- getFdStatus fd + pure $ do + bytes <- maybeBytes + if compilerObjectOutputStatusFingerprint statusBefore + == compilerObjectOutputStatusFingerprint statusAfter + && validRunnableLinkedOutputBytes maybeProbeMarker bytes + then Just (bytes, fileMode statusBefore) + else Nothing + _ -> pure Nothing + +validRunnableLinkedOutputBytes :: Maybe String -> B.ByteString -> Bool +validRunnableLinkedOutputBytes maybeProbeMarker bytes = + let markerPresent = maybe + True + (`probeMarkerPresent` bytes) + maybeProbeMarker + linkedOutputOk = + maybe + False + hasRunnableLinkedElfProgramHeadersAndInterpreter + (parseLinkedOutputElf bytes) + in markerPresent && linkedOutputOk + +writeLinkedOutputBytes :: FilePath -> FileMode -> B.ByteString -> IO Bool +writeLinkedOutputBytes destinationPath sourceMode bytes = + bracket openLinkedOutputDestination closeFd $ \destinationFd -> do + destinationStatus <- getFdStatus destinationFd + if isRegularFile destinationStatus && linkCount destinationStatus == 1 + then do + setFdSize destinationFd 0 + _ <- fdSeek destinationFd AbsoluteSeek 0 + written <- writeLinkedOutputFdBytes destinationFd bytes + when written $ + setFdMode destinationFd $ + fileMode destinationStatus + `unionFileModes` intersectFileModes sourceMode executableFileMode + pure written + else pure False + where + openLinkedOutputDestination = + openFd + destinationPath + WriteOnly + defaultFileFlags {nofollow = True, cloexec = True, nonBlock = True} + +writeLinkedOutputFdBytes :: Fd -> B.ByteString -> IO Bool +writeLinkedOutputFdBytes destinationFd = go + where + go remainingBytes + | B.null remainingBytes = pure True + | otherwise = do + writtenBytes <- PB.fdWrite destinationFd remainingBytes + let writtenByteCount = fromIntegral writtenBytes + if writtenByteCount <= 0 + || writtenByteCount > B.length remainingBytes + then pure False + else go $ B.drop writtenByteCount remainingBytes + +withRunnableLinkOutputPath :: Bool -> FilePath -> (FilePath -> IO a) -> IO a +withRunnableLinkOutputPath False outputPath action = action outputPath +withRunnableLinkOutputPath True outputPath action = do + tmpDir <- getTemporaryDirectory + (linkOutputPath, linkOutputHandle) <- openTempFile tmpDir "htcc-link-output-.out" + hSetBinaryMode linkOutputHandle True + finally + (do + outputMode <- fileMode <$> getFileStatus outputPath + setFileMode linkOutputPath outputMode + hClose linkOutputHandle + action linkOutputPath) + ( ignoreIOException (hClose linkOutputHandle) + *> ignoreIOException (removeFile linkOutputPath) + ) + +probeMarkerPresent :: String -> B.ByteString -> Bool +probeMarkerPresent probeMarker = + B.isInfixOf $ B.pack $ map (fromIntegral . fromEnum) probeMarker + +data LinkedOutputElf = LinkedOutputElf + { linkedOutputElfBytes :: B.ByteString + , linkedOutputElfDataEncoding :: Word8 + , linkedOutputElfOsAbi :: Word8 + , linkedOutputElfType :: Int + , linkedOutputElfFileSize :: Word64 + , linkedOutputElfEntryPoint :: Word64 + , linkedOutputElfProgramHeaderOffset :: Word64 + , linkedOutputElfProgramHeaderEntrySize :: Int + , linkedOutputElfProgramHeaderCount :: Int + } + +data RelocatableElfObjectHeader = RelocatableElfObjectHeader + { relocatableElfData :: !Word8 + , relocatableElfProgramHeaderOffset :: !Word64 + , relocatableElfSectionHeaderOffset :: !Word64 + , relocatableElfProgramHeaderEntrySize :: !Int + , relocatableElfProgramHeaderCount :: !Word64 + , relocatableElfSectionHeaderEntrySize :: !Int + , relocatableElfSectionHeaderCount :: !Word64 + , relocatableElfSectionHeaderStringIndex :: !Word64 + } + +data RelocatableElfSectionHeader = RelocatableElfSectionHeader + { relocatableElfSectionType :: !Int + , relocatableElfSectionOffset :: !Word64 + , relocatableElfSectionSize :: !Word64 + , relocatableElfSectionLink :: !Word64 + , relocatableElfSectionInfo :: !Word64 + } + +data CompilerObjectContentDigest = CompilerObjectContentDigest + !Word64 + !Word64 + !Word64 + deriving (Eq) + +relocatableElfObjectContentFingerprint :: Fd -> Word64 -> IO (Maybe CompilerObjectContentDigest) +relocatableElfObjectContentFingerprint fd objectFileSize = do + maybeHeaderBytes <- readCompilerObjectBytesAt fd 0 $ fromIntegral elfHeaderSize + case maybeHeaderBytes >>= parseRelocatableElfObjectHeader of + Nothing -> pure Nothing + Just header -> do + maybeSectionHeader0 <- readRelocatableElfSectionHeader0 fd objectFileSize header + let maybeSectionCount = + resolvedRelocatableElfSectionHeaderCount + header + maybeSectionHeader0 + maybeProgramHeaderCount = + resolvedRelocatableElfProgramHeaderCount + header + maybeSectionHeader0 + maybeSectionHeaderStringIndex = + resolvedRelocatableElfSectionHeaderStringIndex + header + maybeSectionHeader0 + case (maybeHeaderBytes, maybeSectionCount, maybeProgramHeaderCount, maybeSectionHeaderStringIndex) of + (Just headerBytes, Just sectionCount, Just programHeaderCount, Just sectionHeaderStringIndex) + | relocatableElfObjectTablesFit + objectFileSize + header + sectionCount + programHeaderCount + sectionHeaderStringIndex + && relocatableElfObjectReadinessLimitsFit + objectFileSize + sectionCount + programHeaderCount -> do + let initialDigest = + digestCompilerObjectLoadedBytes + emptyCompilerObjectContentDigest + 0 + (fromIntegral elfHeaderSize) + headerBytes + maybeProgramHeaderDigest <- + case initialDigest of + Just digest + | programHeaderCount == 0 -> pure $ Just digest + | otherwise -> + case tableSizeWord64 + (relocatableElfProgramHeaderEntrySize header) + programHeaderCount of + Just programHeaderTableSize -> + digestCompilerObjectRangeAt + fd + digest + (relocatableElfProgramHeaderOffset header) + programHeaderTableSize + Nothing -> pure Nothing + Nothing -> pure Nothing + case maybeProgramHeaderDigest of + Just digest -> + digestRelocatableElfSectionTableAndContents + fd + objectFileSize + header + sectionCount + digest + Nothing -> pure Nothing + _ -> pure Nothing + +parseRelocatableElfObjectHeader :: B.ByteString -> Maybe RelocatableElfObjectHeader +parseRelocatableElfObjectHeader bytes + | B.length bytes /= elfHeaderSize = Nothing + | B.take 4 bytes /= elfMagic = Nothing + | elfClass /= elfClass64Bit = Nothing + | elfData `notElem` [elfDataLittleEndian, elfDataBigEndian] = Nothing + | elfIdentVersion /= elfCurrentVersion = Nothing + | elfType /= elfTypeRelocatable = Nothing + | elfMachine /= elfMachineX86_64 = Nothing + | elfVersion /= fromIntegral elfCurrentVersion = Nothing + | elfHeaderByteSize /= elfHeaderSize = Nothing + | otherwise = + Just RelocatableElfObjectHeader + { relocatableElfData = elfData + , relocatableElfProgramHeaderOffset = elfProgramHeaderOffset + , relocatableElfSectionHeaderOffset = elfSectionHeaderOffset + , relocatableElfProgramHeaderEntrySize = elfProgramHeaderEntrySize + , relocatableElfProgramHeaderCount = elfProgramHeaderCount + , relocatableElfSectionHeaderEntrySize = elfSectionHeaderEntrySize + , relocatableElfSectionHeaderCount = elfSectionHeaderCount + , relocatableElfSectionHeaderStringIndex = elfSectionHeaderStringIndex + } + where + elfClass = B.index bytes 4 + elfData = B.index bytes 5 + elfIdentVersion = B.index bytes 6 + elfType = decodeElfHalfWord elfData (B.index bytes 16) (B.index bytes 17) :: Int + elfMachine = decodeElfHalfWord elfData (B.index bytes 18) (B.index bytes 19) :: Int + elfVersion = + decodeElfWord32 + elfData + [ B.index bytes 20 + , B.index bytes 21 + , B.index bytes 22 + , B.index bytes 23 + ] :: + Int + elfProgramHeaderOffset = + decodeElfWord64 + elfData + [ B.index bytes 32 + , B.index bytes 33 + , B.index bytes 34 + , B.index bytes 35 + , B.index bytes 36 + , B.index bytes 37 + , B.index bytes 38 + , B.index bytes 39 + ] + elfSectionHeaderOffset = + decodeElfWord64 + elfData + [ B.index bytes 40 + , B.index bytes 41 + , B.index bytes 42 + , B.index bytes 43 + , B.index bytes 44 + , B.index bytes 45 + , B.index bytes 46 + , B.index bytes 47 + ] + elfHeaderByteSize = decodeElfHalfWord elfData (B.index bytes 52) (B.index bytes 53) :: Int + elfProgramHeaderEntrySize = + decodeElfHalfWord elfData (B.index bytes 54) (B.index bytes 55) :: Int + elfProgramHeaderCount = + decodeElfHalfWord elfData (B.index bytes 56) (B.index bytes 57) :: Word64 + elfSectionHeaderEntrySize = + decodeElfHalfWord elfData (B.index bytes 58) (B.index bytes 59) :: Int + elfSectionHeaderCount = + decodeElfHalfWord elfData (B.index bytes 60) (B.index bytes 61) :: Word64 + elfSectionHeaderStringIndex = + decodeElfHalfWord elfData (B.index bytes 62) (B.index bytes 63) :: Word64 + +readRelocatableElfSectionHeader0 + :: Fd + -> Word64 + -> RelocatableElfObjectHeader + -> IO (Maybe RelocatableElfSectionHeader) +readRelocatableElfSectionHeader0 fd objectFileSize header + | relocatableElfSectionHeaderEntrySize header < elfSectionHeaderSize = pure Nothing + | not $ + rangeWithinFile + (relocatableElfSectionHeaderOffset header) + (fromIntegral elfSectionHeaderSize) + objectFileSize = pure Nothing + | otherwise = do + maybeSectionHeaderBytes <- + readCompilerObjectBytesAt + fd + (relocatableElfSectionHeaderOffset header) + (fromIntegral elfSectionHeaderSize) + pure $ maybeSectionHeaderBytes >>= parseRelocatableElfSectionHeader (relocatableElfData header) + +resolvedRelocatableElfSectionHeaderCount + :: RelocatableElfObjectHeader -> Maybe RelocatableElfSectionHeader -> Maybe Word64 +resolvedRelocatableElfSectionHeaderCount header maybeSectionHeader0 = + case relocatableElfSectionHeaderCount header of + 0 -> relocatableElfSectionSize <$> maybeSectionHeader0 + sectionCount -> Just sectionCount + +resolvedRelocatableElfProgramHeaderCount + :: RelocatableElfObjectHeader -> Maybe RelocatableElfSectionHeader -> Maybe Word64 +resolvedRelocatableElfProgramHeaderCount header maybeSectionHeader0 = + case relocatableElfProgramHeaderCount header of + count + | count == elfProgramHeaderCountExtended -> + relocatableElfSectionInfo <$> maybeSectionHeader0 + | otherwise -> Just count + +resolvedRelocatableElfSectionHeaderStringIndex + :: RelocatableElfObjectHeader -> Maybe RelocatableElfSectionHeader -> Maybe Word64 +resolvedRelocatableElfSectionHeaderStringIndex header maybeSectionHeader0 = + case relocatableElfSectionHeaderStringIndex header of + index + | index == elfSectionHeaderIndexExtended -> do + sectionHeader0 <- maybeSectionHeader0 + Just $ relocatableElfSectionLink sectionHeader0 + | otherwise -> Just index + +relocatableElfObjectTablesFit + :: Word64 + -> RelocatableElfObjectHeader + -> Word64 + -> Word64 + -> Word64 + -> Bool +relocatableElfObjectTablesFit objectFileSize header sectionCount programHeaderCount sectionHeaderStringIndex = + sectionCount > 0 + && programHeaderTableFits + && sectionHeaderTableFits + && sectionHeaderStringIndexValid + where + programHeaderTableFits = + programHeaderCount == 0 + || relocatableElfProgramHeaderEntrySize header >= elfProgramHeaderSize + && maybe + False + ( \programHeaderTableSize -> + rangeWithinFile + (relocatableElfProgramHeaderOffset header) + programHeaderTableSize + objectFileSize + ) + (tableSizeWord64 (relocatableElfProgramHeaderEntrySize header) programHeaderCount) + sectionHeaderTableFits = + relocatableElfSectionHeaderEntrySize header >= elfSectionHeaderSize + && maybe + False + ( \sectionHeaderTableSize -> + rangeWithinFile + (relocatableElfSectionHeaderOffset header) + sectionHeaderTableSize + objectFileSize + ) + (tableSizeWord64 (relocatableElfSectionHeaderEntrySize header) sectionCount) + sectionHeaderStringIndexValid = + sectionHeaderStringIndex == elfSectionHeaderIndexUndefined + || sectionHeaderStringIndex < sectionCount + +tableSizeWord64 :: Int -> Word64 -> Maybe Word64 +tableSizeWord64 entrySize entryCount = + let tableSize = fromIntegral entrySize * fromIntegral entryCount :: Integer + in if entrySize >= 0 && tableSize <= fromIntegral (maxBound :: Word64) + then Just $ fromInteger tableSize + else Nothing + +relocatableElfObjectReadinessLimitsFit :: Word64 -> Word64 -> Word64 -> Bool +relocatableElfObjectReadinessLimitsFit objectFileSize sectionCount programHeaderCount = + objectFileSize <= maximumStableCompilerObjectOutputBytes + && sectionCount <= maximumStableCompilerObjectSectionHeaderCount + && programHeaderCount <= maximumStableCompilerObjectProgramHeaderCount + +tableEntryOffsetWord64 :: Word64 -> Int -> Word64 -> Maybe Word64 +tableEntryOffsetWord64 tableOffset entrySize entryIndex = + let entryOffset = + fromIntegral tableOffset + + fromIntegral entrySize * fromIntegral entryIndex :: + Integer + in if entrySize >= 0 && entryOffset <= fromIntegral (maxBound :: Word64) + then Just $ fromInteger entryOffset + else Nothing + +digestRelocatableElfSectionTableAndContents + :: Fd + -> Word64 + -> RelocatableElfObjectHeader + -> Word64 + -> CompilerObjectContentDigest + -> IO (Maybe CompilerObjectContentDigest) +digestRelocatableElfSectionTableAndContents fd objectFileSize header sectionCount = + go 0 + where + go sectionIndex digest + | sectionIndex >= sectionCount = pure $ Just digest + | otherwise = do + maybeDigest <- + digestRelocatableElfSectionHeaderEntry + fd + objectFileSize + header + sectionIndex + digest + maybe (pure Nothing) (go $ sectionIndex + 1) maybeDigest + +digestRelocatableElfSectionHeaderEntry + :: Fd + -> Word64 + -> RelocatableElfObjectHeader + -> Word64 + -> CompilerObjectContentDigest + -> IO (Maybe CompilerObjectContentDigest) +digestRelocatableElfSectionHeaderEntry fd objectFileSize header sectionIndex digest = + case tableEntryOffsetWord64 + (relocatableElfSectionHeaderOffset header) + (relocatableElfSectionHeaderEntrySize header) + sectionIndex of + Just sectionHeaderOffset -> do + let sectionHeaderEntrySize = + fromIntegral $ relocatableElfSectionHeaderEntrySize header + if compilerObjectDigestRangeFits digest sectionHeaderEntrySize + then do + maybeSectionHeaderBytes <- + readCompilerObjectBytesAt fd sectionHeaderOffset sectionHeaderEntrySize + case maybeSectionHeaderBytes of + Just sectionHeaderBytes -> do + let maybeEntryDigest = + digestCompilerObjectLoadedBytes + digest + sectionHeaderOffset + sectionHeaderEntrySize + sectionHeaderBytes + maybeSectionHeader = + parseRelocatableElfSectionHeader + (relocatableElfData header) + (B.take elfSectionHeaderSize sectionHeaderBytes) + case (maybeEntryDigest, maybeSectionHeader) of + (Just entryDigest, Just sectionHeader) -> + digestRelocatableElfSectionBytes + fd + objectFileSize + sectionIndex + sectionHeader + entryDigest + _ -> pure Nothing + Nothing -> pure Nothing + else pure Nothing + Nothing -> pure Nothing + +digestRelocatableElfSectionBytes + :: Fd + -> Word64 + -> Word64 + -> RelocatableElfSectionHeader + -> CompilerObjectContentDigest + -> IO (Maybe CompilerObjectContentDigest) +digestRelocatableElfSectionBytes fd objectFileSize sectionIndex sectionHeader digest + | sectionIndex == elfSectionHeaderIndexUndefined = + pure $ + if relocatableElfSectionType sectionHeader == elfSectionTypeNull + then Just digest + else Nothing + | relocatableElfSectionType sectionHeader == elfSectionTypeNoBits = + pure $ Just digest + | relocatableElfSectionSize sectionHeader == 0 = + pure $ Just digest + | not $ + rangeWithinFile + (relocatableElfSectionOffset sectionHeader) + (relocatableElfSectionSize sectionHeader) + objectFileSize = pure Nothing + | otherwise = + digestCompilerObjectRangeAt + fd + digest + (relocatableElfSectionOffset sectionHeader) + (relocatableElfSectionSize sectionHeader) + +parseRelocatableElfSectionHeader :: Word8 -> B.ByteString -> Maybe RelocatableElfSectionHeader +parseRelocatableElfSectionHeader elfData bytes + | B.length bytes /= elfSectionHeaderSize = Nothing + | otherwise = + Just RelocatableElfSectionHeader + { relocatableElfSectionType = + decodeElfWord32 + elfData + [ B.index bytes 4 + , B.index bytes 5 + , B.index bytes 6 + , B.index bytes 7 + ] + , relocatableElfSectionOffset = + decodeElfWord64 + elfData + [ B.index bytes 24 + , B.index bytes 25 + , B.index bytes 26 + , B.index bytes 27 + , B.index bytes 28 + , B.index bytes 29 + , B.index bytes 30 + , B.index bytes 31 + ] + , relocatableElfSectionSize = + decodeElfWord64 + elfData + [ B.index bytes 32 + , B.index bytes 33 + , B.index bytes 34 + , B.index bytes 35 + , B.index bytes 36 + , B.index bytes 37 + , B.index bytes 38 + , B.index bytes 39 + ] + , relocatableElfSectionLink = + decodeElfWord32 + elfData + [ B.index bytes 40 + , B.index bytes 41 + , B.index bytes 42 + , B.index bytes 43 + ] + , relocatableElfSectionInfo = + decodeElfWord32 + elfData + [ B.index bytes 44 + , B.index bytes 45 + , B.index bytes 46 + , B.index bytes 47 + ] + } + +emptyCompilerObjectContentDigest :: CompilerObjectContentDigest +emptyCompilerObjectContentDigest = + CompilerObjectContentDigest + 0 + compilerObjectDigestFnvOffset + compilerObjectDigestMixSeed + +digestCompilerObjectLoadedBytes + :: CompilerObjectContentDigest + -> Word64 + -> Word64 + -> B.ByteString + -> Maybe CompilerObjectContentDigest +digestCompilerObjectLoadedBytes digest offset byteCount bytes + | fromIntegral (B.length bytes) == byteCount + && compilerObjectDigestRangeFits digest byteCount = + Just $ + digestCompilerObjectBytes + (digestCompilerObjectRangeDescriptor digest offset byteCount) + bytes + | otherwise = Nothing + +digestCompilerObjectRangeAt + :: Fd + -> CompilerObjectContentDigest + -> Word64 + -> Word64 + -> IO (Maybe CompilerObjectContentDigest) +digestCompilerObjectRangeAt fd digest offset byteCount = + if compilerObjectDigestRangeFits digest byteCount + then + case word64ToInt offset of + Just offsetInt -> do + _ <- fdSeek fd AbsoluteSeek $ fromIntegral offsetInt + go + (digestCompilerObjectRangeDescriptor digest offset byteCount) + byteCount + Nothing -> pure Nothing + else pure Nothing + where + go currentDigest remainingBytes + | remainingBytes == 0 = pure $ Just currentDigest + | otherwise = do + let chunkSize = + min remainingBytes compilerObjectDigestChunkBytes + case word64ToInt chunkSize of + Just chunkSizeInt -> do + bytes <- readCompilerObjectExactBytes fd chunkSizeInt + if B.length bytes == chunkSizeInt + then + go + (digestCompilerObjectBytes currentDigest bytes) + (remainingBytes - chunkSize) + else pure Nothing + Nothing -> pure Nothing + +digestCompilerObjectRangeDescriptor + :: CompilerObjectContentDigest -> Word64 -> Word64 -> CompilerObjectContentDigest +digestCompilerObjectRangeDescriptor digest offset = + digestCompilerObjectWord64 + (digestCompilerObjectWord64 + (digestCompilerObjectBytes digest $ B.singleton compilerObjectDigestRangeMarker) + offset) + +compilerObjectDigestRangeFits :: CompilerObjectContentDigest -> Word64 -> Bool +compilerObjectDigestRangeFits digest byteCount = + requiredDigestBytes + <= fromIntegral maximumStableCompilerObjectDigestBytes + where + requiredDigestBytes = + fromIntegral (compilerObjectDigestByteCount digest) + + fromIntegral compilerObjectDigestRangeDescriptorBytes + + fromIntegral byteCount :: + Integer + +compilerObjectDigestByteCount :: CompilerObjectContentDigest -> Word64 +compilerObjectDigestByteCount (CompilerObjectContentDigest byteCount _ _) = + byteCount + +digestCompilerObjectWord64 :: CompilerObjectContentDigest -> Word64 -> CompilerObjectContentDigest +digestCompilerObjectWord64 digest value = + digestCompilerObjectBytes + digest + $ B.pack + [ fromIntegral $ value `shiftR` 56 + , fromIntegral $ value `shiftR` 48 + , fromIntegral $ value `shiftR` 40 + , fromIntegral $ value `shiftR` 32 + , fromIntegral $ value `shiftR` 24 + , fromIntegral $ value `shiftR` 16 + , fromIntegral $ value `shiftR` 8 + , fromIntegral value + ] + +digestCompilerObjectBytes :: CompilerObjectContentDigest -> B.ByteString -> CompilerObjectContentDigest +digestCompilerObjectBytes = + B.foldl' digestCompilerObjectByte + +digestCompilerObjectByte :: CompilerObjectContentDigest -> Word8 -> CompilerObjectContentDigest +digestCompilerObjectByte (CompilerObjectContentDigest byteCount fnvHash mixHash) byte = + let byteValue = fromIntegral byte + in CompilerObjectContentDigest + (byteCount + 1) + ((fnvHash `xor` byteValue) * compilerObjectDigestFnvPrime) + ((mixHash + byteValue + compilerObjectDigestMixIncrement) + * compilerObjectDigestMixPrime) + +readCompilerObjectBytesAt :: Fd -> Word64 -> Word64 -> IO (Maybe B.ByteString) +readCompilerObjectBytesAt fd offset byteCount = do + case (word64ToInt offset, word64ToInt byteCount) of + (Just offsetInt, Just byteCountInt) -> do + _ <- fdSeek fd AbsoluteSeek $ fromIntegral offsetInt + bytes <- readCompilerObjectExactBytes fd byteCountInt + pure $ + if B.length bytes == byteCountInt + then Just bytes + else Nothing + _ -> pure Nothing + +readCompilerObjectExactBytes :: Fd -> Int -> IO B.ByteString +readCompilerObjectExactBytes fd = go [] + where + go acc remainingBytes + | remainingBytes <= 0 = pure $ B.concat $ reverse acc + | otherwise = do + bytes <- PB.fdRead fd $ fromIntegral remainingBytes + if B.null bytes + then pure $ B.concat $ reverse acc + else go (bytes : acc) (remainingBytes - B.length bytes) + +word64ToInt :: Word64 -> Maybe Int +word64ToInt value + | value <= fromIntegral (maxBound :: Int) = Just $ fromIntegral value + | otherwise = Nothing + +looksRunnableLinkedOutput :: B.ByteString -> Bool +looksRunnableLinkedOutput bytes = + maybe False hasRunnableLinkedElfProgramHeadersAndInterpreter $ + parseLinkedOutputElf bytes + +linkedOutputElfHasStandaloneInterpreterDynamicSection :: LinkedOutputElf -> Bool +linkedOutputElfHasStandaloneInterpreterDynamicSection elf = + any hasStandaloneInterpreterDynamicSection [0 .. linkedOutputElfProgramHeaderCount elf - 1] + where + hasStandaloneInterpreterDynamicSection headerIndex = + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + in linkedOutputElfHasValidProgramHeaderBounds elf headerOffset + && linkedOutputElfProgramHeaderType elf headerOffset == elfProgramHeaderTypeDynamic + && maybe + False + (uncurry $ linkedOutputElfDynamicEntriesDescribeStandaloneInterpreter elf) + (linkedOutputElfProgramHeaderFileRange elf headerOffset) + +linkedOutputElfDynamicEntriesDescribeStandaloneInterpreter + :: LinkedOutputElf + -> Word64 + -> Word64 + -> Bool +linkedOutputElfDynamicEntriesDescribeStandaloneInterpreter elf dynamicOffset dynamicSize + | dynamicSize < fromIntegral elfDynamicEntrySize = False + | dynamicSize `mod` fromIntegral elfDynamicEntrySize /= 0 = False + | otherwise = + let dynamicEnd = dynamicOffset + dynamicSize + go entryOffset + | entryOffset >= dynamicEnd = False + | otherwise = + let entryTag = linkedOutputElfDynamicEntryTag elf entryOffset + in entryTag == elfDynamicTagNull + || entryTag /= elfDynamicTagNeeded + && go (entryOffset + fromIntegral elfDynamicEntrySize) + in go dynamicOffset + +parseLinkedOutputElf :: B.ByteString -> Maybe LinkedOutputElf +parseLinkedOutputElf bytes + | B.length bytes < elfHeaderSize = Nothing + | B.take 4 bytes /= elfMagic = Nothing + | elfClass /= elfClass64Bit = Nothing + | elfData `notElem` [elfDataLittleEndian, elfDataBigEndian] = Nothing + | elfIdentVersion /= elfCurrentVersion = Nothing + | elfMachine /= elfMachineX86_64 = Nothing + | elfType `notElem` [elfTypeExecutable, elfTypeSharedObject] = Nothing + | elfVersion /= fromIntegral elfCurrentVersion = Nothing + | elfHeaderByteSize /= elfHeaderSize = Nothing + | elfProgramHeaderEntrySize < elfProgramHeaderSize = Nothing + | elfProgramHeaderCount == 0 = Nothing + | not (rangeWithinFile elfProgramHeaderOffset elfProgramHeaderTableSize linkedFileSize) = Nothing + | otherwise = + Just LinkedOutputElf + { linkedOutputElfBytes = bytes + , linkedOutputElfDataEncoding = elfData + , linkedOutputElfOsAbi = elfOsAbi + , linkedOutputElfType = elfType + , linkedOutputElfFileSize = linkedFileSize + , linkedOutputElfEntryPoint = elfEntryPoint + , linkedOutputElfProgramHeaderOffset = elfProgramHeaderOffset + , linkedOutputElfProgramHeaderEntrySize = elfProgramHeaderEntrySize + , linkedOutputElfProgramHeaderCount = elfProgramHeaderCount + } + where + linkedFileSize = fromIntegral $ B.length bytes + elfClass = B.index bytes 4 + elfData = B.index bytes 5 + elfIdentVersion = B.index bytes 6 + elfOsAbi = B.index bytes 7 + elfType = decodeElfHalfWord elfData (B.index bytes 16) (B.index bytes 17) :: Int + elfMachine = decodeElfHalfWord elfData (B.index bytes 18) (B.index bytes 19) :: Int + elfVersion = + decodeElfWord32 + elfData + [ B.index bytes 20 + , B.index bytes 21 + , B.index bytes 22 + , B.index bytes 23 + ] :: + Int + elfEntryPoint = + decodeElfWord64 + elfData + [ B.index bytes 24 + , B.index bytes 25 + , B.index bytes 26 + , B.index bytes 27 + , B.index bytes 28 + , B.index bytes 29 + , B.index bytes 30 + , B.index bytes 31 + ] + elfProgramHeaderOffset = + decodeElfWord64 + elfData + [ B.index bytes 32 + , B.index bytes 33 + , B.index bytes 34 + , B.index bytes 35 + , B.index bytes 36 + , B.index bytes 37 + , B.index bytes 38 + , B.index bytes 39 + ] + elfHeaderByteSize = decodeElfHalfWord elfData (B.index bytes 52) (B.index bytes 53) :: Int + elfProgramHeaderEntrySize = + decodeElfHalfWord elfData (B.index bytes 54) (B.index bytes 55) :: Int + elfProgramHeaderCount = + decodeElfHalfWord elfData (B.index bytes 56) (B.index bytes 57) :: Int + elfProgramHeaderTableSize = + fromIntegral elfProgramHeaderEntrySize * fromIntegral elfProgramHeaderCount + +hasRunnableLinkedElfProgramHeadersAndInterpreter :: LinkedOutputElf -> Bool +hasRunnableLinkedElfProgramHeadersAndInterpreter elf = + linkedOutputElfHasRunnableInterpreterLayout elf + && any (linkedOutputElfHasRunnableProgramHeader elf) [0 .. linkedOutputElfProgramHeaderCount elf - 1] + +linkedOutputElfHasRunnableInterpreterLayout :: LinkedOutputElf -> Bool +linkedOutputElfHasRunnableInterpreterLayout elf = + case linkedOutputElfInterpreterPath elf of + Just (Just _) -> + True + Just Nothing -> + ( linkedOutputElfType elf == elfTypeExecutable + && not (linkedOutputElfHasDynamicProgramHeader elf) + ) + || linkedOutputElfHasStandaloneStaticPieLayout elf + Nothing -> + False + +linkedOutputElfHasDynamicProgramHeader :: LinkedOutputElf -> Bool +linkedOutputElfHasDynamicProgramHeader elf = + any hasDynamicProgramHeader [0 .. linkedOutputElfProgramHeaderCount elf - 1] + where + hasDynamicProgramHeader headerIndex = + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + in linkedOutputElfHasValidProgramHeaderBounds elf headerOffset + && linkedOutputElfProgramHeaderType elf headerOffset == elfProgramHeaderTypeDynamic + +linkedOutputElfInterpreterPath :: LinkedOutputElf -> Maybe (Maybe FilePath) +linkedOutputElfInterpreterPath elf = + case filter isInterpreterProgramHeader [0 .. linkedOutputElfProgramHeaderCount elf - 1] of + [] -> + Just Nothing + [headerIndex] -> do + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + interpreterOffset = linkedOutputElfProgramHeaderFileOffset elf headerOffset + interpreterSize = linkedOutputElfProgramHeaderFileSize elf headerOffset + if not (linkedOutputElfHasValidProgramHeaderBounds elf headerOffset) + || interpreterSize <= 1 + || interpreterSize > linkedOutputElfProgramHeaderMemorySize elf headerOffset + || not (rangeWithinFile interpreterOffset interpreterSize (linkedOutputElfFileSize elf)) + then Nothing + else do + interpreterBytes <- linkedOutputElfNullTerminatedBytes elf interpreterOffset interpreterSize + let interpreterPath = BC.unpack interpreterBytes + if null interpreterPath || head interpreterPath /= '/' + then Nothing + else Just $ Just interpreterPath + _ -> + Nothing + where + isInterpreterProgramHeader headerIndex = + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + in linkedOutputElfHasValidProgramHeaderBounds elf headerOffset + && linkedOutputElfProgramHeaderType elf headerOffset == elfProgramHeaderTypeInterp + +linkedOutputElfNullTerminatedBytes + :: LinkedOutputElf + -> Word64 + -> Word64 + -> Maybe B.ByteString +linkedOutputElfNullTerminatedBytes elf start size + | size == 0 = Nothing + | otherwise = + let rawBytes = + B.take (fromIntegral size) $ + B.drop (fromIntegral start) $ + linkedOutputElfBytes elf + in case B.unsnoc rawBytes of + Just (payloadBytes, trailingByte) + | trailingByte == 0 && not (B.null payloadBytes) && B.all (/= 0) payloadBytes -> + Just payloadBytes + _ -> + Nothing + +linkedOutputElfHasStaticPieDynamicFlags :: LinkedOutputElf -> Bool +linkedOutputElfHasStaticPieDynamicFlags elf = + linkedOutputElfType elf == elfTypeSharedObject + && any (linkedOutputElfProgramHeaderHasStaticPieFlag elf) [0 .. linkedOutputElfProgramHeaderCount elf - 1] + +linkedOutputElfHasStandaloneStaticPieLayout :: LinkedOutputElf -> Bool +linkedOutputElfHasStandaloneStaticPieLayout elf = + linkedOutputElfHasStaticPieDynamicFlags elf + && linkedOutputElfHasStandaloneInterpreterDynamicSection elf + +linkedOutputElfProgramHeaderHasStaticPieFlag :: LinkedOutputElf -> Int -> Bool +linkedOutputElfProgramHeaderHasStaticPieFlag elf headerIndex = + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + in linkedOutputElfHasValidProgramHeaderBounds elf headerOffset + && linkedOutputElfProgramHeaderType elf headerOffset == elfProgramHeaderTypeDynamic + && maybe + False + (uncurry $ linkedOutputElfDynamicEntriesContainStaticPieFlag elf) + (linkedOutputElfProgramHeaderFileRange elf headerOffset) + +linkedOutputElfDynamicEntriesContainStaticPieFlag :: LinkedOutputElf -> Word64 -> Word64 -> Bool +linkedOutputElfDynamicEntriesContainStaticPieFlag elf dynamicOffset dynamicSize + | dynamicSize < fromIntegral elfDynamicEntrySize = False + | dynamicSize `mod` fromIntegral elfDynamicEntrySize /= 0 = False + | otherwise = + let dynamicEnd = dynamicOffset + dynamicSize + go entryOffset + | entryOffset >= dynamicEnd = False + | otherwise = + let entryTag = linkedOutputElfDynamicEntryTag elf entryOffset + entryValue = linkedOutputElfDynamicEntryValue elf entryOffset + in entryTag /= elfDynamicTagNull + && ( ( entryTag == elfDynamicTagFlags1 + && entryValue .&. elfDynamicFlag1Pie /= 0 + ) + || go (entryOffset + fromIntegral elfDynamicEntrySize) + ) + in go dynamicOffset + +linkedOutputElfHasRunnableProgramHeader :: LinkedOutputElf -> Int -> Bool +linkedOutputElfHasRunnableProgramHeader elf headerIndex = + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + in linkedOutputElfHasValidProgramHeaderBounds elf headerOffset + && linkedOutputElfProgramHeaderType elf headerOffset == elfProgramHeaderTypeLoad + && linkedOutputElfProgramHeaderFileSize elf headerOffset > 0 + && linkedOutputElfProgramHeaderFileSize elf headerOffset + <= linkedOutputElfProgramHeaderMemorySize elf headerOffset + && rangeWithinFile + (linkedOutputElfProgramHeaderFileOffset elf headerOffset) + (linkedOutputElfProgramHeaderFileSize elf headerOffset) + (linkedOutputElfFileSize elf) + && linkedOutputElfProgramHeaderContainsEntryPoint elf headerOffset + && linkedOutputElfProgramHeaderFlags elf headerOffset .&. elfProgramHeaderFlagExecute /= 0 + +linkedOutputElfHasValidProgramHeaderBounds :: LinkedOutputElf -> Word64 -> Bool +linkedOutputElfHasValidProgramHeaderBounds elf headerOffset = + rangeWithinFile headerOffset (fromIntegral elfProgramHeaderSize) (linkedOutputElfFileSize elf) + +linkedOutputElfProgramHeaderFileRange :: LinkedOutputElf -> Word64 -> Maybe (Word64, Word64) +linkedOutputElfProgramHeaderFileRange elf headerOffset = + let fileOffset = linkedOutputElfProgramHeaderFileOffset elf headerOffset + segmentFileSize = linkedOutputElfProgramHeaderFileSize elf headerOffset + memorySize = linkedOutputElfProgramHeaderMemorySize elf headerOffset + in if segmentFileSize == 0 + || segmentFileSize > memorySize + || not (rangeWithinFile fileOffset segmentFileSize (linkedOutputElfFileSize elf)) + then Nothing + else Just (fileOffset, segmentFileSize) + +linkedOutputElfProgramHeaderEntryOffset :: LinkedOutputElf -> Int -> Word64 +linkedOutputElfProgramHeaderEntryOffset elf headerIndex = + linkedOutputElfProgramHeaderOffset elf + + fromIntegral headerIndex * fromIntegral (linkedOutputElfProgramHeaderEntrySize elf) + +linkedOutputElfProgramHeaderType :: LinkedOutputElf -> Word64 -> Int +linkedOutputElfProgramHeaderType elf headerOffset = + decodeElfWord32 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 0 + , linkedOutputElfByteAt elf headerOffset 1 + , linkedOutputElfByteAt elf headerOffset 2 + , linkedOutputElfByteAt elf headerOffset 3 + ] :: + Int + +linkedOutputElfProgramHeaderFlags :: LinkedOutputElf -> Word64 -> Int +linkedOutputElfProgramHeaderFlags elf headerOffset = + decodeElfWord32 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 4 + , linkedOutputElfByteAt elf headerOffset 5 + , linkedOutputElfByteAt elf headerOffset 6 + , linkedOutputElfByteAt elf headerOffset 7 + ] :: + Int + +linkedOutputElfProgramHeaderFileOffset :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfProgramHeaderFileOffset elf headerOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 8 + , linkedOutputElfByteAt elf headerOffset 9 + , linkedOutputElfByteAt elf headerOffset 10 + , linkedOutputElfByteAt elf headerOffset 11 + , linkedOutputElfByteAt elf headerOffset 12 + , linkedOutputElfByteAt elf headerOffset 13 + , linkedOutputElfByteAt elf headerOffset 14 + , linkedOutputElfByteAt elf headerOffset 15 + ] + +linkedOutputElfProgramHeaderVirtualAddress :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfProgramHeaderVirtualAddress elf headerOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 16 + , linkedOutputElfByteAt elf headerOffset 17 + , linkedOutputElfByteAt elf headerOffset 18 + , linkedOutputElfByteAt elf headerOffset 19 + , linkedOutputElfByteAt elf headerOffset 20 + , linkedOutputElfByteAt elf headerOffset 21 + , linkedOutputElfByteAt elf headerOffset 22 + , linkedOutputElfByteAt elf headerOffset 23 + ] + +linkedOutputElfProgramHeaderFileSize :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfProgramHeaderFileSize elf headerOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 32 + , linkedOutputElfByteAt elf headerOffset 33 + , linkedOutputElfByteAt elf headerOffset 34 + , linkedOutputElfByteAt elf headerOffset 35 + , linkedOutputElfByteAt elf headerOffset 36 + , linkedOutputElfByteAt elf headerOffset 37 + , linkedOutputElfByteAt elf headerOffset 38 + , linkedOutputElfByteAt elf headerOffset 39 + ] + +linkedOutputElfProgramHeaderMemorySize :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfProgramHeaderMemorySize elf headerOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 40 + , linkedOutputElfByteAt elf headerOffset 41 + , linkedOutputElfByteAt elf headerOffset 42 + , linkedOutputElfByteAt elf headerOffset 43 + , linkedOutputElfByteAt elf headerOffset 44 + , linkedOutputElfByteAt elf headerOffset 45 + , linkedOutputElfByteAt elf headerOffset 46 + , linkedOutputElfByteAt elf headerOffset 47 + ] + +linkedOutputElfDynamicEntryTag :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfDynamicEntryTag elf entryOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf entryOffset 0 + , linkedOutputElfByteAt elf entryOffset 1 + , linkedOutputElfByteAt elf entryOffset 2 + , linkedOutputElfByteAt elf entryOffset 3 + , linkedOutputElfByteAt elf entryOffset 4 + , linkedOutputElfByteAt elf entryOffset 5 + , linkedOutputElfByteAt elf entryOffset 6 + , linkedOutputElfByteAt elf entryOffset 7 + ] + +linkedOutputElfDynamicEntryValue :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfDynamicEntryValue elf entryOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf entryOffset 8 + , linkedOutputElfByteAt elf entryOffset 9 + , linkedOutputElfByteAt elf entryOffset 10 + , linkedOutputElfByteAt elf entryOffset 11 + , linkedOutputElfByteAt elf entryOffset 12 + , linkedOutputElfByteAt elf entryOffset 13 + , linkedOutputElfByteAt elf entryOffset 14 + , linkedOutputElfByteAt elf entryOffset 15 + ] + +linkedOutputElfProgramHeaderContainsEntryPoint :: LinkedOutputElf -> Word64 -> Bool +linkedOutputElfProgramHeaderContainsEntryPoint elf headerOffset = + rangeContainsPoint + (linkedOutputElfProgramHeaderVirtualAddress elf headerOffset) + (linkedOutputElfProgramHeaderMemorySize elf headerOffset) + (linkedOutputElfEntryPoint elf) + +linkedOutputElfByteAt :: LinkedOutputElf -> Word64 -> Int -> Word8 +linkedOutputElfByteAt elf headerOffset relativeOffset = + B.index + (linkedOutputElfBytes elf) + (fromIntegral $ headerOffset + fromIntegral relativeOffset) + +decodeElfHalfWord :: (Bits a, Num a) => Word8 -> Word8 -> Word8 -> a +decodeElfHalfWord elfData byte18 byte19 + = decodeElfUnsigned elfData [byte18, byte19] + +decodeElfWord32 :: (Bits a, Num a) => Word8 -> [Word8] -> a +decodeElfWord32 = decodeElfUnsigned + +decodeElfWord64 :: (Bits a, Num a) => Word8 -> [Word8] -> a +decodeElfWord64 = decodeElfUnsigned + +decodeElfUnsigned :: (Bits a, Num a) => Word8 -> [Word8] -> a +decodeElfUnsigned elfData = + List.foldl' + (\acc nextByte -> acc `shiftL` 8 .|. fromIntegral nextByte) + 0 + . orderedBytes + where + orderedBytes + | elfData == elfDataBigEndian = id + | otherwise = reverse + +rangeWithinFile :: Word64 -> Word64 -> Word64 -> Bool +rangeWithinFile start size totalFileSize = + start <= totalFileSize && size <= totalFileSize - start + +rangeContainsPoint :: Word64 -> Word64 -> Word64 -> Bool +rangeContainsPoint start size point = + size > 0 && point >= start && point - start < size + +elfMagic :: B.ByteString +elfMagic = B.pack [0x7f, 0x45, 0x4c, 0x46] + +elfClass64Bit :: Word8 +elfClass64Bit = 2 + +elfCurrentVersion :: Word8 +elfCurrentVersion = 1 + +elfDataLittleEndian :: Word8 +elfDataLittleEndian = 1 + +elfDataBigEndian :: Word8 +elfDataBigEndian = 2 + +elfOsAbiSystemV :: Word8 +elfOsAbiSystemV = 0 + +elfOsAbiLinux :: Word8 +elfOsAbiLinux = 3 + +elfOsAbiFreeBsd :: Word8 +elfOsAbiFreeBsd = 9 + +elfTypeExecutable :: Int +elfTypeExecutable = 2 + +elfTypeSharedObject :: Int +elfTypeSharedObject = 3 + +elfTypeRelocatable :: Int +elfTypeRelocatable = 1 + +elfMachineX86_64 :: Int +elfMachineX86_64 = 62 + +elfHeaderSize :: Int +elfHeaderSize = 64 + +elfSectionHeaderSize :: Int +elfSectionHeaderSize = 64 + +elfSectionTypeNull :: Int +elfSectionTypeNull = 0 + +elfSectionTypeNoBits :: Int +elfSectionTypeNoBits = 8 + +elfSectionHeaderIndexUndefined :: Word64 +elfSectionHeaderIndexUndefined = 0 + +elfSectionHeaderIndexExtended :: Word64 +elfSectionHeaderIndexExtended = 0xffff + +elfProgramHeaderCountExtended :: Word64 +elfProgramHeaderCountExtended = 0xffff + +elfProgramHeaderSize :: Int +elfProgramHeaderSize = 56 + +compilerObjectDigestChunkBytes :: Word64 +compilerObjectDigestChunkBytes = 64 * 1024 + +maximumStableCompilerObjectOutputBytes :: Word64 +maximumStableCompilerObjectOutputBytes = 512 * 1024 * 1024 + +maximumStableCompilerObjectDigestBytes :: Word64 +maximumStableCompilerObjectDigestBytes = 128 * 1024 * 1024 + +maximumRunnableLinkedOutputValidationBytes :: Word64 +maximumRunnableLinkedOutputValidationBytes = 128 * 1024 * 1024 + +maximumStableCompilerObjectSectionHeaderCount :: Word64 +maximumStableCompilerObjectSectionHeaderCount = 8192 + +maximumStableCompilerObjectProgramHeaderCount :: Word64 +maximumStableCompilerObjectProgramHeaderCount = 1024 + +maximumProbeObjectTargetBytes :: Word64 +maximumProbeObjectTargetBytes = 64 + +compilerObjectDigestRangeDescriptorBytes :: Word64 +compilerObjectDigestRangeDescriptorBytes = 17 + +compilerObjectDigestFnvOffset :: Word64 +compilerObjectDigestFnvOffset = 14695981039346656037 + +compilerObjectDigestFnvPrime :: Word64 +compilerObjectDigestFnvPrime = 1099511628211 + +compilerObjectDigestMixSeed :: Word64 +compilerObjectDigestMixSeed = 7809847782465536322 + +compilerObjectDigestMixPrime :: Word64 +compilerObjectDigestMixPrime = 14029467366897019727 + +compilerObjectDigestMixIncrement :: Word64 +compilerObjectDigestMixIncrement = 11400714819323198485 + +compilerObjectDigestRangeMarker :: Word8 +compilerObjectDigestRangeMarker = 0xff + +elfProgramHeaderTypeLoad :: Int +elfProgramHeaderTypeLoad = 1 + +elfProgramHeaderTypeDynamic :: Int +elfProgramHeaderTypeDynamic = 2 + +elfProgramHeaderTypeInterp :: Int +elfProgramHeaderTypeInterp = 3 + +elfProgramHeaderFlagExecute :: Int +elfProgramHeaderFlagExecute = 0x1 + +elfDynamicEntrySize :: Int +elfDynamicEntrySize = 16 + +elfDynamicTagNull :: Word64 +elfDynamicTagNull = 0 + +elfDynamicTagNeeded :: Word64 +elfDynamicTagNeeded = 1 + +elfDynamicTagFlags1 :: Word64 +elfDynamicTagFlags1 = 0x6ffffffb + +elfDynamicFlag1Pie :: Word64 +elfDynamicFlag1Pie = 0x08000000 + +markerSectionAsm :: String -> String -> String +markerSectionAsm label marker = unlines + [ ".section .rodata" + , label <> ":" + , " .asciz \"" <> escapeAsmString marker <> "\"" + , ".text" ] -inputFNameP :: Parser FilePath -inputFNameP = strArgument $ mconcat - [ metavar "file" - , action "file" - , help "Specify the input file name" +x86_64ElfRunnableLinkedOutputMarkerAsm :: String -> String +x86_64ElfRunnableLinkedOutputMarkerAsm runnableOutputMarker = + unlines [".intel_syntax noprefix"] + <> markerSectionAsm ".L.htcc_runnable_output_marker_payload" runnableOutputMarker + <> unlines + [ ".section .init_array,\"aw\",@init_array" + , " .quad .L.htcc_runnable_output_marker_ctor" + , ".text" + , ".L.htcc_runnable_output_marker_ctor:" + , " lea rax, [rip + .L.htcc_runnable_output_marker_payload]" + , " ret" + , ".section .note.GNU-stack,\"\",@progbits" + ] + +makeRunnableLinkedOutputMarker :: FilePath -> FilePath -> FilePath -> String +makeRunnableLinkedOutputMarker asmPath objPath markerObjPath = + "htcc-output-marker:" + <> takeFileName asmPath + <> ":" + <> takeFileName objPath + <> ":" + <> takeFileName markerObjPath + +x86_64ElfProbeAsm :: String -> String +x86_64ElfProbeAsm probeMarker = + unlines [".intel_syntax noprefix"] + <> markerSectionAsm "htcc_probe_marker" probeMarker + <> unlines + [ ".global main" + , "main:" + , " lea rdx, [rip + htcc_probe_marker]" + , " xor eax, eax" + , " ret" + , ".section .note.GNU-stack,\"\",@progbits" + ] + +escapeAsmString :: String -> String +escapeAsmString = concatMap $ \case + '"' -> "\\\"" + '\\' -> "\\\\" + c -> [c] + +asmCompiler :: Bool -> IO CompilerCommand +asmCompiler suppressWarnsOutput = do + htccAssembler <- nonEmptyEnv <$> lookupEnv "HTCC_ASSEMBLER" + compilerSpec <- resolveCompilerCommand $ fromMaybe "gcc" htccAssembler + ensureX86_64ElfCompiler suppressWarnsOutput compilerSpec + pure compilerSpec + +data CompilerProbeFailure + = CompilerAssemblyProbeFailure + | CompilerLinkProbeFailure + +ensureX86_64ElfCompiler :: Bool -> CompilerCommand -> IO () +ensureX86_64ElfCompiler suppressWarnsOutput compilerSpec = do + detectedTargets <- probeCompilerTargets compilerSpec + probeResult <- probeCompilerInvocation compilerSpec + case probeResult of + Right target + | isX86_64ElfTarget target -> pure () + | otherwise -> rejectTarget target + Left CompilerAssemblyProbeFailure -> + case filter isX86_64ElfTarget detectedTargets of + _ : _ -> rejectAssemblyProbeFailure + [] -> rejectDriverSelection + Left CompilerLinkProbeFailure -> + rejectLinkProbeFailure + where + rejectTarget target = + ioError . userError $ + "HTCC_ASSEMBLER must target x86_64-ELF for -r (detected " <> target <> ")" + + rejectAssemblyProbeFailure = + ioError . userError $ + "HTCC_ASSEMBLER passed target metadata probes but failed an x86_64-ELF assembly probe for -r" + + rejectDriverSelection = + ioError . userError $ + "failed to determine an x86_64-ELF target from HTCC_ASSEMBLER; choose a compiler driver that defaults to x86_64-ELF for -r" + + rejectLinkProbeFailure = + ioError . userError $ + "HTCC_ASSEMBLER assembled an x86_64-ELF object but failed a link probe for -r; choose a compiler driver that supports both assembly and linking for -r" + + wrapCompilerProbeIOError compilerSpec' args action = + catchIOError + action + ( \ioErr -> + ioError . userError $ + "failed to start HTCC_ASSEMBLER probe " + <> showCompilerCommandForUser compilerSpec' args + <> ": " + <> ioeGetErrorString ioErr + ) + + probeCompilerTargets compilerSpec' = + catMaybes <$> mapM (probeCompilerTarget compilerSpec') + [ "-dumpmachine" + , "-print-target-triple" + ] + + probeCompilerTarget compilerSpec' probeArg = do + (exitCode, stdout', _) <- + wrapCompilerProbeIOError compilerSpec' [probeArg] $ + readCompilerProcessWithExitCodeProbeUntil + capturedCompilerTargetLineAvailableAfterExit + suppressWarnsOutput + compilerSpec' + [probeArg] + pure $ case exitCode of + ExitSuccess -> nonEmptyTrimmed stdout' + ExitFailure _ -> Nothing + + probeCompilerInvocation compilerSpec' = + withProbeFile "htcc-probe-.s" $ \asmPath asmHandle -> do + withProbeFile "htcc-probe-.o" $ \objPath objHandle -> do + withProbeFile "htcc-probe-marker-.s" $ \markerAsmPath markerAsmHandle -> do + withProbeFile "htcc-probe-marker-.o" $ \markerObjPath markerObjHandle -> do + let probeMarker = + makeRunnableLinkedOutputMarker asmPath objPath markerObjPath + hPutStr asmHandle $ x86_64ElfProbeAsm probeMarker + hClose asmHandle + hPutStr + markerAsmHandle + (x86_64ElfRunnableLinkedOutputMarkerAsm probeMarker) + hClose markerAsmHandle + setFileMode objPath temporaryWritableMode + hClose objHandle + setFileMode markerObjPath temporaryWritableMode + hClose markerObjHandle + let assembleArgs = asmAssembleArgs objPath asmPath + markerAssembleArgs = + asmAssembleArgs markerObjPath markerAsmPath + assemblePostExitDrainSatisfied <- + stabilizeCompilerProbeObjectTargetAfterExit objPath + probeProcessResult <- + probeCommandExitCode + compilerSpec' + assembleArgs + assemblePostExitDrainSatisfied + case probeProcessResult of + Just ExitSuccess -> do + probeTarget <- detectProbeObjectTarget objPath + case probeTarget of + Just target + | isX86_64ElfTarget target -> + withCompilerObjectSnapshot objPath $ \objSnapshotPath -> do + markerAssemblePostExitDrainSatisfied <- + stabilizeCompilerObjectOutputAfterExit + markerObjPath + markerProbeProcessResult <- + probeCommandExitCode + compilerSpec' + markerAssembleArgs + markerAssemblePostExitDrainSatisfied + case markerProbeProcessResult of + Just ExitSuccess -> + withCompilerObjectSnapshot + markerObjPath + $ \markerObjSnapshotPath -> do + linkSucceeded <- + probeCompilerLink + compilerSpec' + objSnapshotPath + markerObjSnapshotPath + probeMarker + pure $ + if linkSucceeded + then Right target + else Left CompilerLinkProbeFailure + _ -> + pure $ Left CompilerLinkProbeFailure + | otherwise -> pure $ Right target + Nothing -> + pure $ Left CompilerAssemblyProbeFailure + _ -> + pure $ Left CompilerAssemblyProbeFailure + + probeCompilerLink compilerSpec' objPath markerObjPath probeMarker = + withProbeFile "htcc-probe-.out" $ \outputPath outputHandle -> do + creationMode <- creationMaskedOutputMode + setFileMode outputPath $ + stagedOutputMode PreserveReplacementOutputModeKeepingExecutableBits creationMode + hClose outputHandle + let linkArgs = asmRunnableLinkArgs outputPath objPath markerObjPath + linkPostExitDrainSatisfied <- + stabilizePostExitPredicate $ + validateRunnableLinkedOutput outputPath (Just probeMarker) + probeProcessResult <- + probeCommandExitCode + compilerSpec' + linkArgs + linkPostExitDrainSatisfied + case probeProcessResult of + Just ExitSuccess -> + validateRunnableLinkedOutput outputPath (Just probeMarker) + _ -> pure False + + probeCommandExitCode compilerSpec' args postExitDrainSatisfied = + Just . (\(exitCode, _, _) -> exitCode) + <$> wrapCompilerProbeIOError compilerSpec' args + ( readCompilerProcessWithExitCodeProbeUntil + (const postExitDrainSatisfied) + suppressWarnsOutput + compilerSpec' + args + ) + + withProbeFile prefix action = do + tmpDir <- getTemporaryDirectory + (path, handle) <- openTempFile tmpDir prefix + setFileMode path temporaryWritableMode + finally + (action path handle) + ( ignoreIOException (hClose handle) + *> ignoreIOException (removeFile path) + ) + + nonEmptyTrimmed outputText = + case trim outputText of + "" -> Nothing + xs -> Just xs + + detectProbeObjectTarget path = + catchIOError + (bracket openProbeObject closeFd readProbeObjectTarget) + (const $ pure Nothing) + where + openProbeObject = + openFd path ReadOnly defaultFileFlags {nofollow = True, cloexec = True, nonBlock = True} + + readProbeObjectTarget fd = do + status <- getFdStatus fd + let maybeOutputSize = fileStatusSizeWord64 status + case maybeOutputSize of + Just outputSize + | isRegularFile status -> do + maybeBytes <- + readCompilerObjectBytesAt fd 0 (min outputSize maximumProbeObjectTargetBytes) + case maybeBytes of + Just bytes -> describeProbeObject fd outputSize bytes + Nothing -> pure Nothing + _ -> pure Nothing + + describeProbeObject fd outputSize bytes + | B.length bytes < 4 = pure Nothing + | B.take 4 bytes /= elfMagic = pure $ Just "non-ELF object file" + | B.length bytes < 20 = pure $ Just "truncated ELF object file" + | isX86_64RelocatableElfProbeObject = + if outputSize < fromIntegral elfHeaderSize + then pure $ Just invalidX86_64RelocatableElfProbeObject + else do + maybeFingerprint <- + catchIOError + (relocatableElfObjectContentFingerprint fd outputSize) + (const $ pure Nothing) + pure . Just $ + if isJust maybeFingerprint + then "x86_64-unknown-elf object" + else invalidX86_64RelocatableElfProbeObject + | otherwise = + pure . Just $ + case (elfClass == elfClass64Bit, elfMachine == elfMachineX86_64, elfType == elfTypeRelocatable) of + (True, True, False) -> "non-relocatable x86_64-ELF file" + (_, _, True) -> "ELF object file" + _ -> "non-relocatable ELF file" + where + elfClass = B.index bytes 4 + elfData = B.index bytes 5 + elfType = decodeElfHalfWord elfData (B.index bytes 16) (B.index bytes 17) :: Int + elfMachine = decodeElfHalfWord elfData (B.index bytes 18) (B.index bytes 19) :: Int + isX86_64RelocatableElfProbeObject = + elfClass == elfClass64Bit + && elfMachine == elfMachineX86_64 + && elfType == elfTypeRelocatable + invalidX86_64RelocatableElfProbeObject = + "invalid x86_64-ELF relocatable object file" + + trim = reverse . dropWhile isSpace . reverse . dropWhile isSpace + +isX86_64ElfTarget :: String -> Bool +isX86_64ElfTarget target = + let normalizedTarget = map toLower target + in isX86_64Target normalizedTarget && not (isKnownNonElfTarget normalizedTarget) + where + isX86_64Target normalizedTarget = + "x86_64" `isPrefixOf` normalizedTarget || "amd64" `isPrefixOf` normalizedTarget + + -- Most x86_64 Unix triples are ELF; reject only known Mach-O/PE families. + isKnownNonElfTarget normalizedTarget = + any (`isInfixOf` normalizedTarget) + [ "apple" + , "cygwin" + , "darwin" + , "mingw" + , "msvc" + , "windows" + ] + +asmOutputPath :: Opts -> FilePath +asmOutputPath = fromMaybe "a.out" . optOutput + +asmAssembleArgs :: FilePath -> FilePath -> [String] +asmAssembleArgs objPath asmPath = + [ "-x" + , "assembler" + , "-c" + , "-o" + , objPath + , asmPath ] -outputFNameP :: Parser FilePath -outputFNameP = strOption $ mconcat - [ metavar "file" - , short 'o' - , long "out" - , help "Specify the output destination file name, supported only svg" - , value "./out.svg" - , showDefaultWith id +asmLinkArgs :: FilePath -> [String] -> [String] +asmLinkArgs outputPath objPaths = + [ "-no-pie" + , "-o" + , outputPath ] + <> objPaths + +asmRunnableLinkArgs :: FilePath -> FilePath -> FilePath -> [String] +asmRunnableLinkArgs outputPath objPath markerObjPath = + asmLinkArgs outputPath [objPath, markerObjPath] + +normalizeComparablePath :: FilePath -> IO FilePath +normalizeComparablePath path = do + exists <- doesFileExist path + normalise <$> if exists then canonicalizePath path else makeAbsolute path + +fileIdentity :: FilePath -> IO (Maybe (FilePath, FilePath)) +fileIdentity path = do + exists <- doesFileExist path + if exists + then do + status <- getFileStatus path + pure . Just $ + ( show $ deviceID status + , show $ fileID status + ) + else pure Nothing + +sameFileAs :: FilePath -> FilePath -> IO Bool +sameFileAs lhs rhs = do + normalizedLhs <- normalizeComparablePath lhs + normalizedRhs <- normalizeComparablePath rhs + if normalizedLhs == normalizedRhs + then pure True + else do + lhsIdentity <- fileIdentity lhs + rhsIdentity <- fileIdentity rhs + pure $ fromMaybe False $ (==) <$> lhsIdentity <*> rhsIdentity + +existingInputAliasesPath :: FilePath -> FilePath -> IO Bool +existingInputAliasesPath outputPath inputPath = do + inputExists <- doesFileExist inputPath + if inputExists + then sameFileAs outputPath inputPath + else pure False + +runAsmOutputAliasesInput :: Opts -> IO Bool +runAsmOutputAliasesInput opts = do + anyM (existingInputAliasesPath $ asmOutputPath opts) $ optInput opts -supressWarnP :: Parser Bool -supressWarnP = switch $ mconcat - [ short 'w' - , long "supress-warns" - , help "Disable all warning messages" +plainOutputAliasesInput :: Opts -> IO Bool +plainOutputAliasesInput opts = maybe + (pure False) + (\path -> anyM (existingInputAliasesPath path) $ optInput opts) + (optOutput opts) + +anyM :: Monad m => (a -> m Bool) -> [a] -> m Bool +anyM _ [] = pure False +anyM p (x : xs) = do + matched <- p x + if matched then pure True else anyM p xs + +executableFileMode :: FileMode +executableFileMode = foldr1 unionFileModes + [ ownerExecuteMode + , groupExecuteMode + , otherExecuteMode ] -optionsP :: Parser Options -optionsP = (<*>) helper $ - Options <$> visualizeASTP <*> resolutionP <*> inputFNameP <*> outputFNameP <*> supressWarnP +defaultVisualizeResolution :: (Double, Double) +defaultVisualizeResolution = (640, 480) + +visualizeOutputPath :: Opts -> FilePath +visualizeOutputPath = fromMaybe "./out.svg" . optOutput + +parseImageResolution :: String -> Maybe (Double, Double) +parseImageResolution inputValue = case break (== 'x') inputValue of + (width, 'x' : height) -> case (readMaybe width, readMaybe height) of + (Just parsedWidth, Just parsedHeight) + | isPositiveFiniteResolution parsedWidth && isPositiveFiniteResolution parsedHeight -> + Just (parsedWidth, parsedHeight) + _ -> + Nothing + _ -> Nothing + +isPositiveFiniteResolution :: Double -> Bool +isPositiveFiniteResolution value = + value > 0 && not (isInfinite value || isNaN value) + +visualizeSizeSpec :: Opts -> IO (SizeSpec V2 Double) +visualizeSizeSpec opts = case optImgResolution opts of + Just resolutionValue -> case parseImageResolution resolutionValue of + Just (width, height) -> + pure $ mkSizeSpec2D (Just width) (Just height) + Nothing -> do + unless (optSuppressWarns opts) $ + hPutStr stderr "warning: the specified resolution is invalid, so using default resolution.\n" + let (width, height) = defaultVisualizeResolution + pure $ mkSizeSpec2D (Just width) (Just height) + Nothing -> do + let (width, height) = defaultVisualizeResolution + pure $ mkSizeSpec2D (Just width) (Just height) + +emitWarningsIfEnabled :: Foldable f => Opts -> f (M.ParseErrorBundle T.Text Void) -> IO () +emitWarningsIfEnabled opts warnings = + unless (optSuppressWarns opts) $ + emitWarnings warnings + +validateOpts :: Opts -> IO () +validateOpts opts + | optVisualizeAst opts && optIsRunAsm opts = + hPutStr stderr "--visualize-ast cannot be combined with -r\n" *> exitFailure + | optVisualizeAst opts && length (optInput opts) /= 1 = + hPutStr stderr "--visualize-ast expects exactly one input file\n" *> exitFailure + | optVisualizeAst opts = do + resolvedVisualizeOutputPath <- resolveReplacementOutputPath $ visualizeOutputPath opts + either + (\msg -> hPutStr stderr (msg <> "\n") *> exitFailure) + pure + (validateVisualizationOutputPath resolvedVisualizeOutputPath) + outputAliasesInput <- anyM (existingInputAliasesPath $ visualizeOutputPath opts) $ optInput opts + when outputAliasesInput $ + hPutStr stderr ("--visualize-ast output path must not overwrite an input file: " <> visualizeOutputPath opts <> "\n") + *> exitFailure + | isJust (optImgResolution opts) = + hPutStr stderr "--img-resolution requires --visualize-ast\n" *> exitFailure + | length (optInput opts) > 1 && optIsRunAsm opts = + hPutStr stderr "multiple input files are not supported with -r\n" *> exitFailure + | optIsRunAsm opts = do + outputAliasesInput <- runAsmOutputAliasesInput opts + when outputAliasesInput $ + hPutStr stderr ("-r output path must not overwrite an input file: " <> asmOutputPath opts <> "\n") + *> exitFailure + | otherwise = do + outputAliasesInput <- plainOutputAliasesInput opts + when outputAliasesInput $ + hPutStr stderr ("-o output path must not overwrite an input file: " <> fromMaybe "" (optOutput opts) <> "\n") + *> exitFailure + +type ParsedInput = + ( ASTs Integer + , GlobalVars Integer + , GlobalVars Integer + , Literals Integer + , PF.Functions Integer + , PF.Functions Integer + ) +type ParsedInputWithWarnings = (Warnings, ParsedInput) + +emitWarnings :: Foldable f => f (M.ParseErrorBundle T.Text Void) -> IO () +emitWarnings = + mapM_ (hPutStr stderr . M.errorBundlePretty) + +collectedWarnings :: Foldable f => f ParsedInputWithWarnings -> Warnings +collectedWarnings = + foldMap fst + +collectedWarningsInInputOrder :: [ParsedInputWithWarnings] -> Warnings +collectedWarningsInInputOrder = + collectedWarnings . reverse + +implicitFunctionWarningName :: M.ParseErrorBundle T.Text Void -> Maybe T.Text +implicitFunctionWarningName M.ParseErrorBundle { M.bundleErrors = bundledError :| [] } = do + msg <- case bundledError of + M.FancyError _ fancyErrors -> case Set.toList fancyErrors of + [M.ErrorFail errMsg] -> + Just $ T.pack errMsg + _ -> + Nothing + _ -> + Nothing + T.stripPrefix (T.pack "warning: the function '") msg + >>= T.stripSuffix (T.pack "' is not declared.") +implicitFunctionWarningName _ = Nothing + +originatingInputDeclaresFunction :: ParsedInput -> T.Text -> Bool +originatingInputDeclaresFunction (_, _, _, _, _, funcs) name = + maybe False (not . PF.fnImplicit) $ + Map.lookup name funcs + +implicitFunctionResolvedAfterMerge :: ParsedInput -> T.Text -> Bool +implicitFunctionResolvedAfterMerge (_, _, _, _, funcs, _) name = + maybe + False + (\func -> not (CT.isSCStatic $ PF.fntype func) && not (PF.fnImplicit func)) + $ Map.lookup name funcs + +shouldEmitMergedWarning :: ParsedInput -> ParsedInput -> M.ParseErrorBundle T.Text Void -> Bool +shouldEmitMergedWarning originatingInput mergedInput warning = + maybe + True + ( \name -> + originatingInputDeclaresFunction originatingInput name + || not (implicitFunctionResolvedAfterMerge mergedInput name) + ) + (implicitFunctionWarningName warning) + +literalLabelPrefix :: T.Text +literalLabelPrefix = T.pack ".L.data." + +literalLabel :: Natural -> T.Text +literalLabel n = literalLabelPrefix <> tshow n + +shiftLiteralLabelName :: Natural -> T.Text -> T.Text +shiftLiteralLabelName offset name = maybe name (literalLabel . (+ offset)) literalIndex + where + literalIndex = T.stripPrefix literalLabelPrefix name >>= readMaybe . T.unpack + +shiftLiteralLabelsInGVar :: Natural -> GVar Integer -> GVar Integer +shiftLiteralLabelsInGVar offset gvar = gvar + { initWith = case initWith gvar of + GVarInitWithOG ref -> GVarInitWithOG $ shiftLiteralLabelName offset ref + GVarInitWithData dats -> GVarInitWithData $ map shiftLiteralLabelsInGVarData dats + GVarInitWithAST ast -> GVarInitWithAST $ shiftLiteralLabelsInATree offset ast + other -> other + } + where + shiftLiteralLabelsInGVarData dat = case dat of + GVarInitReloc sz ref addend -> GVarInitReloc sz (shiftLiteralLabelName offset ref) addend + other -> other + +shiftLiteralLabelsInATKindFor :: Natural -> ATKindFor Integer -> ATKindFor Integer +shiftLiteralLabelsInATKindFor offset kind = case kind of + ATForkw -> ATForkw + ATForInit at -> ATForInit $ shiftLiteralLabelsInATree offset at + ATForCond at -> ATForCond $ shiftLiteralLabelsInATree offset at + ATForIncr at -> ATForIncr $ shiftLiteralLabelsInATree offset at + ATForStmt at -> ATForStmt $ shiftLiteralLabelsInATree offset at + +shiftLiteralLabelsInATKind :: Natural -> ATKind Integer -> ATKind Integer +shiftLiteralLabelsInATKind offset kind = case kind of + ATConditional cond tr fl -> + ATConditional + (shiftLiteralLabelsInATree offset cond) + (shiftLiteralLabelsInATree offset tr) + (shiftLiteralLabelsInATree offset fl) + ATSwitch cond cases -> + ATSwitch + (shiftLiteralLabelsInATree offset cond) + (map (shiftLiteralLabelsInATree offset) cases) + ATFor kinds -> + ATFor $ map (shiftLiteralLabelsInATKindFor offset) kinds + ATBlock ats -> + ATBlock $ map (shiftLiteralLabelsInATree offset) ats + ATStmtExpr ats -> + ATStmtExpr $ map (shiftLiteralLabelsInATree offset) ats + ATNull at -> + ATNull $ shiftLiteralLabelsInATree offset at + ATDefFunc name args -> + ATDefFunc name $ map (shiftLiteralLabelsInATree offset) <$> args + ATCallFunc name args -> + ATCallFunc name $ map (shiftLiteralLabelsInATree offset) <$> args + ATCallPtr args -> + ATCallPtr $ map (shiftLiteralLabelsInATree offset) <$> args + ATGVar ty name -> + ATGVar ty $ shiftLiteralLabelName offset name + _ -> + kind + +shiftLiteralLabelsInATree :: Natural -> ATree Integer -> ATree Integer +shiftLiteralLabelsInATree _ ATEmpty = ATEmpty +shiftLiteralLabelsInATree offset (ATNode kind ty lhs rhs) = + ATNode + (shiftLiteralLabelsInATKind offset kind) + ty + (shiftLiteralLabelsInATree offset lhs) + (shiftLiteralLabelsInATree offset rhs) + +shiftLiteralLabels :: Natural -> ParsedInput -> ParsedInput +shiftLiteralLabels offset (asts, gvars, mergeGVars, lits, funcs, mergeFuncs) = + ( map (shiftLiteralLabelsInATree offset) asts + , Map.map (shiftLiteralLabelsInGVar offset) gvars + , Map.map (shiftLiteralLabelsInGVar offset) mergeGVars + , map (\lit -> lit { ln = ln lit + offset }) lits + , funcs + , mergeFuncs + ) + +namespaceInternalSymbol :: Natural -> T.Text -> T.Text +namespaceInternalSymbol inputIndex name = + namespaceInternalSymbolPrefix inputIndex <> name + +namespaceInternalSymbolPrefix :: Natural -> T.Text +namespaceInternalSymbolPrefix inputIndex = + T.pack ".L.internal." + <> tshow inputIndex + <> T.pack "." + +denamespaceInternalSymbol :: Natural -> T.Text -> T.Text +denamespaceInternalSymbol inputIndex name = + fromMaybe name $ + T.stripPrefix (namespaceInternalSymbolPrefix inputIndex) name + +renameInputSymbol :: Map.Map T.Text T.Text -> T.Text -> T.Text +renameInputSymbol renames name = Map.findWithDefault name name renames + +data InternalSymbolRenames = InternalSymbolRenames + { functionSymbolRenames :: Map.Map T.Text T.Text + , objectSymbolRenames :: Map.Map T.Text T.Text + } + +renameFunctionSymbol :: InternalSymbolRenames -> T.Text -> T.Text +renameFunctionSymbol renames = renameInputSymbol $ functionSymbolRenames renames + +renameObjectSymbol :: InternalSymbolRenames -> T.Text -> T.Text +renameObjectSymbol renames = renameInputSymbol $ objectSymbolRenames renames + +renameKnownInternalSymbol :: InternalSymbolRenames -> T.Text -> T.Text +renameKnownInternalSymbol renames name = + renameFunctionSymbol + renames + (renameObjectSymbol renames name) + +renameInternalSymbolsInGVar :: InternalSymbolRenames -> GVar Integer -> GVar Integer +renameInternalSymbolsInGVar renames gvar = gvar + { initWith = case initWith gvar of + GVarInitWithOG ref -> GVarInitWithOG $ renameKnownInternalSymbol renames ref + GVarInitWithData dats -> GVarInitWithData $ map renameInternalSymbolsInGVarData dats + GVarInitWithAST ast -> GVarInitWithAST $ renameInternalSymbolsInATree renames ast + other -> other + } + where + renameInternalSymbolsInGVarData dat = case dat of + GVarInitReloc sz ref addend -> GVarInitReloc sz (renameKnownInternalSymbol renames ref) addend + other -> other + +renameInternalSymbolsInATKind :: InternalSymbolRenames -> ATKind Integer -> ATKind Integer +renameInternalSymbolsInATKind renames kind = case kind of + ATConditional cond tr fl -> + ATConditional + (renameInternalSymbolsInATree renames cond) + (renameInternalSymbolsInATree renames tr) + (renameInternalSymbolsInATree renames fl) + ATSwitch cond cases -> + ATSwitch + (renameInternalSymbolsInATree renames cond) + (map (renameInternalSymbolsInATree renames) cases) + ATFor kinds -> + ATFor $ map (renameInternalSymbolsInATKindFor renames) kinds + ATBlock ats -> + ATBlock $ map (renameInternalSymbolsInATree renames) ats + ATStmtExpr ats -> + ATStmtExpr $ map (renameInternalSymbolsInATree renames) ats + ATNull at -> + ATNull $ renameInternalSymbolsInATree renames at + ATDefFunc name args -> + ATDefFunc + (renameFunctionSymbol renames name) + (map (renameInternalSymbolsInATree renames) <$> args) + ATCallFunc name args -> + ATCallFunc + (renameFunctionSymbol renames name) + (map (renameInternalSymbolsInATree renames) <$> args) + ATCallPtr args -> + ATCallPtr $ map (renameInternalSymbolsInATree renames) <$> args + ATFuncPtr name -> + ATFuncPtr $ renameFunctionSymbol renames name + ATGVar ty name -> + ATGVar ty $ renameObjectSymbol renames name + _ -> + kind + +renameInternalSymbolsInATKindFor :: InternalSymbolRenames -> ATKindFor Integer -> ATKindFor Integer +renameInternalSymbolsInATKindFor renames kind = case kind of + ATForkw -> ATForkw + ATForInit at -> ATForInit $ renameInternalSymbolsInATree renames at + ATForCond at -> ATForCond $ renameInternalSymbolsInATree renames at + ATForIncr at -> ATForIncr $ renameInternalSymbolsInATree renames at + ATForStmt at -> ATForStmt $ renameInternalSymbolsInATree renames at + +renameInternalSymbolsInATree :: InternalSymbolRenames -> ATree Integer -> ATree Integer +renameInternalSymbolsInATree _ ATEmpty = ATEmpty +renameInternalSymbolsInATree renames (ATNode kind ty lhs rhs) = + ATNode + (renameInternalSymbolsInATKind renames kind) + ty + (renameInternalSymbolsInATree renames lhs) + (renameInternalSymbolsInATree renames rhs) + +internalSymbolRenames :: Natural -> ParsedInput -> InternalSymbolRenames +internalSymbolRenames inputIndex (_, gvars, _, _, funcs, _) = + InternalSymbolRenames + { functionSymbolRenames = Map.fromList + [ (name, namespaceInternalSymbol inputIndex name) + | (name, func) <- Map.toList funcs + , CT.isSCStatic (PF.fntype func) + ] + , objectSymbolRenames = Map.fromList + [ (name, namespaceInternalSymbol inputIndex name) + | (name, gvar) <- Map.toList gvars + , CT.isSCStatic (gvtype gvar) + ] + } + +renameInternalSymbols :: Natural -> ParsedInput -> ParsedInput +renameInternalSymbols inputIndex (asts, gvars, mergeGVars, lits, funcs, mergeFuncs) = + ( map (renameInternalSymbolsInATree renames) asts + , Map.mapKeys (renameObjectSymbol renames) $ Map.map (renameInternalSymbolsInGVar renames) gvars + , Map.mapKeys (renameObjectSymbol renames) $ Map.map (renameInternalSymbolsInGVar renames) mergeGVars + , lits + , Map.mapKeys (renameFunctionSymbol renames) funcs + , Map.mapKeys (renameFunctionSymbol renames) mergeFuncs + ) + where + renames = internalSymbolRenames inputIndex (asts, gvars, mergeGVars, lits, funcs, mergeFuncs) + +shiftLiteralLabelsInInputs :: [ParsedInput] -> [ParsedInput] +shiftLiteralLabelsInInputs parsedInputs = snd $ mapAccumL step (0, 0) parsedInputs + where + shouldNamespaceInternalSymbols = length parsedInputs > 1 + step (inputIndex, offset) parsed@(_, _, _, lits, _, _) = + let renamed = bool parsed (renameInternalSymbols inputIndex parsed) shouldNamespaceInternalSymbols + shifted = shiftLiteralLabels offset renamed + in ((succ inputIndex, offset + fromIntegral (length lits)), shifted) + +data ExternalSymbol + = ExternalFunction (PF.Function Integer) Bool + | ExternalGlobal (GVar Integer) + | ExternalImplicitFunction + +type ExternalFunctionInfo = (PF.Function Integer, Bool) +type ExternalSymbolOrigin = Int +type TaggedExternalSymbol = (ExternalSymbolOrigin, ExternalSymbol) +type StaticSymbolKey = (ExternalSymbolOrigin, T.Text) +type StaticSymbols = Map.Map StaticSymbolKey ExternalSymbol + +duplicateExternalSymbolError :: T.Text -> String +duplicateExternalSymbolError name = + T.unpack $ T.pack "multiple external definitions in multi-input -o mode: " <> name + +conflictingExternalDeclarationError :: T.Text -> GVar Integer -> GVar Integer -> String +conflictingExternalDeclarationError name lhs rhs = + T.unpack $ mconcat + [ T.pack "conflicting external declarations in multi-input -o mode: " + , name + , T.pack " (" + , tshow $ CT.toTypeKind $ gvtype lhs + , T.pack " vs " + , tshow $ CT.toTypeKind $ gvtype rhs + , T.pack ")" + ] + +conflictingExternalFunctionDeclarationError :: T.Text -> PF.Function Integer -> PF.Function Integer -> String +conflictingExternalFunctionDeclarationError name lhs rhs = + T.unpack $ mconcat + [ T.pack "conflicting external function declarations in multi-input -o mode: " + , name + , T.pack " (" + , tshow $ CT.toTypeKind $ PF.fntype lhs + , T.pack " vs " + , tshow $ CT.toTypeKind $ PF.fntype rhs + , T.pack ")" + ] + +isTentativeExternalGlobal :: GVar Integer -> Bool +isTentativeExternalGlobal gvar = + not (CT.isSCStatic $ gvtype gvar) + && case initWith gvar of + GVarInitWithZero -> True + _ -> False + +isExternOnlyExternalGlobal :: GVar Integer -> Bool +isExternOnlyExternalGlobal gvar = + not (CT.isSCStatic $ gvtype gvar) + && case initWith gvar of + GVarInitWithExternDecl -> True + _ -> False + +mergeCrossInputTypeKinds :: CT.TypeKind Integer -> CT.TypeKind Integer -> Maybe (CT.TypeKind Integer) +mergeCrossInputTypeKinds lhs rhs = + CT.mergeCompatibleTypeKinds + (normalizeCrossInputTypeScopes lhs) + (normalizeCrossInputTypeScopes rhs) + where + -- `ScopeId` is allocated per parser run, so it cannot participate in + -- cross-input compatibility checks. + normalizeCrossInputTypeScopes = \case + CT.CTSigned ty -> + CT.CTSigned $ normalizeCrossInputTypeScopes ty + CT.CTShort ty -> + CT.CTShort $ normalizeCrossInputTypeScopes ty + CT.CTLong ty -> + CT.CTLong $ normalizeCrossInputTypeScopes ty + CT.CTPtr ty -> + CT.CTPtr $ normalizeCrossInputTypeScopes ty + CT.CTArray len ty -> + CT.CTArray len $ normalizeCrossInputTypeScopes ty + CT.CTFunc ret params -> + CT.CTFunc + (normalizeCrossInputTypeScopes ret) + (map (first normalizeCrossInputTypeScopes) params) + CT.CTIncomplete incompleteTy -> + CT.CTIncomplete $ case incompleteTy of + CT.IncompleteArray elemTy -> + CT.IncompleteArray $ normalizeCrossInputTypeScopes elemTy + CT.IncompleteStruct tag _ -> + CT.IncompleteStruct tag (CT.ScopeId 0) + CT.CTStruct members -> + CT.CTStruct $ fmap normalizeCrossInputStructMember members + CT.CTNamedStruct tag _ members -> + CT.CTNamedStruct tag (CT.ScopeId 0) $ fmap normalizeCrossInputStructMember members + CT.CTEnum underlyingTy members -> + CT.CTEnum (normalizeCrossInputTypeScopes underlyingTy) members + ty -> + ty + + normalizeCrossInputStructMember member = + member { CT.smType = normalizeCrossInputTypeScopes $ CT.smType member } + +mergeExternalGlobalTypes + :: CT.StorageClass Integer + -> CT.StorageClass Integer + -> Maybe (CT.StorageClass Integer) +mergeExternalGlobalTypes (CT.SCAuto lhs) (CT.SCAuto rhs) = + CT.SCAuto <$> mergeCrossInputTypeKinds lhs rhs +mergeExternalGlobalTypes (CT.SCStatic lhs) (CT.SCStatic rhs) = + CT.SCStatic <$> mergeCrossInputTypeKinds lhs rhs +mergeExternalGlobalTypes (CT.SCRegister lhs) (CT.SCRegister rhs) = + CT.SCRegister <$> mergeCrossInputTypeKinds lhs rhs +mergeExternalGlobalTypes (CT.SCUndef lhs) (CT.SCUndef rhs) = + CT.SCUndef <$> mergeCrossInputTypeKinds lhs rhs +mergeExternalGlobalTypes _ _ = Nothing + +mergeExternalGlobals :: T.Text -> GVar Integer -> GVar Integer -> Either String (GVar Integer) +mergeExternalGlobals name lhs rhs = case mergeExternalGlobalTypes (gvtype lhs) (gvtype rhs) of + Nothing -> + Left $ conflictingExternalDeclarationError name lhs rhs + Just mergedType + | isExternOnlyExternalGlobal lhs && isExternOnlyExternalGlobal rhs -> + Right $ lhs { gvtype = mergedType } + | isExternOnlyExternalGlobal lhs -> + Right $ rhs { gvtype = mergedType } + | isExternOnlyExternalGlobal rhs -> + Right $ lhs { gvtype = mergedType } + | isTentativeExternalGlobal lhs && isTentativeExternalGlobal rhs -> + Right $ lhs { gvtype = mergedType } + | isTentativeExternalGlobal lhs -> + Right $ rhs { gvtype = mergedType } + | isTentativeExternalGlobal rhs -> + Right $ lhs { gvtype = mergedType } + | otherwise -> + Left $ duplicateExternalSymbolError name + +mergeExternalFunctions :: T.Text -> ExternalFunctionInfo -> ExternalFunctionInfo -> Either String ExternalFunctionInfo +mergeExternalFunctions name (lhs, lhsDefined) (rhs, rhsDefined) + | lhsDefined && rhsDefined = + Left $ duplicateExternalSymbolError name + | otherwise = case mergeExternalFunctionTypes preferred fallback of + Nothing -> + Left $ conflictingExternalFunctionDeclarationError name lhs rhs + Just merged -> + Right (merged, lhsDefined || rhsDefined) + where + (preferred, fallback) + | rhsDefined = (rhs, lhs) + | otherwise = (lhs, rhs) + +mergeExternalFunctionTypes + :: PF.Function Integer + -> PF.Function Integer + -> Maybe (PF.Function Integer) +mergeExternalFunctionTypes preferred fallback = do + mergedType <- mergeExternalGlobalTypes (PF.fntype preferred) (PF.fntype fallback) + pure $ + preferred + { PF.fntype = mergedType + , PF.fnDefined = PF.fnDefined preferred || PF.fnDefined fallback + , PF.fnImplicit = PF.fnImplicit preferred && PF.fnImplicit fallback + } + +implicitExternalFunction :: PF.Function Integer +implicitExternalFunction = + PF.Function + { PF.fntype = CT.SCAuto $ CT.CTFunc CT.CTInt [] + , PF.fnDefined = False + , PF.fnImplicit = True + , PF.fnNestDepth = 0 + } + +definedFunctions :: ParsedInput -> Set.Set T.Text +definedFunctions (asts, _, _, _, _, _) = + Set.fromList + [ name + | ATNode (ATDefFunc name _) _ _ _ <- asts + ] + +implicitFunctionCalls :: ParsedInput -> Set.Set T.Text +implicitFunctionCalls (asts, _, _, _, _, mergeFuncs) = + foldMap implicitFunctionCallsInATree asts + `Set.difference` Set.fromList (Map.keys mergeFuncs) + +implicitFunctionCallsInATKindFor :: ATKindFor Integer -> Set.Set T.Text +implicitFunctionCallsInATKindFor kind = case kind of + ATForkw -> Set.empty + ATForInit at -> implicitFunctionCallsInATree at + ATForCond at -> implicitFunctionCallsInATree at + ATForIncr at -> implicitFunctionCallsInATree at + ATForStmt at -> implicitFunctionCallsInATree at + +implicitFunctionCallsInATKind :: ATKind Integer -> Set.Set T.Text +implicitFunctionCallsInATKind kind = case kind of + ATConditional cond tr fl -> + implicitFunctionCallsInATree cond + <> implicitFunctionCallsInATree tr + <> implicitFunctionCallsInATree fl + ATSwitch cond cases -> + implicitFunctionCallsInATree cond + <> foldMap implicitFunctionCallsInATree cases + ATFor kinds -> + foldMap implicitFunctionCallsInATKindFor kinds + ATBlock ats -> + foldMap implicitFunctionCallsInATree ats + ATStmtExpr ats -> + foldMap implicitFunctionCallsInATree ats + ATNull at -> + implicitFunctionCallsInATree at + ATDefFunc _ args -> + foldMap implicitFunctionCallsInATree $ fromMaybe [] args + ATCallFunc name args -> + Set.insert name $ foldMap implicitFunctionCallsInATree $ fromMaybe [] args + ATCallPtr args -> + foldMap implicitFunctionCallsInATree $ fromMaybe [] args + _ -> + Set.empty + +implicitFunctionCallsInATree :: ATree Integer -> Set.Set T.Text +implicitFunctionCallsInATree ATEmpty = Set.empty +implicitFunctionCallsInATree (ATNode kind _ lhs rhs) = + implicitFunctionCallsInATKind kind + <> implicitFunctionCallsInATree lhs + <> implicitFunctionCallsInATree rhs + +mergeOutputInputs :: [ParsedInput] -> Either String ParsedInput +mergeOutputInputs = + mergePreparedInputs prepareAsmInput + +mergeVisualizableInputs :: [ParsedInput] -> Either String ParsedInput +mergeVisualizableInputs = + mergePreparedInputs prepareVisualizableInput + +mergePreparedInputs + :: (PF.Functions Integer -> ASTs Integer -> GlobalVars Integer -> Either String (ASTs Integer, GlobalVars Integer)) + -> [ParsedInput] + -> Either String ParsedInput +mergePreparedInputs prepareMergedInput = + mergeParsedInputs finalize + where + finalize (asts, gvars, lits, funcs, _, _) = do + (preparedAsts, preparedGVars) <- prepareMergedInput (fmap fst funcs) asts gvars + let visibleFuncs = fmap fst funcs + pure (preparedAsts, preparedGVars, preparedGVars, lits, visibleFuncs, visibleFuncs) + +mergeParsedInputs + :: (( ASTs Integer + , GlobalVars Integer + , Literals Integer + , Map.Map T.Text (PF.Function Integer, Bool) + , Map.Map T.Text TaggedExternalSymbol + , StaticSymbols + ) + -> Either String ParsedInput + ) + -> [ParsedInput] + -> Either String ParsedInput +mergeParsedInputs finalize parsedInputs = + foldM mergeInput ([], Map.empty, [], Map.empty, Map.empty, Map.empty) (zip [0 :: Int ..] parsedInputs) >>= finalize + where + mergeInput (astsAcc, gvarsAcc, litsAcc, funcsAcc, symbolsAcc, staticSymbolsAcc) (inputIndex, (asts, visibleGVars, mergeGVars, lits, visibleFuncs, mergeFuncs)) = do + let parsedInput = (asts, visibleGVars, mergeGVars, lits, visibleFuncs, mergeFuncs) + actualDefinitions = definedFunctions parsedInput + symbolsAcc' <- foldM (registerImplicitFunction inputIndex staticSymbolsAcc) symbolsAcc $ Set.toList $ implicitFunctionCalls parsedInput + (symbolsAcc'', staticSymbolsAcc', funcsAcc') <- + foldM + (registerFunction inputIndex actualDefinitions visibleFuncs) + (symbolsAcc', staticSymbolsAcc, funcsAcc) + $ Map.toList mergeFuncs + (symbolsAcc''', staticSymbolsAcc'', gvarsAcc') <- + foldM + (registerGlobal inputIndex visibleGVars) + (symbolsAcc'', staticSymbolsAcc', gvarsAcc) + $ Map.toList mergeGVars + pure + ( astsAcc <> asts + , gvarsAcc' + , litsAcc <> lits + , funcsAcc' + , symbolsAcc''' + , staticSymbolsAcc'' + ) + + registerImplicitFunction origin staticSymbols symbols name = do + rejectStaticSymbolConflict origin name ExternalImplicitFunction staticSymbols + case Map.lookup name symbols of + Nothing -> + pure $ insertSymbol origin name ExternalImplicitFunction symbols + Just (existingOrigin, existingSymbol) + | existingOrigin == origin -> + (\merged -> insertSymbol origin name merged symbols) + <$> mergeSameOriginExternalSymbol name existingSymbol ExternalImplicitFunction + | otherwise -> case existingSymbol of + ExternalGlobal _ -> + Left $ duplicateExternalSymbolError name + ExternalFunction existing existingHasBody -> + mergeExternalFunctions name (existing, existingHasBody) (implicitExternalFunction, False) + $> symbols + _ -> + pure symbols + + registerFunction origin actualDefinitions visibleFuncs (symbols, staticSymbols, funcsAcc) (name, func) + | CT.isSCStatic (PF.fntype func) = do + rejectExternalSymbolConflict origin semanticName newSymbol symbols + staticSymbols' <- registerStaticSymbol origin semanticName newSymbol staticSymbols + pure + ( symbols + , staticSymbols' + , insertVisibleFunction name (func, hasBody) visibleFuncs funcsAcc + ) + | otherwise = do + rejectStaticSymbolConflict origin semanticName newSymbol staticSymbols + case Map.lookup semanticName symbols of + Nothing -> + pure + ( insertSymbol origin semanticName newSymbol symbols + , staticSymbols + , insertVisibleFunction name (func, hasBody) visibleFuncs funcsAcc + ) + Just (existingOrigin, existingSymbol) + | existingOrigin == origin -> + case mergeSameOriginExternalSymbol name existingSymbol (ExternalFunction func hasBody) of + Left mergeErr -> + Left mergeErr + Right (ExternalFunction mergedFunc mergedHasBody) -> + pure + ( insertSymbol origin semanticName (ExternalFunction mergedFunc mergedHasBody) symbols + , staticSymbols + , insertVisibleFunction name (mergedFunc, mergedHasBody) visibleFuncs funcsAcc + ) + Right _ -> + Left "internal compiler error: unexpected same-input symbol merge result" + | otherwise -> case existingSymbol of + ExternalImplicitFunction -> + mergeExternalFunctions name (implicitExternalFunction, False) (func, hasBody) >> + pure + ( insertSymbol origin semanticName newSymbol symbols + , staticSymbols + , insertVisibleFunction name (func, hasBody) visibleFuncs funcsAcc + ) + ExternalFunction existing existingHasBody -> do + merged <- mergeExternalFunctions name (existing, existingHasBody) (func, hasBody) + pure + ( insertSymbol origin semanticName (uncurry ExternalFunction merged) symbols + , staticSymbols + , insertVisibleFunction name merged visibleFuncs funcsAcc + ) + ExternalGlobal _ -> + Left $ duplicateExternalSymbolError name + where + hasBody = Set.member name actualDefinitions + semanticName = emittedSymbolName origin (CT.isSCStatic $ PF.fntype func) name + newSymbol = ExternalFunction func hasBody + + registerGlobal origin visibleGVars (symbols, staticSymbols, gvarsAcc) (name, gvar) + | CT.isSCStatic (gvtype gvar) = do + rejectExternalSymbolConflict origin semanticName newSymbol symbols + staticSymbols' <- registerStaticSymbol origin semanticName newSymbol staticSymbols + pure + ( symbols + , staticSymbols' + , insertVisibleGlobal name visibleGVars gvarsAcc + ) + | otherwise = do + rejectStaticSymbolConflict origin semanticName newSymbol staticSymbols + case Map.lookup semanticName symbols of + Nothing -> + pure + ( insertSymbol origin semanticName newSymbol symbols + , staticSymbols + , insertVisibleGlobal name visibleGVars gvarsAcc + ) + Just (existingOrigin, existingSymbol) + | existingOrigin == origin -> + case mergeSameOriginExternalSymbol name existingSymbol (ExternalGlobal gvar) of + Left mergeErr -> + Left mergeErr + Right (ExternalGlobal mergedGVar) -> + pure + ( insertSymbol origin semanticName (ExternalGlobal mergedGVar) symbols + , staticSymbols + , insertVisibleGlobal name visibleGVars gvarsAcc + ) + Right _ -> + Left "internal compiler error: unexpected same-input symbol merge result" + | otherwise -> case existingSymbol of + ExternalFunction _ _ -> + Left $ duplicateExternalSymbolError name + ExternalImplicitFunction -> + Left $ duplicateExternalSymbolError name + ExternalGlobal existing -> do + merged <- mergeExternalGlobals name existing gvar + pure + ( insertSymbol origin semanticName (ExternalGlobal merged) symbols + , staticSymbols + , insertVisibleGlobal name visibleGVars gvarsAcc + ) + where + semanticName = emittedSymbolName origin (CT.isSCStatic $ gvtype gvar) name + newSymbol = ExternalGlobal gvar + + insertGlobal name = Map.insertWith (preserveMergedGlobalType name) name + insertFunction = Map.insertWith preserveMergedFunctionType + insertVisibleGlobal name visibleGVars gvarsAcc = + maybe gvarsAcc (\gvar -> insertGlobal name gvar gvarsAcc) $ + Map.lookup name visibleGVars + insertVisibleFunction name func visibleFuncs funcsAcc = + maybe funcsAcc (\visibleFunc -> insertFunction name (visibleFunc, snd func) funcsAcc) $ + Map.lookup name visibleFuncs + insertSymbol origin name symbol = Map.insert name (origin, symbol) + insertStaticSymbol origin name = Map.insert (origin, name) + + emittedSymbolName origin isInternal name + | isInternal = + denamespaceInternalSymbol (fromIntegral origin) name + | otherwise = + name + + rejectExternalSymbolConflict origin name newSymbol symbols = case Map.lookup name symbols of + Just (existingOrigin, existingSymbol) + | existingOrigin == origin -> + mergeSameOriginExternalSymbol name existingSymbol newSymbol $> () + _ -> + pure () + + rejectStaticSymbolConflict origin name newSymbol staticSymbols = case Map.lookup (origin, name) staticSymbols of + Just existingSymbol -> + mergeSameOriginExternalSymbol name existingSymbol newSymbol $> () + Nothing -> + pure () + + registerStaticSymbol origin name newSymbol staticSymbols = case Map.lookup (origin, name) staticSymbols of + Just existingSymbol -> + (\merged -> insertStaticSymbol origin name merged staticSymbols) + <$> mergeSameOriginExternalSymbol name existingSymbol newSymbol + Nothing -> + pure $ insertStaticSymbol origin name newSymbol staticSymbols + + mergeSameOriginExternalSymbol name existingSymbol newSymbol = case (existingSymbol, newSymbol) of + (ExternalImplicitFunction, ExternalImplicitFunction) -> + pure ExternalImplicitFunction + (ExternalImplicitFunction, ExternalFunction func hasBody) -> + pure $ ExternalFunction func hasBody + (ExternalFunction func hasBody, ExternalImplicitFunction) -> + pure $ ExternalFunction func hasBody + (ExternalFunction existing existingHasBody, ExternalFunction func hasBody) -> + uncurry ExternalFunction + <$> mergeExternalFunctions name (existing, existingHasBody) (func, hasBody) + (ExternalGlobal existing, ExternalGlobal gvar) -> + ExternalGlobal <$> mergeExternalGlobals name existing gvar + _ -> + Left $ duplicateExternalSymbolError name -parseResolution :: (Num a, Read a) => String -> (Maybe a, Maybe a) -parseResolution xs = let rs = splitOn "x" xs in if length rs /= 2 then dupe Nothing else - let rs' = map readMaybe rs in if any isNothing rs' then dupe Nothing else (head rs', rs' !! 1) + preserveMergedFunctionType new old = + ( fromMaybe (fst new) $ mergeExternalFunctionTypes (fst new) (fst old) + , snd new || snd old + ) + preserveMergedGlobalType name new old = + fromRight old $ + mergeExternalGlobals name old new -execVisualize :: Show i => Options -> ASTs i -> IO () -execVisualize ops ast = let rlt = parseResolution $ resolution ops in do - rs <- if uncurry (&&) (both isJust rlt) then return rlt else - (Just 640, Just 480) <$ putStrLnErr "warning: the specified resolution is invalid, so using default resolution." - visualize ast (uncurry mkSizeSpec2D rs) (outputFName ops) +runAsm :: Maybe Handle -> Opts -> SI.Asm SI.AsmCodeCtx Integer a -> IO a +runAsm outputHandle opts asm + | optIsRunAsm opts = do + resolvedOutputPath <- resolveReplacementOutputPath $ asmOutputPath opts + shouldValidateOutput <- shouldValidateRunnableLinkedOutput resolvedOutputPath + snd <$> + withReplacementOutputPathAndResolvedPath PreserveReplacementOutputModeKeepingExecutableBits (asmOutputPath opts) (\tmpOutputPath -> do + compilerSpec <- asmCompiler $ optSuppressWarns opts + tmpDir <- getTemporaryDirectory + (asmPath, tmpHandle) <- openTempFile tmpDir "htcc-.s" + finally + ( do + (objPath, objHandle) <- openTempFile tmpDir "htcc-.o" + let cleanupObj = + ignoreIOException (hClose objHandle) + *> ignoreIOException (removeFile objPath) + finally + ( do + (markerAsmPath, markerAsmHandle) <- openTempFile tmpDir "htcc-marker-.s" + let cleanupMarkerAsm = + ignoreIOException (hClose markerAsmHandle) + *> ignoreIOException (removeFile markerAsmPath) + finally + ( do + (markerObjPath, markerObjHandle) <- openTempFile tmpDir "htcc-marker-.o" + let cleanupMarkerObj = + ignoreIOException (hClose markerObjHandle) + *> ignoreIOException (removeFile markerObjPath) + finally + ( do + let runnableOutputMarker = + makeRunnableLinkedOutputMarker + asmPath + objPath + markerObjPath + hPutStr + markerAsmHandle + (x86_64ElfRunnableLinkedOutputMarkerAsm runnableOutputMarker) + hClose markerAsmHandle + setFileMode markerObjPath temporaryWritableMode + hClose markerObjHandle + let markerAssembleArgs = asmAssembleArgs markerObjPath markerAsmPath + when (optIsVerbose opts) $ + hPutStr stderr $ + showCompilerCommandForUser compilerSpec markerAssembleArgs + <> "\n" + -- Snapshot paths are passed to the linker as ordinary pathnames. + -- Keep them unpublished until the producing wrapper group is gone. + callCompilerProcessUntil + waitForCompilerProcessGroupQuiescenceAfterExit + (optSuppressWarns opts) + compilerSpec + markerAssembleArgs + withCompilerObjectSnapshot markerObjPath $ \markerObjSnapshotPath -> do + setFileMode objPath temporaryWritableMode + hClose objHandle + let assembleArgs = asmAssembleArgs objPath asmPath + result' <- SI.runAsmWithHandle tmpHandle asm + hClose tmpHandle + when (optIsVerbose opts) $ + hPutStr stderr $ + showCompilerCommandForUser compilerSpec assembleArgs + <> "\n" + -- Avoid exposing object snapshot pathnames while delayed + -- assembler helpers can still replace them. + callCompilerProcessUntil + waitForCompilerProcessGroupQuiescenceAfterExit + (optSuppressWarns opts) + compilerSpec + assembleArgs + withCompilerObjectSnapshot objPath $ \objSnapshotPath -> do + withRunnableLinkOutputPath shouldValidateOutput tmpOutputPath $ \linkOutputPath -> do + let linkArgs = + asmRunnableLinkArgs + linkOutputPath + objSnapshotPath + markerObjSnapshotPath + when (optIsVerbose opts) $ + hPutStr stderr $ + showCompilerCommandForUser compilerSpec linkArgs + <> "\n" + -- Copy to the replacement staging path only after + -- the linker wrapper group can no longer mutate it. + callCompilerProcessUntil + waitForCompilerProcessGroupQuiescenceAfterExit + (optSuppressWarns opts) + compilerSpec + linkArgs + when shouldValidateOutput $ do + linkedOutputOk <- + ensureRunnableLinkedOutputCopied + linkOutputPath + tmpOutputPath + (Just runnableOutputMarker) + unless linkedOutputOk $ + ioError . userError $ + "HTCC_ASSEMBLER produced a non-runnable final output for -r: " + <> asmOutputPath opts + pure result' + ) + cleanupMarkerObj + ) + cleanupMarkerAsm + ) + cleanupObj + ) + ( ignoreIOException (hClose tmpHandle) + *> ignoreIOException (removeFile asmPath) + ) + ) + | otherwise = maybe + (SI.runAsm asm) + (`SI.runAsmWithHandle` asm) + outputHandle main :: IO () main = do - ops <- execParser $ info optionsP fullDesc - ifM (not <$> doesFileExist (inputFName ops)) (notFould (inputFName ops) >> exitFailure) $ - T.readFile (inputFName ops) >>= execAST' (supressWarn ops) (inputFName ops) >>= maybe (return ()) (bool casm (execVisualize ops . fst3) (visualizeAST ops)) - where - execAST' :: Bool -> FilePath -> InputCCode -> IO (Maybe (ASTs Integer, GlobalVars Integer, Literals Integer)) - execAST' = execAST - notFould fpath = putDocLnErr $ - locTxtDoc "htcc:" <+> - errTxtDoc "error:" <+> - text fpath <> char ':' <+> - text "no such file" <> linebreak <> - text "compilation terminated." + opts <- OA.execParser optsParser + validateOpts opts + let allowSameInputExternalCollisions = + not (optIsRunAsm opts) && length (optInput opts) > 1 + emitWarnings' = + emitWarningsIfEnabled opts + parseInputRawEitherSingleInput fname txt = + case runParser parser fname txt + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) of + Left x -> Left x + Right (warns, asts, gvars, lits, funcs) -> + Right (warns, (asts, gvars, gvars, lits, funcs, funcs)) + parseInputRawEitherAllowingExternalCollisions fname txt = + case PT.runParserAllowSameInputExternalCollisionsDetailed parser fname txt of + Left x -> Left x + Right (warns, asts, gvars, mergeGVars, lits, funcs, mergeFuncs) -> + Right (warns, (asts, gvars, mergeGVars, lits, funcs, mergeFuncs)) + parseInputRawEither fname txt = + if allowSameInputExternalCollisions + then parseInputRawEitherAllowingExternalCollisions fname txt + else parseInputRawEitherSingleInput fname txt + parseInputRaw fname txt = + either + (\x -> hPutStr stderr (M.errorBundlePretty x) *> exitFailure) + pure + (parseInputRawEither fname txt) + readParsedInputRaw fname = + readInput fname >>= uncurry parseInputRaw + readParsedInput fname = do + (warns, parsedInput) <- readParsedInputRaw fname + case mergeParsedInputsEither [parsedInput] of + Left msg -> do + emitWarnings' warns + hPutStr stderr (msg <> "\n") + exitFailure + Right mergedInput -> do + emitWarnings' warns + pure mergedInput + readVisualizableInput fname = do + (warns, parsedInput) <- readParsedInputRaw fname + case mergeVisualizableInputsEither [parsedInput] of + Left msg -> do + emitWarnings' warns + hPutStr stderr (msg <> "\n") + exitFailure + Right mergedInput -> do + emitWarnings' warns + pure mergedInput + readMergedInputRaw parsedInputs [] = + pure $ reverse parsedInputs + readMergedInputRaw parsedInputs (fname:fnames) = + catchIOError + ( do + txt <- readInputContents fname + case parseInputRawEither fname txt of + Left parseErr -> do + emitWarnings' $ collectedWarningsInInputOrder parsedInputs + hPutStr stderr (M.errorBundlePretty parseErr) + exitFailure + Right (warns, parsedInput) -> + readMergedInputRaw + ((warns, parsedInput) : parsedInputs) + fnames + ) + (\ioErr -> do + emitWarnings' $ collectedWarningsInInputOrder parsedInputs + hPutStr stderr $ formatInputReadError fname ioErr + exitFailure + ) + mergeParsedInputsEither parsedInputs = + mergeOutputInputs $ shiftLiteralLabelsInInputs parsedInputs + mergeVisualizableInputsEither parsedInputs = + mergeVisualizableInputs $ shiftLiteralLabelsInInputs parsedInputs + readMergedInput = do + parsedInputsWithWarnings <- readMergedInputRaw [] (optInput opts) + let parsedInputs = map snd parsedInputsWithWarnings + case mergeParsedInputsEither parsedInputs of + Left msg -> do + emitWarnings' $ collectedWarnings parsedInputsWithWarnings + hPutStr stderr (msg <> "\n") + exitFailure + Right parsedInput -> do + emitWarnings' + (SQ.fromList + [ warning + | (warnings, originatingInput) <- parsedInputsWithWarnings + , warning <- foldMap pure warnings + , shouldEmitMergedWarning originatingInput parsedInput warning + ] + ) + pure parsedInput + runParsed outputHandle (asts, gvars, _, lits, _, _) = + runAsm outputHandle opts $ casmNormalized' asts gvars lits + formatInputReadError fname ioErr + | isDoesNotExistError ioErr = fname <> ": no such file or directory\n" + | otherwise = fname <> ": " <> ioeGetErrorString ioErr <> "\n" + readInputContents fname = withFile fname ReadMode $ \h -> do + txt' <- T.hGetContents h + _ <- evaluate $ T.foldl' (\n _ -> succ n) (0 :: Int) txt' + pure txt' + readInput fname = do + txt <- catchIOError + (readInputContents fname) + (\ioErr -> hPutStr stderr (formatInputReadError fname ioErr) *> exitFailure) + pure (fname, txt) + runVisualize fname = do + (asts, _, _, _, _, _) <- readVisualizableInput fname + sizeSpec <- visualizeSizeSpec opts + writeVisualization asts sizeSpec (visualizeOutputPath opts) + if optVisualizeAst opts + then case optInput opts of + [fname] -> runVisualize fname + _ -> hPutStr stderr "internal compiler error: invalid visualize inputs\n" *> exitFailure + else if optIsRunAsm opts + then forM_ (optInput opts) $ + readParsedInput >=> runParsed Nothing + else maybe + (case optInput opts of + [_] -> + forM_ (optInput opts) $ + readParsedInput >=> runParsed Nothing + _ -> + readMergedInput >>= runParsed Nothing + ) + (\path -> case optInput opts of + [fname] -> do + parsedInput <- readParsedInput fname + withReplacementOutputPath PreserveReplacementOutputMode path $ \tmpOutputPath -> + withFile tmpOutputPath WriteMode $ \h -> + runParsed (Just h) parsedInput + _ -> do + mergedInput <- readMergedInput + withReplacementOutputPath PreserveReplacementOutputMode path $ \tmpOutputPath -> + withFile tmpOutputPath WriteMode $ \h -> + runParsed (Just h) mergedInput + ) + (optOutput opts) diff --git a/bench/Criterion.hs b/bench/Criterion.hs index 2fb7274..7190309 100644 --- a/bench/Criterion.hs +++ b/bench/Criterion.hs @@ -1,19 +1,31 @@ module Main (main) where -import Criterion.Main (bench, bgroup, defaultConfig, defaultMainWith, - nf, whnf) -import Criterion.Types (reportFile) +import Criterion.Main (bench, bgroup, + defaultConfig, + defaultMainWith, + nf, whnf) +import Criterion.Types (reportFile) -import Data.Either (fromRight) -import qualified Data.Text as T +import qualified Data.Text as T +import Data.Void (Void) -import Htcc.Parser (parse) -import qualified Htcc.Tokenizer as HT -import Htcc.Utils (tshow) +import qualified Htcc.MegaparsecCompat as M +import Htcc.Parser.AST (ASTs) +import Htcc.Parser.Combinators (parser, + runParser) +import Htcc.Parser.ConstructionData.Core (Warnings) +import qualified Htcc.Parser.ConstructionData.Scope.Function as PF +import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) +import qualified Htcc.Parser.ConstructionData.Scope.Var as PV +import qualified Htcc.Tokenizer as HT +import Htcc.Utils (tshow) -tknize :: T.Text -> Either (HT.TokenLCNums Int, T.Text) [HT.TokenLC Int] +tknize :: T.Text -> Either (ASTError Int) [HT.TokenLC Int] tknize = HT.tokenize +parseProgram :: T.Text -> Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, PV.GlobalVars Integer, PV.Literals Integer, PF.Functions Integer) +parseProgram = runParser parser "" + data CCodes = ReturningZero | StrLiteral @@ -43,8 +55,8 @@ main = defaultMainWith (defaultConfig { reportFile = Just "./bench_report.html" , bench "Calculate fibonacci" $ nf tknize $ tshow CalculateFibonacci ] , bgroup "parse tokens (whnf)" - [ bench "ReturningZero" $ whnf parse $ fromRight [] $ tknize $ tshow ReturningZero - , bench "StrLiteral" $ whnf parse $ fromRight [] $ tknize $ tshow StrLiteral - , bench "Calculate fibonacci" $ whnf parse $ fromRight [] $ tknize $ tshow CalculateFibonacci + [ bench "ReturningZero" $ whnf parseProgram $ tshow ReturningZero + , bench "StrLiteral" $ whnf parseProgram $ tshow StrLiteral + , bench "Calculate fibonacci" $ whnf parseProgram $ tshow CalculateFibonacci ] ] diff --git a/cabal.project b/cabal.project new file mode 100644 index 0000000..7ef286b --- /dev/null +++ b/cabal.project @@ -0,0 +1,2 @@ +packages: . +tests: True diff --git a/docker/example.dhall b/docker/example.dhall index 72ae872..77f7908 100644 --- a/docker/example.dhall +++ b/docker/example.dhall @@ -1,8 +1,10 @@ let types = - https://raw.githubusercontent.com/falgon/dhall-docker-compose/master/compose/v3/types.dhall + https://raw.githubusercontent.com/falgon/dhall-docker-compose/2e37f35926a9306278f0fb41328ea6d3fed016ee/compose/v3/types.dhall + sha256:df692daa4e2ec76fdf6eb873fdc703e995b74f45e63ed6b283b6cd4c3ac74a58 let defaults = - https://raw.githubusercontent.com/falgon/dhall-docker-compose/master/compose/v3/defaults.dhall + https://raw.githubusercontent.com/falgon/dhall-docker-compose/2e37f35926a9306278f0fb41328ea6d3fed016ee/compose/v3/defaults.dhall + sha256:205439311c2f3e48b8de6737322af5305966cbca5ceb34374699c4dda141da91 let htccService = defaults.Service diff --git a/docker/scripts/test.sh b/docker/scripts/test.sh new file mode 100755 index 0000000..3de1f6d --- /dev/null +++ b/docker/scripts/test.sh @@ -0,0 +1,25 @@ +#!/bin/bash + +set -euo pipefail + +work_dir=${HTCC_WORK_DIR:-/htcc_work} +asm_list=$(mktemp) +count=0 +trap 'rm -f "$asm_list"' EXIT + +find "$work_dir" -name "*.s" -type f -print0 > "$asm_list" + +while IFS= read -r -d '' fname; do + count=$((count + 1)) + out_dir=${fname%/*} + base=${fname##*/} + out_file="$out_dir/${base%.s}.o" + gcc -xassembler -no-pie -o "$out_file" "$fname" + echo ">>>>> $fname" + "$out_file" +done < "$asm_list" + +if [ "$count" -eq 0 ]; then + echo "no assembly files found in $work_dir" >&2 + exit 1 +fi diff --git a/docker/test.dhall b/docker/test.dhall index ba1029a..42a4a45 100644 --- a/docker/test.dhall +++ b/docker/test.dhall @@ -1,17 +1,18 @@ let types = - https://raw.githubusercontent.com/falgon/dhall-docker-compose/master/compose/v3/types.dhall + https://raw.githubusercontent.com/falgon/dhall-docker-compose/2e37f35926a9306278f0fb41328ea6d3fed016ee/compose/v3/types.dhall + sha256:df692daa4e2ec76fdf6eb873fdc703e995b74f45e63ed6b283b6cd4c3ac74a58 let defaults = - https://raw.githubusercontent.com/falgon/dhall-docker-compose/master/compose/v3/defaults.dhall + https://raw.githubusercontent.com/falgon/dhall-docker-compose/2e37f35926a9306278f0fb41328ea6d3fed016ee/compose/v3/defaults.dhall + sha256:205439311c2f3e48b8de6737322af5305966cbca5ceb34374699c4dda141da91 let htccService = defaults.Service ⫽ { image = Some "roki/htcc_test:1.0.0" , command = Some - ( types.StringOrList.String - "/bin/sh -c 'gcc -no-pie -o spec /htcc_work/spec.s && ./spec'" - ) - , volumes = Some [ "/tmp/htcc:/htcc_work" ] + (types.StringOrList.String "/bin/bash /htcc_work/scripts/test.sh") + , volumes = Some + [ "/tmp/htcc:/htcc_work", "./docker/scripts:/htcc_work/scripts" ] , build = Some ( types.Build.Object { context = "." diff --git a/example/Makefile b/example/Makefile index 4668407..f22945d 100644 --- a/example/Makefile +++ b/example/Makefile @@ -1,5 +1,7 @@ SHELL=/bin/bash DIST_DIR=dist +DHALL_TO_YAML ?= dhall-to-yaml +DOCKER_COMPOSE ?= docker compose define compile mkdir -p $(DIST_DIR) @@ -9,7 +11,7 @@ endef define compile_docker_and_run mkdir -p /tmp/htcc stack exec htcc -- $1 > /tmp/htcc/spec.s - dhall-to-yaml < ../docker/example.dhall | docker-compose -f - up + $(DHALL_TO_YAML) --file ../docker/example.dhall | $(DOCKER_COMPOSE) -f - up rm -r /tmp/htcc endef @@ -42,6 +44,6 @@ clean: $(RM) -r $(DIST_DIR) clean_docker: - dhall-to-yaml < ../docker/example.dhall | docker-compose -f - down --rmi all + $(DHALL_TO_YAML) --file ../docker/example.dhall | $(DOCKER_COMPOSE) -f - down --rmi all .PHONY: knapsack shuffle_and_sort merge_sort_linked_list lifegame docker_knapsack docker_shuffle_and_sort docker_merge_sort_linked_list docker clean diff --git a/example/shuffle_and_sort.c b/example/shuffle_and_sort.c index 421e7e6..584bc1c 100644 --- a/example/shuffle_and_sort.c +++ b/example/shuffle_and_sort.c @@ -7,7 +7,7 @@ typedef long time_t; -void srand(unsigned); +void srand(int); int rand(void); time_t time(time_t*); int printf(); @@ -53,22 +53,27 @@ int* med3(int* a, int* b, int* c) return max(min(a, b), min(max(a, b), c)); } -int quick_sort(int* first, int* last) +void quick_sort(int* first, int* last) { - if (first == last) return; - swap(first, med3(first, first + ((last - first) >> 1), last)); + int size = last - first; + if (size < 2) return; - int* l = first; - int* r = last - 1; + int pivot = *med3(first, first + (size >> 1), last - 1); + int l = 0; + int r = size - 1; - while (l < r) { - for (; *r > *first; --r); - for (; *l <= *first && l < r; ++l); - swap(l, r); + while (l <= r) { + for (; *(first + l) < pivot; ++l); + for (; *(first + r) > pivot; --r); + if (l <= r) { + swap(first + l, first + r); + ++l; + --r; + } } - swap(first, l); - quick_sort(first, l); - quick_sort(l + 1, last); + + quick_sort(first, first + (r + 1)); + quick_sort(first + l, last); } int main() @@ -76,7 +81,7 @@ int main() int ar[10]; int size = sizeof ar / sizeof *ar; - iota(ar, ar + size); + iota(ar, ar + size, 0); shuffle(ar, ar + size); printf("Before sorting: "); diff --git a/htcc.cabal b/htcc.cabal index f1a5855..2faaccb 100644 --- a/htcc.cabal +++ b/htcc.cabal @@ -1,10 +1,10 @@ -cabal-version: 1.12 +cabal-version: 2.0 --- This file has been generated from package.yaml by hpack version 0.33.0. +-- This file has been generated from package.yaml by hpack version 0.38.3. -- -- see: https://github.com/sol/hpack -- --- hash: 22bacaccc5bc8617817eb1c982386f1b7ce9a77e1928a9f47181e1dcd7a27d2a +-- hash: 6898310968dcf001c66a93150d55c5e9f04e053e15793d6120855f7a339dca4b name: htcc version: 0.0.0.1 @@ -63,14 +63,30 @@ library Htcc.CRules.Types.CType Htcc.CRules.Types.StorageClass Htcc.CRules.Types.TypeKind + Htcc.MegaparsecCompat + Htcc.Output + Htcc.WarningSuppression Htcc.Parser Htcc.Parser.AST Htcc.Parser.AST.Core Htcc.Parser.AST.DeduceKind Htcc.Parser.AST.Type - Htcc.Parser.AST.Var - Htcc.Parser.AST.Var.Init - Htcc.Parser.ConstructionData + Htcc.Parser.Combinators + Htcc.Parser.Combinators.BasicOperator + Htcc.Parser.Combinators.ConstExpr + Htcc.Parser.Combinators.Core + Htcc.Parser.Combinators.Decl + Htcc.Parser.Combinators.Decl.Declarator + Htcc.Parser.Combinators.Decl.Spec + Htcc.Parser.Combinators.GNUExtensions + Htcc.Parser.Combinators.Keywords + Htcc.Parser.Combinators.ParserType + Htcc.Parser.Combinators.Program + Htcc.Parser.Combinators.Type + Htcc.Parser.Combinators.Type.Core + Htcc.Parser.Combinators.Type.Utils + Htcc.Parser.Combinators.Utils + Htcc.Parser.Combinators.Var Htcc.Parser.ConstructionData.Core Htcc.Parser.ConstructionData.Scope Htcc.Parser.ConstructionData.Scope.Enumerator @@ -80,14 +96,6 @@ library Htcc.Parser.ConstructionData.Scope.Typedef Htcc.Parser.ConstructionData.Scope.Utils Htcc.Parser.ConstructionData.Scope.Var - Htcc.Parser.Parsing - Htcc.Parser.Parsing.Core - Htcc.Parser.Parsing.Global - Htcc.Parser.Parsing.Global.Function - Htcc.Parser.Parsing.Global.Var - Htcc.Parser.Parsing.StmtExpr - Htcc.Parser.Parsing.Type - Htcc.Parser.Parsing.Typedef Htcc.Parser.Utils Htcc.Parser.Utils.Core Htcc.Tokenizer @@ -105,6 +113,8 @@ library Htcc.Visualizer.Core other-modules: Paths_htcc + autogen-modules: + Paths_htcc hs-source-dirs: src build-depends: @@ -117,26 +127,37 @@ library , diagrams-contrib , diagrams-lib , diagrams-svg + , directory , extra + , filepath + , megaparsec , monad-finally , monad-loops , mono-traversable , mtl , natural-transformation , optparse-applicative + , parsec + , parser-combinators + , prettyprinter + , prettyprinter-ansi-terminal , safe , split , text , transformers + , unix + , utf8-string default-language: Haskell2010 executable htcc main-is: Main.hs other-modules: Paths_htcc + autogen-modules: + Paths_htcc hs-source-dirs: app - ghc-options: -threaded -rtsopts -with-rtsopts=-N -Wall -Werror -O2 + ghc-options: -threaded -rtsopts -with-rtsopts=-N -Wall -O2 build-depends: ansi-wl-pprint , base >=4.7 && <5 @@ -149,29 +170,50 @@ executable htcc , diagrams-svg , directory , extra + , filepath , htcc + , megaparsec , monad-finally , monad-loops , mono-traversable , mtl , natural-transformation , optparse-applicative + , parsec + , parser-combinators + , prettyprinter + , prettyprinter-ansi-terminal + , process , safe , split + , stm + , template-haskell , text , transformers + , unix + , utf8-string default-language: Haskell2010 test-suite htcc-test type: exitcode-stdio-1.0 main-is: Spec.hs + build-tool-depends: + htcc:htcc other-modules: + Tests.CommandSelection + Tests.ComponentsTests + Tests.ComponentsTests.AsmOutput + Tests.ComponentsTests.CommandSelection + Tests.ComponentsTests.Parser.Combinators Tests.SubProcTests - Tests.Test1 - Tests.Test2 - Tests.Test3 + Tests.SubProcTests.AsmOutput + Tests.SubProcTests.LinkFuncRet + Tests.SubProcTests.LinkFuncStdOut + Tests.SubProcTests.StatementEqual Tests.Utils Paths_htcc + autogen-modules: + Paths_htcc hs-source-dirs: test ghc-options: -threaded -rtsopts -with-rtsopts=-N -Wall -O2 @@ -183,8 +225,6 @@ test-suite htcc-test , cond , containers , deepseq - , dhall-json - , dhall-yaml , diagrams-contrib , diagrams-lib , diagrams-svg @@ -196,12 +236,17 @@ test-suite htcc-test , hspec-contrib , hspec-core , htcc + , megaparsec , monad-finally , monad-loops , mono-traversable , mtl , natural-transformation , optparse-applicative + , parsec + , parser-combinators + , prettyprinter + , prettyprinter-ansi-terminal , process , safe , split @@ -209,6 +254,7 @@ test-suite htcc-test , time , transformers , turtle + , unix , utf8-string default-language: Haskell2010 @@ -217,6 +263,8 @@ benchmark criterion main-is: bench/Criterion.hs other-modules: Paths_htcc + autogen-modules: + Paths_htcc ghc-options: -O2 build-depends: ansi-wl-pprint @@ -229,16 +277,25 @@ benchmark criterion , diagrams-contrib , diagrams-lib , diagrams-svg + , directory , extra + , filepath , htcc + , megaparsec , monad-finally , monad-loops , mono-traversable , mtl , natural-transformation , optparse-applicative + , parsec + , parser-combinators + , prettyprinter + , prettyprinter-ansi-terminal , safe , split , text , transformers + , unix + , utf8-string default-language: Haskell2010 diff --git a/package.yaml b/package.yaml index c659094..bbc521a 100644 --- a/package.yaml +++ b/package.yaml @@ -36,7 +36,9 @@ dependencies: - containers - bytestring - deepseq +- directory - safe +- filepath - mtl - monad-finally - mono-traversable @@ -46,29 +48,132 @@ dependencies: - diagrams-lib - natural-transformation - optparse-applicative +- megaparsec +- parsec +- parser-combinators +- prettyprinter +- prettyprinter-ansi-terminal +- unix +- utf8-string library: source-dirs: src - + generated-other-modules: + - Paths_htcc + exposed-modules: + - Htcc.Asm + - Htcc.Asm.Generate + - Htcc.Asm.Generate.Core + - Htcc.Asm.Intrinsic + - Htcc.Asm.Intrinsic.Operand + - Htcc.Asm.Intrinsic.Register + - Htcc.Asm.Intrinsic.Structure + - Htcc.Asm.Intrinsic.Structure.Internal + - Htcc.Asm.Intrinsic.Structure.Section.Data + - Htcc.Asm.Intrinsic.Structure.Section.Text + - Htcc.Asm.Intrinsic.Structure.Section.Text.Directive + - Htcc.Asm.Intrinsic.Structure.Section.Text.Instruction + - Htcc.Asm.Intrinsic.Structure.Section.Text.Operations + - Htcc.CRules + - Htcc.CRules.Char + - Htcc.CRules.LexicalElements + - Htcc.CRules.Preprocessor + - Htcc.CRules.Preprocessor.Core + - Htcc.CRules.Preprocessor.Punctuators + - Htcc.CRules.Types + - Htcc.CRules.Types.CType + - Htcc.CRules.Types.StorageClass + - Htcc.CRules.Types.TypeKind + - Htcc.MegaparsecCompat + - Htcc.Output + - Htcc.WarningSuppression + - Htcc.Parser + - Htcc.Parser.AST + - Htcc.Parser.AST.Core + - Htcc.Parser.AST.DeduceKind + - Htcc.Parser.AST.Type + - Htcc.Parser.Combinators + - Htcc.Parser.Combinators.BasicOperator + - Htcc.Parser.Combinators.ConstExpr + - Htcc.Parser.Combinators.Core + - Htcc.Parser.Combinators.Decl + - Htcc.Parser.Combinators.Decl.Declarator + - Htcc.Parser.Combinators.Decl.Spec + - Htcc.Parser.Combinators.GNUExtensions + - Htcc.Parser.Combinators.Keywords + - Htcc.Parser.Combinators.ParserType + - Htcc.Parser.Combinators.Program + - Htcc.Parser.Combinators.Type + - Htcc.Parser.Combinators.Type.Core + - Htcc.Parser.Combinators.Type.Utils + - Htcc.Parser.Combinators.Utils + - Htcc.Parser.Combinators.Var + - Htcc.Parser.ConstructionData.Core + - Htcc.Parser.ConstructionData.Scope + - Htcc.Parser.ConstructionData.Scope.Enumerator + - Htcc.Parser.ConstructionData.Scope.Function + - Htcc.Parser.ConstructionData.Scope.ManagedScope + - Htcc.Parser.ConstructionData.Scope.Tag + - Htcc.Parser.ConstructionData.Scope.Typedef + - Htcc.Parser.ConstructionData.Scope.Utils + - Htcc.Parser.ConstructionData.Scope.Var + - Htcc.Parser.Utils + - Htcc.Parser.Utils.Core + - Htcc.Tokenizer + - Htcc.Tokenizer.Core + - Htcc.Tokenizer.Token + - Htcc.Utils + - Htcc.Utils.Bool + - Htcc.Utils.CompilationState + - Htcc.Utils.List + - Htcc.Utils.NaturalTransformations + - Htcc.Utils.Print + - Htcc.Utils.Text + - Htcc.Utils.Tuple + - Htcc.Visualizer + - Htcc.Visualizer.Core executables: htcc: main: Main.hs source-dirs: app + generated-other-modules: + - Paths_htcc ghc-options: - -threaded - -rtsopts - -with-rtsopts=-N - -Wall - - -Werror + #- -Werror - -O2 dependencies: - htcc - directory + - filepath + - process + - stm + - template-haskell + - unix tests: htcc-test: main: Spec.hs source-dirs: test + generated-other-modules: + - Paths_htcc + build-tools: + - htcc:htcc + other-modules: + - Tests.CommandSelection + - Tests.ComponentsTests + - Tests.ComponentsTests.AsmOutput + - Tests.ComponentsTests.CommandSelection + - Tests.ComponentsTests.Parser.Combinators + - Tests.SubProcTests + - Tests.SubProcTests.AsmOutput + - Tests.SubProcTests.LinkFuncRet + - Tests.SubProcTests.LinkFuncStdOut + - Tests.SubProcTests.StatementEqual + - Tests.Utils ghc-options: - -threaded - -rtsopts @@ -87,14 +192,15 @@ tests: - hspec-core - hspec-contrib - filepath - - dhall-json - - dhall-yaml - process + - unix - utf8-string benchmarks: criterion: main: bench/Criterion.hs + generated-other-modules: + - Paths_htcc ghc-options: - -O2 dependencies: diff --git a/src/Htcc/Asm.hs b/src/Htcc/Asm.hs index 702e1b5..0992042 100644 --- a/src/Htcc/Asm.hs +++ b/src/Htcc/Asm.hs @@ -15,16 +15,17 @@ module Htcc.Asm ( casm ) where -import Data.Tuple.Extra (uncurry3) - +import Data.Bits (Bits) import Htcc.Asm.Generate import qualified Htcc.Asm.Intrinsic.Operand as O import qualified Htcc.Asm.Intrinsic.Structure.Internal as SI import qualified Htcc.Asm.Intrinsic.Structure.Section.Text.Instruction as TI import Htcc.Parser (ASTs) +import qualified Htcc.Parser.ConstructionData.Scope.Function as PF import Htcc.Parser.ConstructionData.Scope.Var (GlobalVars, Literals) +import Htcc.Utils (uncurry4) -- | Generate full assembly code from string of C source code -casm :: (O.IsOperand i, TI.UnaryInstruction i, TI.BinaryInstruction i, Integral i) => (ASTs i, GlobalVars i, Literals i) -> IO () -casm = SI.runAsm . uncurry3 casm' +casm :: (Bits i, Read i, Show i, Ord i, O.IsOperand i, TI.UnaryInstruction i, TI.BinaryInstruction i, Integral i) => (ASTs i, GlobalVars i, Literals i, PF.Functions i) -> IO () +casm = SI.runAsm . uncurry4 casm' diff --git a/src/Htcc/Asm/Generate.hs b/src/Htcc/Asm/Generate.hs index d73a5b4..a1d0b69 100644 --- a/src/Htcc/Asm/Generate.hs +++ b/src/Htcc/Asm/Generate.hs @@ -1,3 +1,4 @@ +{-# LANGUAGE LambdaCase #-} {-| Module : Htcc.Asm.Generate Description : The modules of intrinsic (x86_64) assembly @@ -9,104 +10,1002 @@ Portability : POSIX The executable module for compilation -} -{-# LANGUAGE OverloadedStrings #-} module Htcc.Asm.Generate ( InputCCode, + normalizeAsmInput, + prepareAsmInput, + prepareVisualizableInput, -- * Generator casm', - buildAST, - execAST + casmNormalized' ) where -import Control.Monad (unless, (>=>)) -import Data.Bits (Bits) -import Data.Foldable (toList) -import qualified Data.Sequence as S -import qualified Data.Text as T -import System.Exit (exitFailure) -import Text.PrettyPrint.ANSI.Leijen (Doc, blue, - bold, char, - empty, - magenta, red, - text, (<+>)) - -import Htcc.Parser (ASTResult, - ASTs, parse) -import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) -import Htcc.Parser.ConstructionData.Scope.Var (GlobalVars, - Literals) -import qualified Htcc.Tokenizer as HT - +import Control.Applicative ((<|>)) +import Control.Monad (when) +import Data.Bits (Bits) +import Data.Foldable (for_, traverse_) +import qualified Data.Map.Strict as M +import Data.Maybe (fromMaybe, isJust) +import qualified Data.Text as T import Htcc.Asm.Generate.Core import Htcc.Asm.Intrinsic.Operand -import qualified Htcc.Asm.Intrinsic.Structure as SI -import qualified Htcc.Asm.Intrinsic.Structure.Section.Text as IT - -import Htcc.Utils (dropFst4, - putDocErr, - putDocLnErr, - putStrErr, - putStrLnErr, - toInts, tshow) +import qualified Htcc.Asm.Intrinsic.Structure as SI +import qualified Htcc.Asm.Intrinsic.Structure.Section.Text as IT +import qualified Htcc.CRules.Types as CT +import Htcc.Parser (ASTs, ATKind (..), + ATKindFor (..), + ATree (..)) +import Htcc.Parser.Combinators.Program (convertCallArgsWith, + foldGlobalInitWith, + isInvalidAggregateValueConversion) +import Htcc.Parser.Combinators.Utils (conditionalResultType, + containsEscapingStmtExprControlFlow, + decayExprType, + isInvalidFunctionPointerValue, + isInvalidObjectPointerValue, + requiresUnsupportedNonAddressableArrayDecay) +import Htcc.Parser.ConstructionData.Core (hasIncompleteObjectType) +import qualified Htcc.Parser.ConstructionData.Scope.Function as PF +import Htcc.Parser.ConstructionData.Scope.Var (GVar (..), + GVarInitWith (..), + GlobalVars, + Literals, + materializeTentativeIncompleteArray) -- | input string, C source code type InputCCode = T.Text -data MessageType = ErrorMessage | WarningMessage - deriving (Eq, Ord, Enum, Bounded) - -instance Show MessageType where - show ErrorMessage = "error" - show WarningMessage = "warning" - -{-# INLINE messageColor #-} -messageColor :: MessageType -> Doc -> Doc -messageColor ErrorMessage = red -messageColor WarningMessage = magenta - -{-# INLINE repSpace #-} -repSpace :: Integral i => i -> MessageType -> IO () -repSpace i mest = do - mapM_ (putStrErr . T.pack . flip replicate ' ' . pred) $ toInts i - putDocErr $ messageColor mest $ char '^' - -{-# INLINE format #-} -format :: T.Text -> Int -> InputCCode -> IO () -format errMesPre e xs = do - putDocErr $ blue (text $ T.unpack errMesPre) <+> blue (char '|') <+> empty - putStrLnErr (T.lines xs !! max 0 (fromIntegral e)) - putStrErr $ T.replicate (T.length errMesPre) " " - putDocErr $ empty <+> blue (char '|') <+> empty - -parsedMessage :: (Integral i, Show i) => MessageType -> FilePath -> InputCCode -> ASTError i -> IO () -parsedMessage mest fpath xs (s, (i, etk)) = do - putDocLnErr $ - bold (text fpath) <> bold (char ':') <> - bold (text (show i)) <> bold (char ':') <+> - messageColor mest (text $ show mest) <> messageColor mest (char ':') <+> - text (T.unpack s) - format (T.replicate 4 " " <> tshow (HT.tkLn i)) (pred $ fromIntegral $ HT.tkLn i) xs - repSpace (HT.tkCn i) mest - putDocLnErr $ messageColor mest (text $ replicate (pred $ HT.length etk) '~') - --- | the function to output error message -parsedErrExit :: (Integral i, Show i) => FilePath -> InputCCode -> ASTError i -> IO () -parsedErrExit fpath ccode err = parsedMessage ErrorMessage fpath ccode err >> exitFailure - --- | the function to output warning message -parsedWarn :: (Integral i, Show i) => FilePath -> InputCCode -> S.Seq (ASTError i) -> IO () -parsedWarn fpath xs warns = mapM_ (parsedMessage WarningMessage fpath xs) (toList warns) +normalizeGlobalInitializers :: (Integral i, Bits i, Read i, Show i, Ord i) => GlobalVars i -> Either String (GlobalVars i) +normalizeGlobalInitializers = M.traverseWithKey resolveGlobalInit + where + resolveGlobalInit _ gvar = case initWith gvar of + GVarInitWithAST ast -> + (\ginit -> gvar { initWith = ginit }) + <$> foldGlobalInitWith (gvtype gvar) ast + _ -> + Right gvar --- | Executor that receives information about the constructed AST, --- global variables, and literals and composes assembly code -casm' :: (Integral e, Show e, Integral i, IsOperand i, IT.UnaryInstruction i, IT.BinaryInstruction i) => ASTs i -> GlobalVars i -> Literals i -> SI.Asm SI.AsmCodeCtx e () -casm' atl gvars lits = dataSection gvars lits >> textSection atl +mergedGlobalType :: Eq i => Maybe (GlobalVars i) -> T.Text -> CT.StorageClass i -> CT.StorageClass i +mergedGlobalType maybeGVars name currentTy = + maybe currentTy mergeDeclaredGlobalType $ + maybeGVars >>= M.lookup name + where + mergeDeclaredGlobalType gvar = + let declaredTy = gvtype gvar + in maybe declaredTy + (\mergedTy -> CT.mapTypeKind (const mergedTy) declaredTy) + ( CT.mergeCompatibleTypeKinds + (CT.toTypeKind declaredTy) + (CT.toTypeKind currentTy) + <|> CT.mergeCompatibleTypeKinds + (CT.toTypeKind currentTy) + (CT.toTypeKind declaredTy) + ) + +retypeResolvedGlobalRefs :: Eq i => GlobalVars i -> ASTs i -> ASTs i +retypeResolvedGlobalRefs = map . retypeResolvedGlobalRefsInATree + where + retypeResolvedGlobalRefsInATKindFor gvars kind = case kind of + ATForkw -> ATForkw + ATForInit at -> ATForInit $ retypeResolvedGlobalRefsInATree gvars at + ATForCond at -> ATForCond $ retypeResolvedGlobalRefsInATree gvars at + ATForIncr at -> ATForIncr $ retypeResolvedGlobalRefsInATree gvars at + ATForStmt at -> ATForStmt $ retypeResolvedGlobalRefsInATree gvars at + + retypeResolvedGlobalRefsInATKind gvars kind = case kind of + ATConditional cond tr fl -> + ATConditional + (retypeResolvedGlobalRefsInATree gvars cond) + (retypeResolvedGlobalRefsInATree gvars tr) + (retypeResolvedGlobalRefsInATree gvars fl) + ATSwitch cond cases -> + ATSwitch + (retypeResolvedGlobalRefsInATree gvars cond) + (map (retypeResolvedGlobalRefsInATree gvars) cases) + ATFor kinds -> + ATFor $ map (retypeResolvedGlobalRefsInATKindFor gvars) kinds + ATBlock ats -> + ATBlock $ map (retypeResolvedGlobalRefsInATree gvars) ats + ATStmtExpr ats -> + ATStmtExpr $ map (retypeResolvedGlobalRefsInATree gvars) ats + ATNull at -> + ATNull $ retypeResolvedGlobalRefsInATree gvars at + ATDefFunc name args -> + ATDefFunc name $ map (retypeResolvedGlobalRefsInATree gvars) <$> args + ATCallFunc name args -> + ATCallFunc name $ map (retypeResolvedGlobalRefsInATree gvars) <$> args + ATCallPtr args -> + ATCallPtr $ map (retypeResolvedGlobalRefsInATree gvars) <$> args + ATGVar ty name -> + ATGVar (mergedGlobalType (Just gvars) name ty) name + _ -> + kind + + retypeResolvedGlobalRefsInATree _ ATEmpty = ATEmpty + retypeResolvedGlobalRefsInATree gvars (ATNode kind ty lhs rhs) = + ATNode + retypedKind + retypedTy + (retypeResolvedGlobalRefsInATree gvars lhs) + (retypeResolvedGlobalRefsInATree gvars rhs) + where + retypedKind = retypeResolvedGlobalRefsInATKind gvars kind + retypedTy = case retypedKind of + ATGVar resolvedTy _ -> resolvedTy + _ -> ty + +normalizeAsmInput :: (Integral i, Bits i, Read i, Show i, Ord i) + => ASTs i + -> GlobalVars i + -> Either String (ASTs i, GlobalVars i) +normalizeAsmInput atl gvars = do + normalizedGVars <- normalizeGlobalInitializers gvars + pure (retypeResolvedGlobalRefs normalizedGVars atl, normalizedGVars) + +data MergedRevalidationMode + = StrictMergedRevalidation + | GlobalInitializerMergedRevalidation + | VisualizableMergedRevalidation + +mergedCallableSignature :: Ord i => CT.StorageClass i -> Maybe (CT.StorageClass i, Maybe [CT.StorageClass i]) +mergedCallableSignature ty = case CT.toTypeKind ty of + CT.CTFunc retTy params -> + Just (CT.SCAuto retTy, explicitFunctionParamTypes params) + CT.CTPtr (CT.CTFunc retTy params) -> + Just (CT.SCAuto retTy, explicitFunctionParamTypes params) + _ -> + Nothing + where + explicitFunctionParamTypes [] = Nothing + explicitFunctionParamTypes [(CT.CTVoid, Nothing)] = Just [] + explicitFunctionParamTypes params = + Just $ map (CT.SCAuto . canonicalizeFunctionParamType . fst) params + + canonicalizeFunctionParamType = \case + CT.CTArray _ elemTy -> CT.CTPtr elemTy + CT.CTIncomplete (CT.IncompleteArray elemTy) -> CT.CTPtr elemTy + CT.CTFunc retTy params -> CT.CTPtr $ CT.CTFunc retTy params + other -> other + +mergedFunctionParamBindings :: Ord i => CT.StorageClass i -> Maybe [ATree i] -> M.Map i (CT.StorageClass i) +mergedFunctionParamBindings ty maybeArgs = case (maybeArgs, mergedParamTypes ty) of + (Just args, Just paramTys) + | length args == length paramTys + , Just offsets <- traverse paramOffset args -> + M.fromList $ zip offsets paramTys + _ -> + M.empty + where + mergedParamTypes resolvedTy = case mergedCallableSignature resolvedTy of + Just (_, formalParamTys) -> formalParamTys + Nothing -> Nothing + + paramOffset (ATNode (ATLVar _ offset) _ _ _) = Just offset + paramOffset _ = Nothing + +mergeCompatibleStorageClass :: Eq i => CT.StorageClass i -> CT.StorageClass i -> CT.StorageClass i +mergeCompatibleStorageClass preferred current = + maybe preferred (\mergedTy -> CT.mapTypeKind (const mergedTy) preferred) $ + CT.mergeCompatibleTypeKinds preferredTy currentTy + <|> CT.mergeCompatibleTypeKinds currentTy preferredTy + where + preferredTy = CT.toTypeKind preferred + currentTy = CT.toTypeKind current + +mergeCompatibleParamType :: Eq i => CT.TypeKind i -> CT.StorageClass i -> CT.TypeKind i +mergeCompatibleParamType preferred current = + fromMaybe preferred $ + CT.mergeCompatibleTypeKinds preferred currentTy + <|> CT.mergeCompatibleTypeKinds currentTy preferred + where + currentTy = CT.toTypeKind current + +mergedFunctionType :: Eq i => PF.Functions i -> T.Text -> CT.StorageClass i -> CT.StorageClass i +mergedFunctionType funcs name fallback = + maybe fallback (\fn -> mergeCompatibleStorageClass (PF.fntype fn) fallback) $ M.lookup name funcs + +functionReturnType :: CT.StorageClass i -> Maybe (CT.StorageClass i) +functionReturnType ty = case CT.toTypeKind ty of + CT.CTFunc retTy _ -> Just $ CT.SCAuto retTy + _ -> Nothing + +isUnsupportedByValueAggregateType :: Ord i => CT.StorageClass i -> Bool +isUnsupportedByValueAggregateType ty = + CT.isCTStruct ty && CT.sizeof ty > 8 + +unsupportedByValueFunctionReturnType :: Ord i => CT.StorageClass i -> Bool +unsupportedByValueFunctionReturnType ty = case CT.toTypeKind ty of + CT.CTFunc retTy _ -> + isUnsupportedByValueAggregateType $ CT.SCAuto retTy + _ -> + False + +functionTypeWithMergedCall :: Eq i => CT.StorageClass i -> CT.StorageClass i -> [ATree i] -> CT.StorageClass i +functionTypeWithMergedCall fnTy currentReturnTy currentArgs = case CT.toTypeKind fnTy of + CT.CTFunc retTy params -> + CT.mapTypeKind (const $ CT.CTFunc (mergedReturnTy retTy) (mergedParams params)) fnTy + _ -> + fnTy + where + currentRetTy = CT.toTypeKind currentReturnTy + mergedReturnTy retTy = + fromMaybe retTy $ + CT.mergeCompatibleTypeKinds retTy currentRetTy + <|> CT.mergeCompatibleTypeKinds currentRetTy retTy + + mergedParams params + | length params == length currentArgs = + zipWith mergedParam params currentArgs + | otherwise = + params + + mergedParam (paramTy, ident) arg = + (mergeCompatibleParamType paramTy (atype arg), ident) + +derefMergedObjectType :: Ord i => CT.StorageClass i -> Maybe (CT.StorageClass i) +derefMergedObjectType ty = case CT.toTypeKind ty of + CT.CTArray n (CT.CTIncomplete (CT.IncompleteArray elemTy)) -> + Just $ CT.mapTypeKind (const $ CT.CTArray n elemTy) ty + _ -> + CT.deref ty + +objectPointerRetyped :: Ord i => ATree i -> ATree i -> ATree i -> ATree i -> Bool +objectPointerRetyped originalLhs originalRhs lhs rhs = + atype lhs /= atype originalLhs + || decayExprType (atype rhs) /= decayExprType (atype originalRhs) + +invalidAssignmentOperands :: (Ord i, Bits i, Integral i) => ATKind i -> ATree i -> ATree i -> ATree i -> ATree i -> Bool +invalidAssignmentOperands kind originalLhs originalRhs lhs rhs = case kind of + ATAssign -> + isInvalidFunctionPointerValue (atype lhs) rhs + || isInvalidAggregateValueConversion (atype lhs) rhs + || ( objectPointerRetyped originalLhs originalRhs lhs rhs + && isInvalidObjectPointerValue (atype lhs) rhs + ) + _ -> + isInvalidCompoundAssignmentOperands kind lhs rhs + +isInvalidCompoundAssignmentOperands :: Ord i => ATKind i -> ATree i -> ATree i -> Bool +isInvalidCompoundAssignmentOperands kind lhs rhs = case kind of + ATAddPtrAssign -> + not $ + isPointerArithmeticOperandType (atype lhs) + && isIntegerOperandType (atype rhs) + ATSubPtrAssign -> + not $ + isPointerArithmeticOperandType (atype lhs) + && isIntegerOperandType (atype rhs) + ATAddAssign -> + invalidIntegerOperands lhs rhs + ATSubAssign -> + invalidIntegerOperands lhs rhs + ATMulAssign -> + invalidIntegerOperands lhs rhs + ATDivAssign -> + invalidIntegerOperands lhs rhs + ATAndAssign -> + invalidIntegerOperands lhs rhs + ATOrAssign -> + invalidIntegerOperands lhs rhs + ATXorAssign -> + invalidIntegerOperands lhs rhs + ATShlAssign -> + invalidIntegerOperands lhs rhs + ATShrAssign -> + invalidIntegerOperands lhs rhs + _ -> + False + +invalidRefreshedOperandUse :: (Ord i, Bits i, Integral i) => ATKind i -> CT.StorageClass i -> ATree i -> ATree i -> Maybe String +invalidRefreshedOperandUse kind nodeTy lhs rhs = case kind of + ATIf + | invalidConditionOperand lhs -> Just "invalid condition type" + ATWhile + | invalidConditionOperand lhs -> Just "invalid condition type" + ATSwitch cond _ + | invalidConditionOperand cond -> Just "invalid condition type" + ATFor kinds -> + foldr ((<|>) . invalidForOperandUse) Nothing kinds + ATConditional cond ATEmpty el -> + invalidConditionalOperands cond cond el + ATConditional cond th el -> + invalidConditionalOperands cond th el + ATCast -> + invalidCastOperands nodeTy lhs + ATPreInc + | invalidScalarOperand lhs -> Just "invalid operands" + ATPreDec + | invalidScalarOperand lhs -> Just "invalid operands" + ATPostInc + | invalidScalarOperand lhs -> Just "invalid operands" + ATPostDec + | invalidScalarOperand lhs -> Just "invalid operands" + ATNot + | invalidScalarOperand lhs -> Just "invalid operands" + ATBitNot + | invalidIntegerOperand lhs -> Just "invalid operands" + ATAdd + | invalidArithmeticOperands lhs rhs -> Just "invalid operands" + ATSub + | invalidArithmeticOperands lhs rhs -> Just "invalid operands" + ATAddPtr + | not (isPointerArithmeticOperandType (atype lhs) && isIntegerOperandType (atype rhs)) -> Just "invalid operands" + ATSubPtr + | not (isPointerArithmeticOperandType (atype lhs) && isIntegerOperandType (atype rhs)) -> Just "invalid operands" + ATPtrDis + | not (isPointerArithmeticOperandType (atype lhs) && isPointerArithmeticOperandType (atype rhs)) -> Just "invalid operands" + ATMul + | invalidIntegerOperands lhs rhs -> Just "invalid operands" + ATDiv + | invalidIntegerOperands lhs rhs -> Just "invalid operands" + ATMod + | invalidIntegerOperands lhs rhs -> Just "invalid operands" + ATShl + | invalidIntegerOperands lhs rhs -> Just "invalid operands" + ATShr + | invalidIntegerOperands lhs rhs -> Just "invalid operands" + ATLT + | invalidScalarOperands lhs rhs -> Just "invalid operands" + ATLEQ + | invalidScalarOperands lhs rhs -> Just "invalid operands" + ATGT + | invalidScalarOperands lhs rhs -> Just "invalid operands" + ATGEQ + | invalidScalarOperands lhs rhs -> Just "invalid operands" + ATEQ + | invalidScalarOperands lhs rhs -> Just "invalid operands" + ATNEQ + | invalidScalarOperands lhs rhs -> Just "invalid operands" + ATLAnd + | invalidScalarOperands lhs rhs -> Just "invalid operands" + ATLOr + | invalidScalarOperands lhs rhs -> Just "invalid operands" + ATAnd + | invalidIntegerOperands lhs rhs -> Just "invalid operands" + ATOr + | invalidIntegerOperands lhs rhs -> Just "invalid operands" + ATXor + | invalidIntegerOperands lhs rhs -> Just "invalid operands" + _ -> + Nothing + +invalidForOperandUse :: (Ord i, Bits i, Integral i) => ATKindFor i -> Maybe String +invalidForOperandUse = \case + ATForCond cond + | cond /= ATEmpty && invalidConditionOperand cond -> Just "invalid condition type" + _ -> + Nothing + +invalidConditionalOperands :: (Ord i, Bits i, Integral i) => ATree i -> ATree i -> ATree i -> Maybe String +invalidConditionalOperands cond lhs rhs + | invalidConditionOperand cond = Just "invalid condition type" + | otherwise = maybe (Just "invalid operands") (const Nothing) $ conditionalResultType lhs rhs + +invalidCastOperands :: Eq i => CT.StorageClass i -> ATree i -> Maybe String +invalidCastOperands targetTy operand + | not (isVoidType targetTy) + && isAggregateType (atype operand) + && (not (isAggregateType targetTy) || isInvalidAggregateValueConversion targetTy operand) = + Just "invalid cast operand" + | otherwise = Nothing + +invalidConditionOperand :: Ord i => ATree i -> Bool +invalidConditionOperand = invalidScalarOperand + +invalidScalarOperand :: Ord i => ATree i -> Bool +invalidScalarOperand = + not . isScalarOperandType . atype + +invalidScalarOperands :: Ord i => ATree i -> ATree i -> Bool +invalidScalarOperands lhs rhs = + invalidScalarOperand lhs || invalidScalarOperand rhs + +invalidIntegerOperand :: Ord i => ATree i -> Bool +invalidIntegerOperand = + not . isIntegerOperandType . atype + +invalidIntegerOperands :: Ord i => ATree i -> ATree i -> Bool +invalidIntegerOperands lhs rhs = + invalidIntegerOperand lhs || invalidIntegerOperand rhs + +invalidArithmeticOperands :: Ord i => ATree i -> ATree i -> Bool +invalidArithmeticOperands lhs rhs = + not $ + isArithmeticOperandType (atype lhs) + && isArithmeticOperandType (atype rhs) + +isScalarOperandType :: Ord i => CT.StorageClass i -> Bool +isScalarOperandType ty = + CT.isIntegral decayedTy || isPointerType decayedTy + where + decayedTy = decayExprType ty + +isIntegerOperandType :: Ord i => CT.StorageClass i -> Bool +isIntegerOperandType = + CT.isIntegral . decayExprType --- | Build AST from string of C source code -buildAST :: (Integral i, Read i, Show i, Bits i) => InputCCode -> ASTResult i -buildAST = HT.tokenize >=> parse +isArithmeticOperandType :: Ord i => CT.StorageClass i -> Bool +isArithmeticOperandType = + CT.isFundamental . decayExprType --- | Print warning or error message if building AST from string of C source code has some problems -execAST :: (Integral i, Read i, Show i, Bits i) => Bool -> FilePath -> InputCCode -> IO (Maybe (ASTs i, GlobalVars i, Literals i)) -execAST supWarns fpath ccode = flip (either ((<$) Nothing . parsedErrExit fpath ccode)) (buildAST ccode) $ \xs@(warns, _, _, _) -> - Just (dropFst4 xs) <$ unless supWarns (parsedWarn fpath ccode warns) +isPointerArithmeticOperandType :: Ord i => CT.StorageClass i -> Bool +isPointerArithmeticOperandType ty = + maybe False (not . isFunctionType) $ CT.deref ty + +isPointerType :: CT.StorageClass i -> Bool +isPointerType ty = case CT.toTypeKind ty of + CT.CTPtr _ -> True + _ -> False + +isFunctionType :: CT.StorageClass i -> Bool +isFunctionType ty = case CT.toTypeKind ty of + CT.CTFunc _ _ -> True + _ -> False + +isAggregateType :: CT.StorageClass i -> Bool +isAggregateType ty = + CT.isCTStruct ty || CT.isIncompleteStruct ty + +isVoidType :: CT.StorageClass i -> Bool +isVoidType ty = case CT.toTypeKind ty of + CT.CTVoid -> True + _ -> False + +invalidIncompletePointerArithmetic :: Ord i => ATKind i -> ATree i -> ATree i -> Bool +invalidIncompletePointerArithmetic kind lhs rhs = case kind of + ATAddPtr -> + hasIncompletePointerTarget lhs + ATSubPtr -> + hasIncompletePointerTarget lhs + ATPtrDis -> + hasIncompletePointerTarget lhs || hasIncompletePointerTarget rhs + ATAddPtrAssign -> + hasIncompletePointerTarget lhs + ATSubPtrAssign -> + hasIncompletePointerTarget lhs + ATPreInc -> + hasIncompletePointerTarget lhs + ATPreDec -> + hasIncompletePointerTarget lhs + ATPostInc -> + hasIncompletePointerTarget lhs + ATPostDec -> + hasIncompletePointerTarget lhs + _ -> + False + where + hasIncompletePointerTarget expr = + maybe False hasIncompleteObjectType $ CT.deref (atype expr) + +invalidIncompleteMemOp :: Ord i => ATKind i -> ATree i -> Bool +invalidIncompleteMemOp kind lhs = case kind of + ATSizeof -> + hasIncompleteObjectType $ atype lhs + ATAlignof -> + hasIncompleteObjectType $ atype lhs + _ -> + False + +invalidReturnValue :: (Ord i, Bits i, Integral i) => Maybe (CT.StorageClass i) -> ATKind i -> ATree i -> Bool +invalidReturnValue currentReturnTy kind returnedExpr = case kind of + ATReturn -> + maybe False + (\returnTy -> + returnedExpr /= ATEmpty + && ( isInvalidFunctionPointerValue returnTy returnedExpr + || isInvalidObjectPointerValue returnTy returnedExpr + || isInvalidAggregateValueConversion returnTy returnedExpr + ) + ) + currentReturnTy + _ -> + False + +invalidAddressOfOperand :: Ord i => Bool -> ATree i -> Bool +invalidAddressOfOperand valueChecks operand = + not (isAddressableUnaryOperand operand) + && ( valueChecks + || not (isUnevaluatedRvalueArrayElementAddressOperand operand) + ) + +isAddressableUnaryOperand :: Ord i => ATree i -> Bool +isAddressableUnaryOperand (ATNode kind _ lhs _) = case kind of + ATLVar _ _ -> True + ATGVar _ _ -> True + ATFuncPtr _ -> True + ATMemberAcc _ -> isAddressableLvalueExpr lhs + ATDeref -> isAddressableDerefOperand lhs + _ -> False +isAddressableUnaryOperand _ = False + +isAddressableLvalueExpr :: Ord i => ATree i -> Bool +isAddressableLvalueExpr (ATNode kind _ lhs _) = case kind of + ATLVar _ _ -> True + ATGVar _ _ -> True + ATMemberAcc _ -> isAddressableLvalueExpr lhs + ATDeref -> isAddressableDerefOperand lhs + _ -> False +isAddressableLvalueExpr _ = False + +isAddressableDerefOperand :: Ord i => ATree i -> Bool +isAddressableDerefOperand (ATNode ATAddPtr _ arrayExpr _) + | CT.isArray (atype arrayExpr) = + isAddressableLvalueExpr arrayExpr +isAddressableDerefOperand (ATNode ATSubPtr _ arrayExpr _) + | CT.isArray (atype arrayExpr) = + isAddressableLvalueExpr arrayExpr +isAddressableDerefOperand operand + | CT.isArray (atype operand) = + isAddressableLvalueExpr operand +isAddressableDerefOperand operand = isJust $ CT.deref (atype operand) + +isUnevaluatedRvalueArrayElementAddressOperand :: Ord i => ATree i -> Bool +isUnevaluatedRvalueArrayElementAddressOperand (ATNode ATDeref _ (ATNode ATAddPtr _ arrayExpr _) _) + | CT.isArray (atype arrayExpr) = + not $ isAddressableLvalueExpr arrayExpr +isUnevaluatedRvalueArrayElementAddressOperand (ATNode ATDeref _ (ATNode ATSubPtr _ arrayExpr _) _) + | CT.isArray (atype arrayExpr) = + not $ isAddressableLvalueExpr arrayExpr +isUnevaluatedRvalueArrayElementAddressOperand (ATNode ATDeref _ arrayExpr _) + | CT.isArray (atype arrayExpr) = + not $ isAddressableLvalueExpr arrayExpr +isUnevaluatedRvalueArrayElementAddressOperand (ATNode (ATMemberAcc _) _ lhs _) = + isUnevaluatedRvalueArrayElementAddressOperand lhs +isUnevaluatedRvalueArrayElementAddressOperand _ = False + +unsupportedValueUseContext :: ATKind i -> Bool +unsupportedValueUseContext = \case + ATExprStmt -> True + ATReturn -> True + ATIf -> True + ATWhile -> True + ATSwitch _ _ -> True + ATFor _ -> True + ATAssign -> True + ATAddAssign -> True + ATSubAssign -> True + ATMulAssign -> True + ATDivAssign -> True + ATAddPtrAssign -> True + ATSubPtrAssign -> True + ATAndAssign -> True + ATOrAssign -> True + ATXorAssign -> True + ATShlAssign -> True + ATShrAssign -> True + ATComma -> True + ATConditional {} -> True + ATCast -> True + ATDeref -> True + ATPreInc -> True + ATPreDec -> True + ATPostInc -> True + ATPostDec -> True + ATNot -> True + ATBitNot -> True + ATAdd -> True + ATSub -> True + ATMul -> True + ATDiv -> True + ATMod -> True + ATShl -> True + ATShr -> True + ATLT -> True + ATLEQ -> True + ATGT -> True + ATGEQ -> True + ATEQ -> True + ATNEQ -> True + ATLAnd -> True + ATLOr -> True + ATAnd -> True + ATOr -> True + ATXor -> True + _ -> False + +refreshMergedValueTypes :: (Ord i, Bits i, Integral i) => PF.Functions i -> Maybe (GlobalVars i) -> ATree i -> ATree i +refreshMergedValueTypes funcs maybeGVars = go + where + lastMaybe [] = Nothing + lastMaybe xs = Just $ last xs + + refreshKindFor = \case + ATForkw -> + ATForkw + ATForInit at -> + ATForInit $ go at + ATForCond at -> + ATForCond $ go at + ATForIncr at -> + ATForIncr $ go at + ATForStmt at -> + ATForStmt $ go at + + refreshKind = \case + ATConditional cond tr fl -> + ATConditional (go cond) (go tr) (go fl) + ATSwitch cond cases -> + ATSwitch (go cond) (map go cases) + ATFor kinds -> + ATFor $ map refreshKindFor kinds + ATBlock ats -> + ATBlock $ map go ats + ATStmtExpr ats -> + ATStmtExpr $ map go ats + ATNull at -> + ATNull $ go at + ATDefFunc name args -> + ATDefFunc name $ map go <$> args + ATCallFunc name args -> + ATCallFunc name $ map go <$> args + ATCallPtr args -> + ATCallPtr $ map go <$> args + ATGVar ty name -> + ATGVar (mergedGlobalType maybeGVars name ty) name + other -> + other + + go ATEmpty = ATEmpty + go (ATNode kind ty lhs rhs) = + ATNode kind' ty' lhs' rhs' + where + kind' = refreshKind kind + lhs' = go lhs + rhs' = go rhs + ty' = case kind' of + ATConditional cond ATEmpty el -> + fromMaybe ty $ conditionalResultType cond el + ATConditional _ tr fl -> + fromMaybe ty $ conditionalResultType tr fl + ATDefFunc name _ -> + mergedFunctionType funcs name ty + ATGVar resolvedTy _ -> + resolvedTy + ATFuncPtr name -> + mergedFunctionType funcs name ty + ATAddr -> + CT.mapTypeKind CT.CTPtr $ atype lhs' + ATDeref -> + fromMaybe ty $ derefMergedObjectType $ atype lhs' + ATAddPtr -> + decayExprType $ atype lhs' + ATSubPtr -> + decayExprType $ atype lhs' + ATAssign -> + atype lhs' + ATAddPtrAssign -> + atype lhs' + ATSubPtrAssign -> + atype lhs' + ATComma -> + decayExprType $ atype rhs' + ATStmtExpr ats -> + maybe ty (decayExprType . atype) $ lastMaybe ats + _ -> + ty + + +revalidateMergedFunctionTree + :: (Ord i, Bits i, Integral i) + => MergedRevalidationMode + -> PF.Functions i + -> Maybe (GlobalVars i) + -> ATree i + -> Either String (ATree i) +revalidateMergedFunctionTree mode funcs maybeGVars = revalidateTree True True M.empty Nothing + where + validateDeferredCodegenChecks = case mode of + StrictMergedRevalidation -> True + GlobalInitializerMergedRevalidation -> True + VisualizableMergedRevalidation -> False + + validateAssignmentChecks = case mode of + StrictMergedRevalidation -> True + GlobalInitializerMergedRevalidation -> False + VisualizableMergedRevalidation -> False + + validateDeferredValueChecks valueChecks = + validateDeferredCodegenChecks && valueChecks + + lastMaybe [] = Nothing + lastMaybe xs = Just $ last xs + + revalidateKindFor valueChecks paramTys currentReturnTy = \case + ATForkw -> + Right ATForkw + ATForInit at -> + ATForInit <$> revalidateTree valueChecks True paramTys currentReturnTy at + ATForCond at -> + ATForCond <$> revalidateTree valueChecks True paramTys currentReturnTy at + ATForIncr at -> + ATForIncr <$> revalidateTree valueChecks True paramTys currentReturnTy at + ATForStmt at -> + ATForStmt <$> revalidateTree valueChecks True paramTys currentReturnTy at + + revalidateKind valueChecks paramTys currentReturnTy = \case + ATConditional cond tr fl -> + ATConditional + <$> revalidateTree valueChecks True paramTys currentReturnTy cond + <*> revalidateTree valueChecks True paramTys currentReturnTy tr + <*> revalidateTree valueChecks True paramTys currentReturnTy fl + ATSwitch cond cases -> + ATSwitch + <$> revalidateTree valueChecks True paramTys currentReturnTy cond + <*> traverse (revalidateTree valueChecks True paramTys currentReturnTy) cases + ATFor kinds -> + ATFor <$> traverse (revalidateKindFor valueChecks paramTys currentReturnTy) kinds + ATBlock ats -> + ATBlock <$> traverse (revalidateTree valueChecks True paramTys currentReturnTy) ats + ATStmtExpr ats -> + ATStmtExpr <$> traverse (revalidateTree valueChecks True paramTys currentReturnTy) ats + ATNull at -> + ATNull <$> revalidateTree valueChecks True paramTys currentReturnTy at + other -> + Right other + + revalidateTree _ _ _ _ ATEmpty = Right ATEmpty + revalidateTree valueChecks _ currentParamTys currentReturnTy (ATNode (ATDefFunc name args) ty lhs rhs) = do + let resolvedFnTy = mergedFunctionType funcs name ty + nestedReturnTy = functionReturnType resolvedFnTy + nestedParamTys = mergedFunctionParamBindings resolvedFnTy args + when (validateDeferredCodegenChecks && unsupportedByValueFunctionReturnType resolvedFnTy) $ + Left "unsupported by-value function return type" + args' <- traverse (traverse $ revalidateTree True True nestedParamTys nestedReturnTy) args + lhs' <- revalidateTree True True nestedParamTys nestedReturnTy lhs + rhs' <- revalidateTree valueChecks True currentParamTys currentReturnTy rhs + pure $ ATNode (ATDefFunc name args') resolvedFnTy lhs' rhs' + revalidateTree valueChecks _ currentParamTys currentReturnTy (ATNode (ATLVar ty offset) _ lhs rhs) = do + lhs' <- revalidateTree valueChecks True currentParamTys currentReturnTy lhs + rhs' <- revalidateTree valueChecks True currentParamTys currentReturnTy rhs + let resolvedTy = fromMaybe ty $ M.lookup offset currentParamTys + pure $ ATNode (ATLVar resolvedTy offset) resolvedTy lhs' rhs' + revalidateTree valueChecks _ currentParamTys currentReturnTy (ATNode (ATCallFunc name args) ty lhs rhs) = do + lhs' <- revalidateTree valueChecks True currentParamTys currentReturnTy lhs + rhs' <- revalidateTree valueChecks True currentParamTys currentReturnTy rhs + args' <- traverse (traverse $ revalidateTree valueChecks True currentParamTys currentReturnTy) args + case M.lookup name funcs of + Nothing -> + pure $ ATNode (ATCallFunc name args') ty lhs' rhs' + Just fn -> do + let resolvedFnTy = functionTypeWithMergedCall (PF.fntype fn) ty $ fromMaybe [] args' + (callTy, formalParamTys) <- maybe + (Left "internal compiler error: merged function lookup returned non-callable type") + Right + (mergedCallableSignature resolvedFnTy) + params <- convertCallArgsWith (validateDeferredValueChecks valueChecks) formalParamTys $ fromMaybe [] args' + let params' = if null params then Nothing else Just params + pure $ ATNode (ATCallFunc name params') callTy lhs' rhs' + revalidateTree valueChecks _ currentParamTys currentReturnTy (ATNode (ATCallPtr args) _ lhs rhs) = do + lhs' <- revalidateTree valueChecks True currentParamTys currentReturnTy lhs + rhs' <- revalidateTree valueChecks True currentParamTys currentReturnTy rhs + args' <- traverse (traverse $ revalidateTree valueChecks True currentParamTys currentReturnTy) args + (callTy, formalParamTys) <- maybe + (Left "called object is not a function or function pointer") + Right + (mergedCallableSignature $ atype lhs') + params <- convertCallArgsWith (validateDeferredValueChecks valueChecks) formalParamTys $ fromMaybe [] args' + let params' = if null params then Nothing else Just params + when (validateDeferredValueChecks valueChecks && isJust params' && containsEscapingStmtExprControlFlow lhs') $ + Left "unsupported control flow in function call callee" + pure $ ATNode (ATCallPtr params') callTy lhs' rhs' + revalidateTree valueChecks currentValueChecks currentParamTys currentReturnTy (ATNode kind ty lhs rhs) = do + let kindValueChecks = case kind of + ATSizeof -> + False + ATAlignof -> + False + _ -> + valueChecks + lhsCurrentValueChecks = case kind of + ATAddr -> + False + ATMemberAcc _ -> + False + _ -> + True + kind' <- revalidateKind kindValueChecks currentParamTys currentReturnTy kind + let nestedReturnTy = case kind' of + ATDefFunc name _ -> + functionReturnType $ mergedFunctionType funcs name ty + _ -> + currentReturnTy + lhs' <- revalidateTree kindValueChecks lhsCurrentValueChecks currentParamTys nestedReturnTy lhs + rhs' <- revalidateTree valueChecks True currentParamTys currentReturnTy rhs + let (kind'', ty') = refreshKindAndType kind' ty lhs' rhs' + when (validateAssignmentChecks && invalidAssignmentOperands kind'' lhs rhs lhs' rhs') $ + Left "invalid operands to assignment" + when validateDeferredCodegenChecks $ + for_ (invalidRefreshedOperandUse kind'' ty' lhs' rhs') Left + when (validateDeferredCodegenChecks && invalidIncompletePointerArithmetic kind'' lhs' rhs') $ + Left "invalid use of pointer to incomplete type" + when (validateDeferredCodegenChecks && invalidIncompleteMemOp kind'' lhs') $ + Left $ case kind'' of + ATSizeof -> "invalid application of 'sizeof' to incomplete type" + ATAlignof -> "invalid application of '_Alignof' to incomplete type" + _ -> "internal compiler error: unexpected incomplete memory operator" + when (validateDeferredValueChecks kindValueChecks && invalidReturnValue currentReturnTy kind'' lhs') $ + Left "invalid return type" + when (validateDeferredCodegenChecks && invalidAddressOfOperandKind kindValueChecks kind'' lhs') $ + Left "lvalue required as unary '&' operand" + let refreshed = ATNode kind'' ty' lhs' rhs' + when + ( validateDeferredValueChecks valueChecks + && currentValueChecks + && unsupportedValueUseContext kind'' + && requiresUnsupportedNonAddressableArrayDecay refreshed + ) $ + Left "unsupported non-addressable array member decay" + pure refreshed + + invalidAddressOfOperandKind valueChecks = \case + ATAddr -> + invalidAddressOfOperand valueChecks + _ -> + const False + + refreshKindAndType kind currentTy lhs rhs = case kind of + ATConditional cond ATEmpty el -> + ( kind + , fromMaybe currentTy $ conditionalResultType cond el + ) + ATConditional _ th el -> + ( kind + , fromMaybe currentTy $ conditionalResultType th el + ) + ATDefFunc name _ -> + (kind, mergedFunctionType funcs name currentTy) + ATGVar _ name -> + let resolvedTy = mergedGlobalType maybeGVars name currentTy + in (ATGVar resolvedTy name, resolvedTy) + ATFuncPtr name -> + let resolvedTy = mergedFunctionType funcs name currentTy + in (ATFuncPtr name, resolvedTy) + ATAddr -> + (ATAddr, CT.mapTypeKind CT.CTPtr $ atype lhs) + ATDeref -> + (ATDeref, fromMaybe currentTy $ derefMergedObjectType $ atype lhs) + ATPreInc -> + (ATPreInc, atype lhs) + ATPreDec -> + (ATPreDec, atype lhs) + ATPostInc -> + (ATPostInc, atype lhs) + ATPostDec -> + (ATPostDec, atype lhs) + ATAddPtr -> + (ATAddPtr, decayExprType $ atype lhs) + ATSubPtr -> + (ATSubPtr, decayExprType $ atype lhs) + ATPtrDis -> + (ATPtrDis, CT.SCAuto $ CT.CTLong CT.CTInt) + ATAssign -> + (ATAssign, atype lhs) + ATAddPtrAssign -> + (ATAddPtrAssign, atype lhs) + ATSubPtrAssign -> + (ATSubPtrAssign, atype lhs) + ATComma -> + (ATComma, decayExprType $ atype rhs) + ATStmtExpr ats -> + (kind, maybe currentTy (decayExprType . atype) $ lastMaybe ats) + _ -> + (kind, currentTy) + +revalidateMergedFunctionCalls + :: (Ord i, Bits i, Integral i) + => PF.Functions i + -> GlobalVars i + -> ASTs i + -> Either String (ASTs i) +revalidateMergedFunctionCalls = + revalidateMergedFunctionCallsWithMode StrictMergedRevalidation + +revalidateMergedFunctionCallsWithMode + :: (Ord i, Bits i, Integral i) + => MergedRevalidationMode + -> PF.Functions i + -> GlobalVars i + -> ASTs i + -> Either String (ASTs i) +revalidateMergedFunctionCallsWithMode mode funcs gvars = + traverse $ revalidateMergedFunctionTree mode funcs (Just gvars) + +revalidateMergedGlobalInitializers + :: (Ord i, Bits i, Integral i) + => PF.Functions i + -> GlobalVars i + -> Either String (GlobalVars i) +revalidateMergedGlobalInitializers funcs gvars = + traverse revalidateGVar gvars + where + revalidateGVar gvar = case initWith gvar of + GVarInitWithAST ast -> do + ast' <- revalidateMergedFunctionTree GlobalInitializerMergedRevalidation funcs (Just gvars) ast + validateGlobalInitializerAssignments ast ast' + pure $ gvar { initWith = GVarInitWithAST ast' } + _ -> + Right gvar + + validateGlobalInitializerAssignments ATEmpty ATEmpty = Right () + validateGlobalInitializerAssignments + (ATNode originalKind _ originalLhs originalRhs) + (ATNode kind _ lhs rhs) = do + when (invalidIncompletePointerArithmetic kind lhs rhs) $ + Left "invalid use of pointer to incomplete type" + when (invalidIncompleteMemOp kind lhs) $ + Left $ case kind of + ATSizeof -> "invalid application of 'sizeof' to incomplete type" + ATAlignof -> "invalid application of '_Alignof' to incomplete type" + _ -> "internal compiler error: unexpected incomplete memory operator" + case (originalKind, kind) of + (ATBlock originalAts, ATBlock ats) + | length originalAts == length ats -> + traverse_ (uncurry validateGlobalInitializerAssignments) $ zip originalAts ats + | otherwise -> + Left "internal compiler error: global initializer shape changed during revalidation" + (ATExprStmt, ATExprStmt) -> + validateGlobalInitializerAssignments originalLhs lhs + (_, ATAssign) -> do + when (invalidAssignmentOperands kind originalLhs originalRhs lhs rhs) $ + Left "invalid initializer for scalar object" + validateGlobalInitializerAssignments originalLhs lhs + *> validateGlobalInitializerAssignments originalRhs rhs + _ -> + validateGlobalInitializerAssignments originalLhs lhs + *> validateGlobalInitializerAssignments originalRhs rhs + validateGlobalInitializerAssignments _ _ = + Left "internal compiler error: global initializer shape changed during revalidation" + +prepareAsmInput + :: (Integral i, Bits i, Read i, Show i, Ord i) + => PF.Functions i + -> ASTs i + -> GlobalVars i + -> Either String (ASTs i, GlobalVars i) +prepareAsmInput funcs asts gvars = do + let materializedGVars = M.map materializeTentativeIncompleteArray gvars + revalidatedGVars <- revalidateMergedGlobalInitializers funcs materializedGVars + (normalizedAsts, normalizedGVars) <- normalizeAsmInput asts revalidatedGVars + revalidatedAsts <- revalidateMergedFunctionCalls funcs normalizedGVars normalizedAsts + pure (revalidatedAsts, normalizedGVars) + +prepareVisualizableInput + :: (Integral i, Bits i, Read i, Show i, Ord i) + => PF.Functions i + -> ASTs i + -> GlobalVars i + -> Either String (ASTs i, GlobalVars i) +prepareVisualizableInput funcs asts gvars = do + let materializedGVars = M.map materializeTentativeIncompleteArray gvars + retypedAsts = retypeResolvedGlobalRefs materializedGVars asts + revalidatedAsts <- + revalidateMergedFunctionCallsWithMode + VisualizableMergedRevalidation + funcs + materializedGVars + retypedAsts + pure (revalidatedAsts, materializedGVars) + +casmNormalized' :: (Show e, Show i, Integral e, Integral i, Ord i, IsOperand i, IT.UnaryInstruction i, IT.BinaryInstruction i) + => ASTs i + -> GlobalVars i + -> Literals i + -> SI.Asm SI.AsmCodeCtx e () +casmNormalized' atl gvars lits = + dataSection materializedGVars lits >> textSection (retypeResolvedGlobalRefs materializedGVars atl) + where + materializedGVars = M.map materializeTentativeIncompleteArray gvars + +-- | Executor that receives information about the constructed AST, +-- global variables, and literals and composes assembly code +casm' :: (Bits i, Read i, Show e, Show i, Integral e, Integral i, Ord i, IsOperand i, IT.UnaryInstruction i, IT.BinaryInstruction i) + => ASTs i + -> GlobalVars i + -> Literals i + -> PF.Functions i + -> SI.Asm SI.AsmCodeCtx e () +casm' atl gvars lits funcs = + case prepareAsmInput funcs atl gvars of + Left err -> SI.errCtx $ T.pack err + Right (normalizedAsts, normalizedGVars) -> + casmNormalized' normalizedAsts normalizedGVars lits diff --git a/src/Htcc/Asm/Generate/Core.hs b/src/Htcc/Asm/Generate/Core.hs index c453a12..22327da 100644 --- a/src/Htcc/Asm/Generate/Core.hs +++ b/src/Htcc/Asm/Generate/Core.hs @@ -9,23 +9,33 @@ Portability : POSIX The modules of intrinsic (x86_64) assembly -} -{-# LANGUAGE OverloadedStrings, ScopedTypeVariables #-} +{-# LANGUAGE BangPatterns, OverloadedStrings, ScopedTypeVariables #-} module Htcc.Asm.Generate.Core ( dataSection, textSection, ) where -import Control.Monad (forM_, unless, when, zipWithM_) +import Control.Monad (forM_, unless, when, + zipWithM_) import Control.Monad.Finally (MonadFinally (..)) -import Prelude hiding (truncate) +import Data.Bifunctor (bimap) import Data.Int (Int32) import Data.IORef (readIORef) -import Data.List (find) +import Data.List (find, foldl') import qualified Data.Map as M import Data.Maybe (fromJust, isJust) +import qualified Data.Set as S import qualified Data.Text as T import qualified Data.Text.IO as T +import Prelude hiding (truncate) +import Data.Tuple.Extra (dupe) +import Htcc.Asm.Intrinsic.Operand +import Htcc.Asm.Intrinsic.Register +import qualified Htcc.Asm.Intrinsic.Structure as SI +import qualified Htcc.Asm.Intrinsic.Structure.Section.Data as ID +import qualified Htcc.Asm.Intrinsic.Structure.Section.Text as IT +import qualified Htcc.CRules.Types as CR import Htcc.Parser (ATKind (..), ATree (..), fromATKindFor, @@ -33,46 +43,92 @@ import Htcc.Parser (ATKind (..), isATForIncr, isATForInit, isATForStmt, - isComplexAssign, - stackSize) + isComplexAssign) import Htcc.Parser.ConstructionData.Scope.Var as PV -import Htcc.Asm.Intrinsic.Operand -import Htcc.Asm.Intrinsic.Register -import qualified Htcc.Asm.Intrinsic.Structure as SI -import qualified Htcc.Asm.Intrinsic.Structure.Section.Data as ID -import qualified Htcc.Asm.Intrinsic.Structure.Section.Text as IT -import Htcc.Utils (err, maybe', splitAtLen, tshow) -import qualified Htcc.CRules.Types as CR +import Htcc.Utils (err, maybe', + splitAtLen, + toNatural, tshow) +import Numeric.Natural + +stackSize :: (Show i, Integral i) => ATree i -> Natural +stackSize (ATNode (ATDefFunc _ args) _ body _) = let ms = f body $ maybe S.empty (foldr (\(ATNode (ATLVar t x) _ _ _) acc -> S.insert (t, x) acc) S.empty) args in + if S.size ms == 1 then toNatural $ flip CR.alignas 8 $ toInteger $ CR.sizeof $ fst $ head (S.toList ms) else toNatural $ flip CR.alignas 8 $ uncurry (+) $ + bimap (toInteger . CR.sizeof . fst) (fromIntegral . snd) $ + dupe $ + foldl' (\acc x -> if snd acc < snd x then x else acc) (CR.SCUndef CR.CTUndef, 0) $ + S.toList ms + where + f ATEmpty !s = s + f (ATNode (ATCallFunc _ (Just arg)) t l r) !s = f (ATNode (ATBlock arg) t l r) s + f (ATNode (ATCallPtr (Just arg)) t l r) !s = f (ATNode (ATBlock arg) t l r) s + f (ATNode (ATLVar t x) _ l r) !s = let i = S.insert (t, x) s in f l i `S.union` f r i + f (ATNode (ATBlock xs) _ l r) !s = let i = foldr (S.union . (`f` s)) s xs in f l i `S.union` f r i + f (ATNode (ATStmtExpr xs) t l r) !s = f (ATNode (ATBlock xs) t l r) s + f (ATNode (ATFor xs) _ l r) !s = let i = foldr (S.union . flip f s . fromATKindFor) S.empty xs in f l i `S.union` f r i + f (ATNode (ATNull x) _ _ _) !s = f x s + f (ATNode _ _ l r) !s = f l s `S.union` f r s +stackSize _ = 0 {-# INLINE prologue #-} prologue :: Integral i => i -> SI.Asm IT.TextLabelCtx e () prologue ss = IT.push rbp >> IT.mov rbp rsp >> IT.sub rsp (fromIntegral ss :: Integer) {-# INLINE epilogue #-} -epilogue :: SI.Asm IT.TextLabelCtx e () -epilogue = retLabel *> IT.leave *> IT.ret +epilogue :: Ord i => CR.StorageClass i -> SI.Asm IT.TextLabelCtx e () +epilogue ty = retLabel *> when (returnsBool ty) normalizeBoolRax *> IT.leave *> IT.ret where retLabel = SI.Asm $ \x -> do cf <- readIORef (SI.curFn x) unless (isJust cf) $ err "stray epilogue" - T.putStrLn $ ".L.return." <> fromJust cf <> ":" + T.hPutStrLn (SI.outHandle x) $ ".L.return." <> fromJust cf <> ":" + + returnsBool sc = case CR.toTypeKind sc of + CR.CTFunc retTy _ -> retTy == CR.CTBool + _ -> False + +normalizeBoolRax :: SI.Asm IT.TextLabelCtx e () +normalizeBoolRax = normalizeBoolWordRax + +normalizeBoolWordRax :: SI.Asm IT.TextLabelCtx e () +normalizeBoolWordRax = IT.cmp rax (0 :: Int) *> IT.setne al *> IT.movzb rax al + +normalizeBoolAbiRax :: SI.Asm IT.TextLabelCtx e () +normalizeBoolAbiRax = IT.cmp al (0 :: Int) *> IT.setne al *> IT.movzb rax al + +truncateRax :: Ord i => CR.StorageClass i -> SI.Asm IT.TextLabelCtx e () +truncateRax t + | CR.sizeof t == 1 = IT.movsx rax al + | CR.sizeof t == 2 = IT.movsx rax ax + | CR.sizeof t == 4 = IT.movsxd rax eax + | otherwise = return () + +normalizeCallResultRax :: Ord i => CR.StorageClass i -> SI.Asm IT.TextLabelCtx e () +normalizeCallResultRax t + | CR.toTypeKind t == CR.CTBool = normalizeBoolAbiRax + | needsAbiTruncation (CR.toTypeKind t) && CR.sizeof t < 8 = truncateRax t + | otherwise = return () + where + needsAbiTruncation ty = case ty of + CR.CTChar -> True + CR.CTInt -> True + CR.CTEnum _ _ -> True + CR.CTSigned x -> needsAbiTruncation x + CR.CTShort x -> needsAbiTruncation x + CR.CTLong x -> needsAbiTruncation x + _ -> False truncate :: Ord i => CR.StorageClass i -> SI.Asm IT.TextLabelCtx e () truncate ty = do IT.pop rax - when (CR.toTypeKind ty == CR.CTBool) $ IT.cmp rax (0 :: Int) *> IT.setne al - truncate' ty + if CR.toTypeKind ty == CR.CTBool + then normalizeBoolRax + else truncateRax ty IT.push rax - where - truncate' t - | CR.sizeof t == 1 = IT.movsx rax al - | CR.sizeof t == 2 = IT.movsx rax ax - | CR.sizeof t == 4 = IT.movsxd rax eax - | otherwise = return () genAddr :: (Integral e, Show e, IsOperand i, Integral i, Ord i, IT.UnaryInstruction i, IT.BinaryInstruction i) => ATree i -> SI.Asm IT.TextLabelCtx e () genAddr (ATNode (ATLVar _ v) _ _ _) = IT.lea rax (Ref $ rbp `osub` v) >> IT.push rax genAddr (ATNode (ATGVar _ n) _ _ _) = IT.push (IT.Offset n) +genAddr (ATNode (ATFuncPtr n) _ _ _) = IT.push (IT.Offset n) genAddr (ATNode ATDeref _ lhs _) = genStmt lhs genAddr (ATNode (ATMemberAcc m) _ lhs _) = do genAddr lhs @@ -81,6 +137,44 @@ genAddr (ATNode (ATMemberAcc m) _ lhs _) = do IT.push rax genAddr _ = SI.errCtx "lvalue required as left operand of assignment" +isAddressableExpr :: ATree i -> Bool +isAddressableExpr (ATNode kind _ lhs _) = case kind of + ATLVar _ _ -> True + ATGVar _ _ -> True + ATFuncPtr _ -> True + ATMemberAcc _ -> isAddressableExpr lhs + ATDeref -> True + _ -> False +isAddressableExpr _ = False + +rvalueSubobjectBaseOffset :: ATree i -> Maybe (ATree i, Natural) +rvalueSubobjectBaseOffset (ATNode (ATMemberAcc member) _ lhs _) + | isAddressableExpr lhs = Nothing + | otherwise = case rvalueSubobjectBaseOffset lhs of + Just (base, offset) -> Just (base, offset + CR.smOffset member) + Nothing -> Just (lhs, CR.smOffset member) +rvalueSubobjectBaseOffset _ = Nothing + +rvalueArrayElementPointerChain :: ATree i -> Maybe (ATree i, Natural, [RvalueArrayIndexStep i]) +rvalueArrayElementPointerChain ptr = do + (arrayExpr, idxs) <- pointerIndexChain ptr + if CR.isArray (atype arrayExpr) + then case rvalueSubobjectBaseOffset arrayExpr of + Just (base, offset) -> Just (base, offset, idxs) + Nothing -> Nothing + else Nothing + where + pointerIndexChain (ATNode ATAddPtr _ arrayExpr idx) = + appendIndex RvalueArrayIndexAdd arrayExpr idx + pointerIndexChain (ATNode ATSubPtr _ arrayExpr idx) = + appendIndex RvalueArrayIndexSub arrayExpr idx + pointerIndexChain _ = + Nothing + + appendIndex direction arrayExpr idx = case pointerIndexChain arrayExpr of + Just (root, idxs) -> Just (root, idxs <> [(direction, idx)]) + Nothing -> Just (arrayExpr, [(direction, idx)]) + genLVal :: (Integral e, Show e, IsOperand i, Integral i, Ord i, IT.UnaryInstruction i, IT.BinaryInstruction i) => ATree i -> SI.Asm IT.TextLabelCtx e () genLVal xs@(ATNode _ t _ _) | CR.isCTArray t = SI.errCtx "lvalue required as left operand of assignment" @@ -89,18 +183,232 @@ genLVal _ = SI.errCtx "internal compiler error: genLVal catch ATEmpty" load :: Ord i => CR.StorageClass i -> SI.Asm IT.TextLabelCtx e () load t + | isAggregateType t = loadPackedObject t | CR.sizeof t == 1 = IT.pop rax >> IT.movsx rax (IT.byte IT.Ptr (Ref rax)) >> IT.push rax | CR.sizeof t == 2 = IT.pop rax >> IT.movsx rax (IT.word IT.Ptr (Ref rax)) >> IT.push rax | CR.sizeof t == 4 = IT.pop rax >> IT.movsxd rax (IT.dword IT.Ptr (Ref rax)) >> IT.push rax | otherwise = IT.pop rax >> IT.mov rax (Ref rax) >> IT.push rax +isAggregateType :: CR.StorageClass i -> Bool +isAggregateType ty = CR.isArray ty || CR.isCTStruct ty + +objectChunks :: Natural -> [(Natural, Natural)] +objectChunks = go 0 + where + go _ 0 = [] + go offset remaining + | remaining >= 4 = (offset, 4) : go (offset + 4) (remaining - 4) + | remaining >= 2 = (offset, 2) : go (offset + 2) (remaining - 2) + | otherwise = [(offset, 1)] + +loadPackedObject :: Ord i => CR.StorageClass i -> SI.Asm IT.TextLabelCtx e () +loadPackedObject ty + | objectSize <= 8 = do + IT.pop rdx + IT.mov rax (0 :: Int) + mapM_ loadChunk (objectChunks objectSize) + IT.push rax + | otherwise = + SI.errCtx "unsupported aggregate value load" + where + objectSize = CR.sizeof ty + + loadChunk (offset, width) = do + IT.mov rcx (0 :: Int) + IT.lea rsi (refAt rdx offset) + case width of + 4 -> IT.mov ecx (IT.dword IT.Ptr $ Ref rsi) + 2 -> IT.mov cx (IT.word IT.Ptr $ Ref rsi) + 1 -> IT.mov cl (IT.byte IT.Ptr $ Ref rsi) + _ -> SI.errCtx "internal compiler error: invalid packed object load width" + unless (offset == 0) $ + IT.shl rcx (fromIntegral (offset * 8) :: Int) + IT.or rax rcx + +storePackedObject :: Ord i => CR.StorageClass i -> SI.Asm IT.TextLabelCtx e () +storePackedObject ty + | objectSize <= 8 = do + IT.pop rdi + IT.pop rax + mapM_ storeChunk (objectChunks objectSize) + IT.push rdi + | otherwise = + SI.errCtx "unsupported aggregate value store" + where + objectSize = CR.sizeof ty + + storeChunk (offset, width) = do + IT.mov rdx rdi + unless (offset == 0) $ + IT.sar rdx (fromIntegral (offset * 8) :: Int) + IT.lea rsi (refAt rax offset) + case width of + 4 -> IT.mov (IT.dword IT.Ptr $ Ref rsi) edx + 2 -> IT.mov (IT.word IT.Ptr $ Ref rsi) dx + 1 -> IT.mov (IT.byte IT.Ptr $ Ref rsi) dl + _ -> SI.errCtx "internal compiler error: invalid packed object store width" + +storePackedRegisterObject + :: Integral e + => Natural + -> Register + -> (Natural -> Ref Operand) + -> SI.Asm IT.TextLabelCtx e () +storePackedRegisterObject objectSize srcReg destAt + | objectSize <= 8 = do + IT.mov packedValueReg srcReg + mapM_ storeChunk (objectChunks objectSize) + | otherwise = + SI.errCtx "internal compiler error: unsupported aggregate parameter width" + where + packedValueReg = rn 10 + chunkReg = rn 11 + chunkRegD = rnd 11 + chunkRegW = rnw 11 + chunkRegB = rnb 11 + + storeChunk (offset, width) = do + IT.mov chunkReg packedValueReg + unless (offset == 0) $ + IT.sar chunkReg (fromIntegral (offset * 8) :: Int) + IT.lea rax (destAt offset) + case width of + 4 -> IT.mov (IT.dword IT.Ptr $ Ref rax) chunkRegD + 2 -> IT.mov (IT.word IT.Ptr $ Ref rax) chunkRegW + 1 -> IT.mov (IT.byte IT.Ptr $ Ref rax) chunkRegB + _ -> SI.errCtx "internal compiler error: invalid packed object parameter width" + +refAt :: IsOperand a => a -> Natural -> Ref Operand +refAt base offset = Ref $ base `oadd` (fromIntegral offset :: Integer) + +isScalarLoadType :: CR.StorageClass i -> Bool +isScalarLoadType ty = + not (isAggregateType ty) + && case CR.toTypeKind ty of + CR.CTFunc _ _ -> False + _ -> True + +data RvalueArrayIndexDirection + = RvalueArrayIndexAdd + | RvalueArrayIndexSub + +type RvalueArrayIndexStep i = (RvalueArrayIndexDirection, ATree i) + +withSmallRvalueObject + :: (Integral e, Show e, IsOperand i, Integral i, Ord i, IT.UnaryInstruction i, IT.BinaryInstruction i) + => ATree i + -> (Register -> SI.Asm IT.TextLabelCtx e ()) + -> SI.Asm IT.TextLabelCtx e () +withSmallRvalueObject base useBase + | CR.sizeof (atype base) <= 8 = do + genStmt base + IT.pop rax + IT.push rbx + IT.sub rsp (8 :: Int) + IT.mov (Ref rsp) rax + IT.mov rbx rsp + useBase rbx + IT.pop rax + IT.add rsp (8 :: Int) + IT.pop rbx + IT.push rax + | otherwise = + SI.errCtx "unsupported non-addressable struct member access" + +loadRvalueSubobject + :: (Integral e, Show e, IsOperand i, Integral i, Ord i, IT.UnaryInstruction i, IT.BinaryInstruction i) + => CR.StorageClass i + -> ATree i + -> Natural + -> SI.Asm IT.TextLabelCtx e () +loadRvalueSubobject ty base offset + | not (isScalarLoadType ty) = + SI.errCtx "unsupported non-addressable aggregate member access" + | accessEnd <= CR.sizeof (atype base) = + withSmallRvalueObject base $ \baseReg -> loadFromBaseOffset baseReg offset ty + | otherwise = + SI.errCtx "unsupported non-addressable struct member access" + where + accessEnd = offset + CR.sizeof ty + +loadRvalueArrayElement + :: (Integral e, Show e, IsOperand i, Integral i, Ord i, IT.UnaryInstruction i, IT.BinaryInstruction i) + => CR.StorageClass i + -> ATree i + -> Natural + -> [RvalueArrayIndexStep i] + -> SI.Asm IT.TextLabelCtx e () +loadRvalueArrayElement elemTy base offset idxs + | not (isScalarLoadType elemTy) = + SI.errCtx "unsupported non-addressable aggregate array element access" + | CR.sizeof (atype base) > 8 = + SI.errCtx "unsupported non-addressable struct member access" + | otherwise = do + genRvalueArrayIndex idxs + withSmallRvalueObject base $ \baseReg -> do + IT.mov rax (refAt rsp 16) + IT.imul rax elemSize + IT.add rax baseReg + IT.add rax offset' + IT.push rax + load elemTy + IT.pop rax + IT.add rsp (8 :: Int) + IT.push rax + where + elemSize = fromIntegral (CR.sizeof elemTy) :: Int + offset' = fromIntegral offset :: Integer + +genRvalueArrayIndex + :: (Integral e, Show e, IsOperand i, Integral i, Ord i, IT.UnaryInstruction i, IT.BinaryInstruction i) + => [RvalueArrayIndexStep i] + -> SI.Asm IT.TextLabelCtx e () +genRvalueArrayIndex [] = IT.push (0 :: Int) +genRvalueArrayIndex ((direction, idx):idxs) = do + genFirstIndex direction idx + forM_ idxs $ \(direction', idx') -> do + genStmt idx' + combineIndex direction' + where + genFirstIndex RvalueArrayIndexAdd idx' = + genStmt idx' + genFirstIndex RvalueArrayIndexSub idx' = do + IT.push (0 :: Int) + genStmt idx' + combineIndex RvalueArrayIndexSub + + combineIndex direction' = do + IT.pop rdi + IT.pop rax + case direction' of + RvalueArrayIndexAdd -> IT.add rax rdi + RvalueArrayIndexSub -> IT.sub rax rdi + IT.push rax + +loadFromBaseOffset :: (Integral e, Ord i, IsOperand i, IT.BinaryInstruction i) => Register -> Natural -> CR.StorageClass i -> SI.Asm IT.TextLabelCtx e () +loadFromBaseOffset baseReg offset ty = do + IT.lea rax (Ref $ baseReg `oadd` offset') + IT.push rax + load ty + where + offset' = fromIntegral offset :: Integer + +nonLoadableDerefType :: CR.StorageClass i -> Bool +nonLoadableDerefType ty = + CR.isArray ty + || case CR.toTypeKind ty of + CR.CTFunc _ _ -> True + _ -> False + store :: Ord i => CR.StorageClass i -> SI.Asm IT.TextLabelCtx e () -store t = do - IT.pop rdi - IT.pop rax - when (CR.toTypeKind t == CR.CTBool) $ IT.cmp rdi (0 :: Int) *> IT.setne dil *> IT.movzb rdi dil - IT.mov (Ref rax) storeReg - IT.push rdi +store t + | isAggregateType t = storePackedObject t + | otherwise = do + IT.pop rdi + IT.pop rax + when (CR.toTypeKind t == CR.CTBool) $ IT.cmp rdi (0 :: Int) *> IT.setne dil *> IT.movzb rdi dil + IT.mov (Ref rax) storeReg + IT.push rdi where storeReg | CR.sizeof t == 1 = dil @@ -114,25 +422,138 @@ increment t = IT.pop rax >> IT.add rax (maybe 1 CR.sizeof $ CR.deref t) >> IT.pu decrement :: Ord i => CR.StorageClass i -> SI.Asm IT.TextLabelCtx e () decrement t = IT.pop rax >> IT.sub rax (maybe 1 CR.sizeof $ CR.deref t) >> IT.push rax -genStmt :: (Show e, Integral e, Show i, Integral i, Ord i, IsOperand i, IT.UnaryInstruction i, IT.BinaryInstruction i) => ATree i -> SI.Asm IT.TextLabelCtx e () -genStmt (ATNode (ATCallFunc x Nothing) _ _ _) = IT.call x >> IT.push rax -genStmt (ATNode (ATCallFunc x (Just args)) t _ _) = let (n', toReg, _) = splitAtLen 6 args in do - mapM_ genStmt toReg - mapM_ IT.pop $ popRegs n' +genCallTarget :: (Show e, Integral e, Show i, Integral i, Ord i, IsOperand i, IT.UnaryInstruction i, IT.BinaryInstruction i) => ATree i -> SI.Asm IT.TextLabelCtx e () +genCallTarget callee + | isFunctionDesignator callee = genAddr callee + | otherwise = genStmt callee + where + isFunctionDesignator expr = case CR.toTypeKind (atype expr) of + CR.CTFunc _ _ -> True + _ -> False + +callAligned + :: (Show e, Integral e) + => Int + -> SI.Asm IT.TextLabelCtx e () + -> SI.Asm IT.TextLabelCtx e () + -> SI.Asm IT.TextLabelCtx e () +callAligned nStackArgs restore invoke = do + let invokeAndCleanup = do + restore + invoke + cleanupStackArgs nStackArgs n <- IT.incrLbl IT.mov rax rsp + when (odd nStackArgs) $ + IT.sub rax (8 :: Int) IT.and rax (0x0f :: Int) IT.jnz $ IT.ref "call" n - IT.mov rax (0 :: Int) - IT.call x + invokeAndCleanup IT.jmp $ IT.refEnd n IT.label "call" n IT.sub rsp (8 :: Int) - IT.mov rax (0 :: Int) - IT.call x + invokeAndCleanup IT.add rsp (8 :: Int) IT.end n - when (CR.toTypeKind t == CR.CTBool) $ IT.movzb rax al + +invokeIndirect :: (Show e, Integral e) => SI.Asm IT.TextLabelCtx e () +invokeIndirect = do + IT.mov rax (0 :: Int) + IT.call "r11" + +prepareCallArgs + :: (Show e, Integral e, Show i, Integral i, Ord i, IsOperand i, IT.UnaryInstruction i, IT.BinaryInstruction i) + => [ATree i] + -> SI.Asm IT.TextLabelCtx e () + -> SI.Asm IT.TextLabelCtx e () +prepareCallArgs args invoke = do + let (nReg, _, stackArgs) = splitAtLen 6 args + nArgs = nReg + length stackArgs + slotRef base idx = Ref $ base `oadd` (8 * idx :: Int) + storeValue base idx expr = do + genStmt expr + IT.pop rdx + IT.mov (slotRef base idx) rdx + pushSavedStackArg base idx = do + IT.mov rdx (slotRef base idx) + IT.push rdx + restoreArgs base = do + IT.mov rax base + mapM_ (pushSavedStackArg rax) $ reverse [nReg .. pred nArgs] + zipWithM_ (\reg idx -> IT.mov reg (slotRef rax idx)) (reverse $ popRegs nReg) [0 .. pred nReg] + if nArgs == 0 + then callAligned 0 (pure ()) invoke + else do + IT.push rbx + IT.sub rsp (8 * nArgs) + IT.mov rbx rsp + zipWithM_ (storeValue rbx) [0..] args + IT.mov (rn 10) rbx + IT.add rsp (8 * nArgs) + IT.pop rbx + callAligned (length stackArgs) (restoreArgs $ rn 10) invoke + +prepareIndirectCall + :: (Show e, Integral e, Show i, Integral i, Ord i, IsOperand i, IT.UnaryInstruction i, IT.BinaryInstruction i) + => ATree i + -> [ATree i] + -> SI.Asm IT.TextLabelCtx e () + -> SI.Asm IT.TextLabelCtx e () +prepareIndirectCall callee args invoke = do + let (nReg, _, stackArgs) = splitAtLen 6 args + nArgs = nReg + length stackArgs + calleeSlot = nArgs + nSlots = succ nArgs + slotRef base idx = Ref $ base `oadd` (8 * idx :: Int) + storeValue base idx expr = do + genStmt expr + IT.pop rdx + IT.mov (slotRef base idx) rdx + pushSavedStackArg base idx = do + IT.mov rdx (slotRef base idx) + IT.push rdx + restoreArgs base = do + IT.mov rax base + IT.mov (rn 11) (slotRef rax calleeSlot) + mapM_ (pushSavedStackArg rax) $ reverse [nReg .. pred nArgs] + zipWithM_ (\reg idx -> IT.mov reg (slotRef rax idx)) (reverse $ popRegs nReg) [0 .. pred nReg] + IT.push rbx + IT.sub rsp (8 * nSlots) + IT.mov rbx rsp + genCallTarget callee + IT.pop rdx + IT.mov (slotRef rbx calleeSlot) rdx + zipWithM_ (storeValue rbx) [0..] args + IT.mov (rn 10) rbx + IT.add rsp (8 * nSlots) + IT.pop rbx + callAligned (length stackArgs) (restoreArgs $ rn 10) invoke + +cleanupStackArgs :: Integral e => Int -> SI.Asm IT.TextLabelCtx e () +cleanupStackArgs n = + when (n > 0) $ + IT.add rsp (8 * n) + +genStmt :: (Show e, Integral e, Show i, Integral i, Ord i, IsOperand i, IT.UnaryInstruction i, IT.BinaryInstruction i) => ATree i -> SI.Asm IT.TextLabelCtx e () +genStmt (ATNode (ATCallFunc x Nothing) t _ _) = do + callAligned 0 (pure ()) $ IT.mov rax (0 :: Int) >> IT.call x + normalizeCallResultRax t + IT.push rax +genStmt (ATNode (ATCallPtr Nothing) t callee _) = do + genCallTarget callee + IT.pop (rn 11) + callAligned 0 (pure ()) invokeIndirect + normalizeCallResultRax t + IT.push rax +genStmt (ATNode (ATCallFunc x (Just args)) t _ _) = do + prepareCallArgs args $ do + IT.mov rax (0 :: Int) + IT.call x + normalizeCallResultRax t + IT.push rax +genStmt (ATNode (ATCallPtr (Just args)) t callee _) = do + prepareIndirectCall callee args invokeIndirect + normalizeCallResultRax t IT.push rax genStmt (ATNode (ATBlock stmt) _ _ _) = mapM_ genStmt stmt genStmt (ATNode (ATStmtExpr stmt) _ _ _) = mapM_ genStmt stmt @@ -143,17 +564,31 @@ genStmt (ATNode (ATLabel ident) _ _ _) = IT.gotoLabel ident genStmt (ATNode (ATFor exps) _ _ _) = IT.bracketBrkCnt $ do n <- IT.incrLbl IT.applyCnt >> IT.applyBrk - maybe (return ()) (genStmt . fromATKindFor) $ find isATForInit exps + maybe (return ()) genStmt $ nonEmptyForClause isATForInit IT.begin n - maybe (return ()) (genStmt . fromATKindFor) $ find isATForCond exps - IT.pop rax - IT.cmp rax (0 :: Int) - IT.je $ IT.refBreak n - maybe (return ()) (genStmt . fromATKindFor) $ find isATForStmt exps + maybe + (return ()) + ( \cond -> do + genStmt cond + IT.pop rax + IT.cmp rax (0 :: Int) + IT.je $ IT.refBreak n + ) + $ nonEmptyForClause isATForCond + maybe (return ()) genStmt $ nonEmptyForClause isATForStmt IT.continue n - maybe (return ()) (genStmt . fromATKindFor) $ find isATForIncr exps + maybe (return ()) genStmt $ nonEmptyForClause isATForIncr IT.jmp $ IT.refBegin n IT.break n + where + nonEmptyForClause predicate = + fromATKindFor + <$> find + (\kind -> predicate kind && case fromATKindFor kind of + ATEmpty -> False + _ -> True + ) + exps genStmt (ATNode ATWhile _ lhs rhs) = IT.bracketBrkCnt $ do n <- IT.incrLbl IT.applyCnt >> IT.applyBrk @@ -202,6 +637,8 @@ genStmt (ATNode ATReturn _ lhs _) = do IT.pop rax IT.jmp IT.refReturn genStmt (ATNode ATCast t lhs _) = genStmt lhs >> truncate t +genStmt (ATNode ATSizeof _ lhs _) = IT.push (fromIntegral (CR.sizeof $ atype lhs) :: Integer) +genStmt (ATNode ATAlignof _ lhs _) = IT.push (fromIntegral (CR.alignof $ atype lhs) :: Integer) genStmt (ATNode ATExprStmt _ lhs _) = genStmt lhs >> IT.add rsp (8 :: Int) genStmt (ATNode ATBitNot _ lhs _) = do genStmt lhs @@ -276,9 +713,12 @@ genStmt (ATNode ATPostDec t lhs _) = do decrement t store t increment t -genStmt (ATNode ATComma _ lhs rhs) = genStmt lhs >> genStmt rhs +genStmt (ATNode ATComma _ lhs rhs) = genStmt lhs >> IT.add rsp (8 :: Int) >> genStmt rhs genStmt (ATNode ATAddr _ lhs _) = genAddr lhs -genStmt (ATNode ATDeref t lhs _) = genStmt lhs >> unless (CR.isCTArray t) (load t) +genStmt (ATNode ATDeref t ptr _) + | Just (base, offset, idxs) <- rvalueArrayElementPointerChain ptr = + loadRvalueArrayElement t base offset idxs +genStmt (ATNode ATDeref t lhs _) = genStmt lhs >> unless (nonLoadableDerefType t) (load t) genStmt (ATNode ATNot _ lhs _) = do genStmt lhs IT.pop rax @@ -289,9 +729,14 @@ genStmt (ATNode ATNot _ lhs _) = do genStmt (ATNode (ATNum x) _ _ _) | x <= fromIntegral (maxBound :: Int32) = IT.push x | otherwise = IT.movabs rax x >> IT.push rax +genStmt n@(ATNode (ATFuncPtr _) _ _ _) = genAddr n genStmt n@(ATNode (ATLVar _ _) t _ _) = genAddr n >> unless (CR.isCTArray t) (load t) genStmt n@(ATNode (ATGVar _ _) t _ _) = genAddr n >> unless (CR.isCTArray t) (load t) -genStmt n@(ATNode (ATMemberAcc _) t _ _) = genAddr n >> unless (CR.isCTArray t) (load t) +genStmt n@(ATNode (ATMemberAcc member) t lhs _) + | isAddressableExpr lhs = genAddr n >> unless (CR.isCTArray t) (load t) + | otherwise = case rvalueSubobjectBaseOffset n of + Just (base, offset) -> loadRvalueSubobject t base offset + Nothing -> loadRvalueSubobject t lhs (CR.smOffset member) genStmt (ATNode ATAssign t lhs rhs) = genLVal lhs >> genStmt rhs >> store t genStmt (ATNode (ATNull _) _ _ _) = return () genStmt (ATNode kd ty lhs rhs) @@ -350,16 +795,69 @@ genStmt (ATNode kd ty lhs rhs) _ -> SI.errCtx "internal compiler error: asm code generator should not reach here (binOp). Maybe abstract tree is broken it cause (bug)." genStmt _ = return () +spillRegisterParam + :: (Integral e, Ord i, IsOperand i, IT.BinaryInstruction i) + => ATree i + -> [Register] + -> SI.Asm IT.TextLabelCtx e () +spillRegisterParam (ATNode (ATLVar t o) _ _ _) regs + | CR.toTypeKind t == CR.CTBool = + maybe + (SI.errCtx "internal compiler error: there is no full-width register for a _Bool parameter") + (\fullReg -> IT.mov rax fullReg >> normalizeBoolAbiRax >> IT.mov (Ref $ rbp `osub` o) al) + (find ((== 8) . byteWidth) regs) + | isAggregateType t = + maybe + (SI.errCtx "internal compiler error: there is no full-width register for an aggregate parameter") + (\fullReg -> storePackedRegisterObject (CR.sizeof t) fullReg localSlotAt) + (find ((== 8) . byteWidth) regs) + | otherwise = + maybe + (SI.errCtx "internal compiler error: there is no register that fits the specified size") + (IT.mov (Ref $ rbp `osub` o)) + (find ((== CR.sizeof t) . byteWidth) regs) + where + localSlotAt offset = Ref $ (rbp `osub` o) `oadd` (fromIntegral offset :: Integer) +spillRegisterParam _ _ = + SI.errCtx "internal compiler error: expected local variable parameter slot" + +spillStackParam + :: (Integral e, Ord i, IsOperand i, IT.BinaryInstruction i) + => Integer + -> ATree i + -> SI.Asm IT.TextLabelCtx e () +spillStackParam callerOffset (ATNode (ATLVar t o) _ _ _) + | isAggregateType t = + loadCallerSlot >> storePackedRegisterObject (CR.sizeof t) rax localSlotAt + | otherwise = case CR.sizeof t of + 1 + | CR.toTypeKind t == CR.CTBool -> + loadCallerSlot >> normalizeBoolAbiRax >> IT.mov localSlot al + | otherwise -> + loadCallerSlot >> IT.mov localSlot al + 2 -> loadCallerSlot >> IT.mov localSlot ax + 4 -> loadCallerSlot >> IT.mov localSlot eax + 8 -> loadCallerSlot >> IT.mov localSlot rax + _ -> SI.errCtx "internal compiler error: unsupported stack-passed parameter width" + where + localSlot = Ref $ rbp `osub` o + localSlotAt offset = Ref $ (rbp `osub` o) `oadd` (fromIntegral offset :: Integer) + callerSlot = Ref $ rbp `oadd` callerOffset + loadCallerSlot = IT.mov rax callerSlot +spillStackParam _ _ = + SI.errCtx "internal compiler error: expected local variable parameter slot" + textSection' :: (Integral e, Show e, Integral i, IsOperand i, IT.UnaryInstruction i, IT.BinaryInstruction i) => ATree i -> SI.Asm IT.TextSectionCtx e () textSection' lc@(ATNode (ATDefFunc fn margs) ty st _) = do unless (CR.isSCStatic ty) $ IT.global fn IT.fn fn $ do prologue (stackSize lc) - when (isJust margs) $ flip (`zipWithM_` fromJust margs) argRegs $ \(ATNode (ATLVar t o) _ _ _) reg -> - maybe (SI.errCtx "internal compiler error: there is no register that fits the specified size") - (IT.mov (Ref $ rbp `osub` o)) $ find ((== CR.sizeof t) . byteWidth) reg + when (isJust margs) $ do + let (regArgs, stackArgs) = splitAt 6 $ fromJust margs + zipWithM_ spillRegisterParam regArgs argRegs + zipWithM_ spillStackParam [16, 24 ..] stackArgs genStmt st - epilogue + epilogue ty textSection' ATEmpty = return () textSection' _ = SI.errCtx "internal compiler error: all abstract tree should start from some functions" @@ -367,10 +865,25 @@ textSection' _ = SI.errCtx "internal compiler error: all abstract tree should st dataSection :: (Show i, Ord i, Num i) => M.Map T.Text (GVar i) -> [Literal i] -> SI.Asm SI.AsmCodeCtx e () dataSection gvars lits = ID.dAta $ do forM_ lits $ \(Literal _ n cnt) -> ID.label (".L.data." <> tshow n) $ ID.byte cnt - forM_ (M.toList gvars) $ \(var, GVar t ginit) -> case ginit of + forM_ (M.toList gvars) $ \(var, GVar t ginit _) -> case ginit of PV.GVarInitWithZero -> ID.label var $ ID.zero (CR.sizeof t) + PV.GVarInitWithExternDecl -> pure () PV.GVarInitWithOG ref -> ID.label var $ ID.quad ref PV.GVarInitWithVal val -> ID.label var $ ID.sbyte (CR.sizeof t) val + PV.GVarInitWithData ds -> ID.label var $ mapM_ emitInitData ds + PV.GVarInitWithAST _ -> SI.errCtx "internal compiler error: unresolved global initializer AST" + where + emitInitData dat = case dat of + PV.GVarInitZeroBytes sz -> ID.zero sz + PV.GVarInitBytes sz val -> ID.sbyte sz val + PV.GVarInitReloc sz ref addend + | sz == 8 -> ID.quad $ formatReloc ref addend + | otherwise -> SI.errCtx "internal compiler error: unsupported relocation width in global initializer" + + formatReloc ref addend + | addend == 0 = ref + | addend > 0 = ref <> "+" <> tshow addend + | otherwise = ref <> tshow addend -- | text section of assembly code textSection :: (Integral e, Show e, IsOperand i, Integral i, Show i, IT.UnaryInstruction i, IT.BinaryInstruction i) => [ATree i] -> SI.Asm SI.AsmCodeCtx e () diff --git a/src/Htcc/Asm/Intrinsic/Structure/Internal.hs b/src/Htcc/Asm/Intrinsic/Structure/Internal.hs index b1aec2c..61ade17 100644 --- a/src/Htcc/Asm/Intrinsic/Structure/Internal.hs +++ b/src/Htcc/Asm/Intrinsic/Structure/Internal.hs @@ -16,6 +16,7 @@ module Htcc.Asm.Intrinsic.Structure.Internal ( AsmCodeCtx, unCtx, runAsm, + runAsmWithHandle, putStrWithIndent, putStrLnWithIndent, errCtx, @@ -28,17 +29,19 @@ import Control.Monad.Finally (MonadFinally (..)) import Data.IORef (IORef, newIORef, writeIORef) import qualified Data.Text as T import qualified Data.Text.IO as T +import System.IO (Handle, stdout) import Htcc.Utils (err) -- | Counter and label information used when generating assembly code data AsmInfo e = AsmInfo { - inLabel :: Bool, -- ^ the flag that indicates whether it is inside the label. If True, indent by single tab, - lblCnt :: IORef e, -- ^ the label counter - brkCnt :: IORef (Maybe e), -- ^ the @break@ label counter - cntCnt :: IORef (Maybe e), -- ^ the @continue@ label counter - curFn :: IORef (Maybe T.Text) -- ^ the function being processed + inLabel :: Bool, -- ^ the flag that indicates whether it is inside the label. If True, indent by single tab, + outHandle :: Handle, -- ^ output destination for generated assembly + lblCnt :: IORef e, -- ^ the label counter + brkCnt :: IORef (Maybe e), -- ^ the @break@ label counter + cntCnt :: IORef (Maybe e), -- ^ the @continue@ label counter + curFn :: IORef (Maybe T.Text) -- ^ the function being processed } -- | A monad that represents the context of the assembly code @@ -80,21 +83,27 @@ unCtx = Asm . unAsm -- | the executor that outputs assembly code runAsm :: (Num e, Enum e) => Asm AsmCodeCtx e a -> IO a -runAsm asm = do - putStrLn ".intel_syntax noprefix" +runAsm = runAsmWithHandle stdout + +-- | the executor that outputs assembly code to the specified handle +runAsmWithHandle :: (Num e, Enum e) => Handle -> Asm AsmCodeCtx e a -> IO a +runAsmWithHandle h asm = do + T.hPutStrLn h ".intel_syntax noprefix" c <- newIORef 0 brk <- newIORef Nothing cnt <- newIORef Nothing fn <- newIORef Nothing - unAsm asm (AsmInfo False c brk cnt fn) + result <- unAsm asm (AsmInfo False h c brk cnt fn) + T.hPutStrLn h ".section .note.GNU-stack,\"\",@progbits" + pure result -- | print a string with indentation, output is broken on a new line putStrLnWithIndent :: T.Text -> Asm ctx e () -putStrLnWithIndent s = Asm $ \x -> T.putStrLn $ if inLabel x then '\t' `T.cons` s else s +putStrLnWithIndent s = Asm $ \x -> T.hPutStrLn (outHandle x) $ if inLabel x then '\t' `T.cons` s else s -- | print a string with indentation putStrWithIndent :: T.Text -> Asm ctx e () -putStrWithIndent s = Asm $ \x -> T.putStr $ if inLabel x then '\t' `T.cons` s else s +putStrWithIndent s = Asm $ \x -> T.hPutStr (outHandle x) $ if inLabel x then '\t' `T.cons` s else s -- | The error context. -- when this is executed, diff --git a/src/Htcc/Asm/Intrinsic/Structure/Section/Text/Directive.hs b/src/Htcc/Asm/Intrinsic/Structure/Section/Text/Directive.hs index 966e1bd..03fd5ee 100644 --- a/src/Htcc/Asm/Intrinsic/Structure/Section/Text/Directive.hs +++ b/src/Htcc/Asm/Intrinsic/Structure/Section/Text/Directive.hs @@ -83,14 +83,14 @@ label :: (Show i, Show e) => T.Text -> i -> C.Asm TextLabelCtx e () label lbl n = C.Asm $ \x -> do cf <- readIORef $ C.curFn x unless (isJust cf) $ err "stray label" - T.putStrLn $ ".L." <> lbl <> "." <> fromJust cf <> "." <> tshow n <> ":" + T.hPutStrLn (C.outHandle x) $ ".L." <> lbl <> "." <> fromJust cf <> "." <> tshow n <> ":" -- | goto label gotoLabel :: T.Text -> C.Asm TextLabelCtx e () gotoLabel ident = C.Asm $ \x -> do cf <- readIORef $ C.curFn x unless (isJust cf) $ err "stray goto label" - T.putStrLn $ ".L.label." <> fromJust cf <> "." <> ident <> ":" + T.hPutStrLn (C.outHandle x) $ ".L.label." <> fromJust cf <> "." <> ident <> ":" -- | begin label begin :: (Show e, Show i) => i -> C.Asm TextLabelCtx e () @@ -109,7 +109,7 @@ cAse :: (Show e, Show i) => i -> C.Asm TextLabelCtx e () cAse n = C.Asm $ \x -> do cf <- readIORef $ C.curFn x unless (isJust cf) $ err "stray case" - T.putStrLn $ ".L.case." <> fromJust cf <> "." <> tshow n <> ":" + T.hPutStrLn (C.outHandle x) $ ".L.case." <> fromJust cf <> "." <> tshow n <> ":" -- | break label break :: (Show e, Show i) => i -> C.Asm TextLabelCtx e () @@ -124,7 +124,7 @@ refReturn :: Show e => C.Asm TargetLabelCtx e () refReturn = C.Asm $ \x -> do cf <- readIORef (C.curFn x) unless (isJust cf) $ err "stray label" - T.putStrLn $ ".L.return." <> fromJust cf + T.hPutStrLn (C.outHandle x) $ ".L.return." <> fromJust cf refCnt :: Show e => (C.AsmInfo a -> IORef (Maybe e)) -> T.Text -> C.Asm ctx a () refCnt f mes = C.Asm $ \x -> do @@ -132,7 +132,7 @@ refCnt f mes = C.Asm $ \x -> do unless (isJust cf) $ err $ "stray " <> mes n <- readIORef (f x) unless (isJust n) $ err $ "stray " <> mes - T.putStrLn $ ".L." <> mes <> "." <> fromJust cf <> "." <> tshow (fromJust n) + T.hPutStrLn (C.outHandle x) $ ".L." <> mes <> "." <> fromJust cf <> "." <> tshow (fromJust n) -- | reference for break label refBreak :: (Show e, Show i) => i -> C.Asm TargetLabelCtx e () @@ -155,7 +155,7 @@ refGoto :: T.Text -> C.Asm TargetLabelCtx e () refGoto ident = C.Asm $ \x -> do cf <- readIORef (C.curFn x) unless (isJust cf) $ err "stray label" - T.putStrLn $ ".L.label." <> fromJust cf <> "." <> ident + T.hPutStrLn (C.outHandle x) $ ".L.label." <> fromJust cf <> "." <> ident -- | reference to begin label refBegin :: (Show e, Show i) => i -> C.Asm TargetLabelCtx e () @@ -174,7 +174,7 @@ ref :: (Show e, Show i) => T.Text -> i -> C.Asm TargetLabelCtx e () ref lbl n = C.Asm $ \x -> do cf <- readIORef (C.curFn x) unless (isJust cf) $ err "stray label" - T.putStrLn $ ".L." <> lbl <> "." <> fromJust cf <> "." <> tshow n + T.hPutStrLn (C.outHandle x) $ ".L." <> lbl <> "." <> fromJust cf <> "." <> tshow n -- | generate cases and return abstract tree makeCases :: (Show e, Enum e, Integral e, Show i, Num i) => [ATree i] -> C.Asm TextLabelCtx e [ATree i] @@ -184,12 +184,12 @@ makeCases cases = C.Asm $ \x -> do (ATNode (ATCase _ cn) t lhs rhs) -> do modifyIORef (C.lblCnt x) succ n' <- readIORef (C.lblCnt x) - T.putStrLn $ "\tcmp rax, " <> tshow cn - T.putStrLn $ "\tje .L.case." <> fromJust cf <> "." <> tshow n' + T.hPutStrLn (C.outHandle x) $ "\tcmp rax, " <> tshow cn + T.hPutStrLn (C.outHandle x) $ "\tje .L.case." <> fromJust cf <> "." <> tshow n' return $ ATNode (ATCase (fromIntegral n') cn) t lhs rhs (ATNode (ATDefault _) t lhs rhs) -> do modifyIORef (C.lblCnt x) succ n' <- readIORef (C.lblCnt x) - T.putStrLn $ "\tjmp .L.case." <> fromJust cf <> "." <> tshow n' + T.hPutStrLn (C.outHandle x) $ "\tjmp .L.case." <> fromJust cf <> "." <> tshow n' return $ ATNode (ATDefault $ fromIntegral n') t lhs rhs at -> return at diff --git a/src/Htcc/Asm/Intrinsic/Structure/Section/Text/Instruction.hs b/src/Htcc/Asm/Intrinsic/Structure/Section/Text/Instruction.hs index 30e6a3a..e976cb1 100644 --- a/src/Htcc/Asm/Intrinsic/Structure/Section/Text/Instruction.hs +++ b/src/Htcc/Asm/Intrinsic/Structure/Section/Text/Instruction.hs @@ -217,5 +217,4 @@ jnz asm = I.putStrWithIndent "jnz " *> I.unCtx asm -- | @call@ instruction call :: T.Text -> I.Asm TextLabelCtx e () -call = intelSyntaxUnary "call" - +call arg = I.putStrLnWithIndent $ "call " <> arg diff --git a/src/Htcc/CRules/Char.hs b/src/Htcc/CRules/Char.hs index 7ab786f..6ff00b7 100644 --- a/src/Htcc/CRules/Char.hs +++ b/src/Htcc/CRules/Char.hs @@ -1,4 +1,4 @@ -{-# LANGUAGE FlexibleInstances, TypeSynonymInstances #-} +{-# LANGUAGE FlexibleInstances #-} {-| Module : Htcc.CRules.Char Description : Characters rules of C language diff --git a/src/Htcc/CRules/Types/StorageClass.hs b/src/Htcc/CRules/Types/StorageClass.hs index cb4f787..8ac28ee 100644 --- a/src/Htcc/CRules/Types/StorageClass.hs +++ b/src/Htcc/CRules/Types/StorageClass.hs @@ -12,14 +12,16 @@ The storage-class of C language {-# LANGUAGE DeriveGeneric #-} module Htcc.CRules.Types.StorageClass ( -- * StorageClass data type and class - StorageClass (..), - StorageClassBase (..) + StorageClass (..) + , StorageClassBase (..) + , wrapCTFunc ) where import Control.DeepSeq (NFData (..)) -import Data.Tuple.Extra (first, second) +import Data.Bifunctor (bimap, first) import GHC.Generics (Generic) +import qualified Data.Text as T import Htcc.CRules.Types.CType import Htcc.CRules.Types.TypeKind @@ -81,8 +83,8 @@ instance Ord i => CType (StorageClass i) where alignof = alignof . toTypeKind deref x = picksc x <$> deref (toTypeKind x) ctorPtr n = mapTypeKind (ctorPtr n) - dctorPtr x = first (picksc x) $ second (\f y -> picksc y $ f $ toTypeKind y) $ dctorPtr $ toTypeKind x - dctorArray x = first (picksc x) $ second (\f y -> picksc y $ f $ toTypeKind y) $ dctorArray $ toTypeKind x + dctorPtr x = bimap (picksc x) (\f y -> picksc y $ f $ toTypeKind y) $ dctorPtr $ toTypeKind x + dctorArray x = bimap (picksc x) (\f y -> picksc y $ f $ toTypeKind y) $ dctorArray $ toTypeKind x removeAllExtents = mapTypeKind removeAllExtents conversion x y = SCAuto $ conversion (toTypeKind x) (toTypeKind y) implicitInt = mapTypeKind implicitInt @@ -94,6 +96,9 @@ instance TypeKindBase StorageClass where {-# INLINE isArray #-} isArray = isArray . toTypeKind + {-# INLINE isIntegral #-} + isIntegral = isIntegral . toTypeKind + {-# INLINE isCTStruct #-} isCTStruct = isCTStruct . toTypeKind @@ -134,3 +139,6 @@ instance StorageClassBase StorageClass where isSCStatic _ = False instance NFData i => NFData (StorageClass i) + +wrapCTFunc :: StorageClass i -> [(StorageClass i, Maybe T.Text)] -> StorageClass i +wrapCTFunc ty params = picksc ty $ CTFunc (toTypeKind ty) $ map (first fromsc) params diff --git a/src/Htcc/CRules/Types/TypeKind.hs b/src/Htcc/CRules/Types/TypeKind.hs index 2263f95..271c17c 100644 --- a/src/Htcc/CRules/Types/TypeKind.hs +++ b/src/Htcc/CRules/Types/TypeKind.hs @@ -9,9 +9,10 @@ Portability : POSIX The types of C language -} -{-# LANGUAGE BangPatterns, DeriveGeneric #-} +{-# LANGUAGE BangPatterns, DeriveGeneric, LambdaCase, TupleSections #-} module Htcc.CRules.Types.TypeKind ( -- * TypeKind data type + ScopeId (..), StructMember (..), TypeKind (..), Incomplete (..), @@ -23,16 +24,21 @@ module Htcc.CRules.Types.TypeKind ( -- * Utilities of C type alignas, Desg (..), + integerPromotedTypeKind, + mergeCompatibleTypeKinds, + mergeTentativeArrayTypeKinds, accessibleIndices, ) where import Control.Applicative ((<|>)) import Control.DeepSeq (NFData (..)) +import Data.Bifunctor (bimap) import Data.Bits (Bits (..), complement, (.&.)) import Data.Foldable (Foldable (..)) import Data.List (find, intercalate, maximumBy) import Data.List.Split (chunksOf) import qualified Data.Map as M +import Data.Maybe (isJust) import qualified Data.Text as T import Data.Tree (Tree (..)) import Data.Tuple.Extra (first, second) @@ -45,6 +51,11 @@ import Htcc.CRules.Types.CType import Htcc.Utils (dropFst3, dropSnd3, lor, maybe', spanLen, toInteger, toNatural) +newtype ScopeId = ScopeId Natural + deriving (Eq, Ord, Show, Generic) + +instance NFData ScopeId + -- | Class to a type based on `TypeKind`. class TypeKindBase a where -- | `isCTArray` returns `True` when the given argument is `Htcc.CRules.Types.Core.CTArray`. @@ -53,6 +64,8 @@ class TypeKindBase a where -- | `isArray` return `True` when the given argument is `Htcc.CRules.Types.Core.CTArray` or `IncompleteArray` -- Otherwise, returns `False` isArray :: a i -> Bool + -- | `isIntegral` return `True` when the given argument is `Htcc.CRules.Types.Core.CTInt`, `Htcc.CRules.Types.Core.CTShort` or `Htcc.CRules.Types.Core.CTLong` + isIntegral :: a i -> Bool -- | `isCTStruct` returns `True` when the given argument is `Htcc.CRules.Types.Core.CTStruct`. -- Otherwise, returns `False` isCTStruct :: a i -> Bool @@ -107,24 +120,24 @@ class IncompleteBase a where -- | The type representing an incomplete type data Incomplete i = IncompleteArray (TypeKind i) -- ^ incomplete array, it has a base type. - | IncompleteStruct T.Text -- ^ incomplete struct, it has a tag name. + | IncompleteStruct T.Text ScopeId -- ^ incomplete struct, it has a tag name and binding scope. deriving (Eq, Generic) instance IncompleteBase Incomplete where isIncompleteArray (IncompleteArray _) = True isIncompleteArray _ = False - isIncompleteStruct (IncompleteStruct _) = True - isIncompleteStruct _ = False - fromIncompleteStruct (IncompleteStruct t) = Just t - fromIncompleteStruct _ = Nothing + isIncompleteStruct (IncompleteStruct _ _) = True + isIncompleteStruct _ = False + fromIncompleteStruct (IncompleteStruct t _) = Just t + fromIncompleteStruct _ = Nothing fromIncompleteArray (IncompleteArray t) = Just t fromIncompleteArray _ = Nothing isValidIncomplete (IncompleteArray t) = isFundamental t isValidIncomplete _ = True instance Show i => Show (Incomplete i) where - show (IncompleteArray t) = show t ++ "[]" - show (IncompleteStruct t) = T.unpack t + show (IncompleteArray t) = show t ++ "[]" + show (IncompleteStruct t _) = T.unpack t instance NFData i => NFData (Incomplete i) @@ -136,17 +149,27 @@ data TypeKind i = CTInt -- ^ The type @int@ as C language | CTLong (TypeKind i) -- ^ The type @long@ as C language | CTBool -- ^ The type @_Bool@ as C language | CTVoid -- ^ The type @void@ as C language + | CTFunc (TypeKind i) [(TypeKind i, Maybe T.Text)] -- ^ The type of function as C language | CTPtr (TypeKind i) -- ^ The pointer type of `TypeKind` | CTArray Natural (TypeKind i) -- ^ The array type | CTEnum (TypeKind i) (M.Map T.Text i) -- ^ The enum, has its underlying type and a map | CTStruct (M.Map T.Text (StructMember i)) -- ^ The struct, has its members and their names. + | CTNamedStruct T.Text ScopeId (M.Map T.Text (StructMember i)) -- ^ A tagged struct definition. | CTIncomplete (Incomplete i) -- ^ The incomplete type. | CTUndef -- ^ Undefined type deriving Generic {-# INLINE fundamental #-} fundamental :: [TypeKind i] -fundamental = [CTChar, CTInt, CTShort CTUndef, CTLong CTUndef, CTSigned CTUndef] +fundamental = [CTChar, CTInt, CTBool, CTShort CTUndef, CTLong CTUndef, CTSigned CTUndef, CTVoid] + +{-# INLINE anonymousStructTagPrefix #-} +anonymousStructTagPrefix :: T.Text +anonymousStructTagPrefix = T.pack ".anonymous.struct." + +{-# INLINE isAnonymousStructTag #-} +isAnonymousStructTag :: T.Text -> Bool +isAnonymousStructTag = T.isPrefixOf anonymousStructTagPrefix {-# INLINE isLongShortable #-} isLongShortable :: TypeKind i -> Bool @@ -191,18 +214,22 @@ removeAllQualified (CTShort x) = removeAllQualified x removeAllQualified (CTSigned x) = removeAllQualified x removeAllQualified x = x +type QualifierFingerprint = (Bool, Int, Int, Int) + {-# INLINE combTable #-} -combTable :: TypeKind i -> Maybe Int -combTable CTChar = Just 1 -combTable CTInt = Just $ shiftL 1 1 -combTable CTBool = Just $ shiftL 1 2 -combTable CTVoid = Just $ shiftL 1 3 -combTable CTUndef = Just $ shiftL 1 4 -combTable (CTPtr x) = (shiftL 1 5 .|.) <$> combTable x -combTable (CTSigned x) = (shiftL 1 6 .|.) <$> combTable x -combTable (CTLong x) = (shiftL 1 7 .|.) <$> combTable x -combTable (CTShort x) = (shiftL 1 8 .|.) <$> combTable x -combTable _ = Nothing +combTable :: TypeKind i -> Maybe QualifierFingerprint +combTable = go False 0 0 + where + go signed longs shorts = \case + CTChar -> Just (signed, longs, shorts, 1) + CTInt -> Just (signed, longs, shorts, 2) + CTBool -> Just (signed, longs, shorts, 3) + CTVoid -> Just (signed, longs, shorts, 4) + CTUndef -> Just (signed, longs, shorts, 5) + CTSigned x -> go True longs shorts x + CTLong x -> go signed (succ longs) shorts x + CTShort x -> go signed longs (succ shorts) x + _ -> Nothing {-# INLINE arSizes #-} arSizes :: (Num i, Enum i) => TypeKind i -> (i, [[i]]) @@ -229,7 +256,7 @@ instance (Enum i, Integral i) => Enum (Desg i) where -- | If the given argument is `CTArray`, it returns a list of accessible indexes of the array. -- Othrewise returns empty list. accessibleIndices :: Integral i => TypeKind i -> [[Desg i]] -accessibleIndices = uncurry (concatMap . chunksOf) . first fromIntegral . second (concatMap (map (iNode' id) . iNode id) . arIndices') . arSizes +accessibleIndices = uncurry (concatMap . chunksOf) . bimap fromIntegral (concatMap (map (iNode' id) . iNode id) . arIndices') . arSizes where arIndices' [] = [] arIndices' (x:xs) = map (flip ($) (arIndices' xs) . Node) x @@ -246,9 +273,12 @@ instance Eq i => Eq (TypeKind i) where (==) CTChar CTChar = True (==) CTBool CTBool = True (==) CTVoid CTVoid = True + (==) (CTFunc lty lparams) (CTFunc rty rparams) = lty == rty && map fst lparams == map fst rparams (==) (CTEnum ut1 m1) (CTEnum ut2 m2) = ut1 == ut2 && m1 == m2 (==) (CTArray v1 t1) (CTArray v2 t2) = v1 == v2 && t1 == t2 (==) (CTStruct m1) (CTStruct m2) = m1 == m2 + (==) (CTNamedStruct tag1 depth1 m1) (CTNamedStruct tag2 depth2 m2) = + tag1 == tag2 && depth1 == depth2 && m1 == m2 (==) CTUndef CTUndef = True (==) (CTPtr t1) (CTPtr t2) = t1 == t2 (==) (CTIncomplete t1) (CTIncomplete t2) = t1 == t2 @@ -268,10 +298,18 @@ instance Show i => Show (TypeKind i) where show (CTLong t) = "long " ++ show t show CTBool = "_Bool" show CTVoid = "void" + show (CTFunc ty param) = show ty ++ "(" ++ intercalate ", " (map show param) ++ ")" show (CTPtr x) = show x ++ "*" show (CTArray v t) = show t ++ "[" ++ show v ++ "]" show (CTEnum _ m) = "enum { " ++ intercalate ", " (map T.unpack $ M.keys m) ++ " }" show (CTStruct m) = "struct { " ++ concatMap (\(v, inf) -> show (smType inf) ++ " " ++ T.unpack v ++ "; ") (M.toList m) ++ "}" + show (CTNamedStruct tag _ m) + | isAnonymousStructTag tag = + "struct { " ++ concatMap (\(v, inf) -> show (smType inf) ++ " " ++ T.unpack v ++ "; ") (M.toList m) ++ "}" + show (CTNamedStruct tag _ m) = + "struct " ++ T.unpack tag ++ " { " + ++ concatMap (\(v, inf) -> show (smType inf) ++ " " ++ T.unpack v ++ "; ") (M.toList m) + ++ "}" show (CTIncomplete t) = show t show CTUndef = "undefined" @@ -312,6 +350,7 @@ instance Ord i => CType (TypeKind i) where sizeof (CTLong x) = sizeof x sizeof CTBool = 1 sizeof CTVoid = 1 -- Non standard + sizeof (CTFunc _ _) = 1 -- Non standard sizeof (CTPtr _) = 8 sizeof (CTArray v t) = v * sizeof t sizeof (CTEnum t _) = sizeof t @@ -319,6 +358,10 @@ instance Ord i => CType (TypeKind i) where | M.null m = 1 | otherwise = let sn = maximumBy (flip (.) smOffset . compare . smOffset) $ M.elems m in toNatural $ alignas (toInteger $ smOffset sn + sizeof (smType sn)) (toInteger $ alignof t) + sizeof t@(CTNamedStruct _ _ m) + | M.null m = 1 + | otherwise = let sn = maximumBy (flip (.) smOffset . compare . smOffset) $ M.elems m in + toNatural $ alignas (toInteger $ smOffset sn + sizeof (smType sn)) (toInteger $ alignof t) sizeof CTUndef = 0 sizeof (CTIncomplete _) = 0 sizeof _ = error "sizeof: sould not reach here" @@ -333,6 +376,7 @@ instance Ord i => CType (TypeKind i) where alignof (CTLong x) = alignof x alignof CTBool = 1 alignof CTVoid = 1 -- Non standard + alignof (CTFunc _ _) = 1 -- Non standard alignof (CTPtr _) = 8 alignof (CTArray _ t) = alignof $ removeAllExtents t alignof (CTEnum t _) = alignof t @@ -343,16 +387,19 @@ instance Ord i => CType (TypeKind i) where alignof (CTStruct m) | M.null m = 1 | otherwise = maximum $ map (alignof . smType) $ M.elems m + alignof (CTNamedStruct _ _ m) + | M.null m = 1 + | otherwise = maximum $ map (alignof . smType) $ M.elems m alignof CTUndef = 0 alignof _ = error "alignof: sould not reach here" deref (CTPtr x) = Just x deref ct@(CTArray _ _) = Just $ f ct where - f (CTArray n c@(CTArray _ _)) = CTArray n (f c) - f (CTArray _ t) = t - f t = t - deref (CTIncomplete (IncompleteArray (CTArray _ _))) = Nothing + f (CTArray _ t@(CTIncomplete (IncompleteArray _))) = t + f (CTArray n c@(CTArray _ _)) = CTArray n (f c) + f (CTArray _ t) = t + f t = t deref (CTIncomplete (IncompleteArray t)) = Just t deref _ = Nothing @@ -369,8 +416,11 @@ instance Ord i => CType (TypeKind i) where removeAllExtents x = x conversion l r - | l == r = l - | otherwise = max l r + | l' == r' = l' + | otherwise = max l' r' + where + l' = integerPromotedTypeKind l + r' = integerPromotedTypeKind r {-# INLINE implicitInt #-} implicitInt (CTLong x) = CTLong $ implicitInt x @@ -388,9 +438,20 @@ instance TypeKindBase TypeKind where {-# INLINE isArray #-} isArray = lor [isCTArray, isIncompleteArray] + {-# INLINE isIntegral #-} + isIntegral CTChar = True + isIntegral CTInt = True + isIntegral CTBool = True + isIntegral (CTEnum _ _) = True + isIntegral (CTSigned x) = isIntegral x + isIntegral (CTLong x) = isIntegral x + isIntegral (CTShort x) = isIntegral x + isIntegral _ = False + {-# INLINE isCTStruct #-} - isCTStruct (CTStruct _) = True - isCTStruct _ = False + isCTStruct (CTStruct _) = True + isCTStruct CTNamedStruct {} = True + isCTStruct _ = False {-# INLINE isCTUndef #-} isCTUndef CTUndef = True @@ -450,5 +511,224 @@ alignas !n !aval = pred (n + aval) .&. complement (pred aval) -- | `lookupMember` search the specified member by its name from `CTStruct`. lookupMember :: T.Text -> TypeKind i -> Maybe (StructMember i) -lookupMember t (CTStruct m) = M.lookup t m -lookupMember _ _ = Nothing +lookupMember t (CTStruct m) = M.lookup t m +lookupMember t (CTNamedStruct _ _ m) = M.lookup t m +lookupMember _ _ = Nothing + +typeKindStructurallyEqual :: Eq i => TypeKind i -> TypeKind i -> Bool +typeKindStructurallyEqual CTInt CTInt = True +typeKindStructurallyEqual CTChar CTChar = True +typeKindStructurallyEqual CTBool CTBool = True +typeKindStructurallyEqual CTVoid CTVoid = True +typeKindStructurallyEqual CTUndef CTUndef = True +typeKindStructurallyEqual (CTFunc lhsRet lhsParams) (CTFunc rhsRet rhsParams) = + typeKindStructurallyEqual lhsRet rhsRet + && functionParamTypesStructurallyEqual lhsParams rhsParams +typeKindStructurallyEqual (CTEnum lhsTy lhsMembers) (CTEnum rhsTy rhsMembers) = + typeKindStructurallyEqual lhsTy rhsTy && lhsMembers == rhsMembers +typeKindStructurallyEqual (CTArray lhsLen lhsTy) (CTArray rhsLen rhsTy) = + lhsLen == rhsLen && typeKindStructurallyEqual lhsTy rhsTy +typeKindStructurallyEqual (CTStruct lhsMembers) (CTStruct rhsMembers) = + structMembersStructurallyEqual lhsMembers rhsMembers +typeKindStructurallyEqual (CTNamedStruct lhsTag lhsDepth lhsMembers) (CTNamedStruct rhsTag rhsDepth rhsMembers) = + lhsTag == rhsTag + && lhsDepth == rhsDepth + && structMembersStructurallyEqual lhsMembers rhsMembers +typeKindStructurallyEqual (CTPtr lhsTy) (CTPtr rhsTy) = + typeKindStructurallyEqual lhsTy rhsTy +typeKindStructurallyEqual (CTIncomplete lhs) (CTIncomplete rhs) = + incompleteTypesStructurallyEqual lhs rhs +typeKindStructurallyEqual lhs rhs + | isQualifier lhs || isQualifier rhs = maybe' False (combTable lhs) $ \lh -> + maybe' False (combTable rhs) $ \rh -> lh == rh + | otherwise = False + +functionParamTypesStructurallyEqual :: Eq i => [(TypeKind i, Maybe T.Text)] -> [(TypeKind i, Maybe T.Text)] -> Bool +functionParamTypesStructurallyEqual lhs rhs = + length lhs == length rhs + && and (zipWith (typeKindStructurallyEqual `on` fst) lhs rhs) + where + on f g x y = f (g x) (g y) + +structMembersStructurallyEqual + :: Eq i + => M.Map T.Text (StructMember i) + -> M.Map T.Text (StructMember i) + -> Bool +structMembersStructurallyEqual lhs rhs + | M.keysSet lhs /= M.keysSet rhs = False + | otherwise = + all + (\(name, lhsMember) -> maybe False (structMemberStructurallyEqual lhsMember) $ M.lookup name rhs) + (M.toList lhs) + +structMemberStructurallyEqual :: Eq i => StructMember i -> StructMember i -> Bool +structMemberStructurallyEqual lhs rhs = + smOffset lhs == smOffset rhs + && typeKindStructurallyEqual (smType lhs) (smType rhs) + +incompleteTypesStructurallyEqual :: Eq i => Incomplete i -> Incomplete i -> Bool +incompleteTypesStructurallyEqual (IncompleteArray lhsTy) (IncompleteArray rhsTy) = + typeKindStructurallyEqual lhsTy rhsTy +incompleteTypesStructurallyEqual (IncompleteStruct lhsTag lhsDepth) (IncompleteStruct rhsTag rhsDepth) = + lhsTag == rhsTag && lhsDepth == rhsDepth +incompleteTypesStructurallyEqual _ _ = False + +mergeCompatibleTypeKinds :: Eq i => TypeKind i -> TypeKind i -> Maybe (TypeKind i) +mergeCompatibleTypeKinds = mergeCompatibleTypeKinds' True + +mergeCompatibleTypeKinds' :: Eq i => Bool -> TypeKind i -> TypeKind i -> Maybe (TypeKind i) +mergeCompatibleTypeKinds' _ lhs rhs + | typeKindStructurallyEqual lhs rhs = Just rhs + | equivalentSignedIntegerSynonym lhs rhs = Just rhs +mergeCompatibleTypeKinds' _ (CTSigned lhs) (CTSigned rhs) = + CTSigned <$> mergeCompatibleTypeKinds' False lhs rhs +mergeCompatibleTypeKinds' _ (CTShort lhs) (CTShort rhs) = + CTShort <$> mergeCompatibleTypeKinds' False lhs rhs +mergeCompatibleTypeKinds' _ (CTLong lhs) (CTLong rhs) = + CTLong <$> mergeCompatibleTypeKinds' False lhs rhs +mergeCompatibleTypeKinds' _ (CTPtr lhs) (CTPtr rhs) = + CTPtr <$> mergeCompatibleTypeKinds' False lhs rhs +mergeCompatibleTypeKinds' allowExtentInference lhs@(CTArray _ _) rhs@(CTArray _ _) = + mergeTentativeArrayTypeKinds' allowExtentInference lhs rhs +mergeCompatibleTypeKinds' _ (CTFunc lhsRet lhsParams) (CTFunc rhsRet rhsParams) = do + retTy <- mergeCompatibleTypeKinds' False lhsRet rhsRet + params <- mergeCompatibleFunctionParamLists lhsParams rhsParams + pure $ CTFunc retTy params +mergeCompatibleTypeKinds' allowExtentInference lhs@(CTIncomplete (IncompleteArray _)) rhs@(CTIncomplete (IncompleteArray _)) = + mergeTentativeArrayTypeKinds' allowExtentInference lhs rhs +mergeCompatibleTypeKinds' _ (CTIncomplete (IncompleteStruct lhsTag lhsDepth)) (CTIncomplete (IncompleteStruct rhsTag rhsDepth)) + | lhsTag == rhsTag + , lhsDepth == rhsDepth = + Just $ CTIncomplete $ IncompleteStruct lhsTag lhsDepth +mergeCompatibleTypeKinds' _ (CTIncomplete (IncompleteStruct lhsTag lhsDepth)) rhs@(CTNamedStruct rhsTag rhsDepth _) + | lhsTag == rhsTag + , lhsDepth == rhsDepth = + Just rhs +mergeCompatibleTypeKinds' _ lhs@(CTNamedStruct lhsTag lhsDepth _) (CTIncomplete (IncompleteStruct rhsTag rhsDepth)) + | lhsTag == rhsTag + , lhsDepth == rhsDepth = + Just lhs +mergeCompatibleTypeKinds' _ (CTEnum lhsTy lhsMembers) (CTEnum rhsTy rhsMembers) + | lhsMembers == rhsMembers = + CTEnum <$> mergeCompatibleTypeKinds' False lhsTy rhsTy <*> pure lhsMembers +mergeCompatibleTypeKinds' _ (CTStruct lhsMembers) (CTStruct rhsMembers) = + CTStruct <$> mergeCompatibleStructMembers lhsMembers rhsMembers +mergeCompatibleTypeKinds' _ (CTNamedStruct lhsTag lhsDepth lhsMembers) (CTNamedStruct rhsTag rhsDepth rhsMembers) + | lhsTag == rhsTag + , lhsDepth == rhsDepth = + CTNamedStruct lhsTag lhsDepth <$> mergeCompatibleStructMembers lhsMembers rhsMembers +mergeCompatibleTypeKinds' allowExtentInference lhs rhs = + mergeTentativeArrayTypeKinds' allowExtentInference lhs rhs + +equivalentSignedIntegerSynonym :: Eq i => TypeKind i -> TypeKind i -> Bool +equivalentSignedIntegerSynonym lhs rhs = case (canonicalSignedIntegerType lhs, canonicalSignedIntegerType rhs) of + (Just lhsTy, Just rhsTy) -> lhsTy == rhsTy + _ -> False + +canonicalSignedIntegerType :: TypeKind i -> Maybe (TypeKind i) +canonicalSignedIntegerType CTInt = Just CTInt +canonicalSignedIntegerType CTUndef = Just CTInt +canonicalSignedIntegerType (CTSigned x) = canonicalSignedIntegerType x +canonicalSignedIntegerType (CTShort x) = CTShort <$> canonicalSignedIntegerType x +canonicalSignedIntegerType (CTLong x) = CTLong <$> canonicalSignedIntegerType x +canonicalSignedIntegerType _ = Nothing + +mergeCompatibleStructMembers + :: Eq i + => M.Map T.Text (StructMember i) + -> M.Map T.Text (StructMember i) + -> Maybe (M.Map T.Text (StructMember i)) +mergeCompatibleStructMembers lhsMembers rhsMembers + | M.keysSet lhsMembers /= M.keysSet rhsMembers = Nothing + | otherwise = M.traverseWithKey mergeMember lhsMembers + where + mergeMember name lhsMember = do + rhsMember <- M.lookup name rhsMembers + if smOffset lhsMember /= smOffset rhsMember + then Nothing + else + (\mergedTy -> rhsMember { smType = mergedTy }) + <$> mergeCompatibleTypeKinds' False (smType lhsMember) (smType rhsMember) + +-- | Merge array types for tentative declarations. +-- Only the outermost missing extent may be inferred; all inner extents and rank +-- must already match exactly. +mergeTentativeArrayTypeKinds :: Eq i => TypeKind i -> TypeKind i -> Maybe (TypeKind i) +mergeTentativeArrayTypeKinds lhs rhs + | typeKindStructurallyEqual lhs rhs = Just rhs + | equivalentSignedIntegerSynonym lhs rhs = Just rhs + | otherwise = mergeTentativeArrayTypeKinds' True lhs rhs + +mergeTentativeArrayTypeKinds' :: Eq i => Bool -> TypeKind i -> TypeKind i -> Maybe (TypeKind i) +mergeTentativeArrayTypeKinds' allowExtentInference (CTArray lhsLen lhsInner) (CTArray rhsLen rhsInner) + | lhsLen == rhsLen = + CTArray lhsLen <$> mergeCompatibleTypeKinds' allowExtentInference lhsInner rhsInner +mergeTentativeArrayTypeKinds' allowExtentInference (CTIncomplete (IncompleteArray lhsElemTy)) (CTIncomplete (IncompleteArray rhsElemTy)) = + CTIncomplete . IncompleteArray <$> mergeCompatibleTypeKinds' allowExtentInference lhsElemTy rhsElemTy +mergeTentativeArrayTypeKinds' True (CTIncomplete (IncompleteArray lhsElemTy)) (CTArray rhsLen rhsInnerTy) = + CTArray rhsLen <$> mergeCompatibleTypeKinds' False lhsElemTy rhsInnerTy +mergeTentativeArrayTypeKinds' True (CTArray lhsLen lhsInnerTy) (CTIncomplete (IncompleteArray rhsElemTy)) = + CTArray lhsLen <$> mergeCompatibleTypeKinds' False lhsInnerTy rhsElemTy +mergeTentativeArrayTypeKinds' _ _ _ = Nothing + +mergeCompatibleFunctionParamLists :: Eq i => [(TypeKind i, Maybe T.Text)] -> [(TypeKind i, Maybe T.Text)] -> Maybe [(TypeKind i, Maybe T.Text)] +mergeCompatibleFunctionParamLists lhsParams rhsParams + | isUnspecifiedParamList lhsParams + && oldStyleCompatibleWithPrototype rhsParamKinds = + Just rhsParams + | isUnspecifiedParamList rhsParams + && oldStyleCompatibleWithPrototype lhsParamKinds = + Just lhsParams + | length lhsParamKinds == length rhsParamKinds = + mapM mergeCompatibleFunctionParam $ zip lhsParams rhsParams + | otherwise = Nothing + where + lhsParamKinds = normalizedFunctionParamKinds lhsParams + rhsParamKinds = normalizedFunctionParamKinds rhsParams + + isUnspecifiedParamList [] = True + isUnspecifiedParamList _ = False + + oldStyleCompatibleWithPrototype = + all (isJust . (\ty -> mergeCompatibleTypeKinds ty $ defaultPromotedFunctionParamType ty)) + mergeCompatibleFunctionParam ((lhsTy, lhsName), (rhsTy, rhsName)) = + (, rhsName <|> lhsName) + <$> ( mergeCompatibleTypeKinds lhsTy rhsTy + <|> mergeCompatibleTypeKinds + (canonicalizeFunctionParamType lhsTy) + (canonicalizeFunctionParamType rhsTy) + ) + +normalizedFunctionParamKinds :: [(TypeKind i, Maybe T.Text)] -> [TypeKind i] +normalizedFunctionParamKinds [(CTVoid, Nothing)] = [] +normalizedFunctionParamKinds params = + map (canonicalizeFunctionParamType . fst) params + +canonicalizeFunctionParamType :: TypeKind i -> TypeKind i +canonicalizeFunctionParamType (CTArray _ elemTy) = CTPtr elemTy +canonicalizeFunctionParamType (CTIncomplete (IncompleteArray elemTy)) = CTPtr elemTy +canonicalizeFunctionParamType (CTFunc retTy params) = CTPtr (CTFunc retTy params) +canonicalizeFunctionParamType ty = ty + +defaultPromotedFunctionParamType :: TypeKind i -> TypeKind i +defaultPromotedFunctionParamType = integerPromotedTypeKind + +integerPromotedTypeKind :: TypeKind i -> TypeKind i +integerPromotedTypeKind = \case + CTChar -> CTInt + CTBool -> CTInt + CTEnum _ _ -> CTInt + CTShort _ -> CTInt + CTSigned ty + | isPromotableQualifiedType ty -> CTInt + | otherwise -> CTSigned ty + ty -> ty + +isPromotableQualifiedType :: TypeKind i -> Bool +isPromotableQualifiedType = \case + CTChar -> True + CTBool -> True + CTEnum _ _ -> True + CTShort _ -> True + _ -> False diff --git a/src/Htcc/MegaparsecCompat.hs b/src/Htcc/MegaparsecCompat.hs new file mode 100644 index 0000000..104a8ef --- /dev/null +++ b/src/Htcc/MegaparsecCompat.hs @@ -0,0 +1,72 @@ +module Htcc.MegaparsecCompat ( + module Text.Megaparsec, + errorBundlePretty +) where + +import Data.Foldable (toList) +import Data.List (intercalate) +import qualified Data.Text as T +import qualified Text.Megaparsec as M +import Text.Megaparsec hiding (errorBundlePretty) +import qualified Text.Megaparsec.Pos as MP + +errorBundlePretty :: M.ShowErrorComponent e => M.ParseErrorBundle T.Text e -> String +errorBundlePretty bundle = + intercalate "\n\n" (renderError <$> toList (M.bundleErrors bundle)) <> "\n" + where + basePosState = M.bundlePosState bundle + + renderError err = + renderAt (errorSourcePos err) $ lines $ M.parseErrorTextPretty err + + errorSourcePos err + | M.errorOffset err <= M.pstateOffset basePosState = + M.pstateSourcePos basePosState + | otherwise = + M.pstateSourcePos $ snd $ M.reachOffset (M.errorOffset err) basePosState + + renderAt pos msgs = + intercalate "\n" $ + [ renderLoc pos ] + <> maybe [] (\(srcLn, caretCol) -> [srcLn, replicate (pred caretCol) ' ' <> "^"]) (sourceLineAt pos) + <> msgs + + renderLoc pos = + intercalate + ":" + [ MP.sourceName pos + , show $ MP.unPos $ MP.sourceLine pos + , show $ MP.unPos $ MP.sourceColumn pos + ] + + sourceLineAt pos = + let lineNo = MP.unPos (MP.sourceLine pos) - 1 + inputLines = T.splitOn (T.singleton '\n') $ M.pstateInput basePosState + in if lineNo < 0 + then Nothing + else case drop lineNo inputLines of + srcLn : _ -> Just $ truncateSourceLine srcLn (MP.unPos $ MP.sourceColumn pos) + [] -> Nothing + + truncateSourceLine srcLine caretCol + | T.length srcLine <= maxSourceLineWidth = + (T.unpack srcLine, caretCol) + | caretCol <= edgeContextWidth = + (T.unpack (T.take (maxSourceLineWidth - suffixWidth) srcLine) <> truncationSuffix, caretCol) + | otherwise = + let startCol = max 1 (caretCol - innerContextWidth) + shown = T.take (maxSourceLineWidth - prefixWidth - suffixWidth) $ T.drop (pred startCol) srcLine + suffix = + if T.length srcLine > pred startCol + T.length shown + then truncationSuffix + else "" + adjustedCaretCol = prefixWidth + caretCol - startCol + 1 + in (truncationPrefix <> T.unpack shown <> suffix, adjustedCaretCol) + where + maxSourceLineWidth = 160 + edgeContextWidth = 120 + innerContextWidth = 80 + truncationPrefix = "... " + truncationSuffix = " ..." + prefixWidth = length truncationPrefix + suffixWidth = length truncationSuffix diff --git a/src/Htcc/Output.hs b/src/Htcc/Output.hs new file mode 100644 index 0000000..fa41332 --- /dev/null +++ b/src/Htcc/Output.hs @@ -0,0 +1,388 @@ +module Htcc.Output ( + ReplacementOutputMode (..), + creationMaskedOutputMode, + replaceExistingOutputFromPathWith, + resolveReplacementOutputPath, + stagedOutputMode, + temporaryWritableMode, + withReplacementOutputPathAndResolvedPath, + withReplacementOutputPath, +) where + +import Control.Exception (SomeException, catch, displayException, + finally, onException, throwIO) +import Control.Monad (when) +import Data.Bits (complement) +import qualified Data.ByteString as B +import System.Directory (getTemporaryDirectory, makeAbsolute, + removeDirectory, removeFile, renameFile) +import System.FilePath (isRelative, normalise, takeDirectory, + takeFileName, ()) +import System.IO (IOMode (ReadMode, WriteMode), hClose, + openTempFile, withBinaryFile) +import System.IO.Error (catchIOError, isDoesNotExistError, + isPermissionError) +import System.Posix.Files (fileMode, getFileStatus, + getSymbolicLinkStatus, groupExecuteMode, + groupReadMode, groupWriteMode, + intersectFileModes, isRegularFile, + isSymbolicLink, linkCount, + otherExecuteMode, otherReadMode, + otherWriteMode, ownerExecuteMode, + ownerReadMode, ownerWriteMode, + readSymbolicLink, setFileMode, + setGroupIDMode, setUserIDMode, + unionFileModes) +import System.Posix.IO (closeFd, createFile) +import System.Posix.Temp (mkdtemp) +import System.Posix.Types (FileMode) + +data ReplacementOutputMode + = PreserveReplacementOutputMode + | PreserveReplacementOutputModeKeepingExecutableBits + +ignoreIOException :: IO () -> IO () +ignoreIOException = flip catchIOError $ const $ pure () + +defaultOutputFileMode :: FileMode +defaultOutputFileMode = foldr1 unionFileModes + [ ownerReadMode + , ownerWriteMode + , groupReadMode + , groupWriteMode + , otherReadMode + , otherWriteMode + ] + +temporaryWritableMode :: FileMode +temporaryWritableMode = ownerReadMode `unionFileModes` ownerWriteMode + +privateTemporaryDirectoryMode :: FileMode +privateTemporaryDirectoryMode = foldr1 unionFileModes + [ ownerReadMode + , ownerWriteMode + , ownerExecuteMode + ] + +privateTemporaryDirectoryModePreserving :: FileMode -> FileMode +privateTemporaryDirectoryModePreserving inheritedMode = + privateTemporaryDirectoryMode + `unionFileModes` intersectFileModes inheritedMode setGroupIDMode + +creationMaskedOutputMode :: IO FileMode +creationMaskedOutputMode = do + tmpDir <- getTemporaryDirectory + probeDir <- mkdtemp (tmpDir "htcc-output-modeXXXXXX") + let probePath = probeDir "mask-probe" + cleanup = + ignoreIOException (removeFile probePath) + *> ignoreIOException (removeDirectory probeDir) + finally + ( do + setFileMode probeDir privateTemporaryDirectoryMode + probeFd <- createFile probePath defaultOutputFileMode + finally + ( do + probeMode <- fileMode <$> getFileStatus probePath + pure $ intersectFileModes probeMode defaultOutputFileMode + ) + (closeFd probeFd) + ) + cleanup + +resolveReplacementOutputPath :: FilePath -> IO FilePath +resolveReplacementOutputPath = go [] + where + go seen path = do + pathKey <- normalise <$> makeAbsolute path + when (pathKey `elem` seen) $ + ioError . userError $ "cyclic symbolic output path: " <> path + maybeStatus <- catchIOError + (Just <$> getSymbolicLinkStatus path) + (\ioErr -> if isDoesNotExistError ioErr then pure Nothing else ioError ioErr) + case maybeStatus of + Just status | isSymbolicLink status -> do + target <- readSymbolicLink path + let nextPath = normalise $ + if isRelative target + then takeDirectory path target + else target + go (pathKey : seen) nextPath + _ -> pure path + +existingOutputMode :: FilePath -> IO (Maybe FileMode) +existingOutputMode path = catchIOError + (Just . fileMode <$> getFileStatus path) + (\ioErr -> if isDoesNotExistError ioErr then pure Nothing else ioError ioErr) + +shouldReplaceOutputPath :: FilePath -> IO Bool +shouldReplaceOutputPath path = catchIOError + (isRegularFile <$> getFileStatus path) + (\ioErr -> if isDoesNotExistError ioErr then pure True else ioError ioErr) + +executableFileMode :: FileMode +executableFileMode = foldr1 unionFileModes + [ ownerExecuteMode + , groupExecuteMode + , otherExecuteMode + ] + +specialFileMode :: FileMode +specialFileMode = foldr1 unionFileModes + [ setUserIDMode + , setGroupIDMode + , 0o1000 + ] + +clearSpecialFileMode :: FileMode -> FileMode +clearSpecialFileMode mode = + intersectFileModes mode $ complement specialFileMode + +preservedExecuteMode :: FileMode -> FileMode +preservedExecuteMode mode = + intersectFileModes mode executableFileMode + +minimalRunnableExecuteMode :: FileMode -> FileMode +minimalRunnableExecuteMode mode + | preservedExecuteMode mode /= 0 = ownerExecuteMode + | otherwise = 0 + +replacementExecutableMode :: FileMode -> FileMode -> FileMode +replacementExecutableMode existingMode currentMode = + preservedExecuteMode existingMode + `unionFileModes` minimalRunnableExecuteMode replacementMode + where + replacementMode = existingMode `unionFileModes` currentMode + +stagedOutputMode :: ReplacementOutputMode -> FileMode -> FileMode +stagedOutputMode PreserveReplacementOutputMode _ = + temporaryWritableMode +stagedOutputMode PreserveReplacementOutputModeKeepingExecutableBits baseMode = + temporaryWritableMode `unionFileModes` stagedExecuteMode + where + stagedExecuteMode + | existingExecuteMode /= 0 = existingExecuteMode + | otherwise = ownerExecuteMode + existingExecuteMode = preservedExecuteMode baseMode + +updatedOutputMode :: ReplacementOutputMode -> FileMode -> FileMode -> FileMode +updatedOutputMode PreserveReplacementOutputMode existingMode _ = + clearSpecialFileMode existingMode +updatedOutputMode PreserveReplacementOutputModeKeepingExecutableBits existingMode currentMode = + clearSpecialFileMode existingMode + `unionFileModes` replacementExecutableMode existingMode currentMode + +freshOutputMode :: ReplacementOutputMode -> FileMode -> FileMode -> FileMode +freshOutputMode PreserveReplacementOutputMode baseMode _ = + clearSpecialFileMode baseMode +freshOutputMode PreserveReplacementOutputModeKeepingExecutableBits baseMode currentMode = + clearSpecialFileMode baseMode + `unionFileModes` preservedExecuteMode currentMode + `unionFileModes` minimalRunnableExecuteMode currentMode + +copyFileContents :: FilePath -> FilePath -> IO () +copyFileContents src dst = + withBinaryFile src ReadMode $ \srcHandle -> + withBinaryFile dst WriteMode $ \dstHandle -> + let go = do + chunk <- B.hGetSome srcHandle 32768 + if B.null chunk + then pure () + else B.hPut dstHandle chunk *> go + in go + +withReadableSource :: FilePath -> FileMode -> IO a -> IO a +withReadableSource path originalMode action + | intersectFileModes originalMode ownerReadMode /= 0 = action + | otherwise = do + setFileMode path readableMode + action `finally` setFileMode path originalMode + where + readableMode = originalMode `unionFileModes` ownerReadMode + +withWritableDestination :: FilePath -> FileMode -> IO a -> IO a +withWritableDestination path originalMode action + | intersectFileModes originalMode ownerWriteMode /= 0 = action + | otherwise = action `catchIOError` \ioErr -> + if isPermissionError ioErr + then do + setFileMode path writableMode + action `finally` setFileMode path originalMode + else ioError ioErr + where + writableMode = originalMode `unionFileModes` ownerWriteMode + +copyExistingOutputToBackup :: FilePath -> FileMode -> FilePath -> IO () +copyExistingOutputToBackup resolvedOutputPath baseMode backupPath = + copyFileContents resolvedOutputPath backupPath `catchIOError` \ioErr -> + if isPermissionError ioErr + then withReadableSource resolvedOutputPath baseMode $ + copyFileContents resolvedOutputPath backupPath + else ioError ioErr + +ensureInPlaceReplacementSafe :: FilePath -> IO () +ensureInPlaceReplacementSafe resolvedOutputPath = do + existingLinkCount <- linkCount <$> getFileStatus resolvedOutputPath + when (existingLinkCount > 1) $ + ioError . userError $ + "refusing to replace hard-linked output in place: " <> resolvedOutputPath + +replaceExistingOutputFromPathWith + :: (FilePath -> FilePath -> IO ()) + -> ReplacementOutputMode + -> FilePath + -> FileMode + -> FileMode + -> FilePath + -> IO () +replaceExistingOutputFromPathWith copyReplacementOutput modeStrategy resolvedOutputPath baseMode currentMode stagedOutputPath = do + ensureInPlaceReplacementSafe resolvedOutputPath + tmpDir <- getTemporaryDirectory + let backupTemplate = takeFileName resolvedOutputPath <> ".htcc-backup-" + (backupPath, backupHandle) <- openTempFile tmpDir backupTemplate + hClose backupHandle + let restoreOutput = do + withWritableDestination resolvedOutputPath baseMode $ + copyFileContents backupPath resolvedOutputPath + setFileMode resolvedOutputPath baseMode + cleanupBackup = + ignoreIOException (hClose backupHandle) + *> ignoreIOException (removeFile backupPath) + replaceOutput = do + withWritableDestination resolvedOutputPath baseMode $ + withReadableSource stagedOutputPath currentMode $ + copyReplacementOutput stagedOutputPath resolvedOutputPath + setFileMode resolvedOutputPath $ + updatedOutputMode modeStrategy baseMode currentMode + handleFailure :: SomeException -> IO () + handleFailure exc = do + restoreOutput `catch` rethrowWithRestoreFailure exc + throwIO exc + rethrowWithRestoreFailure :: SomeException -> SomeException -> IO () + rethrowWithRestoreFailure replacementExc restoreExc = + ioError . userError $ + "failed to restore original output after replacement failure (" + <> displayException replacementExc + <> "): " + <> displayException restoreExc + finally + ( do + copyExistingOutputToBackup resolvedOutputPath baseMode backupPath + replaceOutput `catch` handleFailure + ) + cleanupBackup + +replaceExistingOutputFromPath :: ReplacementOutputMode -> FilePath -> FileMode -> FileMode -> FilePath -> IO () +replaceExistingOutputFromPath = + replaceExistingOutputFromPathWith copyFileContents + +withDirectReplacementOutputPath :: ReplacementOutputMode -> FilePath -> FileMode -> (FilePath -> IO a) -> IO a +withDirectReplacementOutputPath modeStrategy resolvedOutputPath baseMode action = do + tmpDir <- getTemporaryDirectory + let outputTemplate = takeFileName resolvedOutputPath <> ".htcc-" + (tmpOutputPath, tmpOutputHandle) <- openTempFile tmpDir outputTemplate + finally + ( do + setFileMode tmpOutputPath $ stagedOutputMode modeStrategy baseMode + hClose tmpOutputHandle + result <- action tmpOutputPath + currentMode <- fileMode <$> getFileStatus tmpOutputPath + replaceExistingOutputFromPath modeStrategy resolvedOutputPath baseMode currentMode tmpOutputPath + pure result + ) + ( ignoreIOException (hClose tmpOutputHandle) + *> ignoreIOException (removeFile tmpOutputPath) + ) + +withFreshOutputPath :: ReplacementOutputMode -> FilePath -> (FilePath -> IO a) -> IO a +withFreshOutputPath modeStrategy resolvedOutputPath action = do + let outputDir = takeDirectory resolvedOutputPath + outputBaseName = takeFileName resolvedOutputPath + outputDirTemplate = outputBaseName <> ".htcc-XXXXXX" + tmpOutputDir <- mkdtemp (outputDir outputDirTemplate) + flip onException (ignoreIOException $ removeDirectory tmpOutputDir) $ do + inheritedMode <- fileMode <$> getFileStatus tmpOutputDir + setFileMode tmpOutputDir $ privateTemporaryDirectoryModePreserving inheritedMode + let tmpOutputPath = tmpOutputDir outputBaseName + cleanup = + ignoreIOException (removeFile tmpOutputPath) + *> ignoreIOException (removeDirectory tmpOutputDir) + finally + ( do + tmpOutputFd <- createFile tmpOutputPath defaultOutputFileMode + closeFd tmpOutputFd + baseMode <- intersectFileModes defaultOutputFileMode . fileMode <$> getFileStatus tmpOutputPath + setFileMode tmpOutputPath $ stagedOutputMode modeStrategy baseMode + result <- action tmpOutputPath + currentMode <- fileMode <$> getFileStatus tmpOutputPath + setFileMode tmpOutputPath $ freshOutputMode modeStrategy baseMode currentMode + renameFile tmpOutputPath resolvedOutputPath + pure result + ) + cleanup + +withReplacementOutputPathAndResolvedPath + :: ReplacementOutputMode + -> FilePath + -> (FilePath -> IO a) + -> IO (FilePath, a) +withReplacementOutputPathAndResolvedPath modeStrategy outputPath action = do + resolvedOutputPath <- resolveReplacementOutputPath outputPath + shouldReplace <- shouldReplaceOutputPath resolvedOutputPath + if shouldReplace + then do + existingMode <- existingOutputMode resolvedOutputPath + case existingMode of + Nothing -> + do + result <- withFreshOutputPath modeStrategy resolvedOutputPath action + pure (resolvedOutputPath, result) + Just baseMode -> do + let outputDir = takeDirectory resolvedOutputPath + outputTemplate = takeFileName resolvedOutputPath <> ".htcc-" + fallbackToDirect ioErr + | isPermissionError ioErr = + do + result <- withDirectReplacementOutputPath modeStrategy resolvedOutputPath baseMode action + pure (resolvedOutputPath, result) + | otherwise = + ioError ioErr + openStagedOutput = + do + (tmpOutputPath, tmpOutputHandle) <- openTempFile outputDir outputTemplate + let cleanup = + ignoreIOException (hClose tmpOutputHandle) + *> ignoreIOException (removeFile tmpOutputPath) + flip onException cleanup $ do + setFileMode tmpOutputPath $ stagedOutputMode modeStrategy baseMode + hClose tmpOutputHandle + pure tmpOutputPath + publishStagedOutput tmpOutputPath stagedMode = + renameFile tmpOutputPath resolvedOutputPath `catchIOError` \ioErr -> + if isPermissionError ioErr + then replaceExistingOutputFromPath modeStrategy resolvedOutputPath baseMode stagedMode tmpOutputPath + else ioError ioErr + stagingResult <- catchIOError + (Right <$> openStagedOutput) + (fmap Left . fallbackToDirect) + case stagingResult of + Left fallbackResult -> + pure fallbackResult + Right tmpOutputPath -> + finally + ( do + result <- action tmpOutputPath + currentMode <- fileMode <$> getFileStatus tmpOutputPath + let finalMode = updatedOutputMode modeStrategy baseMode currentMode + setFileMode tmpOutputPath finalMode + publishStagedOutput tmpOutputPath finalMode + pure (resolvedOutputPath, result) + ) + (ignoreIOException (removeFile tmpOutputPath)) + else do + result <- action resolvedOutputPath + pure (resolvedOutputPath, result) + +withReplacementOutputPath :: ReplacementOutputMode -> FilePath -> (FilePath -> IO a) -> IO a +withReplacementOutputPath modeStrategy outputPath action = + snd <$> withReplacementOutputPathAndResolvedPath modeStrategy outputPath action diff --git a/src/Htcc/Parser.hs b/src/Htcc/Parser.hs index 970f480..5204b44 100644 --- a/src/Htcc/Parser.hs +++ b/src/Htcc/Parser.hs @@ -10,9 +10,7 @@ Portability : POSIX Parsing and constructing AST from string -} module Htcc.Parser ( - module Htcc.Parser.AST, - module Htcc.Parser.Parsing + module Htcc.Parser.AST ) where import Htcc.Parser.AST -import Htcc.Parser.Parsing diff --git a/src/Htcc/Parser/AST.hs b/src/Htcc/Parser/AST.hs index 6d6ee15..febf3d3 100644 --- a/src/Htcc/Parser/AST.hs +++ b/src/Htcc/Parser/AST.hs @@ -13,10 +13,8 @@ module Htcc.Parser.AST ( module Htcc.Parser.AST.Core, module Htcc.Parser.AST.Type, module Htcc.Parser.AST.DeduceKind, - module Htcc.Parser.AST.Var ) where import Htcc.Parser.AST.Core import Htcc.Parser.AST.DeduceKind import Htcc.Parser.AST.Type -import Htcc.Parser.AST.Var diff --git a/src/Htcc/Parser/AST/Core.hs b/src/Htcc/Parser/AST/Core.hs index d0a6175..5e48e7e 100644 --- a/src/Htcc/Parser/AST/Core.hs +++ b/src/Htcc/Parser/AST/Core.hs @@ -49,7 +49,7 @@ data ATKindFor a = ATForkw -- ^ The @for@ keyword | ATForCond (ATree a) -- ^ The conditional section of @for@ statement | ATForIncr (ATree a) -- ^ The incremental section of @for@ statement | ATForStmt (ATree a) -- ^ The statement section of @for@ statement - deriving Show + deriving (Eq, Show) {-# INLINE isATForInit #-} -- | An utility of `ATForInit`. When an argument is `ATForInit`, return `True` otherwise `False` @@ -130,6 +130,8 @@ data ATKind a = ATAdd -- ^ \(x+y\): @x + y@ | ATConditional (ATree a) (ATree a) (ATree a) -- ^ conditional operator: @a ? x : y;@. It has three AST (cond, then and else) | ATComma -- ^ comma operator: @x,b@ | ATCast -- ^ the cast operation: @(type) x@ + | ATSizeof -- ^ the @sizeof@ operator for expression operands. + | ATAlignof -- ^ the @_Alignof@ operator for expression operands. | ATMemberAcc (CT.StructMember a) -- ^ accessing the member of the @struct@ | ATReturn -- ^ the @return@ keyword | ATIf -- ^ the @if@ keyword @@ -147,11 +149,13 @@ data ATKind a = ATAdd -- ^ \(x+y\): @x + y@ | ATLVar (CT.StorageClass a) a -- ^ the local variable. It has a type information (as `CT.StorageClass`) and an offset value | ATGVar (CT.StorageClass a) T.Text -- ^ the global variable. It has a type information (as `CT.StorageClass`) and an name | ATDefFunc T.Text (Maybe [ATree a]) -- ^ the function definition - | ATCallFunc T.Text (Maybe [ATree a]) -- ^ the function call. It has a offset value and arguments (`Maybe`) + | ATCallFunc T.Text (Maybe [ATree a]) -- ^ the direct function call. It has the target function name and arguments (`Maybe`) + | ATCallPtr (Maybe [ATree a]) -- ^ the indirect function call. The callee expression is stored in the lhs of `ATNode`. + | ATFuncPtr T.Text -- ^ the function designator / pointer. It has the target function name. | ATExprStmt -- ^ the expression of a statement | ATStmtExpr [ATree a] -- ^ the statement of a expression (GNU extension) | ATNull (ATree a) -- ^ indicates nothing to do - deriving Show + deriving (Eq, Show) {-# INLINE fromATVar #-} -- | Take its type when it is ATIVar or ATIVar. @@ -192,7 +196,7 @@ data ATree a = ATEmpty -- ^ The empty node atL :: ATree a, -- ^ The left hand side abstract tree atR :: ATree a -- ^ The right hand side abstract tree } -- ^ `ATKind` representing the kind of node and the two branches `ATree` it has - deriving Show + deriving (Eq, Show) -- | A class whose type can be converted to ATree class Treealizable a where diff --git a/src/Htcc/Parser/AST/DeduceKind.hs b/src/Htcc/Parser/AST/DeduceKind.hs index 08e31ea..b2fc908 100644 --- a/src/Htcc/Parser/AST/DeduceKind.hs +++ b/src/Htcc/Parser/AST/DeduceKind.hs @@ -24,8 +24,8 @@ import Htcc.Parser.AST.Core (ATKind (..), ATree (..)) addKind :: (Eq i, Ord i, Show i) => ATree i -> ATree i -> Maybe (ATree i) addKind lhs rhs | all (CT.isFundamental . atype) [lhs, rhs] = Just $ ATNode ATAdd (CT.conversion (atype lhs) (atype rhs)) lhs rhs - | isJust (CT.deref $ atype lhs) && CT.isFundamental (atype rhs) = Just $ ATNode ATAddPtr (atype lhs) lhs rhs - | CT.isFundamental (atype lhs) && isJust (CT.deref $ atype rhs) = Just $ ATNode ATAddPtr (atype rhs) rhs lhs + | isJust (CT.deref $ atype lhs) && CT.isFundamental (atype rhs) = Just $ ATNode ATAddPtr (pointerOperandType lhs) lhs rhs + | CT.isFundamental (atype lhs) && isJust (CT.deref $ atype rhs) = Just $ ATNode ATAddPtr (pointerOperandType rhs) rhs lhs | otherwise = Nothing {-# INLINE subKind #-} @@ -33,6 +33,17 @@ addKind lhs rhs subKind :: (Eq i, Ord i) => ATree i -> ATree i -> Maybe (ATree i) subKind lhs rhs | all (CT.isFundamental . atype) [lhs, rhs] = Just $ ATNode ATSub (CT.conversion (atype lhs) (atype rhs)) lhs rhs - | isJust (CT.deref $ atype lhs) && CT.isFundamental (atype rhs) = Just $ ATNode ATSubPtr (atype lhs) lhs rhs - | all (isJust . CT.deref . atype) [lhs, rhs] = Just $ ATNode ATPtrDis (atype lhs) lhs rhs + | isJust (CT.deref $ atype lhs) && CT.isFundamental (atype rhs) = Just $ ATNode ATSubPtr (pointerOperandType lhs) lhs rhs + | all (isJust . CT.deref . atype) [lhs, rhs] = Just $ ATNode ATPtrDis (CT.SCAuto $ CT.CTLong CT.CTInt) lhs rhs | otherwise = Nothing + +pointerOperandType :: Ord i => ATree i -> CT.StorageClass i +pointerOperandType expr = case CT.toTypeKind ty of + CT.CTArray _ _ -> + maybe ty (CT.mapTypeKind CT.CTPtr) $ CT.deref ty + CT.CTIncomplete (CT.IncompleteArray elemTy) -> + CT.mapTypeKind (const $ CT.CTPtr elemTy) ty + _ -> + ty + where + ty = atype expr diff --git a/src/Htcc/Parser/AST/Type.hs b/src/Htcc/Parser/AST/Type.hs index 5abaf0c..e5149d7 100644 --- a/src/Htcc/Parser/AST/Type.hs +++ b/src/Htcc/Parser/AST/Type.hs @@ -17,13 +17,14 @@ module Htcc.Parser.AST.Type ( ASTState ) where -import Htcc.Parser.AST.Core (ATree (..)) -import Htcc.Parser.ConstructionData.Core (ConstructionData, - Warnings) -import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) -import qualified Htcc.Parser.ConstructionData.Scope.Var as PV -import qualified Htcc.Tokenizer as HT -import Htcc.Utils.CompilationState (CompilationState) +import Htcc.Parser.AST.Core (ATree (..)) +import {-# SOURCE #-} Htcc.Parser.ConstructionData.Core (ConstructionData, + Warnings) +import qualified Htcc.Parser.ConstructionData.Scope.Function as PF +import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) +import qualified Htcc.Parser.ConstructionData.Scope.Var as PV +import qualified Htcc.Tokenizer as HT +import Htcc.Utils.CompilationState (CompilationState) -- | The type to be used when the AST construction is successful type ASTSuccess i = ([HT.TokenLC i], ATree i, ConstructionData i) @@ -34,8 +35,9 @@ type ASTConstruction i = Either (ASTError i) (ASTSuccess i) -- | The type of AST list type ASTs i = [ATree i] --- | A type that represents the result after AST construction. Quadraple of warning list, constructed abstract syntax tree list, global variable map, literal list. -type ASTResult i = Either (ASTError i) (Warnings i, ASTs i, PV.GlobalVars i, PV.Literals i) +-- | A type that represents the result after AST construction. Quintuple of warning list, +-- constructed abstract syntax tree list, global variable map, literal list, and function map. +type ASTResult i = Either (ASTError i) (Warnings, ASTs i, PV.GlobalVars i, PV.Literals i, PF.Functions i) -- | The type synonym of ASTState type ASTState i r = CompilationState (ConstructionData i) [HT.TokenLC i] i r diff --git a/src/Htcc/Parser/AST/Var.hs b/src/Htcc/Parser/AST/Var.hs deleted file mode 100644 index ba98174..0000000 --- a/src/Htcc/Parser/AST/Var.hs +++ /dev/null @@ -1,16 +0,0 @@ -{-| -Module : Htcc.Parser.AST.Var -Description : Data types and type synonyms used during AST construction -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -Data types and type synonyms used during AST construction --} -module Htcc.Parser.AST.Var ( - module Htcc.Parser.AST.Var.Init -) where - -import Htcc.Parser.AST.Var.Init diff --git a/src/Htcc/Parser/AST/Var/Init.hs b/src/Htcc/Parser/AST/Var/Init.hs deleted file mode 100644 index b4855fe..0000000 --- a/src/Htcc/Parser/AST/Var/Init.hs +++ /dev/null @@ -1,182 +0,0 @@ -{-# LANGUAGE BangPatterns, OverloadedStrings, TupleSections #-} -{-| -Module : Htcc.Parser.AST.Var.Init -Description : The C languge parser and AST constructor -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -The C languge parser and AST constructor --} -module Htcc.Parser.AST.Var.Init ( - Assign, - validAssign, - varInit -) where - -import Control.Conditional (ifM) -import Control.Monad (forM) -import Control.Monad.Fix (fix) -import Control.Monad.State (gets, put) -import Control.Monad.Trans (lift) -import Control.Monad.Trans.State (evalStateT) -import Data.Bits (Bits) -import qualified Data.ByteString as B -import Data.Foldable (Foldable (..)) -import Data.List (isPrefixOf, - sortBy) -import qualified Data.Map.Strict as M -import Data.Maybe (fromJust, - fromMaybe, - isNothing) -import qualified Data.Sequence as SQ -import Data.Tuple.Extra (dupe, first, - second, snd3) -import Prelude hiding - (toInteger) -import Safe (headMay) - -import qualified Htcc.CRules.Types as CT -import Htcc.Parser.AST.Core (ATKind (..), - ATree (..), - Treealizable (..), - atAssign, - atExprStmt, - atMemberAcc, - atNumLit, - atUnary) -import Htcc.Parser.AST.DeduceKind -import Htcc.Parser.AST.Type (ASTConstruction, - ASTState) -import Htcc.Parser.ConstructionData -import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) -import Htcc.Parser.ConstructionData.Scope.Utils (internalCE) -import Htcc.Parser.Utils -import qualified Htcc.Tokenizer as HT -import Htcc.Utils (dropSnd3, - fou4, fst4, - maybeToRight, - second3, snd4, - swap, thd4, - tshow) -import Htcc.Utils.CompilationState (isSatisfied, - itemCWhen) - -{-# INLINE validAssign #-} --- | Check for valid substitutions -validAssign :: Eq i => HT.TokenLC i -> ATree i -> Either (ASTError i) (ATree i) -validAssign errPlaceholder x@(ATNode _ t _ _) - | CT.toTypeKind t == CT.CTVoid = Left ("void value not ignored as it ought to be", errPlaceholder) - | otherwise = Right x -validAssign errPlaceholder _ = Left ("Expected to assign", errPlaceholder) - -desgNode :: (Num i, Ord i, Show i) => HT.TokenLC i -> ATree i -> [CT.Desg i]-> ConstructionData i -> Either (ASTError i) (ATree i) -desgNode ident rhs desg sc = fmap (atExprStmt . flip atAssign rhs) $ flip (`foldr` ntRightLVarTree) desg $ \idx acc -> case idx of - CT.DesgIdx idx' -> do - at <- acc - nd <- ntRightInvalidInitList $ addKind at $ atNumLit idx' - flip (atUnary ATDeref) nd <$> ntRightInvalidInitList (CT.deref (atype nd)) - CT.DesgMem mem -> atMemberAcc mem <$> acc - where - ntRightInvalidInitList = maybeToRight ("invalid initializer-list", HT.emptyToken) - ntRightLVarTree = treealize <$> maybeToRight (internalCE, HT.emptyToken) (lookupLVar (tshow $ snd ident) sc) - -initZero :: (Num i, Ord i, Show i, Enum i) => CT.TypeKind i -> HT.TokenLC i -> [CT.Desg i] -> ConstructionData i -> Either (ASTError i) [ATree i] -initZero (CT.CTArray n t) ident desg sc = fmap concat $ forM [0..fromIntegral (pred n)] $ flip (initZero t ident) sc . (:desg) . CT.DesgIdx -initZero _ ident desg sc = (:[]) <$> desgNode ident (atNumLit 0) desg sc - --- | needs parameters for Assign -type Assign i = [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i - --- Designator initialization processing loop --- Returns the consumed token list, the constructed tree, ConstructionData and the number of elements specified in designator -initLoop :: (Bits i, Integral i, Read i, Show i) => - Assign i -> CT.StorageClass i -> HT.TokenLC i -> [CT.Desg i] -> SQ.Seq (ATree i) -> HT.TokenLC i -> ASTState i ([HT.TokenLC i], SQ.Seq (ATree i), ConstructionData i, i) -initLoop callback t' ident desg ai c = do - rs <- initLoop' ai - itemCWhen const ((==HT.TKReserved "}") . snd) >>= maybe (lift $ Left ("expected '}' token for '{'", c)) (const $ retCur (snd4 rs) (fou4 rs)) - where - isEnd = uncurry (||) . first (isPrefixOf [HT.TKReserved "}"]) . second (isPrefixOf [HT.TKReserved ",", HT.TKReserved "}"]) . dupe . map snd - - retCur :: SQ.Seq (ATree i) -> i -> ASTState i ([HT.TokenLC i], SQ.Seq (ATree i), ConstructionData i, i) - retCur ai' n = uncurry (,ai',,n) <$> gets swap - - initLoop' ai' = case CT.toTypeKind t' of - CT.CTArray _ _ -> ($ (0, ai')) . fix $ \f (!idx, rl) -> do - rs <- uncurry (desgInit callback ident rl (CT.DesgIdx idx:desg) $ fromJust $ CT.deref t') <$> gets swap - flip (either (lift . Left)) rs $ \rs' -> do - put (swap $ dropSnd3 rs') - ifM ((||) <$> isSatisfied isEnd <*> (isNothing <$> itemCWhen const ((==HT.TKReserved ",") . snd))) (retCur (snd3 rs') $ succ idx) $ f (succ idx, snd3 rs') - CT.CTStruct mems -> ($ (M.elems mems, ai', 0)) . fix $ \f (mems', rl, len) -> if null mems' then retCur ai' len else do - rs <- uncurry (desgInit callback ident rl (CT.DesgMem (head mems'):desg) $ CT.SCAuto $ CT.smType (head mems')) <$> gets swap - flip (either (lift . Left)) rs $ \rs' -> do - put (swap $ dropSnd3 rs') - ifM ((||) <$> isSatisfied isEnd <*> (isNothing <$> itemCWhen const ((==HT.TKReserved ",") . snd))) (retCur (snd3 rs') $ succ len) $ f (tail mems', snd3 rs', succ len) - _ -> lift $ Left (internalCE, HT.emptyToken) - --- For initializer-list. --- For example, the declaration @int x[2][2] = { { 1, 2 }, { 3, 4 } };@ is converted to @x[2][2]; x[0][0] = 1; x[0][1] = 2; x[1][0] = 3; x[1][1] = 4;@. -desgInit :: (Bits i, Integral i, Read i, Show i) => - Assign i -> HT.TokenLC i -> SQ.Seq (ATree i) -> [CT.Desg i] -> CT.StorageClass i -> [HT.TokenLC i] -> ConstructionData i -> - Either (ASTError i) ([HT.TokenLC i], SQ.Seq (ATree i), ConstructionData i) -desgInit callback ident ai desg t' xs' scp - -- initializer-string - | CT.isArray t' && maybe False ((==CT.CTChar) . CT.toTypeKind) (CT.deref t') && maybe False (HT.isTKString . snd) (headMay xs') = if CT.isIncompleteArray t' then - case snd (head xs') of - (HT.TKString s) -> let newt = arTypeFromLen (B.length s) in addLVar newt ident scp >>= desgInit callback ident ai desg newt xs' . snd - _ -> Left (internalCE, HT.emptyToken) -- should not reach here - else case (snd (head xs'), CT.toTypeKind t') of - (HT.TKString s, CT.CTArray n _) -> let s' = s `B.append` B.pack (replicate (fromIntegral n - pred (B.length s)) $ toEnum 0) in - fmap ((tail xs',, if fromIntegral n < pred (B.length s) then pushWarn "initializer-string for char array is too long" (head xs') scp else scp) . - (ai SQ.><) . SQ.fromList) $ mapM (flip id scp . uncurry (desgNode ident)) $ zipWith (flip (.) (++desg) . (,) . atNumLit . fromIntegral) (B.unpack s') $ - sortBy (flip (.) reverse . compare . reverse) $ CT.accessibleIndices $ CT.toTypeKind t' - _ -> Left (internalCE, HT.emptyToken) -- should not reach here - -- Non-string initializer-list - | CT.isArray t' = case xs' of -- incomplete dattara takeExps de kazeru - -- Zero initialization - (_, HT.TKReserved "{"):(_, HT.TKReserved "}"):ds -> fmap ((ds,, scp) . (ai SQ.><) . SQ.fromList) $ - mapM (flip (desgNode ident $ atNumLit 0) scp . (++desg)) $ CT.accessibleIndices $ CT.toTypeKind t' - -- The specified initializer-list of initialization elements - c@(_, HT.TKReserved "{"):ds - | CT.isIncompleteArray t' -> toComplete (c:ds) >>= \newt -> addLVar newt ident scp >>= desgInit callback ident ai desg newt xs' . snd - | otherwise -> case CT.toTypeKind t' of - CT.CTArray n bt -> do - rs <- evalStateT (initLoop callback t' ident desg ai c) (scp, ds) - zeroResult rs $ forM [fromIntegral (fou4 rs)..pred $ fromIntegral n] $ \idx -> initZero bt ident (CT.DesgIdx idx:desg) (thd4 rs) - _ -> Left (internalCE, HT.emptyToken) - _ -> Left ("expected { initializer-list } or { initializer-list , }", if not (null xs') then head xs' else HT.emptyToken) - -- struct initializer - | CT.isCTStruct t' = case (xs', CT.toTypeKind t') of - ((_, HT.TKReserved "{"):(_, HT.TKReserved "}"):ds, CT.CTStruct mems) -> fmap ((ds,,scp) . (ai SQ.><) . SQ.fromList . concat) $ forM (M.elems mems) $ \mem -> - initZero (CT.smType mem) ident (CT.DesgMem mem:desg) scp - (c@(_, HT.TKReserved "{"):ds, CT.CTStruct mems) -> do - rs <- evalStateT (initLoop callback t' ident desg ai c) (scp, ds) - zeroResult rs $ forM (drop (fromIntegral $ fou4 rs) (M.elems mems)) $ \mem -> initZero (CT.smType mem) ident (CT.DesgMem mem:desg) (thd4 rs) - _ -> Left ("expected { initializer-list } or { initializer-list , }", if not (null xs') then head xs' else HT.emptyToken) - -- For a element - | otherwise = callback xs' ATEmpty scp >>= \(ds, at, scp''') -> (ds,,scp''') . (ai SQ.|>) <$> desgNode ident at desg scp''' - where - {-# INLINE zeroResult #-} - zeroResult rs = fmap ((fst4 rs,,thd4 rs) . (ai SQ.><) . (snd4 rs SQ.><) . SQ.fromList . concat) - - {-# INLINE toComplete #-} - toComplete ds' = (>>=) - (maybeToRight ("expected { initializer-list } or { initializer-list , }", if not (null xs') then head xs' else HT.emptyToken) (takeBrace "{" "}" ds')) $ - either (Left . ("expected { initializer-list } or { initializer-list , }",)) $ \(br, _) -> arTypeFromLen . length <$> - maybeToRight (internalCE, HT.emptyToken) (takeExps $ [(HT.TokenLCNums 0 0, HT.TKReserved "(")] ++ init (tail br) ++ [(HT.TokenLCNums 0 0, HT.TKReserved ")")]) - - {-# INLINE arTypeFromLen #-} - arTypeFromLen len = snd (CT.dctorArray t') $ CT.mapTypeKind (CT.CTArray (fromIntegral len) . fromJust . CT.fromIncompleteArray) t' - -varInit' :: (Read i, Show i, Integral i, Bits i) => Assign i -> CT.StorageClass i -> HT.TokenLC i -> [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -varInit' callback t ident xs lat scp' - | CT.isArray t || CT.isCTStruct t = second3 (\st -> ATNode (ATBlock $ toList st) (CT.SCUndef CT.CTUndef) ATEmpty ATEmpty) <$> desgInit callback ident SQ.empty [] t xs scp' - | otherwise = do - (ert, erat, ervar) <- callback xs ATEmpty scp' - flip fmap (validAssign (HT.altEmptyToken ert) erat) $ \erat' -> (ert, atExprStmt (ATNode ATAssign (atype lat) lat erat'), ervar) - --- | Initializing local variables -varInit :: (Read i, Show i, Integral i, Bits i) => Assign i -> CT.StorageClass i -> HT.TokenLC i -> [HT.TokenLC i] -> ConstructionData i -> ASTConstruction i -varInit callback t ident token scp = addLVar (fromMaybe t $ incomplete t scp) ident scp >>= uncurry (varInit' callback t ident token) diff --git a/src/Htcc/Parser/Combinators.hs b/src/Htcc/Parser/Combinators.hs new file mode 100644 index 0000000..69c907f --- /dev/null +++ b/src/Htcc/Parser/Combinators.hs @@ -0,0 +1,18 @@ +{-| +Module : Htcc.Parser.Combinators +Description : C language lexer +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language lexer +-} +module Htcc.Parser.Combinators ( + module Htcc.Parser.Combinators.Core + , module Htcc.Parser.Combinators.Program +) where + +import Htcc.Parser.Combinators.Core +import Htcc.Parser.Combinators.Program diff --git a/src/Htcc/Parser/Combinators/BasicOperator.hs b/src/Htcc/Parser/Combinators/BasicOperator.hs new file mode 100644 index 0000000..45dee35 --- /dev/null +++ b/src/Htcc/Parser/Combinators/BasicOperator.hs @@ -0,0 +1,126 @@ +{-| +Module : Htcc.Parser.Combinators.BasicOperator +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +module Htcc.Parser.Combinators.BasicOperator ( + binaryOperator + , binOpBool + , binOpCon + , binOpIntOnly + , notFollowedOp +) where + +import Control.Applicative (Alternative (..)) +import Control.Monad.Combinators (choice) +import Control.Monad.Fix (fix) +import Data.Bits (Bits (..)) +import Htcc.CRules.Types as CT +import Htcc.Parser.AST.Core (ATKind (..), ATree (..)) +import Htcc.Parser.Combinators.Core +import Htcc.Utils (lor) +import qualified Text.Megaparsec as M + +-- | A parser combinator that builds a parser for basic binary operators. +-- This is useful for syntax such as: +-- \\[X::=X'\left("\text{op}_1"\ X'\ \mid\ "\text{op}_2"\ X'\ \mid\cdots\right)\ast\\] +binaryOperator :: + Parser i (ATree i) + -> [(Parser i a, ATree i -> ATree i -> Parser i (ATree i))] + -> Parser i (ATree i) +binaryOperator p opndMs = do + m <- p + flip fix m $ \f nd -> + M.option nd $ choice [M.try (opM >> p) >>= ndM nd >>= f | (opM, ndM) <- opndMs] + +isFunctionType :: CT.StorageClass i -> Bool +isFunctionType ty = case CT.toTypeKind ty of + CT.CTFunc _ _ -> True + _ -> False + +binOpBool :: (Monad m, Ord i, Bits i, Show i) + => ATKind i + -> ATree i + -> ATree i + -> m (ATree i) +binOpBool k lhs rhs = return $ ATNode k (CT.SCAuto CT.CTBool) lhs rhs + +binOpCon :: (MonadFail m, Ord i, Bits i, Show i) + => ATKind i + -> ATree i + -> ATree i + -> m (ATree i) +binOpCon k lhs rhs + | isFunctionOperand lhs || isFunctionOperand rhs = fail $ mconcat + [ "invalid operands of types '" + , show (atype lhs) + , "' and '" + , show (atype rhs) + , "' to binary '" + , show k + , "'" + ] + | otherwise = pure $ ATNode k (CT.conversion (atype lhs) (atype rhs)) lhs rhs + where + isFunctionOperand expr = + isFunctionType (atype expr) || carriesUncastFunctionDesignatorValue expr + + carriesUncastFunctionDesignatorValue expr = case expr of + ATNode (ATFuncPtr _) _ _ _ -> + True + ATNode ATAddr _ inner _ -> + carriesUncastFunctionDesignatorValue inner + ATNode ATCast ty inner _ + | CT.isIntegral ty -> + False + | otherwise -> + carriesUncastFunctionDesignatorValue inner + ATNode (ATNull inner) _ _ _ -> + carriesUncastFunctionDesignatorValue inner + ATNode ATExprStmt _ inner _ -> + carriesUncastFunctionDesignatorValue inner + ATNode ATComma _ _ rhs -> + carriesUncastFunctionDesignatorValue rhs + ATNode (ATConditional cond ATEmpty el) _ _ _ -> + any carriesUncastFunctionDesignatorValue [cond, el] + ATNode (ATConditional _ th el) _ _ _ -> + any carriesUncastFunctionDesignatorValue [th, el] + ATNode (ATStmtExpr stmts) _ _ _ -> + maybe False carriesUncastFunctionDesignatorValue (lastMaybe stmts) + _ -> + False + + lastMaybe [] = Nothing + lastMaybe xs = Just $ last xs + +binOpIntOnly :: (Monad m, MonadFail m, Alternative m, Ord i, Bits i, Show i) + => ATKind i + -> ATree i + -> ATree i + -> m (ATree i) +binOpIntOnly k lhs rhs + | lor [CT.isIntegral, (CT.CTBool==) . CT.toTypeKind] (atype lhs) && + lor [CT.isIntegral, (CT.CTBool ==) . CT.toTypeKind] (atype rhs) = + return $ ATNode k (resultTy k (atype lhs) (atype rhs)) lhs rhs + | otherwise = fail $ mconcat + [ "invalid operands of types '" + , show (atype lhs) + , "' and '" + , show (atype rhs) + , "' to binary '" + , show k + , "'" + ] + where + resultTy ATShl lhsTy _ = CT.SCAuto $ CT.integerPromotedTypeKind $ CT.toTypeKind lhsTy + resultTy ATShr lhsTy _ = CT.SCAuto $ CT.integerPromotedTypeKind $ CT.toTypeKind lhsTy + resultTy _ lhsTy rhsTy = CT.conversion lhsTy rhsTy + +notFollowedOp :: Parser i a -> Parser i b -> Parser i a +notFollowedOp op nop = M.try $ lexeme $ op `notFollowedBy` nop diff --git a/src/Htcc/Parser/Combinators/ConstExpr.hs b/src/Htcc/Parser/Combinators/ConstExpr.hs new file mode 100644 index 0000000..b0a3c78 --- /dev/null +++ b/src/Htcc/Parser/Combinators/ConstExpr.hs @@ -0,0 +1,101 @@ +{-| +Module : Htcc.Parser.Combinators.ConstExpr +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +{-# LANGUAGE OverloadedStrings #-} +module Htcc.Parser.Combinators.ConstExpr ( + evalConstexpr +) where +import Data.Bits (Bits (..)) +import Data.Bool (bool) +import qualified Htcc.CRules.Types as CT +import Htcc.Parser.AST.Core (ATKind (..), + ATree (..)) +import Htcc.Parser.Combinators.Core +import {-# SOURCE #-} Htcc.Parser.Combinators.Program (conditional) +import Htcc.Parser.Combinators.Utils (applyConstexprCast, + isConstexprArithmeticCastType) +import Htcc.Parser.ConstructionData.Core (hasIncompleteObjectType) + +evalConstexpr :: (Bits i, Integral i, Show i, Read i) => Parser i i +evalConstexpr = conditional >>= constantExp' + where + fromBool = fromIntegral . fromEnum :: Num i => Bool -> i + toBool x | x == 0 = False | otherwise = True + + constantExp' (ATNode k ty lhs rhs) = case k of + ATAdd -> binop (+) + ATSub -> binop (-) + ATMul -> binop (*) + ATDiv -> nonZeroBinop quot + ATMod -> nonZeroBinop rem + ATAnd -> binop (.&.) + ATXor -> binop xor + ATOr -> binop (.|.) + ATShl -> shiftBinop shiftL + ATShr -> shiftBinop shiftR + ATEQ -> binop ((.) fromBool . (==)) + ATNEQ -> binop ((.) fromBool . (/=)) + ATLT -> binop ((.) fromBool . (<)) + ATGT -> binop ((.) fromBool . (>)) + ATLEQ -> binop ((.) fromBool . (<=)) + ATGEQ -> binop ((.) fromBool . (>=)) + ATConditional cn th el -> constantExp' cn + >>= bool (constantExp' el) (constantExp' trueExpr) . toBool + where + trueExpr = case th of + ATEmpty -> cn + _ -> th + ATComma -> fail "The expression is not constant-expression" + ATNot -> fromIntegral . fromEnum . not . toBool <$> constantExp' lhs + ATBitNot -> complement <$> constantExp' lhs + ATLAnd -> constantExp' lhs >>= logicalAnd + ATLOr -> constantExp' lhs >>= logicalOr + ATSizeof -> memOp "sizeof" CT.sizeof lhs + ATAlignof -> memOp "_Alignof" CT.alignof lhs + ATCast + | isConstexprArithmeticCastType ty -> applyConstexprCast ty <$> constantExp' lhs + | otherwise -> fail "The expression is not constant-expression" + ATNum v -> pure v + _ -> fail "The expression is not constant-expression" + where + binop f = constantExp' lhs + >>= \lhs' -> fromIntegral . f lhs' <$> constantExp' rhs + shiftBinop f = + constantExp' lhs >>= \lhs' -> + constantExp' rhs >>= \rhs' -> + case shiftCount rhs' of + Nothing -> fail "The expression is not constant-expression" + Just count' -> pure $ f lhs' count' + shiftCount n + | n < 0 = Nothing + | toInteger n >= shiftWidth = Nothing + | toInteger n > toInteger (maxBound :: Int) = Nothing + | otherwise = Just $ fromIntegral n + where + shiftWidth = toInteger (CT.sizeof ty) * 8 + logicalAnd lhs' + | not (toBool lhs') = pure $ fromBool False + | otherwise = fromBool . toBool <$> constantExp' rhs + logicalOr lhs' + | toBool lhs' = pure $ fromBool True + | otherwise = fromBool . toBool <$> constantExp' rhs + nonZeroBinop f = + constantExp' lhs >>= \lhs' -> + constantExp' rhs >>= \rhs' -> + if rhs' == 0 + then fail "The expression is not constant-expression" + else pure $ fromIntegral $ f lhs' rhs' + memOp opName op expr + | hasIncompleteObjectType (atype expr) = + fail $ "invalid application of '" <> opName <> "' to incomplete type" + | otherwise = + pure $ fromIntegral $ op $ atype expr + constantExp' ATEmpty = fail "The expression is not constant-expression" diff --git a/src/Htcc/Parser/Combinators/Core.hs b/src/Htcc/Parser/Combinators/Core.hs new file mode 100644 index 0000000..29b978f --- /dev/null +++ b/src/Htcc/Parser/Combinators/Core.hs @@ -0,0 +1,384 @@ +{-| +Module : Htcc.Parser.Combinators.Core +Description : C language lexer +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language lexer +-} +{-# LANGUAGE FlexibleContexts, LambdaCase, OverloadedStrings #-} +module Htcc.Parser.Combinators.Core ( + runParser + , ConstructionDataState + , Parser + , spaceConsumer + , lexeme + , symbol + , charLiteral + , stringLiteral + , decimal + , hexadecimal + , octal + , natural + , integer + , angles + , parens + , braces + , brackets + , identifier + , semi + , comma + , colon + , lnot + , sharp + , ampersand + , lparen + , rparen + , lbrace + , rbrace + , langle + , rangle + , lbracket + , rbracket + , star + , period + , slash + , equal + , question + , hat + , tilda + , vertical + , percent + , notFollowedBy +) where + +import Htcc.Parser.Combinators.ParserType + +import Control.Applicative (Alternative (..), optional) +import Control.Monad (void) +import Control.Monad.Combinators (between) +import qualified Data.ByteString as B +import Data.Char (chr, digitToInt, isAlpha, + isHexDigit, isOctDigit, + ord) +import Data.Functor (($>)) +import qualified Data.Text as T +import qualified Data.Text.Encoding as TE +import qualified Htcc.CRules as CR +import Htcc.Utils (lor) +import qualified Text.Megaparsec as M +import qualified Text.Megaparsec.Char as MC +import qualified Text.Megaparsec.Char.Lexer as ML +import qualified Text.Megaparsec.Pos as MP + +spaceConsumer :: (Monad m, Ord e) => M.ParsecT e T.Text m () +spaceConsumer = ML.space spaceOrInclude lineComment (ML.skipBlockComment "/*" "*/") + where + spaceOrInclude = skipIncludeLine <|> whiteSpaceChar + lineComment = ML.skipLineComment "//" + skipIncludeLine = do + pos <- M.getSourcePos + if MP.unPos (MP.sourceColumn pos) == 1 + then M.try includeLine + else M.empty + includeLine = do + horizontalSpace + void $ MC.char '#' + horizontalSpace + void $ MC.string "include" + M.notFollowedBy $ M.satisfy CR.isValidChar + void $ M.takeWhileP Nothing (/= '\n') + horizontalSpace = void $ M.takeWhileP Nothing (`elem` (" \t\r\f\v" :: String)) + whiteSpaceChar = void $ M.satisfy (`elem` (" \t\r\f\v\n" :: String)) + +lexeme :: (Monad m, Ord e) => M.ParsecT e T.Text m a -> M.ParsecT e T.Text m a +lexeme = ML.lexeme spaceConsumer + +symbol :: (Monad m, Ord e) => T.Text -> M.ParsecT e T.Text m T.Text +symbol = ML.symbol spaceConsumer + +maxHexEscapeDigits :: Int +maxHexEscapeDigits = 4 + +maxBinaryLiteralDigits :: Int +maxBinaryLiteralDigits = 128 + +charLiteral :: (Monad m, Ord e, Num i) => M.ParsecT e T.Text m i +charLiteral = do + void $ MC.char '\'' + first <- byteCharBody + rest <- charLiteralTail 3 + charConstantValue (first : rest) <$ spaceConsumer + where + charBody = escapedByte <|> M.noneOf ['\\', '\'', '\n', '\r'] + escapedByte = + MC.char '\\' + *> (hexEscape + <|> octalEscape + <|> simpleEscape + <|> invalidEscape + ) + simpleEscape = M.choice + [ '\a' <$ MC.char 'a' + , '\b' <$ MC.char 'b' + , '\t' <$ MC.char 't' + , '\n' <$ MC.char 'n' + , '\v' <$ MC.char 'v' + , '\f' <$ MC.char 'f' + , '\r' <$ MC.char 'r' + , '\ESC' <$ MC.char 'e' + , '\\' <$ MC.char '\\' + , '\'' <$ MC.char '\'' + , '"' <$ MC.char '"' + , '?' <$ MC.char '?' + ] + hexEscape = do + void $ MC.char 'x' + digits <- hexDigits + byteFromDigits 16 digits + hexDigits = do + first <- M.satisfy isHexDigit + collectHexDigits 1 [first] + collectHexDigits count revDigits = + optional (M.lookAhead $ M.satisfy isHexDigit) >>= \case + Nothing -> pure $ reverse revDigits + Just _ + | count >= maxHexEscapeDigits -> + fail "character constant escape exceeds byte width" + | otherwise -> do + c <- M.satisfy isHexDigit + let count' = succ count + revDigits' = c : revDigits + count' `seq` revDigits' `seq` collectHexDigits count' revDigits' + octalEscape = do + first <- M.satisfy isOctDigit + second <- optional $ M.satisfy isOctDigit + third <- optional $ M.satisfy isOctDigit + byteFromDigits 8 $ first : maybe [] (\d -> d : maybe [] pure third) second + byteFromDigits base digits = + let n = foldl (\acc c -> acc * base + digitToInt c) 0 digits + in if n <= 0xff + then pure $ chr n + else fail "character constant escape exceeds byte width" + invalidEscape = + M.anySingle >>= \c -> fail ("invalid escape sequence \\" <> [c]) + byteCharBody = do + c <- charBody + if ord c <= 0xff + then pure c + else fail "character constant escape exceeds byte width" + charLiteralTail remaining = + M.choice + [ [] <$ MC.char '\'' + , if remaining == 0 + then charBody *> fail "multi-character constant is too long" + else (:) <$> byteCharBody <*> charLiteralTail (pred remaining) + ] + charConstantValue = foldl (\acc c -> acc * 256 + fromIntegral (ord c)) 0 + +stringLiteral :: (Monad m, Ord e) => M.ParsecT e T.Text m B.ByteString +stringLiteral = do + void $ MC.char '\"' + chunks <- M.manyTill stringByteChunk (MC.char '\"') + spaceConsumer + pure $ B.concat chunks `B.snoc` 0 + where + stringByteChunk = escapedByte <|> rawCharBytes + rawCharBytes = TE.encodeUtf8 . T.singleton <$> M.noneOf ['\\', '"', '\n', '\r'] + escapedByte = + MC.char '\\' + *> (hexEscape + <|> octalEscape + <|> simpleEscape + <|> invalidEscape + ) + simpleEscape = M.choice + [ byte '\a' <$ MC.char 'a' + , byte '\b' <$ MC.char 'b' + , byte '\t' <$ MC.char 't' + , byte '\n' <$ MC.char 'n' + , byte '\v' <$ MC.char 'v' + , byte '\f' <$ MC.char 'f' + , byte '\r' <$ MC.char 'r' + , byte '\ESC' <$ MC.char 'e' + , byte '\\' <$ MC.char '\\' + , byte '\'' <$ MC.char '\'' + , byte '"' <$ MC.char '"' + , byte '?' <$ MC.char '?' + ] + hexEscape = do + void $ MC.char 'x' + digits <- hexDigits + byteFromDigits "character code point out of range" 16 digits + octalEscape = do + first <- M.satisfy isOctDigit + second <- optional $ M.satisfy isOctDigit + third <- optional $ M.satisfy isOctDigit + byteFromDigits "character code point out of range" 8 $ first : maybe [] (\d -> d : maybe [] pure third) second + hexDigits = do + first <- M.satisfy isHexDigit + collectHexDigits 1 [first] + collectHexDigits count revDigits = + optional (M.lookAhead $ M.satisfy isHexDigit) >>= \case + Nothing -> pure $ reverse revDigits + Just _ + | count >= maxHexEscapeDigits -> + fail "character code point out of range" + | otherwise -> do + c <- M.satisfy isHexDigit + let count' = succ count + revDigits' = c : revDigits + count' `seq` revDigits' `seq` collectHexDigits count' revDigits' + byteFromDigits errMsg base digits = + let n = foldl (\acc c -> acc * base + digitToInt c) 0 digits + in if n <= 0xff + then pure $ B.singleton $ fromIntegral n + else fail errMsg + byte = B.singleton . fromIntegral . ord + invalidEscape = + M.anySingle >>= \c -> fail ("invalid escape sequence \\" <> [c]) + +hexadecimal, binary, octal, decimal, natural, integer :: (Monad m, Ord e, Num i) => M.ParsecT e T.Text m i +hexadecimal = MC.char '0' >> MC.char' 'x' >> ML.hexadecimal +binary = do + void $ MC.char '0' + void $ MC.char' 'b' + first <- binaryDigit + consumeBinaryDigits 1 $ binaryDigitValue first + where + binaryDigit = MC.char '0' <|> MC.char '1' + binaryDigitValue = fromIntegral . digitToInt + consumeBinaryDigits count acc = + M.option Nothing (Just <$> M.lookAhead binaryDigit) >>= \case + Nothing -> pure acc + Just _ + | count >= maxBinaryLiteralDigits -> + fail "binary integer literal is too long" + | otherwise -> do + c <- binaryDigit + let count' = succ count + acc' = acc * 2 + binaryDigitValue c + count' `seq` acc' `seq` consumeBinaryDigits count' acc' +octal = MC.char '0' >> ML.octal +decimal = ML.decimal +natural = M.try (lexeme hexadecimal) <|> M.try (lexeme binary) <|> M.try (lexeme octal) <|> lexeme decimal +integer = ML.signed spaceConsumer natural <|> natural + +parens, braces, angles, brackets :: (Monad m, Ord e) => M.ParsecT e T.Text m a -> M.ParsecT e T.Text m a +parens = between lparen rparen +braces = between lbrace rbrace +angles = between langle rangle +brackets = between lbracket rbracket + +identifier, + semi, + comma, + colon, + lnot, + sharp, + ampersand, + lparen, + rparen, + lbrace, + rbrace, + langle, + rangle, + lbracket, + rbracket, + star, + period, + slash, + equal, + question, + hat, + tilda, + vertical, + percent :: (Monad m, Ord e) => M.ParsecT e T.Text m T.Text +identifier = M.try $ do + ident <- + mappend + <$> M.takeWhile1P (Just "valid identifier") (lor [isAlpha, (=='_')]) + <*> M.takeWhileP (Just "valid identifier") CR.isValidChar + if ident `elem` reservedKeywords + then fail $ "reserved keyword '" <> T.unpack ident <> "' cannot be used as identifier" + else spaceConsumer $> ident + +reservedKeywords :: [T.Text] +reservedKeywords = + [ "auto" + , "break" + , "case" + , "char" + , "const" + , "continue" + , "default" + , "do" + , "double" + , "else" + , "enum" + , "extern" + , "float" + , "for" + , "goto" + , "if" + , "inline" + , "int" + , "long" + , "register" + , "restrict" + , "return" + , "short" + , "signed" + , "sizeof" + , "static" + , "struct" + , "switch" + , "typedef" + , "union" + , "unsigned" + , "void" + , "volatile" + , "while" + , "_Alignas" + , "_Alignof" + , "_Atomic" + , "_Bool" + , "_Complex" + , "_Generic" + , "_Imaginary" + , "_Noreturn" + , "_Static_assert" + , "_Thread_local" + ] +semi = symbol ";" +comma = symbol "," +colon = symbol ":" +lnot = symbol "!" +sharp = symbol "#" +ampersand = symbol "&" +lparen = symbol "(" +rparen = symbol ")" +lbrace = symbol "{" +rbrace = symbol "}" +langle = symbol "<" +rangle = symbol ">" +lbracket = symbol "[" +rbracket = symbol "]" +star = symbol "*" +period = symbol "." +slash = symbol "/" +equal = symbol "=" +question = symbol "?" +hat = symbol "^" +tilda = symbol "~" +vertical = symbol "|" +percent = symbol "%" + +notFollowedBy :: (Monad m, Ord e) + => M.ParsecT e T.Text m a + -> M.ParsecT e T.Text m b + -> M.ParsecT e T.Text m a +notFollowedBy k p = lexeme (k <* M.notFollowedBy p) diff --git a/src/Htcc/Parser/Combinators/Decl.hs b/src/Htcc/Parser/Combinators/Decl.hs new file mode 100644 index 0000000..57b2685 --- /dev/null +++ b/src/Htcc/Parser/Combinators/Decl.hs @@ -0,0 +1,18 @@ +{-| +Module : Htcc.Parser.Combinators.Decl +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +module Htcc.Parser.Combinators.Decl ( + module Htcc.Parser.Combinators.Decl.Spec + , module Htcc.Parser.Combinators.Decl.Declarator +) where + +import Htcc.Parser.Combinators.Decl.Declarator +import Htcc.Parser.Combinators.Decl.Spec diff --git a/src/Htcc/Parser/Combinators/Decl/Declarator.hs b/src/Htcc/Parser/Combinators/Decl/Declarator.hs new file mode 100644 index 0000000..301960a --- /dev/null +++ b/src/Htcc/Parser/Combinators/Decl/Declarator.hs @@ -0,0 +1,95 @@ +{-| +Module : Htcc.Parser.Combinators.Decl.Declarator +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +{-# LANGUAGE TupleSections #-} +module Htcc.Parser.Combinators.Decl.Declarator ( + declarator + , absDeclarator +) where + +import Control.Monad.Fix (fix) +import Data.Bits (Bits (..)) +import Data.Maybe (fromMaybe) +import qualified Data.Text as T +import Data.Tuple.Extra (uncurry3) +import qualified Htcc.CRules.Types as CT +import Htcc.Parser.Combinators.Core +import Htcc.Parser.Combinators.Decl.Spec (declspec) +import Htcc.Parser.Combinators.Keywords (kAuto, kRegister, kStatic) +import Htcc.Parser.Combinators.Type.Core (typeSuffix) +import Htcc.Parser.Combinators.Type.Utils +import Htcc.Utils (dropFst3, swap) +import qualified Text.Megaparsec as M + +declarator :: (Integral i, Show i, Read i, Bits i) + => CT.StorageClass i + -> Parser i (CT.StorageClass i, Maybe T.Text) +declarator ty = do + ty' <- starsToPtr ty + M.option (ty', Nothing) $ M.choice + [ nested ty' + , swap <$> ((,) <$> M.option Nothing (Just <$> identifier) <*> typeSuffix ty') + ] + where + nested ty' = fmap (swap . dropFst3) + $ ($ (id, ty')) + $ fix $ \f (fn, ty'') -> do + ptrf <- (fn .) <$> starsToPtrCtor + M.choice + [ parens (f (ptrf, ty'')) >>= uncurry3 nested' + , (ptrf,,) . Just <$> identifier <*> M.option ty'' (typeSuffix ty'') + ] + where + nested' ptrf ident t = + M.option + (id, ident, rebuildTy $ ptrf baseTy) + ((id, ident,) . rebuildTy . ptrf <$> typeSuffix baseTy) + where + (baseTyKind, rebuildTyKind) = dctorDirectSuffix $ CT.toTypeKind t + baseTy = CT.mapTypeKind (const baseTyKind) t + rebuildTy ty''' = CT.mapTypeKind (const $ rebuildTyKind $ CT.toTypeKind ty''') t + + dctorDirectSuffix arrayTy@(CT.CTArray n ty''') = + let (baseTyKind, rebuildTyKind) = dctorDirectSuffix ty''' + rebuild newTy = CT.CTArray n $ rebuildTyKind newTy + in (baseTyKind, \newTy -> fromMaybe (rebuild newTy) (CT.concatCTArray arrayTy newTy)) + dctorDirectSuffix (CT.CTIncomplete (CT.IncompleteArray ty''')) = + let (baseTyKind, rebuildTyKind) = dctorDirectSuffix ty''' + in (baseTyKind, CT.CTIncomplete . CT.IncompleteArray . rebuildTyKind) + dctorDirectSuffix (CT.CTFunc retTy params) = + let (baseTyKind, rebuildTyKind) = dctorDirectSuffix retTy + in (baseTyKind, (`CT.CTFunc` params) . rebuildTyKind) + dctorDirectSuffix ty''' = (ty''', id) + +absDeclarator :: (Integral i, Show i, Read i, Bits i) => Parser i (CT.StorageClass i) +absDeclarator = do + ty <- typeNameSpec + ty' <- starsToPtr ty + M.choice + [ M.try $ typeSuffix ty' + , snd <$> absDeclType' id ty' + ] + where + typeNameSpec = + M.choice + [ kStatic *> storageClassError + , kRegister *> storageClassError + , kAuto *> storageClassError + , declspec + ] + storageClassError = + fail "storage-class specifier is not allowed" + + absDeclType' fn ty = do + cpfn <- starsToPtrCtor + M.option (cpfn, ty) $ do + (cpfn', ty') <- parens $ absDeclType' (fn . cpfn) ty + M.option (id, cpfn' ty') ((id,) . cpfn' <$> typeSuffix ty') diff --git a/src/Htcc/Parser/Combinators/Decl/Declarator.hs-boot b/src/Htcc/Parser/Combinators/Decl/Declarator.hs-boot new file mode 100644 index 0000000..eb576bc --- /dev/null +++ b/src/Htcc/Parser/Combinators/Decl/Declarator.hs-boot @@ -0,0 +1,21 @@ +{-| +Module : Htcc.Parser.Combinators.Decl.Declarator +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +module Htcc.Parser.Combinators.Decl.Declarator where + +import Data.Bits (Bits (..)) +import qualified Data.Text as T +import qualified Htcc.CRules.Types as CT +import Htcc.Parser.Combinators.Core (Parser) + +declarator :: (Integral i, Show i, Read i, Bits i) + => CT.StorageClass i + -> Parser i (CT.StorageClass i, Maybe T.Text) diff --git a/src/Htcc/Parser/Combinators/Decl/Spec.hs b/src/Htcc/Parser/Combinators/Decl/Spec.hs new file mode 100644 index 0000000..9eb9099 --- /dev/null +++ b/src/Htcc/Parser/Combinators/Decl/Spec.hs @@ -0,0 +1,388 @@ +{-| +Module : Htcc.Parser.Combinators.Decl.Spec +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +{-# LANGUAGE FlexibleContexts, LambdaCase #-} +module Htcc.Parser.Combinators.Decl.Spec ( + DeclStorage (..) + , declarationSpec + , declspec +) where + +import Control.Applicative (some, + (<|>)) +import Control.Monad (void, + when) +import Control.Monad.State (get, + gets, + put) +import Data.Bits (Bits) +import Data.Functor (($>), + (<&>)) +import qualified Data.Map.Strict as MP +import Data.Maybe (catMaybes) +import qualified Data.Text as T +import qualified Htcc.CRules.Types as CT +import Htcc.Parser.Combinators.ConstExpr (evalConstexpr) +import Htcc.Parser.Combinators.Core +import {-# SOURCE #-} Htcc.Parser.Combinators.Decl.Declarator (declarator) +import Htcc.Parser.Combinators.Keywords +import Htcc.Parser.Combinators.Type.Utils +import Htcc.Parser.Combinators.Utils (captureFunctionParamScopes) +import Htcc.Parser.ConstructionData.Core (ConstructionData (scope), + addEnumerator, + addTag, + hasIncompleteObjectType, + lookupEnumerator, + lookupFunction, + lookupGVar, + lookupLVar, + lookupTag, + lookupTypedef, + normalizeCompletedStorageClass) +import Htcc.Parser.ConstructionData.Scope (Scoped (curNestDepth, curScopeId)) +import qualified Htcc.Parser.ConstructionData.Scope.Enumerator as PSE +import qualified Htcc.Parser.ConstructionData.Scope.Function as PSF +import qualified Htcc.Parser.ConstructionData.Scope.Tag as PST +import qualified Htcc.Parser.ConstructionData.Scope.Typedef as PT +import qualified Htcc.Parser.ConstructionData.Scope.Var as PSV +import qualified Htcc.Tokenizer.Token as HT +import qualified Htcc.Utils as U +import Numeric.Natural (Natural) +import qualified Text.Megaparsec as M + +data DeclStorage + = OrdinaryDecl + | TypedefDecl + | ExternDecl + | AutoDecl + deriving (Eq, Show) + +declspec', + declspec :: (Ord i, Bits i, Show i, Read i, Integral i) => Parser i (CT.StorageClass i) + +declarationSpec :: (Ord i, Bits i, Show i, Read i, Integral i) => Parser i (DeclStorage, CT.StorageClass i) +declarationSpec = + leadingExternOrTypedef + <|> ordinaryWithLeadingStorage + <|> ordinaryOrTrailingExternOrTypedef + where + leadingExternOrTypedef = do + storage <- externOrTypedefStorage + ty <- declspecNoStorage + pure (storage, ty) + + ordinaryWithLeadingStorage = + explicitAuto <|> explicitStaticOrRegister + where + explicitAuto = do + void kAuto + ty <- declspecNoStorage + pure (AutoDecl, ty) + explicitStaticOrRegister = do + void $ M.lookAhead $ M.choice [kStatic, kRegister] + ty <- declspec + pure (OrdinaryDecl, ty) + + ordinaryOrTrailingExternOrTypedef = do + ty <- declspecNoStorage + storage <- M.option OrdinaryDecl externOrTypedefStorage + pure (storage, ty) + + externOrTypedefStorage = + M.choice + [ TypedefDecl <$ M.try kTypedef + , ExternDecl <$ M.try kExtern + ] + +declspec' = M.choice + [ kStatic *> (CT.SCStatic . CT.toTypeKind <$> declspecNoStorage) + , kRegister *> (CT.SCRegister . CT.toTypeKind <$> declspecNoStorage) + , kAuto *> declspecNoStorage + , declspecNoStorage + ] + +declspecNoStorage :: (Ord i, Bits i, Show i, Read i, Integral i) => Parser i (CT.StorageClass i) +declspecNoStorage = + M.choice + [ structSpecifier + , enumSpecifier + , M.try typedefSpecifier + , basicTypeSpecifier + ] + +basicTypeSpecifier :: (Show i, Read i, Integral i) => Parser i (CT.StorageClass i) +basicTypeSpecifier = do + specifiers <- some $ M.choice $ map M.try kBasicTypes + validateBasicTypeSpecifiers specifiers + pure $ CT.SCAuto $ toBasicTypeKind specifiers + where + toBasicTypeKind specifiers + | has "void" specifiers = CT.CTVoid + | has "_Bool" specifiers = CT.CTBool + | has "char" specifiers && has "signed" specifiers = CT.CTSigned CT.CTChar + | has "char" specifiers = CT.CTChar + | count "long" specifiers == 2 = CT.CTLong $ CT.CTLong CT.CTInt + | count "long" specifiers == 1 = CT.CTLong CT.CTInt + | has "short" specifiers = CT.CTShort CT.CTInt + | otherwise = CT.CTInt + + validateBasicTypeSpecifiers specifiers + | any has ["double", "float", "unsigned", "_Complex", "_Imaginary"] = invalidCombination specifiers + | count "long" > 2 = invalidCombination specifiers + | count "signed" > 1 = invalidCombination specifiers + | count "short" > 1 = invalidCombination specifiers + | count "short" > 0 && count "long" > 0 = invalidCombination specifiers + | any ((> 1) . count) ["char", "int", "_Bool", "void"] = invalidCombination specifiers + | has "void" && length specifiers > 1 = invalidCombination specifiers + | has "_Bool" && length specifiers > 1 = invalidCombination specifiers + | has "char" && any has ["int", "long", "short"] = invalidCombination specifiers + | baseSpecifierCount > 1 = invalidCombination specifiers + | otherwise = pure () + where + count keyword = + length $ filter (== T.pack keyword) specifiers + has keyword = + count keyword > 0 + baseSpecifierCount = + length $ filter (`elem` map T.pack ["char", "int", "_Bool", "void"]) specifiers + + count keyword = + length . filter (== T.pack keyword) + + has keyword = + (> 0) . count keyword + + invalidCombination specifiers = + fail $ + "invalid type specifier combination '" + <> T.unpack (T.unwords specifiers) + <> "'" + +structSpecifier :: (Ord i, Bits i, Show i, Read i, Integral i) => Parser i (CT.StorageClass i) +structSpecifier = do + void kStruct + mtag <- M.option Nothing $ Just <$> M.try identifier + ifMtaggedStruct mtag + where + ifMtaggedStruct mtag = + (M.lookAhead lbrace *> defineStruct mtag) + <|> resolveStructTag mtag + + defineStruct Nothing = do + anonymousTag <- mkAnonymousStructTag + members <- braces structMembers + scopeId <- gets $ curScopeId . scope + pure $ CT.SCAuto $ CT.CTNamedStruct anonymousTag scopeId members + defineStruct (Just tag) = do + scopeId <- gets $ curScopeId . scope + registerTagType PST.StructTag (CT.SCAuto $ CT.CTIncomplete $ CT.IncompleteStruct tag scopeId) tag + members <- braces structMembers + let ty = CT.SCAuto $ CT.CTNamedStruct tag scopeId members + registerTagType PST.StructTag ty tag + pure ty + + resolveStructTag Nothing = + fail "expected identifier or '{' after 'struct'" + resolveStructTag (Just tag) = + do + standaloneForwardDecl <- M.option False $ True <$ M.lookAhead semi + scp <- get + let depth = curNestDepth $ scope scp + scopeId = curScopeId $ scope scp + incompleteStructTy = + CT.SCAuto $ CT.CTIncomplete $ CT.IncompleteStruct tag scopeId + maybe + (registerIncompleteStructTag tag $> incompleteStructTy) + (ensureStructTag depth scopeId standaloneForwardDecl) + (lookupTag tag scp) + where + ensureStructTag depth scopeId standaloneForwardDecl tagInfo + | standaloneForwardDecl && PST.stNestDepth tagInfo < depth = + registerIncompleteStructTag tag + $> CT.SCAuto (CT.CTIncomplete $ CT.IncompleteStruct tag scopeId) + | PST.stKind tagInfo == PST.StructTag = pure $ PST.sttype tagInfo + | otherwise = fail $ "use of 'struct " <> T.unpack tag <> "' with wrong tag type" + + structMembers = go (0 :: Natural) MP.empty + where + go offset acc = + (M.lookAhead rbrace $> acc) + <|> do + (nextOffset, name, mem) <- structMember offset + when (MP.member name acc) $ + fail $ "duplicate member '" <> T.unpack name <> "'" + go nextOffset $ MP.insert name mem acc + + structMember offset = do + rejectMemberStorageClass + memberBaseTy <- declspecNoStorage + ((memberTy, mident), _) <- captureFunctionParamScopes $ declarator memberBaseTy + ident <- maybe + (fail "expected member name or ';' after declaration specifiers") + pure + mident + resolvedTy <- gets (`normalizeCompletedStorageClass` memberTy) + when (hasIncompleteObjectType resolvedTy) $ + fail "declaration with incomplete type" + when (isVoidObjectType resolvedTy) $ + fail $ "variable or field '" <> T.unpack ident <> "' declarated void" + when (isFunctionMemberType resolvedTy) $ + fail $ "field '" <> T.unpack ident <> "' declared as a function" + void semi + let memberOffset = + U.toNatural $ + CT.alignas + (U.toInteger offset) + (fromIntegral $ CT.alignof resolvedTy) + memberInfo = + CT.StructMember + (CT.toTypeKind resolvedTy) + memberOffset + pure + ( memberOffset + fromIntegral (CT.sizeof resolvedTy) + , ident + , memberInfo + ) + + rejectMemberStorageClass = + M.choice + [ kAuto *> fail "invalid storage-class specifier" + , kStatic *> fail "invalid storage-class specifier" + , kRegister *> fail "invalid storage-class specifier" + , kExtern *> fail "invalid storage-class specifier" + , kTypedef *> fail "invalid storage-class specifier" + , pure () + ] + + mkAnonymousStructTag = do + pos <- M.getSourcePos + pure $ T.pack ".anonymous.struct." <> U.tshow pos + +enumSpecifier :: (Ord i, Bits i, Show i, Read i, Integral i) => Parser i (CT.StorageClass i) +enumSpecifier = do + void kEnum + mtag <- M.option Nothing $ Just <$> M.try identifier + (M.lookAhead lbrace *> defineEnum mtag) + <|> resolveEnumTag mtag + where + defineEnum Nothing = CT.SCAuto . CT.CTEnum CT.CTInt <$> braces enumMembers + defineEnum (Just tag) = do + members <- braces enumMembers + let ty = CT.SCAuto $ CT.CTEnum CT.CTInt members + registerTagType PST.EnumTag ty tag + pure ty + + resolveEnumTag Nothing = + fail "expected identifier or '{' after 'enum'" + resolveEnumTag (Just tag) = + gets (lookupTag tag) + >>= maybe + (fail $ "storage size of '" <> T.unpack tag <> "' isn't known") + ensureEnumTag + where + ensureEnumTag tagInfo + | PST.stKind tagInfo == PST.EnumTag = pure $ PST.sttype tagInfo + | otherwise = fail $ "use of 'enum " <> T.unpack tag <> "' with wrong tag type" + + enumMembers = go 0 MP.empty + where + enumTy = CT.SCAuto CT.CTInt + + go nextVal acc = + (M.lookAhead rbrace *> + if MP.null acc + then fail "use of empty enum" + else pure acc) + <|> do + ident <- identifier + val <- M.option nextVal $ equal *> evalConstexpr + registerEnumeratorValue enumTy ident val + hasComma <- M.option False (True <$ comma) + let acc' = MP.insert ident val acc + nextVal' = succ val + if hasComma then + (M.lookAhead rbrace $> acc') <|> go nextVal' acc' + else + pure acc' + +typedefSpecifier :: (Ord i, Bits i, Show i, Read i, Integral i) => Parser i (CT.StorageClass i) +typedefSpecifier = + M.try $ + identifier >>= \ident -> do + mTypedef <- gets $ lookupTypedef ident + ordinaryDepth <- visibleOrdinaryIdentifierDepth ident + case mTypedef of + Just td + | maybe False (>= PT.tdNestDepth td) ordinaryDepth -> + fail $ "'" <> T.unpack ident <> "' is not a type or also a typedef identifier" + | otherwise -> + pure $ PT.tdtype td + Nothing -> + fail $ "'" <> T.unpack ident <> "' is not a type or also a typedef identifier" + +declspec = declspec' >>= starsToPtr + +isVoidObjectType :: CT.StorageClass i -> Bool +isVoidObjectType = go . CT.toTypeKind + where + go = \case + CT.CTLong innerTy -> go innerTy + CT.CTShort innerTy -> go innerTy + CT.CTSigned innerTy -> go innerTy + CT.CTArray _ innerTy -> go innerTy + CT.CTIncomplete (CT.IncompleteArray innerTy) -> go innerTy + CT.CTVoid -> True + _ -> False + +isFunctionMemberType :: CT.StorageClass i -> Bool +isFunctionMemberType = go . CT.toTypeKind + where + go = \case + CT.CTLong innerTy -> go innerTy + CT.CTShort innerTy -> go innerTy + CT.CTSigned innerTy -> go innerTy + CT.CTFunc _ _ -> True + _ -> False + +visibleOrdinaryIdentifierDepth :: T.Text -> Parser i (Maybe Natural) +visibleOrdinaryIdentifierDepth ident = + gets $ \scp -> + let depths = + catMaybes + [ PSV.nestDepth <$> lookupLVar ident scp + , PSE.enNestDepth <$> lookupEnumerator ident scp + , PSV.gvNestDepth <$> lookupGVar ident scp + , PSF.fnNestDepth <$> lookupFunction ident scp + ] + in + if null depths then Nothing else Just $ maximum depths + +tmpTKIdent :: Num i => T.Text -> HT.TokenLC i +tmpTKIdent ident = (HT.TokenLCNums 1 1, HT.TKIdent ident) + +registerTagType :: Num i => PST.TagKind -> CT.StorageClass i -> T.Text -> Parser i () +registerTagType kind ty ident = do + scp <- get + case addTag kind ty (tmpTKIdent ident) scp of + Left err -> fail $ T.unpack $ fst err + Right scp' -> put scp' + +registerIncompleteStructTag :: Num i => T.Text -> Parser i () +registerIncompleteStructTag ident = do + scopeId <- gets $ curScopeId . scope + registerTagType PST.StructTag (CT.SCAuto $ CT.CTIncomplete $ CT.IncompleteStruct ident scopeId) ident + +registerEnumeratorValue :: Num i => CT.StorageClass i -> T.Text -> i -> Parser i () +registerEnumeratorValue ty ident val = do + scp <- get + case addEnumerator ty (tmpTKIdent ident) val scp of + Left err -> fail $ T.unpack $ fst err + Right scp' -> put scp' diff --git a/src/Htcc/Parser/Combinators/GNUExtensions.hs b/src/Htcc/Parser/Combinators/GNUExtensions.hs new file mode 100644 index 0000000..8c8e233 --- /dev/null +++ b/src/Htcc/Parser/Combinators/GNUExtensions.hs @@ -0,0 +1,53 @@ +{-| +Module : Htcc.Parser.Combinators.GNUExtensions +Description : Combinators of GNU extensions +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +Combinators of GNU extensions +-} +{-# LANGUAGE OverloadedStrings #-} +module Htcc.Parser.Combinators.GNUExtensions ( + condOmitted + , stmtExpr +) where + +import Control.Monad (unless, when) +import Control.Monad.State (gets) +import Data.Bits (Bits) +import Htcc.Parser.AST.Core (ATKind (..), + ATree (..), + atConditional, + atNoLeaf) +import Htcc.Parser.Combinators.Core +import {-# SOURCE #-} Htcc.Parser.Combinators.Program (compoundStmt, + conditional) +import Htcc.Parser.Combinators.Utils (conditionalResultType, + decayExprType, + maybeToParser, + requiresUnsupportedNonAddressableArrayDecay) +import Htcc.Parser.ConstructionData.Core (ConstructionData (suppressUnsupportedValueChecks)) +import qualified Text.Megaparsec as M + + +-- Conditionals with Omitted Operands, see also: https://gcc.gnu.org/onlinedocs/gcc/Conditionals.html +condOmitted :: (Ord i, Bits i, Read i, Show i, Integral i) => ATree i -> Parser i (ATree i) +condOmitted nd = M.try (symbol "?" *> symbol ":") *> do + el <- conditional + ty <- maybeToParser "invalid operands" $ conditionalResultType nd el + unsupportedChecksSuppressed <- gets suppressUnsupportedValueChecks + unless unsupportedChecksSuppressed $ + when (requiresUnsupportedNonAddressableArrayDecay nd || requiresUnsupportedNonAddressableArrayDecay el) $ + fail "unsupported non-addressable array member decay" + pure $ atConditional ty nd ATEmpty el + +-- Statements and Declarations in Expressions, see also: https://gcc.gnu.org/onlinedocs/gcc/Statement-Exprs.html +stmtExpr :: (Ord i, Bits i, Read i, Show i, Integral i) => Parser i (ATree i) +stmtExpr = do + k <- parens compoundStmt + if null k then fail "void value not ignored as it ought to be" else case last k of + (ATNode ATExprStmt _ n _) -> pure $ atNoLeaf (ATStmtExpr $ init k <> [n]) (decayExprType $ atype n) + _ -> fail "void value not ignored as it ought to be" diff --git a/src/Htcc/Parser/Combinators/Keywords.hs b/src/Htcc/Parser/Combinators/Keywords.hs new file mode 100644 index 0000000..910bb0b --- /dev/null +++ b/src/Htcc/Parser/Combinators/Keywords.hs @@ -0,0 +1,100 @@ +{-| +Module : Htcc.Parser.Combinators.Keywords +Description : C language lexer +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language lexer +-} +{-# LANGUAGE OverloadedStrings #-} +module Htcc.Parser.Combinators.Keywords ( + kAuto, kBreak, kCase, kChar, kConst, kContinue, + kDefault, kDo, kDouble, kElse, kEnum, kExtern, + kFloat, kFor, kGoto, kIf, kInline, kInt, + kLong, kRegister, kRestrict, kReturn, kShort, kSigned, kSizeof, + kStatic, kStruct, kSwitch, kTypedef, kUnion, kUnsigned, kVoid, + kVolatile, kWhile, kAlignas, kAlignof, kAtomic, kBool, kComplex, + kGeneric, kImaginary, kNoreturn, kStaticAssert, kThreadLocal, + kBasicTypes +) where + +import qualified Data.Text as T +import qualified Htcc.CRules as CR +import Htcc.Parser.Combinators.Core +import qualified Text.Megaparsec as M +import qualified Text.Megaparsec.Char as MC + +pKeyword :: (Monad m, Ord e) => T.Text -> M.ParsecT e T.Text m T.Text +pKeyword = flip notFollowedBy (M.takeWhile1P (Just "valid Keyword") CR.isValidChar) . MC.string + +kAuto, kBreak, kCase, kChar, kConst, kContinue, + kDefault, kDo, kDouble, kElse, kEnum, kExtern, + kFloat, kFor, kGoto, kIf, kInline, kInt, + kLong, kRegister, kRestrict, kReturn, kShort, kSigned, + kSizeof, kStatic, kStruct, kSwitch, kTypedef, kUnion, + kUnsigned, kVoid, kVolatile, kWhile, kAlignas, kAlignof, + kAtomic, kBool, kComplex, kGeneric, kImaginary, kNoreturn, + kStaticAssert, kThreadLocal :: (Monad m, Ord e) => M.ParsecT e T.Text m T.Text +kAuto = pKeyword "auto" +kBreak = pKeyword "break" +kCase = pKeyword "case" +kChar = pKeyword "char" +kConst = pKeyword "const" +kContinue = pKeyword "continue" +kDefault = pKeyword "default" +kDo = pKeyword "do" +kDouble = pKeyword "double" +kElse = pKeyword "else" +kEnum = pKeyword "enum" +kExtern = pKeyword "extern" +kFloat = pKeyword "float" +kFor = pKeyword "for" +kGoto = pKeyword "goto" +kIf = pKeyword "if" +kInline = pKeyword "inline" +kInt = pKeyword "int" +kLong = pKeyword "long" +kRegister = pKeyword "register" +kRestrict = pKeyword "restrict" +kReturn = pKeyword "return" +kShort = pKeyword "short" +kSigned = pKeyword "signed" +kSizeof = pKeyword "sizeof" +kStatic = pKeyword "static" +kStruct = pKeyword "struct" +kSwitch = pKeyword "switch" +kTypedef = pKeyword "typedef" +kUnion = pKeyword "union" +kUnsigned = pKeyword "unsigned" +kVoid = pKeyword "void" +kVolatile = pKeyword "volatile" +kWhile = pKeyword "while" +kAlignas = pKeyword "_Alignas" +kAlignof = pKeyword "_Alignof" +kAtomic = pKeyword "_Atomic" +kBool = pKeyword "_Bool" +kComplex = pKeyword "_Complex" +kGeneric = pKeyword "_Generic" +kImaginary = pKeyword "_Imaginary" +kNoreturn = pKeyword "_Noreturn" +kStaticAssert = pKeyword "_Static_assert" +kThreadLocal = pKeyword "_Thread_local" + +kBasicTypes :: (Monad m, Ord e) => [M.ParsecT e T.Text m T.Text] +kBasicTypes = [ + kChar + , kDouble + , kFloat + , kInt + , kLong + , kShort + , kSigned + , kUnsigned + , kVoid + , kBool + , kComplex + , kImaginary + ] diff --git a/src/Htcc/Parser/Combinators/ParserType.hs b/src/Htcc/Parser/Combinators/ParserType.hs new file mode 100644 index 0000000..f25198b --- /dev/null +++ b/src/Htcc/Parser/Combinators/ParserType.hs @@ -0,0 +1,117 @@ +{-| +Module : Htcc.Parser.Combinators.ParserType +Description : C language parser type +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser type +-} +module Htcc.Parser.Combinators.ParserType ( + runParser + , runParserAllowSameInputExternalCollisions + , runParserAllowSameInputExternalCollisionsDetailed + , ConstructionDataState + , Parser +) where + +import Control.Monad.Trans.State.Lazy (StateT, + runStateT) +import Data.Functor.Identity +import qualified Data.Text as T +import Data.Void +import Htcc.Parser.AST.Type (ASTs) +import {-# SOURCE #-} Htcc.Parser.ConstructionData.Core (ConstructionData (..), + Warnings, + initConstructionData) +import qualified Htcc.Parser.ConstructionData.Scope as PS +import qualified Htcc.Parser.ConstructionData.Scope.Function as PF +import qualified Htcc.Parser.ConstructionData.Scope.Var as PSV +import qualified Text.Megaparsec as M + +type ConstructionDataState i = StateT (ConstructionData i) Identity +type Parser i = M.ParsecT Void T.Text (ConstructionDataState i) + +runParser :: + Parser i (ASTs i) + -> FilePath + -> T.Text + -> Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs i, PSV.GlobalVars i, PSV.Literals i, PF.Functions i) +runParser = runParserWithMode False + +runParserAllowSameInputExternalCollisions :: + Parser i (ASTs i) + -> FilePath + -> T.Text + -> Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs i, PSV.GlobalVars i, PSV.Literals i, PF.Functions i) +runParserAllowSameInputExternalCollisions = runParserWithMode True + +runParserAllowSameInputExternalCollisionsDetailed :: + Parser i (ASTs i) + -> FilePath + -> T.Text + -> Either + (M.ParseErrorBundle T.Text Void) + ( Warnings + , ASTs i + , PSV.GlobalVars i + , PSV.GlobalVars i + , PSV.Literals i + , PF.Functions i + , PF.Functions i + ) +runParserAllowSameInputExternalCollisionsDetailed = + runParserWithModeDetailed True + +runParserWithMode :: + Bool + -> Parser i (ASTs i) + -> FilePath + -> T.Text + -> Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs i, PSV.GlobalVars i, PSV.Literals i, PF.Functions i) +runParserWithMode allowSameInputExternalCollisionsMode p fp input = + (\(warns', asts, gvars, _, lits, funcs, _) -> (warns', asts, gvars, lits, funcs)) + <$> runParserWithModeDetailed allowSameInputExternalCollisionsMode p fp input + +runParserWithModeDetailed :: + Bool + -> Parser i (ASTs i) + -> FilePath + -> T.Text + -> Either + (M.ParseErrorBundle T.Text Void) + ( Warnings + , ASTs i + , PSV.GlobalVars i + , PSV.GlobalVars i + , PSV.Literals i + , PF.Functions i + , PF.Functions i + ) +runParserWithModeDetailed allowSameInputExternalCollisionsMode p fp input = + attachDetailedResult <$> fst result + where + finalScope = scope $ snd result + attachDetailedResult asts = + ( warns (snd result) + , asts + , visibleGlobals finalScope + , mergeGlobals finalScope + , PSV.literals (PS.vars finalScope) + , PS.functions finalScope + , mergeFunctions finalScope + ) + visibleGlobals scp = PSV.globals $ PS.vars scp + mergeGlobals scp + | allowSameInputExternalCollisionsMode = PSV.externalGlobals $ PS.vars scp + | otherwise = visibleGlobals scp + mergeFunctions scp + | allowSameInputExternalCollisionsMode = PS.externalFunctions scp + | otherwise = PS.functions scp + result = + runIdentity $ + runStateT + (M.runParserT p fp input) + (initConstructionData { allowSameInputExternalCollisions = allowSameInputExternalCollisionsMode }) diff --git a/src/Htcc/Parser/Combinators/ParserType.hs-boot b/src/Htcc/Parser/Combinators/ParserType.hs-boot new file mode 100644 index 0000000..bb37a07 --- /dev/null +++ b/src/Htcc/Parser/Combinators/ParserType.hs-boot @@ -0,0 +1,43 @@ +{-# LANGUAGE FlexibleContexts, OverloadedStrings, RankNTypes, TupleSections #-} +module Htcc.Parser.Combinators.ParserType where + +import Control.Monad.Trans.State.Lazy (StateT (..)) +import Data.Functor.Identity +import qualified Data.Text as T +import Data.Void +import Htcc.Parser.AST.Type (ASTs) +import {-# SOURCE #-} Htcc.Parser.ConstructionData.Core (ConstructionData, + Warnings) +import qualified Htcc.Parser.ConstructionData.Scope.Function as PF +import qualified Htcc.Parser.ConstructionData.Scope.Var as PSV +import qualified Text.Megaparsec as M + +type ConstructionDataState i = StateT (ConstructionData i) Identity +type Parser i = M.ParsecT Void T.Text (ConstructionDataState i) + +runParser :: + Parser i (ASTs i) + -> FilePath + -> T.Text + -> Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs i, PSV.GlobalVars i, PSV.Literals i, PF.Functions i) + +runParserAllowSameInputExternalCollisions :: + Parser i (ASTs i) + -> FilePath + -> T.Text + -> Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs i, PSV.GlobalVars i, PSV.Literals i, PF.Functions i) + +runParserAllowSameInputExternalCollisionsDetailed :: + Parser i (ASTs i) + -> FilePath + -> T.Text + -> Either + (M.ParseErrorBundle T.Text Void) + ( Warnings + , ASTs i + , PSV.GlobalVars i + , PSV.GlobalVars i + , PSV.Literals i + , PF.Functions i + , PF.Functions i + ) diff --git a/src/Htcc/Parser/Combinators/Program.hs b/src/Htcc/Parser/Combinators/Program.hs new file mode 100644 index 0000000..4efbef5 --- /dev/null +++ b/src/Htcc/Parser/Combinators/Program.hs @@ -0,0 +1,2056 @@ +{-| +Module : Htcc.Parser.Combinators.Program +Description : C language lexer +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language Program parser +-} +{-# LANGUAGE FlexibleContexts, LambdaCase, OverloadedStrings #-} +module Htcc.Parser.Combinators.Program ( + parser + , assign + , conditional + , compoundStmt + , convertCallArgs + , convertCallArgsWith + , foldGlobalInitWith + , isInvalidAggregateValueConversion +) where + +import Control.Monad (unless, void, + when, zipWithM, + (>=>)) +import Control.Monad.Combinators (choice) +import Control.Monad.Extra (ifM) +import Control.Monad.State (get, gets, modify) +import Control.Monad.Trans (MonadTrans (..)) +import Control.Monad.Trans.Maybe (MaybeT (..), + runMaybeT) +import Data.Bits (Bits, bit, + complement, + shiftL, shiftR, + xor, (.&.), (.|.)) +import Data.Functor (($>), (<&>)) +import Data.List (find, sortBy) +import Data.Maybe (fromJust, + fromMaybe, isJust, + listToMaybe) +import Data.Ord (comparing) +import qualified Data.Text as T +import Data.Tuple.Extra (dupe, first, + second) +import qualified Htcc.CRules.Types as CT +import Htcc.Parser.AST (Treealizable (..), + addKind, + isEmptyReturn, + isNonEmptyReturn, + subKind) +import Htcc.Parser.AST.Core (ATKind (..), + ATKindFor (..), + ATree (..), + atBlock, atBreak, + atCase, atCast, + atConditional, + atContinue, + atDefFunc, + atDefault, atElse, + atExprStmt, atFor, + atGVar, atGoto, + atIf, atLVar, + atLabel, + atMemberAcc, + atNoLeaf, atNull, + atNumLit, + atReturn, + atSwitch, atUnary, + atWhile, + fromATKindFor) +import Htcc.Parser.AST.Type (ASTs) +import Htcc.Parser.Combinators.BasicOperator +import Htcc.Parser.Combinators.ConstExpr (evalConstexpr) +import Htcc.Parser.Combinators.Core +import Htcc.Parser.Combinators.Decl (DeclStorage (..), + absDeclarator, + declarationSpec, + declarator, + declspec) +import qualified Htcc.Parser.Combinators.GNUExtensions as GNU +import Htcc.Parser.Combinators.Keywords +import Htcc.Parser.Combinators.Utils (bracket, + captureFunctionParamScopes, + conditionalResultType, + containsEscapingStmtExprControlFlow, + decayExprType, + getPosState, + hasInvalidStmtExprControlFlow, + isInvalidAggregateValueConversion, + isInvalidFunctionPointerInitializer, + isInvalidFunctionPointerValue, + isInvalidObjectPointerValue, + maybeToParser, + registerFunc, + registerGVar, + registerGVarWith, + registerLVar, + registerStringLiteral, + registerTypedef, + requiresUnsupportedNonAddressableArrayDecay) +import Htcc.Parser.Combinators.Var (varInit) +import Htcc.Parser.ConstructionData.Core (ConstructionData (scope, suppressUnsupportedValueChecks), + FunctionParamScope (..), + fallBack, + hasIncompleteObjectType, + incomplete, + isSwitchStmt, + lookupFunction, + lookupGVar, + lookupLVar, + lookupVar, + normalizeCompletedStorageClass, + pushWarn, + resetLocal, + succNest) +import Htcc.Parser.ConstructionData.Scope (LookupVarResult (..), + Scoped (curNestDepth, curScopeId, enumerators, functions, structs)) +import qualified Htcc.Parser.ConstructionData.Scope.Function as PSF +import qualified Htcc.Parser.ConstructionData.Scope.Var as PV +import Numeric.Natural (Natural) +import qualified Text.Megaparsec as M +import qualified Text.Megaparsec.Char as MC + +import Text.Megaparsec.Debug (dbg) + +parser, program :: (Ord i, Integral i, Bits i, Read i, Show i) => Parser i (ASTs i) +parser = do + asts <- spaceConsumer *> program <* M.eof + rejectUnsupportedCompletedFunctionReturnTypes + pure asts +program = M.many global + +requireCompleteObjectType + :: (Ord i, Bits i, Read i, Show i, Integral i) + => String + -> CT.StorageClass i + -> Parser i (CT.StorageClass i) +requireCompleteObjectType err ty = do + resolvedTy <- gets (`normalizeCompletedStorageClass` ty) + if hasIncompleteObjectType resolvedTy + then fail err + else pure resolvedTy + +isVoidObjectType :: CT.StorageClass i -> Bool +isVoidObjectType = go . CT.toTypeKind + where + go = \case + CT.CTLong innerTy -> go innerTy + CT.CTShort innerTy -> go innerTy + CT.CTSigned innerTy -> go innerTy + CT.CTArray _ innerTy -> go innerTy + CT.CTIncomplete (CT.IncompleteArray innerTy) -> go innerTy + CT.CTVoid -> True + _ -> False + +requireNonVoidObjectType :: String -> CT.StorageClass i -> Parser i (CT.StorageClass i) +requireNonVoidObjectType err ty + | isVoidObjectType ty = fail err + | otherwise = pure ty + +requireSupportedByValueType :: Ord i => String -> CT.StorageClass i -> Parser i (CT.StorageClass i) +requireSupportedByValueType err ty + | isUnsupportedByValueAggregateType ty = fail err + | otherwise = pure ty + +isUnsupportedByValueAggregateType :: Ord i => CT.StorageClass i -> Bool +isUnsupportedByValueAggregateType ty = + CT.isCTStruct ty && CT.sizeof ty > 8 + +rejectUnsupportedCompletedFunctionReturnTypes :: Ord i => Parser i () +rejectUnsupportedCompletedFunctionReturnTypes = do + hasUnsupportedReturn <- gets $ \cd -> + any (definedFunctionReturnsUnsupported cd) (functions $ scope cd) + when hasUnsupportedReturn $ + fail "unsupported by-value function return type" + where + definedFunctionReturnsUnsupported cd fn = + PSF.fnDefined fn + && functionReturnsUnsupported + (normalizeCompletedStorageClass cd $ PSF.fntype fn) + + functionReturnsUnsupported fnTy = case CT.toTypeKind fnTy of + CT.CTFunc retTy _ -> + isUnsupportedByValueAggregateType $ CT.SCAuto retTy + _ -> + False + +functionParamDecls :: Eq i => CT.StorageClass i -> Parser i [(CT.StorageClass i, Maybe T.Text)] +functionParamDecls ty = case CT.toTypeKind ty of + CT.CTFunc _ params -> pure + [ (CT.SCAuto $ canonicalizeFunctionParamType paramTy, ident) + | (paramTy, ident) <- params + , paramTy /= CT.CTVoid + ] + _ -> fail "expected function parameters" + +unnamedFunctionParamIdent :: Int -> T.Text +unnamedFunctionParamIdent idx = + "$htcc_unnamed_param_" <> T.pack (show idx) + +requireInitializedObjectType + :: (Ord i, Bits i, Read i, Show i, Integral i) + => String + -> CT.StorageClass i + -> CT.StorageClass i + -> Parser i (CT.StorageClass i) +requireInitializedObjectType err baseTy ty = do + resolvedBaseTy <- gets (`normalizeCompletedStorageClass` baseTy) + resolvedTy <- gets (`normalizeCompletedStorageClass` ty) + if hasInitializerCompletableArrayType resolvedTy + && not (addsOuterArrayLayerBeforeExistingOmittedBound resolvedBaseTy resolvedTy) + then pure resolvedTy + else + if hasIncompleteObjectType resolvedTy + then fail err + else pure resolvedTy + +requireExternDeclObjectType + :: (Ord i, Bits i, Read i, Show i, Integral i) + => String + -> CT.StorageClass i + -> Parser i (CT.StorageClass i) +requireExternDeclObjectType err ty = do + resolvedTy <- gets (`normalizeCompletedStorageClass` ty) + if hasInvalidExternArrayElementType resolvedTy + then fail err + else pure resolvedTy + where + hasInvalidExternArrayElementType = goTop . CT.toTypeKind + + goTop = \case + CT.CTArray _ innerTy -> + hasIncompleteArrayElementType innerTy + CT.CTIncomplete (CT.IncompleteArray elemTy) -> + hasIncompleteArrayElementType elemTy + _ -> + False + + hasIncompleteArrayElementType = \case + CT.CTArray _ innerTy -> + hasIncompleteArrayElementType innerTy + CT.CTIncomplete _ -> + True + _ -> + False + +requireTypedefDeclType + :: (Ord i, Bits i, Read i, Show i, Integral i) + => String + -> CT.StorageClass i + -> Parser i (CT.StorageClass i) +requireTypedefDeclType err ty = do + resolvedTy <- gets (`normalizeCompletedStorageClass` ty) + if hasInvalidTypedefArrayElementType resolvedTy + then fail err + else pure resolvedTy + where + hasInvalidTypedefArrayElementType = goTop . CT.toTypeKind + + goTop = \case + CT.CTArray _ innerTy -> + hasInvalidArrayElementType innerTy + CT.CTIncomplete (CT.IncompleteArray elemTy) -> + hasInvalidArrayElementType elemTy + _ -> + False + + hasInvalidArrayElementType = \case + CT.CTArray _ innerTy -> + hasInvalidArrayElementType innerTy + CT.CTIncomplete _ -> + True + innerTy -> + isVoidTypeKind innerTy + + isVoidTypeKind = \case + CT.CTLong innerTy -> + isVoidTypeKind innerTy + CT.CTShort innerTy -> + isVoidTypeKind innerTy + CT.CTSigned innerTy -> + isVoidTypeKind innerTy + CT.CTVoid -> + True + _ -> + False + +resolveDerefObjectType + :: (Ord i, Bits i, Read i, Show i, Integral i) + => String + -> CT.StorageClass i + -> Parser i (CT.StorageClass i) +resolveDerefObjectType err ty + | isTopLevelOmittedBoundArrayType ty = pure ty + | otherwise = gets (incomplete ty) >>= maybeToParser err + +decayIncompleteArrayExpr :: Ord i => ATree i -> ATree i +decayIncompleteArrayExpr expr + | isTopLevelOmittedBoundArrayType (atype expr) = atCast (decayExprType $ atype expr) expr + | otherwise = expr + +isTopLevelOmittedBoundArrayType :: CT.StorageClass i -> Bool +isTopLevelOmittedBoundArrayType ty = case CT.toTypeKind ty of + CT.CTIncomplete (CT.IncompleteArray _) -> True + _ -> False + +hasInitializerCompletableArrayType :: CT.StorageClass i -> Bool +hasInitializerCompletableArrayType = go . CT.toTypeKind + where + go = \case + CT.CTLong innerTy -> go innerTy + CT.CTShort innerTy -> go innerTy + CT.CTSigned innerTy -> go innerTy + CT.CTEnum baseTy _ -> go baseTy + CT.CTArray _ innerTy -> + go innerTy + CT.CTIncomplete (CT.IncompleteArray elemTy) -> + isCompleteObjectTypeKind elemTy + _ -> + False + + isCompleteObjectTypeKind = \case + CT.CTLong innerTy -> isCompleteObjectTypeKind innerTy + CT.CTShort innerTy -> isCompleteObjectTypeKind innerTy + CT.CTSigned innerTy -> isCompleteObjectTypeKind innerTy + CT.CTEnum baseTy _ -> isCompleteObjectTypeKind baseTy + CT.CTArray _ innerTy -> + isCompleteObjectTypeKind innerTy + CT.CTIncomplete _ -> + False + CT.CTVoid -> + False + CT.CTFunc _ _ -> + False + tyKind -> + not $ hasIncompleteObjectType $ CT.SCAuto tyKind + +addsOuterArrayLayerBeforeExistingOmittedBound :: CT.StorageClass i -> CT.StorageClass i -> Bool +addsOuterArrayLayerBeforeExistingOmittedBound baseTy declaredTy + | not (hasInitializerCompletableArrayType baseTy) = + False + | otherwise = + case + ( outerArrayDepthBeforeFirstIncomplete baseTy + , outerArrayDepthBeforeFirstIncomplete declaredTy + ) of + (Just baseDepth, Just declaredDepth) -> + declaredDepth > baseDepth + _ -> + False + +outerArrayDepthBeforeFirstIncomplete :: CT.StorageClass i -> Maybe Int +outerArrayDepthBeforeFirstIncomplete = go 0 . CT.toTypeKind + where + go depth = \case + CT.CTLong innerTy -> go depth innerTy + CT.CTShort innerTy -> go depth innerTy + CT.CTSigned innerTy -> go depth innerTy + CT.CTEnum baseTy _ -> go depth baseTy + CT.CTArray _ innerTy -> + go (succ depth) innerTy + CT.CTIncomplete (CT.IncompleteArray _) -> + Just depth + _ -> + Nothing + +canonicalizeCompletableOmittedArrayType :: CT.StorageClass i -> CT.StorageClass i +canonicalizeCompletableOmittedArrayType ty = case CT.toTypeKind ty of + CT.CTIncomplete (CT.IncompleteArray elemTy) -> + let (baseTy, rebuild) = peelArrays elemTy + in CT.mapTypeKind (const $ rebuild $ CT.CTIncomplete $ CT.IncompleteArray baseTy) ty + _ -> + ty + where + peelArrays (CT.CTArray n innerTy) = + let (baseTy, rebuild) = peelArrays innerTy + in (baseTy, CT.CTArray n . rebuild) + peelArrays baseTy = (baseTy, id) + +normalizeGlobalDeclType :: CT.StorageClass i -> CT.StorageClass i +normalizeGlobalDeclType ty + | hasInitializerCompletableArrayType ty = + canonicalizeCompletableOmittedArrayType ty + | otherwise = + ty + +isValidTentativeFileScopeArrayType :: CT.StorageClass i -> Bool +isValidTentativeFileScopeArrayType = go . CT.toTypeKind + where + go = \case + CT.CTArray _ innerTy -> + go innerTy + CT.CTIncomplete (CT.IncompleteArray elemTy) -> + isCompleteArrayElementType elemTy + _ -> + False + + isCompleteArrayElementType = \case + CT.CTIncomplete _ -> False + CT.CTArray _ innerTy -> isCompleteArrayElementType innerTy + _ -> True + +derefObjectType :: Ord i => CT.StorageClass i -> Maybe (CT.StorageClass i) +derefObjectType ty = case CT.toTypeKind ty of + CT.CTArray n (CT.CTIncomplete (CT.IncompleteArray elemTy)) -> + Just $ CT.mapTypeKind (const $ CT.CTArray n elemTy) ty + _ -> + CT.deref ty + +callableSignature :: CT.StorageClass i -> Maybe (CT.StorageClass i, Maybe [CT.StorageClass i]) +callableSignature ty = case CT.toTypeKind ty of + CT.CTFunc retTy params -> + Just (CT.SCAuto retTy, explicitFunctionParamTypes params) + CT.CTPtr (CT.CTFunc retTy params) -> + Just (CT.SCAuto retTy, explicitFunctionParamTypes params) + _ -> + Nothing + +explicitFunctionParamTypes :: [(CT.TypeKind i, Maybe T.Text)] -> Maybe [CT.StorageClass i] +explicitFunctionParamTypes [] = Nothing +explicitFunctionParamTypes [(CT.CTVoid, Nothing)] = Just [] +explicitFunctionParamTypes params = + Just $ map (CT.SCAuto . canonicalizeFunctionParamType . fst) params + +canonicalizeFunctionParamType :: CT.TypeKind i -> CT.TypeKind i +canonicalizeFunctionParamType (CT.CTArray _ elemTy) = CT.CTPtr elemTy +canonicalizeFunctionParamType (CT.CTIncomplete (CT.IncompleteArray elemTy)) = CT.CTPtr elemTy +canonicalizeFunctionParamType (CT.CTFunc retTy params) = CT.CTPtr $ CT.CTFunc retTy params +canonicalizeFunctionParamType ty = ty + +applyCallArgConversions :: (Ord i, Bits i, Integral i) => Maybe [CT.StorageClass i] -> [ATree i] -> Parser i [ATree i] +applyCallArgConversions paramTys args = do + shouldValidateUnsupported <- gets (not . suppressUnsupportedValueChecks) + either fail pure $ convertCallArgsWith shouldValidateUnsupported paramTys args + +convertCallArgs :: (Ord i, Bits i, Integral i) => Maybe [CT.StorageClass i] -> [ATree i] -> Either String [ATree i] +convertCallArgs = convertCallArgsWith True + +convertCallArgsWith :: (Ord i, Bits i, Integral i) => Bool -> Maybe [CT.StorageClass i] -> [ATree i] -> Either String [ATree i] +convertCallArgsWith validateUnsupported Nothing args = do + mapM_ (validateDeferredCallArg validateUnsupported) args + Right $ map defaultPromotedCallArg args +convertCallArgsWith validateUnsupported (Just paramTys) args + | actualArgCount < expectedArgCount = Left "too few arguments to function call" + | actualArgCount > expectedArgCount = Left "too many arguments to function call" + | otherwise = zipWithM convertTypedCallArg paramTys args + where + actualArgCount = length args + expectedArgCount = length paramTys + + convertTypedCallArg paramTy arg = do + validateDeferredCallArg validateUnsupported arg + if isInvalidFunctionPointerValue paramTy arg + || isInvalidObjectPointerValue paramTy arg + || isInvalidAggregateValueArgument paramTy arg + then + Left "invalid argument type to function call" + else Right $ atCast paramTy arg + +isInvalidAggregateValueArgument :: Eq i => CT.StorageClass i -> ATree i -> Bool +isInvalidAggregateValueArgument = isInvalidAggregateValueConversion + +validateDeferredCallArg :: (Ord i, Bits i, Integral i) => Bool -> ATree i -> Either String () +validateDeferredCallArg shouldValidate arg + | not shouldValidate = + Right () + | containsEscapingStmtExprControlFlow arg = + Left "unsupported control flow in function call argument" + | requiresUnsupportedNonAddressableArrayDecay arg = + Left "unsupported non-addressable array member decay" + | otherwise = + Right () + +defaultPromotedCallArg :: Ord i => ATree i -> ATree i +defaultPromotedCallArg = castExprType defaultPromotedCallArgType + +defaultPromotedCallArgType :: Ord i => CT.StorageClass i -> CT.StorageClass i +defaultPromotedCallArgType ty = CT.mapTypeKind (const promotedTy) decayedTy + where + decayedTy = decayExprType ty + promotedTy = CT.integerPromotedTypeKind $ CT.toTypeKind decayedTy + +integerPromotedExpr :: Eq i => ATree i -> ATree i +integerPromotedExpr = castExprType integerPromotedExprType + +integerPromotedExprType :: CT.StorageClass i -> CT.StorageClass i +integerPromotedExprType ty = CT.mapTypeKind (const promotedTy) ty + where + promotedTy = CT.integerPromotedTypeKind $ CT.toTypeKind ty + +castExprType :: Eq i => (CT.StorageClass i -> CT.StorageClass i) -> ATree i -> ATree i +castExprType f expr + | promotedTy == atype expr = expr + | otherwise = atCast promotedTy expr + where + promotedTy = f $ atype expr + +isFunctionType :: CT.StorageClass i -> Bool +isFunctionType ty = case CT.toTypeKind ty of + CT.CTFunc _ _ -> True + _ -> False + +isPointerType :: CT.StorageClass i -> Bool +isPointerType ty = case CT.toTypeKind ty of + CT.CTPtr _ -> True + _ -> False + +isScalarOperandType :: Ord i => CT.StorageClass i -> Bool +isScalarOperandType ty = + CT.isIntegral decayedTy || isPointerType decayedTy + where + decayedTy = decayExprType ty + +isIntegerOperandType :: Ord i => CT.StorageClass i -> Bool +isIntegerOperandType = + CT.isIntegral . decayExprType + +isModifiableLvalueType :: CT.StorageClass i -> Bool +isModifiableLvalueType ty = + not (CT.isCTArray ty) && not (isFunctionType ty) + +requireNonFunctionOperand + :: String + -> ATree i + -> Parser i (ATree i) +requireNonFunctionOperand opName expr + | isFunctionType (atype expr) = fail $ "invalid application of '" <> opName <> "' to function type" + | otherwise = pure expr + +requireScalarOperand + :: Ord i + => String + -> ATree i + -> Parser i (ATree i) +requireScalarOperand err expr + | isScalarOperandType (atype expr) = pure expr + | otherwise = fail err + +requireIntegerOperand + :: Ord i + => String + -> ATree i + -> Parser i (ATree i) +requireIntegerOperand err expr + | isIntegerOperandType (atype expr) = pure expr + | otherwise = fail err + +requirePointerArithmeticTarget + :: (Ord i, Bits i, Read i, Show i, Integral i) + => ATree i + -> Parser i () +requirePointerArithmeticTarget expr = case CT.deref (atype expr) of + Just ty + | isFunctionType ty -> + fail "invalid operands" + | otherwise -> + void $ requireCompleteObjectType "invalid use of pointer to incomplete type" ty + Nothing -> + pure () + +requirePointerArithmeticTargetAllowDeferred + :: (Ord i, Bits i, Read i, Show i, Integral i) + => ATree i + -> Parser i () +requirePointerArithmeticTargetAllowDeferred expr = case CT.deref (atype expr) of + Just ty + | isFunctionType ty -> + requirePointerArithmeticTarget expr + | CT.isCTIncomplete ty && isDeferredIncompletePointerArithmeticExpr expr -> + pure () + | otherwise -> + requirePointerArithmeticTarget expr + Nothing -> + requirePointerArithmeticTarget expr + +isDeferredIncompleteObjectExpr :: ATree i -> Bool +isDeferredIncompleteObjectExpr = \case + ATNode (ATLVar _ _) ty _ _ -> + not $ CT.isIncompleteArray ty + ATNode (ATGVar _ _) ty _ _ -> + not $ CT.isIncompleteArray ty + ATNode (ATMemberAcc _) ty _ _ -> + not $ CT.isIncompleteArray ty + _ -> + True + +isDeferredIncompletePointerArithmeticExpr :: ATree i -> Bool +isDeferredIncompletePointerArithmeticExpr = \case + ATNode ATAddr _ _ _ -> False + _ -> True + +resolveMemOperandType + :: (Ord i, Bits i, Read i, Show i, Integral i) + => String + -> ATree i + -> Parser i (CT.StorageClass i) +resolveMemOperandType err expr = do + resolvedTy <- gets (incomplete $ atype expr) + case resolvedTy of + Just ty + | CT.isIncompleteArray ty -> + if isDeferredIncompleteObjectExpr expr then pure (atype expr) else fail err + | otherwise -> + pure ty + Nothing -> + if isDeferredIncompleteObjectExpr expr then pure (atype expr) else fail err + +isModifiableLvalueExpr :: ATree i -> Bool +isModifiableLvalueExpr (ATNode kind ty lhs _) + | not $ isModifiableLvalueType ty = False + | otherwise = case kind of + ATLVar _ _ -> True + ATGVar _ _ -> True + ATMemberAcc _ -> isModifiableLvalueExpr lhs + ATDeref -> isAddressableDerefOperand lhs + _ -> False +isModifiableLvalueExpr _ = False + +isAddressableLvalueExpr :: ATree i -> Bool +isAddressableLvalueExpr (ATNode kind _ lhs _) = case kind of + ATLVar _ _ -> True + ATGVar _ _ -> True + ATMemberAcc _ -> isAddressableLvalueExpr lhs + ATDeref -> isAddressableDerefOperand lhs + _ -> False +isAddressableLvalueExpr _ = False + +isAddressableDerefOperand :: ATree i -> Bool +isAddressableDerefOperand ptr + | Just arrayExpr <- pointerIndexRootArrayOperand ptr = + isAddressableLvalueExpr arrayExpr + | CT.isArray (atype ptr) = + isAddressableLvalueExpr ptr +isAddressableDerefOperand _ = True + +pointerIndexOperands :: ATree i -> Maybe (ATree i, ATree i) +pointerIndexOperands (ATNode ATAddPtr _ arrayExpr idx) = Just (arrayExpr, idx) +pointerIndexOperands (ATNode ATSubPtr _ arrayExpr idx) = Just (arrayExpr, idx) +pointerIndexOperands _ = Nothing + +pointerIndexRootArrayOperand :: ATree i -> Maybe (ATree i) +pointerIndexRootArrayOperand ptr = do + root <- pointerIndexRootOperand ptr + if CT.isArray (atype root) then Just root else Nothing + where + pointerIndexRootOperand expr = do + (arrayExpr, _) <- pointerIndexOperands expr + case pointerIndexRootOperand arrayExpr of + Just root -> Just root + Nothing -> Just arrayExpr + +isAddressableUnaryOperand :: ATree i -> Bool +isAddressableUnaryOperand (ATNode kind _ lhs _) = case kind of + ATLVar _ _ -> True + ATGVar _ _ -> True + ATFuncPtr _ -> True + ATMemberAcc _ -> isAddressableLvalueExpr lhs + ATDeref -> isAddressableDerefOperand lhs + _ -> False +isAddressableUnaryOperand _ = False + +isUnevaluatedRvalueArrayElementLvalue :: ATree i -> Bool +isUnevaluatedRvalueArrayElementLvalue (ATNode ATDeref _ ptr _) + | Just arrayExpr <- pointerIndexRootArrayOperand ptr = + not $ isAddressableLvalueExpr arrayExpr + | CT.isArray (atype ptr) = + not $ isAddressableLvalueExpr ptr +isUnevaluatedRvalueArrayElementLvalue (ATNode (ATMemberAcc _) _ lhs _) = + isUnevaluatedRvalueArrayElementLvalue lhs +isUnevaluatedRvalueArrayElementLvalue _ = False + +rejectUnsupportedNonAddressableArrayDecay :: Ord i => ATree i -> Parser i () +rejectUnsupportedNonAddressableArrayDecay expr = do + shouldValidateUnsupported <- gets (not . suppressUnsupportedValueChecks) + when (shouldValidateUnsupported && requiresUnsupportedNonAddressableArrayDecay expr) $ + fail "unsupported non-addressable array member decay" + +rejectNonScalarCondition :: Ord i => ATree i -> Parser i () +rejectNonScalarCondition expr + | isScalarConditionType (atype expr) = pure () + | otherwise = fail "invalid condition type" + +isScalarConditionType :: Ord i => CT.StorageClass i -> Bool +isScalarConditionType ty = + CT.isIntegral decayedTy || case CT.toTypeKind decayedTy of + CT.CTPtr _ -> True + _ -> False + where + decayedTy = decayExprType ty + +withSuppressedUnsupportedValueChecks :: Parser i a -> Parser i a +withSuppressedUnsupportedValueChecks = + bracket + (gets suppressUnsupportedValueChecks <* modify (\cd -> cd { suppressUnsupportedValueChecks = True })) + (\restore -> modify (\cd -> cd { suppressUnsupportedValueChecks = restore })) + . const + +rejectingBinOp + :: Ord i + => (ATree i -> ATree i -> Parser i (ATree i)) + -> ATree i + -> ATree i + -> Parser i (ATree i) +rejectingBinOp op lhs rhs = do + rejectUnsupportedNonAddressableArrayDecay lhs + rejectUnsupportedNonAddressableArrayDecay rhs + op lhs rhs + +rejectingScalarBinOp + :: Ord i + => (ATree i -> ATree i -> Parser i (ATree i)) + -> ATree i + -> ATree i + -> Parser i (ATree i) +rejectingScalarBinOp op lhs rhs = do + void $ requireScalarOperand "invalid operands" lhs + void $ requireScalarOperand "invalid operands" rhs + rejectingBinOp op lhs rhs + +requireModifiableLvalue + :: String + -> ATree i + -> Parser i (ATree i) +requireModifiableLvalue err expr + | isModifiableLvalueExpr expr = pure expr + | otherwise = do + unsupportedChecksSuppressed <- gets suppressUnsupportedValueChecks + if unsupportedChecksSuppressed + && isModifiableLvalueType (atype expr) + && isUnevaluatedRvalueArrayElementLvalue expr + then pure expr + else fail err + +global, + stmt, + expr, + assign, + conditional, + logicalOr, + logicalAnd, + bitwiseOr, + bitwiseXor, + bitwiseAnd, + equality, + relational, + shift, + add, + term, + cast, + unary, + factor :: (Ord i, Bits i, Read i, Show i, Integral i) => Parser i (ATree i) + +global = do + pos <- getPosState + rejectInvalidFileScopeStorageClass + (declStorage, ty) <- declarationSpec + case declStorage of + TypedefDecl -> + globalDecl declStorage ty pos + _ -> + choice + [ ATEmpty <$ semi + , globalDecl declStorage ty pos + ] + where + rejectInvalidFileScopeStorageClass = + M.lookAhead $ + choice + [ kAuto *> fail "storage-class specifier is not allowed at file scope" + , kRegister *> fail "storage-class specifier is not allowed at file scope" + , pure () + ] + + globalDecl declStorage ty pos = captureFunctionParamScopes (declarator ty) >>= \case + ((_, Nothing), _) -> fail $ + if declStorage == TypedefDecl + then "typedef name omitted, expected unqualified-id" + else "variable name omitted, expected unqualified-id" + ((ty', Just ident), paramScopes) + | declStorage == TypedefDecl -> + typedefDecl ty' ident + | isFunctionType ty' -> modify resetLocal + *> choice + [ declaration ty' ident + , definition ty' ident pos paramScopes + ] + ((ty', Just ident), _) -> + requireNonVoidObjectType "variable declared void" ty' + *> gvarDecl declStorage ty ty' ident + + isFunctionType ty' = case CT.toTypeKind ty' of + CT.CTFunc _ _ -> True + _ -> False + + declaration ty ident = do + resolvedTy <- gets (`normalizeCompletedStorageClass` ty) + semi *> registerFunc False False resolvedTy ident $> ATEmpty + + typedefDecl ty ident = do + resolvedTy <- requireTypedefDeclType "typedef declaration has invalid array element type" ty + semi *> registerTypedef resolvedTy ident $> ATEmpty + + definition ty ident pos paramScopes = do + resolvedTy <- gets (`normalizeCompletedStorageClass` ty) + registerFunc True False resolvedTy ident + bracket get (modify . fallBack) $ const $ do + paramScope <- maybe + (fail "internal compiler error: missing function parameter scope") + pure + -- The function body's scope is the earliest prototype scope created + -- while parsing the declarator. Nested parameter prototypes and + -- trailing function suffixes are created later and must not leak. + (listToMaybe $ sortBy (comparing fpsScopeId) paramScopes) + params <- registerFunctionParams paramScope resolvedTy + functionBody >>= fromValidFunc resolvedTy params + where + registerFunctionParams paramScope fnTy = + enterFunctionScope paramScope + *> (mapM registerParam . zip [0 :: Int ..] =<< functionParamDecls fnTy) + where + registerParam (idx, (paramTy, mIdent)) = do + resolvedParamTy <- + requireCompleteObjectType + "declaration of variable with incomplete type" + paramTy + >>= requireSupportedByValueType + "unsupported by-value function parameter type" + registerLVar + resolvedParamTy + (fromMaybe (unnamedFunctionParamIdent idx) mIdent) + + enterFunctionScope paramScope = + modify $ \cd -> + cd + { scope = + (scope cd) + { curNestDepth = succ $ curNestDepth $ scope cd + , curScopeId = fpsScopeId paramScope + , structs = fpsTags paramScope + , enumerators = fpsEnumerators paramScope + } + } + + functionBody = atBlock <$> braces (M.many stmt) + + fromValidFunc fnTy params' st@(ATNode (ATBlock block) _ _ _) = do + when (hasInvalidStmtExprControlFlow st) $ + fail "unsupported control flow in statement expression" + case CT.toTypeKind fnTy of + CT.CTFunc retTy _ -> do + when (isUnsupportedByValueAggregateType (CT.SCAuto retTy)) $ + fail "unsupported by-value function return type" + if retTy == CT.CTVoid then + if isJust (find isNonEmptyReturn block) then + fail $ mconcat + [ "the return type of function '" + , T.unpack ident + , "' is void, but the statement returns a value" + ] + else do + when (hasInvalidAggregateReturnValue (CT.SCAuto retTy) st) $ + fail "invalid return type" + pure $ atDefFunc ident (if null params' then Nothing else Just params') fnTy st + else do + when (hasInvalidAggregateReturnValue (CT.SCAuto retTy) st) $ + fail "invalid return type" + when (isJust (find isEmptyReturn block)) $ + pushWarn pos $ mconcat + [ "the return type of function '" + , T.unpack ident + , "' is " + , show retTy + , ", but the statement returns no value" + ] + pure $ atDefFunc ident (if null params' then Nothing else Just params') fnTy st + _ -> + fail "internal compiler error" + + gvarDecl declStorage baseTy ty ident = choice + [ nonInit declStorage ty ident + , withInit baseTy ty ident + ] + nonInit declStorage ty ident + | declStorage == ExternDecl = + semi + *> (requireExternDeclObjectType "declaration of variable with incomplete type" ty + >>= \resolvedTy -> registerGVarWith (normalizeGlobalDeclType resolvedTy) ident PV.GVarInitWithExternDecl + ) + $> ATEmpty + | CT.isIncompleteArray ty && isValidTentativeFileScopeArrayType ty = + semi *> registerGVar (normalizeGlobalDeclType ty) ident $> ATEmpty + | CT.isIncompleteArray ty = + fail "defining global variables with a incomplete type" + | otherwise = + semi + >> requireCompleteObjectType "defining global variables with a incomplete type" ty + >>= flip (registerGVar . normalizeGlobalDeclType) ident + >> pure ATEmpty + + withInit baseTy ty ident = do + resolvedTy <- requireInitializedObjectType "defining global variables with a incomplete type" baseTy ty + void equal + (ty', initWith) <- parseGlobalVarInit resolvedTy ident + registerGVarWith (normalizeGlobalDeclType ty') ident initWith <* semi + +parseGlobalVarInit :: (Ord i, Bits i, Read i, Show i, Integral i) + => CT.StorageClass i + -> T.Text + -> Parser i (CT.StorageClass i, PV.GVarInitWith i) +parseGlobalVarInit ty ident = + bracket get (modify . fallBack) $ const $ do + ensureTargetGlobalVisible ty ident + let tempIdent = ".L.global.init." <> ident + ast <- varInit assign ty tempIdent + rejectIncompleteGlobalSelfReference ty ident ast + ty' <- maybeToParser "defining global variables with a incomplete type" + =<< gets (fmap PV.lvtype . lookupLVar tempIdent) + void $ either fail pure $ foldGlobalInitWith ty' ast + pure (ty', PV.GVarInitWithAST ast) + where + ensureTargetGlobalVisible declaredTy name = + void $ registerGVar (normalizeGlobalDeclType declaredTy) name + + rejectIncompleteGlobalSelfReference declaredTy name ast + | CT.isIncompleteArray declaredTy && containsGlobalRef name ast = + fail "invalid initializer in global variable" + | otherwise = + pure () + +containsGlobalRef :: T.Text -> ATree i -> Bool +containsGlobalRef name = go + where + go ATEmpty = False + go (ATNode kind _ lhs rhs) = + goKind kind || go lhs || go rhs + + goKind = \case + ATConditional cond tr fl -> + any go [cond, tr, fl] + ATSwitch cond cases -> + go cond || any go cases + ATFor kinds -> + any (go . fromATKindFor) kinds + ATBlock ats -> + any go ats + ATStmtExpr ats -> + any go ats + ATNull at -> + go at + ATDefFunc _ args -> + maybe False (any go) args + ATCallFunc _ args -> + maybe False (any go) args + ATCallPtr args -> + maybe False (any go) args + ATGVar _ ref -> + ref == name + _ -> + False + +hasInvalidAggregateReturnValue :: Eq i => CT.StorageClass i -> ATree i -> Bool +hasInvalidAggregateReturnValue returnTy = go + where + go ATEmpty = False + go (ATNode ATReturn _ ATEmpty _) = False + go (ATNode ATReturn _ returnedExpr _) = + isInvalidAggregateValueConversion returnTy returnedExpr + || go returnedExpr + go (ATNode ATSizeof _ _ _) = False + go (ATNode ATAlignof _ _ _) = False + go (ATNode kind _ lhs rhs) = + goKind kind || go lhs || go rhs + + goKind = \case + ATConditional cond tr fl -> + any go [cond, tr, fl] + ATSwitch cond cases -> + go cond || any go cases + ATFor kinds -> + any (go . fromATKindFor) kinds + ATBlock ats -> + any go ats + ATStmtExpr ats -> + any go ats + ATNull at -> + go at + ATDefFunc _ args -> + maybe False (any go) args + ATCallFunc _ args -> + maybe False (any go) args + ATCallPtr args -> + maybe False (any go) args + _ -> + False + +foldGlobalInitWith :: (Integral i, Bits i, Read i, Show i, Ord i) + => CT.StorageClass i + -> ATree i + -> Either String (PV.GVarInitWith i) +foldGlobalInitWith ty ast = do + entries <- globalInitEntries ast + PV.GVarInitWithData <$> finalizeGlobalInitData (CT.sizeof ty) entries + +globalInitEntries :: (Integral i, Bits i, Read i, Show i, Ord i) + => ATree i + -> Either String [(Natural, PV.GVarInitData i)] +globalInitEntries ATEmpty = Right [] +globalInitEntries (ATNode (ATBlock stmts) _ _ _) = concat <$> mapM globalInitEntries stmts +globalInitEntries (ATNode ATExprStmt _ expr _) = globalInitEntries expr +globalInitEntries (ATNode ATAssign _ lhs rhs) = do + offset <- maybe (Left "invalid initializer in global variable") Right $ globalInitByteOffset lhs + dat <- globalInitDatum (atype lhs) rhs + Right [(offset, dat)] +globalInitEntries _ = Left "invalid initializer in global variable" + +globalInitDatum :: (Integral i, Bits i, Read i, Show i, Ord i) + => CT.StorageClass i + -> ATree i + -> Either String (PV.GVarInitData i) +globalInitDatum ty rhs + | rejectsIncompatibleScalarInitializer = + Left "invalid initializer for scalar object" + | otherwise = globalInitReloc ty rhs >>= \case + Just (ref, addend) + | isRelocInitializerType ty -> pure $ PV.GVarInitReloc size ref addend + | otherwise -> Left "invalid initializer for scalar object" + Nothing -> do + val <- evalGlobalInitConstexpr ty rhs + Right $ + if val == 0 + then PV.GVarInitZeroBytes size + else PV.GVarInitBytes size val + where + size = CT.sizeof ty + rejectsIncompatibleScalarInitializer = + isInvalidFunctionPointerInitializer ty rhs + || isInvalidObjectPointerValue ty rhs + isRelocInitializerType = \case + CT.SCAuto (CT.CTPtr _) -> True + CT.SCRegister (CT.CTPtr _) -> True + CT.SCStatic (CT.CTPtr _) -> True + _ -> False + isFunctionPointerType sc = case CT.toTypeKind sc of + CT.CTPtr (CT.CTFunc _ _) -> True + _ -> False + +globalInitReloc :: (Integral i, Bits i, Read i, Show i) + => CT.StorageClass i + -> ATree i + -> Either String (Maybe (T.Text, Integer)) +globalInitReloc targetTy = \case + ATNode ATCast ty inner _ + | isPointerType ty + && (not (isFunctionPointerType targetTy) || isFunctionPointerType ty) -> + globalInitReloc targetTy inner + | otherwise -> Right Nothing + ATNode (ATNull inner) _ _ _ -> + globalInitReloc targetTy inner + ATNode ATExprStmt _ inner _ -> + globalInitReloc targetTy inner + ATNode (ATConditional cond ATEmpty el) _ _ _ -> + globalInitConditionalReloc cond cond el + ATNode (ATConditional cond th el) _ _ _ -> + globalInitConditionalReloc cond th el + ATNode (ATFuncPtr name) _ _ _ + | isFunctionPointerType targetTy -> Right $ Just (name, 0) + | otherwise -> Right Nothing + ast@(ATNode (ATGVar ty _) _ _ _) + | CT.isArray ty -> globalInitLvalueReloc targetTy ast + ast@(ATNode ATDeref ty _ _) + | CT.isArray ty -> globalInitLvalueReloc targetTy ast + ast@(ATNode (ATMemberAcc mem) _ _ _) + | CT.isArray (CT.smType mem) -> globalInitLvalueReloc targetTy ast + ATNode ATAddr _ inner _ -> globalInitLvalueReloc targetTy inner + ATNode ATAddPtr _ lhs rhs -> globalInitRelocWithAddend (+) lhs rhs + ATNode ATSubPtr _ lhs rhs -> globalInitRelocWithAddend (-) lhs rhs + _ -> Right Nothing + where + isPointerType ty = case CT.toTypeKind ty of + CT.CTPtr _ -> True + _ -> False + + isFunctionPointerType ty = case CT.toTypeKind ty of + CT.CTPtr (CT.CTFunc _ _) -> True + _ -> False + + offsetFromIndex lhs idx = + fromIntegral idx * maybe 0 (fromIntegral . CT.sizeof) (CT.deref $ atype lhs) + + globalInitConditionalReloc cond th el = case evalGlobalInitConditionMaybe cond of + Just True -> globalInitReloc targetTy th + Just False -> + globalInitReloc targetTy el + Nothing -> + Right Nothing + + globalInitRelocWithAddend op lhs rhs = + globalInitReloc targetTy lhs >>= \case + Just (name, addend) -> + Right $ (\idx -> Just (name, op addend (offsetFromIndex lhs idx))) =<< evalConstexprMaybe rhs + Nothing -> Right Nothing + + evalConstexprMaybe expr = either (const Nothing) Just $ evalConstexprTree expr + +globalInitLvalueReloc :: (Integral i, Bits i, Read i, Show i) + => CT.StorageClass i + -> ATree i + -> Either String (Maybe (T.Text, Integer)) +globalInitLvalueReloc targetTy = \case + ATNode (ATFuncPtr name) _ _ _ + | isFunctionPointerType targetTy -> Right $ Just (name, 0) + | otherwise -> Right Nothing + ATNode (ATGVar _ name) _ _ _ + | isFunctionPointerType targetTy -> Right Nothing + | otherwise -> Right $ Just (name, 0) + ATNode (ATMemberAcc mem) _ lhs _ -> do + globalInitLvalueReloc targetTy lhs <&> fmap (second (+ fromIntegral (CT.smOffset mem))) + ATNode ATDeref _ ptr _ -> globalInitReloc targetTy ptr + ATNode ATCast _ lhs _ -> globalInitLvalueReloc targetTy lhs + _ -> Right Nothing + where + isFunctionPointerType ty = case CT.toTypeKind ty of + CT.CTPtr (CT.CTFunc _ _) -> True + _ -> False + +applyConstexprCast :: (Bits i, Integral i) => CT.StorageClass i -> i -> i +applyConstexprCast ty val + | CT.toTypeKind ty == CT.CTBool = fromIntegral $ fromEnum $ val /= 0 + | otherwise = truncateToWidth (CT.sizeof ty) val + where + truncateToWidth sz x + | sz == 0 = 0 + | otherwise = fromInteger $ signExtend width $ toInteger x + where + width = fromIntegral $ sz * 8 + + signExtend width x = + if width <= 0 + then 0 + else + let modulus = bit width :: Integer + mask = pred modulus + truncated = x .&. mask + signBit = bit (pred width) :: Integer + in + if truncated .&. signBit == 0 + then truncated + else truncated - modulus + +evalGlobalInitConstexpr :: (Bits i, Integral i, Show i, Read i) => CT.StorageClass i -> ATree i -> Either String i +evalGlobalInitConstexpr ty rhs = + case CT.toTypeKind ty of + CT.CTPtr _ -> evalPointerNullGlobalInit rhs + _ -> applyConstexprCast ty <$> evalConstexprTree rhs + where + evalPointerNullGlobalInit ast = + evalPointerNullConstexpr ast >>= rejectUnlessZero + + rejectUnlessZero val + | val == 0 = pure 0 + | otherwise = Left "initializer element is not constant" + +evalGlobalInitConditionMaybe :: (Bits i, Integral i, Show i, Read i) => ATree i -> Maybe Bool +evalGlobalInitConditionMaybe = either (const Nothing) Just . evalGlobalInitCondition + +evalGlobalInitCondition :: (Bits i, Integral i, Show i, Read i) => ATree i -> Either String Bool +evalGlobalInitCondition ast = + case either (const Nothing) (Just . (/= 0)) $ evalConstexprTree ast of + Just truthy -> + pure truthy + Nothing -> case evalPointerNullConstexpr ast of + Right 0 -> + pure False + _ -> + maybe (Left "initializer element is not constant") pure $ + evalGlobalInitAddressCondition ast + +evalGlobalInitAddressCondition :: (Bits i, Integral i, Show i, Read i) => ATree i -> Maybe Bool +evalGlobalInitAddressCondition = \case + ATNode ATCast ty inner _ + | isPointerType ty -> + evalGlobalInitAddressCondition inner + ATNode (ATNull inner) _ _ _ -> + evalGlobalInitAddressCondition inner + ATNode ATExprStmt _ inner _ -> + evalGlobalInitAddressCondition inner + ATNode (ATConditional cond ATEmpty el) _ _ _ -> + evalGlobalInitConditionMaybe cond >>= \case + True -> + pure True + False -> + evalGlobalInitConditionMaybe el + ATNode (ATConditional cond th el) _ _ _ -> + evalGlobalInitConditionMaybe cond >>= \case + True -> + evalGlobalInitConditionMaybe th + False -> + evalGlobalInitConditionMaybe el + ATNode (ATFuncPtr _) _ _ _ -> + Just True + ast@(ATNode (ATGVar ty _) _ _ _) + | CT.isArray ty -> + globalInitLvalueCondition ast + ast@(ATNode ATDeref ty _ _) + | CT.isArray ty -> + globalInitLvalueCondition ast + ast@(ATNode (ATMemberAcc mem) _ _ _) + | CT.isArray (CT.smType mem) -> + globalInitLvalueCondition ast + ATNode ATAddr _ inner _ -> + globalInitLvalueCondition inner + ATNode ATAddPtr _ lhs rhs -> + globalInitAddressAdditiveCondition lhs rhs + ATNode ATSubPtr _ lhs rhs -> + globalInitAddressAdditiveCondition lhs rhs + _ -> + Nothing + where + isPointerType ty = case CT.toTypeKind ty of + CT.CTPtr _ -> True + _ -> False + + globalInitAddressAdditiveCondition lhs rhs = + globalInitAddressConditionBase lhs >> evalConstexprMaybe rhs >> pure True + + globalInitAddressConditionBase = \case + ATNode ATCast ty inner _ + | isPointerType ty -> + globalInitAddressConditionBase inner + ATNode (ATNull inner) _ _ _ -> + globalInitAddressConditionBase inner + ATNode ATExprStmt _ inner _ -> + globalInitAddressConditionBase inner + ATNode (ATConditional cond ATEmpty el) _ _ _ -> + evalGlobalInitConditionMaybe cond >>= \case + True -> + globalInitAddressConditionBase cond + False -> + globalInitAddressConditionBase el + ATNode (ATConditional cond th el) _ _ _ -> + evalGlobalInitConditionMaybe cond >>= \case + True -> + globalInitAddressConditionBase th + False -> + globalInitAddressConditionBase el + ATNode (ATFuncPtr _) _ _ _ -> + Just () + ast@(ATNode (ATGVar ty _) _ _ _) + | CT.isArray ty -> + globalInitLvalueConditionBase ast + ast@(ATNode ATDeref ty _ _) + | CT.isArray ty -> + globalInitLvalueConditionBase ast + ast@(ATNode (ATMemberAcc mem) _ _ _) + | CT.isArray (CT.smType mem) -> + globalInitLvalueConditionBase ast + ATNode ATAddr _ inner _ -> + globalInitLvalueConditionBase inner + ATNode ATAddPtr _ lhs rhs -> + globalInitAddressConditionBase lhs >> evalConstexprMaybe rhs >> pure () + ATNode ATSubPtr _ lhs rhs -> + globalInitAddressConditionBase lhs >> evalConstexprMaybe rhs >> pure () + _ -> + Nothing + + globalInitLvalueCondition = fmap (const True) . globalInitLvalueConditionBase + + globalInitLvalueConditionBase = \case + ATNode (ATFuncPtr _) _ _ _ -> + Just () + ATNode (ATGVar _ _) _ _ _ -> + Just () + ATNode (ATMemberAcc _) _ lhs _ -> + globalInitLvalueConditionBase lhs + ATNode ATDeref _ ptr _ -> + globalInitAddressConditionBase ptr + ATNode ATCast _ lhs _ -> + globalInitLvalueConditionBase lhs + _ -> + Nothing + + evalConstexprMaybe expr = either (const Nothing) Just $ evalConstexprTree expr + +evalPointerNullConstexpr :: (Bits i, Integral i, Show i, Read i) => ATree i -> Either String i +evalPointerNullConstexpr (ATNode ATCast castTy inner _) + | isPointerStorageClass castTy = + applyConstexprCast castTy <$> evalPointerNullConstexpr inner +evalPointerNullConstexpr ast = + evalConstexprTree ast + +isPointerStorageClass :: CT.StorageClass i -> Bool +isPointerStorageClass ty = case CT.toTypeKind ty of + CT.CTPtr _ -> True + _ -> False + +evalConstexprTree :: (Bits i, Integral i, Show i, Read i) => ATree i -> Either String i +evalConstexprTree = \case + ATNode k ty lhs rhs -> case k of + ATAdd -> binop (+) + ATSub -> binop (-) + ATMul -> binop (*) + ATDiv -> nonZeroBinop quot + ATMod -> nonZeroBinop rem + ATAnd -> binop (.&.) + ATXor -> binop xor + ATOr -> binop (.|.) + ATShl -> shiftBinop shiftL + ATShr -> shiftBinop shiftR + ATEQ -> binop (fromBool .: (==)) + ATNEQ -> binop (fromBool .: (/=)) + ATLT -> binop (fromBool .: (<)) + ATGT -> binop (fromBool .: (>)) + ATLEQ -> binop (fromBool .: (<=)) + ATGEQ -> binop (fromBool .: (>=)) + ATConditional cn th el -> + evalGlobalInitCondition cn >>= \cond -> + if cond + then evalConstexprTree $ + case th of + ATEmpty -> cn + _ -> th + else evalConstexprTree el + ATNot -> fromBool . not <$> evalGlobalInitCondition lhs + ATBitNot -> complement <$> evalConstexprTree lhs + ATLAnd -> evalGlobalInitCondition lhs >>= logicalAnd + ATLOr -> evalGlobalInitCondition lhs >>= logicalOr + ATSizeof -> memOp "sizeof" CT.sizeof lhs + ATAlignof -> memOp "_Alignof" CT.alignof lhs + ATCast + | isConstexprArithmeticCastType ty -> applyConstexprCast ty <$> evalConstexprTree lhs + | otherwise -> Left "initializer element is not constant" + ATNum v -> pure v + _ -> Left "initializer element is not constant" + where + binop f = evalConstexprTree lhs >>= \lhs' -> f lhs' <$> evalConstexprTree rhs + shiftBinop f = + evalConstexprTree lhs >>= \lhs' -> + evalConstexprTree rhs >>= \rhs' -> + case shiftCount rhs' of + Nothing -> Left "initializer element is not constant" + Just count' -> pure $ f lhs' count' + shiftCount n + | n < 0 = Nothing + | toInteger n >= shiftWidth = Nothing + | toInteger n > toInteger (maxBound :: Int) = Nothing + | otherwise = Just $ fromIntegral n + where + shiftWidth = toInteger (CT.sizeof ty) * 8 + logicalAnd lhs' + | not lhs' = pure $ fromBool False + | otherwise = fromBool <$> evalGlobalInitCondition rhs + logicalOr lhs' + | lhs' = pure $ fromBool True + | otherwise = fromBool <$> evalGlobalInitCondition rhs + nonZeroBinop f = + evalConstexprTree lhs >>= \lhs' -> + evalConstexprTree rhs >>= \rhs' -> + if rhs' == 0 + then Left "initializer element is not constant" + else pure (f lhs' rhs') + memOp opName op expr + | hasIncompleteObjectType (atype expr) = + Left $ "invalid application of '" <> opName <> "' to incomplete type" + | otherwise = + pure $ fromIntegral $ op $ atype expr + fromBool = fromIntegral . fromEnum + (.:) f g x y = f (g x y) + _ -> Left "initializer element is not constant" + +isConstexprArithmeticCastType :: CT.StorageClass i -> Bool +isConstexprArithmeticCastType = \case + CT.SCAuto ty -> go ty + CT.SCRegister ty -> go ty + CT.SCStatic ty -> go ty + CT.SCUndef ty -> go ty + where + go = \case + CT.CTInt -> True + CT.CTChar -> True + CT.CTBool -> True + CT.CTEnum _ _ -> True + CT.CTSigned CT.CTUndef -> True + CT.CTShort CT.CTUndef -> True + CT.CTLong CT.CTUndef -> True + CT.CTSigned ty -> go ty + CT.CTShort ty -> go ty + CT.CTLong ty -> go ty + _ -> False + +finalizeGlobalInitData :: Natural -> [(Natural, PV.GVarInitData i)] -> Either String [PV.GVarInitData i] +finalizeGlobalInitData totalBytes entries = mergeGlobalInitData <$> go 0 sorted + where + sorted = sortBy (comparing fst) entries + + go offset [] = + pure [PV.GVarInitZeroBytes $ totalBytes - offset | offset < totalBytes] + go offset ((nextOffset, dat):rest) + | nextOffset < offset = Left "internal compiler error: overlapping global initializer" + | otherwise = do + suffix <- go (nextOffset + globalInitDataSize dat) rest + pure $ + zeroGap offset nextOffset <> [dat] <> suffix + + zeroGap cur nxt + | cur < nxt = [PV.GVarInitZeroBytes $ nxt - cur] + | otherwise = [] + +globalInitDataSize :: PV.GVarInitData i -> Natural +globalInitDataSize = \case + PV.GVarInitZeroBytes sz -> sz + PV.GVarInitBytes sz _ -> sz + PV.GVarInitReloc sz _ _ -> sz + +mergeGlobalInitData :: [PV.GVarInitData i] -> [PV.GVarInitData i] +mergeGlobalInitData = foldr step [] + where + step (PV.GVarInitZeroBytes sz) (PV.GVarInitZeroBytes sz' : rest) = + PV.GVarInitZeroBytes (sz + sz') : rest + step dat acc = dat : acc + +globalInitByteOffset :: Integral i => ATree i -> Maybe Natural +globalInitByteOffset = \case + ATNode (ATLVar _ _) _ _ _ -> Just 0 + ATNode (ATMemberAcc mem) _ lhs _ -> (+ CT.smOffset mem) <$> globalInitByteOffset lhs + ATNode ATDeref _ ptr _ -> globalInitByteOffset ptr + ATNode ATAddr _ lhs _ -> globalInitByteOffset lhs + ATNode ATCast _ lhs _ -> globalInitByteOffset lhs + ATNode ATAddPtr _ lhs (ATNode (ATNum idx) _ _ _) -> + (+ offsetFromIndex lhs idx) <$> globalInitByteOffset lhs + _ -> Nothing + where + offsetFromIndex lhs idx = fromIntegral idx * maybe 0 CT.sizeof (CT.deref $ atype lhs) + +compoundStmt :: (Ord i, Bits i, Read i, Show i, Integral i) => Parser i [ATree i] +compoundStmt = bracket get (modify . fallBack) $ const $ + braces (modify succNest *> M.many stmt) + +stmt = choice + [ returnStmt + , ifStmt + , whileStmt + , forStmt + , breakStmt + , continueStmt + , switchStmt + , caseStmt + , defaultStmt + , gotoStmt + , labelStmt + , atBlock <$> compoundStmt + , lvarStmt + , exprStmt + , ATEmpty <$ semi + ] + where + returnStmt = choice + [ atReturn (CT.SCUndef CT.CTUndef) ATEmpty <$ M.try (kReturn *> semi) + , do + ret <- M.try kReturn *> expr + rejectUnsupportedNonAddressableArrayDecay ret + atReturn (CT.SCUndef CT.CTUndef) ret <$ semi + ] + + exprStmt = do + nd <- expr + rejectUnsupportedNonAddressableArrayDecay nd + atExprStmt nd <$ semi + + ifStmt = do + cond <- M.try kIf >> parens expr + rejectNonScalarCondition cond + rejectUnsupportedNonAddressableArrayDecay cond + r <- atIf cond <$> stmt + M.option ATEmpty (M.try kElse >> stmt) <&> \case + ATEmpty -> r + nd -> atElse r nd + + whileStmt = do + cond <- M.try kWhile >> parens expr + rejectNonScalarCondition cond + rejectUnsupportedNonAddressableArrayDecay cond + atWhile cond <$> stmt + + forStmt = (>>) (M.try kFor) $ bracket get (modify . fallBack) $ const $ do + es <- parens $ do + modify succNest + initSect <- ATForInit + <$> choice [ATEmpty <$ semi, M.try exprStmt, lvarStmt] + condSect <- ATForCond + <$> choice [ATEmpty <$ semi, checkedCondition <* semi] + incrSect <- ATForIncr + <$> M.option ATEmpty exprStmtNoSemi + pure [initSect, condSect, incrSect] + atFor (es <> [ATForStmt ATEmpty]) <$ semi + M.<|> atFor . (es <>) . (:[]) . ATForStmt <$> stmt + where + checkedCondition = do + nd <- expr + rejectNonScalarCondition nd + rejectUnsupportedNonAddressableArrayDecay nd + pure nd + + checkedExpr = do + nd <- expr + rejectUnsupportedNonAddressableArrayDecay nd + pure nd + + exprStmtNoSemi = + atExprStmt <$> checkedExpr + + breakStmt = atBreak <$ (M.try kBreak *> semi) + + continueStmt = atContinue <$ (M.try kContinue *> semi) + + switchStmt = do + cond <- M.try kSwitch *> parens expr + rejectNonScalarCondition cond + rejectUnsupportedNonAddressableArrayDecay cond + bracket (putSwitchState True) (const $ putSwitchState False) (const stmt) + >>= \case + ATNode (ATBlock ats) ty _ _ -> pure $ atSwitch cond ats ty + _ -> fail "expected compound statement after the token ')'" + where + putSwitchState b = modify $ \scp -> scp { isSwitchStmt = b } + + caseStmt = M.try kCase + *> ifM (gets isSwitchStmt) + ((atCase 0 <$> evalConstexpr <* colon) <*> stmt) + (fail "stray 'case'") + + defaultStmt = (M.try kDefault <* colon) + *> ifM (gets isSwitchStmt) + (atDefault 0 <$> stmt) + (fail "stray 'default'") + + gotoStmt = atGoto <$> (M.try kGoto *> identifier <* semi) + + labelStmt = atLabel <$> M.try (identifier <* colon) + + lvarStmt = do + (declStorage, ty) <- M.try declarationSpec + case declStorage of + TypedefDecl -> + declLVar declStorage ty + _ -> + M.choice + [ standaloneDecl + , declLVar declStorage ty + ] + where + standaloneDecl = ATEmpty <$ semi + + declLVar declStorage ty = captureFunctionParamScopes (declarator ty) >>= \case + ((_, Nothing), _) -> fail $ + if declStorage == TypedefDecl + then "typedef name omitted, expected unqualified-id" + else "variable name omitted, expected unqualified-id" + ((ty', Just ident), _) -> + case declStorage of + TypedefDecl -> + typedefDecl ty' ident + ExternDecl + | isFunctionType ty' -> + do + resolvedTy <- gets (`normalizeCompletedStorageClass` ty') + semi *> registerFunc False False resolvedTy ident $> ATEmpty + | otherwise -> + requireNonVoidObjectType "variable declared void" ty' + *> externDecl ty' ident + OrdinaryDecl -> + if isFunctionType ty' + then blockScopeFunctionDecl ty' ident + else + ordinaryObjectDecl ty ty' ident + AutoDecl -> + if isFunctionType ty' + then fail "invalid storage-class specifier for block-scope function declaration" + else ordinaryObjectDecl ty ty' ident + + nonInit ty ident = + requireCompleteObjectType "declaration of variable with incomplete type" ty + >>= \resolvedTy -> + semi *> registerLVar resolvedTy ident <&> atNull + withInit baseTy ty ident = do + resolvedTy <- + requireInitializedObjectType "declaration of variable with incomplete type" baseTy ty + equal *> varInit assign resolvedTy ident <* semi + ordinaryObjectDecl baseTy ty ident = + requireNonVoidObjectType "variable declared void" ty + *> M.choice + [ nonInit ty ident + , withInit baseTy ty ident + ] + blockScopeFunctionDecl ty ident = do + rejectInvalidBlockScopeFunctionStorage ty + resolvedTy <- gets (`normalizeCompletedStorageClass` ty) + semi *> registerFunc False False resolvedTy ident $> ATEmpty + rejectInvalidBlockScopeFunctionStorage = \case + CT.SCStatic _ -> + fail "invalid storage-class specifier for block-scope function declaration" + CT.SCRegister _ -> + fail "invalid storage-class specifier for block-scope function declaration" + _ -> + pure () + typedefDecl ty ident = do + resolvedTy <- requireTypedefDeclType "typedef declaration has invalid array element type" ty + semi *> registerTypedef resolvedTy ident $> ATEmpty + externDecl ty ident = + M.choice + [ equal *> fail "initializer is not allowed in block scope extern declaration" + , semi + *> ( requireExternDeclObjectType + "declaration of variable with incomplete type" + ty + >>= \resolvedTy -> + registerGVarWith + (normalizeGlobalDeclType resolvedTy) + ident + PV.GVarInitWithExternDecl + ) + $> ATEmpty + ] + +expr = assign >>= go + where + go lhs = M.option lhs $ do + void comma + rhs <- assign + rejectUnsupportedNonAddressableArrayDecay lhs + rejectUnsupportedNonAddressableArrayDecay rhs + go $ ATNode ATComma (decayExprType $ atype rhs) lhs rhs + +assign = do + nd <- conditional + M.option nd $ choice $ map (`id` nd) + [ assignOp ATAssign "=" + , assignOp ATMulAssign "*=" + , assignOp ATDivAssign "/=" + , assignOp ATAndAssign "&=" + , assignOp ATOrAssign "|=" + , assignOp ATXorAssign "^=" + , assignOp ATShlAssign "<<=" + , assignOp ATShrAssign ">>=" + , assignOp (maybe ATAddAssign (const ATAddPtrAssign) $ CT.deref (atype nd)) "+=" + , assignOp (maybe ATSubAssign (const ATSubPtrAssign) $ CT.deref (atype nd)) "-=" + ] + where + assignOp k s nd = symbol s *> do + lhs <- requireModifiableLvalue "lvalue required as left operand of assignment" nd + requireCompletePointerArithmetic k lhs + rhs <- assign + requireCompatibleAssignmentOperands k lhs rhs + rejectUnsupportedNonAddressableArrayDecay rhs + pure $ ATNode k (atype lhs) lhs rhs + + requireCompletePointerArithmetic kind expr = case kind of + ATAddPtrAssign -> + requirePointerArithmeticTargetAllowDeferred expr + ATSubPtrAssign -> + requirePointerArithmeticTargetAllowDeferred expr + _ -> + pure () + + requireCompatibleAssignmentOperands kind lhs rhs + | isInvalidFunctionPointerValue (atype lhs) rhs = + fail "invalid operands to assignment" + | kind == ATAssign && isInvalidObjectPointerValue (atype lhs) rhs = + fail "invalid operands to assignment" + | kind == ATAssign && isInvalidAggregateValueConversion (atype lhs) rhs = + fail "invalid operands to assignment" + | kind /= ATAssign && isInvalidCompoundAssignmentOperands kind lhs rhs = + fail "invalid operands to assignment" + | otherwise = + pure () + + isInvalidCompoundAssignmentOperands kind lhs rhs = + not $ case kind of + ATAddPtrAssign -> + isPointerType (decayExprType $ atype lhs) + && isIntegerOperandType (atype rhs) + ATSubPtrAssign -> + isPointerType (decayExprType $ atype lhs) + && isIntegerOperandType (atype rhs) + ATAddAssign -> + integerOperands + ATSubAssign -> + integerOperands + ATMulAssign -> + integerOperands + ATDivAssign -> + integerOperands + ATAndAssign -> + integerOperands + ATOrAssign -> + integerOperands + ATXorAssign -> + integerOperands + ATShlAssign -> + integerOperands + ATShrAssign -> + integerOperands + _ -> + False + where + integerOperands = + isIntegerOperandType (atype lhs) + && isIntegerOperandType (atype rhs) + +conditional = do + nd <- logicalOr + ifM + (M.option False (True <$ M.lookAhead question)) + (rejectNonScalarCondition nd >> (GNU.condOmitted nd M.<|> condOp nd)) + $ pure nd + where + condOp nd = do + rejectUnsupportedNonAddressableArrayDecay nd + th <- question *> expr <* colon + el <- conditional + ty <- maybeToParser "invalid operands" $ conditionalResultType th el + rejectUnsupportedNonAddressableArrayDecay th + rejectUnsupportedNonAddressableArrayDecay el + pure $ atConditional ty nd th el + +logicalOr = binaryOperator logicalAnd [(symbol "||", rejectingScalarBinOp $ binOpBool ATLOr)] + +logicalAnd = binaryOperator bitwiseOr [(symbol "&&", rejectingScalarBinOp $ binOpBool ATLAnd)] + +bitwiseOr = binaryOperator bitwiseXor [(vertical, rejectingBinOp $ binOpIntOnly ATOr)] + +bitwiseXor = binaryOperator bitwiseAnd [(hat, rejectingBinOp $ binOpIntOnly ATXor)] + +bitwiseAnd = binaryOperator equality [(MC.char '&' `notFollowedOp` MC.char '&', rejectingBinOp $ binOpIntOnly ATAnd)] + +equality = binaryOperator relational + [ (symbol "==", rejectingScalarBinOp $ binOpBool ATEQ) + , (symbol "!=", rejectingScalarBinOp $ binOpBool ATNEQ) + ] + +relational = binaryOperator shift + [ (symbol "<=", rejectingScalarBinOp $ binOpBool ATLEQ) + , (langle, rejectingScalarBinOp $ binOpBool ATLT) + , (symbol ">=", rejectingScalarBinOp $ binOpBool ATGEQ) + , (rangle, rejectingScalarBinOp $ binOpBool ATGT) + ] + +shift = binaryOperator add + [ (symbol "<<", rejectingBinOp $ binOpIntOnly ATShl) + , (symbol ">>", rejectingBinOp $ binOpIntOnly ATShr) + ] + +add = binaryOperator term + [ (symbol "+", pointerAwareBinaryOp addKind) + , (symbol "-", pointerAwareBinaryOp subKind) + ] + where + pointerAwareBinaryOp mk l r = do + node <- maybeToParser "invalid operands" $ mk (decayIncompleteArrayExpr l) (decayIncompleteArrayExpr r) + requireCompletePointerArithmeticNode node + rejectUnsupportedImmediateValueUse node + pure node + + rejectUnsupportedImmediateValueUse = \case + ATNode ATAddPtr _ _ _ -> pure () + ATNode ATSubPtr _ _ _ -> pure () + node -> rejectUnsupportedNonAddressableArrayDecay node + + requireCompletePointerArithmeticNode = \case + ATNode ATAddPtr _ ptr _ -> + requirePointerArithmeticTargetAllowDeferred ptr + ATNode ATSubPtr _ ptr _ -> + requirePointerArithmeticTargetAllowDeferred ptr + ATNode ATPtrDis _ lhs rhs -> + requirePointerArithmeticTargetAllowDeferred lhs + >> requirePointerArithmeticTargetAllowDeferred rhs + _ -> pure () + +term = binaryOperator cast + [ (star, rejectingBinOp $ binOpIntOnly ATMul) + , (slash, rejectingBinOp $ binOpIntOnly ATDiv) + , (percent, rejectingBinOp $ binOpIntOnly ATMod) + ] + +cast = choice + [ do + ty <- M.try (parens absDeclarator) + operand <- cast + rejectAggregateCast ty operand + rejectUnsupportedNonAddressableArrayDecay operand + pure $ atCast ty operand + , unary + ] + +rejectAggregateCast :: CT.StorageClass i -> ATree i -> Parser i () +rejectAggregateCast ty operand + | isAggregateType ty = fail "invalid cast type" + | not (isVoidType ty) && isAggregateType (atype operand) = fail "invalid cast operand" + | otherwise = pure () + where + isAggregateType aggregateTy = + CT.isCTStruct aggregateTy || CT.isIncompleteStruct aggregateTy + + isVoidType voidTy = case CT.toTypeKind voidTy of + CT.CTVoid -> True + _ -> False + +unary = choice + [ symbol "++" *> checkedIncDecOperand "lvalue required as increment operand" <&> \n -> ATNode ATPreInc (atype n) n ATEmpty + , symbol "--" *> checkedIncDecOperand "lvalue required as decrement operand" <&> \n -> ATNode ATPreDec (atype n) n ATEmpty + , symbol "+" *> checkedUnaryIntegerOperand "+" <&> integerPromotedExpr + , symbol "-" *> checkedUnaryIntegerOperand "-" <&> \n -> + let promoted = integerPromotedExpr n + in ATNode ATSub (atype promoted) (atNumLit 0) promoted + , lnot *> checkedUnaryScalarOperand <&> flip (ATNode ATNot (CT.SCAuto CT.CTBool)) ATEmpty + , tilda *> checkedUnaryIntegerOperand "~" <&> \n -> + let promoted = integerPromotedExpr n + in ATNode ATBitNot (atype promoted) promoted ATEmpty + , addr + , star *> checkedUnaryDerefOperand >>= deref' + , factor' + ] + where + checkedUnaryDecayOperand = do + n <- unary + rejectUnsupportedNonAddressableArrayDecay n + pure n + + checkedUnaryDerefOperand = + unary + + checkedUnaryScalarOperand = + checkedUnaryDecayOperand >>= requireScalarOperand "invalid operands" + + checkedUnaryIntegerOperand op = + checkedUnaryDecayOperand + >>= requireNonFunctionOperand op + >>= requireIntegerOperand "invalid operands" + + checkedIncDecOperand err = + unary + >>= requireModifiableLvalue err + >>= requireScalarOperand "invalid operands" + + addr = do + n <- MC.char '&' `notFollowedOp` MC.char '&' >> withSuppressedUnsupportedValueChecks unary + unsupportedChecksSuppressed <- gets suppressUnsupportedValueChecks + let + canTakeAddress = + isAddressableUnaryOperand n + || ( unsupportedChecksSuppressed + && isUnevaluatedRvalueArrayElementLvalue n + ) + unless canTakeAddress $ + fail "lvalue required as unary '&' operand" + let node = atUnary ATAddr (CT.mapTypeKind CT.CTPtr $ atype n) n + rejectUnsupportedNonAddressableArrayDecay node + pure node + + factor' = factor >>= allAcc + where + allAcc fac = M.option fac $ choice + [ callAcc fac + , idxAcc fac + , memberAcc fac + , ptrMemberAcc fac + , postInc fac + , postDec fac + ] + + callAcc fac = do + rawParams <- callArgs + (callTy, formalParamTys) <- maybe + (fail "called object is not a function or function pointer") + pure + (callableSignature $ atype fac) + params <- applyCallArgConversions formalParamTys rawParams + let + params' = if null params then Nothing else Just params + shouldValidateUnsupported <- gets (not . suppressUnsupportedValueChecks) + when (shouldValidateUnsupported && isJust params' && containsEscapingStmtExprControlFlow fac) $ + fail "unsupported control flow in function call callee" + allAcc =<< case fac of + ATNode (ATFuncPtr name) _ _ _ -> + pure $ atNoLeaf (ATCallFunc name params') callTy + _ -> + pure $ ATNode (ATCallPtr params') callTy fac ATEmpty + + idxAcc fac = do + idx <- brackets expr + kt <- maybeToParser "invalid operands" (addKind (decayIncompleteArrayExpr fac) idx) + ty <- maybeToParser "subscripted value is neither array nor pointer nor vector" $ derefObjectType $ atype kt + ty' <- resolveDerefObjectType "incomplete value dereference" ty + allAcc $ atUnary ATDeref ty' kt + + memberAcc fac = do + member <- period *> identifier + structTy <- resolveMemOperandType "invalid use of incomplete type" fac + member' <- lookupStructMember member structTy + allAcc $ atMemberAcc member' (withType structTy fac) + + ptrMemberAcc fac = do + member <- M.try (symbol "->") *> identifier + rejectUnsupportedNonAddressableArrayDecay fac + structTy <- maybeToParser "invalid type argument of '->'" $ derefObjectType $ atype fac + structTy' <- resolveDerefObjectType "invalid use of pointer to incomplete type" structTy + member' <- lookupStructMember member structTy' + allAcc $ atMemberAcc member' (atUnary ATDeref structTy' fac) + + postInc fac = do + _ <- symbol "++" + fac' <- + requireModifiableLvalue "lvalue required as increment operand" fac + >>= requireScalarOperand "invalid operands" + allAcc $ atUnary ATPostInc (atype fac') fac' + + postDec fac = do + _ <- symbol "--" + fac' <- + requireModifiableLvalue "lvalue required as decrement operand" fac + >>= requireScalarOperand "invalid operands" + allAcc $ atUnary ATPostDec (atype fac') fac' + + lookupStructMember member ty = + maybeToParser + ("no member named '" <> T.unpack member <> "'") + (CT.lookupMember member $ CT.toTypeKind ty) + + withType ty (ATNode kind _ lhs rhs) = ATNode kind ty lhs rhs + withType _ ATEmpty = ATEmpty + + deref' = runMaybeT . deref'' >=> maybe M.empty pure + where + deref'' n + | isFunctionType (atype n) = do + lift $ rejectUnsupportedNonAddressableArrayDecay n + pure n + deref'' n = do + ty <- MaybeT $ pure (derefObjectType $ atype n) + case CT.toTypeKind ty of + CT.CTVoid -> lift $ fail "void value not ignored as it ought to be" + _ -> do + ty' <- lift $ resolveDerefObjectType "incomplete value dereference" ty + pure $ atUnary ATDeref ty' (derefOperand n) + + derefOperand n + | isPointerArithmeticNode n = + n + | CT.isArray (atype n) = + ATNode ATAddPtr (atype n) n (atNumLit 0) + | otherwise = + n + + isPointerArithmeticNode (ATNode ATAddPtr _ _ _) = True + isPointerArithmeticNode (ATNode ATSubPtr _ _ _) = True + isPointerArithmeticNode _ = False + +callArgs :: (Ord i, Bits i, Read i, Show i, Integral i) => Parser i [ATree i] +callArgs = + lparen *> choice + [ [] <$ rparen + , (:) <$> arg <*> M.many (comma *> arg) <* rparen + ] + where + arg = M.notFollowedBy rparen *> assign + +factor = choice + [ atNumLit <$> natural + , atNumLit <$> charLiteral + , sizeof + , alignof + , strLiteral + , identifier' + , parensExprOrStmt + ] + where + parensExprOrStmt = do + isStmtExpr <- M.option False $ True <$ M.lookAhead (M.try (lparen *> lbrace)) + if isStmtExpr then GNU.stmtExpr else parens expr + + memOp p deferredKind op opS = p *> choice + [ memOpType + , memOpUnary + ] + where + memOpType = M.try (parens absDeclarator) + >>= requireCompleteObjectType ("invalid application of '" <> opS <> "' to incomplete type") + <&> atNumLit . fromIntegral . op + + memOpUnary = do + u <- withSuppressedUnsupportedValueChecks unary >>= requireNonFunctionOperand opS + if CT.isCTUndef (atype u) then + fail $ opS <> " must be an expression or type" + else + do + uTy <- resolveMemOperandType ("invalid application of '" <> opS <> "' to incomplete type") u + let u' = case u of + ATNode kind _ lhs rhs -> ATNode kind uTy lhs rhs + ATEmpty -> ATEmpty + pure $ atUnary deferredKind (CT.SCAuto $ CT.CTLong CT.CTInt) u' + + sizeof = memOp kSizeof ATSizeof CT.sizeof "sizeof" + alignof = memOp kAlignof ATAlignof CT.alignof "_Alignof" + + strLiteral = stringLiteral >>= registerStringLiteral + + identifier' = do + pos <- getPosState + ident <- identifier + gets (lookupVar ident) >>= \case + FoundGVar gvar -> do + let declaredTy = PV.gvtype gvar + resolvedTy <- gets (`normalizeCompletedStorageClass` declaredTy) + pure $ + atGVar + resolvedTy + ident + FoundLVar sct -> + gets (\cd -> normalizeCompletedStorageClass cd (PV.lvtype sct)) + >>= \resolvedTy -> return $ atLVar resolvedTy (PV.rbpOffset sct) + FoundEnum sct -> + return $ treealize sct + FoundFunc sct -> + gets (\cd -> normalizeCompletedStorageClass cd (PSF.fntype sct)) + >>= \resolvedTy -> return $ atNoLeaf (ATFuncPtr ident) resolvedTy + FoundTypedef _ -> + fail $ "'" <> T.unpack ident <> "' is a typedef name, not an expression" + NotFound -> + M.try (fnCall ident pos) + M.<|> fail ("The '" <> T.unpack ident <> "' is not defined identifier") + where + fnCall ident pos = do + rawParams <- callArgs + gets (lookupFunction ident) >>= \case + -- TODO: set warning message + -- TODO: Infer the return type of a function + Nothing -> + do + params <- applyCallArgConversions Nothing rawParams + let params' = if null params then Nothing else Just params + implicitFnTy = CT.SCAuto $ CT.CTFunc CT.CTInt [] + shadowingGlobal <- gets (isJust . lookupGVar ident) + unless shadowingGlobal $ + registerFunc False True implicitFnTy ident + pushWarn pos ("the function '" <> T.unpack ident <> "' is not declared.") + pure $ atNoLeaf (ATCallFunc ident params') (CT.SCAuto CT.CTInt) + Just fn -> do + resolvedFnTy <- gets (\cd -> normalizeCompletedStorageClass cd (PSF.fntype fn)) + (callTy, formalParamTys) <- maybe + (fail "internal compiler error: function lookup returned non-callable type") + pure + (callableSignature resolvedFnTy) + params <- applyCallArgConversions formalParamTys rawParams + let + params' = if null params then Nothing else Just params + pure $ atNoLeaf (ATCallFunc ident params') callTy diff --git a/src/Htcc/Parser/Combinators/Program.hs-boot b/src/Htcc/Parser/Combinators/Program.hs-boot new file mode 100644 index 0000000..b02f140 --- /dev/null +++ b/src/Htcc/Parser/Combinators/Program.hs-boot @@ -0,0 +1,9 @@ +module Htcc.Parser.Combinators.Program where + +import Data.Bits (Bits) +import Htcc.Parser.AST (ATree) +import Htcc.Parser.Combinators.Core (Parser) + +conditional :: (Ord i, Bits i, Read i, Show i, Integral i) => Parser i (ATree i) + +compoundStmt :: (Ord i, Bits i, Read i, Show i, Integral i) => Parser i [ATree i] diff --git a/src/Htcc/Parser/Combinators/Type.hs b/src/Htcc/Parser/Combinators/Type.hs new file mode 100644 index 0000000..33560ba --- /dev/null +++ b/src/Htcc/Parser/Combinators/Type.hs @@ -0,0 +1,16 @@ +{-| +Module : Htcc.Parser.Combinators.Type +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +module Htcc.Parser.Combinators.Type ( + module Htcc.Parser.Combinators.Type.Core +) where + +import Htcc.Parser.Combinators.Type.Core diff --git a/src/Htcc/Parser/Combinators/Type/Core.hs b/src/Htcc/Parser/Combinators/Type/Core.hs new file mode 100644 index 0000000..214088d --- /dev/null +++ b/src/Htcc/Parser/Combinators/Type/Core.hs @@ -0,0 +1,206 @@ +{-| +Module : Htcc.Parser.Combinators.Type.Core +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +{-# LANGUAGE FlexibleContexts, LambdaCase, OverloadedStrings #-} +module Htcc.Parser.Combinators.Type.Core ( + typeSuffix + -- * Helper functions + , toNamedParams +) where +import Control.Applicative ((<|>)) +import Control.Monad (mfilter, + void, + when) +import Control.Monad.Combinators (choice) +import Control.Monad.State (get, + gets, + modify, + put) +import Control.Monad.Trans (MonadTrans (..)) +import Control.Monad.Trans.Maybe (MaybeT (..), + runMaybeT) +import Data.Bifunctor (bimap, + first) +import Data.Bits (Bits (..)) +import Data.Functor ((<&>)) +import Data.Maybe (fromJust, + isJust) +import qualified Data.Text as T +import Data.Tuple.Extra (dupe) +import qualified Htcc.CRules.Types as CT +import Htcc.Parser.Combinators.ConstExpr (evalConstexpr) +import Htcc.Parser.Combinators.Core +import {-# SOURCE #-} Htcc.Parser.Combinators.Decl.Declarator +import Htcc.Parser.Combinators.Decl.Spec (declspec) +import Htcc.Parser.Combinators.Keywords (kVoid) +import Htcc.Parser.Combinators.Utils (registerLVar) +import Htcc.Parser.ConstructionData.Core (ConstructionData (functionParamScopes, scope, tagHistory), + FunctionParamScope (..), + incomplete, + succNest) +import Htcc.Parser.ConstructionData.Scope (Scoped (curScopeId, enumerators, nextScopeId, structs)) +import Htcc.Utils (toNatural) +import qualified Text.Megaparsec as M + +arraySuffix :: (Show i, Read i, Bits i, Integral i) + => CT.StorageClass i + -> Parser i (CT.StorageClass i) +arraySuffix ty = lbracket *> choice + [ nonConstantExp + , withConstantExp + ] + where + failWithTypeMaybe ty' = maybe (fail $ show ty') pure + + withConstantExp = do + len <- evalConstexpr + when (len < 0) $ + fail "array bound is negative" + void rbracket + let arty = flip id ty . CT.mapTypeKind . CT.CTArray $ toNatural len + M.option Nothing (Just <$> arraySuffix ty) + >>= \case + Nothing -> pure arty + Just ty' + | CT.isIncompleteArray ty' -> + failWithTypeMaybe ty' Nothing + Just ty' -> + runMaybeT (mfilter CT.isValidIncomplete $ MaybeT $ pure $ CT.concatCTArray arty ty') + >>= failWithTypeMaybe ty' + + nonConstantExp = let mtIncomplete ty' = MaybeT $ lift $ gets $ incomplete ty' in + rbracket + *> M.option Nothing (Just <$> arraySuffix ty) + >>= \case + Nothing -> + runMaybeT (CT.mapTypeKind (CT.CTIncomplete . CT.IncompleteArray) <$> mtIncomplete ty) + >>= failWithTypeMaybe ty + Just ty' -> + runMaybeT (multiple <$> mtIncomplete ty') + >>= failWithTypeMaybe ty' + where + multiple = CT.mapTypeKind $ + uncurry ((.) fromJust . CT.concatCTArray) + . first (CT.CTIncomplete . CT.IncompleteArray . CT.removeAllExtents) + . dupe + +funcParams :: (Show i, Read i, Integral i, Bits i) + => CT.StorageClass i + -> Parser i (CT.StorageClass i) +funcParams ty = lparen *> do + pre <- get + modify succNest + params <- scopedParams + post <- get + let paramScope = scope post + carry = + FunctionParamScope + { fpsScopeId = curScopeId paramScope + , fpsTags = structs paramScope + , fpsEnumerators = enumerators paramScope + } + restoredScope = (scope pre) { nextScopeId = nextScopeId paramScope } + put $ + pre + { scope = restoredScope + , tagHistory = tagHistory post + , functionParamScopes = carry : functionParamScopes post + } + pure $ CT.wrapCTFunc ty params + where + scopedParams = + choice + [ M.try $ [(CT.SCAuto CT.CTVoid, Nothing)] <$ (kVoid *> rparen) + , [] <$ rparen + , withParams + ] + + withParams = do + firstParam <- declIdentFuncParam + restParams <- M.many (comma *> declIdentFuncParam) + void rparen + validateVoidParams $ firstParam : restParams + + declIdentFuncParam = do + ty' <- M.try declspec + M.choice + [ M.try $ (ty', Nothing) <$ M.lookAhead (comma <|> rparen) + , declarator ty' >>= \(t, mIdent) -> do + rejectVoidArrayParam t + let paramTy = narrowPtr t + case mIdent of + Nothing -> + pure () + Just ident -> + void $ registerLVar paramTy ident + pure (paramTy, mIdent) + ] + where + rejectVoidArrayParam paramTy + | containsVoidArrayType $ CT.toTypeKind paramTy = + fail "parameter declared as array of void" + | otherwise = pure () + + containsVoidArrayType = \case + CT.CTPtr ty'' -> + containsVoidArrayType ty'' + CT.CTArray _ ty'' -> + hasVoidArrayElement ty'' + CT.CTIncomplete (CT.IncompleteArray ty'') -> + hasVoidArrayElement ty'' + _ -> + False + + hasVoidArrayElement = \case + CT.CTArray _ ty'' -> + hasVoidArrayElement ty'' + CT.CTIncomplete (CT.IncompleteArray ty'') -> + hasVoidArrayElement ty'' + CT.CTVoid -> + True + _ -> + False + + narrowPtr ty' + | CT.isCTArray ty' = maybe ty' (CT.mapTypeKind CT.CTPtr) $ CT.deref ty' + | CT.isIncompleteArray ty' = flip CT.mapTypeKind ty' $ + \(CT.CTIncomplete (CT.IncompleteArray ty'')) -> CT.CTPtr ty'' + | otherwise = ty' + + validateVoidParams params + | isSingleUnnamedVoid params = pure params + | any (isVoidParam . fst) params = fail "parameter declared void" + | otherwise = pure params + + isSingleUnnamedVoid [(paramTy, Nothing)] = isVoidParam paramTy + isSingleUnnamedVoid _ = False + + isVoidParam = (CT.CTVoid ==) . CT.toTypeKind + +toNamedParams :: (Show i, Read i, Integral i, Bits i) + => CT.StorageClass i + -> Parser i [(CT.StorageClass i, T.Text)] +toNamedParams ty = case CT.toTypeKind ty of + (CT.CTFunc _ params) -> pure + [ bimap CT.SCAuto fromJust p + | p <- params + , fst p /= CT.CTVoid + , isJust $ snd p + ] + _ -> fail "expected function parameters" + +typeSuffix :: (Show i, Read i, Bits i, Integral i) + => CT.StorageClass i + -> Parser i (CT.StorageClass i) +typeSuffix ty = M.option ty $ choice + [ arraySuffix ty + , funcParams ty + ] diff --git a/src/Htcc/Parser/Combinators/Type/Utils.hs b/src/Htcc/Parser/Combinators/Type/Utils.hs new file mode 100644 index 0000000..b565faf --- /dev/null +++ b/src/Htcc/Parser/Combinators/Type/Utils.hs @@ -0,0 +1,26 @@ +{-| +Module : Htcc.Parser.Combinators.Type.Utils +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +module Htcc.Parser.Combinators.Type.Utils ( + starsToPtrCtor + , starsToPtr +) where + +import qualified Htcc.CRules.Types as CT +import Htcc.Parser.Combinators.Core (Parser, star) +import Htcc.Utils (toNatural) +import qualified Text.Megaparsec as M + +starsToPtrCtor :: Ord i => Parser i (CT.StorageClass i -> CT.StorageClass i) +starsToPtrCtor = CT.ctorPtr . toNatural . length <$> M.many star + +starsToPtr :: Ord i => CT.StorageClass i -> Parser i (CT.StorageClass i) +starsToPtr ty = starsToPtrCtor <*> pure ty diff --git a/src/Htcc/Parser/Combinators/Utils.hs b/src/Htcc/Parser/Combinators/Utils.hs new file mode 100644 index 0000000..b3739d5 --- /dev/null +++ b/src/Htcc/Parser/Combinators/Utils.hs @@ -0,0 +1,1103 @@ +{-| +Module : Htcc.Parser.Combinators.Utils +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +{-# LANGUAGE LambdaCase, Rank2Types, TypeOperators #-} +module Htcc.Parser.Combinators.Utils ( + maybeToParser + , registerLVar + , registerGVar + , registerGVarWith + , registerStringLiteral + , registerFunc + , registerTypedef + , decayExprType + , conditionalResultType + , isNullPointerConstant + , functionDesignatorSourcePointerType + , carriesFunctionDesignatorValue + , containsEscapingStmtExprControlFlow + , hasInvalidStmtExprControlFlow + , requiresUnsupportedNonAddressableArrayDecay + , isInvalidAggregateValueConversion + , isInvalidObjectPointerValue + , isInvalidFunctionPointerValue + , isInvalidFunctionPointerInitializer + , captureFunctionParamScopes + , bracket + , getPosState + , isConstexprArithmeticCastType + , applyConstexprCast +) where +import Control.Applicative ((<|>)) +import Control.Monad.State (gets, modify, + put) +import Control.Natural (type (~>)) +import Data.Bits (Bits (..)) +import qualified Data.ByteString as B +import Data.Maybe (isJust) +import qualified Data.Set as S +import qualified Data.Text as T +import qualified Htcc.CRules.Types as CT +import Htcc.Parser.AST.Core (ATKind (..), + ATKindFor (..), + ATree (..), + fromATKindFor) +import Htcc.Parser.Combinators.Core +import Htcc.Parser.ConstructionData.Core (ConstructionData (functionParamScopes), + FunctionParamScope, + addFunction, + addGVar, + addGVarWith, + addLVar, + addLiteral, + addTypedef, + hasIncompleteObjectType) +import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) +import Htcc.Parser.ConstructionData.Scope.Var (GVarInitWith) +import qualified Htcc.Tokenizer.Token as HT +import qualified Text.Megaparsec as M + +maybeToParser :: String -> Maybe ~> Parser i +maybeToParser s = maybe (fail s) pure + +type PureAdder i = CT.StorageClass i + -> HT.TokenLC i + -> ConstructionData i + -> Either (ASTError i) (ATree i, ConstructionData i) + +registerVar :: (Bits i, Integral i) + => PureAdder i + -> CT.StorageClass i + -> T.Text + -> Parser i (ATree i) +registerVar adder ty ident = gets (adder ty (tmpTKIdent ident)) + >>= \case + Right (lat, scp') -> lat <$ put scp' + Left err -> fail $ T.unpack $ fst err + +registerLVar :: (Bits i, Integral i) + => CT.StorageClass i + -> T.Text + -> Parser i (ATree i) +registerLVar = registerVar addLVar + +registerStringLiteral :: (Bits i, Integral i) + => B.ByteString + -> Parser i (ATree i) +registerStringLiteral s = gets (addLiteral ty (HT.TokenLCNums 1 1, HT.TKString bytes)) + >>= \case + Right (n, scp) -> n <$ put scp + Left err -> fail $ T.unpack $ fst err + where + bytes = s + ty = CT.SCAuto $ CT.CTArray (fromIntegral $ B.length bytes) CT.CTChar + +registerGVar :: (Ord i, Bits i, Integral i) + => CT.StorageClass i + -> T.Text + -> Parser i (ATree i) +registerGVar = registerVar addGVar + +registerGVarWith :: (Ord i, Bits i, Integral i) + => CT.StorageClass i + -> T.Text + -> GVarInitWith i + -> Parser i (ATree i) +registerGVarWith ty ident to = gets (addGVarWith ty (tmpTKIdent ident) to) + >>= \case + Right (_, scp) -> ATEmpty <$ put scp + Left err -> fail $ T.unpack $ fst err + +registerFunc :: (Bits i, Integral i) + => Bool + -> Bool + -> CT.StorageClass i + -> T.Text + -> Parser i () +registerFunc isDefined isImplicit ty ident = gets (addFunction isDefined isImplicit ty (tmpTKIdent ident)) + >>= \case + Right scp -> put scp + Left err -> fail $ T.unpack $ fst err + +registerTypedef :: (Eq i, Num i) + => CT.StorageClass i + -> T.Text + -> Parser i () +registerTypedef ty ident = gets (addTypedef ty (tmpTKIdent ident)) + >>= \case + Right scp -> put scp + Left err -> fail $ T.unpack $ fst err + +captureFunctionParamScopes :: Parser i a -> Parser i (a, [FunctionParamScope i]) +captureFunctionParamScopes parser = do + initialDepth <- gets (length . functionParamScopes) + result <- parser + scopes <- gets functionParamScopes + let newCount = length scopes - initialDepth + (newScopes, remainingScopes) = splitAt newCount scopes + modify $ \cd -> cd { functionParamScopes = remainingScopes } + pure (result, newScopes) + +decayExprType :: Ord i => CT.StorageClass i -> CT.StorageClass i +decayExprType ty = case CT.toTypeKind ty of + CT.CTArray _ _ -> decayArrayType ty + CT.CTIncomplete (CT.IncompleteArray elemTy) -> CT.mapTypeKind (const $ CT.CTPtr elemTy) ty + CT.CTFunc _ _ -> CT.mapTypeKind CT.CTPtr ty + _ -> ty + where + decayArrayType arrTy = maybe arrTy (CT.mapTypeKind CT.CTPtr) $ CT.deref arrTy + +conditionalResultType :: (Ord i, Bits i, Integral i) => ATree i -> ATree i -> Maybe (CT.StorageClass i) +conditionalResultType lhs rhs + | isPointerType lhsTy || isPointerType rhsTy = + nullPointerConditionalType lhsTy rhs + <|> nullPointerConditionalType rhsTy lhs + <|> mergePointerConditionalType lhsTy rhsTy + | isAggregateType lhsTy || isAggregateType rhsTy = + mergeCompatibleStorageClasses lhsTy rhsTy + <|> mergeCompatibleStorageClasses rhsTy lhsTy + | otherwise = + Just $ CT.conversion lhsTy rhsTy + where + lhsTy = decayExprType $ atype lhs + rhsTy = decayExprType $ atype rhs + + isAggregateType ty = + CT.isCTStruct ty || CT.isIncompleteStruct ty + + mergePointerConditionalType lTy rTy + | isPointerType lTy && isPointerType rTy = + voidObjectPointerConditionalType lTy rTy + <|> voidObjectPointerConditionalType rTy lTy + <|> mergeCompatibleStorageClasses lTy rTy + <|> mergeCompatibleStorageClasses rTy lTy + | otherwise = + Nothing + + voidObjectPointerConditionalType voidTy objectTy + | isVoidPointerType voidTy && isObjectPointerType objectTy = + Just voidTy + | otherwise = + Nothing + + nullPointerConditionalType pointerTy expr + | isPointerType pointerTy && isConditionalNullPointerConstantFor pointerTy expr = + Just pointerTy + | otherwise = + Nothing + + isPointerType ty = case CT.toTypeKind ty of + CT.CTPtr _ -> True + _ -> False + + isFunctionPointerType ty = case CT.toTypeKind ty of + CT.CTPtr (CT.CTFunc _ _) -> True + _ -> False + + isVoidPointerType ty = case CT.toTypeKind ty of + CT.CTPtr CT.CTVoid -> True + _ -> False + + isObjectPointerType ty = case CT.toTypeKind ty of + CT.CTPtr (CT.CTFunc _ _) -> False + CT.CTPtr _ -> True + _ -> False + + mergeCompatibleStorageClasses lTy rTy = + CT.SCAuto <$> CT.mergeCompatibleTypeKinds (CT.toTypeKind lTy) (CT.toTypeKind rTy) + + isConditionalNullPointerConstantFor pointerTy expr = + isZeroIntegerNullPointerConstant expr + || (isObjectPointerType pointerTy && isTypedNullPointerConstant isObjectPointerType expr) + || (isFunctionPointerType pointerTy && isTypedNullPointerConstant isFunctionNullPointerCastType expr) + + isFunctionNullPointerCastType ty = + isVoidPointerType ty || isFunctionPointerType ty + + isZeroIntegerNullPointerConstant = \case + ATNode (ATNull inner) _ _ _ -> + isZeroIntegerNullPointerConstant inner + ATNode ATExprStmt _ inner _ -> + isZeroIntegerNullPointerConstant inner + expr -> + isZeroIntegerConstexpr expr + + isTypedNullPointerConstant matchesTy = \case + ATNode (ATNull inner) _ _ _ -> + isTypedNullPointerConstant matchesTy inner + ATNode ATExprStmt _ inner _ -> + isTypedNullPointerConstant matchesTy inner + ATNode ATCast ty inner _ + | matchesTy ty -> + isZeroIntegerNullPointerConstant inner + _ -> + False + + isZeroIntegerConstexpr expr = + either (const False) (== 0) $ evalIntegerConstexprTree expr + +isNullPointerConstant :: (Bits i, Integral i) => ATree i -> Bool +isNullPointerConstant = \case + ATNode (ATNull inner) _ _ _ -> + isNullPointerConstant inner + ATNode ATExprStmt _ inner _ -> + isNullPointerConstant inner + ATNode ATCast ty inner _ + | isVoidPointerType ty -> + isZeroIntegerConstexpr inner + expr -> + isZeroIntegerConstexpr expr + where + isVoidPointerType ty = case CT.toTypeKind ty of + CT.CTPtr CT.CTVoid -> True + _ -> False + + isZeroIntegerConstexpr expr = + either (const False) (== 0) $ evalIntegerConstexprTree expr + +functionDesignatorSourcePointerType :: Ord i => ATree i -> Maybe (CT.StorageClass i) +functionDesignatorSourcePointerType at@(ATNode _ ty _ _) + | carriesFunctionDesignatorValue at = functionPointerValueType at + | otherwise = Nothing +functionDesignatorSourcePointerType _ = Nothing + +functionPointerValueType :: Ord i => ATree i -> Maybe (CT.StorageClass i) +functionPointerValueType (ATNode _ ty _ _) = case CT.toTypeKind ty of + CT.CTFunc _ _ -> Just $ decayExprType ty + CT.CTPtr (CT.CTFunc _ _) -> Just ty + _ -> Nothing +functionPointerValueType _ = Nothing + +isVoidPointerType :: CT.StorageClass i -> Bool +isVoidPointerType ty = case CT.toTypeKind ty of + CT.CTPtr CT.CTVoid -> True + _ -> False + +isObjectPointerType :: CT.StorageClass i -> Bool +isObjectPointerType ty = case CT.toTypeKind ty of + CT.CTPtr (CT.CTFunc _ _) -> False + CT.CTPtr _ -> True + _ -> False + +objectPointerTypesCompatible :: Ord i => CT.StorageClass i -> CT.StorageClass i -> Bool +objectPointerTypesCompatible targetTy sourceTy = + isVoidPointerType targetTy + || isVoidPointerType sourceTy + || compatibleKinds (CT.toTypeKind targetTy) (CT.toTypeKind sourceTy) + || compatibleKinds (CT.toTypeKind sourceTy) (CT.toTypeKind targetTy) + where + compatibleKinds lhs rhs = + isJust $ CT.mergeCompatibleTypeKinds lhs rhs + +isInvalidObjectPointerValue :: (Ord i, Bits i, Integral i) => CT.StorageClass i -> ATree i -> Bool +isInvalidObjectPointerValue targetTy expr + | not (isObjectPointerType targetTy) = False + | isNullPointerConstant expr = False + | otherwise = not $ any (maybe False isCompatibleObjectPointerSource) sourceTypes + where + sourceTypes = + [Just $ decayExprType $ atype expr] + + isCompatibleObjectPointerSource sourceTy = case CT.toTypeKind sourceTy of + CT.CTPtr (CT.CTFunc _ _) -> + False + CT.CTPtr _ -> + objectPointerTypesCompatible targetTy sourceTy + _ -> + False + +isInvalidAggregateValueConversion :: Eq i => CT.StorageClass i -> ATree i -> Bool +isInvalidAggregateValueConversion targetTy expr = + (isStructLikeValueType targetTy || isStructLikeValueType exprTy) + && not (compatibleKinds (CT.toTypeKind targetTy) (CT.toTypeKind exprTy)) + && not (compatibleKinds (CT.toTypeKind exprTy) (CT.toTypeKind targetTy)) + where + exprTy = atype expr + + isStructLikeValueType ty = + CT.isCTStruct ty || CT.isIncompleteStruct ty + + compatibleKinds lhs rhs = + isJust $ CT.mergeCompatibleTypeKinds lhs rhs + +isInvalidFunctionPointerValue :: (Ord i, Bits i, Integral i) => CT.StorageClass i -> ATree i -> Bool +isInvalidFunctionPointerValue targetTy at + | hasInvalidFunctionPointerCast at = isFunctionPointerType targetTy + | otherwise = case inferredFunctionPointerValueType at of + Just sourcePtrTy -> + not $ isCompatibleFunctionPointerType targetTy sourcePtrTy + Nothing -> + isFunctionPointerType targetTy && not (isNullPointerConstant at) + where + isFunctionPointerType ty = case CT.toTypeKind ty of + CT.CTPtr (CT.CTFunc _ _) -> True + _ -> False + + isVoidPointerType ty = case CT.toTypeKind ty of + CT.CTPtr CT.CTVoid -> True + _ -> False + + isNullPointerCastType ty = isVoidPointerType ty || isFunctionPointerType ty + + isNullPointerConstant = \case + ATNode (ATNull inner) _ _ _ -> + isNullPointerConstant inner + ATNode ATExprStmt _ inner _ -> + isNullPointerConstant inner + ATNode ATCast ty inner _ + | isNullPointerCastType ty -> + isZeroIntegerConstexpr inner + expr -> + isZeroIntegerConstexpr expr + + isZeroIntegerConstexpr expr = + either (const False) (== 0) $ evalIntegerConstexprTree expr + + inferredFunctionPointerValueType expr = + if isNullPointerConstant expr + then Nothing + else case functionPointerValueType expr of + Just ty -> + Just ty + Nothing -> case expr of + ATNode (ATNull inner) _ _ _ -> + inferredFunctionPointerValueType inner + ATNode ATExprStmt _ inner _ -> + inferredFunctionPointerValueType inner + ATNode ATComma _ _ rhs -> + inferredFunctionPointerValueType rhs + ATNode (ATConditional cond ATEmpty el) _ _ _ -> + conditionalFunctionPointerValueType cond el + ATNode (ATConditional _ th el) _ _ _ -> + conditionalFunctionPointerValueType th el + ATNode (ATStmtExpr stmts) _ _ _ -> + inferredFunctionPointerValueType =<< lastMaybe stmts + _ -> + Nothing + + conditionalFunctionPointerValueType lhs rhs = case (inferredFunctionPointerValueType lhs, inferredFunctionPointerValueType rhs) of + (Just lhsTy, Just rhsTy) + | functionPointerTypesCompatible lhsTy rhsTy -> + Just lhsTy + (Just lhsTy, Nothing) + | isNullPointerConstant rhs -> + Just lhsTy + (Nothing, Just rhsTy) + | isNullPointerConstant lhs -> + Just rhsTy + _ -> + Nothing + + functionPointerTypesCompatible lhsTy rhsTy = + isCompatibleFunctionPointerType lhsTy rhsTy + || isCompatibleFunctionPointerType rhsTy lhsTy + + isCompatibleFunctionPointerType targetTy' sourcePtrTy = + isFunctionPointerType targetTy' + && + ( isJust + (CT.mergeCompatibleTypeKinds + (CT.toTypeKind targetTy') + (CT.toTypeKind sourcePtrTy) + ) + || oldStyleFunctionPointerCompatible + (CT.toTypeKind targetTy') + (CT.toTypeKind sourcePtrTy) + ) + + oldStyleFunctionPointerCompatible + (CT.CTPtr (CT.CTFunc targetRet targetParams)) + (CT.CTPtr (CT.CTFunc sourceRet sourceParams)) = + isJust (CT.mergeCompatibleTypeKinds targetRet sourceRet) + && + (oldStyleParamListCompatible targetParams sourceParams + || oldStyleParamListCompatible sourceParams targetParams) + oldStyleFunctionPointerCompatible _ _ = False + + oldStyleParamListCompatible [] params = + all oldStyleCallCompatibleParamType $ normalizedFunctionParamKinds params + oldStyleParamListCompatible _ _ = False + + normalizedFunctionParamKinds [(CT.CTVoid, Nothing)] = [] + normalizedFunctionParamKinds params = + map (normalizeFunctionParamType . fst) params + + normalizeFunctionParamType = \case + CT.CTArray _ elemTy -> CT.CTPtr elemTy + CT.CTIncomplete (CT.IncompleteArray elemTy) -> CT.CTPtr elemTy + CT.CTFunc retTy params -> CT.CTPtr $ CT.CTFunc retTy params + ty -> ty + + oldStyleCallCompatibleParamType ty = + isJust (CT.mergeCompatibleTypeKinds ty promotedTy) + where + promotedTy = CT.integerPromotedTypeKind ty + + hasInvalidFunctionPointerCast = \case + ATNode ATCast ty inner _ + | carriesFunctionDesignatorValue inner -> + not (isFunctionPointerType ty) || hasInvalidFunctionPointerCast inner + ATNode ATAddr _ inner _ -> + hasInvalidFunctionPointerCast inner + ATNode (ATNull inner) _ _ _ -> + hasInvalidFunctionPointerCast inner + ATNode ATExprStmt _ inner _ -> + hasInvalidFunctionPointerCast inner + ATNode ATComma _ _ rhs -> + hasInvalidFunctionPointerCast rhs + ATNode (ATConditional cond ATEmpty el) _ _ _ -> + any hasInvalidFunctionPointerCast [cond, el] + ATNode (ATConditional _ th el) _ _ _ -> + any hasInvalidFunctionPointerCast [th, el] + ATNode (ATStmtExpr stmts) _ _ _ -> + maybe False hasInvalidFunctionPointerCast (lastMaybe stmts) + _ -> + False + + lastMaybe [] = Nothing + lastMaybe xs = Just $ last xs + +evalIntegerConstexprTree :: (Bits i, Integral i) => ATree i -> Either String i +evalIntegerConstexprTree = \case + ATNode k ty lhs rhs -> case k of + ATAdd -> binop (+) + ATSub -> binop (-) + ATMul -> binop (*) + ATDiv -> nonZeroBinop quot + ATMod -> nonZeroBinop rem + ATAnd -> binop (.&.) + ATXor -> binop xor + ATOr -> binop (.|.) + ATShl -> shiftBinop shiftL + ATShr -> shiftBinop shiftR + ATEQ -> binop (fromBool .: (==)) + ATNEQ -> binop (fromBool .: (/=)) + ATLT -> binop (fromBool .: (<)) + ATGT -> binop (fromBool .: (>)) + ATLEQ -> binop (fromBool .: (<=)) + ATGEQ -> binop (fromBool .: (>=)) + ATConditional cond th el -> + evalIntegerConstexprTree cond >>= \cond' -> + if cond' == 0 + then evalIntegerConstexprTree el + else evalIntegerConstexprTree $ + case th of + ATEmpty -> cond + _ -> th + ATNot -> fromBool . (== 0) <$> evalIntegerConstexprTree lhs + ATBitNot -> complement <$> evalIntegerConstexprTree lhs + ATLAnd -> evalIntegerConstexprTree lhs >>= logicalAnd + ATLOr -> evalIntegerConstexprTree lhs >>= logicalOr + ATSizeof -> memOp "sizeof" CT.sizeof lhs + ATAlignof -> memOp "_Alignof" CT.alignof lhs + ATCast + | isConstexprArithmeticCastType ty -> applyConstexprCast ty <$> evalIntegerConstexprTree lhs + | otherwise -> Left "not an integer constant expression" + ATNum v -> pure v + _ -> Left "not an integer constant expression" + where + binop f = evalIntegerConstexprTree lhs >>= \lhs' -> f lhs' <$> evalIntegerConstexprTree rhs + shiftBinop f = + evalIntegerConstexprTree lhs >>= \lhs' -> + evalIntegerConstexprTree rhs >>= \rhs' -> + case shiftCount rhs' of + Nothing -> Left "not an integer constant expression" + Just count' -> pure $ f lhs' count' + shiftCount n + | n < 0 = Nothing + | toInteger n >= shiftWidth = Nothing + | toInteger n > toInteger (maxBound :: Int) = Nothing + | otherwise = Just $ fromIntegral n + where + shiftWidth = toInteger (CT.sizeof ty) * 8 + logicalAnd lhs' + | lhs' == 0 = pure 0 + | otherwise = fromBool . (/= 0) <$> evalIntegerConstexprTree rhs + logicalOr lhs' + | lhs' /= 0 = pure 1 + | otherwise = fromBool . (/= 0) <$> evalIntegerConstexprTree rhs + nonZeroBinop f = + evalIntegerConstexprTree lhs >>= \lhs' -> + evalIntegerConstexprTree rhs >>= \rhs' -> + if rhs' == 0 + then Left "not an integer constant expression" + else pure (f lhs' rhs') + memOp opName op expr + | hasIncompleteObjectType (atype expr) = + Left $ "invalid application of '" <> opName <> "' to incomplete type" + | otherwise = + pure $ fromIntegral $ op $ atype expr + fromBool = fromIntegral . fromEnum + (.:) f g x y = f (g x y) + _ -> Left "not an integer constant expression" + +containsEscapingStmtExprControlFlow :: Ord i => ATree i -> Bool +containsEscapingStmtExprControlFlow = containsStmtExprEscapingControlFlow True + +containsNonReturnEscapingStmtExprControlFlow :: Ord i => ATree i -> Bool +containsNonReturnEscapingStmtExprControlFlow = containsStmtExprEscapingControlFlow False + +containsStmtExprEscapingControlFlow :: Ord i => Bool -> ATree i -> Bool +containsStmtExprEscapingControlFlow returnEscapes = exprContains + where + exprContains ATEmpty = False + exprContains (ATNode ATSizeof _ _ _) = False + exprContains (ATNode ATAlignof _ _ _) = False + exprContains (ATNode (ATStmtExpr stmts) _ lhs rhs) = + stmtExprContainsEscape stmts + || exprContains lhs + || exprContains rhs + exprContains (ATNode kind _ lhs rhs) = + kindExprContains kind + || exprContains lhs + || exprContains rhs + + stmtExprContainsEscape stmts = + any (stmtContainsEscape (localLabels stmts) 0 0) stmts + + stmtContainsEscape _ _ _ ATEmpty = False + stmtContainsEscape _ _ _ (ATNode ATSizeof _ _ _) = False + stmtContainsEscape _ _ _ (ATNode ATAlignof _ _ _) = False + stmtContainsEscape labels breakDepth continueDepth (ATNode ATWhile _ cond body) = + stmtContainsEscape labels breakDepth continueDepth cond + || stmtContainsEscape labels (succ breakDepth) (succ continueDepth) body + stmtContainsEscape labels breakDepth continueDepth (ATNode kind _ lhs rhs) = + escapingKind labels breakDepth continueDepth kind + || kindStmtContainsEscape labels breakDepth continueDepth kind + || stmtContainsEscape labels breakDepth continueDepth lhs + || stmtContainsEscape labels breakDepth continueDepth rhs + + escapingKind labels breakDepth continueDepth = \case + ATBreak -> + breakDepth == 0 + ATContinue -> + continueDepth == 0 + ATGoto ident -> + not $ ident `S.member` labels + ATReturn -> + returnEscapes + _ -> + False + + kindStmtContainsEscape labels breakDepth continueDepth = \case + ATConditional cond tr fl -> + any (stmtContainsEscape labels breakDepth continueDepth) [cond, tr, fl] + ATSwitch cond cases -> + stmtContainsEscape labels breakDepth continueDepth cond + || any (stmtContainsEscape labels (succ breakDepth) continueDepth) cases + ATFor kinds -> + any (forContainsEscape labels (succ breakDepth) (succ continueDepth)) kinds + ATBlock stmts -> + any (stmtContainsEscape labels breakDepth continueDepth) stmts + ATStmtExpr stmts -> + stmtExprContainsEscape stmts + ATNull at -> + stmtContainsEscape labels breakDepth continueDepth at + ATDefFunc _ maybeArgs -> + maybe False (any exprContains) maybeArgs + ATCallFunc _ maybeArgs -> + maybe False (any exprContains) maybeArgs + ATCallPtr maybeArgs -> + maybe False (any exprContains) maybeArgs + _ -> + False + + forContainsEscape labels breakDepth continueDepth = \case + ATForkw -> + False + ATForInit at -> + stmtContainsEscape labels breakDepth continueDepth at + ATForCond at -> + stmtContainsEscape labels breakDepth continueDepth at + ATForIncr at -> + stmtContainsEscape labels breakDepth continueDepth at + ATForStmt at -> + stmtContainsEscape labels breakDepth continueDepth at + + kindExprContains = \case + ATConditional cond tr fl -> + any exprContains [cond, tr, fl] + ATSwitch cond cases -> + exprContains cond || any exprContains cases + ATFor kinds -> + any forExprContains kinds + ATBlock stmts -> + any exprContains stmts + ATStmtExpr stmts -> + stmtExprContainsEscape stmts + ATNull at -> + exprContains at + ATDefFunc _ maybeArgs -> + maybe False (any exprContains) maybeArgs + ATCallFunc _ maybeArgs -> + maybe False (any exprContains) maybeArgs + ATCallPtr maybeArgs -> + maybe False (any exprContains) maybeArgs + _ -> + False + + forExprContains = \case + ATForkw -> + False + ATForInit at -> + exprContains at + ATForCond at -> + exprContains at + ATForIncr at -> + exprContains at + ATForStmt at -> + exprContains at + + localLabels = + S.unions . map labelsInTree + + labelsInTree ATEmpty = S.empty + labelsInTree (ATNode kind _ lhs rhs) = case kind of + ATStmtExpr _ -> + S.empty + ATCallFunc _ _ -> + S.empty + ATCallPtr _ -> + S.empty + _ -> + labelsInKind kind + <> labelsInTree lhs + <> labelsInTree rhs + + labelsInKind = \case + ATLabel ident -> + S.singleton ident + ATConditional cond tr fl -> + S.unions $ map labelsInTree [cond, tr, fl] + ATSwitch cond cases -> + S.unions $ labelsInTree cond : map labelsInTree cases + ATFor kinds -> + S.unions $ map labelsInFor kinds + ATBlock stmts -> + S.unions $ map labelsInTree stmts + ATNull at -> + labelsInTree at + _ -> + S.empty + + labelsInFor = \case + ATForkw -> + S.empty + ATForInit at -> + labelsInTree at + ATForCond at -> + labelsInTree at + ATForIncr at -> + labelsInTree at + ATForStmt at -> + labelsInTree at + +hasInvalidStmtExprControlFlow :: Ord i => ATree i -> Bool +hasInvalidStmtExprControlFlow ast = + containsStmtExprEscapingControlFlow False ast + || hasGotoIntoStmtExpr ast + || hasSwitchLabelIntoStmtExpr ast + where + hasGotoIntoStmtExpr tree = + any jumpsIntoStmtExpr gotos + where + (_, labels, gotos) = collect True Nothing 0 tree + + jumpsIntoStmtExpr (ident, currentScope, isEmittedGoto) = + isEmittedGoto + && any + ( \case + (label, Just labelScope, _) -> + label == ident && Just labelScope /= currentScope + _ -> + False + ) + labels + + collect isEmitted scope nextId ATEmpty = (nextId, [], []) + collect isEmitted scope nextId (ATNode ATSizeof _ lhs _) = + collect False scope nextId lhs + collect isEmitted scope nextId (ATNode ATAlignof _ lhs _) = + collect False scope nextId lhs + collect isEmitted scope nextId (ATNode kind _ lhs rhs) = + let + (nextId', labels, gotos) = collectKind isEmitted scope nextId kind + (nextId'', lhsLabels, lhsGotos) = collect isEmitted scope nextId' lhs + (nextId''', rhsLabels, rhsGotos) = collect isEmitted scope nextId'' rhs + in + ( nextId''' + , labels <> lhsLabels <> rhsLabels + , gotos <> lhsGotos <> rhsGotos + ) + + collectKind isEmitted scope nextId = \case + ATLabel ident -> + (nextId, [(ident, scope, isEmitted)], []) + ATGoto ident -> + (nextId, [], [(ident, scope, isEmitted)]) + ATConditional cond th el -> + collectList isEmitted scope nextId [cond, th, el] + ATSwitch cond cases -> + collectList isEmitted scope nextId (cond : cases) + ATFor clauses -> + collectList isEmitted scope nextId $ map fromATKindFor clauses + ATBlock stmts -> + collectList isEmitted scope nextId stmts + ATStmtExpr stmts -> + collectList isEmitted (Just nextId) (succ nextId) stmts + ATNull at -> + collect isEmitted scope nextId at + ATDefFunc _ maybeArgs -> + maybe (nextId, [], []) (collectList isEmitted scope nextId) maybeArgs + ATCallFunc _ maybeArgs -> + maybe (nextId, [], []) (collectList isEmitted scope nextId) maybeArgs + ATCallPtr maybeArgs -> + maybe (nextId, [], []) (collectList isEmitted scope nextId) maybeArgs + _ -> + (nextId, [], []) + + collectList isEmitted scope nextStart = + foldl + ( \(nextId, labels, gotos) expr -> + let + (nextId', labels', gotos') = collect isEmitted scope nextId expr + in + (nextId', labels <> labels', gotos <> gotos') + ) + (nextStart, [], []) + + hasSwitchLabelIntoStmtExpr tree = + invalid + where + (_, invalid) = walk Nothing Nothing 0 tree + + walk _ _ nextId ATEmpty = (nextId, False) + walk _ _ nextId (ATNode ATSizeof _ _ _) = (nextId, False) + walk _ _ nextId (ATNode ATAlignof _ _ _) = (nextId, False) + walk stmtExprScope switchStmtExprScope nextId (ATNode kind _ lhs rhs) = + let + (nextId', invalidKind) = walkKind stmtExprScope switchStmtExprScope nextId kind + (nextId'', invalidLhs) = walk stmtExprScope switchStmtExprScope nextId' lhs + (nextId''', invalidRhs) = walk stmtExprScope switchStmtExprScope nextId'' rhs + in + (nextId''', invalidKind || invalidLhs || invalidRhs) + + walkKind stmtExprScope switchStmtExprScope nextId = \case + ATCase _ _ -> + (nextId, stmtExprScope /= switchStmtExprScope) + ATDefault _ -> + (nextId, stmtExprScope /= switchStmtExprScope) + ATConditional cond th el -> + walkList stmtExprScope switchStmtExprScope nextId [cond, th, el] + ATSwitch cond cases -> + let + (nextId', invalidCond) = walk stmtExprScope switchStmtExprScope nextId cond + (nextId'', invalidCases) = walkList stmtExprScope stmtExprScope nextId' cases + in + (nextId'', invalidCond || invalidCases) + ATFor clauses -> + walkList stmtExprScope switchStmtExprScope nextId $ map fromATKindFor clauses + ATBlock stmts -> + walkList stmtExprScope switchStmtExprScope nextId stmts + ATStmtExpr stmts -> + walkList (Just nextId) switchStmtExprScope (succ nextId) stmts + ATNull at -> + walk stmtExprScope switchStmtExprScope nextId at + ATDefFunc _ maybeArgs -> + maybe (nextId, False) (walkList stmtExprScope switchStmtExprScope nextId) maybeArgs + ATCallFunc _ maybeArgs -> + maybe (nextId, False) (walkList stmtExprScope switchStmtExprScope nextId) maybeArgs + ATCallPtr maybeArgs -> + maybe (nextId, False) (walkList stmtExprScope switchStmtExprScope nextId) maybeArgs + _ -> + (nextId, False) + + walkList stmtExprScope switchStmtExprScope nextStart = + foldl + ( \(nextId, invalid) expr -> + let + (nextId', invalid') = walk stmtExprScope switchStmtExprScope nextId expr + in + (nextId', invalid || invalid') + ) + (nextStart, False) + +isConstexprArithmeticCastType :: CT.StorageClass i -> Bool +isConstexprArithmeticCastType = \case + CT.SCAuto ty -> go ty + CT.SCRegister ty -> go ty + CT.SCStatic ty -> go ty + CT.SCUndef ty -> go ty + where + go = \case + CT.CTInt -> True + CT.CTChar -> True + CT.CTBool -> True + CT.CTEnum _ _ -> True + CT.CTSigned CT.CTUndef -> True + CT.CTShort CT.CTUndef -> True + CT.CTLong CT.CTUndef -> True + CT.CTSigned ty -> go ty + CT.CTShort ty -> go ty + CT.CTLong ty -> go ty + _ -> False + +applyConstexprCast :: (Bits i, Integral i) => CT.StorageClass i -> i -> i +applyConstexprCast ty val + | CT.toTypeKind ty == CT.CTBool = fromIntegral $ fromEnum $ val /= 0 + | otherwise = truncateToWidth (CT.sizeof ty) val + where + truncateToWidth sz x + | sz == 0 = 0 + | otherwise = fromInteger $ signExtend width $ toInteger x + where + width = fromIntegral $ sz * 8 + + signExtend width x = + if width <= 0 + then 0 + else + let modulus = bit width :: Integer + mask = pred modulus + truncated = x .&. mask + signBit = bit (pred width) :: Integer + in + if truncated .&. signBit == 0 + then truncated + else truncated - modulus + +isInvalidFunctionPointerInitializer :: (Ord i, Bits i, Integral i) => CT.StorageClass i -> ATree i -> Bool +isInvalidFunctionPointerInitializer = isInvalidFunctionPointerValue + +carriesFunctionDesignatorValue :: ATree i -> Bool +carriesFunctionDesignatorValue = \case + ATNode (ATFuncPtr _) _ _ _ -> + True + ATNode ATAddr _ inner _ -> + carriesFunctionDesignatorValue inner + ATNode ATCast _ inner _ -> + carriesFunctionDesignatorValue inner + ATNode (ATNull inner) _ _ _ -> + carriesFunctionDesignatorValue inner + ATNode ATExprStmt _ inner _ -> + carriesFunctionDesignatorValue inner + ATNode ATComma _ _ rhs -> + carriesFunctionDesignatorValue rhs + ATNode (ATConditional cond ATEmpty el) _ _ _ -> + any carriesFunctionDesignatorValue [cond, el] + ATNode (ATConditional _ th el) _ _ _ -> + any carriesFunctionDesignatorValue [th, el] + ATNode (ATStmtExpr stmts) _ _ _ -> + maybe False carriesFunctionDesignatorValue (lastMaybe stmts) + _ -> + False + where + lastMaybe [] = Nothing + lastMaybe xs = Just $ last xs + +requiresUnsupportedNonAddressableArrayDecay :: Ord i => ATree i -> Bool +requiresUnsupportedNonAddressableArrayDecay = requiresUnsupportedValueRead + where + requiresUnsupportedValueRead = \case + n@(ATNode (ATMemberAcc _) ty lhs _) + | isNonAddressableAggregateArrayElementAccess lhs -> + True + | Just (base, offset) <- rvalueSubobjectBaseOffset n -> + not $ + isSupportedSmallRvalueBase base + && isSupportedRvalueScalarType ty + && offset + CT.sizeof ty <= CT.sizeof (atype base) + && not (requiresUnsupportedValueRead base) + | otherwise -> + isUnsupportedStructValueType ty + || requiresUnsupportedAddressComputation lhs + ATNode ATDeref ty ptr _ + | Just (arrayExpr, idx) <- pointerIndexOperands ptr + , Just (base, _) <- rvalueSubobjectBaseOffset arrayExpr -> + requiresUnsupportedRvalueArrayElementRead ty base [idx] + ATNode ATDeref ty ptr _ + | Just (arrayExpr, idxs) <- nonAddressableArrayMemberPointerIndexes ptr + , Just (base, _) <- rvalueSubobjectBaseOffset arrayExpr -> + requiresUnsupportedRvalueArrayElementRead ty base idxs + ATNode ATDeref ty lhs _ -> + isUnsupportedStructValueType ty + || requiresUnsupportedValueRead lhs + ATNode ATSizeof _ _ _ -> + False + ATNode ATAlignof _ _ _ -> + False + ATNode ATAddr _ lhs _ -> + requiresUnsupportedAddressComputation lhs + ATNode (ATLVar _ _) ty _ _ -> + isUnsupportedStructValueType ty + ATNode (ATGVar _ _) ty _ _ -> + isUnsupportedStructValueType ty + ATNode (ATCallFunc _ args) ty _ _ -> + isUnsupportedStructValueType ty + || maybe False (any requiresUnsupportedValueRead) args + ATNode (ATCallPtr args) ty callee _ -> + isUnsupportedStructValueType ty + || requiresUnsupportedValueRead callee + || maybe False (any requiresUnsupportedValueRead) args + ATNode ATAssign ty lhs rhs -> + isUnsupportedStructValueType ty + || requiresUnsupportedAddressComputation lhs + || requiresUnsupportedValueRead rhs + ATNode ATComma _ lhs rhs -> + requiresUnsupportedValueRead lhs + || requiresUnsupportedValueRead rhs + ATNode ATCast ty lhs _ -> + isUnsupportedStructValueType ty + || requiresUnsupportedValueRead lhs + ATNode (ATConditional cond ATEmpty el) _ _ _ -> + requiresUnsupportedValueRead cond + || requiresUnsupportedValueRead el + ATNode (ATConditional cond th el) _ _ _ -> + requiresUnsupportedValueRead cond + || requiresUnsupportedValueRead th + || requiresUnsupportedValueRead el + ATNode (ATNull _) _ _ _ -> + False + ATNode ATExprStmt _ inner _ -> + requiresUnsupportedValueRead inner + ATNode (ATStmtExpr stmts) _ _ _ -> + maybe False requiresUnsupportedValueRead (lastMaybe stmts) + ATNode kind _ lhs rhs -> + requiresUnsupportedNonAddressableArrayDecayKind kind + || requiresUnsupportedValueRead lhs + || requiresUnsupportedValueRead rhs + _ -> + False + + requiresUnsupportedAddressComputation = \case + ATNode (ATMemberAcc _) _ lhs _ -> + requiresUnsupportedAddressComputation lhs + ATNode ATDeref _ lhs _ -> + requiresUnsupportedValueRead lhs + ATNode ATExprStmt _ inner _ -> + requiresUnsupportedAddressComputation inner + _ -> + False + + requiresUnsupportedNonAddressableArrayDecayKind = \case + ATConditional cond th el -> + any requiresUnsupportedValueRead [cond, th, el] + ATSwitch cond cases -> + requiresUnsupportedValueRead cond + || any requiresUnsupportedValueRead cases + ATFor clauses -> + any requiresUnsupportedNonAddressableArrayDecayFor clauses + ATBlock stmts -> + any requiresUnsupportedValueRead stmts + ATStmtExpr stmts -> + any requiresUnsupportedValueRead stmts + ATDefFunc _ args -> + maybe False (any requiresUnsupportedValueRead) args + ATCallFunc _ args -> + maybe False (any requiresUnsupportedValueRead) args + ATCallPtr args -> + maybe False (any requiresUnsupportedValueRead) args + ATNull _ -> + False + _ -> + False + + pointerIndexOperands (ATNode ATAddPtr _ arrayExpr idx) = Just (arrayExpr, idx) + pointerIndexOperands (ATNode ATSubPtr _ arrayExpr idx) = Just (arrayExpr, idx) + pointerIndexOperands _ = Nothing + + requiresUnsupportedRvalueArrayElementRead ty base idxs = + containsNonReturnEscapingStmtExprControlFlow base + || not (isSupportedSmallRvalueBase base) + || not (isSupportedRvalueScalarType ty) + || requiresUnsupportedValueRead base + || any requiresUnsupportedValueRead idxs + + nonAddressableArrayMemberPointerIndexes (ATNode ATAddPtr _ lhs idx) + | isNonAddressableArrayMemberExpr lhs = Just (lhs, [idx]) + | otherwise = do + (arrayExpr, idxs) <- nonAddressableArrayMemberPointerIndexes lhs + pure (arrayExpr, idxs <> [idx]) + nonAddressableArrayMemberPointerIndexes (ATNode ATSubPtr _ lhs idx) + | isNonAddressableArrayMemberExpr lhs = Just (lhs, [idx]) + | otherwise = do + (arrayExpr, idxs) <- nonAddressableArrayMemberPointerIndexes lhs + pure (arrayExpr, idxs <> [idx]) + nonAddressableArrayMemberPointerIndexes expr + | isNonAddressableArrayMemberExpr expr = Just (expr, []) + | otherwise = Nothing + + isNonAddressableArrayMemberExpr (ATNode (ATMemberAcc _) ty lhs _) = + CT.isArray ty && not (isAddressableExpr lhs) + isNonAddressableArrayMemberExpr _ = False + + isNonAddressableAggregateArrayElementAccess (ATNode ATDeref ty ptr _) + | Just (arrayExpr, _) <- pointerIndexOperands ptr = + CT.isCTStruct ty && isNonAddressableArrayMemberExpr arrayExpr + isNonAddressableAggregateArrayElementAccess _ = False + + rvalueSubobjectBaseOffset (ATNode (ATMemberAcc member) _ lhs _) + | isAddressableExpr lhs = Nothing + | otherwise = case rvalueSubobjectBaseOffset lhs of + Just (base, offset) -> Just (base, offset + CT.smOffset member) + Nothing -> Just (lhs, CT.smOffset member) + rvalueSubobjectBaseOffset _ = Nothing + + isSupportedSmallRvalueBase base = CT.sizeof (atype base) <= 8 + + isUnsupportedStructValueType ty = + CT.isIncompleteStruct ty + || (CT.isCTStruct ty && CT.sizeof ty > 8) + + isSupportedRvalueScalarType ty = + not (CT.isArray ty) + && not (CT.isCTStruct ty) + && case CT.toTypeKind ty of + CT.CTFunc _ _ -> False + _ -> True + + requiresUnsupportedNonAddressableArrayDecayFor = \case + ATForInit expr -> requiresUnsupportedValueRead expr + ATForCond expr -> requiresUnsupportedValueRead expr + ATForIncr expr -> requiresUnsupportedValueRead expr + ATForStmt expr -> requiresUnsupportedValueRead expr + ATForkw -> False + + isAddressableExpr (ATNode kind _ lhs _) = case kind of + ATLVar _ _ -> True + ATGVar _ _ -> True + ATMemberAcc _ -> isAddressableExpr lhs + ATDeref -> isAddressableDerefOperand lhs + _ -> False + isAddressableExpr _ = False + + isAddressableDerefOperand ptr + | Just (arrayExpr, _) <- pointerIndexOperands ptr + , CT.isArray (atype arrayExpr) = + isAddressableExpr arrayExpr + isAddressableDerefOperand _ = True + + lastMaybe [] = Nothing + lastMaybe xs = Just $ last xs + +bracket :: Parser i a -> (a -> Parser i b) -> (a -> Parser i c) -> Parser i c +bracket beg end m = do + b <- beg + M.withRecovery (\err -> end b *> M.parseError err) (m b) <* end b + +tmpTKIdent :: Num i => T.Text -> HT.TokenLC i +tmpTKIdent ident = (HT.TokenLCNums 1 1, HT.TKIdent ident) + +getPosState :: Parser i (M.PosState T.Text) +getPosState = do + statePos <- M.statePosState <$> M.getParserState + srcPos <- M.getSourcePos + pure $ statePos { M.pstateSourcePos = srcPos } diff --git a/src/Htcc/Parser/Combinators/Var.hs b/src/Htcc/Parser/Combinators/Var.hs new file mode 100644 index 0000000..10e138d --- /dev/null +++ b/src/Htcc/Parser/Combinators/Var.hs @@ -0,0 +1,715 @@ +{-| +Module : Htcc.Parser.Combinators.Var +Description : C language parser Combinators +Copyright : (c) roki, 2020~ +License : MIT +Maintainer : falgon53@yahoo.co.jp +Stability : experimental +Portability : POSIX + +C language parser Combinators +-} +{-# LANGUAGE OverloadedStrings #-} +module Htcc.Parser.Combinators.Var ( + varInit +) where +import Control.Monad (foldM, forM, unless, + void, when, (>=>)) +import Control.Monad.Extra (andM) +import Control.Monad.Fix (fix) +import Control.Monad.Trans (MonadTrans (..)) +import Control.Monad.Trans.Reader (ReaderT (..), asks, + runReaderT) +import Control.Monad.Trans.State (get, gets, put) +import Data.Bits (Bits) +import Data.Bool (bool) +import qualified Data.ByteString as B +import Data.Foldable (toList) +import Data.Functor ((<&>)) +import Data.List (sortBy) +import qualified Data.Map as MP +import Data.Maybe (fromJust, fromMaybe) +import qualified Data.Sequence as SQ +import qualified Data.Set as S +import qualified Data.Text as T +import Data.Tuple.Extra (second) +import qualified Htcc.CRules.Types as CT +import Htcc.Parser.AST (ATKind (..), + ATree (..), addKind, + atAssign, atBlock, + atCast, atExprStmt, + atMemberAcc, atNumLit, + atUnary, treealize) +import Htcc.Parser.Combinators.Core +import Htcc.Parser.Combinators.Utils (bracket, + isInvalidAggregateValueConversion, + isInvalidFunctionPointerInitializer, + isInvalidObjectPointerValue, + maybeToParser, + registerLVar, + requiresUnsupportedNonAddressableArrayDecay) +import Htcc.Parser.ConstructionData.Core (ConstructionData (suppressUnsupportedValueChecks), + incomplete, lookupLVar) +import Htcc.Parser.ConstructionData.Scope.Var (Var (vtype)) +import Htcc.Utils (tshow) +import Numeric.Natural (Natural) +import qualified Text.Megaparsec as M + +type DesignatorParser i r = ReaderT (T.Text, Parser i (ATree i)) (Parser i) r + +runDesignator :: (SQ.Seq (ATree i) -> SQ.Seq (CT.Desg i) -> DesignatorParser i r) + -> T.Text + -> Parser i (ATree i) + -> Parser i r +runDesignator p ident assignParser = runReaderT (p SQ.empty SQ.empty) (ident, assignParser) + +validateScalarInitializer :: (Ord i, Bits i, Integral i) => CT.StorageClass i -> ATree i -> DesignatorParser i (ATree i) +validateScalarInitializer targetTy at@(ATNode _ ty _ _) = do + unsupportedChecksSuppressed <- lift $ lift $ gets suppressUnsupportedValueChecks + if isVoidExpressionType ty then + fail "void value not ignored as it ought to be" + else if isInvalidFunctionInitializer || isInvalidObjectInitializer || isInvalidAggregateInitializer then + fail "invalid initializer for scalar object" + else if not unsupportedChecksSuppressed && requiresUnsupportedNonAddressableArrayDecay at then + fail "unsupported non-addressable array member decay" + else + pure at + where + isVoidExpressionType = isVoidTypeKind . CT.toTypeKind + + isVoidTypeKind ty' = case ty' of + CT.CTVoid -> True + CT.CTShort ty'' -> isVoidTypeKind ty'' + CT.CTLong ty'' -> isVoidTypeKind ty'' + CT.CTSigned ty'' -> isVoidTypeKind ty'' + _ -> False + + isInvalidFunctionInitializer = + isInvalidFunctionPointerInitializer targetTy at + + isInvalidObjectInitializer = + isInvalidObjectPointerValue targetTy at + + isInvalidAggregateInitializer = + isInvalidAggregateValueConversion targetTy at +validateScalarInitializer _ _ = fail "expected to assign" + +withDesignatorCheckpoint :: DesignatorParser i a -> DesignatorParser i a +withDesignatorCheckpoint p = ReaderT $ \ctx -> + bracket M.getParserState M.setParserState $ const $ + bracket (lift get) (lift . put) $ const $ + runReaderT p ctx + +tryDesignator :: DesignatorParser i a -> DesignatorParser i a +tryDesignator p = ReaderT $ \ctx -> do + constructionData <- lift get + M.try $ M.withRecovery + (\err -> lift (put constructionData) *> M.parseError err) + (runReaderT p ctx) + +arrayElementType :: Ord i => CT.StorageClass i -> CT.StorageClass i +arrayElementType ty = CT.mapTypeKind (const elemTy) ty + where + elemTy = case CT.deref ty of + Just ty' -> CT.toTypeKind ty' + Nothing -> case CT.toTypeKind ty of + CT.CTIncomplete (CT.IncompleteArray ty') -> ty' + _ -> error "internal compiler error" + +fixedArrayLength :: Ord i => CT.StorageClass i -> Maybe Natural +fixedArrayLength ty + | not (CT.isArray ty) = Nothing + | elemBytes == 0 = Nothing + | totalBytes == 0 = Nothing + | otherwise = Just $ totalBytes `div` elemBytes + where + totalBytes = CT.sizeof ty + elemBytes = CT.sizeof $ arrayElementType ty + +inferredArrayBoundElementType :: Ord i => CT.StorageClass i -> CT.StorageClass i +inferredArrayBoundElementType ty = fromMaybe (arrayElementType ty) logicalElemTy + where + (baseTy, rebuild) = CT.dctorArray $ CT.toTypeKind ty + logicalElemTy = + (\elemTy -> CT.mapTypeKind (const $ rebuild elemTy) ty) + <$> CT.fromIncompleteArray baseTy + +fixedCharArrayStringFits :: Integral i => i -> B.ByteString -> Bool +fixedCharArrayStringFits len s = toInteger (B.length s) <= toInteger len + 1 + +isCharArrayType :: Ord i => CT.StorageClass i -> Bool +isCharArrayType ty = + CT.isArray ty && maybe False isCharType (CT.deref ty) + where + isCharType ty' = case CT.toTypeKind ty' of + CT.CTChar -> True + _ -> False + +lookInitializerStringFor :: Ord i => CT.StorageClass i -> Parser i () +lookInitializerStringFor ty = bool M.empty (pure ()) =<< andM + [ pure $ isCharArrayType ty + , M.option False (True <$ M.lookAhead stringLiteral) + ] + +failCommitted :: String -> DesignatorParser i a +failCommitted msg = + lift $ M.fancyFailure $ S.singleton $ M.ErrorFail msg + +data ArrayBoundInference i + = InferArrayBoundLength Int + | InferArrayBoundType (CT.StorageClass i) + +inferArrayBoundFromInitializer :: (Integral i, Bits i, Read i, Show i, Ord i) + => CT.StorageClass i + -> DesignatorParser i (ArrayBoundInference i) +inferArrayBoundFromInitializer ty = + withDesignatorCheckpoint $ inferArrayBoundFromInitializer' ty + +inferArrayBoundFromInitializer' :: (Integral i, Bits i, Read i, Show i, Ord i) + => CT.StorageClass i + -> DesignatorParser i (ArrayBoundInference i) +inferArrayBoundFromInitializer' ty = do + void $ lift lbrace + bracedStringInitializerLength M.<|> inferElementCount + where + elemTy = inferredArrayBoundElementType ty + emptyInitializerError = "cannot initialize incomplete array with an empty initializer list" + + inferElementCount = do + len <- countElements 0 + lift rbrace + pure $ InferArrayBoundLength len + + bracedStringInitializerLength + | isCharArrayType ty = do + lift $ lookInitializerStringFor ty + len <- B.length <$> lift stringLiteral + void $ lift $ M.option () (void comma) + void $ lift rbrace + pure $ InferArrayBoundLength len + | otherwise = M.empty + + countElements acc = M.choice + [ do + void $ lift (M.lookAhead rbrace) + if acc == 0 + then fail emptyInitializerError + else pure acc + , do + skipInitializer True elemTy + continue <- continueBracedAggregate + bool + (pure $ succ acc) + (countElements $ succ acc) + continue + ] + +lookAheadTrailingCommaBeforeRbrace :: DesignatorParser i Bool +lookAheadTrailingCommaBeforeRbrace = + lift $ M.option False $ True <$ M.try (M.lookAhead (comma *> rbrace)) + +continueBracedAggregate :: DesignatorParser i Bool +continueBracedAggregate = M.choice + [ False <$ lift (M.lookAhead rbrace) + , False <$ lift (M.try (comma *> M.lookAhead rbrace)) + , True <$ lift comma + ] + +continueBraceElidedAggregate :: DesignatorParser i Bool +continueBraceElidedAggregate = do + trailingOuterComma <- lookAheadTrailingCommaBeforeRbrace + if trailingOuterComma + then pure False + else M.choice + [ True <$ lift comma + , pure False + ] + +skipInitializer :: (Integral i, Bits i, Read i, Show i, Ord i) + => Bool + -> CT.StorageClass i + -> DesignatorParser i () +skipInitializer allowStructBraceElision ty = M.choice + [ lift lookInitializerString *> void (lift stringLiteral) + , lift lookInitializerList *> leadingBraceInitializer + , aggregateCopyFallback + , braceElidedAggregateInit + , rejectScalarFallback *> void (asks snd >>= lift >>= validateScalarInitializer ty) + ] + where + lookInitializerString = lookInitializerStringFor ty + lookInitializerList = bool M.empty (pure ()) =<< M.option False (True <$ M.lookAhead lbrace) + leadingBraceInitializer = skipInitializerList ty + aggregateCopyFallback + | CT.isCTStruct ty = tryDesignator $ void (asks snd >>= lift >>= validateScalarInitializer ty) + | otherwise = M.empty + braceElidedAggregateInit + | allowStructBraceElision = case CT.toTypeKind ty of + CT.CTArray _ _ -> + rejectAggregateExpr "expected '{' to initialize an array" *> skipArrayNoBraces ty + CT.CTStruct mems -> + rejectAggregateExpr "expected '{' to initialize a struct" *> skipStructNoBraces (orderedStructMembers mems) + CT.CTNamedStruct _ _ mems -> + rejectAggregateExpr "expected '{' to initialize a struct" *> skipStructNoBraces (orderedStructMembers mems) + _ -> + M.empty + | otherwise = M.empty + rejectScalarFallback = rejectScalarFallbackFor ty + +skipInitializerList :: (Integral i, Bits i, Read i, Show i, Ord i) + => CT.StorageClass i + -> DesignatorParser i () +skipInitializerList ty = do + void $ lift lbrace + case CT.toTypeKind ty of + CT.CTArray _ _ -> skipBracedInitializerString ty M.<|> (skipArrayList ty <* lift rbrace) + CT.CTStruct mems -> skipStructList (orderedStructMembers mems) <* lift rbrace + CT.CTNamedStruct _ _ mems -> skipStructList (orderedStructMembers mems) <* lift rbrace + _ -> do + skipInitializer False ty + void $ lift $ M.option () (void comma) + void $ lift rbrace + where + skipBracedInitializerString aty = tryDesignator $ do + lift $ lookInitializerStringFor aty + void $ lift stringLiteral + void $ lift $ M.option () (void comma) + void $ lift rbrace + + skipArrayList aty = M.choice + [ void $ lift (M.lookAhead rbrace) + , do + skipInitializer True (arrayElementType aty) + continue <- continueBracedAggregate + bool + (pure ()) + (skipArrayList aty) + continue + ] + + skipStructList [] = pure () + skipStructList (mem:rest) = M.choice + [ void $ lift (M.lookAhead rbrace) + , do + skipInitializer True (CT.SCAuto $ CT.smType mem) + continue <- continueBracedAggregate + bool + (pure ()) + (skipStructList rest) + continue + ] + +skipArrayNoBraces :: (Integral i, Bits i, Read i, Show i, Ord i) + => CT.StorageClass i + -> DesignatorParser i () +skipArrayNoBraces ty = case fixedArrayLength ty of + Just n -> skipArrayLoop (fromIntegral n) 0 + Nothing -> fail "internal compiler error" + where + elemTy = arrayElementType ty + + skipArrayLoop len idx + | idx >= len = pure () + | otherwise = do + skipInitializer True elemTy + if succ idx == len + then pure () + else do + continue <- continueBraceElidedAggregate + bool + (pure ()) + (skipArrayLoop len (succ idx)) + continue + +skipStructNoBraces :: (Integral i, Bits i, Read i, Show i, Ord i) + => [CT.StructMember i] + -> DesignatorParser i () +skipStructNoBraces [] = pure () +skipStructNoBraces (mem:rest) = do + skipInitializer True (CT.SCAuto $ CT.smType mem) + case rest of + [] -> pure () + _ -> do + continue <- continueBraceElidedAggregate + bool + (pure ()) + (skipStructNoBraces rest) + continue + +desgNode :: (Num i, Ord i, Show i) + => ATree i + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (ATree i) +desgNode nd desg = atExprStmt . flip atAssign nd <$> desgLVal desg + +desgLVal :: (Num i, Ord i, Show i) + => SQ.Seq (CT.Desg i) + -> DesignatorParser i (ATree i) +desgLVal desg = flip (`foldr` facc) desg $ \idx acc -> case idx of + CT.DesgIdx idx' -> do + nd' <- maybeToParser' . (`addKind` atNumLit idx') =<< acc + flip (atUnary ATDeref) nd' <$> maybeToParser' (CT.deref (atype nd')) + CT.DesgMem mem -> atMemberAcc mem <$> acc + where + facc = asks fst + >>= lift . lift . gets . lookupLVar + >>= maybeToParser' + <&> treealize + maybeToParser' = lift . maybeToParser "invalid initializer-list" + +zeroFillByteOffsets :: (Integral i, Ord i, Show i) + => SQ.Seq (CT.Desg i) + -> [i] + -> DesignatorParser i (SQ.Seq (ATree i)) +zeroFillByteOffsets desg offsets = do + base <- desgLVal desg + let bytePtrTy = CT.SCAuto $ CT.CTPtr CT.CTChar + baseAddrTy = CT.mapTypeKind CT.CTPtr (atype base) + byteBase = atCast bytePtrTy $ atUnary ATAddr baseAddrTy base + fmap SQ.fromList $ forM offsets $ \offset -> do + bytePtr <- case offset of + 0 -> pure byteBase + _ -> lift + $ maybeToParser "invalid initializer-list" + $ addKind byteBase (atNumLit offset) + pure $ atExprStmt $ atAssign (atUnary ATDeref (CT.SCAuto CT.CTChar) bytePtr) (atNumLit 0) + +zeroFillObject :: (Integral i, Ord i, Show i) + => CT.StorageClass i + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (SQ.Seq (ATree i)) +zeroFillObject ty desg = zeroFillByteOffsets desg [0 .. pred totalBytes] + where + totalBytes = fromIntegral $ CT.sizeof ty + +zeroFillRemainingStructBytes :: (Integral i, Ord i, Show i) + => CT.StorageClass i + -> [CT.StructMember i] + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (SQ.Seq (ATree i)) +zeroFillRemainingStructBytes ty explicitMembers desg = + zeroFillByteOffsets desg $ filter (`notElem` explicitOffsets) [0 .. pred totalBytes] + where + totalBytes = fromIntegral $ CT.sizeof ty + explicitOffsets = concatMap memberByteOffsets explicitMembers + memberByteOffsets mem + | memberBytes == 0 = [] + | otherwise = [start .. start + pred memberBytes] + where + start = fromIntegral $ CT.smOffset mem + memberBytes = fromIntegral $ CT.sizeof $ CT.smType mem + +initLoop :: (Integral i, Bits i, Read i, Show i, Ord i) + => CT.StorageClass i + -> SQ.Seq (ATree i) + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (SQ.Seq (ATree i), i) +initLoop ty ai desg = second fromIntegral <$> initLoop' ai <* lift rbrace + where + initLoop' ai' = case fixedArrayLength ty of + Just n -> fix (\f (idx, rl) -> do + let arrayLen = fromIntegral n + when (idx >= arrayLen) $ failCommitted "excess elements in array initializer" + rs <- desgInit True elemTy rl (CT.DesgIdx (fromIntegral idx) SQ.<| desg) + continue <- continueBracedAggregate + bool + (pure (rs, succ idx)) + (f (succ idx, rs)) + continue + ) (0 :: Natural, ai') + Nothing -> fail "internal compiler error" + elemTy = arrayElementType ty + +initZero :: (Integral i, Ord i, Show i) + => CT.TypeKind i + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (SQ.Seq (ATree i)) +initZero (CT.CTArray n ty) desg = + foldM + (\acc idx -> (SQ.>< acc) <$> initZero ty (CT.DesgIdx idx SQ.<| desg)) + SQ.empty + [0..fromIntegral (pred n)] +initZero t@(CT.CTStruct _) desg = zeroFillObject (CT.SCAuto t) desg +initZero t@CT.CTNamedStruct {} desg = zeroFillObject (CT.SCAuto t) desg +initZero _ desg = SQ.singleton <$> desgNode (atNumLit 0) desg + +orderedStructMembers :: MP.Map T.Text (CT.StructMember i) -> [CT.StructMember i] +orderedStructMembers = sortBy (\x y -> compare (CT.smOffset x) (CT.smOffset y)) . MP.elems + +structStorageClassFromMembers :: [CT.StructMember i] -> CT.StorageClass i +structStorageClassFromMembers mems = + CT.SCAuto $ CT.CTStruct $ MP.fromList $ zipWith (\idx mem -> (tshow idx, mem)) [(0 :: Int)..] mems + +isAggregateType :: CT.StorageClass i -> Bool +isAggregateType ty = CT.isArray ty || CT.isCTStruct ty + +peekAssignType :: DesignatorParser i (Maybe (CT.StorageClass i)) +peekAssignType = do + assignParser <- asks snd + constructionData <- lift $ lift get + observed <- lift $ M.option Nothing $ Just <$> M.try (M.lookAhead assignParser) + lift $ lift $ put constructionData + pure $ atype <$> observed + +aggregateInitializerBraceError :: CT.StorageClass i -> Maybe String +aggregateInitializerBraceError ty + | CT.isCTStruct ty = Just "expected '{' to initialize a struct" + | CT.isArray ty = Just "expected '{' to initialize an array" + | otherwise = Nothing + +rejectAggregateExpr :: String -> DesignatorParser i () +rejectAggregateExpr msg = do + startsWithString <- lift $ M.option False (True <$ M.lookAhead stringLiteral) + unless startsWithString $ do + isAggregateExpr <- maybe False isAggregateType <$> peekAssignType + bool (pure ()) (fail msg) isAggregateExpr + +rejectScalarFallbackFor :: CT.StorageClass i -> DesignatorParser i () +rejectScalarFallbackFor ty = + maybe + rejectInvalidScalarAggregate + fail + (aggregateInitializerBraceError ty) + where + rejectInvalidScalarAggregate = do + invalidAggregate <- maybe False (scalarAggregateNeedsBraces ty) <$> peekAssignType + when invalidAggregate $ fail "invalid initializer for scalar object" + + scalarAggregateNeedsBraces target rhs = case CT.toTypeKind target of + CT.CTPtr _ -> CT.isCTStruct rhs + _ -> isAggregateType rhs + +arType :: Integral i => CT.StorageClass i -> i -> CT.StorageClass i +arType ty len = case CT.toTypeKind ty of + CT.CTIncomplete (CT.IncompleteArray innerTy) -> + CT.mapTypeKind (const $ inferredOuterArrayType innerTy len) ty + _ -> snd (CT.dctorArray ty) $ + CT.mapTypeKind (CT.CTArray (fromIntegral len) . fromJust . CT.fromIncompleteArray) ty + +inferredOuterArrayType :: Integral i => CT.TypeKind i -> i -> CT.TypeKind i +inferredOuterArrayType innerTy len = case innerTy of + CT.CTArray _ _ -> + fromMaybe fallback $ + CT.concatCTArray + (CT.makeCTArray [fromIntegral len] $ CT.removeAllExtents innerTy) + innerTy + _ -> fallback + where + fallback = CT.CTArray (fromIntegral len) innerTy + +registerInferredArrayBound :: (Integral i, Bits i, Read i, Show i, Ord i) + => CT.StorageClass i + -> ArrayBoundInference i + -> DesignatorParser i (CT.StorageClass i) +registerInferredArrayBound ty inferred = do + ident <- asks fst + let applyInference target = case inferred of + InferArrayBoundLength len -> arType target (fromIntegral len) + InferArrayBoundType ty' -> ty' + newt = applyInference ty + currentTy <- lift $ lift $ gets (fmap vtype . lookupLVar ident) + case currentTy of + Just currentTy' + | CT.isIncompleteArray currentTy' -> + void $ lift $ registerLVar (applyInference currentTy') ident + _ -> pure () + pure newt + +initializerString :: (Integral i, Bits i, Read i, Show i, Ord i) + => Bool + -> CT.StorageClass i + -> SQ.Seq (ATree i) + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (SQ.Seq (ATree i)) +initializerString allowStructBraceElision ty ai desg + | CT.isIncompleteArray ty = do + len <- lift $ bracket M.getParserState M.setParserState (const $ B.length <$> stringLiteral) + newt <- registerInferredArrayBound ty $ InferArrayBoundLength len + desgInit allowStructBraceElision newt ai desg + | otherwise = case CT.toTypeKind ty of + CT.CTArray n _ -> do + s <- lift stringLiteral + unless (fixedCharArrayStringFits n s) $ + failCommitted "initializer-string for array of chars is too long" + let s' = B.unpack $ s <> B.replicate (fromIntegral n - pred (B.length s)) 0 + inds = sortBy (flip (.) reverse . compare . reverse) $ CT.accessibleIndices $ CT.toTypeKind ty + fmap ((ai SQ.><) . SQ.fromList) + $ mapM (uncurry desgNode) + $ zipWith (flip (.) ((SQ.>< desg) . SQ.fromList) . (,) . atNumLit . fromIntegral) s' inds + _ -> fail "internal compiler error" + +bracedInitializerString :: (Integral i, Bits i, Read i, Show i, Ord i) + => Bool + -> CT.StorageClass i + -> SQ.Seq (ATree i) + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (SQ.Seq (ATree i)) +bracedInitializerString allowStructBraceElision ty ai desg = do + lift $ lookInitializerStringFor ty + rs <- initializerString allowStructBraceElision ty ai desg + void $ lift $ M.option () (void comma) + void $ lift rbrace + pure rs + +initializerList :: (Integral i, Bits i, Read i, Show i, Ord i) + => CT.StorageClass i + -> SQ.Seq (ATree i) + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (SQ.Seq (ATree i)) +initializerList ty ai desg = M.choice + [ allZeroInit + , withInitElements + ] + where + allZeroInit + | (CT.isArray ty && not (CT.isIncompleteArray ty)) || CT.isCTStruct ty = do + void $ lift $ M.try (lbrace *> rbrace) + (ai SQ.><) <$> initZero (CT.toTypeKind ty) desg + | otherwise = M.empty + + withInitElements + | CT.isIncompleteArray ty = do + inferred <- inferArrayBoundFromInitializer ty + newt <- registerInferredArrayBound ty inferred + desgInit False newt ai desg + | otherwise = do + void $ lift lbrace + case CT.toTypeKind ty of + CT.CTArray _ _ -> + bracedInitializerString False ty ai desg M.<|> do + (ast, idx) <- initLoop ty ai desg + (ast SQ.><) + <$> foldM + (\acc idx' -> (SQ.>< acc) <$> initZero (CT.toTypeKind elemTy) (CT.DesgIdx idx' SQ.<| desg)) + SQ.empty + [fromIntegral idx..pred (fromIntegral arrayLen)] + CT.CTStruct mems -> do + (ast, explicitMems, _) <- initStructLoop (orderedStructMembers mems) ai + (ast SQ.><) <$> zeroFillRemainingStructBytes ty explicitMems desg + CT.CTNamedStruct _ _ mems -> do + (ast, explicitMems, _) <- initStructLoop (orderedStructMembers mems) ai + (ast SQ.><) <$> zeroFillRemainingStructBytes ty explicitMems desg + _ -> do + rs <- desgInit False ty ai desg + void $ lift $ M.option () (void comma) + void $ lift rbrace + pure rs + where + elemTy = arrayElementType ty + arrayLen = fromMaybe 0 $ fixedArrayLength ty + initStructLoop mems ai' = initStructLoop' [] mems ai' <* lift rbrace + + initStructLoop' explicit [] ai' = pure (ai', explicit, []) + initStructLoop' explicit (mem:rest) ai' = do + rs <- desgInit True (CT.SCAuto $ CT.smType mem) ai' (CT.DesgMem mem SQ.<| desg) + let explicit' = mem : explicit + continue <- continueBracedAggregate + bool + (pure (rs, explicit', rest)) + (initStructLoop' explicit' rest rs) + continue + +initializerStructNoBraces :: (Integral i, Bits i, Read i, Show i, Ord i) + => [CT.StructMember i] + -> SQ.Seq (ATree i) + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (SQ.Seq (ATree i)) +initializerStructNoBraces mems ai desg = do + let structTy = structStorageClassFromMembers mems + (ast, explicitMems, _) <- initStructLoop [] mems ai + (ast SQ.><) <$> zeroFillRemainingStructBytes structTy explicitMems desg + where + initStructLoop explicit [] ai' = pure (ai', explicit, []) + initStructLoop explicit (mem:rest) ai' = do + rs <- desgInit True (CT.SCAuto $ CT.smType mem) ai' (CT.DesgMem mem SQ.<| desg) + let explicit' = mem : explicit + case rest of + [] -> pure (rs, explicit', []) + _ -> do + continue <- continueBraceElidedAggregate + bool + (pure (rs, explicit', rest)) + (initStructLoop explicit' rest rs) + continue + +initializerArrayNoBraces :: (Integral i, Bits i, Read i, Show i, Ord i) + => CT.StorageClass i + -> SQ.Seq (ATree i) + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (SQ.Seq (ATree i)) +initializerArrayNoBraces ty ai desg = case fixedArrayLength ty of + Just n -> do + let len = fromIntegral n + (ast, idx) <- initArrayLoop len bt 0 ai + (ast SQ.><) + <$> foldM + (\acc idx' -> (SQ.>< acc) <$> initZero bt (CT.DesgIdx idx' SQ.<| desg)) + SQ.empty + [fromIntegral idx..pred (fromIntegral n)] + Nothing -> fail "internal compiler error" + where + bt = CT.toTypeKind $ arrayElementType ty + initArrayLoop n bt idx ai' + | idx >= n = pure (ai', idx) + | otherwise = do + rs <- desgInit True (CT.SCAuto bt) ai' (CT.DesgIdx idx SQ.<| desg) + if idx == pred n + then pure (rs, succ idx) + else do + continue <- continueBraceElidedAggregate + bool + (pure (rs, succ idx)) + (initArrayLoop n bt (succ idx) rs) + continue + +desgInit :: (Integral i, Bits i, Read i, Show i, Ord i) + => Bool + -> CT.StorageClass i + -> SQ.Seq (ATree i) + -> SQ.Seq (CT.Desg i) + -> DesignatorParser i (SQ.Seq (ATree i)) +desgInit allowStructBraceElision ty ai desg = M.choice + [ lift (lookInitializerStringFor ty) *> initializerString allowStructBraceElision ty ai desg + , lift lookInitializerList *> leadingBraceInitializer + , aggregateCopyFallback + , braceElidedAggregateInit + , rejectScalarFallback *> scalarFallback + ] + where + lookInitializerList = bool M.empty (pure ()) =<< M.option False (True <$ M.lookAhead lbrace) + leadingBraceInitializer = initializerList ty ai desg + aggregateCopyFallback + | CT.isCTStruct ty = tryDesignator scalarFallback + | otherwise = M.empty + braceElidedAggregateInit + | allowStructBraceElision = case CT.toTypeKind ty of + CT.CTArray _ _ -> + rejectAggregateExpr "expected '{' to initialize an array" *> initializerArrayNoBraces ty ai desg + CT.CTStruct mems -> + rejectAggregateExpr "expected '{' to initialize a struct" *> initializerStructNoBraces (orderedStructMembers mems) ai desg + CT.CTNamedStruct _ _ mems -> + rejectAggregateExpr "expected '{' to initialize a struct" *> initializerStructNoBraces (orderedStructMembers mems) ai desg + _ -> + M.empty + | otherwise = M.empty + rejectScalarFallback = rejectScalarFallbackFor ty + scalarFallback = do + rhs <- asks snd >>= lift >>= validateScalarInitializer ty + (ai SQ.|>) <$> desgNode rhs desg + +varInit' :: (Integral i, Bits i, Read i, Show i, Ord i) + => Parser i (ATree i) + -> CT.StorageClass i + -> T.Text + -> ATree i + -> Parser i (ATree i) +varInit' p ty ident _ = atBlock . toList <$> runDesignator (desgInit False ty) ident p + +varInit :: (Integral i, Bits i, Read i, Show i, Ord i) + => Parser i (ATree i) + -> CT.StorageClass i + -> T.Text + -> Parser i (ATree i) +varInit p ty ident = lift (gets $ fromMaybe ty . incomplete ty) + >>= flip registerLVar ident + >>= varInit' p ty ident diff --git a/src/Htcc/Parser/ConstructionData.hs b/src/Htcc/Parser/ConstructionData.hs deleted file mode 100644 index 687d9eb..0000000 --- a/src/Htcc/Parser/ConstructionData.hs +++ /dev/null @@ -1,16 +0,0 @@ -{-| -Module : Htcc.Parser.ConstructionData.hs -Description : Data types and type synonyms used during AST construction -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -Data types and type synonyms used during AST construction --} -module Htcc.Parser.ConstructionData ( - module Htcc.Parser.ConstructionData.Core -) where - -import Htcc.Parser.ConstructionData.Core diff --git a/src/Htcc/Parser/ConstructionData/Core.hs b/src/Htcc/Parser/ConstructionData/Core.hs index 407afb7..a82a6fe 100644 --- a/src/Htcc/Parser/ConstructionData/Core.hs +++ b/src/Htcc/Parser/ConstructionData/Core.hs @@ -9,9 +9,11 @@ Portability : POSIX Data types and type synonyms used during AST construction -} +{-# LANGUAGE LambdaCase #-} module Htcc.Parser.ConstructionData.Core ( -- * Main type ConstructionData (..), + FunctionParamScope (..), Warnings, -- * Adding funcitons addLVar, @@ -36,38 +38,57 @@ module Htcc.Parser.ConstructionData.Core ( initConstructionData, resetLocal, pushWarn, - incomplete + incomplete, + normalizeCompletedStorageClass, + hasIncompleteObjectType ) where -import Data.Bits (Bits (..)) -import Data.Maybe (fromJust) -import qualified Data.Sequence as S -import qualified Data.Text as T -import Data.Tuple.Extra (second) - -import qualified Htcc.CRules.Types as CT -import Htcc.Parser.AST.Core (ATree (..)) -import Htcc.Parser.ConstructionData.Scope (LookupVarResult (..)) -import qualified Htcc.Parser.ConstructionData.Scope as AS -import qualified Htcc.Parser.ConstructionData.Scope.Enumerator as SE -import qualified Htcc.Parser.ConstructionData.Scope.Function as PF -import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) -import qualified Htcc.Parser.ConstructionData.Scope.Tag as PS -import qualified Htcc.Parser.ConstructionData.Scope.Typedef as PT -import qualified Htcc.Parser.ConstructionData.Scope.Var as PV -import Htcc.Tokenizer.Token (TokenLC) -import qualified Htcc.Tokenizer.Token as HT +import Data.Bits (Bits (..)) +import qualified Data.Sequence as SQ +import qualified Data.Text as T +import Data.Tuple.Extra (second) +import Numeric.Natural (Natural) + +import qualified Htcc.CRules.Types as CT +import Htcc.Parser.AST.Core (ATree (..)) +import Htcc.Parser.ConstructionData.Scope (LookupVarResult (..)) +import qualified Htcc.Parser.ConstructionData.Scope as AS +import qualified Htcc.Parser.ConstructionData.Scope.Enumerator as SE +import qualified Htcc.Parser.ConstructionData.Scope.Function as PF +import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) +import qualified Htcc.Parser.ConstructionData.Scope.Tag as PS +import qualified Htcc.Parser.ConstructionData.Scope.Typedef as PT +import qualified Htcc.Parser.ConstructionData.Scope.Var as PV +import qualified Htcc.Tokenizer.Token as HT + +import Control.Monad.State (modify) +import Data.List.NonEmpty (NonEmpty (..)) +import qualified Data.Set as S +import Data.Void +import {-# SOURCE #-} Htcc.Parser.Combinators.ParserType +import qualified Text.Megaparsec as M -- | The warning messages type -type Warnings i = S.Seq (T.Text, TokenLC i) +type Warnings = SQ.Seq (M.ParseErrorBundle T.Text Void) + +data FunctionParamScope i = FunctionParamScope + { + fpsScopeId :: CT.ScopeId, + fpsTags :: PS.Tags i, + fpsEnumerators :: SE.Enumerators i + } deriving Show -- | `ConstructionData` is a set of "things" used during the construction of the AST. -- Contains error messages and scope information. data ConstructionData i = ConstructionData -- ^ The constructor of ConstructionData { - warns :: Warnings i, -- ^ The warning messages - scope :: AS.Scoped i, -- ^ Scope type - isSwitchStmt :: Bool -- ^ When the statement is @switch@, this flag will be `True`, otherwise will be `False`. + warns :: Warnings, -- ^ The warning messages + scope :: AS.Scoped i, -- ^ Scope type + tagHistory :: PS.TagHistory i, -- ^ Historical tag bindings used for deferred struct completion. + functionParamScopes :: [FunctionParamScope i], -- ^ Deferred outer function parameter scopes captured while parsing declarators. + isSwitchStmt :: Bool, -- ^ When the statement is @switch@, this flag will be `True`, otherwise will be `False`. + suppressUnsupportedValueChecks :: Bool, -- ^ Skip codegen-only value checks while parsing unevaluated operands. + allowSameInputExternalCollisions :: Bool -- ^ When `True`, same-input globals and function declarations may coexist so multi-input `-o` merge can resolve them. } deriving Show {-# INLINE applyScope #-} @@ -90,11 +111,22 @@ addLVar = addVar AS.addLVar -- -- >>> second (\x -> y { scope = x }) <$> Htcc.Parser.AST.Scope.addGVar ty tkn (scope x) addGVar :: (Integral i, Bits i) => CT.StorageClass i -> HT.TokenLC i -> ConstructionData i -> Either (ASTError i) (ATree i, ConstructionData i) -addGVar = addVar AS.addGVar +addGVar ty tkn cd = + addVar + (if allowSameInputExternalCollisions cd then AS.addGVarAllowFunctionConflict else AS.addGVar) + ty + tkn + cd -- | Shortcut to function `Htcc.Parser.AST.Scope.addGVarWith` for variable @x@ of tye `ConstructionData`. addGVarWith :: (Integral i, Bits i) => CT.StorageClass i -> HT.TokenLC i -> PV.GVarInitWith i -> ConstructionData i -> Either (ASTError i) (ATree i, ConstructionData i) -addGVarWith ty tkn iw cd = applyScope cd <$> AS.addGVarWith ty tkn iw (scope cd) +addGVarWith ty tkn iw cd = + applyScope cd <$> + (if allowSameInputExternalCollisions cd then AS.addGVarWithAllowFunctionConflict else AS.addGVarWith) + ty + tkn + iw + (scope cd) -- | Shortcut to function `Htcc.Parser.AST.Scope.addLiteral` for variable @x@ of type `ConstructionData`. -- This function is equivalent to @@ -175,8 +207,18 @@ lookupEnumerator = lookupFromScope AS.lookupEnumerator -- This function is equivalent to -- -- >>> (\y -> x { scope = y }) <$> Htcc.Parser.AST.Scope.addTag ty tkn (scope x) -addTag :: Num i => CT.StorageClass i -> HT.TokenLC i -> ConstructionData i -> Either (ASTError i) (ConstructionData i) -addTag ty tkn cd = (\x -> cd { scope = x }) <$> AS.addTag ty tkn (scope cd) +addTag :: Num i => PS.TagKind -> CT.StorageClass i -> HT.TokenLC i -> ConstructionData i -> Either (ASTError i) (ConstructionData i) +addTag kind ty tkn cd = do + scp <- AS.addTag kind ty tkn (scope cd) + pure $ + cd + { scope = scp + , tagHistory = case tkn of + (_, HT.TKIdent ident) -> + PS.remember (AS.curScopeId $ scope cd) (AS.curNestDepth $ scope cd) kind ty ident (tagHistory cd) + _ -> + tagHistory cd + } -- | Shortcut to function `Htcc.Parser.AST.Scope.addTypedef` for variable @x@ of type `ConstructionData`. -- This function is equivalent to @@ -189,8 +231,15 @@ addTypedef ty tkn cd = (\x -> cd { scope = x }) <$> AS.addTypedef ty tkn (scope -- This function is equivalent to -- -- >>> (\y -> x { scope = y }) <$> Htcc.Parser.AST.Scope.addFunction ty tkn (scope x) -addFunction :: Num i => Bool -> CT.StorageClass i -> HT.TokenLC i -> ConstructionData i -> Either (ASTError i) (ConstructionData i) -addFunction fd ty tkn cd = (\x -> cd { scope = x }) <$> AS.addFunction fd ty tkn (scope cd) +addFunction :: (Eq i, Num i) => Bool -> Bool -> CT.StorageClass i -> HT.TokenLC i -> ConstructionData i -> Either (ASTError i) (ConstructionData i) +addFunction fd isImplicit ty tkn cd = + (\x -> cd { scope = x }) <$> + (if allowSameInputExternalCollisions cd then AS.addFunctionAllowGlobalConflict else AS.addFunction) + fd + isImplicit + ty + tkn + (scope cd) -- | Shortcut to function `Htcc.Parser.AST.Scope.addEnumerator` for variable @x@ of type `ConstructionData`. -- This function is equivalent to @@ -202,7 +251,7 @@ addEnumerator ty tkn n cd = (\x -> cd { scope = x }) <$> AS.addEnumerator ty tkn -- | Shortcut to the initial state of `ConstructionData`. {-# INLINE initConstructionData #-} initConstructionData :: ConstructionData i -initConstructionData = ConstructionData S.empty AS.initScope False +initConstructionData = ConstructionData SQ.empty AS.initScope PS.emptyTagHistory [] False False False -- | Shortcut to function `Htcc.Parser.AST.Scope.resetLocal` for variable @x@ of type `ConstructionData`. -- This function is equivalent to @@ -212,14 +261,100 @@ resetLocal :: ConstructionData i -> ConstructionData i resetLocal cd = cd { scope = AS.resetLocal (scope cd) } -- | Function to add warning text. -pushWarn :: T.Text -> TokenLC i -> ConstructionData i -> ConstructionData i -pushWarn t tkn cd = cd { warns = warns cd S.|> (t, tkn) } +pushWarn :: M.PosState T.Text -> String -> Parser i () +pushWarn posState warnMsg = do + let peb = M.ParseErrorBundle { + M.bundleErrors = M.FancyError 0 (S.singleton $ M.ErrorFail $ "warning: " <> warnMsg) :| [] + , M.bundlePosState = posState + } + modify (\s -> s { warns = warns s SQ.|> peb }) -- | Returns `Nothing` if incomplete, otherwise `Htcc.CRules.Types.StorageClass`. {-# INLINE incomplete #-} incomplete :: CT.StorageClass i -> ConstructionData i -> Maybe (CT.StorageClass i) incomplete ty scp | not (CT.isCTIncomplete ty) = Just ty - | CT.isIncompleteStruct ty = (>>=) (lookupTag (fromJust $ CT.fromIncompleteStruct ty) scp) $ \tag -> - if CT.isCTIncomplete (PS.sttype tag) then Nothing else Just (PS.sttype tag) - | otherwise = Nothing + | otherwise = case CT.toTypeKind ty of + CT.CTIncomplete (CT.IncompleteStruct tag scopeId) -> + completedStructTagType scp tag scopeId + _ -> + Nothing + +completedStructTagType :: ConstructionData i -> T.Text -> CT.ScopeId -> Maybe (CT.StorageClass i) +completedStructTagType cd tag scopeId = + case PS.lookupAtScope tag scopeId (tagHistory cd) of + Just tagInfo + | PS.stKind tagInfo == PS.StructTag + , not (CT.isCTIncomplete $ PS.sttype tagInfo) -> + Just $ PS.sttype tagInfo + _ -> + Nothing + +normalizeCompletedStorageClass :: ConstructionData i -> CT.StorageClass i -> CT.StorageClass i +normalizeCompletedStorageClass cd = + CT.mapTypeKind (normalizeCompletedTypeKind S.empty) + where + normalizeCompletedTypeKind seen = \case + CT.CTPtr innerTy -> + CT.CTPtr $ normalizeCompletedTypeKind seen innerTy + CT.CTArray n innerTy -> + CT.CTArray n $ normalizeCompletedTypeKind seen innerTy + CT.CTFunc retTy params -> + CT.CTFunc + (normalizeCompletedTypeKind seen retTy) + (map (secondParam seen) params) + CT.CTEnum baseTy members -> + CT.CTEnum (normalizeCompletedTypeKind seen baseTy) members + CT.CTStruct members -> + CT.CTStruct $ fmap (normalizeStructMember seen) members + CT.CTNamedStruct tag scopeId members -> + CT.CTNamedStruct tag scopeId $ + fmap (normalizeStructMemberForTag seen (tag, scopeId)) members + CT.CTIncomplete (CT.IncompleteArray innerTy) -> + CT.CTIncomplete $ + CT.IncompleteArray $ normalizeCompletedTypeKind seen innerTy + CT.CTIncomplete (CT.IncompleteStruct tag scopeId) -> + case completedStructTagType cd tag scopeId of + Just completedTy + | S.member (tag, scopeId) seen -> + CT.toTypeKind completedTy + | otherwise -> + normalizeCompletedTypeKind + (S.insert (tag, scopeId) seen) + (CT.toTypeKind completedTy) + Nothing -> + CT.CTIncomplete $ CT.IncompleteStruct tag scopeId + tyKind -> + tyKind + + secondParam seen' (paramTy, ident) = + (normalizeCompletedTypeKind seen' paramTy, ident) + + normalizeStructMember seen member = + member { + CT.smType = normalizeCompletedTypeKind seen (CT.smType member) + } + + normalizeStructMemberForTag seen tagKey member = + member { + CT.smType = + normalizeCompletedTypeKind + (S.insert tagKey seen) + (CT.smType member) + } + +hasIncompleteObjectType :: CT.StorageClass i -> Bool +hasIncompleteObjectType = go . CT.toTypeKind + where + go = \case + CT.CTLong innerTy -> go innerTy + CT.CTShort innerTy -> go innerTy + CT.CTSigned innerTy -> go innerTy + CT.CTArray _ innerTy -> go innerTy + CT.CTEnum baseTy _ -> go baseTy + CT.CTStruct members -> + any (go . CT.smType) members + CT.CTNamedStruct _ _ members -> + any (go . CT.smType) members + CT.CTIncomplete _ -> True + _ -> False diff --git a/src/Htcc/Parser/ConstructionData/Core.hs-boot b/src/Htcc/Parser/ConstructionData/Core.hs-boot new file mode 100644 index 0000000..c8d3507 --- /dev/null +++ b/src/Htcc/Parser/ConstructionData/Core.hs-boot @@ -0,0 +1,32 @@ +module Htcc.Parser.ConstructionData.Core where + +import qualified Data.Sequence as SQ +import qualified Data.Text as T +import Data.Void +import qualified Htcc.CRules.Types as CT +import qualified Htcc.Parser.ConstructionData.Scope as AS +import qualified Htcc.Parser.ConstructionData.Scope.Enumerator as SE +import qualified Htcc.Parser.ConstructionData.Scope.Tag as PS +import qualified Text.Megaparsec as M + +type Warnings = SQ.Seq (M.ParseErrorBundle T.Text Void) + +data FunctionParamScope i = FunctionParamScope + { + fpsScopeId :: CT.ScopeId, + fpsTags :: PS.Tags i, + fpsEnumerators :: SE.Enumerators i + } + +data ConstructionData i = ConstructionData + { + warns :: Warnings, + scope :: AS.Scoped i, + tagHistory :: PS.TagHistory i, + functionParamScopes :: [FunctionParamScope i], + isSwitchStmt :: Bool, + suppressUnsupportedValueChecks :: Bool, + allowSameInputExternalCollisions :: Bool + } + +initConstructionData :: ConstructionData i diff --git a/src/Htcc/Parser/ConstructionData/Scope.hs b/src/Htcc/Parser/ConstructionData/Scope.hs index 944133c..5d01cff 100644 --- a/src/Htcc/Parser/ConstructionData/Scope.hs +++ b/src/Htcc/Parser/ConstructionData/Scope.hs @@ -9,7 +9,7 @@ Portability : POSIX The Data type of variables and its utilities used in parsing -} -{-# LANGUAGE DeriveGeneric #-} +{-# LANGUAGE DeriveGeneric, OverloadedStrings #-} module Htcc.Parser.ConstructionData.Scope ( -- * The types Scoped (..), @@ -17,11 +17,14 @@ module Htcc.Parser.ConstructionData.Scope ( -- * Operations for scope addLVar, addGVar, + addGVarAllowFunctionConflict, addGVarWith, + addGVarWithAllowFunctionConflict, addLiteral, addTag, addTypedef, addFunction, + addFunctionAllowGlobalConflict, addEnumerator, succNest, fallBack, @@ -38,6 +41,8 @@ module Htcc.Parser.ConstructionData.Scope ( import Control.DeepSeq (NFData (..)) import Data.Bits (Bits (..)) +import Data.Maybe (fromMaybe, + isJust) import qualified Data.Text as T import Data.Tuple.Extra (second) import GHC.Generics (Generic (..), @@ -57,12 +62,15 @@ import qualified Htcc.Tokenizer.Token as HT -- | The data type of a struct tag data Scoped i = Scoped -- ^ The constructor of a struct tag { - curNestDepth :: !Natural, -- ^ The nest depth of the parsing process - vars :: PV.Vars i, -- ^ scoped all identifiers of variables (local variables, global variables and literals) visible during processing - structs :: PS.Tags i, -- ^ scoped all struct tags - typedefs :: PT.Typedefs i, -- ^ scoped all typedefs - functions :: PF.Functions i, -- ^ scoped all identifires of functions - enumerators :: SE.Enumerators i -- ^ scoped all identifiers of enumerators + curNestDepth :: !Natural, -- ^ The nest depth of the parsing process + curScopeId :: !CT.ScopeId, -- ^ The unique identity of the current scope + nextScopeId :: !CT.ScopeId, -- ^ The next unique scope identity to allocate + vars :: PV.Vars i, -- ^ scoped all identifiers of variables (local variables, global variables and literals) visible during processing + structs :: PS.Tags i, -- ^ scoped all struct tags + typedefs :: PT.Typedefs i, -- ^ scoped all typedefs + functions :: PF.Functions i, -- ^ scoped all identifires of functions + externalFunctions :: PF.Functions i, -- ^ translation-unit function declarations used for compatibility checks across scopes + enumerators :: SE.Enumerators i -- ^ scoped all identifiers of enumerators } deriving (Show, Generic, Generic1) instance NFData i => NFData (Scoped i) @@ -71,6 +79,8 @@ instance NFData i => NFData (Scoped i) data LookupVarResult i = FoundGVar (PV.GVar i) -- ^ A type constructor indicating that a global variable has been found | FoundLVar (PV.LVar i) -- ^ A type constructor indicating that a local variable has been found | FoundEnum (SE.Enumerator i) -- ^ A type constructor indicating that a enumerator has been found + | FoundFunc (PF.Function i) -- ^ A type constructor indicating that a function has been found + | FoundTypedef (PT.Typedef i) -- ^ A type constructor indicating that a typedef has been found in the ordinary identifier namespace | NotFound -- ^ A type constructor indicating that it was not found deriving (Show, Eq) @@ -78,6 +88,81 @@ data LookupVarResult i = FoundGVar (PV.GVar i) -- ^ A type constructor indicati applyVars :: Scoped i -> (a, PV.Vars i) -> (a, Scoped i) applyVars sc = second (\x -> sc { vars = x }) +{-# INLINE identifierFromToken #-} +identifierFromToken :: HT.TokenLC i -> Maybe T.Text +identifierFromToken (_, HT.TKIdent ident) = Just ident +identifierFromToken _ = Nothing + +{-# INLINE sameScopeLVar #-} +sameScopeLVar :: T.Text -> Natural -> Scoped i -> Bool +sameScopeLVar ident depth = maybe False ((== depth) . PV.nestDepth) . lookupLVar ident + +{-# INLINE sameScopeTypedef #-} +sameScopeTypedef :: T.Text -> Natural -> Scoped i -> Bool +sameScopeTypedef ident depth = maybe False ((== depth) . PT.tdNestDepth) . lookupTypedef ident + +{-# INLINE sameScopeEnumerator #-} +sameScopeEnumerator :: T.Text -> Natural -> Scoped i -> Bool +sameScopeEnumerator ident depth = maybe False ((== depth) . SE.enNestDepth) . lookupEnumerator ident + +{-# INLINE sameScopeGVar #-} +sameScopeGVar :: T.Text -> Natural -> Scoped i -> Bool +sameScopeGVar ident depth = maybe False ((== depth) . PV.gvNestDepth) . lookupGVar ident + +{-# INLINE sameScopeFunction #-} +sameScopeFunction :: T.Text -> Natural -> Scoped i -> Bool +sameScopeFunction ident depth = maybe False ((== depth) . PF.fnNestDepth) . lookupFunction ident + +{-# INLINE rejectLocalOrdinaryNameConflict #-} +rejectLocalOrdinaryNameConflict :: HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) () +rejectLocalOrdinaryNameConflict tkn sc = case identifierFromToken tkn of + Just ident + | any ($ sc) + [ sameScopeGVar ident depth + , sameScopeTypedef ident depth + , sameScopeEnumerator ident depth + , sameScopeFunction ident depth + ] -> + Left ("redeclaration of '" <> ident <> "' with no linkage", tkn) + _ -> + Right () + where + depth = curNestDepth sc + +{-# INLINE lookupExternalFunction #-} +lookupExternalFunction :: T.Text -> Scoped i -> Maybe (PF.Function i) +lookupExternalFunction ident = SM.lookup ident . externalFunctions + +{-# INLINE lookupExternalGVar #-} +lookupExternalGVar :: T.Text -> Scoped i -> Maybe (PV.GVar i) +lookupExternalGVar ident = SM.lookup ident . PV.externalGlobals . vars + +{-# INLINE rejectObjectNameConflict #-} +rejectObjectNameConflict :: HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) () +rejectObjectNameConflict tkn sc = case identifierFromToken tkn of + Just ident + | depth == 0 + , isJust (lookupExternalFunction ident sc) -> + Left ("redeclaration of '" <> ident <> "' with no linkage", tkn) + | depth == 0 + && any ($ sc) [sameScopeTypedef ident 0, sameScopeEnumerator ident 0] -> + Left ("conflicting types for '" <> ident <> "'", tkn) + | depth /= 0 + && isJust (lookupFunction ident sc) -> + Left ("redeclaration of '" <> ident <> "' with no linkage", tkn) + | depth /= 0 + && any ($ sc) + [ sameScopeLVar ident depth + , sameScopeTypedef ident depth + , sameScopeEnumerator ident depth + , sameScopeFunction ident depth + ] -> + Left ("redeclaration of '" <> ident <> "' with no linkage", tkn) + _ -> + Right () + where + depth = curNestDepth sc + {-# INLINE addVar #-} addVar :: (Integral i, Bits i) => (CT.StorageClass i -> HT.TokenLC i -> PV.Vars i -> Either (T.Text, HT.TokenLC i) (ATree i, PV.Vars i)) -> CT.StorageClass i -> HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) (ATree i, Scoped i) addVar f ty tkn sc = applyVars sc <$> f ty tkn (vars sc) @@ -85,17 +170,71 @@ addVar f ty tkn sc = applyVars sc <$> f ty tkn (vars sc) -- | `addLVar` has a scoped type argument and is the same function as `PV.addLVar` internally. {-# INLINE addLVar #-} addLVar :: (Integral i, Bits i) => CT.StorageClass i -> HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) (ATree i, Scoped i) -addLVar ty tkn scp = addVar (PV.addLVar $ curNestDepth scp) ty tkn scp +addLVar ty tkn scp = + rejectLocalOrdinaryNameConflict tkn scp *> addVar (PV.addLVar $ curNestDepth scp) ty tkn scp -- | `addGVar` has a scoped type argument and is the same function as `PV.addGVar` internally. {-# INLINE addGVar #-} addGVar :: (Integral i, Bits i) => CT.StorageClass i -> HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) (ATree i, Scoped i) -addGVar = addVar PV.addGVar +addGVar ty tkn sc = rejectObjectNameConflict tkn sc *> addVar (PV.addGVar $ curNestDepth sc) ty tkn sc + +{-# INLINE addGVarAllowFunctionConflict #-} +addGVarAllowFunctionConflict :: (Integral i, Bits i) => CT.StorageClass i -> HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) (ATree i, Scoped i) +addGVarAllowFunctionConflict ty tkn sc = + rejectNonFunctionObjectNameConflict tkn sc *> addVar (PV.addGVar $ curNestDepth sc) ty tkn sc + where + rejectNonFunctionObjectNameConflict tkn' sc' = case identifierFromToken tkn' of + Just ident + | depth == 0 + && any ($ sc') [sameScopeTypedef ident 0, sameScopeEnumerator ident 0] -> + Left ("conflicting types for '" <> ident <> "'", tkn') + | depth /= 0 + && isJust (lookupFunction ident sc') -> + Left ("redeclaration of '" <> ident <> "' with no linkage", tkn') + | depth /= 0 + && any ($ sc') + [ sameScopeLVar ident depth + , sameScopeTypedef ident depth + , sameScopeEnumerator ident depth + , sameScopeFunction ident depth + ] -> + Left ("redeclaration of '" <> ident <> "' with no linkage", tkn') + _ -> + Right () + where + depth = curNestDepth sc' -- | `addGVarWith` has a scoped type argument and is the same function as `PV.addLiteral` internally. {-# INLINE addGVarWith #-} addGVarWith :: (Integral i, Bits i) => CT.StorageClass i -> HT.TokenLC i -> PV.GVarInitWith i -> Scoped i -> Either (SM.ASTError i) (ATree i, Scoped i) -addGVarWith ty tkn iw sc = applyVars sc <$> PV.addGVarWith ty tkn iw (vars sc) +addGVarWith ty tkn iw sc = + rejectObjectNameConflict tkn sc *> (applyVars sc <$> PV.addGVarWith (curNestDepth sc) ty tkn iw (vars sc)) + +{-# INLINE addGVarWithAllowFunctionConflict #-} +addGVarWithAllowFunctionConflict :: (Integral i, Bits i) => CT.StorageClass i -> HT.TokenLC i -> PV.GVarInitWith i -> Scoped i -> Either (SM.ASTError i) (ATree i, Scoped i) +addGVarWithAllowFunctionConflict ty tkn iw sc = + rejectNonFunctionObjectNameConflict tkn sc *> (applyVars sc <$> PV.addGVarWith (curNestDepth sc) ty tkn iw (vars sc)) + where + rejectNonFunctionObjectNameConflict tkn' sc' = case identifierFromToken tkn' of + Just ident + | depth == 0 + && any ($ sc') [sameScopeTypedef ident 0, sameScopeEnumerator ident 0] -> + Left ("conflicting types for '" <> ident <> "'", tkn') + | depth /= 0 + && isJust (lookupFunction ident sc') -> + Left ("redeclaration of '" <> ident <> "' with no linkage", tkn') + | depth /= 0 + && any ($ sc') + [ sameScopeLVar ident depth + , sameScopeTypedef ident depth + , sameScopeEnumerator ident depth + , sameScopeFunction ident depth + ] -> + Left ("redeclaration of '" <> ident <> "' with no linkage", tkn') + _ -> + Right () + where + depth = curNestDepth sc' -- | `addLiteral` has a scoped type argument and is the same function as `PV.addLiteral` internally. {-# INLINE addLiteral #-} @@ -105,7 +244,13 @@ addLiteral = addVar PV.addLiteral -- | `succNest` has a scoped type argument and is the same function as `PV.succNest` internally. {-# INLINE succNest #-} succNest :: Scoped i -> Scoped i -succNest sc = sc { curNestDepth = succ $ curNestDepth sc } +succNest sc = sc + { curNestDepth = succ $ curNestDepth sc + , curScopeId = nextScopeId sc + , nextScopeId = succScopeId $ nextScopeId sc + } + where + succScopeId (CT.ScopeId scopeId) = CT.ScopeId $ succ scopeId -- | `fallBack` has a scoped type argument and is the same function as `PV.fallBack` internally. {-# INLINE fallBack #-} @@ -113,10 +258,12 @@ fallBack :: Scoped i -> Scoped i -> Scoped i fallBack pre post = pre { vars = PV.fallBack (vars pre) (vars post), - structs = SM.fallBack (structs pre) (structs post), + structs = PS.fallBackTags (structs pre) (structs post), typedefs = SM.fallBack (typedefs pre) (typedefs post), functions = SM.fallBack (functions pre) (functions post), - enumerators = SM.fallBack (enumerators pre) (enumerators post) + externalFunctions = externalFunctions post, + enumerators = SM.fallBack (enumerators pre) (enumerators post), + nextScopeId = nextScopeId post } {-# INLINE lookupVar' #-} @@ -136,16 +283,31 @@ lookupGVar = lookupVar' PV.lookupGVar -- | `lookupVar` has a scoped type argument and is the same function as `PV.lookupVar` internally. {-# INLINE lookupVar #-} lookupVar :: T.Text -> Scoped i -> LookupVarResult i -lookupVar ident scp = case lookupLVar ident scp of - Just local -> FoundLVar local - _ -> case lookupEnumerator ident scp of - Just enum -> FoundEnum enum - _ -> maybe NotFound FoundGVar $ lookupGVar ident scp +lookupVar ident scp = + maybe NotFound snd $ + selectDeepest + [ (\local -> (PV.nestDepth local, FoundLVar local)) <$> lookupLVar ident scp + , (\enum -> (SE.enNestDepth enum, FoundEnum enum)) <$> lookupEnumerator ident scp + , (\gvar -> (PV.gvNestDepth gvar, FoundGVar gvar)) <$> lookupGVar ident scp + , (\fn -> (PF.fnNestDepth fn, FoundFunc fn)) <$> lookupFunction ident scp + , (\td -> (PT.tdNestDepth td, FoundTypedef td)) <$> lookupTypedef ident scp + ] + where + selectDeepest = + foldl + (\best candidate -> case (best, candidate) of + (Nothing, x) -> x + (x, Nothing) -> x + (Just (bestDepth, _), Just (candidateDepth, _)) + | candidateDepth > bestDepth -> candidate + | otherwise -> best + ) + Nothing -- | `lookupTag` has a scoped type argument and is the same function as `PS.lookupTag` internally. {-# INLINE lookupTag #-} lookupTag :: T.Text -> Scoped i -> Maybe (PS.Tag i) -lookupTag t sc = SM.lookup t $ structs sc +lookupTag t sc = PS.lookupVisible t $ structs sc -- | `lookupTypedef` has a scoped type argument and is the same function as `PT.lookupTypedef` internally. {-# INLINE lookupTypedef #-} @@ -164,28 +326,102 @@ lookupEnumerator t sc = SM.lookup t $ enumerators sc -- | `addTag` has a scoped type argument and is the same function as `PS.add` internally. {-# INLINE addTag #-} -addTag :: Num i => CT.StorageClass i -> HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) (Scoped i) -addTag ty tkn sc = (\x -> sc { structs = x }) <$> PS.add (curNestDepth sc) ty tkn (structs sc) +addTag :: Num i => PS.TagKind -> CT.StorageClass i -> HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) (Scoped i) +addTag kind ty tkn sc = + (\x -> sc { structs = x }) + <$> PS.add (curNestDepth sc) (curScopeId sc) kind ty tkn (structs sc) -- | `addTypedef` has a scoped type argument and is the same function as `PT.add` internally. {-# INLINE addTypedef #-} addTypedef :: (Eq i, Num i) => CT.StorageClass i -> HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) (Scoped i) -addTypedef ty tkn sc = (\x -> sc { typedefs = x }) <$> PT.add (curNestDepth sc) ty tkn (typedefs sc) +addTypedef ty tkn sc = + rejectTypedefNameConflict tkn sc *> ((\x -> sc { typedefs = x }) <$> PT.add (curNestDepth sc) ty tkn (typedefs sc)) + where + rejectTypedefNameConflict tkn' sc' = case identifierFromToken tkn' of + Just ident + | depth == 0 + && (isJust (lookupGVar ident sc') || isJust (lookupFunction ident sc') || sameScopeEnumerator ident 0 sc') -> + Left ("conflicting types for '" <> ident <> "'", tkn') + | depth /= 0 + && (sameScopeLVar ident depth sc' || sameScopeGVar ident depth sc' || sameScopeEnumerator ident depth sc' || sameScopeFunction ident depth sc') -> + Left ("conflicting types for '" <> ident <> "'", tkn') + _ -> + Right () + where + depth = curNestDepth sc' -- | `addFunction` has a scoped type argument and is the same function as `PT.add` internally. {-# INLINE addFunction #-} -addFunction :: Num i => Bool -> CT.StorageClass i -> HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) (Scoped i) -addFunction fd ty tkn sc = (\x -> sc { functions = x }) <$> PF.add fd ty tkn (functions sc) +addFunction :: (Eq i, Num i) => Bool -> Bool -> CT.StorageClass i -> HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) (Scoped i) +addFunction fd isImplicit ty tkn sc = do + rejectFunctionDeclNameConflict tkn sc + visibleFunctions <- PF.add (curNestDepth sc) fd isImplicit ty tkn (functions sc) + declaredFunctions <- PF.add 0 fd isImplicit ty tkn (externalFunctions sc) + pure sc { functions = visibleFunctions, externalFunctions = declaredFunctions } + where + rejectFunctionDeclNameConflict tkn' sc' = case identifierFromToken tkn' of + Just ident + | depth == 0 + && (isJust (lookupExternalGVar ident sc') || sameScopeTypedef ident 0 sc' || sameScopeEnumerator ident 0 sc') -> + Left ("conflicting types for '" <> ident <> "'", tkn') + | depth /= 0 + && isJust (lookupGVar ident sc') -> + Left ("conflicting types for '" <> ident <> "'", tkn') + | depth /= 0 + && (sameScopeLVar ident depth sc' || sameScopeGVar ident depth sc' || sameScopeTypedef ident depth sc' || sameScopeEnumerator ident depth sc') -> + Left ("conflicting types for '" <> ident <> "'", tkn') + _ -> + Right () + where + depth = curNestDepth sc' + +{-# INLINE addFunctionAllowGlobalConflict #-} +addFunctionAllowGlobalConflict :: (Eq i, Num i) => Bool -> Bool -> CT.StorageClass i -> HT.TokenLC i -> Scoped i -> Either (SM.ASTError i) (Scoped i) +addFunctionAllowGlobalConflict fd isImplicit ty tkn sc = do + rejectFunctionDeclNameConflict tkn sc + visibleFunctions <- PF.add (curNestDepth sc) fd isImplicit ty tkn (functions sc) + declaredFunctions <- PF.add 0 fd isImplicit ty tkn (externalFunctions sc) + pure sc { functions = visibleFunctions, externalFunctions = declaredFunctions } + where + rejectFunctionDeclNameConflict tkn' sc' = case identifierFromToken tkn' of + Just ident + | depth == 0 + && (sameScopeTypedef ident 0 sc' || sameScopeEnumerator ident 0 sc') -> + Left ("conflicting types for '" <> ident <> "'", tkn') + | depth /= 0 + && isJust (lookupGVar ident sc') -> + Left ("conflicting types for '" <> ident <> "'", tkn') + | depth /= 0 + && (sameScopeLVar ident depth sc' || sameScopeGVar ident depth sc' || sameScopeTypedef ident depth sc' || sameScopeEnumerator ident depth sc') -> + Left ("conflicting types for '" <> ident <> "'", tkn') + _ -> + Right () + where + depth = curNestDepth sc' -- | `addEnumerator` has a scoped type argument and is the same function as `SE.add` internally. {-# INLINE addEnumerator #-} addEnumerator :: Num i => CT.StorageClass i -> HT.TokenLC i -> i -> Scoped i -> Either (SM.ASTError i) (Scoped i) -addEnumerator ty tkn val sc = (\x -> sc { enumerators = x }) <$> SE.add ty tkn val (enumerators sc) +addEnumerator ty tkn val sc = + rejectEnumeratorNameConflict tkn sc *> ((\x -> sc { enumerators = x }) <$> SE.add (curNestDepth sc) ty tkn val (enumerators sc)) + where + rejectEnumeratorNameConflict tkn' sc' = case identifierFromToken tkn' of + Just ident + | depth == 0 + && (isJust (lookupGVar ident sc') || isJust (lookupFunction ident sc') || sameScopeTypedef ident 0 sc') -> + Left ("redeclaration of enumerator '" <> ident <> "'", tkn') + | depth /= 0 + && (sameScopeLVar ident depth sc' || sameScopeGVar ident depth sc' || sameScopeTypedef ident depth sc' || sameScopeFunction ident depth sc') -> + Left ("redeclaration of enumerator '" <> ident <> "'", tkn') + _ -> + Right () + where + depth = curNestDepth sc' {-# INLINE initScope #-} -- | Helper function representing an empty scoped data initScope :: Scoped i -initScope = Scoped 0 PV.initVars SM.initial SM.initial SM.initial SM.initial +initScope = Scoped 0 (CT.ScopeId 0) (CT.ScopeId 1) PV.initVars SM.initial SM.initial SM.initial SM.initial SM.initial {-# INLINE resetLocal #-} -- | `resetLocal` has a scoped type argument and is the same function as `PV.resetLocal` internally. diff --git a/src/Htcc/Parser/ConstructionData/Scope/Enumerator.hs b/src/Htcc/Parser/ConstructionData/Scope/Enumerator.hs index 9894ed9..d1ca810 100644 --- a/src/Htcc/Parser/ConstructionData/Scope/Enumerator.hs +++ b/src/Htcc/Parser/ConstructionData/Scope/Enumerator.hs @@ -20,6 +20,7 @@ import Control.DeepSeq (NFData (..)) import qualified Data.Map as M import qualified Data.Text as T import GHC.Generics (Generic (..)) +import Numeric.Natural (Natural) import qualified Htcc.CRules.Types as CT import Htcc.Parser.AST.Core (Treealizable (..), @@ -32,13 +33,14 @@ import qualified Htcc.Tokenizer.Token as HT data Enumerator i = Enumerator { enVal :: i, -- ^ The value of enumerator - enUnderlying :: CT.StorageClass i -- ^ The underlying type of this enumerator + enUnderlying :: CT.StorageClass i, -- ^ The underlying type of this enumerator + enNestDepth :: !Natural -- ^ The nest depth of this enumerator } deriving (Eq, Ord, Show, Generic) instance NFData i => NFData (Enumerator i) instance Treealizable Enumerator where - treealize (Enumerator val _) = atNumLit val + treealize (Enumerator val _ _) = atNumLit val instance ManagedScope (Enumerator i) where lookup = M.lookup @@ -53,8 +55,12 @@ type Enumerators i = M.Map T.Text (Enumerator i) -- return an error message and its location as a pair. -- Otherwise, add a new tag to `Enumerators` and return it. -- If the token does not indicate an identifier, an error indicating internal compiler error is returned. -add :: Num i => CT.StorageClass i -> HT.TokenLC i -> i -> Enumerators i -> Either (ASTError i) (Enumerators i) -add t cur@(_, HT.TKIdent ident) val sts = case M.lookup ident sts of - Just _ -> Left ("redeclaration of enumerator '" <> ident <> "'", cur) -- ODR - Nothing -> Right $ M.insert ident (Enumerator val t) sts -add _ _ _ _ = Left (internalCE, (HT.TokenLCNums 0 0, HT.TKEmpty)) +add :: Num i => Natural -> CT.StorageClass i -> HT.TokenLC i -> i -> Enumerators i -> Either (ASTError i) (Enumerators i) +add cnd t cur@(_, HT.TKIdent ident) val sts = case M.lookup ident sts of + Just foundedEnum + | enNestDepth foundedEnum /= cnd -> enat + | otherwise -> Left ("redeclaration of enumerator '" <> ident <> "'", cur) -- ODR + Nothing -> enat + where + enat = Right $ M.insert ident (Enumerator val t cnd) sts +add _ _ _ _ _ = Left (internalCE, (HT.TokenLCNums 0 0, HT.TKEmpty)) diff --git a/src/Htcc/Parser/ConstructionData/Scope/Function.hs b/src/Htcc/Parser/ConstructionData/Scope/Function.hs index 6a2ed36..c9a4bac 100644 --- a/src/Htcc/Parser/ConstructionData/Scope/Function.hs +++ b/src/Htcc/Parser/ConstructionData/Scope/Function.hs @@ -20,6 +20,7 @@ import Control.DeepSeq (NFData (..)) import qualified Data.Map as M import qualified Data.Text as T import GHC.Generics (Generic (..)) +import Numeric.Natural (Natural) import qualified Htcc.CRules.Types as CT import Htcc.Parser.ConstructionData.Scope.ManagedScope @@ -29,15 +30,17 @@ import qualified Htcc.Tokenizer.Token as HT -- | The data type of a typedef tag data Function a = Function -- ^ The contypedefor of a typedef tag { - fntype :: CT.StorageClass a, -- ^ The type of this typedef - fnDefined :: Bool -- ^ If the function is defined, it will be `True`, otherwise will be `False`. + fntype :: CT.StorageClass a, -- ^ The type of this typedef + fnDefined :: Bool, -- ^ If the function is defined, it will be `True`, otherwise will be `False`. + fnImplicit :: Bool, -- ^ `True` only when the function only exists because of an implicit declaration synthesized from a call site. + fnNestDepth :: !Natural -- ^ The nest depth of this function declaration. } deriving (Eq, Ord, Show, Generic) instance NFData a => NFData (Function a) instance ManagedScope (Function i) where lookup = M.lookup - fallBack = flip const + fallBack pre post = M.union pre $ M.filter ((== 0) . fnNestDepth) post initial = M.empty -- | The typedefs data typedefs @@ -48,11 +51,52 @@ type Functions i = M.Map T.Text (Function i) -- return an error message and its location as a pair. -- Otherwise, add a new tag to `Functions` and return it. -- If the token does not indicate an identifier, an error indicating internal compiler error is returned. -add :: Num i => Bool -> CT.StorageClass i -> HT.TokenLC i -> Functions i -> Either (ASTError i) (Functions i) -add df t cur@(_, HT.TKIdent ident) sts = case M.lookup ident sts of - Just foundFunc - | not (fnDefined foundFunc) -> Right $ M.insert ident (Function t True) sts - | otherwise -> Left ("conflicting types for '" <> ident <> "'", cur) -- ODR - Nothing -> Right $ M.insert ident (Function t df) sts -add _ _ _ _ = Left (internalCE, (HT.TokenLCNums 0 0, HT.TKEmpty)) +add :: (Eq i, Num i) => Natural -> Bool -> Bool -> CT.StorageClass i -> HT.TokenLC i -> Functions i -> Either (ASTError i) (Functions i) +add cnd df isImplicit t cur@(_, HT.TKIdent ident) sts = case M.lookup ident sts of + Just foundFunc -> + case mergeFunctionTypes (fntype foundFunc) t of + Nothing -> + Left ("conflicting types for '" <> ident <> "'", cur) + Just mergedType + | fnNestDepth foundFunc == cnd && fnDefined foundFunc && df -> + Left ("conflicting types for '" <> ident <> "'", cur) + | otherwise -> + Right $ + M.insert + ident + Function + { fntype = mergedType + , fnDefined = fnDefined foundFunc || df + , fnImplicit = fnImplicit foundFunc && isImplicit + , fnNestDepth = storedDepth + } + sts + Nothing -> + Right $ + M.insert + ident + Function + { fntype = t + , fnDefined = df + , fnImplicit = isImplicit + , fnNestDepth = storedDepth + } + sts + where + storedDepth + | isImplicit = 0 + | otherwise = cnd +add _ _ _ _ _ _ = Left (internalCE, (HT.TokenLCNums 0 0, HT.TKEmpty)) +mergeFunctionTypes :: Eq i => CT.StorageClass i -> CT.StorageClass i -> Maybe (CT.StorageClass i) +mergeFunctionTypes (CT.SCAuto lhs) (CT.SCAuto rhs) = + CT.SCAuto <$> CT.mergeCompatibleTypeKinds lhs rhs +mergeFunctionTypes (CT.SCStatic lhs) (CT.SCStatic rhs) = + CT.SCStatic <$> CT.mergeCompatibleTypeKinds lhs rhs +mergeFunctionTypes (CT.SCStatic lhs) (CT.SCAuto rhs) = + CT.SCStatic <$> CT.mergeCompatibleTypeKinds lhs rhs +mergeFunctionTypes (CT.SCRegister lhs) (CT.SCRegister rhs) = + CT.SCRegister <$> CT.mergeCompatibleTypeKinds lhs rhs +mergeFunctionTypes (CT.SCUndef lhs) (CT.SCUndef rhs) = + CT.SCUndef <$> CT.mergeCompatibleTypeKinds lhs rhs +mergeFunctionTypes _ _ = Nothing diff --git a/src/Htcc/Parser/ConstructionData/Scope/Tag.hs b/src/Htcc/Parser/ConstructionData/Scope/Tag.hs index b571bab..0730156 100644 --- a/src/Htcc/Parser/ConstructionData/Scope/Tag.hs +++ b/src/Htcc/Parser/ConstructionData/Scope/Tag.hs @@ -9,11 +9,18 @@ Portability : POSIX The Data type of variables and its utilities used in parsing -} -{-# LANGUAGE DeriveGeneric, OverloadedStrings #-} +{-# LANGUAGE DeriveGeneric, LambdaCase, OverloadedStrings #-} module Htcc.Parser.ConstructionData.Scope.Tag ( + TagKind (..), Tag (..), Tags, + TagHistory, + emptyTagHistory, add + , fallBackTags + , lookupVisible + , lookupAtScope + , remember ) where import Control.DeepSeq (NFData (..)) @@ -27,10 +34,19 @@ import Htcc.Parser.ConstructionData.Scope.ManagedScope import Htcc.Parser.ConstructionData.Scope.Utils (internalCE) import qualified Htcc.Tokenizer.Token as HT +data TagKind + = StructTag + | EnumTag + deriving (Eq, Ord, Show, Generic) + +instance NFData TagKind + -- | The data type of a tag data Tag i = Tag -- ^ The constructor of a tag { sttype :: CT.StorageClass i, -- ^ The type of this tag + stKind :: TagKind, -- ^ The kind of this tag. + stScopeId :: !CT.ScopeId, -- ^ The scope identity of this tag. stNestDepth :: !Natural -- ^ The nest depth of this tag } deriving (Eq, Ord, Show, Generic) @@ -44,18 +60,50 @@ instance ManagedScope (Tag i) where -- | The `Tags` data type type Tags i = M.Map T.Text (Tag i) +-- | Historical tag bindings keyed by tag name and scope identity. +type TagHistory i = M.Map (T.Text, CT.ScopeId) (Tag i) + +{-# INLINE emptyTagHistory #-} +emptyTagHistory :: TagHistory i +emptyTagHistory = M.empty + +{-# INLINE fallBackTags #-} +fallBackTags :: Tags i -> Tags i -> Tags i +fallBackTags = const + +{-# INLINE lookupVisible #-} +lookupVisible :: T.Text -> Tags i -> Maybe (Tag i) +lookupVisible = M.lookup + +{-# INLINE lookupAtScope #-} +lookupAtScope :: T.Text -> CT.ScopeId -> TagHistory i -> Maybe (Tag i) +lookupAtScope ident scopeId = + M.lookup (ident, scopeId) + +{-# INLINE remember #-} +remember :: CT.ScopeId -> Natural -> TagKind -> CT.StorageClass i -> T.Text -> TagHistory i -> TagHistory i +remember scopeId depth kind ty ident = + M.insert (ident, scopeId) (Tag ty kind scopeId depth) + -- | Given the current nesting number, type, identifier token, and `Tags`, if the specified identifier already exists in the same scope, -- return an error message and its location as a pair. -- Otherwise, add a new tag to `Tags` and return it. -- If the token does not indicate an identifier, an error indicating internal compiler error is returned. -add :: Num i => Natural -> CT.StorageClass i -> HT.TokenLC i -> Tags i -> Either (ASTError i) (Tags i) -add cnd t cur@(_, HT.TKIdent ident) sts = case M.lookup ident sts of +add :: Num i => Natural -> CT.ScopeId -> TagKind -> CT.StorageClass i -> HT.TokenLC i -> Tags i -> Either (ASTError i) (Tags i) +add cnd scopeId kind t cur@(_, HT.TKIdent ident) sts = case M.lookup ident sts of Just foundedTag | stNestDepth foundedTag /= cnd -> stnat + | stKind foundedTag /= kind -> Left (tagRedefinitionMsg kind ident, cur) | CT.isCTIncomplete (sttype foundedTag) -> stnat - | otherwise -> Left ("redefinition of 'struct " <> ident <> "'", cur) -- ODR + | otherwise -> Left (tagRedefinitionMsg kind ident, cur) -- ODR Nothing -> stnat where - stnat = Right $ M.insert ident (Tag t cnd) sts -add _ _ _ _ = Left (internalCE, (HT.TokenLCNums 0 0, HT.TKEmpty)) + stnat = Right $ M.insert ident (Tag t kind scopeId cnd) sts + + tagRedefinitionMsg kind' ident' = + "redefinition of '" <> tagKindText kind' <> " " <> ident' <> "'" + tagKindText = \case + StructTag -> "struct" + EnumTag -> "enum" +add _ _ _ _ _ _ = Left (internalCE, (HT.TokenLCNums 0 0, HT.TKEmpty)) diff --git a/src/Htcc/Parser/ConstructionData/Scope/Var.hs b/src/Htcc/Parser/ConstructionData/Scope/Var.hs index d116e92..74d90c9 100644 --- a/src/Htcc/Parser/ConstructionData/Scope/Var.hs +++ b/src/Htcc/Parser/ConstructionData/Scope/Var.hs @@ -15,6 +15,7 @@ module Htcc.Parser.ConstructionData.Scope.Var ( Var (..), -- * The data type SomeVars, + GVarInitData (..), GVarInitWith (..), GVar (..), LVar (..), @@ -32,6 +33,7 @@ module Htcc.Parser.ConstructionData.Scope.Var ( addGVar, addLiteral, -- * Utilities + materializeTentativeIncompleteArray, initVars, resetLocal, fallBack @@ -41,6 +43,7 @@ import Control.DeepSeq (NFData (..)) import Data.Bits (Bits (..)) import qualified Data.ByteString as B import qualified Data.Map.Strict as M +import Data.Maybe (fromMaybe) import qualified Data.Text as T import GHC.Generics (Generic, Generic1) @@ -63,19 +66,67 @@ class Var a where vtype :: a i -> CT.StorageClass i -- | The informations type about initial value of the global variable -data GVarInitWith i = GVarInitWithZero | GVarInitWithOG T.Text | GVarInitWithVal i +data GVarInitData i + = GVarInitZeroBytes Natural + | GVarInitBytes Natural i + | GVarInitReloc Natural T.Text Integer deriving (Eq, Ord, Show, Generic) -instance NFData i => NFData (GVarInitWith i) +instance NFData i => NFData (GVarInitData i) + +-- | The informations type about initial value of the global variable +data GVarInitWith i + = GVarInitWithZero + | GVarInitWithExternDecl + | GVarInitWithOG T.Text + | GVarInitWithVal i + | GVarInitWithData [GVarInitData i] + | GVarInitWithAST (ATree i) + deriving (Eq, Show, Generic) + +instance NFData i => NFData (GVarInitWith i) where + rnf GVarInitWithZero = () + rnf GVarInitWithExternDecl = () + rnf (GVarInitWithOG ref) = rnf ref + rnf (GVarInitWithVal val) = rnf val + rnf (GVarInitWithData ds) = rnf ds + rnf (GVarInitWithAST ast) = ast `seq` () -- | The data type of the global variable data GVar i = GVar -- ^ The constructor of the global variable { - gvtype :: CT.StorageClass i, -- ^ The type of the global variable - initWith :: GVarInitWith i -- ^ The informations about initial value of the global variable - } deriving (Eq, Ord, Show, Generic) + gvtype :: CT.StorageClass i, -- ^ The type of the global variable + initWith :: GVarInitWith i, -- ^ The informations about initial value of the global variable + gvNestDepth :: !Natural -- ^ The nest depth where this declaration is currently visible. + } deriving (Eq, Show, Generic) + +instance NFData i => NFData (GVar i) where + rnf (GVar ty iw depth) = rnf ty `seq` rnf iw `seq` rnf depth -instance NFData i => NFData (GVar i) +materializeTentativeIncompleteArray :: Ord i => GVar i -> GVar i +materializeTentativeIncompleteArray gvar = case initWith gvar of + GVarInitWithZero -> + gvar { gvtype = CT.mapTypeKind materializeTentativeArrayType $ gvtype gvar } + _ -> + gvar + where + materializeTentativeArrayType = go + where + go (CT.CTArray n innerTy) = CT.CTArray n $ go innerTy + go (CT.CTIncomplete (CT.IncompleteArray elemTy)) = + materializeIncompleteArray elemTy + go ty = ty + + materializeIncompleteArray elemTy + | CT.isCTArray elemTy = + fromMaybe fallback $ + CT.concatCTArray + (CT.makeCTArray [1] $ CT.removeAllExtents elemTy) + elemTy + | otherwise = + fallback + where + fallback = CT.CTArray 1 elemTy instance Var GVar where vtype = gvtype @@ -135,9 +186,10 @@ type Literals a = [Literal a] -- | The data type of local variables data Vars a = Vars -- ^ The constructor of variables { - globals :: GlobalVars a, -- ^ The global variables - locals :: LocalVars a, -- ^ The local variables - literals :: Literals a -- ^ Literals + globals :: GlobalVars a, -- ^ The global variables visible in the current scope + externalGlobals :: GlobalVars a, -- ^ Global declarations remembered for translation-unit compatibility checks + locals :: LocalVars a, -- ^ The local variables + literals :: Literals a -- ^ Literals } deriving (Show, Generic, Generic1) instance NFData a => NFData (Vars a) @@ -145,7 +197,7 @@ instance NFData a => NFData (Vars a) {-# INLINE initVars #-} -- | Helper function representing an empty variables initVars :: Vars a -initVars = Vars SM.initial SM.initial [] +initVars = Vars SM.initial SM.initial SM.initial [] {-# INLINE resetLocal #-} -- | `resetLocal` initialize the local variable list for `Vars` @@ -176,7 +228,11 @@ maximumOffset m {-# INLINE fallBack #-} -- | Organize variable list state after scoping fallBack :: Vars a -> Vars a -> Vars a -fallBack pre post = pre { literals = literals post } +fallBack pre post = + pre + { externalGlobals = externalGlobals post + , literals = literals post + } -- | If the specified token is `HT.TKIdent` and the local variable does not exist in the list, `addLVar` adds a new local variable to the list, -- constructs a pair with the node representing the variable, wraps it in `Right` and return it. Otherwise, returns an error message and token pair wrapped in `Left`. @@ -196,17 +252,73 @@ addLVar _ _ _ _ = Left (internalCE, HT.emptyToken) -- | If the specified token is `HT.TKIdent` and the global variable does not exist in the list, `addLVar` adds a new global variable to the list, -- constructs a pair with the node representing the variable, wraps it in `Right` and return it. Otherwise, returns an error message and token pair wrapped in `Left`. -addGVarWith :: Num i => CT.StorageClass i -> HT.TokenLC i -> GVarInitWith i -> Vars i -> Either (SM.ASTError i) (ATree i, Vars i) -addGVarWith t cur@(_, HT.TKIdent ident) iw vars = flip (flip maybe $ const $ Left ("redeclaration of '" <> ident <> "' with no linkage", cur)) (lookupGVar ident vars) $ -- ODR - Right (atGVar (gvtype gvar) ident, vars { globals = M.insert ident gvar $ globals vars }) +addGVarWith :: (Ord i, Num i) => Natural -> CT.StorageClass i -> HT.TokenLC i -> GVarInitWith i -> Vars i -> Either (SM.ASTError i) (ATree i, Vars i) +addGVarWith cnd t cur@(_, HT.TKIdent ident) iw vars = do + (visibleGVar, visibleGlobals') <- mergeIntoMap cnd (globals vars) + (_, externalGlobals') <- mergeIntoMap 0 (externalGlobals vars) + pure + ( atGVar (gvtype visibleGVar) ident + , vars + { globals = visibleGlobals' + , externalGlobals = externalGlobals' + } + ) where - gvar = GVar t iw -addGVarWith _ _ _ _ = Left (internalCE, (HT.TokenLCNums 0 0, HT.TKEmpty)) + mergeIntoMap storedDepth sts = case M.lookup ident sts of + Nothing -> + let new = newGVar storedDepth + in Right (new, M.insert ident new sts) + Just existing -> + (\merged -> (merged, M.insert ident merged sts)) <$> mergeGVar storedDepth existing (newGVar storedDepth) + + newGVar = GVar t iw + + mergeGVar storedDepth lhs rhs = case mergeGVarTypes lhs rhs of + Nothing -> Left ("redeclaration of '" <> ident <> "' with no linkage", cur) + Just mergedType + | isExternDecl lhs && isExternDecl rhs -> + Right $ lhs { gvtype = mergedType, gvNestDepth = storedDepth } + | isExternDecl lhs -> + Right $ rhs { gvtype = mergedType, gvNestDepth = storedDepth } + | isExternDecl rhs -> + Right $ lhs { gvtype = mergedType, gvNestDepth = storedDepth } + | isTentativeGVar lhs && isTentativeGVar rhs -> + Right $ lhs { gvtype = mergedType, gvNestDepth = storedDepth } + | isTentativeGVar lhs -> + Right $ rhs { gvtype = mergedType, gvNestDepth = storedDepth } + | isTentativeGVar rhs -> + Right $ lhs { gvtype = mergedType, gvNestDepth = storedDepth } + | otherwise -> + Left ("redeclaration of '" <> ident <> "' with no linkage", cur) + + isTentativeGVar gvar = case initWith gvar of + GVarInitWithZero -> True + _ -> False + + isExternDecl gvar = case initWith gvar of + GVarInitWithExternDecl -> True + _ -> False + + mergeGVarTypes lhs rhs = case (gvtype lhs, gvtype rhs) of + (CT.SCAuto lhsTy, CT.SCAuto rhsTy) -> + CT.SCAuto <$> CT.mergeCompatibleTypeKinds lhsTy rhsTy + (CT.SCStatic lhsTy, CT.SCStatic rhsTy) -> + CT.SCStatic <$> CT.mergeCompatibleTypeKinds lhsTy rhsTy + (CT.SCStatic lhsTy, CT.SCAuto rhsTy) + | isExternDecl rhs -> + CT.SCStatic <$> CT.mergeCompatibleTypeKinds lhsTy rhsTy + (CT.SCRegister lhsTy, CT.SCRegister rhsTy) -> + CT.SCRegister <$> CT.mergeCompatibleTypeKinds lhsTy rhsTy + (CT.SCUndef lhsTy, CT.SCUndef rhsTy) -> + CT.SCUndef <$> CT.mergeCompatibleTypeKinds lhsTy rhsTy + _ -> + Nothing +addGVarWith _ _ _ _ _ = Left (internalCE, (HT.TokenLCNums 0 0, HT.TKEmpty)) -- | If the specified token is `HT.TKIdent` and the global variable does not exist in the list, `addLVar` adds a new global variable that will be initialized by zero to the list, -- constructs a pair with the node representing the variable, wraps it in `Right` and return it. Otherwise, returns an error message and token pair wrapped in `Left`. -addGVar :: Num i => CT.StorageClass i -> HT.TokenLC i -> Vars i -> Either (SM.ASTError i) (ATree i, Vars i) -addGVar t ident = addGVarWith t ident GVarInitWithZero +addGVar :: (Ord i, Num i) => Natural -> CT.StorageClass i -> HT.TokenLC i -> Vars i -> Either (SM.ASTError i) (ATree i, Vars i) +addGVar cnd t ident = addGVarWith cnd t ident GVarInitWithZero -- | If the specified token is `HT.TKString`, `addLiteral` adds a new literal to the list, -- constructs a pair with the node representing the variable, wraps it in `Right` and return it. Otherwise, returns an error message and token pair wrapped in `Left`. diff --git a/src/Htcc/Parser/Parsing.hs b/src/Htcc/Parser/Parsing.hs deleted file mode 100644 index 059cb3f..0000000 --- a/src/Htcc/Parser/Parsing.hs +++ /dev/null @@ -1,16 +0,0 @@ -{-| -Module : Htcc.Parser.Parsing -Description : The main routines for parsing -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -The main routines for parsing --} -module Htcc.Parser.Parsing ( - module Htcc.Parser.Parsing.Core -) where - -import Htcc.Parser.Parsing.Core diff --git a/src/Htcc/Parser/Parsing/Core.hs b/src/Htcc/Parser/Parsing/Core.hs deleted file mode 100644 index 47aa884..0000000 --- a/src/Htcc/Parser/Parsing/Core.hs +++ /dev/null @@ -1,435 +0,0 @@ -{-# LANGUAGE BangPatterns, LambdaCase, OverloadedStrings, ScopedTypeVariables, - TupleSections #-} -{-| -Module : Htcc.Parser.Parsing.Core -Description : The C languge parser and AST constructor -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -The C languge parser and AST constructor --} -module Htcc.Parser.Parsing.Core ( - -- * Recursive descent implementation functions - program, - globalDef, - stmt, - inners, - logicalOr, - logicalAnd, - bitwiseOr, - bitwiseXor, - bitwiseAnd, - shift, - add, - term, - cast, - unary, - factor, - relational, - equality, - conditional, - assign, - expr, - -- * Parser - parse, - -- * Types and synonyms - ASTs, - ASTSuccess, - ASTConstruction, - ASTResult, - -- * Utilities - stackSize -) where - -import Control.Monad (forM) -import Control.Monad.Loops (unfoldrM) -import Control.Monad.ST (runST) -import Data.Bits hiding (shift) -import qualified Data.ByteString as B -import Data.Either (isLeft, lefts, - rights) -import Data.Foldable (Foldable (..)) -import Data.List (find) -import Data.Maybe (fromJust, - fromMaybe) -import qualified Data.Set as S -import Data.STRef (newSTRef, - readSTRef, - writeSTRef) -import qualified Data.Text as T -import Data.Tuple.Extra (dupe, first, - second, snd3, - uncurry3) -import Numeric.Natural -import Prelude hiding - (toInteger) - -import qualified Htcc.CRules.Types as CT -import Htcc.Parser.AST -import Htcc.Parser.ConstructionData -import Htcc.Parser.ConstructionData.Scope (LookupVarResult (..), - Scoped (..)) -import qualified Htcc.Parser.ConstructionData.Scope.Function as PSF -import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) -import Htcc.Parser.ConstructionData.Scope.Utils (internalCE) -import qualified Htcc.Parser.ConstructionData.Scope.Var as PV -import Htcc.Parser.Parsing.Global -import Htcc.Parser.Parsing.StmtExpr -import Htcc.Parser.Parsing.Type -import Htcc.Parser.Parsing.Typedef -import Htcc.Parser.Utils -import qualified Htcc.Tokenizer as HT -import Htcc.Utils (first3, - first4, - maybe', - maybeToRight, - second3, - third3, - toInteger, - toNatural, - tshow) - -{-# INLINE varDecl #-} -varDecl :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ConstructionData i -> ASTConstruction i -varDecl tk scp = takeType tk scp >>= validDecl (HT.altEmptyToken tk) >>= varDecl' - where - varDecl' (_, Nothing, (_, HT.TKReserved ";"):ds, scp') = Right (ds, ATEmpty, scp') - varDecl' (t, Just ident, (_, HT.TKReserved ";"):ds, scp') = maybeToRight ("declaration with incomplete type", ident) (incomplete t scp) >>= \t' -> - addLVar t' ident scp' >>= \(lat, scp'') -> Right (ds, atNull lat, scp'') - varDecl' (t, Just ident, (_, HT.TKReserved "="):ds, scp') = (>>=) (varInit assign t ident ds scp') $ \case - ((_, HT.TKReserved ";"):ds', at, sc) -> Right (ds', at, sc) - _ -> Left ("expected ';' token, the subject iteration statement starts here:", head tk) - varDecl' (_, _, ds, _) = Left $ if null ds then ("expected unqualified-id", head tk) else ("expected unqualified-id before '" <> tshow (snd (head ds)) <> T.singleton '\'', head ds) - validDecl _ tnt@(t, Just ident, _, scp') = maybe' (Right tnt) (incomplete t scp') $ \t' -> if CT.toTypeKind t == CT.CTVoid then - Left ("variable or field '" <> tshow (snd ident) <> "' declarated void", ident) else Right $ first4 (const t') tnt - validDecl errPlaceholder tnt@(t, _, _, scp') = maybe' (Right tnt) (incomplete t scp') $ \t' -> if CT.toTypeKind t == CT.CTVoid then - Left ("declarations of type void is invalid in this context", errPlaceholder) else Right $ first4 (const t') tnt - --- | `program` indicates \(\eqref{eq:eigth}\) among the comments of `inners`. -program :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ConstructionData i -> Either (ASTError i) (ASTs i, ConstructionData i) -program [] !scp = Right ([], scp) -program xs !scp = either Left (\(ys, atn, !scp') -> first (atn:) <$> program ys scp') $ globalDef xs ATEmpty scp - --- | `stmt` indicates \(\eqref{eq:nineth}\) among the comments of `inners`. -stmt :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -stmt ((_, HT.TKReturn):(_, HT.TKReserved ";"):xs) _ scp = Right (xs, atReturn (CT.SCUndef CT.CTUndef) ATEmpty, scp) -- for @return;@ -stmt (cur@(_, HT.TKReturn):xs) atn !scp = (>>=) (expr xs atn scp) $ \(ert, erat, erscp) -> case ert of -- for @return@ - (_, HT.TKReserved ";"):ys -> Right (ys, atReturn (CT.SCUndef CT.CTUndef) erat, erscp) - ert' -> Left $ expectedMessage ";" cur ert' -stmt (cur@(_, HT.TKIf):(_, HT.TKReserved "("):xs) atn !scp = (>>=) (expr xs atn scp) $ \(ert, erat, erscp) -> case ert of -- for @if@ - (_, HT.TKReserved ")"):ys -> (>>=) (stmt ys erat erscp) $ \x -> case second3 (atIf erat) x of - ((_, HT.TKElse):zs, eerat, eerscp) -> second3 (atElse eerat) <$> stmt zs eerat eerscp -- for @else@ - zs -> Right zs - ert' -> Left $ expectedMessage ")" cur ert' -stmt (cur@(_, HT.TKWhile):(_, HT.TKReserved "("):xs) atn !scp = (>>=) (expr xs atn scp) $ \(ert, erat, erscp) -> case ert of -- for @while@ - (_, HT.TKReserved ")"):ys -> second3 (atWhile erat) <$> stmt ys erat erscp - ert' -> Left $ expectedMessage ")" cur ert' -stmt xxs@(cur@(_, HT.TKFor):(_, HT.TKReserved "("):_) _ !scp = (>>=) (maybeToRight (internalCE, cur) (takeBrace "(" ")" (tail xxs))) $ -- for @for@ - either (Left . ("expected ')' token. The subject iteration statement starts here:",)) $ \(forSt, ds) -> (>>=) (initSect (tail (init forSt)) $ succNest scp) $ \(fxs, finit, fscp') -> - (>>=) (condSect fxs fscp') $ \(fxs', fcond, fscp'') -> (>>=) (incrSect fxs' fscp'') $ \case - ([], fincr, fscp''') -> - let fnd = filter (\x' -> case fromATKindFor x' of ATEmpty -> False; x'' -> not $ isEmptyExprStmt x'') [ATForInit finit, ATForCond fcond, ATForIncr fincr] - mkk = maybe (ATForCond (atNumLit 1) : fnd) (const fnd) $ find isATForCond fnd in case ds of - ((_, HT.TKReserved ";"):ys) -> Right (ys, atFor mkk, fallBack scp fscp''') - _ -> third3 (fallBack scp) . second3 (atFor . (mkk ++) . (:[]) . ATForStmt) <$> stmt ds ATEmpty fscp''' - _ -> Left ("unexpected end of for statement", cur) - where - initSect [] _ = Left ("the iteration statement for must be `for (expression_opt; expression_opt; expression_opt) statement`. See section 6.8.5.", cur) - initSect ((_, HT.TKReserved ";"):ds) fsc = Right (ds, ATEmpty, fsc) - initSect forSect fsc - | isTypeName (head forSect) fsc = varDecl forSect fsc - | otherwise = (>>=) (expr forSect ATEmpty fsc) $ \(x, y, z) -> case x of - (_, HT.TKReserved ";"):ds -> Right (ds, atExprStmt y, z) - _ -> if null x then Left ("expected ';' token", HT.emptyToken) else Left ("expected ';' token after '" <> tshow (snd $ head x) <> "'", head x) - condSect [] _ = Left ("the iteration statement for must be `for (expression_opt; expression_opt; expression_opt) statement`. See section 6.8.5.", cur) - condSect ((_, HT.TKReserved ";"):ds) fsc = Right (ds, ATEmpty, fsc) - condSect forSect fsc = (>>=) (expr forSect ATEmpty fsc) $ \case - ((_, HT.TKReserved ";"):ds, y, z) -> Right (ds, y, z) - (x, _, _) -> if null x then Left ("expected ';' token", HT.emptyToken) else Left ("expected ';' token after '" <> tshow (snd $ head x) <> "'", head x) - incrSect [] fsc = Right ([], ATEmpty, fsc) - incrSect forSect fsc = second3 atExprStmt <$> expr forSect ATEmpty fsc -stmt xxs@(cur@(_, HT.TKReserved "{"):_) _ !scp = (>>=) (maybeToRight (internalCE, cur) (takeBrace "{" "}" xxs)) $ -- for compound statement - either (Left . ("the compound statement is not closed",)) $ \(sctk, ds) -> runST $ do - eri <- newSTRef Nothing - v <- newSTRef $ succNest scp - mk <- flip unfoldrM (init $ tail sctk) $ \ert -> if null ert then return Nothing else do - erscp <- readSTRef v - either (\err -> Nothing <$ writeSTRef eri (Just err)) (\(ert', erat', erscp') -> Just (erat', ert') <$ writeSTRef v erscp') $ stmt ert ATEmpty erscp - (>>=) (readSTRef eri) $ flip maybe (return . Left) $ Right . (ds, atBlock mk,) . fallBack scp <$> readSTRef v -stmt ((_, HT.TKReserved ";"):xs) atn !scp = Right (xs, atn, scp) -- for only @;@ -stmt (cur@(_, HT.TKBreak):xs) _ scp = case xs of -- for @break@ - (_, HT.TKReserved ";"):ds -> Right (ds, atBreak, scp) - _ -> Left ("expected ';' token after 'break' token", cur) -stmt (cur@(_, HT.TKContinue):xs) _ scp = case xs of -- for @continue@ - (_, HT.TKReserved ";"):ds -> Right (ds, atContinue, scp) - _ -> Left ("expected ';' token after 'continue' token", cur) -stmt (cur@(_, HT.TKSwitch):xs) atn scp = case xs of -- for @switch@ - (_, HT.TKReserved "("):xs' -> (>>=) (expr xs' atn scp) $ \case - (cur1@(_, HT.TKReserved ")"):xs'', cond, scp') -> - (>>=) (stmt xs'' ATEmpty (scp' { isSwitchStmt = True })) $ \case - (xs''', ATNode (ATBlock ats) t _ _, scp'') -> Right (xs''', atSwitch cond ats t, scp'' { isSwitchStmt = False }) - _ -> Left ("expected compound statement after the token ')'", cur1) - (xs'', _, _) -> Left $ if not (null xs'') then ("expected token ')' before '" <> tshow (snd $ head xs') <> "' token", head xs') else ("expected ')' token", HT.emptyToken) - _ -> Left ("expected token '(' after the token 'switch'", cur) -stmt (cur@(_, HT.TKCase):xs) atn scp -- for @case@ - | isSwitchStmt scp = flip (either (Left . fromMaybe ("expected constant expression after 'case' token", cur))) (constantExp xs scp) $ \case - ((_, HT.TKReserved ":"):ds, val) -> second3 (atCase 0 val) <$> stmt ds atn scp - (ds, _) -> Left $ if not (null ds) then ("expected ':' token before '" <> tshow (snd $ head ds) <> "'", head ds) else ("expected ':' token", head ds) - | otherwise = Left ("stray 'case'", cur) -stmt (cur@(_, HT.TKDefault):(_, HT.TKReserved ":"):xs) atn scp -- for @default@ - | isSwitchStmt scp = second3 (atDefault 0) <$> stmt xs atn scp - | otherwise = Left ("stray 'default'", cur) -stmt (cur@(_, HT.TKGoto):xs) _ scp = case xs of -- for @goto@ - (_, HT.TKIdent ident):(_, HT.TKReserved ";"):ds -> Right (ds, atGoto ident, scp) - (_, HT.TKIdent ident):_ -> Left ("expected ';' token after the identifier '" <> ident <> "'", cur) - _ -> Left ("expected identifier after the 'goto' token", cur) -stmt ((_, HT.TKIdent ident):(_, HT.TKReserved ":"):xs) _ scp = Right (xs, atLabel ident, scp) -- for local label -stmt xs@((_, HT.TKTypedef):_) _ scp = typedef xs scp -- for local @typedef@ -stmt tk atn !scp - | not (null tk) && isTypeName (head tk) scp = varDecl tk scp -- for a local variable declaration - | otherwise = (>>=) (expr tk atn scp) $ \(ert, erat, erscp) -> case ert of -- for stmt; - (_, HT.TKReserved ";"):ys -> Right (ys, atExprStmt erat, erscp) - ert' -> Left $ expectedMessage ";" (if null tk then HT.emptyToken else last tk) ert' - -{-# INLINE expr #-} --- | \({\rm expr} = {\rm assign}\left("," {\rm assign}\right)\ast\) -expr :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -expr tk at cd = assign tk at cd >>= uncurry3 f - where - f ((_, HT.TKReserved ","):xs) at' cd' = assign xs at' cd' >>= uncurry3 f . second3 (\x -> atComma (atype x) (atExprStmt at') x) - f tk' at' cd' = Right (tk', at', cd') - --- | `assign` indicates \(\eqref{eq:seventh}\) among the comments of `inners`. -assign :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -assign xs atn scp = (>>=) (conditional xs atn scp) $ \(ert, erat, erscp) -> case ert of - (_, HT.TKReserved "="):ys -> nextNode ATAssign ys erat erscp - (_, HT.TKReserved "*="):ys -> nextNode ATMulAssign ys erat erscp - (_, HT.TKReserved "/="):ys -> nextNode ATDivAssign ys erat erscp - (_, HT.TKReserved "&="):ys -> nextNode ATAndAssign ys erat erscp - (_, HT.TKReserved "|="):ys -> nextNode ATOrAssign ys erat erscp - (_, HT.TKReserved "^="):ys -> nextNode ATXorAssign ys erat erscp - (_, HT.TKReserved "<<="):ys -> nextNode ATShlAssign ys erat erscp - (_, HT.TKReserved ">>="):ys -> nextNode ATShrAssign ys erat erscp - (_, HT.TKReserved "+="):ys -> nextNode (maybe ATAddAssign (const ATAddPtrAssign) $ CT.deref (atype erat)) ys erat erscp - (_, HT.TKReserved "-="):ys -> nextNode (maybe ATSubAssign (const ATSubPtrAssign) $ CT.deref (atype erat)) ys erat erscp - _ -> Right (ert, erat, erscp) - where - nextNode atk ys erat erscp = (>>=) (assign ys erat erscp) $ \(zs, erat', erscp') -> - (>>=) (validAssign (if not (null zs) then head zs else if not (null ys) then head ys else if not (null xs) then head xs else HT.emptyToken) erat') $ \erat'' -> - Right (zs, ATNode atk (atype erat) erat erat'', erscp') - --- | `conditional` indicates \(\eqref{eq:seventeenth}\) among the comments of `inners`. -conditional :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -conditional xs atn scp = (>>=) (logicalOr xs atn scp) $ \(ert, cond, erscp) -> case ert of - -- GNU extension (Conditionals with Omitted Operands, see also: https://gcc.gnu.org/onlinedocs/gcc/Conditionals.html) - (_, HT.TKReserved "?"):(_, HT.TKReserved ":"):ds -> second3 (atConditional (atype cond) cond ATEmpty) <$> conditional ds cond erscp - cur@(_, HT.TKReserved "?"):ds -> (>>=) (expr ds cond erscp) $ \(ert', thn, erscp') -> case ert' of - (_, HT.TKReserved ":"):ds' -> second3 (atConditional (atype thn) cond thn) <$> conditional ds' thn erscp' - ds' -> if null ds' then Left ("expected ':' token for this '?'", cur) else Left ("expected ':' before '" <> tshow (snd (head ds')) <> "' token", head ds') - _ -> Right (ert, cond, erscp) - --- | `inners` is a general function for creating `equality`, `relational`, `add` and `term` in the following syntax (EBNF) of \({\rm LL}\left(k\right)\) where \(k\in\mathbb{N}\). --- --- \[ --- \begin{eqnarray} --- {\rm program} &=& \text{global-def*}\label{eq:eigth}\tag{1}\\ --- {\rm stmt} &=& \begin{array}{l} --- {\rm expr}?\ {\rm ";"}\\ --- \mid\ {\rm "\{"\ stmt}^\ast\ {\rm "\}"}\\ --- \mid\ {\rm "return"}\ {\rm expr}\ ";"\\ --- \mid\ "{\rm if}"\ "("\ {\rm expr}\ ")"\ {\rm stmt}\ ("{\rm else}"\ {\rm stmt})?\\ --- \mid\ {\rm "while"\ "("\ expr\ ")"\ stmt}\\ --- \mid\ {\rm "for"\ "("\ expr?\ ";" expr?\ ";"\ expr?\ ")"\ stmt? ";"} --- \end{array}\label{eq:nineth}\tag{2}\\ --- {\rm expr} &=& {\rm assign}\\ --- {\rm assign} &=& {\rm conditional} \left(\left("="\ \mid\ "+="\ \mid\ "-="\ \mid\ "*="\ \mid\ "/="\right)\ {\rm assign}\right)?\label{eq:seventh}\tag{3}\\ --- {\rm conditional} &=& {\rm logicalOr} \left("?"\ {\rm expr}\ ":"\ {\rm conditional}\right)?\label{eq:seventeenth}\tag{4}\\ --- {\rm logicalOr} &=& {\rm logicalAnd}\ \left("||"\ {\rm logicalAnd}\right)^\ast\label{eq:fifteenth}\tag{5}\\ --- {\rm logicalAnd} &=& {\rm bitwiseOr}\ \left("|"\ {\rm bitwiseOr}\right)^\ast\label{eq:sixteenth}\tag{6}\\ --- {\rm bitwiseOr} &=& {\rm bitwiseXor}\ \left("|"\ {\rm bitwiseXor}\right)^\ast\label{eq:tenth}\tag{7}\\ --- {\rm bitwiseXor} &=& {\rm bitwiseAnd}\ \left("\hat{}"\ {\rm bitwiseAnd}\right)^\ast\label{eq:eleventh}\tag{8}\\ --- {\rm bitwiseAnd} &=& {\rm equality}\ \left("\&"\ {\rm equality}\right)^\ast\label{eq:twelveth}\tag{9}\\ --- {\rm equality} &=& {\rm relational}\ \left("=="\ {\rm relational}\ \mid\ "!="\ {\rm relational}\right)^\ast\label{eq:fifth}\tag{10}\\ --- {\rm relational} &=& {\rm shift}\ \left("\lt"\ {\rm shift}\mid\ "\lt ="\ {\rm shift}\mid\ "\gt"\ {\rm shift}\mid\ "\gt ="\ {\rm shift}\right)^\ast\label{eq:sixth}\tag{11}\\ --- {\rm shift} &=& {\rm add}\ \left("\lt\lt"\ {\rm add}\mid\ "\gt\gt"\ {\rm add}\right)^\ast\label{eq:thirteenth}\tag{12}\\ --- {\rm add} &=& {\rm term}\ \left("+"\ {\rm term}\ \mid\ "-"\ {\rm term}\right)^\ast\label{eq:first}\tag{13} \\ --- {\rm term} &=& {\rm factor}\ \left("\ast"\ {\rm factor}\ \mid\ "/"\ {\rm factor}\right)^\ast\label{eq:second}\tag{14} \\ --- {\rm cast} &=& "(" {\rm type-name} ")"\ {\rm cast}\ \mid\ {\rm unary}\label{eq:fourteenth}\tag{15} \\ --- {\rm unary} &=& \left(\text{"+"}\ \mid\ \text{"-"}\ \mid\ \text{"*"}\ \mid\ \text{"&"}\ \mid\ \text{"!"}\ \mid\ \text{"-"}\right)\text{?}\ \text{cast}\ \mid\ \left(\text{"++"}\ \mid\ \text{"--"}\right)\ \text{unary}\ \mid\ \text{primary} \left(\text{"["}\ \text{expr}\ \text{"]"}\ \mid\ \text{"."}\ \text{ident}\ \mid\ \text{"->"}\ \text{ident}\ \mid\ \text{"++"}\ \mid\ \text{"--"}\right)\ast\label{eq:fourth}\tag{16} \\ --- {\rm factor} &=& {\rm num} \mid\ {\rm ident}\ \left({\rm "(" \left(expr\ \left(\left(","\ expr\right)^\ast\right)?\right)? ")"}\right)?\ \mid\ "(" {\rm expr} ")"\ \mid \text{string}\ \mid\ \text{"sizeof"}\ \text{"("}\ \text{type}\ \text{")"}\ \mid\ \text{"sizeof"}\ \text{unary}\ \mid\ \text{stmt-expr}\label{eq:third}\tag{17} --- \end{eqnarray} --- \] -inners :: ([HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i) -> [(T.Text, ATKind i)] -> [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -inners _ _ [] atn scp = Right ([], atn, scp) -inners f cs xs atn scp = either Left (uncurry3 (inners' f cs)) $ f xs atn scp - where - inners' _ _ [] at ars = Right ([], at, ars) - inners' g ds ys at ars = maybe' (Right (ys, at, ars)) (find (\(c, _) -> case snd (head ys) of HT.TKReserved cc -> cc == c; _ -> False) ds) $ \(_, k) -> - either Left (uncurry3 id . first3 (inners' f cs) . second3 (ATNode k (CT.SCAuto CT.CTInt) at)) $ g (tail ys) at ars - --- | `logicalOr` indicates \(\eqref{eq:fifteenth}\) among the comments of `inners`. -logicalOr :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -logicalOr = inners logicalAnd [("||", ATLOr)] - --- | `logicalAnd` indicates \(\eqref{eq:sixteenth}\) among the comments of `inners`. -logicalAnd :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -logicalAnd = inners bitwiseOr [("&&", ATLAnd)] - --- | `bitwiseOr` indicates \(\eqref{eq:tenth}\) among the comments of `inners`. -bitwiseOr :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -bitwiseOr = inners bitwiseXor [("|", ATOr)] - --- | `bitwiseXor` indicates \(\eqref{eq:eleventh}\) amont the comments of `inners`. -bitwiseXor :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -bitwiseXor = inners bitwiseAnd [("^", ATXor)] - --- | `bitwiseAnd` indicates \(\eqref{eq:twelveth}\) among the comments of `inners`. -bitwiseAnd :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -bitwiseAnd = inners equality [("&", ATAnd)] - --- | `equality` indicates \(\eqref{eq:fifth}\) among the comments of `inners`. --- This is equivalent to the following code: --- --- --- > equality :: [HT.TokenLC i] -> ATree i -> [LVar i] -> Either (ASTError i) ([HT.TokenLC i], ATree i) --- > equality xs atn scp = (>>=) (relational xs atn scp) $ uncurry3 equality' --- > where --- > equality' ((_, HT.TKReserved "=="):ys) era ars = either Left (uncurry3 id . first3 equality' . second3 (ATNode ATEQ era)) $ relational ys era ars --- > equality' ((_, HT.TKReserved "!="):ys) era ars = either Left (uncurry3 id . first3 equality' . second3 (ATNode ATNEQ era)) $ relational ys era ars --- > equality' ert era ars = Right (ert, era, ars) -equality :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -equality = inners relational [("==", ATEQ), ("!=", ATNEQ)] - --- | `relational` indicates \(\eqref{eq:sixth}\) among the comments of `inners`. -relational :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -relational = inners shift [("<", ATLT), ("<=", ATLEQ), (">", ATGT), (">=", ATGEQ)] - --- | `shift` indicates \(\eqref{eq:thirteenth}\\) among the comments of `inners`. -shift :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -shift = inners add [("<<", ATShl), (">>", ATShr)] - --- | `add` indicates \(\eqref{eq:first}\) among the comments of `inners`. -add :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -add xs atn scp = (>>=) (term xs atn scp) $ uncurry3 add' - where - add' (cur@(_, HT.TKReserved "+"):ys) era ars = (>>=) (term ys era ars) $ \zz -> - maybeToRight ("invalid operands", cur) (addKind era $ snd3 zz) >>= \nat -> uncurry3 id $ first3 add' $ second3 (const nat) zz - add' (cur@(_, HT.TKReserved "-"):ys) era ars = (>>=) (term ys era ars) $ \zz -> - maybeToRight ("invalid operands", cur) (subKind era $ snd3 zz) >>= \nat -> uncurry3 id $ first3 add' $ second3 (const nat) zz - add' ert erat ars = Right (ert, erat, ars) - --- | `term` indicates \(\eqref{eq:second}\) amont the comments of `inners`. -term :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -term = inners cast [("*", ATMul), ("/", ATDiv), ("%", ATMod)] - --- | `cast` indicates \(\eqref{eq:fourteenth}\) amont the comments of `inners`. -cast :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -cast (cur@(_, HT.TKReserved "("):xs) at scp = flip (either (const $ unary (cur:xs) at scp)) (takeTypeName xs scp) $ \case - (t, (_, HT.TKReserved ")"):xs') -> second3 (atCast t) <$> cast xs' at scp - _ -> Left ("The token ')' corresponding to '(' is expected", cur) -cast xs at scp = unary xs at scp - --- | `unary` indicates \(\eqref{eq:fourth}\) amount the comments of `inners`. -unary :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -unary ((_, HT.TKReserved "+"):xs) at scp = cast xs at scp -unary ((_, HT.TKReserved "-"):xs) at scp = second3 (ATNode ATSub (CT.SCAuto CT.CTInt) (atNumLit 0)) <$> cast xs at scp -unary ((_, HT.TKReserved "!"):xs) at scp = second3 (flip (ATNode ATNot $ CT.SCAuto CT.CTInt) ATEmpty) <$> cast xs at scp -unary ((_, HT.TKReserved "~"):xs) at scp = second3 (flip (ATNode ATBitNot $ CT.SCAuto CT.CTInt) ATEmpty) <$> cast xs at scp -unary ((_, HT.TKReserved "&"):xs) at scp = flip fmap (cast xs at scp) $ second3 $ \x -> let ty = if CT.isCTArray (atype x) then fromJust $ CT.deref (atype x) else atype x in - atUnary ATAddr (CT.mapTypeKind CT.CTPtr ty) x -unary (cur@(_, HT.TKReserved "*"):xs) at !scp = (>>=) (cast xs at scp) $ \(ert, erat, erscp) -> - maybeToRight ("invalid pointer dereference", cur) (CT.deref $ atype erat) >>= \y -> case CT.toTypeKind y of - CT.CTVoid -> Left ("void value not ignored as it ought to be", cur) - _ -> (\ty' -> (ert, atUnary ATDeref ty' erat, erscp)) <$> maybeToRight ("incomplete value dereference", cur) (incomplete y scp) -unary ((_, HT.TKReserved "++"):xs) at scp = second3 (\x -> ATNode ATPreInc (atype x) x ATEmpty) <$> unary xs at scp -unary ((_, HT.TKReserved "--"):xs) at scp = second3 (\x -> ATNode ATPreDec (atype x) x ATEmpty) <$> unary xs at scp -unary xs at scp = either Left (uncurry3 f) $ factor xs at scp - where - f (cur@(_, HT.TKReserved "["):xs') erat !erscp = (>>=) (expr xs' erat erscp) $ \(ert', erat', erscp') -> case ert' of - (_, HT.TKReserved "]"):xs'' -> maybeToRight ("invalid operands", cur) (addKind erat erat') >>= \erat'' -> - maybeToRight ("subscripted value is neither array nor pointer nor vector", HT.altEmptyToken xs) (CT.deref $ atype erat'') >>= \t -> - maybeToRight ("incomplete value dereference", cur) (incomplete t erscp') >>= \t' -> f xs'' (atUnary ATDeref t' erat'') erscp' - _ -> Left $ if null ert' then ("expected expression after '[' token", cur) else ("expected expression before '" <> tshow (snd (head ert')) <> "' token", head ert') - f (cur@(_, HT.TKReserved "."):xs') erat !erscp - | CT.isCTStruct (atype erat) || CT.isIncompleteStruct (atype erat) = if null xs' then Left ("expected identifier at end of input", cur) else case head xs' of - (_, HT.TKIdent ident) -> maybeToRight ("incomplete type '" <> tshow (atype erat) <> "'", cur) (incomplete (atype erat) erscp) >>= \t -> - maybeToRight ("no such member", cur) (CT.lookupMember ident (CT.toTypeKind t)) >>= \mem -> - f (tail xs') (atMemberAcc mem erat) erscp - _ -> Left ("expected identifier after '.' token", cur) - | otherwise = Left ("request for a member in something not a structure or union", cur) - f (cur@(_, HT.TKReserved "->"):xs') erat !erscp - | maybe False CT.isCTStruct (CT.deref $ atype erat) || maybe False CT.isIncompleteStruct (CT.deref $ atype erat) = if null xs' then Left ("expected identifier at end of input", cur) else - case head xs' of - (_, HT.TKIdent ident) -> maybeToRight ("incomplete type '" <> tshow (atype erat) <> "'", cur) (incomplete (fromJust (CT.deref $ atype erat)) erscp) >>= \t -> - maybeToRight ("no such member", cur) (CT.lookupMember ident (CT.toTypeKind t)) >>= \mem -> - f (tail xs') (atMemberAcc mem (atUnary ATDeref (CT.SCAuto $ CT.smType mem) erat)) erscp - _ -> Left ("expected identifier after '->' token", cur) - | otherwise = Left ("invalid type argument of '->'" <> if CT.isCTUndef (atype erat) then "" else " (have '" <> tshow (atype erat) <> "')", cur) - f ((_, HT.TKReserved "++"):xs') erat !erscp = f xs' (atUnary ATPostInc (atype erat) erat) erscp - f ((_, HT.TKReserved "--"):xs') erat !erscp = f xs' (atUnary ATPostDec (atype erat) erat) erscp - f ert erat !erscp = Right (ert, erat, erscp) - --- | `factor` indicates \(\eqref{eq:third}\) amount the comments of `inners`. -factor :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -factor [] atn !scp = Right ([], atn, scp) -factor tk@((_, HT.TKReserved "("):((_, HT.TKReserved "{"):_)) at !scp = stmtExpr tk at scp -factor (cur@(_, HT.TKReserved "("):xs) atn !scp = (>>=) (expr xs atn scp) $ \(ert, erat, erscp) -> case ert of -- for (expr) - (_, HT.TKReserved ")"):ys -> Right (ys, erat, erscp) - ert' -> Left $ expectedMessage ")" cur ert' -factor ((_, HT.TKNum n):xs) _ !scp = Right (xs, atNumLit n, scp) -- for numbers -factor (cur@(_, HT.TKIdent v):(_, HT.TKReserved "("):(_, HT.TKReserved ")"):xs) _ !scp = case lookupFunction v scp of -- for no arguments function call - Nothing -> Right (xs, atNoLeaf (ATCallFunc v Nothing) (CT.SCAuto CT.CTInt), pushWarn ("the function '" <> v <> "' is not declared.") cur scp) - Just fn -> Right (xs, atNoLeaf (ATCallFunc v Nothing) (PSF.fntype fn), scp) -factor (cur1@(_, HT.TKIdent v):cur2@(_, HT.TKReserved "("):xs) _ scp = (>>=) (maybeToRight (internalCE, cur1) (takeBrace "(" ")" (cur2:xs))) $ -- for some argumets function call - either (Left . ("invalid function call",)) $ \(fsec, ds) -> case lookupFunction v scp of - Nothing -> f fsec ds (pushWarn ("The function '" <> tshow (snd cur1) <> "' is not declared.") cur1 scp) $ CT.SCAuto CT.CTInt - Just fn -> f fsec ds scp (PSF.fntype fn) - where - f fsec ds scp' t = maybeToRight ("invalid function call", cur1) (takeExps fsec) >>= \exps -> runST $ do - mk <- newSTRef scp' - expl <- forM exps $ \etk -> readSTRef mk >>= either (return . Left) (\(_, erat, ervar) -> Right erat <$ writeSTRef mk ervar) . expr etk ATEmpty - if any isLeft expl then return $ Left $ head $ lefts expl else do - scp'' <- readSTRef mk - return $ Right (ds, atNoLeaf (ATCallFunc v (Just $ rights expl)) t, scp'') -factor (cur0@(_, HT.TKSizeof):cur@(_, HT.TKReserved "("):xs) atn scp = case takeTypeName xs scp of - Left _ -> second3 (atNumLit . fromIntegral . CT.sizeof . atype) <$> unary (cur:xs) atn scp -- for `sizeof(variable)` - Right (t, (_, HT.TKReserved ")"):ds) -> (ds, , scp) . atNumLit . fromIntegral . CT.sizeof <$> - maybeToRight ("invalid application of 'sizeof' to incomplete type '" <> tshow (CT.toTypeKind t) <> "'", cur0) (incomplete t scp) - Right _ -> Left ("The token ')' corresponding to '(' is expected", cur) -factor ((_, HT.TKSizeof):xs) atn !scp = second3 (atNumLit . fromIntegral . CT.sizeof . atype) <$> unary xs atn scp -- for `sizeof variable` -- TODO: the type of sizeof must be @size_t@ -factor (cur@(_, HT.TKAlignof):xs) atn !scp = (>>=) (unary xs atn scp) $ \(ert, erat, erscp) -> - if CT.isCTUndef (atype erat) then Left ("_Alignof must be an expression or type", cur) else Right (ert, atNumLit $ fromIntegral $ CT.alignof $ atype erat, erscp) -- Note: Using alignof for expressions is a non-standard feature of C11 -factor (cur@(_, HT.TKString slit):xs) _ !scp = uncurry (xs,,) <$> addLiteral (CT.SCAuto $ CT.CTArray (fromIntegral $ B.length slit) CT.CTChar) cur scp -- for literals -factor (cur@(_, HT.TKIdent ident):xs) _ !scp = case lookupVar ident scp of - FoundGVar (PV.GVar t _) -> Right (xs, atGVar t ident, scp) -- for declared global variable - FoundLVar sct -> Right (xs, treealize sct, scp) -- for declared local variable - FoundEnum sct -> Right (xs, treealize sct, scp) -- for declared enumerator - NotFound -> Left ("The '" <> ident <> "' is not defined variable", cur) -factor ert _ _ = Left (if null ert then "unexpected token in program" else "unexpected token '" <> tshow (snd (head ert)) <> "' in program", HT.altEmptyToken ert) - -{-# INLINE parse #-} --- | Constructs the abstract syntax tree based on the list of token strings. --- if construction fails, `parse` returns the error message and the token at the error location. --- Otherwise, `parse` returns a list of abstract syntax trees, a set of global variables, and a list of literals. -parse :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ASTResult i -parse = fmap (\(ast, sc) -> (warns sc, ast, PV.globals $ vars $ scope sc, PV.literals $ vars $ scope sc)) . flip program initConstructionData - --- | `stackSize` returns the stack size of variable per function. -stackSize :: (Show i, Integral i) => ATree i -> Natural -stackSize (ATNode (ATDefFunc _ args) _ body _) = let ms = f body $ maybe S.empty (foldr (\(ATNode (ATLVar t x) _ _ _) acc -> S.insert (t, x) acc) S.empty) args in - if S.size ms == 1 then toNatural $ flip CT.alignas 8 $ toInteger $ CT.sizeof $ fst $ head (S.toList ms) else toNatural $ flip CT.alignas 8 $ uncurry (+) $ - first (toInteger . CT.sizeof . fst) $ second (fromIntegral . snd) $ dupe $ foldl' (\acc x -> if snd acc < snd x then x else acc) (CT.SCUndef CT.CTUndef, 0) $ S.toList ms - where - f ATEmpty !s = s - f (ATNode (ATCallFunc _ (Just arg)) t l r) !s = f (ATNode (ATBlock arg) t l r) s - f (ATNode (ATLVar t x) _ l r) !s = let i = S.insert (t, x) s in f l i `S.union` f r i - f (ATNode (ATBlock xs) _ l r) !s = let i = foldr (S.union . (`f` s)) s xs in f l i `S.union` f r i - f (ATNode (ATStmtExpr xs) t l r) !s = f (ATNode (ATBlock xs) t l r) s - f (ATNode (ATFor xs) _ l r) !s = let i = foldr (S.union . flip f s . fromATKindFor) S.empty xs in f l i `S.union` f r i - f (ATNode (ATNull x) _ _ _) !s = f x s - f (ATNode _ _ l r) !s = f l s `S.union` f r s -stackSize _ = 0 - diff --git a/src/Htcc/Parser/Parsing/Core.hs-boot b/src/Htcc/Parser/Parsing/Core.hs-boot deleted file mode 100644 index a56fa51..0000000 --- a/src/Htcc/Parser/Parsing/Core.hs-boot +++ /dev/null @@ -1,12 +0,0 @@ -module Htcc.Parser.Parsing.Core where - -import Data.Bits (Bits) -import Htcc.Tokenizer (TokenLC) -import Htcc.Parser.ConstructionData (ConstructionData) -import Htcc.Parser.AST (ATree, ASTConstruction) -import qualified Htcc.Tokenizer as HT - -stmt :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i - -conditional :: (Show i, Read i, Integral i, Bits i) => [TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i - diff --git a/src/Htcc/Parser/Parsing/Global.hs b/src/Htcc/Parser/Parsing/Global.hs deleted file mode 100644 index 702e32d..0000000 --- a/src/Htcc/Parser/Parsing/Global.hs +++ /dev/null @@ -1,40 +0,0 @@ -{-# LANGUAGE BangPatterns, LambdaCase, OverloadedStrings #-} -{-| -Module : Htcc.Parser.Parsing.Global -Description : The C languge parser and AST constructor -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -The module of the globals --} -module Htcc.Parser.Parsing.Global ( - globalDef -) where - -import Data.Bits - -import Htcc.Parser.AST -import Htcc.Parser.ConstructionData -import Htcc.Parser.Parsing.Global.Function -import Htcc.Parser.Parsing.Global.Var -import Htcc.Parser.Parsing.Type (takeType) -import Htcc.Parser.Parsing.Typedef -import qualified Htcc.Tokenizer as HT -import Htcc.Utils (uncurry4) - --- | `globalDef` parses global definitions (include functions and global variables) --- \[ --- \text{global-def}=\left(\text{global-var}\ \mid\ \text{function}\right)\text{*} --- \] -globalDef :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -globalDef (cur@(_, HT.TKReserved "register"):_) _ _ = Left ("illegal storage class on file-scoped identifier", cur) -globalDef (cur@(_, HT.TKReserved "auto"):_) _ _ = Left ("illegal storage class on file-scoped identifier", cur) -globalDef xs@((_, HT.TKTypedef):_) _ sc = typedef xs sc -- for global @typedef@ -globalDef tks at !va = (>>=) (takeType tks va) $ \case - (_, Nothing, (_, HT.TKReserved ";"):ds', scp) -> Right (ds', ATEmpty, scp) -- e.g., @int;@ is legal in C11 (See N1570/section 6.7 Declarations) - (funcType, ident@(Just (_, HT.TKIdent _)), tk@((_, HT.TKReserved "("):_), !sc) -> function funcType ident tk at sc - p@(_, Just (_, HT.TKIdent _), _, _) -> uncurry4 var p - _ -> Left ("invalid definition of global identifier", HT.altEmptyToken tks) diff --git a/src/Htcc/Parser/Parsing/Global.hs-boot b/src/Htcc/Parser/Parsing/Global.hs-boot deleted file mode 100644 index 2c26ff2..0000000 --- a/src/Htcc/Parser/Parsing/Global.hs-boot +++ /dev/null @@ -1,8 +0,0 @@ -module Htcc.Parser.Parsing.Global where - -import Data.Bits -import Htcc.Parser.AST -import Htcc.Parser.ConstructionData -import qualified Htcc.Tokenizer as HT - -globalDef :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i diff --git a/src/Htcc/Parser/Parsing/Global/Function.hs b/src/Htcc/Parser/Parsing/Global/Function.hs deleted file mode 100644 index 7980e49..0000000 --- a/src/Htcc/Parser/Parsing/Global/Function.hs +++ /dev/null @@ -1,80 +0,0 @@ -{-# LANGUAGE BangPatterns, LambdaCase, OverloadedStrings, ScopedTypeVariables, - TupleSections #-} -{-| -Module : Htcc.Parser.Parsing.Global.Function -Description : The C languge parser and AST constructor -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -The function declaration --} -module Htcc.Parser.Parsing.Global.Function ( - function -) where - -import Control.Monad.Loops (unfoldrM) -import Control.Monad.ST (runST) -import Data.Bits hiding (shift) -import Data.List (find) -import Data.List.Split (linesBy) -import Data.Maybe (fromMaybe, isJust) -import Data.STRef (newSTRef, readSTRef, - writeSTRef) -import Prelude hiding (toInteger) - -import qualified Htcc.CRules.Types as CT -import Htcc.Parser.AST -import Htcc.Parser.ConstructionData -import Htcc.Parser.ConstructionData.Scope.Utils (internalCE) -import {-# SOURCE #-} Htcc.Parser.Parsing.Core (stmt) -import {-# SOURCE #-} Htcc.Parser.Parsing.Global (globalDef) -import Htcc.Parser.Parsing.Type -import Htcc.Parser.Utils -import qualified Htcc.Tokenizer as HT -import Htcc.Utils (maybe', maybeToRight, - tshow) - --- | --- \[ --- \begin{array}{ccc} --- \text{function}&=&\text{pre-type}\ \text{declaration}\ \text{"("}\ \text{params?}\ \text{")"}\ \left(\text{"\{"}\ \text{stmt*}\ \text{"\}"}\ \mid\ \text{";"}\right)\\ --- \text{params}&=&\text{params}\left(\text{","}\ \text{param}\right)\text{*}\\ --- \text{param}&=&\text{pre-type}\ \text{declaration}\ \text{array-decl-suffix} --- \end{array} --- \] -function :: (Show i, Read i, Integral i, Bits i) => CT.StorageClass i -> Maybe (HT.TokenLC i) -> [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -function funcType (Just cur@(_, HT.TKIdent fname)) tk@((_, HT.TKReserved "("):_) at !sc = let scp = resetLocal sc in - (>>=) (maybeToRight (internalCE, cur) (takeBrace "(" ")" $ tail (cur:tk))) $ - either (Left . ("invalid function declaration/definition",)) $ \(fndec, st) -> case st of - ((_, HT.TKReserved ";"):ds'') -> addFunction False funcType cur scp >>= globalDef ds'' at -- for a function declaration -- TODO: read types of parameters and register them - ((_, HT.TKReserved "{"):_) -> (>>=) (addFunction True funcType cur scp) $ \scp' -> checkErr fndec scp' $ \args -> runST $ do -- for a function definition - eri <- newSTRef Nothing - v <- newSTRef scp' - mk <- flip unfoldrM args $ \args' -> if null args' then return Nothing else let arg = head args' in do - -- About @t'@: - -- An array of type T is equivalent to a pointer of type T in the context of function parameters. - m <- flip fmap (readSTRef v) $ \scp'' -> let (t, mident, _, _) = arg; t' = fromMaybe t $ aboutArray t in case mident of - Nothing -> Left ("anonymouse variable is not implemented yet", cur) -- TODO - Just ident -> addLVar t' ident scp'' - flip (either ((<$) Nothing . writeSTRef eri . Just)) m $ \(vat, scp'') -> Just (vat, tail args') <$ writeSTRef v scp'' - (>>=) (readSTRef eri) $ flip maybe (return . Left) $ flip fmap (readSTRef v) $ \v' -> (>>=) (stmt st at v') $ \case -- Forbid void to return a value in a return type function. - (ert, erat@(ATNode (ATBlock block) _ _ _), erscp) - | CT.toTypeKind funcType == CT.CTVoid -> if isJust (find isNonEmptyReturn block) then - Left ("The return type of function '" <> fname <> "' is void, but the statement returns a value", cur) else - Right (ert, atDefFunc fname (if null mk then Nothing else Just mk) funcType erat, erscp) - | otherwise -> let fnode = atDefFunc fname (if null mk then Nothing else Just mk) funcType erat in - maybe' (Right (ert, fnode, erscp)) (find isEmptyReturn block) $ const $ - Right (ert, fnode, pushWarn ("The return type of function '" <> fname <> "' is " <> tshow (CT.toTypeKind funcType) <> ", but the statement returns no value") cur erscp) - _ -> Left (internalCE, HT.emptyToken) - _ -> stmt tk at scp - where - checkErr ar !scp' f = let ar' = init $ tail ar in if not (null ar') && snd (head ar') == HT.TKReserved "," then Left ("unexpected ',' token", head ar') else - let args = linesBy ((==HT.TKReserved ",") . snd) ar' in mapM (`takeType` scp') args >>= f - aboutArray t - | CT.isCTArray t = CT.mapTypeKind CT.CTPtr <$> CT.deref t - | CT.isIncompleteArray t = Just $ CT.mapTypeKind (\(CT.CTIncomplete (CT.IncompleteArray t')) -> CT.CTPtr t') t - | otherwise = Nothing -function _ _ xs _ _ = Left (internalCE, HT.altEmptyToken xs) diff --git a/src/Htcc/Parser/Parsing/Global/Var.hs b/src/Htcc/Parser/Parsing/Global/Var.hs deleted file mode 100644 index ae419a2..0000000 --- a/src/Htcc/Parser/Parsing/Global/Var.hs +++ /dev/null @@ -1,68 +0,0 @@ -{-# LANGUAGE BangPatterns, OverloadedStrings, ScopedTypeVariables, - TupleSections #-} -{-| -Module : Htcc.Parser.Parsing.Global.Var -Description : The C languge parser and AST constructor -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -The Global variable declaration --} -module Htcc.Parser.Parsing.Global.Var ( - var -) where - -import Data.Bits hiding (shift) -import Prelude hiding - (toInteger) - -import qualified Htcc.CRules.Types as CT -import Htcc.Parser.AST -import Htcc.Parser.ConstructionData -import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) -import Htcc.Parser.ConstructionData.Scope.Utils (internalCE) -import qualified Htcc.Parser.ConstructionData.Scope.Var as PV -import {-# SOURCE #-} Htcc.Parser.Parsing.Core (conditional) -import Htcc.Parser.Parsing.Type -import qualified Htcc.Tokenizer as HT -import Htcc.Utils (maybeToRight, - tshow) - -gvarInit :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> CT.StorageClass i -> HT.TokenLC i -> ConstructionData i -> Either (ASTError i) ([HT.TokenLC i], ConstructionData i) -gvarInit xs ty ident sc = do - (ds, ast, sc') <- conditional xs ATEmpty sc - case (atkind ast, atkind (atL ast)) of - (ATAddr, ATGVar _ name) -> (ds,) . snd <$> gvarInitWithOG ty ident name sc' - (ATAddr, _) -> Left ("invalid initializer in global variable", HT.altEmptyToken ds) - (ATGVar t name, _) - | CT.isCTArray t -> (ds,) . snd <$> gvarInitWithOG ty ident name sc' - | otherwise -> gvarInitWithVal ds sc' - _ -> gvarInitWithVal ds sc' - where - gvarInitWithOG ty' from to = addGVarWith ty' from (PV.GVarInitWithOG to) - gvarInitWithVal ds sc' = do - (ds', cval) <- either (maybe (Left ("initializer element is not constant", HT.altEmptyToken ds)) Left) Right $ constantExp xs sc' - (ds',) . snd <$> addGVarWith ty ident (PV.GVarInitWithVal cval) sc' - --- | \[ --- \text{global-var} = \text{pre-type}\ \text{declaration}\ \text{array-decl-suffix}\ \text{";"} --- \] -var :: (Show i, Read i, Integral i, Bits i) => CT.StorageClass i -> Maybe (HT.TokenLC i) -> [HT.TokenLC i] -> ConstructionData i -> ASTConstruction i -var ty (Just cur@(_, HT.TKIdent _)) xs !scp = case xs of - (_, HT.TKReserved "="):ds -> do -- for initializing - ty' <- maybeToRight ("defining global variables with a incomplete type", cur) (incomplete ty scp) - (ds', nsc) <- gvarInit ds ty' cur scp - case ds' of - (_, HT.TKReserved ";"):ds'' -> return (ds'', ATEmpty, nsc) - _ -> Left $ if null ds' then - ("expected ';' token after '" <> tshow (snd cur) <> "' token", HT.altEmptyToken ds') else - ("expected ';' token" <> (if null ds' then "" else " before '" <> tshow (snd $ head ds') <> "' token"), HT.altEmptyToken ds') - (_, HT.TKReserved ";"):ds -> do -- for non initializing - ty' <- maybeToRight ("defining global variables with a incomplete type", cur) (incomplete ty scp) - (ds, ATEmpty,) . snd <$> addGVar ty' cur scp - _ -> Left ("expected ';' token after '" <> tshow (snd cur) <> "' token", cur) -var _ _ xs _ = Left (internalCE, HT.altEmptyToken xs) - diff --git a/src/Htcc/Parser/Parsing/StmtExpr.hs b/src/Htcc/Parser/Parsing/StmtExpr.hs deleted file mode 100644 index 14317b3..0000000 --- a/src/Htcc/Parser/Parsing/StmtExpr.hs +++ /dev/null @@ -1,54 +0,0 @@ -{-# LANGUAGE BangPatterns, LambdaCase, OverloadedStrings, ScopedTypeVariables, - TupleSections #-} -{-| -Module : Htcc.Parser.Parsing.StmtExpr -Description : The C languge parser and AST constructor -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -The module of the statement expression (GNU extension: ) --} -module Htcc.Parser.Parsing.StmtExpr ( - stmtExpr -) where - -import Control.Monad (when) -import Control.Monad.Loops (unfoldrM) -import Control.Monad.ST (runST) -import Data.Bits hiding (shift) -import Data.STRef (newSTRef, readSTRef, - writeSTRef) -import Prelude hiding (toInteger) - -import Htcc.Parser.AST -import Htcc.Parser.ConstructionData -import Htcc.Parser.ConstructionData.Scope.Utils (internalCE) -import {-# SOURCE #-} Htcc.Parser.Parsing.Core (stmt) -import Htcc.Parser.Utils -import qualified Htcc.Tokenizer as HT -import Htcc.Utils (maybeToRight, tshow) - --- | statement expression (GNU extension: ) --- \[\text{stmt-expr}=\text{"("}\ \text{"\{"}\ \text{stmt}\ \text{stmt*}\ \text{"\}"}\ \text{")"}\] -stmtExpr :: (Show i, Read i, Integral i, Bits i) => [HT.TokenLC i] -> ATree i -> ConstructionData i -> ASTConstruction i -stmtExpr ((_, HT.TKReserved "("):xs@((_, HT.TKReserved "{"):_)) _ !scp = (>>=) (maybeToRight (internalCE, head xs) (takeBrace "{" "}" xs)) $ - either (Left . ("the statement expression is not closed",)) $ \(sctk, ds) -> case ds of - (_, HT.TKReserved ")"):ds' -> runST $ do - eri <- newSTRef Nothing - v <- newSTRef $ succNest scp - lastA <- newSTRef ATEmpty - mk <- flip unfoldrM (init $ tail sctk) $ \ert -> if null ert then return Nothing else do - erscp <- readSTRef v - flip (either $ \err -> Nothing <$ writeSTRef eri (Just err)) (stmt ert ATEmpty erscp) $ \(ert', erat', erscp') -> - Just (erat', ert') <$ (writeSTRef v erscp' >> when (case erat' of ATEmpty -> False; _ -> True) (writeSTRef lastA erat')) - (>>=) (readSTRef eri) $ flip maybe (return . Left) $ do - v' <- readSTRef v - flip fmap (readSTRef lastA) $ \case - (ATNode ATExprStmt _ lhs _) -> Right (ds', atNoLeaf (ATStmtExpr (init mk ++ [lhs])) (atype lhs), fallBack scp v') - _ -> Left ("void value not ignored as it ought to be. the statement expression starts here:", head xs) - _ -> Left $ if null sctk then ("expected ')' token. the statement expression starts here: ", head xs) else - ("expected ')' token after '" <> tshow (snd $ last sctk) <> "' token", last sctk) -stmtExpr xs _ _ = Left (internalCE, HT.altEmptyToken xs) diff --git a/src/Htcc/Parser/Parsing/Type.hs b/src/Htcc/Parser/Parsing/Type.hs deleted file mode 100644 index bced49a..0000000 --- a/src/Htcc/Parser/Parsing/Type.hs +++ /dev/null @@ -1,278 +0,0 @@ -{-# LANGUAGE BangPatterns, LambdaCase, OverloadedStrings, ScopedTypeVariables, - TupleSections #-} -{-| -Module : Htcc.Parser.Parsing.Type -Description : The C languge parser and AST constructor -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -The module of the Type parsing --} -module Htcc.Parser.Parsing.Type ( - -- * Constant - ConstantResult, - constantExp, - -- * Utilities - isTypeName, - -- * Structure and Enum - takeStructFields, - takeEnumFiels, - -- * Declarations - arrayDeclSuffix, - absDeclaration, - declaration, - -- * Type - takePreType, - takeType, - takeTypeName -) where - -import Data.Bits hiding (shift) -import Data.Bool (bool) -import qualified Data.Map.Strict as M -import Data.Maybe (fromJust, - fromMaybe, - isJust) -import qualified Data.Text as T -import Data.Tuple.Extra (dupe, first, - uncurry3) -import Prelude hiding - (toInteger) - -import qualified Htcc.CRules.Types as CT -import Htcc.Parser.AST -import Htcc.Parser.ConstructionData -import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) -import qualified Htcc.Parser.ConstructionData.Scope.Tag as PST -import qualified Htcc.Parser.ConstructionData.Scope.Typedef as PSD -import Htcc.Parser.ConstructionData.Scope.Utils (internalCE) -import {-# SOURCE #-} Htcc.Parser.Parsing.Core (conditional) -import Htcc.Parser.Utils -import qualified Htcc.Tokenizer as HT -import Htcc.Utils (dropFst3, - dropFst4, - dropSnd3, - first3, - maybe', - maybeToRight, - spanLen, - toInteger, - toNatural, - tshow) - --- | \[ --- \begin{array}{ccc} --- \text{struct-decl}&=&\text{"struct"}\ \text{ident?}\ \left(\text{"\{"}\ \text{struct-member}\ \text{"\}"}\right)\text{?}\\ --- \text{struct-member}&=&\text{pre-type}\ \text{declaration}\ \text{array-decl-suffix}\ \text{";"} --- \end{array} --- \] -takeStructFields :: (Integral i, Show i, Read i, Bits i) => [HT.TokenLC i] -> ConstructionData i -> Either (ASTError i) (M.Map T.Text (CT.StructMember i), ConstructionData i) -takeStructFields tk sc = takeStructFields' tk sc 0 - where - takeStructFields' [] scp' _ = Right (M.empty, scp') - takeStructFields' fs scp' !n = (>>=) (takeType fs scp' >>= validDecl (HT.altEmptyToken tk)) $ \case - (ty@(CT.SCAuto _), Just (_, HT.TKIdent ident), (_, HT.TKReserved ";"):ds, scp'') -> let ofs = toNatural $ CT.alignas (toInteger n) $ toInteger $ CT.alignof ty in - first (M.insert ident (CT.StructMember (CT.toTypeKind ty) ofs)) <$> takeStructFields' ds scp'' (ofs + fromIntegral (CT.sizeof ty)) - (_, Just _, _, _) -> Left ("invalid storage-class specifier", head fs) - _ -> Left ("expected member name or ';' after declaration specifiers", HT.altEmptyToken fs) - validDecl _ (t, Just ident, tks, scp) = maybeToRight ("declaration with incomplete type", ident) (incomplete t scp) >>= \t' -> if CT.toTypeKind t == CT.CTVoid then - Left ("variable or field '" <> tshow (snd ident) <> "' declarated void", ident) else Right (t', Just ident, tks, scp) - validDecl errPlaceholder (t, noth, tks, scp) = maybeToRight ("declaration with incomplete type", errPlaceholder) (incomplete t scp) >>= \t' -> if CT.toTypeKind t == CT.CTVoid then - Left ("declarations of type void is invalid in this context", errPlaceholder) else Right (t', noth, tks, scp) - --- | \[ --- \begin{array}{ccc} --- \text{enum-specifier}&=&\text{"enum"}\ \text{ident}\ \mid\ \text{"enum"}\ \text{ident?}\ \text{"\{"}\ \text{enum-list?}\ \text{"\}"}\\ --- \text{enum-list}&=&\text{enum-elem}\ \left(\text{","}\ \text{enum-elem}\right)\ast\ \text{","?}\\ --- \text{enum-elem}&=&\text{ident}\ \left(\text{"="}\ \text{const-expr}\right)\text{?} --- \end{array} --- \] -takeEnumFiels :: (Integral i, Show i, Read i, Bits i) => CT.StorageClass i -> [HT.TokenLC i] -> ConstructionData i -> Either (ASTError i) (M.Map T.Text i, ConstructionData i) -takeEnumFiels = takeEnumFiels' 0 - where - takeEnumFiels' !n ty [cur@(_, HT.TKIdent ident)] scp = (M.singleton ident n,) <$> addEnumerator ty cur n scp - takeEnumFiels' !n ty (cur@(_, HT.TKIdent ident):(_, HT.TKReserved ","):xs) scp = (>>=) (takeEnumFiels' (succ n) ty xs scp) $ \(m, scp') -> - (M.insert ident n m,) <$> addEnumerator ty cur n scp' - takeEnumFiels' _ ty (cur@(_, HT.TKIdent ident):(_, HT.TKReserved "="):xs) scp = case constantExp xs scp of - Left (Just err) -> Left err - Left Nothing -> Left ("The enumerator value for '" <> tshow (snd cur) <> "' is not an integer constant", cur) - Right ((_, HT.TKReserved ","):ds, val) -> (>>=) (takeEnumFiels' (succ val) ty ds scp) $ \(m, scp') -> - (M.insert ident val m,) <$> addEnumerator ty cur val scp' - Right (ds, val) -> (>>=) (takeEnumFiels' (succ val) ty ds scp) $ \(m, scp') -> - (M.insert ident val m,) <$> addEnumerator ty cur val scp' - takeEnumFiels' _ _ ds _ = let lst = if null ds then HT.emptyToken else last ds in - Left ("expected enum identifier_opt { enumerator-list } or enum identifier_opt { enumerator-list , }", lst) - -{-# INLINE takeCtorPtr #-} -takeCtorPtr :: Integral i => [HT.TokenLC i] -> (CT.StorageClass i -> CT.StorageClass i, [HT.TokenLC i]) -takeCtorPtr = first (CT.ctorPtr . toNatural) . dropSnd3 . spanLen ((==HT.TKReserved "*") . snd) - --- | It is obtained by parsing the front part of the type from the token string. --- e.g. @int (*)[4]@ applied to this function yields @int@. --- --- \[\begin{array}{ccc} --- \text{pre-type}&=&\text{builtin-type}\ \mid\ \text{struct-decl}\ \mid\ \text{typedef-name}\ \mid\ \text{enum-specifier}\\ --- \text{builtin-type}&=&\text{"void"}\ \mid\ \text{"_Bool"}\ \mid\ \text{"char"}\ \mid\ \text{"short"}\ \mid\ \text{"int"}\ \mid\ \text{"long"}\ \mid\ \text{"long "long"} --- \end{array} --- \] -takePreType :: (Integral i, Show i, Read i, Bits i) => [HT.TokenLC i] -> ConstructionData i -> Either (ASTError i) (CT.StorageClass i, [HT.TokenLC i], ConstructionData i) -takePreType ((_, HT.TKType ty1):y@(iy, HT.TKType ty2):xs) scp = maybeToRight (T.singleton '\'' <> tshow ty1 <> " " <> tshow ty2 <> "' is invalid.", y) (CT.qualify ty1 ty2) >>= \ty -> -- for a complex type - takePreType ((iy, HT.TKType ty):xs) scp -takePreType ((_, HT.TKType ty):xs) scp = Right (CT.SCAuto $ CT.toTypeKind $ CT.implicitInt ty, xs, scp) -- for fundamental type -takePreType ((_, HT.TKStruct):cur@(_, HT.TKReserved "{"):xs) scp = maybeToRight (internalCE, cur) (takeBrace "{" "}" (cur:xs)) >>= -- for @struct@ definition - either (Left . ("expected '}' token to match this '{'",)) (\(field, ds) -> uncurry (,ds,) . first (CT.SCAuto . CT.CTStruct) <$> takeStructFields (tail $ init field) scp) -takePreType ((_, HT.TKStruct):cur1@(_, HT.TKIdent ident):cur2@(_, HT.TKReserved "{"):xs) scp = (>>=) (maybeToRight (internalCE, cur1) (takeBrace "{" "}" (cur2:xs))) $ -- for @struct@ definition with tag - either (Left . ("expected '}' token to match this '{'",)) $ \(field, ds) -> (>>=) (addTag (CT.SCAuto $ CT.CTIncomplete $ CT.IncompleteStruct ident) cur1 scp) $ \scp' -> - (>>=) (takeStructFields (tail $ init field) scp') $ \(mem, scp'') -> let ty = CT.SCAuto $ CT.CTStruct mem in addTag ty cur1 scp'' >>= Right . (ty, ds,) -takePreType ((_, HT.TKStruct):cur1@(_, HT.TKIdent ident):xs) scp = case lookupTag ident scp of -- for variable declaration with @struct@ tag - Nothing -> let ty = CT.SCAuto $ CT.CTIncomplete $ CT.IncompleteStruct ident in (>>=) (addTag ty cur1 scp) $ \scp' -> Right (ty, xs, scp') - Just ty -> Right (PST.sttype ty, xs, scp) -takePreType (cur@(_, HT.TKIdent ident):xs) scp = (, xs, scp) . PSD.tdtype <$> maybeToRight (T.singleton '\'' <> tshow (snd cur) <> "' is not a type or also a typedef identifier", cur) (lookupTypedef ident scp) -- for declaration variable with @typedef@ -takePreType ((_, HT.TKEnum):cur@(_, HT.TKReserved "{"):xs) scp = (>>=) (maybeToRight (internalCE, cur) (takeBrace "{" "}" (cur:xs))) $ -- for @enum@ - either (Left . ("expected '}' token to match this '{'",)) $ \(field, ds) -> uncurry (,ds,) . first (CT.SCAuto . CT.CTEnum CT.CTInt) <$> takeEnumFiels (CT.SCAuto CT.CTInt) (tail $ init field) scp -takePreType ((_, HT.TKEnum):cur1@(_, HT.TKIdent _):cur2@(_, HT.TKReserved "{"):xs) scp = (>>=) (maybeToRight (internalCE, cur1) (takeBrace "{" "}" (cur2:xs))) $ -- for @enum@ with tag - either (Left . ("expected '}' token to match this '{'",)) $ \(field, ds) -> (>>=) (takeEnumFiels (CT.SCAuto CT.CTInt) (tail $ init field) scp) $ \(mem, scp') -> let ty = CT.SCAuto $ CT.CTEnum CT.CTInt mem in - addTag ty cur1 scp' >>= Right . (ty, ds,) -takePreType ((_, HT.TKEnum):cur1@(_, HT.TKIdent ident):xs) scp = (, xs, scp) . PST.sttype <$> maybeToRight ("storage size of '" <> ident <> "' isn't known", cur1) (lookupTag ident scp) -- declaration for @enum@ -takePreType ((_, HT.TKReserved _):cur@(_, HT.TKReserved _):_) _ = Left ("cannot combine with previous '" <> tshow (snd cur) <> "' declaration specifier", cur) -takePreType ((_, HT.TKReserved "static"):xs) scp = first3 (CT.SCStatic . CT.toTypeKind) <$> takePreType xs scp -takePreType ((_, HT.TKReserved "register"):xs) scp = first3 (CT.SCRegister . CT.toTypeKind) <$> takePreType xs scp -takePreType ((_, HT.TKReserved "auto"):xs) scp = takePreType xs scp -takePreType (x:_) _ = Left ("ISO C forbids declaration with no type", x) -takePreType _ _ = Left ("ISO C forbids declaration with no type", HT.emptyToken) - -{-# INLINE declaration #-} --- | \[ --- \text{declaration} = \text{"*"*}\ \left(\text{"("}\ \text{declaration}\ \text{")"}\ \mid\ \text{ident}\right)\ \text{array-decl-suffix} --- \] -declaration :: (Integral i, Bits i, Show i, Read i) => CT.StorageClass i -> [HT.TokenLC i] -> ConstructionData i -> Either (ASTError i) (CT.StorageClass i, Maybe (HT.TokenLC i), [HT.TokenLC i]) -declaration ty xs scp = case takeCtorPtr xs of - (fn, xs'@((_, HT.TKReserved "("):_)) -> dropFst4 <$> declaration' id (fn ty) xs' scp - (fn, ident@(_, HT.TKIdent _):ds') -> case arrayDeclSuffix (fn ty) ds' scp of - Nothing -> Right (fn ty, Just ident, ds') - Just rs -> uncurry (,Just ident,) <$> rs - (fn, es) -> Right (fn ty, Nothing, es) - where - declaration' fn ty' xs' scp' = case takeCtorPtr xs' of - (ptrf, cur@(_, HT.TKReserved "("):ds') -> (>>=) (declaration' (fn . ptrf) ty' ds' scp') $ \case - (ptrf', ty'', ident, (_, HT.TKReserved ")"):ds'') -> case arrayDeclSuffix ty'' ds'' scp' of - Nothing -> Right (id, ptrf' ty', ident, ds'') - Just rs -> uncurry (id,,ident,) . first ptrf' <$> rs - _ -> Left ("expected ')' token for this '('", cur) - (ptrf, ident@(_, HT.TKIdent _):ds') -> case arrayDeclSuffix ty' ds' scp' of - Nothing -> Right (ptrf, ty', Just ident, ds') - Just rs -> uncurry (ptrf,,Just ident,) <$> rs - _ -> Left ("expected some identifier", HT.emptyToken) - --- | `takeType` returns a pair of type (including pointer and array type) and the remaining tokens wrapped in --- `Just` only if the token starts with `HT.TKType`, `HT.TKStruct` or identifier that is declarated by @typedef@. --- Otherwise `Nothing` is returned. --- --- \[ --- \text{type}=\text{pre-type}\ \text{declaration} --- \] -takeType :: (Integral i, Show i, Read i, Bits i) => [HT.TokenLC i] -> ConstructionData i -> Either (ASTError i) (CT.StorageClass i, Maybe (HT.TokenLC i), [HT.TokenLC i], ConstructionData i) -takeType tk scp = takePreType tk scp >>= (\(x, y, z) -> uncurry3 (,,, z) <$> declaration x y z) - - --- | `absDeclaration` parses abstract type declarations: --- --- \[ --- \text{abs-declaration} = \text{"*"*}\ \left(\text{"("}\ \text{abs-declaration}\ \text{")"}\right)\text{?}\ \text{array-decl-suffix} --- \] -absDeclaration :: (Integral i, Bits i, Show i, Read i) => CT.StorageClass i -> [HT.TokenLC i] -> ConstructionData i -> Either (ASTError i) (CT.StorageClass i, [HT.TokenLC i]) -absDeclaration ty xs scp = case takeCtorPtr xs of - (fn, xs'@((_, HT.TKReserved "("):_)) -> dropFst3 <$> absDeclarator' id (fn ty) xs' scp - (fn, ds) -> fromMaybe (Right (fn ty, ds)) $ arrayDeclSuffix (fn ty) ds scp - where - absDeclarator' fn ty' xs' scp' = case takeCtorPtr xs' of - (ptrf, cur@(_, HT.TKReserved "("):ds') -> (>>=) (absDeclarator' (fn . ptrf) ty' ds' scp') $ \case - (ptrf', ty'', (_, HT.TKReserved ")"):ds'') -> maybe (Right (id, ptrf' ty'', ds'')) (fmap (uncurry (id,,) . first ptrf')) $ arrayDeclSuffix ty'' ds'' scp' - _ -> Left ("expected ')' token for this '('", cur) - (p, ds) -> Right (p, ty', ds) - --- | `takeTypeName` is used to parse type names used for sizeof etc. Version without `takeType`s identifier. -takeTypeName :: (Integral i, Show i, Read i, Bits i) => [HT.TokenLC i] -> ConstructionData i -> Either (ASTError i) (CT.StorageClass i, [HT.TokenLC i]) -takeTypeName tk scp = (>>=) (takePreType tk scp) $ \(x, y, z) -> if CT.isSCStatic x then Left ("storage-class specifier is not allowed", head tk) else absDeclaration x y z -- ! - --- | @HT.TKReserved "[", n, HT.TKReserved "]"@ from the beginning of the token sequence. --- `arrayDeclSuffix` constructs an array type of the given type @t@ based on --- the token sequence if \(k\leq 1\), wraps it in `Right` and `Just` and returns it with the rest of the token sequence. --- If the token @HT.TKReserved "["@ exists at the beginning of the token sequence, --- but the subsequent token sequence is invalid as an array declaration in C programming language, --- an error mesage and the token at the error location are returned wrapped in --- `Left` and `Just`. When \(k=0\), `Nothing` is returned. --- --- \[ --- \text{array-decl-suffix}=\left(\text{"["}\ \text{const-expr?}\ \text{"]"}\ \text{array-decl-suffix}\right)\text{?} --- \] -arrayDeclSuffix :: forall i. (Integral i, Bits i, Show i, Read i) => CT.StorageClass i -> [HT.TokenLC i] -> ConstructionData i -> Maybe (Either (ASTError i) (CT.StorageClass i, [HT.TokenLC i])) -arrayDeclSuffix t (cur@(_, HT.TKReserved "["):(_, HT.TKReserved "]"):xs) scp = case arrayDeclSuffix t xs scp of - Nothing -> Just ((,xs) . CT.mapTypeKind (CT.CTIncomplete . CT.IncompleteArray) <$> maybeToRight (errSt t) (incomplete t scp)) - Just rs -> Just . (>>=) rs $ \(t', ds) -> (,ds) . CT.mapTypeKind (uncurry ((.) fromJust . CT.concatCTArray) . first (CT.CTIncomplete . CT.IncompleteArray . CT.removeAllExtents) . dupe) <$> - maybeToRight (errSt t') (incomplete t' scp) - where - errSt t' = ("array type has incomplete element type '" <> tshow t' <> "'", cur) -arrayDeclSuffix t (cur@(_, HT.TKReserved "["):xs) scp = case constantExp xs scp of - Left (Just err) -> Just $ Left err - Left Nothing -> Just $ Left $ if null xs then ("The expression is not constant-expression", cur) else ("The expression '" <> tshow (snd $ head xs) <> "' is not constant-expression", head xs) - Right ((_, HT.TKReserved "]"):ds, val) -> Just $ maybe' (Right (CT.mapTypeKind (CT.CTArray (toNatural val)) t, ds)) (arrayDeclSuffix t ds scp) $ - either Left $ \(t', ds') -> maybe' (errSt t') (CT.concatCTArray (CT.mapTypeKind (CT.CTArray (toNatural val)) t) t') $ \ty -> if CT.isValidIncomplete ty then Right (ty, ds') else errSt t' - _ -> Just $ Left ("expected storage size after '[' token", cur) - where - errSt t' = Left ("array type has incomplete element type '" <> tshow t' <> "'", cur) -arrayDeclSuffix _ _ _ = Nothing - -{-# INLINE isTypeName #-} --- | `isTypeName` returns @True@ if the token is a type name, @False@ otherwise. -isTypeName :: HT.TokenLC i -> ConstructionData i -> Bool -isTypeName (_, HT.TKType _) _ = True -isTypeName (_, HT.TKStruct) _ = True -isTypeName (_, HT.TKEnum) _ = True -isTypeName (_, HT.TKReserved "static") _ = True -isTypeName (_, HT.TKReserved "auto") _ = True -isTypeName (_, HT.TKReserved "register") _ = True -isTypeName (_, HT.TKIdent ident) scp = isJust $ lookupTypedef ident scp -isTypeName _ _ = False - --- | The `Just` represents an error during construction of the syntax tree, and the `Nothing` represents no valid constant expression. -type ConstantResult i = Maybe (ASTError i) - --- | `constantExp` evaluates to a constant expression from token list. -constantExp :: forall i. (Bits i, Integral i, Show i, Read i) => [HT.TokenLC i] -> ConstructionData i -> Either (ConstantResult i) ([HT.TokenLC i], i) -constantExp tk scp = flip (either (Left . Just)) (conditional tk ATEmpty scp) $ \(ds, at, _) -> - maybe (Left Nothing) (Right . (ds, )) $ evalConstantExp at - where - evalConstantExp :: ATree i -> Maybe i - evalConstantExp (ATNode k _ lhs rhs) = let fromBool = fromIntegral . fromEnum :: Bool -> i in case k of - ATAdd -> binop (+) - ATSub -> binop (-) - ATMul -> binop (*) - ATDiv -> binop div - ATAnd -> binop (.&.) - ATXor -> binop xor - ATOr -> binop (.|.) - ATShl -> binop (flip (.) fromIntegral . shiftL) - ATShr -> binop (flip (.) fromIntegral . shiftR) - ATEQ -> binop ((.) fromBool . (==)) - ATNEQ -> binop ((.) fromBool . (/=)) - ATLT -> binop ((.) fromBool . (<)) - ATGT -> binop ((.) fromBool . (>)) - ATLEQ -> binop ((.) fromBool . (<=)) - ATGEQ -> binop ((.) fromBool . (>=)) - ATConditional cn th el -> evalConstantExp cn >>= bool (evalConstantExp el) (evalConstantExp th) . castBool - ATComma -> evalConstantExp rhs - ATNot -> fromIntegral . fromEnum . not . castBool <$> evalConstantExp lhs - ATBitNot -> complement <$> evalConstantExp lhs - ATLAnd -> binop ((.) fromBool . flip (.) castBool . (&&) . castBool) - ATLOr -> binop ((.) fromBool . flip (.) castBool . (||) . castBool) - ATNum v -> Just v - _ -> Nothing - where - binop f = (>>=) (evalConstantExp lhs) $ \lhs' -> fromIntegral . f lhs' <$> evalConstantExp rhs - castBool x | x == 0 = False | otherwise = True - evalConstantExp ATEmpty = Nothing - diff --git a/src/Htcc/Parser/Parsing/Typedef.hs b/src/Htcc/Parser/Parsing/Typedef.hs deleted file mode 100644 index e9c3e05..0000000 --- a/src/Htcc/Parser/Parsing/Typedef.hs +++ /dev/null @@ -1,44 +0,0 @@ -{-# LANGUAGE BangPatterns, OverloadedStrings, ScopedTypeVariables, - TupleSections #-} -{-| -Module : Htcc.Parser.Parsing.Typedef -Description : The C languge parser and AST constructor -Copyright : (c) roki, 2019 -License : MIT -Maintainer : falgon53@yahoo.co.jp -Stability : experimental -Portability : POSIX - -Perspective on @typedef@ declaration --} -module Htcc.Parser.Parsing.Typedef ( - typedef -) where - -import Data.Bits (Bits) - -import Htcc.Parser.AST -import Htcc.Parser.ConstructionData -import Htcc.Parser.ConstructionData.Scope.ManagedScope (ASTError) -import Htcc.Parser.ConstructionData.Scope.Utils (internalCE) -import Htcc.Parser.Parsing.Type -import qualified Htcc.Tokenizer as HT -import Htcc.Utils (maybeToRight, - tshow) - --- | Perform type definition from token string starting from @typedef@ token. --- \[\text{typedef-name}=\text{ident}\] -typedef :: (Integral i, Show i, Read i, Bits i) => [(HT.TokenLCNums i, HT.Token i)] -> ConstructionData i -> Either (ASTError i) ([HT.TokenLC i], ATree a, ConstructionData i) -typedef ((_, HT.TKTypedef):cur@(_, HT.TKReserved _):_) _ = Left ("storage-class specifier is not allowed in this context", cur) -typedef (cur@(_, HT.TKTypedef):xs) !scp = case takeType xs scp of - Left er -> Left er - Right (ty, Just ident, ds, scp') -> case ds of - (_, HT.TKReserved ";"):ds' -> do - ty' <- maybeToRight ("incomplete type typedef", ident) (incomplete ty scp') - (ds', ATEmpty,) <$> addTypedef ty' ident scp' - _ -> Left ("expected ';' token after '" <> tshow (snd ident) <> "'", ident) - Right (_, Nothing, ds, scp') -> case ds of - (_, HT.TKReserved ";"):ds' -> Right (ds', ATEmpty, pushWarn "useless type name in empty declaration" cur scp') - _ -> Left $ if not (null ds) then ("expected ';' token after '" <> tshow (snd $ head ds) <> "'", head ds) else ("expected ';' token", HT.emptyToken) -typedef _ _ = Left (internalCE, HT.emptyToken) - diff --git a/src/Htcc/Parser/Utils/Core.hs b/src/Htcc/Parser/Utils/Core.hs index ac270f8..ae10c22 100644 --- a/src/Htcc/Parser/Utils/Core.hs +++ b/src/Htcc/Parser/Utils/Core.hs @@ -49,7 +49,7 @@ takeBrace leftb rightb xxs@((_, HT.TKReserved y):_) | otherwise = g l r xs' where g = (.) (fmap (first ((p, HT.TKReserved x):)) .) . f - f !l !r ((p, x):xs') = first ((:) (p, x)) <$> f l r xs' + f !l !r ((p, x):xs') = first ((p, x):) <$> f l r xs' takeBrace _ _ _ = Nothing -- | Get an argument from list of `Htcc.Tokenizer.Token` (e.g: Given the token of @f(g(a, b)), 42@, return the token of @f(g(a, b))@). @@ -74,6 +74,5 @@ takeExps :: Eq i => [HT.TokenLC i] -> Maybe [[HT.TokenLC i]] takeExps ((_, HT.TKReserved "("):xs) = maybe' Nothing (lastInit ((==HT.TKReserved ")") . snd) xs) $ fmap (filter (not . null)) . f where f [] = Just [] - f args = maybe Nothing (\(ex, ds) -> (ex:) <$> f ds) $ readFn args + f args = readFn args >>= \(ex, ds) -> (ex:) <$> f ds takeExps _ = Nothing - diff --git a/src/Htcc/Tokenizer/Core.hs b/src/Htcc/Tokenizer/Core.hs index 2d98385..fb15fd7 100644 --- a/src/Htcc/Tokenizer/Core.hs +++ b/src/Htcc/Tokenizer/Core.hs @@ -9,7 +9,7 @@ Portability : POSIX The tokenizer -} -{-# LANGUAGE LambdaCase, MultiWayIf, OverloadedStrings, TupleSections #-} +{-# LANGUAGE LambdaCase, OverloadedStrings, TupleSections #-} module Htcc.Tokenizer.Core ( -- * Tokenizer tokenize' @@ -17,6 +17,7 @@ module Htcc.Tokenizer.Core ( import Control.Applicative (Alternative (..)) import Control.Conditional (ifM) +import Control.Monad (replicateM_) import Control.Monad.Extra (firstJustM) import Control.Monad.State import Data.Char (digitToInt, @@ -136,11 +137,12 @@ charLit = do txt <- gets snd maybe' (lift $ Left ("invalid char literal in program", (lc, TKReserved "\'"))) (spanCharLiteral txt) $ \(lit, ds) -> -- Adding 3 means to add a single character literal and two @"@ - if | T.length lit == 1 -> Just (TKNum (fromIntegral $ ord $ T.head lit)) <$ put (lc { tkCn = 3 + tkCn lc }, ds) + case T.length lit of + 1 -> Just (TKNum (fromIntegral $ ord $ T.head lit)) <$ put (lc { tkCn = 3 + tkCn lc }, ds) -- For multi-character constants. -- The standard states that this is an implementation definition. -- Here it follows the implementation definitions of GCC and Clang. - | otherwise -> Just (TKNum $ fst $ head $ readHex $ foldr (\x acc -> showHex (ord x) "" <> acc) [] $ T.unpack lit) <$ + _ -> Just (TKNum $ fst $ head $ readHex $ foldr (\x acc -> showHex (ord x) "" <> acc) [] $ T.unpack lit) <$ put (lc { tkCn = 2 + fromIntegral (T.length lit) + tkCn lc }, ds) operators :: (Enum i, Num i) => Tokenizer i (Maybe (Token i)) diff --git a/src/Htcc/Tokenizer/Token.hs b/src/Htcc/Tokenizer/Token.hs index 87cc1b3..cd22c51 100644 --- a/src/Htcc/Tokenizer/Token.hs +++ b/src/Htcc/Tokenizer/Token.hs @@ -17,6 +17,7 @@ module Htcc.Tokenizer.Token ( TokenLC, Token (..), -- * Utilities for accessing to token data + keywordsTokens, length, emptyToken, isTKNum, @@ -36,6 +37,7 @@ module Htcc.Tokenizer.Token ( import Control.DeepSeq (NFData (..), NFData1 (..)) +import Data.Bifunctor (bimap) import qualified Data.ByteString as B import Data.Char (chr, isDigit, ord) import Data.List (find) @@ -111,7 +113,7 @@ instance Show i => Show (Token i) where instance Read i => Read (Token i) where readsPrec _ xxs@(x:xs) | isDigit x = [first (TKNum . (read :: String -> i) . (x:)) $ dropFst3 $ spanLen isDigit xs] - | x == '\"' = [maybe' (error "No parse: string literal was not closed") (spanStrLiteral $ T.pack xs) $ first (TKString . T.encodeUtf8 . flip T.append "\0") . second T.unpack] + | x == '\"' = [maybe' (error "No parse: string literal was not closed") (spanStrLiteral $ T.pack xs) $ bimap (TKString . T.encodeUtf8 . flip T.append "\0") T.unpack] | P.length xxs > 2 && T.pack (take 3 xxs) `elem` CR.strOps3 = [(TKReserved $ T.pack $ take 3 xxs, drop 3 xxs)] | not (null xs) && T.pack (take 2 xxs) `elem` CR.strOps2 = [(TKReserved $ T.pack $ take 2 xxs, drop 2 xxs)] | x `elem` CR.charOps = [(TKReserved (T.singleton x), xs)] @@ -145,36 +147,39 @@ length (TKString s) = B.length s length (TKMacro m t) = CP.length m + T.length t length TKEmpty = 0 +keywordsTokens :: [Token i] +keywordsTokens = [ + TKReturn + , TKWhile + , TKIf + , TKSwitch + , TKCase + , TKDefault + , TKElse + , TKFor + , TKBreak + , TKContinue + , TKEnum + , TKStruct + , TKSizeof + , TKGoto + , TKAlignof + , TKTypedef + , TKType $ CR.SCUndef CR.CTInt + , TKType $ CR.SCUndef CR.CTChar + , TKType $ CR.SCUndef $ CR.CTSigned CR.CTUndef + , TKType $ CR.SCUndef $ CR.CTShort CR.CTUndef + , TKType $ CR.SCUndef $ CR.CTLong CR.CTUndef + , TKType $ CR.SCUndef CR.CTVoid + , TKType $ CR.SCUndef CR.CTBool + , TKType $ CR.SCAuto CR.CTUndef + , TKType $ CR.SCStatic CR.CTUndef + , TKType $ CR.SCRegister CR.CTUndef + ] + -- | Lookup keyword from `T.Text`. If the specified `T.Text` is not keyword as C language, `lookupKeyword` returns `Nothing`. lookupKeyword :: forall i. (Show i) => T.Text -> Maybe (Token i) -lookupKeyword s = find ((==) s . tshow) [ - TKReturn, - TKWhile, - TKIf, - TKSwitch, - TKCase, - TKDefault, - TKElse, - TKFor, - TKBreak, - TKContinue, - TKEnum, - TKStruct, - TKSizeof, - TKGoto, - TKAlignof, - TKTypedef, - TKType $ CR.SCUndef CR.CTInt, - TKType $ CR.SCUndef CR.CTChar, - TKType $ CR.SCUndef $ CR.CTSigned CR.CTUndef, - TKType $ CR.SCUndef $ CR.CTShort CR.CTUndef, - TKType $ CR.SCUndef $ CR.CTLong CR.CTUndef, - TKType $ CR.SCUndef CR.CTVoid, - TKType $ CR.SCUndef CR.CTBool, - TKReserved $ T.pack $ show (CR.SCAuto CR.CTUndef :: CR.StorageClass i), - TKReserved $ T.pack $ show (CR.SCStatic CR.CTUndef :: CR.StorageClass i), - TKReserved $ T.pack $ show (CR.SCRegister CR.CTUndef :: CR.StorageClass i) - ] +lookupKeyword s = find ((==) s . tshow) keywordsTokens -- | `TokenLCNums` is data structure for storing the line number and character number of each token data TokenLCNums i = TokenLCNums -- ^ The constructor of `TokenLCNums` diff --git a/src/Htcc/Utils.hs b/src/Htcc/Utils.hs index 914dbfa..052bbcb 100644 --- a/src/Htcc/Utils.hs +++ b/src/Htcc/Utils.hs @@ -9,8 +9,7 @@ Portability : POSIX General-purpose utilities -} -{-# LANGUAGE BangPatterns, Rank2Types, ScopedTypeVariables, TupleSections, - TypeOperators #-} +{-# LANGUAGE BangPatterns, TupleSections #-} module Htcc.Utils ( -- * Extra functions for lists module Htcc.Utils.List, diff --git a/src/Htcc/Utils/CompilationState.hs b/src/Htcc/Utils/CompilationState.hs index c0b5976..7e744ce 100644 --- a/src/Htcc/Utils/CompilationState.hs +++ b/src/Htcc/Utils/CompilationState.hs @@ -25,9 +25,11 @@ import Control.Monad (replicateM) import Control.Monad.Loops (unfoldM) import Control.Monad.State (StateT, get, gets, put) +import Data.Bifunctor (bimap) import Data.Bool (bool) import Data.Maybe (catMaybes) -import Data.MonoTraversable (Element, MonoFoldable (..), +import Data.MonoTraversable (Element, + MonoFoldable (..), headMay) import qualified Data.Sequences as S import Data.Tuple.Extra (first, second) @@ -55,7 +57,7 @@ itemsP n = do -- Defines information updates by providing a function that -- accepts the current information and one item to be consumed and returns the information itemC :: S.IsSequence mono => (cd -> Element mono -> cd) -> CompilationState cd mono i (Maybe (Element mono)) -itemC f = itemP >>= maybe (return Nothing) (\itp -> Just itp <$ (get >>= put . first (`f` itp) . second S.tailEx)) +itemC f = itemP >>= maybe (return Nothing) (\itp -> Just itp <$ (get >>= put . bimap (`f` itp) S.tailEx)) {-# INLINE itemsC #-} -- | `itemsC` consumes at items from input data. @@ -89,4 +91,3 @@ itemsCWhen cf f = fmap S.pack $ unfoldM $ itemCWhen cf f -- | `isSatisfied` returns `True` if the input data satisfies the condition of given unary function, otherwise returns `False`. isSatisfied :: (mono -> Bool) -> CompilationState cd mono i Bool isSatisfied f = gets (f . snd) - diff --git a/src/Htcc/Utils/Print.hs b/src/Htcc/Utils/Print.hs index 32cce81..6737bc4 100644 --- a/src/Htcc/Utils/Print.hs +++ b/src/Htcc/Utils/Print.hs @@ -18,20 +18,21 @@ module Htcc.Utils.Print ( warnCharDoc, locTxtDoc, locCharDoc, ) where -import qualified Data.Text as T -import qualified Data.Text.IO as T -import Prelude hiding (toInteger) -import System.Exit (exitFailure) -import System.IO (stderr) -import Text.PrettyPrint.ANSI.Leijen (Doc, bold, char, hPutDoc, - linebreak, magenta, putDoc, red, - text) +import qualified Data.Text as T +import qualified Data.Text.IO as T +import Prelude hiding (toInteger) +import qualified Prettyprinter as PP +import Prettyprinter.Render.Terminal (AnsiStyle, Color (Magenta, Red), + bold, color, hPutDoc, putDoc) +import System.Exit (exitFailure) +import System.IO (stderr) + +type Doc = PP.Doc AnsiStyle {-# INLINE putDocLn #-} --- | Execute `Text.PrettyPrint.ANSI.Leijen.putDoc` by applying `Text.PrettyPrint.ANSI.Leijen.linebreak` --- to `Text.PrettyPrint.ANSI.Leijen.<>` at the end of given `Text.PrettyPrint.ANSI.Leijen.Doc` +-- | Execute `putDoc` by appending a trailing line break to the given `Doc`. putDocLn :: Doc -> IO () -putDocLn = putDoc . flip (<>) linebreak +putDocLn = putDoc . flip (<>) PP.hardline {-# INLINE putDocErr #-} -- | The shortcut of @hPutDoc stderr@ @@ -39,46 +40,41 @@ putDocErr :: Doc -> IO () putDocErr = hPutDoc stderr {-# INLINE putDocLnErr #-} --- | Execute `putDocErr` by applying `Text.PrettyPrint.ANSI.Leijen.linebreak` --- to `Text.PrettyPrint.ANSI.Leijen.<>` at the end of given `Text.PrettyPrint.ANSI.Leijen.Doc` +-- | Execute `putDocErr` by appending a trailing line break to the given `Doc`. putDocLnErr :: Doc -> IO () -putDocLnErr = putDocErr . flip (<>) linebreak +putDocLnErr = putDocErr . flip (<>) PP.hardline {-# INLINE errTxtDoc #-} --- | The `Text.PrettyPrint.ANSI.Leijen.Doc` used to output an error message (`String`), --- it is shortcut of @red . text@ +-- | Doc used to output an error message (`String`). errTxtDoc :: String -> Doc -errTxtDoc = red . text +errTxtDoc = PP.annotate (color Red) . PP.pretty {-# INLINE errCharDoc #-} --- | The `Text.PrettyPrint.ANSI.Leijen.Doc` used to output an error message (`Char`), --- it is shortcut of @red. char@ +-- | Doc used to output an error message (`Char`). errCharDoc :: Char -> Doc -errCharDoc = red . char +errCharDoc = PP.annotate (color Red) . PP.pretty {-# INLINE warnTxtDoc #-} --- | The `Text.PrettyPrint.ANSI.Leijen.Doc` used to output an warning message (`String`), --- it is shortcut of @magenta . text@ +-- | Doc used to output a warning message (`String`). warnTxtDoc :: String -> Doc -warnTxtDoc = magenta . text +warnTxtDoc = PP.annotate (color Magenta) . PP.pretty {-# INLINE warnCharDoc #-} --- | The `Text.PrettyPrint.ANSI.Leijen.Doc` used to output an warning message (`Char`), --- it is shortcut of @magenta . char@ +-- | Doc used to output a warning message (`Char`). warnCharDoc :: Char -> Doc -warnCharDoc = magenta . char +warnCharDoc = PP.annotate (color Magenta) . PP.pretty {-# INLINE locTxtDoc #-} -- | Doc used to output a message (`String`) about the location, such as the file name and its location, --- it is shortcut of @bold . text@ +-- it is shortcut of @annotate bold . pretty@ locTxtDoc :: String -> Doc -locTxtDoc = bold . text +locTxtDoc = PP.annotate bold . PP.pretty {-# INLINE locCharDoc #-} -- | Doc used to output a message (`Char`) about the location, such as the file name and its location, --- it is shortcut of @bold . char@ +-- it is shortcut of @annotate bold . pretty@ locCharDoc :: Char -> Doc -locCharDoc = bold . char +locCharDoc = PP.annotate bold . PP.pretty -- | Standard error output shortcut (with new line). putStrLnErr :: T.Text -> IO () diff --git a/src/Htcc/Utils/Tuple.hs b/src/Htcc/Utils/Tuple.hs index 5dcf67d..58a3776 100644 --- a/src/Htcc/Utils/Tuple.hs +++ b/src/Htcc/Utils/Tuple.hs @@ -37,12 +37,13 @@ module Htcc.Utils.Tuple ( uncurry4, ) where +import Data.Bifunctor (bimap) import Data.Tuple.Extra (dupe, first, second) {-# INLINE swap #-} -- | Swap a first element and second element swap :: (a, b) -> (b, a) -swap = first snd . second fst . dupe +swap = bimap snd fst . dupe {-# INLINE first3 #-} -- | Update the first component of triple. diff --git a/src/Htcc/Visualizer.hs b/src/Htcc/Visualizer.hs index 39a3546..f3178d5 100644 --- a/src/Htcc/Visualizer.hs +++ b/src/Htcc/Visualizer.hs @@ -10,7 +10,39 @@ Portability : POSIX Build AST from C source code -} module Htcc.Visualizer ( + validateVisualizationOutputPath, + writeVisualization, module Htcc.Visualizer.Core ) where +import Data.Char (toLower) +import Diagrams.Prelude (V2) +import Diagrams.Size (SizeSpec) +import Htcc.Output (ReplacementOutputMode (..), + resolveReplacementOutputPath, + withReplacementOutputPath) +import Htcc.Parser (ASTs) import Htcc.Visualizer.Core +import System.FilePath (takeExtension) + +validateVisualizationOutputPath :: FilePath -> Either String () +validateVisualizationOutputPath outputPath + | map toLower (takeExtension outputPath) == ".svg" = + Right () + | otherwise = + Left $ + "AST visualization output path must use the .svg extension: " + <> outputPath + +writeVisualization :: Show i => ASTs i -> SizeSpec V2 Double -> FilePath -> IO () +writeVisualization asts sizeSpec outputPath = do + resolvedOutputPath <- resolveReplacementOutputPath outputPath + either (ioError . userError) pure $ + validateVisualizationOutputPath resolvedOutputPath + if null renderableAsts + then ioError $ userError "There is nothing to describe" + else + withReplacementOutputPath PreserveReplacementOutputMode outputPath $ + visualize renderableAsts sizeSpec + where + renderableAsts = filter hasRenderableTree asts diff --git a/src/Htcc/Visualizer/Core.hs b/src/Htcc/Visualizer/Core.hs index 9ab88cb..b44b616 100644 --- a/src/Htcc/Visualizer/Core.hs +++ b/src/Htcc/Visualizer/Core.hs @@ -11,6 +11,8 @@ Build AST from C source code -} {-# LANGUAGE FlexibleContexts, OverloadedStrings #-} module Htcc.Visualizer.Core ( + hasRenderableTree, + mkWidth, visualize ) where @@ -27,77 +29,118 @@ import Htcc.Parser.AST.Core (ATKind (..), ATree (..), fromATKindFor) import Htcc.Utils (putStrLnErr) +hasRenderableTree :: Show i => ATree i -> Bool +hasRenderableTree = not . null . encodeForest + +normalizeRenderableTree :: ATree i -> ATree i +normalizeRenderableTree ATEmpty = ATEmpty +normalizeRenderableTree (ATNode (ATNull inner) _ _ _) = normalizeRenderableTree inner +normalizeRenderableTree (ATNode ATExprStmt _ lhs _) = normalizeRenderableTree lhs +normalizeRenderableTree tree = tree + +encodeForest :: Show i => ATree i -> [Tree String] +encodeForest tree = case normalizeRenderableTree tree of + ATEmpty -> + [] + normalizedTree -> + [encodeTree normalizedTree] + +encodeChildren :: Show i => [ATree i] -> [Tree String] +encodeChildren = concatMap encodeForest + +encodeChildrenPreservingEmpty :: Show i => [ATree i] -> [Tree String] +encodeChildrenPreservingEmpty = + map (encodeTree . normalizeRenderableTree) + -- | the function to convert `ATree` to `Data.Tree` encodeTree :: Show i => ATree i -> Tree String encodeTree ATEmpty = Node "Null" [] -encodeTree (ATNode ATAdd _ l r) = Node "+" [encodeTree l, encodeTree r] -encodeTree (ATNode ATAddPtr _ l r) = Node "+" [encodeTree l, encodeTree r] -encodeTree (ATNode ATSub _ l r) = Node "-" [encodeTree l, encodeTree r] -encodeTree (ATNode ATSubPtr _ l r) = Node "-" [encodeTree l, encodeTree r] -encodeTree (ATNode ATPtrDis _ l r) = Node "-" [encodeTree l, encodeTree r] -encodeTree (ATNode ATMul _ l r) = Node "*" [encodeTree l, encodeTree r] -encodeTree (ATNode ATDiv _ l r) = Node "/" [encodeTree l, encodeTree r] -encodeTree (ATNode ATMod _ l r) = Node "%" [encodeTree l, encodeTree r] -encodeTree (ATNode ATAddAssign _ l r) = Node "+=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATSubAssign _ l r) = Node "-=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATMulAssign _ l r) = Node "*=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATDivAssign _ l r) = Node "/=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATAddPtrAssign _ l r) = Node "+=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATSubPtrAssign _ l r) = Node "-=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATLAnd _ l r) = Node "&&" [encodeTree l, encodeTree r] -encodeTree (ATNode ATLOr _ l r) = Node "||" [encodeTree l, encodeTree r] -encodeTree (ATNode ATAnd _ l r) = Node "&" [encodeTree l, encodeTree r] -encodeTree (ATNode ATAndAssign _ l r) = Node "&=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATOr _ l r) = Node "|" [encodeTree l, encodeTree r] -encodeTree (ATNode ATOrAssign _ l r) = Node "|=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATXor _ l r) = Node "^" [encodeTree l, encodeTree r] -encodeTree (ATNode ATXorAssign _ l r) = Node "^=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATBitNot _ l r) = Node "~" [encodeTree l, encodeTree r] -encodeTree (ATNode ATShl _ l r) = Node "<<" [encodeTree l, encodeTree r] -encodeTree (ATNode ATShlAssign _ l r) = Node "<<=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATShr _ l r) = Node ">>" [encodeTree l, encodeTree r] -encodeTree (ATNode ATShrAssign _ l r) = Node ">>=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATLT _ l r) = Node "<" [encodeTree l, encodeTree r] -encodeTree (ATNode ATLEQ _ l r) = Node "<=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATGT _ l r) = Node ">" [encodeTree l, encodeTree r] -encodeTree (ATNode ATGEQ _ l r) = Node ">=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATEQ _ l r) = Node "==" [encodeTree l, encodeTree r] -encodeTree (ATNode ATNEQ _ l r) = Node "!=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATNot _ l _) = Node "!" [encodeTree l] -encodeTree (ATNode ATAddr _ l _) = Node "&" [encodeTree l] -encodeTree (ATNode ATDeref _ l _) = Node "*" [encodeTree l] -encodeTree (ATNode ATAssign _ l r) = Node "=" [encodeTree l, encodeTree r] -encodeTree (ATNode ATPreInc _ l r) = Node "++ (pre)" [encodeTree l, encodeTree r] -encodeTree (ATNode ATPreDec _ l r) = Node "-- (pre)" [encodeTree l, encodeTree r] -encodeTree (ATNode ATPostInc _ l r) = Node "++ (post)" [encodeTree l, encodeTree r] -encodeTree (ATNode ATPostDec _ l r) = Node "-- (post)" [encodeTree l, encodeTree r] -encodeTree (ATNode (ATNum n) t l r) = Node (show n ++ " (" ++ show (CT.toTypeKind t) ++ ")") [encodeTree l, encodeTree r] -encodeTree (ATNode (ATConditional a b c) _ _ _) = Node "?:" [encodeTree a, encodeTree b, encodeTree c] -encodeTree (ATNode ATComma _ l r) = Node "," [encodeTree l, encodeTree r] -encodeTree (ATNode ATCast t l _) = Node ("(" ++ show (CT.toTypeKind t) ++ ")\n(type cast)") [encodeTree l] -encodeTree (ATNode (ATMemberAcc _) _ l r) = Node "." [encodeTree l, encodeTree r] -encodeTree (ATNode ATReturn _ l r) = Node "return" [encodeTree l, encodeTree r] -encodeTree (ATNode ATIf _ l r) = Node "if" [encodeTree l, encodeTree r] -encodeTree (ATNode ATElse _ l r) = Node "else" [encodeTree l, encodeTree r] -encodeTree (ATNode (ATSwitch th xs) _ l r) = Node "switch" $ encodeTree th : map encodeTree xs ++ [encodeTree l, encodeTree r] -encodeTree (ATNode (ATCase _ v) _ l r) = Node ("case " ++ show v) [encodeTree l, encodeTree r] -encodeTree (ATNode (ATDefault _) _ l r) = Node "default" [encodeTree l, encodeTree r] -encodeTree (ATNode ATWhile _ l r) = Node "while" [encodeTree l, encodeTree r] -encodeTree (ATNode (ATFor atf) _ l r) = Node "for" $ map (encodeTree . fromATKindFor) atf ++ [encodeTree l, encodeTree r] -encodeTree (ATNode ATBreak _ l r) = Node "break" [encodeTree l, encodeTree r] -encodeTree (ATNode ATContinue _ l r) = Node "continue" [encodeTree l, encodeTree r] -encodeTree (ATNode (ATGoto lbl) _ l r) = Node ("goto " ++ T.unpack lbl) [encodeTree l, encodeTree r] -encodeTree (ATNode (ATLabel lbl) _ l r) = Node (":" ++ T.unpack lbl) [encodeTree l, encodeTree r] -encodeTree (ATNode (ATBlock xs) _ _ _) = Node "{}" $ map encodeTree xs -encodeTree (ATNode (ATLVar t o) _ l r) = Node (show t ++ " lvar" ++ show o) [encodeTree l, encodeTree r] -encodeTree (ATNode (ATGVar t n) _ l r) = Node (show t ++ " " ++ T.unpack n) [encodeTree l, encodeTree r] -encodeTree (ATNode (ATDefFunc fname Nothing) t lhs _) = Node (show (CT.toTypeKind t) ++ " " ++ T.unpack fname ++ "()") [encodeTree lhs] -encodeTree (ATNode (ATDefFunc fname (Just args)) t lhs _) = Node (show (CT.toTypeKind t) ++ " " ++ T.unpack fname ++ "(some arguments)") $ map encodeTree args ++ [encodeTree lhs] -encodeTree (ATNode (ATCallFunc fname Nothing) _ lhs rhs) = Node (T.unpack fname ++ "()") [encodeTree lhs, encodeTree rhs] -encodeTree (ATNode (ATCallFunc fname (Just args)) _ lhs rhs) = Node (T.unpack fname ++ "(some arguments)") $ map encodeTree args ++ [encodeTree lhs, encodeTree rhs] +encodeTree (ATNode ATAdd _ l r) = Node "+" $ encodeChildren [l, r] +encodeTree (ATNode ATAddPtr _ l r) = Node "+" $ encodeChildren [l, r] +encodeTree (ATNode ATSub _ l r) = Node "-" $ encodeChildren [l, r] +encodeTree (ATNode ATSubPtr _ l r) = Node "-" $ encodeChildren [l, r] +encodeTree (ATNode ATPtrDis _ l r) = Node "-" $ encodeChildren [l, r] +encodeTree (ATNode ATMul _ l r) = Node "*" $ encodeChildren [l, r] +encodeTree (ATNode ATDiv _ l r) = Node "/" $ encodeChildren [l, r] +encodeTree (ATNode ATMod _ l r) = Node "%" $ encodeChildren [l, r] +encodeTree (ATNode ATAddAssign _ l r) = Node "+=" $ encodeChildren [l, r] +encodeTree (ATNode ATSubAssign _ l r) = Node "-=" $ encodeChildren [l, r] +encodeTree (ATNode ATMulAssign _ l r) = Node "*=" $ encodeChildren [l, r] +encodeTree (ATNode ATDivAssign _ l r) = Node "/=" $ encodeChildren [l, r] +encodeTree (ATNode ATAddPtrAssign _ l r) = Node "+=" $ encodeChildren [l, r] +encodeTree (ATNode ATSubPtrAssign _ l r) = Node "-=" $ encodeChildren [l, r] +encodeTree (ATNode ATLAnd _ l r) = Node "&&" $ encodeChildren [l, r] +encodeTree (ATNode ATLOr _ l r) = Node "||" $ encodeChildren [l, r] +encodeTree (ATNode ATAnd _ l r) = Node "&" $ encodeChildren [l, r] +encodeTree (ATNode ATAndAssign _ l r) = Node "&=" $ encodeChildren [l, r] +encodeTree (ATNode ATOr _ l r) = Node "|" $ encodeChildren [l, r] +encodeTree (ATNode ATOrAssign _ l r) = Node "|=" $ encodeChildren [l, r] +encodeTree (ATNode ATXor _ l r) = Node "^" $ encodeChildren [l, r] +encodeTree (ATNode ATXorAssign _ l r) = Node "^=" $ encodeChildren [l, r] +encodeTree (ATNode ATBitNot _ l r) = Node "~" $ encodeChildren [l, r] +encodeTree (ATNode ATShl _ l r) = Node "<<" $ encodeChildren [l, r] +encodeTree (ATNode ATShlAssign _ l r) = Node "<<=" $ encodeChildren [l, r] +encodeTree (ATNode ATShr _ l r) = Node ">>" $ encodeChildren [l, r] +encodeTree (ATNode ATShrAssign _ l r) = Node ">>=" $ encodeChildren [l, r] +encodeTree (ATNode ATLT _ l r) = Node "<" $ encodeChildren [l, r] +encodeTree (ATNode ATLEQ _ l r) = Node "<=" $ encodeChildren [l, r] +encodeTree (ATNode ATGT _ l r) = Node ">" $ encodeChildren [l, r] +encodeTree (ATNode ATGEQ _ l r) = Node ">=" $ encodeChildren [l, r] +encodeTree (ATNode ATEQ _ l r) = Node "==" $ encodeChildren [l, r] +encodeTree (ATNode ATNEQ _ l r) = Node "!=" $ encodeChildren [l, r] +encodeTree (ATNode ATNot _ l _) = Node "!" $ encodeChildren [l] +encodeTree (ATNode ATSizeof _ l _) = Node "sizeof" $ encodeChildren [l] +encodeTree (ATNode ATAlignof _ l _) = Node "_Alignof" $ encodeChildren [l] +encodeTree (ATNode ATAddr _ l _) = Node "&" $ encodeChildren [l] +encodeTree (ATNode ATDeref _ l _) = Node "*" $ encodeChildren [l] +encodeTree (ATNode ATAssign _ l r) = Node "=" $ encodeChildren [l, r] +encodeTree (ATNode ATPreInc _ l r) = Node "++ (pre)" $ encodeChildren [l, r] +encodeTree (ATNode ATPreDec _ l r) = Node "-- (pre)" $ encodeChildren [l, r] +encodeTree (ATNode ATPostInc _ l r) = Node "++ (post)" $ encodeChildren [l, r] +encodeTree (ATNode ATPostDec _ l r) = Node "-- (post)" $ encodeChildren [l, r] +encodeTree (ATNode (ATNum n) t l r) = Node (show n ++ " (" ++ show (CT.toTypeKind t) ++ ")") $ encodeChildren [l, r] +encodeTree (ATNode (ATConditional a b c) _ _ _) = Node "?:" $ encodeChildrenPreservingEmpty [a, b, c] +encodeTree (ATNode ATComma _ l r) = Node "," $ encodeChildren [l, r] +encodeTree (ATNode ATCast t l _) = Node ("(" ++ show (CT.toTypeKind t) ++ ")\n(type cast)") $ encodeChildren [l] +encodeTree (ATNode (ATMemberAcc _) _ l r) = Node "." $ encodeChildren [l, r] +encodeTree (ATNode ATReturn _ l r) = Node "return" $ encodeChildren [l, r] +encodeTree (ATNode ATIf _ l r) = Node "if" $ encodeChildrenPreservingEmpty [l, r] +encodeTree (ATNode ATElse _ l r) = Node "else" $ encodeChildrenPreservingEmpty [l, r] +encodeTree (ATNode (ATSwitch th xs) _ l r) = + Node "switch" $ + encodeChildren [th] + <> encodeChildrenPreservingEmpty xs + <> if null xs + then encodeChildrenPreservingEmpty [l, r] + else encodeChildren [l, r] +encodeTree (ATNode (ATCase _ v) _ l r) = + Node ("case " ++ show v) $ encodeChildrenPreservingEmpty [l] <> encodeChildren [r] +encodeTree (ATNode (ATDefault _) _ l r) = + Node "default" $ encodeChildrenPreservingEmpty [l] <> encodeChildren [r] +encodeTree (ATNode ATWhile _ l r) = Node "while" $ encodeChildrenPreservingEmpty [l, r] +encodeTree (ATNode (ATFor atf) _ l r) = + Node "for" $ + encodeChildrenPreservingEmpty (map fromATKindFor atf) + <> encodeChildren [l, r] +encodeTree (ATNode ATBreak _ l r) = Node "break" $ encodeChildren [l, r] +encodeTree (ATNode ATContinue _ l r) = Node "continue" $ encodeChildren [l, r] +encodeTree (ATNode (ATGoto lbl) _ l r) = Node ("goto " ++ T.unpack lbl) $ encodeChildren [l, r] +encodeTree (ATNode (ATLabel lbl) _ l r) = Node (":" ++ T.unpack lbl) $ encodeChildren [l, r] +encodeTree (ATNode (ATBlock xs) _ _ _) = Node "{}" $ encodeChildrenPreservingEmpty xs +encodeTree (ATNode (ATLVar t o) _ l r) = Node (show t ++ " lvar" ++ show o) $ encodeChildren [l, r] +encodeTree (ATNode (ATGVar t n) _ l r) = Node (show t ++ " " ++ T.unpack n) $ encodeChildren [l, r] +encodeTree (ATNode (ATFuncPtr name) _ _ _) = Node ("funcptr " ++ T.unpack name) [] +encodeTree (ATNode (ATDefFunc fname Nothing) t lhs _) = Node (show (CT.toTypeKind t) ++ " " ++ T.unpack fname ++ "()") $ encodeChildren [lhs] +encodeTree (ATNode (ATDefFunc fname (Just args)) t lhs _) = Node (show (CT.toTypeKind t) ++ " " ++ T.unpack fname ++ "(some arguments)") $ encodeChildren (args <> [lhs]) +encodeTree (ATNode (ATCallFunc fname Nothing) _ lhs rhs) = Node (T.unpack fname ++ "()") $ encodeChildren [lhs, rhs] +encodeTree (ATNode (ATCallFunc fname (Just args)) _ lhs rhs) = Node (T.unpack fname ++ "(some arguments)") $ encodeChildren (args <> [lhs, rhs]) +encodeTree (ATNode (ATCallPtr Nothing) _ lhs rhs) = Node "(*)(...)" $ encodeChildren [lhs, rhs] +encodeTree (ATNode (ATCallPtr (Just args)) _ lhs rhs) = Node "(*)(some arguments)" $ encodeChildren (lhs : args <> [rhs]) encodeTree (ATNode ATExprStmt _ lhs _) = encodeTree lhs -encodeTree (ATNode (ATStmtExpr exps) _ lhs rhs) = Node "({})" $ map encodeTree exps ++ [encodeTree lhs, encodeTree rhs] +encodeTree (ATNode (ATStmtExpr exps) _ lhs rhs) = + Node "({})" $ encodeChildrenPreservingEmpty exps <> encodeChildren [lhs, rhs] encodeTree (ATNode (ATNull _) _ _ _) = Node "" [] +-- TODO: handle ATFunc renderNTree :: Tree String -> QDiagram SVG V2 Double Any renderNTree nt = renderTree @@ -109,6 +152,9 @@ renderNTree nt = renderTree -- | Build AST from C source code visualize :: Show i => ASTs i -> SizeSpec V2 Double -> FilePath -> IO () -visualize ast ss fpath = let et = map encodeTree ast in if not (null et) then - renderPretty fpath ss (foldr ((|||) . renderNTree) (renderNTree $ head et) $ tail et) else +visualize ast ss fpath = case concatMap encodeForest ast of + [] -> putStrLnErr "There is nothing to describe" + firstTree:otherTrees -> + renderPretty fpath ss $ + foldr ((|||) . renderNTree) (renderNTree firstTree) otherTrees diff --git a/src/Htcc/WarningSuppression.hs b/src/Htcc/WarningSuppression.hs new file mode 100644 index 0000000..bcd5e93 --- /dev/null +++ b/src/Htcc/WarningSuppression.hs @@ -0,0 +1,734 @@ +module Htcc.WarningSuppression + ( CompilerOutputChunk + , CompilerWarningFilterDecision (..) + , dropCompilerWarningOutput + , emptyIncrementalCompilerWarningFilter + , feedIncrementalCompilerWarningFilter + , finalCompilerOutputChunk + , finalizeIncrementalCompilerWarningFilter + , filterCompilerOutputChunks + , incompleteCompilerOutputNeedsMoreInputForWarningSuppression + , IncrementalCompilerWarningFilter + , newlineByte + , normalizeCompilerOutputLine + , splitCompleteCompilerOutputChunks + , splitCompilerOutputChunks + ) where + +import qualified Data.ByteString as B +import qualified Data.ByteString.Char8 as BC +import Data.Char (isDigit, isSpace, toLower) +import qualified Data.Map.Strict as Map +import Data.Word (Word8) + +type CompilerOutputChunk = (B.ByteString, B.ByteString) + +data WarningSuppressionState + = NotSuppressingWarnings + | SuppressingWarning + | AwaitingWarningSummary + +data CompilerWarningFilterDecision chunk + = RetainCompilerWarningFilterChunk chunk + | SuppressCompilerWarningFilterChunk chunk + +data IncrementalCompilerWarningFilter chunk = IncrementalCompilerWarningFilter + { incrementalCompilerWarningLeadInChunks :: [chunk] + , incrementalCompilerWarningState :: WarningSuppressionState + , incrementalCompilerWarningDeferred :: [chunk] + } + +data LookaheadStatus + = LookaheadMatched + | LookaheadNoMatch + | LookaheadNeedsMoreInput + deriving (Eq) + +emptyIncrementalCompilerWarningFilter :: IncrementalCompilerWarningFilter chunk +emptyIncrementalCompilerWarningFilter = + IncrementalCompilerWarningFilter + { incrementalCompilerWarningLeadInChunks = [] + , incrementalCompilerWarningState = NotSuppressingWarnings + , incrementalCompilerWarningDeferred = [] + } + +feedIncrementalCompilerWarningFilter + :: (chunk -> B.ByteString) + -> (chunk -> B.ByteString) + -> IncrementalCompilerWarningFilter chunk + -> [chunk] + -> (IncrementalCompilerWarningFilter chunk, [CompilerWarningFilterDecision chunk]) +feedIncrementalCompilerWarningFilter chunkBytes chunkLine warningFilter newChunks = + let (pendingLeadIn, warningState, deferredChunks, decisions) = + stepIncrementalCompilerWarningFilter + False + chunkBytes + chunkLine + (incrementalCompilerWarningLeadInChunks warningFilter) + (incrementalCompilerWarningState warningFilter) + (incrementalCompilerWarningDeferred warningFilter <> newChunks) + in ( IncrementalCompilerWarningFilter + { incrementalCompilerWarningLeadInChunks = pendingLeadIn + , incrementalCompilerWarningState = warningState + , incrementalCompilerWarningDeferred = deferredChunks + } + , decisions + ) + +finalizeIncrementalCompilerWarningFilter + :: (chunk -> B.ByteString) + -> (chunk -> B.ByteString) + -> IncrementalCompilerWarningFilter chunk + -> [CompilerWarningFilterDecision chunk] +finalizeIncrementalCompilerWarningFilter chunkBytes chunkLine warningFilter = + let (_, _, _, decisions) = + stepIncrementalCompilerWarningFilter + True + chunkBytes + chunkLine + (incrementalCompilerWarningLeadInChunks warningFilter) + (incrementalCompilerWarningState warningFilter) + (incrementalCompilerWarningDeferred warningFilter) + in decisions + +stepIncrementalCompilerWarningFilter + :: Bool + -> (chunk -> B.ByteString) + -> (chunk -> B.ByteString) + -> [chunk] + -> WarningSuppressionState + -> [chunk] + -> ([chunk], WarningSuppressionState, [chunk], [CompilerWarningFilterDecision chunk]) +stepIncrementalCompilerWarningFilter inputComplete _chunkBytes chunkLine = + go + where + go pendingWarningLeadIn SuppressingWarning [] = + if inputComplete + then + ( [] + , NotSuppressingWarnings + , [] + , suppressCompilerWarningFilterChunks pendingWarningLeadIn + ) + else + ( pendingWarningLeadIn + , SuppressingWarning + , [] + , [] + ) + go pendingWarningLeadIn warningState [] = + if inputComplete + then + ( [] + , NotSuppressingWarnings + , [] + , retainCompilerWarningFilterChunks pendingWarningLeadIn + ) + else + ( pendingWarningLeadIn + , warningState + , [] + , [] + ) + go pendingWarningLeadIn SuppressingWarning chunks@(chunk:rest) = + let line = chunkLine chunk + in if isCompilerWarningSummaryLine line + then + let (pendingLeadIn', warningState', deferredChunks', decisions) = + go pendingWarningLeadIn SuppressingWarning rest + in ( pendingLeadIn' + , warningState' + , deferredChunks' + , SuppressCompilerWarningFilterChunk chunk : decisions + ) + else case suppressedCompilerDiagnosticStatus line rest of + LookaheadMatched -> + let (pendingLeadIn', warningState', deferredChunks', decisions) = + go pendingWarningLeadIn SuppressingWarning rest + in ( pendingLeadIn' + , warningState' + , deferredChunks' + , SuppressCompilerWarningFilterChunk chunk : decisions + ) + LookaheadNeedsMoreInput -> + ( pendingWarningLeadIn + , SuppressingWarning + , chunks + , [] + ) + LookaheadNoMatch -> + let retainPendingLeadIn = + isCompilerRetainedDiagnosticChunk line + nextState + | retainPendingLeadIn = NotSuppressingWarnings + | otherwise = AwaitingWarningSummary + (pendingLeadIn', warningState', deferredChunks', decisions) = + go [] nextState rest + in ( pendingLeadIn' + , warningState' + , deferredChunks' + , ( if retainPendingLeadIn + then retainCompilerWarningFilterChunks pendingWarningLeadIn + else suppressCompilerWarningFilterChunks pendingWarningLeadIn + ) + <> (RetainCompilerWarningFilterChunk chunk : decisions) + ) + go pendingWarningLeadIn NotSuppressingWarnings chunks = + goNonSuppressing False pendingWarningLeadIn chunks + go pendingWarningLeadIn AwaitingWarningSummary chunks = + goNonSuppressing True pendingWarningLeadIn chunks + + goNonSuppressing suppressTrailingSummary pendingWarningLeadIn chunks@(chunk:rest) = + let line = chunkLine chunk + nextState = NotSuppressingWarnings + in if suppressTrailingSummary && isCompilerWarningSummaryLine line + then + let (pendingLeadIn', warningState', deferredChunks', decisions) = + go [] NotSuppressingWarnings rest + in ( pendingLeadIn' + , warningState' + , deferredChunks' + , retainCompilerWarningFilterChunks pendingWarningLeadIn + <> (SuppressCompilerWarningFilterChunk chunk : decisions) + ) + else case compilerWarningChunkStatus line rest of + LookaheadMatched -> + let (pendingLeadIn', warningState', deferredChunks', decisions) = + go pendingWarningLeadIn SuppressingWarning rest + in ( pendingLeadIn' + , warningState' + , deferredChunks' + , SuppressCompilerWarningFilterChunk chunk : decisions + ) + LookaheadNeedsMoreInput -> + ( pendingWarningLeadIn + , nextState + , chunks + , [] + ) + LookaheadNoMatch + | isCompilerWarningLeadInChunk line -> + go + (pendingWarningLeadIn <> [chunk]) + nextState + rest + | not (null pendingWarningLeadIn) -> + case suppressedCompilerDiagnosticStatus line rest of + LookaheadMatched -> + go + (pendingWarningLeadIn <> [chunk]) + nextState + rest + LookaheadNeedsMoreInput -> + ( pendingWarningLeadIn + , nextState + , chunks + , [] + ) + LookaheadNoMatch -> + let (pendingLeadIn', warningState', deferredChunks', decisions) = + go [] nextState rest + in ( pendingLeadIn' + , warningState' + , deferredChunks' + , retainCompilerWarningFilterChunks pendingWarningLeadIn + <> (RetainCompilerWarningFilterChunk chunk : decisions) + ) + | otherwise -> + let (pendingLeadIn', warningState', deferredChunks', decisions) = + go [] nextState rest + in ( pendingLeadIn' + , warningState' + , deferredChunks' + , RetainCompilerWarningFilterChunk chunk : decisions + ) + goNonSuppressing suppressTrailingSummary pendingWarningLeadIn [] = + go pendingWarningLeadIn nextState [] + where + nextState + | suppressTrailingSummary = AwaitingWarningSummary + | otherwise = NotSuppressingWarnings + + retainCompilerWarningFilterChunks = + map RetainCompilerWarningFilterChunk + + suppressCompilerWarningFilterChunks = + map SuppressCompilerWarningFilterChunk + + isCompilerWarningLeadInChunk line = + isCompilerWarningPreambleLine line + || isAttachedCompilerWarningNoteLine line + + compilerWarningChunkStatus line rest + | not (isCompilerWarningLine line) = + LookaheadNoMatch + | not (isBareCompilerWarningCaptureChunk line) = + LookaheadMatched + | otherwise = + invertLookaheadStatus $ + compilerDiagnosticSnippetContinuationStatus rest + + suppressedCompilerDiagnosticStatus line rest = + case suppressedNonNoteCompilerDiagnosticStatus line rest of + LookaheadMatched -> + LookaheadMatched + LookaheadNeedsMoreInput -> + LookaheadNeedsMoreInput + LookaheadNoMatch -> + suppressedCompilerWarningNoteStatus line rest + + suppressedNonNoteCompilerDiagnosticStatus line rest + | isCompilerWarningLine line = + compilerWarningLineSuppressionStatus line rest + | isCompilerWarningSummaryLine line = + LookaheadMatched + | isCompilerWarningContinuationLine line = + LookaheadMatched + | isCompilerDiagnosticAnnotationLine line = + LookaheadMatched + | isGenericCompilerDiagnosticSourceSnippetLine line = + compilerDiagnosticSnippetContinuationStatus rest + | otherwise = + LookaheadNoMatch + where + compilerWarningLineSuppressionStatus warningLine remaining + | not (isBareCompilerWarningCaptureChunk warningLine) = + LookaheadMatched + | otherwise = + compilerDiagnosticSnippetContinuationStatus remaining + + suppressedCompilerWarningNoteStatus line rest + | not (isCompilerWarningNoteLine line) = + LookaheadNoMatch + | isAttachedCompilerWarningNoteLine line = + LookaheadMatched + | otherwise = + nextChunkContinuesSuppressedWarningStatus rest + + nextChunkContinuesSuppressedWarningStatus [] = + incompleteLookaheadNoMatch + nextChunkContinuesSuppressedWarningStatus (next:remaining) = + suppressedNonNoteCompilerDiagnosticStatus (chunkLine next) remaining + + compilerDiagnosticSnippetContinuationStatus [] = + incompleteLookaheadNoMatch + compilerDiagnosticSnippetContinuationStatus (next:remaining) + | isCompilerDiagnosticAnnotationLine nextLine = + LookaheadMatched + | isGenericCompilerDiagnosticSourceSnippetLine nextLine = + compilerDiagnosticSnippetContinuationStatus remaining + | otherwise = + LookaheadNoMatch + where + nextLine = chunkLine next + + incompleteLookaheadNoMatch + | inputComplete = + LookaheadNoMatch + | otherwise = + LookaheadNeedsMoreInput + + invertLookaheadStatus lookaheadStatus = + case lookaheadStatus of + LookaheadMatched -> LookaheadNoMatch + LookaheadNoMatch -> LookaheadMatched + LookaheadNeedsMoreInput -> LookaheadNeedsMoreInput + + isCompilerWarningLine = + isCompilerWarningCaptureChunk + + isCompilerWarningPreambleLine = + isCompilerWarningPreambleCaptureChunk + + isCompilerWarningNoteLine = + isCompilerWarningNoteCaptureChunk + + isCompilerWarningSummaryLine = + isCompilerWarningSummaryCaptureChunk + + isCompilerWarningContinuationLine = + isCompilerWarningContinuationCaptureChunk + + isCompilerRetainedDiagnosticChunk = + isCompilerErrorCaptureChunk + + isAttachedCompilerWarningNoteLine line = + case BC.breakSubstring noteNeedle (BC.map toLower line) of + (prefix, suffix) -> + not (B.null suffix) + && BC.count ':' prefix >= 2 + && BC.any isDigit prefix + + isGenericCompilerDiagnosticSourceSnippetLine line = + let trimmed = BC.dropWhile isSpace line + in not (B.null trimmed) + && not (isCompilerWarningPreambleLine line) + && not (isCompilerWarningLine line) + && not (isCompilerWarningSummaryLine line) + && not (isCompilerWarningNoteLine line) + && not (isCompilerDiagnosticAnnotationLine line) + + isCompilerDiagnosticAnnotationLine = + isCompilerDiagnosticAnnotationCaptureChunk + + noteNeedle = BC.pack "note:" + +dropCompilerWarningOutput :: B.ByteString -> B.ByteString +dropCompilerWarningOutput = + B.concat . map fst . filterCompilerOutputChunks fst snd . splitCompilerOutputChunks + +incompleteCompilerOutputNeedsMoreInputForWarningSuppression :: B.ByteString -> Bool +incompleteCompilerOutputNeedsMoreInputForWarningSuppression line = + BC.all isSpace line + || isPotentialBareWarningPrefix loweredLine + || isPotentialLocatedWarningPrefix loweredLine + || isPotentialBareWarningLeadInPrefix loweredLine + || isPotentialLocatedWarningLeadInPrefix loweredLine + || isPotentialWarningSummaryPrefix loweredLine + || isPotentialDiagnosticAnnotationPrefix line + where + loweredLine = + BC.map toLower $ BC.dropWhile isSpace line + + isPotentialBareWarningPrefix current = + any (current `BC.isPrefixOf`) warningPrefixes + + isPotentialLocatedWarningPrefix current = + case lastColonSeparatedSuffix current of + Just (headerPrefix, suffix) -> + looksLikeLocatedDiagnosticPrefix headerPrefix + && any (suffix `BC.isPrefixOf`) warningPrefixes + Nothing -> + False + + isPotentialBareWarningLeadInPrefix current = + any (current `BC.isPrefixOf`) bareWarningLeadInPrefixes + + isPotentialLocatedWarningLeadInPrefix current = + case lastColonSeparatedSuffix current of + Just (headerPrefix, suffix) -> + looksLikeLocatedDiagnosticPrefix headerPrefix + && any (suffix `BC.isPrefixOf`) locatedWarningLeadInPrefixes + Nothing -> + False + + isPotentialWarningSummaryPrefix current = + case BC.span isDigit current of + (countPrefix, rest) + | not (B.null countPrefix) -> + let summaryPrefix = BC.dropWhile isSpace rest + in B.null summaryPrefix + || any (summaryPrefix `BC.isPrefixOf`) warningSummaryPrefixes + _ -> + False + + isPotentialDiagnosticAnnotationPrefix rawLine = + let trimmedLine = BC.dropWhile isSpace rawLine + in any (trimmedLine `BC.isPrefixOf`) diagnosticAnnotationPrefixes + + lastColonSeparatedSuffix current = + case B.elemIndices (fromIntegral $ fromEnum ':') current of + [] -> + Nothing + colonIndices -> + let colonIndex = last colonIndices + in Just + ( B.take colonIndex current + , BC.dropWhile isSpace $ B.drop (colonIndex + 1) current + ) + + looksLikeLocatedDiagnosticPrefix headerPrefix = + looksLikeCompilerDiagnosticHeaderPrefix headerPrefix + && BC.any isLocatedDiagnosticHeaderPrefixChar headerPrefix + + isLocatedDiagnosticHeaderPrefixChar c = + isDigit c || c `elem` ("./\\{}()-_" :: String) + + warningPrefixes = + [ BC.pack "warning:" + , BC.pack "note:" + ] + + bareWarningLeadInPrefixes = + [ BC.pack "in file included from " + , BC.pack "from " + ] + + locatedWarningLeadInPrefixes = + [ BC.pack "in function " + , BC.pack "assembler messages:" + ] + + warningSummaryPrefixes = + [ BC.pack "warning" + , BC.pack "warnings" + , BC.pack "warning generated" + , BC.pack "warning generated." + , BC.pack "warnings generated" + , BC.pack "warnings generated." + , BC.pack "warning emitted" + , BC.pack "warning emitted." + , BC.pack "warnings emitted" + , BC.pack "warnings emitted." + ] + + diagnosticAnnotationPrefixes = + [ BC.pack "^" + , BC.pack "|" + ] + +filterCompilerOutputChunks + :: (chunk -> B.ByteString) + -> (chunk -> B.ByteString) + -> [chunk] + -> [chunk] +filterCompilerOutputChunks chunkBytes chunkLine chunks = + map snd $ + filter + (\(chunkIndex, _) -> Map.findWithDefault True chunkIndex decisionMap) + indexedChunks + where + indexedChunks = zip [0 :: Int ..] chunks + (warningFilter, initialDecisions) = + feedIncrementalCompilerWarningFilter + (chunkBytes . snd) + (chunkLine . snd) + emptyIncrementalCompilerWarningFilter + indexedChunks + decisions = + initialDecisions + <> finalizeIncrementalCompilerWarningFilter + (chunkBytes . snd) + (chunkLine . snd) + warningFilter + decisionMap = + Map.fromList $ + map compilerWarningFilterDecisionEntry decisions + + compilerWarningFilterDecisionEntry decision = + case decision of + RetainCompilerWarningFilterChunk (chunkIndex, _) -> + (chunkIndex, True) + SuppressCompilerWarningFilterChunk (chunkIndex, _) -> + (chunkIndex, False) + +isCompilerWarningCaptureChunk :: B.ByteString -> Bool +isCompilerWarningCaptureChunk line = + isBareCompilerDiagnosticCaptureChunk warningNeedles line + || isLocatedCompilerDiagnosticCaptureChunk warningNeedles normalizedLine + where + normalizedLine = normalizeCompilerDiagnosticCaptureChunk line + warningNeedles = [BC.pack "warning:"] + +isBareCompilerWarningCaptureChunk :: B.ByteString -> Bool +isBareCompilerWarningCaptureChunk = + isBareCompilerDiagnosticCaptureChunk [warningNeedle] + where + warningNeedle = BC.pack "warning:" + +isCompilerErrorCaptureChunk :: B.ByteString -> Bool +isCompilerErrorCaptureChunk line = + isBareCompilerDiagnosticCaptureChunk errorNeedles line + || isLocatedCompilerDiagnosticCaptureChunk errorNeedles normalizedLine + where + normalizedLine = normalizeCompilerDiagnosticCaptureChunk line + errorNeedles = + [ BC.pack "error:" + , BC.pack "fatal error:" + ] + +isBareCompilerDiagnosticCaptureChunk :: [B.ByteString] -> B.ByteString -> Bool +isBareCompilerDiagnosticCaptureChunk needles line = + any (`BC.isPrefixOf` normalizedLine) needles + where + normalizedLine = normalizeCompilerDiagnosticCaptureChunk line + +isLocatedCompilerDiagnosticCaptureChunk + :: [B.ByteString] + -> B.ByteString + -> Bool +isLocatedCompilerDiagnosticCaptureChunk needles normalized = + any locatedDiagnosticNeedleMatches needles + where + locatedDiagnosticNeedleMatches diagnosticNeedle = + case BC.breakSubstring diagnosticNeedle normalized of + (prefix, suffix) + | not (B.null suffix) -> + let trimmedPrefix = BC.dropWhileEnd isSpace prefix + in not (B.null trimmedPrefix) + && BC.last trimmedPrefix == ':' + && looksLikeCompilerDiagnosticHeaderPrefix + (BC.init trimmedPrefix) + _ -> + False + +normalizeCompilerDiagnosticCaptureChunk :: B.ByteString -> B.ByteString +normalizeCompilerDiagnosticCaptureChunk = + BC.dropWhile isSpace . BC.map toLower + +looksLikeCompilerDiagnosticHeaderPrefix :: B.ByteString -> Bool +looksLikeCompilerDiagnosticHeaderPrefix prefix = + not (B.null prefix) + && not (BC.any isClearlyNonDiagnosticHeaderPunctuation prefix) + +isClearlyNonDiagnosticHeaderPunctuation :: Char -> Bool +isClearlyNonDiagnosticHeaderPunctuation c = + c `elem` ("\";" :: String) + +isCompilerWarningPreambleCaptureChunk :: B.ByteString -> Bool +isCompilerWarningPreambleCaptureChunk line = + let normalizedLine = BC.map toLower $ BC.dropWhile isSpace line + in includedFromPrefix `BC.isPrefixOf` normalizedLine + || fromPrefix `BC.isPrefixOf` normalizedLine + || inFunctionNeedle `BC.isInfixOf` normalizedLine + || assemblerMessagesSuffix `BC.isSuffixOf` normalizedLine + where + includedFromPrefix = BC.pack "in file included from " + fromPrefix = BC.pack "from " + inFunctionNeedle = BC.pack ": in function " + assemblerMessagesSuffix = BC.pack ": assembler messages:" + +isCompilerWarningSummaryCaptureChunk :: B.ByteString -> Bool +isCompilerWarningSummaryCaptureChunk line = + case BC.words (BC.map toLower line) of + [count, warningWord, trailer] -> + BC.all isDigit count + && warningWord `elem` [warningWordSingular, warningWordPlural] + && trailer `elem` [generatedWord, generatedWordPeriod, emittedWord, emittedWordPeriod] + _ -> + False + where + warningWordSingular = BC.pack "warning" + warningWordPlural = BC.pack "warnings" + generatedWord = BC.pack "generated" + generatedWordPeriod = BC.pack "generated." + emittedWord = BC.pack "emitted" + emittedWordPeriod = BC.pack "emitted." + +isCompilerWarningNoteCaptureChunk :: B.ByteString -> Bool +isCompilerWarningNoteCaptureChunk line = + noteNeedle `BC.isInfixOf` BC.map toLower line + where + noteNeedle = BC.pack "note:" + +isCompilerWarningContinuationCaptureChunk :: B.ByteString -> Bool +isCompilerWarningContinuationCaptureChunk = + BC.all isSpace + +isCompilerDiagnosticAnnotationCaptureChunk :: B.ByteString -> Bool +isCompilerDiagnosticAnnotationCaptureChunk line = + isCompilerDiagnosticCaretCaptureChunk line || isCompilerDiagnosticPipeCaptureChunk line + +isCompilerDiagnosticCaretCaptureChunk :: B.ByteString -> Bool +isCompilerDiagnosticCaretCaptureChunk line = + let trimmed = BC.dropWhile isSpace line + in not (B.null trimmed) + && BC.all (`elem` ("^~|" :: String)) trimmed + +isCompilerDiagnosticPipeCaptureChunk :: B.ByteString -> Bool +isCompilerDiagnosticPipeCaptureChunk line = + let trimmed = BC.dropWhile isSpace line + (prefixDigits, rest) = BC.span isDigit trimmed + pipePrefix = BC.pack "|" + spacedPipePrefix = BC.pack " |" + in pipePrefix `BC.isPrefixOf` trimmed + || ( not (B.null prefixDigits) + && (pipePrefix `BC.isPrefixOf` rest + || spacedPipePrefix `BC.isPrefixOf` rest + ) + ) + +splitCompilerOutputChunks :: B.ByteString -> [CompilerOutputChunk] +splitCompilerOutputChunks bytes = + completedChunks <> finalCompilerOutputChunk trailingBytes + where + (completedChunks, trailingBytes) = splitCompleteCompilerOutputChunks bytes + +splitCompleteCompilerOutputChunks + :: B.ByteString + -> ([CompilerOutputChunk], B.ByteString) +splitCompleteCompilerOutputChunks bytes + | B.null bytes = + ([], B.empty) + | otherwise = + let (line, rest) = B.break (== newlineByte) bytes + in case B.uncons rest of + Just (_, remaining) -> + let rawLine = line `B.snoc` newlineByte + normalizedLine = normalizeCompilerOutputLine line + (remainingChunks, trailingBytes) = + splitCompleteCompilerOutputChunks remaining + in ((rawLine, normalizedLine) : remainingChunks, trailingBytes) + Nothing -> + ([], line) + +finalCompilerOutputChunk :: B.ByteString -> [CompilerOutputChunk] +finalCompilerOutputChunk bytes + | B.null bytes = [] + | otherwise = [(bytes, normalizeCompilerOutputLine bytes)] + +normalizeCompilerOutputLine :: B.ByteString -> B.ByteString +normalizeCompilerOutputLine line = + stripAnsiEscapeSequences $ + B.dropWhileEnd isTrailingLineEndingByte line + where + isTrailingLineEndingByte byte = + byte == carriageReturnByte || byte == newlineByte + +stripAnsiEscapeSequences :: B.ByteString -> B.ByteString +stripAnsiEscapeSequences bytes = + case B.uncons bytes of + Just (escapeByte, rest) + | escapeByte == ansiEscapeByte -> + stripAnsiEscapeSequence rest + Just (byte, rest) -> + B.cons byte (stripAnsiEscapeSequences rest) + Nothing -> + B.empty + +stripAnsiEscapeSequence :: B.ByteString -> B.ByteString +stripAnsiEscapeSequence bytes = + case B.uncons bytes of + Just (openBracketByte, rest) + | openBracketByte == ansiControlSequenceIntroducerByte -> + stripAnsiControlSequence rest + Just (_, rest) -> + stripAnsiEscapeSequences rest + Nothing -> + B.empty + +stripAnsiControlSequence :: B.ByteString -> B.ByteString +stripAnsiControlSequence bytes = + case B.uncons bytes of + Just (byte, rest) + | isAnsiControlSequenceParameterByte byte + || isAnsiControlSequenceIntermediateByte byte -> + stripAnsiControlSequence rest + | isAnsiControlSequenceFinalByte byte -> + stripAnsiEscapeSequences rest + _ -> + B.empty + +isAnsiControlSequenceParameterByte :: Word8 -> Bool +isAnsiControlSequenceParameterByte byte = + byte >= 0x30 && byte <= 0x3f + +isAnsiControlSequenceIntermediateByte :: Word8 -> Bool +isAnsiControlSequenceIntermediateByte byte = + byte >= 0x20 && byte <= 0x2f + +isAnsiControlSequenceFinalByte :: Word8 -> Bool +isAnsiControlSequenceFinalByte byte = + byte >= 0x40 && byte <= 0x7e + +carriageReturnByte :: Word8 +carriageReturnByte = 13 + +newlineByte :: Word8 +newlineByte = 10 + +ansiEscapeByte :: Word8 +ansiEscapeByte = 27 + +ansiControlSequenceIntroducerByte :: Word8 +ansiControlSequenceIntroducerByte = 91 diff --git a/stack.yaml b/stack.yaml index 874850d..19270f3 100644 --- a/stack.yaml +++ b/stack.yaml @@ -1,4 +1,4 @@ -resolver: lts-16.13 +resolver: lts-24.38 # User packages to be built. # Various formats can be used as shown in the example below. # @@ -18,11 +18,10 @@ packages: # using the same syntax as the packages field. # (e.g., acme-missiles-0.3) extra-deps: -- monad-finally-0.1.2@sha256:7f2c860c39d0a00908d83ddaf9cd232d09c19934381b011ed361335715b4e52e -- monad-abort-fd-0.7@sha256:dc917e7ee2ec0b4f20d6e1cc323bef03adf5b2067619b6e7f4f324a50ae6e870,1340 +- monad-finally-0.1.2.1@sha256:a5da0f790185af394ae30e53dae9af388b7bec1babf9240304e13de51d5925e7,1637 +- monad-abort-fd-0.7.0.1@sha256:a13c744a6c1f85e72adeac6348c6e1044c7a19123f65fafa41336a98923db521,1490 - transformers-abort-0.6.0.3@sha256:34de32cc6e852df10ad57df34e46404f841c6b0123526b7fd942c455f62a7a31,1236 - # Override default flag values for local packages and extra-deps # flags: {} diff --git a/stack.yaml.lock b/stack.yaml.lock index b372830..6cfc8b6 100644 --- a/stack.yaml.lock +++ b/stack.yaml.lock @@ -1,33 +1,33 @@ # This file was autogenerated by Stack. # You should not edit this file by hand. # For more information, please see the documentation at: -# https://docs.haskellstack.org/en/stable/lock_files +# https://docs.haskellstack.org/en/stable/topics/lock_files packages: - completed: - hackage: monad-finally-0.1.2@sha256:7f2c860c39d0a00908d83ddaf9cd232d09c19934381b011ed361335715b4e52e,1487 + hackage: monad-finally-0.1.2.1@sha256:a5da0f790185af394ae30e53dae9af388b7bec1babf9240304e13de51d5925e7,1637 pantry-tree: + sha256: 17f2358ba23fe2d6ae4c7dc9702b921c0bce3d694985c7dc60b5ebd22bf42277 size: 347 - sha256: 24d7ac7af5b7d8c23bfe3eced368c3fe9ae2de5766522605ef963cd2f466dce1 original: - hackage: monad-finally-0.1.2@sha256:7f2c860c39d0a00908d83ddaf9cd232d09c19934381b011ed361335715b4e52e + hackage: monad-finally-0.1.2.1@sha256:a5da0f790185af394ae30e53dae9af388b7bec1babf9240304e13de51d5925e7,1637 - completed: - hackage: monad-abort-fd-0.7@sha256:dc917e7ee2ec0b4f20d6e1cc323bef03adf5b2067619b6e7f4f324a50ae6e870,1340 + hackage: monad-abort-fd-0.7.0.1@sha256:a13c744a6c1f85e72adeac6348c6e1044c7a19123f65fafa41336a98923db521,1490 pantry-tree: + sha256: 73df69ba0ace97907f7c7553df55b0ff17bfffe071e300db6c350a8af2f2fc2b size: 486 - sha256: 8aae8657f16cd20d32307b6e815ab3b53bcd6a35e770dbe96064a3b6a5a06b6d original: - hackage: monad-abort-fd-0.7@sha256:dc917e7ee2ec0b4f20d6e1cc323bef03adf5b2067619b6e7f4f324a50ae6e870,1340 + hackage: monad-abort-fd-0.7.0.1@sha256:a13c744a6c1f85e72adeac6348c6e1044c7a19123f65fafa41336a98923db521,1490 - completed: hackage: transformers-abort-0.6.0.3@sha256:34de32cc6e852df10ad57df34e46404f841c6b0123526b7fd942c455f62a7a31,1236 pantry-tree: - size: 357 sha256: eeafa773cab79ad314b113aaa6e182e4a353bc1843b759241c1e761ebc1f453e + size: 357 original: hackage: transformers-abort-0.6.0.3@sha256:34de32cc6e852df10ad57df34e46404f841c6b0123526b7fd942c455f62a7a31,1236 snapshots: - completed: - size: 532381 - url: https://raw.githubusercontent.com/commercialhaskell/stackage-snapshots/master/lts/16/13.yaml - sha256: 6ee17f7996e5bc75ae4406250841f1362ad4196418a4d90a0615ff4f26ac98df - original: lts-16.13 + sha256: abc790b571e0c70e929db74b329e3c18d7e76a6e173e8bdf94f1ba20770d4c24 + size: 728990 + url: https://raw.githubusercontent.com/commercialhaskell/stackage-snapshots/master/lts/24/38.yaml + original: lts-24.38 diff --git a/test/Spec.hs b/test/Spec.hs index a531222..42f4fdf 100644 --- a/test/Spec.hs +++ b/test/Spec.hs @@ -1,35 +1,48 @@ {-# LANGUAGE OverloadedStrings #-} module Main where -import Codec.Binary.UTF8.String (decodeString) -import Control.Exception (finally) -import qualified Data.ByteString.Char8 as B -import qualified Data.Text as T -import qualified Data.Text.IO as T -import Dhall.JSON (omitNull) -import Dhall.Yaml (Options (..), defaultOptions, - dhallToYaml) -import qualified Options.Applicative as OA -import System.Directory (createDirectoryIfMissing) -import System.FilePath (()) -import System.Process (readCreateProcess, shell) -import qualified Tests.SubProcTests as SubProcTests +import Control.Exception (bracket, finally, try) +import Control.Monad (foldM, when) +import Control.Monad.Extra (partitionM) +import Control.Monad.Trans (lift) +import Control.Monad.Trans.State (StateT, evalStateT, gets, modify, + put) +import Data.List (isSuffixOf) +import Data.Maybe (fromMaybe) +import qualified Data.Text as T +import qualified Data.Text.IO as T +import Htcc.Utils (tshow) +import qualified Options.Applicative as OA +import System.Directory (createDirectoryIfMissing, + doesDirectoryExist, listDirectory) +import System.Environment (lookupEnv, setEnv, unsetEnv) +import System.Exit (ExitCode (..), exitFailure, + exitWith) +import System.FilePath (()) +import System.IO (hFlush, hPutStr, stderr, stdout) +import System.IO.Error (isDoesNotExistError) +import System.Process (proc, readCreateProcessWithExitCode) +import Tests.CommandSelection (Command (..), autoHtccBinOverride, + collectCommandExitCodes, + commandsToRun, + needsHtccCommandOverride, + resolveCommand) +import qualified Tests.ComponentsTests as ComponentsTests +import qualified Tests.SubProcTests as SubProcTests import Tests.Utils workDir :: FilePath workDir = "/tmp" "htcc" -specPath :: FilePath -specPath = workDir "spec.s" +asmDir :: FilePath +asmDir = workDir "asm" dockerComposePath :: FilePath -dockerComposePath = "./docker" "test.dhall" - -data Command = WithSubProc | WithDocker | WithSelf +dockerComposePath = "." "docker" "test.dhall" data Opts = Opts { optClean :: !Bool - , optCmd :: !Command + , optCmd :: !(Maybe Command) } subProcCmd :: OA.Mod OA.CommandFields Command @@ -44,6 +57,10 @@ selfCmd :: OA.Mod OA.CommandFields Command selfCmd = OA.command "self" $ OA.info (pure WithSelf) $ OA.progDesc "run the test using htcc's processing power" +componentsCmd :: OA.Mod OA.CommandFields Command +componentsCmd = OA.command "components" $ + OA.info (pure WithComponents) $ OA.progDesc "run unit tests of components" + cleanOpt :: OA.Parser Bool cleanOpt = OA.switch $ mconcat [ OA.long "clean" @@ -53,43 +70,166 @@ cleanOpt = OA.switch $ mconcat [ programOptions :: OA.Parser Opts programOptions = Opts <$> cleanOpt - <*> OA.hsubparser (mconcat [ - subProcCmd - , dockerCmd - , selfCmd - ]) + <*> OA.optional (OA.hsubparser (mconcat [ + subProcCmd + , dockerCmd + , selfCmd + , componentsCmd + ])) optsParser :: OA.ParserInfo Opts optsParser = OA.info (OA.helper <*> programOptions) $ mconcat [ OA.fullDesc - , OA.progDesc $ "The htcc unit tester" + , OA.progDesc "The htcc unit tester" ] -genTestAsm :: IO () -genTestAsm = do - createDirectoryIfMissing False workDir - execErrFin $ "stack exec htcc -- " <> T.pack testCoreFile <> " > " <> T.pack specPath +genTestAsm' :: StateT Int IO [T.Text] +genTestAsm' = lift (createDirectoryIfMissing False workDir *> createDirectoryIfMissing False asmDir) + *> go [] ("." "test" "Tests" "csrc" "self") + where + go s fname = do + names <- lift $ map (fname ) <$> listDirectory fname + (dirPaths, filePaths) <- lift $ partitionM doesDirectoryExist names + foldM (\fs f -> if ".c" `isSuffixOf` f then (:fs) <$> mkBin (T.pack f) else pure fs) s filePaths + >>= flip (foldM go) dirPaths + + mkBin fname = do + outAsmName <- gets (\n -> T.pack (asmDir "spec") <> tshow n <> ".s") + lift $ + htccCommand >>= \htccCmd -> + T.putStr ("[compiling] " <> fname) + *> hFlush stdout + *> execErrFin (htccCmd <> " " <> fname <> " > " <> outAsmName) + *> T.putStrLn (" -> " <> outAsmName) + outAsmName <$ modify succ + +genTestAsm :: IO [T.Text] +genTestAsm = evalStateT genTestAsm' 0 + +genTestBins' :: StateT Int IO [T.Text] +genTestBins' = (genTestAsm' <* put 0) >>= mapM f where - testCoreFile = "./test" "Tests" "csrc" "test_core.c" + f fname = do + binName <- gets (\n -> T.pack (workDir "spec") <> tshow n <> ".out") + asmCmd <- lift $ assemblerCommand + [ "-x" + , "assembler" + , "-no-pie" + , "-o" + , T.unpack binName + , T.unpack fname + ] + lift $ + T.putStr ("[assembling] " <> fname) + *> hFlush stdout + *> execErrFin asmCmd + *> T.putStrLn (" -> " <> binName) + binName <$ modify succ -createProcessDhallDocker :: FilePath -> String -> IO () -createProcessDhallDocker fp cmd = T.readFile fp - >>= dhallToYaml (defaultOptions { explain = True, omission = omitNull }) (Just fp) - >>= readCreateProcess (shell $ "docker-compose -f - " <> cmd) . decodeString . B.unpack - >>= putStrLn +genTestBins :: IO [T.Text] +genTestBins = evalStateT genTestBins' 0 + +createProcessDhallDocker :: FilePath -> [String] -> IO () +createProcessDhallDocker fp cmd = do + dockerCompose <- dockerComposeCommand + dockerInput <- renderDhallYaml fp + (dockerExitCode, dockerStdout, dockerStderr) <- + readCreateProcessWithExitCode + (uncurry proc $ dockerComposeArgs dockerCompose cmd) + dockerInput + putStr dockerStdout + hFlush stdout + hPutStr stderr dockerStderr + when (dockerExitCode /= ExitSuccess) $ + exitWith dockerExitCode + where + dockerComposeCommand = + maybe (pure ["docker", "compose"]) parseDockerComposeCommand =<< lookupEnv "DOCKER_COMPOSE" + + parseDockerComposeCommand "docker compose" = pure ["docker", "compose"] + parseDockerComposeCommand "docker-compose" = pure ["docker-compose"] + parseDockerComposeCommand value = + fail $ "unsupported DOCKER_COMPOSE value: " <> value + + dockerComposeArgs [] composeArgs = + dockerComposeArgs ["docker", "compose"] composeArgs + dockerComposeArgs (exe:args) composeArgs = + (exe, args <> ["-f", "-"] <> composeArgs) + +renderDhallYaml :: FilePath -> IO String +renderDhallYaml fp = do + dhallToYaml <- fromMaybe "dhall-to-yaml" <$> lookupEnv "DHALL_TO_YAML" + result <- try $ readCreateProcessWithExitCode + (proc dhallToYaml ["--explain", "--file", fp]) + "" + case result of + Left err + | isDoesNotExistError err -> do + hPutStr stderr $ unlines [ + "dhall-to-yaml executable not found: " <> dhallToYaml + , "Run .travis/install-dhall-to-yaml.sh and ensure $HOME/.local/bin is on PATH, or set DHALL_TO_YAML." + ] + exitFailure + | otherwise -> ioError err + Right (dhallExitCode, dhallStdout, dhallStderr) -> do + hPutStr stderr dhallStderr + when (dhallExitCode /= ExitSuccess) $ + exitWith dhallExitCode + pure dhallStdout + +runDhallDocker :: [String] -> IO () +runDhallDocker = createProcessDhallDocker dockerComposePath main :: IO () main = do opts <- OA.execParser optsParser - case optCmd opts of - WithSubProc -> SubProcTests.exec - WithDocker -> let runDhallDocker = createProcessDhallDocker dockerComposePath in - if optClean opts then - runDhallDocker "down --rmi all" - else - flip finally (clean [workDir]) $ - genTestAsm >> runDhallDocker "up --build" - WithSelf -> flip finally (clean [workDir, "spec"]) $ do - genTestAsm - execErrFin $ "gcc -no-pie -o spec " <> T.pack specPath - execErrFin "./spec" + command <- resolveCommand (optClean opts) (optCmd opts) + let commands = commandsToRun (optCmd opts) command + autoCompilerCommand <- + if needsHtccCommandOverride (optCmd opts) command + then autoHtccBinOverride + else pure Nothing + exitCodes <- collectCommandExitCodes $ map (runCommand opts autoCompilerCommand) commands + when (any (/= ExitSuccess) exitCodes) exitFailure + +runCommand :: Opts -> Maybe T.Text -> Command -> IO () +runCommand opts autoCompilerCommand command = case command of + WithSubProc -> + maybe + SubProcTests.exec + (\compilerCommand -> + withEnvOverride "HTCC_BIN" (T.unpack compilerCommand) SubProcTests.exec + ) + autoCompilerCommand + WithDocker + | optClean opts -> runDhallDocker ["down", "--rmi", "all"] + | otherwise -> + ( clean [workDir] + *> genTestAsm + *> runDhallDocker ["up", "--build", "--exit-code-from", "htcc"] + ) `finally` clean [workDir] + WithSelf -> + maybe + runGeneratedSelfTests + (\compilerCommand -> + withEnvOverride + "HTCC_BIN" + (T.unpack compilerCommand) + runGeneratedSelfTests + ) + autoCompilerCommand + WithComponents -> ComponentsTests.exec + where + runGeneratedSelfTests = + (genTestBins >>= mapM_ execErrFin) `finally` clean [workDir] + +withEnvOverride :: String -> String -> IO a -> IO a +withEnvOverride name value = + bracket + (do + oldValue <- lookupEnv name + setEnv name value + pure oldValue + ) + (maybe (unsetEnv name) (setEnv name)) + . const diff --git a/test/Tests/CommandSelection.hs b/test/Tests/CommandSelection.hs new file mode 100644 index 0000000..fe277a5 --- /dev/null +++ b/test/Tests/CommandSelection.hs @@ -0,0 +1,916 @@ +module Tests.CommandSelection ( + Command (..), + autoHtccBinOverride, + autoHtccBinOverrideFor, + autoHtccBinOverrideForHost, + autoHtccBinOverrideWith, + autoHtccCommand, + autoHtccCommandFor, + autoHtccCommandForHost, + assemblerCommandAvailableInDirectoryWith, + assemblerCommandAvailableWithDirectories, + assemblerCommandAvailableWith, + assemblerCommandAvailableWithTempDirectory, + collectCommandExitCodes, + compilerCommandAvailable, + compilerCommandAvailableWithDirectories, + compilerCommandAvailableWith, + commandsToRun, + defaultCommand, + defaultCommandFor, + defaultCommandWithProbes, + shouldValidateRunnableLinkedOutput, + validateRunnableLinkedOutput, + looksRunnableLinkedOutput, + needsHtccCommandOverride, + needsSubProcCompilerOverride, + probeAvailableDirectories, + resolveCommand, + resolveCommandWith, + resolveCommandWithDefault +) where + +import Control.Exception (AsyncException, IOException, + SomeException, displayException, + finally, fromException, throwIO, try) +import Data.Bits (Bits (shiftL, (.&.), (.|.))) +import qualified Data.ByteString as B +import qualified Data.ByteString.Char8 as BC +import Data.Char (toLower) +import Data.Either (fromRight) +import Data.Functor (($>)) +import Data.List (foldl', isInfixOf, isPrefixOf) +import Data.Maybe (fromMaybe, isJust) +import qualified Data.Text as T +import Data.Word (Word64, Word8) +import System.Directory (getCurrentDirectory, + getTemporaryDirectory, removeFile) +import System.Exit (ExitCode (ExitFailure, ExitSuccess)) +import System.FilePath (takeFileName) +import System.Info (arch, os) +import System.IO (Handle, hClose, hPutStr, hPutStrLn, + openTempFile, stderr) +import System.IO.Error (catchIOError, isDoesNotExistError) +import System.Posix.Files (fileMode, getSymbolicLinkStatus, + groupExecuteMode, intersectFileModes, + isRegularFile, otherExecuteMode, + ownerExecuteMode, ownerReadMode, + setFileMode, unionFileModes) +import System.Posix.Types (FileMode) +import Tests.Utils (CompilerCommand, absoluteHtccCommand, + absoluteHtccCommandWith, + assemblerCompilerCommand, + htccCommandFor, htccCommandForHost, + probeCompilerShellCommandAvailableInDirectory, + readCompilerProcessWithExitCodeIn) + +data Command = WithSubProc | WithDocker | WithSelf | WithComponents + deriving (Eq, Show) + +commandsToRun :: Maybe Command -> Command -> [Command] +commandsToRun Nothing WithSubProc = [WithComponents, WithSubProc] +commandsToRun _ command = [command] + +needsSubProcCompilerOverride :: Maybe Command -> Command -> Bool +needsSubProcCompilerOverride maybeCommand command = + WithSubProc `elem` commandsToRun maybeCommand command + +needsHtccCommandOverride :: Maybe Command -> Command -> Bool +needsHtccCommandOverride maybeCommand command = + any (`elem` commandsToRun maybeCommand command) [WithSubProc, WithSelf] + +defaultCommandFor :: String -> String -> Bool -> Bool -> Command +defaultCommandFor hostOs hostArch compilerAvailable assemblerAvailable + | supportsSubProcDefault hostOs hostArch + && compilerAvailable + && assemblerAvailable = + WithSubProc + | otherwise = + WithComponents + +defaultCommandWithProbes :: String -> String -> IO Bool -> IO Bool -> IO Command +defaultCommandWithProbes hostOs hostArch compilerProbe assemblerProbe + | supportsSubProcDefault hostOs hostArch = do + compilerAvailable <- compilerProbe + defaultCommandFor hostOs hostArch compilerAvailable + <$> if compilerAvailable then assemblerProbe else pure False + | otherwise = + pure $ defaultCommandFor hostOs hostArch False False + +supportsSubProcDefault :: String -> String -> Bool +supportsSubProcDefault hostOs hostArch = + hostOs == "linux" && hostArch == "x86_64" + +resolveCommandWithDefault :: Command -> Bool -> Maybe Command -> Command +resolveCommandWithDefault autoCommand clean maybeCommand = case maybeCommand of + Just command -> + command + Nothing + | clean -> + WithDocker + | otherwise -> + autoCommand + +defaultCommand :: IO Command +defaultCommand = + defaultCommandWithProbes os arch compilerCommandAvailable assemblerCommandAvailable + +collectCommandExitCodes :: [IO ()] -> IO [ExitCode] +collectCommandExitCodes = + mapM $ \action -> do + result <- try action :: IO (Either SomeException ()) + case result of + Right () -> + pure ExitSuccess + Left err -> + if isJust (fromException err :: Maybe AsyncException) + then throwIO err + else + hPutStrLn stderr (displayException err) + $> fromMaybe (ExitFailure 1) (fromException err) + +resolveCommandWith :: IO Command -> Bool -> Maybe Command -> IO Command +resolveCommandWith autoCommand clean maybeCommand = case maybeCommand of + Just command -> + pure command + Nothing + | clean -> + pure WithDocker + | otherwise -> + autoCommand + +resolveCommand :: Bool -> Maybe Command -> IO Command +resolveCommand = + resolveCommandWith defaultCommand + +autoHtccCommandFor :: Maybe String -> Maybe FilePath -> T.Text +autoHtccCommandFor = + htccCommandFor + +autoHtccCommandForHost :: String -> Maybe String -> Maybe FilePath -> T.Text +autoHtccCommandForHost = + htccCommandForHost + +autoHtccCommand :: IO T.Text +autoHtccCommand = absoluteHtccCommand + +autoHtccBinOverrideFor :: Maybe String -> Maybe FilePath -> Maybe T.Text +autoHtccBinOverrideFor = + autoHtccBinOverrideForHost os + +autoHtccBinOverrideForHost :: String -> Maybe String -> Maybe FilePath -> Maybe T.Text +autoHtccBinOverrideForHost hostOs maybeCompilerCommand maybeRepoBuiltCompilerPath = + Just $ htccCommandForHost hostOs maybeCompilerCommand maybeRepoBuiltCompilerPath + +autoHtccBinOverrideWith :: Maybe String -> Maybe FilePath -> Maybe FilePath -> IO (Maybe T.Text) +autoHtccBinOverrideWith maybeCompilerCommand maybeRepoRoot maybeRepoBuiltCompilerPath = + Just <$> absoluteHtccCommandWith + maybeCompilerCommand + maybeRepoRoot + maybeRepoBuiltCompilerPath + +autoHtccBinOverride :: IO (Maybe T.Text) +autoHtccBinOverride = + Just <$> absoluteHtccCommand + +compilerCommandAvailable :: IO Bool +compilerCommandAvailable = + compilerCommandAvailableWith autoHtccCommand + +compilerCommandAvailableWithDirectories :: [IO FilePath] -> IO T.Text -> IO Bool +compilerCommandAvailableWithDirectories directoryProviders getCompilerCommand = + probeAvailableDirectories + directoryProviders + (`compilerCommandAvailableInDirectoryWith` getCompilerCommand) + where + compilerCommandAvailableInDirectoryWith workingDir getCompilerCommand' = do + result <- try $ do + compilerCommand <- getCompilerCommand' + probeCompilerShellCommandAvailableInDirectory workingDir compilerCommand + pure $ fromRight False (result :: Either IOException Bool) + +compilerCommandAvailableWith :: IO T.Text -> IO Bool +compilerCommandAvailableWith = + compilerCommandAvailableWithDirectories + [ getTemporaryDirectory + , getCurrentDirectory + ] + +assemblerCommandAvailable :: IO Bool +assemblerCommandAvailable = do + result <- try (assemblerCommandAvailableWith $ const assemblerCompilerCommand) :: IO (Either IOException Bool) + pure $ fromRight False result + +assemblerCommandAvailableWithDirectories :: [IO FilePath] -> ([String] -> IO CompilerCommand) -> IO Bool +assemblerCommandAvailableWithDirectories directoryProviders buildAssemblerCommand = + probeAvailableDirectories + directoryProviders + (`assemblerCommandAvailableInDirectoryWith` buildAssemblerCommand) + +assemblerCommandAvailableWith :: ([String] -> IO CompilerCommand) -> IO Bool +assemblerCommandAvailableWith = + assemblerCommandAvailableWithDirectories + [ getTemporaryDirectory + , getCurrentDirectory + ] + +assemblerCommandAvailableWithTempDirectory :: IO FilePath -> ([String] -> IO CompilerCommand) -> IO Bool +assemblerCommandAvailableWithTempDirectory getTempDirectory buildAssemblerCommand = do + tempDir <- getTempDirectory + assemblerCommandAvailableInDirectoryWith tempDir buildAssemblerCommand + +probeAvailableDirectories :: [IO FilePath] -> (FilePath -> IO Bool) -> IO Bool +probeAvailableDirectories directoryProviders probe = + go [] False True directoryProviders + where + go _ sawDirectory allAvailable [] = + pure $ sawDirectory && allAvailable + go seenDirectories sawDirectory allAvailable (nextDirectory : remainingDirectoryProviders) = do + nextDirectoryResult <- try nextDirectory :: IO (Either IOException FilePath) + case nextDirectoryResult of + Left _ -> + go seenDirectories sawDirectory False remainingDirectoryProviders + Right directory -> + if directory `elem` seenDirectories + then go seenDirectories sawDirectory allAvailable remainingDirectoryProviders + else do + probeResult <- try (probe directory) :: IO (Either IOException Bool) + case probeResult of + Left _ -> + go (directory : seenDirectories) True False remainingDirectoryProviders + Right available -> + go + (directory : seenDirectories) + True + (allAvailable && available) + remainingDirectoryProviders + +assemblerCommandAvailableInDirectoryWith :: FilePath -> ([String] -> IO CompilerCommand) -> IO Bool +assemblerCommandAvailableInDirectoryWith workingDir = + probeAssemblerCommandWith (pure workingDir) + +probeAssemblerCommandWith :: IO FilePath -> ([String] -> IO CompilerCommand) -> IO Bool +probeAssemblerCommandWith getWorkingDirectory buildAssemblerCommand = do + workingDir <- getWorkingDirectory + withProbeFile (pure workingDir) "htcc-test-probe-.s" $ \asmPath asmHandle -> do + withProbeFile (pure workingDir) "htcc-test-probe-.o" $ \objPath objHandle -> do + withProbeFile (pure workingDir) "htcc-test-probe-.out" $ \outputPath outputHandle -> do + let probeMarker = makeProbeMarker asmPath objPath + hPutStr asmHandle $ assemblerProbeAsm probeMarker + hClose asmHandle + hClose objHandle + hClose outputHandle + ignoreIOException $ removeFile outputPath + assembleResult <- readAssemblerExitCodeWith workingDir buildAssemblerCommand + [ "-x" + , "assembler" + , "-c" + , "-o" + , objPath + , asmPath + ] + case assembleResult of + Just ExitSuccess -> do + probeTarget <- detectProbeObjectTarget objPath + case probeTarget of + Just target + | isX86_64ElfTarget target -> do + linkResult <- readAssemblerExitCodeWith workingDir buildAssemblerCommand + [ "-no-pie" + , "-o" + , outputPath + , objPath + ] + case linkResult of + Just ExitSuccess -> + validateRunnableLinkedOutput outputPath (Just probeMarker) + _ -> pure False + | otherwise -> + pure False + Nothing -> + pure False + _ -> + pure False + +readAssemblerExitCodeWith :: FilePath -> ([String] -> IO CompilerCommand) -> [String] -> IO (Maybe ExitCode) +readAssemblerExitCodeWith workingDir buildAssemblerCommand args = do + compiler <- buildAssemblerCommand args + catchIOError + (do + (exitCode, _, _) <- readCompilerProcessWithExitCodeIn (Just workingDir) compiler args + pure $ Just exitCode + ) + (const $ pure Nothing) + +detectProbeObjectTarget :: FilePath -> IO (Maybe String) +detectProbeObjectTarget path = + catchIOError + (do + status <- getSymbolicLinkStatus path + if isRegularFile status + then describeProbeObject <$> B.readFile path + else pure Nothing + ) + (const $ pure Nothing) + +describeProbeObject :: B.ByteString -> Maybe String +describeProbeObject bytes + | B.length bytes < 4 = Nothing + | B.take 4 bytes /= elfMagic = Just "non-ELF object file" + | B.length bytes < 20 = Just "truncated ELF object file" + | otherwise = + Just $ + case (elfClass == 2, elfMachine == 62, elfType == 1) of + (True, True, True) -> "x86_64-unknown-elf object" + (True, True, False) -> "non-relocatable x86_64-ELF file" + (_, _, True) -> "ELF object file" + _ -> "non-relocatable ELF file" + where + elfClass = B.index bytes 4 + elfData = B.index bytes 5 + elfType = decodeElfHalfWord elfData (B.index bytes 16) (B.index bytes 17) :: Int + elfMachine = decodeElfHalfWord elfData (B.index bytes 18) (B.index bytes 19) :: Int + +validateRunnableLinkedOutput :: FilePath -> Maybe String -> IO Bool +validateRunnableLinkedOutput path maybeProbeMarker = + catchIOError + (do + status <- getSymbolicLinkStatus path + if isRegularFile status + then do + let originalMode = fileMode status + hasExecuteBits = + intersectFileModes originalMode executableFileMode /= 0 + if not hasExecuteBits + then pure False + else do + withReadableFile path originalMode $ do + bytes <- B.readFile path + let markerPresent = maybe + True + (`probeMarkerPresent` bytes) + maybeProbeMarker + linkedOutputOk = + maybe + False + hasRunnableLinkedElfProgramHeadersAndInterpreter + (parseLinkedOutputElf bytes) + pure $ markerPresent && linkedOutputOk + else pure False + ) + (const $ pure False) + +shouldValidateRunnableLinkedOutput :: FilePath -> IO Bool +shouldValidateRunnableLinkedOutput path = + catchIOError + (isRegularFile <$> getSymbolicLinkStatus path) + (\ioErr -> if isDoesNotExistError ioErr then pure True else ioError ioErr) + +withReadableFile :: FilePath -> FileMode -> IO a -> IO a +withReadableFile path originalMode action + | intersectFileModes originalMode ownerReadMode /= 0 = action + | otherwise = do + setFileMode path readableMode + action `finally` setFileMode path originalMode + where + readableMode = originalMode `unionFileModes` ownerReadMode + +probeMarkerPresent :: String -> B.ByteString -> Bool +probeMarkerPresent probeMarker = + B.isInfixOf $ B.pack $ map (fromIntegral . fromEnum) probeMarker + +data LinkedOutputElf = LinkedOutputElf + { linkedOutputElfBytes :: B.ByteString + , linkedOutputElfDataEncoding :: Word8 + , linkedOutputElfOsAbi :: Word8 + , linkedOutputElfType :: Int + , linkedOutputElfFileSize :: Word64 + , linkedOutputElfEntryPoint :: Word64 + , linkedOutputElfProgramHeaderOffset :: Word64 + , linkedOutputElfProgramHeaderEntrySize :: Int + , linkedOutputElfProgramHeaderCount :: Int + } + +looksRunnableLinkedOutput :: B.ByteString -> Bool +looksRunnableLinkedOutput bytes = + maybe False hasRunnableLinkedElfProgramHeadersAndInterpreter $ + parseLinkedOutputElf bytes + +linkedOutputElfHasStandaloneInterpreterDynamicSection :: LinkedOutputElf -> Bool +linkedOutputElfHasStandaloneInterpreterDynamicSection elf = + any hasStandaloneInterpreterDynamicSection [0 .. linkedOutputElfProgramHeaderCount elf - 1] + where + hasStandaloneInterpreterDynamicSection headerIndex = + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + in linkedOutputElfHasValidProgramHeaderBounds elf headerOffset + && linkedOutputElfProgramHeaderType elf headerOffset == elfProgramHeaderTypeDynamic + && maybe + False + (uncurry $ linkedOutputElfDynamicEntriesDescribeStandaloneInterpreter elf) + (linkedOutputElfProgramHeaderFileRange elf headerOffset) + +linkedOutputElfDynamicEntriesDescribeStandaloneInterpreter + :: LinkedOutputElf + -> Word64 + -> Word64 + -> Bool +linkedOutputElfDynamicEntriesDescribeStandaloneInterpreter elf dynamicOffset dynamicSize + | dynamicSize < fromIntegral elfDynamicEntrySize = False + | dynamicSize `mod` fromIntegral elfDynamicEntrySize /= 0 = False + | otherwise = + let dynamicEnd = dynamicOffset + dynamicSize + go entryOffset + | entryOffset >= dynamicEnd = False + | otherwise = + let entryTag = linkedOutputElfDynamicEntryTag elf entryOffset + in entryTag == elfDynamicTagNull + || ( entryTag /= elfDynamicTagNeeded + && go (entryOffset + fromIntegral elfDynamicEntrySize) + ) + in go dynamicOffset + +parseLinkedOutputElf :: B.ByteString -> Maybe LinkedOutputElf +parseLinkedOutputElf bytes + | B.length bytes < elfHeaderSize = Nothing + | B.take 4 bytes /= elfMagic = Nothing + | elfClass /= elfClass64Bit = Nothing + | elfData `notElem` [elfDataLittleEndian, elfDataBigEndian] = Nothing + | elfIdentVersion /= elfCurrentVersion = Nothing + | elfMachine /= elfMachineX86_64 = Nothing + | elfType `notElem` [elfTypeExecutable, elfTypeSharedObject] = Nothing + | elfVersion /= fromIntegral elfCurrentVersion = Nothing + | elfHeaderByteSize /= elfHeaderSize = Nothing + | elfProgramHeaderEntrySize < elfProgramHeaderSize = Nothing + | elfProgramHeaderCount == 0 = Nothing + | not (rangeWithinFile elfProgramHeaderOffset elfProgramHeaderTableSize fileSize) = Nothing + | otherwise = + Just LinkedOutputElf + { linkedOutputElfBytes = bytes + , linkedOutputElfDataEncoding = elfData + , linkedOutputElfOsAbi = elfOsAbi + , linkedOutputElfType = elfType + , linkedOutputElfFileSize = fileSize + , linkedOutputElfEntryPoint = elfEntryPoint + , linkedOutputElfProgramHeaderOffset = elfProgramHeaderOffset + , linkedOutputElfProgramHeaderEntrySize = elfProgramHeaderEntrySize + , linkedOutputElfProgramHeaderCount = elfProgramHeaderCount + } + where + fileSize = fromIntegral $ B.length bytes + elfClass = B.index bytes 4 + elfData = B.index bytes 5 + elfIdentVersion = B.index bytes 6 + elfOsAbi = B.index bytes 7 + elfType = decodeElfHalfWord elfData (B.index bytes 16) (B.index bytes 17) :: Int + elfMachine = decodeElfHalfWord elfData (B.index bytes 18) (B.index bytes 19) :: Int + elfVersion = + decodeElfWord32 + elfData + [ B.index bytes 20 + , B.index bytes 21 + , B.index bytes 22 + , B.index bytes 23 + ] :: + Int + elfEntryPoint = + decodeElfWord64 + elfData + [ B.index bytes 24 + , B.index bytes 25 + , B.index bytes 26 + , B.index bytes 27 + , B.index bytes 28 + , B.index bytes 29 + , B.index bytes 30 + , B.index bytes 31 + ] + elfProgramHeaderOffset = + decodeElfWord64 + elfData + [ B.index bytes 32 + , B.index bytes 33 + , B.index bytes 34 + , B.index bytes 35 + , B.index bytes 36 + , B.index bytes 37 + , B.index bytes 38 + , B.index bytes 39 + ] + elfHeaderByteSize = decodeElfHalfWord elfData (B.index bytes 52) (B.index bytes 53) :: Int + elfProgramHeaderEntrySize = + decodeElfHalfWord elfData (B.index bytes 54) (B.index bytes 55) :: Int + elfProgramHeaderCount = + decodeElfHalfWord elfData (B.index bytes 56) (B.index bytes 57) :: Int + elfProgramHeaderTableSize = + fromIntegral elfProgramHeaderEntrySize * fromIntegral elfProgramHeaderCount + +hasRunnableLinkedElfProgramHeadersAndInterpreter :: LinkedOutputElf -> Bool +hasRunnableLinkedElfProgramHeadersAndInterpreter elf = + linkedOutputElfHasRunnableInterpreterLayout elf + && any (linkedOutputElfHasRunnableProgramHeader elf) [0 .. linkedOutputElfProgramHeaderCount elf - 1] + +linkedOutputElfHasRunnableInterpreterLayout :: LinkedOutputElf -> Bool +linkedOutputElfHasRunnableInterpreterLayout elf = + case linkedOutputElfInterpreterPath elf of + Just (Just _) -> + True + Just Nothing -> + ( linkedOutputElfType elf == elfTypeExecutable + && not (linkedOutputElfHasDynamicProgramHeader elf) + ) + || linkedOutputElfHasStandaloneStaticPieLayout elf + Nothing -> + False + +linkedOutputElfHasDynamicProgramHeader :: LinkedOutputElf -> Bool +linkedOutputElfHasDynamicProgramHeader elf = + any hasDynamicProgramHeader [0 .. linkedOutputElfProgramHeaderCount elf - 1] + where + hasDynamicProgramHeader headerIndex = + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + in linkedOutputElfHasValidProgramHeaderBounds elf headerOffset + && linkedOutputElfProgramHeaderType elf headerOffset == elfProgramHeaderTypeDynamic + +linkedOutputElfInterpreterPath :: LinkedOutputElf -> Maybe (Maybe FilePath) +linkedOutputElfInterpreterPath elf = + case filter isInterpreterProgramHeader [0 .. linkedOutputElfProgramHeaderCount elf - 1] of + [] -> + Just Nothing + [headerIndex] -> do + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + interpreterOffset = linkedOutputElfProgramHeaderFileOffset elf headerOffset + interpreterSize = linkedOutputElfProgramHeaderFileSize elf headerOffset + if not (linkedOutputElfHasValidProgramHeaderBounds elf headerOffset) + || interpreterSize <= 1 + || interpreterSize > linkedOutputElfProgramHeaderMemorySize elf headerOffset + || not (rangeWithinFile interpreterOffset interpreterSize (linkedOutputElfFileSize elf)) + then Nothing + else do + interpreterBytes <- linkedOutputElfNullTerminatedBytes elf interpreterOffset interpreterSize + let interpreterPath = BC.unpack interpreterBytes + if null interpreterPath || head interpreterPath /= '/' + then Nothing + else Just $ Just interpreterPath + _ -> + Nothing + where + isInterpreterProgramHeader headerIndex = + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + in linkedOutputElfHasValidProgramHeaderBounds elf headerOffset + && linkedOutputElfProgramHeaderType elf headerOffset == elfProgramHeaderTypeInterp + +linkedOutputElfNullTerminatedBytes + :: LinkedOutputElf + -> Word64 + -> Word64 + -> Maybe B.ByteString +linkedOutputElfNullTerminatedBytes elf start size + | size == 0 = Nothing + | otherwise = + let rawBytes = + B.take (fromIntegral size) $ + B.drop (fromIntegral start) $ + linkedOutputElfBytes elf + in case B.unsnoc rawBytes of + Just (payloadBytes, trailingByte) + | trailingByte == 0 && not (B.null payloadBytes) && B.all (/= 0) payloadBytes -> + Just payloadBytes + _ -> + Nothing + +linkedOutputElfHasStaticPieDynamicFlags :: LinkedOutputElf -> Bool +linkedOutputElfHasStaticPieDynamicFlags elf = + linkedOutputElfType elf == elfTypeSharedObject + && any (linkedOutputElfProgramHeaderHasStaticPieFlag elf) [0 .. linkedOutputElfProgramHeaderCount elf - 1] + +linkedOutputElfHasStandaloneStaticPieLayout :: LinkedOutputElf -> Bool +linkedOutputElfHasStandaloneStaticPieLayout elf = + linkedOutputElfHasStaticPieDynamicFlags elf + && linkedOutputElfHasStandaloneInterpreterDynamicSection elf + +linkedOutputElfProgramHeaderHasStaticPieFlag :: LinkedOutputElf -> Int -> Bool +linkedOutputElfProgramHeaderHasStaticPieFlag elf headerIndex = + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + in linkedOutputElfHasValidProgramHeaderBounds elf headerOffset + && linkedOutputElfProgramHeaderType elf headerOffset == elfProgramHeaderTypeDynamic + && maybe + False + (uncurry $ linkedOutputElfDynamicEntriesContainStaticPieFlag elf) + (linkedOutputElfProgramHeaderFileRange elf headerOffset) + +linkedOutputElfDynamicEntriesContainStaticPieFlag :: LinkedOutputElf -> Word64 -> Word64 -> Bool +linkedOutputElfDynamicEntriesContainStaticPieFlag elf dynamicOffset dynamicSize + | dynamicSize < fromIntegral elfDynamicEntrySize = False + | dynamicSize `mod` fromIntegral elfDynamicEntrySize /= 0 = False + | otherwise = + let dynamicEnd = dynamicOffset + dynamicSize + go entryOffset + | entryOffset >= dynamicEnd = False + | otherwise = + let entryTag = linkedOutputElfDynamicEntryTag elf entryOffset + entryValue = linkedOutputElfDynamicEntryValue elf entryOffset + in entryTag /= elfDynamicTagNull + && ( ( entryTag == elfDynamicTagFlags1 + && entryValue .&. elfDynamicFlag1Pie /= 0 + ) + || go (entryOffset + fromIntegral elfDynamicEntrySize) + ) + in go dynamicOffset + +linkedOutputElfHasRunnableProgramHeader :: LinkedOutputElf -> Int -> Bool +linkedOutputElfHasRunnableProgramHeader elf headerIndex = + let headerOffset = linkedOutputElfProgramHeaderEntryOffset elf headerIndex + in linkedOutputElfHasValidProgramHeaderBounds elf headerOffset + && linkedOutputElfProgramHeaderType elf headerOffset == elfProgramHeaderTypeLoad + && linkedOutputElfProgramHeaderFileSize elf headerOffset > 0 + && linkedOutputElfProgramHeaderFileSize elf headerOffset + <= linkedOutputElfProgramHeaderMemorySize elf headerOffset + && rangeWithinFile + (linkedOutputElfProgramHeaderFileOffset elf headerOffset) + (linkedOutputElfProgramHeaderFileSize elf headerOffset) + (linkedOutputElfFileSize elf) + && linkedOutputElfProgramHeaderContainsEntryPoint elf headerOffset + && linkedOutputElfProgramHeaderFlags elf headerOffset .&. elfProgramHeaderFlagExecute /= 0 + +linkedOutputElfHasValidProgramHeaderBounds :: LinkedOutputElf -> Word64 -> Bool +linkedOutputElfHasValidProgramHeaderBounds elf headerOffset = + rangeWithinFile headerOffset (fromIntegral elfProgramHeaderSize) (linkedOutputElfFileSize elf) + +linkedOutputElfProgramHeaderFileRange :: LinkedOutputElf -> Word64 -> Maybe (Word64, Word64) +linkedOutputElfProgramHeaderFileRange elf headerOffset = + let fileOffset = linkedOutputElfProgramHeaderFileOffset elf headerOffset + fileSize = linkedOutputElfProgramHeaderFileSize elf headerOffset + memorySize = linkedOutputElfProgramHeaderMemorySize elf headerOffset + in if fileSize == 0 + || fileSize > memorySize + || not (rangeWithinFile fileOffset fileSize (linkedOutputElfFileSize elf)) + then Nothing + else Just (fileOffset, fileSize) + +linkedOutputElfProgramHeaderEntryOffset :: LinkedOutputElf -> Int -> Word64 +linkedOutputElfProgramHeaderEntryOffset elf headerIndex = + linkedOutputElfProgramHeaderOffset elf + + fromIntegral headerIndex * fromIntegral (linkedOutputElfProgramHeaderEntrySize elf) + +linkedOutputElfProgramHeaderType :: LinkedOutputElf -> Word64 -> Int +linkedOutputElfProgramHeaderType elf headerOffset = + decodeElfWord32 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 0 + , linkedOutputElfByteAt elf headerOffset 1 + , linkedOutputElfByteAt elf headerOffset 2 + , linkedOutputElfByteAt elf headerOffset 3 + ] :: + Int + +linkedOutputElfProgramHeaderFlags :: LinkedOutputElf -> Word64 -> Int +linkedOutputElfProgramHeaderFlags elf headerOffset = + decodeElfWord32 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 4 + , linkedOutputElfByteAt elf headerOffset 5 + , linkedOutputElfByteAt elf headerOffset 6 + , linkedOutputElfByteAt elf headerOffset 7 + ] :: + Int + +linkedOutputElfProgramHeaderFileOffset :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfProgramHeaderFileOffset elf headerOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 8 + , linkedOutputElfByteAt elf headerOffset 9 + , linkedOutputElfByteAt elf headerOffset 10 + , linkedOutputElfByteAt elf headerOffset 11 + , linkedOutputElfByteAt elf headerOffset 12 + , linkedOutputElfByteAt elf headerOffset 13 + , linkedOutputElfByteAt elf headerOffset 14 + , linkedOutputElfByteAt elf headerOffset 15 + ] + +linkedOutputElfProgramHeaderVirtualAddress :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfProgramHeaderVirtualAddress elf headerOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 16 + , linkedOutputElfByteAt elf headerOffset 17 + , linkedOutputElfByteAt elf headerOffset 18 + , linkedOutputElfByteAt elf headerOffset 19 + , linkedOutputElfByteAt elf headerOffset 20 + , linkedOutputElfByteAt elf headerOffset 21 + , linkedOutputElfByteAt elf headerOffset 22 + , linkedOutputElfByteAt elf headerOffset 23 + ] + +linkedOutputElfProgramHeaderFileSize :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfProgramHeaderFileSize elf headerOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 32 + , linkedOutputElfByteAt elf headerOffset 33 + , linkedOutputElfByteAt elf headerOffset 34 + , linkedOutputElfByteAt elf headerOffset 35 + , linkedOutputElfByteAt elf headerOffset 36 + , linkedOutputElfByteAt elf headerOffset 37 + , linkedOutputElfByteAt elf headerOffset 38 + , linkedOutputElfByteAt elf headerOffset 39 + ] + +linkedOutputElfProgramHeaderMemorySize :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfProgramHeaderMemorySize elf headerOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf headerOffset 40 + , linkedOutputElfByteAt elf headerOffset 41 + , linkedOutputElfByteAt elf headerOffset 42 + , linkedOutputElfByteAt elf headerOffset 43 + , linkedOutputElfByteAt elf headerOffset 44 + , linkedOutputElfByteAt elf headerOffset 45 + , linkedOutputElfByteAt elf headerOffset 46 + , linkedOutputElfByteAt elf headerOffset 47 + ] + +linkedOutputElfDynamicEntryTag :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfDynamicEntryTag elf entryOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf entryOffset 0 + , linkedOutputElfByteAt elf entryOffset 1 + , linkedOutputElfByteAt elf entryOffset 2 + , linkedOutputElfByteAt elf entryOffset 3 + , linkedOutputElfByteAt elf entryOffset 4 + , linkedOutputElfByteAt elf entryOffset 5 + , linkedOutputElfByteAt elf entryOffset 6 + , linkedOutputElfByteAt elf entryOffset 7 + ] + +linkedOutputElfDynamicEntryValue :: LinkedOutputElf -> Word64 -> Word64 +linkedOutputElfDynamicEntryValue elf entryOffset = + decodeElfWord64 + (linkedOutputElfDataEncoding elf) + [ linkedOutputElfByteAt elf entryOffset 8 + , linkedOutputElfByteAt elf entryOffset 9 + , linkedOutputElfByteAt elf entryOffset 10 + , linkedOutputElfByteAt elf entryOffset 11 + , linkedOutputElfByteAt elf entryOffset 12 + , linkedOutputElfByteAt elf entryOffset 13 + , linkedOutputElfByteAt elf entryOffset 14 + , linkedOutputElfByteAt elf entryOffset 15 + ] + +linkedOutputElfProgramHeaderContainsEntryPoint :: LinkedOutputElf -> Word64 -> Bool +linkedOutputElfProgramHeaderContainsEntryPoint elf headerOffset = + rangeContainsPoint + (linkedOutputElfProgramHeaderVirtualAddress elf headerOffset) + (linkedOutputElfProgramHeaderMemorySize elf headerOffset) + (linkedOutputElfEntryPoint elf) + +linkedOutputElfByteAt :: LinkedOutputElf -> Word64 -> Int -> Word8 +linkedOutputElfByteAt elf headerOffset relativeOffset = + B.index + (linkedOutputElfBytes elf) + (fromIntegral $ headerOffset + fromIntegral relativeOffset) + +decodeElfHalfWord :: (Bits a, Num a) => Word8 -> Word8 -> Word8 -> a +decodeElfHalfWord elfData byte18 byte19 + = decodeElfUnsigned elfData [byte18, byte19] + +decodeElfWord32 :: (Bits a, Num a) => Word8 -> [Word8] -> a +decodeElfWord32 = decodeElfUnsigned + +decodeElfWord64 :: (Bits a, Num a) => Word8 -> [Word8] -> a +decodeElfWord64 = decodeElfUnsigned + +decodeElfUnsigned :: (Bits a, Num a) => Word8 -> [Word8] -> a +decodeElfUnsigned elfData = + foldl' + (\acc nextByte -> acc `shiftL` 8 .|. fromIntegral nextByte) + 0 + . orderedBytes + where + orderedBytes + | elfData == elfDataBigEndian = id + | otherwise = reverse + +rangeWithinFile :: Word64 -> Word64 -> Word64 -> Bool +rangeWithinFile start size fileSize = + start <= fileSize && size <= fileSize - start + +rangeContainsPoint :: Word64 -> Word64 -> Word64 -> Bool +rangeContainsPoint start size point = + size > 0 && point >= start && point - start < size + +executableFileMode :: FileMode +executableFileMode = foldr1 unionFileModes + [ ownerExecuteMode + , groupExecuteMode + , otherExecuteMode + ] + +elfMagic :: B.ByteString +elfMagic = B.pack [0x7f, 0x45, 0x4c, 0x46] + +elfClass64Bit :: Word8 +elfClass64Bit = 2 + +elfCurrentVersion :: Word8 +elfCurrentVersion = 1 + +elfDataLittleEndian :: Word8 +elfDataLittleEndian = 1 + +elfDataBigEndian :: Word8 +elfDataBigEndian = 2 + +elfTypeExecutable :: Int +elfTypeExecutable = 2 + +elfTypeSharedObject :: Int +elfTypeSharedObject = 3 + +elfMachineX86_64 :: Int +elfMachineX86_64 = 62 + +elfHeaderSize :: Int +elfHeaderSize = 64 + +elfProgramHeaderSize :: Int +elfProgramHeaderSize = 56 + +elfProgramHeaderTypeLoad :: Int +elfProgramHeaderTypeLoad = 1 + +elfProgramHeaderTypeDynamic :: Int +elfProgramHeaderTypeDynamic = 2 + +elfProgramHeaderTypeInterp :: Int +elfProgramHeaderTypeInterp = 3 + +elfProgramHeaderFlagExecute :: Int +elfProgramHeaderFlagExecute = 0x1 + +elfDynamicEntrySize :: Int +elfDynamicEntrySize = 16 + +elfDynamicTagNull :: Word64 +elfDynamicTagNull = 0 + +elfDynamicTagNeeded :: Word64 +elfDynamicTagNeeded = 1 + +elfDynamicTagFlags1 :: Word64 +elfDynamicTagFlags1 = 0x6ffffffb + +elfDynamicFlag1Pie :: Word64 +elfDynamicFlag1Pie = 0x08000000 + +isX86_64ElfTarget :: String -> Bool +isX86_64ElfTarget target = + let normalizedTarget = map toLower target + in isX86_64Target normalizedTarget && not (isKnownNonElfTarget normalizedTarget) + where + isX86_64Target normalizedTarget = + "x86_64" `isPrefixOf` normalizedTarget || "amd64" `isPrefixOf` normalizedTarget + + isKnownNonElfTarget normalizedTarget = + any (`isInfixOf` normalizedTarget) + [ "apple" + , "cygwin" + , "darwin" + , "mingw" + , "msvc" + , "windows" + ] + +withProbeFile :: IO FilePath -> String -> (FilePath -> Handle -> IO a) -> IO a +withProbeFile getWorkingDirectory prefix action = do + workingDir <- getWorkingDirectory + (path, handle) <- openTempFile workingDir prefix + finally + (action path handle) + ( ignoreIOException (hClose handle) + *> ignoreIOException (removeFile path) + ) + +makeProbeMarker :: FilePath -> FilePath -> String +makeProbeMarker asmPath objPath = + "htcc-probe-marker:" <> takeFileName asmPath <> ":" <> takeFileName objPath + +assemblerProbeAsm :: String -> String +assemblerProbeAsm probeMarker = unlines + [ ".intel_syntax noprefix" + , ".section .rodata" + , "htcc_test_probe_marker:" + , " .ascii " <> show probeMarker + , " .byte 0" + , ".text" + , ".globl main" + , "main:" + , " lea rdx, [rip + htcc_test_probe_marker]" + , " xor eax, eax" + , " ret" + ] + +ignoreIOException :: IO () -> IO () +ignoreIOException action = + catchIOError action $ const (pure ()) diff --git a/test/Tests/ComponentsTests.hs b/test/Tests/ComponentsTests.hs new file mode 100644 index 0000000..13524bd --- /dev/null +++ b/test/Tests/ComponentsTests.hs @@ -0,0 +1,17 @@ +module Tests.ComponentsTests ( + exec +) where + +import Tests.Utils hiding (exec) +-- import Test.HUnit (Test (..)) +import qualified Tests.ComponentsTests.AsmOutput as AsmOutput +import qualified Tests.ComponentsTests.CommandSelection as CommandSelection +import Tests.ComponentsTests.Parser.Combinators as PC + +exec :: IO () +exec = runTestsSequential $ + TestList [ + AsmOutput.test + , CommandSelection.test + , PC.test + ] diff --git a/test/Tests/ComponentsTests/AsmOutput.hs b/test/Tests/ComponentsTests/AsmOutput.hs new file mode 100644 index 0000000..e3e624f --- /dev/null +++ b/test/Tests/ComponentsTests/AsmOutput.hs @@ -0,0 +1,3444 @@ +{-# LANGUAGE OverloadedStrings #-} +module Tests.ComponentsTests.AsmOutput ( + test +) where + +import Control.Exception (IOException, + finally, try) +import Control.Monad (when) +import qualified Data.ByteString as B +import qualified Data.ByteString.Char8 as BC +import Data.Either (isLeft) +import Data.IORef (modifyIORef', + newIORef, + readIORef) +import qualified Data.Map.Strict as Map +import qualified Data.Text as T +import qualified Data.Text.IO as T +import Data.Void (Void) +import Htcc.Asm (casm', + normalizeAsmInput, + prepareAsmInput, + prepareVisualizableInput) +import qualified Htcc.Asm.Intrinsic.Structure.Internal as SI +import qualified Htcc.CRules.Types as CT +import qualified Htcc.MegaparsecCompat as M +import Htcc.Output (ReplacementOutputMode (..), + creationMaskedOutputMode, + replaceExistingOutputFromPathWith, + withReplacementOutputPath, + withReplacementOutputPathAndResolvedPath) +import Htcc.Parser (ASTs, ATKind (..), + ATKindFor (..), + ATree (..)) +import Htcc.Parser.Combinators (parser, runParser) +import Htcc.Parser.ConstructionData.Core (Warnings) +import qualified Htcc.Parser.ConstructionData.Scope.Function as PF +import Htcc.Parser.ConstructionData.Scope.Var (GVar (..), + GVarInitWith (..), + GlobalVars, + Literals, + materializeTentativeIncompleteArray) +import Htcc.Visualizer (mkWidth, + visualize, + writeVisualization) +import Htcc.WarningSuppression (CompilerWarningFilterDecision (..), + dropCompilerWarningOutput, + emptyIncrementalCompilerWarningFilter, + feedIncrementalCompilerWarningFilter, + finalizeIncrementalCompilerWarningFilter) +import qualified Htcc.WarningSuppression as WS +import System.Directory (createDirectory, + doesDirectoryExist, + getTemporaryDirectory, + removeDirectory, + removeDirectoryRecursive, + removeFile) +import System.FilePath (takeDirectory, + ()) +import System.IO (IOMode (ReadMode, WriteMode), + hClose, + openTempFile, + withBinaryFile) +import System.IO.Error (catchIOError, + mkIOError, + permissionErrorType) +import System.Posix.Files (createLink, + createSymbolicLink, + fileMode, + getFileStatus, + intersectFileModes, + otherWriteMode, + ownerExecuteMode, + ownerReadMode, + ownerWriteMode, + setFileCreationMask, + setFileMode, + setGroupIDMode, + unionFileModes) +import System.Posix.IO (closeFd, + createFile) +import System.Posix.Temp (mkdtemp) +import System.Posix.Types (FileMode) +import Test.HUnit (Test (..), + assertBool, + assertEqual, + assertFailure) + +parseAsmSource :: T.Text -> IO (ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) +parseAsmSource source = + case runParser parser "" source + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) of + Left err -> + assertFailure (M.errorBundlePretty err) + Right (_, asts, gvars, lits, funcs) -> + pure (asts, gvars, lits, funcs) + +renderAsm :: T.Text -> IO T.Text +renderAsm = renderAsmWith id + +renderAsmWith :: (GlobalVars Integer -> GlobalVars Integer) -> T.Text -> IO T.Text +renderAsmWith transformGVars source = do + tmpDir <- getTemporaryDirectory + (path, h) <- openTempFile tmpDir "htcc-components-asm.s" + flip finally (ignoreIOException (hClose h) >> ignoreIOException (removeFile path)) $ do + (asts, gvars, lits, funcs) <- parseAsmSource source + SI.runAsmWithHandle h (casm' asts (transformGVars gvars) lits funcs :: SI.Asm SI.AsmCodeCtx Integer ()) + hClose h + T.readFile path + where + ignoreIOException = flip catchIOError $ const $ pure () + +renderVisualization :: T.Text -> IO T.Text +renderVisualization source = do + tmpDir <- getTemporaryDirectory + (path, h) <- openTempFile tmpDir "htcc-components-visualizer.svg" + flip finally (ignoreIOException (hClose h) >> ignoreIOException (removeFile path)) $ do + hClose h + (asts, _, _, _) <- parseAsmSource source + visualize asts (mkWidth 200) path + T.readFile path + where + ignoreIOException = flip catchIOError $ const $ pure () + +renderVisualizationFromAsts :: ASTs Integer -> IO T.Text +renderVisualizationFromAsts asts = do + tmpDir <- getTemporaryDirectory + (path, h) <- openTempFile tmpDir "htcc-components-visualizer-ast.svg" + flip finally (ignoreIOException (hClose h) >> ignoreIOException (removeFile path)) $ do + hClose h + visualize asts (mkWidth 200) path + T.readFile path + where + ignoreIOException = flip catchIOError $ const $ pure () + +withVisualizerSymlinkPaths :: String -> String -> (FilePath -> FilePath -> IO a) -> IO a +withVisualizerSymlinkPaths targetName aliasName action = do + tmpDir <- getTemporaryDirectory + probeDir <- mkdtemp (tmpDir "htcc-components-visualizer-symlinkXXXXXX") + let targetPath = probeDir targetName + aliasPath = probeDir aliasName + cleanup = + ignoreIOException (removeFile aliasPath) + >> ignoreIOException (removeFile targetPath) + >> ignoreIOException (removeDirectory probeDir) + flip finally cleanup $ + action targetPath aliasPath + where + ignoreIOException = flip catchIOError $ const $ pure () +assertContains :: String -> [T.Text] -> T.Text -> IO () +assertContains label needles haystack = + assertBool label $ all (`T.isInfixOf` haystack) needles + +assertContainsInOrder :: String -> [T.Text] -> T.Text -> IO () +assertContainsInOrder label needles haystack = + assertBool label $ go needles (T.lines haystack) + where + go [] _ = True + go _ [] = False + go remaining@(needle:rest) (line:lines') + | needle `T.isInfixOf` line = go rest lines' + | otherwise = go remaining lines' + +assertOccursBefore :: String -> T.Text -> T.Text -> T.Text -> IO () +assertOccursBefore label first second haystack = + assertBool label $ + case (T.breakOn first haystack, T.breakOn second haystack) of + ((_, restFirst), (_, restSecond)) + | T.null restFirst || T.null restSecond -> False + | otherwise -> T.length restFirst > T.length restSecond + +stringLiteralByteEncodingTest :: Test +stringLiteralByteEncodingTest = TestLabel "Asm.Output.string-literal-byte-encoding" $ TestCase $ do + asm <- renderAsm "int main(void) { return \"\\xff\"[1]; }" + assertContains + "string hex escapes should emit C bytes, not UTF-8 bytes" + [ ".byte 255, 0" + ] + asm + assertBool + "string hex escapes should not be UTF-8 encoded" + (not $ ".byte 195, 191, 0" `T.isInfixOf` asm) + utf8Asm <- renderAsm "int main(void) { return \"é\"[0]; }" + assertContains + "raw non-ASCII string characters should keep UTF-8 source bytes" + [ ".byte 195, 169, 0" + ] + utf8Asm + +replacementFailurePreservesWriteOnlyOutputTest :: ReplacementOutputMode -> T.Text -> FileMode -> Test +replacementFailurePreservesWriteOnlyOutputTest modeStrategy label targetMode = TestLabel (T.unpack label) $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + (stagedPath, stagedHandle) <- openTempFile tmpDir "htcc-output-staged" + let cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (hClose stagedHandle) + >> ignoreIOException (removeFile targetPath) + >> ignoreIOException (removeFile stagedPath) + failDuringReplacement src dst + | src == stagedPath = + withBinaryFile src ReadMode $ \srcHandle -> + withBinaryFile dst WriteMode $ \dstHandle -> do + chunk <- B.hGetSome srcHandle 4 + B.hPut dstHandle chunk + ioError $ userError "simulated replacement failure" + | otherwise = + assertFailure "unexpected replacement source path" + flip finally cleanup $ do + hClose targetHandle + hClose stagedHandle + T.writeFile targetPath staleOutput + setFileMode targetPath targetMode + T.writeFile stagedPath replacementOutput + stagedMode <- fileMode <$> getFileStatus stagedPath + result <- try + (replaceExistingOutputFromPathWith failDuringReplacement modeStrategy targetPath targetMode stagedMode stagedPath) + :: IO (Either IOException ()) + case result of + Left _ -> pure () + Right _ -> assertFailure "replacement should fail after partially overwriting the target" + restoredMode <- fileMode <$> getFileStatus targetPath + setFileMode targetPath $ restoredMode `unionFileModes` ownerReadMode + restoredOutput <- T.readFile targetPath + assertEqual "restored file mode" targetMode $ intersectFileModes restoredMode ownerPermissionMask + assertEqual "restored file contents" staleOutput restoredOutput + where + ownerPermissionMask = + ownerReadMode `unionFileModes` ownerWriteMode `unionFileModes` ownerExecuteMode + staleOutput = "stale output\n" + replacementOutput = "replacement output\n" + ignoreIOException = flip catchIOError $ const $ pure () + +extractFunctionSection :: T.Text -> T.Text -> T.Text +extractFunctionSection name asm = + fst $ + T.breakOn "\n.global " $ + snd $ + T.breakOn ("\n" <> name <> ":\n") ("\n" <> asm) + +returnLabelTest :: Test +returnLabelTest = TestLabel "Asm.Output.return" $ TestCase $ do + asm <- renderAsm "int main() { return 0; }" + assertContains + "return labels are emitted into the requested handle" + [ ".intel_syntax noprefix" + , "jmp .L.return.main" + , ".L.return.main:" + ] + asm + +controlFlowLabelTest :: Test +controlFlowLabelTest = TestLabel "Asm.Output.control-flow" $ TestCase $ do + asm <- renderAsm "int main() { int x; x = 0; while (x < 2) { if (x == 1) goto done; x = x + 1; continue; } done: switch (x) { case 1: return 0; default: return 1; } }" + assertContains + "control-flow labels and references stay on the requested handle" + [ ".L.continue.main." + , ".L.break.main." + , ".L.label.main.done:" + , "jmp .L.label.main.done" + , ".L.case.main." + , "je .L.case.main." + ] + asm + +callArgumentGotoLabelTest :: Test +callArgumentGotoLabelTest = TestLabel "Asm.Output.call-argument-goto-label" $ TestCase $ do + asm <- renderAsm "int f(int, int, char*); int main(void) { return f(3, ({ int i = 0; goto a; a: ++i; b: ++i; c: ++i; i; }), \"x\"); }" + assertEqual + "call argument code should not duplicate goto label definitions" + [1, 1, 1] + (map (\ident -> T.count (".L.label.main." <> ident <> ":") asm) ["a", "b", "c"]) + +globalInitializerCastTest :: Test +globalInitializerCastTest = TestLabel "Asm.Output.global-initializer-cast" $ TestCase $ do + asm <- renderAsm "int g = (char)0x1234; int h = (char)0xff; int main() { return g == 52 && h == -1; }" + assertContains + "global initializer casts are folded before emitting data bytes" + [ "g:" + , ".4byte 52" + , "h:" + , ".4byte -1" + ] + asm + assertBool + "global initializer should not retain the uncast value" + (not $ any (`T.isInfixOf` asm) [".4byte 4660", ".4byte 255"]) + +globalInitializerNullPointerCastTest :: Test +globalInitializerNullPointerCastTest = TestLabel "Asm.Output.global-initializer-null-pointer-cast" $ TestCase $ do + asm <- renderAsm "char *p = (char*)0; int main(void) { return p == 0; }" + assertContains + "file-scope null pointer casts are emitted as zero-initialized pointer storage" + [ "p:" + , ".zero 8" + ] + asm + +globalInitializerNestedNullPointerCastTest :: Test +globalInitializerNestedNullPointerCastTest = TestLabel "Asm.Output.global-initializer-nested-null-pointer-cast" $ TestCase $ do + asm <- renderAsm "int *p = (int*)(void*)0; int main(void) { return p == 0; }" + assertContains + "nested file-scope null pointer casts are emitted as zero-initialized pointer storage" + [ "p:" + , ".zero 8" + ] + asm + +globalInitializerFunctionNullPointerCastTest :: Test +globalInitializerFunctionNullPointerCastTest = TestLabel "Asm.Output.global-initializer-function-null-pointer-cast" $ TestCase $ do + asm <- renderAsm "int (*fp)(void) = (int (*)(void))0; int main(void) { return fp == 0; }" + assertContains + "file-scope function-pointer null casts are emitted as zero-initialized pointer storage" + [ "fp:" + , ".zero 8" + ] + asm + +globalInitializerWideCastTruncationTest :: Test +globalInitializerWideCastTruncationTest = TestLabel "Asm.Output.global-initializer-wide-cast-truncation" $ TestCase $ do + asm <- renderAsm "long g = (long)0x10000000000000000; long h = (long)0x10000000000000001; char *p = (char*)0x10000000000000000; int main(void) { return g == 0 && h == 1 && p == 0; }" + assertBool + "8-byte global initializer casts are truncated before data emission" + (all (`T.isInfixOf` asm) + [ "g:\n\t.zero 8" + , "h:\n\t.8byte 1" + , "p:\n\t.zero 8" + ] + ) + assertBool + "8-byte global initializer casts should not leak untruncated literals" + (not $ any (`T.isInfixOf` asm) + [ ".8byte 18446744073709551616" + , ".8byte 18446744073709551617" + ] + ) + +tentativeIncompleteArrayTest :: Test +tentativeIncompleteArrayTest = TestLabel "Asm.Output.tentative-incomplete-array" $ TestCase $ do + asm <- renderAsm "int x[]; int main() { x[0] = 1; return x[0]; }" + assertContains + "tentative incomplete arrays are materialized as one element before data emission" + [ "x:" + , ".zero 4" + ] + asm + +tentativeIncompleteArrayDecayRetypeFallbackTest :: Test +tentativeIncompleteArrayDecayRetypeFallbackTest = TestLabel "Asm.Output.tentative-incomplete-array-decay-retype-fallback" $ TestCase $ do + asm <- renderAsm "int x[]; int *f(void) { return x; } int main(void) { return 0; }" + let fSection = extractFunctionSection "f" asm + assertContainsInOrder + "tentative arrays without a later completing declaration are retyped for decay-only codegen after fallback materialization" + [ "f:" + , "push offset x" + , "pop rax" + , "jmp .L.return.f" + ] + fSection + assertBool + "fallback-sized tentative-array decay sites should not load the first element as a scalar" + (not $ "movsxd rax, dword ptr [rax]" `T.isInfixOf` fSection) + +tentativeIncompleteArraySizeofFallbackTest :: Test +tentativeIncompleteArraySizeofFallbackTest = TestLabel "Asm.Output.tentative-incomplete-array-sizeof-fallback" $ TestCase $ do + let incompleteTy :: CT.StorageClass Integer + incompleteTy = CT.SCAuto $ CT.CTIncomplete (CT.IncompleteArray CT.CTInt) + materializedTy :: CT.StorageClass Integer + materializedTy = CT.SCAuto $ CT.CTArray 1 CT.CTInt + gvars :: GlobalVars Integer + gvars = Map.fromList [("x", GVar incompleteTy GVarInitWithZero 0)] + sizeofExpr = + ATNode + ATSizeof + (CT.SCAuto CT.CTInt) + (ATNode (ATGVar incompleteTy "x") incompleteTy ATEmpty ATEmpty) + ATEmpty + case prepareAsmInput Map.empty [sizeofExpr] gvars of + Left err -> + assertFailure err + Right ([ATNode ATSizeof _ (ATNode (ATGVar resolvedTy _) _ _ _) _], preparedGVars) -> do + assertEqual + "tentative incomplete arrays should materialize before revalidating sizeof" + materializedTy + resolvedTy + assertEqual + "prepareAsmInput should materialize tentative incomplete arrays in global storage too" + (Just materializedTy) + (gvtype <$> Map.lookup "x" preparedGVars) + Right _ -> + assertFailure "internal test error: prepareAsmInput returned an unexpected AST shape" + +staticTentativeIncompleteArrayUseSiteRejectedTest :: Test +staticTentativeIncompleteArrayUseSiteRejectedTest = TestLabel "Asm.Output.static-tentative-incomplete-array-use-site-rejected" $ TestCase $ + assertBool + "static tentative incomplete arrays should remain incomplete at expression use sites" + (isLeft + (runParser parser "" + "static int x[]; int main(void) { return sizeof x / sizeof x[0]; }" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +nestedTentativeIncompleteArrayUseSiteRejectedTest :: Test +nestedTentativeIncompleteArrayUseSiteRejectedTest = TestLabel "Asm.Output.nested-tentative-incomplete-array-use-site-rejected" $ TestCase $ + assertBool + "address arithmetic on tentative incomplete arrays should be rejected before data emission" + (isLeft + (runParser parser "" + "int x[][4]; int main(void) { return ((char*)(&x + 1)) - ((char*)&x); }" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +tentativeNestedArrayExtentInferenceTest :: Test +tentativeNestedArrayExtentInferenceTest = TestLabel "Asm.Output.tentative-nested-array-extent-inference" $ TestCase $ do + asm <- renderAsm "int x[][4]; int x[2][4]; int main(void) { return ((char*)(&x + 1)) - ((char*)&x); }" + assertContains + "same-translation-unit tentative nested arrays infer the missing outer extent from a later redeclaration" + [ "x:" + , ".zero 32" + , "imul rdi, 32" + ] + asm + +tentativeArrayEarlierFunctionDecayRetypeTest :: Test +tentativeArrayEarlierFunctionDecayRetypeTest = TestLabel "Asm.Output.tentative-array-earlier-function-decay-retype" $ TestCase $ do + asm <- renderAsm "int x[]; int *f(void) { return x; } int x[4]; int main(void) { return 0; }" + let fSection = extractFunctionSection "f" asm + assertContainsInOrder + "same-translation-unit tentative arrays retype earlier decay-only uses before codegen" + [ "f:" + , "push offset x" + , "pop rax" + , "jmp .L.return.f" + ] + fSection + assertBool + "retyped tentative-array decay sites should not load the first element as a scalar" + (not $ "movsxd rax, dword ptr [rax]" `T.isInfixOf` fSection) + +tentativeNestedArrayOuterExtentMergeUnitTest :: Test +tentativeNestedArrayOuterExtentMergeUnitTest = TestLabel "Asm.Output.tentative-nested-array-merge-unit" $ TestCase $ do + let incompleteTy :: CT.TypeKind Integer + incompleteTy = CT.CTIncomplete (CT.IncompleteArray (CT.CTArray 4 CT.CTInt)) + completeTy :: CT.TypeKind Integer + completeTy = CT.CTArray 2 (CT.CTArray 4 CT.CTInt) + assertEqual + "tentative nested-array merging should infer only the missing outer extent" + (Just completeTy) + (CT.mergeTentativeArrayTypeKinds incompleteTy completeTy) + assertEqual + "tentative nested-array merging should be symmetric for the inferred outer extent" + (Just completeTy) + (CT.mergeTentativeArrayTypeKinds completeTy incompleteTy) + +compatiblePointerArrayExtentCompletionMergeUnitTest :: Test +compatiblePointerArrayExtentCompletionMergeUnitTest = TestLabel "Asm.Output.compatible-pointer-array-extent-completion-merge-unit" $ TestCase $ do + let incompletePtrTy :: CT.TypeKind Integer + incompletePtrTy = CT.CTPtr $ CT.CTIncomplete (CT.IncompleteArray CT.CTInt) + completePtrTy :: CT.TypeKind Integer + completePtrTy = CT.CTPtr $ CT.CTArray 4 CT.CTInt + assertEqual + "pointer-compatible type merging should reject inferring an omitted array bound through a pointee" + Nothing + (CT.mergeCompatibleTypeKinds incompletePtrTy completePtrTy) + assertEqual + "pointer-compatible type merging should reject pointee-bound inference symmetrically" + Nothing + (CT.mergeCompatibleTypeKinds completePtrTy incompletePtrTy) + +incompatiblePointerArrayExtentConflictMergeUnitTest :: Test +incompatiblePointerArrayExtentConflictMergeUnitTest = TestLabel "Asm.Output.incompatible-pointer-array-extent-conflict-merge-unit" $ TestCase $ do + let lhsPtrTy :: CT.TypeKind Integer + lhsPtrTy = CT.CTPtr $ CT.CTArray 3 CT.CTInt + rhsPtrTy :: CT.TypeKind Integer + rhsPtrTy = CT.CTPtr $ CT.CTArray 4 CT.CTInt + assertEqual + "pointer-compatible type merging should reject conflicting pointee array bounds" + Nothing + (CT.mergeCompatibleTypeKinds lhsPtrTy rhsPtrTy) + assertEqual + "pointer-compatible type merging should reject conflicting pointee array bounds symmetrically" + Nothing + (CT.mergeCompatibleTypeKinds rhsPtrTy lhsPtrTy) + +compatibleFunctionParamRefinementMergeUnitTest :: Test +compatibleFunctionParamRefinementMergeUnitTest = TestLabel "Asm.Output.compatible-function-param-refinement-merge-unit" $ TestCase $ do + let incompleteFnTy :: CT.TypeKind Integer + incompleteFnTy = + CT.CTFunc + CT.CTInt + [ (CT.CTPtr $ CT.CTIncomplete $ CT.IncompleteArray CT.CTInt, Nothing) + ] + refinedFnTy :: CT.TypeKind Integer + refinedFnTy = + CT.CTFunc + CT.CTInt + [ (CT.CTPtr $ CT.CTArray 4 CT.CTInt, Nothing) + ] + assertEqual + "function-compatible type merging should reject omitted pointee bounds completed through pointer-to-array parameters" + Nothing + (CT.mergeCompatibleTypeKinds incompleteFnTy refinedFnTy) + assertEqual + "function-compatible type merging should reject pointer-to-array parameter bound inference symmetrically" + Nothing + (CT.mergeCompatibleTypeKinds refinedFnTy incompleteFnTy) + +incompatibleFunctionParamArrayExtentConflictMergeUnitTest :: Test +incompatibleFunctionParamArrayExtentConflictMergeUnitTest = TestLabel "Asm.Output.incompatible-function-param-array-extent-conflict-merge-unit" $ TestCase $ do + let lhsFnTy :: CT.TypeKind Integer + lhsFnTy = + CT.CTFunc + CT.CTInt + [ (CT.CTPtr $ CT.CTArray 3 CT.CTInt, Nothing) + ] + rhsFnTy :: CT.TypeKind Integer + rhsFnTy = + CT.CTFunc + CT.CTInt + [ (CT.CTPtr $ CT.CTArray 4 CT.CTInt, Nothing) + ] + assertEqual + "function-compatible type merging should reject conflicting pointer-to-array parameter bounds" + Nothing + (CT.mergeCompatibleTypeKinds lhsFnTy rhsFnTy) + assertEqual + "function-compatible type merging should reject conflicting pointer-to-array parameter bounds symmetrically" + Nothing + (CT.mergeCompatibleTypeKinds rhsFnTy lhsFnTy) + +compatibleTaggedStructCompletionMergeUnitTest :: Test +compatibleTaggedStructCompletionMergeUnitTest = TestLabel "Asm.Output.compatible-tagged-struct-completion-merge-unit" $ TestCase $ do + let members :: Map.Map T.Text (CT.StructMember Integer) + members = Map.fromList + [ ("value", CT.StructMember CT.CTInt 0) + ] + incompletePtrTy :: CT.TypeKind Integer + incompletePtrTy = CT.CTPtr $ CT.CTIncomplete $ CT.IncompleteStruct "Foo" (CT.ScopeId 0) + completePtrTy :: CT.TypeKind Integer + completePtrTy = CT.CTPtr $ CT.CTNamedStruct "Foo" (CT.ScopeId 0) members + assertEqual + "pointer-compatible type merging should accept completion of a tagged opaque struct declaration" + (Just completePtrTy) + (CT.mergeCompatibleTypeKinds incompletePtrTy completePtrTy) + assertEqual + "pointer-compatible type merging should stay symmetric when the tagged struct definition appears first" + (Just completePtrTy) + (CT.mergeCompatibleTypeKinds completePtrTy incompletePtrTy) + +incompatibleTaggedStructAliasMergeUnitTest :: Test +incompatibleTaggedStructAliasMergeUnitTest = TestLabel "Asm.Output.incompatible-tagged-struct-alias-merge-unit" $ TestCase $ do + let members :: Map.Map T.Text (CT.StructMember Integer) + members = Map.fromList + [ ("value", CT.StructMember CT.CTInt 0) + ] + fooPtrTy :: CT.TypeKind Integer + fooPtrTy = CT.CTPtr $ CT.CTNamedStruct "Foo" (CT.ScopeId 0) members + barPtrTy :: CT.TypeKind Integer + barPtrTy = CT.CTPtr $ CT.CTNamedStruct "Bar" (CT.ScopeId 0) members + assertEqual + "pointer-compatible type merging should reject tagged structs that only match structurally" + Nothing + (CT.mergeCompatibleTypeKinds fooPtrTy barPtrTy) + assertEqual + "pointer-compatible type merging should reject structurally identical tagged structs symmetrically" + Nothing + (CT.mergeCompatibleTypeKinds barPtrTy fooPtrTy) + +compatibleAnonymousStructMergeUnitTest :: Test +compatibleAnonymousStructMergeUnitTest = TestLabel "Asm.Output.compatible-anonymous-struct-merge-unit" $ TestCase $ do + let lhsMembers :: Map.Map T.Text (CT.StructMember Integer) + lhsMembers = Map.fromList + [ ("value", CT.StructMember CT.CTInt 0) + ] + rhsMembers :: Map.Map T.Text (CT.StructMember Integer) + rhsMembers = Map.fromList + [ ("value", CT.StructMember CT.CTInt 0) + ] + lhsPtrTy :: CT.TypeKind Integer + lhsPtrTy = CT.CTPtr $ CT.CTStruct lhsMembers + rhsPtrTy :: CT.TypeKind Integer + rhsPtrTy = CT.CTPtr $ CT.CTStruct rhsMembers + assertEqual + "pointer-compatible type merging should accept anonymous structs that match structurally" + (Just rhsPtrTy) + (CT.mergeCompatibleTypeKinds lhsPtrTy rhsPtrTy) + assertEqual + "pointer-compatible type merging should accept structurally identical anonymous structs symmetrically" + (Just rhsPtrTy) + (CT.mergeCompatibleTypeKinds rhsPtrTy lhsPtrTy) + +compatibleNamedStructAnonymousMemberMergeUnitTest :: Test +compatibleNamedStructAnonymousMemberMergeUnitTest = TestLabel "Asm.Output.compatible-named-struct-anonymous-member-merge-unit" $ TestCase $ do + let lhsAnonMembers :: Map.Map T.Text (CT.StructMember Integer) + lhsAnonMembers = Map.fromList + [ ("value", CT.StructMember CT.CTInt 0) + ] + rhsAnonMembers :: Map.Map T.Text (CT.StructMember Integer) + rhsAnonMembers = Map.fromList + [ ("value", CT.StructMember CT.CTInt 0) + ] + lhsOuterMembers :: Map.Map T.Text (CT.StructMember Integer) + lhsOuterMembers = Map.fromList + [ ("anon", CT.StructMember (CT.CTStruct lhsAnonMembers) 0) + ] + rhsOuterMembers :: Map.Map T.Text (CT.StructMember Integer) + rhsOuterMembers = Map.fromList + [ ("anon", CT.StructMember (CT.CTStruct rhsAnonMembers) 0) + ] + lhsTy :: CT.TypeKind Integer + lhsTy = CT.CTNamedStruct "Outer" (CT.ScopeId 0) lhsOuterMembers + rhsTy :: CT.TypeKind Integer + rhsTy = CT.CTNamedStruct "Outer" (CT.ScopeId 0) rhsOuterMembers + assertEqual + "struct-compatible type merging should accept named structs whose anonymous member structs match structurally" + (Just rhsTy) + (CT.mergeCompatibleTypeKinds lhsTy rhsTy) + assertEqual + "struct-compatible type merging should accept nested anonymous member structs symmetrically" + (Just rhsTy) + (CT.mergeCompatibleTypeKinds rhsTy lhsTy) + +tentativeNestedArrayMaterializationUnitTest :: Test +tentativeNestedArrayMaterializationUnitTest = TestLabel "Asm.Output.tentative-nested-array-materialization-unit" $ TestCase $ do + let gvar :: GVar Integer + gvar = + GVar + { gvtype = CT.SCAuto $ CT.CTIncomplete (CT.IncompleteArray (CT.CTArray 4 CT.CTInt)) + , initWith = GVarInitWithZero + , gvNestDepth = 0 + } + expectedTy :: CT.StorageClass Integer + expectedTy = CT.SCAuto $ CT.CTArray 4 (CT.CTArray 1 CT.CTInt) + assertEqual + "tentative nested-array globals should materialize one outer row before codegen" + expectedTy + (gvtype $ materializeTentativeIncompleteArray gvar) + +tentativeArrayUseSiteRejectedTest :: Test +tentativeArrayUseSiteRejectedTest = TestLabel "Asm.Output.tentative-array-use-site-rejected" $ TestCase $ + assertBool + "later global completions must not retroactively legitimize earlier sizeof uses" + (isLeft + (runParser parser "" + "int x[]; int main(void) { return sizeof x; } int x[4];" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +tentativeArrayAddressUseSiteRejectedTest :: Test +tentativeArrayAddressUseSiteRejectedTest = TestLabel "Asm.Output.tentative-array-address-use-site-rejected" $ TestCase $ + assertBool + "later global completions must not retroactively legitimize earlier address arithmetic" + (isLeft + (runParser parser "" + "int x[]; int main(void) { return ((char*)(&x + 1)) - ((char*)&x); } int x[4];" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +globalInitializerRelocAddendTest :: Test +globalInitializerRelocAddendTest = TestLabel "Asm.Output.global-initializer-reloc-addend" $ TestCase $ do + asm <- renderAsm "int a[4]; int *p = a + (1 + 1); char *end = (char*)(&a + 1); char *q = \"ab\" + 1; void *self = &self; int main() { return p != 0 && end != 0 && q[0] == 'b' && self != 0; }" + assertContains + "global pointer initializers retain compatible self-references and folded symbol addends in emitted relocations" + [ "a:" + , ".zero 16" + , "p:" + , ".quad a+8" + , "end:" + , ".quad a+16" + , "q:" + , ".quad .L.data.0+1" + , "self:" + , ".quad self" + ] + asm + +globalInitializerArraySubobjectRelocTest :: Test +globalInitializerArraySubobjectRelocTest = TestLabel "Asm.Output.global-initializer-array-subobject-reloc" $ TestCase $ do + asm <- renderAsm "int x[2][4]; int *row0 = x[0]; int *first = &x[0][0]; int *row1 = x[1]; int main(void) { return row0 == &x[0][0] && first == &x[0][0] && row1 == &x[1][0]; }" + assertContains + "global pointer initializers fold array subobjects reached through dereferences into relocations" + [ "x:" + , ".zero 32" + , "row0:" + , ".quad x" + , "first:" + , ".quad x" + , "row1:" + , ".quad x+16" + ] + asm + +globalInitializerTentativeNestedArrayFallbackRelocTest :: Test +globalInitializerTentativeNestedArrayFallbackRelocTest = TestLabel "Asm.Output.global-initializer-tentative-nested-array-fallback-reloc" $ TestCase $ do + asm <- renderAsm "int x[][4]; char *p = (char*)(x + 1); int main(void) { return p == ((char*)x) + 16; }" + assertContains + "tentative nested arrays should materialize before folding global initializer relocations" + [ "x:" + , ".zero 16" + , "p:" + , ".quad x+16" + ] + asm + +globalInitializerTentativeArrayUseSiteRejectedTest :: Test +globalInitializerTentativeArrayUseSiteRejectedTest = TestLabel "Asm.Output.global-initializer-tentative-array-use-site-rejected" $ TestCase $ + assertBool + "earlier global initializers must not be retyped from later tentative-array completions" + (isLeft + (runParser parser "" + "int x[]; int y = sizeof x; char *p = (char*)(&x + 1); int x[4]; int main(void) { return y == 16 && p == ((char*)&x) + 16; }" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +globalInitializerSelfRetypeSizeofTest :: Test +globalInitializerSelfRetypeSizeofTest = TestLabel "Asm.Output.global-initializer-self-retype-sizeof" $ TestCase $ do + asm <- renderAsm "int x[]; int x[4] = { sizeof x }; int main(void) { return x[0] == 16; }" + assertContains + "completed array definitions retype self-referential sizeof expressions before folding the defining initializer" + [ "x:" + , ".4byte 16" + , ".zero 12" + ] + asm + +globalInitializerSelfRetypeAddressTest :: Test +globalInitializerSelfRetypeAddressTest = TestLabel "Asm.Output.global-initializer-self-retype-address" $ TestCase $ do + asm <- renderAsm "char *x[]; char *x[4] = { (char*)(&x + 1) }; int main(void) { return x[0] == ((char*)&x) + 32; }" + assertContains + "completed array definitions retype self-referential address arithmetic before folding the defining initializer" + [ "x:" + , ".quad x+32" + , ".zero 24" + ] + asm + +globalInitializerFunctionRelocTest :: Test +globalInitializerFunctionRelocTest = TestLabel "Asm.Output.global-initializer-function-reloc" $ TestCase $ do + asm <- renderAsm "int foo(void) { return 1; } int (*fp)(void) = foo; int (*fq)(void) = &foo; int main(void) { return fp != 0 && fq != 0; }" + assertContains + "file-scope function-pointer initializers emit relocations for bare and addressed function designators" + [ "foo:" + , "fp:" + , ".quad foo" + , "fq:" + , ".quad foo" + ] + asm + +bareFunctionDesignatorDerefCallTest :: Test +bareFunctionDesignatorDerefCallTest = TestLabel "Asm.Output.bare-function-designator-deref-call" $ TestCase $ do + asm <- renderAsm "int foo(void) { return 1; } int main(void) { return (*foo)(); }" + assertContains + "dereferencing a bare function designator remains callable" + [ "foo:" + , "main:" + , "call foo" + ] + asm + +functionPointerArrayZeroInitializerTest :: Test +functionPointerArrayZeroInitializerTest = TestLabel "Asm.Output.function-pointer-array-zero-initializer" $ TestCase $ do + asm <- renderAsm "int (*fps[2])(void) = { 0, 0 }; int main(void) { return fps[0] != 0 || fps[1] != 0; }" + assertContains + "brace initializers keep scanning all zeroed function-pointer array elements" + [ "fps:" + , ".zero 16" + ] + asm + +functionPointerArrayFunctionInitializerTest :: Test +functionPointerArrayFunctionInitializerTest = TestLabel "Asm.Output.function-pointer-array-function-initializer" $ TestCase $ do + asm <- renderAsm "int foo(void) { return 1; } int (*fps[2])(void) = { foo, &foo }; int main(void) { return fps[0]() != 1 || fps[1]() != 1; }" + assertContains + "brace initializers keep scanning all function-pointer array elements" + [ "foo:" + , "fps:" + ] + asm + assertEqual + "each function-pointer initializer should emit its own relocation" + 2 + (T.count ".quad foo" asm) + +omittedBoundArrayPointerDerefDecayTest :: Test +omittedBoundArrayPointerDerefDecayTest = TestLabel "Asm.Output.omitted-bound-array-pointer-deref-decay" $ TestCase $ do + asm <- renderAsm "int *f(int (*p)[]) { return *p; } int main(void) { int x[4]; int (*p)[] = (int (*)[])&x; return f(p) != x; }" + let fSection = extractFunctionSection "f" asm + assertEqual + "dereferencing a pointer-to-omitted-bound-array should keep the array lvalue and decay it without an extra scalar load" + 1 + (T.count "mov rax, [rax]" fSection) + +indirectFunctionPointerCallTest :: Test +indirectFunctionPointerCallTest = TestLabel "Asm.Output.indirect-function-pointer-call" $ TestCase $ do + asm <- renderAsm "int foo(void) { return 1; } int main(void) { int (*fp)(void) = foo; return fp(); }" + assertContains + "callable variables are lowered as indirect calls through the loaded function pointer" + [ "call r11" ] + asm + assertBool + "callable variables must not be emitted as implicit direct symbol calls" + (not $ "call \"fp\"" `T.isInfixOf` asm) + +indirectFunctionPointerCallAlignmentTest :: Test +indirectFunctionPointerCallAlignmentTest = TestLabel "Asm.Output.indirect-function-pointer-call-alignment" $ TestCase $ do + asm <- renderAsm "int foo(void) { return 1; } int main(void) { int (*fp)(void); fp = foo; return fp(); }" + assertContainsInOrder + "zero-argument indirect calls load the callee once and keep a padded fallback path" + [ "pop r11" + , "mov rax, rsp" + , "and rax, 15" + , "jnz .L.call." + , "mov rax, 0" + , "call r11" + , ".L.call." + , "sub rsp, 8" + , "mov rax, 0" + , "call r11" + , "add rsp, 8" + ] + asm + +directBoolFunctionCallNormalizationTest :: Test +directBoolFunctionCallNormalizationTest = TestLabel "Asm.Output.direct-bool-function-call-normalization" $ TestCase $ do + asm <- renderAsm "_Bool foo(void) { return 2; } int main(void) { return foo(); }" + assertContainsInOrder + "zero-argument direct _Bool calls normalize the ABI-defined low byte before pushing the result" + [ "call foo" + , "cmp al, 0" + , "setne al" + , "movzb rax, al" + , "push rax" + ] + asm + +indirectBoolFunctionPointerCallNormalizationTest :: Test +indirectBoolFunctionPointerCallNormalizationTest = TestLabel "Asm.Output.indirect-bool-function-pointer-call-normalization" $ TestCase $ do + asm <- renderAsm "_Bool foo(void) { return 2; } int main(void) { _Bool (*fp)(void); fp = foo; return fp(); }" + assertContainsInOrder + "zero-argument indirect _Bool calls normalize the ABI-defined low byte before pushing the result" + [ "call r11" + , "cmp al, 0" + , "setne al" + , "movzb rax, al" + , "push rax" + ] + asm + +directIntegralFunctionCallNormalizationTest :: Test +directIntegralFunctionCallNormalizationTest = TestLabel "Asm.Output.direct-integral-function-call-normalization" $ TestCase $ do + asm <- renderAsm "char ret_char(void) { return -1; } short ret_short(void) { return -1; } int ret_int(void) { return -1; } int main(void) { return ret_char() == -1 && ret_short() == -1 && ret_int() == -1; }" + let mainSection = extractFunctionSection "main" asm + assertContainsInOrder + "direct char/short/int calls truncate the ABI return register before the result is consumed" + [ "call ret_char" + , "movsx rax, al" + , "push rax" + , "call ret_short" + , "movsx rax, ax" + , "push rax" + , "call ret_int" + , "movsxd rax, eax" + , "push rax" + ] + mainSection + +blockScopeExternObjectShadowsOuterLocalAsmTest :: Test +blockScopeExternObjectShadowsOuterLocalAsmTest = TestLabel "Asm.Output.block-scope-extern-object-shadows-outer-local" $ TestCase $ do + asm <- renderAsm "int foo = 2; int main(void) { int foo = 1; { extern int foo; return foo; } }" + let mainSection = extractFunctionSection "main" asm + assertContains + "block-scope extern object references the global symbol instead of the outer local" + [ "push offset foo" + ] + mainSection + +blockScopeExternObjectShadowsEnumeratorAsmTest :: Test +blockScopeExternObjectShadowsEnumeratorAsmTest = TestLabel "Asm.Output.block-scope-extern-object-shadows-enumerator" $ TestCase $ do + asm <- renderAsm "enum E { A = 5 }; int f(void) { extern int A; return A; }" + let fSection = extractFunctionSection "f" asm + assertContains + "block-scope extern object resolves to a global symbol load instead of folding the enumerator" + [ "push offset A" + ] + fSection + assertBool + "block-scope extern object should not fold to the outer enumerator constant" + (not $ "push 5" `T.isInfixOf` fSection) + +blockScopeExternFunctionShadowsEnumeratorAsmTest :: Test +blockScopeExternFunctionShadowsEnumeratorAsmTest = TestLabel "Asm.Output.block-scope-extern-function-shadows-enumerator" $ TestCase $ do + asm <- renderAsm "enum E { foo = 1 }; int f(void) { extern int foo(void); return foo(); }" + let fSection = extractFunctionSection "f" asm + assertContains + "block-scope extern prototype resolves to a function call instead of the outer enumerator" + [ "call foo" + ] + fSection + +blockScopeOrdinaryFunctionPrototypeAsmTest :: Test +blockScopeOrdinaryFunctionPrototypeAsmTest = TestLabel "Asm.Output.block-scope-ordinary-function-prototype" $ TestCase $ do + asm <- renderAsm "int main(void) { int foo(void); return foo(); }" + let mainSection = extractFunctionSection "main" asm + assertContains + "block-scope ordinary function prototypes resolve to direct external calls" + [ "call foo" + ] + mainSection + assertBool + "block-scope function prototypes must not be lowered as indirect calls through local stack slots" + (not $ "call r11" `T.isInfixOf` mainSection) + +blockScopeExternStaticFunctionAsmTest :: Test +blockScopeExternStaticFunctionAsmTest = TestLabel "Asm.Output.block-scope-extern-static-function" $ TestCase $ do + asm <- renderAsm "static int foo(void) { return 3; } int main(void) { extern int foo(void); return foo(); }" + let mainSection = extractFunctionSection "main" asm + assertContains + "block-scope extern prototypes inherit visible static function linkage" + [ "call foo" + ] + mainSection + +blockScopeExternStaticObjectAsmTest :: Test +blockScopeExternStaticObjectAsmTest = TestLabel "Asm.Output.block-scope-extern-static-object" $ TestCase $ do + asm <- renderAsm "static int x = 4; int main(void) { extern int x; return x; }" + let mainSection = extractFunctionSection "main" asm + assertContains + "block-scope extern objects inherit visible static object linkage" + [ "push offset x" + ] + mainSection + +indirectIntegralFunctionPointerCallNormalizationTest :: Test +indirectIntegralFunctionPointerCallNormalizationTest = TestLabel "Asm.Output.indirect-integral-function-pointer-call-normalization" $ TestCase $ do + asm <- renderAsm "char ret_char(void) { return -1; } short ret_short(void) { return -1; } int ret_int(void) { return -1; } int main(void) { char (*char_fp)(void); short (*short_fp)(void); int (*int_fp)(void); char_fp = ret_char; short_fp = ret_short; int_fp = ret_int; return char_fp() == -1 && short_fp() == -1 && int_fp() == -1; }" + let mainSection = extractFunctionSection "main" asm + assertContainsInOrder + "indirect char/short/int calls truncate the ABI return register before the result is consumed" + [ "call r11" + , "movsx rax, al" + , "push rax" + , "call r11" + , "movsx rax, ax" + , "push rax" + , "call r11" + , "movsxd rax, eax" + , "push rax" + ] + mainSection + +boolFunctionReturnNormalizationTest :: Test +boolFunctionReturnNormalizationTest = TestLabel "Asm.Output.bool-function-return-normalization" $ TestCase $ do + asm <- renderAsm "_Bool foo(void) { return 256; } int main(void) { return foo(); }" + assertContainsInOrder + "bool function epilogues normalize the full return register at the shared return label" + [ ".L.return.foo:" + , "cmp rax, 0" + , "setne al" + , "movzb rax, al" + , "leave" + , "ret" + ] + asm + +directBoolFunctionArgNormalizationTest :: Test +directBoolFunctionArgNormalizationTest = TestLabel "Asm.Output.direct-bool-function-arg-normalization" $ TestCase $ do + asm <- renderAsm "int takes_bool(_Bool x) { return x; } int main(void) { return takes_bool(256); }" + let mainSection = extractFunctionSection "main" asm + assertContainsInOrder + "direct calls cast integer arguments to _Bool before materializing the argument register from the scratch slot" + [ "main:" + , "push 256" + , "cmp rax, 0" + , "setne al" + , "movzb rax, al" + , "mov [rbx+0], rdx" + , "mov rdi, [rax+0]" + , "call takes_bool" + ] + mainSection + +directOldStyleBoolFunctionArgPromotionTest :: Test +directOldStyleBoolFunctionArgPromotionTest = TestLabel "Asm.Output.direct-old-style-bool-function-arg-promotion" $ TestCase $ do + asm <- renderAsm "int takes_bool(); int main(void) { return takes_bool(256); }" + let mainSection = extractFunctionSection "main" asm + assertContainsInOrder + "old-style direct calls pass the promoted integer argument through the scratch slot without _Bool normalization" + [ "main:" + , "push 256" + , "mov [rbx+0], rdx" + , "mov rdi, [rax+0]" + , "call takes_bool" + ] + mainSection + assertBool + "old-style direct calls must not normalize the argument to _Bool at the call site" + (not $ any (`T.isInfixOf` mainSection) ["cmp rax, 0", "setne al", "movzb rax, al"]) + +indirectBoolFunctionPointerArgNormalizationTest :: Test +indirectBoolFunctionPointerArgNormalizationTest = TestLabel "Asm.Output.indirect-bool-function-pointer-arg-normalization" $ TestCase $ do + asm <- renderAsm "int takes_bool(_Bool x) { return x; } int main(void) { int (*fp)(_Bool); fp = takes_bool; return fp(256); }" + let mainSection = extractFunctionSection "main" asm + assertContainsInOrder + "indirect calls cast integer arguments to _Bool before materializing the argument and callee from scratch slots" + [ "main:" + , "push 256" + , "cmp rax, 0" + , "setne al" + , "movzb rax, al" + , "mov [rbx+0], rdx" + , "mov rdi, [rax+0]" + , "mov r11, [rax+8]" + , "mov rax, 0" + , "call r11" + ] + mainSection + +indirectOldStyleBoolFunctionPointerPromotionConflictTest :: Test +indirectOldStyleBoolFunctionPointerPromotionConflictTest = TestLabel "Asm.Output.indirect-old-style-bool-function-pointer-promotion-conflict" $ TestCase $ do + assertBool + "old-style function pointers must reject _Bool parameters that only match after default promotions" + (isLeft + ( runParser parser "" + "int takes_bool(_Bool x) { return x; } int main(void) { int (*fp)(); fp = takes_bool; return fp(256); }" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +indirectFunctionPointerArgAlignmentTest :: Test +indirectFunctionPointerArgAlignmentTest = TestLabel "Asm.Output.indirect-function-pointer-arg-alignment" $ TestCase $ do + asm <- renderAsm "int inc(int x) { return x + 1; } int main(void) { int (*fp)(int); fp = inc; return fp(41); }" + assertContainsInOrder + "indirect calls with register arguments probe rsp alignment before restoring the argument and callee from scratch slots" + [ "mov rax, rsp" + , "and rax, 15" + , "jnz .L.call." + , "mov r11, [rax+8]" + , "mov rdi, [rax+0]" + , "mov rax, 0" + , "call r11" + ] + asm + +directFunctionStackArgAlignmentTest :: Test +directFunctionStackArgAlignmentTest = TestLabel "Asm.Output.direct-function-stack-arg-alignment" $ TestCase $ do + asm <- renderAsm "long sum7(long a, long b, long c, long d, long e, long f, long g) { return a + b + c + d + e + f + g; } int main(void) { return sum7(1, 2, 3, 4, 5, 6, 7) - 28; }" + assertContainsInOrder + "stack-passed direct calls probe the final call alignment before preparing args and keep any padding ahead of outgoing stack arguments" + [ "mov rax, rsp" + , "sub rax, 8" + , "and rax, 15" + , "jnz .L.call." + , "mov rdx, [rax+48]" + , "push rdx" + , "mov rdx, [rax+16]" + , "call sum7" + , ".L.call." + , "sub rsp, 8" + , "mov rdx, [rax+48]" + , "push rdx" + , "mov rdx, [rax+16]" + , "call sum7" + , "add rsp, 8" + , "add rsp, 8" + ] + asm + +directFunctionMultipleStackArgRestoreTest :: Test +directFunctionMultipleStackArgRestoreTest = TestLabel "Asm.Output.direct-function-multiple-stack-arg-restore" $ TestCase $ do + asm <- renderAsm "long pick9(long a, long b, long c, long d, long e, long f, long g, long h, long i) { return g * 100 + h * 10 + i; } int main(void) { return pick9(1, 2, 3, 4, 5, 6, 7, 8, 9) - 789; }" + let mainSection = extractFunctionSection "main" asm + assertContainsInOrder + "direct calls restore multiple stack arguments through a register before pushing, so padding pushes cannot overwrite unread slots" + [ "mov rdx, [rax+64]" + , "push rdx" + , "mov rdx, [rax+56]" + , "push rdx" + , "mov rdx, [rax+48]" + , "push rdx" + , "mov rdx, [rax+16]" + , "call pick9" + ] + mainSection + assertBool + "direct calls should not push stack arguments directly from the scratch area" + (not $ "push [rax+" `T.isInfixOf` mainSection) + +directFunctionLateStackArgCallOrderTest :: Test +directFunctionLateStackArgCallOrderTest = TestLabel "Asm.Output.direct-function-late-stack-arg-call-order" $ TestCase $ do + asm <- renderAsm "long g(void) { return 7; } long sum8(long a, long b, long c, long d, long e, long f, long g_, long h) { return g_ + h; } int main(void) { return sum8(1, 2, 3, 4, 5, 6, g(), 8) - 15; }" + let mainSection = extractFunctionSection "main" asm + assertOccursBefore + "later stack arguments should not be pushed before evaluating an earlier stack argument that contains a nested call" + "call g" + "push 8" + mainSection + +stackPassedParameterSpillTest :: Test +stackPassedParameterSpillTest = TestLabel "Asm.Output.stack-passed-parameter-spill" $ TestCase $ do + asm <- renderAsm "long last7(long a, long b, long c, long d, long e, long f, long g) { return g; } int main(void) { return last7(1, 2, 3, 4, 5, 6, 7) - 7; }" + assertContainsInOrder + "function prologues spill stack-passed parameters into their local slots before use" + [ "last7:" + , "mov rax, [rbp+16]" + , "mov [rbp-56], rax" + ] + asm + +stackPassedBoolParameterSpillTest :: Test +stackPassedBoolParameterSpillTest = TestLabel "Asm.Output.stack-passed-bool-parameter-spill" $ TestCase $ do + asm <- renderAsm "int bool7(int a, int b, int c, int d, int e, int f, _Bool g) { return g; } int main(void) { return bool7(1, 2, 3, 4, 5, 6, 256); }" + assertContainsInOrder + "stack-passed _Bool parameters normalize the ABI-defined low byte before spilling into the local slot" + [ "bool7:" + , "mov rax, [rbp+16]" + , "cmp al, 0" + , "setne al" + , "movzb rax, al" + , "mov [rbp-" + ] + asm + +unnamedFunctionParameterSpillTest :: Test +unnamedFunctionParameterSpillTest = TestLabel "Asm.Output.unnamed-function-parameter-spill" $ TestCase $ do + asm <- renderAsm "struct S { char a; char b; char c; }; int f(struct S, int x) { return x; } int main(void) { struct S s; s.a = 1; s.b = 2; s.c = 3; return f(s, 42) - 42; }" + let fSection = extractFunctionSection "f" asm + assertContainsInOrder + "unnamed function parameters should still occupy their ABI argument slots before later named parameters are spilled" + [ "f:" + , "mov r10, rdi" + , "mov [rbp-8], esi" + , "lea rax, [rbp-8]" + ] + fSection + assertBool + "a named parameter after an unnamed aggregate should not be spilled from the aggregate's register" + (not $ "mov [rbp-8], edi" `T.isInfixOf` fSection) + +unnamedFunctionAdjustedParameterSpillTest :: Test +unnamedFunctionAdjustedParameterSpillTest = TestLabel "Asm.Output.unnamed-function-adjusted-parameter-spill" $ TestCase $ do + asm <- renderAsm "typedef int F(int); int f(F) { return 0; }" + let fSection = extractFunctionSection "f" asm + assertContainsInOrder + "unnamed function-typed parameters should be registered as pointer-sized local slots" + [ "f:" + , "mov [rbp-8], rdi" + ] + fSection + assertBool + "unnamed function-typed parameters should not use the raw function type's one-byte slot" + (not $ "mov [rbp-1], rdi" `T.isInfixOf` fSection) + +writeOnlyFallbackReplacementRestoreTest :: Test +writeOnlyFallbackReplacementRestoreTest = + replacementFailurePreservesWriteOnlyOutputTest + PreserveReplacementOutputMode + "Asm.Output.write-only-fallback-replacement-restore" + ownerWriteMode + +writeOnlyExecutableFallbackReplacementRestoreTest :: Test +writeOnlyExecutableFallbackReplacementRestoreTest = + replacementFailurePreservesWriteOnlyOutputTest + PreserveReplacementOutputModeKeepingExecutableBits + "Asm.Output.write-only-executable-fallback-replacement-restore" + (ownerWriteMode `unionFileModes` ownerExecuteMode) + +unreadableStagedFallbackReplacementTest :: Test +unreadableStagedFallbackReplacementTest = + TestLabel "Asm.Output.unreadable-staged-fallback-replacement" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + (stagedPath, stagedHandle) <- openTempFile tmpDir "htcc-output-staged" + let cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (hClose stagedHandle) + >> ignoreIOException (removeFile targetPath) + >> ignoreIOException (removeFile stagedPath) + existingMode = 0o555 + stagedMode = 0o055 + replacementOutput = "#!/bin/sh\nexit 0\n" + copyReplacementOutput src dst = + B.readFile src >>= B.writeFile dst + flip finally cleanup $ do + hClose targetHandle + hClose stagedHandle + T.writeFile targetPath "#!/bin/sh\nexit 99\n" + setFileMode targetPath existingMode + T.writeFile stagedPath replacementOutput + setFileMode stagedPath stagedMode + replaceExistingOutputFromPathWith + copyReplacementOutput + PreserveReplacementOutputModeKeepingExecutableBits + targetPath + existingMode + stagedMode + stagedPath + replacedMode <- fileMode <$> getFileStatus targetPath + setFileMode targetPath $ replacedMode `unionFileModes` ownerReadMode + replacedOutput <- T.readFile targetPath + assertEqual + "fallback replacement should temporarily restore owner read on unreadable staged outputs" + existingMode + (intersectFileModes replacedMode 0o777) + assertEqual "fallback replacement should copy the staged output" replacementOutput replacedOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +otherWritableFallbackReplacementSkipsPreemptiveChmodTest :: Test +otherWritableFallbackReplacementSkipsPreemptiveChmodTest = + TestLabel "Asm.Output.other-writable-fallback-replacement-skips-preemptive-chmod" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + (stagedPath, stagedHandle) <- openTempFile tmpDir "htcc-output-staged" + let cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (hClose stagedHandle) + >> ignoreIOException (removeFile targetPath) + >> ignoreIOException (removeFile stagedPath) + existingMode = otherWriteMode + stagedMode = ownerReadMode + replacementOutput = "replacement output\n" + copyReplacementOutput src dst = do + modeBeforeCopy <- fileMode <$> getFileStatus dst + assertEqual + "fallback replacement should attempt copy before adding owner write" + existingMode + (intersectFileModes modeBeforeCopy 0o777) + setFileMode dst $ modeBeforeCopy `unionFileModes` ownerWriteMode + B.readFile src >>= B.writeFile dst + flip finally cleanup $ do + hClose targetHandle + hClose stagedHandle + T.writeFile targetPath "stale output\n" + setFileMode targetPath existingMode + T.writeFile stagedPath replacementOutput + setFileMode stagedPath stagedMode + replaceExistingOutputFromPathWith + copyReplacementOutput + PreserveReplacementOutputMode + targetPath + existingMode + stagedMode + stagedPath + replacedMode <- fileMode <$> getFileStatus targetPath + setFileMode targetPath $ replacedMode `unionFileModes` ownerReadMode + replacedOutput <- T.readFile targetPath + assertEqual + "fallback replacement should restore the original mode" + existingMode + (intersectFileModes replacedMode 0o777) + assertEqual "fallback replacement should copy the staged output" replacementOutput replacedOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +rollbackFailureSurfacedTest :: Test +rollbackFailureSurfacedTest = + TestLabel "Asm.Output.rollback-failure-surfaced" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + (stagedPath, stagedHandle) <- openTempFile tmpDir "htcc-output-staged" + let cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (hClose stagedHandle) + >> ignoreIOException (removeFile stagedPath) + >> ignoreIOException (removeFile targetPath) + >> ignoreIOException (removeDirectory targetPath) + existingMode = ownerReadMode `unionFileModes` ownerWriteMode + stagedMode = ownerReadMode `unionFileModes` ownerWriteMode + failDuringReplacement src dst + | src == stagedPath = do + withBinaryFile src ReadMode $ \srcHandle -> + withBinaryFile dst WriteMode $ \dstHandle -> do + chunk <- B.hGetSome srcHandle 4 + B.hPut dstHandle chunk + removeFile dst + createDirectory dst + ioError $ userError "simulated replacement failure" + | otherwise = + assertFailure "unexpected replacement source path" + flip finally cleanup $ do + hClose targetHandle + hClose stagedHandle + T.writeFile targetPath "stale output\n" + setFileMode targetPath existingMode + T.writeFile stagedPath "replacement output\n" + result <- try + (replaceExistingOutputFromPathWith + failDuringReplacement + PreserveReplacementOutputMode + targetPath + existingMode + stagedMode + stagedPath + ) + :: IO (Either IOException ()) + case result of + Left ioErr -> do + let errText = T.pack $ show ioErr + assertBool + "rollback failures should be surfaced to the caller" + ("failed to restore original output after replacement failure" `T.isInfixOf` errText) + assertBool + "the surfaced error should retain the original replacement failure" + ("simulated replacement failure" `T.isInfixOf` errText) + Right _ -> + assertFailure "replacement should fail when both replacement and rollback fail" + targetIsDirectory <- doesDirectoryExist targetPath + assertBool + "the failed rollback fixture should leave the destination in its mutated state" + targetIsDirectory + where + ignoreIOException = flip catchIOError $ const $ pure () + +executableOnlyFallbackReplacementPreservesModeTest :: Test +executableOnlyFallbackReplacementPreservesModeTest = + TestLabel "Asm.Output.executable-only-fallback-replacement-preserves-mode" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + (stagedPath, stagedHandle) <- openTempFile tmpDir "htcc-output-staged" + let cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (hClose stagedHandle) + >> ignoreIOException (removeFile targetPath) + >> ignoreIOException (removeFile stagedPath) + existingMode = 0o555 + currentMode = 0o755 + replacementOutput = "#!/bin/sh\nexit 0\n" + copyReplacementOutput src dst = + B.readFile src >>= B.writeFile dst + flip finally cleanup $ do + hClose targetHandle + hClose stagedHandle + T.writeFile targetPath "#!/bin/sh\nexit 99\n" + setFileMode targetPath existingMode + T.writeFile stagedPath replacementOutput + setFileMode stagedPath currentMode + replaceExistingOutputFromPathWith + copyReplacementOutput + PreserveReplacementOutputModeKeepingExecutableBits + targetPath + existingMode + currentMode + stagedPath + replacedMode <- fileMode <$> getFileStatus targetPath + replacedOutput <- T.readFile targetPath + assertEqual + "fallback replacement should temporarily make executable-only outputs writable and then restore the original mode" + existingMode + (intersectFileModes replacedMode 0o777) + assertEqual "fallback replacement should copy the staged output" replacementOutput replacedOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +replacementExecutableBitsIgnoreReadBitsTest :: Test +replacementExecutableBitsIgnoreReadBitsTest = + TestLabel "Asm.Output.replacement-executable-bits-ignore-read-bits" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + (stagedPath, stagedHandle) <- openTempFile tmpDir "htcc-output-staged" + let cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (hClose stagedHandle) + >> ignoreIOException (removeFile targetPath) + >> ignoreIOException (removeFile stagedPath) + existingMode = 0o640 + currentMode = 0o755 + expectedMode = 0o740 + copyReplacementOutput src dst = + B.readFile src >>= B.writeFile dst + flip finally cleanup $ do + hClose targetHandle + hClose stagedHandle + T.writeFile targetPath "stale output\n" + setFileMode targetPath existingMode + T.writeFile stagedPath "#!/bin/sh\nexit 0\n" + replaceExistingOutputFromPathWith + copyReplacementOutput + PreserveReplacementOutputModeKeepingExecutableBits + targetPath + existingMode + currentMode + stagedPath + replacedMode <- fileMode <$> getFileStatus targetPath + replacedOutput <- T.readFile targetPath + assertEqual + "replacement should preserve the prior permission mask and add only owner execute" + expectedMode + (intersectFileModes replacedMode 0o777) + assertEqual "replacement should copy the staged output" "#!/bin/sh\nexit 0\n" replacedOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +replacementExecutableBitsRestoreOwnerExecuteTest :: Test +replacementExecutableBitsRestoreOwnerExecuteTest = + TestLabel "Asm.Output.replacement-executable-bits-restore-owner-execute" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + (stagedPath, stagedHandle) <- openTempFile tmpDir "htcc-output-staged" + let cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (hClose stagedHandle) + >> ignoreIOException (removeFile targetPath) + >> ignoreIOException (removeFile stagedPath) + existingMode = 0o055 + currentMode = 0o755 + expectedMode = 0o155 + replacementOutput = "#!/bin/sh\nexit 0\n" + copyReplacementOutput src dst = + B.readFile src >>= B.writeFile dst + flip finally cleanup $ do + hClose targetHandle + hClose stagedHandle + T.writeFile targetPath "#!/bin/sh\nexit 99\n" + setFileMode targetPath existingMode + T.writeFile stagedPath replacementOutput + setFileMode stagedPath currentMode + replaceExistingOutputFromPathWith + copyReplacementOutput + PreserveReplacementOutputModeKeepingExecutableBits + targetPath + existingMode + currentMode + stagedPath + replacedMode <- fileMode <$> getFileStatus targetPath + setFileMode targetPath $ replacedMode `unionFileModes` ownerReadMode + replacedOutput <- T.readFile targetPath + assertEqual + "replacement should restore owner execute when the prior execute mask only covered group/other" + expectedMode + (intersectFileModes replacedMode 0o777) + assertEqual "replacement should copy the staged output" replacementOutput replacedOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +freshExecutableReplacementPreservesExecuteBitsTest :: Test +freshExecutableReplacementPreservesExecuteBitsTest = + TestLabel "Asm.Output.fresh-executable-replacement-preserves-execute-bits" $ TestCase $ do + tmpDir <- getTemporaryDirectory + creationMode <- creationMaskedOutputMode + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + let cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (removeFile targetPath) + expectedMode = intersectFileModes creationMode 0o777 `unionFileModes` 0o111 + flip finally cleanup $ do + hClose targetHandle + removeFile targetPath + withReplacementOutputPath PreserveReplacementOutputModeKeepingExecutableBits targetPath $ \tmpOutputPath -> do + T.writeFile tmpOutputPath "#!/bin/sh\nexit 0\n" + setFileMode tmpOutputPath 0o755 + replacedMode <- fileMode <$> getFileStatus targetPath + replacedOutput <- T.readFile targetPath + assertEqual + "fresh executable replacements should preserve all execute bits emitted by the linker" + expectedMode + (intersectFileModes replacedMode 0o777) + assertEqual "fresh replacement should write the staged output" "#!/bin/sh\nexit 0\n" replacedOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +freshExecutableReplacementRestoresOwnerExecuteTest :: Test +freshExecutableReplacementRestoresOwnerExecuteTest = + TestLabel "Asm.Output.fresh-executable-replacement-restores-owner-execute" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + let cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (removeFile targetPath) + currentMode = 0o055 + flip finally cleanup $ do + hClose targetHandle + removeFile targetPath + withReplacementOutputPath PreserveReplacementOutputModeKeepingExecutableBits targetPath $ \tmpOutputPath -> do + T.writeFile tmpOutputPath "#!/bin/sh\nexit 0\n" + setFileMode tmpOutputPath currentMode + replacedMode <- fileMode <$> getFileStatus targetPath + replacedOutput <- T.readFile targetPath + assertBool + "fresh executable replacements should restore owner execute when the linker only leaves group/other execute" + (intersectFileModes replacedMode ownerExecuteMode /= 0) + assertEqual + "fresh executable replacements should preserve the linker-provided group/other execute mask" + currentMode + (intersectFileModes replacedMode currentMode) + assertEqual "fresh replacement should write the staged output" "#!/bin/sh\nexit 0\n" replacedOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +freshReplacementRestrictiveUmaskCreatesWritableStagingTest :: Test +freshReplacementRestrictiveUmaskCreatesWritableStagingTest = + TestLabel "Asm.Output.fresh-replacement-restrictive-umask-creates-writable-staging" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + let cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (removeFile targetPath) + replacementOutput = "#!/bin/sh\nexit 0\n" + flip finally cleanup $ do + hClose targetHandle + removeFile targetPath + originalMask <- setFileCreationMask 0o222 + flip finally (setFileCreationMask originalMask) $ + withReplacementOutputPath PreserveReplacementOutputModeKeepingExecutableBits targetPath $ \tmpOutputPath -> do + T.writeFile tmpOutputPath replacementOutput + setFileMode tmpOutputPath 0o755 + replacedOutput <- T.readFile targetPath + assertEqual + "fresh replacement staging should remain writable under restrictive umask" + replacementOutput + replacedOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +freshReplacementPreservesInheritedSetgidStagingDirectoryTest :: Test +freshReplacementPreservesInheritedSetgidStagingDirectoryTest = + TestLabel "Asm.Output.fresh-replacement-preserves-inherited-setgid-staging-directory" $ TestCase $ do + tmpDir <- getTemporaryDirectory + targetDir <- mkdtemp (tmpDir "htcc-output-setgid-XXXXXX") + let targetPath = targetDir "target" + cleanup = + ignoreIOException (removeFile targetPath) + >> ignoreIOException (removeDirectoryRecursive targetDir) + privateDirectoryMode = + foldr1 unionFileModes + [ ownerReadMode + , ownerWriteMode + , ownerExecuteMode + ] + expectedOutput = "setgid\n" + flip finally cleanup $ do + setFileMode targetDir $ privateDirectoryMode `unionFileModes` setGroupIDMode + let controlDir = targetDir "control" + createDirectory controlDir + controlMode <- fileMode <$> getFileStatus controlDir + removeDirectory controlDir + let childrenInheritSetgid = intersectFileModes controlMode setGroupIDMode /= 0 + withReplacementOutputPath PreserveReplacementOutputMode targetPath $ \tmpOutputPath -> do + stagingMode <- fileMode <$> getFileStatus (takeDirectory tmpOutputPath) + when childrenInheritSetgid $ + assertBool + "fresh replacement staging directory should preserve inherited setgid" + (intersectFileModes stagingMode setGroupIDMode /= 0) + T.writeFile tmpOutputPath expectedOutput + replacedOutput <- T.readFile targetPath + assertEqual "fresh replacement should write the staged output" expectedOutput replacedOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +creationMaskedOutputModeMatchesActualCreationTest :: Test +creationMaskedOutputModeMatchesActualCreationTest = + TestLabel "Asm.Output.creation-masked-output-mode-matches-actual-creation" $ TestCase $ do + tmpDir <- getTemporaryDirectory + probeDir <- mkdtemp (tmpDir <> "/htcc-output-modeXXXXXX") + let probePath = probeDir <> "/mask-probe" + cleanup = + ignoreIOException (removeFile probePath) + >> ignoreIOException (removeDirectory probeDir) + flip finally cleanup $ do + expectedMode <- creationMaskedOutputMode + probeFd <- createFile probePath 0o666 + finally + ( do + actualMode <- intersectFileModes 0o666 . fileMode <$> getFileStatus probePath + assertEqual + "creationMaskedOutputMode should match the mode that POSIX file creation actually receives" + expectedMode + actualMode + ) + (closeFd probeFd) + where + ignoreIOException = flip catchIOError $ const $ pure () + +hardLinkedFallbackReplacementRejectedTest :: Test +hardLinkedFallbackReplacementRejectedTest = + TestLabel "Asm.Output.hard-linked-fallback-replacement-rejected" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + (stagedPath, stagedHandle) <- openTempFile tmpDir "htcc-output-staged" + let aliasPath = targetPath <> ".alias" + cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (hClose stagedHandle) + >> ignoreIOException (removeFile aliasPath) + >> ignoreIOException (removeFile targetPath) + >> ignoreIOException (removeFile stagedPath) + existingMode = 0o644 + currentMode = 0o644 + copyReplacementOutput src dst = + B.readFile src >>= B.writeFile dst + flip finally cleanup $ do + hClose targetHandle + hClose stagedHandle + T.writeFile targetPath "stale output\n" + createLink targetPath aliasPath + setFileMode targetPath existingMode + T.writeFile stagedPath "replacement output\n" + result <- try + (replaceExistingOutputFromPathWith + copyReplacementOutput + PreserveReplacementOutputMode + targetPath + existingMode + currentMode + stagedPath + ) + :: IO (Either IOException ()) + case result of + Left _ -> pure () + Right _ -> assertFailure "fallback replacement should reject hard-linked outputs" + targetContents <- T.readFile targetPath + aliasContents <- T.readFile aliasPath + assertEqual "target should remain unchanged" "stale output\n" targetContents + assertEqual "hard-linked alias should remain unchanged" "stale output\n" aliasContents + where + ignoreIOException = flip catchIOError $ const $ pure () + +hardLinkedRenameReplacementPreservesAliasTest :: Test +hardLinkedRenameReplacementPreservesAliasTest = + TestLabel "Asm.Output.hard-linked-rename-replacement-preserves-alias" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + let aliasPath = targetPath <> ".alias" + cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (removeFile aliasPath) + >> ignoreIOException (removeFile targetPath) + staleOutput = "stale output\n" + replacementOutput = "replacement output\n" + flip finally cleanup $ do + hClose targetHandle + T.writeFile targetPath staleOutput + createLink targetPath aliasPath + result <- try + (withReplacementOutputPath PreserveReplacementOutputMode targetPath $ \tmpOutputPath -> + T.writeFile tmpOutputPath replacementOutput + ) + :: IO (Either IOException ()) + case result of + Left ioErr -> + assertFailure $ + "rename replacement should permit hard-linked outputs: " <> show ioErr + Right _ -> + pure () + targetContents <- T.readFile targetPath + aliasContents <- T.readFile aliasPath + assertEqual "target should receive the replacement output" replacementOutput targetContents + assertEqual "hard-linked alias should retain the previous inode contents" staleOutput aliasContents + where + ignoreIOException = flip catchIOError $ const $ pure () + +withReplacementOutputPathAndResolvedPathDirectFallbackTest :: Test +withReplacementOutputPathAndResolvedPathDirectFallbackTest = + TestLabel "Asm.Output.with-replacement-output-path-and-resolved-path-direct-fallback" $ TestCase $ do + tmpDir <- getTemporaryDirectory + targetDir <- mkdtemp (tmpDir "htcc-output-direct-fallbackXXXXXX") + let targetPath = targetDir "htcc-output-target" + replacementOutput = "#!/bin/sh\nexit 0\n" + cleanup = do + ignoreIOException $ setFileMode targetDir 0o755 + ignoreIOException $ removeFile targetPath + ignoreIOException $ removeDirectoryRecursive targetDir + flip finally cleanup $ do + T.writeFile targetPath "#!/bin/sh\nexit 99\n" + setFileMode targetPath 0o644 + setFileMode targetDir 0o555 + (finalPath, stagedPath) <- + withReplacementOutputPathAndResolvedPath PreserveReplacementOutputModeKeepingExecutableBits targetPath $ \tmpOutputPath -> do + T.writeFile tmpOutputPath replacementOutput + setFileMode tmpOutputPath 0o755 + pure tmpOutputPath + replacedOutput <- T.readFile targetPath + assertEqual + "direct fallback should report the final destination path rather than the temporary staging path" + targetPath + finalPath + assertBool + "direct fallback should stage the replacement away from the target directory" + (stagedPath /= targetPath && takeDirectory stagedPath /= targetDir) + assertEqual "direct fallback should replace the target output" replacementOutput replacedOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +withReplacementOutputPathActionPermissionFailureNotRetriedTest :: Test +withReplacementOutputPathActionPermissionFailureNotRetriedTest = + TestLabel "Asm.Output.with-replacement-output-path-action-permission-failure-not-retried" $ TestCase $ do + tmpDir <- getTemporaryDirectory + (targetPath, targetHandle) <- openTempFile tmpDir "htcc-output-target" + actionCalls <- newIORef (0 :: Int) + let staleOutput = "stale output\n" + cleanup = + ignoreIOException (hClose targetHandle) + >> ignoreIOException (removeFile targetPath) + failingAction tmpOutputPath = do + modifyIORef' actionCalls succ + ioError $ + mkIOError + permissionErrorType + "simulated action permission failure" + Nothing + (Just tmpOutputPath) + flip finally cleanup $ do + hClose targetHandle + T.writeFile targetPath staleOutput + result <- try + (withReplacementOutputPathAndResolvedPath PreserveReplacementOutputMode targetPath $ \tmpOutputPath -> + failingAction tmpOutputPath + ) + :: IO (Either IOException (FilePath, ())) + case result of + Left _ -> + pure () + Right _ -> + assertFailure "action PermissionDenied should be returned without direct fallback retry" + callCount <- readIORef actionCalls + targetOutput <- T.readFile targetPath + assertEqual "action PermissionDenied should not retry the action through direct fallback" 1 callCount + assertEqual "failed action should preserve the existing output" staleOutput targetOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +withReplacementOutputPathPublishPermissionFailureNotRetriedTest :: Test +withReplacementOutputPathPublishPermissionFailureNotRetriedTest = + TestLabel "Asm.Output.with-replacement-output-path-publish-permission-failure-not-retried" $ TestCase $ do + tmpDir <- getTemporaryDirectory + targetDir <- mkdtemp (tmpDir "htcc-output-publish-fallbackXXXXXX") + actionCalls <- newIORef (0 :: Int) + let targetPath = targetDir "htcc-output-target" + staleOutput = "stale output\n" + replacementOutput = "replacement output\n" + targetMode = ownerWriteMode + cleanup = do + ignoreIOException $ setFileMode targetDir 0o755 + ignoreIOException $ setFileMode targetPath 0o644 + ignoreIOException $ removeFile targetPath + ignoreIOException $ removeDirectoryRecursive targetDir + flip finally cleanup $ do + T.writeFile targetPath staleOutput + setFileMode targetPath targetMode + result <- try + (withReplacementOutputPathAndResolvedPath PreserveReplacementOutputMode targetPath $ \tmpOutputPath -> do + modifyIORef' actionCalls succ + T.writeFile tmpOutputPath replacementOutput + setFileMode tmpOutputPath 0o644 + setFileMode targetDir 0o555 + ) + :: IO (Either IOException (FilePath, ())) + case result of + Left ioErr -> + assertFailure $ + "publish PermissionDenied should reuse the staged artifact without rerunning the action: " + <> show ioErr + Right (finalPath, ()) -> + assertEqual "publish fallback should still report the target output path" targetPath finalPath + callCount <- readIORef actionCalls + setFileMode targetDir 0o755 + replacedMode <- fileMode <$> getFileStatus targetPath + setFileMode targetPath $ replacedMode `unionFileModes` ownerReadMode + targetOutput <- T.readFile targetPath + assertEqual "publish PermissionDenied should not retry the action through direct fallback" 1 callCount + assertEqual + "publish fallback should preserve the unreadable existing output mode" + targetMode + (intersectFileModes replacedMode 0o777) + assertEqual "publish fallback should copy the staged output" replacementOutput targetOutput + where + ignoreIOException = flip catchIOError $ const $ pure () + +suppressWarnsRunAsmPreservesDirectiveLikePostWarningOutputTest :: Test +suppressWarnsRunAsmPreservesDirectiveLikePostWarningOutputTest = + TestLabel "Asm.Output.suppress-warns-run-asm-preserves-directive-like-post-warning-output" $ TestCase $ + let filteredStderr = + dropCompilerWarningOutput $ + BC.unlines + [ "warning: this warning should be suppressed" + , ".section keep" + , "# generated by fake HTCC_ASSEMBLER wrapper" + , "1 warning generated." + ] + expectedStderr = + BC.unlines + [ ".section keep" + , "# generated by fake HTCC_ASSEMBLER wrapper" + , "1 warning generated." + ] + in do + assertEqual + "summary-shaped output should be preserved once unrelated post-warning output intervenes" + expectedStderr + filteredStderr + +suppressWarnsRunAsmPreservesLeadInForRetainedErrorTest :: Test +suppressWarnsRunAsmPreservesLeadInForRetainedErrorTest = + TestLabel "Asm.Output.suppress-warns-run-asm-preserves-lead-in-for-retained-error" $ TestCase $ + let filteredStderr = + dropCompilerWarningOutput $ + BC.unlines + [ "In file included from fake-header.h:1:" + , " from fake-source.c:2:" + , "warning: this warning should be suppressed" + , "1 warning generated." + , "error: fake HTCC_ASSEMBLER failure" + ] + expectedStderr = + BC.unlines + [ "In file included from fake-header.h:1:" + , " from fake-source.c:2:" + , "error: fake HTCC_ASSEMBLER failure" + ] + in do + assertEqual + "lead-in lines should stay attached when a retained error follows the suppressed warning" + expectedStderr + filteredStderr + assertBool "suppressed warning text should be removed" $ + not $ + "warning: this warning should be suppressed" `BC.isInfixOf` filteredStderr + +suppressWarnsRunAsmPreservesWarningLabelErrorSnippetTest :: Test +suppressWarnsRunAsmPreservesWarningLabelErrorSnippetTest = + TestLabel "Asm.Output.suppress-warns-run-asm-preserves-warning-label-error-snippet" $ TestCase $ + let filteredStderr = + dropCompilerWarningOutput $ + BC.unlines + [ "warning: this warning should be suppressed" + , "{standard input}:1:1: error: fake HTCC_ASSEMBLER failure" + , "warning: return 1;" + , "^~~~~~~~~~~~~~~~~" + ] + in do + assertBool "standalone warnings should still be suppressed" $ + not $ + "warning: this warning should be suppressed" `BC.isInfixOf` filteredStderr + assertBool "the real error should be preserved" $ + "error: fake HTCC_ASSEMBLER failure" `BC.isInfixOf` filteredStderr + assertBool "warning-label snippets should not be suppressed" $ + "warning: return 1;" `BC.isInfixOf` filteredStderr + assertBool "the snippet caret should remain attached" $ + "^~~~~~~~~~~~~~~~~" `BC.isInfixOf` filteredStderr + +suppressWarnsRunAsmSuppressesLocatedWarningsFromParenthesizedPathsTest :: Test +suppressWarnsRunAsmSuppressesLocatedWarningsFromParenthesizedPathsTest = + TestLabel "Asm.Output.suppress-warns-run-asm-suppresses-located-warnings-from-parenthesized-paths" $ TestCase $ + let filteredStderr = + dropCompilerWarningOutput $ + BC.unlines + [ "/tmp/a(b)=c/x.c:1:1: warning: this warning should be suppressed" + , "1 warning generated." + , ".section keep" + ] + expectedStderr = + BC.unlines + [ ".section keep" + ] + in assertEqual + "located warnings should still be suppressed when the diagnostic path contains parentheses or equals signs" + expectedStderr + filteredStderr + +incrementalWarningFilterRetainsStandalonePartialOutputTest :: Test +incrementalWarningFilterRetainsStandalonePartialOutputTest = + TestLabel "Asm.Output.incremental-warning-filter-retains-standalone-partial-output" $ TestCase $ + let promptChunk = ("stdout: wrapper prompt", "stdout: wrapper prompt") + (_, decisions) = + feedIncrementalCompilerWarningFilter + fst + snd + emptyIncrementalCompilerWarningFilter + [promptChunk] + in case decisions of + [RetainCompilerWarningFilterChunk retainedChunk] -> + assertEqual + "standalone partial output should be retained immediately" + promptChunk + retainedChunk + _ -> + assertFailure $ + "expected a retained prompt chunk, got " <> show (length decisions) <> " decisions" + +incrementalWarningFilterDefersPartialWarningPrefixTest :: Test +incrementalWarningFilterDefersPartialWarningPrefixTest = + TestLabel "Asm.Output.incremental-warning-filter-defers-partial-warning-prefix" $ TestCase $ + let warningChunk = ("warning: fake HTCC_ASSEMBLER warning", "warning: fake HTCC_ASSEMBLER warning") + (warningFilter, decisions) = + feedIncrementalCompilerWarningFilter + fst + snd + emptyIncrementalCompilerWarningFilter + [warningChunk] + finalDecisions = + finalizeIncrementalCompilerWarningFilter + fst + snd + warningFilter + in do + assertBool + "partial warning prefixes should remain deferred until more input or finalization" + (null decisions) + case finalDecisions of + [SuppressCompilerWarningFilterChunk suppressedChunk] -> + assertEqual + "finalization should still suppress the deferred warning chunk" + warningChunk + suppressedChunk + _ -> + assertFailure $ + "expected a suppressed warning chunk after finalization, got " + <> show (length finalDecisions) + <> " decisions" + +incompleteWarningSuppressionFlushesRetainedPromptPrefixTest :: Test +incompleteWarningSuppressionFlushesRetainedPromptPrefixTest = + TestLabel "Asm.Output.incomplete-warning-suppression-flushes-retained-prompt-prefix" $ TestCase $ + assertBool + "non-diagnostic interactive prompt text should not stay buffered waiting for a newline" + (not $ WS.incompleteCompilerOutputNeedsMoreInputForWarningSuppression "stdout: wrapper prompt") + +incompleteWarningSuppressionKeepsLocatedWarningPrefixBufferedTest :: Test +incompleteWarningSuppressionKeepsLocatedWarningPrefixBufferedTest = + TestLabel "Asm.Output.incomplete-warning-suppression-keeps-located-warning-prefix-buffered" $ TestCase $ + assertBool + "partial located warning prefixes still need buffering so later bytes can be suppressed as one diagnostic" + (WS.incompleteCompilerOutputNeedsMoreInputForWarningSuppression "{standard input}:1:1: warn") + +indirectFunctionPointerStackArgAlignmentTest :: Test +indirectFunctionPointerStackArgAlignmentTest = TestLabel "Asm.Output.indirect-function-pointer-stack-arg-alignment" $ TestCase $ do + asm <- renderAsm "long sum7(long a, long b, long c, long d, long e, long f, long g) { return a + b + c + d + e + f + g; } int main(void) { long (*fp)(long, long, long, long, long, long, long); fp = sum7; return fp(1, 2, 3, 4, 5, 6, 7) - 28; }" + assertContainsInOrder + "stack-passed indirect calls probe the final call alignment before preparing args and keep any padding ahead of outgoing stack arguments" + [ "mov rax, rsp" + , "sub rax, 8" + , "and rax, 15" + , "jnz .L.call." + , "mov r11, [rax+56]" + , "mov rdx, [rax+48]" + , "push rdx" + , "mov rdx, [rax+16]" + , "mov rax, 0" + , "call r11" + , "add rsp, 8" + , ".L.call." + , "sub rsp, 8" + , "mov r11, [rax+56]" + , "mov rdx, [rax+48]" + , "push rdx" + , "mov rdx, [rax+16]" + , "mov rax, 0" + , "call r11" + , "add rsp, 8" + , "add rsp, 8" + ] + asm + +indirectFunctionMultipleStackArgRestoreTest :: Test +indirectFunctionMultipleStackArgRestoreTest = TestLabel "Asm.Output.indirect-function-multiple-stack-arg-restore" $ TestCase $ do + asm <- renderAsm "long pick9(long a, long b, long c, long d, long e, long f, long g, long h, long i) { return g * 100 + h * 10 + i; } int main(void) { long (*fp)(long, long, long, long, long, long, long, long, long); fp = pick9; return fp(1, 2, 3, 4, 5, 6, 7, 8, 9) - 789; }" + let mainSection = extractFunctionSection "main" asm + assertContainsInOrder + "indirect calls restore multiple stack arguments through a register after loading the callee" + [ "mov r11, [rax+72]" + , "mov rdx, [rax+64]" + , "push rdx" + , "mov rdx, [rax+56]" + , "push rdx" + , "mov rdx, [rax+48]" + , "push rdx" + , "mov rdx, [rax+16]" + , "mov rax, 0" + , "call r11" + ] + mainSection + assertBool + "indirect calls should not push stack arguments directly from the scratch area" + (not $ "push [rax+" `T.isInfixOf` mainSection) + +indirectFunctionLateStackArgCallOrderTest :: Test +indirectFunctionLateStackArgCallOrderTest = TestLabel "Asm.Output.indirect-function-late-stack-arg-call-order" $ TestCase $ do + asm <- renderAsm "long g(void) { return 7; } long sum8(long a, long b, long c, long d, long e, long f, long g_, long h) { return g_ + h; } int main(void) { long (*fp)(long, long, long, long, long, long, long, long); fp = sum8; return fp(1, 2, 3, 4, 5, 6, g(), 8) - 15; }" + let mainSection = extractFunctionSection "main" asm + assertOccursBefore + "indirect-call preparation should evaluate nested stack-argument calls before pushing later stack arguments" + "call g" + "push 8" + mainSection + +functionPointerGlobalObjectAddressRejectedTest :: Test +functionPointerGlobalObjectAddressRejectedTest = TestLabel "Asm.Output.function-pointer-global-object-address-rejected" $ TestCase $ + assertBool + "function-pointer global initializers should reject object addresses, even when cast to a function-pointer type" + (all rejected + [ "int g; int (*fp)(void) = &g;" + , "int g; int (*fp)(void) = (int (*)(void))&g;" + ] + ) + where + rejected source = + isLeft + (runParser parser "" source + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + +objectPointerGlobalAddressMismatchRejectedTest :: Test +objectPointerGlobalAddressMismatchRejectedTest = TestLabel "Asm.Output.object-pointer-global-address-mismatch-rejected" $ TestCase $ + assertBool + "object-pointer global initializers should reject already-known incompatible object addresses" + (all rejected + [ "int x[4]; int (*p)[5] = &x;" + , "int x[4]; char *p = &x;" + ] + ) + where + rejected source = + isLeft + (runParser parser "" source + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + +globalInitializerGnuConditionalTest :: Test +globalInitializerGnuConditionalTest = TestLabel "Asm.Output.global-initializer-gnu-conditional" $ TestCase $ do + asm <- renderAsm "int g = 1 ?: 2; int h = 0 ?: 2; int i = 42 ?: 7; int main(void) { return g == 1 && h == 2 && i == 42; }" + assertBool + "GNU omitted-middle conditionals fold by reusing the condition value on the true branch" + (all (`T.isInfixOf` asm) + [ "g:\n\t.4byte 1" + , "h:\n\t.4byte 2" + , "i:\n\t.4byte 42" + ] + ) + +globalInitializerConditionalFunctionDecaySizeofTest :: Test +globalInitializerConditionalFunctionDecaySizeofTest = TestLabel "Asm.Output.global-initializer-conditional-function-decay-sizeof" $ TestCase $ do + asm <- renderAsm "int f(void); int g = sizeof(1 ? f : 0); int main(void) { return g == 8; }" + assertContains + "conditional expressions preserve their decayed result type during normalization" + [ "g:" + , ".4byte 8" + ] + asm + +globalInitializerConditionalRelocTest :: Test +globalInitializerConditionalRelocTest = TestLabel "Asm.Output.global-initializer-conditional-reloc" $ TestCase $ do + asm <- renderAsm "int x; int f(void) { return 0; } int *p = 1 ? &x : 0; int (*fp)(void) = 1 ? f : 0;" + assertContains + "file-scope conditional initializers preserve wrapped object and function relocations" + [ "x:" + , ".zero 4" + , "p:" + , ".quad x" + , "fp:" + , ".quad f" + ] + asm + +globalInitializerAddressConditionBoolTest :: Test +globalInitializerAddressConditionBoolTest = TestLabel "Asm.Output.global-initializer-address-condition-bool" $ TestCase $ do + asm <- renderAsm "int x; int g = &x && 1; int h = &x ? 1 : 0; int main(void) { return g == 1 && h == 1; }" + assertContains + "file-scope scalar initializers fold address constants in boolean contexts" + [ "x:" + , ".zero 4" + , "g:" + , ".4byte 1" + , "h:" + , ".4byte 1" + ] + asm + +globalInitializerAddressConditionRelocTest :: Test +globalInitializerAddressConditionRelocTest = TestLabel "Asm.Output.global-initializer-address-condition-reloc" $ TestCase $ do + asm <- renderAsm "int x; int f(void) { return 0; } int *p = &x ? &x : 0; int (*fp)(void) = f ? f : 0;" + assertContains + "file-scope pointer initializers fold address-constant conditions before selecting relocations" + [ "x:" + , ".zero 4" + , "p:" + , ".quad x" + , "fp:" + , ".quad f" + ] + asm + +commaFunctionDesignatorCallDecayTest :: Test +commaFunctionDesignatorCallDecayTest = TestLabel "Asm.Output.comma-function-designator-call-decay" $ TestCase $ do + asm <- renderAsm "int foo(void) { return 42; } int main(void) { return (0, foo)(); }" + assertContains + "comma expressions decay bare function designators before indirect-call lowering" + [ "foo:" + , "main:" + ] + asm + +commaAssignmentDiscardsLhsTest :: Test +commaAssignmentDiscardsLhsTest = TestLabel "Asm.Output.comma-assignment-discards-lhs" $ TestCase $ do + asm <- renderAsm "int main(void) { int x; x = (1, 2); return x; }" + assertContainsInOrder + "comma expressions discard the lhs result before feeding assignments" + [ "push 1" + , "add rsp, 8" + , "push 2" + ] + asm + +structMemberAccessCodegenTest :: Test +structMemberAccessCodegenTest = TestLabel "Asm.Output.struct-member-access" $ TestCase $ do + directAsm <- renderAsm "int main(void) { struct S { int a; int b; } x; x.b = 2; return x.b; }" + pointerAsm <- renderAsm "int main(void) { struct S { int a; int b; } x; struct S *p = &x; p->b = 42; return x.b; }" + sizeofRvalueAsm <- renderAsm "struct S { int a; int b; }; struct S make(void); int main(void) { return sizeof(make().b); }" + callRvalueAsm <- renderAsm "struct S { int a; int b; }; struct S make(void) { struct S x; x.a = 3; x.b = 7; return x; } int main(void) { return make().b; }" + assignRvalueAsm <- renderAsm "int main(void) { struct S { int a; int b; } x; struct S y; y.a = 3; y.b = 7; return (x = y).b; }" + arrayRvalueAsm <- renderAsm "struct S { int a[2]; }; struct S make(void) { struct S x; x.a[0] = 3; x.a[1] = 7; return x; } int main(void) { return make().a[1]; }" + arrayRvalueDerefAsm <- renderAsm "struct S { int a[2]; }; struct S make(void) { struct S x; x.a[0] = 3; x.a[1] = 7; return x; } int main(void) { return *make().a; }" + arrayRvalueDerefAddAsm <- renderAsm "struct S { int a[2]; }; struct S make(void) { struct S x; x.a[0] = 3; x.a[1] = 7; return x; } int main(void) { return *(make().a + 1); }" + arrayRvaluePostfixAddAsm <- renderAsm "struct S { int a[2]; }; struct S make(void) { struct S x; x.a[0] = 3; x.a[1] = 7; return x; } int main(void) { return (make().a + 1)[0]; }" + arrayRvalueDerefSubAsm <- renderAsm "struct S { int a[2]; }; struct S make(void) { struct S x; x.a[0] = 3; x.a[1] = 7; return x; } int main(void) { return *(make().a - 0); }" + assignArrayRvalueAsm <- renderAsm "int main(void) { struct S { int a[2]; } x; struct S y; y.a[0] = 3; y.a[1] = 7; return (x = y).a[1]; }" + stmtExprReturnArrayRvalueAsm <- renderAsm "struct S { int a[2]; }; struct S make(void); int main(void) { return ({ return 5; make(); }).a[0]; }" + nestedRvalueAsm <- renderAsm "struct T { char a; char b; char c; }; struct S { char pad; struct T t; }; struct S make(void) { struct S x; x.pad = 1; x.t.a = 2; x.t.b = 3; x.t.c = 4; return x; } int main(void) { return make().t.a; }" + threeByteRvalueAsm <- renderAsm "struct S { char a; char b; char c; }; struct S make(void) { struct S x; x.a = 1; x.b = 2; x.c = 3; return x; } int main(void) { return make().b; }" + threeByteAssignRvalueAsm <- renderAsm "int main(void) { struct S { char a; char b; char c; } x; struct S y; y.a = 1; y.b = 2; y.c = 3; return (x = y).b; }" + threeByteRegisterArgAsm <- renderAsm "struct S { char a; char b; char c; }; int sink(struct S s) { return s.b; } int main(void) { struct S x; x.a = 1; x.b = 2; x.c = 3; return sink(x); }" + threeByteStackArgAsm <- renderAsm "struct S { char a; char b; char c; }; int sink(int a, int b, int c, int d, int e, int f, struct S s) { return s.b; } int main(void) { struct S x; x.a = 1; x.b = 2; x.c = 3; return sink(1, 2, 3, 4, 5, 6, x); }" + threeByteFirstRegisterArgWithTrailingScalarsAsm <- renderAsm "struct S { char a; char b; char c; }; int sink(struct S s, int x, int y) { return s.b + x + y; } int main(void) { struct S x; x.a = 1; x.b = 2; x.c = 3; return sink(x, 4, 5); }" + threeByteSecondRegisterArgWithTrailingScalarAsm <- renderAsm "struct S { char a; char b; char c; }; int sink(int x, struct S s, int y) { return s.b + x + y; } int main(void) { struct S x; x.a = 1; x.b = 2; x.c = 3; return sink(4, x, 5); }" + sizeofLargeByValueCallAsm <- renderAsm "struct S { int a; int b; int c; }; int sink(struct S x); int main(void) { struct S x; return sizeof(sink(x)); }" + largeStructControlFlowDeclAsm <- renderAsm "int main(void) { if (1) { struct S { int a; int b; int c; } x; } while (0) { struct T { int a; int b; int c; } y; } for (; 0; ) { struct U { int a; int b; int c; } z; } return 0; }" + largePointerMemberAsm <- renderAsm "struct S { int a; int b; int c; }; struct S *p; int main(void) { return p->c; }" + explicitDerefLargePointerMemberAsm <- renderAsm "struct S { int a; int b; int c; }; struct S *p; int main(void) { return (*p).c; }" + largeAddrDerefAsm <- renderAsm "struct S { int a; int b; int c; }; struct S *p; int main(void) { return (int)&*p; }" + let + rejectsParser source = + isLeft + ( runParser parser "" source + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + firstRegisterArgWithTrailingScalarsSection = + extractFunctionSection "sink" threeByteFirstRegisterArgWithTrailingScalarsAsm + secondRegisterArgWithTrailingScalarSection = + extractFunctionSection "sink" threeByteSecondRegisterArgWithTrailingScalarAsm + stmtExprReturnArrayRvalueMainSection = + extractFunctionSection "main" stmtExprReturnArrayRvalueAsm + clobbersAbiArgScratch segment = + any (`T.isInfixOf` segment) ["mov rdx, rax", "lea rsi"] + assertContainsInOrder + "direct member access should lower through the member-address codegen path" + [ "lea rax" + , "add rax, 4" + , "push 2" + , "mov [rax], edi" + ] + directAsm + assertContainsInOrder + "pointer member access should lower through the member-address codegen path" + [ "mov rax, [rax]" + , "add rax, 4" + , "push 42" + , "mov [rax], edi" + ] + pointerAsm + assertContainsInOrder + "large struct pointer member access should revalidate as address calculation, not a struct value load" + [ "main:" + , "mov rax, [rax]" + , "add rax, 8" + , "movsxd rax, dword ptr [rax]" + ] + largePointerMemberAsm + assertContainsInOrder + "explicit dereference member access should revalidate as address calculation, not a struct value load" + [ "main:" + , "mov rax, [rax]" + , "add rax, 8" + , "movsxd rax, dword ptr [rax]" + ] + explicitDerefLargePointerMemberAsm + assertContainsInOrder + "address-of dereference should revalidate as address calculation, not a struct value load" + [ "main:" + , "mov rax, [rax]" + , "push rax" + , "jmp .L.return.main" + ] + largeAddrDerefAsm + assertContainsInOrder + "sizeof member access on struct rvalues should use only the member type" + [ "main:" + , "push 4" + , "jmp .L.return.main" + ] + sizeofRvalueAsm + assertContainsInOrder + "member access on function-call struct rvalues should read from spilled return bytes" + [ "call make" + , "push rbx" + , "mov rbx, rsp" + , "lea rax, [rbx+4]" + , "movsxd rax, dword ptr [rax]" + , "add rsp, 8" + , "pop rbx" + ] + callRvalueAsm + assertContainsInOrder + "member access on assignment struct rvalues should read from spilled assignment bytes" + [ "push rbx" + , "mov rbx, rsp" + , "lea rax, [rbx+4]" + , "movsxd rax, dword ptr [rax]" + , "add rsp, 8" + , "pop rbx" + , "jmp .L.return.main" + ] + assignRvalueAsm + assertContainsInOrder + "array member access on function-call struct rvalues should index into spilled return bytes" + [ "push 1" + , "call make" + , "mov rax, [rsp+16]" + , "imul rax, 4" + , "add rax, rbx" + , "movsxd rax, dword ptr [rax]" + , "add rsp, 8" + ] + arrayRvalueAsm + assertContainsInOrder + "unary dereference of array members on function-call struct rvalues should read the first spilled element" + [ "push 0" + , "call make" + , "mov rax, [rsp+16]" + , "imul rax, 4" + , "add rax, rbx" + , "movsxd rax, dword ptr [rax]" + , "add rsp, 8" + ] + arrayRvalueDerefAsm + assertContainsInOrder + "unary dereference of array-member pointer arithmetic on function-call struct rvalues should index into spilled return bytes" + [ "push 1" + , "call make" + , "mov rax, [rsp+16]" + , "imul rax, 4" + , "add rax, rbx" + , "movsxd rax, dword ptr [rax]" + , "add rsp, 8" + ] + arrayRvalueDerefAddAsm + assertContainsInOrder + "postfix subscript of array-member pointer arithmetic on function-call struct rvalues should index into spilled return bytes" + [ "push 1" + , "push 0" + , "add rax, rdi" + , "call make" + , "mov rax, [rsp+16]" + , "imul rax, 4" + , "add rax, rbx" + , "movsxd rax, dword ptr [rax]" + , "add rsp, 8" + ] + arrayRvaluePostfixAddAsm + assertContainsInOrder + "unary dereference of array-member pointer subtraction on function-call struct rvalues should index into spilled return bytes" + [ "push 0" + , "push 0" + , "sub rax, rdi" + , "call make" + , "mov rax, [rsp+16]" + , "imul rax, 4" + , "add rax, rbx" + , "movsxd rax, dword ptr [rax]" + , "add rsp, 8" + ] + arrayRvalueDerefSubAsm + assertContainsInOrder + "array member access on assignment struct rvalues should index into spilled assignment bytes" + [ "push 1" + , "mov rax, [rsp+16]" + , "imul rax, 4" + , "add rax, rbx" + , "movsxd rax, dword ptr [rax]" + , "add rsp, 8" + , "jmp .L.return.main" + ] + assignArrayRvalueAsm + assertContainsInOrder + "array member access should allow statement-expression returns in rvalue bases" + [ "main:" + , "push 0" + , "push 5" + , "jmp .L.return.main" + ] + stmtExprReturnArrayRvalueMainSection + assertContainsInOrder + "nested aggregate member access on struct rvalues should load only the final scalar member" + [ "call make" + , "lea rax, [rbx+1]" + , "movsx rax, byte ptr [rax]" + , "add rsp, 8" + ] + nestedRvalueAsm + assertBool + "nested aggregate member access should not issue an 8-byte load from the subobject address" + (not $ "mov rax, [rbx+1]" `T.isInfixOf` nestedRvalueAsm) + assertContainsInOrder + "three-byte struct returns should be packed without an 8-byte object load" + [ "make:" + , "mov cx, word ptr [rsi]" + , "lea rsi, [rdx+2]" + , "mov cl, byte ptr [rsi]" + , "shl rcx, 16" + , "main:" + , "lea rax, [rbx+1]" + , "movsx rax, byte ptr [rax]" + ] + threeByteRvalueAsm + assertBool + "three-byte struct returns should not issue an 8-byte local object load" + (not $ "mov rax, [rax]" `T.isInfixOf` threeByteRvalueAsm) + assertContainsInOrder + "three-byte struct assignments should store only object bytes" + [ "mov word ptr [rsi], dx" + , "sar rdx, 16" + , "lea rsi, [rax+2]" + , "mov byte ptr [rsi], dl" + , "lea rax, [rbx+1]" + , "movsx rax, byte ptr [rax]" + ] + threeByteAssignRvalueAsm + assertContainsInOrder + "three-byte register-passed struct parameters should spill only object bytes" + [ "sink:" + , "mov r10, rdi" + , "mov r11, r10" + , "mov word ptr [rax], r11w" + , "sar r11, 16" + , "mov byte ptr [rax], r11b" + , "lea rax" + , "add rax, 1" + , "movsx rax, byte ptr [rax]" + ] + threeByteRegisterArgAsm + assertContainsInOrder + "three-byte stack-passed struct parameters should spill only object bytes" + [ "sink:" + , "mov rax, [rbp+16]" + , "mov r10, rax" + , "mov r11, r10" + , "mov word ptr [rax], r11w" + , "sar r11, 16" + , "mov byte ptr [rax], r11b" + , "lea rax" + , "add rax, 1" + , "movsx rax, byte ptr [rax]" + ] + threeByteStackArgAsm + assertContainsInOrder + "aggregate parameter spills should preserve following integer argument registers" + [ "sink:" + , "mov r10, rdi" + , "mov r11, r10" + , "mov word ptr [rax], r11w" + , "sar r11, 16" + , "mov byte ptr [rax], r11b" + , "mov [rbp-8], esi" + , "mov [rbp-12], edx" + ] + threeByteFirstRegisterArgWithTrailingScalarsAsm + assertContainsInOrder + "aggregate parameter spills should preserve following registers when the aggregate is not first" + [ "sink:" + , "mov [rbp-4], edi" + , "mov r10, rsi" + , "mov r11, r10" + , "mov word ptr [rax], r11w" + , "sar r11, 16" + , "mov byte ptr [rax], r11b" + , "mov [rbp-12], edx" + ] + threeByteSecondRegisterArgWithTrailingScalarAsm + assertBool + "aggregate parameter spill should not use rdx/rsi as scratch before scalar args are saved" + (not $ clobbersAbiArgScratch firstRegisterArgWithTrailingScalarsSection) + assertBool + "aggregate parameter spill should not use rdx/rsi as scratch for a later register aggregate" + (not $ clobbersAbiArgScratch secondRegisterArgWithTrailingScalarSection) + assertContainsInOrder + "sizeof call arguments should not re-enable evaluated backend-only aggregate checks" + [ "main:" + , "push 4" + , "jmp .L.return.main" + ] + sizeofLargeByValueCallAsm + assertContainsInOrder + "large struct declarations in control-flow bodies should not be revalidated as value reads" + [ "main:" + , "push 0" + , "jmp .L.return.main" + ] + largeStructControlFlowDeclAsm + assertBool + "escaping statement-expression control flow in rvalue array member indexes should be rejected before codegen" + (rejectsParser "struct S { int a[2]; }; struct S make(void) { struct S x; x.a[0] = 3; x.a[1] = 7; return x; } int main(void) { for (;;) { make().a[({ continue; 0; })]; } return 0; }") + assertBool + "chained rvalue array member bases on large structs should be rejected before codegen" + (rejectsParser "struct S { int pad; int a[2]; }; struct S make(void); int main(void) { return (make().a + 1)[0]; }") + assertBool + "nested rvalue array member bases on large structs should be rejected before codegen" + (rejectsParser "struct S { int pad; int a[2]; }; struct S make(void); int main(void) { return *((make().a + 1) + 0); }") + +globalInitializerStmtExprArrayDecaySizeofTest :: Test +globalInitializerStmtExprArrayDecaySizeofTest = TestLabel "Asm.Output.global-initializer-stmt-expr-array-decay-sizeof" $ TestCase $ do + asm <- renderAsm "int x[4]; int y = sizeof(({ x; })); int main(void) { return y == 8; }" + assertContains + "statement expressions preserve their decayed result type during normalization" + [ "x:" + , ".zero 16" + , "y:" + , ".4byte 8" + ] + asm + +globalInitializerModRemainderTest :: Test +globalInitializerModRemainderTest = TestLabel "Asm.Output.global-initializer-mod-remainder" $ TestCase $ do + asm <- renderAsm "int g = -5 % 2; int h = 5 % -2; int main(void) { return g == -1 && h == 1; }" + assertBool + "global initializer modulo folding follows C remainder semantics" + (all (`T.isInfixOf` asm) + [ "g:\n\t.4byte -1" + , "h:\n\t.4byte 1" + ] + ) + +globalInitializerDivTruncationTest :: Test +globalInitializerDivTruncationTest = TestLabel "Asm.Output.global-initializer-div-truncation" $ TestCase $ do + asm <- renderAsm "int g = -5 / 2; int h = 5 / -2; char *p = \"abc\" + (-5 / 2 + 3); int main(void) { return g == -2 && h == -2 && p[0] == 'b'; }" + assertBool + "global initializer division folding truncates toward zero for integers and reloc addends" + (all (`T.isInfixOf` asm) + [ "g:\n\t.4byte -2" + , "h:\n\t.4byte -2" + , "p:\n\t.quad .L.data.0+1" + ] + ) + +normalizeAsmInputPreservesOperatorTypesTest :: Test +normalizeAsmInputPreservesOperatorTypesTest = TestLabel "Asm.Output.normalize-input-preserves-operator-types" $ TestCase $ do + let intTy :: CT.StorageClass Integer + intTy = CT.SCAuto CT.CTInt + lit n = ATNode (ATNum n) intTy ATEmpty ATEmpty + lessNode = ATNode ATLT intTy (lit 1) (lit 2) + bitNode = ATNode ATAnd intTy (lit 1) (lit 2) + case normalizeAsmInput [lessNode, bitNode] Map.empty of + Left err -> + assertFailure err + Right ([normalizedLess, normalizedBit], _) -> do + assertEqual + "normalization should preserve existing comparison-node types when no rewritten global changes them" + intTy + (atype normalizedLess) + assertEqual + "normalization should preserve existing bitwise-node types when no rewritten global changes them" + intTy + (atype normalizedBit) + Right _ -> + assertFailure "internal test error: normalizeAsmInput returned an unexpected AST shape" + +assertPrepareAsmInputError :: String -> T.Text -> String -> IO () +assertPrepareAsmInputError label source expected = do + (asts, gvars, _, funcs) <- parseAsmSource source + case prepareAsmInput funcs asts gvars of + Left err -> + assertEqual label expected err + Right _ -> + assertFailure $ label <> ": expected asm input preparation failure" + +assertPrepareAsmInputOk :: String -> T.Text -> IO () +assertPrepareAsmInputOk label source = do + (asts, gvars, _, funcs) <- parseAsmSource source + case prepareAsmInput funcs asts gvars of + Left err -> + assertFailure $ label <> ": unexpected asm input preparation failure: " <> err + Right _ -> + pure () + +assertPrepareVisualizableInputError :: String -> T.Text -> String -> IO () +assertPrepareVisualizableInputError label source expected = do + (asts, gvars, _, funcs) <- parseAsmSource source + case prepareVisualizableInput funcs asts gvars of + Left err -> + assertEqual label expected err + Right _ -> + assertFailure $ label <> ": expected visualizable input preparation failure" + +assertPrepareAstInputError :: String -> PF.Functions Integer -> ASTs Integer -> String -> IO () +assertPrepareAstInputError label funcs asts expected = + case prepareAsmInput funcs asts Map.empty of + Left err -> + assertEqual label expected err + Right _ -> + assertFailure $ label <> ": expected asm input preparation failure" + +functionCallRefinementRevalidationTest :: Test +functionCallRefinementRevalidationTest = TestLabel "Asm.Output.function-call-refinement-revalidation" $ TestCase $ + assertPrepareAsmInputError + "asm input preparation should revalidate refined function calls before codegen" + "int f(); int main(void) { return f(1); } int f(void) { return 1; }" + "too many arguments to function call" + +completedStructParamRevalidationTest :: Test +completedStructParamRevalidationTest = TestLabel "Asm.Output.completed-struct-param-revalidation" $ TestCase $ + assertPrepareAsmInputOk + "asm input preparation should keep completed by-value struct parameters during call revalidation" + "struct S; int sink(struct S); struct S { int a; }; int main(void) { struct S x; return sink(x); }" + +noArgIndirectCalleeControlFlowRevalidationTest :: Test +noArgIndirectCalleeControlFlowRevalidationTest = TestLabel "Asm.Output.no-arg-indirect-callee-control-flow-revalidation" $ TestCase $ + assertPrepareAsmInputOk + "asm input preparation should allow statement-expression return control flow in no-argument indirect callees" + "int sink(void) { return 1; } int main(void) { int (*fp)(void) = sink; ({ return 2; fp; })(); return 0; }" + +completedStructReturnRevalidationTest :: Test +completedStructReturnRevalidationTest = TestLabel "Asm.Output.completed-struct-return-revalidation" $ TestCase $ do + assertPrepareAsmInputOk + "asm input preparation should preserve completed direct-call struct returns" + "struct S foo(void); struct S { int a; }; int main(void) { foo(); return 0; }" + assertPrepareAsmInputOk + "asm input preparation should preserve completed direct-call struct returns for member access" + "struct S foo(void); struct S { int a; }; int main(void) { return foo().a; }" + +deferredLargeStructFunctionValueRevalidationTest :: Test +deferredLargeStructFunctionValueRevalidationTest = TestLabel "Asm.Output.deferred-large-struct-function-value-revalidation" $ TestCase $ + assertBool + "function-returned incomplete struct values should be rejected before codegen" + (isLeft + (runParser parser "" + "struct S f(); int main(void) { f(); return 0; } struct S { int a; int b; int c; };" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +deferredLargeStructGlobalValueRevalidationTest :: Test +deferredLargeStructGlobalValueRevalidationTest = TestLabel "Asm.Output.deferred-large-struct-global-value-revalidation" $ TestCase $ + assertBool + "incomplete struct global values should be rejected before codegen" + (isLeft + (runParser parser "" + "extern struct S x; int main(void) { x; return 0; } struct S { int a; int b; int c; };" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +addressContextCallArgumentRevalidationTest :: Test +addressContextCallArgumentRevalidationTest = TestLabel "Asm.Output.address-context-call-argument-revalidation" $ TestCase $ do + let + intTy = CT.SCAuto CT.CTInt + intArrayKind = CT.CTArray 2 CT.CTInt + largeStructKind = + CT.CTStruct $ + Map.fromList + [ ("a", CT.StructMember CT.CTInt 0) + , ("b", CT.StructMember CT.CTInt 4) + , ("c", CT.StructMember CT.CTInt 8) + ] + arrayStructKind = + CT.CTStruct $ + Map.fromList + [ ("a", CT.StructMember intArrayKind 0) + ] + largeStructTy = CT.SCAuto largeStructKind + largeStructPtrTy = CT.SCAuto $ CT.CTPtr largeStructKind + arrayStructTy = CT.SCAuto arrayStructKind + intArrayTy = CT.SCAuto intArrayKind + intPtrTy = CT.SCAuto $ CT.CTPtr CT.CTInt + num n = ATNode (ATNum n) intTy ATEmpty ATEmpty + largeArg = ATNode (ATGVar largeStructTy "x") largeStructTy ATEmpty ATEmpty + cMember = CT.StructMember CT.CTInt 8 + arrayMember = CT.StructMember intArrayKind 0 + function ty = + PF.Function + { PF.fntype = ty + , PF.fnDefined = False + , PF.fnImplicit = False + , PF.fnNestDepth = 0 + } + funcs = + Map.fromList + [ ("getMemberBase", function $ CT.SCAuto $ CT.CTFunc (CT.CTPtr largeStructKind) [(largeStructKind, Nothing)]) + , ("getAddressBase", function $ CT.SCAuto $ CT.CTFunc (CT.CTPtr CT.CTInt) [(largeStructKind, Nothing)]) + , ("makeLargeStruct", function $ CT.SCAuto $ CT.CTFunc largeStructKind []) + , ("makeArrayStruct", function $ CT.SCAuto $ CT.CTFunc arrayStructKind []) + ] + memberBaseCall = ATNode (ATCallFunc "getMemberBase" (Just [largeArg])) largeStructPtrTy ATEmpty ATEmpty + memberExpr = + ATNode + (ATMemberAcc cMember) + intTy + (ATNode ATDeref largeStructTy memberBaseCall ATEmpty) + ATEmpty + addressBaseCall = ATNode (ATCallFunc "getAddressBase" (Just [largeArg])) intPtrTy ATEmpty ATEmpty + addressExpr = + ATNode + ATAddr + intPtrTy + (ATNode ATDeref intTy addressBaseCall ATEmpty) + ATEmpty + scalarMemberAddressExpr = + ATNode + ATAddr + intPtrTy + (ATNode (ATMemberAcc cMember) intTy (ATNode (ATCallFunc "makeLargeStruct" Nothing) largeStructTy ATEmpty ATEmpty) ATEmpty) + ATEmpty + arrayMemberExpr = + ATNode + (ATMemberAcc arrayMember) + intArrayTy + (ATNode (ATCallFunc "makeArrayStruct" Nothing) arrayStructTy ATEmpty ATEmpty) + ATEmpty + arrayMemberAddressExpr = + ATNode + ATAddr + (CT.SCAuto $ CT.CTPtr intArrayKind) + arrayMemberExpr + ATEmpty + pointerArithmeticExpr = + ATNode + ATAddr + intPtrTy + (ATNode ATDeref intTy (ATNode ATAddPtr intPtrTy arrayMemberExpr (num 1)) ATEmpty) + ATEmpty + nonPointerDerefExpr = + ATNode + ATAddr + intPtrTy + (ATNode ATDeref intTy (num 1) ATEmpty) + ATEmpty + assertPrepareAstInputError + "member-access address context should still validate evaluated call arguments" + funcs + [ATNode ATReturn intTy memberExpr ATEmpty] + "unsupported non-addressable array member decay" + assertPrepareAstInputError + "address-of context should still validate evaluated call arguments" + funcs + [ATNode ATExprStmt intPtrTy addressExpr ATEmpty] + "unsupported non-addressable array member decay" + assertPrepareAstInputError + "address-of dereference should still validate pointer-arithmetic operands" + funcs + [ATNode ATExprStmt intPtrTy pointerArithmeticExpr ATEmpty] + "lvalue required as unary '&' operand" + assertPrepareAstInputError + "address-of direct scalar member access should reject struct rvalue bases" + funcs + [ATNode ATExprStmt intPtrTy scalarMemberAddressExpr ATEmpty] + "lvalue required as unary '&' operand" + assertPrepareAstInputError + "address-of direct array member access should reject struct rvalue bases" + funcs + [ATNode ATExprStmt intPtrTy arrayMemberAddressExpr ATEmpty] + "lvalue required as unary '&' operand" + assertPrepareAstInputError + "address-of dereference should reject non-dereferenceable operands" + funcs + [ATNode ATExprStmt intPtrTy nonPointerDerefExpr ATEmpty] + "lvalue required as unary '&' operand" + +unevaluatedAddressOfRvalueArrayElementRevalidationTest :: Test +unevaluatedAddressOfRvalueArrayElementRevalidationTest = TestLabel "Asm.Output.unevaluated-address-of-rvalue-array-element-revalidation" $ TestCase $ do + assertPrepareAsmInputOk + "sizeof address-of array member element on struct rvalues should remain unevaluated" + "struct S { int a[2]; }; struct S make(void); int main(void) { return sizeof(&make().a[1]); }" + assertPrepareAsmInputOk + "sizeof address-of dereferenced array member on struct rvalues should remain unevaluated" + "struct S { int a[2]; }; struct S make(void); int main(void) { return sizeof(&*make().a); }" + assertPrepareAsmInputOk + "_Alignof address-of array member element on struct rvalues should remain unevaluated" + "struct S { int a[2]; }; struct S make(void); int main(void) { return _Alignof(&make().a[1]); }" + assertPrepareAsmInputOk + "_Alignof address-of dereferenced array member on struct rvalues should remain unevaluated" + "struct S { int a[2]; }; struct S make(void); int main(void) { return _Alignof(&*make().a); }" + +unevaluatedAggregateReturnRevalidationTest :: Test +unevaluatedAggregateReturnRevalidationTest = TestLabel "Asm.Output.unevaluated-aggregate-return-revalidation" $ TestCase $ do + assertPrepareAsmInputOk + "sizeof statement-expression aggregate returns should remain unevaluated during revalidation" + "struct S { int a; }; int main(void) { return sizeof(({ struct S s; return s; 0; })); }" + assertPrepareAsmInputOk + "_Alignof statement-expression aggregate returns should remain unevaluated during revalidation" + "struct S { int a; }; int main(void) { return _Alignof(({ struct S s; return s; 0; })); }" + +objectPointerAssignmentRefinementRevalidationTest :: Test +objectPointerAssignmentRefinementRevalidationTest = TestLabel "Asm.Output.object-pointer-assignment-refinement-revalidation" $ TestCase $ + assertPrepareAsmInputError + "asm input preparation should reject object-pointer assignments after later tentative-array completion" + "int x[]; int main(void) { int (*p)[]; p = &x; return 0; } int x[4];" + "invalid operands to assignment" + +globalInitializerFunctionPointerRefinementRevalidationTest :: Test +globalInitializerFunctionPointerRefinementRevalidationTest = TestLabel "Asm.Output.global-initializer-function-pointer-refinement-revalidation" $ TestCase $ + assertPrepareAsmInputError + "asm input preparation should reject file-scope function-pointer initializers after later function refinement" + "int f(); int (*p)(void) = f; int f(int x) { return x; }" + "invalid initializer for scalar object" + +globalInitializerObjectPointerRefinementRevalidationTest :: Test +globalInitializerObjectPointerRefinementRevalidationTest = TestLabel "Asm.Output.global-initializer-object-pointer-refinement-revalidation" $ TestCase $ + assertPrepareAsmInputError + "asm input preparation should reject file-scope object-pointer initializers after later tentative-array completion" + "int x[]; int (*p)[] = &x; int main(void) { return 0; } int x[4];" + "invalid initializer for scalar object" + +globalInitializerIncompleteSizeofRevalidationTest :: Test +globalInitializerIncompleteSizeofRevalidationTest = TestLabel "Asm.Output.global-initializer-incomplete-sizeof-revalidation" $ TestCase $ do + let longTy :: CT.StorageClass Integer + longTy = CT.SCAuto $ CT.CTLong CT.CTInt + incompleteArrayTy :: CT.StorageClass Integer + incompleteArrayTy = CT.SCAuto $ CT.CTIncomplete (CT.IncompleteArray CT.CTInt) + arrayPointerTy :: CT.StorageClass Integer + arrayPointerTy = CT.SCAuto $ CT.CTPtr (CT.CTIncomplete (CT.IncompleteArray CT.CTInt)) + lhs = + ATNode + (ATLVar longTy 0) + longTy + ATEmpty + ATEmpty + rhs = + ATNode + ATSizeof + longTy + (ATNode + ATDeref + incompleteArrayTy + (ATNode (ATGVar arrayPointerTy "p") arrayPointerTy ATEmpty ATEmpty) + ATEmpty + ) + ATEmpty + initAst = + ATNode + (ATBlock [ATNode ATExprStmt longTy (ATNode ATAssign longTy lhs rhs) ATEmpty]) + longTy + ATEmpty + ATEmpty + gvars :: GlobalVars Integer + gvars = + Map.fromList + [ ("p", GVar arrayPointerTy GVarInitWithZero 0) + , ("n", GVar longTy (GVarInitWithAST initAst) 0) + ] + case prepareAsmInput Map.empty [] gvars of + Left err -> + assertEqual + "asm input preparation should reject deferred sizeof in global initializers when the operand stays incomplete" + "invalid application of 'sizeof' to incomplete type" + err + Right _ -> + assertFailure "expected asm input preparation failure" + +globalInitializerCallRefinementRevalidationTest :: Test +globalInitializerCallRefinementRevalidationTest = TestLabel "Asm.Output.global-initializer-call-refinement-revalidation" $ TestCase $ + assertPrepareAsmInputError + "asm input preparation should revalidate calls in file-scope initializer ASTs after later function refinement" + "struct S { int a; }; extern struct S g; int sink(); int x = sizeof(sink(g)); struct T { int b; }; int sink(struct T);" + "invalid argument type to function call" + +globalInitializerArrayMemberPointerArithmeticSizeofTest :: Test +globalInitializerArrayMemberPointerArithmeticSizeofTest = TestLabel "Asm.Output.global-initializer-array-member-pointer-arithmetic-sizeof" $ TestCase $ do + asm <- renderAsm "struct S { int a[3]; } s; int y = sizeof(s.a + 1); int main(void) { return y == 8; }" + assertContains + "file-scope sizeof should preserve the decayed pointer type for array-member pointer arithmetic" + [ "y:" + , ".4byte 8" + ] + asm + assertBool + "file-scope sizeof should not fold array-member pointer arithmetic as the array object size" + (not $ "y:\n\t.4byte 12" `T.isInfixOf` asm) + +functionPointerReturnRefinementRevalidationTest :: Test +functionPointerReturnRefinementRevalidationTest = TestLabel "Asm.Output.function-pointer-return-refinement-revalidation" $ TestCase $ + assertPrepareAsmInputError + "asm input preparation should reject function-pointer return expressions after later function refinement" + "int f(); int (*g(void))(void) { return f; } int f(int x) { return x; }" + "invalid return type" + +objectPointerReturnRefinementRevalidationTest :: Test +objectPointerReturnRefinementRevalidationTest = TestLabel "Asm.Output.object-pointer-return-refinement-revalidation" $ TestCase $ + assertPrepareAsmInputError + "asm input preparation should reject object-pointer return expressions after later tentative-array completion" + "int a[]; int (*f(void))[3] { return &a; } int a[4];" + "invalid return type" + +aggregateCompatibilityRevalidationTest :: Test +aggregateCompatibilityRevalidationTest = TestLabel "Asm.Output.aggregate-compatibility-revalidation" $ TestCase $ do + let + intTy = CT.SCAuto CT.CTInt + structAKind = + CT.CTStruct $ + Map.fromList + [ ("a", CT.StructMember CT.CTInt 0) + ] + structBKind = + CT.CTStruct $ + Map.fromList + [ ("b", CT.StructMember CT.CTInt 0) + ] + structATy = CT.SCAuto structAKind + structBTy = CT.SCAuto structBKind + structAFnTy = CT.SCAuto $ CT.CTFunc structAKind [] + intFnTy = CT.SCAuto $ CT.CTFunc CT.CTInt [] + boolTy = CT.SCAuto CT.CTBool + undefTy = CT.SCUndef CT.CTUndef + num n = ATNode (ATNum n) intTy ATEmpty ATEmpty + lvar ty offset = ATNode (ATLVar ty offset) ty ATEmpty ATEmpty + block stmts = ATNode (ATBlock stmts) undefTy ATEmpty ATEmpty + defFunc name fnTy stmts = ATNode (ATDefFunc name Nothing) fnTy (block stmts) ATEmpty + exprStmt expr = ATNode ATExprStmt undefTy expr ATEmpty + function ty = + PF.Function + { PF.fntype = ty + , PF.fnDefined = False + , PF.fnImplicit = False + , PF.fnNestDepth = 0 + } + refinedAggregateFuncs = Map.singleton "makeStruct" $ function structAFnTy + staleStructCall = ATNode (ATCallFunc "makeStruct" Nothing) intTy ATEmpty ATEmpty + incompatibleAssign = + ATNode + ATExprStmt + undefTy + (ATNode ATAssign structATy (lvar structATy 0) (lvar structBTy 8)) + ATEmpty + incompatibleStructReturn = + ATNode ATReturn structATy (lvar structBTy 8) ATEmpty + incompatibleScalarReturn = + ATNode ATReturn intTy (lvar structBTy 8) ATEmpty + invalidScalarAdd = + ATNode ATAdd intTy staleStructCall (num 1) + invalidScalarEquality = + ATNode ATEQ boolTy staleStructCall staleStructCall + invalidCompoundAssign = + ATNode ATAddAssign intTy (lvar intTy 0) staleStructCall + returnZero = + ATNode ATReturn intTy (num 0) ATEmpty + assertPrepareAstInputError + "asm input preparation should reject incompatible small-struct assignments" + Map.empty + [defFunc "assignBad" structAFnTy [incompatibleAssign, ATNode ATReturn structATy (lvar structATy 0) ATEmpty]] + "invalid operands to assignment" + assertPrepareAstInputError + "asm input preparation should reject incompatible small-struct returns" + Map.empty + [defFunc "returnBad" structAFnTy [incompatibleStructReturn]] + "invalid return type" + assertPrepareAstInputError + "asm input preparation should reject small-struct returns from scalar functions" + Map.empty + [defFunc "scalarReturnBad" intFnTy [incompatibleScalarReturn]] + "invalid return type" + assertPrepareAstInputError + "asm input preparation should reject refreshed small-struct arithmetic operands" + refinedAggregateFuncs + [defFunc "addOperandBad" intFnTy [exprStmt invalidScalarAdd, returnZero]] + "invalid operands" + assertPrepareAstInputError + "asm input preparation should reject refreshed small-struct scalar operands" + refinedAggregateFuncs + [defFunc "equalityOperandBad" intFnTy [exprStmt invalidScalarEquality, returnZero]] + "invalid operands" + assertPrepareAstInputError + "asm input preparation should reject refreshed small-struct compound assignment operands" + refinedAggregateFuncs + [defFunc "compoundOperandBad" intFnTy [exprStmt invalidCompoundAssign, returnZero]] + "invalid operands to assignment" + +pointerIncDecRefinementRevalidationTest :: Test +pointerIncDecRefinementRevalidationTest = TestLabel "Asm.Output.pointer-inc-dec-refinement-revalidation" $ TestCase $ + assertBool + "same-input pointer ++/-- should reject pointer-to-array redeclarations that refine pointee bounds" + (isLeft + (runParser parser "" + "int (*p)[]; int main(void) { ++p; p++; --p; p--; return 0; } int (*p)[4];" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +pointerAddSubAssignRefinementRevalidationTest :: Test +pointerAddSubAssignRefinementRevalidationTest = TestLabel "Asm.Output.pointer-add-sub-assign-refinement-revalidation" $ TestCase $ + assertBool + "same-input pointer compound assignments should reject pointer-to-array redeclarations that refine pointee bounds" + (isLeft + (runParser parser "" + "int (*p)[]; int main(void) { p += 1; p -= 1; return 0; } int (*p)[4];" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +pointerIncDecIncompleteRevalidationFailureTest :: Test +pointerIncDecIncompleteRevalidationFailureTest = TestLabel "Asm.Output.pointer-inc-dec-incomplete-revalidation-failure" $ TestCase $ + assertPrepareAsmInputError + "asm input preparation should reject deferred ++/-- on pointers that stay incomplete" + "int (*f(void))[]; int main(void) { int (*p)[] = f(); ++p; return 0; }" + "invalid use of pointer to incomplete type" + +pointerAddSubAssignIncompleteRevalidationFailureTest :: Test +pointerAddSubAssignIncompleteRevalidationFailureTest = TestLabel "Asm.Output.pointer-add-sub-assign-incomplete-revalidation-failure" $ TestCase $ + assertPrepareAsmInputError + "asm input preparation should reject deferred +=/-= on pointers that stay incomplete" + "int (*f(void))[]; int main(void) { int (*p)[] = f(); p += 1; return 0; }" + "invalid use of pointer to incomplete type" + +visualizableInputAcceptsAsmNormalizationFailureTest :: Test +visualizableInputAcceptsAsmNormalizationFailureTest = + TestLabel "Visualizer.prepare-input-accepts-asm-normalization-failure" $ TestCase $ do + let source = + "int (*f(void))[]; int main(void) { int (*p)[] = f(); ++p; return 0; }" + (asts, gvars, _, funcs) <- parseAsmSource source + case prepareVisualizableInput funcs asts gvars of + Left err -> + assertFailure $ + "visualizable input preparation should keep parseable incomplete-pointer arithmetic renderable: " + <> err + Right (preparedAsts, _) -> do + svg <- renderVisualizationFromAsts preparedAsts + assertBool + "visualizable input preparation should still produce a renderable AST" + ("" + "int x[] = { sizeof x, 0 };" + :: Either (M.ParseErrorBundle T.Text Void) (Warnings, ASTs Integer, GlobalVars Integer, Literals Integer, PF.Functions Integer) + ) + ) + +visualizerSizeofExprTest :: Test +visualizerSizeofExprTest = TestLabel "Visualizer.sizeof-expr" $ TestCase $ do + svg <- renderVisualization "int main(void) { int x; return sizeof x; }" + assertBool + "visualizer renders expression-form sizeof nodes" + ("sizeof" `T.isInfixOf` svg) + +visualizerFunctionDesignatorTest :: Test +visualizerFunctionDesignatorTest = TestLabel "Visualizer.function-designator" $ TestCase $ do + svg <- renderVisualization "int foo(void) { return 1; } int (*fp)(void) = &foo;" + assertBool + "visualizer renders bare function designators emitted as ATFuncPtr nodes" + ("