Skip to content

feat: Add timing and QUIET mode to Makefile#121

Merged
angerman merged 1 commit into
stable-ghc-9.14from
feature/timing
Feb 28, 2026
Merged

feat: Add timing and QUIET mode to Makefile#121
angerman merged 1 commit into
stable-ghc-9.14from
feature/timing

Conversation

@angerman

@angerman angerman commented Dec 3, 2025

Copy link
Copy Markdown

Summary

  • Add timing information for build phases (cabal, stage1, stage2, bindist, test)
  • Add QUIET=1 mode to suppress output unless there's an error
  • Show timing summary at end of build (even on failure)
  • Save timing data to _build/timing/summary.txt
  • Enable QUIET=1 in devx CI workflow

Usage

make                  # Normal build with timing summary at end
make QUIET=1          # Quiet build - output only on error
make timing-summary   # Show timing summary only

Example Output

+------------------+---------------+--------+
| Phase            | Duration      | Status |
+------------------+---------------+--------+
| cabal            |        2m 15s | OK     |
| stage1           |       45m 10s | OK     |
| stage2           |     1h 24m 5s | OK     |
| bindist          |        1m 12s | OK     |
+------------------+---------------+--------+
| TOTAL            |    2h 12m 42s |        |
+------------------+---------------+--------+

Test plan

  • Run make and verify timing summary appears at end
  • Run make QUIET=1 and verify output is suppressed
  • Verify logs are stored in _build/logs/
  • Verify timing data is saved to _build/timing/summary.txt

@angerman

angerman commented Dec 3, 2025

Copy link
Copy Markdown
Author

This should only be merged after #117 is merged. But before #118 is merged. So that we can get the timing comparison between non, and minimal stage1.

@angerman

Copy link
Copy Markdown
Author

Yikes, so many conflicts.

@angerman angerman force-pushed the feature/timing branch 3 times, most recently from c94ded3 to c3550a8 Compare February 14, 2026 08:35
@angerman angerman requested a review from Copilot February 26, 2026 00:44

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds build-phase timing instrumentation and a QUIET mode, plus optional CPU/memory metrics collection and plotting, and wires quiet/timing/metrics into CI.

Changes:

  • Introduces timing collection and a timing-summary target for phase duration/status reporting.
  • Adds QUIET=1 and metrics collection/plotting helpers (bash + Python) for CI observability.
  • Updates CI workflow to run builds/tests in quiet mode and upload logs/timing/metrics artifacts.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
mk/timing-summary.sh Generates a timing table and writes _build/timing/summary.txt from phase timestamp/status files.
mk/run-phase.sh Adds a wrapper to time phases and capture logs in quiet mode.
mk/collect-metrics.sh Background CPU/memory sampler writing _build/metrics/metrics.csv.
mk/plot-metrics.py Produces SVG charts overlaying metrics with phase timing regions.
Makefile Adds QUIET/metrics/timing variables, timing macros, and make targets for summaries/metrics.
.github/workflows/ci.yml Enables QUIET mode in CI, collects metrics, uploads artifacts, and writes a summary.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Makefile
Comment on lines +238 to +242
ifeq ($(QUIET),1)
LOG = @:
LOG_GROUP_START = @:
LOG_GROUP_END = @:
endif

Copilot AI Feb 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QUIET=1 currently only disables the LOG* helper macros, but it won’t suppress normal recipe command echoing (most recipe lines are not @-prefixed). This doesn’t meet the stated goal of 'suppress output unless there’s an error'. Consider enabling Make’s silent mode when QUIET=1 (e.g., set MAKEFLAGS += --silent / .SILENT: under the conditional), and then explicitly echo/print only for error reporting.

Copilot uses AI. Check for mistakes.
Comment thread Makefile
Comment on lines 682 to +693
$(STAGE2_CABAL_BUILD) rts
$(call PHASE_END_OK,stage2.rts)

$(call PHASE_START,stage2.executables)
$(call LOG,Building executables $(STAGE2_EXECUTABLES))
$(STAGE2_CABAL_BUILD) $(addprefix exe:,$(STAGE2_EXECUTABLES))
$(call PHASE_END_OK,stage2.executables)

$(call PHASE_START,stage2.libraries)
$(call LOG,Building libraries $(filter-out rts%,$(STAGE2_LIBRARIES)))
$(STAGE2_CABAL_BUILD) $(filter-out rts%,$(STAGE2_LIBRARIES))
$(call PHASE_END_OK,stage2.libraries)

Copilot AI Feb 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phase timing/status files are only finalized on success (PHASE_END_OK). If the build command fails, .end and .status won’t be recorded for that phase, and mk/timing-summary.sh will omit the phase entirely (it discovers only phases with both .start and .end). To support 'timing summary at end (even on failure)', wrap phase commands so failures also write .end and a failure status (e.g., a PHASE_END_FAIL macro, or a common wrapper that records end+status in both branches).

Suggested change
$(STAGE2_CABAL_BUILD) rts
$(call PHASE_END_OK,stage2.rts)
$(call PHASE_START,stage2.executables)
$(call LOG,Building executables $(STAGE2_EXECUTABLES))
$(STAGE2_CABAL_BUILD) $(addprefix exe:,$(STAGE2_EXECUTABLES))
$(call PHASE_END_OK,stage2.executables)
$(call PHASE_START,stage2.libraries)
$(call LOG,Building libraries $(filter-out rts%,$(STAGE2_LIBRARIES)))
$(STAGE2_CABAL_BUILD) $(filter-out rts%,$(STAGE2_LIBRARIES))
$(call PHASE_END_OK,stage2.libraries)
set +e; \
$(STAGE2_CABAL_BUILD) rts; status=$$?; \
if [ $$status -eq 0 ]; then \
$(call PHASE_END_OK,stage2.rts); \
else \
$(call PHASE_END_FAIL,stage2.rts); \
fi; \
exit $$status
$(call PHASE_START,stage2.executables)
$(call LOG,Building executables $(STAGE2_EXECUTABLES))
set +e; \
$(STAGE2_CABAL_BUILD) $(addprefix exe:,$(STAGE2_EXECUTABLES)); status=$$?; \
if [ $$status -eq 0 ]; then \
$(call PHASE_END_OK,stage2.executables); \
else \
$(call PHASE_END_FAIL,stage2.executables); \
fi; \
exit $$status
$(call PHASE_START,stage2.libraries)
$(call LOG,Building libraries $(filter-out rts%,$(STAGE2_LIBRARIES)))
set +e; \
$(STAGE2_CABAL_BUILD) $(filter-out rts%,$(STAGE2_LIBRARIES)); status=$$?; \
if [ $$status -eq 0 ]; then \
$(call PHASE_END_OK,stage2.libraries); \
else \
$(call PHASE_END_FAIL,stage2.libraries); \
fi; \
exit $$status

Copilot uses AI. Check for mistakes.
Comment thread .github/workflows/ci.yml
Comment on lines +76 to +79
- name: Display build timings
if: ${{ !cancelled() }}
shell: devx {0}
run: make timing-summary

Copilot AI Feb 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In GitHub Actions, steps after a failing step are skipped unless they use if: always() .... With the current condition, the 'Display build timings' step will not run when the build fails, contradicting the PR goal of showing timing summaries even on failure. Use an always() guard (optionally combined with !cancelled()) for timing display and also for stopping metrics collection / uploading diagnostics that should run after failures.

Copilot uses AI. Check for mistakes.
Comment thread mk/run-phase.sh
# Record start time
START_TIME=$(date +%s)
echo "$START_TIME" > "$TIMING_DIR/$PHASE.start"
echo ">>> Building $PHASE..."

Copilot AI Feb 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script’s contract says quiet mode should 'suppress output unless error', but it always prints >>> Building ... regardless of QUIET. Gate this message on QUIET!=1 (or send it to the log in quiet mode) so quiet mode is actually quiet unless there’s a failure.

Suggested change
echo ">>> Building $PHASE..."
if [[ "$QUIET" != "1" ]]; then
echo ">>> Building $PHASE..."
fi

Copilot uses AI. Check for mistakes.
Comment thread mk/timing-summary.sh Outdated
local parent=$1
for sp in "${sub_phases[@]}"; do
# Match: parent.anything (no further dots)
if [[ "$sp" == "$parent".* ]]; then

Copilot AI Feb 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says sub-phases match parent.anything (no further dots), but the glob \"$parent\".* will also include nested names like parent.a.b. Either adjust the match to enforce a single-dot sub-phase (e.g., exclude \"$parent\".*.*) or update the comment to reflect the actual behavior.

Suggested change
if [[ "$sp" == "$parent".* ]]; then
if [[ "$sp" == "$parent".* && "$sp" != "$parent".*.* ]]; then

Copilot uses AI. Check for mistakes.
Comment thread mk/plot-metrics.py
Comment on lines +176 to +179
elif phase_name.startswith('stage3-'):
# Assign colors from palette based on platform name hash
idx = hash(phase_name) % len(stage3_palette)
color = stage3_palette[idx]

Copilot AI Feb 26, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using Python’s built-in hash() makes stage3 color assignment non-deterministic across runs (hash randomization), which makes comparing plots over time harder. Use a stable hash (e.g., hashlib.sha256(phase_name.encode()).digest() to derive idx) so the same platform always gets the same color.

Copilot uses AI. Check for mistakes.
Build instrumentation:
- QUIET=1 suppresses LOG/LOG_GROUP output for cleaner CI logs
- PHASE_START/PHASE_END_OK/PHASE_END_FAIL macros record timestamps
- Sub-phase granularity for stage1, stage2, and stage3 phases
- mk/run-phase.sh: standalone wrapper combining timing + quiet mode
- mk/timing-summary.sh: auto-discovers phases, displays ASCII table
  with indented sub-phases and TOTAL row
- .DEFAULT_GOAL := all to prevent timing-summary as default target

CI metrics collection and visualization:
- mk/collect-metrics.sh: background CPU/memory sampler (cross-platform)
- mk/plot-metrics.py: generates dual-axis SVG plots with phase overlays
- CI workflow: collect metrics, generate plots, upload to R2
- CI workflow: QUIET=1 for cleaner logs, timing tables in step summary
@angerman angerman merged commit 9e6780e into stable-ghc-9.14 Feb 28, 2026
33 checks passed
@angerman angerman deleted the feature/timing branch February 28, 2026 17:22
angerman added a commit that referenced this pull request Mar 1, 2026
Two issues from the QUIET mode CI run (#121):

1. **Timing overwrites:** The `test: stage2` dependency triggers the PHONY
   stage2 recipe again. Since everything is already built, the re-run
   completes in seconds, overwriting the real stage2 timings (16m 31s)
   with near-zero durations (3m 36s). Fix: guard PHASE_START/PHASE_END
   macros to skip writing if the phase already completed successfully
   (status=0). Failed phases can still be retried.

2. **Metrics plots:** The "Generate Metrics plots" CI step used
   `nix-shell -p ...` which is not available in the devx environment.
   Fix: replace `nix-shell` with `nix build --print-out-paths` to get
   the store path, then invoke the binary directly. Applied to both the
   matplotlib plot generation and the awscli R2 upload steps.
angerman added a commit that referenced this pull request Mar 1, 2026
Two issues from the QUIET mode CI run (#121):

1. **Timing overwrites:** The `test: stage2` dependency triggers the PHONY
   stage2 recipe again. Since everything is already built, the re-run
   completes in seconds, overwriting the real stage2 timings (16m 31s)
   with near-zero durations (3m 36s). Fix: guard PHASE_START/PHASE_END
   macros to skip writing if the phase already completed successfully
   (status=0). Failed phases can still be retried.

2. **Metrics plots:** The "Generate Metrics plots" CI step used
   `nix-shell -p ...` which is not available in the devx environment.
   Fix: replace `nix-shell` with `nix build --print-out-paths` to get
   the store path, then invoke the binary directly. Applied to both the
   matplotlib plot generation and the awscli R2 upload steps.

@hasufell hasufell left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a lot of LLM generated stuff.

Comment thread mk/collect-metrics.sh
@@ -0,0 +1,221 @@
#!/usr/bin/env bash

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't more portable than /bin/bash. /usr/bin/env isn't guaranteed by posix and we're now potentially executing a non-system bash.

I would suggest to write posix scripts, unless we really need bash features. It's also hard to know what bash version is on the users system, e.g. mac sometimes ships very old bash versions that lack features of modern bash.

Comment thread mk/collect-metrics.sh
echo "0"
fi
;;
*)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is windows and FreeBSD?

Comment thread mk/collect-metrics.sh
#!/usr/bin/env bash
# collect-metrics.sh - Collect CPU and memory metrics during build
#
# Usage: collect-metrics.sh start METRICS_DIR [INTERVAL]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Usage: collect-metrics.sh start METRICS_DIR [INTERVAL]
# Usage: collect-metrics.sh start [METRICS_DIR] [INTERVAL]

METRICS_DIR appears to be optional, since it has a default.

Comment thread mk/collect-metrics.sh
OS="$(uname -s)"

# State file for CPU delta calculation
CPU_STATE_FILE=""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this variable set in run_collector, but not here?

Comment thread mk/collect-metrics.sh
# Calculates delta between samples for accurate instantaneous usage
get_cpu_usage() {
case "$OS" in
Darwin)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For long case expressions, it's better to factor the code out into get_cpu_usage_darwin() etc. and then have a simple case expression that disambiguates by platform.

Comment thread mk/timing-summary.sh
done

# Define display order for top-level phases
ordered_top_level() {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make all the used variables p, t etc. local.

Comment thread mk/timing-summary.sh
Comment on lines +108 to +113
for t in "${top_level[@]}"; do
local found=0
for o in "${ordered[@]}"; do
[[ "$t" == "$o" ]] && found=1 && break
done
[[ $found -eq 0 ]] && ordered+=("$t")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've already written a function like this:

# @FUNCTION: array_contains
# @USAGE: <arr_ref> <val>
# @DESCRIPTION:
# Checks whether the array reference contains the given value.
# @RETURN: 0 if value exists, 1 otherwise
array_contains() {
    local -n arr=$1
    local val=$2
    shift 2
    if [[ " ${arr[*]} " =~ [[:space:]]${val}[[:space:]] ]]; then
        return 0
    else
        return 1
    fi
}

You pass it the array reference.

Comment thread mk/timing-summary.sh

# Get sub-phases for a top-level phase, sorted alphabetically
sub_phases_of() {
local parent=$1

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check function argument before use.

Comment thread mk/timing-summary.sh
sub_phases_of() {
local parent=$1
for sp in "${sub_phases[@]}"; do
# Match: parent.suffix (single level only, no nested dots)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Comment thread mk/timing-summary.sh
printf "| %-${COL_PHASE}s | %13s | |\n" "TOTAL" "$total_str"
echo "$sep"

# Save summary to file (top-level phases only, for machine consumption)

@hasufell hasufell Mar 2, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code below also saves sub-phases, so the comment seems wrong.

angerman added a commit that referenced this pull request Mar 3, 2026
collect-metrics.sh:
- Factor case branches into named helpers (get_cpu_usage_darwin,
  get_cpu_usage_linux, get_memory_usage_darwin, get_memory_usage_linux)
  with thin dispatchers for the cross-platform functions
- Initialize CPU_STATE_FILE at declaration instead of in run_collector()
- Replace all `|| echo <default>` fallbacks with `exit 1` + error on
  stderr — silent zeros hide real failures in CI
- Document /proc/stat guard (minimal containers without procfs)
- Document SIGTERM choice in stop command
- Fix usage message: METRICS_DIR is optional (brackets)
- Add bash dependency note after shebang

plot-metrics.py:
- Add 5-stage data pipeline overview comment (COLLECT → READ → SELECT
  → FILTER → PLOT)
- Add create_plot() docstring covering dual y-axes, phase shading,
  label alternation, and all arguments
- Add inline comments explaining effective cores calculation, color
  assignment logic, and alternating label positions
angerman added a commit that referenced this pull request Mar 3, 2026
collect-metrics.sh:
- Factor case branches into named helpers (get_cpu_usage_darwin,
  get_cpu_usage_linux, get_memory_usage_darwin, get_memory_usage_linux)
  with thin dispatchers for the cross-platform functions
- Initialize CPU_STATE_FILE at declaration instead of in run_collector()
- Replace all `|| echo <default>` fallbacks with `exit 1` + error on
  stderr — silent zeros hide real failures in CI
- Document /proc/stat guard (minimal containers without procfs)
- Document SIGTERM choice in stop command
- Fix usage message: METRICS_DIR is optional (brackets)
- Add bash dependency note after shebang

plot-metrics.py:
- Add 5-stage data pipeline overview comment (COLLECT → READ → SELECT
  → FILTER → PLOT)
- Add create_plot() docstring covering dual y-axes, phase shading,
  label alternation, and all arguments
- Add inline comments explaining effective cores calculation, color
  assignment logic, and alternating label positions
angerman added a commit that referenced this pull request Mar 3, 2026
collect-metrics.sh:
- Factor case branches into named helpers (get_cpu_usage_darwin,
  get_cpu_usage_linux, get_memory_usage_darwin, get_memory_usage_linux)
  with thin dispatchers for the cross-platform functions
- Initialize CPU_STATE_FILE at declaration instead of in run_collector()
- Replace all `|| echo <default>` fallbacks with `exit 1` + error on
  stderr — silent zeros hide real failures in CI
- Document /proc/stat guard (minimal containers without procfs)
- Document SIGTERM choice in stop command
- Fix usage message: METRICS_DIR is optional (brackets)
- Add bash dependency note after shebang

plot-metrics.py:
- Add 5-stage data pipeline overview comment (COLLECT → READ → SELECT
  → FILTER → PLOT)
- Add create_plot() docstring covering dual y-axes, phase shading,
  label alternation, and all arguments
- Add inline comments explaining effective cores calculation, color
  assignment logic, and alternating label positions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants