Skip to content

Implement automatic hashrate anomaly detection and ASIC recovery#1655

Open
cniweb wants to merge 7 commits into
bitaxeorg:masterfrom
cniweb:master
Open

Implement automatic hashrate anomaly detection and ASIC recovery#1655
cniweb wants to merge 7 commits into
bitaxeorg:masterfrom
cniweb:master

Conversation

@cniweb
Copy link
Copy Markdown

@cniweb cniweb commented Apr 10, 2026

This pull request introduces an automatic hashrate anomaly detection and recovery mechanism to the hashrate_monitor_task. The main goal is to improve system robustness by monitoring for abnormal hashrate drops or spikes and automatically reinitializing the ASICs if anomalies persist, thus reducing the need for manual intervention or full system reboots. The detection thresholds are dynamically computed based on hardware configuration to minimize false positives.

Hashrate anomaly detection and recovery:

  • Added a new check_hashrate_anomaly function that monitors for low or high hashrate anomalies and triggers ASIC live recovery if anomalies persist for a configurable number of consecutive polls. The function is documented in hashrate_monitor_task.h and called from the main monitoring loop before updating the highest observed hashrate. [1] [2] [3]
  • Introduced state variables and thresholds for anomaly detection, including dynamic calculation of the lower threshold based on ASIC and hash domain configuration. [1] [2] [3]

Integration and robustness improvements:

  • Ensured that after ASIC recovery, measurements are reset to avoid reporting spikes from stale counters, and UART buffers are flushed to clear stale data.
  • Added necessary includes for asic_init.h and driver/uart.h to support the new recovery logic.

Copilot AI and others added 4 commits April 10, 2026 20:28
Integrates TCH-specific stability improvements: when hashrate drops below
a dynamic threshold for 3 consecutive polls, the ASICs are automatically
reinitialized using live recovery mode (ASIC_INIT_RECOVERY) without a
full system reboot. This prevents sustained hashrate loss caused by
TPS546 or chip domain failures, particularly beneficial for multi-ASIC
configurations like Bitaxe 800 (Gamma Turbo) and SupraHex.

The lower threshold is computed dynamically based on expected hashrate,
ASIC count, and hash domain count. Recovery is skipped when mining is
paused. Measurements are cleared after recovery to prevent hashrate
spikes from stale counters.

Inspired by TinyChipHub ESP-Miner-TCH stability improvements.

Agent-Logs-Url: https://github.com/cniweb/ESP-Miner/sessions/a360443a-9b9c-42db-9b77-f65bed100cbe

Co-authored-by: cniweb <2334906+cniweb@users.noreply.github.com>
- Separate low/high anomaly detection: low hashrate requires previous
  highest (prevents ramp-up false positives), high spikes detected
  independently
- Add ANOMALY_CONSECUTIVE_THRESHOLD and RECOVERY_STABILIZATION_DELAY_MS
  named constants
- Add detailed comment explaining the 2.0x domain contribution margin
  in threshold calculation

Agent-Logs-Url: https://github.com/cniweb/ESP-Miner/sessions/a360443a-9b9c-42db-9b77-f65bed100cbe

Co-authored-by: cniweb <2334906+cniweb@users.noreply.github.com>
- Rename low_hashrate_count to consecutive_anomaly_count since it tracks
  both low and high anomalies
- Move check_hashrate_anomaly() before highest_hashrate update so the
  low-anomaly guard (current < highest) correctly ignores ramp-up
- Clarify fallback threshold comment

Agent-Logs-Url: https://github.com/cniweb/ESP-Miner/sessions/a360443a-9b9c-42db-9b77-f65bed100cbe

Co-authored-by: cniweb <2334906+cniweb@users.noreply.github.com>
Add automatic hashrate anomaly detection and ASIC recovery (TCH stability improvements)
Copy link
Copy Markdown
Collaborator

@0xf0xx0 0xf0xx0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

claude :/

Comment thread main/tasks/hashrate_monitor_task.c Outdated
Comment on lines +273 to +277
// Track highest observed hashrate (after anomaly check)
if (current_hashrate > highest_hashrate) {
highest_hashrate = current_hashrate;
ESP_LOGI(TAG, "New highest hashrate: %.3f Gh/s", highest_hashrate);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why? if we're looking for abnormal hashrates we shouldnt be tracking the highest, and this isnt used anywhere

Comment thread main/tasks/hashrate_monitor_task.c Outdated
Comment on lines +216 to +227
// Compute dynamic lower hashrate threshold based on hardware configuration.
// The formula detects when hashrate drops by more than twice the contribution
// of a single hash domain on a single ASIC. Multiplying by 2.0 provides a
// margin so that losing one domain triggers detection, while normal variance
// (which is less than one full domain) does not cause false positives.
float expected_hr = GLOBAL_STATE->POWER_MANAGEMENT_MODULE.expected_hashrate;
if (expected_hr > 0.0f && asic_count > 0 && hash_domains > 0) {
float per_domain_contribution = expected_hr / asic_count / hash_domains;
lower_threshold_hashrate_pct = 1.0f - (per_domain_contribution * 2.0f / expected_hr);
ESP_LOGI(TAG, "Hashrate anomaly lower threshold: %.0f%% of expected", lower_threshold_hashrate_pct * 100.0f);
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not a simpler 75% threshold? the more chips you add the higher the threshold becomes, the gamma turbos threshold would be 0.875 and a nerdoctaxe would be 0.992. the gammas threshold alone is 0.5, meaning this code actually needs 2 domains to go down before it triggers the reset.

Comment thread main/tasks/hashrate_monitor_task.c Outdated
// Reset measurements to avoid hashrate spike from stale counters
hashrate_monitor_reset_measurements(GLOBAL_STATE);
} else {
ESP_LOGE(TAG, "ASIC recovery failed - chip count 0");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be a full reboot, instead of leaving the user in a worse state than before

Comment thread main/tasks/hashrate_monitor_task.h Outdated
Copilot AI and others added 3 commits April 11, 2026 21:32
…implify warmup logic, clean up comments

Agent-Logs-Url: https://github.com/cniweb/ESP-Miner/sessions/0ee049d8-a3ca-4b05-8d8c-6c13556aa397

Co-authored-by: cniweb <2334906+cniweb@users.noreply.github.com>
Fix hashrate anomaly detection: simplify thresholds, add reboot fallback
@mutatrum
Copy link
Copy Markdown
Collaborator

mutatrum commented May 4, 2026

This is overly complex. All cases of diminished hashrate are happening if one or more domains of an ASIC are dropping out, e.g. the hashcounter register is not changing anymore. This can be detected without any heuristics.

The problem however is that this is almost always a sign of a crappy power supply. If the device is powered by a decent power supply, this never happens. So just simply restarting the ASIC hides this problem. Maybe we should add a notification messages first, when one or more of the domains dropped out, so the user can try to either adjust frequency or voltage to remedy the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants