Enhance scoring engine with context-aware risk evaluation and multi-package scanning by standwlkdljea · Pull Request #18 · Sohimaster/traur

standwlkdljea · 2026-06-13T14:00:10Z

Summary

Overhauls the scoring engine from a simple weighted-signal model to a context-aware 9-step pipeline, adds multi-package scanning, fixes an AUR comment HTML parsing bug, and introduces NPM package inspection.

Changes

Bug fix: AUR comment HTML regex

The regex for extracting comments from AUR package pages was using a loose <div[^>]*\bclass="article-content"[^>]*> pattern that failed to match when id="comment-N-content" appeared before class="article-content" in the HTML. Rewrote the parser to use two targeted regexes — one for comment dates (<h4 class="comment-header">) and one for comment bodies (<div id="comment-N-content" class="article-content">) — paired by numeric comment ID.

Multi-package scanning

traur scan now accepts multiple package names as arguments:

traur scan pkg1 pkg2 pkg3

Context-aware scoring pipeline (replaces simple weighted average)

The old scoring engine applied a flat weighted average across signal categories. The new pipeline has 9 sequential stages:

Step	What it does
1. Community gate	Time-aware AUR comment threat evaluation
2. Critical gate	Signals that alone force Malicious (trust 0)
3. Override gate	High-severity signals force Malicious with max-risk
4. Weighted risk	Composite score (15% Metadata / 45% PKGBUILD / 25% Behavioral / 15% Temporal)
5. Maintainer trust	Multiplier based on account age, package count, takeover recency
6. Popularity penalty	+15 risk for zero-vote packages, +5 for low-traffic
7. Orphan + diff boost	Orphan takeover combined with new suspicious diff → risk ≥ 95
8. NPM risk	Suspicious install scripts, new maintainers, dead repos → up to 30 extra risk points
9. Clamp & tier	5 tiers: Trusted(81–100), OK(61–80), Sketchy(41–60), Suspicious(21–40), Malicious(0–20)

Time-aware AUR comment threat evaluation

Comments mentioning "malware", "backdoor", etc. are now evaluated with time-awareness and popularity context:

High-popularity repos (≥3 votes or ≥0.01 popularity):
- < 7 days old → Malicious override
- 7–60 days → degraded to a 20pt non-override signal
- 60 days → ignored entirely
Low-popularity repos:
- Mitigation/follow-up comments after the warning → degraded signal
- No mitigation + > 60 days old → always fires (orphaned concern)

Mitigation phrases (e.g. "patched", "fixed", "not compromised", "different package", "false positive") in newer comments automatically downgrade the threat. This prevents stale warnings from permanently labeling recovered packages as malicious.

NPM registry inspection

When a PKGBUILD references an npm package (via npm install, npm i -g, or registry.npmjs.org URLs), traur now fetches the package metadata from the npm registry and inspects:

Install scripts (preinstall/install/postinstall) for suspicious commands (eval, exec, curl, wget, base64, child_process)
Maintainer account age and package count
GitHub repository existence, stars, and commit freshness

New detection patterns: P-NPM-OBFUSCATED-EXEC (critical, 95pts), P-NPM-SUSPICIOUS-SCRIPT (50pts), P-NPM-ATOMIC-LOCKFILE (60pts).

Broader diff detection

Diff analysis now checks all added lines against any high-severity pattern (≥60pts) — not just network code. This catches malicious .install files, npm lockfile drops, and other non-network attack vectors.

Files changed

File	Change
`src/shared/aur_comments.rs`	Fix HTML regex, add date parsing (`CommentEntry` with timestamp)
`src/shared/scoring.rs`	Complete rewrite: 9-step pipeline, `ScoreInput`, maintainer trust, NPM risk, time-aware comment eval, tunable constants
`src/shared/models.rs`	New `CommentEntry`, `MaintainerInfo`, `NpmPackageInfo`, `NpmScripts` structs; new `PackageContext` fields
`src/shared/npm.rs`	New — npm registry fetch + GitHub stats
`src/shared/patterns.rs`	`is_critical` field on patterns; `load_high_severity_diff_patterns()`
`src/shared/signal_registry.rs`	All signals get `is_critical` field
`src/coordinator.rs`	`compute_context_meta()`, multi-package loop, time-aware verdict application
`src/main.rs`	Multi-argument scan
`src/features/aur_comments_analysis/mod.rs`	Adapted for `CommentEntry`, made keywords `pub`
All 14 feature files	`is_critical` field on `Signal` constructors, new `PackageContext` fields
`data/patterns.toml`	3 new NPM patterns
`tests/output_tests.rs`	Updated for new `is_critical` field
`README.md`	Updated with pipeline docs and multi-package usage

Testing

464 unit + integration tests pass (0 failures). Includes:

13 time-aware comment threat tests covering all rule combinations

g engine, introducing context-aware risk evaluation, external NPM package analysis, and multi-package scanning capabilities.

Implement multiple security analysis enhancements: - Add new P-INSTALL-SUID detection rule for chmod SUID/SGID bits in install scripts, and increase point values for existing SUID privilege escalation rules - Overhaul NPM dependency legitimacy scoring with a 4-component weighted model covering botting risk, documentation quality, takeover anomaly, and burner account age - Add fetching of GitHub repo metadata (closed issues count, README size) to improve NPM risk calculation accuracy - Improve bin source verification to split domain mismatch signals: full domain mismatches (50 points) and trusted CDN subdomain mismatches (10 points) - Update coordinator logic to dynamically adjust signal severity for NPM suspicious scripts and maintainer changes based on analysis results - Add comprehensive unit tests for all new and updated functionality

Adds a new security check to identify npm packages that claim a GitHub repository not matching their own package's name: - Add `repo_spoofed` boolean field to `NpmPackageInfo` struct - Implement GitHub API calls to fetch and parse a repo's root package.json content - Add lenient name matching logic to handle monorepos, scoped packages, and common variations - Update suspicion scoring to treat spoofed repos as critical risk, maxing out the score - Expand NPM install regex to support bun commands and update related comments - Add comprehensive test coverage for all new helper functions and logic

peter1599 · 2026-06-14T21:39:06Z

Hi.

Been using your pr for a few hours now.

I found a strange... bug?

Bulk.rs seems to fail? Added some debugging stuff to check:

I tried url encoding cause I first thought that is the problem but that also didn't solve it.

package is notepad++

Edit: Def broken

So from this I'm assuming its trying to use batch even tho the first screenshot i showed only had notepad++ as aur and yet still tried to batch "scan"

There is def something broken in batch scan also too.

Short summary:

It tried to batch "scan" on a single aur package, fails for some unknown reason
On multiple packages it tried to batch "scan" and still failed.

nicolasdanelon · 2026-06-15T00:32:27Z

+
+[[pkgbuild_analysis]]
+id = "P-NPM-SUSPICIOUS-SCRIPT"
+pattern = '(npm|yarn|npx)\s+(run\s+)?(postinstall|preinstall|install)'


what about pnpm?

standwlkdljea added 5 commits June 13, 2026 12:16

Major architectural enhancement to the core scorin

73d6b54

g engine, introducing context-aware risk evaluation, external NPM package analysis, and multi-package scanning capabilities.

Time-aware AUR Comment Security Evaluation

8b5fb1f

update readme

74b7f04

standwlkdljea force-pushed the main branch from 7d06869 to b01bd24 Compare June 14, 2026 15:57

nicolasdanelon reviewed Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance scoring engine with context-aware risk evaluation and multi-package scanning#18

Enhance scoring engine with context-aware risk evaluation and multi-package scanning#18
standwlkdljea wants to merge 5 commits into
Sohimaster:mainfrom
standwlkdljea:main

standwlkdljea commented Jun 13, 2026

Uh oh!

peter1599 commented Jun 14, 2026 •

edited

Loading

Uh oh!

nicolasdanelon Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

standwlkdljea commented Jun 13, 2026

Summary

Changes

Bug fix: AUR comment HTML regex

Multi-package scanning

Context-aware scoring pipeline (replaces simple weighted average)

Time-aware AUR comment threat evaluation

NPM registry inspection

Broader diff detection

Files changed

Testing

Uh oh!

peter1599 commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nicolasdanelon Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

peter1599 commented Jun 14, 2026 •

edited

Loading