feat(codex): include archived sessions#1176
Conversation
Load Codex usage from both sessions and archived_sessions when a CODEX_HOME entry is detected as a Codex home. Direct JSONL directories still load as before for saved codex exec output. Deduplicate active and archived files by relative JSONL path before parsing so copied archived sessions do not double count. The aggregate streaming path and event loader now share the same file discovery behavior. Update Codex docs to describe archived session coverage and active-session precedence.
|
@coderabbitai @cubic-dev-ai please review this PR. |
|
This PR was auto-closed. Only contributors approved with Maintainers review auto-closed issues and reopen worthwhile ones. Issues that do not meet the quality bar in CONTRIBUTING.md may not be reopened or receive a reply. If a maintainer replies See CONTRIBUTING.md. |
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughExtends Codex discovery and loading to support multiple CODEX_HOME roots and both ChangesCodex Multi-Directory Session Loading
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested labels
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
@yashau I have started the AI code review. It will take a few minutes to complete. |
|
✅ Actions performedFull review triggered. |
|
Opened the requested contribution issue first: #1177. This closed PR can serve as the ready implementation reference if maintainers approve/reopen. |
There was a problem hiding this comment.
2 issues found across 5 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="rust/crates/ccusage/src/adapter/codex/paths.rs">
<violation number="1" location="rust/crates/ccusage/src/adapter/codex/paths.rs:58">
P1: Deduplication key is global across all CODEX_HOME roots, causing valid files from different homes to be incorrectly dropped when they share the same relative path.</violation>
<violation number="2" location="rust/crates/ccusage/src/adapter/codex/paths.rs:72">
P2: Non-UTF-8 path components are silently discarded in deduplication keys, which can cause false duplicate detection and undercounted usage.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| pub(super) fn collect_deduped_codex_usage_files( | ||
| sessions_dirs: &[PathBuf], | ||
| ) -> Vec<(PathBuf, Vec<PathBuf>)> { | ||
| let mut seen = FxHashSet::default(); |
There was a problem hiding this comment.
P1: Deduplication key is global across all CODEX_HOME roots, causing valid files from different homes to be incorrectly dropped when they share the same relative path.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At rust/crates/ccusage/src/adapter/codex/paths.rs, line 58:
<comment>Deduplication key is global across all CODEX_HOME roots, causing valid files from different homes to be incorrectly dropped when they share the same relative path.</comment>
<file context>
@@ -31,3 +44,72 @@ pub(super) fn codex_home_paths() -> Result<Vec<PathBuf>> {
+pub(super) fn collect_deduped_codex_usage_files(
+ sessions_dirs: &[PathBuf],
+) -> Vec<(PathBuf, Vec<PathBuf>)> {
+ let mut seen = FxHashSet::default();
+ let mut grouped_files = Vec::new();
+ for sessions_dir in sessions_dirs {
</file context>
| grouped_files | ||
| } | ||
|
|
||
| fn codex_relative_session_path(sessions_dir: &Path, path: &Path) -> String { |
There was a problem hiding this comment.
P2: Non-UTF-8 path components are silently discarded in deduplication keys, which can cause false duplicate detection and undercounted usage.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At rust/crates/ccusage/src/adapter/codex/paths.rs, line 72:
<comment>Non-UTF-8 path components are silently discarded in deduplication keys, which can cause false duplicate detection and undercounted usage.</comment>
<file context>
@@ -31,3 +44,72 @@ pub(super) fn codex_home_paths() -> Result<Vec<PathBuf>> {
+ grouped_files
+}
+
+fn codex_relative_session_path(sessions_dir: &Path, path: &Path) -> String {
+ path.strip_prefix(sessions_dir)
+ .unwrap_or(path)
</file context>
|
@yashau looks reasonable. thank you for this pr. |
Summary
Codex usage discovery now includes archived sessions by default for Codex homes. When
CODEX_HOMEpoints at a Codex home such as~/.codex, the Rust adapter reads bothsessions/andarchived_sessions/so focused and unified Codex reports include archived conversation history.This ports the behavior proposed in #849 to the current Rust implementation after the TypeScript adapter was retired.
What Changed
sessions/andarchived_sessions/.codex exec --jsonoutput.sessions/copy wins.Why
Archived Codex sessions are still part of a user's local usage history, but the Rust adapter only read active sessions. That undercounted token and cost totals for users with archived conversations. This change makes the default behavior match what users expect from
ccusage codex, while preserving explicit/direct JSONL-directory support.Implementation Notes
The key decision is to de-duplicate at the file discovery layer by relative JSONL path before parsing. That prevents double counting a session copied from
sessions/toarchived_sessions/, while still allowing distinct nested paths with the same basename to be counted separately.The aggregate path matters because normal table output can stream/aggregate without first building a full event vector. This PR updates that path as well as the event loader so
--json, table output, and focused reports all see the same file set.Testing
cargo test --manifest-path rust/Cargo.toml --workspace codex -- --nocapturecargo check --manifest-path rust/Cargo.toml --workspacegit diff --checkI also ran
cargo test --manifest-path rust/Cargo.toml --workspace. It reached 191 passing tests and failed two existing timezone-sensitive tests unrelated to this Codex change on this Windows machine:commands::tests::builds_statusline_today_filter_from_timezonetests::formats_dates_with_timezoneBoth failures reproduce individually and show named timezone conversion behaving like UTC locally.
AI-assisted: This code was written with assistance from AI.
Summary by cubic
Include archived Codex sessions by default to fix undercounted usage and align
ccusagereports with real totals. Direct JSONL directories still work the same.sessions/andarchived_sessions/whenCODEX_HOMEpoints to a Codex home.sessions/copy wins if both exist.codex exec --jsondirectories unchanged.Written for commit d29832d. Summary will update on new commits.
Review in cubic
Summary by CodeRabbit
New Features
Documentation
Tests