Skip to content

Add Skills宝 to Chinese guide#56

Open
wangzaiwang-hub wants to merge 109 commits into
daymade:mainfrom
wangzaiwang-hub:add-skillsbao-link-v2
Open

Add Skills宝 to Chinese guide#56
wangzaiwang-hub wants to merge 109 commits into
daymade:mainfrom
wangzaiwang-hub:add-skillsbao-link-v2

Conversation

@wangzaiwang-hub

Copy link
Copy Markdown

Adds Skills宝 as a Chinese discovery entry for users looking for more skills.

Skills宝: https://skilery.com

daymade and others added 30 commits March 7, 2026 14:54
…ents

## New Skill: continue-claude-work (v1.1.0)
- Recover actionable context from local `.claude` session artifacts
- Compact-boundary-aware extraction (reads Claude's own compaction summaries)
- Subagent workflow recovery (reports completed vs interrupted subagents)
- Session end reason detection (clean exit, interrupted, error cascade, abandoned)
- Size-adaptive strategy for small/large sessions
- Noise filtering (skips 37-53% of session lines)
- Self-session exclusion, stale index fallback, MEMORY.md integration
- Bundled Python script (no external dependencies)
- Security scan passed, argument-hint added

## Skill Updates
- **skill-creator** (v1.5.0): Complete rewrite with evaluation framework
  - Added agents/ (analyzer, comparator, grader)
  - Added eval-viewer/ (generate_review.py, viewer.html)
  - Added scripts/ (run_eval, aggregate_benchmark, improve_description, run_loop)
  - Added references/schemas.md (eval/benchmark schemas)
  - Expanded SKILL.md with inline vs fork guidance, progressive disclosure patterns
  - Enhanced package_skill.py and quick_validate.py

- **transcript-fixer** (v1.2.0): CLI improvements and test coverage
  - Enhanced argument_parser.py and commands.py
  - Added correction_service.py improvements
  - Added test_correction_service.py

- **tunnel-doctor** (v1.4.0): Quick diagnostic script
  - Added scripts/quick_diagnose.py
  - Enhanced SKILL.md with 5-layer conflict model

- **pdf-creator** (v1.1.0): Auto DYLD_LIBRARY_PATH + rendering fixes
  - Auto-detect and set DYLD_LIBRARY_PATH for weasyprint
  - Fixed list rendering and CSS improvements

- **github-contributor** (v1.0.3): Enhanced project evaluation
  - Added evidence-loop, redaction, and merge-ready PR guidance

## Documentation
- Updated marketplace.json (v1.38.0, 42 skills)
- Updated CHANGELOG.md with v1.38.0 entry
- Updated CLAUDE.md (skill count, marketplace version, daymade#42 description)
- Updated README.md (badges, skill section daymade#42, use case, requirements)
- Updated README.zh-CN.md (badges, skill section daymade#42, use case, requirements)
- Fixed absolute paths in continue-claude-work/references/file_structure.md

## Validation
- All skills passed quick_validate.py
- continue-claude-work passed security_scan.py
- marketplace.json validated (valid JSON)
- Cross-checked version consistency across all docs
- Add _ensure_list_spacing() to handle lists without blank lines before them
- Modify _md_to_html() to preprocess markdown content via stdin
- Add automated test suite (scripts/tests/test_list_rendering.py)
- Fix: Lists without preceding blank lines now render correctly
- Original markdown files remain unmodified (preprocessing in memory only)

Root cause: Pandoc requires blank lines before lists per CommonMark spec.
Without preprocessing, lists following paragraphs render as plain text.

Tested scenarios:
✅ Lists with blank lines (normal case)
✅ Lists without blank lines (critical fix)
✅ Ordered lists without blank lines
✅ Original file integrity preserved

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on workflow

- Add 10-point high-quality PR formula based on real-world success cases
- Add investigation phase workflow (post to issue before PR)
- Add git history tracing techniques (git log, git blame)
- Add evidence-loop pattern (reproduce → trace → link → post)
- Add high-quality PR case study reference
- Update PR checklist with investigation steps
- Emphasize separation of concerns (detailed analysis in issue, fix summary in PR)

Key principles:
- Deep investigation before coding
- Minimal, surgical fixes
- Professional communication
- No internal/irrelevant details in PR

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ripts

New scripts:
- fix_transcript_timestamps.py: Repair malformed timestamps (HH:MM:SS format)
- split_transcript_sections.py: Split transcript by keywords and rebase timestamps
- Automated tests for both scripts

Features:
- Timestamp validation and repair (handle missing colons, invalid ranges)
- Section splitting with custom names
- Rebase timestamps to 00:00:00 for each section
- Preserve speaker format and content integrity
- In-place editing with backup

Documentation updates:
- Add usage examples to SKILL.md
- Clarify dictionary iteration workflow (save stable patterns only)
- Update workflow guides with new script references
- Add script parameter documentation

Use cases:
- Fix ASR output with broken timestamps
- Split long meetings into focused sections
- Prepare sections for independent processing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add skill to fix broken line wrapping in Claude Code exported .txt files.
Reconstructs tables, paragraphs, paths, and tool calls that were hard-wrapped
at fixed column widths.

Features:
- State-machine parser with next-line look-ahead
- Handles 10 content types (user prompts, Claude responses, tables, tool calls, etc.)
- Pangu spacing for CJK/ASCII mixed text
- 53 automated validation checks
- Safety: never modifies original files, verifies marker counts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds dependency detection before skill creation starts, preventing
mid-workflow failures (e.g., gitleaks missing at packaging, PyYAML
missing at validation). Documents correct script invocation via
python3 -m syntax and auto-installation commands.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…losure

Move verbose sections to references/ files, keeping concise pointers in
CLAUDE.md. Zero content loss — all documentation preserved in reference
files that Claude loads on demand.

Moved to references/:
- plugin-architecture.md (296 lines) — architecture docs
- plugin-troubleshooting.md (441 lines) — installation debugging
- new-skill-guide.md (241 lines) — detailed templates/checklists
- promotion-policy.md (60 lines) — third-party request policy
- youtube-downloader/references/internal-sop.md — yt-dlp SOP

Also fixed: Available Skills daymade#36-42 indentation, deduplicated 4x
versioning sections into one, removed stale notes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- add scrapling-skill with validated CLI workflow, diagnostics, packaging, and docs integration
- fix skill-creator package_skill.py so direct script invocation works from repo root
- fix continue-claude-work extract_resume_context.py typing compatibility for local python3
- bump marketplace to 1.39.0 and updated skill versions
Replace real Zhipu GLM API key with fake placeholder in mask_secret()
and SecretStr docstring examples. The real key was exposed in this
PUBLIC repo.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ndings

transcript-fixer:
- Add common_words.py safety system (blocks common Chinese words from dictionary)
- Add --audit command to scan existing dictionary for risky rules
- Add --force flag to override safety checks explicitly
- Fix substring corruption (产线数据→产线束据, 现金流→现现金流)
- Unified position-aware replacement with _already_corrected() check
- 69 tests covering all production false positive scenarios

tunnel-doctor:
- Add Step 5A: Tailscale SSH proxy silent failure on WSL
- Add Step 5B: App Store vs Standalone Tailscale on macOS
- Add Go net/http NO_PROXY CIDR incompatibility warning
- Add utun interface identification (MTU 1280=Tailscale, 4064=Shadowrocket)
- Fix "Four→Five Conflict Layers" inconsistency in reference doc
- Add complete working Shadowrocket config reference

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Claude IS the AI — when running inside Claude Code, use Claude's own
language understanding for Stage 2 corrections instead of calling an
external API. No API key needed by default.

New capabilities in native mode:
- Intelligent paragraph breaks at logical topic transitions
- Filler word reduction (excessive repetition removal)
- Interactive review with confidence-level tables
- Context-aware judgment using full document context

API mode (GLM) remains available for batch/automation use cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…AskUserQuestion

New skill: asr-transcribe-to-text (v1.0.0)
- Transcribe audio/video via configurable ASR endpoint (Qwen3-ASR default)
- Persistent config in CLAUDE_PLUGIN_DATA (endpoint, model, proxy bypass)
- Single-request-first strategy (empirically proven: 55min in one request)
- Fallback overlap-merge script for very long audio (18min chunks, 2min overlap)
- AskUserQuestion at config init, health check failure, and output verification

skill-creator optimization (v1.5.1 → v1.6.0)
- Add AskUserQuestion best practices section (Re-ground/Simplify/Recommend/Options)
- Inject structured decision points at 8 key workflow stages
- Inspired by gstack's atomic question pattern

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename skill to better reflect its purpose (document-to-markdown conversion)
- Update SKILL.md name, description, and trigger keywords
- Add benchmark reference (2026-03-22)
- Update marketplace.json entry (name, skills path, version 2.0.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…gnosis

Real-world findings from debugging docker build failures on macOS with
OrbStack + Shadowrocket:

- Add docker pull vs docker build vs docker run proxy path distinction table
- Add 2G-1: --network host workaround for OrbStack transparent proxy broken by TUN
- Rewrite 2G-2: use host.internal (not 127.0.0.1) for OrbStack Docker proxy
- Add 2G-4: container healthcheck failure from lowercase http_proxy env var leak
- Add 3 new symptom entries to Step 1 diagnostic index
- Add smoking gun diagnosis: wget showing "127.0.0.1: Connection refused"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… full rename cleanup

- Add CJK bold spacing fix: insert spaces around **bold** spans containing
  CJK characters for correct rendering (handles emoji adjacency, already-spaced)
- Add JSON pretty-print: auto-format JSON code blocks with 2-space indent
- Add 31 unit tests covering all post-processing functions
- Fix pandoc simple table detection (1-space column gaps)
- Fix image path double-nesting when --assets-dir ends with 'media'
- Rename all markdown-tools references across 15 files (README, QUICKSTART,
  marketplace.json, CLAUDE.md, meeting-minutes-taker, GitHub templates)
- Add 5-tool benchmark report (Docling/MarkItDown/Pandoc/Mammoth/ours)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… to 2.1.0

Sync description with actual capabilities: CJK bold spacing, JSON pretty-print,
simple table support, 31 tests, benchmark score. Add cjk/chinese keywords.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…kill draft

- pdf-creator v1.2.0: theme system (default/warm-terra), dual backend
  (weasyprint/chrome auto-detect), argparse CLI, extracted CSS to themes/
- terraform-skill: operational traps from real deployments (provisioner
  timing, DNS duplication, multi-env isolation, pre-deploy validation)
- asr-transcribe-to-text: add security scan marker

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…iew Team

- Correct source accessibility: distinguish circular verification (forbidden)
  from exclusive information advantage (encouraged)
- Add Counter-Review Team with 5 specialized agents (claim-validator,
  source-diversity-checker, recency-validator, contradiction-finder,
  counter-review-coordinator)
- Add Enterprise Research Mode: 6-dimension data collection framework
  with SWOT, competitive barrier, and risk matrix analysis
- Update version to 2.4.0
- Add comprehensive reference docs:
  - source_accessibility_policy.md
  - V6_1_improvements.md
  - counter_review_team_guide.md
  - enterprise_analysis_frameworks.md
  - enterprise_quality_checklist.md
  - enterprise_research_methodology.md
  - quality_gates.md
  - report_template_v6.md
  - research_notes_format.md
  - subagent_prompt.md

Based on "深度推理" case study methodology lessons learned.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ample

Replace \"深度推理(上海)科技有限公司\" with \"字节跳动子公司\"
as the case study example to avoid exposing user's own company info.

Also update .gitignore to exclude:
- deep-research-output/ (contains sensitive research data)
- recovered_deep_research/
- .opencli/
- douban-skill/ (work-in-progress)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dology

New skill: douban-skill
- Full export of Douban (豆瓣) book/movie/music/game collections via Frodo API
- RSS incremental sync for daily updates
- Python stdlib only, zero dependencies, cross-platform (macOS/Windows/Linux)
- Documented 7 failed approaches (PoW anti-scraping) and why Frodo API is the only working solution
- Pre-flight user validation, KeyboardInterrupt handling, pagination bug fix

skill-creator enhancements:
- Add development methodology reference (8-phase process with prior art research,
  counter review, and real failure case studies)
- Sync upstream changes: improve_description.py now uses `claude -p` instead of
  Anthropic SDK (no ANTHROPIC_API_KEY needed), remove stale "extended thinking" ref
- Add "Updating an existing skill" guidance to Claude.ai and Cowork sections
- Restore test case heuristic guidance for objective vs subjective skills

README updates:
- Document fork advantages vs upstream with quality comparison table (65 vs 42)
- Bilingual (EN + ZH-CN) with consistent content

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…UDE.md)

Prevents sensitive data (user paths, phone numbers, personal IDs) from
entering git history. Born from redacting 6 historical commits.

- .gitleaks.toml: custom rules for absolute paths, phone numbers, usernames
- .githooks/pre-commit: dual-layer scan (gitleaks + regex fallback)
- CLAUDE.md: updated Privacy section documenting the defense system

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nciple, add date handling rules

- SKILL.md length driven by information density, not line count
- Factual dates (release dates, "last verified") should be kept — they help readers judge freshness
- Conditional date logic ("before X use old API") should be avoided

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… recovery

Previously, recover_content.py saved all files flat in the output directory,
causing files with the same name (e.g., src/utils.py and tests/utils.py) to
overwrite each other.

Now the script preserves the original directory structure, creating subdirectories
as needed within the output directory.

- Bump version: 1.0.0 → 1.0.1

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hardcoded user paths that triggered gitleaks PII detection:
- /Users/username/ → ~/
- /Users/user/ → ~/
- -Users-username- → -Users-<username>- (normalized paths)

Also fix the sed example to use <home> placeholder instead of
regex pattern that would match actual usernames.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Update SKILL.md and workflow_examples.md to reflect the new behavior
of recover_content.py which now preserves original directory structure:

- SKILL.md: Add 'preserving the original directory structure' note
- SKILL.md: Update verification examples to use find command and
  show subdirectory paths (e.g., ./recovered_content/src/components/)
- workflow_examples.md: Update diff example to account for nested paths

Version bump: 1.0.1 → 1.0.2

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
asr-transcribe-to-text:
- Add local MLX transcription path (macOS Apple Silicon, 15-27x realtime)
- Add bundled script transcribe_local_mlx.py with max_tokens=200000
- Add local_mlx_guide.md with benchmarks and truncation trap docs
- Auto-detect platform and recommend local vs remote mode
- Fix audio extraction format (MP3 → WAV 16kHz mono PCM)
- Add Step 5: recommend transcript-fixer after transcription

transcript-fixer:
- Optimize SKILL.md from 289 → 153 lines (best practices compliance)
- Move FALSE_POSITIVE_RISKS (40 lines) to references/false_positive_guide.md
- Move Example Session to references/example_session.md
- Improve description for better triggering (226 → 580 chars)
- Add handoff to meeting-minutes-taker

skill-creator:
- Add "Pipeline Handoff" pattern to Skill Writing Guide
- Add pipeline check reminder in Step 4 (Edit the Skill)

Pipeline handoffs added to 8 skills forming 6 chains:
- youtube-downloader → asr-transcribe-to-text → transcript-fixer → meeting-minutes-taker → pdf/ppt-creator
- deep-research → fact-checker → pdf/ppt-creator
- doc-to-markdown → docs-cleaner / fact-checker
- claude-code-history-files-finder → continue-claude-work

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SKILL.md: rewritten following Anthropic best practices
- Concise (233 lines, down from 347)
- Critical VHS parser limitations section (base64 workaround)
- Advanced patterns: self-bootstrap, output filtering, frame verification
- Better description for skill triggering

New files:
- references/advanced_patterns.md: production patterns from dbskill project
- assets/templates/self-bootstrap.tape: self-cleaning demo template

auto_generate_demo.py: new flags
- --bootstrap: hidden setup commands (self-cleaning state)
- --filter: regex pattern to filter noisy output
- --speed: post-processing speed multiplier (gifsicle)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
daymade and others added 28 commits April 12, 2026 07:23
- Fix PyInstaller not found: use venv activation instead of uv run
- Create assets/ directory with .gitkeep for Docker build context
- Separate Unix/Windows build steps for proper shell handling

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Round 65 实现:竞品差距分析识别出数据可视化是核心差距,商业SaaS(新榜/西瓜数据)
有但我们没有的功能。

## 新增文件

1. scripts/analytics.py - 数据分析引擎
   - 核心指标计算(文章数/阅读量/点赞数/平均WCI)
   - 趋势分析(7/30/90天)
   - 排行榜算法
   - 发布时间分布统计
   - 内容分类占比
   - 质量评分分布

2. dashboard/main.py - FastAPI 后端服务
   - RESTful API for 图表数据
   - WebSocket 实时更新支持
   - CORS 配置
   - 静态文件服务

3. dashboard/static/index.html - Dashboard 前端
   - Chart.js 图表库
   - Tailwind CSS 样式
   - 响应式布局
   - 实时数据交互

## Dashboard 功能

### 核心指标面板
- 文章总数、总阅读量、总点赞数
- 平均阅读量/点赞数/WCI
- 渐变色彩卡片设计

### 图表可视化
- 阅读量趋势(7/30/90天)
- 发布时间分布(24小时热力图)
- 内容分类占比(饼图)
- 质量评分分布(柱状图)

### 排行榜
- Top 10 高阅读量文章
- Top 10 高WCI文章
- 支持按阅读量/点赞量/WCI排序

### 账号统计
- 多账号对比表格
- 文章数/阅读量/WCI一览

## CLI 更新
- w dashboard - 启动可视化仪表盘服务
- 自动打开浏览器
- 可配置端口和绑定地址

## 文档更新
- SKILL.md 添加 v2.3.0 更新日志
- 竞品对比表添加"数据仪表盘"行
- 核心差异化列表新增第38条

## 竞品对比优势
| 功能 | 新榜/西瓜数据 | 我们的方案 |
|------|--------------|-----------|
| 数据仪表盘 | ✅ 商业SaaS | ✅ 开源免费 |
| 排行榜 | ✅ 付费功能 | ✅ 免费使用 |
| 趋势分析 | ✅ 付费功能 | ✅ 免费使用 |

版本: 3.23.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Round 66 实现:竞品差距分析识别出AI智能分析能力是核心差距。
竞品(新榜/西瓜数据)无此功能,用户需求强烈。

## 新增文件

1. scripts/ai_analyzer.py - AI智能分析引擎 (500+ lines)
   - 情感分析:正面/负面/中性 + 置信度评分
   - 关键词提取:TF-IDF + LLM混合算法
   - 智能摘要:自动生成文章摘要和关键要点
   - 实体识别:人名/公司/产品/地点提取
   - 多LLM支持:Ollama本地模型 + OpenAI/DeepSeek API
   - 规则回退:当LLM不可用时使用基于规则的分析

## 核心特性

### 多模型支持
- 优先本地Ollama (qwen2.5:7b/llama3.2) - 保护隐私
- 回退OpenAI API - 高质量分析
- 回退DeepSeek API - 国产模型选择
- 规则基础分析 - 最后的可靠保障

### AI分析功能
| 功能 | 描述 |
|------|------|
| 情感分析 | positive/negative/neutral + 置信度(0-1) |
| 关键词提取 | 10个关键词 + 重要性评分 |
| 智能摘要 | 200-500字摘要 + 关键要点列表 |
| 实体识别 | PERSON/ORG/PRODUCT/LOCATION |
| 预计阅读时间 | 基于字数计算 |

### CLI 命令
- `w analyze article --url <URL>` - 单篇文章AI分析
- `w analyze batch --limit 100` - 批量分析
- `w analyze stats` - 情感分布统计
- `w analyze keywords` - 关键词云

### 数据持久化
- SQLite ai_analysis 表存储分析结果
- 避免重复分析
- 支持强制重新分析

## 技术亮点

1. **智能降级策略**: LLM → API → 规则,确保服务可用
2. **JSON模式解析**: 结构化输出,便于处理
3. **缓存机制**: 避免重复调用LLM,节省成本
4. **中文优化**: 针对中文微信公众号内容优化

## 竞品对比优势

| 功能 | 新榜 | 西瓜数据 | 我们的 v3.24.0 |
|------|------|---------|---------------|
| AI情感分析 | ❌ | ❌ | ✅ **独有** |
| AI关键词提取 | ❌ | ❌ | ✅ **独有** |
| AI智能摘要 | ❌ | ❌ | ✅ **独有** |
| 多LLM支持 | ❌ | ❌ | ✅ **独有** |

当前差异化计数:42个"唯一支持"功能

版本: 3.24.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Round 67 实现:竞品差距分析识别出语义搜索是核心功能空白。
所有竞品(新榜/西瓜数据/wcplusPro)都只支持关键词搜索,无语义理解能力。

## 新增文件

1. scripts/vector_store.py - 向量存储引擎 (400+ lines)
   - 支持 sqlite-vss (SQLite Vector Similarity Search)
   - 备选方案使用 numpy + 余弦相似度计算
   - Embedding多提供商支持:
     - Ollama nomic-embed-text (本地,768维)
     - OpenAI text-embedding-ada-002 (API,1536维)
     - TF-IDF词哈希 (规则备选)
   - 智能降级: sqlite-vss → numpy备选 → 规则embedding

2. scripts/semantic_search.py - 语义搜索引擎 (350+ lines)
   - 语义搜索: 理解查询意图,返回语义相关结果
   - 相似文章推荐: 基于内容向量k-NN检索
   - 内容聚类: 自动发现文章主题群
   - 混合搜索: 结合关键词过滤和语义相似度

## 核心特性

### 向量化支持
| 提供商 | 模型 | 维度 | 特点 |
|--------|------|------|------|
| Ollama | nomic-embed-text | 768 | 本地运行,保护隐私 |
| OpenAI | text-embedding-ada-002 | 1536 | 高质量,需API Key |
| 规则 | TF-IDF哈希 | 768 | 无需外部依赖 |

### 语义搜索能力
- 自然语言查询理解
- 余弦相似度排序
- 相似度分数显示
- 关键词高亮
- 公众号/时间范围过滤

### CLI 命令
```bash
# 索引文章到向量库
w semantic index --limit 1000

# 语义搜索
w search "人工智能发展趋势" --semantic
w semantic search -q "大模型技术突破"

# 相似文章推荐
w semantic similar --id <article_id>

# 内容聚类
w semantic cluster --clusters 5

# 向量库统计
w semantic stats
```

## 技术亮点

1. **智能降级链**: sqlite-vss → numpy备选 → 规则embedding
2. **多维度向量**: 支持768/1536等不同维度模型
3. **元数据过滤**: 支持按公众号、时间等条件过滤
4. **批量索引**: 支持大量文章批量向量化
5. **持久化存储**: SQLite存储,无需额外数据库

## 竞品对比优势

| 功能 | 新榜 | 西瓜数据 | wcplusPro | 我们的 v3.25.0 |
|------|------|---------|-----------|---------------|
| 关键词搜索 | ✅ | ✅ | ✅ | ✅ |
| 语义搜索 | ❌ | ❌ | ❌ | ✅ **独有** |
| 相似文章推荐 | ❌ | ❌ | ❌ | ✅ **独有** |
| 向量存储 | ❌ | ❌ | ❌ | ✅ **独有** |

当前差异化计数:45个"唯一支持"功能

版本: 3.25.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- IFTTT风格的触发-动作系统
- 触发器: 新文章/热度阈值/关键词匹配/定时触发
- 动作执行器: Webhook/邮件/企业微信/飞书/本地脚本
- RESTful API + WebSocket 实时通知
- w workflow CLI命令组
- v3.26.0, 46个唯一支持特性
- 团队协作系统: 多用户、RBAC权限(admin/member/viewer)、共享工作区
- 文章标注系统: 标签、评论、高亮、共享收藏夹
- 邀请码机制: 通过邀请码加入团队
- 第三方集成: Notion/语雀/Airtable 自动同步
- w team CLI命令组: create/list/join/members/collections/stats
- w sync CLI命令组: add/list/test
- v3.27.0, 48个唯一支持特性
- 多格式导出引擎: Excel/PDF/Word/Markdown/JSON/CSV
  - Excel: 样式美化、多sheet、统计
  - PDF: 中文支持、分页优化
  - Word: 格式保持、目录生成
  - 导出模板系统
- 高级筛选系统: 多条件组合(AND/OR)、筛选模板
- 批量操作引擎: 批量导出/编辑/同步/删除、任务队列
- w export/filter/batch CLI命令组
- v3.28.0, 51个唯一支持特性
- Chrome/Firefox Manifest V3 扩展
- Content Script: 页面内容提取、文章数据解析
- Background Script: 右键菜单、快捷键、自动下载
- Popup UI: 格式选择(Markdown/HTML/JSON)、进度显示
- 快捷键: Ctrl+Shift+S 快速抓取
- 支持图片提取、互动数据读取
- v3.29.0, 52个唯一支持特性
- 敏感词检测: 实时扫描敏感内容 (政治/色情/暴力/赌博/毒品/诈骗)
- 品牌提及追踪: 追踪品牌/关键词在文章中的提及
- 情感趋势分析: 正面/中性/负面情绪监控
- 危机预警系统: 异常传播速度检测、敏感内容预警
- CLI集成: w sentiment 命令管理监控
- 53个唯一支持特性

版本: v3.29.0 → v3.30.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 数据仪表盘: ECharts图表、阅读趋势、互动热力图、文章排行
- 竞品分析: 多账号对比、竞争力评分、内容策略分析
- AI智能洞察: 异常检测、趋势分析、自动运营建议
- 热点追踪: 自动发现热点话题、传播速度监控、生命周期分析
- CLI集成: w analytics 统一入口
- 57个唯一支持特性

版本: v3.30.0 → v3.31.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- AI标题生成器: 8种爆款公式、A/B测试、CTR预测
- 智能摘要生成: 5种风格(新闻/营销/极简/故事/要点)
- 内容改写润色: 5种风格转换(专业/轻松/营销/故事/学术)
- 素材库管理: 文案/图片收藏、标签分类、快速检索
- CLI集成: w writing 统一入口
- 61个唯一支持特性

版本: v3.31.0 → v3.32.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 定时任务调度器: Cron表达式、自动采集、定时导出
- 任务执行日志: 执行历史、成功率统计、错误追踪
- 通知提醒系统: 邮件/Webhook、5种通知模板
- 5种内置任务类型: 采集/导出/备份/清理/自定义
- CLI集成: w scheduler 统一入口
- 64个唯一支持特性

版本: v3.32.0 → v3.33.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
核心功能:
- 微信小程序一键转发保存 (pages/save/save.js)
- 微信公众号消息自动保存 (official/webhook/route.ts)
- WeChat OpenID 用户绑定系统 (wechat_bindings 表)
- 异步抓取任务队列 (scrape_jobs 表)

API端点:
- POST /api/wechat/miniapp/login - 小程序登录
- POST /api/wechat/miniapp/save - 小程序保存文章
- POST/GET /api/wechat/official/webhook - 公众号消息处理

数据库迁移:
- 006_wechat_bindings.sql - wechat_bindings, scrape_jobs, wechat_message_logs

小程序源码:
- wechat-miniprogram/app.js - 小程序入口
- wechat-miniprogram/pages/save/ - 保存页面(接收转发)
- wechat-miniprogram/pages/index/ - 文章列表页

竞品对标:
- 实现 Cubox/Matter 级别的微信转发即收藏体验
- 从 3 步操作简化为 1 步转发

文档更新:
- SKILL.md v3.40.0 发布说明
- COMPETITOR_MATRIX.md 更新差距分析
- docs/wechat-ecosystem-setup.md 配置指南

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
建立了完整的基准测试体系:
- 100篇文章测试集 (test-dataset-100.json)
- 抓取成功率自动化测试 (scrape-benchmark.ts)
- 批注定位准确率测试 (annotation-accuracy.ts)
- 测试运行脚本 (run.sh)

目标指标:
- 抓取成功率 ≥ 95% (对标 Omnivore/Matter)
- 批注定位准确率 ≥ 98% (对标 Omnivore 99%)

测试覆盖:
- 15种微信文章类型 (图文/视频/音频/表格/代码等)
- 3种复杂度级别 (简单/中等/复杂)
- 6年跨度 (2020-2025)
- 8大内容类别

下一步:
- 在真实环境中运行测试
- 收集实际成功率数据
- 针对薄弱环节优化

文档:
- tests/benchmark/BENCHMARK_REPORT.md
- COMPETITOR_MATRIX.md 更新进度

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
建立了企业级 CI/CD 流水线(参考 Omnivore 最佳实践):

CI Workflow (ci.yml):
- Lint & Type Check (代码风格、类型检查)
- Unit Tests + 覆盖率报告
- Integration Tests (PostgreSQL 服务)
- E2E Tests with Stagehand AI
- Benchmark Tests (抓取成功率、批注准确率)
- Security Scan (npm audit + Trivy)
- Build Verification + 包大小检查

CD - Staging (cd-staging.yml):
- Vercel Preview 部署
- Smoke Tests (/api/health)
- Supabase 数据库迁移

CD - Production (cd-production.yml):
- 手动确认保护
- Vercel 生产部署
- Sentry Release 集成
- Slack 通知

Scheduled Benchmark (benchmark-scheduled.yml):
- 每周日自动运行基准测试
- 成功率/准确率阈值检查
- 自动报告生成

配置文件:
- .github/PULL_REQUEST_TEMPLATE.md
- docs/ci-cd-setup.md
- cloud/package.json (添加测试脚本)

质量门禁:
- 测试覆盖率 ≥ 30% (逐步提升到 60%)
- 抓取成功率 ≥ 95%
- 批注准确率 ≥ 98%

竞品对标:
- CI/CD 现已对齐 Omnivore/Wallabag
- 自动化基准测试超越竞品

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 实现 SiteConfigManager 从 GitHub 获取站点规则
- 实现 SiteConfigExtractor 使用社区规则提取内容
- 支持 XPath/CSS selector 内容提取
- 支持子域名回退规则匹配
- 使用 Readability 作为回退策略
- 完整的单元测试覆盖
- 使用安全的 textContent 替代 innerHTML

竞品对标:
- Omnivore: ✅ 站点规则 (已对标)
- Wallabag: ✅ 站点规则 (已对标)

技术亮点:
- 2218+ 条社区维护规则
- 24小时规则缓存
- 自动降级策略

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Round 163-164:
- 后端: 新增文章导出API (/api/articles/{id}/export, /api/articles/export/batch)
- 支持5种导出格式: Markdown/HTML/JSON/PDF/Excel
- 前端: 单篇文章导出按钮 (ArticleDetail)
- 前端: 批量导出功能 (Articles列表页)
- 移动端: 响应式布局,底部导航栏,汉堡菜单
- 版本: 3.33.0 -> 3.34.0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Round 165:
- 新增 ReaderThemeSettings 组件
- 支持4种主题: Light/Sepia/Dark/Ink
- 支持4种字体: 系统默认/霞鹜文楷/思源宋体/Noto Serif
- 支持字号调节 (14-24px)
- 支持行高调节 (1.4-2.4)
- 支持字间距调节 (0-0.2em)
- 设置持久化到 localStorage
- 集成到 ArticleDetail 页面

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- 后端 /api/scrape: 微信文章直接返回提示,引导使用扩展
- 前端 ImportURL: 添加醒目的微信文章扩展提示
- 验证: 扩展提取 → import API → 数据库存储 流程 100% 可用

取竞品精华:
- Omnivore: 浏览器扩展一键保存体验
- wcplusPro: 微信生态深度集成(通过扩展实现)

去自身糟粕:
- 放弃后端直接抓取微信文章(反爬不可战胜)
- 简化用户认知:微信文章 = 用扩展保存

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Preserve defensive guards before removing wechat-article-scraper:
- .pre-commit-config.yaml (gitleaks + check-added-large-files)
- .pii-path-patterns (block node_modules/storage/feeds)
- scripts/repo_path_guard.py (path-based commit guard)
- scripts/find_images.py, scripts/test_standard_article.py
- .claude/ralph-* moved to .claude/archive/
- CLAUDE.md updates, marketplace.json sync

Next commit removes the whole wechat-article-scraper tree
(leaked api key via hardcoded fallback in agents/src/config.ts).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Skill moved to private repo daymade/wechat-article-scraper due to
leaked api key via hardcoded fallback in agents/src/config.ts.
All scraper history has been purged from this public repo via git filter-repo.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Workflow was pushing Docker images to ghcr.io/daymade/claude-code-skills/wechat-article-scraper.
Package has been deleted from GHCR. Skill moved to private repo daymade/wechat-article-scraper.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…1.1.0 (daymade#53)

Add a prominent Step 1–6 one-shot installation section at the top of
SKILL.md covering git clone, Python fallback download via GitHub API,
agent registration, bulk skill install, credential setup (including the
critical symlink step), and final verification.

Also update the Routing table to link install intents to the new section,
and tighten the 'When in doubt' hint with bold + backtick formatting.
Bundle 7 Claude Code power-user skills under one shared namespace so
invocations render as daymade-claude-code:SKILL instead of the
redundant SKILL:SKILL form produced by same-name single-skill plugins.

Suite members (moved to suites/daymade-claude-code/ mirroring the
daymade-docs pattern for narrow plugin cache footprints):
  - claude-code-history-files-finder  1.0.2 -> 1.0.3
  - continue-claude-work              1.1.1 -> 1.1.2
  - claude-skills-troubleshooting     1.0.0 -> 1.0.1
  - claude-md-progressive-disclosurer 1.2.0 -> 1.2.1
  - statusline-generator              1.0.0 -> 1.0.1
  - claude-export-txt-better          1.0.0 -> 1.0.1
  - marketplace-dev                   1.2.0 -> 1.2.1 (hook paths
    simplified to CLAUDE_PLUGIN_ROOT/hooks/... now that cache root
    is the skill dir itself)

Both the suite and the 7 individual plugins install from the same
canonical location. Transparent to existing users: plugin names and
invocation unchanged; claude plugin update pulls from the new path on
next update.

Marketplace 1.47.0 -> 1.48.0, plugin entries 51 -> 52 (two suites now:
daymade-docs and daymade-claude-code).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@daymade daymade force-pushed the main branch 2 times, most recently from 915a8c6 to 9c9615e Compare June 28, 2026 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants