Add paper-review skill for academic paper methodology extraction by igerber · Pull Request #134 · igerber/diff-diff

igerber · 2026-02-08T17:26:12Z

Summary

Add /paper-review skill (.claude/commands/paper-review.md) that reads academic paper PDFs and produces structured methodology documentation
Multi-phase architecture: scout agent (haiku) for paper structure detection, parallel extraction agents for long papers (>20 pages), synthesis agent for combining results
Short-paper fast path reads entire paper in main context for papers ≤20 content pages
Output is a Methodology Registry entry at docs/methodology/papers/{paper-name}-review.md
Add .claude/paper-review/ to .gitignore as safety net for interrupted runs
Add docs/methodology/papers/ documentation entry to CLAUDE.md

Methodology references (required if estimator / math changes)

N/A - no methodology changes, this is a developer tooling skill

Validation

No test changes (skill is a markdown instruction file, not code)
Manual testing applicable: short paper fast path, long paper multi-agent pipeline, invalid PDF handling, --confirm flag pauses, existing output collision

Security / privacy

Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

Multi-phase skill that reads academic paper PDFs and produces structured methodology documentation (equations, algorithms, SEs, edge cases) in the project's Methodology Registry format. Uses scout agent for paper structure detection, parallel extraction agents for long papers, and a synthesis agent to combine results. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-02-08T17:29:14Z

Overall assessment: ⚠️ Needs changes

Executive summary

No estimator/math changes in this PR; methodology registry content is not directly modified.
P1 edge case: scout page count caps at 120 pages, which can truncate long papers and miss assumptions/SE details.
P2 edge case: references detection only scans the last 8 pages, so long references/appendices can be misclassified as content.
P2 methodology/docs mismatch: “direct inclusion” template doesn’t match the actual REGISTRY heading level/labels, risking inconsistent registry entries.
No security concerns observed.

Methodology

P2 | Impact: The template claims it is “formatted for direct inclusion” in the Methodology Registry but uses ### headings and label variants that don’t match the registry’s ##-level entries, which will break structure/ToC consistency when pasted. | Fix: Align the template to the registry format (## headings and matching section labels) or change the “direct inclusion” claim. Ref: .claude/commands/paper-review.md#L41, docs/methodology/REGISTRY.md#L29.

Code Quality

P1 | Impact: The scout binary search hard-codes high = 120, so papers longer than 120 pages are silently truncated and extraction can miss key assumptions, equations, or SE details. | Fix: Use exponential search to find an upper bound (e.g., doubling high until a page fails) before binary search. Ref: .claude/commands/paper-review.md#L183.
P2 | Impact: References detection scans only the last 8 pages; long references or early-start appendices can be treated as content, diluting extraction and increasing noise. | Fix: Scan backward until a references heading is found (or to a larger configurable window), or scan all pages for a references heading. Ref: .claude/commands/paper-review.md#L196.

Performance

P3 | Impact: None observed in this PR. | Fix: N/A.

Maintainability

P3 | Impact: None observed in this PR. | Fix: N/A.

Tech Debt

P3 | Impact: None observed in this PR. | Fix: N/A.

Security

P3 | Impact: None observed in this PR. | Fix: N/A.

Documentation/Tests

P3 | Impact: No testable code changes; documentation-only skill addition. | Fix: N/A.

Fix P1: Replace hard-coded 120-page binary search cap with exponential probe (64→128→256…) so papers >120 pages are not silently truncated. Fix P2: Increase references scan window from 8 to 15 pages to cover papers with large reference sections. Fix P2: Align template heading levels and labels with REGISTRY.md (h2 not h3, matching assumption/equation label wording). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igerber · 2026-02-08T17:37:22Z

/ai-review

github-actions · 2026-02-08T17:40:37Z

🔁 AI review rerun (requested by @igerber)

Head SHA: a2c35a7add0f4c261bce555834a3c8585b2638f0

Overall assessment: ⚠️ Needs changes

Executive summary:

No estimator/math changes; methodology registry content is unaffected.
The new skill uses --name directly in output paths without sanitization, which can lead to accidental path traversal or invalid filenames.
Scout logic treats any Read error as “page beyond document,” which can truncate extraction if a page is unreadable.
Minor inconsistency in slug derivation example vs rule could cause inconsistent filenames.

Methodology

No issues found. (No estimator, identification, or inference changes in this PR.)

Code Quality

P2 – Impact: Scout treats any Read error as “page beyond document,” so a single unreadable page can truncate content_pages and skip methodology sections. – Fix: differentiate out‑of‑range vs parse errors; on error, retry or prompt user and fall back to a linear scan with a warning. (.claude/commands/paper-review.md:L178-L181)

Performance

No issues found.

Maintainability

P3 – Impact: Slug derivation rules say “first author’s last name,” but the example uses two surnames (callaway-santanna), risking inconsistent filenames across reviews. – Fix: clarify and standardize the slug rule (first author only vs first+second) and update the example. (.claude/commands/paper-review.md:L16-L19, .claude/commands/paper-review.md:L140-L143)

Tech Debt

No issues found.

Security

P2 – Impact: --name is interpolated into output paths without sanitization, allowing path traversal or illegal filenames and possible accidental overwrite outside docs/methodology/papers. – Fix: slugify to [a-z0-9-], strip path separators, and confirm the final path before writing. (.claude/commands/paper-review.md:L16-L19, .claude/commands/paper-review.md:L156-L157)

Documentation/Tests

No issues found (docs/skill-only change; tests not expected).

Fix scout treating any Read error as end-of-document by adding corrupt-page caveat and post-search sanity check. Fix slug derivation rule to match two-author example (callaway-santanna-2021). Add slug sanitization to prevent path traversal via --name argument. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igerber · 2026-02-08T17:48:19Z

/ai-review

github-actions · 2026-02-08T17:52:41Z

🔁 AI review rerun (requested by @igerber)

Head SHA: 1230e75bb410d66b1014f1f94a990a42f9f0e0a8

Methodology

P3 — No estimator, math, weighting, variance/SE, or default-behavior changes in this PR; Impact: none; Fix: none.

Code Quality

P1 — Missing guard for content_pages == 0 leads to an invalid Read range ("1-0") when references start on page 1 or there is no content; Impact: skill fails on empty-content PDFs; Fix: after parsing scout results, explicitly handle content_pages < 1 (error or AskUserQuestion for manual ranges, or fall back to 1-total_pages) and skip short-path Read in that case; Location: .claude/commands/paper-review.md:L230-L259.
P2 — Short-paper fast path skips output verification and cleans up immediately; Impact: a Write failure could be silently reported as success; Fix: add a read/exists check for docs/methodology/papers/{paper-name}-review.md before cleanup (reuse Phase 4 check logic); Location: .claude/commands/paper-review.md:L256-L263.

Performance

P3 — No runtime code changes; Impact: none; Fix: none.

Maintainability

P3 — PDF extension check is case-sensitive (.pdf only); Impact: avoidable UX friction for .PDF paths; Fix: accept case-insensitive extension or validate via file existence/MIME sniffing; Location: .claude/commands/paper-review.md:L14-L26.

Tech Debt

P3 — No additional tech debt identified beyond issues above; Impact: none; Fix: none.

Security

P3 — No secrets or security-sensitive code changes in this diff; Impact: none; Fix: none.

Documentation/Tests

P3 — Documentation updates are present and adequate for a command file; no tests expected; Impact: none; Fix: none.

Open Questions / Assumptions

Should a PDF with references_start_page == 1 be treated as invalid input (error) or should the skill prompt for manual content ranges?

Overall assessment: ⚠️ Needs changes

Executive summary

One P1 edge-case gap: content_pages == 0 triggers an invalid Read range in the short-path flow.
One P2 robustness gap: short-paper path doesn’t verify output before cleanup.
Minor UX issue: .PDF extensions are rejected due to case-sensitive check.
No methodology changes; registry content remains unaffected.

Guard against zero content pages, add output verification to short-paper fast path, and make PDF extension check case-insensitive. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igerber · 2026-02-08T18:06:36Z

/ai-review

github-actions · 2026-02-08T18:10:42Z

🔁 AI review rerun (requested by @igerber)

Head SHA: ee59f004a2535f05798d2d2bc179ac3e355e1187

Overall assessment: ✅ Looks good
Executive Summary

No estimator/math/SE/assumption/default behavior changes; diff is a Claude command + doc updates.
Paper-review template explicitly aligns to the Methodology Registry format and preserves equation numbering for cross-checks.
Edge-case checklist not triggered; no inference code paths added/modified.
No security-sensitive changes; temp outputs are gitignored.
Testing gap: no automated validation for the new command (manual use only).

Methodology

No findings. (Registry format reviewed in docs/methodology/REGISTRY.md:1 and referenced by the new command in .claude/commands/paper-review.md:45 and .claude/commands/paper-review.md:459.)

Code Quality

No findings.

Performance

No findings.

Maintainability

No findings.

Tech Debt

No findings.

Security

No findings.

Documentation/Tests

P3: The new /paper-review command is not covered by automated checks; correctness relies on manual runs. Impact: regressions in command structure/templates could slip in unnoticed. Fix: add a lightweight doc-lint/smoke test or document a manual validation checklist in CI/CLAUDE.md. (Related: .claude/commands/paper-review.md:1)

Address round 3 AI review feedback on paper-review skill

ee59f00

Guard against zero content pages, add output verification to short-paper fast path, and make PDF extension check case-insensitive. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

igerber merged commit 4bb60bf into main Feb 8, 2026

igerber deleted the paper-review-skill branch February 8, 2026 18:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add paper-review skill for academic paper methodology extraction#134

Add paper-review skill for academic paper methodology extraction#134
igerber merged 4 commits intomainfrom
paper-review-skill

igerber commented Feb 8, 2026

Uh oh!

github-actions bot commented Feb 8, 2026

Uh oh!

igerber commented Feb 8, 2026

Uh oh!

github-actions bot commented Feb 8, 2026

Uh oh!

igerber commented Feb 8, 2026

Uh oh!

github-actions bot commented Feb 8, 2026

Uh oh!

igerber commented Feb 8, 2026

Uh oh!

github-actions bot commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Feb 8, 2026

Summary

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions bot commented Feb 8, 2026

Uh oh!

igerber commented Feb 8, 2026

Uh oh!

github-actions bot commented Feb 8, 2026

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Uh oh!

igerber commented Feb 8, 2026

Uh oh!

github-actions bot commented Feb 8, 2026

Methodology

Code Quality

Performance

Maintainability

Tech Debt

Security

Documentation/Tests

Open Questions / Assumptions

Executive summary

Uh oh!

igerber commented Feb 8, 2026

Uh oh!

github-actions bot commented Feb 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant