- CLAUDE.md dev loop: after opening the PR, wait for CI green, then self-review the diff (/code-review) before saying it's ready; never claim ready before CI-green + self-review; still never self-merge. - .claude/skills/sprint: thin per-project skill so "start sprint N" / "groom sprint N" / "continue the sprint" deterministically runs the groom -> per-issue dev-loop -> retro flow, enforcing the guardrails (plan is source of truth, groom-first, CI-green + self-review before ready, human merges). Composes with the superpowers skills. - docs/sprints/README.md: fold the CI-green + self-review gate into the per-sprint Definition of Done. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2.9 KiB
2.9 KiB
File-Ingestion Search Pipeline
Authoritative spec — read before planning or any architectural decision: @docs/digger-brief.md
Always
- Use the superpowers brainstorm → plan → execute workflow and TDD (red → green → refactor).
- Do NOT build features until I approve the plan. Create the Forgejo V1 milestone + issues only AFTER the design is approved.
- The v1 plan (
docs/superpowers/plans/2026-07-01-digger-v1-plan.md) is the source of truth: whenever a Forgejo issue is added, changed, or removed, update the plan to match (and vice versa) — they must not drift. - Groom each sprint before starting it into
docs/sprints/sprint-<n>-<slug>.md, built from the plan + current repo state + the previous sprint's retro (seedocs/sprints/README.md). Groom only the sprint about to start, not all of them. If grooming changes scope, flow it back: groom doc → plan doc → Forgejo issues. Close each sprint with a short retro in its groom doc; it feeds the next sprint's grooming. - Every branch gets its own git worktree at
../<repo>.worktrees/<issue>-<slug>(sibling to the repo), branched as<type>/<issue>-<slug>(feat|fix|chore|docs|refactor|test) off the latest main. Never work on main. Remove the worktree and delete the branch after merge. - Dev loop: pick a Forgejo issue (via the Forgejo MCP) → worktree + branch → tests-first →
PR (Closes #N) → wait for CI to go green → self-review my own diff (via
/code-reviewor superpowers:requesting-code-review; fix blocking findings — if I push fixes, re-confirm CI is green) → only then say it's ready → I approve and merge. Never self-merge. Never claim a PR is ready before CI is green and the self-review pass is done. - Prefer small, functional PRs. I review PRs myself, so keep each one small, self-contained, and independently green — that makes review much easier. If an issue/task is large, split it into multiple PRs (one issue may span several); each PR should build and pass on its own.
- Every sprint ships a working end-to-end slice.
Invariants (never compromise)
- Pipeline runs standalone without Meilisearch; the search backend is swappable behind an interface.
- Every heavy dependency is swappable behind an interface: the search engine (Sink/SearchProvider), the model runtime (ModelBackend), and the extraction engine including Docling (Extractor).
DoclingDocumentnever escapes the extractor; the IR is the sole contract. - The intermediate representation (IR) is the contract — keep it stable and engine-agnostic.
- All model inference (OCR / ASR / embeddings) is local; no file content leaves the machine.
- v1 = keyword search only; design for vector/hybrid but keep it switched off.
Layout
- docs/digger-brief.md — full spec
- docs/research/ — subagent findings
- docs/decisions/ — ADRs, IR schema, Meilisearch settings