feat(diff): detailed before/after metadata diff — surface + JPEG (#22) #119

Merged
forgejo_admin merged 21 commits from feat/metadata-diff into master 2026-05-15 11:52:15 +00:00

Closes #22.

Summary

Phase 1 of issue #22 — the per-file expandable before/after metadata diff. Surface (types, plumbing, renderer component) + JPEG enumeration. Other formats (PDF/PNG/Office/Video) return metadataItems: [] and will fill in via per-format follow-up PRs as planned in the spec.

User-visible: JPEG rows now show a chevron + clickable diff panel listing what was removed (red strikethrough), modified (green before → after), and kept (neutral grey badge — Orientation + ICC profile preservation). Diff is collapsed by default; single-row-expanded semantics from AppContext.expandedRowId. Mobile stacks to single column at < 420px.

Scope

  • SurfaceMetadataItem discriminated union (removed / modified / kept), StripResult.metadataItems, ProcessOutcome + WasmApi.process extension, FileEntry.metadataItems, UPDATE_FILE_METADATA action, MetadataDiffExpansion component with BEM CSS + dark mode tokens, 6 new i18n keys (English placeholders for the other 24 locales), isExpandable gate tightened so non-JPEG Complete rows stay non-expandable.
  • JPEG enumeration — per-tag IFD0/SubIFD/GPS/IFD1/InteropIFD walks (~750 LOC strategy code, ~450 LOC fixture builder). Compound GPS coordinates (47° 36' 22.3" N from triplet + ref). MakerNote emitted as one opaque item with vendor signature detection (Nikon/Canon/Sony/Olympus/Fuji/Panasonic/Pentax/Leica). XMP/ICC/JFIF/Comment dropped segments emit blob items. Kept items for Orientation (when preserveOrientation: true) and ICC profile (when preserveColorProfile: true).

Spec + plan

  • Spec: docs/superpowers/specs/2026-05-14-issue-22-metadata-diff-design.md
  • Plan: docs/superpowers/plans/2026-05-14-issue-22-metadata-diff.md

What's deferred (per spec §11)

  • Per-format enumeration for PDF, PNG, Office, Video (each ships as its own follow-up PR; the surface is ready)
  • MakerNote vendor sub-tag decoders (opaque only — see spec §11)
  • XMP property-list parsing
  • Diff persistence / export / sensitive-value masking

Verification

Quality gates (all green):

  • yarn typecheck pass
  • yarn lint pass
  • yarn test — 341/341 pass (31 test files)
  • yarn check:deps pass (no circular)
  • yarn build:web pass
  • yarn build:web:standalone pass

E2E:

  • yarn test:e2e:web — 105/105 pass, 17 skipped (1 flaky web-mobile-ios service-worker precache test passed cleanly on rerun; unrelated to diff)
  • yarn test:e2e:standalone — 7/7 pass (standalone-desktop + standalone-mobile)

Forensic gate (per format-strategy-workflow.md §3 / privacy-invariants.md §3):
Re-ran tools/forensic/jpeg.ts against the Phase 1 JPEG strategy with exiftool 12.76. Zero sentinel survival across every recovery technique — raw strings | grep, exiftool -a -G1 -s tag enumeration, and the in-process APP*/COM marker walker all return [] on the default strip (15 b output) and on preserveOrientation=true (51 b output, single 34-byte APP1 carrying only the Orientation entry). Output sizes, segment counts, sentinel-survival counts, and the ExifTool-marker check reproduce the 2026-05-07 baseline at docs/forensic/jpeg.md byte-for-byte. Phase 1 changes are observation-only on the strip path — the walker emits a MetadataItem[] describing what it saw while dropping the same bytes it always dropped; no segment-drop policy or marker-keep policy moved.

Screenshots

Desktop, collapsed (chevron visible on JPEG row):

diff-collapsed

Desktop, expanded (JFIF + EXIF + GPS groups with strikethrough removed values):

diff-expanded

Mobile (390 × 844, iPhone UA), expanded — single-column stack at < 420px:

diff-expanded-mobile

Test plan

  • Drag a JPEG with EXIF + GPS into the standalone build; row shows chevron; click expands; diff shows EXIF + GPS groups with removed items; click collapses
  • Drag a JPEG into the standalone build with preserveOrientation: true in settings; Orientation row renders as kept with the preserved badge
  • Drag a PDF; no chevron appears; click doesn't reveal a diff
  • Mobile (390px viewport): diff stacks to single column, no horizontal scroll

Generated with Claude Code

Closes #22. ## Summary Phase 1 of issue #22 — the per-file expandable before/after metadata diff. Surface (types, plumbing, renderer component) + JPEG enumeration. Other formats (PDF/PNG/Office/Video) return `metadataItems: []` and will fill in via per-format follow-up PRs as planned in the spec. User-visible: JPEG rows now show a chevron + clickable diff panel listing what was removed (red strikethrough), modified (green `before → after`), and kept (neutral grey badge — Orientation + ICC profile preservation). Diff is collapsed by default; single-row-expanded semantics from `AppContext.expandedRowId`. Mobile stacks to single column at `< 420px`. ## Scope - **Surface** — `MetadataItem` discriminated union (`removed` / `modified` / `kept`), `StripResult.metadataItems`, `ProcessOutcome` + `WasmApi.process` extension, `FileEntry.metadataItems`, `UPDATE_FILE_METADATA` action, `MetadataDiffExpansion` component with BEM CSS + dark mode tokens, 6 new i18n keys (English placeholders for the other 24 locales), `isExpandable` gate tightened so non-JPEG `Complete` rows stay non-expandable. - **JPEG enumeration** — per-tag IFD0/SubIFD/GPS/IFD1/InteropIFD walks (~750 LOC strategy code, ~450 LOC fixture builder). Compound GPS coordinates (`47° 36' 22.3" N` from triplet + ref). MakerNote emitted as one opaque item with vendor signature detection (Nikon/Canon/Sony/Olympus/Fuji/Panasonic/Pentax/Leica). XMP/ICC/JFIF/Comment dropped segments emit blob items. Kept items for Orientation (when `preserveOrientation: true`) and ICC profile (when `preserveColorProfile: true`). ## Spec + plan - Spec: `docs/superpowers/specs/2026-05-14-issue-22-metadata-diff-design.md` - Plan: `docs/superpowers/plans/2026-05-14-issue-22-metadata-diff.md` ## What's deferred (per spec §11) - Per-format enumeration for PDF, PNG, Office, Video (each ships as its own follow-up PR; the surface is ready) - MakerNote vendor sub-tag decoders (opaque only — see spec §11) - XMP property-list parsing - Diff persistence / export / sensitive-value masking ## Verification **Quality gates** (all green): - `yarn typecheck` pass - `yarn lint` pass - `yarn test` — 341/341 pass (31 test files) - `yarn check:deps` pass (no circular) - `yarn build:web` pass - `yarn build:web:standalone` pass **E2E**: - `yarn test:e2e:web` — 105/105 pass, 17 skipped (1 flaky `web-mobile-ios` service-worker precache test passed cleanly on rerun; unrelated to diff) - `yarn test:e2e:standalone` — 7/7 pass (`standalone-desktop` + `standalone-mobile`) **Forensic gate** (per `format-strategy-workflow.md` §3 / `privacy-invariants.md` §3): Re-ran `tools/forensic/jpeg.ts` against the Phase 1 JPEG strategy with exiftool 12.76. Zero sentinel survival across every recovery technique — raw `strings | grep`, `exiftool -a -G1 -s` tag enumeration, and the in-process APP*/COM marker walker all return `[]` on the default strip (15 b output) and on `preserveOrientation=true` (51 b output, single 34-byte APP1 carrying only the Orientation entry). Output sizes, segment counts, sentinel-survival counts, and the `ExifTool`-marker check reproduce the 2026-05-07 baseline at `docs/forensic/jpeg.md` byte-for-byte. Phase 1 changes are observation-only on the strip path — the walker emits a `MetadataItem[]` describing what it saw while dropping the same bytes it always dropped; no segment-drop policy or marker-keep policy moved. ## Screenshots Desktop, collapsed (chevron visible on JPEG row): ![diff-collapsed](http://localhost:3000/attachments/d0cf5abc-3260-466d-bda0-60b60ab93f77) Desktop, expanded (JFIF + EXIF + GPS groups with strikethrough removed values): ![diff-expanded](http://localhost:3000/attachments/f46c78a5-24f8-45d1-8ec6-2f120da5be99) Mobile (390 × 844, iPhone UA), expanded — single-column stack at `< 420px`: ![diff-expanded-mobile](http://localhost:3000/attachments/7cfb984c-3cbf-472c-94ea-9b3ad1db1359) ## Test plan - [ ] Drag a JPEG with EXIF + GPS into the standalone build; row shows chevron; click expands; diff shows EXIF + GPS groups with `removed` items; click collapses - [ ] Drag a JPEG into the standalone build with `preserveOrientation: true` in settings; Orientation row renders as `kept` with the `preserved` badge - [ ] Drag a PDF; no chevron appears; click doesn't reveal a diff - [ ] Mobile (390px viewport): diff stacks to single column, no horizontal scroll Generated with Claude Code
forgejo_admin added 16 commits 2026-05-15 05:44:56 +00:00
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generalises the existing orientation-only IFD0 walker into a per-tag
enumerator that produces MetadataItem entries. Adds tag-decoder tables
for IFD0, the Exif SubIFD, and GPS IFD, plus formatter helpers for
orientation, resolution units, exposure time / f-number / focal length,
EXIF dates, MakerNote vendor detection, and GPS compound coordinates.
The walker follows ExifIFDPointer / GPSIFDPointer / InteropIFDPointer
chains and IFD0's next-IFD pointer into IFD1, surfacing every tag as a
"removed" MetadataItem in document order.

GPS coordinates collapse latitude + GPSLatitudeRef (and longitude + its
ref) into a single compound item formatted as "47° 36' 22.3\" N".

Strip behaviour is unchanged — enumeration observes the segments that
were already being dropped, it does not alter which bytes are kept.

Tests via a synthetic big-endian TIFF builder (tests/.../fixtures/jpeg_builders.ts)
that round-trips through the strategy's walker. Three new specs cover
IFD0 tags, SubIFD walking via ExifIFDPointer, and GPS compound coords.
The two existing JPEG tests that asserted metadataItems: [] are updated
to expect the real per-tag content now emitted by the walker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure render component for the per-file before/after metadata diff:
groups items by source (first-seen ordering), kept items first within
each group, with per-action row styling (red strikethrough for removed,
green arrow target for modified, neutral with `preserved` badge for
kept). Long values truncate with a `title` tooltip; mobile stacks to a
single column under 420px.

The diff i18n keys (added in Task 9) carry `{count}` placeholders.
Interpolation runs locally in the component via `.replace("{count}",
...)`, mirroring the `ErrorExpansion.tsx` pattern, because the live
`useI18n()` `t` signature is `(key: string) => string` and does not
interpolate vars.

Tests use `react-dom/server.renderToStaticMarkup` for assertions on
the rendered HTML — no DOM environment, no new dev deps. Vitest
include pattern extended to `.test.{ts,tsx}` so the component test
file is picked up.

Tokens added under `[data-theme="dark"]` (matching the existing
codebase pattern) rather than `@media (prefers-color-scheme: dark)`.
The existing surface/border/text-secondary tokens are reused for the
group header chrome.
Task 12 of issue #22. Two specs run under all three web projects (desktop,
mobile-iOS, mobile-android):

- JPEG: setInputFiles sample.jpg → row becomes Complete + --expandable,
  ChevronIcon is visible, click/tap reveals .file-table__diff with an
  EXIF group header and at least one --removed row carrying the
  --strike value. Mobile-responsive guard asserts the diff width fits
  within the viewport. Toggle collapses the panel.
- PDF: sample.pdf processes to a Complete row but the PDF strategy
  emits no items in Phase 1, so isExpandable falls through and the
  diff panel is never rendered.

Uses .file-browse-button__input + force:true (the established mobile-safe
pattern from file-processing.spec.ts) so no separate drag-vs-tap branch
is needed for file ingestion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The standalone HTML build is the project's primary distribution channel,
so the metadata-diff UX from Task 12 must be verified against the inlined
bundle too — not only the hosted web build. Two projects so the diff is
exercised on both desktop and mobile viewports.

- playwright.config.ts: replace the single `standalone` project with
  `standalone-desktop` (smoke specs + metadata_diff) and `standalone-mobile`
  (metadata_diff). Both set `baseURL` to a `file://` URL of
  `dist/web-standalone/index.html` so `launchPage()` works without per-spec
  branching.
- tests/e2e/web/helpers/page_launcher.ts: handle the file:// baseURL by
  navigating to `""` instead of `"/"` — under `file://`, `"/"` resolves to
  the filesystem root and 404s; `""` resolves to the baseURL itself.
- package.json: `yarn test:e2e:standalone` now drives both standalone
  projects.
- tests/e2e/README.md: document the two new projects.

Verified: yarn test:e2e:standalone → 7 passed (3 smoke + 2 diff specs ×
2 viewports for diff). yarn test:e2e:web → 109 passed, 23 skipped
(unchanged from Task 12). Quality gates green: lint, typecheck, vitest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The spec tested Electron-era `MetadataExpansion` inspection behaviour using the
`.metadata-expansion` selector. That component was retired in Phase G (Electron
removed entirely), and the diff feature (Task 11) replaced it with
`MetadataDiffExpansion` rendered via `.file-table__diff`.

The spec's only currently-passing test asserted `.metadata-expansion` count is
0 — a meaningless assertion since that selector no longer exists anywhere in
the source. Its other two tests were permanently `test.skip(true, ...)`.
`metadata_diff.spec.ts` (Task 12) now covers the JPEG-row expandability +
diff-panel reveal flow that this stale spec was nominally testing, and does so
against the actual current UI.

No coverage lost: the new diff spec asserts row expandability for JPEG
(equivalent to the deleted spec's surviving check), verifies the new diff
selectors are reachable on click, and adds the non-JPEG no-chevron Phase 1
gate assertion the old spec couldn't.

Verified: `yarn test:e2e:web` → 106 passed (was 109; -3 = the one passing
metadata-inspection test × 3 projects). `yarn test:e2e:standalone` → 7 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
chore(forensic): update JPEG runner log line for new StripResult shape
Some checks failed
CI / Lint, Typecheck & Unit Tests (pull_request) Successful in 38s
CI / E2E (Standalone single-file) (pull_request) Failing after 1m49s
CI / E2E (Web) (pull_request) Successful in 2m34s
1e081b48df
Phase 1 of the metadata-diff feature renamed StripResult.metadataRemoved
(number) to StripResult.metadataItems (array). Logging path in the
forensic runner stayed on the old field and printed 'dropped undefined'.

Strip behaviour and the recovery-battery results are unchanged:

| Channel                     | default | preserveOrientation | exiftool -all= |
| --------------------------- | ------- | ------------------- | -------------- |
| Output size                 | 15 b    | 51 b (34-byte APP1) | 15 b           |
| Raw strings sentinels       | 0       | 0                   | 0              |
| ExifTool visible tags       | 0       | 0                   | 0              |
| In-process marker walk      | 0       | 0                   | 0              |
| Orientation                 | absent  | 6                   | absent         |
| ExifTool marker in bytes    | no      | no                  | no             |

Matches the baseline at docs/forensic/jpeg.md byte-for-byte. Privacy
bar unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
forgejo_admin added 1 commit 2026-05-15 06:12:57 +00:00
fix(ci): standalone-mobile uses Pixel 7 (Chromium) so the CI's Chromium-only install works
All checks were successful
CI / Lint, Typecheck & Unit Tests (pull_request) Successful in 38s
CI / E2E (Standalone single-file) (pull_request) Successful in 2m1s
CI / E2E (Web) (pull_request) Successful in 3m26s
211acf254f
forgejo_admin added 1 commit 2026-05-15 06:19:24 +00:00
fix(diff): expand EXIF SubIFD + GPS + IFD0 decoder tables to cover Exif 2.32 spec
All checks were successful
CI / Lint, Typecheck & Unit Tests (pull_request) Successful in 40s
CI / E2E (Standalone single-file) (pull_request) Successful in 1m34s
CI / E2E (Web) (pull_request) Successful in 2m29s
3042c26bf7
forgejo_admin added 1 commit 2026-05-15 07:03:39 +00:00
refactor(exif): extract IFD walker + generate tag names from exiftool
All checks were successful
CI / Lint, Typecheck & Unit Tests (pull_request) Successful in 38s
CI / E2E (Standalone single-file) (pull_request) Successful in 4m38s
CI / E2E (Web) (pull_request) Successful in 5m53s
1c17b5a9fb
Splits the EXIF walker and tag dictionary out of `jpeg_strategy.ts` into
shared modules under `src/infrastructure/wasm/exif/` and
`src/domain/exif/`, so future format strategies (PNG eXIf, HEIC Exif,
native TIFF) can consume the same parser without copying the
nine-hundred-LOC body of the JPEG strategy.

The new layout splits along a clean DDD seam:

- `src/domain/exif/ifd_tag_names.ts` — GENERATED from `exiftool -listx`
  via `scripts/generate_exif_tags.mjs` (idempotent, no deps, regenerate
  with `yarn generate:exif-tags`). Four numeric dictionaries: IFD0
  (242), ExifIFD (357), GPS (32), InteropIFD (5).
- `src/domain/exif/ifd_value_formatters.ts` — hand-rolled,
  primitive-typed (no IfdEntry dependency).
- `src/infrastructure/wasm/exif/ifd_entry.ts` — `IfdEntry` interface +
  TIFF type constants.
- `src/infrastructure/wasm/exif/ifd_readers.ts` — endian-aware readers,
  `describeUnknown`, `formatByteSize`.
- `src/infrastructure/wasm/exif/ifd_decoders.ts` — assembles
  `{ name, format }` per tag; small `IFD0_NAME_OVERRIDES` /
  `EXIF_SUBIFD_NAME_OVERRIDES` preserve Exif-spec names where
  ExifTool ships its own alias (0x9204 ExposureBiasValue vs
  ExposureCompensation; 0x927c MakerNote vs the 91 vendor aliases;
  0x0201/0x0202 JPEGInterchangeFormat[Length] vs ThumbnailOffset/Length).
- `src/infrastructure/wasm/exif/ifd_walker.ts` — `parseIfd`,
  `enumerateExifSegment` (APP1 path with "Exif\0\0" prefix),
  `enumerateExifTiff` (prefix-less path for PNG eXIf / HEIC).

`jpeg_strategy.ts` drops from ~1629 to ~495 LOC; it now contains only
the JPEG segment walker, drop policy, orientation-only APP1 synthesis,
and JFIF/ICC/Comment helpers. Zero behaviour change — all 343 existing
tests pass unmodified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
forgejo_admin added 1 commit 2026-05-15 11:23:36 +00:00
refactor: generate EXIF decoder tables from ExifTool PrintConv data
All checks were successful
CI / Lint, Typecheck & Unit Tests (pull_request) Successful in 1m0s
CI / E2E (Standalone single-file) (pull_request) Successful in 1m30s
CI / E2E (Web) (pull_request) Successful in 2m20s
32e5be6c41
Replace 462-LOC hand-typed ifd_decoders.ts + 337-LOC ifd_value_formatters.ts
with generated tag data (type + enumValues from exiftool -listx) and a ~100-LOC
generic assembler. Enum tables (MeteringMode, FlashMode, etc.) are now machine-
derived; only display-shape overrides (Orientation, ExposureTime, FNumber,
FocalLength, MakerNote, GPS) remain hand-rolled.

- scripts/generate_exif_tags.mjs extended to emit type + enumValues per tag
- src/domain/exif/ifd_tag_names.ts replaced by ifd_tag_data.ts (richer shape:
  TagData interface with name/type/enumValues; four Record<number, TagData>
  exports: IFD0_TAG_DATA, EXIF_SUBIFD_TAG_DATA, GPS_TAG_DATA, INTEROP_TAG_DATA)
- ifd_decoders.ts: IFD0_OVERRIDES/EXIF_SUBIFD_OVERRIDES/GPS_OVERRIDES replace
  hand-typed enum tables; buildDecoders() also merges override-only tags (fixes
  MakerNote: ExifTool buckets 0x927c in ExifIFD but real files emit it from IFD0)
- ifd_value_formatters.ts: deleted hand-typed enum maps; only display-shape
  formatters remain (formatShortEnum/formatAsciiEnum helpers retained)

Regenerate: node scripts/generate_exif_tags.mjs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
forgejo_admin added 1 commit 2026-05-15 11:47:56 +00:00
feat(diff): add kept-reason copy to metadata diff expansion
All checks were successful
CI / Lint, Typecheck & Unit Tests (pull_request) Successful in 39s
CI / E2E (Standalone single-file) (pull_request) Successful in 2m40s
CI / E2E (Web) (pull_request) Successful in 3m19s
a8716f9b52
Adds optional reason field to the kept MetadataItem variant. The JPEG
walker fills it (diffKeptReasonOrientation / diffKeptReasonColorProfile)
and the diff expansion renders a small muted label explaining why the
item was preserved. i18n: en/es/ar translations added.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
forgejo_admin merged commit 92278e5174 into master 2026-05-15 11:52:15 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: forgejo_admin/exifcleaner-web#119
No description provided.