feat(diff): detailed before/after metadata diff — surface + JPEG (#22) (#119)
Some checks failed
CI / Lint, Typecheck & Unit Tests (push) Failing after 34s
CI / E2E (Web) (push) Has been skipped
CI / E2E (Standalone single-file) (push) Has been skipped

This commit is contained in:
forgejo_admin 2026-05-15 15:52:14 +04:00
parent b960329464
commit 92278e5174
49 changed files with 8392 additions and 322 deletions

3
.gitignore vendored
View file

@ -313,3 +313,6 @@ test-results/
# Git worktrees
.worktrees/
# Superpowers brainstorming mockups
.superpowers/

View file

@ -1333,5 +1333,33 @@
},
"folder.zipDownloadFilename": {
"en": "metascrub-{folder}-{date}.zip"
},
"diffGroupRemoved": {
"en": "{count} removed"
},
"diffGroupModified": {
"en": "{count} modified"
},
"diffGroupKept": {
"en": "{count} kept"
},
"diffGroupSeparator": {
"en": ", "
},
"diffKeptBadge": {
"en": "preserved"
},
"diffArrow": {
"en": "→"
},
"diffKeptReasonOrientation": {
"en": "Preserve orientation is enabled",
"es": "Preservar orientación está activado",
"ar": "الحفاظ على الاتجاه مفعّل"
},
"diffKeptReasonColorProfile": {
"en": "Preserve color profile is enabled",
"es": "Preservar perfil de color está activado",
"ar": "الحفاظ على ملف تعريف الألوان مفعّل"
}
}

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,505 @@
# Issue #22 — Detailed before/after metadata diff (Phase 1: surface + JPEG)
**Date:** 2026-05-14
**Issue:** [#22 E4: Detailed before/after metadata diff view](http://localhost:3000/forgejo_admin/exifcleaner-web/issues/22)
**Parent track:** Phase E.2 — Features
**Predecessors:** [`2026-05-05-webapp-migration-design.md`](2026-05-05-webapp-migration-design.md) §9.5 E4, [`2026-05-10-phase-e-f-design.md`](2026-05-10-phase-e-f-design.md), [`2026-05-11-phase-e-trust-copy-design.md`](2026-05-11-phase-e-trust-copy-design.md).
**Status:** Design approved, ready for implementation planning.
---
## 1. Goal
Replace the current count-only "metadata removed" model with a per-file expandable **before/after diff** that lists which tags were removed (red strikethrough), which were modified to a privacy-safe value (green `before → after`), and which were preserved (neutral grey, with a `preserved` badge).
The issue's acceptance criteria — "expandable before/after table per file: tag name, value before, value after (empty = removed)" — is satisfied for JPEG in this PR; PDF/PNG/Office/video follow in per-format PRs that reuse the same surface.
## 2. Scope of this spec (Phase 1)
| Layer | Phase 1 scope |
|---|---|
| Shared types (`MetadataItem`, extended `StripResult`) | ✓ ship |
| Renderer state (`FileEntry.metadataItems`, reducer, `ProcessOutcome`) | ✓ ship |
| `MetadataDiffExpansion` component + BEM CSS + i18n strings | ✓ ship |
| JPEG strategy — per-tag enumeration of EXIF IFD0 + SubIFD + GPS + IFD1 + InteropIFD + MakerNote (opaque) + XMP (blob) + ICC (blob) + JFIF + Comment | ✓ ship |
| PDF / PNG / Office / Video — per-tag enumeration | ✗ deferred to per-format follow-up PRs (return `[]` in this PR) |
| MakerNote vendor decoders | ✗ never (see §11) |
| Diff export, persistence, masking, re-process | ✗ never (see §11) |
## 3. The `MetadataItem` contract
The shape every strategy returns. Three-variant discriminated union; one array carried on `StripResult`.
```ts
// src/domain/exif/metadata_item.ts (new file)
export type MetadataItem =
| {
readonly action: "removed";
readonly source: string; // "EXIF" | "GPS" | "XMP" | "ICC" | "JFIF" | "Comment" | ...
readonly name: string; // canonical tag name: "Make", "GPSLatitude", ...
readonly valueBefore: string; // pre-formatted string, ready to display
}
| {
readonly action: "modified";
readonly source: string;
readonly name: string;
readonly valueBefore: string;
readonly valueAfter: string; // pre-formatted, e.g. "epoch (0)"
}
| {
readonly action: "kept";
readonly source: string;
readonly name: string;
readonly value: string;
};
// src/infrastructure/wasm/format_strategy.ts (extend existing)
export interface StripResult {
readonly bytes: Uint8Array;
readonly metadataItems: readonly MetadataItem[];
}
```
Notes:
- The existing `metadataRemoved: number` field on `StripResult` is **removed**. The `NoMetadataFound` check changes from `metadataRemoved === 0` to `metadataItems.filter(i => i.action !== "kept").length === 0`. Kept items don't count as "metadata we found and removed."
- `source` labels are short jargon strings (`"EXIF"`, `"GPS"`, `"XMP"`, …) — **not** translated.
- `name` is the canonical English tag name from the format spec — **not** translated.
- `value` / `valueBefore` / `valueAfter` are **pre-formatted strings**. The strategy renders rationals to `"1/120 s"`, GPS triplets + ref to `"47° 36' 22.3\" N"`, binary blobs to `"<binary, N KB>"`. The renderer is dumb; the strategy owns presentation.
- `readonly` throughout — items are immutable once emitted.
## 4. Per-strategy responsibility & Phase 1 rollout
Every strategy implements the same contract. Phase 1 only fills it for JPEG. The four others return `[]` and ship per-format follow-up PRs.
| Strategy | Phase 1 | Follow-up PR (each independent) |
|---|---|---|
| **JPEG** | Per-tag IFD0 + SubIFD + GPS + IFD1 + Interop. XMP / ICC / JFIF / Comment as blob items. MakerNote opaque. Orientation as `kept` when `preserveOrientation: true`. ICC profile as `kept` when `preserveColorProfile: true`. | (done) |
| **PDF** | `[]` | Per-key Info dict items, per-annotation PII keys, catalog fingerprint keys per-key, XMP stream as blob. |
| **PNG** | `[]` | Per dropped chunk. tEXt/iTXt/zTXt parsed to key+value items. eXIf as blob (or shared with JPEG IFD walker if extracted). |
| **Office** | `[]` | Per deleted file (docProps/core.xml, thumbnail.jpeg, ...). ZIP entry mtimes as one `modified` summary item. |
| **Video** | `[]` | Per zeroed metadata box. mvhd/tkhd/mdhd timestamps as `modified` items. |
### Renderer behavior with empty items
`isExpandable` in `FileRow.tsx` becomes:
```ts
const isExpandable =
isError ||
file.status === FileProcessingStatus.NoMetadataFound ||
(file.status === FileProcessingStatus.Complete && file.metadataItems.length > 0);
```
- `Complete` with items → chevron + `MetadataDiffExpansion`
- `Complete` with empty items (non-JPEG in Phase 1) → checkmark, **no chevron**, no expansion
- `NoMetadataFound` and `Error` — unchanged
In Phase 1, JPEG rows show a chevron; PDF/PNG/Office/video rows don't. Each follow-up PR flips its format's rows expandable. The existing latent quirk where `Complete` rows had a chevron but no content is closed as a side effect.
## 5. JPEG enumeration policy
`src/infrastructure/wasm/strategies/jpeg_strategy.ts` gains a `~150 LOC` tag-decoder layer on top of the existing IFD walker (which already parses orientation via `extractOrientation`).
### IFDs walked
| IFD | Reached via | Source label |
|---|---|---|
| IFD0 | TIFF header offset (Exif 2.32 §4.6) | `"EXIF"` |
| ExifIFD (SubIFD) | IFD0 tag 0x8769 (ExifIFDPointer) | `"EXIF"` |
| GPS IFD | IFD0 tag 0x8825 (GPSIFDPointer) | `"GPS"` |
| InteropIFD | ExifIFD tag 0xA005 (InteroperabilityIFDPointer) | `"EXIF"` |
| IFD1 (thumbnail) | IFD0 next-IFD offset | `"EXIF"` |
`EXIF` covers IFD0 / SubIFD / Interop / IFD1 because users don't conceptually separate them; the underlying tag namespaces overlap minimally. `GPS` is separated because it's the most privacy-sensitive and benefits from a distinct group header.
### Tag-decoder table
The strategy holds a `Record<number, { name: string; format: (entry) => string }>` for each IFD. Entries that miss the table are still emitted (so the diff is honest about unknown tags) under their canonical hex name `"Tag 0xNNNN"`.
```ts
// Excerpt — values are illustrative
const EXIF_IFD0_TAGS: Record<number, { name: string; format: (e: IfdEntry) => string }> = {
0x010F: { name: "Make", format: e => readAscii(e) },
0x0110: { name: "Model", format: e => readAscii(e) },
0x0112: { name: "Orientation", format: e => formatOrientation(readShort(e)) },
0x011A: { name: "XResolution", format: e => formatRational(readRational(e)) },
// ...
};
```
### Value formatters (canonical pre-formatted strings)
| Type | Example raw | Emitted value |
|---|---|---|
| `ASCII` | `"Apple\0"` | `"Apple"` (trailing nulls trimmed) |
| `SHORT` (enum) | `Orientation=6` | `"6 (Rotate 90 CW)"` |
| `SHORT` (units) | `ResolutionUnit=2` | `"2 (inches)"` |
| `RATIONAL` | `1/120` | `"1/120 s"` for ExposureTime; `"f/4.5"` for FNumber; plain `"300"` for XResolution |
| `RATIONAL` triplet + ref | `GPSLatitude=[47, 36, 22.3]` + `GPSLatitudeRef="N"` | `"47° 36' 22.3\" N"` (one item; ref is consumed, not emitted) |
| `BYTE`/`UNDEFINED` blob | `MakerNote = <2528 bytes>` | `"<binary, 2.5 KB>"` |
| `LONG` | `0x0202` | `"0220"` (zero-padded for ExifVersion) or `"514"` (decimal) |
### MakerNote opaque emission
When IFD0 tag 0x927C (MakerNote) is present, emit exactly one item:
```ts
{ action: "removed", source: "EXIF", name: "MakerNote (Nikon)", valueBefore: "<binary, 2.5 KB>" }
```
Vendor detection: leading bytes match Nikon / Canon / Sony / Olympus / Fuji / Pentax / Panasonic / Leica signature tables. Unknown vendor → plain `"MakerNote"` (no parens). Per-vendor sub-tag decoding is explicitly **not** done (see §11).
### XMP, ICC, JFIF, Comment
Emitted as one item per dropped segment:
```ts
{ action: "removed", source: "XMP", name: "XMP packet", valueBefore: "<XML, 4.2 KB>" }
{ action: "removed", source: "ICC", name: "<profile desc>", valueBefore: "<binary, 3.1 KB>" }
{ action: "removed", source: "JFIF", name: "JFIF block", valueBefore: "version 1.01, density 72×72 dpi" }
{ action: "removed", source: "Comment", name: "Comment", valueBefore: "<the comment string>" }
```
For ICC, the profile description is parsed from the `desc` tag if present, otherwise label as `"ICC profile"`. For JFIF, the human-readable summary is derived from the segment header.
### Kept items
- `preserveOrientation: true` and source had Orientation → emit one `kept` item: `{ action: "kept", source: "EXIF", name: "Orientation", value: "6 (Rotate 90 CW)" }`. All other EXIF tags still `removed`.
- `preserveColorProfile: true` and source had ICC → emit one `kept` item with same shape: `{ action: "kept", source: "ICC", name: "<profile desc>", value: "<binary, 3.1 KB>" }`.
## 6. State plumbing
### 6.1 `FileEntry` (in `src/web/contexts/AppContext.tsx`)
```ts
export interface FileEntry {
// ...existing fields unchanged...
metadataItems: readonly MetadataItem[]; // new — defaults to [] on ADD_FILES
}
```
`[]` (not `null`) at construction. The renderer uses `metadataItems.length` together with `status` to disambiguate "no diff to show" from "still processing."
### 6.2 Reducer action `UPDATE_FILE_METADATA`
Extended to carry both — they arrive together off the strategy:
```ts
| {
type: "UPDATE_FILE_METADATA";
id: string;
afterBytes: number;
metadataItems: readonly MetadataItem[];
};
```
No new action type. The existing one (already misnamed for "just bytes") becomes accurately named.
### 6.3 `ProcessOutcome` (in `src/infrastructure/web/web_api.ts`)
The shape `window.api.wasm.process(...)` resolves to:
```ts
export interface ProcessOutcome {
// ...existing fields unchanged...
metadataItems: readonly MetadataItem[]; // [] on error or NoMetadataFound
}
```
`WasmProcessor` forwards items straight from `strategy.strip()`'s success Result. No transformation in the processor.
### 6.4 `use_process_files` hook
After `window.api.wasm.process(...)` returns:
```ts
dispatch({
type: "UPDATE_FILE_METADATA",
id: fileId,
afterBytes: outcome.afterBytes,
metadataItems: outcome.metadataItems,
});
const removedOrModified = outcome.metadataItems.filter(i => i.action !== "kept").length;
dispatch({
type: "UPDATE_FILE_STATUS",
id: fileId,
status: removedOrModified === 0
? FileProcessingStatus.NoMetadataFound
: FileProcessingStatus.Complete,
});
```
### 6.5 What does *not* change
`FormatStrategy.extensions`, `verifyMagicBytes`, the strategy registry router, file registry, batch zip output, download adapter, `Result<T, E>` machinery, `AppContext.expandedRowId` (still `string | null`, single-open) — all untouched.
## 7. `MetadataDiffExpansion` component
New file: `src/web/components/file-list/MetadataDiffExpansion.tsx`. Sibling to `ErrorExpansion`. Pure render component, no state/callbacks/effects.
### 7.1 API
```ts
export function MetadataDiffExpansion({
items,
}: {
items: readonly MetadataItem[];
}): React.JSX.Element;
```
### 7.2 Render shape
1. Group by `source` — stable first-seen ordering.
2. Within each group, **kept first**, then removed/modified.
3. Per-group header: `{source} · N removed[, M modified][, K kept]` — zero categories omitted.
4. Per-row: two-column grid (`220px name + 1fr value`) on viewports ≥ 420px; **stacked single-column** at < 420px.
- **Removed**: red strikethrough on value, red left border, red-tinted background.
- **Modified**: `valueBefore` strikethrough + Unicode arrow `→` + `valueAfter` in green. Green left border, green-tinted background.
- **Kept**: neutral grey row, grey left border, small `preserved` badge after the name.
### 7.3 Long-value handling
Truncate with ellipsis + native `title=` tooltip. CSS only:
```css
.file-table__diff-value {
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
}
```
No expand-row behavior, no modal. Pre-formatted blob strings are already short by construction.
### 7.4 Sensitive-value display
**Raw values, no masking.** The user dropped the file — they already have it. The diff is a record of what would otherwise have shipped out unstripped. Masking would train a useless reveal-click reflex and contradict the "the user owns their files" framing. Trust copy (#68/#69, shipped #81) already establishes that everything happens locally.
### 7.5 CSS / BEM classes
New classes in the existing `src/web/styles/file-list.css`:
```
.file-table__diff
.file-table__diff-group-header
.file-table__diff-row
.file-table__diff-row--removed
.file-table__diff-row--modified
.file-table__diff-row--kept
.file-table__diff-name
.file-table__diff-value
.file-table__diff-value--strike
.file-table__diff-value--added
.file-table__diff-arrow
.file-table__diff-kept-badge
```
New tokens in `src/web/styles/tokens.css`:
```
--ec-color-diff-removed-bg, --ec-color-diff-removed-fg
--ec-color-diff-modified-bg, --ec-color-diff-modified-fg
--ec-color-diff-kept-bg, --ec-color-diff-kept-fg
```
Each with `prefers-color-scheme: dark` variants. Reds and greens borrow from existing error / success palette tokens.
### 7.6 Wire-in inside `FileRow.tsx`
```tsx
{isExpanded && file.status === FileProcessingStatus.Complete && (
<MetadataDiffExpansion items={file.metadataItems} />
)}
{isExpanded && isError && file.error !== null && (
<ErrorExpansion error={file.error} extension={file.extension} onCopy={onCopyToast} />
)}
{isExpanded && file.status === FileProcessingStatus.NoMetadataFound && (
<div className="file-table__expansion">
<span className="file-table__expansion-empty">{t("noMetadataFound")}</span>
</div>
)}
```
### 7.7 Accessibility
Semantic markup: `<dl>` for the source group, `<dt>` for tag names, `<dd>` for values. Source group as `<section>` with visually-hidden `<h4>` heading for screen readers. Existing row-level `role="row"`, `tabIndex`, and keyboard handlers unchanged. Animations gated by `prefers-reduced-motion` per project convention; the diff introduces no new animation.
## 8. i18n
### 8.1 Policy
- **Tag names and source labels** — canonical English from the format spec, **not** translated. Maps 1:1 to ExifTool and the spec documents.
- **Pre-formatted values** — emitted by strategies in canonical form, not translated.
- **UI chrome** — group header counts and the `preserved` badge are translated via existing `i18nLookup()`.
### 8.2 New keys
Add to `.resources/strings.json` for all 25 locales (English first, placeholder English values for the other locales until translators fill in):
```jsonc
{
"diffGroupRemoved": "{count} removed",
"diffGroupModified": "{count} modified",
"diffGroupKept": "{count} kept",
"diffGroupSeparator": ", ",
"diffKeptBadge": "preserved",
"diffArrow": "→"
}
```
`{count}` follows the existing string-interpolation convention. RTL locales may swap `diffArrow` to `←`; the separator is translatable for locales with different punctuation rules.
### 8.3 No trust copy additions
The diff *is* trust copy in artifact form. The existing offline indicator and "files never leave your device" framing (shipped #81) need no additions.
## 9. Testing
### 9.1 JPEG strategy unit tests (new)
`tests/infrastructure/wasm/jpeg_strategy.metadata_items.test.ts`. Synthetic fixtures via small builders (preferred per existing convention).
- IFD0-only fixture: Make, Model, Orientation → assert 3 `removed` items, `source: "EXIF"`, formatted values.
- SubIFD via ExifIFDPointer: ExposureTime, ISO → assert items with `"1/120 s"` and `"200"`.
- GPS via GPSIFDPointer: `GPSLatitude + GPSLatitudeRef` → assert one item with compound `"47° 36' 22.3\" N"`.
- IFD1 walk via next-IFD pointer: thumbnail tags surface.
- InteropIFD walk: Interop tags surface.
- MakerNote opaque: assert exactly one item `name: "MakerNote (Nikon)"` (or vendor-specific) on signature match; plain `"MakerNote"` on unknown vendor.
- XMP segment: one blob item.
- ICC: default policy → `removed`; `preserveColorProfile: true``kept`.
- Kept orientation: with `preserveOrientation: true`, assert one `kept` item, other tags still `removed`.
- Edge cases: rational `0/0`, ASCII with trailing nulls, BYTE arrays with non-printable content.
### 9.2 Component unit tests (new)
`tests/web/components/metadata_diff_expansion.test.tsx`.
- Items grouped by source, first-seen ordering.
- Kept items render before removed/modified within each group.
- Group header counts and zero-category omission.
- `modified` row renders `valueBefore → valueAfter`.
- `removed` row has strikethrough class.
- `kept` row has neutral class + `preserved` badge.
- Long values render with `title` attribute set to full value.
### 9.3 E2E test matrix — standalone primary, web secondary
`tests/e2e/web/metadata_diff.spec.ts` runs against four Playwright projects in priority order:
| Project | Build target | Why |
|---|---|---|
| **`standalone-desktop`** ⭐ primary | `dist/web-standalone/index.html` | Main user-facing distribution. Offline hand-off. APK substrate. |
| **`standalone-mobile`** ⭐ primary | same | Mobile users get the standalone file. |
| `web-desktop` | `dist/web/` served | Dev/CI convenience target. |
| `web-mobile-ios` + `web-mobile-android` | same | Hosted-deploy mobile path. |
Spec content (same across projects):
**Desktop**: drag-drop JPEG → wait "Cleaned" → assert chevron → click → assert diff visible with `EXIF` group, Make/Model/GPSLatitude rows strikethrough → click → assert collapsed → drop PDF → assert no chevron.
**Mobile**: file picker → wait "Cleaned" → tap row → assert diff visible → assert no horizontal scroll on expansion area → tap → assert collapsed → tap PDF row → assert no expansion content.
Playwright config additions: two new projects (`standalone-desktop`, `standalone-mobile`) using the same desktop/mobile viewport + UA settings as the web projects. `yarn test:e2e:standalone`'s project glob widens from smoke-only to include the diff spec. CI `e2e-standalone` job needs no structural change.
### 9.4 Existing strategy tests (updated)
PDF / PNG / Office / Video unit tests today assert `metadataRemoved` is a number. Updated to assert `metadataItems` is an empty readonly array. One-line change per test file.
### 9.4a Existing e2e tests audit
The new `isExpandable` gate (§4) means non-JPEG `Complete` rows lose their chevron in Phase 1. Any existing e2e spec that asserts on or interacts with the chevron of a non-JPEG cleaned row (PDF, PNG, Office, video fixtures) needs to be updated to either:
- Drop the chevron-presence assertion for that row, or
- Switch its fixture to a JPEG.
The implementation plan must include a quick audit pass: grep `tests/e2e/` for `ChevronIcon` / `file-table__row--expandable` / row-click interactions, fix any case where the gate changes existing behaviour. This is bounded scope — the existing suite's "cleaned file" rows are exercised in a small number of specs.
### 9.5 Forensic gate
`tools/forensic/jpeg.ts` re-run against the post-Phase-1 strategy. **Zero sentinel survival** across the full recovery battery (`exiftool -a -G1 -s`, `strings | grep`, structural decompression, indirect-object walks). Report attached to the PR. Display-side change only; no expected regression in strip behaviour.
## 10. Files touched (Phase 1)
```
NEW:
src/domain/exif/metadata_item.ts
src/web/components/file-list/MetadataDiffExpansion.tsx
tests/infrastructure/wasm/jpeg_strategy.metadata_items.test.ts
tests/web/components/metadata_diff_expansion.test.tsx
tests/e2e/web/metadata_diff.spec.ts
MODIFIED:
src/infrastructure/wasm/format_strategy.ts (StripResult shape)
src/infrastructure/wasm/strategies/jpeg_strategy.ts (~150 LOC enumeration layer)
src/infrastructure/wasm/strategies/pdf_strategy.ts (return metadataItems: [])
src/infrastructure/wasm/strategies/png_strategy.ts (return metadataItems: [])
src/infrastructure/wasm/strategies/office_strategy.ts (return metadataItems: [])
src/infrastructure/wasm/strategies/video_strategy.ts (return metadataItems: [])
src/infrastructure/wasm/wasm_processor.ts (forward metadataItems)
src/infrastructure/web/web_api.ts (ProcessOutcome.metadataItems)
src/web/contexts/AppContext.tsx (FileEntry + reducer action)
src/web/components/file-list/FileRow.tsx (isExpandable + render branch)
src/web/hooks/use_process_files.ts (dispatch metadataItems)
src/web/styles/file-list.css (BEM diff classes)
src/web/styles/tokens.css (new diff colour tokens)
.resources/strings.json (6 new i18n keys, all 25 locales)
playwright.config.ts (2 new projects)
package.json (no new prod deps; test scripts unchanged)
tests/infrastructure/wasm/{pdf,png,office,video}_strategy.test.ts (assertion shape)
```
## 11. Non-goals (explicit)
- **No MakerNote vendor decoders.** Opaque emission only. Per-vendor decoding is its own multi-PR effort and probably never — see §5 MakerNote.
- **No XMP parsing.** XMP packets emit as blob items. Property-list extraction is out of v5 scope.
- **No persistence beyond session.** `metadataItems` lives in reducer state, cleared on `CLEAR_FILES`. No `localStorage`. Re-drop to re-view.
- **No diff export** (clipboard / JSON / report download). The diff is a viewing surface, not an artifact.
- **No sensitive-value masking.** Per §7.4.
- **No re-strip / re-process.** Settings changes apply on next drop.
- **No telemetry of tag distributions.** Per privacy invariants §5 — we don't collect what tags people remove.
## 12. Risks
| Risk | Mitigation |
|---|---|
| Subtle bugs in extended IFD walker (off-by-one, endian, unsigned overflow) on real-world camera files | Synthetic-fixture unit tests cover structural cases; real-world fixtures (e.g. `DSCN0012.jpg` plus a Canon + Sony sample) added to the e2e suite as oracles |
| MakerNote vendor signature parsing misidentifies an unfamiliar vendor | Fallback to plain `"MakerNote"` on unrecognised signature; covered by a test case |
| Mobile diff layout regression on narrow viewports | Single-column stack at `< 420px`; e2e asserts no horizontal scroll on expansion area |
| Standalone build's inlined script bundle grows past psychological threshold | Diff feature adds ~3 KB gzipped; bundle-size check (existing) catches regression |
| `NoMetadataFound` semantics shift if kept-item filter is miswritten | Direct test: JPEG with only Orientation + `preserveOrientation: true` → expect `NoMetadataFound`; same fixture + `preserveOrientation: false` → expect `Complete` |
| LSP showing `tsc` false positives during JPEG decoder table edits | Per `typescript-conventions.md`, verify with `npx tsc --noEmit` before treating LSP errors as real |
| Existing e2e specs that interact with non-JPEG `Complete`-row chevrons break (those rows lose the chevron under the new `isExpandable` gate) | §9.4a — implementation plan includes a grep+audit pass over `tests/e2e/` for chevron / row-click interactions; affected specs are updated or fixture-swapped |
## 13. Follow-up sequencing
Each format gets its own independent PR. Order:
1. **Phase 1 (this spec)** — surface + JPEG enumeration.
2. **PDF** — Info dict per-key; pdf-lib has values at deletion time. Annotation PII per annotation. ~60100 LOC added to `pdf_strategy.ts`.
3. **PNG** — per dropped chunk; tEXt/iTXt/zTXt walked. eXIf as blob (or shared walker if extracted into a helper).
4. **Office** — per deleted file + ZIP mtime summary. Possible sub-PR for parsing docProps XML to per-property fidelity.
5. **Video** — per zeroed metadata box + per modified timestamp.
Each PR re-runs its format's forensic battery and asserts zero sentinel survival.
## 14. Out of scope (future PRs, additive)
- Diff export (copy-to-clipboard JSON, or "view raw" reveal-bytes mode for MakerNote)
- Diff persistence across reloads (`localStorage`-cached results)
- A per-tag glossary feature with friendly tag-name translations
- Vendor-specific MakerNote decoders (Nikon-first, then Canon, etc.)
- XMP property-list extraction
All additive — the `{source, name, value}` contract already supports them without breaking shape.
## 15. References
- [Issue #22](http://localhost:3000/forgejo_admin/exifcleaner-web/issues/22) — original request and 2026-05-11 deferral comment
- [`docs/superpowers/specs/2026-05-05-webapp-migration-design.md`](2026-05-05-webapp-migration-design.md) §9.5 E4
- [`docs/superpowers/specs/2026-05-10-phase-e-f-design.md`](2026-05-10-phase-e-f-design.md) — Phase E/F backlog split
- [`docs/superpowers/specs/2026-05-11-phase-e-trust-copy-design.md`](2026-05-11-phase-e-trust-copy-design.md) — trust copy framing (shipped #81)
- [`.claude/rules/format-strategy-workflow.md`](../../.claude/rules/format-strategy-workflow.md) — analyse → implement → verify pattern
- [`.claude/rules/privacy-invariants.md`](../../.claude/rules/privacy-invariants.md) — §3 forensic > unit tests, §5 no telemetry, §6 timestamps
- [`.claude/rules/typescript-conventions.md`](../../.claude/rules/typescript-conventions.md) — discriminated unions + named arguments + Result type
- ExifTool documentation, Exif 2.32 specification §4.6, TIFF 6.0 specification

View file

@ -21,10 +21,11 @@
"test:e2e": "playwright test",
"test:e2e:web": "playwright test --project=web-desktop --project=web-mobile-ios --project=web-mobile-android",
"test:e2e:web:desktop": "playwright test --project=web-desktop",
"test:e2e:standalone": "yarn build:web:standalone && playwright test --project=standalone",
"test:e2e:standalone": "yarn build:web:standalone && playwright test --project=standalone-desktop --project=standalone-mobile",
"test:all": "vitest run && playwright test",
"screenshots": "tsx scripts/generate-screenshots.ts",
"check:deps": "madge --circular --extensions ts,tsx src/"
"check:deps": "madge --circular --extensions ts,tsx src/",
"generate:exif-tags": "node scripts/generate_exif_tags.mjs"
},
"dependencies": {
"jszip": "^3.10.1",

View file

@ -1,5 +1,16 @@
import path from "node:path";
import { fileURLToPath } from "node:url";
import { defineConfig, devices } from "@playwright/test";
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
// Absolute file:// URL of the inlined single-file build. Used as `baseURL` for
// the standalone-* projects so `page.goto("")` resolves to the index.html and
// `launchPage()` works without per-spec branching. The build itself is chained
// via `yarn test:e2e:standalone`.
const standaloneIndexUrl = `file://${path.resolve(__dirname, "dist/web-standalone/index.html")}`;
export default defineConfig({
testMatch: "*.spec.ts",
timeout: 15000,
@ -33,14 +44,43 @@ export default defineConfig({
testDir: "./tests/e2e/web",
use: { ...devices["Pixel 7"], baseURL: "http://localhost:4173", locale: "en-US" },
},
// Standalone projects: load `dist/web-standalone/index.html` via file://.
// Two viewports because the standalone HTML is the *primary* distribution
// channel — it must work for the same desktop + mobile audiences as the
// hosted web build. The smoke specs in `tests/e2e/standalone/` exercise
// the inlined-bundle guarantees (single file, zero outbound requests);
// `metadata_diff.spec.ts` from tests/e2e/web/ is matched in via
// testMatch so the diff UX is verified against the inlined bundle too.
// The webServer above still launches when these projects run alone —
// harmless waste since file:// navigation never hits it.
{
// Loads dist/web-standalone/index.html via file://, so no baseURL.
// Build is chained via `yarn test:e2e:standalone`. The top-level
// webServer still launches when this project runs alone — wasted but
// harmless, since the tests don't hit it.
name: "standalone",
testDir: "./tests/e2e/standalone",
use: { ...devices["Desktop Chrome"], locale: "en-US" },
name: "standalone-desktop",
testDir: "./tests/e2e",
testMatch: [
"standalone/**/*.spec.ts",
"web/metadata_diff.spec.ts",
],
use: {
...devices["Desktop Chrome"],
baseURL: standaloneIndexUrl,
locale: "en-US",
},
},
{
// Pixel 7 (Chromium-based Android emulation), not iPhone 14 (WebKit).
// The CI's `e2e-standalone` job installs Chromium only to stay cheap;
// WebKit coverage of the diff feature lives on `web-mobile-ios` against
// the hosted build. Standalone HTML renders identically in both engines
// (no service worker / PWA differences), so the narrow-viewport CSS
// guard is engine-agnostic.
name: "standalone-mobile",
testDir: "./tests/e2e",
testMatch: ["web/metadata_diff.spec.ts"],
use: {
...devices["Pixel 7"],
baseURL: standaloneIndexUrl,
locale: "en-US",
},
},
],
});

286
scripts/generate_exif_tags.mjs Executable file
View file

@ -0,0 +1,286 @@
#!/usr/bin/env node
// Regenerate src/domain/exif/ifd_tag_data.ts from system exiftool.
//
// Run with: yarn generate:exif-tags
//
// Source of truth: `exiftool -listx -EXIF:All` (XML output enumerating every
// EXIF / GPS / Interop / Sub-IFD tag exiftool knows). We parse the XML with
// regex (no deps), bucket tags by their `g1` group (the IFD they live in),
// and emit four Record<number, TagData> dictionaries:
//
// - IFD0_TAG_DATA — tags in g1=IFD0 + IFD1 + SubIFD (shared TIFF tag space)
// - EXIF_SUBIFD_TAG_DATA — tags in g1=ExifIFD
// - GPS_TAG_DATA — tags in g1=GPS
// - INTEROP_TAG_DATA — tags in g1=InteropIFD
//
// Each TagData record captures three things from the `<tag>` element:
//
// - `name` — the canonical tag name (e.g. "MeteringMode")
// - `type` — ExifTool's raw type (e.g. "int16u", "rational64u",
// "string", "undef") — used by the generic formatter in
// `ifd_decoders.ts` to pick the right reader
// - `enumValues` — present iff the `<values>` block is a simple
// numeric-key → label PrintConv (e.g. MeteringMode
// `{ 0: "Unknown", 1: "Average", 2: "Center-weighted
// average", ... }`). When PrintConv is a complex
// expression with non-numeric keys (rare in the EXIF
// namespace), the block is skipped and `enumValues`
// stays undefined — the formatter then falls through
// to plain numeric display.
//
// The generator is idempotent: running twice with the same exiftool version
// produces byte-identical output. Tag names and enum labels are stable,
// well-known identifiers from CIPA DC-008 (Exif spec) and are not
// copyrightable. ExifTool itself is dual-licensed under Artistic 2.0 / GPL.
//
// To regenerate after an exiftool version bump: install the new exiftool
// system-wide, then `yarn generate:exif-tags`.
import { execFileSync } from "node:child_process";
import { writeFileSync } from "node:fs";
import { fileURLToPath } from "node:url";
import { dirname, join } from "node:path";
const __dirname = dirname(fileURLToPath(import.meta.url));
const outputPath = join(
__dirname,
"..",
"src",
"domain",
"exif",
"ifd_tag_data.ts",
);
// IFDs we actually walk. Tag dictionaries are bucketed by these g1 values.
// SubIFD + IFD1 share the IFD0 tag namespace, so they fold into the same
// dictionary.
const IFD0_GROUPS = new Set(["IFD0", "IFD1", "SubIFD"]);
const EXIF_GROUPS = new Set(["ExifIFD"]);
const GPS_GROUPS = new Set(["GPS"]);
const INTEROP_GROUPS = new Set(["InteropIFD"]);
function getExiftoolVersion() {
return execFileSync("exiftool", ["-ver"], { encoding: "utf8" }).trim();
}
function runListx() {
return execFileSync("exiftool", ["-listx", "-EXIF:All"], {
encoding: "utf8",
maxBuffer: 32 * 1024 * 1024,
});
}
// Decode XML entities we actually encounter in ExifTool's `<val lang='en'>`
// labels. ExifTool's output is well-formed XML — we only need a tiny set.
function decodeEntities(raw) {
return raw
.replace(/&amp;/g, "&")
.replace(/&lt;/g, "<")
.replace(/&gt;/g, ">")
.replace(/&apos;/g, "'")
.replace(/&quot;/g, '"')
.replace(/&#39;/g, "'");
}
// Parse a `<values>` block. Returns a `Record<number, string>` if every key
// is a non-negative integer; otherwise returns null (caller skips enum
// emission — falls through to plain numeric display).
//
// Non-numeric keys appear for tags ExifTool models as bitmask expressions
// (e.g. Flash's complex PrintConv) or for ASCII-keyed enums (e.g.
// InteropIndex with keys "R03"/"R98"/"THM"). Those don't fit the simple
// shortEnum(value, table) shape, so we omit the enumValues field and let
// the formatter render the raw numeric value or ASCII string.
function parseValuesBlock(valuesBody) {
const out = {};
const keyRegex = /<key\s+id='([^']+)'\s*>([\s\S]*?)<\/key>/g;
let match;
while ((match = keyRegex.exec(valuesBody)) !== null) {
const keyRaw = match[1];
const keyBody = match[2];
if (!/^\d+$/.test(keyRaw)) {
return null;
}
const key = parseInt(keyRaw, 10);
// Pick the English label. Each key has one `<val lang='X'>...</val>`
// per supported locale; English is always present.
const enMatch = keyBody.match(
/<val\s+lang='en'\s*>([\s\S]*?)<\/val>/,
);
if (enMatch === null) continue;
const label = decodeEntities(enMatch[1].trim());
if (out[key] === undefined) out[key] = label;
}
return Object.keys(out).length > 0 ? out : null;
}
// Parse exiftool's `<table ... g1='X'>...</table>` blocks. Each tag inside
// inherits the table's g1 unless it specifies its own g1 attribute. Tags
// can carry `index='N'` for raw-format variants; we keep only the first
// occurrence per (g1, id) pair so the canonical name + type wins.
function parseTags(xml) {
// g1 -> Map<numeric-id, { name, type, enumValues | undefined }>
const tagsByGroup = new Map();
const tableRegex =
/<table\s+name='([^']+)'\s+g0='([^']+)'\s+g1='([^']+)'[^>]*>([\s\S]*?)<\/table>/g;
let tableMatch;
while ((tableMatch = tableRegex.exec(xml)) !== null) {
const tableG1 = tableMatch[3];
const tableBody = tableMatch[4];
// Per-tag scan with body capture (we need the body to extract a
// nested `<values>` block when present). The opening attributes
// are always `id='...' name='...'` first; type / g1 / etc. follow.
const tagRegex =
/<tag\s+id='([^']+)'\s+name='([^']+)'([^>]*)>([\s\S]*?)<\/tag>/g;
let tagMatch;
while ((tagMatch = tagRegex.exec(tableBody)) !== null) {
const idRaw = tagMatch[1];
const name = tagMatch[2];
const attrTail = tagMatch[3];
const tagBody = tagMatch[4];
// Numeric id only — skip composite tags ("Exif-JpgFromRaw") etc.
if (!/^\d+$/.test(idRaw)) continue;
const id = parseInt(idRaw, 10);
// Effective g1: tag's own g1 attr wins over table's g1.
const g1Override = attrTail.match(/\sg1='([^']+)'/);
const g1 = g1Override !== null ? g1Override[1] : tableG1;
// Type is always present in the EXIF namespace. Composite tags
// can have type='?'; we'd already have skipped those above.
const typeMatch = attrTail.match(/\stype='([^']+)'/);
const type = typeMatch !== null ? typeMatch[1] : "unknown";
// Optional `<values>` block — parsed only when every key is
// numeric (see parseValuesBlock).
const valuesMatch = tagBody.match(
/<values>([\s\S]*?)<\/values>/,
);
let enumValues;
if (valuesMatch !== null) {
const parsed = parseValuesBlock(valuesMatch[1]);
if (parsed !== null) enumValues = parsed;
}
if (!tagsByGroup.has(g1)) tagsByGroup.set(g1, new Map());
const bucket = tagsByGroup.get(g1);
// First-wins: indexed duplicates and re-listings keep the
// canonical (first, usually un-indexed) record.
if (!bucket.has(id)) {
bucket.set(id, { name, type, enumValues });
}
}
}
return tagsByGroup;
}
function collectDictionary(tagsByGroup, groups) {
const dict = new Map();
for (const g1 of groups) {
const bucket = tagsByGroup.get(g1);
if (bucket === undefined) continue;
for (const [id, data] of bucket) {
if (!dict.has(id)) dict.set(id, data);
}
}
// Sort by numeric id for stable output.
return [...dict.entries()].sort((a, b) => a[0] - b[0]);
}
function emitEnumValues(values) {
const keys = Object.keys(values)
.map((k) => parseInt(k, 10))
.sort((a, b) => a - b);
const parts = keys.map(
(k) => `[${k}]: ${JSON.stringify(values[k])}`,
);
return `{ ${parts.join(", ")} }`;
}
function emitDictionary(name, entries) {
const lines = [];
lines.push(`export const ${name}: Record<number, TagData> = {`);
for (const [id, data] of entries) {
const hex = `0x${id.toString(16).padStart(4, "0")}`;
const enumPart =
data.enumValues !== undefined
? `, enumValues: ${emitEnumValues(data.enumValues)}`
: "";
lines.push(
`\t[${hex}]: { name: ${JSON.stringify(data.name)}, type: ${JSON.stringify(data.type)}${enumPart} },`,
);
}
lines.push("};");
return lines.join("\n");
}
function main() {
const exiftoolVersion = getExiftoolVersion();
const xml = runListx();
const tagsByGroup = parseTags(xml);
const ifd0 = collectDictionary(tagsByGroup, IFD0_GROUPS);
const exif = collectDictionary(tagsByGroup, EXIF_GROUPS);
const gps = collectDictionary(tagsByGroup, GPS_GROUPS);
const interop = collectDictionary(tagsByGroup, INTEROP_GROUPS);
const header = [
"// GENERATED FILE — do not edit by hand.",
"//",
"// Regenerate with: yarn generate:exif-tags",
"// Generator script: scripts/generate_exif_tags.mjs",
"// Source: `exiftool -listx -EXIF:All` (system exiftool)",
`// ExifTool version: ${exiftoolVersion}`,
"//",
"// Tag names, types, and enum labels are stable identifiers from CIPA",
"// DC-008 (Exif 2.32) and the TIFF 6.0 specification; the dictionary",
"// itself is purely factual (spec terms + numeric IDs + ExifTool's",
"// PrintConv labels). ExifTool is the source of truth for the surface",
"// area we cover; ExifTool's own code is dual-licensed under Artistic",
"// 2.0 / GPL.",
"",
"// Per-tag record. `type` is ExifTool's raw type string (e.g. `int16u`,",
"// `rational64u`, `string`, `undef`). `enumValues` is present iff the",
"// tag's PrintConv is a simple numeric-key → label map; the formatter",
"// in `ifd_decoders.ts` uses it to render `<value> (<label>)` for",
"// integer-typed enum tags.",
"export interface TagData {",
"\treadonly name: string;",
"\treadonly type: string;",
"\treadonly enumValues?: Readonly<Record<number, string>>;",
"}",
"",
].join("\n");
const body = [
"// IFD0 / IFD1 / SubIFD share the TIFF tag namespace — bucketed together.",
emitDictionary("IFD0_TAG_DATA", ifd0),
"",
"// Exif sub-IFD (ExifIFD).",
emitDictionary("EXIF_SUBIFD_TAG_DATA", exif),
"",
"// GPS IFD.",
emitDictionary("GPS_TAG_DATA", gps),
"",
"// Interoperability IFD.",
emitDictionary("INTEROP_TAG_DATA", interop),
"",
].join("\n");
writeFileSync(outputPath, header + body, "utf8");
console.log(
`Wrote ${outputPath}\n` +
` IFD0: ${ifd0.length} tags\n` +
` ExifIFD: ${exif.length} tags\n` +
` GPS: ${gps.length} tags\n` +
` Interop: ${interop.length} tags\n` +
` ExifTool version: ${exiftoolVersion}`,
);
}
main();

View file

@ -1,10 +1,10 @@
import type { Result } from "../../common";
import type { ExifError, StripOptions } from "../../domain";
import type { ExifError, MetadataItem, StripOptions } from "../../domain";
export interface ProcessOutcome {
readonly outputPath: string;
readonly metadataRemoved: number;
readonly outputBytes: number;
readonly metadataItems: readonly MetadataItem[];
}
export interface MetadataProcessorPort {

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,188 @@
// Hand-rolled display formatters for Exif tag values. All functions take
// primitive types (number, [number, number], string) and return the string
// shown in the diff — no IfdEntry dependency, no I/O. This is pure domain
// logic.
//
// Most numeric-to-label enum mappings (MeteringMode, Flash, LightSource,
// ColorSpace, etc.) used to live here as hand-rolled `Record<number, string>`
// tables; they were moved into the generated `ifd_tag_data.ts` (sourced from
// ExifTool's PrintConv labels). What remains is **display-shape choices**:
// formatters that wrap or restructure the raw value (ExposureTime `1/100 s`,
// FNumber `f/2.8`, FocalLength `28 mm`, etc.), enum tables we deliberately
// disagree with ExifTool about (Orientation: `Normal` vs `Horizontal
// (normal)`), and compound formatters (GPS-coord, GPS-time, GPS-version).
//
// Lives in `src/domain/exif/` so the renderer (and tests) can format values
// without pulling in TIFF-walker infrastructure.
// Displayed when a value is unavailable (e.g. rational with zero denominator).
// Mirrors the em-dash used in BEFORE/AFTER cells in the file table.
export const VALUE_UNAVAILABLE = "—" as const;
// ---- Display-shape enum tables we deliberately own ---------------------
//
// These three tables stay hand-rolled because we disagree with ExifTool's
// PrintConv wording on display grounds:
//
// - Orientation: ExifTool's `Horizontal (normal)` is verbose; our `Normal`
// matches what users actually want to see in a diff column.
// - ResolutionUnit: ExifTool labels are sentence-case (`Inches`); ours are
// lowercase to match the spec's unit names.
// - ExposureProgram: same wording, kept hand-rolled so the formatter is
// colocated with the other shape-overrides.
export const ORIENTATIONS: Record<number, string> = {
1: "Normal",
2: "Mirror horizontal",
3: "Rotate 180",
4: "Mirror vertical",
5: "Mirror horizontal + Rotate 270 CW",
6: "Rotate 90 CW",
7: "Mirror horizontal + Rotate 90 CW",
8: "Rotate 270 CW",
};
export const RESOLUTION_UNITS: Record<number, string> = {
1: "none",
2: "inches",
3: "centimeters",
};
export const EXPOSURE_PROGRAMS: Record<number, string> = {
0: "Not Defined",
1: "Manual",
2: "Program AE",
3: "Aperture-priority AE",
4: "Shutter speed priority AE",
5: "Creative (Slow speed)",
6: "Action (High speed)",
7: "Portrait",
8: "Landscape",
};
// ---- Primitive-typed formatters ----------------------------------------
export function formatOrientation(value: number): string {
const label = ORIENTATIONS[value] ?? "Unknown";
return `${value} (${label})`;
}
export function formatResolutionUnit(value: number): string {
const label = RESOLUTION_UNITS[value] ?? "unknown";
return `${value} (${label})`;
}
export function formatExposureProgram(value: number): string {
return `${value} (${EXPOSURE_PROGRAMS[value] ?? "Unknown"})`;
}
export function formatRationalPlain([num, den]: [number, number]): string {
if (den === 0) return VALUE_UNAVAILABLE;
if (den === 1) return `${num}`;
return `${num / den}`;
}
export function formatExposureTime([num, den]: [number, number]): string {
if (den === 0) return VALUE_UNAVAILABLE;
return `${num}/${den} s`;
}
export function formatFNumber([num, den]: [number, number]): string {
if (den === 0) return VALUE_UNAVAILABLE;
return `f/${(num / den).toFixed(1)}`;
}
export function formatFocalLength([num, den]: [number, number]): string {
if (den === 0) return VALUE_UNAVAILABLE;
return `${num / den} mm`;
}
export function formatDateTime(raw: string): string {
// EXIF format: "YYYY:MM:DD HH:MM:SS". Convert the date portion to dashes
// for human readability; preserve the time portion as-is.
return raw.replace(/^(\d{4}):(\d{2}):(\d{2})/, "$1-$2-$3");
}
// Renders a numeric enum value as `<value> (<label>)`, e.g. `5 (Multi-segment)`.
// Used for every int-typed tag with an `enumValues` map in `ifd_tag_data.ts`
// — the table comes from ExifTool's PrintConv, the wrapping shape is ours.
export function formatShortEnum(
value: number,
table: Record<number, string>,
): string {
const label = table[value] ?? "Unknown";
return `${value} (${label})`;
}
// ---- GPS ASCII enums (display-shape choice) ----------------------------
//
// GPSStatus, GPSMeasureMode, GPSSpeedRef, GPSDirectionRef are typed `string`
// in ExifTool. Their PrintConv uses single-character ASCII keys ("A"/"V",
// "K"/"M"/"N", "T"/"M") which the tag-data generator skips (non-numeric).
// These tables stay hand-rolled — there are only four of them, the labels
// are intentionally terse (`km/h` not `Kilometers per hour`), and the
// formatter is `<key> (<label>)` instead of plain key. Lives here so the
// override map in `ifd_decoders.ts` can bind them via `formatAsciiEnum`.
export const GPS_STATUSES: Record<string, string> = {
A: "Active",
V: "Void",
};
export const GPS_MEASURE_MODES: Record<string, string> = {
"2": "2D",
"3": "3D",
};
export const GPS_SPEED_REFS: Record<string, string> = {
K: "km/h",
M: "mph",
N: "knots",
};
export const GPS_DIRECTION_REFS: Record<string, string> = {
T: "True",
M: "Magnetic",
};
export function formatAsciiEnum(
raw: string,
table: Record<string, string>,
): string {
const key = raw.length > 0 ? raw[0] : "";
if (key === undefined || key === "") return VALUE_UNAVAILABLE;
const label = table[key] ?? "Unknown";
return `${key} (${label})`;
}
// GPSVersionID is BYTE[4] joined with dots, e.g. "2.2.0.0".
export function formatGpsVersion(bytes: readonly number[]): string {
return bytes.slice(0, 4).join(".");
}
export function formatGpsTime(triplet: Array<[number, number]>): string {
const safe = (pair: [number, number] | undefined): number => {
if (pair === undefined) return 0;
const [num, den] = pair;
return den === 0 ? 0 : num / den;
};
const h = safe(triplet[0]);
const m = safe(triplet[1]);
const s = safe(triplet[2]);
return `${h.toFixed(0)}:${m.toFixed(0).padStart(2, "0")}:${s.toFixed(0).padStart(2, "0")}`;
}
export function formatGpsCoord(
triplet: Array<[number, number]>,
ref: string,
): string {
const safe = (pair: [number, number] | undefined): number => {
if (pair === undefined) return 0;
const [num, den] = pair;
return den === 0 ? 0 : num / den;
};
const deg = safe(triplet[0]);
const min = safe(triplet[1]);
const sec = safe(triplet[2]);
return `${deg.toFixed(0)}° ${min.toFixed(0)}' ${sec.toFixed(1)}" ${ref}`;
}

View file

@ -0,0 +1,31 @@
// A single observed metadata item from a format strategy's strip pass.
// Pre-formatted strings — the renderer displays values verbatim.
//
// Three variants:
// - removed: tag was present in the source; absent after strip
// - modified: tag was present; written to a different value (e.g. epoch'd timestamps)
// - kept: tag was present; preserved by strategy policy (e.g. orientation when
// preserveOrientation is on)
export type MetadataItem =
| {
readonly action: "removed";
readonly source: string;
readonly name: string;
readonly valueBefore: string;
}
| {
readonly action: "modified";
readonly source: string;
readonly name: string;
readonly valueBefore: string;
readonly valueAfter: string;
}
| {
readonly action: "kept";
readonly source: string;
readonly name: string;
readonly value: string;
// i18n key resolved by the diff component via t(). Absent for
// kept items where no user-visible policy explanation is needed.
readonly reason?: string;
};

View file

@ -26,6 +26,7 @@ export { LANGUAGE_NAMES } from "./i18n/language_names";
export { middleTruncatePath } from "./path_truncation";
export type { ExifError } from "./exif/exif_errors";
export { formatExifError } from "./exif/exif_errors";
export type { MetadataItem } from "./exif/metadata_item";
export type { SettingsError } from "./settings_errors";
export { formatSettingsError } from "./settings_errors";
export type { FolderError } from "./files/folder_errors";

View file

@ -0,0 +1,424 @@
// Decoder tables — combine generated per-tag data (name + type +
// `enumValues` from ExifTool's PrintConv) with our hand-rolled
// display-shape choices into a single `{ name, format }` per tag.
//
// The bulk of decoder construction is mechanical: `buildDecoders` takes the
// generated `TagData` map, walks it, and picks a formatter based on `type`
// + `enumValues` via `pickFormatter`. Per-IFD `OVERRIDES` maps then replace
// the generic decoder for the small set of tags where we want a specific
// display shape (Orientation `<value> (<label>)` with our wording, exposure
// time `1/100 s`, FNumber `f/2.8`, focal length `28 mm`, DateTime ISO-ish
// dashes, MakerNote vendor-detected blob, GPS-version dotted, etc.).
//
// Tables are looked up by tag ID at walk time. Tags absent from a table fall
// through to `defaultDecoder` (canonical hex name + shape description).
//
// Pointer tags (ExifIFDPointer, GPSIFDPointer, InteropIFDPointer) are
// intentionally NOT in any table — the walker consumes them to chase the
// sub-IFD and emitting them as user-visible items would double-count.
//
// Name overrides exist for the handful of cases where ExifTool's canonical
// name disagrees with the Exif spec name we prefer (e.g. ExifTool calls
// 0x9204 "ExposureCompensation", we use the spec name "ExposureBiasValue";
// ExifTool has 91 vendor-specific aliases for 0x927c "MakerNote" — we keep
// one name).
import type { TagData } from "../../../domain/exif/ifd_tag_data";
import {
EXIF_SUBIFD_TAG_DATA,
GPS_TAG_DATA,
IFD0_TAG_DATA,
INTEROP_TAG_DATA,
} from "../../../domain/exif/ifd_tag_data";
import {
GPS_DIRECTION_REFS,
GPS_MEASURE_MODES,
GPS_SPEED_REFS,
GPS_STATUSES,
formatAsciiEnum,
formatDateTime,
formatExposureProgram,
formatExposureTime,
formatFNumber,
formatFocalLength,
formatGpsTime,
formatGpsVersion,
formatOrientation,
formatRationalPlain,
formatResolutionUnit,
formatShortEnum,
} from "../../../domain/exif/ifd_value_formatters";
import type { IfdEntry } from "./ifd_entry";
import {
describeUnknown,
formatByteSize,
readAscii,
readLong,
readRational,
readRationalArray,
readShort,
readUndefined,
readUndefinedAsAscii,
readUtf16,
} from "./ifd_readers";
export interface IfdTagDecoder {
readonly name: string;
readonly format: (e: IfdEntry) => string;
}
// ---- Generic type-driven formatter selection ---------------------------
//
// Picks the right reader+formatter combination from `TagData.type` +
// optional `TagData.enumValues`. Covers the boring case (the ~95% of tags
// that just read-and-render a primitive value). Tags needing a specific
// display shape (FNumber, ExposureTime, dotted GPSVersionID, etc.) are
// supplied by per-IFD override maps below and never hit this function.
function pickFormatter(
type: string,
enumValues: Readonly<Record<number, string>> | undefined,
): (e: IfdEntry) => string {
if (enumValues !== undefined) {
// int16u/int16s enum (the dominant case — MeteringMode, ColorSpace,
// LightSource, etc.).
if (type === "int16u" || type === "int16s") {
return (e) => formatShortEnum(readShort(e), enumValues);
}
// Single-byte undef enum (SceneType, FileSource,
// ComponentsConfiguration first byte).
if (type === "undef") {
return (e) => formatShortEnum(e.tiff[e.valueDataOffset] ?? 0, enumValues);
}
// Single-byte int8u enum (GPSAltitudeRef).
if (type === "int8u") {
return (e) => formatShortEnum(e.tiff[e.valueDataOffset] ?? 0, enumValues);
}
// int32u/int32s enum — rare but possible.
if (type === "int32u" || type === "int32s") {
return (e) => formatShortEnum(readLong(e), enumValues);
}
// Other types fall through to plain rendering below.
}
switch (type) {
case "int16u":
case "int16s":
return (e) => `${readShort(e)}`;
case "int32u":
case "int32s":
return (e) => `${readLong(e)}`;
case "string":
return readAscii;
case "rational64u":
case "rational64s":
return (e) => formatRationalPlain(readRational(e));
case "undef":
return readUndefinedAsAscii;
case "int8u":
return (e) => `${e.tiff[e.valueDataOffset] ?? 0}`;
default:
return describeUnknown;
}
}
// ---- Helper closures binding IfdEntry-aware readers to formatters ------
// MakerNote: opaque blob with optional vendor detection from a leading
// signature ("Nikon", "Canon", ...). Vendor goes in parentheses next to the
// size; if none recognised, just the byte count.
function formatMakerNoteBlob(e: IfdEntry): string {
const vendor = detectMakerNoteVendor(e);
const suffix = vendor !== null ? ` (${vendor})` : "";
return `<binary, ${formatByteSize(e.count)}${suffix}>`;
}
function detectMakerNoteVendor(e: IfdEntry): string | null {
const previewLen = Math.min(e.count, 16);
const start = e.valueDataOffset;
const end = Math.min(start + previewLen, e.tiff.length);
let head = "";
for (let i = start; i < end; i++) {
const b = e.tiff[i] ?? 0;
if (b === 0) break;
head += String.fromCharCode(b);
}
if (head.startsWith("Nikon")) return "Nikon";
if (head.startsWith("Canon")) return "Canon";
if (head.startsWith("OLYMP")) return "Olympus";
if (head.startsWith("FUJIFILM")) return "Fuji";
if (head.startsWith("SONY")) return "Sony";
if (head.startsWith("Panasonic")) return "Panasonic";
if (head.startsWith("PENTAX")) return "Pentax";
if (head.startsWith("LEICA")) return "Leica";
return null;
}
function gpsVersionFormatter(e: IfdEntry): string {
const parts: number[] = [];
for (let i = 0; i < 4; i++) {
parts.push(e.tiff[e.valueDataOffset + i] ?? 0);
}
return formatGpsVersion(parts);
}
// ---- Name overrides ----------------------------------------------------
//
// Where ExifTool's canonical name disagrees with our preferred display name.
// Each entry is a deliberate choice — we prefer the Exif-spec name over
// ExifTool's vendor-detected alias for some tags.
const IFD0_NAME_OVERRIDES: Record<number, string> = {
// ExifTool's `ThumbnailOffset` / `ThumbnailLength` map to the spec's
// JPEGInterchangeFormat / -Length when the IFD holds a JPEG thumbnail.
0x0201: "JPEGInterchangeFormat",
0x0202: "JPEGInterchangeFormatLength",
// ExifTool returns vendor-specific names (MakerNoteApple etc.); we keep
// a single canonical "MakerNote" since vendor detection happens in the
// formatter (the value column shows e.g. "<binary, 2.5 KB (Nikon)>").
0x927c: "MakerNote",
};
const EXIF_SUBIFD_NAME_OVERRIDES: Record<number, string> = {
// Exif 2.32 §4.6.5 calls 0x9204 "ExposureBiasValue"; ExifTool aliases it
// to "ExposureCompensation". Keep the spec name.
0x9204: "ExposureBiasValue",
// Same MakerNote consideration as IFD0 — ExifTool has 91 vendor variants.
0x927c: "MakerNote",
};
// ---- Per-IFD display-shape overrides -----------------------------------
//
// Each entry is a tag where we want a specific display shape that the
// generic `pickFormatter` can't produce. Categories:
//
// 1. Specific units — ExposureTime renders `1/100 s`, FNumber renders
// `f/2.8`, FocalLength renders `28 mm`, FocalLengthIn35mmFormat
// renders `28 mm`.
// 2. Spec-vs-ExifTool wording disagreements — Orientation, ResolutionUnit,
// ExposureProgram have hand-rolled enum tables (see
// `ifd_value_formatters`) instead of ExifTool's PrintConv.
// 3. DateTime: ISO-ish dashes instead of `YYYY:MM:DD`.
// 4. Compound formatters — MakerNote vendor blob, GPS-version dotted
// quartet, GPS-time hh:mm:ss.
// 5. Tags read as something other than their declared type — the XP* tags
// are typed `int8u` in ExifTool but contain UTF-16LE strings (Microsoft
// extension). UserComment / ExifVersion / FlashpixVersion are `undef`
// but contain ASCII version strings.
// 6. Multi-element tags where the generic single-value reader would
// truncate silently — keep `describeUnknown` so the diff says
// `<RATIONAL, count 4>` instead of just the first rational.
// 7. ASCII enum tags (GPS{Status,MeasureMode,SpeedRef,DirectionRef}) —
// ExifTool's PrintConv has non-numeric keys, so `enumValues` is absent
// in the generated data; we bind hand-rolled ASCII enum tables via
// `formatAsciiEnum` instead.
const IFD0_OVERRIDES: Record<number, IfdTagDecoder> = {
// Multi-element / shape-description tags.
0x0102: { name: "BitsPerSample", format: describeUnknown },
0x0111: { name: "StripOffsets", format: describeUnknown },
0x0117: { name: "StripByteCounts", format: describeUnknown },
0x0201: {
name: "JPEGInterchangeFormat",
format: describeUnknown,
},
0x0212: { name: "YCbCrSubSampling", format: describeUnknown },
// Orientation — hand-rolled label table (see ifd_value_formatters).
0x0112: {
name: "Orientation",
format: (e) => formatOrientation(readShort(e)),
},
// Resolution: spec-style lowercase unit names.
0x0128: {
name: "ResolutionUnit",
format: (e) => formatResolutionUnit(readShort(e)),
},
// DateTime with ISO-ish dashes.
0x0132: {
name: "ModifyDate",
format: (e) => formatDateTime(readAscii(e)),
},
// XP* tags — typed `int8u` but encoded as UTF-16LE.
0x9c9b: { name: "XPTitle", format: readUtf16 },
0x9c9c: { name: "XPComment", format: readUtf16 },
0x9c9d: { name: "XPAuthor", format: readUtf16 },
0x9c9e: { name: "XPKeywords", format: readUtf16 },
0x9c9f: { name: "XPSubject", format: readUtf16 },
// MakerNote: opaque blob with vendor detection.
0x927c: { name: "MakerNote", format: formatMakerNoteBlob },
};
const EXIF_SUBIFD_OVERRIDES: Record<number, IfdTagDecoder> = {
// Photometric values with units.
0x829a: {
name: "ExposureTime",
format: (e) => formatExposureTime(readRational(e)),
},
0x829d: {
name: "FNumber",
format: (e) => formatFNumber(readRational(e)),
},
0x8822: {
name: "ExposureProgram",
format: (e) => formatExposureProgram(readShort(e)),
},
0x920a: {
name: "FocalLength",
format: (e) => formatFocalLength(readRational(e)),
},
0xa405: {
name: "FocalLengthIn35mmFormat",
format: (e) => `${readShort(e)} mm`,
},
// Version tags — `undef` containing ASCII like "0220". Generic would
// pick this up via the `undef` fallthrough; the override is kept so
// the format selection is explicit (the table doubles as documentation).
0x9000: { name: "ExifVersion", format: readUndefinedAsAscii },
0xa000: { name: "FlashpixVersion", format: readUndefinedAsAscii },
// DateTime tags — ISO-ish dashes.
0x9003: {
name: "DateTimeOriginal",
format: (e) => formatDateTime(readAscii(e)),
},
0x9004: {
name: "CreateDate",
format: (e) => formatDateTime(readAscii(e)),
},
// Multi-element / shape-description tags.
0x9101: { name: "ComponentsConfiguration", format: describeUnknown },
0x9214: { name: "SubjectArea", format: describeUnknown },
0xa214: { name: "SubjectLocation", format: describeUnknown },
0xa432: { name: "LensInfo", format: describeUnknown },
// UserComment — `undef` containing ASCII.
0x9286: { name: "UserComment", format: readUndefinedAsAscii },
// MakerNote: opaque blob with vendor detection.
0x927c: { name: "MakerNote", format: formatMakerNoteBlob },
};
const GPS_OVERRIDES: Record<number, IfdTagDecoder> = {
// Dotted version, e.g. "2.2.0.0".
0x0000: { name: "GPSVersionID", format: gpsVersionFormatter },
// hh:mm:ss from a RATIONAL[3] payload.
0x0007: {
name: "GPSTimeStamp",
format: (e) => formatGpsTime(readRationalArray(e, 3)),
},
// ASCII enums — single-character keys, hand-rolled label tables.
0x0009: {
name: "GPSStatus",
format: (e) => formatAsciiEnum(readAscii(e), GPS_STATUSES),
},
0x000a: {
name: "GPSMeasureMode",
format: (e) => formatAsciiEnum(readAscii(e), GPS_MEASURE_MODES),
},
0x000c: {
name: "GPSSpeedRef",
format: (e) => formatAsciiEnum(readAscii(e), GPS_SPEED_REFS),
},
0x000e: {
name: "GPSTrackRef",
format: (e) => formatAsciiEnum(readAscii(e), GPS_DIRECTION_REFS),
},
0x0010: {
name: "GPSImgDirectionRef",
format: (e) => formatAsciiEnum(readAscii(e), GPS_DIRECTION_REFS),
},
// Multi-element coordinate (destination) — shape only; we don't pair
// it with a ref the way we do for primary lat/lng in `emitGpsItems`.
0x0014: { name: "GPSDestLatitude", format: describeUnknown },
0x0016: { name: "GPSDestLongitude", format: describeUnknown },
// GPSProcessingMethod — `undef` payload, sometimes prefixed by an
// encoding marker ("ASCII\0\0\0"). Smart-detect printability so a
// non-ASCII (Unicode) payload shows up as `<binary, N B>` instead of
// garbled glyphs.
0x001b: { name: "GPSProcessingMethod", format: readUndefined },
};
const INTEROP_OVERRIDES: Record<number, IfdTagDecoder> = {
// InteropVersion: `undef` containing ASCII like "0100".
0x0002: { name: "InteropVersion", format: readUndefinedAsAscii },
};
// ---- Decoder construction ----------------------------------------------
function buildDecoders(
data: Record<number, TagData>,
overrides: Record<number, IfdTagDecoder>,
nameOverrides: Record<number, string>,
): Record<number, IfdTagDecoder> {
const out: Record<number, IfdTagDecoder> = {};
for (const [idStr, tag] of Object.entries(data)) {
const id = Number(idStr);
const override = overrides[id];
if (override !== undefined) {
out[id] = override;
continue;
}
out[id] = {
name: nameOverrides[id] ?? tag.name,
format: pickFormatter(tag.type, tag.enumValues),
};
}
// Also include override entries for tags absent from the generated data.
// This covers tags that ExifTool places in a different IFD bucket than the
// one we walk them from (e.g. 0x927c MakerNote appears in ExifIFD in the
// generated data but is also emitted from IFD0 in real-world files).
for (const [idStr, decoder] of Object.entries(overrides)) {
const id = Number(idStr);
if (out[id] === undefined) {
out[id] = decoder;
}
}
return out;
}
export function defaultDecoder(tag: number): IfdTagDecoder {
return {
name: `Tag 0x${tag.toString(16).padStart(4, "0").toUpperCase()}`,
format: (e) => describeUnknown(e),
};
}
// ---- Decoder tables ----------------------------------------------------
//
// IFD0 / IFD1 / SubIFD share the TIFF tag namespace — single table.
//
// Pointer tags (0x8769 ExifIFDPointer, 0x8825 GPSIFDPointer) are NOT in
// this table — the walker consumes them to chase the SubIFD/GPS IFDs
// and emitting them as user-visible items would double-count and confuse
// the diff. Same reasoning applies to InteropIFDPointer (0xA005) which
// sits in EXIF_SUBIFD_DECODERS' would-be slot.
export const IFD_DECODERS: Record<number, IfdTagDecoder> = buildDecoders(
IFD0_TAG_DATA,
IFD0_OVERRIDES,
IFD0_NAME_OVERRIDES,
);
export const EXIF_SUBIFD_DECODERS: Record<number, IfdTagDecoder> =
buildDecoders(
EXIF_SUBIFD_TAG_DATA,
EXIF_SUBIFD_OVERRIDES,
EXIF_SUBIFD_NAME_OVERRIDES,
);
// GPS IFD decoders.
//
// The compound lat/lng pair (0x0001/0x0002 and 0x0003/0x0004) is intentionally
// NOT emitted by the per-tag table — `emitGpsItems` combines each ref with
// its coordinate triplet into a single human-readable item. We still
// produce decoders for those tags (the table is a `Record<number, _>`),
// but `emitGpsItems` consumes them before falling through to the table.
export const GPS_DECODERS: Record<number, IfdTagDecoder> = buildDecoders(
GPS_TAG_DATA,
GPS_OVERRIDES,
{},
);
// InteropIFD has its own tiny tag namespace (Exif 2.32 §4.6.7). Tag 0x0001 is
// ASCII, 0x0002 is UNDEFINED-bytes-encoding-ASCII-version (e.g. "0100").
export const INTEROP_DECODERS: Record<number, IfdTagDecoder> = buildDecoders(
INTEROP_TAG_DATA,
INTEROP_OVERRIDES,
{},
);

View file

@ -0,0 +1,55 @@
// One parsed Image File Directory entry from a TIFF/Exif stream. Pure data
// — readers in `ifd_readers.ts` take an `IfdEntry` and turn it into a number,
// string, or rational. Decoders in `ifd_decoders.ts` chain a reader + a
// formatter to produce the display string.
export interface IfdEntry {
readonly tag: number;
readonly type: number;
readonly count: number;
// The value-or-offset slot's resolved offset into `tiff`. For values that
// fit inline (≤ 4 bytes), this is the offset of the slot itself; readers
// interpret the first N bytes there. For values that don't fit, the slot
// stores an offset and `valueDataOffset` resolves through that indirection.
readonly valueDataOffset: number;
readonly tiff: Uint8Array;
readonly isLE: boolean;
}
// TIFF entry types (Exif 2.32 §4.6.2).
export const TYPE_BYTE = 1;
export const TYPE_ASCII = 2;
export const TYPE_SHORT = 3;
export const TYPE_LONG = 4;
export const TYPE_RATIONAL = 5;
export const TYPE_UNDEFINED = 7;
export const TYPE_SLONG = 9;
export const TYPE_SRATIONAL = 10;
export const TYPE_SIZES: Record<number, number> = {
[TYPE_BYTE]: 1,
[TYPE_ASCII]: 1,
[TYPE_SHORT]: 2,
[TYPE_LONG]: 4,
[TYPE_RATIONAL]: 8,
[TYPE_UNDEFINED]: 1,
[TYPE_SLONG]: 4,
[TYPE_SRATIONAL]: 8,
};
export const TYPE_NAMES: Record<number, string> = {
[TYPE_BYTE]: "BYTE",
[TYPE_ASCII]: "ASCII",
[TYPE_SHORT]: "SHORT",
[TYPE_LONG]: "LONG",
[TYPE_RATIONAL]: "RATIONAL",
[TYPE_UNDEFINED]: "UNDEFINED",
[TYPE_SLONG]: "SLONG",
[TYPE_SRATIONAL]: "SRATIONAL",
};
// Displayed when a value is unavailable (e.g. rational with zero denominator).
// Matches the em-dash used in BEFORE/AFTER cells in the file table; avoids
// surfacing the literal string "undefined" or fingerprinting our specific
// formatter implementation.
export const VALUE_UNAVAILABLE = "—" as const;

View file

@ -0,0 +1,146 @@
// Low-level value readers for Image File Directory entries. Each reader
// takes an `IfdEntry` and pulls the typed value out of the underlying TIFF
// buffer, honouring endian + inline/offset indirection.
import type { IfdEntry } from "./ifd_entry";
import { TYPE_NAMES } from "./ifd_entry";
export function makeReader16(
buf: Uint8Array,
isLE: boolean,
): (off: number) => number {
return (off) =>
isLE
? ((buf[off] ?? 0) | ((buf[off + 1] ?? 0) << 8)) >>> 0
: (((buf[off] ?? 0) << 8) | (buf[off + 1] ?? 0)) >>> 0;
}
export function makeReader32(
buf: Uint8Array,
isLE: boolean,
): (off: number) => number {
return (off) =>
isLE
? ((buf[off] ?? 0) |
((buf[off + 1] ?? 0) << 8) |
((buf[off + 2] ?? 0) << 16) |
((buf[off + 3] ?? 0) << 24)) >>>
0
: (((buf[off] ?? 0) << 24) |
((buf[off + 1] ?? 0) << 16) |
((buf[off + 2] ?? 0) << 8) |
(buf[off + 3] ?? 0)) >>>
0;
}
export function readAscii(e: IfdEntry): string {
const end = Math.min(e.valueDataOffset + e.count, e.tiff.length);
let s = "";
for (let i = e.valueDataOffset; i < end; i++) {
const b = e.tiff[i] ?? 0;
if (b === 0) break;
s += String.fromCharCode(b);
}
return s;
}
export function readShort(e: IfdEntry): number {
const r16 = makeReader16(e.tiff, e.isLE);
return r16(e.valueDataOffset);
}
export function readLong(e: IfdEntry): number {
const r32 = makeReader32(e.tiff, e.isLE);
return r32(e.valueDataOffset);
}
export function readByte(e: IfdEntry): number {
return e.tiff[e.valueDataOffset] ?? 0;
}
export function readRational(e: IfdEntry): [number, number] {
const r32 = makeReader32(e.tiff, e.isLE);
const num = r32(e.valueDataOffset);
const den = r32(e.valueDataOffset + 4);
return [num, den];
}
export function readRationalArray(
e: IfdEntry,
count: number,
): Array<[number, number]> {
const r32 = makeReader32(e.tiff, e.isLE);
const result: Array<[number, number]> = [];
for (let i = 0; i < count; i++) {
const off = e.valueDataOffset + i * 8;
result.push([r32(off), r32(off + 4)]);
}
return result;
}
export function readUtf16(e: IfdEntry): string {
// XPTitle/XPComment/etc. — UTF-16LE in practice (Microsoft tags).
const end = Math.min(e.valueDataOffset + e.count, e.tiff.length);
let s = "";
for (let i = e.valueDataOffset; i + 1 < end; i += 2) {
const code = (e.tiff[i] ?? 0) | ((e.tiff[i + 1] ?? 0) << 8);
if (code === 0) break;
s += String.fromCharCode(code);
}
return s;
}
// InteropVersion (and similar Exif-defined version tags) use TYPE_UNDEFINED
// with `count === 4`, but the four bytes are ASCII digits like "0100". Use
// this reader when the tag is documented as version-ASCII-in-UNDEFINED.
export function readUndefinedAsAscii(e: IfdEntry): string {
const end = Math.min(e.valueDataOffset + e.count, e.tiff.length);
let s = "";
for (let i = e.valueDataOffset; i < end; i++) {
const b = e.tiff[i] ?? 0;
if (b === 0) break;
s += String.fromCharCode(b);
}
return s;
}
export function readUndefined(e: IfdEntry): string {
// For UNDEFINED tags: if the bytes look printable, emit as ASCII; else
// describe as a binary blob.
const end = Math.min(e.valueDataOffset + e.count, e.tiff.length);
let printable = true;
for (let i = e.valueDataOffset; i < end; i++) {
const b = e.tiff[i] ?? 0;
if (b < 0x20 || b > 0x7e) {
if (b !== 0) {
printable = false;
break;
}
}
}
if (printable) {
let s = "";
for (let i = e.valueDataOffset; i < end; i++) {
const b = e.tiff[i] ?? 0;
if (b === 0) break;
s += String.fromCharCode(b);
}
return s;
}
return `<binary, ${formatByteSize(e.count)}>`;
}
// Shared with the strategies: byte-size formatter for blob-y values
// ("<binary, 1.2 KB>" etc.).
export function formatByteSize(bytes: number): string {
if (bytes < 1024) return `${bytes} B`;
if (bytes < 1024 * 1024) return `${(bytes / 1024).toFixed(1)} KB`;
return `${(bytes / 1024 / 1024).toFixed(1)} MB`;
}
// Fallback formatter for tags we don't decode (or decode partially): describe
// shape (type + count) without showing raw bytes.
export function describeUnknown(e: IfdEntry): string {
const typeName = TYPE_NAMES[e.type] ?? `type ${e.type}`;
return `<${typeName}, count ${e.count}>`;
}

View file

@ -0,0 +1,303 @@
// TIFF/Exif IFD walker shared by every format that wraps an Exif payload —
// JPEG APP1 (with "Exif\0\0" prefix), PNG eXIf chunks (raw TIFF), HEIC Exif
// items (raw TIFF, optionally offset-prefixed), native TIFF.
//
// The walker enumerates IFD0 → SubIFD → InteropIFD → GPS → IFD1, calling
// decoder tables from `ifd_decoders.ts` to render per-tag display values.
// Pointer tags (ExifIFDPointer, GPSIFDPointer, InteropIFDPointer) are
// consumed for navigation, never emitted — emitting them would double-count.
import type { MetadataItem, StripOptions } from "../../../domain";
import {
formatGpsCoord,
formatOrientation,
} from "../../../domain/exif/ifd_value_formatters";
import {
EXIF_SUBIFD_DECODERS,
GPS_DECODERS,
IFD_DECODERS,
INTEROP_DECODERS,
defaultDecoder,
type IfdTagDecoder,
} from "./ifd_decoders";
import type { IfdEntry } from "./ifd_entry";
import { TYPE_SIZES } from "./ifd_entry";
import {
makeReader16,
makeReader32,
readAscii,
readLong,
readRationalArray,
readShort,
} from "./ifd_readers";
// Pointer tags consumed for navigation, never emitted.
const TAG_EXIF_IFD_POINTER = 0x8769;
const TAG_GPS_IFD_POINTER = 0x8825;
const TAG_INTEROP_IFD_POINTER = 0xa005;
const TAG_ORIENTATION = 0x0112;
// "Exif\0\0" — APP1 EXIF identifier prefix.
const EXIF_PREFIX_LEN = 6;
// Minimum APP1 EXIF payload that could carry a single SHORT entry:
// 6 (prefix) + 8 (TIFF header) + 2 (entry count) + 12 (1 entry) + 4 (next-IFD) = 32.
const EXIF_MIN_PAYLOAD = 32;
const ORIENTATION_MIN = 1;
const ORIENTATION_MAX = 8;
export interface EnumerationResult {
readonly items: readonly MetadataItem[];
readonly orientation: number | null;
}
export function parseIfd(
tiff: Uint8Array,
offset: number,
isLE: boolean,
): readonly IfdEntry[] {
const r16 = makeReader16(tiff, isLE);
const r32 = makeReader32(tiff, isLE);
if (offset + 2 > tiff.length) return [];
const count = r16(offset);
const entries: IfdEntry[] = [];
for (let n = 0; n < count; n++) {
const off = offset + 2 + n * 12;
if (off + 12 > tiff.length) break;
const tag = r16(off);
const type = r16(off + 2);
const entryCount = r32(off + 4);
const slotOffset = off + 8;
const typeSize = TYPE_SIZES[type];
// Resolve value-or-offset: payload size ≤ 4 → inline at slotOffset,
// else the slot holds a 32-bit offset into the TIFF buffer.
let valueDataOffset = slotOffset;
if (typeSize !== undefined) {
const payloadSize = typeSize * entryCount;
if (payloadSize > 4) {
valueDataOffset = r32(slotOffset);
}
}
entries.push({
tag,
type,
count: entryCount,
valueDataOffset,
tiff,
isLE,
});
}
return entries;
}
// Detect a JPEG APP1 EXIF payload by its "Exif\0\0" prefix.
export function isExifSegment(payload: Uint8Array): boolean {
return (
payload.length >= 6 &&
payload[0] === 0x45 &&
payload[1] === 0x78 &&
payload[2] === 0x69 &&
payload[3] === 0x66 &&
payload[4] === 0x00 &&
payload[5] === 0x00
);
}
// Public entry point for JPEG APP1 EXIF: handles the "Exif\0\0" prefix +
// TIFF header detection, then delegates to enumerateExifTiff.
export function enumerateExifSegment(
payload: Uint8Array,
options: StripOptions,
): EnumerationResult {
if (payload.length < EXIF_MIN_PAYLOAD) {
return { items: [], orientation: null };
}
if (
payload[0] !== 0x45 ||
payload[1] !== 0x78 ||
payload[2] !== 0x69 ||
payload[3] !== 0x66 ||
payload[4] !== 0x00 ||
payload[5] !== 0x00
) {
return { items: [], orientation: null };
}
const tiff = payload.subarray(EXIF_PREFIX_LEN);
const isLE = tiff[0] === 0x49 && tiff[1] === 0x49;
const isBE = tiff[0] === 0x4d && tiff[1] === 0x4d;
if (!isLE && !isBE) return { items: [], orientation: null };
const r16 = makeReader16(tiff, isLE);
if (r16(2) !== 0x002a) return { items: [], orientation: null };
return enumerateExifTiff(tiff, isLE, options);
}
// Lower-level entry point for prefix-less Exif payloads (PNG eXIf chunk,
// HEIC Exif item with offset). The caller is responsible for slicing out
// the raw TIFF stream (starting at the byte-order mark) and detecting
// endianness — `isLE` is the result of inspecting bytes 0..1.
export function enumerateExifTiff(
tiff: Uint8Array,
isLE: boolean,
options: StripOptions,
): EnumerationResult {
const r16 = makeReader16(tiff, isLE);
const r32 = makeReader32(tiff, isLE);
const ifd0Offset = r32(4);
const items: MetadataItem[] = [];
let orientation: number | null = null;
let exifSubIfdOffset: number | null = null;
let gpsIfdOffset: number | null = null;
const ifd0Entries = parseIfd(tiff, ifd0Offset, isLE);
for (const entry of ifd0Entries) {
if (entry.tag === TAG_EXIF_IFD_POINTER) {
exifSubIfdOffset = readLong(entry);
continue;
}
if (entry.tag === TAG_GPS_IFD_POINTER) {
gpsIfdOffset = readLong(entry);
continue;
}
if (entry.tag === TAG_ORIENTATION) {
const v = readShort(entry);
if (v >= ORIENTATION_MIN && v <= ORIENTATION_MAX) {
orientation = v;
}
}
items.push(emitItem(entry, "EXIF", IFD_DECODERS));
}
// Walk IFD1 (thumbnail) via next-IFD pointer after IFD0's entry table.
const ifd0Count = tiff.length >= ifd0Offset + 2 ? r16(ifd0Offset) : 0;
const ifd0EndOffset = ifd0Offset + 2 + ifd0Count * 12;
if (ifd0EndOffset + 4 <= tiff.length) {
const nextIfdOffset = r32(ifd0EndOffset);
if (nextIfdOffset !== 0) {
const ifd1Entries = parseIfd(tiff, nextIfdOffset, isLE);
for (const entry of ifd1Entries) {
items.push(emitItem(entry, "EXIF", IFD_DECODERS));
}
}
}
// Walk SubIFD
let interopIfdOffset: number | null = null;
if (exifSubIfdOffset !== null) {
const subEntries = parseIfd(tiff, exifSubIfdOffset, isLE);
for (const entry of subEntries) {
if (entry.tag === TAG_INTEROP_IFD_POINTER) {
interopIfdOffset = readLong(entry);
continue;
}
items.push(emitItem(entry, "EXIF", EXIF_SUBIFD_DECODERS));
}
}
// Walk InteropIFD (small table — Index, Version, etc.)
if (interopIfdOffset !== null) {
const interopEntries = parseIfd(tiff, interopIfdOffset, isLE);
for (const entry of interopEntries) {
items.push(emitItem(entry, "EXIF", INTEROP_DECODERS));
}
}
// Walk GPS IFD with compound coordinate pairing.
if (gpsIfdOffset !== null) {
const gpsEntries = parseIfd(tiff, gpsIfdOffset, isLE);
items.push(...emitGpsItems(gpsEntries));
}
// preserveOrientation: callers that can synthesize an Orientation-only
// replacement (e.g. JPEG) relabel the corresponding removed item as
// kept. From the user's perspective Orientation is preserved (same
// value, before == after), so the entry stays in the diff with action
// "kept" instead of "removed". Every other IFD entry stays removed —
// those tags really did disappear.
if (options.preserveOrientation && orientation !== null) {
const idx = items.findIndex(
(i) => i.name === "Orientation" && i.source === "EXIF",
);
if (idx >= 0) {
items[idx] = {
action: "kept",
source: "EXIF",
name: "Orientation",
value: formatOrientation(orientation),
reason: "diffKeptReasonOrientation",
};
}
}
return { items, orientation };
}
function emitItem(
entry: IfdEntry,
source: string,
table: Record<number, IfdTagDecoder>,
): MetadataItem {
const decoder = table[entry.tag] ?? defaultDecoder(entry.tag);
return {
action: "removed",
source,
name: decoder.name,
valueBefore: decoder.format(entry),
};
}
// GPS coordinates: combine GPSLatitude (RATIONAL[3]) + GPSLatitudeRef ("N"/"S")
// into a single compound item. Same for longitude. Other GPS tags emit individually.
export function emitGpsItems(
entries: readonly IfdEntry[],
): readonly MetadataItem[] {
const byTag = new Map<number, IfdEntry>();
for (const e of entries) byTag.set(e.tag, e);
const items: MetadataItem[] = [];
const lat = byTag.get(0x0002);
const latRef = byTag.get(0x0001);
if (lat !== undefined && latRef !== undefined) {
const triplet = readRationalArray(lat, 3);
const ref = readAscii(latRef);
items.push({
action: "removed",
source: "GPS",
name: "GPSLatitude",
valueBefore: formatGpsCoord(triplet, ref),
});
byTag.delete(0x0001);
byTag.delete(0x0002);
}
const lng = byTag.get(0x0004);
const lngRef = byTag.get(0x0003);
if (lng !== undefined && lngRef !== undefined) {
const triplet = readRationalArray(lng, 3);
const ref = readAscii(lngRef);
items.push({
action: "removed",
source: "GPS",
name: "GPSLongitude",
valueBefore: formatGpsCoord(triplet, ref),
});
byTag.delete(0x0003);
byTag.delete(0x0004);
}
for (const [tag, entry] of byTag) {
const decoder = GPS_DECODERS[tag] ?? defaultDecoder(tag);
items.push({
action: "removed",
source: "GPS",
name: decoder.name,
valueBefore: decoder.format(entry),
});
}
return items;
}

View file

@ -1,11 +1,11 @@
import type { Result } from "../../common";
import type { ExifError, StripOptions } from "../../domain";
import type { ExifError, MetadataItem, StripOptions } from "../../domain";
export type { StripOptions };
export interface StripResult {
readonly bytes: Uint8Array;
readonly metadataRemoved: number;
readonly metadataItems: readonly MetadataItem[];
}
export interface FormatStrategy {

View file

@ -1,5 +1,7 @@
import type { Result } from "../../../common";
import type { ExifError } from "../../../domain";
import type { ExifError, MetadataItem } from "../../../domain";
import { enumerateExifSegment, isExifSegment } from "../exif/ifd_walker";
import { formatByteSize } from "../exif/ifd_readers";
import type {
FormatStrategy,
StripOptions,
@ -34,7 +36,7 @@ import type {
// and RST0..RST7 restart markers within the stream are preserved.
//
// Reference: ITU-T T.81 (JPEG) §B.1.1; Exif 2.32 §4.6 (TIFF/Exif IFD layout);
// ExifTool limitations doc.
// ExifTool limitations doc. EXIF walker lives in `../exif/`.
const SOI = 0xd8;
const EOI = 0xd9;
@ -44,21 +46,16 @@ const RST_FIRST = 0xd0;
const RST_LAST = 0xd7;
const APP_FIRST = 0xe0;
const APP_LAST = 0xef;
const APP0_JFIF = 0xe0;
const APP1_EXIF_OR_XMP = 0xe1;
const APP2_ICC = 0xe2;
const APP14_ADOBE = 0xee;
const COM = 0xfe;
// EXIF in IFD0 — see Exif 2.32 §4.6.
// Exif IFD0 tag — synthesized into the replacement APP1 when
// preserveOrientation is on.
const TAG_ORIENTATION = 0x0112;
const TYPE_SHORT = 3;
const ORIENTATION_MIN = 1;
const ORIENTATION_MAX = 8;
// "Exif\0\0" — APP1 EXIF identifier prefix.
const EXIF_PREFIX_LEN = 6;
// Minimum APP1 EXIF payload that could carry a single SHORT entry:
// 6 (prefix) + 8 (TIFF header) + 2 (entry count) + 12 (1 entry) + 4 (next-IFD) = 32.
const EXIF_MIN_PAYLOAD = 32;
function isStandalone(marker: number): boolean {
// Markers without a length field. Standalone markers consume only
@ -85,61 +82,143 @@ function shouldDropSegment(marker: number, options: StripOptions): boolean {
return false;
}
// Parse an APP1 segment payload (everything after the FF E1 + length field) and
// return the EXIF Orientation value (1..8) from IFD0 if present. Returns null
// when the payload is not a parseable EXIF block, has no Orientation entry, or
// the entry is malformed.
function extractOrientation(payload: Uint8Array): number | null {
if (payload.length < EXIF_MIN_PAYLOAD) return null;
// "Exif\0\0" — distinguishes APP1 EXIF from APP1 XMP.
if (
payload[0] !== 0x45 ||
payload[1] !== 0x78 ||
payload[2] !== 0x69 ||
payload[3] !== 0x66 ||
payload[4] !== 0x00 ||
payload[5] !== 0x00
) {
// ---- Non-IFD segment description ----------------------------------------
//
// `describeDroppedSegment` is observation-only: it inspects a segment that
// `shouldDropSegment` already decided to drop, and produces zero or more
// MetadataItems describing what's being lost. The walker drops the bytes
// either way; this just lets the diff record what was there.
function describeDroppedSegment(
marker: number,
payload: Uint8Array,
): readonly MetadataItem[] {
// APP1: EXIF or XMP — discriminate by identifier prefix. EXIF routes to
// the IFD enumerator (handled by the caller); this branch only handles
// XMP so callers that hand us a generic APP1 don't double-emit.
if (marker === APP1_EXIF_OR_XMP) {
if (isExifSegment(payload)) {
// describeDroppedSegment is only invoked by tests / fallback paths that
// don't carry options. Walker uses enumerateExifSegment directly with
// options. Default to non-preserving here.
return enumerateExifSegment(payload, {
preserveOrientation: false,
preserveColorProfile: false,
preserveTimestamps: false,
}).items;
}
if (isXmpSegment(payload)) {
return [
{
action: "removed",
source: "XMP",
name: "XMP packet",
valueBefore: `<XML, ${formatByteSize(payload.length)}>`,
},
];
}
return [];
}
if (marker === APP2_ICC) {
const desc = extractIccDescription(payload);
return [
{
action: "removed",
source: "ICC",
name: desc ?? "ICC profile",
valueBefore: `<binary, ${formatByteSize(payload.length)}>`,
},
];
}
if (marker === APP0_JFIF) {
return [
{
action: "removed",
source: "JFIF",
name: "JFIF block",
valueBefore: summariseJfif(payload),
},
];
}
if (marker === COM) {
return [
{
action: "removed",
source: "Comment",
name: "Comment",
valueBefore: new TextDecoder("utf-8", { fatal: false }).decode(payload),
},
];
}
return [];
}
function isXmpSegment(payload: Uint8Array): boolean {
if (payload.length < 28) return false;
const text = new TextDecoder().decode(payload.subarray(0, 28));
return text.startsWith("http://ns.adobe.com/xap/1.0/");
}
// ICC profile structure (ICC.1:2010 §7): 14 bytes of JPEG framing
// ("ICC_PROFILE\0" + chunk-index + chunk-count), then the ICC profile proper:
// a 128-byte header followed by a tag table (4-byte tag count + 12-byte entries).
// Locate the `desc` tag (signature 0x64657363, 'desc') and read its embedded
// ASCII description. Best-effort: any structural mismatch falls back to null
// and the caller substitutes a generic label.
function extractIccDescription(payload: Uint8Array): string | null {
const JPEG_HEADER_LEN = 14;
const ICC_HEADER_LEN = 128;
const DESC_SIG = 0x64657363; // 'desc'
try {
if (payload.length < JPEG_HEADER_LEN + ICC_HEADER_LEN + 4) return null;
const profileBytes = payload.subarray(JPEG_HEADER_LEN);
const dv = new DataView(
profileBytes.buffer,
profileBytes.byteOffset,
profileBytes.byteLength,
);
const tagCount = dv.getUint32(ICC_HEADER_LEN);
for (let n = 0; n < tagCount; n++) {
const entry = ICC_HEADER_LEN + 4 + n * 12;
if (entry + 12 > profileBytes.length) break;
const sig = dv.getUint32(entry);
if (sig !== DESC_SIG) continue;
const descOffset = dv.getUint32(entry + 4);
const descLength = dv.getUint32(entry + 8);
if (descOffset + descLength > profileBytes.length) return null;
// `desc` tag body: 'desc' (4) + reserved (4) + ascii-len (4) + ascii bytes.
const asciiLen = dv.getUint32(descOffset + 8);
if (asciiLen === 0) return null;
const asciiOffset = descOffset + 12;
const asciiEnd = asciiOffset + asciiLen - 1; // strip trailing NUL
if (asciiEnd > profileBytes.length || asciiEnd < asciiOffset) {
return null;
}
return new TextDecoder().decode(
profileBytes.subarray(asciiOffset, asciiEnd),
);
}
return null;
} catch {
return null;
}
const tiff = payload.subarray(EXIF_PREFIX_LEN);
const isLE = tiff[0] === 0x49 && tiff[1] === 0x49;
const isBE = tiff[0] === 0x4d && tiff[1] === 0x4d;
if (!isLE && !isBE) return null;
const r16 = (off: number): number =>
isLE
? ((tiff[off] ?? 0) | ((tiff[off + 1] ?? 0) << 8)) >>> 0
: (((tiff[off] ?? 0) << 8) | (tiff[off + 1] ?? 0)) >>> 0;
const r32 = (off: number): number =>
isLE
? ((tiff[off] ?? 0) |
((tiff[off + 1] ?? 0) << 8) |
((tiff[off + 2] ?? 0) << 16) |
((tiff[off + 3] ?? 0) << 24)) >>>
0
: (((tiff[off] ?? 0) << 24) |
((tiff[off + 1] ?? 0) << 16) |
((tiff[off + 2] ?? 0) << 8) |
(tiff[off + 3] ?? 0)) >>>
0;
if (r16(2) !== 0x002a) return null;
const ifd0Offset = r32(4);
// Need 2 bytes for entry count plus the entry table.
if (ifd0Offset + 2 > tiff.length) return null;
const numEntries = r16(ifd0Offset);
const entryStart = ifd0Offset + 2;
if (entryStart + numEntries * 12 > tiff.length) return null;
for (let n = 0; n < numEntries; n++) {
const off = entryStart + n * 12;
if (r16(off) !== TAG_ORIENTATION) continue;
// Orientation must be SHORT type with count 1.
if (r16(off + 2) !== TYPE_SHORT) return null;
if (r32(off + 4) !== 1) return null;
const value = r16(off + 8);
if (value < ORIENTATION_MIN || value > ORIENTATION_MAX) return null;
return value;
}
return null;
}
function summariseJfif(payload: Uint8Array): string {
// JFIF 1.02 §5: "JFIF\0" (5) + version (2) + density-unit (1) + Xdensity (2)
// + Ydensity (2) + thumb-w (1) + thumb-h (1) = 14 bytes minimum.
if (payload.length < 14) return "JFIF block";
const major = payload[5] ?? 0;
const minor = payload[6] ?? 0;
const unit = payload[7] ?? 0;
const xDensity = ((payload[8] ?? 0) << 8) | (payload[9] ?? 0);
const yDensity = ((payload[10] ?? 0) << 8) | (payload[11] ?? 0);
const unitLabel = unit === 1 ? "dpi" : unit === 2 ? "dpcm" : "raw";
return `version ${major}.${minor.toString().padStart(2, "0")}, density ${xDensity}×${yDensity} ${unitLabel}`;
}
// Build a minimal APP1 EXIF segment whose IFD0 contains exactly one entry:
@ -192,7 +271,7 @@ function synthesizeOrientationApp1(orientation: number): Uint8Array {
interface WalkResult {
bytes: Uint8Array;
droppedSegments: number;
metadataItems: readonly MetadataItem[];
}
function walkJpeg(input: Uint8Array, options: StripOptions): WalkResult {
@ -214,8 +293,8 @@ function walkJpeg(input: Uint8Array, options: StripOptions): WalkResult {
out[outPos++] = SOI;
let i = 2;
let droppedSegments = 0;
let sawEOI = false;
const collectedItems: MetadataItem[] = [];
while (i < input.length) {
if (input[i] !== 0xff) {
@ -264,32 +343,64 @@ function walkJpeg(input: Uint8Array, options: StripOptions): WalkResult {
throw new Error(`segment at offset ${i} extends past end of input`);
}
// When dropping a segment, observe its content so callers can see
// per-tag detail. APP1 EXIF takes a special path (it sets aside the
// orientation tag for the preserveOrientation synthesis); every other
// segment goes through describeDroppedSegment, which emits zero or
// more MetadataItems describing what's being lost. Observation only —
// the byte-dropping decision is already made by shouldDropSegment.
const isDropped = shouldDropSegment(marker, options);
const isDroppedApp1 = marker === APP1_EXIF_OR_XMP && isDropped;
let segmentOrientation: number | null = null;
if (isDroppedApp1) {
const payload = input.subarray(i + 4, segmentEnd);
if (isExifSegment(payload)) {
const enumeration = enumerateExifSegment(payload, options);
collectedItems.push(...enumeration.items);
segmentOrientation = enumeration.orientation;
} else {
// Non-EXIF APP1 (XMP, or some other identifier). Route through
// describeDroppedSegment so XMP packets land as one blob item.
collectedItems.push(...describeDroppedSegment(marker, payload));
}
} else if (isDropped) {
const payload = input.subarray(i + 4, segmentEnd);
collectedItems.push(...describeDroppedSegment(marker, payload));
}
if (
marker === APP1_EXIF_OR_XMP &&
options.preserveOrientation &&
shouldDropSegment(marker, options)
isDropped &&
segmentOrientation !== null
) {
// APP1 may be EXIF or XMP. extractOrientation returns null for
// XMP (no "Exif\0\0" prefix) and for EXIF without an Orientation
// tag — in both cases we fall through to the normal drop policy.
const payload = input.subarray(i + 4, segmentEnd);
const orientation = extractOrientation(payload);
if (orientation !== null) {
const synth = synthesizeOrientationApp1(orientation);
out.set(synth, outPos);
outPos += synth.length;
// Original APP1 contents (Make, GPS, MakerNotes, etc.) were
// dropped; the replacement carries only Orientation. Counts
// as one segment removed for parity with the drop case.
droppedSegments++;
i = segmentEnd;
continue;
}
// Synthesize a replacement APP1 carrying only the Orientation tag.
// Original APP1 contents (Make, GPS, MakerNotes, etc.) were
// already accumulated as removed items above.
const synth = synthesizeOrientationApp1(segmentOrientation);
out.set(synth, outPos);
outPos += synth.length;
i = segmentEnd;
continue;
}
if (shouldDropSegment(marker, options)) {
droppedSegments++;
} else {
if (!isDropped) {
// Segment is being kept. For privacy-relevant segments we only keep
// on opt-in (ICC), emit a `kept` item so the diff records the
// decision. APP14 Adobe stays silent — it's an always-keep decoder
// helper, not a user-visible preservation choice.
if (marker === APP2_ICC && options.preserveColorProfile) {
const payload = input.subarray(i + 4, segmentEnd);
const sizeKb = formatByteSize(payload.length);
const desc = extractIccDescription(payload);
collectedItems.push({
action: "kept",
source: "ICC",
name: desc ?? "ICC profile",
value: `<binary, ${sizeKb}>`,
reason: "diffKeptReasonColorProfile",
});
}
// Copy FF + code + length field + payload verbatim.
out.set(input.subarray(i, segmentEnd), outPos);
outPos += segmentEnd - i;
@ -337,7 +448,7 @@ function walkJpeg(input: Uint8Array, options: StripOptions): WalkResult {
return {
bytes: out.slice(0, outPos),
droppedSegments,
metadataItems: collectedItems,
};
}
@ -364,10 +475,13 @@ export class JpegStrategy implements FormatStrategy {
options: StripOptions;
}): Promise<Result<StripResult, ExifError>> {
try {
const { bytes: result, droppedSegments } = walkJpeg(bytes, options);
const { bytes: result, metadataItems } = walkJpeg(bytes, options);
return {
ok: true,
value: { bytes: result, metadataRemoved: droppedSegments },
value: {
bytes: result,
metadataItems,
},
};
} catch (err: unknown) {
return {

View file

@ -104,11 +104,8 @@ export class OfficeStrategy implements FormatStrategy {
private async stripOoxml(
zip: JSZip,
): Promise<Result<StripResult, ExifError>> {
let removed = 0;
for (const path of OOXML_METADATA_FILES) {
if (zip.file(path) === null) continue;
removed += 1;
const replacement = REPLACEMENTS_OOXML[path];
if (replacement !== undefined) {
zip.file(path, replacement);
@ -120,26 +117,21 @@ export class OfficeStrategy implements FormatStrategy {
for (const path of OOXML_DELETE_FILES) {
if (zip.file(path) !== null) {
zip.remove(path);
removed += 1;
}
}
for (const path of Object.keys(zip.files)) {
if (PRINTER_SETTINGS_RE.test(path)) {
zip.remove(path);
removed += 1;
}
}
return this.generateResult(zip, removed);
return this.generateResult(zip);
}
private async stripOdt(zip: JSZip): Promise<Result<StripResult, ExifError>> {
let removed = 0;
for (const path of ODT_METADATA_FILES) {
if (zip.file(path) === null) continue;
removed += 1;
const replacement = REPLACEMENTS_ODT[path];
if (replacement !== undefined) {
zip.file(path, replacement);
@ -151,7 +143,6 @@ export class OfficeStrategy implements FormatStrategy {
for (const path of ODT_DELETE_FILES) {
if (zip.file(path) !== null) {
zip.remove(path);
removed += 1;
}
}
@ -170,12 +161,11 @@ export class OfficeStrategy implements FormatStrategy {
finalZip = rebuilt;
}
return this.generateResult(finalZip, removed);
return this.generateResult(finalZip);
}
private async generateResult(
zip: JSZip,
removed: number,
): Promise<Result<StripResult, ExifError>> {
for (const entry of Object.values(zip.files)) {
entry.date = ZIP_EPOCH;
@ -186,7 +176,10 @@ export class OfficeStrategy implements FormatStrategy {
compression: "DEFLATE",
compressionOptions: { level: 6 },
});
return { ok: true, value: { bytes: out, metadataRemoved: removed } };
return {
ok: true,
value: { bytes: out, metadataItems: [] },
};
} catch (err: unknown) {
return {
ok: false,

View file

@ -79,15 +79,11 @@ export class PdfStrategy implements FormatStrategy {
updateMetadata: false,
});
let removed = 0;
const tryDelete = (dict: PDFDict, key: string): boolean => {
const tryDelete = (dict: PDFDict, key: string): void => {
const name = PDFName.of(key);
if (dict.has(name)) {
dict.delete(name);
return true;
}
return false;
};
const dropIndirect = (ref: unknown): void => {
@ -103,7 +99,7 @@ export class PdfStrategy implements FormatStrategy {
const infoDict = doc.context.lookup(infoRef);
if (infoDict instanceof PDFDictClass) {
for (const key of INFO_KEYS) {
if (tryDelete(infoDict, key)) removed++;
tryDelete(infoDict, key);
}
}
}
@ -119,12 +115,11 @@ export class PdfStrategy implements FormatStrategy {
if (metadataRef !== undefined) {
doc.catalog.delete(metadataKey);
dropIndirect(metadataRef);
removed++;
}
// 3. Drop catalog-level fingerprints.
for (const key of CATALOG_FINGERPRINT_KEYS) {
if (tryDelete(doc.catalog, key)) removed++;
tryDelete(doc.catalog, key);
}
// 4. Per-page cleanup: page-level metadata + thumbnails +
@ -138,7 +133,6 @@ export class PdfStrategy implements FormatStrategy {
if (pageMetaRef !== undefined) {
node.delete(pageMetaName);
dropIndirect(pageMetaRef);
removed++;
}
const thumbName = PDFName.of("Thumb");
@ -146,7 +140,6 @@ export class PdfStrategy implements FormatStrategy {
if (thumbRef !== undefined) {
node.delete(thumbName);
dropIndirect(thumbRef);
removed++;
}
const annotsRef = node.get(PDFName.of("Annots"));
@ -157,7 +150,7 @@ export class PdfStrategy implements FormatStrategy {
const annot = doc.context.lookup(annots.get(i));
if (annot instanceof PDFDictClass) {
for (const key of ANNOTATION_PII_KEYS) {
if (tryDelete(annot, key)) removed++;
tryDelete(annot, key);
}
}
}
@ -176,7 +169,7 @@ export class PdfStrategy implements FormatStrategy {
ok: true,
value: {
bytes: new Uint8Array(outputBytes),
metadataRemoved: removed,
metadataItems: [],
},
};
} catch (err: unknown) {

View file

@ -118,7 +118,6 @@ function shouldKeep(type: string, options: StripOptions): boolean {
interface WalkResult {
bytes: Uint8Array;
droppedChunks: number;
}
function walkPng(input: Uint8Array, options: StripOptions): WalkResult {
@ -135,7 +134,6 @@ function walkPng(input: Uint8Array, options: StripOptions): WalkResult {
outPos += SIGNATURE.length;
let i: number = SIGNATURE.length;
let droppedChunks = 0;
let sawIhdr = false;
let sawIend = false;
@ -176,8 +174,6 @@ function walkPng(input: Uint8Array, options: StripOptions): WalkResult {
if (shouldKeep(type, options)) {
out.set(input.subarray(i, chunkEnd), outPos);
outPos += chunkEnd - i;
} else {
droppedChunks++;
}
if (type === TYPE_IEND) {
@ -196,7 +192,6 @@ function walkPng(input: Uint8Array, options: StripOptions): WalkResult {
return {
bytes: out.slice(0, outPos),
droppedChunks,
};
}
@ -215,10 +210,13 @@ export class PngStrategy implements FormatStrategy {
options: StripOptions;
}): Promise<Result<StripResult, ExifError>> {
try {
const { bytes: result, droppedChunks } = walkPng(bytes, options);
const { bytes: result } = walkPng(bytes, options);
return {
ok: true,
value: { bytes: result, metadataRemoved: droppedChunks },
value: {
bytes: result,
metadataItems: [],
},
};
} catch (err: unknown) {
return {

View file

@ -159,11 +159,7 @@ function parseBoxes(
// Overwrites the box header with `free` and zeroes the payload in `out`.
// The total byte count is unchanged, preserving every downstream file offset.
function blankBox(
out: Uint8Array,
b: ParsedBox,
counter: { removed: number },
): void {
function blankBox(out: Uint8Array, b: ParsedBox): void {
const totalSize = b.payloadEnd - b.headerStart;
if (totalSize > 0xffff_ffff) {
// Metadata boxes are never this large in practice. Fail loud rather than
@ -181,28 +177,23 @@ function blankBox(
out[b.headerStart + 6] = 0x65; // 'e'
out[b.headerStart + 7] = 0x65; // 'e'
out.fill(0, b.headerStart + HEADER_SIZE_REGULAR, b.payloadEnd);
counter.removed += 1;
}
// Single-pass in-place strip. The file is cloned once in strip(); every
// mutation here writes directly into that clone with no intermediate
// allocations. mdat — which can be gigabytes — is never copied again.
function stripInPlace(
out: Uint8Array,
boxes: readonly ParsedBox[],
counter: { removed: number },
): void {
function stripInPlace(out: Uint8Array, boxes: readonly ParsedBox[]): void {
for (const b of boxes) {
if (METADATA_BOX_TYPES.has(b.type)) {
blankBox(out, b, counter);
blankBox(out, b);
continue;
}
if (b.type === "uuid" && !isSafeUuidBox(out, b.payloadStart)) {
blankBox(out, b, counter);
blankBox(out, b);
continue;
}
if (CONTAINER_BOX_TYPES.has(b.type)) {
stripInPlace(out, parseBoxes(out, b.payloadStart, b.payloadEnd), counter);
stripInPlace(out, parseBoxes(out, b.payloadStart, b.payloadEnd));
continue;
}
if (TIMESTAMP_BOX_TYPES.has(b.type)) {
@ -248,11 +239,13 @@ export class VideoStrategy implements FormatStrategy {
// stripInPlace writes only to the clone; mdat is never copied a second time.
const boxes = parseBoxes(bytes, 0, bytes.length);
const out = new Uint8Array(bytes);
const counter = { removed: 0 };
stripInPlace(out, boxes, counter);
stripInPlace(out, boxes);
return {
ok: true,
value: { bytes: out, metadataRemoved: counter.removed },
value: {
bytes: out,
metadataItems: [],
},
};
} catch (err: unknown) {
return {

View file

@ -55,8 +55,8 @@ export class WasmProcessor implements MetadataProcessorPort {
ok: true,
value: {
outputPath,
metadataRemoved: stripResult.value.metadataRemoved,
outputBytes: stripResult.value.bytes.byteLength,
metadataItems: stripResult.value.metadataItems,
},
};
}

View file

@ -1,4 +1,8 @@
import type { Settings, I18nStringsDictionary } from "../../domain";
import type {
MetadataItem,
Settings,
I18nStringsDictionary,
} from "../../domain";
import {
DEFAULT_SETTINGS,
validateSettings,
@ -80,8 +84,8 @@ export interface WasmApi {
): Promise<{
ok: boolean;
outputPath: string | null;
metadataRemoved: number | null;
outputBytes: number | null;
metadataItems: readonly MetadataItem[];
error: string | null;
}>;
}
@ -268,16 +272,16 @@ export function makeWebApi(): WebApi {
return {
ok: false,
outputPath: null,
metadataRemoved: null,
outputBytes: null,
metadataItems: [],
error: formatExifError(result.error),
};
}
return {
ok: true,
outputPath: result.value.outputPath,
metadataRemoved: result.value.metadataRemoved,
outputBytes: result.value.outputBytes,
metadataItems: result.value.metadataItems,
error: null,
};
},

View file

@ -12,6 +12,7 @@ import { TypePill } from "../ui/TypePill";
import { StatusIcon } from "../ui/StatusIcon";
import { ChevronIcon } from "../icons/ChevronIcon";
import { ErrorExpansion } from "./ErrorExpansion";
import { MetadataDiffExpansion } from "./MetadataDiffExpansion";
import { ResultPill } from "./ResultPill";
import { formatFileSize } from "../../utils/format_file_size";
import { useI18n } from "../../hooks/use_i18n";
@ -40,7 +41,11 @@ export function FileRow({
file.status === FileProcessingStatus.Complete ||
file.status === FileProcessingStatus.NoMetadataFound;
const isError = file.status === FileProcessingStatus.Error;
const isExpandable = isComplete || isError;
const isExpandable =
isError ||
file.status === FileProcessingStatus.NoMetadataFound ||
(file.status === FileProcessingStatus.Complete &&
file.metadataItems.length > 0);
const rowClasses = [
"file-table__row",
@ -170,6 +175,11 @@ export function FileRow({
</span>
</div>
)}
{isExpanded &&
file.status === FileProcessingStatus.Complete &&
file.metadataItems.length > 0 && (
<MetadataDiffExpansion items={file.metadataItems} />
)}
</div>
);
}

View file

@ -0,0 +1,184 @@
// Expandable per-file metadata diff. Rendered below a Complete FileRow when
// `file.metadataItems` is non-empty. Pure render — no state, no effects, no
// callbacks. Groups items by `source` (first-seen ordering, stable), then
// within each group renders kept items first, then removed/modified.
//
// `t()` is the live i18n hook from useI18n. The diff i18n keys carry a
// `{count}` placeholder which is interpolated locally here (mirrors the
// ErrorExpansion.tsx `.replace("{ext}", ...)` pattern), since the live `t`
// signature is `(key: string) => string` and does not interpolate.
import { useI18n } from "../../hooks/use_i18n";
import type { MetadataItem } from "../../../domain";
export function MetadataDiffExpansion({
items,
}: {
items: readonly MetadataItem[];
}): React.JSX.Element | null {
const { t } = useI18n();
// Defensive: Task 11's `isExpandable` gate prevents this case, but the
// component still returns null on empty input so it can be reused safely.
if (items.length === 0) return null;
const grouped = groupBySource(items);
return (
<div className="file-table__expansion file-table__diff">
{grouped.map(({ source, items: groupItems }) => (
<section key={source} className="file-table__diff-group">
<h4 className="file-table__diff-group-header">
{source} {makeGroupSummary(groupItems, t)}
</h4>
<dl className="file-table__diff-list">
{sortKeptFirst(groupItems).map((item, idx) => (
<MetadataRow
key={`${item.source}-${item.name}-${idx}`}
item={item}
t={t}
/>
))}
</dl>
</section>
))}
</div>
);
}
interface SourceGroup {
readonly source: string;
readonly items: readonly MetadataItem[];
}
function groupBySource(items: readonly MetadataItem[]): readonly SourceGroup[] {
const order: string[] = [];
const byKey = new Map<string, MetadataItem[]>();
for (const item of items) {
const existing = byKey.get(item.source);
if (existing === undefined) {
order.push(item.source);
byKey.set(item.source, [item]);
} else {
existing.push(item);
}
}
// Safe: every key in `order` was just inserted into `byKey` above.
return order.map((source) => ({
source,
items: byKey.get(source) as MetadataItem[],
}));
}
function sortKeptFirst(
items: readonly MetadataItem[],
): readonly MetadataItem[] {
const kept: MetadataItem[] = [];
const rest: MetadataItem[] = [];
for (const item of items) {
if (item.action === "kept") {
kept.push(item);
} else {
rest.push(item);
}
}
return [...kept, ...rest];
}
function makeGroupSummary(
items: readonly MetadataItem[],
t: (key: string) => string,
): string {
let removed = 0;
let modified = 0;
let kept = 0;
for (const item of items) {
switch (item.action) {
case "removed":
removed += 1;
break;
case "modified":
modified += 1;
break;
case "kept":
kept += 1;
break;
}
}
const parts: string[] = [];
if (removed > 0)
parts.push(t("diffGroupRemoved").replace("{count}", String(removed)));
if (modified > 0)
parts.push(t("diffGroupModified").replace("{count}", String(modified)));
if (kept > 0) parts.push(t("diffGroupKept").replace("{count}", String(kept)));
return `· ${parts.join(t("diffGroupSeparator"))}`;
}
function MetadataRow({
item,
t,
}: {
item: MetadataItem;
t: (key: string) => string;
}): React.JSX.Element {
switch (item.action) {
case "removed":
return (
<div className="file-table__diff-row file-table__diff-row--removed">
<dt className="file-table__diff-name">{item.name}</dt>
<dd className="file-table__diff-value-cell">
<span
className="file-table__diff-value file-table__diff-value--strike"
title={item.valueBefore}
>
{item.valueBefore}
</span>
</dd>
</div>
);
case "modified":
return (
<div className="file-table__diff-row file-table__diff-row--modified">
<dt className="file-table__diff-name">{item.name}</dt>
<dd className="file-table__diff-value-cell">
<span
className="file-table__diff-value file-table__diff-value--strike"
title={item.valueBefore}
>
{item.valueBefore}
</span>
<span className="file-table__diff-arrow">{t("diffArrow")}</span>
<span
className="file-table__diff-value file-table__diff-value--added"
title={item.valueAfter}
>
{item.valueAfter}
</span>
</dd>
</div>
);
case "kept":
return (
<div className="file-table__diff-row file-table__diff-row--kept">
<dt className="file-table__diff-name">
{item.name}
<span className="file-table__diff-kept-badge">
{t("diffKeptBadge")}
</span>
{item.reason !== undefined && (
<span className="file-table__diff-kept-reason">
{t(item.reason)}
</span>
)}
</dt>
<dd className="file-table__diff-value-cell">
<span className="file-table__diff-value" title={item.value}>
{item.value}
</span>
</dd>
</div>
);
}
}

View file

@ -25,6 +25,7 @@ function buildFileEntry(
status: FileProcessingStatus.Pending,
afterBytes: null,
error: null,
metadataItems: [],
};
}

View file

@ -39,6 +39,7 @@ export function handleSelectedFiles({
status: FileProcessingStatus.Pending,
afterBytes: null,
error: null,
metadataItems: [],
});
}

View file

@ -1,6 +1,7 @@
import { createContext, useContext, useReducer } from "react";
import type { Dispatch, ReactNode } from "react";
import { FileProcessingStatus } from "../../domain";
import type { MetadataItem } from "../../domain";
import { assertNever } from "../../common/types";
export type FolderDiscoveryStatus =
@ -26,6 +27,7 @@ export interface FileEntry {
status: FileProcessingStatus;
afterBytes: number | null;
error: string | null;
metadataItems: readonly MetadataItem[];
}
export interface AppState {
@ -43,6 +45,7 @@ export type AppAction =
type: "UPDATE_FILE_METADATA";
id: string;
afterBytes: number;
metadataItems: readonly MetadataItem[];
}
| { type: "UPDATE_FILE_ERROR"; id: string; error: string }
| { type: "TOGGLE_FOLDER"; folder: string }
@ -80,7 +83,11 @@ export function appReducer(state: AppState, action: AppAction): AppState {
...state,
files: state.files.map((file) =>
file.id === action.id
? { ...file, afterBytes: action.afterBytes }
? {
...file,
afterBytes: action.afterBytes,
metadataItems: action.metadataItems,
}
: file,
),
};

View file

@ -93,17 +93,20 @@ async function processViaWasm({
}
const outputBytes = result.outputBytes ?? 0;
const metadataRemoved = result.metadataRemoved ?? 0;
const removedOrModified = result.metadataItems.filter(
(i) => i.action !== "kept",
).length;
dispatch({
type: "UPDATE_FILE_METADATA",
id: entry.id,
afterBytes: outputBytes,
metadataItems: result.metadataItems,
});
dispatch({
type: "UPDATE_FILE_STATUS",
id: entry.id,
status:
metadataRemoved === 0
removedOrModified === 0
? FileProcessingStatus.NoMetadataFound
: FileProcessingStatus.Complete,
});

View file

@ -213,3 +213,134 @@
background: transparent;
color: var(--ec-color-text-secondary);
}
/* Metadata diff expansion rendered below a Complete row when
`file.metadataItems` is non-empty. Reuses `.file-table__expansion`
for the outer container; the rules below style the per-source groups
and per-tag rows. */
.file-table__diff {
font-family: var(--ec-font-family-mono);
font-size: 13px;
padding: 0;
}
.file-table__diff-group + .file-table__diff-group {
margin-top: var(--ec-space-2);
}
.file-table__diff-group-header {
margin: 0;
padding: var(--ec-space-2) var(--ec-space-4) var(--ec-space-1);
font-family: var(--ec-font-family);
font-size: 11px;
font-weight: var(--ec-font-weight-semibold);
letter-spacing: 0.04em;
text-transform: uppercase;
color: var(--ec-color-text-secondary);
background: var(--ec-color-surface);
border-bottom: 1px solid var(--ec-color-border);
}
.file-table__diff-list {
margin: 0;
padding: 0;
}
.file-table__diff-row {
display: grid;
grid-template-columns: 220px 1fr;
gap: var(--ec-space-4);
align-items: baseline;
padding: 5px var(--ec-space-4);
border-left: 3px solid transparent;
}
.file-table__diff-row > dt,
.file-table__diff-row > dd {
margin: 0;
min-width: 0;
}
.file-table__diff-row--removed {
background: var(--ec-color-diff-removed-bg);
border-left-color: var(--ec-color-diff-removed-border);
}
.file-table__diff-row--modified {
background: var(--ec-color-diff-modified-bg);
border-left-color: var(--ec-color-diff-modified-border);
}
.file-table__diff-row--kept {
background: var(--ec-color-diff-kept-bg);
border-left-color: var(--ec-color-diff-kept-border);
color: var(--ec-color-diff-kept-fg);
}
.file-table__diff-name {
font-weight: 500;
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
}
.file-table__diff-value-cell {
display: flex;
align-items: baseline;
gap: var(--ec-space-1);
min-width: 0;
}
.file-table__diff-value {
color: var(--ec-color-text-secondary);
overflow: hidden;
text-overflow: ellipsis;
white-space: nowrap;
min-width: 0;
}
.file-table__diff-value--strike {
text-decoration: line-through;
color: var(--ec-color-diff-removed-fg);
}
.file-table__diff-value--added {
color: var(--ec-color-diff-modified-fg);
}
.file-table__diff-arrow {
color: var(--ec-color-text-secondary);
flex-shrink: 0;
margin: 0 var(--ec-space-1);
}
.file-table__diff-kept-badge {
display: inline-block;
margin-left: var(--ec-space-1);
font-family: var(--ec-font-family);
font-size: 10px;
font-weight: var(--ec-font-weight-regular);
color: var(--ec-color-text-secondary);
text-transform: lowercase;
}
.file-table__diff-kept-reason {
display: block;
margin-top: 2px;
font-family: var(--ec-font-family);
font-size: 10px;
font-weight: var(--ec-font-weight-regular);
color: var(--ec-color-text-secondary);
white-space: normal;
overflow: visible;
text-overflow: clip;
}
/* Mobile: stack name + value into a single column at narrow viewports. */
@media (max-width: 420px) {
.file-table__diff-row {
grid-template-columns: 1fr;
gap: 2px;
}
}

View file

@ -21,6 +21,17 @@
--ec-color-removed-bg: rgba(255, 59, 48, 0.08);
--ec-color-preserved-bg: rgba(52, 199, 89, 0.08);
/* Metadata diff (per-row tinting) -- Light Mode */
--ec-color-diff-removed-bg: rgba(229, 62, 62, 0.06);
--ec-color-diff-removed-border: #e53e3e;
--ec-color-diff-removed-fg: #c53030;
--ec-color-diff-modified-bg: rgba(56, 161, 105, 0.06);
--ec-color-diff-modified-border: #38a169;
--ec-color-diff-modified-fg: #2f855a;
--ec-color-diff-kept-bg: rgba(160, 174, 192, 0.06);
--ec-color-diff-kept-border: #a0aec0;
--ec-color-diff-kept-fg: #4a5568;
/* Adaptive tokens -- Light Mode */
--ec-shadow-drawer: rgba(0, 0, 0, 0.12);
--ec-backdrop-opacity: 0.4;
@ -99,6 +110,17 @@
--ec-color-removed-bg: rgba(255, 69, 58, 0.12);
--ec-color-preserved-bg: rgba(48, 209, 88, 0.12);
/* Metadata diff (per-row tinting) -- Dark Mode */
--ec-color-diff-removed-bg: rgba(252, 129, 129, 0.10);
--ec-color-diff-removed-border: #fc8181;
--ec-color-diff-removed-fg: #feb2b2;
--ec-color-diff-modified-bg: rgba(104, 211, 145, 0.10);
--ec-color-diff-modified-border: #68d391;
--ec-color-diff-modified-fg: #9ae6b4;
--ec-color-diff-kept-bg: rgba(160, 174, 192, 0.08);
--ec-color-diff-kept-border: #718096;
--ec-color-diff-kept-fg: #cbd5e0;
/* Adaptive tokens -- Dark Mode */
--ec-shadow-drawer: rgba(0, 0, 0, 0.3);
--ec-backdrop-opacity: 0.6;

View file

@ -9,8 +9,10 @@ Playwright E2E test suite for MetaScrub. Runs against the static web build serve
- `web-desktop``tests/e2e/web/` against the static build at http://localhost:4173 under Desktop Chrome.
- `web-mobile-ios` — same web specs under iPhone 14 / WebKit.
- `web-mobile-android` — same web specs under Pixel 7 / Chromium.
- `standalone-desktop` — the inlined `dist/web-standalone/index.html` loaded via `file://` under Desktop Chrome. Runs the standalone smoke specs in `tests/e2e/standalone/` plus the metadata-diff spec from `tests/e2e/web/metadata_diff.spec.ts`. The standalone HTML is the project's primary distribution channel, so the diff UX is verified here too.
- `standalone-mobile` — same standalone build under iPhone 14 / WebKit. Currently picks up only the metadata-diff spec.
Run scoped: `yarn test:e2e:web`, `yarn test:e2e:web:desktop`. Run everything: `yarn test:e2e`.
Run scoped: `yarn test:e2e:web`, `yarn test:e2e:web:desktop`, `yarn test:e2e:standalone`. Run everything: `yarn test:e2e`.
## Adding a Web Test

View file

@ -1,11 +1,19 @@
import type { Page } from "@playwright/test";
import { test, type Page } from "@playwright/test";
/**
* Navigate to the app and wait until the React tree has mounted.
* Mirrors the Electron suite's launchApp() but for the web build.
*
* Works for both HTTP baseURLs (web-* projects `http://localhost:4173`) and
* `file://` baseURLs (standalone-* projects the inlined `index.html`). For
* HTTP, `"/"` resolves against baseURL to the app root. For `file://`, `"/"`
* would resolve to the filesystem root and 404, so we navigate to `""` which
* resolves to the baseURL itself i.e. the inlined HTML file.
*/
export async function launchPage(page: Page): Promise<void> {
await page.goto("/");
const baseURL = test.info().project.use.baseURL;
const target = baseURL?.startsWith("file://") === true ? "" : "/";
await page.goto(target);
await page.waitForLoadState("domcontentloaded");
await page.waitForSelector("[role='main']", { timeout: 10000 });
}

View file

@ -1,118 +0,0 @@
// Metadata Inspection — Web
//
// The web build does not populate beforeMetadata/afterMetadata on FileEntry
// because ExifTool is unavailable in the browser (use_process_files.ts sets
// readMetadataForInspection: !isWebBuild). As a result the MetadataExpansion
// component never renders and the row click does not open a panel.
//
// These tests verify the absence-of-expansion behaviour and the row-click
// interaction so that any accidental regression (e.g. the panel appearing with
// null/undefined data) is caught. Metadata-content assertions (field names,
// removed indicators) are Electron-only and are gated with test.skip so the
// suite does not fail on web projects.
import { test, expect } from "@playwright/test";
import { launchPage } from "./helpers/page_launcher";
import { fixturePath } from "./helpers/fixture_loader";
test.describe("Metadata Inspection (Web)", () => {
test.beforeEach(async ({ page }) => {
await launchPage(page);
// .file-browse-button__input is visually hidden (aria-hidden + clip pattern).
// { force: true } bypasses Playwright's visibility check.
const input = page.locator(".file-browse-button__input").first();
await input.setInputFiles([fixturePath("sample.jpg")], { force: true });
await page.waitForSelector(".file-table__row--complete", { timeout: 15000 });
});
test("completed row is expandable (has chevron) but no metadata panel in web build", async ({
page,
}) => {
// The row should have the --expandable modifier and render a ChevronIcon
// (it is clickable) even though the metadata panel does not open.
const fileRow = page.locator(".file-table__row--complete").first();
await expect(fileRow).toBeVisible();
await expect(fileRow).toHaveClass(/file-table__row--expandable/);
// Click to toggle expansion state — no crash, no error overlay expected.
await fileRow.click();
// In the web build beforeMetadata === null so MetadataExpansion is never
// rendered. Confirm the panel is absent.
const expansion = page.locator(".metadata-expansion");
// If this count ever becomes >0, the web build has gained metadata inspection
// — remove test.skip from tests 2 and 3 and delete this assertion.
await expect(expansion).toHaveCount(0);
});
test("shows before-metadata fields", async ({ page }) => {
// The web build does not read metadata back via ExifTool, so
// beforeMetadata/afterMetadata are null and MetadataExpansion never
// renders. This test is intentionally skipped for the web build;
// it lives here only so the suite mirrors the Electron spec structure.
test.skip(
true,
"Web build does not populate metadata for inspection (readMetadataForInspection: !isWebBuild)",
);
// The block below is unreachable but kept for documentation.
const fileRow = page.locator(".file-table__row--complete").first();
await fileRow.click();
const expansion = page.locator(".metadata-expansion");
await expect(expansion).toBeVisible();
const groupHeaders = expansion.locator(".metadata-group__header");
const groupCount = await groupHeaders.count();
expect(groupCount).toBeGreaterThan(0);
for (let i = 0; i < groupCount; i++) {
await groupHeaders.nth(i).click();
}
const allFieldNames = expansion.locator(".metadata-field__name");
const fieldCount = await allFieldNames.count();
expect(fieldCount).toBeGreaterThan(0);
});
test("indicates removed fields after stripping", async ({ page }) => {
// Same skip rationale as above — removed indicators require MetadataExpansion
// which requires non-null beforeMetadata, which the web build does not provide.
test.skip(
true,
"Web build does not populate metadata for inspection (readMetadataForInspection: !isWebBuild)",
);
const fileRow = page.locator(".file-table__row--complete").first();
await fileRow.click();
const expansion = page.locator(".metadata-expansion");
await expect(expansion).toBeVisible();
const groupHeaders = expansion.locator(".metadata-group__header");
const groupCount = await groupHeaders.count();
for (let i = 0; i < groupCount; i++) {
await groupHeaders.nth(i).click();
}
const removedFields = expansion.locator(".metadata-field--removed");
const removedCount = await removedFields.count();
expect(removedCount).toBeGreaterThan(0);
const firstRemovedIcon = removedFields
.first()
.locator(".metadata-field__icon");
const iconText = await firstRemovedIcon.textContent();
expect(iconText).toBe("");
const preservedFields = expansion.locator(".metadata-field--preserved");
const preservedCount = await preservedFields.count();
expect(preservedCount).toBeGreaterThan(0);
const firstPreservedIcon = preservedFields
.first()
.locator(".metadata-field__icon");
const preservedIconText = await firstPreservedIcon.textContent();
expect(preservedIconText).toBe("✓");
});
});

View file

@ -0,0 +1,109 @@
// Metadata Diff — Web (Task 12 of issue #22)
//
// End-to-end coverage for the per-file diff expansion: a Complete JPEG row
// shows a chevron, clicking it reveals grouped diff items (EXIF group header,
// at least one --removed row with strikethrough), the diff fits within the
// mobile viewport, and toggling collapses it again. A PDF row, by contrast,
// gates `isExpandable` to false in Phase 1 (PDF strategy emits no items yet),
// so it has no chevron and no diff panel.
import { test, expect } from "@playwright/test";
import { launchPage } from "./helpers/page_launcher";
import { fixturePath } from "./helpers/fixture_loader";
test.describe("Metadata diff expansion", () => {
test.beforeEach(async ({ page }) => {
await launchPage(page);
});
test("JPEG file shows expandable diff with EXIF/GPS items", async ({
page,
isMobile,
}) => {
// .file-browse-button__input is visually hidden (aria-hidden + clip
// pattern). { force: true } bypasses Playwright's visibility check.
// setInputFiles works under both desktop and mobile projects, so we use
// it as the single ingestion path for the diff assertion.
const input = page.locator(".file-browse-button__input").first();
await input.setInputFiles([fixturePath("sample.jpg")], { force: true });
const row = page.locator(".file-table__row--complete").first();
await expect(row).toBeVisible({ timeout: 15000 });
// Task 11's `isExpandable` gate adds the modifier when items.length > 0;
// the JPEG strategy emits real EXIF/GPS items for sample.jpg (Make,
// Model, Author, Copyright, DateTime), so the row must be expandable
// and the ChevronIcon (`.chevron-icon`) — not the StatusIcon — must
// be rendered in the status cell.
await expect(row).toHaveClass(/file-table__row--expandable/);
const chevron = row.locator(".chevron-icon");
await expect(chevron).toBeVisible();
if (isMobile) {
await row.tap();
} else {
await row.click();
}
const diff = page.locator(".file-table__diff");
await expect(diff).toBeVisible();
// The diff groups items by source. For sample.jpg the JPEG strategy
// emits a JFIF group (APP0 segment) followed by an EXIF group (from
// the APP1 EXIF tags: Make, Model, Artist, Copyright, etc.). Assert
// the EXIF group header is present rather than asserting positional
// order, since the JFIF segment legitimately comes first in APP-order.
await expect(
diff.locator(".file-table__diff-group-header", { hasText: /EXIF/ }),
).toBeVisible();
// At least one removed item with the --removed modifier. The
// strikethrough is applied via the .file-table__diff-value--strike
// child class, which the component renders for removed/modified rows.
const removedRows = diff.locator(".file-table__diff-row--removed");
await expect(removedRows.first()).toBeVisible();
await expect(
removedRows.first().locator(".file-table__diff-value--strike"),
).toBeVisible();
// Mobile responsive guard: the diff panel must not overflow the
// viewport horizontally. +1px tolerance for sub-pixel rounding.
const diffBox = await diff.boundingBox();
expect(diffBox).not.toBeNull();
const viewport = page.viewportSize();
if (viewport !== null && diffBox !== null) {
expect(diffBox.width).toBeLessThanOrEqual(viewport.width + 1);
}
// Collapse: clicking/tapping the row again hides the diff.
if (isMobile) {
await row.tap();
} else {
await row.click();
}
await expect(diff).not.toBeVisible();
});
test("PDF file has no chevron in Phase 1 (non-JPEG strategies emit no items)", async ({
page,
}) => {
const input = page.locator(".file-browse-button__input").first();
await input.setInputFiles([fixturePath("sample.pdf")], { force: true });
// PDF strategy emits an empty metadataItems array (Phase 1 of the diff
// feature only wires JPEG). With no removed/modified items the status
// resolves to NoMetadataFound, which renders its own --expandable
// modifier + "no metadata found" notice — but never the diff panel.
const row = page
.locator(".file-table__row--complete, .file-table__row")
.first();
await expect(row).toBeVisible({ timeout: 15000 });
// Clicking the row should never reveal a .file-table__diff panel —
// MetadataDiffExpansion is only rendered when status === Complete and
// metadataItems.length > 0, which the PDF strategy does not satisfy.
await row.click();
const diff = page.locator(".file-table__diff");
await expect(diff).toHaveCount(0);
});
});

View file

@ -0,0 +1,520 @@
// Synthetic JPEG builders — small enough to verify by hand, large enough to
// drive the strategy's IFD walker. Big-endian TIFF for legibility.
//
// The walker reads APP1 EXIF segments by:
// 1. Detecting "Exif\0\0" prefix
// 2. Reading TIFF header (MM/II, 0x002a, IFD0 offset)
// 3. Walking IFD0 entries in tag-ascending order
// 4. Following ExifIFDPointer (0x8769) to SubIFD if present
// 5. Following GPSIFDPointer (0x8825) to GPS IFD if present
// 6. Walking next-IFD pointer (after IFD0 entry table) to IFD1 if non-zero
//
// These builders produce JPEGs with controllable IFD0/SubIFD/GPS/IFD1 content
// so each pass can be tested in isolation.
export const IFD_TAGS = {
Make: 0x010f,
Model: 0x0110,
Orientation: 0x0112,
XResolution: 0x011a,
YResolution: 0x011b,
ResolutionUnit: 0x0128,
Software: 0x0131,
DateTime: 0x0132,
ExifIFDPointer: 0x8769,
GPSIFDPointer: 0x8825,
} as const;
export const EXIF_TAGS = {
ExposureTime: 0x829a,
FNumber: 0x829d,
ExposureProgram: 0x8822,
ISO: 0x8827,
DateTimeOriginal: 0x9003,
DateTimeDigitized: 0x9004,
FocalLength: 0x920a,
ExifImageWidth: 0xa002,
ExifImageHeight: 0xa003,
InteropIFDPointer: 0xa005,
} as const;
export const GPS_TAGS = {
GPSLatitudeRef: 0x0001,
GPSLatitude: 0x0002,
GPSLongitudeRef: 0x0003,
GPSLongitude: 0x0004,
GPSAltitudeRef: 0x0005,
GPSAltitude: 0x0006,
GPSTimeStamp: 0x0007,
GPSDateStamp: 0x001d,
} as const;
// TIFF entry types (Exif 2.32 §4.6.2).
const TYPE_ASCII = 2;
const TYPE_SHORT = 3;
const TYPE_LONG = 4;
const TYPE_RATIONAL = 5;
const TYPE_UNDEFINED = 7;
const TYPE_SIZES: Record<number, number> = {
[TYPE_ASCII]: 1,
[TYPE_SHORT]: 2,
[TYPE_LONG]: 4,
[TYPE_RATIONAL]: 8,
[TYPE_UNDEFINED]: 1,
};
export type IfdValueSpec =
| { tag: number; type: "ASCII"; value: string }
| { tag: number; type: "SHORT"; value: number }
| { tag: number; type: "LONG"; value: number }
| { tag: number; type: "RATIONAL"; value: [number, number] }
| { tag: number; type: "RATIONAL_ARRAY"; value: Array<[number, number]> }
| { tag: number; type: "UNDEFINED"; value: Uint8Array };
export function buildJpegWithIfd0(entries: IfdValueSpec[]): Uint8Array {
return buildJpeg({ ifd0: entries });
}
export function buildJpegWithIfd0AndSubIfd(
ifd0: IfdValueSpec[],
subIfd: IfdValueSpec[],
): Uint8Array {
return buildJpeg({ ifd0, subIfd });
}
export function buildJpegWithGps(
ifd0: IfdValueSpec[],
gps: IfdValueSpec[],
): Uint8Array {
return buildJpeg({ ifd0, gps });
}
export function buildJpegWithIfd1(
ifd0: IfdValueSpec[],
ifd1: IfdValueSpec[],
): Uint8Array {
return buildJpeg({ ifd0, ifd1 });
}
export function buildJpegWithInterop(
ifd0: IfdValueSpec[],
subIfd: IfdValueSpec[],
interop: IfdValueSpec[],
): Uint8Array {
return buildJpeg({ ifd0, subIfd, interop });
}
// --- Non-IFD segment builders ---------------------------------------------
//
// These wrap a raw payload in a length-prefixed APPn / COM marker, sandwiched
// between SOI / SOF / SOS / EOI so the walker treats the segment as a normal
// drop candidate. The payload is the segment body *without* the FF marker
// prefix or length field.
export function buildJpegWithXmp(xmlBytes: Uint8Array): Uint8Array {
// APP1 segment with "http://ns.adobe.com/xap/1.0/\0" identifier + xml payload.
const xmpIdent = new TextEncoder().encode("http://ns.adobe.com/xap/1.0/\0");
const payload = concat(xmpIdent, xmlBytes);
return wrapInJpegWithApp(0xe1, payload);
}
export function buildJpegWithIcc(iccBytes: Uint8Array): Uint8Array {
// APP2 ICC profile: identifier "ICC_PROFILE\0" + chunk# + chunk-count + bytes.
const ident = new TextEncoder().encode("ICC_PROFILE\0");
const header = new Uint8Array([0x01, 0x01]); // chunk 1 of 1
const payload = concat(ident, header, iccBytes);
return wrapInJpegWithApp(0xe2, payload);
}
export function buildJpegWithJfif(): Uint8Array {
// APP0 JFIF: identifier "JFIF\0" + version + density + thumbnail size.
const payload = new Uint8Array([
0x4a, 0x46, 0x49, 0x46, 0x00, // "JFIF\0"
0x01, 0x01, // version 1.01
0x01, // density unit: inches
0x00, 0x48, 0x00, 0x48, // 72×72 dpi
0x00, 0x00, // thumbnail W×H = 0
]);
return wrapInJpegWithApp(0xe0, payload);
}
export function buildJpegWithComment(comment: string): Uint8Array {
const payload = new TextEncoder().encode(comment);
return wrapInJpegWithApp(0xfe, payload);
}
export function buildJpegWithMakerNote(
vendorSignature: string,
bodyBytes: number,
): Uint8Array {
const sig = new TextEncoder().encode(vendorSignature);
const body = new Uint8Array(bodyBytes);
const blob = concat(sig, body);
return buildJpegWithIfd0([
{ tag: 0x927c, type: "UNDEFINED", value: blob },
]);
}
function wrapInJpegWithApp(marker: number, payload: Uint8Array): Uint8Array {
const segLen = payload.length + 2;
const seg = new Uint8Array(2 + 2 + payload.length);
seg[0] = 0xff;
seg[1] = marker;
seg[2] = (segLen >> 8) & 0xff;
seg[3] = segLen & 0xff;
seg.set(payload, 4);
const soi = new Uint8Array([0xff, 0xd8]);
const sof = makeSofPlaceholder();
const sos = makeSosPlaceholder();
const eoi = new Uint8Array([0xff, 0xd9]);
return concat(soi, seg, sof, sos, eoi);
}
interface BuildParts {
ifd0: IfdValueSpec[];
subIfd?: IfdValueSpec[];
gps?: IfdValueSpec[];
ifd1?: IfdValueSpec[];
interop?: IfdValueSpec[];
}
function buildJpeg(parts: BuildParts): Uint8Array {
const tiff = encodeTiff(parts);
// Wrap in APP1 segment: FF E1 + length(2BE) + "Exif\0\0" + tiff
const exifIdent = new Uint8Array([0x45, 0x78, 0x69, 0x66, 0x00, 0x00]);
const app1Payload = concat(exifIdent, tiff);
const app1Length = app1Payload.length + 2; // length field includes itself
const app1 = new Uint8Array(2 + 2 + app1Payload.length);
app1[0] = 0xff;
app1[1] = 0xe1;
app1[2] = (app1Length >> 8) & 0xff;
app1[3] = app1Length & 0xff;
app1.set(app1Payload, 4);
// Minimal JPEG body. The walker only cares about marker structure, not
// that the image is decodable. SOF0/SOS placeholders satisfy the parser.
const soi = new Uint8Array([0xff, 0xd8]);
const sof = makeSofPlaceholder();
const sos = makeSosPlaceholder();
const eoi = new Uint8Array([0xff, 0xd9]);
return concat(soi, app1, sof, sos, eoi);
}
function concat(...parts: Uint8Array[]): Uint8Array {
const total = parts.reduce((n, p) => n + p.length, 0);
const out = new Uint8Array(total);
let off = 0;
for (const p of parts) {
out.set(p, off);
off += p.length;
}
return out;
}
function makeSofPlaceholder(): Uint8Array {
// FF C0 + length(11) + 8 bytes of SOF0 (8bpp, 1x1, 1 component).
return new Uint8Array([
0xff, 0xc0, 0x00, 0x0b, 0x08, 0x00, 0x01, 0x00, 0x01, 0x01, 0x01, 0x11,
0x00,
]);
}
function makeSosPlaceholder(): Uint8Array {
// FF DA + length(8) + 6 bytes of SOS header (1 component scan).
return new Uint8Array([
0xff, 0xda, 0x00, 0x08, 0x01, 0x01, 0x00, 0x00, 0x3f, 0x00,
]);
}
// --- TIFF encoder ----------------------------------------------------------
// Encoded form of an IFD entry as it appears in the TIFF stream.
interface EncodedEntry {
tag: number;
type: number;
count: number;
// Either a 4-byte inline value (left-justified) or a placeholder requesting
// an offset patch into the values area.
inlineValue: Uint8Array | null; // exactly 4 bytes when non-null
externalValue: Uint8Array | null; // > 4 bytes, written to values area
}
// Convert a value spec to byte payload (the actual TIFF-encoded value) plus type/count.
function encodeValue(spec: IfdValueSpec): {
type: number;
count: number;
payload: Uint8Array;
} {
switch (spec.type) {
case "ASCII": {
// NUL-terminated ASCII; count = string length + 1.
const s = spec.value;
const payload = new Uint8Array(s.length + 1);
for (let i = 0; i < s.length; i++) payload[i] = s.charCodeAt(i) & 0xff;
payload[s.length] = 0;
return { type: TYPE_ASCII, count: payload.length, payload };
}
case "SHORT": {
const payload = new Uint8Array(2);
payload[0] = (spec.value >> 8) & 0xff;
payload[1] = spec.value & 0xff;
return { type: TYPE_SHORT, count: 1, payload };
}
case "LONG": {
const payload = new Uint8Array(4);
payload[0] = (spec.value >>> 24) & 0xff;
payload[1] = (spec.value >>> 16) & 0xff;
payload[2] = (spec.value >>> 8) & 0xff;
payload[3] = spec.value & 0xff;
return { type: TYPE_LONG, count: 1, payload };
}
case "RATIONAL": {
const [num, den] = spec.value;
const payload = new Uint8Array(8);
writeUint32BE(payload, 0, num);
writeUint32BE(payload, 4, den);
return { type: TYPE_RATIONAL, count: 1, payload };
}
case "RATIONAL_ARRAY": {
const payload = new Uint8Array(spec.value.length * 8);
for (let i = 0; i < spec.value.length; i++) {
const pair = spec.value[i];
if (pair === undefined) continue;
const [num, den] = pair;
writeUint32BE(payload, i * 8, num);
writeUint32BE(payload, i * 8 + 4, den);
}
return {
type: TYPE_RATIONAL,
count: spec.value.length,
payload,
};
}
case "UNDEFINED": {
return {
type: TYPE_UNDEFINED,
count: spec.value.length,
payload: new Uint8Array(spec.value),
};
}
}
}
function writeUint16BE(buf: Uint8Array, off: number, v: number): void {
buf[off] = (v >> 8) & 0xff;
buf[off + 1] = v & 0xff;
}
function writeUint32BE(buf: Uint8Array, off: number, v: number): void {
buf[off] = (v >>> 24) & 0xff;
buf[off + 1] = (v >>> 16) & 0xff;
buf[off + 2] = (v >>> 8) & 0xff;
buf[off + 3] = v & 0xff;
}
// Pack a value payload into the 4-byte inline slot if it fits; else mark
// for external placement in the values area.
function packValue(payload: Uint8Array): {
inline: Uint8Array | null;
external: Uint8Array | null;
} {
if (payload.length <= 4) {
const slot = new Uint8Array(4);
slot.set(payload, 0);
return { inline: slot, external: null };
}
return { inline: null, external: payload };
}
// Sort entries by tag (TIFF spec requirement) and produce encoded forms.
function toEncodedEntries(entries: IfdValueSpec[]): EncodedEntry[] {
const sorted = [...entries].sort((a, b) => a.tag - b.tag);
return sorted.map((spec): EncodedEntry => {
const { type, count, payload } = encodeValue(spec);
const { inline, external } = packValue(payload);
return {
tag: spec.tag,
type,
count,
inlineValue: inline,
externalValue: external,
};
});
}
// Size of an IFD section in the TIFF stream:
// 2 bytes count + 12 * entries + 4 bytes next-IFD pointer.
function ifdSize(entryCount: number): number {
return 2 + entryCount * 12 + 4;
}
// Layout: TIFF header (8 bytes)
// | IFD0 (entries + next-IFD ptr)
// | SubIFD if present (entries + next-IFD ptr)
// | InteropIFD if present
// | GPS if present
// | IFD1 if present
// | values area (external values for any IFD with payload > 4 bytes)
//
// IFD0 gets synthetic ExifIFDPointer / GPSIFDPointer entries when SubIFD /
// GPS are supplied. SubIFD gets InteropIFDPointer when Interop is supplied.
// All IFDs terminate next-IFD = 0 except IFD0 (next-IFD = IFD1 offset if any).
function encodeTiff(parts: BuildParts): Uint8Array {
// Phase 1: encode all entries (without resolving pointer values yet).
const ifd0Encoded = toEncodedEntries(parts.ifd0);
const subEncoded = parts.subIfd ? toEncodedEntries(parts.subIfd) : null;
const gpsEncoded = parts.gps ? toEncodedEntries(parts.gps) : null;
const ifd1Encoded = parts.ifd1 ? toEncodedEntries(parts.ifd1) : null;
const interopEncoded = parts.interop ? toEncodedEntries(parts.interop) : null;
// Add synthetic pointer entries to IFD0 (inserted in tag-sorted order).
const ifd0Final = [...ifd0Encoded];
if (subEncoded !== null) {
insertSorted(ifd0Final, makePointerEntry(IFD_TAGS.ExifIFDPointer));
}
if (gpsEncoded !== null) {
insertSorted(ifd0Final, makePointerEntry(IFD_TAGS.GPSIFDPointer));
}
// Add synthetic InteropIFDPointer to SubIFD if Interop is supplied.
const subFinal = subEncoded === null ? null : [...subEncoded];
if (subFinal !== null && interopEncoded !== null) {
insertSorted(subFinal, makePointerEntry(EXIF_TAGS.InteropIFDPointer));
}
// Phase 2: compute offsets. IFD0 starts at offset 8 (right after header).
let cursor = 8;
const ifd0Offset = cursor;
cursor += ifdSize(ifd0Final.length);
const subIfdOffset = subFinal !== null ? cursor : 0;
if (subFinal !== null) cursor += ifdSize(subFinal.length);
const interopOffset = interopEncoded !== null ? cursor : 0;
if (interopEncoded !== null) cursor += ifdSize(interopEncoded.length);
const gpsOffset = gpsEncoded !== null ? cursor : 0;
if (gpsEncoded !== null) cursor += ifdSize(gpsEncoded.length);
const ifd1Offset = ifd1Encoded !== null ? cursor : 0;
if (ifd1Encoded !== null) cursor += ifdSize(ifd1Encoded.length);
const valuesAreaStart = cursor;
// Phase 3: collect external values, allocating offsets into the values
// area. Update each entry's inlineValue to point at its values-area offset.
const valuesArea: number[] = [];
const assignExternal = (entry: EncodedEntry): void => {
if (entry.externalValue === null) return;
// Pad to even alignment (TIFF convention; many encoders also use
// 4-byte alignment but 2-byte is sufficient for parsers).
while ((valuesAreaStart + valuesArea.length) % 2 !== 0) {
valuesArea.push(0);
}
const offset = valuesAreaStart + valuesArea.length;
for (const b of entry.externalValue) valuesArea.push(b);
const slot = new Uint8Array(4);
writeUint32BE(slot, 0, offset);
entry.inlineValue = slot;
};
for (const e of ifd0Final) assignExternal(e);
if (subFinal !== null) for (const e of subFinal) assignExternal(e);
if (interopEncoded !== null) for (const e of interopEncoded) assignExternal(e);
if (gpsEncoded !== null) for (const e of gpsEncoded) assignExternal(e);
if (ifd1Encoded !== null) for (const e of ifd1Encoded) assignExternal(e);
// Resolve synthetic pointer entries: ExifIFDPointer / GPSIFDPointer /
// InteropIFDPointer point at the start of their respective IFDs.
patchPointer(ifd0Final, IFD_TAGS.ExifIFDPointer, subIfdOffset);
patchPointer(ifd0Final, IFD_TAGS.GPSIFDPointer, gpsOffset);
if (subFinal !== null) {
patchPointer(subFinal, EXIF_TAGS.InteropIFDPointer, interopOffset);
}
// Phase 4: assemble the byte stream.
const totalSize = valuesAreaStart + valuesArea.length;
const tiff = new Uint8Array(totalSize);
// TIFF header: big-endian (MM), 0x002a, IFD0 offset.
tiff[0] = 0x4d;
tiff[1] = 0x4d;
writeUint16BE(tiff, 2, 0x002a);
writeUint32BE(tiff, 4, ifd0Offset);
const nextIfd0 = ifd1Offset; // 0 if no IFD1
writeIfd(tiff, ifd0Offset, ifd0Final, nextIfd0);
if (subFinal !== null) writeIfd(tiff, subIfdOffset, subFinal, 0);
if (interopEncoded !== null) {
writeIfd(tiff, interopOffset, interopEncoded, 0);
}
if (gpsEncoded !== null) writeIfd(tiff, gpsOffset, gpsEncoded, 0);
if (ifd1Encoded !== null) writeIfd(tiff, ifd1Offset, ifd1Encoded, 0);
// Append the values area verbatim.
for (let i = 0; i < valuesArea.length; i++) {
const b = valuesArea[i];
tiff[valuesAreaStart + i] = b ?? 0;
}
return tiff;
}
function writeIfd(
buf: Uint8Array,
offset: number,
entries: EncodedEntry[],
nextIfdOffset: number,
): void {
writeUint16BE(buf, offset, entries.length);
let p = offset + 2;
for (const entry of entries) {
writeUint16BE(buf, p, entry.tag);
writeUint16BE(buf, p + 2, entry.type);
writeUint32BE(buf, p + 4, entry.count);
const slot = entry.inlineValue;
if (slot === null) {
// Should never happen — assignExternal always patches inlineValue.
throw new Error(`encoder bug: entry for tag 0x${entry.tag.toString(16)} has no inlineValue`);
}
buf[p + 8] = slot[0] ?? 0;
buf[p + 9] = slot[1] ?? 0;
buf[p + 10] = slot[2] ?? 0;
buf[p + 11] = slot[3] ?? 0;
p += 12;
}
writeUint32BE(buf, p, nextIfdOffset);
}
// Synthetic pointer entry (ExifIFDPointer / GPSIFDPointer / InteropIFDPointer).
// The actual target offset is patched later via patchPointer once all IFD
// offsets are known.
function makePointerEntry(tag: number): EncodedEntry {
return {
tag,
type: TYPE_LONG,
count: 1,
inlineValue: new Uint8Array(4), // patched later
externalValue: null,
};
}
function patchPointer(
entries: EncodedEntry[],
tag: number,
targetOffset: number,
): void {
const entry = entries.find((e) => e.tag === tag);
if (entry === undefined) return;
const slot = new Uint8Array(4);
writeUint32BE(slot, 0, targetOffset);
entry.inlineValue = slot;
}
function insertSorted(entries: EncodedEntry[], entry: EncodedEntry): void {
const idx = entries.findIndex((e) => e.tag > entry.tag);
if (idx === -1) entries.push(entry);
else entries.splice(idx, 0, entry);
}

View file

@ -0,0 +1,382 @@
import { describe, it, expect } from "vitest";
import { JpegStrategy } from "../../../src/infrastructure/wasm/strategies/jpeg_strategy";
import type { StripOptions } from "../../../src/infrastructure/wasm/format_strategy";
import {
buildJpegWithIfd0,
buildJpegWithIfd0AndSubIfd,
buildJpegWithGps,
buildJpegWithInterop,
buildJpegWithIfd1,
buildJpegWithXmp,
buildJpegWithIcc,
buildJpegWithJfif,
buildJpegWithComment,
buildJpegWithMakerNote,
IFD_TAGS,
EXIF_TAGS,
GPS_TAGS,
} from "./fixtures/jpeg_builders";
const strategy = new JpegStrategy();
const defaultOptions: StripOptions = {
preserveOrientation: false,
preserveColorProfile: false,
preserveTimestamps: false,
};
describe("JpegStrategy — IFD0 enumeration", () => {
it("emits one removed item per IFD0 tag with canonical name and formatted value", async () => {
const bytes = buildJpegWithIfd0([
{ tag: IFD_TAGS.Make, type: "ASCII", value: "Apple" },
{ tag: IFD_TAGS.Model, type: "ASCII", value: "iPhone 15 Pro" },
{ tag: IFD_TAGS.Orientation, type: "SHORT", value: 6 },
]);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const items = result.value.metadataItems;
expect(items).toHaveLength(3);
expect(items[0]).toEqual({
action: "removed",
source: "EXIF",
name: "Make",
valueBefore: "Apple",
});
expect(items[1]).toEqual({
action: "removed",
source: "EXIF",
name: "Model",
valueBefore: "iPhone 15 Pro",
});
expect(items[2]).toEqual({
action: "removed",
source: "EXIF",
name: "Orientation",
valueBefore: "6 (Rotate 90 CW)",
});
});
it("walks ExifIFDPointer to SubIFD and emits its tags", async () => {
const bytes = buildJpegWithIfd0AndSubIfd(
[{ tag: IFD_TAGS.Make, type: "ASCII", value: "Apple" }],
[
{ tag: EXIF_TAGS.ExposureTime, type: "RATIONAL", value: [1, 120] },
{ tag: EXIF_TAGS.ISO, type: "SHORT", value: 200 },
],
);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const names = result.value.metadataItems.map((i) => i.name);
expect(names).toContain("Make");
expect(names).toContain("ExposureTime");
expect(names).toContain("ISO");
const exposureItem = result.value.metadataItems.find(
(i) => i.name === "ExposureTime",
);
expect(exposureItem).toBeDefined();
if (exposureItem !== undefined && exposureItem.action === "removed") {
expect(exposureItem.valueBefore).toBe("1/120 s");
}
});
it("walks InteropIFDPointer to InteropIFD with correct tag names", async () => {
const bytes = buildJpegWithInterop(
[],
[],
[{ tag: 0x0001, type: "ASCII", value: "R98" }],
);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const interopItem = result.value.metadataItems.find(
(i) => i.name === "InteropIndex",
);
expect(interopItem).toBeDefined();
if (interopItem !== undefined && interopItem.action === "removed") {
expect(interopItem.valueBefore).toBe("R98");
}
});
it("walks IFD0 next-IFD pointer to IFD1 (thumbnail)", async () => {
const bytes = buildJpegWithIfd1(
[{ tag: IFD_TAGS.Make, type: "ASCII", value: "Apple" }],
[
{ tag: 0x0103, type: "SHORT", value: 6 }, // Compression
{ tag: 0x0201, type: "LONG", value: 1234 }, // JPEGInterchangeFormat
],
);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const names = result.value.metadataItems.map((i) => i.name);
expect(names).toContain("Make");
// IFD1 thumbnail tags decode under their canonical names where the
// decoder table covers them, or fall back to "Tag 0xNNNN" otherwise.
expect(result.value.metadataItems.length).toBeGreaterThanOrEqual(3);
});
it("emits canonical-hex name for unknown tags", async () => {
const bytes = buildJpegWithIfd0([
{ tag: 0xabcd, type: "SHORT", value: 42 }, // not in decoder table
]);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const item = result.value.metadataItems[0];
expect(item).toBeDefined();
if (item !== undefined && item.action === "removed") {
expect(item.name).toBe("Tag 0xABCD");
}
});
it("emits XMP packet as one blob item", async () => {
const xml = new TextEncoder().encode(
'<x:xmpmeta xmlns:x="adobe:ns:meta/"></x:xmpmeta>',
);
const bytes = buildJpegWithXmp(xml);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const xmpItems = result.value.metadataItems.filter(
(i) => i.source === "XMP",
);
expect(xmpItems).toHaveLength(1);
const item = xmpItems[0];
expect(item).toBeDefined();
if (item !== undefined && item.action === "removed") {
expect(item.name).toBe("XMP packet");
expect(item.valueBefore).toMatch(/^<XML, \d+/);
}
});
it("emits ICC profile as one blob item when dropped", async () => {
const icc = new Uint8Array(3144);
const bytes = buildJpegWithIcc(icc);
const result = await strategy.strip({
bytes,
options: { ...defaultOptions, preserveColorProfile: false },
});
expect(result.ok).toBe(true);
if (!result.ok) return;
const iccItems = result.value.metadataItems.filter(
(i) => i.source === "ICC",
);
expect(iccItems).toHaveLength(1);
const item = iccItems[0];
expect(item).toBeDefined();
if (item !== undefined && item.action === "removed") {
expect(item.valueBefore).toMatch(/^<binary, \d+/);
}
});
it("emits JFIF block as one item when dropped", async () => {
const bytes = buildJpegWithJfif();
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const jfifItems = result.value.metadataItems.filter(
(i) => i.source === "JFIF",
);
expect(jfifItems).toHaveLength(1);
expect(jfifItems[0]!.name).toBe("JFIF block");
});
it("emits Comment marker as one item", async () => {
const bytes = buildJpegWithComment("hello world");
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const commentItems = result.value.metadataItems.filter(
(i) => i.source === "Comment",
);
expect(commentItems).toHaveLength(1);
const item = commentItems[0];
expect(item).toBeDefined();
if (item !== undefined && item.action === "removed") {
expect(item.valueBefore).toBe("hello world");
}
});
it("emits MakerNote as one opaque item with vendor detection", async () => {
const bytes = buildJpegWithMakerNote("Nikon ", 2528);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const makerNoteItem = result.value.metadataItems.find((i) =>
i.name.startsWith("MakerNote"),
);
expect(makerNoteItem).toBeDefined();
if (makerNoteItem !== undefined && makerNoteItem.action === "removed") {
expect(makerNoteItem.name).toBe("MakerNote");
expect(makerNoteItem.valueBefore).toMatch(/Nikon/);
}
});
it("emits plain MakerNote (no vendor) on unrecognised signature", async () => {
const bytes = buildJpegWithMakerNote("XYZUNKNOWN", 100);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const makerNoteItem = result.value.metadataItems.find((i) =>
i.name.startsWith("MakerNote"),
);
expect(makerNoteItem).toBeDefined();
if (makerNoteItem !== undefined && makerNoteItem.action === "removed") {
expect(makerNoteItem.valueBefore).not.toMatch(
/Nikon|Canon|Sony|Olympus|Fuji/,
);
}
});
it("emits compound GPSLatitude combining ref + coord triplet", async () => {
const bytes = buildJpegWithGps(
[],
[
{ tag: GPS_TAGS.GPSLatitudeRef, type: "ASCII", value: "N" },
{
tag: GPS_TAGS.GPSLatitude,
type: "RATIONAL_ARRAY",
value: [
[47, 1],
[36, 1],
[223, 10],
],
},
],
);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const items = result.value.metadataItems;
const gpsItems = items.filter((i) => i.source === "GPS");
expect(gpsItems).toHaveLength(1); // ref + lat collapsed into one item
const latItem = gpsItems[0];
expect(latItem).toBeDefined();
if (latItem !== undefined && latItem.action === "removed") {
expect(latItem.name).toBe("GPSLatitude");
expect(latItem.valueBefore).toBe("47° 36' 22.3\" N");
}
});
it("emits kept item for Orientation when preserveOrientation is true", async () => {
const bytes = buildJpegWithIfd0([
{ tag: IFD_TAGS.Make, type: "ASCII", value: "Apple" },
{ tag: IFD_TAGS.Orientation, type: "SHORT", value: 6 },
]);
const result = await strategy.strip({
bytes,
options: { ...defaultOptions, preserveOrientation: true },
});
expect(result.ok).toBe(true);
if (!result.ok) return;
const items = result.value.metadataItems;
const orientationItem = items.find((i) => i.name === "Orientation");
expect(orientationItem).toBeDefined();
if (orientationItem !== undefined) {
expect(orientationItem.action).toBe("kept");
if (orientationItem.action === "kept") {
expect(orientationItem.value).toBe("6 (Rotate 90 CW)");
}
}
const makeItem = items.find((i) => i.name === "Make");
expect(makeItem?.action).toBe("removed");
});
it("emits kept item for ICC profile when preserveColorProfile is true", async () => {
const icc = new Uint8Array(3144);
const bytes = buildJpegWithIcc(icc);
const result = await strategy.strip({
bytes,
options: { ...defaultOptions, preserveColorProfile: true },
});
expect(result.ok).toBe(true);
if (!result.ok) return;
const iccItems = result.value.metadataItems.filter((i) => i.source === "ICC");
expect(iccItems).toHaveLength(1);
expect(iccItems[0]!.action).toBe("kept");
});
it("file with only Orientation + preserveOrientation: true → only a kept item, no removed/modified", async () => {
const bytes = buildJpegWithIfd0([
{ tag: IFD_TAGS.Orientation, type: "SHORT", value: 6 },
]);
const result = await strategy.strip({
bytes,
options: { ...defaultOptions, preserveOrientation: true },
});
expect(result.ok).toBe(true);
if (!result.ok) return;
const nonKept = result.value.metadataItems.filter((i) => i.action !== "kept");
expect(nonKept).toHaveLength(0);
});
it("decodes common EXIF SubIFD tags by canonical name", async () => {
const bytes = buildJpegWithIfd0AndSubIfd(
[],
[
{ tag: 0x9207, type: "SHORT", value: 5 }, // MeteringMode = Multi-segment
{ tag: 0xa001, type: "SHORT", value: 1 }, // ColorSpace = sRGB
{ tag: 0xa002, type: "LONG", value: 640 }, // ExifImageWidth
{ tag: 0x9204, type: "RATIONAL", value: [0, 1] }, // ExposureBiasValue
],
);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const names = result.value.metadataItems.map((i) => i.name);
expect(names).toContain("MeteringMode");
expect(names).toContain("ColorSpace");
expect(names).toContain("ExifImageWidth");
expect(names).toContain("ExposureBiasValue");
const metering = result.value.metadataItems.find(
(i) => i.name === "MeteringMode",
);
expect(metering).toBeDefined();
if (metering !== undefined && metering.action === "removed") {
expect(metering.valueBefore).toBe("5 (Multi-segment)");
}
const colorSpace = result.value.metadataItems.find(
(i) => i.name === "ColorSpace",
);
if (colorSpace !== undefined && colorSpace.action === "removed") {
expect(colorSpace.valueBefore).toBe("1 (sRGB)");
}
});
it("decodes common GPS tags by canonical name", async () => {
const bytes = buildJpegWithGps(
[],
[
{ tag: 0x0008, type: "ASCII", value: "07" }, // GPSSatellites
{ tag: 0x0012, type: "ASCII", value: "WGS-84" }, // GPSMapDatum
],
);
const result = await strategy.strip({ bytes, options: defaultOptions });
expect(result.ok).toBe(true);
if (!result.ok) return;
const names = result.value.metadataItems.map((i) => i.name);
expect(names).toContain("GPSSatellites");
expect(names).toContain("GPSMapDatum");
const satellites = result.value.metadataItems.find(
(i) => i.name === "GPSSatellites",
);
if (satellites !== undefined && satellites.action === "removed") {
expect(satellites.valueBefore).toBe("07");
}
});
});

View file

@ -103,7 +103,9 @@ describe("JpegStrategy — drops APP and COM segments by default", () => {
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(findMarker(result.value.bytes, 0xe0)).toBe(-1);
expect(result.value.metadataRemoved).toBe(1);
// The dropped JFIF segment surfaces as a single removed item.
expect(result.value.metadataItems).toHaveLength(1);
expect(result.value.metadataItems[0]?.source).toBe("JFIF");
});
it("drops EXIF/APP1 (FFE1)", async () => {
@ -188,7 +190,12 @@ describe("JpegStrategy — drops APP and COM segments by default", () => {
});
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(result.value.metadataRemoved).toBe(4);
// APP0/JFIF + COM/Comment surface as items (each describable segment
// type produces one item). APP1 here has no Exif/XMP identifier, so
// it's silently dropped without an item; APP13 isn't decoded yet.
const sources = result.value.metadataItems.map((i) => i.source);
expect(sources).toContain("JFIF");
expect(sources).toContain("Comment");
});
});
@ -408,7 +415,7 @@ describe("JpegStrategy — fill bytes (T.81 §B.1.1.2)", () => {
if (!result.ok) return;
// APP1 was dropped — no FFE1 should remain in the output.
expect(findMarker(result.value.bytes, 0xe1)).toBe(-1);
expect(result.value.metadataRemoved).toBe(1);
expect(result.value.metadataItems).toEqual([]);
// Output must still be a well-formed JPEG.
expect(result.value.bytes[0]).toBe(0xff);
expect(result.value.bytes[1]).toBe(0xd8);
@ -499,8 +506,26 @@ describe("JpegStrategy — real fixture", () => {
expect(stripped[1]).toBe(0xd8);
expect(stripped[stripped.length - 2]).toBe(0xff);
expect(stripped[stripped.length - 1]).toBe(0xd9);
// At least one segment was dropped (the fixture has metadata)
expect(result.value.metadataRemoved).toBeGreaterThan(0);
// The fixture has Make=TestCamera + Artist=TestAuthor embedded in
// IFD0; both should surface as removed metadata items. The fixture
// also carries a JFIF block (APP0) which is observed and dropped.
const items = result.value.metadataItems;
expect(items).toEqual(
expect.arrayContaining([
{
action: "removed",
source: "EXIF",
name: "Make",
valueBefore: "TestCamera",
},
{
action: "removed",
source: "EXIF",
name: "Artist",
valueBefore: "TestAuthor",
},
]),
);
// No EXIF, JFIF, or COM markers remain
expect(findMarker(stripped, 0xe0)).toBe(-1);
expect(findMarker(stripped, 0xe1)).toBe(-1);
@ -783,7 +808,7 @@ describe("JpegStrategy — preserveOrientation", () => {
expect(findMarker(result.value.bytes, 0xe1)).toBe(-1);
});
it("counts an APP1 replacement as one segment removed (the GPS/Make payload was dropped)", async () => {
it("tracks IFD0 entries when APP1 is replaced with the orientation-only synthesis", async () => {
const payload = buildExifApp1Payload({
endian: "BE",
entries: [
@ -801,7 +826,20 @@ describe("JpegStrategy — preserveOrientation", () => {
});
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(result.value.metadataRemoved).toBe(1);
// The walker enumerated the original IFD0 before synthesizing the
// replacement. The GPS pointer (0x8825) is consumed without an item
// (it's a pointer entry, not metadata content); Orientation surfaces
// as a kept item because the replacement APP1 re-emits the same value
// — from the user's perspective the tag was preserved verbatim.
expect(result.value.metadataItems).toEqual([
{
action: "kept",
source: "EXIF",
name: "Orientation",
value: "1 (Normal)",
reason: "diffKeptReasonOrientation",
},
]);
});
it("rejects orientation values outside the 1..8 range", async () => {

View file

@ -84,7 +84,7 @@ describe("OfficeStrategy", () => {
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(result.value.metadataRemoved).toBeGreaterThan(0);
expect(result.value.metadataItems).toEqual([]);
const cleaned = await JSZip.loadAsync(result.value.bytes);
const core = await cleaned.file("docProps/core.xml")?.async("string");
@ -100,7 +100,7 @@ describe("OfficeStrategy", () => {
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(result.value.metadataRemoved).toBeGreaterThan(0);
expect(result.value.metadataItems).toEqual([]);
const cleaned = await JSZip.loadAsync(result.value.bytes);
const meta = await cleaned.file("meta.xml")?.async("string");
@ -151,7 +151,7 @@ describe("OfficeStrategy", () => {
expect(cleaned.file("docProps/thumbnail.jpeg")).toBeNull();
expect(cleaned.file("docProps/thumbnail.emf")).toBeNull();
expect(cleaned.file("docProps/thumbnail.wmf")).toBeNull();
expect(result.value.metadataRemoved).toBeGreaterThanOrEqual(3);
expect(result.value.metadataItems).toEqual([]);
});
it("removes word/people.xml and ppt/commentAuthors.xml when present", async () => {

View file

@ -155,7 +155,7 @@ describe("PdfStrategy — Info dictionary", () => {
expect(cleaned.getModificationDate()).toBeUndefined();
});
it("counts every removed metadata key in metadataRemoved", async () => {
it("returns empty metadataItems until per-tag tracking is wired", async () => {
const input = await makeRichPdf({ withDates: true });
const result = await strategy.strip({
bytes: input,
@ -163,9 +163,7 @@ describe("PdfStrategy — Info dictionary", () => {
});
expect(result.ok).toBe(true);
if (!result.ok) return;
// 6 Info string fields + CreationDate + ModDate = 8 minimum from
// makeRichPdf. We allow >= because pdf-lib may add extras.
expect(result.value.metadataRemoved).toBeGreaterThanOrEqual(8);
expect(result.value.metadataItems).toEqual([]);
});
// Privacy invariant §6: byte-level forensic check. The pdf-lib API
@ -363,7 +361,7 @@ describe("PdfStrategy — bundled fixture", () => {
expect(result.value.bytes[2]).toBe(0x44);
expect(result.value.bytes[3]).toBe(0x46);
expect(result.value.bytes[4]).toBe(0x2d);
// At least some metadata removed
expect(result.value.metadataRemoved).toBeGreaterThan(0);
// metadataItems is empty until per-tag tracking is wired.
expect(result.value.metadataItems).toEqual([]);
});
});

View file

@ -189,10 +189,10 @@ describe("PngStrategy — drops ancillary metadata chunks by default", () => {
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(findChunk(result.value.bytes, type)).toBe(-1);
expect(result.value.metadataRemoved).toBe(1);
expect(result.value.metadataItems).toEqual([]);
});
it("drops several ancillary chunks in one pass and counts them", async () => {
it("drops several ancillary chunks in one pass", async () => {
const input = makePng([
{ type: "tEXt", data: asciiBytes("Author\0Jane") },
{ type: "tIME", data: new Uint8Array([0x07, 0xe6, 0x05, 0x07, 0x10, 0x00, 0x00]) },
@ -201,7 +201,7 @@ describe("PngStrategy — drops ancillary metadata chunks by default", () => {
const result = await strategy.strip({ bytes: input, options: DEFAULT_OPTIONS });
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(result.value.metadataRemoved).toBe(3);
expect(result.value.metadataItems).toEqual([]);
expect(findChunk(result.value.bytes, "tEXt")).toBe(-1);
expect(findChunk(result.value.bytes, "tIME")).toBe(-1);
expect(findChunk(result.value.bytes, "iTXt")).toBe(-1);
@ -308,12 +308,12 @@ describe("PngStrategy — preserves required and display-affecting chunks", () =
expect(findChunk(result.value.bytes, "tEXt")).toBe(-1);
});
it("returns metadataRemoved=0 for a clean input", async () => {
it("returns empty metadataItems for a clean input", async () => {
const input = makePng([]);
const result = await strategy.strip({ bytes: input, options: DEFAULT_OPTIONS });
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(result.value.metadataRemoved).toBe(0);
expect(result.value.metadataItems).toEqual([]);
});
});
@ -541,8 +541,8 @@ describe("PngStrategy — real fixture", () => {
for (let i = 0; i < PNG_SIGNATURE.length; i++) {
expect(stripped[i]).toBe(PNG_SIGNATURE[i]);
}
// At least one ancillary chunk was dropped (the fixture has metadata).
expect(result.value.metadataRemoved).toBeGreaterThan(0);
// metadataItems is empty until Task 5+ wires up per-tag tracking.
expect(result.value.metadataItems).toEqual([]);
// No tEXt, iTXt, tIME, or pHYs chunks remain.
expect(findChunk(stripped, "tEXt")).toBe(-1);
expect(findChunk(stripped, "iTXt")).toBe(-1);

View file

@ -120,7 +120,7 @@ describe("VideoStrategy", () => {
const result = await strategy.strip({ bytes, options: NO_PRESERVE });
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(result.value.metadataRemoved).toBeGreaterThan(0);
expect(result.value.metadataItems).toEqual([]);
const after = asString(result.value.bytes);
expect(after).not.toContain("Test Title");
@ -162,7 +162,7 @@ describe("VideoStrategy", () => {
const result = await strategy.strip({ bytes, options: NO_PRESERVE });
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(result.value.metadataRemoved).toBeGreaterThan(0);
expect(result.value.metadataItems).toEqual([]);
const after = asString(result.value.bytes);
expect(after).not.toContain("Test Title");

View file

@ -70,7 +70,7 @@ describe("WasmProcessor", () => {
expect(result.ok).toBe(true);
if (!result.ok) return;
expect(result.value.outputPath).toBe("/tmp/report.docx");
expect(result.value.metadataRemoved).toBeGreaterThan(0);
expect(result.value.metadataItems).toEqual([]);
const cleaned = fileBytes.files.get("/tmp/report.docx");
expect(cleaned).toBeDefined();

View file

@ -15,6 +15,7 @@ function makeFile(overrides: Partial<FileEntry> = {}): FileEntry {
status: FileProcessingStatus.Pending,
afterBytes: null,
error: null,
metadataItems: [],
...overrides,
};
}
@ -124,7 +125,7 @@ describe("appReducer", () => {
});
describe("UPDATE_FILE_METADATA", () => {
it("sets afterBytes for matching file id", () => {
it("sets afterBytes and metadataItems for matching file id", () => {
const file = makeFile({ id: "file-1" });
const state = makeInitialState({ files: [file] });
@ -132,9 +133,11 @@ describe("appReducer", () => {
type: "UPDATE_FILE_METADATA",
id: "file-1",
afterBytes: 980_000,
metadataItems: [],
});
expect(result.files[0]!.afterBytes).toBe(980_000);
expect(result.files[0]!.metadataItems).toEqual([]);
});
});

View file

@ -0,0 +1,189 @@
import { describe, it, expect } from "vitest";
import { renderToStaticMarkup } from "react-dom/server";
import { I18nContext } from "../../../src/web/contexts/I18nContext";
import { MetadataDiffExpansion } from "../../../src/web/components/file-list/MetadataDiffExpansion";
import type { MetadataItem } from "../../../src/domain/exif/metadata_item";
// Mirrors how Task 9 added the keys to .resources/strings.json — the live `t`
// returns the raw template string with `{count}` placeholders, and the
// component is responsible for interpolating them locally (mirrors the
// `ErrorExpansion.tsx` `.replace("{ext}", ...)` pattern).
function makeI18nProvider(): (
children: React.ReactNode,
) => React.JSX.Element {
const dict: Record<string, string> = {
diffGroupRemoved: "{count} removed",
diffGroupModified: "{count} modified",
diffGroupKept: "{count} kept",
diffGroupSeparator: ", ",
diffKeptBadge: "preserved",
diffArrow: "→",
};
return (children) => (
<I18nContext.Provider
value={{
t: (key: string) => dict[key] ?? key,
locale: "en",
isLoading: false,
}}
>
{children}
</I18nContext.Provider>
);
}
function render(items: readonly MetadataItem[]): string {
const wrap = makeI18nProvider();
return renderToStaticMarkup(wrap(<MetadataDiffExpansion items={items} />));
}
describe("MetadataDiffExpansion", () => {
it("groups items by source with first-seen ordering", () => {
const items: MetadataItem[] = [
{
action: "removed",
source: "EXIF",
name: "Make",
valueBefore: "Apple",
},
{
action: "removed",
source: "GPS",
name: "GPSLatitude",
valueBefore: "47° N",
},
{
action: "removed",
source: "EXIF",
name: "Model",
valueBefore: "iPhone",
},
];
const html = render(items);
// Two <h4> headers should be present, EXIF before GPS (first-seen).
const headerMatches = html.match(
/<h4 class="file-table__diff-group-header">([^<]*)<\/h4>/g,
);
expect(headerMatches).not.toBeNull();
expect(headerMatches).toHaveLength(2);
expect(headerMatches?.[0]).toContain("EXIF");
expect(headerMatches?.[1]).toContain("GPS");
// EXIF section should appear before GPS in the rendered markup.
expect(html.indexOf("EXIF")).toBeLessThan(html.indexOf("GPS"));
});
it("sorts kept items before removed/modified within each group", () => {
const items: MetadataItem[] = [
{
action: "removed",
source: "EXIF",
name: "Make",
valueBefore: "Apple",
},
{ action: "kept", source: "EXIF", name: "Orientation", value: "6" },
];
const html = render(items);
const rowMatches = html.match(
/<div class="file-table__diff-row[^"]*"/g,
);
expect(rowMatches).not.toBeNull();
expect(rowMatches).toHaveLength(2);
expect(rowMatches?.[0]).toContain("file-table__diff-row--kept");
expect(rowMatches?.[1]).toContain("file-table__diff-row--removed");
});
it("group header shows correct counts and omits zero categories", () => {
const items: MetadataItem[] = [
{
action: "removed",
source: "EXIF",
name: "Make",
valueBefore: "Apple",
},
{ action: "kept", source: "EXIF", name: "Orientation", value: "6" },
];
const html = render(items);
const headerMatch = html.match(
/<h4 class="file-table__diff-group-header">([^<]*)<\/h4>/,
);
expect(headerMatch).not.toBeNull();
const headerText = headerMatch?.[1] ?? "";
expect(headerText).toMatch(/EXIF/);
expect(headerText).toMatch(/1 removed/);
expect(headerText).toMatch(/1 kept/);
expect(headerText).not.toMatch(/modified/);
});
it("renders modified item as before → after", () => {
const items: MetadataItem[] = [
{
action: "modified",
source: "MP4",
name: "mvhd.creation_time",
valueBefore: "2024-03-15",
valueAfter: "epoch (0)",
},
];
const html = render(items);
expect(html).toContain("2024-03-15");
expect(html).toContain("epoch (0)");
expect(html).toContain("→");
// Order: before, then arrow, then after.
const beforeIdx = html.indexOf("2024-03-15");
const arrowIdx = html.indexOf("→");
const afterIdx = html.indexOf("epoch (0)");
expect(beforeIdx).toBeLessThan(arrowIdx);
expect(arrowIdx).toBeLessThan(afterIdx);
});
it("removed item has strikethrough class", () => {
const items: MetadataItem[] = [
{
action: "removed",
source: "EXIF",
name: "Make",
valueBefore: "Apple",
},
];
const html = render(items);
const strikeMatch = html.match(
/<span class="[^"]*file-table__diff-value--strike[^"]*"[^>]*>([^<]*)<\/span>/,
);
expect(strikeMatch).not.toBeNull();
expect(strikeMatch?.[1]).toBe("Apple");
});
it("kept item has neutral class + preserved badge", () => {
const items: MetadataItem[] = [
{ action: "kept", source: "EXIF", name: "Orientation", value: "6" },
];
const html = render(items);
expect(html).toMatch(/file-table__diff-row--kept/);
expect(html).toContain("preserved");
expect(html).toMatch(/file-table__diff-kept-badge/);
});
it("long value renders with full title attribute", () => {
const longValue = "x".repeat(200);
const items: MetadataItem[] = [
{
action: "removed",
source: "EXIF",
name: "Software",
valueBefore: longValue,
},
];
const html = render(items);
const titleMatch = html.match(
/<span class="file-table__diff-value[^"]*" title="([^"]*)"/,
);
expect(titleMatch).not.toBeNull();
expect(titleMatch?.[1]).toBe(longValue);
});
it("empty items array renders nothing", () => {
const html = render([]);
// I18nContext.Provider renders no DOM of its own; component returns null.
expect(html).toBe("");
});
});

View file

@ -37,6 +37,7 @@ function makeFileEntry(overrides: Partial<FileEntry> = {}): FileEntry {
status: overrides.status ?? FileProcessingStatus.Pending,
afterBytes: overrides.afterBytes ?? null,
error: overrides.error ?? null,
metadataItems: overrides.metadataItems ?? [],
};
}
@ -111,8 +112,8 @@ describe("processFileEntries", () => {
mockApi.wasm.process.mockResolvedValue({
ok: true,
outputPath: entry.path,
metadataRemoved: 1,
outputBytes: 980_000,
metadataItems: [],
error: null,
});
@ -136,8 +137,8 @@ describe("processFileEntries", () => {
mockApi.wasm.process.mockResolvedValue({
ok: true,
outputPath: entry.path,
metadataRemoved: 3,
outputBytes: 980_000,
metadataItems: [],
error: null,
});
@ -151,6 +152,7 @@ describe("processFileEntries", () => {
type: "UPDATE_FILE_METADATA",
id: "test-id-1",
afterBytes: 980_000,
metadataItems: [],
});
});
@ -159,8 +161,15 @@ describe("processFileEntries", () => {
mockApi.wasm.process.mockResolvedValue({
ok: true,
outputPath: entry.path,
metadataRemoved: 1,
outputBytes: 980_000,
metadataItems: [
{
action: "removed",
source: "EXIF",
name: "Artist",
valueBefore: "Jane Photographer",
},
],
error: null,
});
@ -174,13 +183,13 @@ describe("processFileEntries", () => {
expect(completeDispatches).toHaveLength(1);
});
it("dispatches 'no-metadata-found' when metadataRemoved is 0", async () => {
it("dispatches 'no-metadata-found' when metadataItems contains only kept entries", async () => {
const entry = makeFileEntry();
mockApi.wasm.process.mockResolvedValue({
ok: true,
outputPath: entry.path,
metadataRemoved: 0,
outputBytes: entry.size,
metadataItems: [],
error: null,
});
@ -218,7 +227,7 @@ describe("processFileEntries", () => {
const callOrder: string[] = [];
mockApi.wasm.process.mockImplementation(async (path: string) => {
callOrder.push(`wasm:${path}`);
return { ok: true, outputPath: path, metadataRemoved: 1, outputBytes: 980_000, error: null };
return { ok: true, outputPath: path, outputBytes: 980_000, metadataItems: [], error: null };
});
await processFileEntries({ entries: [entry1, entry2], dispatch: mockDispatch, t: mockT });
@ -234,8 +243,8 @@ describe("processFileEntries", () => {
mockApi.wasm.process.mockResolvedValue({
ok: true,
outputPath: "/x",
metadataRemoved: 1,
outputBytes: 980_000,
metadataItems: [],
error: null,
});
@ -249,8 +258,8 @@ describe("processFileEntries", () => {
mockApi.wasm.process.mockResolvedValue({
ok: true,
outputPath: entry.path,
metadataRemoved: 1,
outputBytes: 980_000,
metadataItems: [],
error: null,
});
@ -267,8 +276,8 @@ describe("processFileEntries", () => {
mockApi.wasm.process.mockResolvedValue({
ok: true,
outputPath: "/x",
metadataRemoved: 1,
outputBytes: 980_000,
metadataItems: [],
error: null,
});
@ -282,8 +291,8 @@ describe("processFileEntries", () => {
mockApi.wasm.process.mockResolvedValue({
ok: true,
outputPath: "/tmp/report.docx",
metadataRemoved: 3,
outputBytes: 980_000,
metadataItems: [],
error: null,
});
@ -307,8 +316,8 @@ describe("processFileEntries", () => {
mockApi.wasm.process.mockResolvedValue({
ok: true,
outputPath: "/tmp/x",
metadataRemoved: 1,
outputBytes: 980_000,
metadataItems: [],
error: null,
});
@ -348,8 +357,8 @@ describe("processFileEntries", () => {
mockApi.wasm.process.mockResolvedValue({
ok: false,
outputPath: null,
metadataRemoved: null,
outputBytes: null,
metadataItems: [],
error: "Not a valid Office file",
});

View file

@ -300,7 +300,7 @@ async function main(): Promise<void> {
const ourDefaultPath = `${SCRATCH}/our-default.jpg`;
writeFileSync(ourDefaultPath, Buffer.from(ourDefault.value.bytes));
console.log(
`Our strategy (default): ${ourDefaultPath} (${ourDefault.value.bytes.length} bytes, dropped ${ourDefault.value.metadataRemoved})`,
`Our strategy (default): ${ourDefaultPath} (${ourDefault.value.bytes.length} bytes, ${ourDefault.value.metadataItems.length} metadata items emitted)`,
);
// 3. Strip via our strategy — preserveOrientation: true.
@ -319,7 +319,7 @@ async function main(): Promise<void> {
const ourPreservePath = `${SCRATCH}/our-preserve-orientation.jpg`;
writeFileSync(ourPreservePath, Buffer.from(ourPreserve.value.bytes));
console.log(
`Our strategy (preserveOrientation=true): ${ourPreservePath} (${ourPreserve.value.bytes.length} bytes, dropped ${ourPreserve.value.metadataRemoved})`,
`Our strategy (preserveOrientation=true): ${ourPreservePath} (${ourPreserve.value.bytes.length} bytes, ${ourPreserve.value.metadataItems.length} metadata items emitted)`,
);
// 4. Strip via ExifTool.

View file

@ -3,6 +3,6 @@ import { defineConfig } from "vitest/config";
export default defineConfig({
test: {
root: ".",
include: ["tests/**/*.test.ts"],
include: ["tests/**/*.test.{ts,tsx}"],
},
});