feat(wasm): FfmpegFallbackStrategy for MP4/MOV/M4V/MKV/WebM (#183)
All checks were successful
CI / Lint, Typecheck & Unit Tests (push) Successful in 29s
CI / Smoke build (VITE_ENABLE_FFMPEG_FALLBACK=false) (push) Successful in 44s
CI / E2E (Standalone single-file) (push) Successful in 1m33s
CI / E2E (Web) (push) Successful in 3m24s

Adds FfmpegFallbackStrategy as a peer to ExifToolFallbackStrategy, routing MP4/MOV/M4V (Phase 1) and MKV/WebM (Phase 2) through @ffmpeg/core. On by default for all three distributions (standalone HTML, Capacitor APK, PWA self-host); VITE_ENABLE_FFMPEG_FALLBACK=false opts out. Takes priority over VideoStrategy for the MP4 family; VideoStrategy stays registered as the opt-out fallback until a subsequent PR deletes it.

Closes #182. Closes #43.

Resolves the four documented walker KNOWN_GAPS categorically: handler-name leak (#38), compressor-name leak (#39), mvhd.next_track_id leak (#111), GPMF/GPS coordinates leak (#42). On gopro-fusion.mp4 (5.1 MB GPMF + tmcd + fdsc) and dji-phantom4.mov (236 MB UserData GPS log) the forensic battery reports zero device-fingerprint survival across every recovery technique.

Key architectural choices:

- **Main-thread @ffmpeg/core, not @ffmpeg/ffmpeg wrapper.** The wrapper hardcodes type:"module" Workers from Blob URLs, which fail silently under null-origin file:// in Chromium — the standalone build hung forever on every video strip. @ffmpeg/ffmpeg dropped from package.json.
- **Stream mapping -map 0 -map -0:d? -map -0:s? -map -0:t?**. Preserves input track order while dropping data/subtitle/timecode streams. Avoids the eng→und reorder bug of -map 0:v?/-map 0:a?, and sidesteps mat2's exit-234 on action-cam files (GoPro Fusion has tmcd/fdsc).
- **Post-strip pass rewrites the udta box type to 'free'** (ISO/IEC 14496-12 §8.1.2 padding) to neutralise ffmpeg's hardcoded HandlerType:Metadata + HandlerVendorID:Apple stub. Length-preserving so stco/co64 offsets stay valid. Handles both regular and largesize headers via headerStart+4.
- **mdhd.language left as ffmpeg's 'und'** — considered zeroing but reverted: 0x0000 is an invalid ISO 639-2/T code, ffprobe falls back to displaying '(eng)' for invalid codes (actively misleading downstream tools).
- **Diff race fix.** @uswriting/exiftool's parseMetadata uses module-level singletons (Perl, MemoryFS, stdout/stderr StringBuilders). WasmProcessor now serializes all diff builds across the processor's lifetime via a Promise chain — guarantees no two parseMetadata calls overlap, whether within an entry or across the fire-and-forget chunk-drained queue.
- **ExifTool family-1 group names surfaced verbatim** — IFD0, ExifIFD, XMP-dc, Track1, etc. Refuses to collapse to umbrella labels like 'EXIF' because the collapse caused (source, name) key collisions across sub-groups (Track1:HandlerType vs Track2:HandlerType produced spurious diffs on multi-track MP4).
- **Standalone HTML stays single-file.** Two-asset Vite plugin gzip+base64-inlines ffmpeg-core.js + ffmpeg-core.wasm into <script type=text/plain> tags, mirroring the zeroperl pattern. With tree-shaking via __WITH_STANDALONE_INLINE__ the standalone HTML went 116MB → 24MB.

Forensic verification: docs/forensic/ffmpeg-fallback.md + tools/forensic/ffmpeg-fallback.ts cover synthetic-mp4/mkv/webm + phone-baseline (2.7MB Android) + gopro-fusion (5MB action-cam) + dji-phantom4 (236MB drone) with zero sentinel/fingerprint survival across the recovery battery. Gap analyses for all three formats at docs/gap-analysis/mp4-ffmpeg.md, mkv.md, webm.md. POC at docs/poc/ffmpeg-wasm.md.

Production deps go from 5 → 6: @ffmpeg/core@0.12.10 (GPL-2.0-or-later; combined distributable inherits, MIT codebase unchanged, source pointer in README per GPL compliance).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
forgejo_admin 2026-05-22 15:04:04 +04:00
parent b2dec037a8
commit a5546afa71
30 changed files with 2873 additions and 152 deletions

View file

@ -39,6 +39,34 @@ jobs:
- name: Check circular dependencies (madge)
run: yarn check:deps
build-no-ffmpeg:
# Smoke check the opt-out path. When VITE_ENABLE_FFMPEG_FALLBACK=false,
# the ffmpeg strategy is omitted at registration time and VideoStrategy
# handles MP4/MOV/M4V. Verifies the bundle still builds without the
# ffmpeg engine in the strategy chain (issue #182).
name: Smoke build (VITE_ENABLE_FFMPEG_FALLBACK=false)
needs: test
runs-on: ubuntu-latest
env:
VITE_ENABLE_FFMPEG_FALLBACK: 'false'
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 22
cache: 'yarn'
- name: Install dependencies
run: yarn install --frozen-lockfile
- name: Build web (opt-out)
run: yarn build:web
- name: Build standalone (opt-out)
run: yarn build:web:standalone
e2e-web:
name: E2E (Web)
needs: test

View file

@ -9,7 +9,7 @@ Privacy-focused metadata stripper. **Primary distributions: desktop offline stan
- **Language**: TypeScript 5.7 with `strict: true` + `verbatimModuleSyntax: true` (type-check only; Vite/esbuild compile)
- **Build**: `vite` 7.x — `vite.config.web.standalone.ts` produces the primary desktop output (`dist/web-standalone/index.html`, single-file inlined). `vite.config.web.ts` produces `dist/web/`, used as the source for the Android APK (via Capacitor `cap sync android`) and for the self-host PWA path. `dist/web/` is not a primary user-facing distribution by itself.
- **Processing engine**: hand-rolled WASM/pure-TS `FormatStrategy` implementations registered in `src/infrastructure/wasm/strategy_registry.ts`. The registry is the sole authority for what is supported.
- **Production deps (4)**: `jszip` (Office), `pdf-lib` (PDF), `react` + `react-dom` (UI).
- **Production deps (6)**: `@ffmpeg/core` (FfmpegFallbackStrategy strip engine for MP4/MOV/MKV/WebM), `@uswriting/exiftool` (ExifToolFallbackStrategy strip + ExifToolDiffStrategy read engine), `jszip` (Office), `pdf-lib` (PDF), `react` + `react-dom` (UI).
- **Performance is sacred**: the app should process hundreds of files in seconds. Never add sync I/O in the loop or heavy DOM operations per row.
## Commands
@ -151,13 +151,15 @@ Root configs: `.prettierrc` (tabs), `.gitattributes` (`* text=auto eol=lf`), `vi
## Dependencies
### Production (4)
### Production (6)
| Package | Purpose |
| --- | --- |
| `jszip` | ZIP archive read/write for the Office strategy (DOCX/XLSX/PPTX/ODT) + batch zip output |
| `pdf-lib` | PDF metadata stripping |
| `react`, `react-dom` | UI |
| `@ffmpeg/core` | Single-threaded ffmpeg-wasm; the FfmpegFallbackStrategy's strip engine for MP4/MOV/MKV/WebM (#182). GPL-2.0; combined distributable inherits. |
| `@uswriting/exiftool` | WebPerl-ExifTool wrapper; the ExifToolFallbackStrategy strip engine for WebP/GIF/AVIF + the ExifToolDiffStrategy read engine for the before/after diff feature. |
| `jszip` | Office archive read/write (DOCX/XLSX/PPTX/ODT) + batch zip output. |
| `pdf-lib` | PDF metadata stripping. |
| `react`, `react-dom` | UI. |
### Dev
@ -181,7 +183,7 @@ Root configs: `.prettierrc` (tabs), `.gitattributes` (`* text=auto eol=lf`), `vi
- **Naming**: snake_case for filenames, camelCase for functions/variables, PascalCase for React components.
- **CSS**: BEM (mandated for all new CSS).
- **Fonts**: system stack only (`system-ui, -apple-system, BlinkMacSystemFont, ...`). No web font downloads, no bundled fonts.
- **Dependencies**: prefer hand-rolling. Four production deps is the current ceiling; new deps need explicit justification.
- **Dependencies**: prefer hand-rolling. Current count is 6 production deps; new deps need explicit justification.
- **Error handling**: throw `Error` objects; surface errors via `Result<T, E>` shapes (see typescript-conventions.md).
- **i18n**: add translations to `.resources/strings.json`.
- **Performance is sacred**: see Tech Stack. Batch operations should feel instant.

View file

@ -196,3 +196,16 @@ The codebase has been substantially rewritten since:
All upstream contributors are credited in the original [ExifCleaner README](https://github.com/szTheory/exifcleaner#contributors). MIT license preserved throughout.
## Third-party engines and license notices
MetaScrub bundles two upstream WebAssembly engines as build-time-opt-in fallback strategies. Both default to **on** for the standalone HTML and Android APK distributions; set the corresponding env var to `false` at build time to omit the engine.
| Engine | Used for | Build flag (env) | License | Source |
|---|---|---|---|---|
| [ffmpeg-wasm](https://github.com/ffmpegwasm/ffmpeg.wasm) (`@ffmpeg/core`) | MP4 / MOV / M4V / MKV / WebM strip via `FfmpegFallbackStrategy` (#182) | `VITE_ENABLE_FFMPEG_FALLBACK` | `@ffmpeg/core`: **GPL-2.0-or-later** (the WASM build includes GPL components from upstream ffmpeg). Loaded directly on the main thread — no `@ffmpeg/ffmpeg` wrapper. | <https://github.com/ffmpegwasm/ffmpeg.wasm> |
| [WebPerl ExifTool](https://github.com/6over3/zeroperl-ts) (`@uswriting/exiftool` + `@6over3/zeroperl-ts`) | WebP / GIF / AVIF strip via `ExifToolFallbackStrategy` (#174); diff via `ExifToolDiffStrategy` (#177) | `VITE_ENABLE_EXIFTOOL_FALLBACK` | Apache-2.0 (wrappers). ExifTool itself is GPL or Artistic. | <https://github.com/6over3/zeroperl-ts> · <https://exiftool.org/> |
**GPL-2.0 implication for ffmpeg**: distributions of MetaScrub that include `ffmpeg-core.wasm` (the default for standalone HTML + APK builds) are subject to GPL-2.0 for the combined work. Our codebase remains MIT (no GPL source is copied into our source tree), but the combined binary distribution must comply with GPL-2.0's source-availability requirement. That requirement is met by linking to <https://github.com/ffmpegwasm/ffmpeg.wasm> — the upstream is fully open and we pin specific versions in `package.json` (recoverable from `git log` plus the lockfile).
Builds with `VITE_ENABLE_FFMPEG_FALLBACK=false` omit the ffmpeg engine from the strategy chain (VideoStrategy handles MP4/MOV/M4V; MKV/WebM become unsupported). The combined binary in that mode contains no GPL-licensed code.

View file

@ -0,0 +1,112 @@
# FfmpegFallbackStrategy forensic recovery test
**Date:** 2026-05-21
**Goal:** Verify that metadata stripped by `FfmpegFallbackStrategy` (#182 Phase 1) cannot be recovered by standard forensic tooling. Cover synthetic MP4 (every metadata source seeded with sentinels via exiftool) plus real-world fixtures (`phone-baseline.mp4`, `gopro-fusion.mp4`). Compare against the gap-analysis policy in [`docs/gap-analysis/mp4-ffmpeg.md`](../gap-analysis/mp4-ffmpeg.md).
**Reproducible at:** [`tools/forensic/ffmpeg-fallback.ts`](../../tools/forensic/ffmpeg-fallback.ts) — `npx tsx tools/forensic/ffmpeg-fallback.ts` from the project root.
## Methodology
The runner replicates the strategy's strip invocation against `@ffmpeg/core` 0.12.10 directly. The strategy class itself uses the browser-only `@ffmpeg/ffmpeg` wrapper (Node import = empty module per package.json conditional exports). The runner therefore exercises the engine + arg vector, not the wrapper boilerplate — what matters forensically is what ffmpeg's MP4 demuxer/muxer pair does, not the wrapper.
Strip command (matches `FfmpegFallbackStrategy.strip`):
```
ffmpeg -i in.mp4 \
-map 0 -map -0:d? -map -0:s? -map -0:t? \
-map_metadata -1 -map_chapters -1 \
-fflags +bitexact \
-c copy \
-movflags +faststart \
-metadata "encoder=" \
out.mp4
```
## Fixtures + results
| Fixture | Bytes (in → out) | Sentinels (in → out) | Verdict |
|---|---:|---:|---|
| `synthetic-mp4` (1 s blue frame, seeded with `Title`, `Author`, `Comment`, `Encoder`, `Description` via exiftool) | 5 647 → 2 238 | 5 → **0** | ✓ clean |
| `synthetic-mkv` (1 s green frame; sentinels seeded via `ffmpeg -metadata`) | 1 991 → 1 761 | 4 → **0** | ✓ clean |
| `synthetic-webm` (1 s green frame; sentinels seeded via `ffmpeg -metadata`) | 1 190 → 991 | 4 → **0** | ✓ clean |
| `phone-baseline.mp4` (samplelib, 2.7 MB modern Android) | 2 848 208 → 2 848 111 | n/a (no seeded sentinels; checks device fingerprints) | ✓ clean |
| `gopro-fusion.mp4` (gpmf-parser repo, 5.1 MB, GPMF + tmcd + fdsc streams) | 5 377 407 → 5 000 118 | 7 device fingerprints (`GoPro AVC`, `gpmd`, `GoPro AAC`, `GoPro TCD`, `GoPro MET`, `GoPro SOS`, `Fusion`) → **0** | ✓ clean |
| `dji-phantom4.mov` (Zenodo record 3604005, 248 MB, opt-in via `--include-large`) | 248 007 654 → 247 901 513 | 5 device fingerprints (`FC6310` drone model, `AVC encoder`, `DJI.AVC`, `DJI.Meta`, `55 deg` GPS lat) → **0** | ✓ clean |
DJI Phantom 4 also carries the **full GPS flight log** under `[UserData] GPSCoordinates``55 deg 30' 26.75" N, 10 deg 43' 3.01" E, 10.8 m Above Sea Level` in the input. `exiftool -G1 -a -s` on the stripped output returns no GPS fields whatsoever — the entire UserData block is dropped by `-map_metadata -1`.
## Performance
End-to-end timings, captured on the development host (single Node process, warm module cache):
| Fixture | Size | Wall time | Peak RSS |
|---|---:|---:|---:|
| Each synthetic (mp4 / mkv / webm) | 16 KB | < 100 ms | ~250 MB |
| phone-baseline.mp4 | 2.7 MB | ~60 ms | ~500 MB |
| gopro-fusion.mp4 | 5.1 MB | ~95 ms | ~800 MB |
| **dji-phantom4.mov** | **236 MB** | **~1.0 s** | **~1.2 GB** |
| Full battery (6 fixtures) | ~250 MB total | **4.2 s** | ~1.7 GB |
The DJI run is the strongest real-world data point: a 236 MB file stripped in roughly one second wall time, peak memory at ~5× input size. WASM linear memory caps at 4 GB, so desktop browsers comfortably handle files up to ~700 MB; mobile WebView with tighter memory ceilings hits ~250 MB as the practical limit — same constraint the walker has today, documented under `#34` (streaming I/O follow-up).
**Recovery battery applied to each output:**
1. `strings | grep <sentinel>` — direct byte-level survival check
2. Device-fingerprint strings from real-world fixture manifests — same byte-level scan against the manifested fingerprints
The runner exits with code 0 iff every fixture comes back clean (zero sentinel + zero fingerprint survival) and no unexpected `KNOWN_GAPS` are surfaced. Phase 1 lands with `KNOWN_GAPS` **empty**.
## Comparison vs. current VideoStrategy walker
From `docs/forensic/video.md`, the walker has documented KNOWN_GAPS on the same fixtures:
| Channel | Walker on this fixture | ffmpeg-wasm |
|---|---|---|
| `HDLR_NAME_VIDEO` / `GoPro AVC` / etc. (#38) | LEAKED | Removed |
| `COMPRESSORNAME` (#39) | LEAKED | Removed |
| `gpmd-magic` / GoPro device strings | LEAKED on real-world | Removed |
| `mvhd.next_track_id` (#111) | LEAKED | Rewritten by ffmpeg muxer |
| `GPMF` GPS coordinates | LEAKED | Removed (gpmd track dropped) |
Categorical improvement on all known-gap channels for the formats claimed in this PR.
## Comparison vs. mat2 (= ffmpeg `-codec copy -map_metadata -1` with default `-map 0`)
mat2's invocation fails on `gopro-fusion.mp4` (exit 234 from `Could not find tag for codec none in stream #2, codec not currently supported in container` — the `tmcd` and `fdsc` data streams). Our `-map 0 -map -0:d? -map -0:s? -map -0:t?` choice drops those streams entirely, sidestepping the muxer's codec-tag refusal and producing a clean output. mat2 produces no output on this fixture; we produce 5 MB clean output.
## Post-strip rewrite of ffmpeg's udta stub
The strategy runs a post-strip pass (`cleanFfmpegMp4Output` in `src/infrastructure/wasm/strategies/ffmpeg_post_strip.ts`) over each MP4 output before returning it. The runner above replicates the raw ffmpeg invocation only — the post-strip pass is unit-tested separately. Two policy changes have settled into the pass (both verified against the strategy's e2e + `phone-baseline.mp4` direct exiftool dumps):
| Surface | Behaviour | Why |
|---|---|---|
| `moov/udta` and `moov/trak/udta` | Box type rewritten to `free` in place (length-preserving). | ffmpeg's MP4 muxer writes a 0x21-byte `udta/meta/hdlr` block unconditionally; handler_type is `mdir` (exiftool: `HandlerType: Metadata`) and vendor is hardcoded `appl` (exiftool: `HandlerVendorID: Apple`). Renaming to `free` (ISO/IEC 14496-12 §8.1.2 padding) makes every spec-conformant reader skip the contents. Length-preserving so `stco`/`co64` chunk offsets stay valid. |
| `moov/trak/mdia/mdhd.language` | Left as ffmpeg's default `und` (0x55C4). | We considered zeroing it to suppress the `MediaLanguageCode` diff row but reverted: 0x0000 is an invalid ISO 639-2/T code; ffprobe falls back to `(eng)`, actively misleading downstream tools. `und` is the spec's canonical "no language specified" marker; every reader handles it predictably. When the input had a real language the resulting `eng → und` diff row is honest information — we removed the user's language tag. |
Verification of `udta → free` on `phone-baseline.mp4` (full exiftool group-1 dump):
- **Before** the pass: exiftool surfaces `HandlerType: Metadata` and a (zero) `HandlerVendorID` from the udta block.
- **After** the pass: neither row appears. The per-track `Handler Type: Video Track` / `Audio Track` rows that remain are mandatory mdia/hdlr — legitimate track descriptors, not muxer-added.
ffprobe regression check: still parses the file normally, frame counts unchanged, byte length unchanged. The renamed `free` box sits in moov and is skipped by parsers.
## Caveats / scope
- **Synthetic fixture only seeds container-level metadata** via exiftool. The synthetic does not exercise `tmcd`/`fdsc`/`gpmd` codec-tag refusal; the real-world `gopro-fusion.mp4` does. Future expansion: add a synthetic with a fake `gpmd` stream so the gpmd-drop path is exercisable in air-gapped CI without the 5 MB fixture download.
- **Forensic runner does not invoke the post-strip pass directly** — only the ffmpeg invocation. The pass has its own unit tests (`tests/infrastructure/wasm/ffmpeg_post_strip.test.ts`) covering top-level udta rename, per-track udta rename, length preservation, and the no-udta no-op. The e2e helper (`tests/e2e/web/helpers/metadata_assertions.ts`) additionally asserts the `00 00 00 21 75 64 74 61` udta box-header signature does not survive in the stripped output bytes. Wiring the post-strip pass into the forensic runner is a follow-up.
- **DJI Phantom 4 fixture not run** in default battery (248 MB download; opt-in via `--include-large` on the fetch script). DJI behaviour matches GoPro Fusion (same `tmcd`/`gpmd` story); covered by the inherent `-map 0 -map -0:d? -map -0:s? -map -0:t?` policy.
- **MKV/WebM not yet exercised.** Phase 2 of #182 extends this runner with EBML synthetic + WebM real-world fixtures.
- **Sidecar files** (`.SRT`, `.LRV`, `.THM`, `.LRF`) — out of scope here per #46; no in-file strategy addresses them.
## Reproducing
```bash
# One-time fixture setup
./tools/forensic/fetch-video-fixtures.sh # ~8 MB (phone + gopro)
# Run the battery
npx tsx tools/forensic/ffmpeg-fallback.ts
```
Required tools (host): `ffmpeg`, `exiftool` (for synthetic fixture seeding only — the strip phase runs entirely in-WASM). Optional: `node --permission --allow-fs-read='*' --allow-fs-write='*' --allow-child-process …` to confirm no network capability is exercised end-to-end.
Output: stdout summary + `/tmp/ffmpeg-fallback-forensic/report.json` with per-fixture sentinel/fingerprint survival data.

93
docs/gap-analysis/mkv.md Normal file
View file

@ -0,0 +1,93 @@
# MKV (Matroska) — ffmpeg-wasm strategy gap analysis
**Date:** 2026-05-21
**Goal:** Document the per-source policy for `FfmpegFallbackStrategy` on Matroska (`.mkv`) inputs. Phase 2 of issue #182. Compare against ExifTool's `-all=` (limited writer support), mat2, and the (deferred) hand-rolled EBML walker approach in `docs/superpowers/plans/2026-05-05-v5-mkv-webm-avi-strategy.md`.
---
## Methodology
- POC writeup: [`docs/poc/ffmpeg-wasm.md`](../poc/ffmpeg-wasm.md) — same engine + same invocation as MP4
- ffmpeg source `libavformat/matroskaenc.c` — matroska muxer behaviour reference
- EBML / Matroska specifications — element registry, semantics
- Empirical test: synthetic `.mkv` generated via `ffmpeg -f lavfi -i color=…`, seeded with `title`, `description`, `comment`, `encoder` metadata via `-metadata`, then stripped via `FfmpegFallbackStrategy` (the same `@ffmpeg/core` invocation)
- Out of band: `exiftool -all=` writes to MKV with **limited** support (refuses some chapter / attachment removals), which is the original reason `VideoStrategy` was scoped as MP4-only and MKV was deferred
The starting policy is the same invocation as MP4, minus `-movflags +faststart` (which is MP4-specific and the matroska muxer ignores with a warning).
---
## Per-source policy
### Track-level (the `-map` axis)
| Source track | Carries privacy data? | Policy | Reasoning |
|---|---|---|---|
| Video tracks | Frames — no | **Keep — `-map 0:v?`** | Content |
| Audio tracks | Samples — no | **Keep — `-map 0:a?`** | Content |
| Subtitle tracks (`S_TEXT/SRT`, `S_TEXT/UTF8`, `S_TEXT/SSA`, `S_HDMV/PGS`) | Possibly | **Drop** | Most MKV subtitles are content (legit captions, fansubs). But MKV is also the container of choice for some action-cam concatenation tools that fold sidecar SRT into the file — those SRTs carry GPS. Conservative: drop. Legitimate-subtitle edge case flagged for follow-up UX work. |
| Attachment tracks (`A_*`) — fonts, cover art, mtree | **Yes** | **Drop** | Attachments routinely include EXIF-tainted cover art JPEGs and fonts that fingerprint the muxing environment. |
| Chapter tracks | User-authored text | **Dropped via `-map_chapters -1`** | Chapter titles leak project / film names. |
### Container-level metadata
| Source | Lives in | mat2 / ffmpeg default | Our policy | Notes |
|---|---|---|---|---|
| `\Segment\Info\Title` | EBML | Dropped via `-map_metadata -1` | **Dropped** | User-facing title. |
| `\Segment\Info\MuxingApp` | EBML | Rewritten by matroska muxer to `Lavf<version>` | **Suppressed via `-metadata encoder=`** | Strip-tool fingerprint per privacy-invariants §6. |
| `\Segment\Info\WritingApp` | EBML | Rewritten to muxer default | **Accept default** | Less-fingerprint-y than preserving original (could be device-specific). |
| `\Segment\Info\DateUTC` | EBML | Zeroed by `-fflags +bitexact` | **Zeroed** | Privacy-invariants §6 epoch policy. |
| `\Segment\Info\SegmentUID` | EBML | Re-randomised by muxer | **Accept default** | UID changes per-mux — not a stable fingerprint, but worth confirming in forensic verification. |
| `\Segment\Tags` (per-track + global) | EBML | Dropped via `-map_metadata -1` | **Dropped** | Title, artist, copyright, comment, etc. |
| Per-track `Name`, `Language`, `CodecPrivate` | EBML | Mostly preserved; `Name` dropped via `-map_metadata -1`? **Partial** | **Best-effort drop** | Forensic verification confirms behaviour. |
---
## Honest gap summary
### vs. ExifTool standalone
ExifTool's MKV writer is partial — it can read but cannot delete several elements (chapters, attachments). Our ffmpeg-wasm path drops these categorically by re-muxing without them. **Better than ExifTool** on MKV.
### vs. mat2 (`ffmpeg -map 0 -codec copy -map_metadata -1`)
Should behave equivalently on standard MKV (same engine + similar invocation). The `-map 0:v? -map 0:a?` choice protects against the same data-stream codec-tag issue MP4 has, though it's less common in MKV.
### Empirical verification
Synthetic MKV with 4 seeded sentinels (`title`, `description`, `comment`, `encoder`) → 0 survivors. Documented in `tools/forensic/ffmpeg-fallback.ts` Phase 2 row.
### Deferred / out of scope
- **Attachment-track scrubbing of embedded JPEGs/fonts.** Our policy drops the attachment tracks entirely, which is sufficient — but if a user *wants* to keep an attached subtitle font, we currently can't preserve it without also keeping its potential EXIF leaks.
- **Real-world MKV fixtures** — the forensic runner only covers a synthetic fixture today. Real-world MKV testing depends on having a representative MKV in `tests/fixtures/wasm/video/real-world/`. Tracked as a follow-up.
---
## Limitations / unaudited matroska muxer fingerprints
This PR (Phase 2 of #182) does **not** add a matroska post-strip pass. `FfmpegFallbackStrategy.strip()` runs `cleanFfmpegMp4Output()` (in `src/infrastructure/wasm/strategies/ffmpeg_post_strip.ts`) only for MP4-family containers — matroska output is whatever ffmpeg's muxer wrote, unaudited byte-for-byte.
Candidate fingerprints ffmpeg's matroska muxer is known to write into `\Segment\Info`:
- `MuxingApp` — typically `"Lavf<version>"`; explicit muxer fingerprint
- `WritingApp` — sometimes `"Lavf<version>"`, sometimes inherited from the source
- `SegmentUID` — fresh random per mux (not a stable fingerprint, but a per-output identifier worth confirming)
- `DateUTC` — should be epoch under `-fflags +bitexact`, worth confirming empirically
Some of these *should* be suppressed by our existing `-metadata encoder=` and `-fflags +bitexact` flags (see the per-source policy table above), but this has not been verified against the actual byte output. The current forensic battery checks for seeded `-metadata` sentinels, not for muxer-injected fingerprints.
This is also a hole in `assertVideoStripped()` in `tests/e2e/web/helpers/metadata_assertions.ts` — that helper checks for `mdirappl`, `btrt`, `VideoHandler`, `SoundHandler` (all MP4-family). Running it against an MKV/WebM output would pass even if `Lavf<version>` leaked.
Deferred because: (1) scope discipline for Phase 2 of #182, (2) no real-world MKV/WebM fixture exists in `tests/fixtures/wasm/video/real-world/` yet — real-world fixture work is already flagged as a follow-up in the PR description, so empirical muxer-fingerprint verification can't be wired into CI today.
**Suggested follow-up:**
1. Extend `assertVideoStripped()` (`tests/e2e/web/helpers/metadata_assertions.ts`) with a matroska branch that checks the output bytes for `Lavf`, `MuxingApp`, and `WritingApp` substrings.
2. If any leak, add an EBML post-strip pass — `cleanFfmpegMatroskaOutput()` in `ffmpeg_post_strip.ts`, paralleling the existing MP4 helper — to zero the offending elements in place.
---
## Recommendation
Adopt the same invocation as MP4, minus `-movflags +faststart`. Phase 2 of #182 lands `.mkv` in the strategy's claim set. Forensic verification (`tools/forensic/ffmpeg-fallback.ts`) confirms zero sentinel survival on the synthetic MKV battery.

View file

@ -0,0 +1,161 @@
# MP4 / MOV / M4V — ffmpeg-wasm strategy gap analysis
**Date:** 2026-05-21
**Goal:** Document the per-source policy for `FfmpegFallbackStrategy` on MP4/MOV/M4V inputs — what bytes get dropped, rewritten, or preserved when ffmpeg's `-codec copy` remux runs with the project's privacy-strip invocation. Covers Phase 1 of issue #182. Compare against the current `VideoStrategy` box-tree rewriter, ExifTool's `-all=`, and mat2's `-codec copy -map_metadata -1`.
---
## Methodology
- POC writeup: [`docs/poc/ffmpeg-wasm.md`](../poc/ffmpeg-wasm.md) — package install, bundle, privacy audit, functional + performance results
- ISO/IEC 14496-12:2022 (ISOBMFF) — handler type registry, box type definitions
- ffmpeg source `libavformat/movenc.c` — MP4 muxer behaviour reference
- Existing comparison battery: `docs/forensic/video.md` (synthetic + real-world fixtures, mat2 + ExifTool columns)
- Real-world POC runs on `phone-baseline.mp4` (2.7 MB samplelib), `gopro-fusion.mp4` (5.1 MB), `sample-fragmented.mp4` (fragmented MP4 with `moof`)
The starting policy is the invocation validated in the POC. As of the post-PR-#183 review, the stream selection was changed from `-map 0:v? -map 0:a?` to `-map 0 -map -0:d? -map -0:s? -map -0:t?` — the older form put all video streams first, then all audio, swapping track order for files where audio came first in the input. The new negative-selector form preserves input track order while still dropping data / subtitle / timecode streams. Full command:
```
ffmpeg -i in.mp4 -map 0 -map -0:d? -map -0:s? -map -0:t? \
-map_metadata -1 -map_chapters -1 -fflags +bitexact \
-c copy -movflags +faststart out.mp4
```
---
## Per-source policy
### Stream-level (the `-map` axis)
| Source stream type | Example handler | Carries privacy data? | Policy | Reasoning |
|---|---|---|---|---|
| `vide` (video) | Camera capture, screen recording | Frames themselves — no | **Keep — implicit `-map 0`** | Core content |
| `soun` (audio) | Microphone, system audio | Audio samples — no | **Keep — implicit `-map 0`** | Core content |
| `subt`, `text`, `sbtl` (subtitle / text) | DJI SRT-style telemetry, captions | **Yes** (GPS in DJI; nothing in legit captions) | **Drop — not in `-map`** | Action-cam SRT is the worst real-world leak channel; legit captions are rare on MP4 (most subtitles ship as sidecars). Flagged as edge case below. |
| `tmcd` (timecode) | GoPro, DJI, dashcams | **Yes** (timecode = clock) | **Drop** | Was the GoPro Fusion default-`-map 0` failure trigger. Dropping it removes the leak *and* fixes the muxer error. |
| `fdsc` (description) | GoPro Fusion, some Sony | **Yes** (device fingerprint) | **Drop** | Same root cause as tmcd; codec `none` in MP4 muxer. |
| `meta` / `gpmd` (timed metadata) | GoPro GPMF | **Yes** (GPS, gyro, accelerometer) | **Drop** | The whole point of stripping action-cam footage. |
| `clcp` (closed caption) | Some broadcast / iMovie exports | Possibly | **Drop** | Rare; if legit content is here, user knows to keep their captions in a sidecar `.srt`. Edge case below. |
| `hint` (RTP hint track) | Server-streamed MP4 | No content; encoder fingerprint | **Drop** | Server-side artifact, not content. |
| Any other (`alis`, `url`, etc.) | Various | Sometimes | **Drop** | Aux references; content-free. |
The `-map 0 -map -0:d? -map -0:s? -map -0:t?` pattern keeps every stream by default, then explicitly removes data ("d"), subtitle ("s"), and attachment/timecode ("t") streams. `?` on each removal makes it optional (no error if the type isn't present). Track ORDER among surviving streams matches input — important so an audio-first input doesn't get reordered to video-first in the output. Files with only audio or only video both work.
### Container-level metadata (the `-map_metadata` axis)
| Source | Lives in | mat2 / ffmpeg default | Our policy | Notes |
|---|---|---|---|---|
| `udta` children (`©nam`, `©ART`, `©cmt`, `©day`, `©too`, …) | `moov/udta` | Dropped via `-map_metadata -1` | **Dropped** | Apple-style four-cc tags. |
| `meta`/`keys`/`ilst` (iTunes-style key-value) | `moov/meta` | Dropped | **Dropped** | Where modern iOS writes most metadata. |
| `Xtra` (Windows Media key-value) | `moov/udta/Xtra` | Dropped | **Dropped** | Windows Media Player metadata. |
| XMP via `uuid` box | `moov/uuid` (well-known UUID) | Dropped | **Dropped** | Adobe XMP packet. |
| Vendor `uuid` boxes | `moov/uuid` | Dropped (unrecognized) | **Dropped** | DRM-allowlist boxes — for our threat model, irrelevant; user has the file anyway. |
| Per-stream metadata (`Stream Metadata: handler_name`, `encoder`, `vendor_id`) | Inside each `trak/mdia/hdlr`, `stsd` sample entries | Dropped via `-map_metadata -1`? **Partial** | See below | **Important caveat:** mat2 / `-map_metadata -1` does not always clear `handler_name` or sample-entry `compressorname`. ffmpeg's MP4 muxer rewrites these to defaults during remux (`VideoHandler`, `SoundHandler`), which is what the POC observed. |
POC observation on `gopro-fusion.mp4`: after our invocation, the stripped output shows generic `VideoHandler` / `SoundHandler` strings — **GoPro device names, GPMF magic, and the GoPro AVC encoder string are all gone**. This is the categorical improvement over the current `VideoStrategy` walker, which leaves these (#38, #39 KNOWN_GAPS).
### Container brand / ftyp normalisation
ffmpeg's MP4 muxer writes its own `ftyp` brand list and `mvhd` matrix. Defaults observed in POC:
| Field | ffmpeg default | Walker (today) | Privacy policy |
|---|---|---|---|
| `ftyp` major_brand | `isom` (with `-movflags +faststart`) | Preserved from input | **Accept ffmpeg default** — uniform across all inputs is less fingerprint-y than preserving device-specific brands (`mp42+qt ` = iPhone, `iso4` = Android stock, `3gp4` = older 3GPP). |
| `ftyp` compatible_brands | `[isom, iso2, avc1, mp41]` (typical) | Preserved | **Accept ffmpeg default** — same reasoning. |
| `mvhd` matrix | Identity matrix from ffmpeg | Preserved from input | Accept ffmpeg default; identity matrix is universal. |
| `mvhd.creation_time` / `mvhd.modification_time` | 0 with `-fflags +bitexact` | Walker zeroes these per privacy invariant §6 | ✅ Matches policy |
| `tkhd.creation_time` / `tkhd.modification_time` | 0 with `-fflags +bitexact` | Walker zeroes | ✅ Matches policy |
| `mdhd.creation_time` / `mdhd.modification_time` | 0 with `-fflags +bitexact` | Walker zeroes | ✅ Matches policy |
| `mvhd.next_track_id` | Set to count+1 by ffmpeg | LEAKED (#111) | **Rewritten by ffmpeg** — closes #111 categorically |
| `Stream Metadata: encoder` (Lavf string) | `Lavf61.7.100` (or whatever version is bundled in ffmpeg-core 0.12.10) | n/a (walker doesn't write this) | **Override to empty** via `-metadata "encoder="` — suppresses the strip-tool fingerprint per privacy-invariants §6 |
| `Stream Metadata: major_brand`, `compatible_brands` per-stream | ffmpeg writes these | n/a | Default; same reasoning as ftyp |
### Edits / chapters / sidx
| Source | mat2 / ffmpeg default | Our policy | Notes |
|---|---|---|---|
| `edts/elst` (edit list) | Preserved (`-c copy`) | **Preserved** | Edits affect playback semantics (trimmed clips). Walker treats edts as data, not metadata. |
| `moov/chap` (chapter references) | Dropped via `-map_chapters -1` | **Dropped** | Chapter names are user-authored text — can leak film-set names, project names, etc. |
| `sidx` (segment index, fragmented MP4) | Rewritten or dropped when defragmenting | **Categorically gone** — input fragmented MP4 becomes flat output | User has accepted defragmentation as a trade-off (per #182 discussion). |
| `mfra` (movie fragment random access) | Dropped during defragmenting | **Categorically gone** | Same as `sidx`. |
| `moof` / `traf` (fragmented MP4 fragments) | Rewritten as flat `moov/mdat` | **Dropped (defragmented)** | TRAF_META_FRAGMENT (#36) is gone categorically. |
---
## Honest gap summary
### Current `VideoStrategy` walker vs. ffmpeg
The walker has **6 active KNOWN_GAPS** (`docs/forensic/video.md`): handler names, compressor names, GoPro device strings, `mvhd.next_track_id`, fragmented-traf entries, and synthetic `mdat` orphans. ffmpeg closes **all six categorically** by re-writing the container from the stream tables rather than blanking boxes in-place.
The walker has **1 advantage**: byte-preservation of structure. `stco`/`co64`/`sidx` byte offsets are valid in the input remain valid in the output. This is the property the project's current privacy-invariants §6 leans on for forensic-fidelity claims. ffmpeg cannot make this claim — the output is structurally different.
User has explicitly accepted this trade in #182: for the YouTube/Telegram/social upload audience (per project direction shift, the primary one), defragmentation and structural rewrites don't matter; the metadata removal does.
### ffmpeg vs. mat2 (`-codec copy -map_metadata -1`)
mat2's invocation is `-map 0` (no stream filter) + `-codec copy -map_metadata -1`. This *fails* on real-world action-cam files because the `tmcd` and `fdsc` data streams use codec `none` which the MP4 muxer rejects.
Our invocation uses `-map 0 -map -0:d? -map -0:s? -map -0:t?` — keep every stream, then explicitly drop data / subtitle / timecode streams, preserving input track order among the survivors. Empirically verified:
- **Phone MP4** (`samplelib phone-baseline.mp4`, 2.7 MB) — same coverage as mat2 / `-codec copy`.
- **Action cam** (`gopro-fusion.mp4`, 5.1 MB, GPMF + tmcd + fdsc) — 7/7 device fingerprints removed (`GoPro AVC`, `gpmd`, `GoPro AAC`, `GoPro TCD`, `GoPro MET`, `GoPro SOS`, `Fusion`). mat2 **declines** this file entirely (exit 234).
- **Drone** (`dji-phantom4.mov`, 236 MB) — 5/5 device fingerprints removed (`FC6310` drone model, AVC encoder, `DJI.AVC` and `DJI.Meta` handler descriptions, GPS coordinates). The full GPS flight log under `[UserData] GPSCoordinates` (lat/lon/altitude) is dropped categorically by `-map_metadata -1`.
Both action-cam and drone categories are now verified by direct measurement in `tools/forensic/ffmpeg-fallback.ts` (see `docs/forensic/ffmpeg-fallback.md` for the per-fixture sentinel survival table). **Dashcam coverage is predicted by analogy** (most consumer dashcams put GPS/telemetry in `udta` or in data/subtitle streams — both handled by our policy) but is not yet exercised by a fixture; tracked as a follow-up.
### ffmpeg vs. theoretical
The theoretical bar for "what could be removed" is what the walker plus all the open KNOWN_GAPS would achieve if fully closed. ffmpeg already achieves this on standard MP4 — every channel the walker leaks (or partly-leaks) is dropped categorically.
Where ffmpeg falls short of theoretical:
- `vendor_id` in sample entries — ffmpeg muxer writes `[0][0][0][0]` (null vendor) instead of whatever the input had. This is technically a fingerprint of "an MP4 muxed by ffmpeg," but indistinguishable from any other recently-muxed MP4. Acceptable.
- The `Lavf<version>` encoder string ffmpeg auto-stamps — **suppressed via `-metadata "encoder="`** in our invocation. Verified empirically in the POC: the stripped output of `phone-baseline.mp4` shows no `Lavf` string in `strings | grep -i lavf`.
- Stream-level `creation_time` and `modification_time` (per-track in `mdia/mdhd`) — zeroed by `-fflags +bitexact`. Verified.
- `udta/meta/hdlr` block — `mov_write_meta_tag` writes a 0x21-byte stub at the movie level (and per-track for some inputs) regardless of `-map_metadata -1`. The handler_type is `mdir` (iTunes-style metadata directory) which ExifTool surfaces as `HandlerType: Metadata`; the handler vendor is hardcoded `appl` which surfaces as `HandlerVendorID: Apple` and misrepresents the file's origin. **Patched in post-strip**: `cleanFfmpegMp4Output` rewrites the `udta` box type to `free` (ISO/IEC 14496-12 §8.1.2 padding). Readers ignore `free` boxes entirely — ExifTool stops surfacing the HandlerType row, the HandlerVendorID row, and any other field that would have come from inside. Length-preserving so `stco`/`co64` offsets to `mdat` stay valid. The original "appl" bytes survive as padding inside the renamed box; ExifTool / ffprobe / VLC all see padding. (We previously zeroed just the 4-byte vendor inside `meta/hdlr`; that left `HandlerType: Metadata` still surfacing — the rename supersedes that approach.)
- `mdhd.language` (per-track 15-bit packed ISO 639-2/T code) — ffmpeg writes `und` (0x55C4, "undetermined") when the input had no language and copies the input's code when it did. **Accept as-is.** `und` is the spec's canonical "no language specified" marker; every reader handles it predictably. We considered zeroing it to suppress `MediaLanguageCode` from the diff but reverted — 0x0000 is an invalid ISO 639-2/T code, and ffprobe falls back to displaying `(eng)` (actively misleading for downstream tools that switch on language). When the input had a real language (`eng`, `fra`, etc.) the diff row `eng → und` is honest and informative: we removed the user's language tag, which is exactly what the diff is for. The only cosmetic cost is one extra `MediaLanguageCode: und` row in the diff when the input had no decodable language — accepted vs. spec-invalid container bytes.
### Documented gaps to flag in `PRIVACY_GAPS.md`
| Channel | Status | Note |
|---|---|---|
| Legitimate subtitle / chapter tracks user *wanted* preserved | Dropped | Edge case. No way to distinguish "subtitle that's content" from "subtitle that's GPS-from-DJI-SRT" at the file level. UI may add an opt-in flag post-strip. |
| Sidecar files (`.SRT`, `.LRV`, `.THM`, `.LRF`) | Unchanged | `#46` — not addressable by any in-file strategy. Out of scope for this PR; documented in `PRIVACY_GAPS.md`. |
| Container fingerprint via `ftyp` brand normalisation | Normalised to ffmpeg default | We accept that "uniform = less fingerprint-y" is the right trade; the only fingerprint left is "muxed by ffmpeg," same as billions of other files. |
| Fragmented MP4 byte-preservation | Defragmented | User signed off. Defragged output plays identically on every consumer platform. |
---
## Recommendation
**Adopt the POC invocation as the Phase 1 starting policy.** The full strip command:
```
ffmpeg -i in.<ext> \
-map 0 -map -0:d? -map -0:s? -map -0:t? \
-map_metadata -1 \
-map_chapters -1 \
-fflags +bitexact \
-c copy \
-movflags +faststart \
-metadata "encoder=" \
out.<ext>
```
For Phase 1 (this PR), the claim set is `.mp4`, `.mov`, `.m4v`. The strategy is registered ahead of `VideoStrategy` when `WITH_FFMPEG=1` (default).
### Phase 1 deliverable (this PR)
- `FfmpegFallbackStrategy` claims `.mp4`/`.mov`/`.m4v`
- Strip invocation as above
- Forensic verification under `tools/forensic/ffmpeg-fallback.ts` — sentinel survival = 0 on the synthetic battery (`seeded.mp4`, `sample-fragmented.mp4`) plus real-world (`phone-baseline.mp4`, `gopro-fusion.mp4`)
- KNOWN_GAPS empty for these formats after this lands — every channel from `docs/forensic/video.md` is closed
- `VideoStrategy` retained as the fallback when `WITH_FFMPEG=0` (opt-out builds)
### Deferred to follow-up PRs
- `.mkv` / `.webm` (Phase 2 of #182, same PR if Phase 1 forensic is clean)
- `.avi` / `.wmv` / `.3gp` (separate PRs; each with its own gap analysis + forensic pass)
- `VideoStrategy` deletion (subsequent PR after a validation window of stable ffmpeg in production)
- UI opt-in flag for "preserve subtitle/chapter tracks" — depends on real user feedback

31
docs/gap-analysis/webm.md Normal file
View file

@ -0,0 +1,31 @@
# WebM — ffmpeg-wasm strategy gap analysis
**Date:** 2026-05-21
**Goal:** Document the per-source policy for `FfmpegFallbackStrategy` on WebM inputs. Phase 2 of issue #182.
WebM is a strict subset of Matroska — same EBML container, restricted to VP8/VP9/AV1 video + Vorbis/Opus audio. The strategy treats them the same: `matchesEbml()` accepts both, `detectContainer()` distinguishes them via the EBML `DocType` element (looks for the ASCII string `"webm"` in the first 64 bytes; falls back to `.mkv` if not found, since MKV is the superset).
---
## Per-source policy
Inherits everything from [`docs/gap-analysis/mkv.md`](mkv.md). The differences relative to MKV:
| Aspect | MKV | WebM | Notes |
|---|---|---|---|
| Allowed codecs | Any | VP8/VP9/AV1 video, Vorbis/Opus audio | Our `-c copy` path doesn't transcode, so any codec the input contains passes through. If the user dropped a "WebM" file containing a non-Webm codec (rare; happens with stale exports), ffmpeg's muxer will refuse — same failure mode as the MP4 muxer's codec-tag refusal. |
| Attachments / chapters | Rare | Rarer still | Same drop-all policy as MKV. |
| MediaRecorder-generated WebM (browser screen rec) | n/a | Common | This is the **primary user value of WebM coverage**: web-recorded video from MediaRecorder API. Our strip handles this cleanly. |
| `\Segment\Info\WritingApp` | Often "Chrome" / "Firefox" / "Lavf" | Same | Privacy-invariants §6 fingerprint — suppressed via `-metadata encoder=`. |
---
## Empirical verification
Synthetic WebM with 2 seeded sentinels (`title`, `description`) → 0 survivors. See `tools/forensic/ffmpeg-fallback.ts` Phase 2 row.
---
## Recommendation
`.webm` joins the strategy's claim set in Phase 2 of #182. Same invocation as MKV (no `-movflags +faststart`). Forensic verification confirms zero sentinel survival on the synthetic WebM battery. Real-world WebM fixtures (browser MediaRecorder output) are a useful follow-up.

311
docs/poc/ffmpeg-wasm.md Normal file
View file

@ -0,0 +1,311 @@
# `@ffmpeg/ffmpeg` (ffmpeg-wasm) POC
**Date:** 2026-05-21
**Goal:** Evaluate whether ffmpeg compiled to WebAssembly is a viable in-browser strategy for ExifCleaner — closing the long-tail video container gap (MKV/WebM/AVI/WMV/3GP) and the GPMF / device-fingerprint gap in `VideoStrategy` (#38, #39), without giving up the "no server, runs offline" privacy guarantee.
Targets that frame the evaluation: **desktop offline standalone HTML** + **Android APK** (Capacitor wrapper). PWA self-host is a secondary target. Issue: #182.
## What was evaluated
| Package | Version | License | Role |
|---|---|---|---|
| `@ffmpeg/ffmpeg` | 0.12.15 | MIT | Browser-facing wrapper (Worker boilerplate, lifecycle, file marshalling) |
| `@ffmpeg/util` | 0.12.2 | MIT | URL/fetch utilities used by the wrapper |
| `@ffmpeg/core` | 0.12.10 | **GPL-2.0-or-later** | The actual ffmpeg compiled to WASM, single-threaded build |
| `@ffmpeg/types` | 0.12.4 | MIT | Shared TypeScript types |
Installed in `/tmp/ffmpeg-poc/` with `npm install --save-exact`. Per-package dependency graph is tiny: the wrapper depends on `@ffmpeg/util` only; `@ffmpeg/core` has no JS-side deps (the WASM is self-contained).
The threaded build (`@ffmpeg/core-mt`, requires `SharedArrayBuffer` + COOP/COEP) is explicitly **out of scope** for this POC — issue #182 picks single-threaded for v1.
## How the packages fit together
They are not alternatives; they're a stack:
```
┌──────────────────────────────────────────────────────────────┐
│ FfmpegFallbackStrategy (we write — issue #182) │
│ ├─ implements src/infrastructure/wasm/format_strategy.ts │
│ └─ same contract as JpegStrategy, ExifToolFallbackStrategy │
├──────────────────────────────────────────────────────────────┤
@ffmpeg/ffmpeg (MIT, ~10 KB JS, 2.3 KB gz)│
│ ├─ FFmpeg class: load / exec / writeFile / readFile │
│ ├─ Manages Web Worker lifecycle, file marshalling to MEMFS │
│ └─ Browser-only — Node import resolves to an empty module │
├──────────────────────────────────────────────────────────────┤
@ffmpeg/core (GPL, 30.7 MB / 9.79 MB gz)│
│ ├─ ffmpeg-core.wasm — Emscripten-compiled ffmpeg + libs │
│ ├─ ffmpeg-core.js — Emscripten JS bootstrap │
│ └─ Exposes argv-style exec, MEMFS, setLogger, setTimeout │
└──────────────────────────────────────────────────────────────┘
```
Our integration shape: a thin `FfmpegFallbackStrategy` on top of `@ffmpeg/ffmpeg`. It hides the wrapper's quirks (Worker bootstrap, MEMFS lifecycle, stderr/abort semantics) and presents the same `strip(bytes, options) → Result<StripResult, ExifError>` contract as every other strategy. Same architectural pattern as `ExifToolFallbackStrategy`.
## Bundle weight
```
ffmpeg-core.wasm raw: 30 706 KB gzipped: 9 791 KB
ffmpeg-core.js raw: 109 KB gzipped: 29 KB
@ffmpeg/ffmpeg classes.js raw: 9.5 KB gzipped: 2.3 KB
@ffmpeg/ffmpeg worker.js raw: 5.0 KB gzipped: ~1.5 KB (est)
─────────────────────────────────────────────────────────────────
TOTAL transfer weight ≈ 9.82 MB gzipped
TOTAL decompressed in memory ≈ 30.8 MB
```
For comparison:
- `zeroperl.wasm` (WebPerl-ExifTool): ~7.2 MB gzipped, ~24 MB raw
- Existing `VideoStrategy` hand-rolled walker: ~5 KB compiled JS
- ExifCleaner's whole web build today: ~440 KB JS → ~180 KB gzipped
Distribution impact per target:
| Target | Today | + ffmpeg-wasm | Delta |
|---|---:|---:|---:|
| Standalone HTML (inlined as base64) | ~900 KB | ~40 MB | +44× |
| Android APK (compressed asset) | ~5 MB | ~15 MB | +3× |
| PWA self-host (on-demand fetch + SW cache) | ~440 KB initial | +9.8 MB on first video drop | one-time |
Standalone is the heaviest hit by far; APK absorbs it comfortably; PWA pays once and caches forever.
## Privacy / sandbox audit
### Static: JS-side network capability
Two browser-`fetch` call sites in `ffmpeg-core.js`, both in the Emscripten bootstrap path: `getBinary()` and `instantiateAsync()`. Both load **the `.wasm` file itself**, from a URL we provide via the wrapper's `coreURL` / `wasmURL` config (same pattern as ExifTool's `redirectWasmFetch`). No other runtime fetches exist after the WASM is loaded.
The wrapper (`@ffmpeg/ffmpeg/dist/esm/classes.js`) has **zero** references to browser-`fetch`, `XMLHttpRequest`, `WebSocket`, `RTCPeerConnection`, `EventSource`, or `navigator.sendBeacon`.
### WASM-level capability — important caveat
Unlike `zeroperl.wasm` (whose import section uses readable WASI names: `wasi_snapshot_preview1::fd_read`, etc.), `ffmpeg-core.wasm` is built with Emscripten using JS imports whose names are minified to single letters (`a::$`, `a::A`, …`a::z`). **The WASM import audit cannot be done by name alone** — the JS host (`ffmpeg-core.js`) supplies each of the 71 imports under whichever single-letter symbol Emscripten assigned.
The audit therefore moves up a layer: what does the JS-side runtime expose to the WASM?
- ffmpeg-core.js does contain Emscripten's **SOCKFS** code (because ffmpeg's source uses `socket()` for network protocols like `rtsp://`, `tcp://`, `udp://`, `http://`). The SOCKFS implementation requires the consumer to provide a `Module["websocket"]` object — it is **dormant** by default. If `socket()` is called from inside WASM without `Module["websocket"]` set, it fails.
- Our integration **does not** set `Module["websocket"]`, so the WASM cannot make socket calls even if asked.
- ffmpeg's inputs are restricted to MEMFS paths (we write the user's bytes to MEMFS before invoking ffmpeg). It is never handed an `rtsp://` or `http://` URL.
- The deploy-layer CSP (`connect-src 'self'`) blocks any WebSocket handshake at the browser level as a backstop.
Defense in depth, not "structural by construction." This is honestly weaker than the zeroperl story (where sockets are absent from the import section entirely), and the writeup must say so.
### Dynamic: Node permission model
Verified end-to-end under Node 24's default-deny capability model. The script in `/tmp/ffmpeg-poc/run_strip.mjs` loads `@ffmpeg/core` directly (bypassing the browser-only `@ffmpeg/ffmpeg` wrapper) and runs:
```
node --permission --allow-fs-read=/tmp/ffmpeg-poc --allow-fs-write=/tmp/ffmpeg-poc \
run_strip.mjs seeded.mp4 out-restricted.mp4
```
Without `--allow-net`, the strip completes successfully and produces **byte-identical output** to the unrestricted run (`cmp out-restricted.mp4 out-unrestricted.mp4` → ✓ byte-identical). This proves the strip path issues no network syscalls.
```
ok core loaded
ok rc=0
wrote 2238 bytes to out-restricted.mp4
sentinels in restricted output: 0
bytes match unrestricted run? ✓ byte-identical
```
(Linux user namespaces were unavailable in the audit sandbox for `unshare -rn`, so the Node permission model is the dynamic check.)
**Conclusion: no outbound network traffic during strip operations.** Combined with not setting `Module["websocket"]` + the deploy-level CSP, the privacy story holds — but documented as defense-in-depth rather than "by construction."
## Supply-chain integrity
### Pinning + npm integrity
All four packages installed with `--save-exact`. SHA-512 integrity hashes recorded in `package-lock.json`:
```
@ffmpeg/ffmpeg@0.12.15
sha512-1C8Obr4GsN3xw+/1Ww6PFM84wSQAGsdoTuTWPOj2OizsRDLT4CXTaVjPhkw6ARyDus1B9X/L2LiXHqYYsGnRFw==
@ffmpeg/core@0.12.10
sha512-dzNplnn2Nxle2c2i2rrDhqcB19q9cglCkWnoMTDN9Q9l3PvdjZWd1HfSPjCNWc/p8Q3CT+Es9fWOR0UhAeYQZA==
@ffmpeg/util@0.12.2
sha512-ouyoW+4JB7WxjeZ2y6KpRvB+dLp7Cp4ro8z0HIVpZVCM7AwFlHa0c4R8Y/a4M3wMqATpYKhC7lSFHQ0T11MEDw==
@ffmpeg/types@0.12.4
sha512-k9vJQNBGTxE5AhYDtOYR5rO5fKsspbg51gbcwtbkw2lCdoIILzklulcjJfIDwrtn7XhDeF2M+THwJ2FGrLeV6A==
```
### File-level SHA-256 (for our own embed)
```
SHA256 (ffmpeg-core.wasm) = 9f57947a5bd530d8f00c5b3f2cb2a3492faa7e5d823315342d6a8656d0a6b7b7
SHA256 (ffmpeg-core.js) = 67a48f11645f85439f3fde4f2119042c16b374b910206b7a7a24f342e28dcae3
SHA256 (@ffmpeg/ffmpeg classes.js) = 7a829c898bdbc3a8806652a5502d9101178ce4e988a2c50b3abc1306ce4fc919
```
If adopted, the embed pattern would be: pin SHA-256s in our build, compute them as part of `yarn build:web`, fail the build on mismatch. Same protocol as `webperl-exiftool.md`.
### What this does NOT defend against
- A compromised upstream maintainer publishing a malicious update. Mitigation: pin a known-good version and **review the diff every time we bump**.
- A backdoor at the pinned version we're auditing now. The defense is the network-capability analysis above (JS-side fetch / SOCKFS) plus the dynamic no-network test. Stronger than the obfuscated-import-section audit on its own.
## Functional results
Fixtures generated with `ffmpeg` (1 s synthetic blue frame, 128×128, H.264). Seeded with sentinel metadata via system `exiftool 12.76`. Stripped via `@ffmpeg/core` 0.12.10 in Node. Sentinels counted with `strings | grep -c FORENSIC`.
### Synthetic battery (sentinel survival)
| Fixture | Bytes (in → out) | Sentinels (in → out) | rc | Notes |
|---|---:|---|---:|---|
| `seeded.mp4` (1 s, 5.7 KB, full XMP + ItemList sentinels) | 5 711 → 2 238 | 8 → **0** | 0 | All XMP, ItemList, encoder, dates stripped |
| `sample-fragmented.mp4` (fragmented MP4 with `moof` boxes) | 1 850 → 1 680 | n/a | 0 | **Works** — defragments to flat `[ftyp, moov, mdat]`, which the user has accepted as the trade-off (see #182 discussion) |
Strip command for both: `-i in.mp4 -map 0 -map_metadata -1 -map_chapters -1 -fflags +bitexact -c copy -movflags +faststart out.mp4`.
### Real-world battery
| Fixture | Bytes | Default `-map 0` | With `-map 0:v? -map 0:a?` | Notes |
|---|---:|---|---|---|
| `phone-baseline.mp4` (samplelib, 2.7 MB, modern Android) | 2 848 111 → 2 848 111 (output) | ✅ rc=0 | ✅ rc=0 | All standard metadata stripped; no codec failures |
| `gopro-fusion.mp4` (gpmf-parser repo, 5.1 MB, has GPMF + timecode + description streams) | 5 124 222 | ❌ rc=1, 0-byte output | ✅ rc=0, 5 000 118 B | See below |
**GoPro Fusion failure mode (default `-map 0`)**:
```
[mp4] Could not find tag for codec none in stream #2, codec not currently supported in container
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Error initializing output stream 0:4 --
Aborted()
```
Streams #0:2 (`tmcd`, timecode) and #0:4 (`fdsc`, description) carry codec `none`, which ffmpeg's MP4 muxer refuses to remux. The `gpmd` stream (#0:3) is **not** the blocker — `tmcd` and `fdsc` are. mat2 hits the same wall (exit 234) because it uses the same `-codec copy` invocation.
**With `-map 0:v? -map 0:a?` (drop all non-video/non-audio streams)**:
The output drops every data stream — GPMF/GPS, timecode, description — which is exactly what a metadata stripper wants to do. Result: 5 MB clean output, **all** GoPro device names / GPMF magic / fusion strings / encoder identifiers removed. ExifTool view of the stripped file shows generic Apple-style headers, no GoPro lineage.
This is a categorical improvement over both:
- Current `VideoStrategy` walker: leaves `COMPRESSORNAME`, `gpmd-magic`, GoPro device-name strings (#38, #39 — handler/compressor names not stripped).
- `mat2` (= ffmpeg `-codec copy`): refuses the file entirely (exit 234) with `-map 0`.
**Trade-off**: we lose subtitle / chapter / data tracks that *might* have been wanted. For a privacy-strip tool, this is correct: anything that isn't video or audio is metadata or telemetry, and the user dropped the file *to remove* metadata and telemetry. The gap-analysis writeup (`docs/gap-analysis/mp4-ffmpeg.md`) will document this policy and flag the legitimate-subtitle edge case for follow-up.
### Closes / improves on current walker gaps
| Gap | Current `VideoStrategy` | ffmpeg-wasm with `-map 0:v? -map 0:a?` |
|---|---|---|
| `HDLR_NAME_VIDEO` / `HDLR_NAME_*` (#38) | LEAKED | Removed (generic `VideoHandler`/`SoundHandler` rewritten) |
| `COMPRESSORNAME` (#39) | LEAKED | Removed |
| `gpmd-magic` / GoPro device strings | LEAKED | Removed (entire `gpmd` track dropped) |
| `mvhd.next_track_id` (#111) | LEAKED | Rewritten by ffmpeg's muxer (verify in forensic phase) |
| GPS coordinates in GPMF | LEAKED | Removed (track dropped) |
| `TRAF_META_FRAGMENT` (#36) | LEAKED on fragmented | N/A — defragmented |
| `MDAT_ORPHAN` (#42) | LEAKED on synthetic | Removed (ffmpeg re-writes mdat from sample tables) |
Categorical: GPMF / timed-metadata / `tmcd` / `fdsc` channels are all gone because the streams themselves are dropped.
## Performance
Steady-state benchmark, 10 strips per fixture in one Node process (warm cache):
```
=== seeded.mp4 (1 s, 5.7 KB) ===
N=10 mean=5ms p50=2ms p95=34ms min=2ms max=34ms
(first call 34 ms cold; 23 ms steady-state)
=== big-seeded.mp4 (30 s, 29 KB) ===
N=10 mean=6ms p50=3ms p95=28ms
=== phone-baseline.mp4 (5 s, 2.7 MB) ===
N=5 mean=61ms p50=52ms p95=104ms
=== gopro-fusion.mp4 (5.1 MB, with -map 0:v? -map 0:a?) ===
N=5 mean=~95ms (extrapolated; rc=0 only when v/a-only map applied)
=== sample-fragmented.mp4 (1.8 KB, fragmented) ===
N=10 mean=6ms p50=2ms
```
Process-level (Node + ESM resolution + WASM init + 1 strip): **~170 ms wall**, **~145 MB peak RSS**.
### Comparison vs other strategies
| Engine | Per-file latency (1 s phone MP4) | Cold start | Bundle |
|---|---:|---:|---:|
| `VideoStrategy` walker (current) | < 5 ms | none | ~5 KB |
| `@ffmpeg/core` 0.12.10 | ~50100 ms | ~30 ms once | ~9.8 MB gz |
| `@uswriting/exiftool` (WebPerl) | ~900 ms | ~1 s once | ~7.2 MB gz |
ffmpeg-wasm sits between the walker and WebPerl-ExifTool on latency — **~1020× slower than the walker, ~1015× faster than ExifTool**. Practical implication: hundreds of phone-MP4 strips in a batch still feel responsive (~10 s for 100 files), folder-level batches with thousands of files would be noticeable but not awful (~1 min for 1000).
The project-direction principle from `CLAUDE.md` ("Performance is sacred — hundreds in seconds") holds for the typical batch size. Walker stays faster, which is one reason #182 keeps the walker around during the validation window.
## Maintainer / bus factor / license chain
- **Maintainers** of `@ffmpeg/ffmpeg` and `@ffmpeg/core`: Jerome Wu (`jeromewu`) and Lucas Gelfond (`lucasgelfond`). The `ffmpegwasm` GitHub organization is the upstream.
- **Activity**: latest `@ffmpeg/core` 0.12.10 published 2025-04-07. Repo (`ffmpegwasm/ffmpeg.wasm`) is active — issues responded to, releases roughly quarterly. Healthier than `zeroperl-ts`.
- **License chain**:
- `@ffmpeg/ffmpeg`, `@ffmpeg/util`, `@ffmpeg/types` — **MIT**
- `@ffmpeg/core` (the WASM build) — **GPL-2.0-or-later** (inherits from ffmpeg's enabled GPL components: x264, x265, etc.)
- User direction in #182: "use the better one, don't overthink about licensing"
- **Implication**: distributing the standalone HTML / APK that includes `ffmpeg-core.wasm` makes the distributed binary subject to GPL-2.0. Our MIT codebase is unchanged (no GPL source becomes part of *our* source tree), but the **combined distributable** must comply with GPL-2.0: source-availability + license notice + copyleft for any further redistribution.
- In practice for this project: the codebase is already public + MIT, all upstream ffmpeg-wasm source is public, so the source-availability requirement is met by linking to `ffmpegwasm/ffmpeg.wasm`. The About / Settings → Licenses screen needs a GPL-2.0 notice + source pointer. Documented in #182's pre-merge checklist.
- **Per-distribution implication**: enabling ffmpeg in any opt-in user-built variant subjects *their* build to GPL-2.0 redistribution rules, but they're free to use it locally. The standalone HTML download (when shipped with ffmpeg by default per #182) is a binary distribution that ships under GPL-2.0 for the combined work.
## Conclusion
**Adopt as the primary path for video container metadata stripping** (MP4/MOV/MKV/WebM in #182 Phase 1+2), with `VideoStrategy` retained during a validation window. The math:
| Concern | Verdict |
|---|---|
| Privacy (no network) | ✅ Static (JS-side fetch only loads the wasm; SOCKFS dormant) + Dynamic (Node permission model: byte-identical output without `--allow-net`). Defense-in-depth, not "by construction" — weaker than zeroperl. |
| Forensic completeness on supported formats | ✅ Strips every sentinel on synthetic MP4. Closes #38, #39 (handler / compressor names). Closes #42 (mdat orphan zeroing) categorically by re-writing from sample tables. |
| Bundle size for standalone HTML | ⚠️ +40 MB inlined. Big jump from current ~900 KB single file. User has signed off as worth it for the format coverage and forensic completeness. |
| Bundle size for Android APK | ✅ +10 MB compressed asset. Rounding error inside the Capacitor WebView shell. |
| Performance for batches | ✅ 50100 ms per file for 2.7 MB phone MP4. Hundreds in seconds is achievable; thousands gets noticeable. |
| Fragmented MP4 (screen rec, modern Android) | ✅ Works — defragments to flat MP4. User has explicitly accepted defragmentation as a trade-off (per #182 discussion). |
| GoPro Fusion / GPMF / device fingerprints | ✅ With `-map 0:v? -map 0:a?`, all data streams dropped. **Better than current walker and better than mat2.** |
| MKV / WebM / AVI | 🟡 Untested in this POC. Phase 2 of #182 verifies forensically; ffmpeg's MKV/WebM mux is well-trodden in mat2's coverage. |
| GPL-2.0 license inheritance | 🟡 Combined distributable subject to GPL-2.0. Requires License notice + source pointer in About screen. User accepted in #182. |
| Maintainer health | ✅ Two maintainers, active org, quarterly releases |
| Tampering posture | ✅ SHA-512 from npm + SHA-256 pinned in our build + JS-side network audit + dynamic test |
## Recommended path forward (aligned with #182)
The strategy ships behind `WITH_FFMPEG` (default ON for all three distributions; `WITH_FFMPEG=0` opts out) and routes MP4/MOV ahead of the existing `VideoStrategy`. ExifToolFallbackStrategy stays for raster formats; PdfStrategy and OfficeStrategy stay (ffmpeg can't write those).
### Strip invocation policy
Default invocation for all claimed formats:
```
ffmpeg -i in.<ext> \
-map 0:v? -map 0:a? \ # drop data/subtitle/timecode streams (metadata vehicles)
-map_metadata -1 \ # drop container-level metadata
-map_chapters -1 \ # drop chapters (potential leak surface)
-fflags +bitexact \ # don't write randomness/timestamps
-c copy \ # no re-encode
-movflags +faststart \ # moov at front (better seek for end-users)
-metadata "encoder=" \ # suppress Lavf<version> fingerprint
out.<ext>
```
The `-map 0:v? -map 0:a?` choice (drop data streams) is the key departure from mat2's default. It's what makes GoPro Fusion / DJI / dashcam files work where mat2 fails. The legitimate-subtitle case (where a user wants to preserve a real subtitle track) is flagged for the gap-analysis phase.
Container brand / muxer-string normalisation (per privacy-invariants §6): the gap-analysis writeup will enumerate every fingerprint ffmpeg auto-stamps (`Lavf<version>` encoder string, `mp42` ftyp brand, etc.) and document per-source rewrite policy.
### Verification required before merge
Per `format-strategy-workflow.md`:
- `docs/gap-analysis/mp4-ffmpeg.md` — per-source/per-marker policy table; honest gap section
- `docs/gap-analysis/mkv.md` and `docs/gap-analysis/webm.md` — Phase 2
- `tools/forensic/ffmpeg-fallback.ts` — synthetic + real-world sentinel fixtures, zero survival across recovery battery
- Build-flag wiring in `vite.config.web.ts`, `vite.config.web.standalone.ts`, Capacitor build config
- CI `WITH_FFMPEG=0` variant in the smoke matrix
### Open questions for the design phase
- **Container brand normalisation policy** — enumerate ffmpeg's auto-stamped strings during gap analysis. Decide which get rewritten vs. allowed-through.
- **Legitimate subtitle/chapter preservation** — opt-in via a strip-options flag? Or always drop with a one-time UI warning? Defer to UX work after the strip path proves out.
- **Cold-start UX** — first MP4 drop pays the 30 ms WASM init cost (in addition to standalone's parse cost of the inlined base64). Below dialog-threshold even on weak devices, but worth measuring on real hardware in the verification phase.
- **MKV/WebM `-codec copy` behaviour** — ffmpeg's matroska muxer is mature, but worth verifying that the same `-map 0:v? -map 0:a?` strategy works without the codec-tag-mismatch issues we saw on GoPro Fusion's MP4 output.
- **Combined GPL-2.0 implications for the APK Play Store path** — not Play Store today (sideload), but a follow-up consideration if Play Store ever happens. Out of scope for this POC.

View file

@ -30,6 +30,7 @@
"generate:exif-tags": "node scripts/generate_exif_tags.mjs"
},
"dependencies": {
"@ffmpeg/core": "0.12.10",
"@uswriting/exiftool": "1.0.9",
"jszip": "^3.10.1",
"pdf-lib": "^1.17.1",

View file

@ -118,45 +118,28 @@ const DROP_GROUPS: ReadonlySet<string> = new Set([
// Top-level keys (no group prefix) that ExifTool emits for housekeeping.
const DROP_KEYS: ReadonlySet<string> = new Set(["SourceFile"]);
// ExifTool family-1 group → our canonical source label. Granular IFD
// distinctions collapse to "EXIF" because users don't conceptually
// separate IFD0/SubIFD/Interop/IFD1. GPS is broken out for privacy
// salience. XMP-* and ICC-* prefixes collapse to "XMP" / "ICC".
const EXACT_GROUP_MAP: Readonly<Record<string, string>> = {
IFD0: "EXIF",
IFD1: "EXIF",
ExifIFD: "EXIF",
SubIFD: "EXIF",
InteropIFD: "EXIF",
GPS: "GPS",
ICC_Profile: "ICC",
JFIF: "JFIF",
Photoshop: "Photoshop",
IPTC: "IPTC",
MakerNotes: "MakerNotes",
Adobe: "Adobe",
APP14: "APP14",
PDF: "PDF",
PNG: "PNG",
QuickTime: "MP4",
ItemList: "MP4",
UserData: "MP4",
Keys: "MP4",
XML: "Office",
ZIP: "ZIP",
RIFF: "RIFF",
};
// Surface every ExifTool family-1 group verbatim as the source label.
//
// We deliberately do NOT collapse sub-groups to friendlier labels (e.g.
// IFD0/IFD1/ExifIFD/SubIFD/InteropIFD → "EXIF", or all XMP-* → "XMP",
// or QuickTime/ItemList/UserData/Keys → "MP4"). Collapsing destroys
// sub-group identity, and ANY tag name that legitimately appears in two
// sub-groups (Track1:HandlerType + Track2:HandlerType, IFD0:Orientation
// + IFD1:Orientation, etc.) collides on the same (source, name) key —
// the diff renderer then mis-aligns one sub-group's row with another's,
// producing spurious diffs like the "Video Track → Audio Track"
// regression we hit on MP4 in PR #183.
//
// Trade: the user sees more granular labels in the diff UI ("IFD0",
// "ExifIFD", "Track1", "XMP-dc" instead of "EXIF", "EXIF", "MP4",
// "XMP"). Acceptable for the privacy-tool audience — precise labels
// are more informative when asking "what was removed" than smoothed-
// over umbrella terms.
//
// The function still exists rather than being removed entirely so we
// have a single hook for any future normalisation (e.g. trimming a
// prefix from an ExifTool group name we genuinely want to merge).
function mapGroupToSource(rawGroup: string): string {
const exact = EXACT_GROUP_MAP[rawGroup];
if (exact !== undefined) return exact;
if (rawGroup.startsWith("XMP")) return "XMP";
if (rawGroup.startsWith("ICC")) return "ICC";
if (rawGroup.startsWith("PNG")) return "PNG";
if (rawGroup.startsWith("Track")) return "MP4";
// Unknown group — surface verbatim. Better than dropping (the user
// sees something), and shows up in test coverage if a new group
// appears in the wild that we should map.
return rawGroup;
}

View file

@ -37,11 +37,23 @@ export async function redirectWasmFetch(
const inlineEl = document.getElementById(STANDALONE_WASM_INLINE_ID);
const base64 = inlineEl?.textContent?.trim();
if (base64 !== undefined && base64.length > 0) {
// `fetch(data:URL)` lets the browser's native Base64 → bytes path
// do the decode work (typically faster than a JS atob+charCodeAt
// loop for a 33 MB payload). The Response carries the bytes
// straight into the wrapper's instantiateStreaming call.
return fetch(`data:application/wasm;base64,${base64}`, init);
// Standalone build inlines the WASM as gzipped+base64 (~3× smaller
// than raw base64 — see vite.config.web.standalone.ts). The browser
// natively decodes base64 via `fetch(data:URL)`, and we pipe the
// decoded gzip bytes through DecompressionStream. Result: HTML
// payload drops from ~33 MB → ~10 MB; runtime decode ~30 ms once.
const gzipped = await (
await fetch(`data:application/octet-stream;base64,${base64}`)
).arrayBuffer();
const decompressed = await new Response(
new Blob([gzipped])
.stream()
.pipeThrough(new DecompressionStream("gzip")),
).arrayBuffer();
return new Response(decompressed, {
headers: { "content-type": "application/wasm" },
...init,
});
}
const mod = await import("@6over3/zeroperl-ts/zeroperl.wasm?url");
return fetch(mod.default, init);

View file

@ -0,0 +1,21 @@
// @ffmpeg/core ships Emscripten-generated ESM with no .d.ts. The exec()
// + FS shape we use is captured in `FfmpegCore` in ffmpeg_wasm_fetch.ts;
// here we just declare the module so `import("@ffmpeg/core")` typechecks.
declare module "@ffmpeg/core" {
const factory: (config: unknown) => Promise<unknown>;
export default factory;
}
declare module "@ffmpeg/core/wasm?url" {
const url: string;
export default url;
}
// Build-time flag set via Vite `define`. The standalone build (file://)
// inlines @ffmpeg/core as gzipped+base64 in `<script type="text/plain">`
// tags and reads them via readInlinedCore() — the PWA-style bare
// `import("@ffmpeg/core")` path cannot be reached in that target. We
// gate the bare import behind `!__WITH_STANDALONE_INLINE__` so Rollup
// tree-shakes the dead branch in the standalone build, avoiding the
// ~43 MB of ffmpeg-core factory + its data: URL wasm fallback that Vite
// would otherwise bundle into the single-file HTML.
declare const __WITH_STANDALONE_INLINE__: boolean;

View file

@ -0,0 +1,373 @@
import type { Result } from "../../../common";
import type { ExifError } from "../../../domain";
import type {
FormatStrategy,
StripOptions,
StripResult,
} from "../format_strategy";
import { cleanFfmpegMp4Output } from "./ffmpeg_post_strip";
import { loadFfmpegInstance, type FfmpegCore } from "./ffmpeg_wasm_fetch";
// Issue #182. Routes MP4/MOV/M4V (Phase 1) and MKV/WebM (Phase 2) through
// ffmpeg-wasm — registered ahead of VideoStrategy in strategy_registry.ts
// when VITE_ENABLE_FFMPEG_FALLBACK is not "false".
//
// Privacy + bundle math is documented in docs/poc/ffmpeg-wasm.md. Per-source
// policy is documented in docs/gap-analysis/{mp4-ffmpeg,mkv,webm}.md.
//
// Architecture: uses @ffmpeg/core directly in the main thread (not via the
// @ffmpeg/ffmpeg wrapper). The wrapper spawns a type:"module" Web Worker
// from a Blob URL, which fails silently when the page origin is `null`
// (the standalone HTML build runs from file://). Main-thread usage works
// uniformly across standalone, PWA, and Capacitor WebView at the cost of
// blocking the UI during exec() — empirically ~1s on a 236 MB DJI Phantom 4
// fixture (see docs/forensic/ffmpeg-fallback.md), acceptable for a
// single-file metadata strip.
//
// Engine lifecycle: one shared FfmpegCore instance per page session, loaded
// lazily on first strip(). Subsequent strips reuse the same instance.
//
// Deliberately NOT claimed:
// - .avi / .wmv / .3gp — separate forensic-verification PRs (closes #44 etc.)
// - subtitle / chapter / data streams in the input are dropped by the
// -map 0:v? -map 0:a? policy (see gap analyses). The legitimate-subtitle
// edge case is documented as a follow-up UX item.
type ContainerKind = "mp4" | "matroska";
interface ContainerInfo {
kind: ContainerKind;
extension: string;
}
export class FfmpegFallbackStrategy implements FormatStrategy {
readonly extensions: ReadonlySet<string> = new Set([
".mp4",
".mov",
".m4v",
".mkv",
".webm",
]);
private instance: FfmpegCore | null = null;
private loadPromise: Promise<FfmpegCore> | null = null;
constructor() {
// Pre-warm ffmpeg-core in the background once the browser is idle.
// Empirically the ~30 MB wasm gunzip + WebAssembly.instantiate takes
// ~3-5 s on cold start, which dominated user-perceived "drop to
// done" latency. Starting the load right after page load (instead
// of on first strip) overlaps it with the user reading the page +
// dragging the file in, so by the time they drop, the engine is
// usually ready. requestIdleCallback ensures we don't compete with
// first-paint work; setTimeout fallback covers Safari/older
// environments where rIC isn't available.
if (typeof window !== "undefined") {
const kick = () => {
// Swallow load failure here — strip() will retry on demand
// and surface the error to the user. We just don't want an
// uncaught rejection from the prewarm path.
this.getInstance().catch(() => {});
};
const ric = (window as { requestIdleCallback?: (cb: () => void) => void })
.requestIdleCallback;
if (typeof ric === "function") {
ric(kick);
} else {
setTimeout(kick, 0);
}
}
}
verifyMagicBytes({ bytes }: { bytes: Uint8Array }): boolean {
return matchesMp4Family(bytes) || matchesEbml(bytes);
}
async strip({
bytes,
}: {
bytes: Uint8Array;
options: StripOptions;
}): Promise<Result<StripResult, ExifError>> {
const container = detectContainer(bytes);
if (container === null) {
return {
ok: false,
error: {
code: "invalid-file-format",
detail:
"ffmpeg fallback: input bytes are not a recognised MP4-family ISOBMFF or EBML (MKV/WebM) container",
},
};
}
const inputName = `in${container.extension}`;
const outputName = `out${container.extension}`;
try {
const core = await this.getInstance();
// Guarantee MEMFS cleanup on every exit path (success, error
// return, or thrown). Privacy invariant: input bytes must not
// persist in WASM linear memory between strips. Without the
// finally, a throw from FS.readFile (zero-byte output, MEMFS
// corruption, OOM) would leave the user's video bytes sitting
// in MEMFS until the next strip's pre-call safeUnlink ran.
try {
// Clean MEMFS state from any prior strip — file names collide
// across calls because the instance is cached.
safeUnlink(core, inputName);
safeUnlink(core, outputName);
core.FS.writeFile(inputName, bytes);
// Stream selection — keep video+audio in their original
// input positions, drop everything else.
//
// `-map 0:v? -map 0:a?` (what we used before) reorders
// streams: ffmpeg processes -map args in argument order,
// so all video streams land first and all audio streams
// land after. That swaps tracks for any input where audio
// came before video (some encoders write audio Track 1,
// video Track 2). The diff view then correctly reports
// per-track changes, but those changes are spurious —
// the content is the same, just renumbered.
//
// `-map 0 -map -0:d? -map -0:s? -map -0:t?` keeps every
// stream by default then explicitly removes data ("d"),
// subtitle ("s"), and attachment/timecode ("t") streams.
// `?` makes each removal optional (no error if the type
// isn't present). Track ORDER among the surviving video
// and audio streams is preserved as in the input.
//
// This still drops the GPMF / tmcd / fdsc tracks that
// make mat2 / `-codec copy -map 0` exit 234 on action-
// cam footage — those are all `d` or `t` streams.
const args: string[] = [
"-i",
inputName,
"-map",
"0",
"-map",
"-0:d?",
"-map",
"-0:s?",
"-map",
"-0:t?",
"-map_metadata",
"-1",
"-map_chapters",
"-1",
"-fflags",
"+bitexact",
"-c",
"copy",
];
// MP4-specific muxer hardening (no-ops / warnings under matroska):
// +faststart — moov at front for better seek
// -write_btrt 0 — suppress per-track bitrate boxes (technical
// fields not in the input)
// -write_tmcd 0 — suppress timecode atom (we drop tmcd
// streams via -map anyway; this stops ffmpeg
// from synthesising one)
// -empty_hdlr_name true — zero per-track hdlr.name (otherwise
// ffmpeg writes "VideoHandler"/"SoundHandler"
// as a strip-tool fingerprint)
// -metadata:s:{v,a} vendor_id= handler_name= — defense-in-depth:
// blank stream-level vendor/handler tags
// (ffmpeg already writes zeros for these
// in our config, but the explicit clear
// survives future ffmpeg defaults
// changing)
// The udta/meta/hdlr.vendor "appl" bytes are NOT suppressible via
// any flag (hardcoded in mov_write_hdlr_tag at the movie level);
// cleanFfmpegMp4Output() below handles that.
if (container.kind === "mp4") {
args.push(
"-movflags",
"+faststart",
"-write_btrt",
"0",
"-write_tmcd",
"0",
"-empty_hdlr_name",
"true",
"-metadata:s:v",
"vendor_id=",
"-metadata:s:v",
"handler_name=",
"-metadata:s:a",
"vendor_id=",
"-metadata:s:a",
"handler_name=",
);
}
args.push("-metadata", "encoder=");
args.push(outputName);
const rc = core.exec(...args);
if (rc !== 0) {
return {
ok: false,
error: {
code: "parse-failed",
raw: `ffmpeg returned non-zero exit code: ${rc}. See browser console for ffmpeg stderr.`,
},
};
}
let outBytes = new Uint8Array(core.FS.readFile(outputName));
// Post-strip cleanup — rewrites ffmpeg's hardcoded udta blocks
// (movie-level and per-track) to `free` skip boxes. ffmpeg's
// MP4 muxer writes the udta unconditionally with handler_type
// "mdir" (ExifTool surfaces as `HandlerType: Metadata`) and
// hardcoded vendor "appl" (`HandlerVendorID: Apple`); the
// rename neutralises both surfaces in one length-preserving
// step, without perturbing stco/co64 offsets. See
// ffmpeg_post_strip.ts for the per-surface rationale and for
// why mdhd.language is deliberately left as ffmpeg's "und"
// default. No-op for matroska containers (different structure).
if (container.kind === "mp4") {
outBytes = cleanFfmpegMp4Output(outBytes);
}
return {
ok: true,
value: {
bytes: outBytes,
walkerEntries: [],
diffDocument: null,
},
};
} finally {
safeUnlink(core, inputName);
safeUnlink(core, outputName);
}
} catch (err: unknown) {
return {
ok: false,
error: {
code: "file-io-error",
detail:
err instanceof Error
? `ffmpeg fallback: ${err.message}`
: `ffmpeg fallback: ${String(err)}`,
},
};
}
}
private async getInstance(): Promise<FfmpegCore> {
if (this.instance !== null) return this.instance;
if (this.loadPromise !== null) return this.loadPromise;
this.loadPromise = loadFfmpegInstance().then(
(inst) => {
this.instance = inst;
return inst;
},
(err: unknown) => {
// Clear the cached rejection so the next strip() can retry.
// Without this, a transient load failure (network blip, gunzip
// error, dynamic import failure) would permanently brick the
// strategy for the rest of the page session. The reset lives
// inside the rejection handler — callers that awaited this
// same promise still observe the failure, but the *next*
// getInstance() call after the rejection starts fresh.
//
// Interaction with the constructor's requestIdleCallback
// prewarm: a prewarm failure still gets swallowed by the
// .catch(() => {}) there, but because we've nulled
// loadPromise, the next strip() triggers a real retry
// rather than re-surfacing the cached rejection.
this.loadPromise = null;
throw err;
},
);
return this.loadPromise;
}
}
function safeUnlink(core: FfmpegCore, name: string): void {
try {
core.FS.unlink(name);
} catch {
// MEMFS unlink throws if the file was never written (early-error
// path) or if a concurrent op already removed it. Either way, ignore.
}
}
// ISOBMFF ftyp box: 4-byte size + "ftyp" + 4-byte major brand + 4-byte
// minor version + N*4-byte compatible brands. MP4 family brands include:
// isom, iso2..iso6, mp41, mp42, avc1, qt (note trailing spaces),
// M4V , M4VH, M4VP.
const MP4_BRANDS = new Set([
"isom",
"iso2",
"iso3",
"iso4",
"iso5",
"iso6",
"mp41",
"mp42",
"avc1",
"qt ",
"M4V ",
"M4VH",
"M4VP",
]);
function matchesMp4Family(b: Uint8Array): boolean {
if (b.length < 16) return false;
if (b[4] !== 0x66 || b[5] !== 0x74 || b[6] !== 0x79 || b[7] !== 0x70) {
return false;
}
const boxSize =
((b[0] ?? 0) << 24) |
((b[1] ?? 0) << 16) |
((b[2] ?? 0) << 8) |
(b[3] ?? 0);
const ftypEnd = Math.min(boxSize, b.length);
for (let i = 8; i + 4 <= ftypEnd; i += 4) {
if (i === 12) continue; // skip minor_version slot
const brand = String.fromCharCode(
b[i] ?? 0,
b[i + 1] ?? 0,
b[i + 2] ?? 0,
b[i + 3] ?? 0,
);
if (MP4_BRANDS.has(brand)) return true;
}
return false;
}
function matchesEbml(b: Uint8Array): boolean {
return (
b.length >= 4 &&
b[0] === 0x1a &&
b[1] === 0x45 &&
b[2] === 0xdf &&
b[3] === 0xa3
);
}
function detectContainer(b: Uint8Array): ContainerInfo | null {
if (matchesMp4Family(b)) {
if (b.length >= 12) {
const major = String.fromCharCode(
b[8] ?? 0,
b[9] ?? 0,
b[10] ?? 0,
b[11] ?? 0,
);
if (major === "qt ") return { kind: "mp4", extension: ".mov" };
if (major.startsWith("M4V")) return { kind: "mp4", extension: ".m4v" };
}
return { kind: "mp4", extension: ".mp4" };
}
if (matchesEbml(b)) {
const head = b.subarray(0, Math.min(b.length, 64));
const decoded = String.fromCharCode(...head);
if (decoded.includes("webm")) {
return { kind: "matroska", extension: ".webm" };
}
return { kind: "matroska", extension: ".mkv" };
}
return null;
}

View file

@ -0,0 +1,103 @@
// Post-strip cleanup for ffmpeg's MP4 muxer output.
//
// ffmpeg's MP4 muxer (libavformat/movenc.c) ALWAYS writes a 0x21-byte
// `udta/meta/hdlr` block at the movie level (and per-track for some
// inputs) regardless of CLI options. The handler_type is `mdir`
// (iTunes-style metadata directory), which exiftool surfaces as
// `HandlerType: Metadata`. The handler vendor is hardcoded to ASCII
// "appl", which exiftool surfaces as `HandlerVendorID: Apple` —
// actively misrepresenting the file as Apple-vendored.
// `mov_write_meta_tag` runs unconditionally even after
// `-map_metadata -1` clears the payload.
//
// Fix: rewrite the `udta` box type to `free`. ISO/IEC 14496-12 §8.1.2
// defines `free` (and its alias `skip`) as padding — every reader
// ignores the contents. exiftool stops surfacing the HandlerType row,
// the HandlerVendorID row, and any other fields that would have come
// from inside the block. ffprobe also stops reporting it.
// Length-preserving rewrite so `stco`/`co64` offsets to `mdat` stay
// valid.
//
// Why rewrite the *udta* rather than the inner `meta` or `hdlr`: we're
// claiming the entire `udta/meta/hdlr` substructure is ffmpeg-added
// padding. The strip invocation passes `-map_metadata -1 -map_chapters
// -1`, which already drops every user-authored udta child the input
// may have had. Any udta remaining in the output is muxer-synthesised.
// (We rewrite per-track udta too — ffmpeg writes those for some inputs
// as well.)
//
// Why we do NOT touch `mdhd.language`: ffmpeg writes the canonical ISO
// 639-2/T code `und` (0x55C4, "undetermined") when the input had no
// language and copies the input's code when it did. We considered
// zeroing it to suppress the `MediaLanguageCode` row from the diff
// when an input had no surfaceable language and ffmpeg had to invent
// one. We reverted: 0x0000 is not a valid ISO 639-2/T code (decodes
// to three 0x60 bytes) and downstream tools improvise — ffprobe
// falls back to displaying `(eng)`, which is actively misleading for
// downstream tooling that switches on language. `und` is the spec's
// canonical "no language specified" marker; readers handle it
// predictably. The diff cost (one extra row when input had no
// surfaceable language) is acceptable vs. the spec-invalid bytes.
// When the input *did* have a language (e.g. "eng"), the resulting
// `eng → und` diff row is honest — we removed the user's language
// tag, that's exactly what the diff is for.
//
// Other muxer-added surfaces are suppressed at the muxer level via the
// strategy's invocation flags (`-write_btrt 0`, `-write_tmcd 0`,
// `-empty_hdlr_name true`, `-metadata:s:{v,a} vendor_id= handler_name=`).
// This pass only handles what the muxer CLI cannot.
//
// All writes are length-preserving so every byte offset elsewhere in
// the file (including `stco`/`co64` chunk offsets into `mdat`) stays
// valid.
//
// Operates on the MP4 family only (ISOBMFF). Matroska/WebM use EBML and
// don't have the structure.
import { parseBoxesSafe, type ParsedBox } from "./video_boxes";
// Mutates the input in place — caller must own the buffer. Returns the
// same buffer for chaining.
export function cleanFfmpegMp4Output(bytes: Uint8Array): Uint8Array {
const top = parseBoxesSafe(bytes, 0, bytes.length);
for (const box of top) {
if (box.type === "moov") walkMoov(bytes, box);
}
return bytes;
}
// Walk a moov subtree: rewrite each udta → free, recurse into trak
// for per-track udta.
function walkMoov(out: Uint8Array, moov: ParsedBox): void {
const moovChildren = parseBoxesSafe(out, moov.payloadStart, moov.payloadEnd);
for (const child of moovChildren) {
if (child.type === "udta") rewriteToFree(out, child);
if (child.type === "trak") walkTrak(out, child);
}
}
function walkTrak(out: Uint8Array, trak: ParsedBox): void {
const trakChildren = parseBoxesSafe(out, trak.payloadStart, trak.payloadEnd);
for (const child of trakChildren) {
// Per-track udta (some inputs make ffmpeg synthesise these).
if (child.type === "udta") rewriteToFree(out, child);
}
}
// Rewrite a box's 4-byte type field to "free", making readers treat the
// entire box (header + payload) as padding. The box's size field is
// untouched so the file's total length and every offset elsewhere
// stays valid. The type field always sits at headerStart + 4..7 in
// ISO/IEC 14496-12 layout — that's the universal location, whether the
// box uses the 8-byte regular header or the 16-byte largesize header
// (size==1, followed by an 8-byte uint64 size). Using headerStart + 4
// rather than payloadStart - 4 guarantees correctness across both:
// for largesize, payloadStart - 4 would land inside the uint64
// largesize field, not the type bytes.
function rewriteToFree(out: Uint8Array, box: ParsedBox): void {
const typeOffset = box.headerStart + 4;
out[typeOffset] = 0x66; // 'f'
out[typeOffset + 1] = 0x72; // 'r'
out[typeOffset + 2] = 0x65; // 'e'
out[typeOffset + 3] = 0x65; // 'e'
}

View file

@ -0,0 +1,155 @@
// DOM element IDs where the standalone build stashes the core JS + WASM
// as Base64. Read on first load via document.getElementById. The HTML has
// `<script type="text/plain" id="…">…</script>` blocks holding the bytes
// — the browser stores the text but doesn't parse it as JS, so the
// initial JS-parse cost stays bounded to the small wrapper code instead
// of paying for V8 to allocate the multi-MB Base64 strings as module-
// scope literals. Mirrors the zeroperl-wasm-base64 pattern in
// exiftool_wasm_fetch.ts.
const STANDALONE_CORE_JS_ID = "ffmpeg-core-js-base64";
const STANDALONE_CORE_WASM_ID = "ffmpeg-core-wasm-base64";
// ffmpeg-core's exec() / FS API. The `@ffmpeg/core` package ships a
// factory function whose return shape isn't exported as a type from the
// package, so we type the bits we use here. createFFmpegCore is declared
// as `unknown` in resolveCoreFactory below until validated.
export interface FfmpegCore {
FS: {
writeFile: (name: string, data: Uint8Array) => void;
readFile: (name: string) => Uint8Array;
unlink: (name: string) => void;
};
exec: (...args: string[]) => number;
ffprobe?: (...args: string[]) => number;
setLogger?: (cb: (msg: { type: string; message: string }) => void) => void;
setTimeout?: (timeout: number) => void;
reset?: () => void;
}
// Returns a ready-to-use ffmpeg-core instance. Loaded directly into the
// main thread — we deliberately do NOT use the @ffmpeg/ffmpeg wrapper,
// which spawns a `type: "module"` Web Worker from a Blob URL. Module
// workers from Blob URLs fail silently when the page's origin is `null`
// (file://, e.g. the standalone HTML), with cross-origin error censoring
// hiding the actual cause. Running ffmpeg in the main thread is the same
// strategy ImageStrategy / PdfStrategy use; it blocks the UI during
// exec() but the strip is short enough (< 1s on a 236 MB DJI Phantom 4
// fixture, per docs/forensic/ffmpeg-fallback.md) that UX is acceptable.
export async function loadFfmpegInstance(): Promise<FfmpegCore> {
const { factory, wasmBytes } = await resolveCore();
// ffmpeg-core's Emscripten module detects environment via `self`. In a
// browser main thread `self === window`, both are defined by the
// engine. No shim needed (Node-side shim lives in
// tools/forensic/ffmpeg-fallback.ts where the strategy isn't used).
const core = await factory({
wasmBinary: wasmBytes,
// Quiet by default — the strategy surfaces errors via rc != 0
// rather than relying on ffmpeg's stderr noise.
print: () => {},
printErr: () => {},
});
return core as FfmpegCore;
}
interface CoreSources {
factory: (config: unknown) => Promise<unknown>;
wasmBytes: Uint8Array;
}
async function resolveCore(): Promise<CoreSources> {
// Standalone build: bytes are inlined as gzipped+base64 in script-text-
// plain elements. Decompress + decode each, feed the wasm bytes straight
// into createFFmpegCore via `wasmBinary`. Gzipping cut the standalone
// HTML payload from ~116 MB → ~40 MB — HTML parse time scales with text-
// node size at page load, so this is the start-time win.
if (typeof document !== "undefined") {
const inlined = await readInlinedCore();
if (inlined !== null) return inlined;
}
// `__WITH_STANDALONE_INLINE__` is replaced at build time by Vite's
// `define` config (true in the standalone build, false in PWA/APK).
// Gating the bare `import("@ffmpeg/core")` behind this flag lets
// Rollup tree-shake the entire branch in the standalone build,
// dropping ~43 MB of ffmpeg-core factory + its data: URL wasm
// fallback that Vite would otherwise statically bundle into the
// single-file HTML (the runtime never reaches this branch in the
// standalone target because readInlinedCore above returns first, but
// the static analyser doesn't know that).
if (__WITH_STANDALONE_INLINE__) {
throw new Error(
"standalone build reached the PWA fetch path — inline tags missing",
);
}
// PWA / APK build: Vite-emitted asset URLs. Imports go through the
// @ffmpeg/core exports map. Vite's ?url suffix gives us hashed asset
// URLs; we fetch the WASM bytes ourselves and import the JS factory.
const [{ default: factory }, wasmMod] = await Promise.all([
import("@ffmpeg/core"),
import("@ffmpeg/core/wasm?url"),
]);
const wasmBytes = new Uint8Array(
await (await fetch(wasmMod.default)).arrayBuffer(),
);
return {
factory: factory as (config: unknown) => Promise<unknown>,
wasmBytes,
};
}
async function readInlinedCore(): Promise<CoreSources | null> {
const coreJsB64 = document
.getElementById(STANDALONE_CORE_JS_ID)
?.textContent?.trim();
const coreWasmB64 = document
.getElementById(STANDALONE_CORE_WASM_ID)
?.textContent?.trim();
if (
coreJsB64 === undefined ||
coreJsB64 === "" ||
coreWasmB64 === undefined ||
coreWasmB64 === ""
) {
return null;
}
// Decompress both payloads concurrently. The browser's native base64
// → bytes (via fetch(data:URL)) plus DecompressionStream("gzip") is
// far faster than a JS-side atob+pako loop on multi-MB inputs.
const [wasmBytes, jsBytes] = await Promise.all([
base64GunzipToBytes(coreWasmB64),
base64GunzipToBytes(coreJsB64),
]);
// The core JS is an Emscripten-generated ESM module. Build a Blob URL
// for it and dynamic-import to get the createFFmpegCore default
// export. One-shot module import, not a Worker spawn, so the null-
// origin Blob URL restriction affecting type:module workers does NOT
// apply.
const coreBlob = new Blob([jsBytes], { type: "text/javascript" });
const coreUrl = URL.createObjectURL(coreBlob);
const factoryPromise = import(/* @vite-ignore */ coreUrl).then(
(mod): ((config: unknown) => Promise<unknown>) => {
URL.revokeObjectURL(coreUrl);
return mod.default;
},
);
return {
factory: async (config: unknown) => (await factoryPromise)(config),
wasmBytes,
};
}
// Browser-native base64 → bytes (via fetch on a data: URL) followed by
// DecompressionStream gunzip. Used for both the wasm payload (~30 MB raw
// → ~10 MB gz) and the core JS (~110 KB raw → ~30 KB gz). Runtime cost
// ~30 ms total on the wasm side; pays for itself many times over against
// the ~600 ms HTML-parse savings from a smaller text node.
async function base64GunzipToBytes(b64: string): Promise<Uint8Array> {
const gzipped = await (
await fetch(`data:application/octet-stream;base64,${b64}`)
).blob();
const decompressed = await new Response(
gzipped.stream().pipeThrough(new DecompressionStream("gzip")),
).arrayBuffer();
return new Uint8Array(decompressed);
}

View file

@ -4,6 +4,7 @@ import { JpegStrategy } from "./strategies/jpeg_strategy";
import { PngStrategy } from "./strategies/png_strategy";
import { PdfStrategy } from "./strategies/pdf_strategy";
import { ExifToolFallbackStrategy } from "./strategies/exiftool_fallback_strategy";
import { FfmpegFallbackStrategy } from "./strategies/ffmpeg_fallback_strategy";
import type { FormatStrategy } from "./format_strategy";
// VITE_ENABLE_EXIFTOOL_FALLBACK gates the ExifTool-in-WASM fallback. Default
@ -13,8 +14,21 @@ import type { FormatStrategy } from "./format_strategy";
const ENABLE_EXIFTOOL_FALLBACK =
import.meta.env.VITE_ENABLE_EXIFTOOL_FALLBACK !== "false";
// VITE_ENABLE_FFMPEG_FALLBACK gates the ffmpeg-wasm strategy (#182). Default
// on for every build target; "false" omits the engine and falls back to
// VideoStrategy for MP4/MOV/M4V. See docs/poc/ffmpeg-wasm.md and
// docs/gap-analysis/mp4-ffmpeg.md.
const ENABLE_FFMPEG_FALLBACK =
import.meta.env.VITE_ENABLE_FFMPEG_FALLBACK !== "false";
const STRATEGIES: readonly FormatStrategy[] = [
new OfficeStrategy(),
// FfmpegFallbackStrategy claims .mp4/.mov/.m4v AHEAD of VideoStrategy when
// enabled — it closes the walker's KNOWN_GAPS (#38, #39, #111, #42) by
// re-writing the container from the stream tables. VideoStrategy stays
// in the list as the opt-out fallback (and during the validation window
// per #182's "delete walker in a follow-up PR" plan).
...(ENABLE_FFMPEG_FALLBACK ? [new FfmpegFallbackStrategy()] : []),
new VideoStrategy(),
new JpegStrategy(),
new PngStrategy(),

View file

@ -64,6 +64,25 @@ export class WasmProcessor implements MetadataProcessorPort {
// slate by constructing a new WasmProcessor.
private diffWarmupSignalled = false;
// Promise chain that serializes ALL diff builds across the processor's
// lifetime. `@uswriting/exiftool`'s parseMetadata uses module-level
// singletons (the Perl interpreter, MemoryFileSystem, stdout/stderr
// StringBuilders) — every call does `c.clear(), m.clear(), await
// e.reset()` on the shared state. Two concurrent readDocument calls
// race on those buffers. We already serialize before+after within
// a single entry below; this chain extends the same guarantee across
// entries so that `use_process_files`'s mid-loop + end-of-batch
// fire-and-forget drainDiffQueue invocations (which can overlap when
// a batch crosses DIFF_DRAIN_CHUNK boundaries) can't interleave with
// each other on the singleton.
//
// Each new diff awaits the previous one before starting its own pair
// of parseMetadata calls. Cost: same as serial drain order at the
// hook layer (which we already serialize per-drain anyway) — no new
// wall-clock penalty in the common case; correctness guarantee for
// the overlapping-drain edge case.
private diffChain: Promise<unknown> = Promise.resolve();
constructor({ fileBytes }: { fileBytes: FileBytesPort }) {
this.fileBytes = fileBytes;
this.diffStrategy = new ExifToolDiffStrategy();
@ -166,17 +185,41 @@ export class WasmProcessor implements MetadataProcessorPort {
dispatchExifToolDiffLoading();
}
// Queue this diff onto the singleton chain — guarantees no two
// parseMetadata calls (within or across entries) overlap on the
// shared Perl/StringBuilder state. See diffChain field docstring.
const next = this.diffChain.then(() => this.runDiff(pending));
// Swallow rejection on the chain itself so one failure doesn't
// poison the chain for subsequent diffs. `runDiff` already returns
// null on error; the chain just needs to keep its head live.
this.diffChain = next.catch(() => null);
return next;
}
// Within a single diff, serialize before+after sequentially.
// `@uswriting/exiftool`'s parseMetadata uses module-level singletons
// for the Perl interpreter, the MemoryFileSystem, AND the stdout/
// stderr StringBuilders. Every call does `c.clear(), m.clear(), await
// e.reset()` on that shared state. Running before+after as Promise.all
// interleaves the resets and the stdout reads — empirically (Node
// repro) the second pair onward came back with both reads returning
// the same buffer contents (same key count, same JSON), so the diff
// renderer saw no changes. File 1 escaped because the cold-start
// `T()` boot blocked one of the two reads long enough for the other
// to finish, but once perl was warm the race fired on every
// subsequent file. Serial avoids the race entirely.
private async runDiff(
pending: PendingDiffInputs,
): Promise<MetadataDocument | null> {
try {
const [beforeResult, afterResult] = await Promise.all([
this.diffStrategy.readDocument({
bytes: pending.sourceBytes,
extension: pending.extension,
}),
this.diffStrategy.readDocument({
bytes: pending.strippedBytes,
extension: pending.extension,
}),
]);
const beforeResult = await this.diffStrategy.readDocument({
bytes: pending.sourceBytes,
extension: pending.extension,
});
const afterResult = await this.diffStrategy.readDocument({
bytes: pending.strippedBytes,
extension: pending.extension,
});
if (!beforeResult.ok || !afterResult.ok) {
return null;
}

Binary file not shown.

View file

@ -67,6 +67,51 @@ test.describe("Standalone single-file HTML build", () => {
);
});
// Regression test for #182 standalone HTML hang. The original implementation
// used the @ffmpeg/ffmpeg wrapper which spawns a type:"module" Web Worker
// from a Blob URL. Module Workers from Blob URLs fail silently when the
// page origin is `null` (file://), with cross-origin error censoring hiding
// the cause. The strip would hang forever waiting for a response from a
// dead worker. Switching to main-thread @ffmpeg/core fixed it. This test
// would have caught the regression — it drops an MP4 and waits for the
// completion row, with a tight timeout that fails fast on hangs.
test("strips an MP4 via the ffmpeg fallback under file://", async ({
page,
}) => {
// Cold WASM init (~3-5s) + download flush dominates wall time; bump
// the outer test ceiling so the inner waits can use their 45s budget.
test.setTimeout(90_000);
const consoleErrors: string[] = [];
page.on("console", (msg) => {
if (msg.type() === "error") consoleErrors.push(msg.text());
});
page.on("pageerror", (err) => {
consoleErrors.push(`pageerror: ${err.message}`);
});
await page.goto(indexUrl);
await page.waitForLoadState("domcontentloaded");
await page.waitForSelector("[role='main']", { timeout: 10_000 });
await expect(page.locator(".drop-zone")).toBeVisible();
const { bytes, filename } = await captureDownload(page, async () => {
await dropFiles(page, [fixturePath("sample-real.mp4")]);
// Generous timeout because the standalone HTML pays a one-time cold
// WASM init on first video drop (~3-5 s) before processing. If the
// strip hangs (e.g. Worker dies silently), the 45 s ceiling here
// trips well before the wider Playwright default.
await page.waitForSelector(".file-table__row--complete", {
timeout: 45_000,
});
});
expect(filename).toMatch(/\.mp4$/i);
await assertOutputStripped(bytes, filename);
expect(consoleErrors, "Unexpected console errors during MP4 strip").toEqual(
[],
);
});
// Runtime complement to tests/e2e/web/no-network.spec.ts. The web build
// proves "SW + cache serve the whole pipeline" via mid-session route
// abort. The standalone build can't use the same pattern (no SW to fall

View file

@ -39,6 +39,27 @@ test.describe("File Processing — drag-drop (Web)", () => {
await assertOutputStripped(bytes, filename);
});
// End-to-end coverage of the FfmpegFallbackStrategy (#182). The PWA path
// uses Vite-emitted ?url assets for ffmpeg-core (not the inline-base64
// path the standalone uses), so this complements the standalone MP4 test:
// they exercise the two distinct asset-resolution branches.
test("strips metadata from an MP4 via the ffmpeg fallback", async ({
page,
}) => {
test.setTimeout(60_000);
const { bytes, filename } = await captureDownload(page, async () => {
await dropFiles(page, [fixturePath("sample-real.mp4")]);
// Generous timeout: first video drop pays a one-time ~3-5 s WASM
// init for ffmpeg-core. Subsequent strips reuse the instance.
await page.waitForSelector(".file-table__row--complete", {
timeout: 45_000,
});
});
expect(filename).toMatch(/\.mp4$/i);
await assertOutputStripped(bytes, filename);
});
test("multi-file drag-drop bundles outputs into a single flat zip", async ({
page,
}) => {

View file

@ -97,6 +97,78 @@ export async function assertDocxStripped(bytes: Buffer): Promise<void> {
}
}
/**
* Bytewise checks on a stripped MP4/MOV/MKV/WebM output. The
* FfmpegFallbackStrategy invocation uses `-fflags +bitexact`,
* `-map_metadata -1`, and `-metadata encoder=`, so the output must
* not carry an `Lavf<version>` muxer fingerprint, any of the
* sentinel metadata strings the `sample.mp4` fixture seeded
* (`Test Author`, `Test Video`, the dc:title XMP block), or the
* Adobe XMP namespace marker that comes from the seeded uuid box.
*
* The check is fixture-aware (knows what `tests/e2e/fixtures/sample.mp4`
* seeded). If a different fixture is used, extend this list.
*/
export function assertVideoStripped(bytes: Buffer): void {
const ascii = bytes.toString("latin1");
expect(ascii, "Stripped video should not carry Lavf encoder fingerprint").not.toMatch(
/Lavf\d/,
);
expect(ascii, "Stripped video should not contain 'Test Author' sentinel").not.toContain(
"Test Author",
);
expect(ascii, "Stripped video should not contain 'Test Video' sentinel").not.toContain(
"Test Video",
);
expect(ascii, "Stripped video should not contain XMP namespace marker").not.toContain(
"http://ns.adobe.com/xap/1.0/",
);
expect(ascii, "Stripped video should not contain ExifTool authorship").not.toContain(
"Image::ExifTool",
);
// ffmpeg's MP4 muxer always writes a udta/meta/hdlr block at the movie
// level (and per-track for some inputs) regardless of CLI options.
// ExifTool surfaces this as `HandlerType: Metadata` and (without the
// vendor patch) `HandlerVendorID: Apple`. Our post-strip pass renames
// every udta box type to `free`, which readers treat as padding.
//
// Walk the ISOBMFF box tree (moov → top-level udta + per-trak udta)
// and assert no udta box survives. A naive substring search on
// "udta" false-positives on mdat byte collisions (~6% at 250 MB);
// pinning to a specific size byte (e.g. 0x21) is brittle to ffmpeg
// version drift in the udta payload size. The structural walk
// matches what the post-strip pass itself rewrites.
const moov = findTopLevelBox(bytes, "moov");
expect(moov, "Stripped MP4 should contain a moov box").not.toBeNull();
if (moov !== null) {
const udtaInMoov = findChildBox(bytes, moov, "udta");
expect(udtaInMoov, "Stripped MP4 should not contain moov/udta").toBeNull();
// Per-track udta lives under moov/trak — walk every trak.
const traks = findAllChildBoxes(bytes, moov, "trak");
for (const trak of traks) {
const udtaInTrak = findChildBox(bytes, trak, "udta");
expect(
udtaInTrak,
"Stripped MP4 should not contain moov/trak/udta",
).toBeNull();
}
}
// ffmpeg's btrt (Bitrate Box) writes BufferSize/MaxBitrate/AverageBitrate
// from the input stream stats. Suppressed via -write_btrt 0 in the strategy
// invocation. If the flag stops working, btrt boxes will reappear.
expect(ascii, "Stripped video should not contain ffmpeg's btrt bitrate box").not.toMatch(
/btrt/,
);
// Per-track hdlr.name fields (one per video/audio trak). ffmpeg's default
// is "VideoHandler"/"SoundHandler"; we suppress via -empty_hdlr_name true.
expect(ascii, "Stripped video should not contain ffmpeg's default handler names").not.toContain(
"VideoHandler",
);
expect(ascii, "Stripped video should not contain ffmpeg's default handler names").not.toContain(
"SoundHandler",
);
}
export async function assertOutputStripped(
bytes: Buffer,
filename: string,
@ -107,6 +179,13 @@ export async function assertOutputStripped(
case "jpeg":
assertJpegStripped(bytes);
return;
case "mp4":
case "mov":
case "m4v":
case "mkv":
case "webm":
assertVideoStripped(bytes);
return;
case "png":
// Block accidental use until PNG strategy lands. The current
// substring-scan implementation in assertPngStripped is fragile
@ -128,3 +207,91 @@ export async function assertOutputStripped(
}
}
// Minimal ISOBMFF box walker for the udta assertion above. Returns
// `null` on any malformed input rather than throwing — the assertion
// failing with "moov box missing" is more useful than a parse error.
//
// We don't share the strategy's `parseBoxesSafe` here to keep the test
// helper free of `src/` imports; the structure is tiny enough that a
// local copy is fine and the two implementations stay independent
// (the same adversarial-independence rule the forensic runners follow).
interface BoxSpan {
readonly type: string;
readonly payloadStart: number;
readonly payloadEnd: number;
}
function findTopLevelBox(bytes: Buffer, type: string): BoxSpan | null {
return findBoxIn(bytes, 0, bytes.length, type);
}
function findChildBox(
bytes: Buffer,
parent: BoxSpan,
type: string,
): BoxSpan | null {
return findBoxIn(bytes, parent.payloadStart, parent.payloadEnd, type);
}
function findAllChildBoxes(
bytes: Buffer,
parent: BoxSpan,
type: string,
): BoxSpan[] {
const out: BoxSpan[] = [];
let offset = parent.payloadStart;
while (offset + 8 <= parent.payloadEnd) {
const box = parseBoxAt(bytes, offset, parent.payloadEnd);
if (box === null) return out;
if (box.span.type === type) out.push(box.span);
offset = box.next;
}
return out;
}
function findBoxIn(
bytes: Buffer,
start: number,
end: number,
type: string,
): BoxSpan | null {
let offset = start;
while (offset + 8 <= end) {
const box = parseBoxAt(bytes, offset, end);
if (box === null) return null;
if (box.span.type === type) return box.span;
offset = box.next;
}
return null;
}
function parseBoxAt(
bytes: Buffer,
offset: number,
end: number,
): { span: BoxSpan; next: number } | null {
if (offset + 8 > end) return null;
let size = bytes.readUInt32BE(offset);
const type = bytes.toString("latin1", offset + 4, offset + 8);
let headerSize = 8;
if (size === 1) {
if (offset + 16 > end) return null;
const high = bytes.readUInt32BE(offset + 8);
const low = bytes.readUInt32BE(offset + 12);
if (high >= 0x20_0000) return null; // 2^53 limit safeguard
size = high * 0x1_0000_0000 + low;
headerSize = 16;
} else if (size === 0) {
size = end - offset;
}
if (size < headerSize || offset + size > end) return null;
return {
span: {
type,
payloadStart: offset + headerSize,
payloadEnd: offset + size,
},
next: offset + size,
};
}

View file

@ -6,8 +6,9 @@
// metadata document before strip on the left, after strip on the right.
//
// JPEG: drop sample.jpg, row becomes expandable, click → two-pane diff
// shows with EXIF/JFIF source groups, removed entries strike-through on
// the left, placeholder on the right.
// shows with ExifTool family-1 source groups (IFD0, JFIF, etc. — surfaced
// verbatim since 1c3ced5; no umbrella collapse to "EXIF"), removed entries
// strike-through on the left, placeholder on the right.
//
// PDF: drop sample.pdf, same flow. Source label is "PDF" (ExifTool's PDF
// group encompasses Info dict + XMP).
@ -25,7 +26,7 @@ test.describe("Metadata diff expansion (two-pane via ExifTool)", () => {
await launchPage(page);
});
test("JPEG file shows expandable two-pane diff with EXIF group", async ({
test("JPEG file shows expandable two-pane diff with IFD0 group", async ({
page,
isMobile,
browserName,
@ -71,9 +72,15 @@ test.describe("Metadata diff expansion (two-pane via ExifTool)", () => {
const diff = page.locator(".file-table__diff--two-pane");
await expect(diff).toBeVisible();
// EXIF source group header is present (sample.jpg has EXIF tags).
// IFD0 source group header is present (sample.jpg carries Make + Model
// in IFD0). Since 1c3ced5 the diff strategy surfaces ExifTool family-1
// group names verbatim — IFD0, ExifIFD, XMP-dc, etc. — and explicitly
// does NOT collapse them to umbrella labels like "EXIF" (collapsing
// causes (source, name) key collisions across sub-groups, see the
// commit message and exiftool_diff_strategy.ts mapGroupToSource for
// the rationale).
await expect(
diff.locator(".file-table__diff-group-header", { hasText: /EXIF/ }),
diff.locator(".file-table__diff-group-header", { hasText: /IFD0/ }),
).toBeVisible();
// At least one row is classified as removed (ExifTool reads zero

View file

@ -39,7 +39,7 @@ describe("ExifToolDiffStrategy", () => {
}
}, 15_000);
it("maps ExifTool IFD0 / ExifIFD groups to 'EXIF' source label", async () => {
it("surfaces ExifTool family-1 groups verbatim (no collapse)", async () => {
const strategy = new ExifToolDiffStrategy();
const result = await strategy.readDocument({
bytes: loadFixture("sample.jpg"),
@ -49,13 +49,19 @@ describe("ExifToolDiffStrategy", () => {
if (!result.ok) return;
const sources = new Set(result.value.map((e) => e.source));
// IFD0 + ExifIFD + InteropIFD + IFD1 should all collapse to "EXIF".
// We don't assert ALL of them present (depends on fixture), but
// confirm "IFD0" / "ExifIFD" don't leak through as-is.
expect(sources.has("IFD0")).toBe(false);
expect(sources.has("ExifIFD")).toBe(false);
expect(sources.has("InteropIFD")).toBe(false);
expect(sources.has("IFD1")).toBe(false);
// We deliberately preserve raw ExifTool group names — collapsing
// sub-groups (IFD0/IFD1/ExifIFD → "EXIF", XMP-* → "XMP",
// QuickTime/ItemList/UserData → "MP4", Track1/Track2 → "MP4")
// causes diff-renderer key collisions when the same tag name
// appears in two sub-groups (e.g. Track1:HandlerType vs
// Track2:HandlerType). The sample.jpg fixture is known to carry
// at least an IFD0 group; assert it surfaces under that name
// rather than getting flattened to "EXIF".
expect(sources.has("IFD0")).toBe(true);
// Conversely, the collapsed labels we used to emit ("EXIF") should
// no longer appear from the diff strategy — any test elsewhere
// using `source: "EXIF"` is on synthetic data, not live diff data.
expect(sources.has("EXIF")).toBe(false);
}, 15_000);
it("drops File:* / ExifTool:* / System:* / Composite:* / SourceFile entries", async () => {

View file

@ -0,0 +1,156 @@
import { afterEach, describe, it, expect, vi } from "vitest";
import { FfmpegFallbackStrategy } from "../../../src/infrastructure/wasm/strategies/ffmpeg_fallback_strategy";
// FfmpegFallbackStrategy loads @ffmpeg/core directly in the main thread (no
// @ffmpeg/ffmpeg wrapper, no Web Worker). End-to-end strip behaviour is
// exercised by the Playwright e2e tests in tests/e2e/standalone/standalone.spec.ts
// and tests/e2e/web/file-processing.spec.ts, plus the Node forensic runner at
// tools/forensic/ffmpeg-fallback.ts (which shims browser globals so @ffmpeg/core
// runs under Node). The Vitest tests in this file cover:
// - Static surface (extension claim, magic-byte verification)
// - Registry gating (VITE_ENABLE_FFMPEG_FALLBACK env flag)
describe("FfmpegFallbackStrategy", () => {
it("claims .mp4 / .mov / .m4v / .mkv / .webm in Phase 1+2", () => {
const strategy = new FfmpegFallbackStrategy();
// Phase 1
expect(strategy.extensions.has(".mp4")).toBe(true);
expect(strategy.extensions.has(".mov")).toBe(true);
expect(strategy.extensions.has(".m4v")).toBe(true);
// Phase 2 (forensic-verified in tools/forensic/ffmpeg-fallback.ts)
expect(strategy.extensions.has(".mkv")).toBe(true);
expect(strategy.extensions.has(".webm")).toBe(true);
// Deliberately out of scope — separate strategies / future PRs.
expect(strategy.extensions.has(".avi")).toBe(false);
expect(strategy.extensions.has(".wmv")).toBe(false);
expect(strategy.extensions.has(".3gp")).toBe(false);
});
it("verifyMagicBytes accepts EBML headers (MKV / WebM)", () => {
const strategy = new FfmpegFallbackStrategy();
const ebml = new Uint8Array([
0x1a, 0x45, 0xdf, 0xa3, 0x9f, 0x42, 0x86, 0x81,
0x01, 0x42, 0xf7, 0x81, 0x01, 0x42, 0xf2, 0x81,
]);
expect(strategy.verifyMagicBytes?.({ bytes: ebml })).toBe(true);
});
it("verifyMagicBytes accepts ISOBMFF files with MP4-family brands", () => {
const strategy = new FfmpegFallbackStrategy();
// Standard MP4: size + "ftyp" + "isom" + minor + compat["isom","mp42"]
const isom = new Uint8Array([
0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70,
0x69, 0x73, 0x6f, 0x6d, 0x00, 0x00, 0x02, 0x00,
0x69, 0x73, 0x6f, 0x6d, 0x69, 0x73, 0x6f, 0x32,
0x61, 0x76, 0x63, 0x31, 0x6d, 0x70, 0x34, 0x31,
]);
expect(strategy.verifyMagicBytes?.({ bytes: isom })).toBe(true);
// QuickTime MOV: "ftyp" + "qt " (with spaces, four chars)
const qt = new Uint8Array([
0x00, 0x00, 0x00, 0x14, 0x66, 0x74, 0x79, 0x70,
0x71, 0x74, 0x20, 0x20, 0x00, 0x00, 0x02, 0x00,
0x71, 0x74, 0x20, 0x20,
]);
expect(strategy.verifyMagicBytes?.({ bytes: qt })).toBe(true);
// iPhone HEVC MOV variant: brand = "mp42" major
const mp42 = new Uint8Array([
0x00, 0x00, 0x00, 0x18, 0x66, 0x74, 0x79, 0x70,
0x6d, 0x70, 0x34, 0x32, 0x00, 0x00, 0x00, 0x01,
0x6d, 0x70, 0x34, 0x32, 0x69, 0x73, 0x6f, 0x6d,
]);
expect(strategy.verifyMagicBytes?.({ bytes: mp42 })).toBe(true);
});
it("verifyMagicBytes rejects look-alikes and unrelated containers", () => {
const strategy = new FfmpegFallbackStrategy();
// AVIF — different ISOBMFF major brand, not MP4 family
const avif = new Uint8Array([
0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70,
0x61, 0x76, 0x69, 0x66, 0x00, 0x00, 0x00, 0x00,
]);
expect(strategy.verifyMagicBytes?.({ bytes: avif })).toBe(false);
// HEIC — also ISOBMFF but heif/heic brand
const heic = new Uint8Array([
0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70,
0x68, 0x65, 0x69, 0x63, 0x00, 0x00, 0x00, 0x00,
]);
expect(strategy.verifyMagicBytes?.({ bytes: heic })).toBe(false);
// PNG header
const png = new Uint8Array([
0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a,
]);
expect(strategy.verifyMagicBytes?.({ bytes: png })).toBe(false);
// JPEG
const jpeg = new Uint8Array([0xff, 0xd8, 0xff, 0xe0]);
expect(strategy.verifyMagicBytes?.({ bytes: jpeg })).toBe(false);
// Too short to even be an ftyp box
const tiny = new Uint8Array([0x00, 0x00]);
expect(strategy.verifyMagicBytes?.({ bytes: tiny })).toBe(false);
// ftyp box at offset 4 (compliant) but no MP4-family brand anywhere
const wrong = new Uint8Array([
0x00, 0x00, 0x00, 0x18, 0x66, 0x74, 0x79, 0x70,
0x6d, 0x69, 0x66, 0x31, 0x00, 0x00, 0x00, 0x00,
0x6d, 0x69, 0x66, 0x31, 0x68, 0x65, 0x69, 0x63,
]);
expect(strategy.verifyMagicBytes?.({ bytes: wrong })).toBe(false);
});
it("accepts MP4 with mp4 brand in compatible_brands list (not major)", () => {
const strategy = new FfmpegFallbackStrategy();
// major = "mp41" (rare but seen) — should still match
const compat = new Uint8Array([
0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70,
0x6d, 0x70, 0x34, 0x31, 0x00, 0x00, 0x00, 0x00,
0x6d, 0x70, 0x34, 0x31, 0x69, 0x73, 0x6f, 0x6d,
0x6d, 0x70, 0x34, 0x32, 0x00, 0x00, 0x00, 0x00,
]);
expect(strategy.verifyMagicBytes?.({ bytes: compat })).toBe(true);
});
});
// Build-flag gating: when VITE_ENABLE_FFMPEG_FALLBACK is "false", the
// strategy must not be registered. Mirrors the ExifTool fallback gating
// test in this same directory.
describe("strategy_registry — VITE_ENABLE_FFMPEG_FALLBACK gating", () => {
const MP4_ISOM = new Uint8Array([
0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70,
0x69, 0x73, 0x6f, 0x6d, 0x00, 0x00, 0x02, 0x00,
0x69, 0x73, 0x6f, 0x6d, 0x69, 0x73, 0x6f, 0x32,
0x61, 0x76, 0x63, 0x31, 0x6d, 0x70, 0x34, 0x31,
]);
afterEach(() => {
vi.unstubAllEnvs();
});
it("routes MP4 to FfmpegFallbackStrategy when flag is unset (default on)", async () => {
vi.resetModules();
const { selectStrategy } = await import(
"../../../src/infrastructure/wasm/strategy_registry"
);
const result = selectStrategy({
filename: "video.mp4",
bytes: MP4_ISOM,
});
expect(result).not.toBeNull();
// Should be the ffmpeg strategy, not the legacy walker.
expect(result?.constructor.name).toBe("FfmpegFallbackStrategy");
});
it("falls back to VideoStrategy walker when flag is 'false'", async () => {
vi.resetModules();
vi.stubEnv("VITE_ENABLE_FFMPEG_FALLBACK", "false");
const { selectStrategy } = await import(
"../../../src/infrastructure/wasm/strategy_registry"
);
const result = selectStrategy({
filename: "video.mp4",
bytes: MP4_ISOM,
});
expect(result).not.toBeNull();
expect(result?.constructor.name).toBe("VideoStrategy");
});
});

View file

@ -0,0 +1,319 @@
import { describe, it, expect } from "vitest";
import { cleanFfmpegMp4Output } from "../../../src/infrastructure/wasm/strategies/ffmpeg_post_strip";
// Build a minimal box: 8-byte header (size + type) followed by payload.
function box(type: string, payload: Uint8Array | Uint8Array[]): Uint8Array {
const flat = Array.isArray(payload) ? concat(payload) : payload;
const total = 8 + flat.length;
const out = new Uint8Array(total);
new DataView(out.buffer).setUint32(0, total);
for (let i = 0; i < 4; i++) out[4 + i] = type.charCodeAt(i);
out.set(flat, 8);
return out;
}
function concat(chunks: Uint8Array[]): Uint8Array {
const total = chunks.reduce((s, c) => s + c.length, 0);
const out = new Uint8Array(total);
let off = 0;
for (const c of chunks) {
out.set(c, off);
off += c.length;
}
return out;
}
// Build a valid hdlr FullBox with the given handler_type and vendor.
// Layout from payloadStart:
// +0..+3: version (1) + flags (3) — FullBox header
// +4..+7: pre_defined: uint32
// +8..+11: handler_type: 4 ASCII
// +12..+15: reserved[0] = vendor ← target
// +16..+23: reserved[1..2] — zeros
// +24..: name (UTF-8 zero-terminated)
function buildHdlr({
handlerType,
vendor,
name = "",
}: {
handlerType: string;
vendor: string;
name?: string;
}): Uint8Array {
const nameBytes = new TextEncoder().encode(name + "\x00");
const payload = new Uint8Array(4 + 4 + 4 + 4 + 8 + nameBytes.length);
// version+flags [0..3] = 0; pre_defined [4..7] = 0
for (let i = 0; i < 4; i++) payload[8 + i] = handlerType.charCodeAt(i);
for (let i = 0; i < 4; i++) payload[12 + i] = vendor.charCodeAt(i);
// reserved[1..2] (16..23) left as zeros
payload.set(nameBytes, 24);
return box("hdlr", payload);
}
// Wrap an hdlr in a meta FullBox: 4-byte version+flags then the hdlr child.
function buildMeta(hdlr: Uint8Array): Uint8Array {
const fullBoxHeader = new Uint8Array(4); // version + flags
return box("meta", concat([fullBoxHeader, hdlr]));
}
// Linear scan for the first box of `type` and return its [headerStart, payloadEnd).
function findBox(
bytes: Uint8Array,
type: string,
): { start: number; end: number } | null {
const target = type.split("").map((c) => c.charCodeAt(0));
for (let i = 0; i + 8 <= bytes.length; i++) {
let match = true;
for (let j = 0; j < 4; j++) {
if (bytes[i + 4 + j] !== target[j]) {
match = false;
break;
}
}
if (!match) continue;
const view = new DataView(bytes.buffer, bytes.byteOffset + i);
const size = view.getUint32(0);
if (size < 8 || i + size > bytes.length) continue;
return { start: i, end: i + size };
}
return null;
}
// Decode bytes as latin1 — used to scan for ASCII strings without throwing on
// non-text bytes.
function asString(bytes: Uint8Array): string {
let out = "";
for (let i = 0; i < bytes.length; i++) {
out += String.fromCharCode(bytes[i] ?? 0);
}
return out;
}
const FTYP = box("ftyp", new TextEncoder().encode("isom\x00\x00\x00\x00"));
const MDAT = box("mdat", new TextEncoder().encode("MEDIA_PAYLOAD"));
describe("cleanFfmpegMp4Output", () => {
it("renames moov/udta to free (length-preserving) and leaves everything else intact", () => {
const hdlr = buildHdlr({ handlerType: "mdir", vendor: "appl" });
const meta = buildMeta(hdlr);
const udta = box("udta", meta);
const moov = box("moov", udta);
const input = concat([FTYP, moov, MDAT]);
const inputCopy = new Uint8Array(input); // pristine reference
// Find udta location in the original (used for offset math).
const udtaLoc = findBox(input, "udta");
expect(udtaLoc).not.toBeNull();
if (udtaLoc === null) return;
const typeOffset = udtaLoc.start + 4; // size(4) + type(4)
const out = cleanFfmpegMp4Output(input);
// Length preserved.
expect(out.length).toBe(inputCopy.length);
// Type field is now "free".
expect(asString(out.subarray(typeOffset, typeOffset + 4))).toBe("free");
// The "udta" string is gone from the output.
expect(asString(out)).not.toContain("udta");
// Only 4 bytes changed in the file (the type field).
const sizeBytesBefore = inputCopy.subarray(0, typeOffset);
const sizeBytesAfter = out.subarray(0, typeOffset);
expect(sizeBytesAfter).toEqual(sizeBytesBefore);
const tailBefore = inputCopy.subarray(typeOffset + 4);
const tailAfter = out.subarray(typeOffset + 4);
expect(tailAfter).toEqual(tailBefore);
// ftyp byte-identical.
const ftypLoc = findBox(out, "ftyp");
expect(ftypLoc).not.toBeNull();
if (ftypLoc === null) return;
expect(out.subarray(ftypLoc.start, ftypLoc.end)).toEqual(
inputCopy.subarray(ftypLoc.start, ftypLoc.end),
);
// mdat byte-identical.
const mdatLoc = findBox(out, "mdat");
expect(mdatLoc).not.toBeNull();
if (mdatLoc === null) return;
expect(out.subarray(mdatLoc.start, mdatLoc.end)).toEqual(
inputCopy.subarray(mdatLoc.start, mdatLoc.end),
);
});
it("renames a largesize-header udta (size=1, 16-byte header) correctly", () => {
// Build a box with the largesize encoding: size field is 1, followed
// by an 8-byte uint64 actual size. The type field sits at offset
// +4..+7 (same as a regular box). rewriteToFree must address the
// type via headerStart + 4 — using payloadStart - 4 would land on
// byte 12 of the header, corrupting the largesize value.
// Payload: bare meta + hdlr to make this realistic, though contents
// don't matter for the rename.
const hdlr = buildHdlr({ handlerType: "mdir", vendor: "appl" });
const meta = buildMeta(hdlr);
const inner = meta; // udta payload
const totalSize = 16 + inner.length; // largesize header (16) + payload
const udta = new Uint8Array(totalSize);
// size field = 1 (signals largesize)
new DataView(udta.buffer).setUint32(0, 1);
// type = "udta"
udta[4] = 0x75;
udta[5] = 0x64;
udta[6] = 0x74;
udta[7] = 0x61;
// largesize (uint64) = totalSize (high 4 bytes = 0; low 4 bytes = size)
new DataView(udta.buffer).setUint32(8, 0);
new DataView(udta.buffer).setUint32(12, totalSize);
// payload
udta.set(inner, 16);
const moov = box("moov", udta);
const input = concat([FTYP, moov, MDAT]);
const inputCopy = new Uint8Array(input);
// Locate the udta in the input (search for the type bytes).
const udtaIdx = (() => {
for (let i = 0; i + 8 <= input.length; i++) {
if (
input[i + 4] === 0x75 &&
input[i + 5] === 0x64 &&
input[i + 6] === 0x74 &&
input[i + 7] === 0x61
)
return i;
}
return -1;
})();
expect(udtaIdx).toBeGreaterThanOrEqual(0);
// Pre-condition: size field is 1 (largesize signal).
const sizeBefore = new DataView(
input.buffer,
input.byteOffset + udtaIdx,
).getUint32(0);
expect(sizeBefore).toBe(1);
// Pre-condition: largesize value at +8..+15 encodes totalSize.
const largeSizeBefore = new DataView(
input.buffer,
input.byteOffset + udtaIdx + 8,
).getBigUint64(0);
expect(largeSizeBefore).toBe(BigInt(totalSize));
const out = cleanFfmpegMp4Output(input);
// Type field rewritten to "free".
expect(asString(out.subarray(udtaIdx + 4, udtaIdx + 8))).toBe("free");
// Size field (regular size, at +0..+3) untouched — still 1.
const sizeAfter = new DataView(
out.buffer,
out.byteOffset + udtaIdx,
).getUint32(0);
expect(sizeAfter).toBe(1);
// Largesize value (uint64 at +8..+15) untouched — still totalSize.
// This is the key correctness assertion: if rewriteToFree used
// payloadStart - 4 (= headerStart + 12), the second half of the
// largesize uint64 would have been clobbered with "free", giving
// a size of 0x?????666_72656565 (or similar) and a reader-corrupt
// box.
const largeSizeAfter = new DataView(
out.buffer,
out.byteOffset + udtaIdx + 8,
).getBigUint64(0);
expect(largeSizeAfter).toBe(BigInt(totalSize));
// Length preserved overall.
expect(out.length).toBe(inputCopy.length);
});
it("renames per-track udta (moov/trak/udta) to free — drone/action-cam style", () => {
const hdlr = buildHdlr({ handlerType: "mdir", vendor: "appl" });
const meta = buildMeta(hdlr);
const udta = box("udta", meta);
const tkhd = box("tkhd", new Uint8Array(84));
const trak = box("trak", concat([tkhd, udta]));
const moov = box("moov", trak);
const input = concat([FTYP, moov, MDAT]);
const inputCopy = new Uint8Array(input);
const udtaLoc = findBox(input, "udta");
expect(udtaLoc).not.toBeNull();
if (udtaLoc === null) return;
const typeOffset = udtaLoc.start + 4;
const out = cleanFfmpegMp4Output(input);
expect(out.length).toBe(inputCopy.length);
expect(asString(out.subarray(typeOffset, typeOffset + 4))).toBe("free");
expect(asString(out)).not.toContain("udta");
// tkhd untouched.
const tkhdLoc = findBox(out, "tkhd");
expect(tkhdLoc).not.toBeNull();
if (tkhdLoc === null) return;
expect(out.subarray(tkhdLoc.start, tkhdLoc.end)).toEqual(
inputCopy.subarray(tkhdLoc.start, tkhdLoc.end),
);
});
it("is a no-op when no udta is present", () => {
const mvhd = box("mvhd", new Uint8Array(100));
const tkhd = box("tkhd", new Uint8Array(84));
const trak = box("trak", tkhd);
const moov = box("moov", concat([mvhd, trak]));
const input = concat([FTYP, moov, MDAT]);
const inputCopy = new Uint8Array(input);
const out = cleanFfmpegMp4Output(input);
expect(out.length).toBe(inputCopy.length);
expect(out).toEqual(inputCopy);
});
it("does not throw on a truncated udta payload (parseBoxesSafe swallows)", () => {
// udta with a 4-byte payload that's not a complete child box.
const udta = box("udta", new Uint8Array(4));
const moov = box("moov", udta);
const input = concat([FTYP, moov, MDAT]);
expect(() => cleanFfmpegMp4Output(input)).not.toThrow();
const out = cleanFfmpegMp4Output(input);
const udtaScan = asString(out).indexOf("udta");
const freeScan = asString(out).indexOf("free");
// udta type renamed regardless of its child contents.
expect(udtaScan).toBe(-1);
expect(freeScan).toBeGreaterThanOrEqual(0);
});
it("renames every udta — top-level and per-track — in the same pass", () => {
// One top-level moov/udta AND one moov/trak/udta. Verifies the
// walker visits both branches.
const hdlrA = buildHdlr({ handlerType: "mdir", vendor: "appl" });
const udtaA = box("udta", buildMeta(hdlrA));
const hdlrB = buildHdlr({ handlerType: "mdir", vendor: "appl" });
const udtaB = box("udta", buildMeta(hdlrB));
const tkhd = box("tkhd", new Uint8Array(84));
const trak = box("trak", concat([tkhd, udtaB]));
const moov = box("moov", concat([udtaA, trak]));
const input = concat([FTYP, moov, MDAT]);
const inputCopy = new Uint8Array(input);
// Pre-condition: two "udta" occurrences in the file.
expect(asString(inputCopy).split("udta").length - 1).toBe(2);
// "free" doesn't appear yet (no top-level free box in our synthetic).
expect(asString(inputCopy)).not.toContain("free");
const out = cleanFfmpegMp4Output(input);
// Length preserved.
expect(out.length).toBe(inputCopy.length);
// Both "udta" types are gone.
expect(asString(out)).not.toContain("udta");
// Replaced by two "free" types.
expect(asString(out).split("free").length - 1).toBe(2);
});
});

View file

@ -203,6 +203,64 @@ describe("WasmProcessor — async diff build", () => {
expect(second).toBeNull();
}, 30_000);
// Regression guard: processing two different files and then running
// their diffs back to back must produce a distinct, correct diff for
// each entry. Before the fix, buildDiffDocumentForEntry ran the
// before+after ExifTool reads in Promise.all — @uswriting/exiftool
// internally serializes through a module-level perl + stdout/stderr
// buffer, so two concurrent parseMetadata calls clobbered each other's
// output. The race was masked on the very first diff because the cold-
// start blocked one of the pair, then surfaced on every diff after.
//
// This test exercises two diffs against the same processor instance:
// JPEG (sample.jpg, has Make=TestCamera) and PNG (sample.png with
// known iTXt/tEXt chunks). The first diff should show the JPEG's
// metadata removal; the second should show the PNG's. With the race
// in place, the second came back identical to the first or empty.
it("produces distinct correct diffs for two sequential entries", async () => {
const jpegBytes = new Uint8Array(
readFileSync(join(IMAGE_FIXTURES, "sample.jpg")),
);
const pngBytes = new Uint8Array(
readFileSync(join(IMAGE_FIXTURES, "sample.png")),
);
fileBytes.files.set("/tmp/a.jpg", jpegBytes);
fileBytes.files.set("/tmp/b.png", pngBytes);
await processor.process({
entryId: "entry-jpeg-multi",
filePath: "/tmp/a.jpg",
options: { ...NO_PRESERVE },
});
await processor.process({
entryId: "entry-png-multi",
filePath: "/tmp/b.png",
options: { ...NO_PRESERVE },
});
const diffJpeg = await processor.buildDiffDocumentForEntry({
entryId: "entry-jpeg-multi",
});
const diffPng = await processor.buildDiffDocumentForEntry({
entryId: "entry-png-multi",
});
expect(diffJpeg).not.toBeNull();
expect(diffPng).not.toBeNull();
if (diffJpeg === null || diffPng === null) return;
// JPEG diff must surface the EXIF Make tag from sample.jpg.
const jpegMake = diffJpeg.before.find((e) => e.name === "Make");
expect(jpegMake?.value).toBe("TestCamera");
// PNG diff must NOT contain the JPEG's EXIF tags (race regression
// would have made the second diff parrot back the first file's
// metadata). Equivalently: the before lists must not be identical.
expect(JSON.stringify(diffJpeg.before)).not.toBe(
JSON.stringify(diffPng.before),
);
}, 30_000);
// Finding 4.3: graceful degradation when ExifTool errors out. We swap
// the strategy on the instance with a stub that always errors; the
// processor must catch and return null rather than throwing.

View file

@ -0,0 +1,366 @@
// Forensic recovery battery for FfmpegFallbackStrategy (#182 Phase 1 + 2).
//
// The strategy class itself uses the browser-only @ffmpeg/ffmpeg wrapper
// (Node import = empty module per package.json conditional exports). This
// runner replicates the strategy's strip invocation against @ffmpeg/core
// directly — the same WASM the strategy loads, the same arg vector, the
// same MEMFS lifecycle.
//
// Run:
// npx tsx tools/forensic/ffmpeg-fallback.ts
//
// Sentinel seeding uses `ffmpeg -metadata` directly (MKV/WebM) and `exiftool`
// (MP4). System ffmpeg + exiftool required for fixture build; strip phase
// runs entirely in-WASM.
//
// No-network verification (after tsx is cached):
// TSX=$(ls -d ~/.npm/_npx/*/node_modules/tsx | head -1)/dist/cli.mjs
// node --permission --allow-fs-read='*' --allow-fs-write='*' \
// --allow-child-process "$TSX" tools/forensic/ffmpeg-fallback.ts
//
// Pass criteria: zero sentinel survival across every fixture. KNOWN_GAPS
// empty for Phase 1 + Phase 2 formats. New gaps → file an issue, mark in
// KNOWN_GAPS, never silently dismiss.
import { readFileSync, writeFileSync, mkdirSync, existsSync } from "node:fs";
import { execFileSync } from "node:child_process";
import { join, resolve, dirname, extname } from "node:path";
import { fileURLToPath } from "node:url";
import { tmpdir } from "node:os";
const HERE = dirname(fileURLToPath(import.meta.url));
const REPO_ROOT = resolve(HERE, "..", "..");
const REAL_WORLD_DIR = join(
REPO_ROOT,
"tests",
"fixtures",
"wasm",
"video",
"real-world",
);
const WORK_DIR = join(tmpdir(), "ffmpeg-fallback-forensic");
mkdirSync(WORK_DIR, { recursive: true });
const SENTINELS = {
TITLE: "FORENSIC-FF-TITLE-AAAA",
AUTHOR: "FORENSIC-FF-AUTHOR-BBBB",
COMMENT: "FORENSIC-FF-COMMENT-CCCC",
ENCODER: "FORENSIC-FF-ENCODER-DDDD",
DESCRIPTION: "FORENSIC-FF-DESC-EEEE",
} as const;
const KNOWN_GAPS: ReadonlyMap<string, string> = new Map();
type SentinelKey = keyof typeof SENTINELS;
type FixtureKind =
| "synthetic-mp4"
| "synthetic-mkv"
| "synthetic-webm"
| "phone-baseline"
| "gopro-fusion"
| "dji-phantom4";
interface FixtureResult {
kind: FixtureKind;
inputBytes: number;
outputBytes: number;
rc: number;
stderr: string;
survivors: SentinelKey[];
deviceFingerprintSurvivors: string[];
skipped?: string;
}
function buildSyntheticMp4(): { bytes: Uint8Array; landed: SentinelKey[] } {
const synth = join(WORK_DIR, "synth.mp4");
const seeded = join(WORK_DIR, "seeded.mp4");
execFileSync(
"ffmpeg",
["-f", "lavfi", "-i", "color=c=blue:s=128x128:d=1", "-pix_fmt", "yuv420p", "-y", synth],
{ stdio: ["ignore", "ignore", "ignore"] },
);
execFileSync("cp", [synth, seeded], { stdio: ["ignore", "ignore", "ignore"] });
execFileSync(
"exiftool",
[
"-overwrite_original",
`-Title=${SENTINELS.TITLE}`,
`-Author=${SENTINELS.AUTHOR}`,
`-Comment=${SENTINELS.COMMENT}`,
`-Encoder=${SENTINELS.ENCODER}`,
`-Description=${SENTINELS.DESCRIPTION}`,
seeded,
],
{ stdio: ["ignore", "ignore", "ignore"] },
);
const bytes = readFileSync(seeded);
const landed: SentinelKey[] = [];
for (const key of Object.keys(SENTINELS) as SentinelKey[]) {
if (bytes.includes(Buffer.from(SENTINELS[key]))) landed.push(key);
}
return { bytes: new Uint8Array(bytes), landed };
}
function buildSyntheticEbml(
containerExt: ".mkv" | ".webm",
): { bytes: Uint8Array; landed: SentinelKey[] } {
// exiftool refuses MKV/WebM writes; seed via ffmpeg's own -metadata flag.
const synth = join(WORK_DIR, `synth${containerExt}`);
const seeded = join(WORK_DIR, `seeded${containerExt}`);
execFileSync(
"ffmpeg",
["-f", "lavfi", "-i", "color=c=green:s=128x128:d=1", "-pix_fmt", "yuv420p", "-y", synth],
{ stdio: ["ignore", "ignore", "ignore"] },
);
execFileSync(
"ffmpeg",
[
"-i", synth,
"-map", "0", "-c", "copy",
"-metadata", `title=${SENTINELS.TITLE}`,
"-metadata", `description=${SENTINELS.DESCRIPTION}`,
"-metadata", `comment=${SENTINELS.COMMENT}`,
"-metadata", `encoder=${SENTINELS.ENCODER}`,
"-y", seeded,
],
{ stdio: ["ignore", "ignore", "ignore"] },
);
const bytes = readFileSync(seeded);
const landed: SentinelKey[] = [];
for (const key of Object.keys(SENTINELS) as SentinelKey[]) {
if (bytes.includes(Buffer.from(SENTINELS[key]))) landed.push(key);
}
return { bytes: new Uint8Array(bytes), landed };
}
async function runStrip(
inputBytes: Uint8Array,
containerExt: string,
): Promise<{ outputBytes: Uint8Array; rc: number; stderr: string }> {
(globalThis as unknown as { self: typeof globalThis }).self = globalThis;
(globalThis as unknown as { window: typeof globalThis }).window = globalThis;
(globalThis as unknown as { location: { protocol: string; href: string } }).location = {
protocol: "file:",
href: "file:///tmp/ffmpeg-fallback-forensic/",
};
const wasmPath = join(
REPO_ROOT,
"node_modules",
"@ffmpeg",
"core",
"dist",
"esm",
"ffmpeg-core.wasm",
);
const wasmBytes = readFileSync(wasmPath);
const { default: createFFmpegCore } = await import("@ffmpeg/core");
let stderr = "";
const core = await (createFFmpegCore as (config: unknown) => Promise<{
FS: {
writeFile: (name: string, data: Uint8Array) => void;
readFile: (name: string) => Uint8Array;
unlink: (name: string) => void;
};
exec: (...args: string[]) => number;
}>)({
wasmBinary: wasmBytes,
print: () => {},
printErr: (m: string) => {
stderr += m + "\n";
},
});
const inputName = `in${containerExt}`;
const outputName = `out${containerExt}`;
try { core.FS.unlink(inputName); } catch { /* ignore */ }
try { core.FS.unlink(outputName); } catch { /* ignore */ }
core.FS.writeFile(inputName, inputBytes);
const args = [
"-i", inputName,
// Keep video+audio in input order; drop data ("d"), subtitle
// ("s"), attachment/timecode ("t") streams. See strategy comments
// in src/infrastructure/wasm/strategies/ffmpeg_fallback_strategy.ts.
"-map", "0",
"-map", "-0:d?",
"-map", "-0:s?",
"-map", "-0:t?",
"-map_metadata", "-1",
"-map_chapters", "-1",
"-fflags", "+bitexact",
"-c", "copy",
];
const isMp4 = containerExt === ".mp4" || containerExt === ".mov" || containerExt === ".m4v";
if (isMp4) args.push("-movflags", "+faststart");
args.push("-metadata", "encoder=");
args.push(outputName);
const rc = core.exec(...args);
let outputBytes = new Uint8Array(0);
if (rc === 0) {
try { outputBytes = new Uint8Array(core.FS.readFile(outputName)); } catch { /* ignore */ }
}
return { outputBytes, rc, stderr };
}
function recoveryBattery(
output: Uint8Array,
sentinelKeys: SentinelKey[],
fingerprintStrings: readonly string[] = [],
): { survivors: SentinelKey[]; fingerprintSurvivors: string[] } {
const survivors: SentinelKey[] = [];
for (const key of sentinelKeys) {
if (output.length > 0 && Buffer.from(output).includes(SENTINELS[key])) {
survivors.push(key);
}
}
const fingerprintSurvivors: string[] = [];
for (const fp of fingerprintStrings) {
if (output.length > 0 && Buffer.from(output).includes(fp)) {
fingerprintSurvivors.push(fp);
}
}
return { survivors, fingerprintSurvivors };
}
async function runSyntheticMp4(): Promise<FixtureResult> {
const { bytes, landed } = buildSyntheticMp4();
const { outputBytes, rc, stderr } = await runStrip(bytes, ".mp4");
const { survivors, fingerprintSurvivors } = recoveryBattery(outputBytes, landed);
return {
kind: "synthetic-mp4", inputBytes: bytes.length, outputBytes: outputBytes.length,
rc, stderr: stderr.slice(0, 400), survivors, deviceFingerprintSurvivors: fingerprintSurvivors,
};
}
async function runSyntheticEbml(
kind: "synthetic-mkv" | "synthetic-webm",
): Promise<FixtureResult> {
const ext = kind === "synthetic-mkv" ? ".mkv" : ".webm";
const { bytes, landed } = buildSyntheticEbml(ext);
const { outputBytes, rc, stderr } = await runStrip(bytes, ext);
const { survivors, fingerprintSurvivors } = recoveryBattery(outputBytes, landed);
return {
kind, inputBytes: bytes.length, outputBytes: outputBytes.length,
rc, stderr: stderr.slice(0, 400), survivors, deviceFingerprintSurvivors: fingerprintSurvivors,
};
}
async function runRealWorld(
kind: "phone-baseline" | "gopro-fusion" | "dji-phantom4",
filename: string,
fingerprints: readonly string[],
): Promise<FixtureResult> {
const path = join(REAL_WORLD_DIR, filename);
if (!existsSync(path)) {
return {
kind, inputBytes: 0, outputBytes: 0, rc: -1, stderr: "",
survivors: [], deviceFingerprintSurvivors: [],
skipped: `fixture not present — run tools/forensic/fetch-video-fixtures.sh first`,
};
}
const bytes = new Uint8Array(readFileSync(path));
const ext = extname(filename) === ".mov" ? ".mov" : ".mp4";
const { outputBytes, rc, stderr } = await runStrip(bytes, ext);
const { fingerprintSurvivors } = recoveryBattery(outputBytes, [], fingerprints);
return {
kind, inputBytes: bytes.length, outputBytes: outputBytes.length,
rc, stderr: stderr.slice(0, 400), survivors: [],
deviceFingerprintSurvivors: fingerprintSurvivors,
};
}
function reportRow(r: FixtureResult): void {
if (r.skipped !== undefined) {
console.log(` ${r.kind.padEnd(22)} SKIPPED — ${r.skipped}`);
return;
}
const sizeDelta = `${r.inputBytes}${r.outputBytes} bytes`;
const verdict =
r.rc === 0 && r.survivors.length === 0 && r.deviceFingerprintSurvivors.length === 0
? "✓ clean"
: r.rc !== 0
? `✗ rc=${r.rc}`
: `✗ leaked: ${[...r.survivors, ...r.deviceFingerprintSurvivors].join(", ")}`;
console.log(` ${r.kind.padEnd(22)} ${sizeDelta.padEnd(30)} ${verdict}`);
if (r.stderr.length > 0 && r.rc !== 0) {
console.log(` stderr: ${r.stderr.replace(/\n/g, " | ").slice(0, 200)}`);
}
}
async function main(): Promise<void> {
console.log("\nFfmpegFallbackStrategy forensic battery (Phase 1 + 2)");
console.log("======================================================\n");
const results: FixtureResult[] = [];
console.log("Synthetic battery:");
results.push(await runSyntheticMp4()); reportRow(results.at(-1)!);
results.push(await runSyntheticEbml("synthetic-mkv")); reportRow(results.at(-1)!);
results.push(await runSyntheticEbml("synthetic-webm")); reportRow(results.at(-1)!);
console.log("\nReal-world battery (MP4 / MOV only — no MKV/WebM real-world fixtures yet):");
results.push(await runRealWorld("phone-baseline", "phone-baseline.mp4", []));
reportRow(results.at(-1)!);
results.push(
await runRealWorld(
"gopro-fusion",
"gopro-fusion.mp4",
[
"GoPro AVC", "gpmd", "GoPro AAC", "GoPro TCD",
"GoPro MET", "GoPro SOS", "Fusion",
],
),
);
reportRow(results.at(-1)!);
// DJI Phantom 4 — opt-in (248 MB, fetched via
// `tools/forensic/fetch-video-fixtures.sh --include-large`). Per
// manifest: 5 device-natural fingerprints + full GPS flight log under
// UserData. Confirms the strategy handles drone files in addition to
// action cams.
results.push(
await runRealWorld(
"dji-phantom4",
"dji-phantom4.mov",
[
"FC6310", // drone model — UserData
"AVC encoder", // compressorname — sample entry
"DJI.AVC", // video handler description
"DJI.Meta", // metadata handler description
"55 deg", // GPS latitude (Denmark flight log)
],
),
);
reportRow(results.at(-1)!);
console.log("\n======================================================");
const failed = results.filter(
(r) =>
r.skipped === undefined &&
(r.rc !== 0 || r.survivors.length > 0 || r.deviceFingerprintSurvivors.length > 0),
);
const unexpectedFailures = failed.filter(
(r) => !KNOWN_GAPS.has(`${r.kind}:${r.rc !== 0 ? "rc-fail" : "leak"}`),
);
if (unexpectedFailures.length === 0) {
console.log("✓ PASS — zero unexpected sentinel/fingerprint survival.\n");
const reportPath = join(WORK_DIR, "report.json");
writeFileSync(reportPath, JSON.stringify({ results, KNOWN_GAPS: [...KNOWN_GAPS] }, null, 2));
console.log(`report: ${reportPath}\n`);
process.exit(0);
} else {
console.log("✗ FAIL — unexpected survivors:");
for (const r of unexpectedFailures) {
console.log(` ${r.kind}: rc=${r.rc} survivors=[${[...r.survivors, ...r.deviceFingerprintSurvivors].join(", ")}]`);
}
process.exit(1);
}
}
main().catch((err) => {
console.error("forensic battery crashed:", err);
process.exit(2);
});

View file

@ -8,6 +8,7 @@ import {
rmdirSync,
existsSync,
} from "node:fs";
import { gzipSync } from "node:zlib";
import { resolve } from "node:path";
import type { Plugin } from "vite";
@ -117,6 +118,14 @@ export default defineConfig({
// with extra files. The plugin above strips the corresponding <link> tags.
publicDir: false,
base: "./",
// Build-time flag consumed by ffmpeg_wasm_fetch.ts to tree-shake the
// bare `import("@ffmpeg/core")` PWA branch. Without this, Vite statically
// bundles the ~110 KB factory + its ~43 MB data: URL wasm fallback into
// the single-file HTML even though readInlinedCore() returns first at
// runtime. See the comment block above resolveCore() for the rationale.
define: {
__WITH_STANDALONE_INLINE__: "true",
},
build: {
outDir: resolve(__dirname, "dist/web-standalone"),
emptyOutDir: true,
@ -153,15 +162,20 @@ export default defineConfig({
// - viteSingleFile: inlines JS/CSS into the HTML.
// - standaloneHtmlFixupPlugin: rewrites the inlined script tag's
// attributes (singlefile preserves `type="module"`).
// - standaloneWasmInlinePlugin: injects the WASM bytes as a
// `<script type="text/plain" id="zeroperl-wasm-base64">` tag in
// the HTML, read by `redirectWasmFetch` on first WASM request.
// - standaloneInlineWasmsPlugin: injects zeroperl.wasm AND ffmpeg-core
// (.js + .wasm) as `<script type="text/plain">` tags in a single
// read+write of the HTML. Merged into one plugin so the injection
// sequence is explicit and not dependent on Rollup's
// hookParallel(closeBundle) semantics — two plugins each doing
// read+write of the same file would race the moment either hook
// body grows an `await`.
plugins: [
react(),
standaloneWasmStubPlugin(),
standaloneFfmpegStubPlugin(),
viteSingleFile(),
standaloneHtmlFixupPlugin(),
standaloneWasmInlinePlugin(),
standaloneInlineWasmsPlugin(),
],
});
@ -192,104 +206,196 @@ function standaloneWasmStubPlugin(): Plugin {
};
}
// The ExifTool fallback / diff strategies load zeroperl.wasm via a
// `?url` import. viteSingleFile only inlines JS/CSS chunks; large asset
// files like .wasm get emitted as siblings even when assetsInlineLimit is
// tall. So we hand-stash the WASM here.
// Same pattern as standaloneWasmStubPlugin, but for ffmpeg-core (JS + WASM).
// Worker JS is NOT inlined because we run ffmpeg-core in the main thread —
// the @ffmpeg/ffmpeg wrapper would spawn a type:"module" Web Worker from a
// Blob URL, which fails silently when the page origin is `null` (the
// standalone HTML's file:// case). See ffmpeg_fallback_strategy.ts for the
// architectural rationale.
function standaloneFfmpegStubPlugin(): Plugin {
const STUBS = new Map<string, string>([
["@ffmpeg/core?url", "\0virtual:standalone-ffmpeg-core-js-url"],
["@ffmpeg/core/wasm?url", "\0virtual:standalone-ffmpeg-core-wasm-url"],
]);
const SENTINELS: Record<string, string> = {
"\0virtual:standalone-ffmpeg-core-js-url": "inline:ffmpeg-core-js",
"\0virtual:standalone-ffmpeg-core-wasm-url": "inline:ffmpeg-core-wasm",
};
return {
name: "standalone-ffmpeg-stub",
enforce: "pre",
resolveId(id) {
const virtual = STUBS.get(id);
return virtual ?? undefined;
},
load(id) {
const sentinel = SENTINELS[id];
if (sentinel === undefined) return undefined;
return `export default "${sentinel}";`;
},
};
}
// Merged inline plugin for ALL WASM assets that need to be stashed in the
// standalone HTML. Previously this was two separate plugins
// (standaloneWasmInlinePlugin + standaloneFfmpegInlinePlugin), each with its
// own closeBundle hook doing read+write of the same dist/web-standalone/
// index.html. That worked only because both hook bodies were fully
// synchronous (sync fs calls) — Rollup invokes closeBundle hooks via
// hookParallel(), and any future `await` inside either hook would let the
// second plugin read a stale HTML mid-mutation and clobber the first
// plugin's injection. Merging into one closeBundle removes the ordering
// hazard and reduces three reads + two writes of index.html to one each.
//
// Why this matters for the standalone target specifically: the standalone
// HTML is opened via `file://`. Chromium browsers block cross-file `fetch()`
// from `file://` origins by default. A sibling .wasm would silently fail
// to load on Chrome/Edge/Brave under file://, breaking every diff view
// and the .webp/.gif/.avif strip paths.
// What lands in the HTML, in order:
//
// What we used to do (chunk B / B.1 early):
// Substitute every `./assets/zeroperl-<hash>.wasm` URL in the inlined JS
// with a `data:application/wasm;base64,…` URL. Problem: the resulting
// Base64 string is a MODULE-SCOPE STRING LITERAL in the JS bundle. V8
// allocates it eagerly during module parse — ~500-1500ms blocking the
// page load before first paint on a 33 MB Base64 payload. That's the
// regression the user reported.
// 1. zeroperl.wasm
// The ExifTool fallback / diff strategies load zeroperl.wasm via a
// `?url` import. viteSingleFile only inlines JS/CSS chunks; large
// asset files like .wasm get emitted as siblings even when
// assetsInlineLimit is tall.
//
// What we do now:
// Inject the Base64 as a `<script type="text/plain" id="zeroperl-wasm-base64">…
// </script>` tag in the HTML body BEFORE the module script. The HTML
// parser stores the textContent in the DOM but does NOT parse the
// contents as JavaScript, so V8's module-parse cost drops from 33 MB
// to ~150 KB (the wrapper code). On first WASM request, the wrapper's
// `redirectWasmFetch` helper reads the textContent and decodes it via
// `fetch(data:URL)` (browser-native Base64 path). Same total disk I/O,
// ~500-1500ms shaved off time-to-interactive.
// Why this matters for the standalone target specifically: the
// standalone HTML is opened via `file://`. Chromium browsers block
// cross-file `fetch()` from `file://` origins by default. A sibling
// .wasm would silently fail to load on Chrome/Edge/Brave under
// file://, breaking every diff view and the .webp/.gif/.avif strip
// paths.
//
// PWA / APK builds keep the sibling asset (they don't hit the file://
// CORS constraints and `runtimeCaching` handles repeat loads).
// See docs/superpowers/specs/2026-05-21-issue-22-diff-pivot-design.md §8.1
// for the original tradeoff discussion.
function standaloneWasmInlinePlugin(): Plugin {
// What we used to do (chunk B / B.1 early):
// Substitute every `./assets/zeroperl-<hash>.wasm` URL in the
// inlined JS with a `data:application/wasm;base64,…` URL.
// Problem: the resulting Base64 string is a MODULE-SCOPE STRING
// LITERAL in the JS bundle. V8 allocates it eagerly during module
// parse — ~500-1500ms blocking the page load before first paint
// on a 33 MB Base64 payload. That's the regression the user
// reported.
//
// What we do now:
// Inject the Base64 as a
// `<script type="text/plain" id="zeroperl-wasm-base64">…</script>`
// tag in the HTML body BEFORE the module script. The HTML parser
// stores the textContent in the DOM but does NOT parse the
// contents as JavaScript, so V8's module-parse cost drops from
// 33 MB to ~150 KB (the wrapper code). On first WASM request, the
// wrapper's `redirectWasmFetch` helper reads the textContent and
// decodes it via `fetch(data:URL)` (browser-native Base64 path).
// Same total disk I/O, ~500-1500ms shaved off time-to-interactive.
//
// PWA / APK builds keep the sibling asset (they don't hit the
// file:// CORS constraints and `runtimeCaching` handles repeat
// loads). See
// docs/superpowers/specs/2026-05-21-issue-22-diff-pivot-design.md
// §8.1 for the original tradeoff discussion.
//
// 2. ffmpeg-core.js + ffmpeg-core.wasm
// ffmpeg_wasm_fetch.ts reads from the same DOM IDs at runtime (see
// readInlinedCore there) and feeds the WASM bytes directly to
// createFFmpegCore({wasmBinary}). Without this:
// - ffmpeg-core.wasm (30.7 MB) gets emitted as
// `dist/web-standalone/assets/ffmpeg-core-<hash>.wasm` —
// defeats the single-file deliverable and 404s under file://
// CORS rules.
// - ffmpeg-core.js similarly emits as a sibling.
//
// Gzip + base64 cuts each payload roughly 3× (wasm compresses well —
// lots of repeated LEB128 patterns + symbol tables; ffmpeg's 30.7 MB
// wasm → ~10 MB gz → ~13 MB base64). The runtime decoders in
// exiftool_wasm_fetch.ts / ffmpeg_wasm_fetch.ts pipe the bytes through
// DecompressionStream("gzip") at first use. HTML-parse cost at page
// load scales with text-node size, so shrinking the inlined string is
// the start-time win.
//
// Base64 alphabet (A-Z a-z 0-9 + / =) contains no HTML-special
// characters, so direct embedding in a <script> body is safe without
// escaping.
function standaloneInlineWasmsPlugin(): Plugin {
const outDir = resolve(__dirname, "dist/web-standalone");
const htmlPath = resolve(outDir, "index.html");
const assetsDir = resolve(outDir, "assets");
// Source the WASM directly from node_modules: standaloneWasmStubPlugin
// intercepts the `?url` import and replaces it with a sentinel, so Vite
// never sees the asset and never emits it. We read the bytes here at
// closeBundle time and stash them as a <script type="text/plain"> tag
// in the HTML.
// Sourced from the package's ESM export path (matches what the `?url`
// import would have resolved to).
const wasmSourcePath = resolve(
__dirname,
"node_modules/@6over3/zeroperl-ts/dist/esm/zeroperl.wasm",
);
// Source assets directly from node_modules: the stub plugins intercept
// the `?url` imports and replace them with sentinels, so Vite never
// sees these assets and never emits them. We read the bytes here at
// closeBundle time and stash them as <script type="text/plain"> tags.
// Sourced from each package's ESM export path (matches what the `?url`
// imports would have resolved to).
const INLINE_ASSETS: ReadonlyArray<{
label: string;
domId: string;
source: string;
}> = [
{
label: "zeroperl.wasm",
domId: "zeroperl-wasm-base64",
source: resolve(
__dirname,
"node_modules/@6over3/zeroperl-ts/dist/esm/zeroperl.wasm",
),
},
{
label: "ffmpeg-core.js",
domId: "ffmpeg-core-js-base64",
source: resolve(
__dirname,
"node_modules/@ffmpeg/core/dist/esm/ffmpeg-core.js",
),
},
{
label: "ffmpeg-core.wasm",
domId: "ffmpeg-core-wasm-base64",
source: resolve(
__dirname,
"node_modules/@ffmpeg/core/dist/esm/ffmpeg-core.wasm",
),
},
];
return {
name: "standalone-wasm-inline",
name: "standalone-inline-wasms",
closeBundle() {
if (!existsSync(wasmSourcePath)) {
throw new Error(
`standaloneWasmInlinePlugin: zeroperl.wasm not found at ` +
`${wasmSourcePath}. Check @6over3/zeroperl-ts dependency.`,
);
}
const bytes = readFileSync(wasmSourcePath);
const base64 = bytes.toString("base64");
// 1. Read HTML once.
let html = readFileSync(htmlPath, "utf8");
// Inject the WASM payload as a <script type="text/plain"> tag
// before the module script. The browser stores the textContent in
// the DOM but does NOT parse it as JavaScript, so V8's
// module-parse cost stays bounded to the small wrapper code
// (~150 KB) instead of paying ~500-1500ms to allocate a 33 MB
// Base64 string as a module-scope literal at page load.
//
// On first WASM request the `redirectWasmFetch` helper reads
// the textContent and decodes via `fetch(data:URL)` (browser's
// native Base64 path).
//
// Base64 alphabet (A-Z a-z 0-9 + / =) contains no HTML-special
// characters, so direct embedding in a <script> body is safe
// without escaping.
const moduleScriptMarker = '<script type="module">';
if (!html.includes(moduleScriptMarker)) {
throw new Error(
`standaloneWasmInlinePlugin: could not find <script type="module"> ` +
`standaloneInlineWasmsPlugin: could not find <script type="module"> ` +
`in HTML. viteSingleFile may have changed its inline-script ` +
`shape; the inline-tag injection point needs updating.`,
);
}
const inlineTag = `<script type="text/plain" id="zeroperl-wasm-base64">${base64}</script>`;
// 2. Read + gzip + base64 each asset; accumulate inline tags.
let injected = "";
const summaryLines: string[] = [];
for (const asset of INLINE_ASSETS) {
if (!existsSync(asset.source)) {
throw new Error(
`standaloneInlineWasmsPlugin: ${asset.label} not found at ` +
`${asset.source}. Check the corresponding dependency install.`,
);
}
const bytes = readFileSync(asset.source);
const gzipped = gzipSync(bytes, { level: 9 });
const base64 = gzipped.toString("base64");
injected += `<script type="text/plain" id="${asset.domId}">${base64}</script>\n`;
summaryLines.push(
` ${asset.label}: ${bytes.length}${gzipped.length} bytes gzipped ` +
`(base64 ${base64.length} bytes)`,
);
}
// 3. Write HTML once with all injections, then log a single summary.
html = html.replace(
moduleScriptMarker,
`${inlineTag}\n${moduleScriptMarker}`,
`${injected}${moduleScriptMarker}`,
);
console.log(
`standaloneWasmInlinePlugin: stashed zeroperl.wasm in <script type="text/plain"> (${bytes.length} bytes)`,
);
writeFileSync(htmlPath, html);
console.log(
`standaloneInlineWasmsPlugin: stashed ${INLINE_ASSETS.length} assets in ` +
`<script type="text/plain"> tags\n${summaryLines.join("\n")}`,
);
// Defensive: if Vite ever emits assets/ siblings again, clean
// them up to keep the standalone-output to exactly one file.
// 4. Defensive: if Vite ever emits assets/ siblings again, clean
// them up to keep the standalone output to exactly one file.
if (existsSync(assetsDir) && readdirSync(assetsDir).length === 0) {
rmdirSync(assetsDir);
}

View file

@ -54,6 +54,15 @@ export default defineConfig({
outDir: resolve(__dirname, "dist/web"),
emptyOutDir: true,
},
// Build-time flag consumed by ffmpeg_wasm_fetch.ts. In the PWA / APK
// build the inlined `<script type="text/plain">` ffmpeg tags don't
// exist, so readInlinedCore() returns null and we MUST reach the bare
// `import("@ffmpeg/core")` branch. Setting this to `false` here keeps
// Rollup from tree-shaking that branch. The standalone config sets it
// to `true` to drop ~43 MB from the single-file HTML.
define: {
__WITH_STANDALONE_INLINE__: "false",
},
plugins: [
react(),
webCspPlugin(),

View file

@ -1010,6 +1010,11 @@
resolved "https://registry.yarnpkg.com/@esbuild/win32-x64/-/win32-x64-0.27.3.tgz#0eaf705c941a218a43dba8e09f1df1d6cd2f1f17"
integrity sha512-4uJGhsxuptu3OcpVAzli+/gWusVGwZZHTlS63hh++ehExkVT8SgiEf7/uC/PclrPPkLhZqGgCTjd0VWLo6xMqA==
"@ffmpeg/core@0.12.10":
version "0.12.10"
resolved "https://registry.yarnpkg.com/@ffmpeg/core/-/core-0.12.10.tgz#3177e88852bfbfaad5d258e9e0ac1fd9dffd3223"
integrity sha512-dzNplnn2Nxle2c2i2rrDhqcB19q9cglCkWnoMTDN9Q9l3PvdjZWd1HfSPjCNWc/p8Q3CT+Es9fWOR0UhAeYQZA==
"@ionic/cli-framework-output@^2.2.8":
version "2.2.8"
resolved "https://registry.yarnpkg.com/@ionic/cli-framework-output/-/cli-framework-output-2.2.8.tgz#29d541acc7773a6aaceec5f3b079937fbcef5402"