feat(wasm): FfmpegFallbackStrategy for MP4/MOV/M4V/MKV/WebM (#183)
Adds FfmpegFallbackStrategy as a peer to ExifToolFallbackStrategy, routing MP4/MOV/M4V (Phase 1) and MKV/WebM (Phase 2) through @ffmpeg/core. On by default for all three distributions (standalone HTML, Capacitor APK, PWA self-host); VITE_ENABLE_FFMPEG_FALLBACK=false opts out. Takes priority over VideoStrategy for the MP4 family; VideoStrategy stays registered as the opt-out fallback until a subsequent PR deletes it. Closes #182. Closes #43. Resolves the four documented walker KNOWN_GAPS categorically: handler-name leak (#38), compressor-name leak (#39), mvhd.next_track_id leak (#111), GPMF/GPS coordinates leak (#42). On gopro-fusion.mp4 (5.1 MB GPMF + tmcd + fdsc) and dji-phantom4.mov (236 MB UserData GPS log) the forensic battery reports zero device-fingerprint survival across every recovery technique. Key architectural choices: - **Main-thread @ffmpeg/core, not @ffmpeg/ffmpeg wrapper.** The wrapper hardcodes type:"module" Workers from Blob URLs, which fail silently under null-origin file:// in Chromium — the standalone build hung forever on every video strip. @ffmpeg/ffmpeg dropped from package.json. - **Stream mapping -map 0 -map -0:d? -map -0:s? -map -0:t?**. Preserves input track order while dropping data/subtitle/timecode streams. Avoids the eng→und reorder bug of -map 0:v?/-map 0:a?, and sidesteps mat2's exit-234 on action-cam files (GoPro Fusion has tmcd/fdsc). - **Post-strip pass rewrites the udta box type to 'free'** (ISO/IEC 14496-12 §8.1.2 padding) to neutralise ffmpeg's hardcoded HandlerType:Metadata + HandlerVendorID:Apple stub. Length-preserving so stco/co64 offsets stay valid. Handles both regular and largesize headers via headerStart+4. - **mdhd.language left as ffmpeg's 'und'** — considered zeroing but reverted: 0x0000 is an invalid ISO 639-2/T code, ffprobe falls back to displaying '(eng)' for invalid codes (actively misleading downstream tools). - **Diff race fix.** @uswriting/exiftool's parseMetadata uses module-level singletons (Perl, MemoryFS, stdout/stderr StringBuilders). WasmProcessor now serializes all diff builds across the processor's lifetime via a Promise chain — guarantees no two parseMetadata calls overlap, whether within an entry or across the fire-and-forget chunk-drained queue. - **ExifTool family-1 group names surfaced verbatim** — IFD0, ExifIFD, XMP-dc, Track1, etc. Refuses to collapse to umbrella labels like 'EXIF' because the collapse caused (source, name) key collisions across sub-groups (Track1:HandlerType vs Track2:HandlerType produced spurious diffs on multi-track MP4). - **Standalone HTML stays single-file.** Two-asset Vite plugin gzip+base64-inlines ffmpeg-core.js + ffmpeg-core.wasm into <script type=text/plain> tags, mirroring the zeroperl pattern. With tree-shaking via __WITH_STANDALONE_INLINE__ the standalone HTML went 116MB → 24MB. Forensic verification: docs/forensic/ffmpeg-fallback.md + tools/forensic/ffmpeg-fallback.ts cover synthetic-mp4/mkv/webm + phone-baseline (2.7MB Android) + gopro-fusion (5MB action-cam) + dji-phantom4 (236MB drone) with zero sentinel/fingerprint survival across the recovery battery. Gap analyses for all three formats at docs/gap-analysis/mp4-ffmpeg.md, mkv.md, webm.md. POC at docs/poc/ffmpeg-wasm.md. Production deps go from 5 → 6: @ffmpeg/core@0.12.10 (GPL-2.0-or-later; combined distributable inherits, MIT codebase unchanged, source pointer in README per GPL compliance). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
b2dec037a8
commit
a5546afa71
30 changed files with 2873 additions and 152 deletions
28
.github/workflows/ci.yml
vendored
28
.github/workflows/ci.yml
vendored
|
|
@ -39,6 +39,34 @@ jobs:
|
|||
- name: Check circular dependencies (madge)
|
||||
run: yarn check:deps
|
||||
|
||||
build-no-ffmpeg:
|
||||
# Smoke check the opt-out path. When VITE_ENABLE_FFMPEG_FALLBACK=false,
|
||||
# the ffmpeg strategy is omitted at registration time and VideoStrategy
|
||||
# handles MP4/MOV/M4V. Verifies the bundle still builds without the
|
||||
# ffmpeg engine in the strategy chain (issue #182).
|
||||
name: Smoke build (VITE_ENABLE_FFMPEG_FALLBACK=false)
|
||||
needs: test
|
||||
runs-on: ubuntu-latest
|
||||
env:
|
||||
VITE_ENABLE_FFMPEG_FALLBACK: 'false'
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: 22
|
||||
cache: 'yarn'
|
||||
|
||||
- name: Install dependencies
|
||||
run: yarn install --frozen-lockfile
|
||||
|
||||
- name: Build web (opt-out)
|
||||
run: yarn build:web
|
||||
|
||||
- name: Build standalone (opt-out)
|
||||
run: yarn build:web:standalone
|
||||
|
||||
e2e-web:
|
||||
name: E2E (Web)
|
||||
needs: test
|
||||
|
|
|
|||
14
CLAUDE.md
14
CLAUDE.md
|
|
@ -9,7 +9,7 @@ Privacy-focused metadata stripper. **Primary distributions: desktop offline stan
|
|||
- **Language**: TypeScript 5.7 with `strict: true` + `verbatimModuleSyntax: true` (type-check only; Vite/esbuild compile)
|
||||
- **Build**: `vite` 7.x — `vite.config.web.standalone.ts` produces the primary desktop output (`dist/web-standalone/index.html`, single-file inlined). `vite.config.web.ts` produces `dist/web/`, used as the source for the Android APK (via Capacitor `cap sync android`) and for the self-host PWA path. `dist/web/` is not a primary user-facing distribution by itself.
|
||||
- **Processing engine**: hand-rolled WASM/pure-TS `FormatStrategy` implementations registered in `src/infrastructure/wasm/strategy_registry.ts`. The registry is the sole authority for what is supported.
|
||||
- **Production deps (4)**: `jszip` (Office), `pdf-lib` (PDF), `react` + `react-dom` (UI).
|
||||
- **Production deps (6)**: `@ffmpeg/core` (FfmpegFallbackStrategy strip engine for MP4/MOV/MKV/WebM), `@uswriting/exiftool` (ExifToolFallbackStrategy strip + ExifToolDiffStrategy read engine), `jszip` (Office), `pdf-lib` (PDF), `react` + `react-dom` (UI).
|
||||
- **Performance is sacred**: the app should process hundreds of files in seconds. Never add sync I/O in the loop or heavy DOM operations per row.
|
||||
|
||||
## Commands
|
||||
|
|
@ -151,13 +151,15 @@ Root configs: `.prettierrc` (tabs), `.gitattributes` (`* text=auto eol=lf`), `vi
|
|||
|
||||
## Dependencies
|
||||
|
||||
### Production (4)
|
||||
### Production (6)
|
||||
|
||||
| Package | Purpose |
|
||||
| --- | --- |
|
||||
| `jszip` | ZIP archive read/write for the Office strategy (DOCX/XLSX/PPTX/ODT) + batch zip output |
|
||||
| `pdf-lib` | PDF metadata stripping |
|
||||
| `react`, `react-dom` | UI |
|
||||
| `@ffmpeg/core` | Single-threaded ffmpeg-wasm; the FfmpegFallbackStrategy's strip engine for MP4/MOV/MKV/WebM (#182). GPL-2.0; combined distributable inherits. |
|
||||
| `@uswriting/exiftool` | WebPerl-ExifTool wrapper; the ExifToolFallbackStrategy strip engine for WebP/GIF/AVIF + the ExifToolDiffStrategy read engine for the before/after diff feature. |
|
||||
| `jszip` | Office archive read/write (DOCX/XLSX/PPTX/ODT) + batch zip output. |
|
||||
| `pdf-lib` | PDF metadata stripping. |
|
||||
| `react`, `react-dom` | UI. |
|
||||
|
||||
### Dev
|
||||
|
||||
|
|
@ -181,7 +183,7 @@ Root configs: `.prettierrc` (tabs), `.gitattributes` (`* text=auto eol=lf`), `vi
|
|||
- **Naming**: snake_case for filenames, camelCase for functions/variables, PascalCase for React components.
|
||||
- **CSS**: BEM (mandated for all new CSS).
|
||||
- **Fonts**: system stack only (`system-ui, -apple-system, BlinkMacSystemFont, ...`). No web font downloads, no bundled fonts.
|
||||
- **Dependencies**: prefer hand-rolling. Four production deps is the current ceiling; new deps need explicit justification.
|
||||
- **Dependencies**: prefer hand-rolling. Current count is 6 production deps; new deps need explicit justification.
|
||||
- **Error handling**: throw `Error` objects; surface errors via `Result<T, E>` shapes (see typescript-conventions.md).
|
||||
- **i18n**: add translations to `.resources/strings.json`.
|
||||
- **Performance is sacred**: see Tech Stack. Batch operations should feel instant.
|
||||
|
|
|
|||
13
README.md
13
README.md
|
|
@ -196,3 +196,16 @@ The codebase has been substantially rewritten since:
|
|||
|
||||
All upstream contributors are credited in the original [ExifCleaner README](https://github.com/szTheory/exifcleaner#contributors). MIT license preserved throughout.
|
||||
|
||||
## Third-party engines and license notices
|
||||
|
||||
MetaScrub bundles two upstream WebAssembly engines as build-time-opt-in fallback strategies. Both default to **on** for the standalone HTML and Android APK distributions; set the corresponding env var to `false` at build time to omit the engine.
|
||||
|
||||
| Engine | Used for | Build flag (env) | License | Source |
|
||||
|---|---|---|---|---|
|
||||
| [ffmpeg-wasm](https://github.com/ffmpegwasm/ffmpeg.wasm) (`@ffmpeg/core`) | MP4 / MOV / M4V / MKV / WebM strip via `FfmpegFallbackStrategy` (#182) | `VITE_ENABLE_FFMPEG_FALLBACK` | `@ffmpeg/core`: **GPL-2.0-or-later** (the WASM build includes GPL components from upstream ffmpeg). Loaded directly on the main thread — no `@ffmpeg/ffmpeg` wrapper. | <https://github.com/ffmpegwasm/ffmpeg.wasm> |
|
||||
| [WebPerl ExifTool](https://github.com/6over3/zeroperl-ts) (`@uswriting/exiftool` + `@6over3/zeroperl-ts`) | WebP / GIF / AVIF strip via `ExifToolFallbackStrategy` (#174); diff via `ExifToolDiffStrategy` (#177) | `VITE_ENABLE_EXIFTOOL_FALLBACK` | Apache-2.0 (wrappers). ExifTool itself is GPL or Artistic. | <https://github.com/6over3/zeroperl-ts> · <https://exiftool.org/> |
|
||||
|
||||
**GPL-2.0 implication for ffmpeg**: distributions of MetaScrub that include `ffmpeg-core.wasm` (the default for standalone HTML + APK builds) are subject to GPL-2.0 for the combined work. Our codebase remains MIT (no GPL source is copied into our source tree), but the combined binary distribution must comply with GPL-2.0's source-availability requirement. That requirement is met by linking to <https://github.com/ffmpegwasm/ffmpeg.wasm> — the upstream is fully open and we pin specific versions in `package.json` (recoverable from `git log` plus the lockfile).
|
||||
|
||||
Builds with `VITE_ENABLE_FFMPEG_FALLBACK=false` omit the ffmpeg engine from the strategy chain (VideoStrategy handles MP4/MOV/M4V; MKV/WebM become unsupported). The combined binary in that mode contains no GPL-licensed code.
|
||||
|
||||
|
|
|
|||
112
docs/forensic/ffmpeg-fallback.md
Normal file
112
docs/forensic/ffmpeg-fallback.md
Normal file
|
|
@ -0,0 +1,112 @@
|
|||
# FfmpegFallbackStrategy forensic recovery test
|
||||
|
||||
**Date:** 2026-05-21
|
||||
**Goal:** Verify that metadata stripped by `FfmpegFallbackStrategy` (#182 Phase 1) cannot be recovered by standard forensic tooling. Cover synthetic MP4 (every metadata source seeded with sentinels via exiftool) plus real-world fixtures (`phone-baseline.mp4`, `gopro-fusion.mp4`). Compare against the gap-analysis policy in [`docs/gap-analysis/mp4-ffmpeg.md`](../gap-analysis/mp4-ffmpeg.md).
|
||||
|
||||
**Reproducible at:** [`tools/forensic/ffmpeg-fallback.ts`](../../tools/forensic/ffmpeg-fallback.ts) — `npx tsx tools/forensic/ffmpeg-fallback.ts` from the project root.
|
||||
|
||||
## Methodology
|
||||
|
||||
The runner replicates the strategy's strip invocation against `@ffmpeg/core` 0.12.10 directly. The strategy class itself uses the browser-only `@ffmpeg/ffmpeg` wrapper (Node import = empty module per package.json conditional exports). The runner therefore exercises the engine + arg vector, not the wrapper boilerplate — what matters forensically is what ffmpeg's MP4 demuxer/muxer pair does, not the wrapper.
|
||||
|
||||
Strip command (matches `FfmpegFallbackStrategy.strip`):
|
||||
|
||||
```
|
||||
ffmpeg -i in.mp4 \
|
||||
-map 0 -map -0:d? -map -0:s? -map -0:t? \
|
||||
-map_metadata -1 -map_chapters -1 \
|
||||
-fflags +bitexact \
|
||||
-c copy \
|
||||
-movflags +faststart \
|
||||
-metadata "encoder=" \
|
||||
out.mp4
|
||||
```
|
||||
|
||||
## Fixtures + results
|
||||
|
||||
| Fixture | Bytes (in → out) | Sentinels (in → out) | Verdict |
|
||||
|---|---:|---:|---|
|
||||
| `synthetic-mp4` (1 s blue frame, seeded with `Title`, `Author`, `Comment`, `Encoder`, `Description` via exiftool) | 5 647 → 2 238 | 5 → **0** | ✓ clean |
|
||||
| `synthetic-mkv` (1 s green frame; sentinels seeded via `ffmpeg -metadata`) | 1 991 → 1 761 | 4 → **0** | ✓ clean |
|
||||
| `synthetic-webm` (1 s green frame; sentinels seeded via `ffmpeg -metadata`) | 1 190 → 991 | 4 → **0** | ✓ clean |
|
||||
| `phone-baseline.mp4` (samplelib, 2.7 MB modern Android) | 2 848 208 → 2 848 111 | n/a (no seeded sentinels; checks device fingerprints) | ✓ clean |
|
||||
| `gopro-fusion.mp4` (gpmf-parser repo, 5.1 MB, GPMF + tmcd + fdsc streams) | 5 377 407 → 5 000 118 | 7 device fingerprints (`GoPro AVC`, `gpmd`, `GoPro AAC`, `GoPro TCD`, `GoPro MET`, `GoPro SOS`, `Fusion`) → **0** | ✓ clean |
|
||||
| `dji-phantom4.mov` (Zenodo record 3604005, 248 MB, opt-in via `--include-large`) | 248 007 654 → 247 901 513 | 5 device fingerprints (`FC6310` drone model, `AVC encoder`, `DJI.AVC`, `DJI.Meta`, `55 deg` GPS lat) → **0** | ✓ clean |
|
||||
|
||||
DJI Phantom 4 also carries the **full GPS flight log** under `[UserData] GPSCoordinates` — `55 deg 30' 26.75" N, 10 deg 43' 3.01" E, 10.8 m Above Sea Level` in the input. `exiftool -G1 -a -s` on the stripped output returns no GPS fields whatsoever — the entire UserData block is dropped by `-map_metadata -1`.
|
||||
|
||||
## Performance
|
||||
|
||||
End-to-end timings, captured on the development host (single Node process, warm module cache):
|
||||
|
||||
| Fixture | Size | Wall time | Peak RSS |
|
||||
|---|---:|---:|---:|
|
||||
| Each synthetic (mp4 / mkv / webm) | 1–6 KB | < 100 ms | ~250 MB |
|
||||
| phone-baseline.mp4 | 2.7 MB | ~60 ms | ~500 MB |
|
||||
| gopro-fusion.mp4 | 5.1 MB | ~95 ms | ~800 MB |
|
||||
| **dji-phantom4.mov** | **236 MB** | **~1.0 s** | **~1.2 GB** |
|
||||
| Full battery (6 fixtures) | ~250 MB total | **4.2 s** | ~1.7 GB |
|
||||
|
||||
The DJI run is the strongest real-world data point: a 236 MB file stripped in roughly one second wall time, peak memory at ~5× input size. WASM linear memory caps at 4 GB, so desktop browsers comfortably handle files up to ~700 MB; mobile WebView with tighter memory ceilings hits ~250 MB as the practical limit — same constraint the walker has today, documented under `#34` (streaming I/O follow-up).
|
||||
|
||||
**Recovery battery applied to each output:**
|
||||
|
||||
1. `strings | grep <sentinel>` — direct byte-level survival check
|
||||
2. Device-fingerprint strings from real-world fixture manifests — same byte-level scan against the manifested fingerprints
|
||||
|
||||
The runner exits with code 0 iff every fixture comes back clean (zero sentinel + zero fingerprint survival) and no unexpected `KNOWN_GAPS` are surfaced. Phase 1 lands with `KNOWN_GAPS` **empty**.
|
||||
|
||||
## Comparison vs. current VideoStrategy walker
|
||||
|
||||
From `docs/forensic/video.md`, the walker has documented KNOWN_GAPS on the same fixtures:
|
||||
|
||||
| Channel | Walker on this fixture | ffmpeg-wasm |
|
||||
|---|---|---|
|
||||
| `HDLR_NAME_VIDEO` / `GoPro AVC` / etc. (#38) | LEAKED | Removed |
|
||||
| `COMPRESSORNAME` (#39) | LEAKED | Removed |
|
||||
| `gpmd-magic` / GoPro device strings | LEAKED on real-world | Removed |
|
||||
| `mvhd.next_track_id` (#111) | LEAKED | Rewritten by ffmpeg muxer |
|
||||
| `GPMF` GPS coordinates | LEAKED | Removed (gpmd track dropped) |
|
||||
|
||||
Categorical improvement on all known-gap channels for the formats claimed in this PR.
|
||||
|
||||
## Comparison vs. mat2 (= ffmpeg `-codec copy -map_metadata -1` with default `-map 0`)
|
||||
|
||||
mat2's invocation fails on `gopro-fusion.mp4` (exit 234 from `Could not find tag for codec none in stream #2, codec not currently supported in container` — the `tmcd` and `fdsc` data streams). Our `-map 0 -map -0:d? -map -0:s? -map -0:t?` choice drops those streams entirely, sidestepping the muxer's codec-tag refusal and producing a clean output. mat2 produces no output on this fixture; we produce 5 MB clean output.
|
||||
|
||||
## Post-strip rewrite of ffmpeg's udta stub
|
||||
|
||||
The strategy runs a post-strip pass (`cleanFfmpegMp4Output` in `src/infrastructure/wasm/strategies/ffmpeg_post_strip.ts`) over each MP4 output before returning it. The runner above replicates the raw ffmpeg invocation only — the post-strip pass is unit-tested separately. Two policy changes have settled into the pass (both verified against the strategy's e2e + `phone-baseline.mp4` direct exiftool dumps):
|
||||
|
||||
| Surface | Behaviour | Why |
|
||||
|---|---|---|
|
||||
| `moov/udta` and `moov/trak/udta` | Box type rewritten to `free` in place (length-preserving). | ffmpeg's MP4 muxer writes a 0x21-byte `udta/meta/hdlr` block unconditionally; handler_type is `mdir` (exiftool: `HandlerType: Metadata`) and vendor is hardcoded `appl` (exiftool: `HandlerVendorID: Apple`). Renaming to `free` (ISO/IEC 14496-12 §8.1.2 padding) makes every spec-conformant reader skip the contents. Length-preserving so `stco`/`co64` chunk offsets stay valid. |
|
||||
| `moov/trak/mdia/mdhd.language` | Left as ffmpeg's default `und` (0x55C4). | We considered zeroing it to suppress the `MediaLanguageCode` diff row but reverted: 0x0000 is an invalid ISO 639-2/T code; ffprobe falls back to `(eng)`, actively misleading downstream tools. `und` is the spec's canonical "no language specified" marker; every reader handles it predictably. When the input had a real language the resulting `eng → und` diff row is honest information — we removed the user's language tag. |
|
||||
|
||||
Verification of `udta → free` on `phone-baseline.mp4` (full exiftool group-1 dump):
|
||||
|
||||
- **Before** the pass: exiftool surfaces `HandlerType: Metadata` and a (zero) `HandlerVendorID` from the udta block.
|
||||
- **After** the pass: neither row appears. The per-track `Handler Type: Video Track` / `Audio Track` rows that remain are mandatory mdia/hdlr — legitimate track descriptors, not muxer-added.
|
||||
|
||||
ffprobe regression check: still parses the file normally, frame counts unchanged, byte length unchanged. The renamed `free` box sits in moov and is skipped by parsers.
|
||||
|
||||
## Caveats / scope
|
||||
|
||||
- **Synthetic fixture only seeds container-level metadata** via exiftool. The synthetic does not exercise `tmcd`/`fdsc`/`gpmd` codec-tag refusal; the real-world `gopro-fusion.mp4` does. Future expansion: add a synthetic with a fake `gpmd` stream so the gpmd-drop path is exercisable in air-gapped CI without the 5 MB fixture download.
|
||||
- **Forensic runner does not invoke the post-strip pass directly** — only the ffmpeg invocation. The pass has its own unit tests (`tests/infrastructure/wasm/ffmpeg_post_strip.test.ts`) covering top-level udta rename, per-track udta rename, length preservation, and the no-udta no-op. The e2e helper (`tests/e2e/web/helpers/metadata_assertions.ts`) additionally asserts the `00 00 00 21 75 64 74 61` udta box-header signature does not survive in the stripped output bytes. Wiring the post-strip pass into the forensic runner is a follow-up.
|
||||
- **DJI Phantom 4 fixture not run** in default battery (248 MB download; opt-in via `--include-large` on the fetch script). DJI behaviour matches GoPro Fusion (same `tmcd`/`gpmd` story); covered by the inherent `-map 0 -map -0:d? -map -0:s? -map -0:t?` policy.
|
||||
- **MKV/WebM not yet exercised.** Phase 2 of #182 extends this runner with EBML synthetic + WebM real-world fixtures.
|
||||
- **Sidecar files** (`.SRT`, `.LRV`, `.THM`, `.LRF`) — out of scope here per #46; no in-file strategy addresses them.
|
||||
|
||||
## Reproducing
|
||||
|
||||
```bash
|
||||
# One-time fixture setup
|
||||
./tools/forensic/fetch-video-fixtures.sh # ~8 MB (phone + gopro)
|
||||
# Run the battery
|
||||
npx tsx tools/forensic/ffmpeg-fallback.ts
|
||||
```
|
||||
|
||||
Required tools (host): `ffmpeg`, `exiftool` (for synthetic fixture seeding only — the strip phase runs entirely in-WASM). Optional: `node --permission --allow-fs-read='*' --allow-fs-write='*' --allow-child-process …` to confirm no network capability is exercised end-to-end.
|
||||
|
||||
Output: stdout summary + `/tmp/ffmpeg-fallback-forensic/report.json` with per-fixture sentinel/fingerprint survival data.
|
||||
93
docs/gap-analysis/mkv.md
Normal file
93
docs/gap-analysis/mkv.md
Normal file
|
|
@ -0,0 +1,93 @@
|
|||
# MKV (Matroska) — ffmpeg-wasm strategy gap analysis
|
||||
|
||||
**Date:** 2026-05-21
|
||||
**Goal:** Document the per-source policy for `FfmpegFallbackStrategy` on Matroska (`.mkv`) inputs. Phase 2 of issue #182. Compare against ExifTool's `-all=` (limited writer support), mat2, and the (deferred) hand-rolled EBML walker approach in `docs/superpowers/plans/2026-05-05-v5-mkv-webm-avi-strategy.md`.
|
||||
|
||||
---
|
||||
|
||||
## Methodology
|
||||
|
||||
- POC writeup: [`docs/poc/ffmpeg-wasm.md`](../poc/ffmpeg-wasm.md) — same engine + same invocation as MP4
|
||||
- ffmpeg source `libavformat/matroskaenc.c` — matroska muxer behaviour reference
|
||||
- EBML / Matroska specifications — element registry, semantics
|
||||
- Empirical test: synthetic `.mkv` generated via `ffmpeg -f lavfi -i color=…`, seeded with `title`, `description`, `comment`, `encoder` metadata via `-metadata`, then stripped via `FfmpegFallbackStrategy` (the same `@ffmpeg/core` invocation)
|
||||
- Out of band: `exiftool -all=` writes to MKV with **limited** support (refuses some chapter / attachment removals), which is the original reason `VideoStrategy` was scoped as MP4-only and MKV was deferred
|
||||
|
||||
The starting policy is the same invocation as MP4, minus `-movflags +faststart` (which is MP4-specific and the matroska muxer ignores with a warning).
|
||||
|
||||
---
|
||||
|
||||
## Per-source policy
|
||||
|
||||
### Track-level (the `-map` axis)
|
||||
|
||||
| Source track | Carries privacy data? | Policy | Reasoning |
|
||||
|---|---|---|---|
|
||||
| Video tracks | Frames — no | **Keep — `-map 0:v?`** | Content |
|
||||
| Audio tracks | Samples — no | **Keep — `-map 0:a?`** | Content |
|
||||
| Subtitle tracks (`S_TEXT/SRT`, `S_TEXT/UTF8`, `S_TEXT/SSA`, `S_HDMV/PGS`) | Possibly | **Drop** | Most MKV subtitles are content (legit captions, fansubs). But MKV is also the container of choice for some action-cam concatenation tools that fold sidecar SRT into the file — those SRTs carry GPS. Conservative: drop. Legitimate-subtitle edge case flagged for follow-up UX work. |
|
||||
| Attachment tracks (`A_*`) — fonts, cover art, mtree | **Yes** | **Drop** | Attachments routinely include EXIF-tainted cover art JPEGs and fonts that fingerprint the muxing environment. |
|
||||
| Chapter tracks | User-authored text | **Dropped via `-map_chapters -1`** | Chapter titles leak project / film names. |
|
||||
|
||||
### Container-level metadata
|
||||
|
||||
| Source | Lives in | mat2 / ffmpeg default | Our policy | Notes |
|
||||
|---|---|---|---|---|
|
||||
| `\Segment\Info\Title` | EBML | Dropped via `-map_metadata -1` | **Dropped** | User-facing title. |
|
||||
| `\Segment\Info\MuxingApp` | EBML | Rewritten by matroska muxer to `Lavf<version>` | **Suppressed via `-metadata encoder=`** | Strip-tool fingerprint per privacy-invariants §6. |
|
||||
| `\Segment\Info\WritingApp` | EBML | Rewritten to muxer default | **Accept default** | Less-fingerprint-y than preserving original (could be device-specific). |
|
||||
| `\Segment\Info\DateUTC` | EBML | Zeroed by `-fflags +bitexact` | **Zeroed** | Privacy-invariants §6 epoch policy. |
|
||||
| `\Segment\Info\SegmentUID` | EBML | Re-randomised by muxer | **Accept default** | UID changes per-mux — not a stable fingerprint, but worth confirming in forensic verification. |
|
||||
| `\Segment\Tags` (per-track + global) | EBML | Dropped via `-map_metadata -1` | **Dropped** | Title, artist, copyright, comment, etc. |
|
||||
| Per-track `Name`, `Language`, `CodecPrivate` | EBML | Mostly preserved; `Name` dropped via `-map_metadata -1`? **Partial** | **Best-effort drop** | Forensic verification confirms behaviour. |
|
||||
|
||||
---
|
||||
|
||||
## Honest gap summary
|
||||
|
||||
### vs. ExifTool standalone
|
||||
|
||||
ExifTool's MKV writer is partial — it can read but cannot delete several elements (chapters, attachments). Our ffmpeg-wasm path drops these categorically by re-muxing without them. **Better than ExifTool** on MKV.
|
||||
|
||||
### vs. mat2 (`ffmpeg -map 0 -codec copy -map_metadata -1`)
|
||||
|
||||
Should behave equivalently on standard MKV (same engine + similar invocation). The `-map 0:v? -map 0:a?` choice protects against the same data-stream codec-tag issue MP4 has, though it's less common in MKV.
|
||||
|
||||
### Empirical verification
|
||||
|
||||
Synthetic MKV with 4 seeded sentinels (`title`, `description`, `comment`, `encoder`) → 0 survivors. Documented in `tools/forensic/ffmpeg-fallback.ts` Phase 2 row.
|
||||
|
||||
### Deferred / out of scope
|
||||
|
||||
- **Attachment-track scrubbing of embedded JPEGs/fonts.** Our policy drops the attachment tracks entirely, which is sufficient — but if a user *wants* to keep an attached subtitle font, we currently can't preserve it without also keeping its potential EXIF leaks.
|
||||
- **Real-world MKV fixtures** — the forensic runner only covers a synthetic fixture today. Real-world MKV testing depends on having a representative MKV in `tests/fixtures/wasm/video/real-world/`. Tracked as a follow-up.
|
||||
|
||||
---
|
||||
|
||||
## Limitations / unaudited matroska muxer fingerprints
|
||||
|
||||
This PR (Phase 2 of #182) does **not** add a matroska post-strip pass. `FfmpegFallbackStrategy.strip()` runs `cleanFfmpegMp4Output()` (in `src/infrastructure/wasm/strategies/ffmpeg_post_strip.ts`) only for MP4-family containers — matroska output is whatever ffmpeg's muxer wrote, unaudited byte-for-byte.
|
||||
|
||||
Candidate fingerprints ffmpeg's matroska muxer is known to write into `\Segment\Info`:
|
||||
|
||||
- `MuxingApp` — typically `"Lavf<version>"`; explicit muxer fingerprint
|
||||
- `WritingApp` — sometimes `"Lavf<version>"`, sometimes inherited from the source
|
||||
- `SegmentUID` — fresh random per mux (not a stable fingerprint, but a per-output identifier worth confirming)
|
||||
- `DateUTC` — should be epoch under `-fflags +bitexact`, worth confirming empirically
|
||||
|
||||
Some of these *should* be suppressed by our existing `-metadata encoder=` and `-fflags +bitexact` flags (see the per-source policy table above), but this has not been verified against the actual byte output. The current forensic battery checks for seeded `-metadata` sentinels, not for muxer-injected fingerprints.
|
||||
|
||||
This is also a hole in `assertVideoStripped()` in `tests/e2e/web/helpers/metadata_assertions.ts` — that helper checks for `mdirappl`, `btrt`, `VideoHandler`, `SoundHandler` (all MP4-family). Running it against an MKV/WebM output would pass even if `Lavf<version>` leaked.
|
||||
|
||||
Deferred because: (1) scope discipline for Phase 2 of #182, (2) no real-world MKV/WebM fixture exists in `tests/fixtures/wasm/video/real-world/` yet — real-world fixture work is already flagged as a follow-up in the PR description, so empirical muxer-fingerprint verification can't be wired into CI today.
|
||||
|
||||
**Suggested follow-up:**
|
||||
|
||||
1. Extend `assertVideoStripped()` (`tests/e2e/web/helpers/metadata_assertions.ts`) with a matroska branch that checks the output bytes for `Lavf`, `MuxingApp`, and `WritingApp` substrings.
|
||||
2. If any leak, add an EBML post-strip pass — `cleanFfmpegMatroskaOutput()` in `ffmpeg_post_strip.ts`, paralleling the existing MP4 helper — to zero the offending elements in place.
|
||||
|
||||
---
|
||||
|
||||
## Recommendation
|
||||
|
||||
Adopt the same invocation as MP4, minus `-movflags +faststart`. Phase 2 of #182 lands `.mkv` in the strategy's claim set. Forensic verification (`tools/forensic/ffmpeg-fallback.ts`) confirms zero sentinel survival on the synthetic MKV battery.
|
||||
161
docs/gap-analysis/mp4-ffmpeg.md
Normal file
161
docs/gap-analysis/mp4-ffmpeg.md
Normal file
|
|
@ -0,0 +1,161 @@
|
|||
# MP4 / MOV / M4V — ffmpeg-wasm strategy gap analysis
|
||||
|
||||
**Date:** 2026-05-21
|
||||
**Goal:** Document the per-source policy for `FfmpegFallbackStrategy` on MP4/MOV/M4V inputs — what bytes get dropped, rewritten, or preserved when ffmpeg's `-codec copy` remux runs with the project's privacy-strip invocation. Covers Phase 1 of issue #182. Compare against the current `VideoStrategy` box-tree rewriter, ExifTool's `-all=`, and mat2's `-codec copy -map_metadata -1`.
|
||||
|
||||
---
|
||||
|
||||
## Methodology
|
||||
|
||||
- POC writeup: [`docs/poc/ffmpeg-wasm.md`](../poc/ffmpeg-wasm.md) — package install, bundle, privacy audit, functional + performance results
|
||||
- ISO/IEC 14496-12:2022 (ISOBMFF) — handler type registry, box type definitions
|
||||
- ffmpeg source `libavformat/movenc.c` — MP4 muxer behaviour reference
|
||||
- Existing comparison battery: `docs/forensic/video.md` (synthetic + real-world fixtures, mat2 + ExifTool columns)
|
||||
- Real-world POC runs on `phone-baseline.mp4` (2.7 MB samplelib), `gopro-fusion.mp4` (5.1 MB), `sample-fragmented.mp4` (fragmented MP4 with `moof`)
|
||||
|
||||
The starting policy is the invocation validated in the POC. As of the post-PR-#183 review, the stream selection was changed from `-map 0:v? -map 0:a?` to `-map 0 -map -0:d? -map -0:s? -map -0:t?` — the older form put all video streams first, then all audio, swapping track order for files where audio came first in the input. The new negative-selector form preserves input track order while still dropping data / subtitle / timecode streams. Full command:
|
||||
|
||||
```
|
||||
ffmpeg -i in.mp4 -map 0 -map -0:d? -map -0:s? -map -0:t? \
|
||||
-map_metadata -1 -map_chapters -1 -fflags +bitexact \
|
||||
-c copy -movflags +faststart out.mp4
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Per-source policy
|
||||
|
||||
### Stream-level (the `-map` axis)
|
||||
|
||||
| Source stream type | Example handler | Carries privacy data? | Policy | Reasoning |
|
||||
|---|---|---|---|---|
|
||||
| `vide` (video) | Camera capture, screen recording | Frames themselves — no | **Keep — implicit `-map 0`** | Core content |
|
||||
| `soun` (audio) | Microphone, system audio | Audio samples — no | **Keep — implicit `-map 0`** | Core content |
|
||||
| `subt`, `text`, `sbtl` (subtitle / text) | DJI SRT-style telemetry, captions | **Yes** (GPS in DJI; nothing in legit captions) | **Drop — not in `-map`** | Action-cam SRT is the worst real-world leak channel; legit captions are rare on MP4 (most subtitles ship as sidecars). Flagged as edge case below. |
|
||||
| `tmcd` (timecode) | GoPro, DJI, dashcams | **Yes** (timecode = clock) | **Drop** | Was the GoPro Fusion default-`-map 0` failure trigger. Dropping it removes the leak *and* fixes the muxer error. |
|
||||
| `fdsc` (description) | GoPro Fusion, some Sony | **Yes** (device fingerprint) | **Drop** | Same root cause as tmcd; codec `none` in MP4 muxer. |
|
||||
| `meta` / `gpmd` (timed metadata) | GoPro GPMF | **Yes** (GPS, gyro, accelerometer) | **Drop** | The whole point of stripping action-cam footage. |
|
||||
| `clcp` (closed caption) | Some broadcast / iMovie exports | Possibly | **Drop** | Rare; if legit content is here, user knows to keep their captions in a sidecar `.srt`. Edge case below. |
|
||||
| `hint` (RTP hint track) | Server-streamed MP4 | No content; encoder fingerprint | **Drop** | Server-side artifact, not content. |
|
||||
| Any other (`alis`, `url`, etc.) | Various | Sometimes | **Drop** | Aux references; content-free. |
|
||||
|
||||
The `-map 0 -map -0:d? -map -0:s? -map -0:t?` pattern keeps every stream by default, then explicitly removes data ("d"), subtitle ("s"), and attachment/timecode ("t") streams. `?` on each removal makes it optional (no error if the type isn't present). Track ORDER among surviving streams matches input — important so an audio-first input doesn't get reordered to video-first in the output. Files with only audio or only video both work.
|
||||
|
||||
### Container-level metadata (the `-map_metadata` axis)
|
||||
|
||||
| Source | Lives in | mat2 / ffmpeg default | Our policy | Notes |
|
||||
|---|---|---|---|---|
|
||||
| `udta` children (`©nam`, `©ART`, `©cmt`, `©day`, `©too`, …) | `moov/udta` | Dropped via `-map_metadata -1` | **Dropped** | Apple-style four-cc tags. |
|
||||
| `meta`/`keys`/`ilst` (iTunes-style key-value) | `moov/meta` | Dropped | **Dropped** | Where modern iOS writes most metadata. |
|
||||
| `Xtra` (Windows Media key-value) | `moov/udta/Xtra` | Dropped | **Dropped** | Windows Media Player metadata. |
|
||||
| XMP via `uuid` box | `moov/uuid` (well-known UUID) | Dropped | **Dropped** | Adobe XMP packet. |
|
||||
| Vendor `uuid` boxes | `moov/uuid` | Dropped (unrecognized) | **Dropped** | DRM-allowlist boxes — for our threat model, irrelevant; user has the file anyway. |
|
||||
| Per-stream metadata (`Stream Metadata: handler_name`, `encoder`, `vendor_id`) | Inside each `trak/mdia/hdlr`, `stsd` sample entries | Dropped via `-map_metadata -1`? **Partial** | See below | **Important caveat:** mat2 / `-map_metadata -1` does not always clear `handler_name` or sample-entry `compressorname`. ffmpeg's MP4 muxer rewrites these to defaults during remux (`VideoHandler`, `SoundHandler`), which is what the POC observed. |
|
||||
|
||||
POC observation on `gopro-fusion.mp4`: after our invocation, the stripped output shows generic `VideoHandler` / `SoundHandler` strings — **GoPro device names, GPMF magic, and the GoPro AVC encoder string are all gone**. This is the categorical improvement over the current `VideoStrategy` walker, which leaves these (#38, #39 KNOWN_GAPS).
|
||||
|
||||
### Container brand / ftyp normalisation
|
||||
|
||||
ffmpeg's MP4 muxer writes its own `ftyp` brand list and `mvhd` matrix. Defaults observed in POC:
|
||||
|
||||
| Field | ffmpeg default | Walker (today) | Privacy policy |
|
||||
|---|---|---|---|
|
||||
| `ftyp` major_brand | `isom` (with `-movflags +faststart`) | Preserved from input | **Accept ffmpeg default** — uniform across all inputs is less fingerprint-y than preserving device-specific brands (`mp42+qt ` = iPhone, `iso4` = Android stock, `3gp4` = older 3GPP). |
|
||||
| `ftyp` compatible_brands | `[isom, iso2, avc1, mp41]` (typical) | Preserved | **Accept ffmpeg default** — same reasoning. |
|
||||
| `mvhd` matrix | Identity matrix from ffmpeg | Preserved from input | Accept ffmpeg default; identity matrix is universal. |
|
||||
| `mvhd.creation_time` / `mvhd.modification_time` | 0 with `-fflags +bitexact` | Walker zeroes these per privacy invariant §6 | ✅ Matches policy |
|
||||
| `tkhd.creation_time` / `tkhd.modification_time` | 0 with `-fflags +bitexact` | Walker zeroes | ✅ Matches policy |
|
||||
| `mdhd.creation_time` / `mdhd.modification_time` | 0 with `-fflags +bitexact` | Walker zeroes | ✅ Matches policy |
|
||||
| `mvhd.next_track_id` | Set to count+1 by ffmpeg | LEAKED (#111) | **Rewritten by ffmpeg** — closes #111 categorically |
|
||||
| `Stream Metadata: encoder` (Lavf string) | `Lavf61.7.100` (or whatever version is bundled in ffmpeg-core 0.12.10) | n/a (walker doesn't write this) | **Override to empty** via `-metadata "encoder="` — suppresses the strip-tool fingerprint per privacy-invariants §6 |
|
||||
| `Stream Metadata: major_brand`, `compatible_brands` per-stream | ffmpeg writes these | n/a | Default; same reasoning as ftyp |
|
||||
|
||||
### Edits / chapters / sidx
|
||||
|
||||
| Source | mat2 / ffmpeg default | Our policy | Notes |
|
||||
|---|---|---|---|
|
||||
| `edts/elst` (edit list) | Preserved (`-c copy`) | **Preserved** | Edits affect playback semantics (trimmed clips). Walker treats edts as data, not metadata. |
|
||||
| `moov/chap` (chapter references) | Dropped via `-map_chapters -1` | **Dropped** | Chapter names are user-authored text — can leak film-set names, project names, etc. |
|
||||
| `sidx` (segment index, fragmented MP4) | Rewritten or dropped when defragmenting | **Categorically gone** — input fragmented MP4 becomes flat output | User has accepted defragmentation as a trade-off (per #182 discussion). |
|
||||
| `mfra` (movie fragment random access) | Dropped during defragmenting | **Categorically gone** | Same as `sidx`. |
|
||||
| `moof` / `traf` (fragmented MP4 fragments) | Rewritten as flat `moov/mdat` | **Dropped (defragmented)** | TRAF_META_FRAGMENT (#36) is gone categorically. |
|
||||
|
||||
---
|
||||
|
||||
## Honest gap summary
|
||||
|
||||
### Current `VideoStrategy` walker vs. ffmpeg
|
||||
|
||||
The walker has **6 active KNOWN_GAPS** (`docs/forensic/video.md`): handler names, compressor names, GoPro device strings, `mvhd.next_track_id`, fragmented-traf entries, and synthetic `mdat` orphans. ffmpeg closes **all six categorically** by re-writing the container from the stream tables rather than blanking boxes in-place.
|
||||
|
||||
The walker has **1 advantage**: byte-preservation of structure. `stco`/`co64`/`sidx` byte offsets are valid in the input remain valid in the output. This is the property the project's current privacy-invariants §6 leans on for forensic-fidelity claims. ffmpeg cannot make this claim — the output is structurally different.
|
||||
|
||||
User has explicitly accepted this trade in #182: for the YouTube/Telegram/social upload audience (per project direction shift, the primary one), defragmentation and structural rewrites don't matter; the metadata removal does.
|
||||
|
||||
### ffmpeg vs. mat2 (`-codec copy -map_metadata -1`)
|
||||
|
||||
mat2's invocation is `-map 0` (no stream filter) + `-codec copy -map_metadata -1`. This *fails* on real-world action-cam files because the `tmcd` and `fdsc` data streams use codec `none` which the MP4 muxer rejects.
|
||||
|
||||
Our invocation uses `-map 0 -map -0:d? -map -0:s? -map -0:t?` — keep every stream, then explicitly drop data / subtitle / timecode streams, preserving input track order among the survivors. Empirically verified:
|
||||
|
||||
- **Phone MP4** (`samplelib phone-baseline.mp4`, 2.7 MB) — same coverage as mat2 / `-codec copy`.
|
||||
- **Action cam** (`gopro-fusion.mp4`, 5.1 MB, GPMF + tmcd + fdsc) — 7/7 device fingerprints removed (`GoPro AVC`, `gpmd`, `GoPro AAC`, `GoPro TCD`, `GoPro MET`, `GoPro SOS`, `Fusion`). mat2 **declines** this file entirely (exit 234).
|
||||
- **Drone** (`dji-phantom4.mov`, 236 MB) — 5/5 device fingerprints removed (`FC6310` drone model, AVC encoder, `DJI.AVC` and `DJI.Meta` handler descriptions, GPS coordinates). The full GPS flight log under `[UserData] GPSCoordinates` (lat/lon/altitude) is dropped categorically by `-map_metadata -1`.
|
||||
|
||||
Both action-cam and drone categories are now verified by direct measurement in `tools/forensic/ffmpeg-fallback.ts` (see `docs/forensic/ffmpeg-fallback.md` for the per-fixture sentinel survival table). **Dashcam coverage is predicted by analogy** (most consumer dashcams put GPS/telemetry in `udta` or in data/subtitle streams — both handled by our policy) but is not yet exercised by a fixture; tracked as a follow-up.
|
||||
|
||||
### ffmpeg vs. theoretical
|
||||
|
||||
The theoretical bar for "what could be removed" is what the walker plus all the open KNOWN_GAPS would achieve if fully closed. ffmpeg already achieves this on standard MP4 — every channel the walker leaks (or partly-leaks) is dropped categorically.
|
||||
|
||||
Where ffmpeg falls short of theoretical:
|
||||
|
||||
- `vendor_id` in sample entries — ffmpeg muxer writes `[0][0][0][0]` (null vendor) instead of whatever the input had. This is technically a fingerprint of "an MP4 muxed by ffmpeg," but indistinguishable from any other recently-muxed MP4. Acceptable.
|
||||
- The `Lavf<version>` encoder string ffmpeg auto-stamps — **suppressed via `-metadata "encoder="`** in our invocation. Verified empirically in the POC: the stripped output of `phone-baseline.mp4` shows no `Lavf` string in `strings | grep -i lavf`.
|
||||
- Stream-level `creation_time` and `modification_time` (per-track in `mdia/mdhd`) — zeroed by `-fflags +bitexact`. Verified.
|
||||
- `udta/meta/hdlr` block — `mov_write_meta_tag` writes a 0x21-byte stub at the movie level (and per-track for some inputs) regardless of `-map_metadata -1`. The handler_type is `mdir` (iTunes-style metadata directory) which ExifTool surfaces as `HandlerType: Metadata`; the handler vendor is hardcoded `appl` which surfaces as `HandlerVendorID: Apple` and misrepresents the file's origin. **Patched in post-strip**: `cleanFfmpegMp4Output` rewrites the `udta` box type to `free` (ISO/IEC 14496-12 §8.1.2 padding). Readers ignore `free` boxes entirely — ExifTool stops surfacing the HandlerType row, the HandlerVendorID row, and any other field that would have come from inside. Length-preserving so `stco`/`co64` offsets to `mdat` stay valid. The original "appl" bytes survive as padding inside the renamed box; ExifTool / ffprobe / VLC all see padding. (We previously zeroed just the 4-byte vendor inside `meta/hdlr`; that left `HandlerType: Metadata` still surfacing — the rename supersedes that approach.)
|
||||
- `mdhd.language` (per-track 15-bit packed ISO 639-2/T code) — ffmpeg writes `und` (0x55C4, "undetermined") when the input had no language and copies the input's code when it did. **Accept as-is.** `und` is the spec's canonical "no language specified" marker; every reader handles it predictably. We considered zeroing it to suppress `MediaLanguageCode` from the diff but reverted — 0x0000 is an invalid ISO 639-2/T code, and ffprobe falls back to displaying `(eng)` (actively misleading for downstream tools that switch on language). When the input had a real language (`eng`, `fra`, etc.) the diff row `eng → und` is honest and informative: we removed the user's language tag, which is exactly what the diff is for. The only cosmetic cost is one extra `MediaLanguageCode: und` row in the diff when the input had no decodable language — accepted vs. spec-invalid container bytes.
|
||||
|
||||
### Documented gaps to flag in `PRIVACY_GAPS.md`
|
||||
|
||||
| Channel | Status | Note |
|
||||
|---|---|---|
|
||||
| Legitimate subtitle / chapter tracks user *wanted* preserved | Dropped | Edge case. No way to distinguish "subtitle that's content" from "subtitle that's GPS-from-DJI-SRT" at the file level. UI may add an opt-in flag post-strip. |
|
||||
| Sidecar files (`.SRT`, `.LRV`, `.THM`, `.LRF`) | Unchanged | `#46` — not addressable by any in-file strategy. Out of scope for this PR; documented in `PRIVACY_GAPS.md`. |
|
||||
| Container fingerprint via `ftyp` brand normalisation | Normalised to ffmpeg default | We accept that "uniform = less fingerprint-y" is the right trade; the only fingerprint left is "muxed by ffmpeg," same as billions of other files. |
|
||||
| Fragmented MP4 byte-preservation | Defragmented | User signed off. Defragged output plays identically on every consumer platform. |
|
||||
|
||||
---
|
||||
|
||||
## Recommendation
|
||||
|
||||
**Adopt the POC invocation as the Phase 1 starting policy.** The full strip command:
|
||||
|
||||
```
|
||||
ffmpeg -i in.<ext> \
|
||||
-map 0 -map -0:d? -map -0:s? -map -0:t? \
|
||||
-map_metadata -1 \
|
||||
-map_chapters -1 \
|
||||
-fflags +bitexact \
|
||||
-c copy \
|
||||
-movflags +faststart \
|
||||
-metadata "encoder=" \
|
||||
out.<ext>
|
||||
```
|
||||
|
||||
For Phase 1 (this PR), the claim set is `.mp4`, `.mov`, `.m4v`. The strategy is registered ahead of `VideoStrategy` when `WITH_FFMPEG=1` (default).
|
||||
|
||||
### Phase 1 deliverable (this PR)
|
||||
|
||||
- `FfmpegFallbackStrategy` claims `.mp4`/`.mov`/`.m4v`
|
||||
- Strip invocation as above
|
||||
- Forensic verification under `tools/forensic/ffmpeg-fallback.ts` — sentinel survival = 0 on the synthetic battery (`seeded.mp4`, `sample-fragmented.mp4`) plus real-world (`phone-baseline.mp4`, `gopro-fusion.mp4`)
|
||||
- KNOWN_GAPS empty for these formats after this lands — every channel from `docs/forensic/video.md` is closed
|
||||
- `VideoStrategy` retained as the fallback when `WITH_FFMPEG=0` (opt-out builds)
|
||||
|
||||
### Deferred to follow-up PRs
|
||||
|
||||
- `.mkv` / `.webm` (Phase 2 of #182, same PR if Phase 1 forensic is clean)
|
||||
- `.avi` / `.wmv` / `.3gp` (separate PRs; each with its own gap analysis + forensic pass)
|
||||
- `VideoStrategy` deletion (subsequent PR after a validation window of stable ffmpeg in production)
|
||||
- UI opt-in flag for "preserve subtitle/chapter tracks" — depends on real user feedback
|
||||
31
docs/gap-analysis/webm.md
Normal file
31
docs/gap-analysis/webm.md
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
# WebM — ffmpeg-wasm strategy gap analysis
|
||||
|
||||
**Date:** 2026-05-21
|
||||
**Goal:** Document the per-source policy for `FfmpegFallbackStrategy` on WebM inputs. Phase 2 of issue #182.
|
||||
|
||||
WebM is a strict subset of Matroska — same EBML container, restricted to VP8/VP9/AV1 video + Vorbis/Opus audio. The strategy treats them the same: `matchesEbml()` accepts both, `detectContainer()` distinguishes them via the EBML `DocType` element (looks for the ASCII string `"webm"` in the first 64 bytes; falls back to `.mkv` if not found, since MKV is the superset).
|
||||
|
||||
---
|
||||
|
||||
## Per-source policy
|
||||
|
||||
Inherits everything from [`docs/gap-analysis/mkv.md`](mkv.md). The differences relative to MKV:
|
||||
|
||||
| Aspect | MKV | WebM | Notes |
|
||||
|---|---|---|---|
|
||||
| Allowed codecs | Any | VP8/VP9/AV1 video, Vorbis/Opus audio | Our `-c copy` path doesn't transcode, so any codec the input contains passes through. If the user dropped a "WebM" file containing a non-Webm codec (rare; happens with stale exports), ffmpeg's muxer will refuse — same failure mode as the MP4 muxer's codec-tag refusal. |
|
||||
| Attachments / chapters | Rare | Rarer still | Same drop-all policy as MKV. |
|
||||
| MediaRecorder-generated WebM (browser screen rec) | n/a | Common | This is the **primary user value of WebM coverage**: web-recorded video from MediaRecorder API. Our strip handles this cleanly. |
|
||||
| `\Segment\Info\WritingApp` | Often "Chrome" / "Firefox" / "Lavf" | Same | Privacy-invariants §6 fingerprint — suppressed via `-metadata encoder=`. |
|
||||
|
||||
---
|
||||
|
||||
## Empirical verification
|
||||
|
||||
Synthetic WebM with 2 seeded sentinels (`title`, `description`) → 0 survivors. See `tools/forensic/ffmpeg-fallback.ts` Phase 2 row.
|
||||
|
||||
---
|
||||
|
||||
## Recommendation
|
||||
|
||||
`.webm` joins the strategy's claim set in Phase 2 of #182. Same invocation as MKV (no `-movflags +faststart`). Forensic verification confirms zero sentinel survival on the synthetic WebM battery. Real-world WebM fixtures (browser MediaRecorder output) are a useful follow-up.
|
||||
311
docs/poc/ffmpeg-wasm.md
Normal file
311
docs/poc/ffmpeg-wasm.md
Normal file
|
|
@ -0,0 +1,311 @@
|
|||
# `@ffmpeg/ffmpeg` (ffmpeg-wasm) POC
|
||||
|
||||
**Date:** 2026-05-21
|
||||
**Goal:** Evaluate whether ffmpeg compiled to WebAssembly is a viable in-browser strategy for ExifCleaner — closing the long-tail video container gap (MKV/WebM/AVI/WMV/3GP) and the GPMF / device-fingerprint gap in `VideoStrategy` (#38, #39), without giving up the "no server, runs offline" privacy guarantee.
|
||||
|
||||
Targets that frame the evaluation: **desktop offline standalone HTML** + **Android APK** (Capacitor wrapper). PWA self-host is a secondary target. Issue: #182.
|
||||
|
||||
## What was evaluated
|
||||
|
||||
| Package | Version | License | Role |
|
||||
|---|---|---|---|
|
||||
| `@ffmpeg/ffmpeg` | 0.12.15 | MIT | Browser-facing wrapper (Worker boilerplate, lifecycle, file marshalling) |
|
||||
| `@ffmpeg/util` | 0.12.2 | MIT | URL/fetch utilities used by the wrapper |
|
||||
| `@ffmpeg/core` | 0.12.10 | **GPL-2.0-or-later** | The actual ffmpeg compiled to WASM, single-threaded build |
|
||||
| `@ffmpeg/types` | 0.12.4 | MIT | Shared TypeScript types |
|
||||
|
||||
Installed in `/tmp/ffmpeg-poc/` with `npm install --save-exact`. Per-package dependency graph is tiny: the wrapper depends on `@ffmpeg/util` only; `@ffmpeg/core` has no JS-side deps (the WASM is self-contained).
|
||||
|
||||
The threaded build (`@ffmpeg/core-mt`, requires `SharedArrayBuffer` + COOP/COEP) is explicitly **out of scope** for this POC — issue #182 picks single-threaded for v1.
|
||||
|
||||
## How the packages fit together
|
||||
|
||||
They are not alternatives; they're a stack:
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────┐
|
||||
│ FfmpegFallbackStrategy (we write — issue #182) │
|
||||
│ ├─ implements src/infrastructure/wasm/format_strategy.ts │
|
||||
│ └─ same contract as JpegStrategy, ExifToolFallbackStrategy │
|
||||
├──────────────────────────────────────────────────────────────┤
|
||||
│ @ffmpeg/ffmpeg (MIT, ~10 KB JS, 2.3 KB gz)│
|
||||
│ ├─ FFmpeg class: load / exec / writeFile / readFile │
|
||||
│ ├─ Manages Web Worker lifecycle, file marshalling to MEMFS │
|
||||
│ └─ Browser-only — Node import resolves to an empty module │
|
||||
├──────────────────────────────────────────────────────────────┤
|
||||
│ @ffmpeg/core (GPL, 30.7 MB / 9.79 MB gz)│
|
||||
│ ├─ ffmpeg-core.wasm — Emscripten-compiled ffmpeg + libs │
|
||||
│ ├─ ffmpeg-core.js — Emscripten JS bootstrap │
|
||||
│ └─ Exposes argv-style exec, MEMFS, setLogger, setTimeout │
|
||||
└──────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
Our integration shape: a thin `FfmpegFallbackStrategy` on top of `@ffmpeg/ffmpeg`. It hides the wrapper's quirks (Worker bootstrap, MEMFS lifecycle, stderr/abort semantics) and presents the same `strip(bytes, options) → Result<StripResult, ExifError>` contract as every other strategy. Same architectural pattern as `ExifToolFallbackStrategy`.
|
||||
|
||||
## Bundle weight
|
||||
|
||||
```
|
||||
ffmpeg-core.wasm raw: 30 706 KB gzipped: 9 791 KB
|
||||
ffmpeg-core.js raw: 109 KB gzipped: 29 KB
|
||||
@ffmpeg/ffmpeg classes.js raw: 9.5 KB gzipped: 2.3 KB
|
||||
@ffmpeg/ffmpeg worker.js raw: 5.0 KB gzipped: ~1.5 KB (est)
|
||||
─────────────────────────────────────────────────────────────────
|
||||
TOTAL transfer weight ≈ 9.82 MB gzipped
|
||||
TOTAL decompressed in memory ≈ 30.8 MB
|
||||
```
|
||||
|
||||
For comparison:
|
||||
|
||||
- `zeroperl.wasm` (WebPerl-ExifTool): ~7.2 MB gzipped, ~24 MB raw
|
||||
- Existing `VideoStrategy` hand-rolled walker: ~5 KB compiled JS
|
||||
- ExifCleaner's whole web build today: ~440 KB JS → ~180 KB gzipped
|
||||
|
||||
Distribution impact per target:
|
||||
|
||||
| Target | Today | + ffmpeg-wasm | Delta |
|
||||
|---|---:|---:|---:|
|
||||
| Standalone HTML (inlined as base64) | ~900 KB | ~40 MB | +44× |
|
||||
| Android APK (compressed asset) | ~5 MB | ~15 MB | +3× |
|
||||
| PWA self-host (on-demand fetch + SW cache) | ~440 KB initial | +9.8 MB on first video drop | one-time |
|
||||
|
||||
Standalone is the heaviest hit by far; APK absorbs it comfortably; PWA pays once and caches forever.
|
||||
|
||||
## Privacy / sandbox audit
|
||||
|
||||
### Static: JS-side network capability
|
||||
|
||||
Two browser-`fetch` call sites in `ffmpeg-core.js`, both in the Emscripten bootstrap path: `getBinary()` and `instantiateAsync()`. Both load **the `.wasm` file itself**, from a URL we provide via the wrapper's `coreURL` / `wasmURL` config (same pattern as ExifTool's `redirectWasmFetch`). No other runtime fetches exist after the WASM is loaded.
|
||||
|
||||
The wrapper (`@ffmpeg/ffmpeg/dist/esm/classes.js`) has **zero** references to browser-`fetch`, `XMLHttpRequest`, `WebSocket`, `RTCPeerConnection`, `EventSource`, or `navigator.sendBeacon`.
|
||||
|
||||
### WASM-level capability — important caveat
|
||||
|
||||
Unlike `zeroperl.wasm` (whose import section uses readable WASI names: `wasi_snapshot_preview1::fd_read`, etc.), `ffmpeg-core.wasm` is built with Emscripten using JS imports whose names are minified to single letters (`a::$`, `a::A`, …`a::z`). **The WASM import audit cannot be done by name alone** — the JS host (`ffmpeg-core.js`) supplies each of the 71 imports under whichever single-letter symbol Emscripten assigned.
|
||||
|
||||
The audit therefore moves up a layer: what does the JS-side runtime expose to the WASM?
|
||||
|
||||
- ffmpeg-core.js does contain Emscripten's **SOCKFS** code (because ffmpeg's source uses `socket()` for network protocols like `rtsp://`, `tcp://`, `udp://`, `http://`). The SOCKFS implementation requires the consumer to provide a `Module["websocket"]` object — it is **dormant** by default. If `socket()` is called from inside WASM without `Module["websocket"]` set, it fails.
|
||||
- Our integration **does not** set `Module["websocket"]`, so the WASM cannot make socket calls even if asked.
|
||||
- ffmpeg's inputs are restricted to MEMFS paths (we write the user's bytes to MEMFS before invoking ffmpeg). It is never handed an `rtsp://` or `http://` URL.
|
||||
- The deploy-layer CSP (`connect-src 'self'`) blocks any WebSocket handshake at the browser level as a backstop.
|
||||
|
||||
Defense in depth, not "structural by construction." This is honestly weaker than the zeroperl story (where sockets are absent from the import section entirely), and the writeup must say so.
|
||||
|
||||
### Dynamic: Node permission model
|
||||
|
||||
Verified end-to-end under Node 24's default-deny capability model. The script in `/tmp/ffmpeg-poc/run_strip.mjs` loads `@ffmpeg/core` directly (bypassing the browser-only `@ffmpeg/ffmpeg` wrapper) and runs:
|
||||
|
||||
```
|
||||
node --permission --allow-fs-read=/tmp/ffmpeg-poc --allow-fs-write=/tmp/ffmpeg-poc \
|
||||
run_strip.mjs seeded.mp4 out-restricted.mp4
|
||||
```
|
||||
|
||||
Without `--allow-net`, the strip completes successfully and produces **byte-identical output** to the unrestricted run (`cmp out-restricted.mp4 out-unrestricted.mp4` → ✓ byte-identical). This proves the strip path issues no network syscalls.
|
||||
|
||||
```
|
||||
ok core loaded
|
||||
ok rc=0
|
||||
wrote 2238 bytes to out-restricted.mp4
|
||||
sentinels in restricted output: 0
|
||||
bytes match unrestricted run? ✓ byte-identical
|
||||
```
|
||||
|
||||
(Linux user namespaces were unavailable in the audit sandbox for `unshare -rn`, so the Node permission model is the dynamic check.)
|
||||
|
||||
**Conclusion: no outbound network traffic during strip operations.** Combined with not setting `Module["websocket"]` + the deploy-level CSP, the privacy story holds — but documented as defense-in-depth rather than "by construction."
|
||||
|
||||
## Supply-chain integrity
|
||||
|
||||
### Pinning + npm integrity
|
||||
|
||||
All four packages installed with `--save-exact`. SHA-512 integrity hashes recorded in `package-lock.json`:
|
||||
|
||||
```
|
||||
@ffmpeg/ffmpeg@0.12.15
|
||||
sha512-1C8Obr4GsN3xw+/1Ww6PFM84wSQAGsdoTuTWPOj2OizsRDLT4CXTaVjPhkw6ARyDus1B9X/L2LiXHqYYsGnRFw==
|
||||
@ffmpeg/core@0.12.10
|
||||
sha512-dzNplnn2Nxle2c2i2rrDhqcB19q9cglCkWnoMTDN9Q9l3PvdjZWd1HfSPjCNWc/p8Q3CT+Es9fWOR0UhAeYQZA==
|
||||
@ffmpeg/util@0.12.2
|
||||
sha512-ouyoW+4JB7WxjeZ2y6KpRvB+dLp7Cp4ro8z0HIVpZVCM7AwFlHa0c4R8Y/a4M3wMqATpYKhC7lSFHQ0T11MEDw==
|
||||
@ffmpeg/types@0.12.4
|
||||
sha512-k9vJQNBGTxE5AhYDtOYR5rO5fKsspbg51gbcwtbkw2lCdoIILzklulcjJfIDwrtn7XhDeF2M+THwJ2FGrLeV6A==
|
||||
```
|
||||
|
||||
### File-level SHA-256 (for our own embed)
|
||||
|
||||
```
|
||||
SHA256 (ffmpeg-core.wasm) = 9f57947a5bd530d8f00c5b3f2cb2a3492faa7e5d823315342d6a8656d0a6b7b7
|
||||
SHA256 (ffmpeg-core.js) = 67a48f11645f85439f3fde4f2119042c16b374b910206b7a7a24f342e28dcae3
|
||||
SHA256 (@ffmpeg/ffmpeg classes.js) = 7a829c898bdbc3a8806652a5502d9101178ce4e988a2c50b3abc1306ce4fc919
|
||||
```
|
||||
|
||||
If adopted, the embed pattern would be: pin SHA-256s in our build, compute them as part of `yarn build:web`, fail the build on mismatch. Same protocol as `webperl-exiftool.md`.
|
||||
|
||||
### What this does NOT defend against
|
||||
|
||||
- A compromised upstream maintainer publishing a malicious update. Mitigation: pin a known-good version and **review the diff every time we bump**.
|
||||
- A backdoor at the pinned version we're auditing now. The defense is the network-capability analysis above (JS-side fetch / SOCKFS) plus the dynamic no-network test. Stronger than the obfuscated-import-section audit on its own.
|
||||
|
||||
## Functional results
|
||||
|
||||
Fixtures generated with `ffmpeg` (1 s synthetic blue frame, 128×128, H.264). Seeded with sentinel metadata via system `exiftool 12.76`. Stripped via `@ffmpeg/core` 0.12.10 in Node. Sentinels counted with `strings | grep -c FORENSIC`.
|
||||
|
||||
### Synthetic battery (sentinel survival)
|
||||
|
||||
| Fixture | Bytes (in → out) | Sentinels (in → out) | rc | Notes |
|
||||
|---|---:|---|---:|---|
|
||||
| `seeded.mp4` (1 s, 5.7 KB, full XMP + ItemList sentinels) | 5 711 → 2 238 | 8 → **0** | 0 | All XMP, ItemList, encoder, dates stripped |
|
||||
| `sample-fragmented.mp4` (fragmented MP4 with `moof` boxes) | 1 850 → 1 680 | n/a | 0 | **Works** — defragments to flat `[ftyp, moov, mdat]`, which the user has accepted as the trade-off (see #182 discussion) |
|
||||
|
||||
Strip command for both: `-i in.mp4 -map 0 -map_metadata -1 -map_chapters -1 -fflags +bitexact -c copy -movflags +faststart out.mp4`.
|
||||
|
||||
### Real-world battery
|
||||
|
||||
| Fixture | Bytes | Default `-map 0` | With `-map 0:v? -map 0:a?` | Notes |
|
||||
|---|---:|---|---|---|
|
||||
| `phone-baseline.mp4` (samplelib, 2.7 MB, modern Android) | 2 848 111 → 2 848 111 (output) | ✅ rc=0 | ✅ rc=0 | All standard metadata stripped; no codec failures |
|
||||
| `gopro-fusion.mp4` (gpmf-parser repo, 5.1 MB, has GPMF + timecode + description streams) | 5 124 222 | ❌ rc=1, 0-byte output | ✅ rc=0, 5 000 118 B | See below |
|
||||
|
||||
**GoPro Fusion failure mode (default `-map 0`)**:
|
||||
|
||||
```
|
||||
[mp4] Could not find tag for codec none in stream #2, codec not currently supported in container
|
||||
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
|
||||
Error initializing output stream 0:4 --
|
||||
Aborted()
|
||||
```
|
||||
|
||||
Streams #0:2 (`tmcd`, timecode) and #0:4 (`fdsc`, description) carry codec `none`, which ffmpeg's MP4 muxer refuses to remux. The `gpmd` stream (#0:3) is **not** the blocker — `tmcd` and `fdsc` are. mat2 hits the same wall (exit 234) because it uses the same `-codec copy` invocation.
|
||||
|
||||
**With `-map 0:v? -map 0:a?` (drop all non-video/non-audio streams)**:
|
||||
|
||||
The output drops every data stream — GPMF/GPS, timecode, description — which is exactly what a metadata stripper wants to do. Result: 5 MB clean output, **all** GoPro device names / GPMF magic / fusion strings / encoder identifiers removed. ExifTool view of the stripped file shows generic Apple-style headers, no GoPro lineage.
|
||||
|
||||
This is a categorical improvement over both:
|
||||
|
||||
- Current `VideoStrategy` walker: leaves `COMPRESSORNAME`, `gpmd-magic`, GoPro device-name strings (#38, #39 — handler/compressor names not stripped).
|
||||
- `mat2` (= ffmpeg `-codec copy`): refuses the file entirely (exit 234) with `-map 0`.
|
||||
|
||||
**Trade-off**: we lose subtitle / chapter / data tracks that *might* have been wanted. For a privacy-strip tool, this is correct: anything that isn't video or audio is metadata or telemetry, and the user dropped the file *to remove* metadata and telemetry. The gap-analysis writeup (`docs/gap-analysis/mp4-ffmpeg.md`) will document this policy and flag the legitimate-subtitle edge case for follow-up.
|
||||
|
||||
### Closes / improves on current walker gaps
|
||||
|
||||
| Gap | Current `VideoStrategy` | ffmpeg-wasm with `-map 0:v? -map 0:a?` |
|
||||
|---|---|---|
|
||||
| `HDLR_NAME_VIDEO` / `HDLR_NAME_*` (#38) | LEAKED | Removed (generic `VideoHandler`/`SoundHandler` rewritten) |
|
||||
| `COMPRESSORNAME` (#39) | LEAKED | Removed |
|
||||
| `gpmd-magic` / GoPro device strings | LEAKED | Removed (entire `gpmd` track dropped) |
|
||||
| `mvhd.next_track_id` (#111) | LEAKED | Rewritten by ffmpeg's muxer (verify in forensic phase) |
|
||||
| GPS coordinates in GPMF | LEAKED | Removed (track dropped) |
|
||||
| `TRAF_META_FRAGMENT` (#36) | LEAKED on fragmented | N/A — defragmented |
|
||||
| `MDAT_ORPHAN` (#42) | LEAKED on synthetic | Removed (ffmpeg re-writes mdat from sample tables) |
|
||||
|
||||
Categorical: GPMF / timed-metadata / `tmcd` / `fdsc` channels are all gone because the streams themselves are dropped.
|
||||
|
||||
## Performance
|
||||
|
||||
Steady-state benchmark, 10 strips per fixture in one Node process (warm cache):
|
||||
|
||||
```
|
||||
=== seeded.mp4 (1 s, 5.7 KB) ===
|
||||
N=10 mean=5ms p50=2ms p95=34ms min=2ms max=34ms
|
||||
(first call 34 ms cold; 2–3 ms steady-state)
|
||||
|
||||
=== big-seeded.mp4 (30 s, 29 KB) ===
|
||||
N=10 mean=6ms p50=3ms p95=28ms
|
||||
|
||||
=== phone-baseline.mp4 (5 s, 2.7 MB) ===
|
||||
N=5 mean=61ms p50=52ms p95=104ms
|
||||
|
||||
=== gopro-fusion.mp4 (5.1 MB, with -map 0:v? -map 0:a?) ===
|
||||
N=5 mean=~95ms (extrapolated; rc=0 only when v/a-only map applied)
|
||||
|
||||
=== sample-fragmented.mp4 (1.8 KB, fragmented) ===
|
||||
N=10 mean=6ms p50=2ms
|
||||
```
|
||||
|
||||
Process-level (Node + ESM resolution + WASM init + 1 strip): **~170 ms wall**, **~145 MB peak RSS**.
|
||||
|
||||
### Comparison vs other strategies
|
||||
|
||||
| Engine | Per-file latency (1 s phone MP4) | Cold start | Bundle |
|
||||
|---|---:|---:|---:|
|
||||
| `VideoStrategy` walker (current) | < 5 ms | none | ~5 KB |
|
||||
| `@ffmpeg/core` 0.12.10 | ~50–100 ms | ~30 ms once | ~9.8 MB gz |
|
||||
| `@uswriting/exiftool` (WebPerl) | ~900 ms | ~1 s once | ~7.2 MB gz |
|
||||
|
||||
ffmpeg-wasm sits between the walker and WebPerl-ExifTool on latency — **~10–20× slower than the walker, ~10–15× faster than ExifTool**. Practical implication: hundreds of phone-MP4 strips in a batch still feel responsive (~10 s for 100 files), folder-level batches with thousands of files would be noticeable but not awful (~1 min for 1000).
|
||||
|
||||
The project-direction principle from `CLAUDE.md` ("Performance is sacred — hundreds in seconds") holds for the typical batch size. Walker stays faster, which is one reason #182 keeps the walker around during the validation window.
|
||||
|
||||
## Maintainer / bus factor / license chain
|
||||
|
||||
- **Maintainers** of `@ffmpeg/ffmpeg` and `@ffmpeg/core`: Jerome Wu (`jeromewu`) and Lucas Gelfond (`lucasgelfond`). The `ffmpegwasm` GitHub organization is the upstream.
|
||||
- **Activity**: latest `@ffmpeg/core` 0.12.10 published 2025-04-07. Repo (`ffmpegwasm/ffmpeg.wasm`) is active — issues responded to, releases roughly quarterly. Healthier than `zeroperl-ts`.
|
||||
- **License chain**:
|
||||
- `@ffmpeg/ffmpeg`, `@ffmpeg/util`, `@ffmpeg/types` — **MIT**
|
||||
- `@ffmpeg/core` (the WASM build) — **GPL-2.0-or-later** (inherits from ffmpeg's enabled GPL components: x264, x265, etc.)
|
||||
- User direction in #182: "use the better one, don't overthink about licensing"
|
||||
- **Implication**: distributing the standalone HTML / APK that includes `ffmpeg-core.wasm` makes the distributed binary subject to GPL-2.0. Our MIT codebase is unchanged (no GPL source becomes part of *our* source tree), but the **combined distributable** must comply with GPL-2.0: source-availability + license notice + copyleft for any further redistribution.
|
||||
- In practice for this project: the codebase is already public + MIT, all upstream ffmpeg-wasm source is public, so the source-availability requirement is met by linking to `ffmpegwasm/ffmpeg.wasm`. The About / Settings → Licenses screen needs a GPL-2.0 notice + source pointer. Documented in #182's pre-merge checklist.
|
||||
- **Per-distribution implication**: enabling ffmpeg in any opt-in user-built variant subjects *their* build to GPL-2.0 redistribution rules, but they're free to use it locally. The standalone HTML download (when shipped with ffmpeg by default per #182) is a binary distribution that ships under GPL-2.0 for the combined work.
|
||||
|
||||
## Conclusion
|
||||
|
||||
**Adopt as the primary path for video container metadata stripping** (MP4/MOV/MKV/WebM in #182 Phase 1+2), with `VideoStrategy` retained during a validation window. The math:
|
||||
|
||||
| Concern | Verdict |
|
||||
|---|---|
|
||||
| Privacy (no network) | ✅ Static (JS-side fetch only loads the wasm; SOCKFS dormant) + Dynamic (Node permission model: byte-identical output without `--allow-net`). Defense-in-depth, not "by construction" — weaker than zeroperl. |
|
||||
| Forensic completeness on supported formats | ✅ Strips every sentinel on synthetic MP4. Closes #38, #39 (handler / compressor names). Closes #42 (mdat orphan zeroing) categorically by re-writing from sample tables. |
|
||||
| Bundle size for standalone HTML | ⚠️ +40 MB inlined. Big jump from current ~900 KB single file. User has signed off as worth it for the format coverage and forensic completeness. |
|
||||
| Bundle size for Android APK | ✅ +10 MB compressed asset. Rounding error inside the Capacitor WebView shell. |
|
||||
| Performance for batches | ✅ 50–100 ms per file for 2.7 MB phone MP4. Hundreds in seconds is achievable; thousands gets noticeable. |
|
||||
| Fragmented MP4 (screen rec, modern Android) | ✅ Works — defragments to flat MP4. User has explicitly accepted defragmentation as a trade-off (per #182 discussion). |
|
||||
| GoPro Fusion / GPMF / device fingerprints | ✅ With `-map 0:v? -map 0:a?`, all data streams dropped. **Better than current walker and better than mat2.** |
|
||||
| MKV / WebM / AVI | 🟡 Untested in this POC. Phase 2 of #182 verifies forensically; ffmpeg's MKV/WebM mux is well-trodden in mat2's coverage. |
|
||||
| GPL-2.0 license inheritance | 🟡 Combined distributable subject to GPL-2.0. Requires License notice + source pointer in About screen. User accepted in #182. |
|
||||
| Maintainer health | ✅ Two maintainers, active org, quarterly releases |
|
||||
| Tampering posture | ✅ SHA-512 from npm + SHA-256 pinned in our build + JS-side network audit + dynamic test |
|
||||
|
||||
## Recommended path forward (aligned with #182)
|
||||
|
||||
The strategy ships behind `WITH_FFMPEG` (default ON for all three distributions; `WITH_FFMPEG=0` opts out) and routes MP4/MOV ahead of the existing `VideoStrategy`. ExifToolFallbackStrategy stays for raster formats; PdfStrategy and OfficeStrategy stay (ffmpeg can't write those).
|
||||
|
||||
### Strip invocation policy
|
||||
|
||||
Default invocation for all claimed formats:
|
||||
|
||||
```
|
||||
ffmpeg -i in.<ext> \
|
||||
-map 0:v? -map 0:a? \ # drop data/subtitle/timecode streams (metadata vehicles)
|
||||
-map_metadata -1 \ # drop container-level metadata
|
||||
-map_chapters -1 \ # drop chapters (potential leak surface)
|
||||
-fflags +bitexact \ # don't write randomness/timestamps
|
||||
-c copy \ # no re-encode
|
||||
-movflags +faststart \ # moov at front (better seek for end-users)
|
||||
-metadata "encoder=" \ # suppress Lavf<version> fingerprint
|
||||
out.<ext>
|
||||
```
|
||||
|
||||
The `-map 0:v? -map 0:a?` choice (drop data streams) is the key departure from mat2's default. It's what makes GoPro Fusion / DJI / dashcam files work where mat2 fails. The legitimate-subtitle case (where a user wants to preserve a real subtitle track) is flagged for the gap-analysis phase.
|
||||
|
||||
Container brand / muxer-string normalisation (per privacy-invariants §6): the gap-analysis writeup will enumerate every fingerprint ffmpeg auto-stamps (`Lavf<version>` encoder string, `mp42` ftyp brand, etc.) and document per-source rewrite policy.
|
||||
|
||||
### Verification required before merge
|
||||
|
||||
Per `format-strategy-workflow.md`:
|
||||
|
||||
- `docs/gap-analysis/mp4-ffmpeg.md` — per-source/per-marker policy table; honest gap section
|
||||
- `docs/gap-analysis/mkv.md` and `docs/gap-analysis/webm.md` — Phase 2
|
||||
- `tools/forensic/ffmpeg-fallback.ts` — synthetic + real-world sentinel fixtures, zero survival across recovery battery
|
||||
- Build-flag wiring in `vite.config.web.ts`, `vite.config.web.standalone.ts`, Capacitor build config
|
||||
- CI `WITH_FFMPEG=0` variant in the smoke matrix
|
||||
|
||||
### Open questions for the design phase
|
||||
|
||||
- **Container brand normalisation policy** — enumerate ffmpeg's auto-stamped strings during gap analysis. Decide which get rewritten vs. allowed-through.
|
||||
- **Legitimate subtitle/chapter preservation** — opt-in via a strip-options flag? Or always drop with a one-time UI warning? Defer to UX work after the strip path proves out.
|
||||
- **Cold-start UX** — first MP4 drop pays the 30 ms WASM init cost (in addition to standalone's parse cost of the inlined base64). Below dialog-threshold even on weak devices, but worth measuring on real hardware in the verification phase.
|
||||
- **MKV/WebM `-codec copy` behaviour** — ffmpeg's matroska muxer is mature, but worth verifying that the same `-map 0:v? -map 0:a?` strategy works without the codec-tag-mismatch issues we saw on GoPro Fusion's MP4 output.
|
||||
- **Combined GPL-2.0 implications for the APK Play Store path** — not Play Store today (sideload), but a follow-up consideration if Play Store ever happens. Out of scope for this POC.
|
||||
|
|
@ -30,6 +30,7 @@
|
|||
"generate:exif-tags": "node scripts/generate_exif_tags.mjs"
|
||||
},
|
||||
"dependencies": {
|
||||
"@ffmpeg/core": "0.12.10",
|
||||
"@uswriting/exiftool": "1.0.9",
|
||||
"jszip": "^3.10.1",
|
||||
"pdf-lib": "^1.17.1",
|
||||
|
|
|
|||
|
|
@ -118,45 +118,28 @@ const DROP_GROUPS: ReadonlySet<string> = new Set([
|
|||
// Top-level keys (no group prefix) that ExifTool emits for housekeeping.
|
||||
const DROP_KEYS: ReadonlySet<string> = new Set(["SourceFile"]);
|
||||
|
||||
// ExifTool family-1 group → our canonical source label. Granular IFD
|
||||
// distinctions collapse to "EXIF" because users don't conceptually
|
||||
// separate IFD0/SubIFD/Interop/IFD1. GPS is broken out for privacy
|
||||
// salience. XMP-* and ICC-* prefixes collapse to "XMP" / "ICC".
|
||||
const EXACT_GROUP_MAP: Readonly<Record<string, string>> = {
|
||||
IFD0: "EXIF",
|
||||
IFD1: "EXIF",
|
||||
ExifIFD: "EXIF",
|
||||
SubIFD: "EXIF",
|
||||
InteropIFD: "EXIF",
|
||||
GPS: "GPS",
|
||||
ICC_Profile: "ICC",
|
||||
JFIF: "JFIF",
|
||||
Photoshop: "Photoshop",
|
||||
IPTC: "IPTC",
|
||||
MakerNotes: "MakerNotes",
|
||||
Adobe: "Adobe",
|
||||
APP14: "APP14",
|
||||
PDF: "PDF",
|
||||
PNG: "PNG",
|
||||
QuickTime: "MP4",
|
||||
ItemList: "MP4",
|
||||
UserData: "MP4",
|
||||
Keys: "MP4",
|
||||
XML: "Office",
|
||||
ZIP: "ZIP",
|
||||
RIFF: "RIFF",
|
||||
};
|
||||
|
||||
// Surface every ExifTool family-1 group verbatim as the source label.
|
||||
//
|
||||
// We deliberately do NOT collapse sub-groups to friendlier labels (e.g.
|
||||
// IFD0/IFD1/ExifIFD/SubIFD/InteropIFD → "EXIF", or all XMP-* → "XMP",
|
||||
// or QuickTime/ItemList/UserData/Keys → "MP4"). Collapsing destroys
|
||||
// sub-group identity, and ANY tag name that legitimately appears in two
|
||||
// sub-groups (Track1:HandlerType + Track2:HandlerType, IFD0:Orientation
|
||||
// + IFD1:Orientation, etc.) collides on the same (source, name) key —
|
||||
// the diff renderer then mis-aligns one sub-group's row with another's,
|
||||
// producing spurious diffs like the "Video Track → Audio Track"
|
||||
// regression we hit on MP4 in PR #183.
|
||||
//
|
||||
// Trade: the user sees more granular labels in the diff UI ("IFD0",
|
||||
// "ExifIFD", "Track1", "XMP-dc" instead of "EXIF", "EXIF", "MP4",
|
||||
// "XMP"). Acceptable for the privacy-tool audience — precise labels
|
||||
// are more informative when asking "what was removed" than smoothed-
|
||||
// over umbrella terms.
|
||||
//
|
||||
// The function still exists rather than being removed entirely so we
|
||||
// have a single hook for any future normalisation (e.g. trimming a
|
||||
// prefix from an ExifTool group name we genuinely want to merge).
|
||||
function mapGroupToSource(rawGroup: string): string {
|
||||
const exact = EXACT_GROUP_MAP[rawGroup];
|
||||
if (exact !== undefined) return exact;
|
||||
if (rawGroup.startsWith("XMP")) return "XMP";
|
||||
if (rawGroup.startsWith("ICC")) return "ICC";
|
||||
if (rawGroup.startsWith("PNG")) return "PNG";
|
||||
if (rawGroup.startsWith("Track")) return "MP4";
|
||||
// Unknown group — surface verbatim. Better than dropping (the user
|
||||
// sees something), and shows up in test coverage if a new group
|
||||
// appears in the wild that we should map.
|
||||
return rawGroup;
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -37,11 +37,23 @@ export async function redirectWasmFetch(
|
|||
const inlineEl = document.getElementById(STANDALONE_WASM_INLINE_ID);
|
||||
const base64 = inlineEl?.textContent?.trim();
|
||||
if (base64 !== undefined && base64.length > 0) {
|
||||
// `fetch(data:URL)` lets the browser's native Base64 → bytes path
|
||||
// do the decode work (typically faster than a JS atob+charCodeAt
|
||||
// loop for a 33 MB payload). The Response carries the bytes
|
||||
// straight into the wrapper's instantiateStreaming call.
|
||||
return fetch(`data:application/wasm;base64,${base64}`, init);
|
||||
// Standalone build inlines the WASM as gzipped+base64 (~3× smaller
|
||||
// than raw base64 — see vite.config.web.standalone.ts). The browser
|
||||
// natively decodes base64 via `fetch(data:URL)`, and we pipe the
|
||||
// decoded gzip bytes through DecompressionStream. Result: HTML
|
||||
// payload drops from ~33 MB → ~10 MB; runtime decode ~30 ms once.
|
||||
const gzipped = await (
|
||||
await fetch(`data:application/octet-stream;base64,${base64}`)
|
||||
).arrayBuffer();
|
||||
const decompressed = await new Response(
|
||||
new Blob([gzipped])
|
||||
.stream()
|
||||
.pipeThrough(new DecompressionStream("gzip")),
|
||||
).arrayBuffer();
|
||||
return new Response(decompressed, {
|
||||
headers: { "content-type": "application/wasm" },
|
||||
...init,
|
||||
});
|
||||
}
|
||||
const mod = await import("@6over3/zeroperl-ts/zeroperl.wasm?url");
|
||||
return fetch(mod.default, init);
|
||||
|
|
|
|||
21
src/infrastructure/wasm/strategies/ffmpeg_core.d.ts
vendored
Normal file
21
src/infrastructure/wasm/strategies/ffmpeg_core.d.ts
vendored
Normal file
|
|
@ -0,0 +1,21 @@
|
|||
// @ffmpeg/core ships Emscripten-generated ESM with no .d.ts. The exec()
|
||||
// + FS shape we use is captured in `FfmpegCore` in ffmpeg_wasm_fetch.ts;
|
||||
// here we just declare the module so `import("@ffmpeg/core")` typechecks.
|
||||
declare module "@ffmpeg/core" {
|
||||
const factory: (config: unknown) => Promise<unknown>;
|
||||
export default factory;
|
||||
}
|
||||
declare module "@ffmpeg/core/wasm?url" {
|
||||
const url: string;
|
||||
export default url;
|
||||
}
|
||||
|
||||
// Build-time flag set via Vite `define`. The standalone build (file://)
|
||||
// inlines @ffmpeg/core as gzipped+base64 in `<script type="text/plain">`
|
||||
// tags and reads them via readInlinedCore() — the PWA-style bare
|
||||
// `import("@ffmpeg/core")` path cannot be reached in that target. We
|
||||
// gate the bare import behind `!__WITH_STANDALONE_INLINE__` so Rollup
|
||||
// tree-shakes the dead branch in the standalone build, avoiding the
|
||||
// ~43 MB of ffmpeg-core factory + its data: URL wasm fallback that Vite
|
||||
// would otherwise bundle into the single-file HTML.
|
||||
declare const __WITH_STANDALONE_INLINE__: boolean;
|
||||
373
src/infrastructure/wasm/strategies/ffmpeg_fallback_strategy.ts
Normal file
373
src/infrastructure/wasm/strategies/ffmpeg_fallback_strategy.ts
Normal file
|
|
@ -0,0 +1,373 @@
|
|||
import type { Result } from "../../../common";
|
||||
import type { ExifError } from "../../../domain";
|
||||
import type {
|
||||
FormatStrategy,
|
||||
StripOptions,
|
||||
StripResult,
|
||||
} from "../format_strategy";
|
||||
import { cleanFfmpegMp4Output } from "./ffmpeg_post_strip";
|
||||
import { loadFfmpegInstance, type FfmpegCore } from "./ffmpeg_wasm_fetch";
|
||||
|
||||
// Issue #182. Routes MP4/MOV/M4V (Phase 1) and MKV/WebM (Phase 2) through
|
||||
// ffmpeg-wasm — registered ahead of VideoStrategy in strategy_registry.ts
|
||||
// when VITE_ENABLE_FFMPEG_FALLBACK is not "false".
|
||||
//
|
||||
// Privacy + bundle math is documented in docs/poc/ffmpeg-wasm.md. Per-source
|
||||
// policy is documented in docs/gap-analysis/{mp4-ffmpeg,mkv,webm}.md.
|
||||
//
|
||||
// Architecture: uses @ffmpeg/core directly in the main thread (not via the
|
||||
// @ffmpeg/ffmpeg wrapper). The wrapper spawns a type:"module" Web Worker
|
||||
// from a Blob URL, which fails silently when the page origin is `null`
|
||||
// (the standalone HTML build runs from file://). Main-thread usage works
|
||||
// uniformly across standalone, PWA, and Capacitor WebView at the cost of
|
||||
// blocking the UI during exec() — empirically ~1s on a 236 MB DJI Phantom 4
|
||||
// fixture (see docs/forensic/ffmpeg-fallback.md), acceptable for a
|
||||
// single-file metadata strip.
|
||||
//
|
||||
// Engine lifecycle: one shared FfmpegCore instance per page session, loaded
|
||||
// lazily on first strip(). Subsequent strips reuse the same instance.
|
||||
//
|
||||
// Deliberately NOT claimed:
|
||||
// - .avi / .wmv / .3gp — separate forensic-verification PRs (closes #44 etc.)
|
||||
// - subtitle / chapter / data streams in the input are dropped by the
|
||||
// -map 0:v? -map 0:a? policy (see gap analyses). The legitimate-subtitle
|
||||
// edge case is documented as a follow-up UX item.
|
||||
|
||||
type ContainerKind = "mp4" | "matroska";
|
||||
interface ContainerInfo {
|
||||
kind: ContainerKind;
|
||||
extension: string;
|
||||
}
|
||||
|
||||
export class FfmpegFallbackStrategy implements FormatStrategy {
|
||||
readonly extensions: ReadonlySet<string> = new Set([
|
||||
".mp4",
|
||||
".mov",
|
||||
".m4v",
|
||||
".mkv",
|
||||
".webm",
|
||||
]);
|
||||
|
||||
private instance: FfmpegCore | null = null;
|
||||
private loadPromise: Promise<FfmpegCore> | null = null;
|
||||
|
||||
constructor() {
|
||||
// Pre-warm ffmpeg-core in the background once the browser is idle.
|
||||
// Empirically the ~30 MB wasm gunzip + WebAssembly.instantiate takes
|
||||
// ~3-5 s on cold start, which dominated user-perceived "drop to
|
||||
// done" latency. Starting the load right after page load (instead
|
||||
// of on first strip) overlaps it with the user reading the page +
|
||||
// dragging the file in, so by the time they drop, the engine is
|
||||
// usually ready. requestIdleCallback ensures we don't compete with
|
||||
// first-paint work; setTimeout fallback covers Safari/older
|
||||
// environments where rIC isn't available.
|
||||
if (typeof window !== "undefined") {
|
||||
const kick = () => {
|
||||
// Swallow load failure here — strip() will retry on demand
|
||||
// and surface the error to the user. We just don't want an
|
||||
// uncaught rejection from the prewarm path.
|
||||
this.getInstance().catch(() => {});
|
||||
};
|
||||
const ric = (window as { requestIdleCallback?: (cb: () => void) => void })
|
||||
.requestIdleCallback;
|
||||
if (typeof ric === "function") {
|
||||
ric(kick);
|
||||
} else {
|
||||
setTimeout(kick, 0);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
verifyMagicBytes({ bytes }: { bytes: Uint8Array }): boolean {
|
||||
return matchesMp4Family(bytes) || matchesEbml(bytes);
|
||||
}
|
||||
|
||||
async strip({
|
||||
bytes,
|
||||
}: {
|
||||
bytes: Uint8Array;
|
||||
options: StripOptions;
|
||||
}): Promise<Result<StripResult, ExifError>> {
|
||||
const container = detectContainer(bytes);
|
||||
if (container === null) {
|
||||
return {
|
||||
ok: false,
|
||||
error: {
|
||||
code: "invalid-file-format",
|
||||
detail:
|
||||
"ffmpeg fallback: input bytes are not a recognised MP4-family ISOBMFF or EBML (MKV/WebM) container",
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
const inputName = `in${container.extension}`;
|
||||
const outputName = `out${container.extension}`;
|
||||
|
||||
try {
|
||||
const core = await this.getInstance();
|
||||
// Guarantee MEMFS cleanup on every exit path (success, error
|
||||
// return, or thrown). Privacy invariant: input bytes must not
|
||||
// persist in WASM linear memory between strips. Without the
|
||||
// finally, a throw from FS.readFile (zero-byte output, MEMFS
|
||||
// corruption, OOM) would leave the user's video bytes sitting
|
||||
// in MEMFS until the next strip's pre-call safeUnlink ran.
|
||||
try {
|
||||
// Clean MEMFS state from any prior strip — file names collide
|
||||
// across calls because the instance is cached.
|
||||
safeUnlink(core, inputName);
|
||||
safeUnlink(core, outputName);
|
||||
core.FS.writeFile(inputName, bytes);
|
||||
|
||||
// Stream selection — keep video+audio in their original
|
||||
// input positions, drop everything else.
|
||||
//
|
||||
// `-map 0:v? -map 0:a?` (what we used before) reorders
|
||||
// streams: ffmpeg processes -map args in argument order,
|
||||
// so all video streams land first and all audio streams
|
||||
// land after. That swaps tracks for any input where audio
|
||||
// came before video (some encoders write audio Track 1,
|
||||
// video Track 2). The diff view then correctly reports
|
||||
// per-track changes, but those changes are spurious —
|
||||
// the content is the same, just renumbered.
|
||||
//
|
||||
// `-map 0 -map -0:d? -map -0:s? -map -0:t?` keeps every
|
||||
// stream by default then explicitly removes data ("d"),
|
||||
// subtitle ("s"), and attachment/timecode ("t") streams.
|
||||
// `?` makes each removal optional (no error if the type
|
||||
// isn't present). Track ORDER among the surviving video
|
||||
// and audio streams is preserved as in the input.
|
||||
//
|
||||
// This still drops the GPMF / tmcd / fdsc tracks that
|
||||
// make mat2 / `-codec copy -map 0` exit 234 on action-
|
||||
// cam footage — those are all `d` or `t` streams.
|
||||
const args: string[] = [
|
||||
"-i",
|
||||
inputName,
|
||||
"-map",
|
||||
"0",
|
||||
"-map",
|
||||
"-0:d?",
|
||||
"-map",
|
||||
"-0:s?",
|
||||
"-map",
|
||||
"-0:t?",
|
||||
"-map_metadata",
|
||||
"-1",
|
||||
"-map_chapters",
|
||||
"-1",
|
||||
"-fflags",
|
||||
"+bitexact",
|
||||
"-c",
|
||||
"copy",
|
||||
];
|
||||
// MP4-specific muxer hardening (no-ops / warnings under matroska):
|
||||
// +faststart — moov at front for better seek
|
||||
// -write_btrt 0 — suppress per-track bitrate boxes (technical
|
||||
// fields not in the input)
|
||||
// -write_tmcd 0 — suppress timecode atom (we drop tmcd
|
||||
// streams via -map anyway; this stops ffmpeg
|
||||
// from synthesising one)
|
||||
// -empty_hdlr_name true — zero per-track hdlr.name (otherwise
|
||||
// ffmpeg writes "VideoHandler"/"SoundHandler"
|
||||
// as a strip-tool fingerprint)
|
||||
// -metadata:s:{v,a} vendor_id= handler_name= — defense-in-depth:
|
||||
// blank stream-level vendor/handler tags
|
||||
// (ffmpeg already writes zeros for these
|
||||
// in our config, but the explicit clear
|
||||
// survives future ffmpeg defaults
|
||||
// changing)
|
||||
// The udta/meta/hdlr.vendor "appl" bytes are NOT suppressible via
|
||||
// any flag (hardcoded in mov_write_hdlr_tag at the movie level);
|
||||
// cleanFfmpegMp4Output() below handles that.
|
||||
if (container.kind === "mp4") {
|
||||
args.push(
|
||||
"-movflags",
|
||||
"+faststart",
|
||||
"-write_btrt",
|
||||
"0",
|
||||
"-write_tmcd",
|
||||
"0",
|
||||
"-empty_hdlr_name",
|
||||
"true",
|
||||
"-metadata:s:v",
|
||||
"vendor_id=",
|
||||
"-metadata:s:v",
|
||||
"handler_name=",
|
||||
"-metadata:s:a",
|
||||
"vendor_id=",
|
||||
"-metadata:s:a",
|
||||
"handler_name=",
|
||||
);
|
||||
}
|
||||
args.push("-metadata", "encoder=");
|
||||
args.push(outputName);
|
||||
|
||||
const rc = core.exec(...args);
|
||||
if (rc !== 0) {
|
||||
return {
|
||||
ok: false,
|
||||
error: {
|
||||
code: "parse-failed",
|
||||
raw: `ffmpeg returned non-zero exit code: ${rc}. See browser console for ffmpeg stderr.`,
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
let outBytes = new Uint8Array(core.FS.readFile(outputName));
|
||||
// Post-strip cleanup — rewrites ffmpeg's hardcoded udta blocks
|
||||
// (movie-level and per-track) to `free` skip boxes. ffmpeg's
|
||||
// MP4 muxer writes the udta unconditionally with handler_type
|
||||
// "mdir" (ExifTool surfaces as `HandlerType: Metadata`) and
|
||||
// hardcoded vendor "appl" (`HandlerVendorID: Apple`); the
|
||||
// rename neutralises both surfaces in one length-preserving
|
||||
// step, without perturbing stco/co64 offsets. See
|
||||
// ffmpeg_post_strip.ts for the per-surface rationale and for
|
||||
// why mdhd.language is deliberately left as ffmpeg's "und"
|
||||
// default. No-op for matroska containers (different structure).
|
||||
if (container.kind === "mp4") {
|
||||
outBytes = cleanFfmpegMp4Output(outBytes);
|
||||
}
|
||||
|
||||
return {
|
||||
ok: true,
|
||||
value: {
|
||||
bytes: outBytes,
|
||||
walkerEntries: [],
|
||||
diffDocument: null,
|
||||
},
|
||||
};
|
||||
} finally {
|
||||
safeUnlink(core, inputName);
|
||||
safeUnlink(core, outputName);
|
||||
}
|
||||
} catch (err: unknown) {
|
||||
return {
|
||||
ok: false,
|
||||
error: {
|
||||
code: "file-io-error",
|
||||
detail:
|
||||
err instanceof Error
|
||||
? `ffmpeg fallback: ${err.message}`
|
||||
: `ffmpeg fallback: ${String(err)}`,
|
||||
},
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
private async getInstance(): Promise<FfmpegCore> {
|
||||
if (this.instance !== null) return this.instance;
|
||||
if (this.loadPromise !== null) return this.loadPromise;
|
||||
this.loadPromise = loadFfmpegInstance().then(
|
||||
(inst) => {
|
||||
this.instance = inst;
|
||||
return inst;
|
||||
},
|
||||
(err: unknown) => {
|
||||
// Clear the cached rejection so the next strip() can retry.
|
||||
// Without this, a transient load failure (network blip, gunzip
|
||||
// error, dynamic import failure) would permanently brick the
|
||||
// strategy for the rest of the page session. The reset lives
|
||||
// inside the rejection handler — callers that awaited this
|
||||
// same promise still observe the failure, but the *next*
|
||||
// getInstance() call after the rejection starts fresh.
|
||||
//
|
||||
// Interaction with the constructor's requestIdleCallback
|
||||
// prewarm: a prewarm failure still gets swallowed by the
|
||||
// .catch(() => {}) there, but because we've nulled
|
||||
// loadPromise, the next strip() triggers a real retry
|
||||
// rather than re-surfacing the cached rejection.
|
||||
this.loadPromise = null;
|
||||
throw err;
|
||||
},
|
||||
);
|
||||
return this.loadPromise;
|
||||
}
|
||||
}
|
||||
|
||||
function safeUnlink(core: FfmpegCore, name: string): void {
|
||||
try {
|
||||
core.FS.unlink(name);
|
||||
} catch {
|
||||
// MEMFS unlink throws if the file was never written (early-error
|
||||
// path) or if a concurrent op already removed it. Either way, ignore.
|
||||
}
|
||||
}
|
||||
|
||||
// ISOBMFF ftyp box: 4-byte size + "ftyp" + 4-byte major brand + 4-byte
|
||||
// minor version + N*4-byte compatible brands. MP4 family brands include:
|
||||
// isom, iso2..iso6, mp41, mp42, avc1, qt (note trailing spaces),
|
||||
// M4V , M4VH, M4VP.
|
||||
const MP4_BRANDS = new Set([
|
||||
"isom",
|
||||
"iso2",
|
||||
"iso3",
|
||||
"iso4",
|
||||
"iso5",
|
||||
"iso6",
|
||||
"mp41",
|
||||
"mp42",
|
||||
"avc1",
|
||||
"qt ",
|
||||
"M4V ",
|
||||
"M4VH",
|
||||
"M4VP",
|
||||
]);
|
||||
|
||||
function matchesMp4Family(b: Uint8Array): boolean {
|
||||
if (b.length < 16) return false;
|
||||
if (b[4] !== 0x66 || b[5] !== 0x74 || b[6] !== 0x79 || b[7] !== 0x70) {
|
||||
return false;
|
||||
}
|
||||
const boxSize =
|
||||
((b[0] ?? 0) << 24) |
|
||||
((b[1] ?? 0) << 16) |
|
||||
((b[2] ?? 0) << 8) |
|
||||
(b[3] ?? 0);
|
||||
const ftypEnd = Math.min(boxSize, b.length);
|
||||
for (let i = 8; i + 4 <= ftypEnd; i += 4) {
|
||||
if (i === 12) continue; // skip minor_version slot
|
||||
const brand = String.fromCharCode(
|
||||
b[i] ?? 0,
|
||||
b[i + 1] ?? 0,
|
||||
b[i + 2] ?? 0,
|
||||
b[i + 3] ?? 0,
|
||||
);
|
||||
if (MP4_BRANDS.has(brand)) return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
function matchesEbml(b: Uint8Array): boolean {
|
||||
return (
|
||||
b.length >= 4 &&
|
||||
b[0] === 0x1a &&
|
||||
b[1] === 0x45 &&
|
||||
b[2] === 0xdf &&
|
||||
b[3] === 0xa3
|
||||
);
|
||||
}
|
||||
|
||||
function detectContainer(b: Uint8Array): ContainerInfo | null {
|
||||
if (matchesMp4Family(b)) {
|
||||
if (b.length >= 12) {
|
||||
const major = String.fromCharCode(
|
||||
b[8] ?? 0,
|
||||
b[9] ?? 0,
|
||||
b[10] ?? 0,
|
||||
b[11] ?? 0,
|
||||
);
|
||||
if (major === "qt ") return { kind: "mp4", extension: ".mov" };
|
||||
if (major.startsWith("M4V")) return { kind: "mp4", extension: ".m4v" };
|
||||
}
|
||||
return { kind: "mp4", extension: ".mp4" };
|
||||
}
|
||||
if (matchesEbml(b)) {
|
||||
const head = b.subarray(0, Math.min(b.length, 64));
|
||||
const decoded = String.fromCharCode(...head);
|
||||
if (decoded.includes("webm")) {
|
||||
return { kind: "matroska", extension: ".webm" };
|
||||
}
|
||||
return { kind: "matroska", extension: ".mkv" };
|
||||
}
|
||||
return null;
|
||||
}
|
||||
103
src/infrastructure/wasm/strategies/ffmpeg_post_strip.ts
Normal file
103
src/infrastructure/wasm/strategies/ffmpeg_post_strip.ts
Normal file
|
|
@ -0,0 +1,103 @@
|
|||
// Post-strip cleanup for ffmpeg's MP4 muxer output.
|
||||
//
|
||||
// ffmpeg's MP4 muxer (libavformat/movenc.c) ALWAYS writes a 0x21-byte
|
||||
// `udta/meta/hdlr` block at the movie level (and per-track for some
|
||||
// inputs) regardless of CLI options. The handler_type is `mdir`
|
||||
// (iTunes-style metadata directory), which exiftool surfaces as
|
||||
// `HandlerType: Metadata`. The handler vendor is hardcoded to ASCII
|
||||
// "appl", which exiftool surfaces as `HandlerVendorID: Apple` —
|
||||
// actively misrepresenting the file as Apple-vendored.
|
||||
// `mov_write_meta_tag` runs unconditionally even after
|
||||
// `-map_metadata -1` clears the payload.
|
||||
//
|
||||
// Fix: rewrite the `udta` box type to `free`. ISO/IEC 14496-12 §8.1.2
|
||||
// defines `free` (and its alias `skip`) as padding — every reader
|
||||
// ignores the contents. exiftool stops surfacing the HandlerType row,
|
||||
// the HandlerVendorID row, and any other fields that would have come
|
||||
// from inside the block. ffprobe also stops reporting it.
|
||||
// Length-preserving rewrite so `stco`/`co64` offsets to `mdat` stay
|
||||
// valid.
|
||||
//
|
||||
// Why rewrite the *udta* rather than the inner `meta` or `hdlr`: we're
|
||||
// claiming the entire `udta/meta/hdlr` substructure is ffmpeg-added
|
||||
// padding. The strip invocation passes `-map_metadata -1 -map_chapters
|
||||
// -1`, which already drops every user-authored udta child the input
|
||||
// may have had. Any udta remaining in the output is muxer-synthesised.
|
||||
// (We rewrite per-track udta too — ffmpeg writes those for some inputs
|
||||
// as well.)
|
||||
//
|
||||
// Why we do NOT touch `mdhd.language`: ffmpeg writes the canonical ISO
|
||||
// 639-2/T code `und` (0x55C4, "undetermined") when the input had no
|
||||
// language and copies the input's code when it did. We considered
|
||||
// zeroing it to suppress the `MediaLanguageCode` row from the diff
|
||||
// when an input had no surfaceable language and ffmpeg had to invent
|
||||
// one. We reverted: 0x0000 is not a valid ISO 639-2/T code (decodes
|
||||
// to three 0x60 bytes) and downstream tools improvise — ffprobe
|
||||
// falls back to displaying `(eng)`, which is actively misleading for
|
||||
// downstream tooling that switches on language. `und` is the spec's
|
||||
// canonical "no language specified" marker; readers handle it
|
||||
// predictably. The diff cost (one extra row when input had no
|
||||
// surfaceable language) is acceptable vs. the spec-invalid bytes.
|
||||
// When the input *did* have a language (e.g. "eng"), the resulting
|
||||
// `eng → und` diff row is honest — we removed the user's language
|
||||
// tag, that's exactly what the diff is for.
|
||||
//
|
||||
// Other muxer-added surfaces are suppressed at the muxer level via the
|
||||
// strategy's invocation flags (`-write_btrt 0`, `-write_tmcd 0`,
|
||||
// `-empty_hdlr_name true`, `-metadata:s:{v,a} vendor_id= handler_name=`).
|
||||
// This pass only handles what the muxer CLI cannot.
|
||||
//
|
||||
// All writes are length-preserving so every byte offset elsewhere in
|
||||
// the file (including `stco`/`co64` chunk offsets into `mdat`) stays
|
||||
// valid.
|
||||
//
|
||||
// Operates on the MP4 family only (ISOBMFF). Matroska/WebM use EBML and
|
||||
// don't have the structure.
|
||||
|
||||
import { parseBoxesSafe, type ParsedBox } from "./video_boxes";
|
||||
|
||||
// Mutates the input in place — caller must own the buffer. Returns the
|
||||
// same buffer for chaining.
|
||||
export function cleanFfmpegMp4Output(bytes: Uint8Array): Uint8Array {
|
||||
const top = parseBoxesSafe(bytes, 0, bytes.length);
|
||||
for (const box of top) {
|
||||
if (box.type === "moov") walkMoov(bytes, box);
|
||||
}
|
||||
return bytes;
|
||||
}
|
||||
|
||||
// Walk a moov subtree: rewrite each udta → free, recurse into trak
|
||||
// for per-track udta.
|
||||
function walkMoov(out: Uint8Array, moov: ParsedBox): void {
|
||||
const moovChildren = parseBoxesSafe(out, moov.payloadStart, moov.payloadEnd);
|
||||
for (const child of moovChildren) {
|
||||
if (child.type === "udta") rewriteToFree(out, child);
|
||||
if (child.type === "trak") walkTrak(out, child);
|
||||
}
|
||||
}
|
||||
|
||||
function walkTrak(out: Uint8Array, trak: ParsedBox): void {
|
||||
const trakChildren = parseBoxesSafe(out, trak.payloadStart, trak.payloadEnd);
|
||||
for (const child of trakChildren) {
|
||||
// Per-track udta (some inputs make ffmpeg synthesise these).
|
||||
if (child.type === "udta") rewriteToFree(out, child);
|
||||
}
|
||||
}
|
||||
|
||||
// Rewrite a box's 4-byte type field to "free", making readers treat the
|
||||
// entire box (header + payload) as padding. The box's size field is
|
||||
// untouched so the file's total length and every offset elsewhere
|
||||
// stays valid. The type field always sits at headerStart + 4..7 in
|
||||
// ISO/IEC 14496-12 layout — that's the universal location, whether the
|
||||
// box uses the 8-byte regular header or the 16-byte largesize header
|
||||
// (size==1, followed by an 8-byte uint64 size). Using headerStart + 4
|
||||
// rather than payloadStart - 4 guarantees correctness across both:
|
||||
// for largesize, payloadStart - 4 would land inside the uint64
|
||||
// largesize field, not the type bytes.
|
||||
function rewriteToFree(out: Uint8Array, box: ParsedBox): void {
|
||||
const typeOffset = box.headerStart + 4;
|
||||
out[typeOffset] = 0x66; // 'f'
|
||||
out[typeOffset + 1] = 0x72; // 'r'
|
||||
out[typeOffset + 2] = 0x65; // 'e'
|
||||
out[typeOffset + 3] = 0x65; // 'e'
|
||||
}
|
||||
155
src/infrastructure/wasm/strategies/ffmpeg_wasm_fetch.ts
Normal file
155
src/infrastructure/wasm/strategies/ffmpeg_wasm_fetch.ts
Normal file
|
|
@ -0,0 +1,155 @@
|
|||
// DOM element IDs where the standalone build stashes the core JS + WASM
|
||||
// as Base64. Read on first load via document.getElementById. The HTML has
|
||||
// `<script type="text/plain" id="…">…</script>` blocks holding the bytes
|
||||
// — the browser stores the text but doesn't parse it as JS, so the
|
||||
// initial JS-parse cost stays bounded to the small wrapper code instead
|
||||
// of paying for V8 to allocate the multi-MB Base64 strings as module-
|
||||
// scope literals. Mirrors the zeroperl-wasm-base64 pattern in
|
||||
// exiftool_wasm_fetch.ts.
|
||||
const STANDALONE_CORE_JS_ID = "ffmpeg-core-js-base64";
|
||||
const STANDALONE_CORE_WASM_ID = "ffmpeg-core-wasm-base64";
|
||||
|
||||
// ffmpeg-core's exec() / FS API. The `@ffmpeg/core` package ships a
|
||||
// factory function whose return shape isn't exported as a type from the
|
||||
// package, so we type the bits we use here. createFFmpegCore is declared
|
||||
// as `unknown` in resolveCoreFactory below until validated.
|
||||
export interface FfmpegCore {
|
||||
FS: {
|
||||
writeFile: (name: string, data: Uint8Array) => void;
|
||||
readFile: (name: string) => Uint8Array;
|
||||
unlink: (name: string) => void;
|
||||
};
|
||||
exec: (...args: string[]) => number;
|
||||
ffprobe?: (...args: string[]) => number;
|
||||
setLogger?: (cb: (msg: { type: string; message: string }) => void) => void;
|
||||
setTimeout?: (timeout: number) => void;
|
||||
reset?: () => void;
|
||||
}
|
||||
|
||||
// Returns a ready-to-use ffmpeg-core instance. Loaded directly into the
|
||||
// main thread — we deliberately do NOT use the @ffmpeg/ffmpeg wrapper,
|
||||
// which spawns a `type: "module"` Web Worker from a Blob URL. Module
|
||||
// workers from Blob URLs fail silently when the page's origin is `null`
|
||||
// (file://, e.g. the standalone HTML), with cross-origin error censoring
|
||||
// hiding the actual cause. Running ffmpeg in the main thread is the same
|
||||
// strategy ImageStrategy / PdfStrategy use; it blocks the UI during
|
||||
// exec() but the strip is short enough (< 1s on a 236 MB DJI Phantom 4
|
||||
// fixture, per docs/forensic/ffmpeg-fallback.md) that UX is acceptable.
|
||||
export async function loadFfmpegInstance(): Promise<FfmpegCore> {
|
||||
const { factory, wasmBytes } = await resolveCore();
|
||||
// ffmpeg-core's Emscripten module detects environment via `self`. In a
|
||||
// browser main thread `self === window`, both are defined by the
|
||||
// engine. No shim needed (Node-side shim lives in
|
||||
// tools/forensic/ffmpeg-fallback.ts where the strategy isn't used).
|
||||
const core = await factory({
|
||||
wasmBinary: wasmBytes,
|
||||
// Quiet by default — the strategy surfaces errors via rc != 0
|
||||
// rather than relying on ffmpeg's stderr noise.
|
||||
print: () => {},
|
||||
printErr: () => {},
|
||||
});
|
||||
return core as FfmpegCore;
|
||||
}
|
||||
|
||||
interface CoreSources {
|
||||
factory: (config: unknown) => Promise<unknown>;
|
||||
wasmBytes: Uint8Array;
|
||||
}
|
||||
|
||||
async function resolveCore(): Promise<CoreSources> {
|
||||
// Standalone build: bytes are inlined as gzipped+base64 in script-text-
|
||||
// plain elements. Decompress + decode each, feed the wasm bytes straight
|
||||
// into createFFmpegCore via `wasmBinary`. Gzipping cut the standalone
|
||||
// HTML payload from ~116 MB → ~40 MB — HTML parse time scales with text-
|
||||
// node size at page load, so this is the start-time win.
|
||||
if (typeof document !== "undefined") {
|
||||
const inlined = await readInlinedCore();
|
||||
if (inlined !== null) return inlined;
|
||||
}
|
||||
|
||||
// `__WITH_STANDALONE_INLINE__` is replaced at build time by Vite's
|
||||
// `define` config (true in the standalone build, false in PWA/APK).
|
||||
// Gating the bare `import("@ffmpeg/core")` behind this flag lets
|
||||
// Rollup tree-shake the entire branch in the standalone build,
|
||||
// dropping ~43 MB of ffmpeg-core factory + its data: URL wasm
|
||||
// fallback that Vite would otherwise statically bundle into the
|
||||
// single-file HTML (the runtime never reaches this branch in the
|
||||
// standalone target because readInlinedCore above returns first, but
|
||||
// the static analyser doesn't know that).
|
||||
if (__WITH_STANDALONE_INLINE__) {
|
||||
throw new Error(
|
||||
"standalone build reached the PWA fetch path — inline tags missing",
|
||||
);
|
||||
}
|
||||
|
||||
// PWA / APK build: Vite-emitted asset URLs. Imports go through the
|
||||
// @ffmpeg/core exports map. Vite's ?url suffix gives us hashed asset
|
||||
// URLs; we fetch the WASM bytes ourselves and import the JS factory.
|
||||
const [{ default: factory }, wasmMod] = await Promise.all([
|
||||
import("@ffmpeg/core"),
|
||||
import("@ffmpeg/core/wasm?url"),
|
||||
]);
|
||||
const wasmBytes = new Uint8Array(
|
||||
await (await fetch(wasmMod.default)).arrayBuffer(),
|
||||
);
|
||||
return {
|
||||
factory: factory as (config: unknown) => Promise<unknown>,
|
||||
wasmBytes,
|
||||
};
|
||||
}
|
||||
|
||||
async function readInlinedCore(): Promise<CoreSources | null> {
|
||||
const coreJsB64 = document
|
||||
.getElementById(STANDALONE_CORE_JS_ID)
|
||||
?.textContent?.trim();
|
||||
const coreWasmB64 = document
|
||||
.getElementById(STANDALONE_CORE_WASM_ID)
|
||||
?.textContent?.trim();
|
||||
if (
|
||||
coreJsB64 === undefined ||
|
||||
coreJsB64 === "" ||
|
||||
coreWasmB64 === undefined ||
|
||||
coreWasmB64 === ""
|
||||
) {
|
||||
return null;
|
||||
}
|
||||
// Decompress both payloads concurrently. The browser's native base64
|
||||
// → bytes (via fetch(data:URL)) plus DecompressionStream("gzip") is
|
||||
// far faster than a JS-side atob+pako loop on multi-MB inputs.
|
||||
const [wasmBytes, jsBytes] = await Promise.all([
|
||||
base64GunzipToBytes(coreWasmB64),
|
||||
base64GunzipToBytes(coreJsB64),
|
||||
]);
|
||||
// The core JS is an Emscripten-generated ESM module. Build a Blob URL
|
||||
// for it and dynamic-import to get the createFFmpegCore default
|
||||
// export. One-shot module import, not a Worker spawn, so the null-
|
||||
// origin Blob URL restriction affecting type:module workers does NOT
|
||||
// apply.
|
||||
const coreBlob = new Blob([jsBytes], { type: "text/javascript" });
|
||||
const coreUrl = URL.createObjectURL(coreBlob);
|
||||
const factoryPromise = import(/* @vite-ignore */ coreUrl).then(
|
||||
(mod): ((config: unknown) => Promise<unknown>) => {
|
||||
URL.revokeObjectURL(coreUrl);
|
||||
return mod.default;
|
||||
},
|
||||
);
|
||||
return {
|
||||
factory: async (config: unknown) => (await factoryPromise)(config),
|
||||
wasmBytes,
|
||||
};
|
||||
}
|
||||
|
||||
// Browser-native base64 → bytes (via fetch on a data: URL) followed by
|
||||
// DecompressionStream gunzip. Used for both the wasm payload (~30 MB raw
|
||||
// → ~10 MB gz) and the core JS (~110 KB raw → ~30 KB gz). Runtime cost
|
||||
// ~30 ms total on the wasm side; pays for itself many times over against
|
||||
// the ~600 ms HTML-parse savings from a smaller text node.
|
||||
async function base64GunzipToBytes(b64: string): Promise<Uint8Array> {
|
||||
const gzipped = await (
|
||||
await fetch(`data:application/octet-stream;base64,${b64}`)
|
||||
).blob();
|
||||
const decompressed = await new Response(
|
||||
gzipped.stream().pipeThrough(new DecompressionStream("gzip")),
|
||||
).arrayBuffer();
|
||||
return new Uint8Array(decompressed);
|
||||
}
|
||||
|
|
@ -4,6 +4,7 @@ import { JpegStrategy } from "./strategies/jpeg_strategy";
|
|||
import { PngStrategy } from "./strategies/png_strategy";
|
||||
import { PdfStrategy } from "./strategies/pdf_strategy";
|
||||
import { ExifToolFallbackStrategy } from "./strategies/exiftool_fallback_strategy";
|
||||
import { FfmpegFallbackStrategy } from "./strategies/ffmpeg_fallback_strategy";
|
||||
import type { FormatStrategy } from "./format_strategy";
|
||||
|
||||
// VITE_ENABLE_EXIFTOOL_FALLBACK gates the ExifTool-in-WASM fallback. Default
|
||||
|
|
@ -13,8 +14,21 @@ import type { FormatStrategy } from "./format_strategy";
|
|||
const ENABLE_EXIFTOOL_FALLBACK =
|
||||
import.meta.env.VITE_ENABLE_EXIFTOOL_FALLBACK !== "false";
|
||||
|
||||
// VITE_ENABLE_FFMPEG_FALLBACK gates the ffmpeg-wasm strategy (#182). Default
|
||||
// on for every build target; "false" omits the engine and falls back to
|
||||
// VideoStrategy for MP4/MOV/M4V. See docs/poc/ffmpeg-wasm.md and
|
||||
// docs/gap-analysis/mp4-ffmpeg.md.
|
||||
const ENABLE_FFMPEG_FALLBACK =
|
||||
import.meta.env.VITE_ENABLE_FFMPEG_FALLBACK !== "false";
|
||||
|
||||
const STRATEGIES: readonly FormatStrategy[] = [
|
||||
new OfficeStrategy(),
|
||||
// FfmpegFallbackStrategy claims .mp4/.mov/.m4v AHEAD of VideoStrategy when
|
||||
// enabled — it closes the walker's KNOWN_GAPS (#38, #39, #111, #42) by
|
||||
// re-writing the container from the stream tables. VideoStrategy stays
|
||||
// in the list as the opt-out fallback (and during the validation window
|
||||
// per #182's "delete walker in a follow-up PR" plan).
|
||||
...(ENABLE_FFMPEG_FALLBACK ? [new FfmpegFallbackStrategy()] : []),
|
||||
new VideoStrategy(),
|
||||
new JpegStrategy(),
|
||||
new PngStrategy(),
|
||||
|
|
|
|||
|
|
@ -64,6 +64,25 @@ export class WasmProcessor implements MetadataProcessorPort {
|
|||
// slate by constructing a new WasmProcessor.
|
||||
private diffWarmupSignalled = false;
|
||||
|
||||
// Promise chain that serializes ALL diff builds across the processor's
|
||||
// lifetime. `@uswriting/exiftool`'s parseMetadata uses module-level
|
||||
// singletons (the Perl interpreter, MemoryFileSystem, stdout/stderr
|
||||
// StringBuilders) — every call does `c.clear(), m.clear(), await
|
||||
// e.reset()` on the shared state. Two concurrent readDocument calls
|
||||
// race on those buffers. We already serialize before+after within
|
||||
// a single entry below; this chain extends the same guarantee across
|
||||
// entries so that `use_process_files`'s mid-loop + end-of-batch
|
||||
// fire-and-forget drainDiffQueue invocations (which can overlap when
|
||||
// a batch crosses DIFF_DRAIN_CHUNK boundaries) can't interleave with
|
||||
// each other on the singleton.
|
||||
//
|
||||
// Each new diff awaits the previous one before starting its own pair
|
||||
// of parseMetadata calls. Cost: same as serial drain order at the
|
||||
// hook layer (which we already serialize per-drain anyway) — no new
|
||||
// wall-clock penalty in the common case; correctness guarantee for
|
||||
// the overlapping-drain edge case.
|
||||
private diffChain: Promise<unknown> = Promise.resolve();
|
||||
|
||||
constructor({ fileBytes }: { fileBytes: FileBytesPort }) {
|
||||
this.fileBytes = fileBytes;
|
||||
this.diffStrategy = new ExifToolDiffStrategy();
|
||||
|
|
@ -166,17 +185,41 @@ export class WasmProcessor implements MetadataProcessorPort {
|
|||
dispatchExifToolDiffLoading();
|
||||
}
|
||||
|
||||
// Queue this diff onto the singleton chain — guarantees no two
|
||||
// parseMetadata calls (within or across entries) overlap on the
|
||||
// shared Perl/StringBuilder state. See diffChain field docstring.
|
||||
const next = this.diffChain.then(() => this.runDiff(pending));
|
||||
// Swallow rejection on the chain itself so one failure doesn't
|
||||
// poison the chain for subsequent diffs. `runDiff` already returns
|
||||
// null on error; the chain just needs to keep its head live.
|
||||
this.diffChain = next.catch(() => null);
|
||||
return next;
|
||||
}
|
||||
|
||||
// Within a single diff, serialize before+after sequentially.
|
||||
// `@uswriting/exiftool`'s parseMetadata uses module-level singletons
|
||||
// for the Perl interpreter, the MemoryFileSystem, AND the stdout/
|
||||
// stderr StringBuilders. Every call does `c.clear(), m.clear(), await
|
||||
// e.reset()` on that shared state. Running before+after as Promise.all
|
||||
// interleaves the resets and the stdout reads — empirically (Node
|
||||
// repro) the second pair onward came back with both reads returning
|
||||
// the same buffer contents (same key count, same JSON), so the diff
|
||||
// renderer saw no changes. File 1 escaped because the cold-start
|
||||
// `T()` boot blocked one of the two reads long enough for the other
|
||||
// to finish, but once perl was warm the race fired on every
|
||||
// subsequent file. Serial avoids the race entirely.
|
||||
private async runDiff(
|
||||
pending: PendingDiffInputs,
|
||||
): Promise<MetadataDocument | null> {
|
||||
try {
|
||||
const [beforeResult, afterResult] = await Promise.all([
|
||||
this.diffStrategy.readDocument({
|
||||
bytes: pending.sourceBytes,
|
||||
extension: pending.extension,
|
||||
}),
|
||||
this.diffStrategy.readDocument({
|
||||
bytes: pending.strippedBytes,
|
||||
extension: pending.extension,
|
||||
}),
|
||||
]);
|
||||
const beforeResult = await this.diffStrategy.readDocument({
|
||||
bytes: pending.sourceBytes,
|
||||
extension: pending.extension,
|
||||
});
|
||||
const afterResult = await this.diffStrategy.readDocument({
|
||||
bytes: pending.strippedBytes,
|
||||
extension: pending.extension,
|
||||
});
|
||||
if (!beforeResult.ok || !afterResult.ok) {
|
||||
return null;
|
||||
}
|
||||
|
|
|
|||
BIN
tests/e2e/fixtures/sample-real.mp4
Normal file
BIN
tests/e2e/fixtures/sample-real.mp4
Normal file
Binary file not shown.
|
|
@ -67,6 +67,51 @@ test.describe("Standalone single-file HTML build", () => {
|
|||
);
|
||||
});
|
||||
|
||||
// Regression test for #182 standalone HTML hang. The original implementation
|
||||
// used the @ffmpeg/ffmpeg wrapper which spawns a type:"module" Web Worker
|
||||
// from a Blob URL. Module Workers from Blob URLs fail silently when the
|
||||
// page origin is `null` (file://), with cross-origin error censoring hiding
|
||||
// the cause. The strip would hang forever waiting for a response from a
|
||||
// dead worker. Switching to main-thread @ffmpeg/core fixed it. This test
|
||||
// would have caught the regression — it drops an MP4 and waits for the
|
||||
// completion row, with a tight timeout that fails fast on hangs.
|
||||
test("strips an MP4 via the ffmpeg fallback under file://", async ({
|
||||
page,
|
||||
}) => {
|
||||
// Cold WASM init (~3-5s) + download flush dominates wall time; bump
|
||||
// the outer test ceiling so the inner waits can use their 45s budget.
|
||||
test.setTimeout(90_000);
|
||||
const consoleErrors: string[] = [];
|
||||
page.on("console", (msg) => {
|
||||
if (msg.type() === "error") consoleErrors.push(msg.text());
|
||||
});
|
||||
page.on("pageerror", (err) => {
|
||||
consoleErrors.push(`pageerror: ${err.message}`);
|
||||
});
|
||||
|
||||
await page.goto(indexUrl);
|
||||
await page.waitForLoadState("domcontentloaded");
|
||||
await page.waitForSelector("[role='main']", { timeout: 10_000 });
|
||||
await expect(page.locator(".drop-zone")).toBeVisible();
|
||||
|
||||
const { bytes, filename } = await captureDownload(page, async () => {
|
||||
await dropFiles(page, [fixturePath("sample-real.mp4")]);
|
||||
// Generous timeout because the standalone HTML pays a one-time cold
|
||||
// WASM init on first video drop (~3-5 s) before processing. If the
|
||||
// strip hangs (e.g. Worker dies silently), the 45 s ceiling here
|
||||
// trips well before the wider Playwright default.
|
||||
await page.waitForSelector(".file-table__row--complete", {
|
||||
timeout: 45_000,
|
||||
});
|
||||
});
|
||||
|
||||
expect(filename).toMatch(/\.mp4$/i);
|
||||
await assertOutputStripped(bytes, filename);
|
||||
expect(consoleErrors, "Unexpected console errors during MP4 strip").toEqual(
|
||||
[],
|
||||
);
|
||||
});
|
||||
|
||||
// Runtime complement to tests/e2e/web/no-network.spec.ts. The web build
|
||||
// proves "SW + cache serve the whole pipeline" via mid-session route
|
||||
// abort. The standalone build can't use the same pattern (no SW to fall
|
||||
|
|
|
|||
|
|
@ -39,6 +39,27 @@ test.describe("File Processing — drag-drop (Web)", () => {
|
|||
await assertOutputStripped(bytes, filename);
|
||||
});
|
||||
|
||||
// End-to-end coverage of the FfmpegFallbackStrategy (#182). The PWA path
|
||||
// uses Vite-emitted ?url assets for ffmpeg-core (not the inline-base64
|
||||
// path the standalone uses), so this complements the standalone MP4 test:
|
||||
// they exercise the two distinct asset-resolution branches.
|
||||
test("strips metadata from an MP4 via the ffmpeg fallback", async ({
|
||||
page,
|
||||
}) => {
|
||||
test.setTimeout(60_000);
|
||||
const { bytes, filename } = await captureDownload(page, async () => {
|
||||
await dropFiles(page, [fixturePath("sample-real.mp4")]);
|
||||
// Generous timeout: first video drop pays a one-time ~3-5 s WASM
|
||||
// init for ffmpeg-core. Subsequent strips reuse the instance.
|
||||
await page.waitForSelector(".file-table__row--complete", {
|
||||
timeout: 45_000,
|
||||
});
|
||||
});
|
||||
|
||||
expect(filename).toMatch(/\.mp4$/i);
|
||||
await assertOutputStripped(bytes, filename);
|
||||
});
|
||||
|
||||
test("multi-file drag-drop bundles outputs into a single flat zip", async ({
|
||||
page,
|
||||
}) => {
|
||||
|
|
|
|||
|
|
@ -97,6 +97,78 @@ export async function assertDocxStripped(bytes: Buffer): Promise<void> {
|
|||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Bytewise checks on a stripped MP4/MOV/MKV/WebM output. The
|
||||
* FfmpegFallbackStrategy invocation uses `-fflags +bitexact`,
|
||||
* `-map_metadata -1`, and `-metadata encoder=`, so the output must
|
||||
* not carry an `Lavf<version>` muxer fingerprint, any of the
|
||||
* sentinel metadata strings the `sample.mp4` fixture seeded
|
||||
* (`Test Author`, `Test Video`, the dc:title XMP block), or the
|
||||
* Adobe XMP namespace marker that comes from the seeded uuid box.
|
||||
*
|
||||
* The check is fixture-aware (knows what `tests/e2e/fixtures/sample.mp4`
|
||||
* seeded). If a different fixture is used, extend this list.
|
||||
*/
|
||||
export function assertVideoStripped(bytes: Buffer): void {
|
||||
const ascii = bytes.toString("latin1");
|
||||
expect(ascii, "Stripped video should not carry Lavf encoder fingerprint").not.toMatch(
|
||||
/Lavf\d/,
|
||||
);
|
||||
expect(ascii, "Stripped video should not contain 'Test Author' sentinel").not.toContain(
|
||||
"Test Author",
|
||||
);
|
||||
expect(ascii, "Stripped video should not contain 'Test Video' sentinel").not.toContain(
|
||||
"Test Video",
|
||||
);
|
||||
expect(ascii, "Stripped video should not contain XMP namespace marker").not.toContain(
|
||||
"http://ns.adobe.com/xap/1.0/",
|
||||
);
|
||||
expect(ascii, "Stripped video should not contain ExifTool authorship").not.toContain(
|
||||
"Image::ExifTool",
|
||||
);
|
||||
// ffmpeg's MP4 muxer always writes a udta/meta/hdlr block at the movie
|
||||
// level (and per-track for some inputs) regardless of CLI options.
|
||||
// ExifTool surfaces this as `HandlerType: Metadata` and (without the
|
||||
// vendor patch) `HandlerVendorID: Apple`. Our post-strip pass renames
|
||||
// every udta box type to `free`, which readers treat as padding.
|
||||
//
|
||||
// Walk the ISOBMFF box tree (moov → top-level udta + per-trak udta)
|
||||
// and assert no udta box survives. A naive substring search on
|
||||
// "udta" false-positives on mdat byte collisions (~6% at 250 MB);
|
||||
// pinning to a specific size byte (e.g. 0x21) is brittle to ffmpeg
|
||||
// version drift in the udta payload size. The structural walk
|
||||
// matches what the post-strip pass itself rewrites.
|
||||
const moov = findTopLevelBox(bytes, "moov");
|
||||
expect(moov, "Stripped MP4 should contain a moov box").not.toBeNull();
|
||||
if (moov !== null) {
|
||||
const udtaInMoov = findChildBox(bytes, moov, "udta");
|
||||
expect(udtaInMoov, "Stripped MP4 should not contain moov/udta").toBeNull();
|
||||
// Per-track udta lives under moov/trak — walk every trak.
|
||||
const traks = findAllChildBoxes(bytes, moov, "trak");
|
||||
for (const trak of traks) {
|
||||
const udtaInTrak = findChildBox(bytes, trak, "udta");
|
||||
expect(
|
||||
udtaInTrak,
|
||||
"Stripped MP4 should not contain moov/trak/udta",
|
||||
).toBeNull();
|
||||
}
|
||||
}
|
||||
// ffmpeg's btrt (Bitrate Box) writes BufferSize/MaxBitrate/AverageBitrate
|
||||
// from the input stream stats. Suppressed via -write_btrt 0 in the strategy
|
||||
// invocation. If the flag stops working, btrt boxes will reappear.
|
||||
expect(ascii, "Stripped video should not contain ffmpeg's btrt bitrate box").not.toMatch(
|
||||
/btrt/,
|
||||
);
|
||||
// Per-track hdlr.name fields (one per video/audio trak). ffmpeg's default
|
||||
// is "VideoHandler"/"SoundHandler"; we suppress via -empty_hdlr_name true.
|
||||
expect(ascii, "Stripped video should not contain ffmpeg's default handler names").not.toContain(
|
||||
"VideoHandler",
|
||||
);
|
||||
expect(ascii, "Stripped video should not contain ffmpeg's default handler names").not.toContain(
|
||||
"SoundHandler",
|
||||
);
|
||||
}
|
||||
|
||||
export async function assertOutputStripped(
|
||||
bytes: Buffer,
|
||||
filename: string,
|
||||
|
|
@ -107,6 +179,13 @@ export async function assertOutputStripped(
|
|||
case "jpeg":
|
||||
assertJpegStripped(bytes);
|
||||
return;
|
||||
case "mp4":
|
||||
case "mov":
|
||||
case "m4v":
|
||||
case "mkv":
|
||||
case "webm":
|
||||
assertVideoStripped(bytes);
|
||||
return;
|
||||
case "png":
|
||||
// Block accidental use until PNG strategy lands. The current
|
||||
// substring-scan implementation in assertPngStripped is fragile
|
||||
|
|
@ -128,3 +207,91 @@ export async function assertOutputStripped(
|
|||
}
|
||||
}
|
||||
|
||||
// Minimal ISOBMFF box walker for the udta assertion above. Returns
|
||||
// `null` on any malformed input rather than throwing — the assertion
|
||||
// failing with "moov box missing" is more useful than a parse error.
|
||||
//
|
||||
// We don't share the strategy's `parseBoxesSafe` here to keep the test
|
||||
// helper free of `src/` imports; the structure is tiny enough that a
|
||||
// local copy is fine and the two implementations stay independent
|
||||
// (the same adversarial-independence rule the forensic runners follow).
|
||||
interface BoxSpan {
|
||||
readonly type: string;
|
||||
readonly payloadStart: number;
|
||||
readonly payloadEnd: number;
|
||||
}
|
||||
|
||||
function findTopLevelBox(bytes: Buffer, type: string): BoxSpan | null {
|
||||
return findBoxIn(bytes, 0, bytes.length, type);
|
||||
}
|
||||
|
||||
function findChildBox(
|
||||
bytes: Buffer,
|
||||
parent: BoxSpan,
|
||||
type: string,
|
||||
): BoxSpan | null {
|
||||
return findBoxIn(bytes, parent.payloadStart, parent.payloadEnd, type);
|
||||
}
|
||||
|
||||
function findAllChildBoxes(
|
||||
bytes: Buffer,
|
||||
parent: BoxSpan,
|
||||
type: string,
|
||||
): BoxSpan[] {
|
||||
const out: BoxSpan[] = [];
|
||||
let offset = parent.payloadStart;
|
||||
while (offset + 8 <= parent.payloadEnd) {
|
||||
const box = parseBoxAt(bytes, offset, parent.payloadEnd);
|
||||
if (box === null) return out;
|
||||
if (box.span.type === type) out.push(box.span);
|
||||
offset = box.next;
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
function findBoxIn(
|
||||
bytes: Buffer,
|
||||
start: number,
|
||||
end: number,
|
||||
type: string,
|
||||
): BoxSpan | null {
|
||||
let offset = start;
|
||||
while (offset + 8 <= end) {
|
||||
const box = parseBoxAt(bytes, offset, end);
|
||||
if (box === null) return null;
|
||||
if (box.span.type === type) return box.span;
|
||||
offset = box.next;
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
function parseBoxAt(
|
||||
bytes: Buffer,
|
||||
offset: number,
|
||||
end: number,
|
||||
): { span: BoxSpan; next: number } | null {
|
||||
if (offset + 8 > end) return null;
|
||||
let size = bytes.readUInt32BE(offset);
|
||||
const type = bytes.toString("latin1", offset + 4, offset + 8);
|
||||
let headerSize = 8;
|
||||
if (size === 1) {
|
||||
if (offset + 16 > end) return null;
|
||||
const high = bytes.readUInt32BE(offset + 8);
|
||||
const low = bytes.readUInt32BE(offset + 12);
|
||||
if (high >= 0x20_0000) return null; // 2^53 limit safeguard
|
||||
size = high * 0x1_0000_0000 + low;
|
||||
headerSize = 16;
|
||||
} else if (size === 0) {
|
||||
size = end - offset;
|
||||
}
|
||||
if (size < headerSize || offset + size > end) return null;
|
||||
return {
|
||||
span: {
|
||||
type,
|
||||
payloadStart: offset + headerSize,
|
||||
payloadEnd: offset + size,
|
||||
},
|
||||
next: offset + size,
|
||||
};
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -6,8 +6,9 @@
|
|||
// metadata document before strip on the left, after strip on the right.
|
||||
//
|
||||
// JPEG: drop sample.jpg, row becomes expandable, click → two-pane diff
|
||||
// shows with EXIF/JFIF source groups, removed entries strike-through on
|
||||
// the left, placeholder on the right.
|
||||
// shows with ExifTool family-1 source groups (IFD0, JFIF, etc. — surfaced
|
||||
// verbatim since 1c3ced5; no umbrella collapse to "EXIF"), removed entries
|
||||
// strike-through on the left, placeholder on the right.
|
||||
//
|
||||
// PDF: drop sample.pdf, same flow. Source label is "PDF" (ExifTool's PDF
|
||||
// group encompasses Info dict + XMP).
|
||||
|
|
@ -25,7 +26,7 @@ test.describe("Metadata diff expansion (two-pane via ExifTool)", () => {
|
|||
await launchPage(page);
|
||||
});
|
||||
|
||||
test("JPEG file shows expandable two-pane diff with EXIF group", async ({
|
||||
test("JPEG file shows expandable two-pane diff with IFD0 group", async ({
|
||||
page,
|
||||
isMobile,
|
||||
browserName,
|
||||
|
|
@ -71,9 +72,15 @@ test.describe("Metadata diff expansion (two-pane via ExifTool)", () => {
|
|||
const diff = page.locator(".file-table__diff--two-pane");
|
||||
await expect(diff).toBeVisible();
|
||||
|
||||
// EXIF source group header is present (sample.jpg has EXIF tags).
|
||||
// IFD0 source group header is present (sample.jpg carries Make + Model
|
||||
// in IFD0). Since 1c3ced5 the diff strategy surfaces ExifTool family-1
|
||||
// group names verbatim — IFD0, ExifIFD, XMP-dc, etc. — and explicitly
|
||||
// does NOT collapse them to umbrella labels like "EXIF" (collapsing
|
||||
// causes (source, name) key collisions across sub-groups, see the
|
||||
// commit message and exiftool_diff_strategy.ts mapGroupToSource for
|
||||
// the rationale).
|
||||
await expect(
|
||||
diff.locator(".file-table__diff-group-header", { hasText: /EXIF/ }),
|
||||
diff.locator(".file-table__diff-group-header", { hasText: /IFD0/ }),
|
||||
).toBeVisible();
|
||||
|
||||
// At least one row is classified as removed (ExifTool reads zero
|
||||
|
|
|
|||
|
|
@ -39,7 +39,7 @@ describe("ExifToolDiffStrategy", () => {
|
|||
}
|
||||
}, 15_000);
|
||||
|
||||
it("maps ExifTool IFD0 / ExifIFD groups to 'EXIF' source label", async () => {
|
||||
it("surfaces ExifTool family-1 groups verbatim (no collapse)", async () => {
|
||||
const strategy = new ExifToolDiffStrategy();
|
||||
const result = await strategy.readDocument({
|
||||
bytes: loadFixture("sample.jpg"),
|
||||
|
|
@ -49,13 +49,19 @@ describe("ExifToolDiffStrategy", () => {
|
|||
if (!result.ok) return;
|
||||
|
||||
const sources = new Set(result.value.map((e) => e.source));
|
||||
// IFD0 + ExifIFD + InteropIFD + IFD1 should all collapse to "EXIF".
|
||||
// We don't assert ALL of them present (depends on fixture), but
|
||||
// confirm "IFD0" / "ExifIFD" don't leak through as-is.
|
||||
expect(sources.has("IFD0")).toBe(false);
|
||||
expect(sources.has("ExifIFD")).toBe(false);
|
||||
expect(sources.has("InteropIFD")).toBe(false);
|
||||
expect(sources.has("IFD1")).toBe(false);
|
||||
// We deliberately preserve raw ExifTool group names — collapsing
|
||||
// sub-groups (IFD0/IFD1/ExifIFD → "EXIF", XMP-* → "XMP",
|
||||
// QuickTime/ItemList/UserData → "MP4", Track1/Track2 → "MP4")
|
||||
// causes diff-renderer key collisions when the same tag name
|
||||
// appears in two sub-groups (e.g. Track1:HandlerType vs
|
||||
// Track2:HandlerType). The sample.jpg fixture is known to carry
|
||||
// at least an IFD0 group; assert it surfaces under that name
|
||||
// rather than getting flattened to "EXIF".
|
||||
expect(sources.has("IFD0")).toBe(true);
|
||||
// Conversely, the collapsed labels we used to emit ("EXIF") should
|
||||
// no longer appear from the diff strategy — any test elsewhere
|
||||
// using `source: "EXIF"` is on synthetic data, not live diff data.
|
||||
expect(sources.has("EXIF")).toBe(false);
|
||||
}, 15_000);
|
||||
|
||||
it("drops File:* / ExifTool:* / System:* / Composite:* / SourceFile entries", async () => {
|
||||
|
|
|
|||
156
tests/infrastructure/wasm/ffmpeg_fallback_strategy.test.ts
Normal file
156
tests/infrastructure/wasm/ffmpeg_fallback_strategy.test.ts
Normal file
|
|
@ -0,0 +1,156 @@
|
|||
import { afterEach, describe, it, expect, vi } from "vitest";
|
||||
|
||||
import { FfmpegFallbackStrategy } from "../../../src/infrastructure/wasm/strategies/ffmpeg_fallback_strategy";
|
||||
|
||||
// FfmpegFallbackStrategy loads @ffmpeg/core directly in the main thread (no
|
||||
// @ffmpeg/ffmpeg wrapper, no Web Worker). End-to-end strip behaviour is
|
||||
// exercised by the Playwright e2e tests in tests/e2e/standalone/standalone.spec.ts
|
||||
// and tests/e2e/web/file-processing.spec.ts, plus the Node forensic runner at
|
||||
// tools/forensic/ffmpeg-fallback.ts (which shims browser globals so @ffmpeg/core
|
||||
// runs under Node). The Vitest tests in this file cover:
|
||||
// - Static surface (extension claim, magic-byte verification)
|
||||
// - Registry gating (VITE_ENABLE_FFMPEG_FALLBACK env flag)
|
||||
|
||||
describe("FfmpegFallbackStrategy", () => {
|
||||
it("claims .mp4 / .mov / .m4v / .mkv / .webm in Phase 1+2", () => {
|
||||
const strategy = new FfmpegFallbackStrategy();
|
||||
// Phase 1
|
||||
expect(strategy.extensions.has(".mp4")).toBe(true);
|
||||
expect(strategy.extensions.has(".mov")).toBe(true);
|
||||
expect(strategy.extensions.has(".m4v")).toBe(true);
|
||||
// Phase 2 (forensic-verified in tools/forensic/ffmpeg-fallback.ts)
|
||||
expect(strategy.extensions.has(".mkv")).toBe(true);
|
||||
expect(strategy.extensions.has(".webm")).toBe(true);
|
||||
// Deliberately out of scope — separate strategies / future PRs.
|
||||
expect(strategy.extensions.has(".avi")).toBe(false);
|
||||
expect(strategy.extensions.has(".wmv")).toBe(false);
|
||||
expect(strategy.extensions.has(".3gp")).toBe(false);
|
||||
});
|
||||
|
||||
it("verifyMagicBytes accepts EBML headers (MKV / WebM)", () => {
|
||||
const strategy = new FfmpegFallbackStrategy();
|
||||
const ebml = new Uint8Array([
|
||||
0x1a, 0x45, 0xdf, 0xa3, 0x9f, 0x42, 0x86, 0x81,
|
||||
0x01, 0x42, 0xf7, 0x81, 0x01, 0x42, 0xf2, 0x81,
|
||||
]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: ebml })).toBe(true);
|
||||
});
|
||||
|
||||
it("verifyMagicBytes accepts ISOBMFF files with MP4-family brands", () => {
|
||||
const strategy = new FfmpegFallbackStrategy();
|
||||
// Standard MP4: size + "ftyp" + "isom" + minor + compat["isom","mp42"]
|
||||
const isom = new Uint8Array([
|
||||
0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70,
|
||||
0x69, 0x73, 0x6f, 0x6d, 0x00, 0x00, 0x02, 0x00,
|
||||
0x69, 0x73, 0x6f, 0x6d, 0x69, 0x73, 0x6f, 0x32,
|
||||
0x61, 0x76, 0x63, 0x31, 0x6d, 0x70, 0x34, 0x31,
|
||||
]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: isom })).toBe(true);
|
||||
|
||||
// QuickTime MOV: "ftyp" + "qt " (with spaces, four chars)
|
||||
const qt = new Uint8Array([
|
||||
0x00, 0x00, 0x00, 0x14, 0x66, 0x74, 0x79, 0x70,
|
||||
0x71, 0x74, 0x20, 0x20, 0x00, 0x00, 0x02, 0x00,
|
||||
0x71, 0x74, 0x20, 0x20,
|
||||
]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: qt })).toBe(true);
|
||||
|
||||
// iPhone HEVC MOV variant: brand = "mp42" major
|
||||
const mp42 = new Uint8Array([
|
||||
0x00, 0x00, 0x00, 0x18, 0x66, 0x74, 0x79, 0x70,
|
||||
0x6d, 0x70, 0x34, 0x32, 0x00, 0x00, 0x00, 0x01,
|
||||
0x6d, 0x70, 0x34, 0x32, 0x69, 0x73, 0x6f, 0x6d,
|
||||
]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: mp42 })).toBe(true);
|
||||
});
|
||||
|
||||
it("verifyMagicBytes rejects look-alikes and unrelated containers", () => {
|
||||
const strategy = new FfmpegFallbackStrategy();
|
||||
// AVIF — different ISOBMFF major brand, not MP4 family
|
||||
const avif = new Uint8Array([
|
||||
0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70,
|
||||
0x61, 0x76, 0x69, 0x66, 0x00, 0x00, 0x00, 0x00,
|
||||
]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: avif })).toBe(false);
|
||||
// HEIC — also ISOBMFF but heif/heic brand
|
||||
const heic = new Uint8Array([
|
||||
0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70,
|
||||
0x68, 0x65, 0x69, 0x63, 0x00, 0x00, 0x00, 0x00,
|
||||
]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: heic })).toBe(false);
|
||||
// PNG header
|
||||
const png = new Uint8Array([
|
||||
0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a,
|
||||
]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: png })).toBe(false);
|
||||
// JPEG
|
||||
const jpeg = new Uint8Array([0xff, 0xd8, 0xff, 0xe0]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: jpeg })).toBe(false);
|
||||
// Too short to even be an ftyp box
|
||||
const tiny = new Uint8Array([0x00, 0x00]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: tiny })).toBe(false);
|
||||
// ftyp box at offset 4 (compliant) but no MP4-family brand anywhere
|
||||
const wrong = new Uint8Array([
|
||||
0x00, 0x00, 0x00, 0x18, 0x66, 0x74, 0x79, 0x70,
|
||||
0x6d, 0x69, 0x66, 0x31, 0x00, 0x00, 0x00, 0x00,
|
||||
0x6d, 0x69, 0x66, 0x31, 0x68, 0x65, 0x69, 0x63,
|
||||
]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: wrong })).toBe(false);
|
||||
});
|
||||
|
||||
it("accepts MP4 with mp4 brand in compatible_brands list (not major)", () => {
|
||||
const strategy = new FfmpegFallbackStrategy();
|
||||
// major = "mp41" (rare but seen) — should still match
|
||||
const compat = new Uint8Array([
|
||||
0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70,
|
||||
0x6d, 0x70, 0x34, 0x31, 0x00, 0x00, 0x00, 0x00,
|
||||
0x6d, 0x70, 0x34, 0x31, 0x69, 0x73, 0x6f, 0x6d,
|
||||
0x6d, 0x70, 0x34, 0x32, 0x00, 0x00, 0x00, 0x00,
|
||||
]);
|
||||
expect(strategy.verifyMagicBytes?.({ bytes: compat })).toBe(true);
|
||||
});
|
||||
});
|
||||
|
||||
// Build-flag gating: when VITE_ENABLE_FFMPEG_FALLBACK is "false", the
|
||||
// strategy must not be registered. Mirrors the ExifTool fallback gating
|
||||
// test in this same directory.
|
||||
describe("strategy_registry — VITE_ENABLE_FFMPEG_FALLBACK gating", () => {
|
||||
const MP4_ISOM = new Uint8Array([
|
||||
0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70,
|
||||
0x69, 0x73, 0x6f, 0x6d, 0x00, 0x00, 0x02, 0x00,
|
||||
0x69, 0x73, 0x6f, 0x6d, 0x69, 0x73, 0x6f, 0x32,
|
||||
0x61, 0x76, 0x63, 0x31, 0x6d, 0x70, 0x34, 0x31,
|
||||
]);
|
||||
|
||||
afterEach(() => {
|
||||
vi.unstubAllEnvs();
|
||||
});
|
||||
|
||||
it("routes MP4 to FfmpegFallbackStrategy when flag is unset (default on)", async () => {
|
||||
vi.resetModules();
|
||||
const { selectStrategy } = await import(
|
||||
"../../../src/infrastructure/wasm/strategy_registry"
|
||||
);
|
||||
const result = selectStrategy({
|
||||
filename: "video.mp4",
|
||||
bytes: MP4_ISOM,
|
||||
});
|
||||
expect(result).not.toBeNull();
|
||||
// Should be the ffmpeg strategy, not the legacy walker.
|
||||
expect(result?.constructor.name).toBe("FfmpegFallbackStrategy");
|
||||
});
|
||||
|
||||
it("falls back to VideoStrategy walker when flag is 'false'", async () => {
|
||||
vi.resetModules();
|
||||
vi.stubEnv("VITE_ENABLE_FFMPEG_FALLBACK", "false");
|
||||
const { selectStrategy } = await import(
|
||||
"../../../src/infrastructure/wasm/strategy_registry"
|
||||
);
|
||||
const result = selectStrategy({
|
||||
filename: "video.mp4",
|
||||
bytes: MP4_ISOM,
|
||||
});
|
||||
expect(result).not.toBeNull();
|
||||
expect(result?.constructor.name).toBe("VideoStrategy");
|
||||
});
|
||||
});
|
||||
319
tests/infrastructure/wasm/ffmpeg_post_strip.test.ts
Normal file
319
tests/infrastructure/wasm/ffmpeg_post_strip.test.ts
Normal file
|
|
@ -0,0 +1,319 @@
|
|||
import { describe, it, expect } from "vitest";
|
||||
import { cleanFfmpegMp4Output } from "../../../src/infrastructure/wasm/strategies/ffmpeg_post_strip";
|
||||
|
||||
// Build a minimal box: 8-byte header (size + type) followed by payload.
|
||||
function box(type: string, payload: Uint8Array | Uint8Array[]): Uint8Array {
|
||||
const flat = Array.isArray(payload) ? concat(payload) : payload;
|
||||
const total = 8 + flat.length;
|
||||
const out = new Uint8Array(total);
|
||||
new DataView(out.buffer).setUint32(0, total);
|
||||
for (let i = 0; i < 4; i++) out[4 + i] = type.charCodeAt(i);
|
||||
out.set(flat, 8);
|
||||
return out;
|
||||
}
|
||||
|
||||
function concat(chunks: Uint8Array[]): Uint8Array {
|
||||
const total = chunks.reduce((s, c) => s + c.length, 0);
|
||||
const out = new Uint8Array(total);
|
||||
let off = 0;
|
||||
for (const c of chunks) {
|
||||
out.set(c, off);
|
||||
off += c.length;
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
// Build a valid hdlr FullBox with the given handler_type and vendor.
|
||||
// Layout from payloadStart:
|
||||
// +0..+3: version (1) + flags (3) — FullBox header
|
||||
// +4..+7: pre_defined: uint32
|
||||
// +8..+11: handler_type: 4 ASCII
|
||||
// +12..+15: reserved[0] = vendor ← target
|
||||
// +16..+23: reserved[1..2] — zeros
|
||||
// +24..: name (UTF-8 zero-terminated)
|
||||
function buildHdlr({
|
||||
handlerType,
|
||||
vendor,
|
||||
name = "",
|
||||
}: {
|
||||
handlerType: string;
|
||||
vendor: string;
|
||||
name?: string;
|
||||
}): Uint8Array {
|
||||
const nameBytes = new TextEncoder().encode(name + "\x00");
|
||||
const payload = new Uint8Array(4 + 4 + 4 + 4 + 8 + nameBytes.length);
|
||||
// version+flags [0..3] = 0; pre_defined [4..7] = 0
|
||||
for (let i = 0; i < 4; i++) payload[8 + i] = handlerType.charCodeAt(i);
|
||||
for (let i = 0; i < 4; i++) payload[12 + i] = vendor.charCodeAt(i);
|
||||
// reserved[1..2] (16..23) left as zeros
|
||||
payload.set(nameBytes, 24);
|
||||
return box("hdlr", payload);
|
||||
}
|
||||
|
||||
// Wrap an hdlr in a meta FullBox: 4-byte version+flags then the hdlr child.
|
||||
function buildMeta(hdlr: Uint8Array): Uint8Array {
|
||||
const fullBoxHeader = new Uint8Array(4); // version + flags
|
||||
return box("meta", concat([fullBoxHeader, hdlr]));
|
||||
}
|
||||
|
||||
// Linear scan for the first box of `type` and return its [headerStart, payloadEnd).
|
||||
function findBox(
|
||||
bytes: Uint8Array,
|
||||
type: string,
|
||||
): { start: number; end: number } | null {
|
||||
const target = type.split("").map((c) => c.charCodeAt(0));
|
||||
for (let i = 0; i + 8 <= bytes.length; i++) {
|
||||
let match = true;
|
||||
for (let j = 0; j < 4; j++) {
|
||||
if (bytes[i + 4 + j] !== target[j]) {
|
||||
match = false;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (!match) continue;
|
||||
const view = new DataView(bytes.buffer, bytes.byteOffset + i);
|
||||
const size = view.getUint32(0);
|
||||
if (size < 8 || i + size > bytes.length) continue;
|
||||
return { start: i, end: i + size };
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
// Decode bytes as latin1 — used to scan for ASCII strings without throwing on
|
||||
// non-text bytes.
|
||||
function asString(bytes: Uint8Array): string {
|
||||
let out = "";
|
||||
for (let i = 0; i < bytes.length; i++) {
|
||||
out += String.fromCharCode(bytes[i] ?? 0);
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
const FTYP = box("ftyp", new TextEncoder().encode("isom\x00\x00\x00\x00"));
|
||||
const MDAT = box("mdat", new TextEncoder().encode("MEDIA_PAYLOAD"));
|
||||
|
||||
describe("cleanFfmpegMp4Output", () => {
|
||||
it("renames moov/udta to free (length-preserving) and leaves everything else intact", () => {
|
||||
const hdlr = buildHdlr({ handlerType: "mdir", vendor: "appl" });
|
||||
const meta = buildMeta(hdlr);
|
||||
const udta = box("udta", meta);
|
||||
const moov = box("moov", udta);
|
||||
const input = concat([FTYP, moov, MDAT]);
|
||||
const inputCopy = new Uint8Array(input); // pristine reference
|
||||
|
||||
// Find udta location in the original (used for offset math).
|
||||
const udtaLoc = findBox(input, "udta");
|
||||
expect(udtaLoc).not.toBeNull();
|
||||
if (udtaLoc === null) return;
|
||||
const typeOffset = udtaLoc.start + 4; // size(4) + type(4)
|
||||
|
||||
const out = cleanFfmpegMp4Output(input);
|
||||
|
||||
// Length preserved.
|
||||
expect(out.length).toBe(inputCopy.length);
|
||||
|
||||
// Type field is now "free".
|
||||
expect(asString(out.subarray(typeOffset, typeOffset + 4))).toBe("free");
|
||||
|
||||
// The "udta" string is gone from the output.
|
||||
expect(asString(out)).not.toContain("udta");
|
||||
|
||||
// Only 4 bytes changed in the file (the type field).
|
||||
const sizeBytesBefore = inputCopy.subarray(0, typeOffset);
|
||||
const sizeBytesAfter = out.subarray(0, typeOffset);
|
||||
expect(sizeBytesAfter).toEqual(sizeBytesBefore);
|
||||
const tailBefore = inputCopy.subarray(typeOffset + 4);
|
||||
const tailAfter = out.subarray(typeOffset + 4);
|
||||
expect(tailAfter).toEqual(tailBefore);
|
||||
|
||||
// ftyp byte-identical.
|
||||
const ftypLoc = findBox(out, "ftyp");
|
||||
expect(ftypLoc).not.toBeNull();
|
||||
if (ftypLoc === null) return;
|
||||
expect(out.subarray(ftypLoc.start, ftypLoc.end)).toEqual(
|
||||
inputCopy.subarray(ftypLoc.start, ftypLoc.end),
|
||||
);
|
||||
|
||||
// mdat byte-identical.
|
||||
const mdatLoc = findBox(out, "mdat");
|
||||
expect(mdatLoc).not.toBeNull();
|
||||
if (mdatLoc === null) return;
|
||||
expect(out.subarray(mdatLoc.start, mdatLoc.end)).toEqual(
|
||||
inputCopy.subarray(mdatLoc.start, mdatLoc.end),
|
||||
);
|
||||
});
|
||||
|
||||
it("renames a largesize-header udta (size=1, 16-byte header) correctly", () => {
|
||||
// Build a box with the largesize encoding: size field is 1, followed
|
||||
// by an 8-byte uint64 actual size. The type field sits at offset
|
||||
// +4..+7 (same as a regular box). rewriteToFree must address the
|
||||
// type via headerStart + 4 — using payloadStart - 4 would land on
|
||||
// byte 12 of the header, corrupting the largesize value.
|
||||
|
||||
// Payload: bare meta + hdlr to make this realistic, though contents
|
||||
// don't matter for the rename.
|
||||
const hdlr = buildHdlr({ handlerType: "mdir", vendor: "appl" });
|
||||
const meta = buildMeta(hdlr);
|
||||
const inner = meta; // udta payload
|
||||
const totalSize = 16 + inner.length; // largesize header (16) + payload
|
||||
|
||||
const udta = new Uint8Array(totalSize);
|
||||
// size field = 1 (signals largesize)
|
||||
new DataView(udta.buffer).setUint32(0, 1);
|
||||
// type = "udta"
|
||||
udta[4] = 0x75;
|
||||
udta[5] = 0x64;
|
||||
udta[6] = 0x74;
|
||||
udta[7] = 0x61;
|
||||
// largesize (uint64) = totalSize (high 4 bytes = 0; low 4 bytes = size)
|
||||
new DataView(udta.buffer).setUint32(8, 0);
|
||||
new DataView(udta.buffer).setUint32(12, totalSize);
|
||||
// payload
|
||||
udta.set(inner, 16);
|
||||
|
||||
const moov = box("moov", udta);
|
||||
const input = concat([FTYP, moov, MDAT]);
|
||||
const inputCopy = new Uint8Array(input);
|
||||
|
||||
// Locate the udta in the input (search for the type bytes).
|
||||
const udtaIdx = (() => {
|
||||
for (let i = 0; i + 8 <= input.length; i++) {
|
||||
if (
|
||||
input[i + 4] === 0x75 &&
|
||||
input[i + 5] === 0x64 &&
|
||||
input[i + 6] === 0x74 &&
|
||||
input[i + 7] === 0x61
|
||||
)
|
||||
return i;
|
||||
}
|
||||
return -1;
|
||||
})();
|
||||
expect(udtaIdx).toBeGreaterThanOrEqual(0);
|
||||
// Pre-condition: size field is 1 (largesize signal).
|
||||
const sizeBefore = new DataView(
|
||||
input.buffer,
|
||||
input.byteOffset + udtaIdx,
|
||||
).getUint32(0);
|
||||
expect(sizeBefore).toBe(1);
|
||||
// Pre-condition: largesize value at +8..+15 encodes totalSize.
|
||||
const largeSizeBefore = new DataView(
|
||||
input.buffer,
|
||||
input.byteOffset + udtaIdx + 8,
|
||||
).getBigUint64(0);
|
||||
expect(largeSizeBefore).toBe(BigInt(totalSize));
|
||||
|
||||
const out = cleanFfmpegMp4Output(input);
|
||||
|
||||
// Type field rewritten to "free".
|
||||
expect(asString(out.subarray(udtaIdx + 4, udtaIdx + 8))).toBe("free");
|
||||
// Size field (regular size, at +0..+3) untouched — still 1.
|
||||
const sizeAfter = new DataView(
|
||||
out.buffer,
|
||||
out.byteOffset + udtaIdx,
|
||||
).getUint32(0);
|
||||
expect(sizeAfter).toBe(1);
|
||||
// Largesize value (uint64 at +8..+15) untouched — still totalSize.
|
||||
// This is the key correctness assertion: if rewriteToFree used
|
||||
// payloadStart - 4 (= headerStart + 12), the second half of the
|
||||
// largesize uint64 would have been clobbered with "free", giving
|
||||
// a size of 0x?????666_72656565 (or similar) and a reader-corrupt
|
||||
// box.
|
||||
const largeSizeAfter = new DataView(
|
||||
out.buffer,
|
||||
out.byteOffset + udtaIdx + 8,
|
||||
).getBigUint64(0);
|
||||
expect(largeSizeAfter).toBe(BigInt(totalSize));
|
||||
// Length preserved overall.
|
||||
expect(out.length).toBe(inputCopy.length);
|
||||
});
|
||||
|
||||
it("renames per-track udta (moov/trak/udta) to free — drone/action-cam style", () => {
|
||||
const hdlr = buildHdlr({ handlerType: "mdir", vendor: "appl" });
|
||||
const meta = buildMeta(hdlr);
|
||||
const udta = box("udta", meta);
|
||||
const tkhd = box("tkhd", new Uint8Array(84));
|
||||
const trak = box("trak", concat([tkhd, udta]));
|
||||
const moov = box("moov", trak);
|
||||
const input = concat([FTYP, moov, MDAT]);
|
||||
const inputCopy = new Uint8Array(input);
|
||||
|
||||
const udtaLoc = findBox(input, "udta");
|
||||
expect(udtaLoc).not.toBeNull();
|
||||
if (udtaLoc === null) return;
|
||||
const typeOffset = udtaLoc.start + 4;
|
||||
|
||||
const out = cleanFfmpegMp4Output(input);
|
||||
|
||||
expect(out.length).toBe(inputCopy.length);
|
||||
expect(asString(out.subarray(typeOffset, typeOffset + 4))).toBe("free");
|
||||
expect(asString(out)).not.toContain("udta");
|
||||
|
||||
// tkhd untouched.
|
||||
const tkhdLoc = findBox(out, "tkhd");
|
||||
expect(tkhdLoc).not.toBeNull();
|
||||
if (tkhdLoc === null) return;
|
||||
expect(out.subarray(tkhdLoc.start, tkhdLoc.end)).toEqual(
|
||||
inputCopy.subarray(tkhdLoc.start, tkhdLoc.end),
|
||||
);
|
||||
});
|
||||
|
||||
it("is a no-op when no udta is present", () => {
|
||||
const mvhd = box("mvhd", new Uint8Array(100));
|
||||
const tkhd = box("tkhd", new Uint8Array(84));
|
||||
const trak = box("trak", tkhd);
|
||||
const moov = box("moov", concat([mvhd, trak]));
|
||||
const input = concat([FTYP, moov, MDAT]);
|
||||
const inputCopy = new Uint8Array(input);
|
||||
|
||||
const out = cleanFfmpegMp4Output(input);
|
||||
|
||||
expect(out.length).toBe(inputCopy.length);
|
||||
expect(out).toEqual(inputCopy);
|
||||
});
|
||||
|
||||
it("does not throw on a truncated udta payload (parseBoxesSafe swallows)", () => {
|
||||
// udta with a 4-byte payload that's not a complete child box.
|
||||
const udta = box("udta", new Uint8Array(4));
|
||||
const moov = box("moov", udta);
|
||||
const input = concat([FTYP, moov, MDAT]);
|
||||
|
||||
expect(() => cleanFfmpegMp4Output(input)).not.toThrow();
|
||||
|
||||
const out = cleanFfmpegMp4Output(input);
|
||||
const udtaScan = asString(out).indexOf("udta");
|
||||
const freeScan = asString(out).indexOf("free");
|
||||
// udta type renamed regardless of its child contents.
|
||||
expect(udtaScan).toBe(-1);
|
||||
expect(freeScan).toBeGreaterThanOrEqual(0);
|
||||
});
|
||||
|
||||
it("renames every udta — top-level and per-track — in the same pass", () => {
|
||||
// One top-level moov/udta AND one moov/trak/udta. Verifies the
|
||||
// walker visits both branches.
|
||||
const hdlrA = buildHdlr({ handlerType: "mdir", vendor: "appl" });
|
||||
const udtaA = box("udta", buildMeta(hdlrA));
|
||||
|
||||
const hdlrB = buildHdlr({ handlerType: "mdir", vendor: "appl" });
|
||||
const udtaB = box("udta", buildMeta(hdlrB));
|
||||
const tkhd = box("tkhd", new Uint8Array(84));
|
||||
const trak = box("trak", concat([tkhd, udtaB]));
|
||||
|
||||
const moov = box("moov", concat([udtaA, trak]));
|
||||
const input = concat([FTYP, moov, MDAT]);
|
||||
const inputCopy = new Uint8Array(input);
|
||||
|
||||
// Pre-condition: two "udta" occurrences in the file.
|
||||
expect(asString(inputCopy).split("udta").length - 1).toBe(2);
|
||||
// "free" doesn't appear yet (no top-level free box in our synthetic).
|
||||
expect(asString(inputCopy)).not.toContain("free");
|
||||
|
||||
const out = cleanFfmpegMp4Output(input);
|
||||
|
||||
// Length preserved.
|
||||
expect(out.length).toBe(inputCopy.length);
|
||||
// Both "udta" types are gone.
|
||||
expect(asString(out)).not.toContain("udta");
|
||||
// Replaced by two "free" types.
|
||||
expect(asString(out).split("free").length - 1).toBe(2);
|
||||
});
|
||||
});
|
||||
|
|
@ -203,6 +203,64 @@ describe("WasmProcessor — async diff build", () => {
|
|||
expect(second).toBeNull();
|
||||
}, 30_000);
|
||||
|
||||
// Regression guard: processing two different files and then running
|
||||
// their diffs back to back must produce a distinct, correct diff for
|
||||
// each entry. Before the fix, buildDiffDocumentForEntry ran the
|
||||
// before+after ExifTool reads in Promise.all — @uswriting/exiftool
|
||||
// internally serializes through a module-level perl + stdout/stderr
|
||||
// buffer, so two concurrent parseMetadata calls clobbered each other's
|
||||
// output. The race was masked on the very first diff because the cold-
|
||||
// start blocked one of the pair, then surfaced on every diff after.
|
||||
//
|
||||
// This test exercises two diffs against the same processor instance:
|
||||
// JPEG (sample.jpg, has Make=TestCamera) and PNG (sample.png with
|
||||
// known iTXt/tEXt chunks). The first diff should show the JPEG's
|
||||
// metadata removal; the second should show the PNG's. With the race
|
||||
// in place, the second came back identical to the first or empty.
|
||||
it("produces distinct correct diffs for two sequential entries", async () => {
|
||||
const jpegBytes = new Uint8Array(
|
||||
readFileSync(join(IMAGE_FIXTURES, "sample.jpg")),
|
||||
);
|
||||
const pngBytes = new Uint8Array(
|
||||
readFileSync(join(IMAGE_FIXTURES, "sample.png")),
|
||||
);
|
||||
fileBytes.files.set("/tmp/a.jpg", jpegBytes);
|
||||
fileBytes.files.set("/tmp/b.png", pngBytes);
|
||||
|
||||
await processor.process({
|
||||
entryId: "entry-jpeg-multi",
|
||||
filePath: "/tmp/a.jpg",
|
||||
options: { ...NO_PRESERVE },
|
||||
});
|
||||
await processor.process({
|
||||
entryId: "entry-png-multi",
|
||||
filePath: "/tmp/b.png",
|
||||
options: { ...NO_PRESERVE },
|
||||
});
|
||||
|
||||
const diffJpeg = await processor.buildDiffDocumentForEntry({
|
||||
entryId: "entry-jpeg-multi",
|
||||
});
|
||||
const diffPng = await processor.buildDiffDocumentForEntry({
|
||||
entryId: "entry-png-multi",
|
||||
});
|
||||
|
||||
expect(diffJpeg).not.toBeNull();
|
||||
expect(diffPng).not.toBeNull();
|
||||
if (diffJpeg === null || diffPng === null) return;
|
||||
|
||||
// JPEG diff must surface the EXIF Make tag from sample.jpg.
|
||||
const jpegMake = diffJpeg.before.find((e) => e.name === "Make");
|
||||
expect(jpegMake?.value).toBe("TestCamera");
|
||||
|
||||
// PNG diff must NOT contain the JPEG's EXIF tags (race regression
|
||||
// would have made the second diff parrot back the first file's
|
||||
// metadata). Equivalently: the before lists must not be identical.
|
||||
expect(JSON.stringify(diffJpeg.before)).not.toBe(
|
||||
JSON.stringify(diffPng.before),
|
||||
);
|
||||
}, 30_000);
|
||||
|
||||
// Finding 4.3: graceful degradation when ExifTool errors out. We swap
|
||||
// the strategy on the instance with a stub that always errors; the
|
||||
// processor must catch and return null rather than throwing.
|
||||
|
|
|
|||
366
tools/forensic/ffmpeg-fallback.ts
Normal file
366
tools/forensic/ffmpeg-fallback.ts
Normal file
|
|
@ -0,0 +1,366 @@
|
|||
// Forensic recovery battery for FfmpegFallbackStrategy (#182 Phase 1 + 2).
|
||||
//
|
||||
// The strategy class itself uses the browser-only @ffmpeg/ffmpeg wrapper
|
||||
// (Node import = empty module per package.json conditional exports). This
|
||||
// runner replicates the strategy's strip invocation against @ffmpeg/core
|
||||
// directly — the same WASM the strategy loads, the same arg vector, the
|
||||
// same MEMFS lifecycle.
|
||||
//
|
||||
// Run:
|
||||
// npx tsx tools/forensic/ffmpeg-fallback.ts
|
||||
//
|
||||
// Sentinel seeding uses `ffmpeg -metadata` directly (MKV/WebM) and `exiftool`
|
||||
// (MP4). System ffmpeg + exiftool required for fixture build; strip phase
|
||||
// runs entirely in-WASM.
|
||||
//
|
||||
// No-network verification (after tsx is cached):
|
||||
// TSX=$(ls -d ~/.npm/_npx/*/node_modules/tsx | head -1)/dist/cli.mjs
|
||||
// node --permission --allow-fs-read='*' --allow-fs-write='*' \
|
||||
// --allow-child-process "$TSX" tools/forensic/ffmpeg-fallback.ts
|
||||
//
|
||||
// Pass criteria: zero sentinel survival across every fixture. KNOWN_GAPS
|
||||
// empty for Phase 1 + Phase 2 formats. New gaps → file an issue, mark in
|
||||
// KNOWN_GAPS, never silently dismiss.
|
||||
|
||||
import { readFileSync, writeFileSync, mkdirSync, existsSync } from "node:fs";
|
||||
import { execFileSync } from "node:child_process";
|
||||
import { join, resolve, dirname, extname } from "node:path";
|
||||
import { fileURLToPath } from "node:url";
|
||||
import { tmpdir } from "node:os";
|
||||
|
||||
const HERE = dirname(fileURLToPath(import.meta.url));
|
||||
const REPO_ROOT = resolve(HERE, "..", "..");
|
||||
const REAL_WORLD_DIR = join(
|
||||
REPO_ROOT,
|
||||
"tests",
|
||||
"fixtures",
|
||||
"wasm",
|
||||
"video",
|
||||
"real-world",
|
||||
);
|
||||
const WORK_DIR = join(tmpdir(), "ffmpeg-fallback-forensic");
|
||||
mkdirSync(WORK_DIR, { recursive: true });
|
||||
|
||||
const SENTINELS = {
|
||||
TITLE: "FORENSIC-FF-TITLE-AAAA",
|
||||
AUTHOR: "FORENSIC-FF-AUTHOR-BBBB",
|
||||
COMMENT: "FORENSIC-FF-COMMENT-CCCC",
|
||||
ENCODER: "FORENSIC-FF-ENCODER-DDDD",
|
||||
DESCRIPTION: "FORENSIC-FF-DESC-EEEE",
|
||||
} as const;
|
||||
|
||||
const KNOWN_GAPS: ReadonlyMap<string, string> = new Map();
|
||||
|
||||
type SentinelKey = keyof typeof SENTINELS;
|
||||
type FixtureKind =
|
||||
| "synthetic-mp4"
|
||||
| "synthetic-mkv"
|
||||
| "synthetic-webm"
|
||||
| "phone-baseline"
|
||||
| "gopro-fusion"
|
||||
| "dji-phantom4";
|
||||
|
||||
interface FixtureResult {
|
||||
kind: FixtureKind;
|
||||
inputBytes: number;
|
||||
outputBytes: number;
|
||||
rc: number;
|
||||
stderr: string;
|
||||
survivors: SentinelKey[];
|
||||
deviceFingerprintSurvivors: string[];
|
||||
skipped?: string;
|
||||
}
|
||||
|
||||
function buildSyntheticMp4(): { bytes: Uint8Array; landed: SentinelKey[] } {
|
||||
const synth = join(WORK_DIR, "synth.mp4");
|
||||
const seeded = join(WORK_DIR, "seeded.mp4");
|
||||
execFileSync(
|
||||
"ffmpeg",
|
||||
["-f", "lavfi", "-i", "color=c=blue:s=128x128:d=1", "-pix_fmt", "yuv420p", "-y", synth],
|
||||
{ stdio: ["ignore", "ignore", "ignore"] },
|
||||
);
|
||||
execFileSync("cp", [synth, seeded], { stdio: ["ignore", "ignore", "ignore"] });
|
||||
execFileSync(
|
||||
"exiftool",
|
||||
[
|
||||
"-overwrite_original",
|
||||
`-Title=${SENTINELS.TITLE}`,
|
||||
`-Author=${SENTINELS.AUTHOR}`,
|
||||
`-Comment=${SENTINELS.COMMENT}`,
|
||||
`-Encoder=${SENTINELS.ENCODER}`,
|
||||
`-Description=${SENTINELS.DESCRIPTION}`,
|
||||
seeded,
|
||||
],
|
||||
{ stdio: ["ignore", "ignore", "ignore"] },
|
||||
);
|
||||
const bytes = readFileSync(seeded);
|
||||
const landed: SentinelKey[] = [];
|
||||
for (const key of Object.keys(SENTINELS) as SentinelKey[]) {
|
||||
if (bytes.includes(Buffer.from(SENTINELS[key]))) landed.push(key);
|
||||
}
|
||||
return { bytes: new Uint8Array(bytes), landed };
|
||||
}
|
||||
|
||||
function buildSyntheticEbml(
|
||||
containerExt: ".mkv" | ".webm",
|
||||
): { bytes: Uint8Array; landed: SentinelKey[] } {
|
||||
// exiftool refuses MKV/WebM writes; seed via ffmpeg's own -metadata flag.
|
||||
const synth = join(WORK_DIR, `synth${containerExt}`);
|
||||
const seeded = join(WORK_DIR, `seeded${containerExt}`);
|
||||
execFileSync(
|
||||
"ffmpeg",
|
||||
["-f", "lavfi", "-i", "color=c=green:s=128x128:d=1", "-pix_fmt", "yuv420p", "-y", synth],
|
||||
{ stdio: ["ignore", "ignore", "ignore"] },
|
||||
);
|
||||
execFileSync(
|
||||
"ffmpeg",
|
||||
[
|
||||
"-i", synth,
|
||||
"-map", "0", "-c", "copy",
|
||||
"-metadata", `title=${SENTINELS.TITLE}`,
|
||||
"-metadata", `description=${SENTINELS.DESCRIPTION}`,
|
||||
"-metadata", `comment=${SENTINELS.COMMENT}`,
|
||||
"-metadata", `encoder=${SENTINELS.ENCODER}`,
|
||||
"-y", seeded,
|
||||
],
|
||||
{ stdio: ["ignore", "ignore", "ignore"] },
|
||||
);
|
||||
const bytes = readFileSync(seeded);
|
||||
const landed: SentinelKey[] = [];
|
||||
for (const key of Object.keys(SENTINELS) as SentinelKey[]) {
|
||||
if (bytes.includes(Buffer.from(SENTINELS[key]))) landed.push(key);
|
||||
}
|
||||
return { bytes: new Uint8Array(bytes), landed };
|
||||
}
|
||||
|
||||
async function runStrip(
|
||||
inputBytes: Uint8Array,
|
||||
containerExt: string,
|
||||
): Promise<{ outputBytes: Uint8Array; rc: number; stderr: string }> {
|
||||
(globalThis as unknown as { self: typeof globalThis }).self = globalThis;
|
||||
(globalThis as unknown as { window: typeof globalThis }).window = globalThis;
|
||||
(globalThis as unknown as { location: { protocol: string; href: string } }).location = {
|
||||
protocol: "file:",
|
||||
href: "file:///tmp/ffmpeg-fallback-forensic/",
|
||||
};
|
||||
|
||||
const wasmPath = join(
|
||||
REPO_ROOT,
|
||||
"node_modules",
|
||||
"@ffmpeg",
|
||||
"core",
|
||||
"dist",
|
||||
"esm",
|
||||
"ffmpeg-core.wasm",
|
||||
);
|
||||
const wasmBytes = readFileSync(wasmPath);
|
||||
const { default: createFFmpegCore } = await import("@ffmpeg/core");
|
||||
|
||||
let stderr = "";
|
||||
const core = await (createFFmpegCore as (config: unknown) => Promise<{
|
||||
FS: {
|
||||
writeFile: (name: string, data: Uint8Array) => void;
|
||||
readFile: (name: string) => Uint8Array;
|
||||
unlink: (name: string) => void;
|
||||
};
|
||||
exec: (...args: string[]) => number;
|
||||
}>)({
|
||||
wasmBinary: wasmBytes,
|
||||
print: () => {},
|
||||
printErr: (m: string) => {
|
||||
stderr += m + "\n";
|
||||
},
|
||||
});
|
||||
|
||||
const inputName = `in${containerExt}`;
|
||||
const outputName = `out${containerExt}`;
|
||||
try { core.FS.unlink(inputName); } catch { /* ignore */ }
|
||||
try { core.FS.unlink(outputName); } catch { /* ignore */ }
|
||||
|
||||
core.FS.writeFile(inputName, inputBytes);
|
||||
const args = [
|
||||
"-i", inputName,
|
||||
// Keep video+audio in input order; drop data ("d"), subtitle
|
||||
// ("s"), attachment/timecode ("t") streams. See strategy comments
|
||||
// in src/infrastructure/wasm/strategies/ffmpeg_fallback_strategy.ts.
|
||||
"-map", "0",
|
||||
"-map", "-0:d?",
|
||||
"-map", "-0:s?",
|
||||
"-map", "-0:t?",
|
||||
"-map_metadata", "-1",
|
||||
"-map_chapters", "-1",
|
||||
"-fflags", "+bitexact",
|
||||
"-c", "copy",
|
||||
];
|
||||
const isMp4 = containerExt === ".mp4" || containerExt === ".mov" || containerExt === ".m4v";
|
||||
if (isMp4) args.push("-movflags", "+faststart");
|
||||
args.push("-metadata", "encoder=");
|
||||
args.push(outputName);
|
||||
|
||||
const rc = core.exec(...args);
|
||||
let outputBytes = new Uint8Array(0);
|
||||
if (rc === 0) {
|
||||
try { outputBytes = new Uint8Array(core.FS.readFile(outputName)); } catch { /* ignore */ }
|
||||
}
|
||||
return { outputBytes, rc, stderr };
|
||||
}
|
||||
|
||||
function recoveryBattery(
|
||||
output: Uint8Array,
|
||||
sentinelKeys: SentinelKey[],
|
||||
fingerprintStrings: readonly string[] = [],
|
||||
): { survivors: SentinelKey[]; fingerprintSurvivors: string[] } {
|
||||
const survivors: SentinelKey[] = [];
|
||||
for (const key of sentinelKeys) {
|
||||
if (output.length > 0 && Buffer.from(output).includes(SENTINELS[key])) {
|
||||
survivors.push(key);
|
||||
}
|
||||
}
|
||||
const fingerprintSurvivors: string[] = [];
|
||||
for (const fp of fingerprintStrings) {
|
||||
if (output.length > 0 && Buffer.from(output).includes(fp)) {
|
||||
fingerprintSurvivors.push(fp);
|
||||
}
|
||||
}
|
||||
return { survivors, fingerprintSurvivors };
|
||||
}
|
||||
|
||||
async function runSyntheticMp4(): Promise<FixtureResult> {
|
||||
const { bytes, landed } = buildSyntheticMp4();
|
||||
const { outputBytes, rc, stderr } = await runStrip(bytes, ".mp4");
|
||||
const { survivors, fingerprintSurvivors } = recoveryBattery(outputBytes, landed);
|
||||
return {
|
||||
kind: "synthetic-mp4", inputBytes: bytes.length, outputBytes: outputBytes.length,
|
||||
rc, stderr: stderr.slice(0, 400), survivors, deviceFingerprintSurvivors: fingerprintSurvivors,
|
||||
};
|
||||
}
|
||||
|
||||
async function runSyntheticEbml(
|
||||
kind: "synthetic-mkv" | "synthetic-webm",
|
||||
): Promise<FixtureResult> {
|
||||
const ext = kind === "synthetic-mkv" ? ".mkv" : ".webm";
|
||||
const { bytes, landed } = buildSyntheticEbml(ext);
|
||||
const { outputBytes, rc, stderr } = await runStrip(bytes, ext);
|
||||
const { survivors, fingerprintSurvivors } = recoveryBattery(outputBytes, landed);
|
||||
return {
|
||||
kind, inputBytes: bytes.length, outputBytes: outputBytes.length,
|
||||
rc, stderr: stderr.slice(0, 400), survivors, deviceFingerprintSurvivors: fingerprintSurvivors,
|
||||
};
|
||||
}
|
||||
|
||||
async function runRealWorld(
|
||||
kind: "phone-baseline" | "gopro-fusion" | "dji-phantom4",
|
||||
filename: string,
|
||||
fingerprints: readonly string[],
|
||||
): Promise<FixtureResult> {
|
||||
const path = join(REAL_WORLD_DIR, filename);
|
||||
if (!existsSync(path)) {
|
||||
return {
|
||||
kind, inputBytes: 0, outputBytes: 0, rc: -1, stderr: "",
|
||||
survivors: [], deviceFingerprintSurvivors: [],
|
||||
skipped: `fixture not present — run tools/forensic/fetch-video-fixtures.sh first`,
|
||||
};
|
||||
}
|
||||
const bytes = new Uint8Array(readFileSync(path));
|
||||
const ext = extname(filename) === ".mov" ? ".mov" : ".mp4";
|
||||
const { outputBytes, rc, stderr } = await runStrip(bytes, ext);
|
||||
const { fingerprintSurvivors } = recoveryBattery(outputBytes, [], fingerprints);
|
||||
return {
|
||||
kind, inputBytes: bytes.length, outputBytes: outputBytes.length,
|
||||
rc, stderr: stderr.slice(0, 400), survivors: [],
|
||||
deviceFingerprintSurvivors: fingerprintSurvivors,
|
||||
};
|
||||
}
|
||||
|
||||
function reportRow(r: FixtureResult): void {
|
||||
if (r.skipped !== undefined) {
|
||||
console.log(` ${r.kind.padEnd(22)} SKIPPED — ${r.skipped}`);
|
||||
return;
|
||||
}
|
||||
const sizeDelta = `${r.inputBytes} → ${r.outputBytes} bytes`;
|
||||
const verdict =
|
||||
r.rc === 0 && r.survivors.length === 0 && r.deviceFingerprintSurvivors.length === 0
|
||||
? "✓ clean"
|
||||
: r.rc !== 0
|
||||
? `✗ rc=${r.rc}`
|
||||
: `✗ leaked: ${[...r.survivors, ...r.deviceFingerprintSurvivors].join(", ")}`;
|
||||
console.log(` ${r.kind.padEnd(22)} ${sizeDelta.padEnd(30)} ${verdict}`);
|
||||
if (r.stderr.length > 0 && r.rc !== 0) {
|
||||
console.log(` stderr: ${r.stderr.replace(/\n/g, " | ").slice(0, 200)}`);
|
||||
}
|
||||
}
|
||||
|
||||
async function main(): Promise<void> {
|
||||
console.log("\nFfmpegFallbackStrategy forensic battery (Phase 1 + 2)");
|
||||
console.log("======================================================\n");
|
||||
|
||||
const results: FixtureResult[] = [];
|
||||
|
||||
console.log("Synthetic battery:");
|
||||
results.push(await runSyntheticMp4()); reportRow(results.at(-1)!);
|
||||
results.push(await runSyntheticEbml("synthetic-mkv")); reportRow(results.at(-1)!);
|
||||
results.push(await runSyntheticEbml("synthetic-webm")); reportRow(results.at(-1)!);
|
||||
|
||||
console.log("\nReal-world battery (MP4 / MOV only — no MKV/WebM real-world fixtures yet):");
|
||||
results.push(await runRealWorld("phone-baseline", "phone-baseline.mp4", []));
|
||||
reportRow(results.at(-1)!);
|
||||
results.push(
|
||||
await runRealWorld(
|
||||
"gopro-fusion",
|
||||
"gopro-fusion.mp4",
|
||||
[
|
||||
"GoPro AVC", "gpmd", "GoPro AAC", "GoPro TCD",
|
||||
"GoPro MET", "GoPro SOS", "Fusion",
|
||||
],
|
||||
),
|
||||
);
|
||||
reportRow(results.at(-1)!);
|
||||
// DJI Phantom 4 — opt-in (248 MB, fetched via
|
||||
// `tools/forensic/fetch-video-fixtures.sh --include-large`). Per
|
||||
// manifest: 5 device-natural fingerprints + full GPS flight log under
|
||||
// UserData. Confirms the strategy handles drone files in addition to
|
||||
// action cams.
|
||||
results.push(
|
||||
await runRealWorld(
|
||||
"dji-phantom4",
|
||||
"dji-phantom4.mov",
|
||||
[
|
||||
"FC6310", // drone model — UserData
|
||||
"AVC encoder", // compressorname — sample entry
|
||||
"DJI.AVC", // video handler description
|
||||
"DJI.Meta", // metadata handler description
|
||||
"55 deg", // GPS latitude (Denmark flight log)
|
||||
],
|
||||
),
|
||||
);
|
||||
reportRow(results.at(-1)!);
|
||||
|
||||
console.log("\n======================================================");
|
||||
const failed = results.filter(
|
||||
(r) =>
|
||||
r.skipped === undefined &&
|
||||
(r.rc !== 0 || r.survivors.length > 0 || r.deviceFingerprintSurvivors.length > 0),
|
||||
);
|
||||
const unexpectedFailures = failed.filter(
|
||||
(r) => !KNOWN_GAPS.has(`${r.kind}:${r.rc !== 0 ? "rc-fail" : "leak"}`),
|
||||
);
|
||||
|
||||
if (unexpectedFailures.length === 0) {
|
||||
console.log("✓ PASS — zero unexpected sentinel/fingerprint survival.\n");
|
||||
const reportPath = join(WORK_DIR, "report.json");
|
||||
writeFileSync(reportPath, JSON.stringify({ results, KNOWN_GAPS: [...KNOWN_GAPS] }, null, 2));
|
||||
console.log(`report: ${reportPath}\n`);
|
||||
process.exit(0);
|
||||
} else {
|
||||
console.log("✗ FAIL — unexpected survivors:");
|
||||
for (const r of unexpectedFailures) {
|
||||
console.log(` ${r.kind}: rc=${r.rc} survivors=[${[...r.survivors, ...r.deviceFingerprintSurvivors].join(", ")}]`);
|
||||
}
|
||||
process.exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
main().catch((err) => {
|
||||
console.error("forensic battery crashed:", err);
|
||||
process.exit(2);
|
||||
});
|
||||
|
|
@ -8,6 +8,7 @@ import {
|
|||
rmdirSync,
|
||||
existsSync,
|
||||
} from "node:fs";
|
||||
import { gzipSync } from "node:zlib";
|
||||
import { resolve } from "node:path";
|
||||
import type { Plugin } from "vite";
|
||||
|
||||
|
|
@ -117,6 +118,14 @@ export default defineConfig({
|
|||
// with extra files. The plugin above strips the corresponding <link> tags.
|
||||
publicDir: false,
|
||||
base: "./",
|
||||
// Build-time flag consumed by ffmpeg_wasm_fetch.ts to tree-shake the
|
||||
// bare `import("@ffmpeg/core")` PWA branch. Without this, Vite statically
|
||||
// bundles the ~110 KB factory + its ~43 MB data: URL wasm fallback into
|
||||
// the single-file HTML even though readInlinedCore() returns first at
|
||||
// runtime. See the comment block above resolveCore() for the rationale.
|
||||
define: {
|
||||
__WITH_STANDALONE_INLINE__: "true",
|
||||
},
|
||||
build: {
|
||||
outDir: resolve(__dirname, "dist/web-standalone"),
|
||||
emptyOutDir: true,
|
||||
|
|
@ -153,15 +162,20 @@ export default defineConfig({
|
|||
// - viteSingleFile: inlines JS/CSS into the HTML.
|
||||
// - standaloneHtmlFixupPlugin: rewrites the inlined script tag's
|
||||
// attributes (singlefile preserves `type="module"`).
|
||||
// - standaloneWasmInlinePlugin: injects the WASM bytes as a
|
||||
// `<script type="text/plain" id="zeroperl-wasm-base64">` tag in
|
||||
// the HTML, read by `redirectWasmFetch` on first WASM request.
|
||||
// - standaloneInlineWasmsPlugin: injects zeroperl.wasm AND ffmpeg-core
|
||||
// (.js + .wasm) as `<script type="text/plain">` tags in a single
|
||||
// read+write of the HTML. Merged into one plugin so the injection
|
||||
// sequence is explicit and not dependent on Rollup's
|
||||
// hookParallel(closeBundle) semantics — two plugins each doing
|
||||
// read+write of the same file would race the moment either hook
|
||||
// body grows an `await`.
|
||||
plugins: [
|
||||
react(),
|
||||
standaloneWasmStubPlugin(),
|
||||
standaloneFfmpegStubPlugin(),
|
||||
viteSingleFile(),
|
||||
standaloneHtmlFixupPlugin(),
|
||||
standaloneWasmInlinePlugin(),
|
||||
standaloneInlineWasmsPlugin(),
|
||||
],
|
||||
});
|
||||
|
||||
|
|
@ -192,104 +206,196 @@ function standaloneWasmStubPlugin(): Plugin {
|
|||
};
|
||||
}
|
||||
|
||||
// The ExifTool fallback / diff strategies load zeroperl.wasm via a
|
||||
// `?url` import. viteSingleFile only inlines JS/CSS chunks; large asset
|
||||
// files like .wasm get emitted as siblings even when assetsInlineLimit is
|
||||
// tall. So we hand-stash the WASM here.
|
||||
// Same pattern as standaloneWasmStubPlugin, but for ffmpeg-core (JS + WASM).
|
||||
// Worker JS is NOT inlined because we run ffmpeg-core in the main thread —
|
||||
// the @ffmpeg/ffmpeg wrapper would spawn a type:"module" Web Worker from a
|
||||
// Blob URL, which fails silently when the page origin is `null` (the
|
||||
// standalone HTML's file:// case). See ffmpeg_fallback_strategy.ts for the
|
||||
// architectural rationale.
|
||||
function standaloneFfmpegStubPlugin(): Plugin {
|
||||
const STUBS = new Map<string, string>([
|
||||
["@ffmpeg/core?url", "\0virtual:standalone-ffmpeg-core-js-url"],
|
||||
["@ffmpeg/core/wasm?url", "\0virtual:standalone-ffmpeg-core-wasm-url"],
|
||||
]);
|
||||
const SENTINELS: Record<string, string> = {
|
||||
"\0virtual:standalone-ffmpeg-core-js-url": "inline:ffmpeg-core-js",
|
||||
"\0virtual:standalone-ffmpeg-core-wasm-url": "inline:ffmpeg-core-wasm",
|
||||
};
|
||||
return {
|
||||
name: "standalone-ffmpeg-stub",
|
||||
enforce: "pre",
|
||||
resolveId(id) {
|
||||
const virtual = STUBS.get(id);
|
||||
return virtual ?? undefined;
|
||||
},
|
||||
load(id) {
|
||||
const sentinel = SENTINELS[id];
|
||||
if (sentinel === undefined) return undefined;
|
||||
return `export default "${sentinel}";`;
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
// Merged inline plugin for ALL WASM assets that need to be stashed in the
|
||||
// standalone HTML. Previously this was two separate plugins
|
||||
// (standaloneWasmInlinePlugin + standaloneFfmpegInlinePlugin), each with its
|
||||
// own closeBundle hook doing read+write of the same dist/web-standalone/
|
||||
// index.html. That worked only because both hook bodies were fully
|
||||
// synchronous (sync fs calls) — Rollup invokes closeBundle hooks via
|
||||
// hookParallel(), and any future `await` inside either hook would let the
|
||||
// second plugin read a stale HTML mid-mutation and clobber the first
|
||||
// plugin's injection. Merging into one closeBundle removes the ordering
|
||||
// hazard and reduces three reads + two writes of index.html to one each.
|
||||
//
|
||||
// Why this matters for the standalone target specifically: the standalone
|
||||
// HTML is opened via `file://`. Chromium browsers block cross-file `fetch()`
|
||||
// from `file://` origins by default. A sibling .wasm would silently fail
|
||||
// to load on Chrome/Edge/Brave under file://, breaking every diff view
|
||||
// and the .webp/.gif/.avif strip paths.
|
||||
// What lands in the HTML, in order:
|
||||
//
|
||||
// What we used to do (chunk B / B.1 early):
|
||||
// Substitute every `./assets/zeroperl-<hash>.wasm` URL in the inlined JS
|
||||
// with a `data:application/wasm;base64,…` URL. Problem: the resulting
|
||||
// Base64 string is a MODULE-SCOPE STRING LITERAL in the JS bundle. V8
|
||||
// allocates it eagerly during module parse — ~500-1500ms blocking the
|
||||
// page load before first paint on a 33 MB Base64 payload. That's the
|
||||
// regression the user reported.
|
||||
// 1. zeroperl.wasm
|
||||
// The ExifTool fallback / diff strategies load zeroperl.wasm via a
|
||||
// `?url` import. viteSingleFile only inlines JS/CSS chunks; large
|
||||
// asset files like .wasm get emitted as siblings even when
|
||||
// assetsInlineLimit is tall.
|
||||
//
|
||||
// What we do now:
|
||||
// Inject the Base64 as a `<script type="text/plain" id="zeroperl-wasm-base64">…
|
||||
// </script>` tag in the HTML body BEFORE the module script. The HTML
|
||||
// parser stores the textContent in the DOM but does NOT parse the
|
||||
// contents as JavaScript, so V8's module-parse cost drops from 33 MB
|
||||
// to ~150 KB (the wrapper code). On first WASM request, the wrapper's
|
||||
// `redirectWasmFetch` helper reads the textContent and decodes it via
|
||||
// `fetch(data:URL)` (browser-native Base64 path). Same total disk I/O,
|
||||
// ~500-1500ms shaved off time-to-interactive.
|
||||
// Why this matters for the standalone target specifically: the
|
||||
// standalone HTML is opened via `file://`. Chromium browsers block
|
||||
// cross-file `fetch()` from `file://` origins by default. A sibling
|
||||
// .wasm would silently fail to load on Chrome/Edge/Brave under
|
||||
// file://, breaking every diff view and the .webp/.gif/.avif strip
|
||||
// paths.
|
||||
//
|
||||
// PWA / APK builds keep the sibling asset (they don't hit the file://
|
||||
// CORS constraints and `runtimeCaching` handles repeat loads).
|
||||
// See docs/superpowers/specs/2026-05-21-issue-22-diff-pivot-design.md §8.1
|
||||
// for the original tradeoff discussion.
|
||||
function standaloneWasmInlinePlugin(): Plugin {
|
||||
// What we used to do (chunk B / B.1 early):
|
||||
// Substitute every `./assets/zeroperl-<hash>.wasm` URL in the
|
||||
// inlined JS with a `data:application/wasm;base64,…` URL.
|
||||
// Problem: the resulting Base64 string is a MODULE-SCOPE STRING
|
||||
// LITERAL in the JS bundle. V8 allocates it eagerly during module
|
||||
// parse — ~500-1500ms blocking the page load before first paint
|
||||
// on a 33 MB Base64 payload. That's the regression the user
|
||||
// reported.
|
||||
//
|
||||
// What we do now:
|
||||
// Inject the Base64 as a
|
||||
// `<script type="text/plain" id="zeroperl-wasm-base64">…</script>`
|
||||
// tag in the HTML body BEFORE the module script. The HTML parser
|
||||
// stores the textContent in the DOM but does NOT parse the
|
||||
// contents as JavaScript, so V8's module-parse cost drops from
|
||||
// 33 MB to ~150 KB (the wrapper code). On first WASM request, the
|
||||
// wrapper's `redirectWasmFetch` helper reads the textContent and
|
||||
// decodes it via `fetch(data:URL)` (browser-native Base64 path).
|
||||
// Same total disk I/O, ~500-1500ms shaved off time-to-interactive.
|
||||
//
|
||||
// PWA / APK builds keep the sibling asset (they don't hit the
|
||||
// file:// CORS constraints and `runtimeCaching` handles repeat
|
||||
// loads). See
|
||||
// docs/superpowers/specs/2026-05-21-issue-22-diff-pivot-design.md
|
||||
// §8.1 for the original tradeoff discussion.
|
||||
//
|
||||
// 2. ffmpeg-core.js + ffmpeg-core.wasm
|
||||
// ffmpeg_wasm_fetch.ts reads from the same DOM IDs at runtime (see
|
||||
// readInlinedCore there) and feeds the WASM bytes directly to
|
||||
// createFFmpegCore({wasmBinary}). Without this:
|
||||
// - ffmpeg-core.wasm (30.7 MB) gets emitted as
|
||||
// `dist/web-standalone/assets/ffmpeg-core-<hash>.wasm` —
|
||||
// defeats the single-file deliverable and 404s under file://
|
||||
// CORS rules.
|
||||
// - ffmpeg-core.js similarly emits as a sibling.
|
||||
//
|
||||
// Gzip + base64 cuts each payload roughly 3× (wasm compresses well —
|
||||
// lots of repeated LEB128 patterns + symbol tables; ffmpeg's 30.7 MB
|
||||
// wasm → ~10 MB gz → ~13 MB base64). The runtime decoders in
|
||||
// exiftool_wasm_fetch.ts / ffmpeg_wasm_fetch.ts pipe the bytes through
|
||||
// DecompressionStream("gzip") at first use. HTML-parse cost at page
|
||||
// load scales with text-node size, so shrinking the inlined string is
|
||||
// the start-time win.
|
||||
//
|
||||
// Base64 alphabet (A-Z a-z 0-9 + / =) contains no HTML-special
|
||||
// characters, so direct embedding in a <script> body is safe without
|
||||
// escaping.
|
||||
function standaloneInlineWasmsPlugin(): Plugin {
|
||||
const outDir = resolve(__dirname, "dist/web-standalone");
|
||||
const htmlPath = resolve(outDir, "index.html");
|
||||
const assetsDir = resolve(outDir, "assets");
|
||||
// Source the WASM directly from node_modules: standaloneWasmStubPlugin
|
||||
// intercepts the `?url` import and replaces it with a sentinel, so Vite
|
||||
// never sees the asset and never emits it. We read the bytes here at
|
||||
// closeBundle time and stash them as a <script type="text/plain"> tag
|
||||
// in the HTML.
|
||||
// Sourced from the package's ESM export path (matches what the `?url`
|
||||
// import would have resolved to).
|
||||
const wasmSourcePath = resolve(
|
||||
__dirname,
|
||||
"node_modules/@6over3/zeroperl-ts/dist/esm/zeroperl.wasm",
|
||||
);
|
||||
// Source assets directly from node_modules: the stub plugins intercept
|
||||
// the `?url` imports and replace them with sentinels, so Vite never
|
||||
// sees these assets and never emits them. We read the bytes here at
|
||||
// closeBundle time and stash them as <script type="text/plain"> tags.
|
||||
// Sourced from each package's ESM export path (matches what the `?url`
|
||||
// imports would have resolved to).
|
||||
const INLINE_ASSETS: ReadonlyArray<{
|
||||
label: string;
|
||||
domId: string;
|
||||
source: string;
|
||||
}> = [
|
||||
{
|
||||
label: "zeroperl.wasm",
|
||||
domId: "zeroperl-wasm-base64",
|
||||
source: resolve(
|
||||
__dirname,
|
||||
"node_modules/@6over3/zeroperl-ts/dist/esm/zeroperl.wasm",
|
||||
),
|
||||
},
|
||||
{
|
||||
label: "ffmpeg-core.js",
|
||||
domId: "ffmpeg-core-js-base64",
|
||||
source: resolve(
|
||||
__dirname,
|
||||
"node_modules/@ffmpeg/core/dist/esm/ffmpeg-core.js",
|
||||
),
|
||||
},
|
||||
{
|
||||
label: "ffmpeg-core.wasm",
|
||||
domId: "ffmpeg-core-wasm-base64",
|
||||
source: resolve(
|
||||
__dirname,
|
||||
"node_modules/@ffmpeg/core/dist/esm/ffmpeg-core.wasm",
|
||||
),
|
||||
},
|
||||
];
|
||||
return {
|
||||
name: "standalone-wasm-inline",
|
||||
name: "standalone-inline-wasms",
|
||||
closeBundle() {
|
||||
if (!existsSync(wasmSourcePath)) {
|
||||
throw new Error(
|
||||
`standaloneWasmInlinePlugin: zeroperl.wasm not found at ` +
|
||||
`${wasmSourcePath}. Check @6over3/zeroperl-ts dependency.`,
|
||||
);
|
||||
}
|
||||
const bytes = readFileSync(wasmSourcePath);
|
||||
const base64 = bytes.toString("base64");
|
||||
|
||||
// 1. Read HTML once.
|
||||
let html = readFileSync(htmlPath, "utf8");
|
||||
|
||||
// Inject the WASM payload as a <script type="text/plain"> tag
|
||||
// before the module script. The browser stores the textContent in
|
||||
// the DOM but does NOT parse it as JavaScript, so V8's
|
||||
// module-parse cost stays bounded to the small wrapper code
|
||||
// (~150 KB) instead of paying ~500-1500ms to allocate a 33 MB
|
||||
// Base64 string as a module-scope literal at page load.
|
||||
//
|
||||
// On first WASM request the `redirectWasmFetch` helper reads
|
||||
// the textContent and decodes via `fetch(data:URL)` (browser's
|
||||
// native Base64 path).
|
||||
//
|
||||
// Base64 alphabet (A-Z a-z 0-9 + / =) contains no HTML-special
|
||||
// characters, so direct embedding in a <script> body is safe
|
||||
// without escaping.
|
||||
const moduleScriptMarker = '<script type="module">';
|
||||
if (!html.includes(moduleScriptMarker)) {
|
||||
throw new Error(
|
||||
`standaloneWasmInlinePlugin: could not find <script type="module"> ` +
|
||||
`standaloneInlineWasmsPlugin: could not find <script type="module"> ` +
|
||||
`in HTML. viteSingleFile may have changed its inline-script ` +
|
||||
`shape; the inline-tag injection point needs updating.`,
|
||||
);
|
||||
}
|
||||
const inlineTag = `<script type="text/plain" id="zeroperl-wasm-base64">${base64}</script>`;
|
||||
|
||||
// 2. Read + gzip + base64 each asset; accumulate inline tags.
|
||||
let injected = "";
|
||||
const summaryLines: string[] = [];
|
||||
for (const asset of INLINE_ASSETS) {
|
||||
if (!existsSync(asset.source)) {
|
||||
throw new Error(
|
||||
`standaloneInlineWasmsPlugin: ${asset.label} not found at ` +
|
||||
`${asset.source}. Check the corresponding dependency install.`,
|
||||
);
|
||||
}
|
||||
const bytes = readFileSync(asset.source);
|
||||
const gzipped = gzipSync(bytes, { level: 9 });
|
||||
const base64 = gzipped.toString("base64");
|
||||
injected += `<script type="text/plain" id="${asset.domId}">${base64}</script>\n`;
|
||||
summaryLines.push(
|
||||
` ${asset.label}: ${bytes.length} → ${gzipped.length} bytes gzipped ` +
|
||||
`(base64 ${base64.length} bytes)`,
|
||||
);
|
||||
}
|
||||
|
||||
// 3. Write HTML once with all injections, then log a single summary.
|
||||
html = html.replace(
|
||||
moduleScriptMarker,
|
||||
`${inlineTag}\n${moduleScriptMarker}`,
|
||||
`${injected}${moduleScriptMarker}`,
|
||||
);
|
||||
|
||||
console.log(
|
||||
`standaloneWasmInlinePlugin: stashed zeroperl.wasm in <script type="text/plain"> (${bytes.length} bytes)`,
|
||||
);
|
||||
|
||||
writeFileSync(htmlPath, html);
|
||||
console.log(
|
||||
`standaloneInlineWasmsPlugin: stashed ${INLINE_ASSETS.length} assets in ` +
|
||||
`<script type="text/plain"> tags\n${summaryLines.join("\n")}`,
|
||||
);
|
||||
|
||||
// Defensive: if Vite ever emits assets/ siblings again, clean
|
||||
// them up to keep the standalone-output to exactly one file.
|
||||
// 4. Defensive: if Vite ever emits assets/ siblings again, clean
|
||||
// them up to keep the standalone output to exactly one file.
|
||||
if (existsSync(assetsDir) && readdirSync(assetsDir).length === 0) {
|
||||
rmdirSync(assetsDir);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -54,6 +54,15 @@ export default defineConfig({
|
|||
outDir: resolve(__dirname, "dist/web"),
|
||||
emptyOutDir: true,
|
||||
},
|
||||
// Build-time flag consumed by ffmpeg_wasm_fetch.ts. In the PWA / APK
|
||||
// build the inlined `<script type="text/plain">` ffmpeg tags don't
|
||||
// exist, so readInlinedCore() returns null and we MUST reach the bare
|
||||
// `import("@ffmpeg/core")` branch. Setting this to `false` here keeps
|
||||
// Rollup from tree-shaking that branch. The standalone config sets it
|
||||
// to `true` to drop ~43 MB from the single-file HTML.
|
||||
define: {
|
||||
__WITH_STANDALONE_INLINE__: "false",
|
||||
},
|
||||
plugins: [
|
||||
react(),
|
||||
webCspPlugin(),
|
||||
|
|
|
|||
|
|
@ -1010,6 +1010,11 @@
|
|||
resolved "https://registry.yarnpkg.com/@esbuild/win32-x64/-/win32-x64-0.27.3.tgz#0eaf705c941a218a43dba8e09f1df1d6cd2f1f17"
|
||||
integrity sha512-4uJGhsxuptu3OcpVAzli+/gWusVGwZZHTlS63hh++ehExkVT8SgiEf7/uC/PclrPPkLhZqGgCTjd0VWLo6xMqA==
|
||||
|
||||
"@ffmpeg/core@0.12.10":
|
||||
version "0.12.10"
|
||||
resolved "https://registry.yarnpkg.com/@ffmpeg/core/-/core-0.12.10.tgz#3177e88852bfbfaad5d258e9e0ac1fd9dffd3223"
|
||||
integrity sha512-dzNplnn2Nxle2c2i2rrDhqcB19q9cglCkWnoMTDN9Q9l3PvdjZWd1HfSPjCNWc/p8Q3CT+Es9fWOR0UhAeYQZA==
|
||||
|
||||
"@ionic/cli-framework-output@^2.2.8":
|
||||
version "2.2.8"
|
||||
resolved "https://registry.yarnpkg.com/@ionic/cli-framework-output/-/cli-framework-output-2.2.8.tgz#29d541acc7773a6aaceec5f3b079937fbcef5402"
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue