Rebrand to MetaScrub + upstream attribution (#95)
All checks were successful
CI / Lint, Typecheck & Unit Tests (push) Successful in 38s
CI / E2E (Standalone single-file) (push) Successful in 2m0s
CI / E2E (Web) (push) Successful in 3m48s

Follow-up to #93 (Phase G). The "ExifCleaner" name no longer reflects what the project does — v5 strips PDF Info dicts, Office docProps, MP4 atoms, and JPEG/PNG markers, not just EXIF. The "wraps ExifTool" framing died in Phase D; Phase G drove the last nail. This PR renames the user-facing surface to **MetaScrub** and adds prominent attribution to the [szTheory/exifcleaner](https://github.com/szTheory/exifcleaner) upstream.

## What changed

**Brand surfaces renamed:**

- `package.json` — `name`, `productName`, `description` (dropped the no-longer-controlled `exifcleaner.com` author URL)
- `src/web/index.html` — page title
- `public/manifest.webmanifest` — PWA name / short_name / description
- `.resources/strings.json` — unsupported-RAW copy + zip filename template
- `src/infrastructure/web/batch_output.ts` — zip filename
- All user-facing docs (`README.md`, `CLAUDE.md`, `CHANGELOG.md`, `docs/architecture.md`, `docs/PRIVACY_GAPS.md`, `docs/android-apk.md`, `docs/animation-principles.md`, `docs/deploying.md`)

**Attribution:**

- README lead now carries a "Forked from [szTheory/exifcleaner](https://github.com/szTheory/exifcleaner)" banner under the title.
- New `## Credits` section at the bottom of the README walks through the lineage v3.6 → v4 (modernization) → v5 Phase A–G (WASM strategies, web-only, Electron retirement) → MetaScrub.
- `CLAUDE.md` and `docs/architecture.md` history note also link to upstream.

**localStorage migration (no data loss):**

- New key: `metascrub-settings-v1`
- One-time shim reads `exifcleaner-settings-v1` on first launch and copies it forward, then clears the legacy key. Existing PWA users keep their toggles. Covered by 4 new tests in `tests/renderer/infrastructure/web_api.test.ts`.

**Stale-URL cleanup bundled in:**

- Replaced all `https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/N` links across docs with bare `#N` refs (Forgejo auto-links these locally; the old GitHub URLs were dead since the May-12 migration).
- Dropped the in-app HEIC follow-up link — it was hard-coding `http://localhost:3000/forgejo_admin/exifcleaner-web/issues/48`, which is a private dev URL useless to end users.
- Cleaned a stale `https://exifcleaner.app/` comment in `vite.config.web.ts`.

**Forensic-scanner update:**

- `tools/forensic/{png,office}.ts` stray-marker regexes now also catch `/metascrub/i`. The `exifcleaner` pattern is kept alongside so any pre-rebrand fingerprint regression also fails. See the §"Forensic verification" rationale in `format-strategy-workflow.md`.

## Deliberately NOT renamed

- **Internal identifiers**: `ExifError` type, `formatExifError`, `src/domain/exif/`, CSS `--ec-*` tokens. Not user-facing brand surfaces; renaming would be pervasive cosmetic churn.
- **CHANGELOG.md historical entries**: v4.0.0 and earlier entries describe events that happened when the project was named ExifCleaner. Treating them as historical record.
- **`docs/superpowers/specs/*` and `docs/superpowers/plans/*`**: same rationale — historical snapshots of design / planning work.
- **Cloudflare Pages project name** (`exifcleaner-web` in `.github/workflows/deploy-web.yml` + the corresponding lines in `docs/deploying.md`): tied to a live Cloudflare resource. Renaming would orphan the existing deploy URL. Inline comment in deploying.md notes the legacy name + that renaming is a CF-dashboard operation.

## Verification

- `yarn lint` ✓
- `yarn typecheck` ✓
- `yarn test` ✓ (318 tests, +4 for the migration shim, all passing)
- `yarn check:deps` ✓
- `yarn build:web` ✓
- `yarn build:web:standalone` ✓

## Test plan

- [ ] CI green on the PR
- [ ] After deploy, existing PWA installs see "MetaScrub" name on next launch; settings persist across the rename (migration shim runs on first read)
- [ ] Fresh PWA install shows MetaScrub end-to-end (title, manifest name, zip filenames)
- [ ] Drop an unsupported file → "We don't support {ext}" message no longer points anywhere clickable for HEIC (was leaking localhost URL)

## Rollback

Single PR, squash-merge. `git revert <merge-sha>` restores the ExifCleaner brand. localStorage migration is one-way (legacy key is removed after copy), so a revert would leave new users without their settings on the next visit; existing users who never opened the app between PR merge and revert would be unaffected. Low-impact in practice.

Co-authored-by: Randa <obuvuyoviz26@gmail.com>
Reviewed-on: http://localhost:3000/forgejo_admin/exifcleaner-web/pulls/95
This commit is contained in:
forgejo_admin 2026-05-14 10:39:19 +04:00
parent c09c156973
commit 6779e59efd
26 changed files with 256 additions and 130 deletions

View file

@ -1,6 +1,6 @@
# GitHub Context
**Snapshot of the upstream `szTheory/exifcleaner` community state.** This is a historical reference; active modernization development happens in `obuvuyoviz26-lab/exifcleaner-web` (the private migration repo). Many of the upstream issues catalogued here are addressed by the v5 architecture (Phase D shipped 2026-05-10) — see `modernization-roadmap.md` for the per-phase status.
**Snapshot of the upstream `szTheory/exifcleaner` community state.** This is a historical reference; active modernization development happens in this Forgejo repo. Many of the upstream issues catalogued here are addressed by the v5 architecture (Phase D shipped 2026-05-10) — see `modernization-roadmap.md` for the per-phase status.
Upstream snapshot updated: Feb 2025. Last upstream release: v3.6.0 (May 4, 2021). Last upstream commit: March 2022.

View file

@ -32,13 +32,13 @@
"en": "Choose Files"
},
"error.pending.heic": {
"en": "HEIC support is on the way (tracked in #48). Convert to JPEG for now, or check back after the next release."
"en": "HEIC support is on the way. Convert to JPEG for now, or check back after the next release."
},
"error.unknown": {
"en": "We don't support {ext} files yet."
},
"error.unsupported.raw": {
"en": "RAW formats aren't supported in ExifCleaner. Try ExifTool standalone for RAW files."
"en": "RAW formats aren't supported in MetaScrub. Try ExifTool standalone for RAW files."
},
"folder.pickButton": {
"en": "Pick folder"
@ -1328,6 +1328,6 @@
"fr": "Effacer"
},
"folder.zipDownloadFilename": {
"en": "exifcleaner-{folder}-{date}.zip"
"en": "metascrub-{folder}-{date}.zip"
}
}

View file

@ -1,5 +1,10 @@
# Changelog
## Unreleased
- **Rebrand to MetaScrub.** The project name changes from ExifCleaner to MetaScrub to reflect the broader format coverage (PDF, Office docProps, MP4 atoms — not just EXIF). Lineage: forked from [szTheory/exifcleaner](https://github.com/szTheory/exifcleaner). Internal CSS variables (`--ec-*`) and the `ExifError` type stay — they are not user-facing brand surfaces. The localStorage settings key migrates from `exifcleaner-settings-v1` to `metascrub-settings-v1` on first launch (existing settings carry over via a one-time shim).
- **Phase G — Electron retired.** PWA is the sole distribution channel. macOS xattr scrubbing documented as a gap in `docs/PRIVACY_GAPS.md`. See PR #93 / issue #80.
## 4.0.0
Complete modernization of ExifCleaner after a 5-year hiatus. Every layer of the application has been rebuilt — from Electron 11 to 35, vanilla DOM to React 19, loose scripts to DDD architecture, zero tests to 265 unit + 42 E2E tests.

View file

@ -1,6 +1,6 @@
# ExifCleaner
# MetaScrub
Privacy-focused metadata stripper, shipped as a browser PWA. No Perl runtime, no server-side calls, no Electron shell. MIT license.
Privacy-focused metadata stripper, shipped as a browser PWA. No Perl runtime, no server-side calls, no Electron shell. MIT license. Forked from [szTheory/exifcleaner](https://github.com/szTheory/exifcleaner) and rebranded in v5; lineage notes in the README.
## Tech Stack

View file

@ -1,6 +1,8 @@
# <img src="static/icon.svg" height=26> ExifCleaner
# <img src="static/icon.svg" height=26> MetaScrub
> Clean metadata from images, videos, PDFs, and other files. Runs entirely in your browser — no uploads, no server.
> Strip metadata from images, videos, PDFs, and Office documents. Runs entirely in your browser — no uploads, no server.
> **Forked from [szTheory/exifcleaner](https://github.com/szTheory/exifcleaner)** (MIT). Substantially rewritten across v4 (modernization) and v5 (Phase AG: WASM strategy registry, web-only build, Electron retirement). Rebranded from ExifCleaner to MetaScrub in v5 to reflect coverage beyond EXIF (PDF, Office, MP4). See [Credits](#credits).
## Features
@ -17,11 +19,11 @@ See the [CHANGELOG](CHANGELOG.md) for release history.
## Project Direction
ExifCleaner is a browser PWA built on a single processing engine — WASM and pure-TypeScript format strategies that run entirely in the user's browser. Phase D (shipped 2026-05-10) collapsed the previous two-engine architecture onto one strategy registry; Phase G (shipped 2026-05-14, issue #80) retired the Electron desktop shell. The PWA is now the sole distribution channel.
MetaScrub is a browser PWA built on a single processing engine — WASM and pure-TypeScript format strategies that run entirely in the user's browser. Phase D (shipped 2026-05-10) collapsed the previous two-engine architecture onto one strategy registry; Phase G (shipped 2026-05-14, issue #80) retired the Electron desktop shell. The PWA is now the sole distribution channel.
Hand-rolled pure-TypeScript marker and chunk walkers cover documented containers (JPEG, PNG today, WebP/GIF/BMP/TIFF in flight). For ISOBMFF-based formats (HEIC, AVIF, MP4), the existing video-strategy box walker provides a foundation; a targeted Rust→WASM module is the second-line option only if a hand-rolled approach proves insufficient. Library evaluations under [`docs/poc/`](docs/poc/) showed the hand-rolled approach is smaller, more transparent, and more thorough than the maintained alternatives we tried (`little_exif` and `exiv2-wasm` both leave significant metadata behind on JPEG/PNG).
Server-side processing is **explicitly out of scope**. Uploading user files to a server, even as a "last resort fallback", would invalidate the privacy guarantee that defines this app — and "last resort" tends to drift to "default". Per-format size caps with explicit messaging ([issue #63](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/63)) cover the large-file edge case without ever reaching for a remote endpoint.
Server-side processing is **explicitly out of scope**. Uploading user files to a server, even as a "last resort fallback", would invalidate the privacy guarantee that defines this app — and "last resort" tends to drift to "default". Per-format size caps with explicit messaging (issue #63) cover the large-file edge case without ever reaching for a remote endpoint.
RAW formats are unsupported from v5 forward. ExifTool's RAW support represents roughly two decades of reverse-engineering on proprietary containers (CR2, CR3, NEF, ARW, RAF, ORF, DNG, and dozens of vendor variants), and no production-ready WASM library covers that surface. RAW workflows belong on dedicated tools — see [`docs/PRIVACY_GAPS.md`](docs/PRIVACY_GAPS.md#raw-unsupported) for context and alternatives.
@ -36,12 +38,12 @@ For "what's *partially* cleaned even when supported", see [`docs/PRIVACY_GAPS.md
| JPG, JPEG | Full¹ (hand-rolled walker) |
| PNG | Full¹ (hand-rolled walker) |
| GIF, WebP, BMP, TIFF | Unsupported² (hand-rolled walkers in flight) |
| HEIC, HEIF | Unsupported² ([issue #48](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/48) in flight — highest-priority deferred format) |
| HEIC, HEIF | Unsupported² (issue #48 in flight — highest-priority deferred format) |
| AVIF | Unsupported² |
| PDF | Best-effort³ |
| DOCX, XLSX, PPTX, ODT | Partial⁴ (WASM strategy) |
| MP4, MOV, M4V, 3GP, 3G2 | Partial⁵ (WASM strategy) |
| MKV | Unsupported ([issue #43](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/43), deferred to v6) |
| MKV | Unsupported (issue #43, deferred to v6) |
| RAW (CR2/CR3/NEF/ARW/RAF/ORF/DNG/...) | Unsupported⁶ |
| SVG, JXL, JPEG 2000, AVI | Unsupported |
@ -50,13 +52,13 @@ Footnotes:
1. JPEG and PNG: hand-rolled walkers. JPEG drops APP0APP15 (except APP14 Adobe DCT) and the COM marker; PNG drops `tEXt`/`zTXt`/`iTXt`/`eXIf` chunks and other metadata-bearing ancillary chunks. Both preserve image data verbatim and mirror ExifTool's `-all=` policy. Per-format tables: [`docs/gap-analysis/jpeg.md`](docs/gap-analysis/jpeg.md), [`docs/gap-analysis/png.md`](docs/gap-analysis/png.md). Forensic verification: [`docs/forensic/jpeg.md`](docs/forensic/jpeg.md), [`docs/forensic/png.md`](docs/forensic/png.md).
2. Formats listed as Unsupported fall through with an explicit "unsupported" error in the UI. Hand-rolled marker/chunk walkers are the planned path; see [`docs/poc/`](docs/poc/) for the investigations that ruled out WASM library alternatives.
3. PDF: the strategy clears the Info dictionary (Title, Author, Subject, Keywords, Producer, Creator, CreationDate, ModDate), drops the catalog `/Metadata` XMP stream and its indirect object, scrubs annotation author/comment/timestamp keys, and removes catalog-level fingerprints (`/Lang`, `/PageLabels`, `/OutputIntents`) plus per-page `/Metadata` and `/Thumb`. Embedded files and AcroForm data are not touched (they may carry legitimate document content). The strip is structurally cleaner than ExifTool's PDF behaviour, which uses incremental updates and leaves the original metadata recoverable in the file body. Full analysis: [`docs/gap-analysis/pdf.md`](docs/gap-analysis/pdf.md). Forensic verification: [`docs/forensic/pdf.md`](docs/forensic/pdf.md).
4. Office: clears `docProps/{core,app,custom}.xml` and a thumbnail. Known partial coverage of tracked changes/comments, RSIDs, embedded media EXIF, `customXml/` parts, and file paths in `*.rels` — tracked under [issue #62 (Office Phase 2 hardening)](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/62). See [`docs/PRIVACY_GAPS.md`](docs/PRIVACY_GAPS.md) for the user-facing summary.
4. Office: clears `docProps/{core,app,custom}.xml` and a thumbnail. Known partial coverage of tracked changes/comments, RSIDs, embedded media EXIF, `customXml/` parts, and file paths in `*.rels` — tracked under issue #62 (Office Phase 2 hardening). See [`docs/PRIVACY_GAPS.md`](docs/PRIVACY_GAPS.md) for the user-facing summary.
5. MP4/MOV: drops `udta`, `meta`, and `Xtra` containers via mp4box.js box-tree rewrite (no re-encoding, lossless). Known gaps in timed-metadata tracks, `hdlr` names, `compressorname`, mdat orphans, and sidecar files — see [`docs/PRIVACY_GAPS.md`](docs/PRIVACY_GAPS.md#mp4--mov-video-gaps) for the user-facing summary.
6. RAW: removed in v5 (decided 2026-05-09, shipped 2026-05-10). No production-ready WASM library covers proprietary RAW. RAW workflows should use [ExifTool standalone](https://exiftool.org/) or a dedicated RAW tool — see [`docs/PRIVACY_GAPS.md#raw-unsupported`](docs/PRIVACY_GAPS.md#raw-unsupported).
## Running the web app locally
ExifCleaner runs entirely in your browser — no server-side processing, no file uploads.
MetaScrub runs entirely in your browser — no server-side processing, no file uploads.
### Option 1: Single-file HTML (no server, no install)
@ -72,10 +74,10 @@ This is a desktop deliverable (Chrome, Brave, Edge, Firefox, Safari). Chrome on
```bash
# Build the image
docker build -t exifcleaner-web .
docker build -t metascrub .
# Run on http://localhost:8080
docker run -p 8080:80 exifcleaner-web
docker run -p 8080:80 metascrub
```
Open http://localhost:8080. Drag and drop files to clean metadata.
@ -107,7 +109,7 @@ The `docs/` tree is organized around an **analyse → implement → verify** pat
- [`docs/architecture.md`](docs/architecture.md) — narrative architecture guide for new contributors: the build pipeline, an end-to-end trace of a file drop, the DDD layers, state management in the renderer, and a React primer aimed at backend devs. Start here if the codebase is new to you.
- [`docs/gap-analysis/`](docs/gap-analysis/) — per-format coverage analysis written *before* implementation. Each writeup compares current state vs reference implementations (typically ExifTool) vs what's theoretically possible, and locks in the marker/chunk policy. Currently: `jpeg.md`, `pdf.md`, `png.md`.
- [`docs/poc/`](docs/poc/) — library evaluation writeups for approaches considered and ruled out, with bundle sizes and coverage tables. Currently: `little-exif-wasm.md`, `exiv2-wasm.md`.
- [`docs/forensic/`](docs/forensic/) — adversarial recovery tests run *after* implementation lands, with reproducible runners under [`tools/forensic/`](tools/forensic/). Tests embed sentinel strings, strip multiple ways, and compare survivors across recovery techniques. Currently: `jpeg.md`, `pdf.md`, `png.md`. Office and Video forensic writeups are in flight ([issues #64, #65](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/64)).
- [`docs/forensic/`](docs/forensic/) — adversarial recovery tests run *after* implementation lands, with reproducible runners under [`tools/forensic/`](tools/forensic/). Tests embed sentinel strings, strip multiple ways, and compare survivors across recovery techniques. Currently: `jpeg.md`, `pdf.md`, `png.md`. Office and Video forensic writeups are in flight (issues #64, #65).
- [`docs/PRIVACY_GAPS.md`](docs/PRIVACY_GAPS.md) — the inverse of `forensic/`: known cases where the privacy guarantee bends (RAW unsupported, MP4 timed-metadata tracks, sidecar files, etc.). Required reading for anyone touching format strategies.
- [`docs/deploying.md`](docs/deploying.md) — deployment guide for the web app: Cloudflare Pages, self-hosted Docker (with Cloudflare Tunnel, VPS + nginx/Caddy, or Tailscale Funnel), and PWA install on Android/iOS.
@ -118,8 +120,6 @@ Built with [React 19](https://react.dev) and [TypeScript 5.7](https://www.typesc
### Run the app in dev mode
```bash
git clone https://github.com/szTheory/exifcleaner.git
cd exifcleaner
yarn install
yarn dev
```
@ -139,7 +139,7 @@ yarn typecheck # TypeScript strict mode check
### Contributing a Format Strategy
A `FormatStrategy` is a pure function that takes file bytes and returns cleaned bytes for one or more file extensions. Strategies are the single processing pipeline ExifCleaner ships, with no Perl ExifTool dependency.
A `FormatStrategy` is a pure function that takes file bytes and returns cleaned bytes for one or more file extensions. Strategies are the single processing pipeline MetaScrub ships, with no Perl ExifTool dependency.
The interface lives at [`src/infrastructure/wasm/format_strategy.ts`](src/infrastructure/wasm/format_strategy.ts):
@ -181,3 +181,16 @@ For broader context on the analysis-then-implementation pattern, the existing st
See [`docs/deploying.md`](docs/deploying.md) for the deployment paths (Cloudflare Pages, Docker, static hosting).
## Credits
MetaScrub began as **[ExifCleaner](https://github.com/szTheory/exifcleaner)** by [szTheory](https://github.com/szTheory) and contributors — first released in 2019 as an Electron desktop wrapper around ExifTool, with translations and platform support contributed by the community over five years. v3.6.0 (May 2021) was the last upstream release before this fork.
The codebase has been substantially rewritten since:
- **v4** (20252026): Electron 11 → 35, vanilla DOM → React 19, build system from electron-webpack → electron-vite, DDD layering introduced, zero tests → 300+ unit + e2e tests.
- **v5 Phases AC**: hand-rolled WASM/pure-TS strategy registry for JPEG, PNG, PDF, Office, MP4 — replaces the bundled Perl ExifTool. Deployable web build + PWA + Docker.
- **v5 Phase D** (2026-05-10): single processing engine across Electron and web.
- **v5 Phase G** (2026-05-14): Electron retired entirely. Project rebranded ExifCleaner → MetaScrub to reflect coverage beyond EXIF.
All upstream contributors are credited in the original [ExifCleaner README](https://github.com/szTheory/exifcleaner#contributors). MIT license preserved throughout.

View file

@ -18,23 +18,23 @@ The current shipping state. Expect this table to drift; the README's Format Supp
| PDF | Supported (best-effort) | Embedded files + AcroForm data not touched (see [`forensic/pdf.md`](forensic/pdf.md) §"caveats"). |
| PNG | Supported (full) | None known. See [`forensic/png.md`](forensic/png.md). |
| MP4 / MOV | Supported (partial) | Timed-metadata tracks, `hdlr` names, `compressorname`, mdat orphans, sidecar files — see [§MP4 video gaps](#mp4--mov-video-gaps) below. |
| DOCX / XLSX / PPTX / ODT | Supported (partial) | Tracked changes/comments, RSIDs, embedded media EXIF, `customXml/`, file paths in `*.rels` — see [Office Phase 2 hardening (issue #62)](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/62). |
| HEIC / AVIF | Unsupported (in flight) | Strategy tracked in [issue #48](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/48). |
| DOCX / XLSX / PPTX / ODT | Supported (partial) | Tracked changes/comments, RSIDs, embedded media EXIF, `customXml/`, file paths in `*.rels` — see Office Phase 2 hardening (issue #62). |
| HEIC / AVIF | Unsupported (in flight) | Strategy tracked in issue #48. |
| GIF / WebP / BMP / TIFF | Unsupported in web build | Hand-rolled walkers planned (see README §"Format Support Matrix"). |
| MKV | Unsupported | Strategy tracked in [issue #43](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/43), deferred to v6. |
| MKV | Unsupported | Strategy tracked in issue #43, deferred to v6. |
| RAW | Unsupported (v5+) | See [§RAW unsupported](#raw-unsupported) below. |
| SVG, JXL, JPEG 2000, AVI | Unsupported | No strategy planned for v5 ([#44](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/44) closed wontfix — small audience, no demand signal). |
| SVG, JXL, JPEG 2000, AVI | Unsupported | No strategy planned for v5 (#44 closed wontfix — small audience, no demand signal). |
---
## macOS extended file attributes (xattr) — lost in Phase G
**Decided 2026-05-11, shipped 2026-05-14 ([issue #80](http://localhost:3000/forgejo_admin/exifcleaner-web/issues/80)).** Prior to Phase G, the Electron desktop build scrubbed macOS-specific extended attributes (`kMDItemContentCreationDate`, `kMDItemDateAdded`, `kMDItemFSContentChangeDate`, `kMDItemFSCreatorCode`, `com.apple.quarantine`, `com.apple.metadata:*`, etc.) via the `XattrCommand` running against the system `xattr` binary. With the Electron shell retired in Phase G, that code path is gone.
**Decided 2026-05-11, shipped 2026-05-14 (issue #80).** Prior to Phase G, the Electron desktop build scrubbed macOS-specific extended attributes (`kMDItemContentCreationDate`, `kMDItemDateAdded`, `kMDItemFSContentChangeDate`, `kMDItemFSCreatorCode`, `com.apple.quarantine`, `com.apple.metadata:*`, etc.) via the `XattrCommand` running against the system `xattr` binary. With the Electron shell retired in Phase G, that code path is gone.
**What may leak:** Spotlight-indexed timestamps, the "Where from" download origin URL, Finder tags, the Quarantine flag (which records the application and the date it was downloaded). These survive on the file even after metadata stripping, because the browser cannot reach the filesystem's xattr namespace from sandboxed JavaScript.
**What you can do:**
- On the file you saved from ExifCleaner, run `xattr -c <file>` from Terminal. That clears all extended attributes in one command.
- On the file you saved from MetaScrub, run `xattr -c <file>` from Terminal. That clears all extended attributes in one command.
- For a directory of cleaned files: `find <dir> -type f -exec xattr -c {} +`
- If you need a deeper sweep (Spotlight metadata stores, recent-items lists), [ExifTool standalone](https://exiftool.org/) and dedicated forensic tooling go further than the simple `xattr -c` strip.
@ -44,14 +44,14 @@ The current shipping state. Expect this table to drift; the README's Format Supp
## RAW unsupported
**Decided 2026-05-09 ([issue #16](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/16)).** Previously, RAW formats (CR2, CR3, NEF, ARW, RAF, ORF, DNG, RW2, X3F, and dozens of vendor variants) were processed by the bundled Perl ExifTool inside the Electron desktop build. Phase D removes that wrapper entirely; v5 ships a single WASM/pure-TS code path that does not cover proprietary RAW.
**Decided 2026-05-09 (issue #16).** Previously, RAW formats (CR2, CR3, NEF, ARW, RAF, ORF, DNG, RW2, X3F, and dozens of vendor variants) were processed by the bundled Perl ExifTool inside the Electron desktop build. Phase D removes that wrapper entirely; v5 ships a single WASM/pure-TS code path that does not cover proprietary RAW.
**What may leak:** Everything ExifTool's RAW support previously stripped — IFD0 EXIF, GPSInfo, MakerNotes, embedded JPEG previews with their own EXIF, XMP, and IPTC. Dropping a RAW into v5 returns "unsupported"; the file is not modified, so any pre-existing metadata remains.
**What you can do:**
- Use [ExifTool standalone](https://exiftool.org/) — the canonical reference implementation, far more thorough than any wrapper.
- Convert RAW → JPEG/TIFF in your photo editor first, then process with ExifCleaner. (Note: the JPEG/TIFF often inherits a subset of the RAW's metadata — verify with `exiftool` before assuming a clean output.)
- For photo libraries, native OS share-sheets often offer "Remove location" before sharing — coarser than ExifCleaner but server-free.
- Convert RAW → JPEG/TIFF in your photo editor first, then process with MetaScrub. (Note: the JPEG/TIFF often inherits a subset of the RAW's metadata — verify with `exiftool` before assuming a clean output.)
- For photo libraries, native OS share-sheets often offer "Remove location" before sharing — coarser than MetaScrub but server-free.
**Why this trade-off:** ExifTool's RAW support represents roughly two decades of reverse-engineering on undocumented proprietary containers. No production-ready WASM library covers that surface (see `docs/poc/little-exif-wasm.md` and `docs/poc/exiv2-wasm.md` for evaluations of the closest candidates). Maintaining the Perl runtime alive in Electron solely for RAW added complexity disproportionate to the audience size; once the convergence-on-one-code-path direction was committed (`project-direction.md`), keeping it became dead weight.
@ -63,7 +63,7 @@ The current `VideoStrategy` (mp4box.js-based box-tree rewriter) drops `udta`, `m
### Timed-metadata tracks (GoPro GPS, DJI telemetry, CAMM, tmcd)
**Status:** [issue #35](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/35) (priority-1, privacy must-fix).
**Status:** issue #35 (priority-1, privacy must-fix).
**What it is:** Action cams (GoPro `gpmd` handler), drones (DJI proprietary, CAMM standard), and many cameras with timecode (`tmcd`) carry per-frame telemetry as separate sample tracks inside the MP4. The current strategy strips container-level metadata but leaves these tracks intact.
@ -73,7 +73,7 @@ The current `VideoStrategy` (mp4box.js-based box-tree rewriter) drops `udta`, `m
### Orphaned mdat bytes after track blanking
**Status:** [issue #42](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/42) (privacy gap, paired with #35).
**Status:** issue #42 (privacy gap, paired with #35).
**What it is:** When a metadata track is removed at the box-tree level, the underlying `mdat` (media data) atom may still contain the raw sample bytes the track referenced. A forensic walk over the cleaned MP4's `mdat` can carve those bytes back out via `strings | grep` or structural carving.
@ -83,7 +83,7 @@ The current `VideoStrategy` (mp4box.js-based box-tree rewriter) drops `udta`, `m
### `hdlr` handler name strings (encoder fingerprint)
**Status:** [issue #38](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/38) (fingerprint hardening, lower priority).
**Status:** issue #38 (fingerprint hardening, lower priority).
**What it is:** ISOBMFF `hdlr` boxes carry a human-readable name string (often "VideoHandler", "Apple Video Media Handler", "GoPro AVC Encoder", etc.). The current strategy doesn't zero these.
@ -91,25 +91,25 @@ The current `VideoStrategy` (mp4box.js-based box-tree rewriter) drops `udta`, `m
### `compressorname` in `avc1`/`hvc1` codec sample entries
**Status:** [issue #39](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/39) (fingerprint hardening, lower priority).
**Status:** issue #39 (fingerprint hardening, lower priority).
**What it is:** Sample entries for H.264/H.265 codecs carry a 32-byte `compressorname` field (commonly "H.264", "x264 - core 152", etc.). Same fingerprinting concern as `hdlr` names.
### H.264/H.265 SEI NAL units
**Status:** [issue #41](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/41) (known limitation — accepted, not fixable without re-encoding).
**Status:** issue #41 (known limitation — accepted, not fixable without re-encoding).
**What it is:** H.264/H.265 video streams can carry SEI (Supplemental Enhancement Information) NAL units inside the encoded bitstream itself. These can encode timestamps, GPS coordinates, recording-device identifiers, and arbitrary user data.
**What may leak:** Same surface as timed-metadata tracks, but baked into the video stream. Removing them requires re-encoding the video — which violates the project's "no quality loss" promise and breaks the core forensic invariant ("we don't decode and re-encode").
**What you can do:** Re-encode with a tool that strips SEI: `ffmpeg -i input.mp4 -map_metadata -1 -bsf:v "filter_units=remove_types=6" -c:v libx264 -c:a copy output.mp4`. Note: this DOES re-encode (lossy) and is therefore explicitly out of scope for ExifCleaner's pure-strip approach.
**What you can do:** Re-encode with a tool that strips SEI: `ffmpeg -i input.mp4 -map_metadata -1 -bsf:v "filter_units=remove_types=6" -c:v libx264 -c:a copy output.mp4`. Note: this DOES re-encode (lossy) and is therefore explicitly out of scope for MetaScrub's pure-strip approach.
### Sidecar files (DJI .SRT, GoPro .THM/.LRV, Insta360 .LRV/.LRF)
**Status:** [issue #46](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/46) (priority-1, privacy must-fix).
**Status:** issue #46 (priority-1, privacy must-fix).
**What it is:** Many cameras drop sidecar files alongside the video — most notably DJI's `.SRT` files, which contain literal `lat,lon,alt,timestamp` per frame as plain ASCII. ExifCleaner only processes the file the user dropped; the sidecar sitting in the same folder is untouched.
**What it is:** Many cameras drop sidecar files alongside the video — most notably DJI's `.SRT` files, which contain literal `lat,lon,alt,timestamp` per frame as plain ASCII. MetaScrub only processes the file the user dropped; the sidecar sitting in the same folder is untouched.
**What may leak:** The entire flight path / route in plain text. Worse than the in-file gaps because the data is *outside* the file the user thinks they cleaned.
@ -119,7 +119,7 @@ The current `VideoStrategy` (mp4box.js-based box-tree rewriter) drops `udta`, `m
## Filesystem timestamps
**Decided 2026-05-12 ([issue #83](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/83)).** The app zeros every in-file timestamp it can reach (see `privacy-invariants.md` §6 for the full policy) — but the *filesystem* timestamps on the cleaned output file are partially out of reach. Two known platform gaps, both documented here rather than fixed in code:
**Decided 2026-05-12 (issue #83).** The app zeros every in-file timestamp it can reach (see `privacy-invariants.md` §6 for the full policy) — but the *filesystem* timestamps on the cleaned output file are partially out of reach. Two known platform gaps, both documented here rather than fixed in code:
### Web build — output download mtime/atime is OS-clock time
@ -129,7 +129,7 @@ The current `VideoStrategy` (mp4box.js-based box-tree rewriter) drops `udta`, `m
**What may leak:** Time-of-download — not time-of-content. An adversary inspecting a downloaded file's filesystem mtime learns roughly when the user saved it, not when the photo was taken / video was recorded. The correlation is weaker the longer the user holds the file before sharing.
**Partial mitigation:** Every File the web build constructs sets `lastModified: 0`. `<a download>` ignores this (the OS clock wins), but Web Share Target consumers ([#23](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/23)) may honor it — meaning files shared via iOS/Android share sheets carry a zero `lastModified` to the receiving app. The disk-write gap remains.
**Partial mitigation:** Every File the web build constructs sets `lastModified: 0`. `<a download>` ignores this (the OS clock wins), but Web Share Target consumers (#23) may honor it — meaning files shared via iOS/Android share sheets carry a zero `lastModified` to the receiving app. The disk-write gap remains.
**What you can do:** If filesystem-timestamp parity matters for your threat model:
- Touch the file to a known time after download: `touch -t 197001010000 cleaned.jpg`.
@ -152,7 +152,7 @@ The current `VideoStrategy` (mp4box.js-based box-tree rewriter) drops `udta`, `m
### Electron build — output mtime/atime zeroed; xattrs scrubbed
For completeness: the Electron build *does* zero filesystem mtime/atime on every output write (via `fs.utimes(output, 0, 0)`). It also runs `XattrCommand` to scrub macOS Spotlight xattrs and Linux user xattrs. The two gaps above (download-time mtime on web, birthtime/crtime cross-platform) remain even on Electron. Phase G ([#80](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/80)) is decommissioning the Electron build; the web mitigations above become the primary path.
For completeness: the Electron build *does* zero filesystem mtime/atime on every output write (via `fs.utimes(output, 0, 0)`). It also runs `XattrCommand` to scrub macOS Spotlight xattrs and Linux user xattrs. The two gaps above (download-time mtime on web, birthtime/crtime cross-platform) remain even on Electron. Phase G (#80) is decommissioning the Electron build; the web mitigations above become the primary path.
---

View file

@ -62,7 +62,7 @@ These steps add Capacitor to the repo. Run them once, commit the result, then `c
yarn add -D @capacitor/core @capacitor/cli @capacitor/android
# 2. Initialize Capacitor — points it at our existing build output
npx cap init ExifCleaner com.exifcleaner.app --web-dir=dist/web
npx cap init MetaScrub com.metascrub.app --web-dir=dist/web
# 3. Add the Android platform — generates the android/ Gradle project
npx cap add android
@ -102,7 +102,7 @@ For sideloaded personal use:
1. Transfer `app-debug.apk` to the device (USB, AirDrop equivalent, signal, email, SD card — anything except an HTTP upload would defeat the point).
2. On the device, allow "Install from unknown sources" for the file manager being used (Android 8+ scopes this per-app rather than system-wide).
3. Tap the APK. Android shows an install dialog. After install, ExifCleaner appears in the launcher like any other app.
3. Tap the APK. Android shows an install dialog. After install, MetaScrub appears in the launcher like any other app.
For wider distribution (if you ever wanted to put the APK on a website), you'd want to switch from `assembleDebug` to `assembleRelease` with a real signing config. F-Droid is the most privacy-aligned distribution channel; the Play Store is technically possible but introduces a Google account dependency on the developer side.

View file

@ -1,7 +1,7 @@
# Animation Principles Reference
Source: Emil Kowalski's UI animation articles (emilkowal.ski/ui)
Compiled for ExifCleaner v4.0 design system.
Compiled for the MetaScrub (formerly ExifCleaner) v4.0 design system.
---
@ -194,13 +194,13 @@ Once ANY tooltip is open: subsequent tooltips display **immediately** — no del
| `400ms` | Toast transitions |
| `500ms` | Drawer/sheet (iOS-like) |
| `cubic-bezier(0.32, 0.72, 0, 1)` | iOS sheet easing |
| `cubic-bezier(0.16, 1, 0.3, 1)` | Standard ease-out (ExifCleaner `--ec-ease-out`) |
| `cubic-bezier(0.16, 1, 0.3, 1)` | Standard ease-out (MetaScrub `--ec-ease-out`) |
| `2px` | Blur value to mask imperfections |
| `100ms` | Anti-accidental scroll timeout |
---
## ExifCleaner Application
## MetaScrub Application
### What to animate (low-frequency, purposeful)
- Drawer slide-in/out (user opens settings occasionally)

View file

@ -1,10 +1,10 @@
# Architecture Guide
A walkthrough of how ExifCleaner is wired together, written for a backend developer who's new to React. `CLAUDE.md` is the reference (LLM-optimised for fast symbol lookup); this document is the narrative — analogies, sequence diagrams, end-to-end traces.
A walkthrough of how MetaScrub is wired together, written for a backend developer who's new to React. `CLAUDE.md` is the reference (LLM-optimised for fast symbol lookup); this document is the narrative — analogies, sequence diagrams, end-to-end traces.
If you've worked on a microservice backend with a DDD-flavoured domain layer, almost everything in this codebase has a familiar analogue. The places where it diverges (React reconciliation, the strategy-registry pattern) are called out explicitly.
> **History note.** Earlier versions of ExifCleaner shipped as an Electron desktop app with bundled Perl ExifTool. Phase D (2026-05-10) consolidated everything onto one WASM/pure-TS engine that ran in both Electron and web. Phase G (2026-05-14, issue #80) retired the Electron shell entirely. The PWA is now the sole distribution channel; this document reflects that state.
> **History note.** This project began life as [ExifCleaner](https://github.com/szTheory/exifcleaner), an Electron desktop app wrapping Perl ExifTool. Phase D (2026-05-10) consolidated everything onto one WASM/pure-TS engine that ran in both Electron and web. Phase G (2026-05-14, issue #80) retired the Electron shell entirely. v5 rebranded the fork as **MetaScrub** to reflect the broader format coverage (PDF, Office, MP4 — not just EXIF). The PWA is now the sole distribution channel; this document reflects that state.
---
@ -25,7 +25,7 @@ If you've worked on a microservice backend with a DDD-flavoured domain layer, al
## Mental model in one page
ExifCleaner is a React SPA that strips metadata from files in the user's browser. There is no server, no Electron shell, no native code. Files are read via the File API, processed by hand-rolled `FormatStrategy` walkers, and handed back to the user via `<a download>` (or bundled into a zip when the batch contains multiple files / folder structure).
MetaScrub is a React SPA that strips metadata from files in the user's browser. There is no server, no Electron shell, no native code. Files are read via the File API, processed by hand-rolled `FormatStrategy` walkers, and handed back to the user via `<a download>` (or bundled into a zip when the batch contains multiple files / folder structure).
```
┌─────────────────────────────────────────────────────────────┐

View file

@ -20,8 +20,8 @@ The included `Dockerfile` is a multi-stage build: stage 1 builds the bundle with
### Build and run locally
```bash
docker build -t exifcleaner-web .
docker run -d -p 8080:80 --name exifcleaner-web exifcleaner-web
docker build -t metascrub .
docker run -d -p 8080:80 --name metascrub metascrub
# → reachable at http://localhost:8080
```
@ -42,9 +42,9 @@ Stable URL on a domain you own (requires a free Cloudflare account with the doma
```bash
cloudflared tunnel login
cloudflared tunnel create exifcleaner
cloudflared tunnel route dns exifcleaner exifcleaner.example.com
cloudflared tunnel run --url http://localhost:8080 exifcleaner
cloudflared tunnel create metascrub
cloudflared tunnel route dns metascrub metascrub.example.com
cloudflared tunnel run --url http://localhost:8080 metascrub
```
Cloudflare provisions and renews TLS certs automatically.
@ -60,7 +60,7 @@ The Docker container's internal nginx already sets all required response headers
Caddy provisions and renews Let's Encrypt certs automatically. Full config:
```caddy
exifcleaner.example.com {
metascrub.example.com {
reverse_proxy localhost:8080
}
```
@ -70,10 +70,10 @@ exifcleaner.example.com {
#### nginx + certbot (most operators already know it)
```nginx
# /etc/nginx/sites-enabled/exifcleaner
# /etc/nginx/sites-enabled/metascrub
server {
listen 80;
server_name exifcleaner.example.com;
server_name metascrub.example.com;
location / {
proxy_pass http://localhost:8080;
@ -88,7 +88,7 @@ server {
Then provision the cert and rewrite the config to enable HTTPS:
```bash
sudo certbot --nginx -d exifcleaner.example.com
sudo certbot --nginx -d metascrub.example.com
```
Certbot adds the SSL block and the HTTP→HTTPS redirect. It also installs a renewal cron/timer.
@ -141,7 +141,7 @@ In the repo:
While the workflow is on `workflow_dispatch`:
- Go to the **Actions** tab → **Deploy Web App to Cloudflare Pages****Run workflow** → pick branch → **Run**
- The first run auto-creates the Pages project with name `exifcleaner-web`
- The first run auto-creates the Pages project with name `exifcleaner-web` (the legacy CF project name — kept as-is to avoid orphaning an existing deploy; rename via the Cloudflare dashboard if desired)
To re-enable auto-deploys on every push, edit the `on:` block in the workflow file. Once enabled:

View file

@ -1,9 +1,9 @@
# Office (DOCX/XLSX/PPTX/ODT) forensic recovery test
**Date:** 2026-05-09
**Goal:** Verify what `OfficeStrategy` actually removes from OOXML and ODT archives, by embedding unique sentinels in every metadata-bearing path the strategy might touch (and several it currently doesn't), stripping the fixtures, and running a recovery battery on the outputs. Closes the Phase-3 verification gap for issue [#64](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/64).
**Goal:** Verify what `OfficeStrategy` actually removes from OOXML and ODT archives, by embedding unique sentinels in every metadata-bearing path the strategy might touch (and several it currently doesn't), stripping the fixtures, and running a recovery battery on the outputs. Closes the Phase-3 verification gap for issue #64.
This is a **deliberately honest** test. OfficeStrategy is in production without prior forensic verification; the runner ships even though several sentinels survive, because masking the result would defeat the privacy invariant that "forensic > unit tests" exists to enforce. The surviving sentinels map cleanly to the open Phase-2 hardening umbrella ([#62](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/62)) and its child issues #28-#32; they are documented here as known gaps so the umbrella PR has explicit before/after evidence to point at when it lands.
This is a **deliberately honest** test. OfficeStrategy is in production without prior forensic verification; the runner ships even though several sentinels survive, because masking the result would defeat the privacy invariant that "forensic > unit tests" exists to enforce. The surviving sentinels map cleanly to the open Phase-2 hardening umbrella (#62) and its child issues #28-#32; they are documented here as known gaps so the umbrella PR has explicit before/after evidence to point at when it lands.
**Reproducible at:** [`tools/forensic/office.ts`](../../tools/forensic/office.ts) — `npx tsx tools/forensic/office.ts` from the project root.
@ -27,16 +27,16 @@ Sources covered (DOCX):
| `APP_COMPANY` | `docProps/app.xml` `Company` | (handled) |
| `APP_MANAGER` | `docProps/app.xml` `Manager` | (handled) |
| `CUSTOM_PROP` | `docProps/custom.xml` `vt:lpwstr` | (handled) |
| `CUSTOMXML_ID` | `customXml/item1.xml` SharePoint datastore item ID | [#31](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/31) |
| `CUSTOMXML_PROPS` | `customXml/item1.xml` SharePoint document ID | [#31](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/31) |
| `SETTINGS_RSID` | `word/settings.xml` `<w:rsids>` block | [#29](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/29) |
| `COMMENT_AUTHOR` | `word/comments.xml` `w:author` attribute | [#28](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/28) |
| `COMMENT_TEXT` | `word/comments.xml` `<w:t>` body | [#28](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/28) |
| `TRACKCHANGE_AUTHOR` | `word/document.xml` inline `<w:ins w:author=…>` | [#28](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/28) |
| `CUSTOMXML_ID` | `customXml/item1.xml` SharePoint datastore item ID | #31 |
| `CUSTOMXML_PROPS` | `customXml/item1.xml` SharePoint document ID | #31 |
| `SETTINGS_RSID` | `word/settings.xml` `<w:rsids>` block | #29 |
| `COMMENT_AUTHOR` | `word/comments.xml` `w:author` attribute | #28 |
| `COMMENT_TEXT` | `word/comments.xml` `<w:t>` body | #28 |
| `TRACKCHANGE_AUTHOR` | `word/document.xml` inline `<w:ins w:author=…>` | #28 |
| `PEOPLE_AUTHOR` | `word/people.xml` (handled by current strategy) | (handled) |
| `RELS_FILEPATH` | `word/_rels/document.xml.rels` external `file:///` hyperlink | [#32](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/32) |
| `MEDIA_EXIF_ARTIST` | EXIF Artist tag inside `word/media/image1.jpeg` | [#30](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/30) |
| `MEDIA_EXIF_GPS` | EXIF UserComment + GPS coords inside `word/media/image1.jpeg` | [#30](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/30) |
| `RELS_FILEPATH` | `word/_rels/document.xml.rels` external `file:///` hyperlink | #32 |
| `MEDIA_EXIF_ARTIST` | EXIF Artist tag inside `word/media/image1.jpeg` | #30 |
| `MEDIA_EXIF_GPS` | EXIF UserComment + GPS coords inside `word/media/image1.jpeg` | #30 |
| `PRINTER_NAME` | `word/printerSettings/printerSettings1.bin` (handled) | (handled) |
| `THUMBNAIL_BYTES` | `docProps/thumbnail.jpeg` (handled) | (handled) |
@ -89,17 +89,17 @@ Plus structural checks: list of remaining ZIP entries by path; surviving metadat
| Sentinel | Source | Channel that recovered it | Tracked by |
|---|---|---|---|
| `CUSTOMXML_ID` | `customXml/item1.xml` (SharePoint datastore item ID) | unzip -p, entry walk | [#31](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/31) |
| `CUSTOMXML_PROPS` | `customXml/item1.xml` (SharePoint document ID) | unzip -p, entry walk | [#31](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/31) |
| `SETTINGS_RSID` | `word/settings.xml` `<w:rsids>` block | unzip -p, entry walk | [#29](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/29) |
| `COMMENT_AUTHOR` | `word/comments.xml` `w:author` attribute | unzip -p, entry walk | [#28](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/28) |
| `COMMENT_TEXT` | `word/comments.xml` `<w:t>` body | unzip -p, entry walk | [#28](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/28) |
| `TRACKCHANGE_AUTHOR` | `word/document.xml` inline `<w:ins w:author=…>` | unzip -p, entry walk | [#28](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/28) |
| `RELS_FILEPATH` | `word/_rels/document.xml.rels` `Target="file:///…"` | unzip -p, entry walk | [#32](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/32) |
| `MEDIA_EXIF_ARTIST` | EXIF Artist inside `word/media/image1.jpeg` | unzip -p, entry walk, embedded-JPEG EXIF | [#30](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/30) |
| `MEDIA_EXIF_GPS` | EXIF UserComment + GPS inside `word/media/image1.jpeg` | unzip -p, entry walk, embedded-JPEG EXIF | [#30](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/30) |
| `CUSTOMXML_ID` | `customXml/item1.xml` (SharePoint datastore item ID) | unzip -p, entry walk | #31 |
| `CUSTOMXML_PROPS` | `customXml/item1.xml` (SharePoint document ID) | unzip -p, entry walk | #31 |
| `SETTINGS_RSID` | `word/settings.xml` `<w:rsids>` block | unzip -p, entry walk | #29 |
| `COMMENT_AUTHOR` | `word/comments.xml` `w:author` attribute | unzip -p, entry walk | #28 |
| `COMMENT_TEXT` | `word/comments.xml` `<w:t>` body | unzip -p, entry walk | #28 |
| `TRACKCHANGE_AUTHOR` | `word/document.xml` inline `<w:ins w:author=…>` | unzip -p, entry walk | #28 |
| `RELS_FILEPATH` | `word/_rels/document.xml.rels` `Target="file:///…"` | unzip -p, entry walk | #32 |
| `MEDIA_EXIF_ARTIST` | EXIF Artist inside `word/media/image1.jpeg` | unzip -p, entry walk, embedded-JPEG EXIF | #30 |
| `MEDIA_EXIF_GPS` | EXIF UserComment + GPS inside `word/media/image1.jpeg` | unzip -p, entry walk, embedded-JPEG EXIF | #30 |
Every one of these maps to an open child issue under the [Office Phase 2 hardening umbrella (#62)](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/62). When the umbrella PR ships, this section becomes the regression bar — re-running the runner with the new strategy must produce zero surviving sentinels for these channels.
Every one of these maps to an open child issue under the Office Phase 2 hardening umbrella (#62). When the umbrella PR ships, this section becomes the regression bar — re-running the runner with the new strategy must produce zero surviving sentinels for these channels.
### ODT
@ -165,4 +165,4 @@ Required tools: `exiftool`, `unzip`, `strings`. All available on Debian/Ubuntu v
## What ships when
This writeup is the "before" snapshot. The umbrella PR for [#62](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/62) — closing #28-#32 — will re-run the runner with the hardened strategy and either update this section's results table to "all zeros" or document any irreducible remaining surface. The bar for that PR is: every sentinel in the DOCX results table moves to zero across every recovery channel, and a regression in any of them is a release-blocker.
This writeup is the "before" snapshot. The umbrella PR for #62 — closing #28-#32 — will re-run the runner with the hardened strategy and either update this section's results table to "all zeros" or document any irreducible remaining surface. The bar for that PR is: every sentinel in the DOCX results table moves to zero across every recovery channel, and a regression in any of them is a release-blocker.

View file

@ -17,12 +17,12 @@ The runner builds a **synthetic ISOBMFF fixture** programmatically (no committed
| `META_ILST_DESC` | top-level `meta/ilst` data atom | — (current strategy handles) |
| `XMP_UUID_TITLE` | top-level `uuid` (Adobe XMP UUID) | — (current strategy handles) |
| `VENDOR_UUID_DATA` | top-level `uuid` (vendor user-type, DJI-style) | — (current strategy handles) |
| `HDLR_NAME_VIDEO` | `mdia/hdlr.name` of the video track | [#38](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/38) |
| `HDLR_NAME_GPMD` | `mdia/hdlr.name` of a GPMD timed-metadata track | [#35](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/35), [#38](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/38) |
| `COMPRESSORNAME` | `avc1.compressorname` Pascal string (offset +42 inside the sample entry) | [#39](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/39) |
| `GPMD_MDAT_GPS` | sample bytes inside `mdat` referenced by the GPMD track's `stco` | [#35](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/35), [#42](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/42) |
| `HDLR_NAME_VIDEO` | `mdia/hdlr.name` of the video track | #38 |
| `HDLR_NAME_GPMD` | `mdia/hdlr.name` of a GPMD timed-metadata track | #35, #38 |
| `COMPRESSORNAME` | `avc1.compressorname` Pascal string (offset +42 inside the sample entry) | #39 |
| `GPMD_MDAT_GPS` | sample bytes inside `mdat` referenced by the GPMD track's `stco` | #35, #42 |
| `FREE_ARTIFACT` | top-level pre-existing `free` atom payload (simulates a prior tool's "blanked" residue) | — |
| `MDAT_ORPHAN` | `mdat` payload bytes referenced by no current `stco` (simulates leftover trim residue) | [#42](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/42) |
| `MDAT_ORPHAN` | `mdat` payload bytes referenced by no current `stco` (simulates leftover trim residue) | #42 |
The fixture's box tree mirrors a realistic ISOBMFF layout: `ftyp`, two top-level `uuid` boxes (XMP + vendor), a `free` artifact, a `moov` containing `mvhd` plus a video `trak` (with `avc1` sample entry), a GPMD timed-metadata `trak`, a `udta`, a top-level `meta` (Apple-style with `hdlr` + `keys` + `ilst`), and a final `mdat` containing GPMD sample bytes plus an orphan-byte sentinel. Total fixture size: 2 802 bytes.
@ -35,7 +35,7 @@ For each output, four recovery techniques are applied:
3. **In-process atom-tree walker** — re-parses the box tree, recurses into `moov` / `trak` / `mdia` / `minf` / `stbl` / `udta` / `meta` / `moof` / `traf`, and scans every payload as latin-1 for sentinels. This is the equivalent of "read the whole structure, decompress nothing fancy, grep" — what an attacker writes in 200 lines.
4. **`mdat` carving** — extracts the raw `mdat` payload (every `mdat` box, recursively) and scans it for sentinels regardless of whether any `stco` references those bytes. This surfaces orphan-mdat survival.
In addition, a **structural assertion** (added 2026-05-12 for [#83](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/83)) reads `creation_time` and `modification_time` out of every `mvhd` / `tkhd` / `mdhd` FullBox. Sentinels are not a useful channel here (the fields are 32- or 64-bit integers, not ASCII), so the runner extracts the numeric values directly and the verdict fails on any non-zero. Privacy invariant §6 admits no documented gap for these fields — non-zero is an unconditional regression.
In addition, a **structural assertion** (added 2026-05-12 for #83) reads `creation_time` and `modification_time` out of every `mvhd` / `tkhd` / `mdhd` FullBox. Sentinels are not a useful channel here (the fields are 32- or 64-bit integers, not ASCII), so the runner extracts the numeric values directly and the verdict fails on any non-zero. Privacy invariant §6 admits no documented gap for these fields — non-zero is an unconditional regression.
## Results
@ -49,12 +49,12 @@ The runner exits 0 if every surviving sentinel is in a documented `KNOWN_GAPS` s
| `META_ILST_DESC` | yes | **gone** | **gone** | n/a | (n/a) | **stripped** |
| `XMP_UUID_TITLE` | yes | **gone** | **gone** | n/a | (n/a) | **stripped** |
| `VENDOR_UUID_DATA` | yes | **gone** | **gone** | n/a | (n/a) | **stripped** |
| `HDLR_NAME_VIDEO` | yes | **survives** | **survives** | n/a | (n/a) | gap → [#38](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/38) |
| `HDLR_NAME_GPMD` | yes | **survives** | **survives** | n/a | (n/a) | gap → [#35](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/35), [#38](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/38) |
| `COMPRESSORNAME` | yes | **survives** | **survives** | n/a | (n/a) | gap → [#39](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/39) |
| `GPMD_MDAT_GPS` | yes | **survives** | **survives** | **survives** | (n/a) | gap → [#35](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/35), [#42](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/42) |
| `HDLR_NAME_VIDEO` | yes | **survives** | **survives** | n/a | (n/a) | gap → #38 |
| `HDLR_NAME_GPMD` | yes | **survives** | **survives** | n/a | (n/a) | gap → #35, #38 |
| `COMPRESSORNAME` | yes | **survives** | **survives** | n/a | (n/a) | gap → #39 |
| `GPMD_MDAT_GPS` | yes | **survives** | **survives** | **survives** | (n/a) | gap → #35, #42 |
| `FREE_ARTIFACT` | yes | **survives** | **survives** | n/a | (n/a) | gap (pre-existing `free` not scrubbed) |
| `MDAT_ORPHAN` | yes | **survives** | **survives** | **survives** | (n/a) | gap → [#42](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/42) |
| `MDAT_ORPHAN` | yes | **survives** | **survives** | **survives** | (n/a) | gap → #42 |
**Summary:** of the 12 sentinels, 6 are stripped to zero recoverability across every channel, and 6 survive at least one recovery channel. Every survivor maps to a documented known gap with an open issue. **No unexpected survivors.**
@ -68,7 +68,7 @@ Six categories work as designed and are forensically locked in:
- **Top-level `meta` (incl. Apple `keys` / `ilst`).** Same treatment. Sentinels in both the keys catalog and the ilst data atoms vanish.
- **Adobe XMP `uuid`.** The XMP UUID is well-known but not on the safe-list (which only covers DRM/CENC). Replaced with `free`. The `xmpmeta` payload is gone.
- **Vendor / unknown `uuid`.** Treated as metadata unless on the DRM safe-list. DJI-style payloads disappear.
- **`mvhd` / `tkhd` / `mdhd` timestamps.** Both `creation_time` and `modification_time` zeroed in-place across all 5 boxes in the test fixture (1 `mvhd`, 2 `tkhd`, 2 `mdhd`). Verified by the structural assertion added for [#83](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/83). Input fixture values (`0x12345678`, `0xaabbccdd`, `0x55aa55aa`, etc.) → `0x00000000` post-strip.
- **`mvhd` / `tkhd` / `mdhd` timestamps.** Both `creation_time` and `modification_time` zeroed in-place across all 5 boxes in the test fixture (1 `mvhd`, 2 `tkhd`, 2 `mdhd`). Verified by the structural assertion added for #83. Input fixture values (`0x12345678`, `0xaabbccdd`, `0x55aa55aa`, etc.) → `0x00000000` post-strip.
Six categories are documented gaps that this test makes reproducibly visible:
@ -78,16 +78,16 @@ Six categories are documented gaps that this test makes reproducibly visible:
- **`mdat` orphan bytes (#42).** Even when a metadata `trak` is fully blanked, the bytes the trak's `stco`/`co64` used to point at remain in `mdat`. A forensic carver scanning for GPMF magic bytes (`GPMF` ASCII signature) or CAMM packet headers can still find them. Our `MDAT_ORPHAN` sentinel demonstrates this: no current atom indexes those bytes, but raw `strings` and a structural scan recover them.
- **Pre-existing `free` atoms.** The strategy leaves existing `free`/`skip` atoms untouched (it only emits new ones as same-size replacements for stripped boxes). If a previous tool "blanked" metadata by replacing it with a non-zero-padded `free` atom, the residue remains. Real-world exposure is probably small (most tools either zero-fill or rewrite-without-the-box) but the gap is real.
The first set (six sentinel-string categories + timestamp fields) validates the strategy's existing policy. The second six (gaps) are the deliverables for [#35](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/35), [#38](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/38), [#39](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/39), and [#42](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/42) — closing them moves their sentinels from `KNOWN_GAPS` to the stripped set, and this runner is what verifies that.
The first set (six sentinel-string categories + timestamp fields) validates the strategy's existing policy. The second six (gaps) are the deliverables for #35, #38, #39, and #42 — closing them moves their sentinels from `KNOWN_GAPS` to the stripped set, and this runner is what verifies that.
## Caveats and limits of this test
- **Synthetic fixture only.** The runner builds a programmatic MP4 with the box tree we want to exercise. It is not a real GoPro / DJI / iPhone capture. Real-world files have richer `mdia` substructure, often-larger `stbl` tables, deeper sample-entry hierarchies (parameter-set NALs in `avcC`), and vendor-specific top-level boxes we don't synthesize. The strategy's policy is structural (treat every `udta`/`meta`/non-DRM-`uuid` the same way), so the result on real captures should be the same shape, but extending the test with real-world captures (committed under `tests/fixtures/wasm/video/` would not be acceptable due to size; reference URLs + a download script would be) is a worthwhile follow-up.
- **`mp4box` CLI not used.** The original task description mentioned an `mp4box` CLI walk; that tool is not a project dependency and is not assumed available. The in-process atom-tree walker plus `strings` plus optional ExifTool covers the same surface, and the in-process walk is what runs in CI when we wire this up.
- **ExifTool comparison is best-effort.** ExifTool refuses to write to our synthetic fixture because the placeholder `stco` entries don't address real `mdat` chunks. We capture ExifTool's read-only view of the input fixture (which surfaces 6 of the 12 sentinels), but the post-strip comparison is skipped when ExifTool fails. This is an honest reflection of ExifTool's behaviour on adversarial inputs and does not weaken the in-process battery.
- **Fragmented MP4 (`moof` / `traf`) is not exercised here.** The strategy _does_ support `moof`/`traf` (it recurses into `traf` as a container and would blank metadata children there), but this fixture is non-fragmented for simplicity. Fragmented-MP4 forensic coverage is tracked as part of [#36](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/36) (blank `traf` fragments for blanked metadata tracks).
- **MKV / AVI / other non-ISOBMFF video formats are out of scope.** They use entirely different containers and are tracked under separate strategies — see [#43](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/43) (deferred Matroska).
- **Sidecar files (`.SRT`, `.LRV`, `.THM`, `.LRF`) are out of scope.** Per [#46](https://github.com/obuvuyoviz26-lab/exifcleaner-web/issues/46), DJI/GoPro/Insta360 cameras emit companion files that carry the same GPS/telemetry as the main video. They live outside the file we touch, so this test cannot exercise them. Closing #46 requires UI work in the renderer's add-files pipeline, not strategy work.
- **Fragmented MP4 (`moof` / `traf`) is not exercised here.** The strategy _does_ support `moof`/`traf` (it recurses into `traf` as a container and would blank metadata children there), but this fixture is non-fragmented for simplicity. Fragmented-MP4 forensic coverage is tracked as part of #36 (blank `traf` fragments for blanked metadata tracks).
- **MKV / AVI / other non-ISOBMFF video formats are out of scope.** They use entirely different containers and are tracked under separate strategies — see #43 (deferred Matroska).
- **Sidecar files (`.SRT`, `.LRV`, `.THM`, `.LRF`) are out of scope.** Per #46, DJI/GoPro/Insta360 cameras emit companion files that carry the same GPS/telemetry as the main video. They live outside the file we touch, so this test cannot exercise them. Closing #46 requires UI work in the renderer's add-files pipeline, not strategy work.
- **Two-pass `mdat` zeroing not yet implemented.** Closing #42 requires the strategy to do a first pass to collect chunk offsets from metadata `trak`s before blanking them, then a second pass to zero those offset+size ranges in `mdat`. This is a larger architectural change and is intentionally not included in this test's pass criteria.
- **Not in CI yet.** As with the other forensic runners under `tools/forensic/`, the bar today is "run before merging a strategy change, attach the report." Wiring this into a release-gate test that hard-fails on `UNEXPECTED survivors` is a natural follow-up.

View file

@ -1,16 +1,11 @@
{
"name": "exifcleaner",
"productName": "ExifCleaner",
"name": "metascrub",
"productName": "MetaScrub",
"version": "4.0.0",
"description": "Clean exif metadata from images, videos, and PDF documents",
"description": "Strip metadata from images, videos, PDFs, and Office documents — entirely in your browser.",
"license": "MIT",
"repository": "github:szTheory/exifcleaner",
"type": "module",
"author": {
"name": "szTheory",
"email": "szTheory@users.noreply.github.com",
"url": "https://exifcleaner.com"
},
"scripts": {
"format": "yarn prettier --write 'src/**/*.{ts,tsx}'",
"lint": "prettier --check 'src/**/*.{ts,tsx}'",

View file

@ -1,7 +1,7 @@
{
"name": "ExifCleaner",
"short_name": "ExifCleaner",
"description": "Remove metadata from your files. 100% private — files never leave your device.",
"name": "MetaScrub",
"short_name": "MetaScrub",
"description": "Strip metadata from your files. 100% private — files never leave your device.",
"start_url": "./",
"display": "standalone",
"background_color": "#1a1a1a",

View file

@ -70,7 +70,7 @@ export class BatchOutputController {
}
const blob = await zip.generateAsync({ type: "blob" });
const date = new Date().toISOString().slice(0, 10);
const filename = `exifcleaner-${rootLabel}-${date}.zip`;
const filename = `metascrub-${rootLabel}-${date}.zip`;
// Wrap in a File so consumers (Web Share Target #23) see lastModified: 0.
// <a download> ignores File metadata, so this is privacy-positive for
// the share path and a no-op for direct downloads. See §6 of

View file

@ -99,7 +99,11 @@ export interface WebApi {
wasm: WasmApi;
}
const SETTINGS_KEY = "exifcleaner-settings-v1";
const SETTINGS_KEY = "metascrub-settings-v1";
// Pre-rebrand key, read once on first launch after the rename and copied to
// SETTINGS_KEY. Existing users keep their toggles; we never read from this
// key again afterward.
const LEGACY_SETTINGS_KEY = "exifcleaner-settings-v1";
// Decides whether a freshly-added batch of files should be bundled into a zip:
// - Any File with a non-empty webkitRelativePath came from a folder picker;
@ -138,9 +142,20 @@ function triggerZipDownload(zipFile: File, filename: string): void {
setTimeout(() => URL.revokeObjectURL(url), 1000);
}
function loadSettingsFromStorage(): Settings {
// Exported for unit testing — the migration shim has subtle ordering that
// merits direct coverage rather than going through makeWebApi().
export function loadSettingsFromStorage(): Settings {
try {
const raw = localStorage.getItem(SETTINGS_KEY);
let raw = localStorage.getItem(SETTINGS_KEY);
if (raw === null) {
// One-time migration from the pre-rebrand key.
const legacy = localStorage.getItem(LEGACY_SETTINGS_KEY);
if (legacy !== null) {
localStorage.setItem(SETTINGS_KEY, legacy);
localStorage.removeItem(LEGACY_SETTINGS_KEY);
raw = legacy;
}
}
if (raw === null) return { ...DEFAULT_SETTINGS };
const parsed: unknown = JSON.parse(raw);
if (typeof parsed !== "object" || parsed === null)

View file

@ -26,9 +26,6 @@ const RAW_EXTENSIONS = new Set([
const PENDING_HEIC = new Set(["HEIC", "HEIF"]);
const HEIC_ISSUE_URL =
"http://localhost:3000/forgejo_admin/exifcleaner-web/issues/48";
export function classifyUnsupportedExtension(
extension: string,
): UnsupportedEntry | undefined {
@ -44,7 +41,6 @@ export function classifyUnsupportedExtension(
return {
class: "pending",
messageKey: "error.pending.heic",
linkHref: HEIC_ISSUE_URL,
};
}
return {

View file

@ -3,7 +3,7 @@
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>ExifCleaner</title>
<title>MetaScrub</title>
<link rel="icon" type="image/png" href="./icon-192.png" />
<link rel="manifest" href="./manifest.webmanifest" />
<meta name="theme-color" content="#2a9d8f" />

View file

@ -1,6 +1,6 @@
# E2E Tests
Playwright E2E test suite for ExifCleaner. Runs against the static web build served by `vite preview`.
Playwright E2E test suite for MetaScrub. Runs against the static web build served by `vite preview`.
(The previous `tests/e2e/electron/` suite was removed alongside the desktop-build CI workflows in preparation for full Electron retirement — tracked on issue #80 / Phase G.)

View file

@ -65,7 +65,7 @@ test.describe("File Processing — drag-drop (Web)", () => {
const download = await downloadPromise;
expect(download.suggestedFilename()).toMatch(
/^exifcleaner-files-\d{4}-\d{2}-\d{2}\.zip$/,
/^metascrub-files-\d{4}-\d{2}-\d{2}\.zip$/,
);
const tmpPath = await download.path();
@ -167,7 +167,7 @@ test.describe("File Processing — picker (Web)", () => {
const download = await downloadPromise;
const downloadFilename = download.suggestedFilename();
expect(downloadFilename).toMatch(
/^exifcleaner-folder-sample-\d{4}-\d{2}-\d{2}\.zip$/,
/^metascrub-folder-sample-\d{4}-\d{2}-\d{2}\.zip$/,
);
const tmpPath = await download.path();

View file

@ -54,7 +54,7 @@ describe("BatchOutputController", () => {
expect(triggerDownload).toHaveBeenCalledTimes(1);
const [zipFile, filename] = triggerDownload.mock.calls[0]!;
expect(zipFile).toBeInstanceOf(File);
expect(filename).toMatch(/^exifcleaner-vacation-\d{4}-\d{2}-\d{2}\.zip$/);
expect(filename).toMatch(/^metascrub-vacation-\d{4}-\d{2}-\d{2}\.zip$/);
expect(controller.mode).toBe("individual");
expect(controller.entries).toHaveLength(0);
});

View file

@ -1,5 +1,94 @@
import { describe, it, expect } from "vitest";
import { decideBatchMode } from "../../../src/infrastructure/web/web_api";
import { describe, it, expect, beforeEach, afterEach } from "vitest";
import {
decideBatchMode,
loadSettingsFromStorage,
} from "../../../src/infrastructure/web/web_api";
import { DEFAULT_SETTINGS } from "../../../src/domain";
function installFakeLocalStorage(): {
store: Map<string, string>;
} {
const store = new Map<string, string>();
(globalThis as Record<string, unknown>).localStorage = {
getItem: (key: string) => store.get(key) ?? null,
setItem: (key: string, value: string) => {
store.set(key, value);
},
removeItem: (key: string) => {
store.delete(key);
},
clear: () => store.clear(),
key: () => null,
length: 0,
};
return { store };
}
describe("loadSettingsFromStorage (legacy → MetaScrub migration)", () => {
let store: Map<string, string>;
beforeEach(() => {
({ store } = installFakeLocalStorage());
});
afterEach(() => {
delete (globalThis as Record<string, unknown>).localStorage;
});
it("returns DEFAULT_SETTINGS when neither key is present", () => {
expect(loadSettingsFromStorage()).toEqual(DEFAULT_SETTINGS);
});
it("reads from the new metascrub-settings-v1 key when present", () => {
store.set(
"metascrub-settings-v1",
JSON.stringify({
...DEFAULT_SETTINGS,
preserveOrientation: false,
}),
);
const settings = loadSettingsFromStorage();
expect(settings.preserveOrientation).toBe(false);
});
it("migrates from legacy exifcleaner-settings-v1 on first read", () => {
store.set(
"exifcleaner-settings-v1",
JSON.stringify({
...DEFAULT_SETTINGS,
preserveColorProfile: false,
language: "fr",
}),
);
const settings = loadSettingsFromStorage();
// Migrated value carries over.
expect(settings.preserveColorProfile).toBe(false);
expect(settings.language).toBe("fr");
// New key was populated, legacy key cleared.
expect(store.get("metascrub-settings-v1")).not.toBeUndefined();
expect(store.has("exifcleaner-settings-v1")).toBe(false);
});
it("ignores the legacy key when the new key is already populated", () => {
// Simulates a user who has already migrated and then has stale legacy data
// (e.g. from accidentally restoring an old backup).
store.set(
"metascrub-settings-v1",
JSON.stringify({ ...DEFAULT_SETTINGS, preserveOrientation: true }),
);
store.set(
"exifcleaner-settings-v1",
JSON.stringify({ ...DEFAULT_SETTINGS, preserveOrientation: false }),
);
const settings = loadSettingsFromStorage();
expect(settings.preserveOrientation).toBe(true);
// Legacy key untouched (we only clear it during the actual migration path).
expect(store.has("exifcleaner-settings-v1")).toBe(true);
});
});
function fileWithoutFolder(name: string): File {
return new File([new Uint8Array([0])], name, { type: "image/jpeg" });

View file

@ -13,12 +13,15 @@ describe("classifyUnsupportedExtension", () => {
}
});
it("classifies HEIC/HEIF as pending while #48 is open", () => {
it("classifies HEIC/HEIF as pending with no follow-up link", () => {
for (const ext of ["HEIC", "HEIF"]) {
const entry = classifyUnsupportedExtension(ext);
expect(entry?.class).toBe("pending");
expect(entry?.messageKey).toBe("error.pending.heic");
expect(entry?.linkHref).toContain("/issues/48");
// linkHref dropped in the MetaScrub rebrand — the in-app message
// stands on its own; we no longer leak a private Forgejo URL to
// public PWA users.
expect(entry?.linkHref).toBeUndefined();
}
});

View file

@ -476,11 +476,14 @@ async function runForensics({
.sort();
// 5. Stray markers — would indicate the strategy stamped its own
// fingerprint into the output (e.g. "ExifCleaner" or "JSZip" comments).
// fingerprint into the output (e.g. "MetaScrub" / "ExifCleaner" / "JSZip"
// comments). "exifcleaner" is kept alongside the new "metascrub" name to
// catch any residual pre-rebrand fingerprint regression.
const strayMarkers: string[] = [];
for (const line of stringsOutput.split("\n")) {
if (
/exifcleaner/i.test(line) ||
/metascrub/i.test(line) ||
/officestrategy/i.test(line) ||
line.startsWith("BeginExifToolUpdate")
) {

View file

@ -422,14 +422,21 @@ function runForensics(label: string, path: string): ForensicReport {
const walkResult = walkAndDecompress(bytes);
// Stray markers: scan the file BYTES (not exiftool reader output) for
// case-insensitive "exiftool" or "exifcleaner" substrings — those would
// indicate the strip tool stamped its own fingerprint into the output.
// case-insensitive "exiftool" / "exifcleaner" / "metascrub" substrings —
// those would indicate the strip tool stamped its own fingerprint into the
// output. "exifcleaner" is kept alongside the new "metascrub" name to catch
// any residual pre-rebrand fingerprints that might re-appear from a stale
// dependency or a regression.
// Note: `exiftool -a -G1 -s` always reports `[ExifTool] ExifToolVersion`
// as its own banner regardless of file contents, so reading from the
// reader's output produces a false positive. Read from `strings` instead.
const strayMarkers: string[] = [];
for (const line of stringsOutput.split("\n")) {
if (/exiftool/i.test(line) || /exifcleaner/i.test(line)) {
if (
/exiftool/i.test(line) ||
/exifcleaner/i.test(line) ||
/metascrub/i.test(line)
) {
strayMarkers.push(line.trim());
}
}

View file

@ -47,8 +47,8 @@ function webCspPlugin(): Plugin {
export default defineConfig({
root: resolve(__dirname, "src/web"),
publicDir: resolve(__dirname, "public"),
// Relative paths so the same bundle works under both https://exifcleaner.app/
// and Electron's file:// load of dist/web/index.html (Phase D #14).
// Relative paths so the bundle works at any deploy root (PWA install,
// subdirectory hosting, the single-file standalone build at file://).
base: "./",
build: {
outDir: resolve(__dirname, "dist/web"),