Commit graph

13 commits

Author SHA1 Message Date
c5a26fc464 Merge pull request 'Add README command to list available qwen2.5vl tags' (#5) from readme-list-tags into master 2026-06-27 11:37:26 +04:00
Randa
26954bb01f Add README command to list available qwen2.5vl tags
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-27 11:37:11 +04:00
80126622a0 Merge pull request 'Update README: remove document types, document all CLI params' (#4) from readme-update into master 2026-06-27 11:30:30 +04:00
Randa
7c612778ac Update README: drop document-type docs, add Options table with all params
- Rewrite intro to describe the single universal prompt (no per-page detection)
- Remove the Document Types table and --type references
- Add --ctx and --timeout usage examples
- Add an Options table documenting every flag and default, including --poppler
- Fix output-format example to drop the removed Type label

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-27 11:28:50 +04:00
05fa727036 Add JPEG and PNG image support (#2)
## Summary

- Accept `.jpg`, `.jpeg`, and `.png` files in addition to `.pdf`
- Images are loaded directly via Pillow — no poppler required
- Unsupported extensions fail fast with a clear error message
- Output header uses "Image N" for images, "Page N" for PDFs
- `--dpi` and `--poppler` args apply to PDFs only (no behaviour change)

## Test plan

- [ ] Run on a JPEG scan and verify output is correct
- [ ] Run on a PNG and verify output is correct
- [ ] Run on a PDF and verify nothing regressed
- [ ] Pass an unsupported extension and verify the error message

Co-authored-by: Randa <obuvuyoviz26@gmail.com>
Reviewed-on: http://forgejo.localhost:3000/forgejo_admin/arabic-ocr/pulls/2
2026-06-26 22:22:49 +04:00
Randa
5f816bf3fa Add streaming, /no_think, token stats, --ctx/--timeout args; bump defaults to 12K ctx and 600s timeout
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-26 21:56:50 +04:00
Randa
8b5273c949 Add Windows support: --poppler arg and Windows setup instructions in README
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-26 20:32:45 +04:00
Randa
eee3600b33 Switch to streaming, add --ctx and --timeout args, print eval_count per page
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-26 20:30:27 +04:00
Randa
eb97632f6d Rewrite README with Hugging Face model pull instructions; prompt iteration 3
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-26 20:22:25 +04:00
Randa
f208354782 Prompt iteration 2: expert manuscript persona + context reconstruction
- Frame model as Arabic manuscript scholar with decades of experience
- Name specific script styles (نسخ، رقعة، ديواني، إجازة، كوفي) to activate deeper knowledge
- Instruct use of surrounding context to infer unclear words rather than skipping
- [؟] now only used when context-based reconstruction also fails

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-26 19:13:43 +04:00
Randa
622bc16a3a Simplify to single universal mixed prompt, remove detection pass
- Replace two-pass (detect + extract) with one call per page
- Single PROMPT handles all content: handwritten, IDs, tables, forms, printed text
- Remove --type flag, detect_type(), and PROMPTS dict
- Halves API calls and eliminates misclassification errors

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-26 18:47:18 +04:00
Randa
01ff048411 Add README with setup, usage, and document type reference
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-26 18:33:59 +04:00
Randa
5aec8a5c6c Initial commit: smart Arabic OCR script with document-aware prompting
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-26 18:31:44 +04:00