Skip to content

feat(authoring): ship image embedding + custom-font loading (closes v0.1.0 wrapping gaps)#38

Merged
billdenney merged 4 commits into
mainfrom
claude/close-wrapping-gaps
May 21, 2026
Merged

feat(authoring): ship image embedding + custom-font loading (closes v0.1.0 wrapping gaps)#38
billdenney merged 4 commits into
mainfrom
claude/close-wrapping-gaps

Conversation

@billdenney
Copy link
Copy Markdown
Member

Summary

Closes the two wrapping-only gaps the comparison vignette previously flagged — image embedding and custom fonts. After this PR, the only authoring gaps left are the two upstream-PDFium ones (FPDF_SetMetaText, FPDF_SetEncryption), which we've already proposed.

Along the way: fixes a silent bug in pdf_save() where newly-inserted page objects on freshly-created docs were being dropped from the saved PDF. Stacks on top of #36 (which is already merged); the refactor: tighten package scope + two docs: commits at the base are from the scope-tighten branch that was rebased in.

New public surface

  • pdf_image_new(page, jpeg, bounds = NULL) — embeds JPEG bytes inline via FPDFImageObj_LoadJpegFileInline. jpeg accepts raw bytes or a file path; bounds = c(left, bottom, right, top) controls placement.
  • pdf_font_load_standard(doc, name) — wraps FPDFText_LoadStandardFont for the 14 PDF standard fonts.
  • pdf_font_load(doc, font_data, type = "truetype", cid = TRUE) — wraps FPDFText_LoadFont for arbitrary TrueType / Type1 bytes. Returns a pdfium_font handle.
  • pdf_font_close(font) — idempotent, mirrors pdf_doc_close.
  • pdf_text_new() now polymorphic on font — accepts either a standard-font name (existing path) or a pdfium_font handle (new path via FPDFPageObj_CreateTextObj).

New pdfium_font S3 class. Its externalptr carries an FPDFFont_Close finalizer; the parent doc is pinned in the prot slot. is_open(font) checks own-ptr validity AND parent-doc liveness.

The flush_dirty_pages bug

pdf_save's auto-flush was calling FPDFPage_GenerateContent on a freshly-loaded FPDF_PAGE rather than the user's actual handle. The fresh handle didn't share the in-memory page-object array that holds user inserts, so its empty state got serialised over the real edits — every newly-inserted path / rect / text / image disappeared on round-trip.

Fix tracks open pdfium_page handles in doc$state$open_pages keyed by page index; flush_dirty_pages uses the registered handle when one is valid (falls back to fresh-load otherwise, still correct for page-dict edits like rotation). Existing tests never round-tripped a newly-authored page-object through pdf_save + pdf_doc_open, which is why the bug stayed latent — the new test suites now cover this end-to-end.

Test plan

  • test-image-authoring.R: 13 tests — raw + path input, bounds validation, read-only / closed-page rejection, save-and-reopen round-trip
  • test-font-authoring.R: 22 tests — all 14 standard fonts, system-TTF load (skips when no candidate found), pdf_text_new dispatch on pdfium_font, save-and-reopen preserves the text run
  • Full suite: 2,210 passing serially. R coverage still 100% on every gated file.
  • tools/check-pkgdown-reference.R + tools/check-rd-xrefs.R clean.
  • _pkgdown.yml reference index updated (new "Font loading" section; image creator added to "Page-object creation").
  • Vignette comparison.Rmd, README.Rmd, NEWS.md, dev/v0.1.0-api-gap-audit.md all reflect the closure.

billdenney and others added 4 commits May 21, 2026 19:54
pdf_save's flush_dirty_pages was loading a fresh FPDF_PAGE via
FPDF_LoadPage and calling FPDFPage_GenerateContent on that handle —
but a freshly-loaded handle doesn't share the in-memory
page-object array of the user's original handle (the one that
holds inserts from pdf_path_new / pdf_rect_new / pdf_text_new /
pdf_image_new / annotation authoring / etc.). The fresh handle's
empty in-memory state then got serialised over the user's edits,
silently dropping every just-inserted page object from the saved
PDF.

The fix tracks every open pdfium_page handle in doc$state$open_pages
keyed by page index. new_pdfium_page registers; pdf_page_close
deregisters. flush_dirty_pages uses the registered handle when one
is valid; falls back to the legacy fresh-load path otherwise (still
correct for page-dict edits like rotation, where the modification
landed on the dict and re-generating empty content is a no-op).

Existing tests were oblivious because none of them round-tripped a
newly-authored page-object through pdf_save + pdf_doc_open. The
new test-image-authoring.R / test-font-authoring.R suites added in
the next commit cover this end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the two wrapping-only gaps in the v0.1.0 authoring surface:

  * pdf_image_new(page, jpeg, bounds) — wraps
    FPDFImageObj_NewImageObj + FPDFImageObj_LoadJpegFileInline +
    FPDFImageObj_SetMatrix. JPEG bytes are embedded inline (the
    "Inline" variant copies into the PDF up front, so the input
    raw vector is free to be GC'd after the call). bounds =
    c(left, bottom, right, top) maps the image's unit square to a
    placement rectangle.

  * pdf_font_load_standard(doc, name) — one of the 14 PDF
    standard fonts via FPDFText_LoadStandardFont. No bytes
    embedded.

  * pdf_font_load(doc, font_data, type, cid) — arbitrary
    TrueType / Type1 bytes via FPDFText_LoadFont. font_data can be
    a raw vector or a path. cid defaults TRUE for full Unicode
    coverage.

  * pdf_font_close(font) — idempotent, mirrors pdf_doc_close.

  * pdf_text_new() now dispatches on the `font` argument: a
    character argument hits the standard-font shortcut (existing
    code path), a `pdfium_font` handle hits the new
    FPDFPageObj_CreateTextObj path.

The pdfium_font handle is a new fourth top-level handle class
(alongside doc / page / obj). Its externalptr carries a finalizer
that calls FPDFFont_Close; the parent doc is pinned in the prot
slot. is_open(font) tracks both own-ptr validity and parent-doc
liveness (matches the doc-owned-with-finalizer shape from annot).

Tests:
  * test-image-authoring.R covers raw bytes, file paths, bounds
    validation, read-only / closed-page rejection, plus a
    save-and-reopen round-trip that confirms the JPEG actually
    lands in the PDF.
  * test-font-authoring.R covers all 14 standard fonts; loads a
    system TrueType font (DejaVu / Liberation / Arial — skips
    when no candidate is available); confirms pdf_text_new
    dispatch on the handle; verifies the round-trip preserves the
    embedded text run.

Vignette comparison + README + NEWS updated: the wrapping-gap
section is now empty, leaving FPDF_SetMetaText and
FPDF_SetEncryption as the only remaining v0.1.0 authoring gaps —
both upstream blockers we've already proposed patches for.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- dev/v0.2.0-plan.md: drop the "page-object authoring is a
  non-goal" item (paths/text/rect have shipped in v0.1.0; images
  + custom fonts just landed). Reframe as "PNG / TIFF / raw-bitmap
  image embedding" — the actual remaining v0.2.0 work, which
  needs an FPDF_BITMAP lifecycle class.

- vignettes/mutating-pdfs.Rmd: fix the stale "Building a page
  from scratch" example (pdf_rect_new signature was outdated;
  pdf_text_new lost its `text` argument and used `size=` instead
  of `font_size=`). Add two new subsections under it:
  "Embedding a JPEG image" demonstrates pdf_image_new with both
  natural-size and explicit-bounds placement; "Custom fonts"
  shows pdf_font_load + the pdf_text_new(font = handle) dispatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Windows GCC 14.3 (MinGW via rtools45) is strict about the
int → Rboolean enum conversion in R_RegisterCFinalizerEx's third
argument (`Rboolean onexit`) and errors under -Wall -pedantic with
"invalid conversion from 'int' to 'Rboolean' [-fpermissive]".

Linux GCC accepted the bare TRUE silently, which is why this
slipped through the local check. Switch to the static_cast pattern
the rest of the codebase already uses (mutation.cpp,
annot_handles.cpp, form_field_handles.cpp, page.cpp).

Both call sites in font_authoring.cpp updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@billdenney billdenney merged commit 6c464f6 into main May 21, 2026
13 checks passed
@billdenney billdenney deleted the claude/close-wrapping-gaps branch May 21, 2026 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant