Commit Graph

12 Commits

Author SHA1 Message Date
Zach Nelson
7dc518624c fix: Use fixed-point fractional x-advance and kerning for better text layout (#1168)
## Summary

**What is the goal of this PR?**

Hopefully fixes #1182.

_Note: I think letterforms got a "heavier" appearance after #1098, which
makes this more noticeable. The current version of this PR reverts the
change to add `--force-autohint` for Bookerly, which to me seems to
bring the font back to a more aesthetic and consistent weight._

#### Problem

Character spacing was uneven in certain words. The word "drew" in
Bookerly was the clearest example: a visible gap between `d` and `r`,
while `e` and `w` appeared tightly condensed. The root cause was
twofold:

1. **Integer-only glyph advances.** `advanceX` was stored as a `uint8_t`
of whole pixels, sourced from FreeType's hinted `advance.x` (which
grid-fits to integers). A glyph whose true advance is 15.56px was stored
as 16px -- an error of +0.44px per character that compounds across a
line.

2. **Floor-rounded kerning.** Kern adjustments were converted with
`math.floor()`, which systematically over-tightened negative kerns. A
kern of -0.3px became -1px -- a 0.7px over-correction that visibly
closed gaps.

Combined, these produced the classic symptom: some pairs too wide,
others too tight, with the imbalance varying per word.

#### Solution: fixed-point accumulation with 1/16-pixel resolution, for
sub-pixel precision during text layout

All font metrics now use a "fixed-point 4" format -- 4 fractional bits
giving 1/16-pixel (0.0625px) resolution. This is implemented with plain
integer arithmetic (shifts and adds), requiring no floating-point on the
ESP32.

**How it works:**

A value like 15.56px is stored as the integer `249`:

```
249 = 15 * 16 + 9    (where 9/16 = 0.5625, closest to 0.56)
```

Two storage widths share the same 4 fractional bits:

| Field | Type | Format | Range | Use |
|-------|------|--------|-------|-----|
| `advanceX` | `uint16_t` | 12.4 | 0 -- 4095.9375 px | Glyph advance
width |
| `kernMatrix` | `int8_t` | 4.4 | -8.0 -- +7.9375 px | Kerning
adjustment |

Because both have 4 fractional bits, they add directly into a single
`int32_t` accumulator during layout. The accumulator is only snapped to
the nearest whole pixel at the moment each glyph is rendered:

```cpp
int32_t xFP = fp4::fromPixel(startX);     // pixel to 12.4: startX << 4

for each character:
    xFP += kernFP;                          // add 4.4 kern (sign-extends into int32_t)
    int xPx = fp4::toPixel(xFP);           // snap to nearest pixel: (xFP + 8) >> 4
    render glyph at xPx;
    xFP += glyph->advanceX;                // add 12.4 advance
```

Fractional remainders carry forward indefinitely. Rounding errors stay
below +/- 0.5px and never compound.

#### Concrete example: "drew" in Bookerly

**Before** (integer advances, floor-rounded kerning):

| Char | Advance | Kern | Cursor | Snap | Gap from prev |
|------|---------|------|--------|------|---------------|
| d | 16 px | -- | 33 | 33 | -- |
| r | 12 px | 0 | 49 | 49 | ~2px |
| e | 13 px | -1 | 60 | 60 | ~0px |
| w | 22 px | -1 | 72 | 72 | ~0px |

The d-to-r gap was visibly wider than the tightly packed `rew`.

**After** (12.4 advances, 4.4 kerning, fractional accumulation):

| Char | Advance (FP) | Kern (FP) | Accumulator | Snap | Ink start | Gap
from prev |

|------|-------------|-----------|-------------|------|-----------|---------------|
| d | 249 (15.56px) | -- | 528 | 33 | 34 | -- |
| r | 184 (11.50px) | 0 | 777 | 49 | 49 | 0px |
| e | 208 (13.00px) | -8 (-0.50px) | 953 | 60 | 61 | 1px |
| w | 356 (22.25px) | -4 (-0.25px) | 1157 | 72 | 72 | 0px |

Spacing is now `0, 1, 0` pixels -- nearly uniform. Verified on-device:
all 5 copies of "drew" in the test EPUB produce identical spacing,
confirming zero accumulator drift.

#### Changes

**Font conversion (`fontconvert.py`)**
- Use `linearHoriAdvance` (FreeType 16.16, unhinted) instead of
`advance.x` (26.6, grid-fitted to integers) for glyph advances
- Encode kern values as 4.4 fixed-point with `round()` instead of
`floor()`
- Add `fp4_from_ft16_16()` and `fp4_from_design_units()` helper
functions
- Add module-level documentation of fixed-point conventions

**Font data structures (`EpdFontData.h`)**
- `EpdGlyph::advanceX`: `uint8_t` to `uint16_t` (no memory cost due to
existing struct padding)
- Add `fp4` namespace with `constexpr` helpers: `fromPixel()`,
`toPixel()`, `toFloat()`
- Document fixed-point conventions

**Font API (`EpdFont.h/cpp`, `EpdFontFamily.h/cpp`)**
- `getKerning()` return type: `int8_t` to `int` (to avoid truncation of
the 4.4 value)

**Rendering (`GfxRenderer.cpp`)**
- `drawText()`: replace integer cursor with `int32_t` fixed-point
accumulator
- `drawTextRotated90CW()`: same accumulator treatment for vertical
layout
- `getTextAdvanceX()`, `getSpaceWidth()`, `getSpaceKernAdjust()`,
`getKerning()`: convert from fixed-point to pixel at API boundary

**Regenerated all built-in font headers** with new 12.4 advances and 4.4
kern values.

#### Memory impact

Zero additional RAM. The `advanceX` field grew from `uint8_t` to
`uint16_t`, but the `EpdGlyph` struct already had 1 byte of padding at
that position, so the struct size is unchanged. The fixed-point
accumulator is a single `int32_t` on the stack.

#### Test plan

- [ ] Verify "drew" spacing in Bookerly at small, medium, and large
sizes
- [ ] Verify uppercase kerning pairs: AVERY, WAVE, VALUE
- [ ] Verify ligature words: coffee, waffle, office
- [ ] Verify all built-in fonts render correctly at each size
- [ ] Verify rotated text (progress bar percentage) renders correctly
- [ ] Verify combining marks (accented characters) still position
correctly
- [ ] Spot-check a full-length book for any layout regressions

---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? _**YES, Claude Opus 4.6
helped figure out a non-floating point approach for sub-pixel error
accumulation**_
2026-03-01 10:43:37 -06:00
Zach Nelson
3cc8e272ca refactor: Use std binary search algorithms for font lookups (#1202)
## Summary

**What is the goal of this PR?**

Rewrite of font routines to use std binary search algorithms instead of
custom repeated implementations: `lookupKernClass`,
`EpdFont::getLigature`, and `EpdFont::getGlyph`.

---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? _**NO**_
2026-03-01 10:28:15 -06:00
Zach Nelson
0eb8a9346b feat: Support for kerning and ligatures (#873)
## Summary

**What is the goal of this PR?**
Improved typesetting, including
[kerning](https://en.wikipedia.org/wiki/Kerning) and
[ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet).

**What changes are included?**
- The script to convert built-in fonts now adds kerning and ligature
information to the generated font headers.
- Epub page layout calculates proper kerning spaces and makes ligature
substitutions according to the selected font.


![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494)

![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea)

![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19)

## Additional Context

- I am not a typography expert. 
- The implementation has been reworked from the earlier version, so it
is no longer necessary to omit Open Dyslexic, and kerning data now
covers all fonts, styles, and codepoints for which we include bitmap
data.
- Claude Opus 4.6 helped with a lot of this.
- There's an included test epub document with lots of kerning and
ligature examples, shown in the photos.

**_After some time to mature, I think this change is in decent shape to
merge and get people testing._**

After opening this PR I came across #660, which overlaps in adding
ligature support.

---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 11:31:43 +03:00
Zach Nelson
13fc8b94b0 refactor: Simplify REPLACEMENT_GLYPH fallback (#1119)
## Summary

**What is the goal of this PR?**

Consolidated repeated logic to fall back to REPLACEMENT_GLYPH.

---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? _**NO**_
2026-02-23 13:32:50 +01:00
jpirnay
5f5561b684 fix: Fix hyphenation and rendering of decomposed characters (#1037)
## Summary

* This PR fixes decomposed diacritic handling end-to-end:
- Hyphenation: normalize common Latin base+combining sequences to
precomposed codepoints before Liang pattern matching, so decomposed
words hyphenate correctly
- Rendering: correct combining-mark placement logic so non-spacing marks
are attached to the preceding base glyph in normal and rotated text
rendering paths, with corresponding text-bounds consistency updates.
- Hyphenation around non breaking space variants have been fixed (and
extended)
- Hyphenation of terms that already included of hyphens were fixed to
include Liang pattern application (eg "US-Satellitensystem" was
*exclusively* broken at the existing hyphen)

## Additional Context

* Before
<img width="800" height="480" alt="2"
src="https://github.com/user-attachments/assets/b9c515c4-ab75-45cc-8b52-f4d86bce519d"
/>


* After
<img width="480" height="800" alt="fix1"
src="https://github.com/user-attachments/assets/4999f6a8-f51c-4c0a-b144-f153f77ddb57"
/>
<img width="800" height="480" alt="fix2"
src="https://github.com/user-attachments/assets/7355126b-80c7-441f-b390-4e0897ee3fb6"
/>

* Note 1: the hyphenation fix is not a 100% bullet proof implementation.
It adds composition of *common* base+combining sequences (e.g. O +
U+0308 -> Ö) during codepoint collection. A complete solution would
require implementing proper Unicode normalization (at least NFC,
possibly NFKC in specific cases) before hyphenation and rendering,
instead of hand-mapping a few combining marks. That was beyond the scope
of this fix.

* Note 2: the render fix should be universal and not limited to the
constraints outlined above: it properly x-centers the compund glyph over
the previous one, and it uses at least 1pt of visual distance in y.

Before:
<img width="478" height="167" alt="Image"
src="https://github.com/user-attachments/assets/f8db60d5-35b1-4477-96d0-5003b4e4a2a1"
/>

After: 
<img width="479" height="180" alt="Image"
src="https://github.com/user-attachments/assets/1b48ef97-3a77-475a-8522-23f4aca8e904"
/>

* This should resolve the issues described in #998 
---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? _**PARTIALLY**_
2026-02-22 13:11:07 +11:00
Zach Nelson
448a77f02b perf: Remove hasPrintableChars pass (#971)
## Summary

**What is the goal of this PR?**

`hasPrintableChars` does a pass over text before rendering. It looks up
glyphs in the font and measures dimensions, returning early if the text
results in zero size.

This additional pass doesn't offer any benefit over moving straight to
rendering the text, because the rendering loop already gracefully
handles missing glyphs. This change saves an extra pass over all
rendered text.

Note that both `hasPrintableChars` and `renderChar` replace missing
glyphs with `glyph = getGlyph(REPLACEMENT_GLYPH)`, so there's no
difference for characters which are not present in the font.

---

### AI Usage

While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.

Did you use AI tools to help write this code? _**NO**_
2026-02-19 21:58:09 +11:00
Maeve Andrews
5fef99c641 fix: render U+FFFD replacement character instead of ? (#366)
The current behavior of rendering `?` for an unknown Unicode character
can be hard to distinguish from a typo. Use the standard Unicode
"replacement character" instead, that's what it's designed for:

https://en.wikipedia.org/wiki/Specials_(Unicode_block)

I'm making this PR as a draft because I'm not sure I did everything that
was needed to change the character set covered by the fonts. Running
that script is in its own commit. If this is proper, I'll rebase/squash
into one commit and un-draft.

Co-authored-by: Maeve Andrews <maeve@git.mail.maeveandrews.com>
2026-01-19 22:58:43 +11:00
Dave Allie
52a0b5bbe9 Small cleanups from https://github.com/juicecultus/crosspoint-reader-x4 2025-12-30 23:19:08 +11:00
Eunchurn Park
dc7544d944 Optimize glyph lookup with binary search (#125)
Replace linear O(n) search with binary search O(log n) for unicode
interval lookup. Korean fonts have many intervals (~30,000+ glyphs), so
this improves text rendering performance during page navigation.

## Summary

* **What is the goal of this PR?** (e.g., Fixes a bug in the user
authentication module, Implements the new feature for
  file uploading.)

Replace linear `O(n)` glyph lookup with binary search `O(log n)` to
improve text rendering performance during page navigation.

* **What changes are included?**

- Modified `EpdFont::getGlyph()` to use binary search instead of linear
search for unicode interval lookup
- Added early return for empty interval count

## Additional Context

* Add any other information that might be helpful for the reviewer
(e.g., performance implications, potential risks, specific areas to
focus on).

- Performance implications: Fonts with many unicode intervals benefit
the most. Korean fonts have ~30,000+ glyphs across multiple intervals,
but any font with significant glyph coverage (CJK, extended Latin,
emoji, etc.) will see improvement.
- Complexity: from `O(n)` to `O(log n)` where n = number of unicode
intervals. For fonts with 10+ intervals, this reduces lookup iterations
significantly.
- Risk: Low - the binary search logic is straightforward and the
intervals are already sorted by unicode codepoint (required for the
original early-exit optimization).
2025-12-26 11:46:17 +11:00
Dave Allie
ad8cee12ab Small cleanup 2025-12-06 20:24:24 +11:00
Dave Allie
4ecfdea1a1 More pass by reference changes 2025-12-06 15:56:00 +11:00
Dave Allie
2ccdbeecc8 Public release 2025-12-03 22:06:45 +11:00