## Summary
* Remove miniz and move completely to uzlib
* Move uzlib interfacing to InflateReader to better modularise inflation
code
---
### AI Usage
While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.
Did you use AI tools to help write this code? Yes, Claude helped with
the extraction and refactor
## Summary
**What is the goal of this PR?**
Compress reader font bitmaps to reduce flash usage by 30.7%.
**What changes are included?**
- New `EpdFontGroup` struct and extended `EpdFontData` with
`groups`/`groupCount` fields
- `--compress` flag in `fontconvert.py`: groups glyphs (ASCII base group
+ groups of 8) and compresses each with raw DEFLATE
- `FontDecompressor` class with 4-slot LRU cache for on-demand
decompression during rendering
- `GfxRenderer` transparently routes bitmap access through
`getGlyphBitmap()` (compressed or direct flash)
- Uses `uzlib` for decompression with minimal heap overhead.
- 48 reader fonts (Bookerly, NotoSans 12-18pt, OpenDyslexic) regenerated
with compression; 5 UI fonts unchanged
- Round-trip verification script (`verify_compression.py`) runs as part
of font generation
## Additional Context
## Flash & RAM
| | baseline | font-compression | Difference |
|--|--------|-----------------|------------|
| Flash (ELF) | 6,302,476 B (96.2%) | 4,365,022 B (66.6%) | -1,937,454 B
(-30.7%) |
| firmware.bin | 6,468,192 B | 4,531,008 B | -1,937,184 B (-29.9%) |
| RAM | 101,700 B (31.0%) | 103,076 B (31.5%) | +1,376 B (+0.5%) |
## Script-Based Grouping (Cold Cache)
Comparison of uncompressed baseline vs script-based group compression
(4-slot LRU cache, cleared each page). Glyphs are grouped by Unicode
block (ASCII, Latin-1, Latin Extended-A, Combining Marks, Cyrillic,
General Punctuation, etc.) instead of sequential groups of 8.
### Render Time
| | Baseline | Compressed (cold cache) | Difference |
|---|---|---|---|
| **Median** | 414.9 ms | 431.6 ms | +16.7 ms (+4.0%) |
| **Pages** | 37 | 37 | |
### Memory Usage
| | Baseline | Compressed (cold cache) | Difference |
|---|---|---|---|
| **Heap free (median)** | 187.0 KB | 176.3 KB | -10.7 KB |
| **Heap free (min)** | 186.0 KB | 166.5 KB | -19.5 KB |
| **Largest block (median)** | 148.0 KB | 128.0 KB | -20.0 KB |
| **Largest block (min)** | 148.0 KB | 120.0 KB | -28.0 KB |
### Cache Effectiveness
| | Misses/page | Hit rate |
|---|---|---|
| **Compressed (cold cache)** | 2.1 | 99.85% |
------
### AI Usage
While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.
Did you use AI tools to help write this code? _**YES**_
Implementation was done by Claude Code (Opus 4.6) based on a plan
developed collaboratively. All generated font headers were verified with
an automated round-trip decompression test. The firmware was compiled
successfully but has not yet been tested on-device.
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Summary
* Definition and use of a central LOG function, that can later be
extended or completely be removed (for public use where debugging
information may not be required) to save flash by suppressing the
-DENABLE_SERIAL_LOG like in the slim branch
* **What changes are included?**
## Additional Context
* By using the central logger the usual:
```
#include <HardwareSerial.h>
...
Serial.printf("[%lu] [WCS] Obfuscating/deobfuscating %zu bytes\n", millis(), data.size());
```
would then become
```
#include <Logging.h>
...
LOG_DBG("WCS", "Obfuscating/deobfuscating %zu bytes", data.size());
```
You do have ``LOG_DBG`` for debug messages, ``LOG_ERR`` for error
messages and ``LOG_INF`` for informational messages. Depending on the
verbosity level defined (see below) soe of these message types will be
suppressed/not-compiled.
* The normal compilation (default) will create a firmware.elf file of
42.194.356 bytes, the same code via slim will create 42.024.048 bytes -
170.308 bytes less
* Firmware.bin : 6.469.984 bytes for default, 6.418.672 bytes for slim -
51.312 bytes less
### AI Usage
While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.
Did you use AI tools to help write this code? _NO_
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
## Summary
Continue my changes to introduce the HAL infrastructure from
https://github.com/crosspoint-reader/crosspoint-reader/pull/522
This PR touches quite a lot of files, but most of them are just name
changing. It should not have any impacts to the end behavior.
## Additional Context
My plan is to firstly add this small shim layer, which sounds useless at
first, but then I'll implement an emulated driver which can be helpful
for testing and for development.
Currently, on my fork, I'm using a FS driver that allow "mounting" a
local directory from my computer to the device, much like the `-v` mount
option on docker. This allows me to quickly reset `.crosspoint`
directory if anything goes wrong. I plan to upstream this feature when
this PR get merged.
---
### AI Usage
While CrossPoint doesn't have restrictions on AI tools in contributing,
please be transparent about their usage as it
helps set the right context for reviewers.
Did you use AI tools to help write this code? NO
## Summary
Optimizes EPUB metadata indexing for large books (2000+ chapters) from
~30 minutes to ~50 seconds by replacing O(n²) algorithms with O(n log n)
hash-indexed lookups.
Fixes#134
## Problem
Three phases had O(n²) complexity due to nested loops:
| Phase | Operation | Before (2768 chapters) |
|-------|-----------|------------------------|
| OPF Pass | For each spine ref, scan all manifest items | ~25 min |
| TOC Pass | For each TOC entry, scan all spine items | ~5 min |
| buildBookBin | For each spine item, scan ZIP central directory | ~8.4
min |
Total: **~30+ minutes** for first-time indexing of large EPUBs.
## Solution
Replace linear scans with sorted hash indexes + binary search:
- **OPF Pass**: Build `{hash(id), len, offset}` index from manifest,
binary search for each spine ref
- **TOC Pass**: Build `{hash(href), len, spineIndex}` index from spine,
binary search for each TOC entry
- **buildBookBin**: New `ZipFile::fillUncompressedSizes()` API - single
ZIP central directory scan with batch hash matching
All indexes use FNV-1a hashing with length as secondary key to minimize
collisions. Indexes are freed immediately after each phase.
## Results
**Shadow Slave EPUB (2768 chapters):**
| Phase | Before | After | Speedup |
|-------|--------|-------|---------|
| OPF pass | ~25 min | 10.8 sec | ~140x |
| TOC pass | ~5 min | 4.7 sec | ~60x |
| buildBookBin | 506 sec | 34.6 sec | ~15x |
| **Total** | **~30+ min** | **~50 sec** | **~36x** |
**Normal EPUB (87 chapters):** 1.7 sec - no regression.
## Memory
Peak temporary memory during indexing:
- OPF index: ~33KB (2770 items × 12 bytes)
- TOC index: ~33KB (2768 items × 12 bytes)
- ZIP batch: ~44KB (targets + sizes arrays)
All indexes cleared immediately after each phase. No OOM risk on
ESP32-C3.
## Note on Threshold
All optimizations are gated by `LARGE_SPINE_THRESHOLD = 400` to preserve
existing behavior for small books. However, the algorithms work
correctly for any book size and are faster even for small books:
| Book Size | Old O(n²) | New O(n log n) | Improvement |
|-----------|-----------|----------------|-------------|
| 10 ch | 100 ops | 50 ops | 2x |
| 100 ch | 10K ops | 800 ops | 12x |
| 400 ch | 160K ops | 4K ops | 40x |
If preferred, the threshold could be removed to use the optimized path
universally.
## Testing
- [x] Shadow Slave (2768 chapters): 50s first-time indexing, loads and
navigates correctly
- [x] Normal book (87 chapters): 1.7s indexing, no regression
- [x] Build passes
- [x] clang-format passes
## Files Changed
- `lib/Epub/Epub/parsers/ContentOpfParser.h/.cpp` - OPF manifest index
- `lib/Epub/Epub/BookMetadataCache.h/.cpp` - TOC index + batch size
lookup
- `lib/ZipFile/ZipFile.h/.cpp` - New `fillUncompressedSizes()` API
- `lib/Epub/Epub.cpp` - Timing logs
<details>
<summary><b>Algorithm Details</b> (click to expand)</summary>
### Phase 1: OPF Pass - Manifest to Spine Lookup
**Problem**: Each `<itemref idref="ch001">` in spine must find matching
`<item id="ch001" href="...">` in manifest.
```
OLD: For each of 2768 spine refs, scan all 2770 manifest items
= 7.6M string comparisons
NEW: While parsing manifest, build index:
{ hash("ch001"), len=5, file_offset=120 }
Sort index, then binary search for each spine ref:
2768 × log₂(2770) ≈ 2768 × 11 = 30K comparisons
```
### Phase 2: TOC Pass - TOC Entry to Spine Index Lookup
**Problem**: Each TOC entry with `href="chapter0001.xhtml"` must find
its spine index.
```
OLD: For each of 2768 TOC entries, scan all 2768 spine entries
= 7.6M string comparisons
NEW: At beginTocPass(), read spine once and build index:
{ hash("OEBPS/chapter0001.xhtml"), len=25, spineIndex=0 }
Sort index, binary search for each TOC entry:
2768 × log₂(2768) ≈ 30K comparisons
Clear index at endTocPass() to free memory.
```
### Phase 3: buildBookBin - ZIP Size Lookup
**Problem**: Need uncompressed file size for each spine item (for
reading progress). Sizes are in ZIP central directory.
```
OLD: For each of 2768 spine items, scan ZIP central directory (2773 entries)
= 7.6M filename reads + string comparisons
Time: 506 seconds
NEW:
Step 1: Build targets from spine
{ hash("OEBPS/chapter0001.xhtml"), len=25, index=0 }
Sort by (hash, len)
Step 2: Single pass through ZIP central directory
For each entry:
- Compute hash ON THE FLY (no string allocation)
- Binary search targets
- If match: sizes[target.index] = uncompressedSize
Step 3: Use sizes array directly (O(1) per spine item)
Total: 2773 entries × log₂(2768) ≈ 33K comparisons
Time: 35 seconds
```
### Why Hash + Length?
Using 64-bit FNV-1a hash + string length as a composite key:
- Collision probability: ~1 in 2⁶⁴ × typical_path_lengths
- No string storage needed in index (just 12-16 bytes per entry)
- Integer comparisons are faster than string comparisons
- Verification on match handles the rare collision case
</details>
---
_AI-assisted development. All changes tested on hardware._
## Summary
* Swap to updated SDCardManager which uses SdFat
* Add exFAT support
* Swap to using FsFile everywhere
* Use newly exposed `SdMan` macro to get to static instance of
SDCardManager
* Move a bunch of FsHelpers up to SDCardManager
Still a bit raw, but gets the time required to determine the size of
each chapter (for reading progress) down from ~25ms to 0-1ms.
This is done by keeping the zipArchive open (so simple ;)).
Probably we don't need to cache the spine sizes anymore then...
---------
Co-authored-by: Dave Allie <dave@daveallie.com>
## Problem
`readFileToMemory()` allocates an output buffer via `malloc()` at line
120 but doesn't check if allocation succeeds. On low memory, the NULL
pointer is passed to `fread()` causing a crash.
## Fix
Add NULL check after `malloc()` for the output buffer. Follows the
existing pattern already used for `deflatedData` at line 141.
Changed `lib/ZipFile/ZipFile.cpp`.
## Test
- Follows existing validated pattern from same function
- Defensive addition only - no logic changes
## Problem
Three `fopen()` calls in ZipFile.cpp did not check for NULL before using
the file handle. If files cannot be opened, `fseek`/`fread`/`fclose`
receive NULL and crash.
## Fix
Added NULL checks with appropriate error logging and early returns for
all three locations:
- `getDataOffset()`
- `readFileToMemory()`
- `readFileToStream()`
## Testing
- Builds successfully with `pio run`
- Affects: `lib/ZipFile/ZipFile.cpp`