Files
crosspoint-reader-mod/lib/Epub/Epub/ParsedText.cpp

550 lines
23 KiB
C++
Raw Normal View History

#include "ParsedText.h"
#include <GfxRenderer.h>
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
#include <Utf8.h>
#include <algorithm>
#include <cmath>
#include <functional>
#include <limits>
#include <vector>
#include "hyphenation/Hyphenator.h"
constexpr int MAX_COST = std::numeric_limits<int>::max();
namespace {
// Soft hyphen byte pattern used throughout EPUBs (UTF-8 for U+00AD).
constexpr char SOFT_HYPHEN_UTF8[] = "\xC2\xAD";
constexpr size_t SOFT_HYPHEN_BYTES = 2;
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
// Returns the first rendered codepoint of a word (skipping leading soft hyphens).
uint32_t firstCodepoint(const std::string& word) {
const auto* ptr = reinterpret_cast<const unsigned char*>(word.c_str());
while (true) {
const uint32_t cp = utf8NextCodepoint(&ptr);
if (cp == 0) return 0;
if (cp != 0x00AD) return cp; // skip soft hyphens
}
}
// Returns the last codepoint of a word by scanning backward for the start of the last UTF-8 sequence.
uint32_t lastCodepoint(const std::string& word) {
if (word.empty()) return 0;
// UTF-8 continuation bytes start with 10xxxxxx; scan backward to find the leading byte.
size_t i = word.size() - 1;
while (i > 0 && (static_cast<uint8_t>(word[i]) & 0xC0) == 0x80) {
--i;
}
const auto* ptr = reinterpret_cast<const unsigned char*>(word.c_str() + i);
return utf8NextCodepoint(&ptr);
}
bool containsSoftHyphen(const std::string& word) { return word.find(SOFT_HYPHEN_UTF8) != std::string::npos; }
// Removes every soft hyphen in-place so rendered glyphs match measured widths.
void stripSoftHyphensInPlace(std::string& word) {
size_t pos = 0;
while ((pos = word.find(SOFT_HYPHEN_UTF8, pos)) != std::string::npos) {
word.erase(pos, SOFT_HYPHEN_BYTES);
}
}
// Returns the advance width for a word while ignoring soft hyphen glyphs and optionally appending a visible hyphen.
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
// Uses advance width (sum of glyph advances + kerning) rather than bounding box width so that italic glyph overhangs
// don't inflate inter-word spacing.
uint16_t measureWordWidth(const GfxRenderer& renderer, const int fontId, const std::string& word,
const EpdFontFamily::Style style, const bool appendHyphen = false) {
if (word.size() == 1 && word[0] == ' ' && !appendHyphen) {
return renderer.getSpaceWidth(fontId, style);
}
const bool hasSoftHyphen = containsSoftHyphen(word);
if (!hasSoftHyphen && !appendHyphen) {
return renderer.getTextAdvanceX(fontId, word.c_str(), style);
}
std::string sanitized = word;
if (hasSoftHyphen) {
stripSoftHyphensInPlace(sanitized);
}
if (appendHyphen) {
sanitized.push_back('-');
}
return renderer.getTextAdvanceX(fontId, sanitized.c_str(), style);
}
} // namespace
void ParsedText::addWord(std::string word, const EpdFontFamily::Style fontStyle, const bool underline,
const bool attachToPrevious) {
if (word.empty()) return;
words.push_back(std::move(word));
EpdFontFamily::Style combinedStyle = fontStyle;
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
if (underline) {
combinedStyle = static_cast<EpdFontFamily::Style>(combinedStyle | EpdFontFamily::UNDERLINE);
}
wordStyles.push_back(combinedStyle);
wordContinues.push_back(attachToPrevious);
}
// Consumes data to minimize memory usage
void ParsedText::layoutAndExtractLines(const GfxRenderer& renderer, const int fontId, const uint16_t viewportWidth,
const std::function<void(std::shared_ptr<TextBlock>)>& processLine,
const bool includeLastLine) {
if (words.empty()) {
return;
}
// Apply fixed transforms before any per-line layout work.
applyParagraphIndent();
Rotation Support (#77) • What is the goal of this PR? Implement a horizontal EPUB reading mode so books can be read in landscape orientation (both 90° and 270°), while keeping the rest of the UI in portrait. • What changes are included? ◦ Rendering / Display ▪ Added an orientation model to GfxRenderer (Portrait, LandscapeNormal, LandscapeFlipped) and made: ▪ drawPixel, drawImage, displayWindow map logical coordinates differently depending on orientation. ▪ getScreenWidth() / getScreenHeight() return orientation‑aware logical dimensions (480×800 in portrait, 800×480 in landscape). ◦ Settings / Configuration ▪ Extended CrossPointSettings with: ▪ landscapeReading (toggle for portrait vs. landscape EPUB reading). ▪ landscapeFlipped (toggle to flip landscape 180° so both horizontal holding directions are supported). ▪ Updated settings serialization/deserialization to persist these fields while remaining backward‑compatible with existing settings files. ▪ Updated SettingsActivity to expose two new toggles: ▪ “Landscape Reading” ▪ “Flip Landscape (swap top/bottom)” ◦ EPUB Reader ▪ In EpubReaderActivity: ▪ On onEnter, set GfxRenderer orientation based on the new settings (Portrait, LandscapeNormal, or LandscapeFlipped). ▪ On onExit, reset orientation back to Portrait so Home, WiFi, Settings, etc. continue to render as before. ▪ Adjusted renderStatusBar to position the status bar and battery indicator relative to GfxRenderer::getScreenHeight() instead of hard‑coded Y coordinates, so it stays correctly at the bottom in both portrait and landscape. ◦ EPUB Caching / Layout ▪ Extended Section cache metadata (section.bin) to include the logical screenWidth and screenHeight used when pages were generated; bumped SECTION_FILE_VERSION. ▪ Updated loadCacheMetadata to compare: ▪ font/margins/line compression/extraParagraphSpacing and screen dimensions; mismatches now invalidate and clear the cache. ▪ Updated persistPageDataToSD and all call sites in EpubReaderActivity to pass the current GfxRenderer::getScreenWidth() / getScreenHeight() so portrait and landscape caches are kept separate and correctly sized. Additional Context • Cache behavior / migration ◦ Existing section.bin files (old SECTION_FILE_VERSION) will be detected as incompatible and their caches cleared and rebuilt once per chapter when first opened after this change. ◦ Within a given orientation, caches will be reused as before. Switching orientation (portrait ↔ landscape) will cause a one‑time re‑index of each chapter in the new orientation. • Scope and risks ◦ Orientation changes are scoped to the EPUB reader; the Home screen, Settings, WiFi selection, sleep screens, and web server UI continue to assume portrait orientation. ◦ The renderer’s orientation is a static/global setting; if future code uses GfxRenderer outside the reader while a reader instance is active, it should be aware that orientation is no longer implicitly fixed. ◦ All drawing primitives now go through orientation‑aware coordinate transforms; any code that previously relied on edge‑case behavior or out‑of‑bounds writes might surface as logged “Outside range” warnings instead. • Testing suggestions / areas to focus on ◦ Verify in hardware: ▪ Portrait mode still renders correctly (boot, home, settings, WiFi, reader). ▪ Landscape reading in both directions: ▪ Landscape Reading = ON, Flip Landscape = OFF. ▪ Landscape Reading = ON, Flip Landscape = ON. ▪ Status bar (page X/Y, % progress, battery icon) is fully visible and aligned at the bottom in all three combinations. ◦ Open the same book: ▪ In portrait first, then switch to landscape and reopen it. ▪ Confirm that: ▪ Old portrait caches are rebuilt once for landscape (you should see the “Indexing…” page). ▪ Progress save/restore still works (resume opens to the correct page in the current orientation). ◦ Ensure grayscale rendering (the secondary pass in EpubReaderActivity::renderContents) still looks correct in both orientations. --------- Co-authored-by: Dave Allie <dave@daveallie.com>
2025-12-28 05:33:20 -05:00
const int pageWidth = viewportWidth;
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
const int spaceWidth = renderer.getSpaceWidth(fontId, EpdFontFamily::REGULAR);
auto wordWidths = calculateWordWidths(renderer, fontId);
std::vector<size_t> lineBreakIndices;
if (hyphenationEnabled) {
// Use greedy layout that can split words mid-loop when a hyphenated prefix fits.
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
lineBreakIndices = computeHyphenatedLineBreaks(renderer, fontId, pageWidth, spaceWidth, wordWidths, wordContinues);
} else {
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
lineBreakIndices = computeLineBreaks(renderer, fontId, pageWidth, spaceWidth, wordWidths, wordContinues);
}
const size_t lineCount = includeLastLine ? lineBreakIndices.size() : lineBreakIndices.size() - 1;
for (size_t i = 0; i < lineCount; ++i) {
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
extractLine(i, pageWidth, spaceWidth, wordWidths, wordContinues, lineBreakIndices, processLine, renderer, fontId);
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
}
// Remove consumed words so size() reflects only remaining words
if (lineCount > 0) {
const size_t consumed = lineBreakIndices[lineCount - 1];
words.erase(words.begin(), words.begin() + consumed);
wordStyles.erase(wordStyles.begin(), wordStyles.begin() + consumed);
wordContinues.erase(wordContinues.begin(), wordContinues.begin() + consumed);
}
}
std::vector<uint16_t> ParsedText::calculateWordWidths(const GfxRenderer& renderer, const int fontId) {
std::vector<uint16_t> wordWidths;
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
wordWidths.reserve(words.size());
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
for (size_t i = 0; i < words.size(); ++i) {
wordWidths.push_back(measureWordWidth(renderer, fontId, words[i], wordStyles[i]));
}
return wordWidths;
}
std::vector<size_t> ParsedText::computeLineBreaks(const GfxRenderer& renderer, const int fontId, const int pageWidth,
const int spaceWidth, std::vector<uint16_t>& wordWidths,
std::vector<bool>& continuesVec) {
if (words.empty()) {
return {};
}
fix: Hanging indent (negative text-indent) and em-unit sizing (#1229) ## Summary * **What is the goal of this PR?** Fixing two independent CSS rendering bugs combined to make hanging-indent list styles (e.g. margin-left:3em; text-indent:-1em) render incorrectly: * **What changes are included?** 1. Negative text-indent was silently ignored Three guards in ParsedText.cpp (computeLineBreaks, computeHyphenatedLineBreaks, extractLine) conditioned firstLineIndent on blockStyle.textIndent > 0, so any negative value collapsed to zero. Additionally, wordXpos was uint16_t, which cannot represent negative offsets — a cast of e.g. −18 would wrap to 65518 and render the word far off-screen. 2. extraParagraphSpacing suppressed hanging indents Even after removing the > 0 guard, the existing !extraParagraphSpacing condition would still suppress all text-indent when that setting is on (its default). Positive text-indent is a decorative paragraph indent that the user can reasonably replace with vertical spacing — negative text-indent is structural (it positions the list marker) and must always apply. 3. em unit was calibrated against line height, not font size emSize was computed as getLineHeight() * lineCompression (the full line advance). CSS em units are defined relative to the font-size, which corresponds to the ascender height — not the line height. Using line height makes every em-based margin/indent ~20–30% wider than a browser would render it, and is especially noticeable for CSS that uses font-size: small (which we do not implement). ## Additional Context Test case ``` .lsl1 { margin-left: 3em; text-indent: -1em; } <div class="lsl1">• First list item that wraps across lines</div> <div class="lsl1">• Short item</div> ``` Before: all lines of all items started at 3 em from the left edge (indent ignored). After: the bullet marker hangs at 2 em; continuation lines align at 3 em. <img width="240" alt="before" src="https://github.com/user-attachments/assets/9dcbf3e0-fcd9-4af8-b451-a90ba4d2fb75" /> <img width="240" alt="after" src="https://github.com/user-attachments/assets/1ffdcf56-a180-4267-9590-c60d7ac44707" /> --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES**_
2026-03-02 12:02:09 +01:00
// Calculate first line indent (only for left/justified text).
// Positive text-indent (paragraph indent) is suppressed when extraParagraphSpacing is on.
// Negative text-indent (hanging indent, e.g. margin-left:3em; text-indent:-1em) always applies —
// it is structural (positions the bullet/marker), not decorative.
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
const int firstLineIndent =
fix: Hanging indent (negative text-indent) and em-unit sizing (#1229) ## Summary * **What is the goal of this PR?** Fixing two independent CSS rendering bugs combined to make hanging-indent list styles (e.g. margin-left:3em; text-indent:-1em) render incorrectly: * **What changes are included?** 1. Negative text-indent was silently ignored Three guards in ParsedText.cpp (computeLineBreaks, computeHyphenatedLineBreaks, extractLine) conditioned firstLineIndent on blockStyle.textIndent > 0, so any negative value collapsed to zero. Additionally, wordXpos was uint16_t, which cannot represent negative offsets — a cast of e.g. −18 would wrap to 65518 and render the word far off-screen. 2. extraParagraphSpacing suppressed hanging indents Even after removing the > 0 guard, the existing !extraParagraphSpacing condition would still suppress all text-indent when that setting is on (its default). Positive text-indent is a decorative paragraph indent that the user can reasonably replace with vertical spacing — negative text-indent is structural (it positions the list marker) and must always apply. 3. em unit was calibrated against line height, not font size emSize was computed as getLineHeight() * lineCompression (the full line advance). CSS em units are defined relative to the font-size, which corresponds to the ascender height — not the line height. Using line height makes every em-based margin/indent ~20–30% wider than a browser would render it, and is especially noticeable for CSS that uses font-size: small (which we do not implement). ## Additional Context Test case ``` .lsl1 { margin-left: 3em; text-indent: -1em; } <div class="lsl1">• First list item that wraps across lines</div> <div class="lsl1">• Short item</div> ``` Before: all lines of all items started at 3 em from the left edge (indent ignored). After: the bullet marker hangs at 2 em; continuation lines align at 3 em. <img width="240" alt="before" src="https://github.com/user-attachments/assets/9dcbf3e0-fcd9-4af8-b451-a90ba4d2fb75" /> <img width="240" alt="after" src="https://github.com/user-attachments/assets/1ffdcf56-a180-4267-9590-c60d7ac44707" /> --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES**_
2026-03-02 12:02:09 +01:00
blockStyle.textIndentDefined && (blockStyle.textIndent < 0 || !extraParagraphSpacing) &&
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
(blockStyle.alignment == CssTextAlign::Justify || blockStyle.alignment == CssTextAlign::Left)
? blockStyle.textIndent
: 0;
// Ensure any word that would overflow even as the first entry on a line is split using fallback hyphenation.
for (size_t i = 0; i < wordWidths.size(); ++i) {
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
// First word needs to fit in reduced width if there's an indent
const int effectiveWidth = i == 0 ? pageWidth - firstLineIndent : pageWidth;
while (wordWidths[i] > effectiveWidth) {
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
if (!hyphenateWordAtIndex(i, effectiveWidth, renderer, fontId, wordWidths, /*allowFallbackBreaks=*/true)) {
break;
}
}
}
const size_t totalWordCount = words.size();
// DP table to store the minimum badness (cost) of lines starting at index i
std::vector<int> dp(totalWordCount);
// 'ans[i]' stores the index 'j' of the *last word* in the optimal line starting at 'i'
std::vector<size_t> ans(totalWordCount);
// Base Case
dp[totalWordCount - 1] = 0;
ans[totalWordCount - 1] = totalWordCount - 1;
for (int i = totalWordCount - 2; i >= 0; --i) {
int currlen = 0;
dp[i] = MAX_COST;
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
// First line has reduced width due to text-indent
const int effectivePageWidth = i == 0 ? pageWidth - firstLineIndent : pageWidth;
for (size_t j = i; j < totalWordCount; ++j) {
// Add space before word j, unless it's the first word on the line or a continuation
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
int gap = 0;
if (j > static_cast<size_t>(i) && !continuesVec[j]) {
gap = spaceWidth;
gap += renderer.getSpaceKernAdjust(fontId, lastCodepoint(words[j - 1]), firstCodepoint(words[j]),
wordStyles[j - 1]);
} else if (j > static_cast<size_t>(i) && continuesVec[j]) {
// Cross-boundary kerning for continuation words (e.g. nonbreaking spaces, attached punctuation)
gap = renderer.getKerning(fontId, lastCodepoint(words[j - 1]), firstCodepoint(words[j]), wordStyles[j - 1]);
}
currlen += wordWidths[j] + gap;
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
if (currlen > effectivePageWidth) {
break;
}
// Cannot break after word j if the next word attaches to it (continuation group)
if (j + 1 < totalWordCount && continuesVec[j + 1]) {
continue;
}
int cost;
if (j == totalWordCount - 1) {
cost = 0; // Last line
} else {
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
const int remainingSpace = effectivePageWidth - currlen;
// Use long long for the square to prevent overflow
const long long cost_ll = static_cast<long long>(remainingSpace) * remainingSpace + dp[j + 1];
if (cost_ll > MAX_COST) {
cost = MAX_COST;
} else {
cost = static_cast<int>(cost_ll);
}
}
if (cost < dp[i]) {
dp[i] = cost;
ans[i] = j; // j is the index of the last word in this optimal line
}
}
Add retry logic and progress bar for chapter indexing (#128) ## Summary * **What is the goal of this PR?** Improve reliability and user experience during chapter indexing by adding retry logic for SD card operations and a visual progress bar. * **What changes are included?** - **Retry logic**: Add 3 retry attempts with 50ms delay for ZIP to SD card streaming to handle timing issues after display refresh - **Progress bar**: Display a visual progress bar (0-100%) during chapter indexing based on file read progress, updating every 10% to balance responsiveness with e-ink display limitations ## Additional Context * **Problem observed**: When navigating quickly through books with many chapters (before chapter titles finish rendering), the "Indexing..." screen would appear frozen. Checking the serial log revealed the operation had silently failed, but the UI showed no indication of this. Users would likely assume the device had crashed. Pressing the next button again would resume operation, but this behavior was confusing and unexpected. * **Solution**: - Retry logic handles transient SD card timing failures automatically, so users don't need to manually retry - Progress bar provides visual feedback so users know indexing is actively working (not frozen) * **Why timing issues occur**: After display refresh operations, there can be timing conflicts when immediately starting SD card write operations. This is more likely to happen when rapidly navigating through chapters. * **Progress bar design**: Updates every 10% to avoid excessive e-ink refreshes while still providing meaningful feedback during long indexing operations (especially for large chapters with CJK characters). * **Performance**: Minimal overhead - progress calculation is simple byte counting, and display updates use `FAST_REFRESH` mode.
2025-12-28 13:59:44 +09:00
// Handle oversized word: if no valid configuration found, force single-word line
// This prevents cascade failure where one oversized word breaks all preceding words
if (dp[i] == MAX_COST) {
ans[i] = i; // Just this word on its own line
// Inherit cost from next word to allow subsequent words to find valid configurations
if (i + 1 < static_cast<int>(totalWordCount)) {
dp[i] = dp[i + 1];
} else {
dp[i] = 0;
}
}
}
// Stores the index of the word that starts the next line (last_word_index + 1)
std::vector<size_t> lineBreakIndices;
size_t currentWordIndex = 0;
while (currentWordIndex < totalWordCount) {
size_t nextBreakIndex = ans[currentWordIndex] + 1;
// Safety check: prevent infinite loop if nextBreakIndex doesn't advance
if (nextBreakIndex <= currentWordIndex) {
// Force advance by at least one word to avoid infinite loop
nextBreakIndex = currentWordIndex + 1;
}
lineBreakIndices.push_back(nextBreakIndex);
currentWordIndex = nextBreakIndex;
}
return lineBreakIndices;
}
void ParsedText::applyParagraphIndent() {
if (extraParagraphSpacing || words.empty()) {
return;
}
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
if (blockStyle.textIndentDefined) {
// CSS text-indent is explicitly set (even if 0) - don't use fallback EmSpace
// The actual indent positioning is handled in extractLine()
} else if (blockStyle.alignment == CssTextAlign::Justify || blockStyle.alignment == CssTextAlign::Left) {
// No CSS text-indent defined - use EmSpace fallback for visual indent
words.front().insert(0, "\xe2\x80\x83");
}
}
// Builds break indices while opportunistically splitting the word that would overflow the current line.
std::vector<size_t> ParsedText::computeHyphenatedLineBreaks(const GfxRenderer& renderer, const int fontId,
const int pageWidth, const int spaceWidth,
std::vector<uint16_t>& wordWidths,
std::vector<bool>& continuesVec) {
fix: Hanging indent (negative text-indent) and em-unit sizing (#1229) ## Summary * **What is the goal of this PR?** Fixing two independent CSS rendering bugs combined to make hanging-indent list styles (e.g. margin-left:3em; text-indent:-1em) render incorrectly: * **What changes are included?** 1. Negative text-indent was silently ignored Three guards in ParsedText.cpp (computeLineBreaks, computeHyphenatedLineBreaks, extractLine) conditioned firstLineIndent on blockStyle.textIndent > 0, so any negative value collapsed to zero. Additionally, wordXpos was uint16_t, which cannot represent negative offsets — a cast of e.g. −18 would wrap to 65518 and render the word far off-screen. 2. extraParagraphSpacing suppressed hanging indents Even after removing the > 0 guard, the existing !extraParagraphSpacing condition would still suppress all text-indent when that setting is on (its default). Positive text-indent is a decorative paragraph indent that the user can reasonably replace with vertical spacing — negative text-indent is structural (it positions the list marker) and must always apply. 3. em unit was calibrated against line height, not font size emSize was computed as getLineHeight() * lineCompression (the full line advance). CSS em units are defined relative to the font-size, which corresponds to the ascender height — not the line height. Using line height makes every em-based margin/indent ~20–30% wider than a browser would render it, and is especially noticeable for CSS that uses font-size: small (which we do not implement). ## Additional Context Test case ``` .lsl1 { margin-left: 3em; text-indent: -1em; } <div class="lsl1">• First list item that wraps across lines</div> <div class="lsl1">• Short item</div> ``` Before: all lines of all items started at 3 em from the left edge (indent ignored). After: the bullet marker hangs at 2 em; continuation lines align at 3 em. <img width="240" alt="before" src="https://github.com/user-attachments/assets/9dcbf3e0-fcd9-4af8-b451-a90ba4d2fb75" /> <img width="240" alt="after" src="https://github.com/user-attachments/assets/1ffdcf56-a180-4267-9590-c60d7ac44707" /> --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES**_
2026-03-02 12:02:09 +01:00
// Calculate first line indent (only for left/justified text).
// Positive text-indent (paragraph indent) is suppressed when extraParagraphSpacing is on.
// Negative text-indent (hanging indent, e.g. margin-left:3em; text-indent:-1em) always applies —
// it is structural (positions the bullet/marker), not decorative.
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
const int firstLineIndent =
fix: Hanging indent (negative text-indent) and em-unit sizing (#1229) ## Summary * **What is the goal of this PR?** Fixing two independent CSS rendering bugs combined to make hanging-indent list styles (e.g. margin-left:3em; text-indent:-1em) render incorrectly: * **What changes are included?** 1. Negative text-indent was silently ignored Three guards in ParsedText.cpp (computeLineBreaks, computeHyphenatedLineBreaks, extractLine) conditioned firstLineIndent on blockStyle.textIndent > 0, so any negative value collapsed to zero. Additionally, wordXpos was uint16_t, which cannot represent negative offsets — a cast of e.g. −18 would wrap to 65518 and render the word far off-screen. 2. extraParagraphSpacing suppressed hanging indents Even after removing the > 0 guard, the existing !extraParagraphSpacing condition would still suppress all text-indent when that setting is on (its default). Positive text-indent is a decorative paragraph indent that the user can reasonably replace with vertical spacing — negative text-indent is structural (it positions the list marker) and must always apply. 3. em unit was calibrated against line height, not font size emSize was computed as getLineHeight() * lineCompression (the full line advance). CSS em units are defined relative to the font-size, which corresponds to the ascender height — not the line height. Using line height makes every em-based margin/indent ~20–30% wider than a browser would render it, and is especially noticeable for CSS that uses font-size: small (which we do not implement). ## Additional Context Test case ``` .lsl1 { margin-left: 3em; text-indent: -1em; } <div class="lsl1">• First list item that wraps across lines</div> <div class="lsl1">• Short item</div> ``` Before: all lines of all items started at 3 em from the left edge (indent ignored). After: the bullet marker hangs at 2 em; continuation lines align at 3 em. <img width="240" alt="before" src="https://github.com/user-attachments/assets/9dcbf3e0-fcd9-4af8-b451-a90ba4d2fb75" /> <img width="240" alt="after" src="https://github.com/user-attachments/assets/1ffdcf56-a180-4267-9590-c60d7ac44707" /> --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES**_
2026-03-02 12:02:09 +01:00
blockStyle.textIndentDefined && (blockStyle.textIndent < 0 || !extraParagraphSpacing) &&
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
(blockStyle.alignment == CssTextAlign::Justify || blockStyle.alignment == CssTextAlign::Left)
? blockStyle.textIndent
: 0;
std::vector<size_t> lineBreakIndices;
size_t currentIndex = 0;
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
bool isFirstLine = true;
while (currentIndex < wordWidths.size()) {
const size_t lineStart = currentIndex;
int lineWidth = 0;
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
// First line has reduced width due to text-indent
const int effectivePageWidth = isFirstLine ? pageWidth - firstLineIndent : pageWidth;
// Consume as many words as possible for current line, splitting when prefixes fit
while (currentIndex < wordWidths.size()) {
const bool isFirstWord = currentIndex == lineStart;
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
int spacing = 0;
if (!isFirstWord && !continuesVec[currentIndex]) {
spacing = spaceWidth;
spacing += renderer.getSpaceKernAdjust(fontId, lastCodepoint(words[currentIndex - 1]),
firstCodepoint(words[currentIndex]), wordStyles[currentIndex - 1]);
} else if (!isFirstWord && continuesVec[currentIndex]) {
// Cross-boundary kerning for continuation words (e.g. nonbreaking spaces, attached punctuation)
spacing = renderer.getKerning(fontId, lastCodepoint(words[currentIndex - 1]),
firstCodepoint(words[currentIndex]), wordStyles[currentIndex - 1]);
}
const int candidateWidth = spacing + wordWidths[currentIndex];
// Word fits on current line
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
if (lineWidth + candidateWidth <= effectivePageWidth) {
lineWidth += candidateWidth;
++currentIndex;
continue;
}
// Word would overflow — try to split based on hyphenation points
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
const int availableWidth = effectivePageWidth - lineWidth - spacing;
const bool allowFallbackBreaks = isFirstWord; // Only for first word on line
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
if (availableWidth > 0 &&
hyphenateWordAtIndex(currentIndex, availableWidth, renderer, fontId, wordWidths, allowFallbackBreaks)) {
// Prefix now fits; append it to this line and move to next line
lineWidth += spacing + wordWidths[currentIndex];
++currentIndex;
break;
}
// Could not split: force at least one word per line to avoid infinite loop
if (currentIndex == lineStart) {
lineWidth += candidateWidth;
++currentIndex;
}
break;
}
// Don't break before a continuation word (e.g., orphaned "?" after "question").
// Backtrack to the start of the continuation group so the whole group moves to the next line.
while (currentIndex > lineStart + 1 && currentIndex < wordWidths.size() && continuesVec[currentIndex]) {
--currentIndex;
}
lineBreakIndices.push_back(currentIndex);
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
isFirstLine = false;
}
return lineBreakIndices;
}
// Splits words[wordIndex] into prefix (adding a hyphen only when needed) and remainder when a legal breakpoint fits the
// available width.
bool ParsedText::hyphenateWordAtIndex(const size_t wordIndex, const int availableWidth, const GfxRenderer& renderer,
const int fontId, std::vector<uint16_t>& wordWidths,
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
const bool allowFallbackBreaks) {
// Guard against invalid indices or zero available width before attempting to split.
if (availableWidth <= 0 || wordIndex >= words.size()) {
return false;
}
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
const std::string& word = words[wordIndex];
const auto style = wordStyles[wordIndex];
// Collect candidate breakpoints (byte offsets and hyphen requirements).
auto breakInfos = Hyphenator::breakOffsets(word, allowFallbackBreaks);
if (breakInfos.empty()) {
return false;
}
size_t chosenOffset = 0;
int chosenWidth = -1;
bool chosenNeedsHyphen = true;
// Iterate over each legal breakpoint and retain the widest prefix that still fits.
for (const auto& info : breakInfos) {
const size_t offset = info.byteOffset;
if (offset == 0 || offset >= word.size()) {
continue;
}
const bool needsHyphen = info.requiresInsertedHyphen;
const int prefixWidth = measureWordWidth(renderer, fontId, word.substr(0, offset), style, needsHyphen);
if (prefixWidth > availableWidth || prefixWidth <= chosenWidth) {
continue; // Skip if too wide or not an improvement
}
chosenWidth = prefixWidth;
chosenOffset = offset;
chosenNeedsHyphen = needsHyphen;
}
if (chosenWidth < 0) {
// No hyphenation point produced a prefix that fits in the remaining space.
return false;
}
// Split the word at the selected breakpoint and append a hyphen if required.
std::string remainder = word.substr(chosenOffset);
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
words[wordIndex].resize(chosenOffset);
if (chosenNeedsHyphen) {
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
words[wordIndex].push_back('-');
}
// Insert the remainder word (with matching style and continuation flag) directly after the prefix.
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
words.insert(words.begin() + wordIndex + 1, remainder);
wordStyles.insert(wordStyles.begin() + wordIndex + 1, style);
fix: Fix hyphenation and rendering of decomposed characters (#1037) ## Summary * This PR fixes decomposed diacritic handling end-to-end: - Hyphenation: normalize common Latin base+combining sequences to precomposed codepoints before Liang pattern matching, so decomposed words hyphenate correctly - Rendering: correct combining-mark placement logic so non-spacing marks are attached to the preceding base glyph in normal and rotated text rendering paths, with corresponding text-bounds consistency updates. - Hyphenation around non breaking space variants have been fixed (and extended) - Hyphenation of terms that already included of hyphens were fixed to include Liang pattern application (eg "US-Satellitensystem" was *exclusively* broken at the existing hyphen) ## Additional Context * Before <img width="800" height="480" alt="2" src="https://github.com/user-attachments/assets/b9c515c4-ab75-45cc-8b52-f4d86bce519d" /> * After <img width="480" height="800" alt="fix1" src="https://github.com/user-attachments/assets/4999f6a8-f51c-4c0a-b144-f153f77ddb57" /> <img width="800" height="480" alt="fix2" src="https://github.com/user-attachments/assets/7355126b-80c7-441f-b390-4e0897ee3fb6" /> * Note 1: the hyphenation fix is not a 100% bullet proof implementation. It adds composition of *common* base+combining sequences (e.g. O + U+0308 -> Ö) during codepoint collection. A complete solution would require implementing proper Unicode normalization (at least NFC, possibly NFKC in specific cases) before hyphenation and rendering, instead of hand-mapping a few combining marks. That was beyond the scope of this fix. * Note 2: the render fix should be universal and not limited to the constraints outlined above: it properly x-centers the compund glyph over the previous one, and it uses at least 1pt of visual distance in y. Before: <img width="478" height="167" alt="Image" src="https://github.com/user-attachments/assets/f8db60d5-35b1-4477-96d0-5003b4e4a2a1" /> After: <img width="479" height="180" alt="Image" src="https://github.com/user-attachments/assets/1b48ef97-3a77-475a-8522-23f4aca8e904" /> * This should resolve the issues described in #998 --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**PARTIALLY**_
2026-02-22 03:11:07 +01:00
// Continuation flag handling after splitting a word into prefix + remainder.
//
// The prefix keeps the original word's continuation flag so that no-break-space groups
// stay linked. The remainder always gets continues=false because it starts on the next
// line and is not attached to the prefix.
//
// Example: "200&#xA0;Quadratkilometer" produces tokens:
// [0] "200" continues=false
// [1] " " continues=true
// [2] "Quadratkilometer" continues=true <-- the word being split
//
// After splitting "Quadratkilometer" at "Quadrat-" / "kilometer":
// [0] "200" continues=false
// [1] " " continues=true
// [2] "Quadrat-" continues=true (KEPT — still attached to the no-break group)
// [3] "kilometer" continues=false (NEW — starts fresh on the next line)
//
// This lets the backtracking loop keep the entire prefix group ("200 Quadrat-") on one
// line, while "kilometer" moves to the next line.
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
// wordContinues[wordIndex] is intentionally left unchanged — the prefix keeps its original attachment.
wordContinues.insert(wordContinues.begin() + wordIndex + 1, false);
// Update cached widths to reflect the new prefix/remainder pairing.
wordWidths[wordIndex] = static_cast<uint16_t>(chosenWidth);
const uint16_t remainderWidth = measureWordWidth(renderer, fontId, remainder, style);
wordWidths.insert(wordWidths.begin() + wordIndex + 1, remainderWidth);
return true;
}
void ParsedText::extractLine(const size_t breakIndex, const int pageWidth, const int spaceWidth,
const std::vector<uint16_t>& wordWidths, const std::vector<bool>& continuesVec,
const std::vector<size_t>& lineBreakIndices,
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
const std::function<void(std::shared_ptr<TextBlock>)>& processLine,
const GfxRenderer& renderer, const int fontId) {
const size_t lineBreak = lineBreakIndices[breakIndex];
const size_t lastBreakAt = breakIndex > 0 ? lineBreakIndices[breakIndex - 1] : 0;
const size_t lineWordCount = lineBreak - lastBreakAt;
fix: Hanging indent (negative text-indent) and em-unit sizing (#1229) ## Summary * **What is the goal of this PR?** Fixing two independent CSS rendering bugs combined to make hanging-indent list styles (e.g. margin-left:3em; text-indent:-1em) render incorrectly: * **What changes are included?** 1. Negative text-indent was silently ignored Three guards in ParsedText.cpp (computeLineBreaks, computeHyphenatedLineBreaks, extractLine) conditioned firstLineIndent on blockStyle.textIndent > 0, so any negative value collapsed to zero. Additionally, wordXpos was uint16_t, which cannot represent negative offsets — a cast of e.g. −18 would wrap to 65518 and render the word far off-screen. 2. extraParagraphSpacing suppressed hanging indents Even after removing the > 0 guard, the existing !extraParagraphSpacing condition would still suppress all text-indent when that setting is on (its default). Positive text-indent is a decorative paragraph indent that the user can reasonably replace with vertical spacing — negative text-indent is structural (it positions the list marker) and must always apply. 3. em unit was calibrated against line height, not font size emSize was computed as getLineHeight() * lineCompression (the full line advance). CSS em units are defined relative to the font-size, which corresponds to the ascender height — not the line height. Using line height makes every em-based margin/indent ~20–30% wider than a browser would render it, and is especially noticeable for CSS that uses font-size: small (which we do not implement). ## Additional Context Test case ``` .lsl1 { margin-left: 3em; text-indent: -1em; } <div class="lsl1">• First list item that wraps across lines</div> <div class="lsl1">• Short item</div> ``` Before: all lines of all items started at 3 em from the left edge (indent ignored). After: the bullet marker hangs at 2 em; continuation lines align at 3 em. <img width="240" alt="before" src="https://github.com/user-attachments/assets/9dcbf3e0-fcd9-4af8-b451-a90ba4d2fb75" /> <img width="240" alt="after" src="https://github.com/user-attachments/assets/1ffdcf56-a180-4267-9590-c60d7ac44707" /> --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES**_
2026-03-02 12:02:09 +01:00
// Calculate first line indent (only for left/justified text).
// Positive text-indent (paragraph indent) is suppressed when extraParagraphSpacing is on.
// Negative text-indent (hanging indent, e.g. margin-left:3em; text-indent:-1em) always applies —
// it is structural (positions the bullet/marker), not decorative.
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
const bool isFirstLine = breakIndex == 0;
const int firstLineIndent =
fix: Hanging indent (negative text-indent) and em-unit sizing (#1229) ## Summary * **What is the goal of this PR?** Fixing two independent CSS rendering bugs combined to make hanging-indent list styles (e.g. margin-left:3em; text-indent:-1em) render incorrectly: * **What changes are included?** 1. Negative text-indent was silently ignored Three guards in ParsedText.cpp (computeLineBreaks, computeHyphenatedLineBreaks, extractLine) conditioned firstLineIndent on blockStyle.textIndent > 0, so any negative value collapsed to zero. Additionally, wordXpos was uint16_t, which cannot represent negative offsets — a cast of e.g. −18 would wrap to 65518 and render the word far off-screen. 2. extraParagraphSpacing suppressed hanging indents Even after removing the > 0 guard, the existing !extraParagraphSpacing condition would still suppress all text-indent when that setting is on (its default). Positive text-indent is a decorative paragraph indent that the user can reasonably replace with vertical spacing — negative text-indent is structural (it positions the list marker) and must always apply. 3. em unit was calibrated against line height, not font size emSize was computed as getLineHeight() * lineCompression (the full line advance). CSS em units are defined relative to the font-size, which corresponds to the ascender height — not the line height. Using line height makes every em-based margin/indent ~20–30% wider than a browser would render it, and is especially noticeable for CSS that uses font-size: small (which we do not implement). ## Additional Context Test case ``` .lsl1 { margin-left: 3em; text-indent: -1em; } <div class="lsl1">• First list item that wraps across lines</div> <div class="lsl1">• Short item</div> ``` Before: all lines of all items started at 3 em from the left edge (indent ignored). After: the bullet marker hangs at 2 em; continuation lines align at 3 em. <img width="240" alt="before" src="https://github.com/user-attachments/assets/9dcbf3e0-fcd9-4af8-b451-a90ba4d2fb75" /> <img width="240" alt="after" src="https://github.com/user-attachments/assets/1ffdcf56-a180-4267-9590-c60d7ac44707" /> --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES**_
2026-03-02 12:02:09 +01:00
isFirstLine && blockStyle.textIndentDefined && (blockStyle.textIndent < 0 || !extraParagraphSpacing) &&
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
(blockStyle.alignment == CssTextAlign::Justify || blockStyle.alignment == CssTextAlign::Left)
? blockStyle.textIndent
: 0;
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
// Calculate total word width for this line, count actual word gaps,
// and accumulate total natural gap widths (including space kerning adjustments).
int lineWordWidthSum = 0;
size_t actualGapCount = 0;
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
int totalNaturalGaps = 0;
for (size_t wordIdx = 0; wordIdx < lineWordCount; wordIdx++) {
lineWordWidthSum += wordWidths[lastBreakAt + wordIdx];
// Count gaps: each word after the first creates a gap, unless it's a continuation
if (wordIdx > 0 && !continuesVec[lastBreakAt + wordIdx]) {
actualGapCount++;
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
int naturalGap = spaceWidth;
naturalGap += renderer.getSpaceKernAdjust(fontId, lastCodepoint(words[lastBreakAt + wordIdx - 1]),
firstCodepoint(words[lastBreakAt + wordIdx]),
wordStyles[lastBreakAt + wordIdx - 1]);
totalNaturalGaps += naturalGap;
} else if (wordIdx > 0 && continuesVec[lastBreakAt + wordIdx]) {
// Cross-boundary kerning for continuation words (e.g. nonbreaking spaces, attached punctuation)
totalNaturalGaps +=
renderer.getKerning(fontId, lastCodepoint(words[lastBreakAt + wordIdx - 1]),
firstCodepoint(words[lastBreakAt + wordIdx]), wordStyles[lastBreakAt + wordIdx - 1]);
}
}
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
// Calculate spacing (account for indent reducing effective page width on first line)
const int effectivePageWidth = pageWidth - firstLineIndent;
const bool isLastLine = breakIndex == lineBreakIndices.size() - 1;
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
// For justified text, compute per-gap extra to distribute remaining space evenly
const int spareSpace = effectivePageWidth - lineWordWidthSum - totalNaturalGaps;
const int justifyExtra = (blockStyle.alignment == CssTextAlign::Justify && !isLastLine && actualGapCount >= 1)
? spareSpace / static_cast<int>(actualGapCount)
: 0;
fix: Hanging indent (negative text-indent) and em-unit sizing (#1229) ## Summary * **What is the goal of this PR?** Fixing two independent CSS rendering bugs combined to make hanging-indent list styles (e.g. margin-left:3em; text-indent:-1em) render incorrectly: * **What changes are included?** 1. Negative text-indent was silently ignored Three guards in ParsedText.cpp (computeLineBreaks, computeHyphenatedLineBreaks, extractLine) conditioned firstLineIndent on blockStyle.textIndent > 0, so any negative value collapsed to zero. Additionally, wordXpos was uint16_t, which cannot represent negative offsets — a cast of e.g. −18 would wrap to 65518 and render the word far off-screen. 2. extraParagraphSpacing suppressed hanging indents Even after removing the > 0 guard, the existing !extraParagraphSpacing condition would still suppress all text-indent when that setting is on (its default). Positive text-indent is a decorative paragraph indent that the user can reasonably replace with vertical spacing — negative text-indent is structural (it positions the list marker) and must always apply. 3. em unit was calibrated against line height, not font size emSize was computed as getLineHeight() * lineCompression (the full line advance). CSS em units are defined relative to the font-size, which corresponds to the ascender height — not the line height. Using line height makes every em-based margin/indent ~20–30% wider than a browser would render it, and is especially noticeable for CSS that uses font-size: small (which we do not implement). ## Additional Context Test case ``` .lsl1 { margin-left: 3em; text-indent: -1em; } <div class="lsl1">• First list item that wraps across lines</div> <div class="lsl1">• Short item</div> ``` Before: all lines of all items started at 3 em from the left edge (indent ignored). After: the bullet marker hangs at 2 em; continuation lines align at 3 em. <img width="240" alt="before" src="https://github.com/user-attachments/assets/9dcbf3e0-fcd9-4af8-b451-a90ba4d2fb75" /> <img width="240" alt="after" src="https://github.com/user-attachments/assets/1ffdcf56-a180-4267-9590-c60d7ac44707" /> --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES**_
2026-03-02 12:02:09 +01:00
// Calculate initial x position (first line starts at indent for left/justified text;
// may be negative for hanging indents, e.g. margin-left:3em; text-indent:-1em).
auto xpos = static_cast<int16_t>(firstLineIndent);
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
if (blockStyle.alignment == CssTextAlign::Right) {
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
xpos = effectivePageWidth - lineWordWidthSum - totalNaturalGaps;
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
} else if (blockStyle.alignment == CssTextAlign::Center) {
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
xpos = (effectivePageWidth - lineWordWidthSum - totalNaturalGaps) / 2;
}
// Pre-calculate X positions for words
// Continuation words attach to the previous word with no space before them
fix: Hanging indent (negative text-indent) and em-unit sizing (#1229) ## Summary * **What is the goal of this PR?** Fixing two independent CSS rendering bugs combined to make hanging-indent list styles (e.g. margin-left:3em; text-indent:-1em) render incorrectly: * **What changes are included?** 1. Negative text-indent was silently ignored Three guards in ParsedText.cpp (computeLineBreaks, computeHyphenatedLineBreaks, extractLine) conditioned firstLineIndent on blockStyle.textIndent > 0, so any negative value collapsed to zero. Additionally, wordXpos was uint16_t, which cannot represent negative offsets — a cast of e.g. −18 would wrap to 65518 and render the word far off-screen. 2. extraParagraphSpacing suppressed hanging indents Even after removing the > 0 guard, the existing !extraParagraphSpacing condition would still suppress all text-indent when that setting is on (its default). Positive text-indent is a decorative paragraph indent that the user can reasonably replace with vertical spacing — negative text-indent is structural (it positions the list marker) and must always apply. 3. em unit was calibrated against line height, not font size emSize was computed as getLineHeight() * lineCompression (the full line advance). CSS em units are defined relative to the font-size, which corresponds to the ascender height — not the line height. Using line height makes every em-based margin/indent ~20–30% wider than a browser would render it, and is especially noticeable for CSS that uses font-size: small (which we do not implement). ## Additional Context Test case ``` .lsl1 { margin-left: 3em; text-indent: -1em; } <div class="lsl1">• First list item that wraps across lines</div> <div class="lsl1">• Short item</div> ``` Before: all lines of all items started at 3 em from the left edge (indent ignored). After: the bullet marker hangs at 2 em; continuation lines align at 3 em. <img width="240" alt="before" src="https://github.com/user-attachments/assets/9dcbf3e0-fcd9-4af8-b451-a90ba4d2fb75" /> <img width="240" alt="after" src="https://github.com/user-attachments/assets/1ffdcf56-a180-4267-9590-c60d7ac44707" /> --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES**_
2026-03-02 12:02:09 +01:00
std::vector<int16_t> lineXPos;
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
lineXPos.reserve(lineWordCount);
for (size_t wordIdx = 0; wordIdx < lineWordCount; wordIdx++) {
lineXPos.push_back(xpos);
const bool nextIsContinuation = wordIdx + 1 < lineWordCount && continuesVec[lastBreakAt + wordIdx + 1];
feat: Support for kerning and ligatures (#873) ## Summary **What is the goal of this PR?** Improved typesetting, including [kerning](https://en.wikipedia.org/wiki/Kerning) and [ligatures](https://en.wikipedia.org/wiki/Ligature_(writing)#Latin_alphabet). **What changes are included?** - The script to convert built-in fonts now adds kerning and ligature information to the generated font headers. - Epub page layout calculates proper kerning spaces and makes ligature substitutions according to the selected font. ![3U1B1808](https://github.com/user-attachments/assets/1accb16f-2f1a-41e5-adca-89f1f1348494) ![3U1B1810](https://github.com/user-attachments/assets/2f6bd007-490e-420f-b774-3380b4add7ea) ![3U1B1815](https://github.com/user-attachments/assets/1986bb77-2db0-46e2-a5d6-8315dae9eb19) ## Additional Context - I am not a typography expert. - The implementation has been reworked from the earlier version, so it is no longer necessary to omit Open Dyslexic, and kerning data now covers all fonts, styles, and codepoints for which we include bitmap data. - Claude Opus 4.6 helped with a lot of this. - There's an included test epub document with lots of kerning and ligature examples, shown in the photos. **_After some time to mature, I think this change is in decent shape to merge and get people testing._** After opening this PR I came across #660, which overlaps in adding ligature support. --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**YES, Claude Opus 4.6**_ --------- Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 02:31:43 -06:00
if (nextIsContinuation) {
int advance = wordWidths[lastBreakAt + wordIdx];
// Cross-boundary kerning for continuation words (e.g. nonbreaking spaces, attached punctuation)
advance +=
renderer.getKerning(fontId, lastCodepoint(words[lastBreakAt + wordIdx]),
firstCodepoint(words[lastBreakAt + wordIdx + 1]), wordStyles[lastBreakAt + wordIdx]);
xpos += advance;
} else {
int gap = spaceWidth;
if (wordIdx + 1 < lineWordCount) {
gap += renderer.getSpaceKernAdjust(fontId, lastCodepoint(words[lastBreakAt + wordIdx]),
firstCodepoint(words[lastBreakAt + wordIdx + 1]),
wordStyles[lastBreakAt + wordIdx]);
}
if (blockStyle.alignment == CssTextAlign::Justify && !isLastLine) {
gap += justifyExtra;
}
xpos += wordWidths[lastBreakAt + wordIdx] + gap;
}
}
perf: Replace std::list with std::vector in text layout (#1038) ## Summary _Revision to @blindbat's #802. Description comes from the original PR._ - Replace `std::list` with `std::vector` for word storage in `TextBlock` and `ParsedText` - Use index-based access (`words[i]`) instead of iterator advancement (`std::advance(it, n)`) - Remove the separate `continuesVec` copy that was built from `wordContinues` for O(1) access — now unnecessary since `std::vector<bool>` already provides O(1) indexing ## Why `std::list` allocates each node individually on the heap with 16 bytes of prev/next pointer overhead per node. For text layout with many small words, this means: - Scattered heap allocations instead of contiguous memory - Poor cache locality during iteration (each node can be anywhere in memory) - Per-node malloc/free overhead during construction and destruction `std::vector` stores elements contiguously, giving better cache performance during the tight rendering and layout loops. The `extractLine` function also benefits: list splice was O(1) but required maintaining three parallel iterators, while vector range construction with move iterators is simpler and still efficient for the small line-sized chunks involved. ## Files changed - `lib/Epub/Epub/blocks/TextBlock.h` / `.cpp` - `lib/Epub/Epub/ParsedText.h` / `.cpp` ## AI Usage YES ## Test plan - [ ] Open an EPUB with mixed formatting (bold, italic, underline) — verify text renders correctly - [ ] Open a book with justified text — verify word spacing is correct - [ ] Open a book with hyphenation enabled — verify words break correctly at hyphens - [ ] Navigate through pages rapidly — verify no rendering glitches or crashes - [ ] Open a book with long paragraphs — verify text layout matches pre-change behavior --------- Co-authored-by: Kuanysh Bekkulov <kbekkulov@gmail.com>
2026-02-21 22:28:56 -06:00
// Build line data by moving from the original vectors using index range
std::vector<std::string> lineWords(std::make_move_iterator(words.begin() + lastBreakAt),
std::make_move_iterator(words.begin() + lineBreak));
std::vector<EpdFontFamily::Style> lineWordStyles(wordStyles.begin() + lastBreakAt, wordStyles.begin() + lineBreak);
for (auto& word : lineWords) {
if (containsSoftHyphen(word)) {
stripSoftHyphensInPlace(word);
}
}
feat: Add CSS parsing and CSS support in EPUBs (#411) ## Summary * **What is the goal of this PR?** - Adds basic CSS parsing to EPUBs and determine the CSS rules when rendering to the screen so that text is styled correctly. Currently supports bold, underline, italics, margin, padding, and text alignment ## Additional Context - My main reason for wanting this is that the book I'm currently reading, Carl's Doomsday Scenario (2nd in the Dungeon Crawler Carl series), relies _a lot_ on styled text for telling parts of the story. When text is bolded, it's supposed to be a message that's rendered "on-screen" in the story. When characters are "chatting" with each other, the text is bolded and their names are underlined. Plus, normal emphasis is provided with italicizing words here and there. So, this greatly improves my experience reading this book on the Xteink, and I figured it was useful enough for others too. - For transparency: I'm a software engineer, but I'm mostly frontend and TypeScript/JavaScript. It's been _years_ since I did any C/C++, so I would not be surprised if I'm doing something dumb along the way in this code. Please don't hesitate to ask for changes if something looks off. I heavily relied on Claude Code for help, and I had a lot of inspiration from how [microreader](https://github.com/CidVonHighwind/microreader) achieves their CSS parsing and styling. I did give this as good of a code review as I could and went through everything, and _it works on my machine_ 😄 ### Before ![IMG_6271](https://github.com/user-attachments/assets/dba7554d-efb6-4d13-88bc-8b83cd1fc615) ![IMG_6272](https://github.com/user-attachments/assets/61ba2de0-87c9-4f39-956f-013da4fe20a4) ### After ![IMG_6268](https://github.com/user-attachments/assets/ebe11796-cca9-4a46-b9c7-0709c7932818) ![IMG_6269](https://github.com/user-attachments/assets/e89c33dc-ff47-4bb7-855e-863fe44b3202) --- ### AI Usage Did you use AI tools to help write this code? **YES**, Claude Code
2026-02-05 05:28:10 -05:00
processLine(
std::make_shared<TextBlock>(std::move(lineWords), std::move(lineXPos), std::move(lineWordStyles), blockStyle));
}