Files
crosspoint-reader-mod/chat-summaries/2026-02-12_17-52-summary.md

48 lines
3.5 KiB
Markdown
Raw Normal View History

# Dictionary Feature Bug Fixes (Round 3)
**Date:** 2026-02-12
**Branch:** mod/add-dictionary
## Task
Fix three issues reported after round 2 of dictionary fixes.
## Changes Made
### 1. Fix: Definitions truncated for some words (Dictionary.cpp)
**Root cause:** The `asciiCaseCmp` case-insensitive match introduced in round 2 returns the *first* case variant found in the index. In StarDict order, "Professor" (capitalized) sorts before "professor" (lowercase). If the dictionary has separate entries for each — e.g., "Professor" as a title (short definition) and "professor" as the common noun (full multi-page definition) — the shorter entry is returned.
**Fix:** The linear scan in `searchIndex` now remembers the first case-insensitive match as a fallback, but continues scanning adjacent entries (case variants are always adjacent in StarDict order). If an exact case-sensitive match is found, it's used immediately. Otherwise, the first case-insensitive match is used. This ensures `cleanWord("professor")``"professor"` finds the full lowercase entry, not the shorter capitalized one.
**Files:** `src/util/Dictionary.cpp`
### 2. Fix: Non-renderable foreign script characters in definitions (DictionaryDefinitionActivity)
**Root cause:** Dictionary definitions include text from other languages (Chinese, Greek, Arabic, Cyrillic, etc.) as etymological references or examples. These characters aren't in the e-ink bitmap font and render as empty boxes. This is the same class of issue as the IPA pronunciation fix from round 2, but affecting inline content within definitions.
**Fix:**
- Added `isRenderableCodepoint(uint32_t cp)` static helper that whitelists character ranges the e-ink font supports:
- U+0000U+024F: Basic Latin through Latin Extended-B (ASCII + accented chars)
- U+0300U+036F: Combining Diacritical Marks
- U+2000U+206F: General Punctuation (dashes, quotes, bullets, ellipsis)
- U+20A0U+20CF: Currency Symbols
- U+2100U+214F: Letterlike Symbols
- U+2190U+21FF: Arrows
- Replaced the byte-by-byte character append in `parseHtml()` with a UTF-8-aware decoder that reads multi-byte sequences, decodes the codepoint, and only appends renderable characters. Invalid or non-renderable characters are silently skipped.
**Files:** `src/activities/reader/DictionaryDefinitionActivity.h`, `src/activities/reader/DictionaryDefinitionActivity.cpp`
### 3. Fix: Revert to standard-height hints, keep overlap hiding (DictionaryWordSelectActivity)
**What changed:** Reverted from 22px thin custom hints back to the standard 40px theme-style buttons (rounded corners with `cornerRadius=6`, `SMALL_FONT_ID` text, matching `LyraTheme::drawButtonHints` exactly). The overlap detection is preserved.
**Key design choice:** Instead of calling `GUI.drawButtonHints()` (which always clears all 4 button areas, erasing page content even for hidden buttons), the method draws each button individually in portrait mode. Hidden buttons are skipped entirely (`continue`), so the page content and word highlight underneath remain visible. Non-hidden buttons get the full theme treatment: white fill + rounded rect border + centered text.
**Files:** `src/activities/reader/DictionaryWordSelectActivity.cpp`
## Follow-up Items
- The `isRenderableCodepoint` whitelist is conservative — if the font gains additional glyph coverage (e.g., Greek letters for math), the whitelist can be extended
- Entity-decoded characters bypass the codepoint filter since they're appended as raw bytes; this is fine for the current entity set (all produce ASCII or General Punctuation characters)