Files
crosspoint-reader-mod/chat-summaries/2026-02-15_17-30-summary.md

53 lines
2.9 KiB
Markdown
Raw Normal View History

# EPUB Table Rendering Implementation
## Task
Replace the `[Table omitted]` placeholder in the EPUB reader with full column-aligned table rendering, including grid lines, proportional column widths, and proper serialization.
## Changes Made
### New file
- **`lib/Epub/Epub/TableData.h`** -- Lightweight structs (`TableCell`, `TableRow`, `TableData`) for buffering table content during SAX parsing.
### Modified files
- **`lib/Epub/Epub/ParsedText.h` / `.cpp`**
- Added `getNaturalWidth()` public method to measure the single-line content width of a ParsedText. Used by column width calculation.
- **`lib/Epub/Epub/Page.h` / `.cpp`**
- Added `TAG_PageTableRow = 2` to `PageElementTag` enum.
- Added `getTag()` pure virtual method to `PageElement` base class for tag-based serialization.
- Added `PageTableCellData` struct (cell lines, column width, x-offset).
- Added `PageTableRow` class with render (grid lines + cell text), serialize, and deserialize support.
- Updated `Page::serialize()` to use `el->getTag()` instead of hardcoded tag.
- Updated `Page::deserialize()` to handle `TAG_PageTableRow`.
- **`lib/Epub/Epub/parsers/ChapterHtmlSlimParser.h`**
- Added `#include "../TableData.h"`.
- Added table state fields: `bool inTable`, `std::unique_ptr<TableData> tableData`.
- Added `processTable()` and `addTableRowToPage()` method declarations.
- **`lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp`**
- Added table-related tag arrays (`TABLE_TRANSPARENT_TAGS`, `TABLE_SKIP_TAGS`).
- Replaced `[Table omitted]` placeholder with full table buffering logic in `startElement`.
- Modified `startNewTextBlock` to be a no-op when inside a table (cell content stays in one ParsedText).
- Added table close handling in `endElement` for `</td>`, `</th>`, and `</table>`.
- Disabled the 750-word early split when inside a table.
- Implemented `processTable()`: column width calculation (natural + proportional distribution), per-cell layout via `layoutAndExtractLines`, `PageTableRow` creation.
- Implemented `addTableRowToPage()`: page-break handling for table rows.
## Design Decisions
- Tables are buffered entirely during parsing, then processed on `</table>` close (two-pass: measure then layout).
- Column widths are proportional to natural content width, with equal distribution of extra space when content fits.
- Grid lines (1px) drawn around every cell; 2px horizontal cell padding.
- Nested tables are skipped (v1 limitation).
- `<caption>`, `<colgroup>`, `<col>` are skipped; `<thead>`, `<tbody>`, `<tfoot>` are transparent.
- `<th>` cells get bold text. Cell text is left-aligned with no paragraph indent.
- Serialization is backward-compatible: old firmware encountering the new tag will re-parse the section.
## Follow-up Items
- Nested table support (currently skipped)
- `colspan` / `rowspan` support
- `<caption>` rendering as centered text above the table
- CSS border detection (currently always draws grid lines)
- Consider CSS-based cell alignment