# Table Rendering Fixes:   Entities and Colspan Support ## Task Description Fix two issues with the newly implemented EPUB table rendering: 1. Stray ` ` entities appearing as literal text in table cells instead of whitespace 2. Cells with `colspan` attributes (e.g., section headers like "Anders Celsius", "Scientific career") rendering as narrow single-column cells instead of spanning the full table width ## Changes Made ### 1. `lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp` — `flushPartWordBuffer()` - Added detection and replacement of literal ` ` strings in the word buffer before flushing to `ParsedText` - This handles double-encoded ` ` entities common in Wikipedia and other generated EPUBs, where XML parsing converts `&` to `&` leaving literal ` ` in the character data ### 2. `lib/Epub/Epub/TableData.h` — `TableCell` struct - Added `int colspan = 1` field to store the HTML `colspan` attribute value ### 3. `lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp` — `startElement()` - Added parsing of the `colspan` attribute from `` and `` tags - Stores the parsed value (minimum 1) in the `TableCell::colspan` field ### 4. `lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp` — `processTable()` - **Column count**: Changed from `max(row.cells.size())` to sum of `cell.colspan` per row, correctly determining logical column count - **Natural width measurement**: Only non-spanning cells (colspan=1) contribute to per-column width calculations; spanning cells use combined width - **Layout**: Added `spanContentWidth()` and `spanFullCellWidth()` lambdas to compute the combined content width and full cell width for cells spanning multiple columns - **Cell mapping**: Each `PageTableCellData` now maps to an actual cell (not a logical column), with correct x-offset and combined column width for spanning cells - **Fill logic**: Empty cells are appended only for unused logical columns after all actual cells are placed ## Follow-up Items - Rowspan support is not yet implemented (uncommon in typical EPUB content) - The ` ` fix only handles the most common double-encoded entity; other double-encoded entities (e.g., `—`) could be handled similarly if needed