# Table Rendering Fixes: Entities and Colspan Support
## Task Description
Fix two issues with the newly implemented EPUB table rendering:
1. Stray ` ` entities appearing as literal text in table cells instead of whitespace
2. Cells with `colspan` attributes (e.g., section headers like "Anders Celsius", "Scientific career") rendering as narrow single-column cells instead of spanning the full table width
## Changes Made
### 1. `lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp` — `flushPartWordBuffer()`
- Added detection and replacement of literal ` ` strings in the word buffer before flushing to `ParsedText`
- This handles double-encoded ` ` entities common in Wikipedia and other generated EPUBs, where XML parsing converts `&` to `&` leaving literal ` ` in the character data
### 2. `lib/Epub/Epub/TableData.h` — `TableCell` struct
- Added `int colspan = 1` field to store the HTML `colspan` attribute value
### 3. `lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp` — `startElement()`
- Added parsing of the `colspan` attribute from `
` and ` | ` tags
- Stores the parsed value (minimum 1) in the `TableCell::colspan` field
### 4. `lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp` — `processTable()`
- **Column count**: Changed from `max(row.cells.size())` to sum of `cell.colspan` per row, correctly determining logical column count
- **Natural width measurement**: Only non-spanning cells (colspan=1) contribute to per-column width calculations; spanning cells use combined width
- **Layout**: Added `spanContentWidth()` and `spanFullCellWidth()` lambdas to compute the combined content width and full cell width for cells spanning multiple columns
- **Cell mapping**: Each `PageTableCellData` now maps to an actual cell (not a logical column), with correct x-offset and combined column width for spanning cells
- **Fill logic**: Empty cells are appended only for unused logical columns after all actual cells are placed
## Follow-up Items
- Rowspan support is not yet implemented (uncommon in typical EPUB content)
- The ` ` fix only handles the most common double-encoded entity; other double-encoded entities (e.g., `—`) could be handled similarly if needed
|