epub parsing improvement ideas
This commit is contained in:
parent
59f493d293
commit
67494a7c90
@ -0,0 +1,537 @@
|
|||||||
|
# EPUB Reader Architectural Decisions
|
||||||
|
|
||||||
|
**Date:** 2026-01-23 19:47:23
|
||||||
|
**Status:** Active
|
||||||
|
**Based on:** EPUB 3.3 Compliance Audit
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
This document captures architectural decisions about which EPUB 3.3 features to implement, intentionally omit, or defer. Each decision includes rationale based on:
|
||||||
|
|
||||||
|
- Hardware constraints (ESP32-C3, 800x480 4-level grayscale e-ink)
|
||||||
|
- Memory limitations (~400KB SRAM, no PSRAM)
|
||||||
|
- User experience goals
|
||||||
|
- Implementation complexity
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-001: Inline Image Support
|
||||||
|
|
||||||
|
**Status:** RECOMMENDED FOR IMPLEMENTATION
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
EPUBs frequently contain images for:
|
||||||
|
- Cover art
|
||||||
|
- Chapter illustrations
|
||||||
|
- Diagrams and figures
|
||||||
|
- Decorative elements
|
||||||
|
|
||||||
|
Current implementation displays `[Image: alt_text]` placeholder.
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Implement inline image rendering** using existing infrastructure.
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **Infrastructure Exists:**
|
||||||
|
- `Bitmap` class handles BMP parsing with grayscale conversion and dithering
|
||||||
|
- `JpegToBmpConverter` converts JPEG to BMP (most common EPUB image format)
|
||||||
|
- `GfxRenderer::drawBitmap()` already renders bitmaps to e-ink
|
||||||
|
- `ZipFile` can extract files from EPUB archive
|
||||||
|
- Home screen cover rendering demonstrates the pattern works
|
||||||
|
|
||||||
|
2. **Memory Management Pattern:**
|
||||||
|
- Convert and cache images to SD card (like thumbnail generation)
|
||||||
|
- Load one image at a time during page render
|
||||||
|
- Use streaming conversion to minimize RAM usage
|
||||||
|
|
||||||
|
3. **High User Impact:**
|
||||||
|
- Many EPUBs contain important visual content
|
||||||
|
- Technical books rely on diagrams
|
||||||
|
- Children's books heavily use illustrations
|
||||||
|
|
||||||
|
### Implementation Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
┌────────────────────────────────────────────────────────────────┐
|
||||||
|
│ Image Processing Pipeline │
|
||||||
|
├────────────────────────────────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ EPUB ZIP ──► Extract Image ──► Convert to BMP ──► Cache to SD │
|
||||||
|
│ │ │ │
|
||||||
|
│ │ ├─ JPEG: JpegToBmpConverter│
|
||||||
|
│ │ └─ BMP: Direct copy │
|
||||||
|
│ │ │
|
||||||
|
│ └─► During page render: │
|
||||||
|
│ Load cached BMP ──► drawBitmap() │
|
||||||
|
│ │
|
||||||
|
└────────────────────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Page Element Structure
|
||||||
|
|
||||||
|
```cpp
|
||||||
|
// New PageImage element (alongside PageLine)
|
||||||
|
class PageImage final : public PageElement {
|
||||||
|
std::string cachedBmpPath; // Path to converted BMP on SD
|
||||||
|
int16_t width;
|
||||||
|
int16_t height;
|
||||||
|
|
||||||
|
public:
|
||||||
|
void render(GfxRenderer& renderer, int fontId, int xOffset, int yOffset) override;
|
||||||
|
bool serialize(FsFile& file) override;
|
||||||
|
static std::unique_ptr<PageImage> deserialize(FsFile& file);
|
||||||
|
};
|
||||||
|
```
|
||||||
|
|
||||||
|
### Constraints
|
||||||
|
|
||||||
|
- **No PNG support** initially (would require adding `pngle` library)
|
||||||
|
- **Maximum image size:** Scale to viewport width, max 800x480
|
||||||
|
- **Memory budget:** ~10KB row buffer during conversion
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-002: JavaScript/Scripting Support
|
||||||
|
|
||||||
|
**Status:** INTENTIONALLY OMITTED
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
EPUB 3.3 allows JavaScript in content documents for interactive features.
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Do not implement JavaScript execution.**
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **Security Risk:**
|
||||||
|
- Untrusted code execution on embedded device
|
||||||
|
- No sandboxing infrastructure
|
||||||
|
- Potential for malicious EPUBs
|
||||||
|
|
||||||
|
2. **Hardware Limitations:**
|
||||||
|
- E-ink display unsuitable for interactive content
|
||||||
|
- Limited RAM for JavaScript engine
|
||||||
|
- No benefit for static reading experience
|
||||||
|
|
||||||
|
3. **Minimal EPUB Use:**
|
||||||
|
- Most EPUBs don't use JavaScript
|
||||||
|
- Interactive textbooks target tablets, not e-readers
|
||||||
|
|
||||||
|
4. **Implementation Complexity:**
|
||||||
|
- Would require embedding V8/Duktape/QuickJS
|
||||||
|
- DOM manipulation engine
|
||||||
|
- Event handling system
|
||||||
|
|
||||||
|
### Alternative
|
||||||
|
|
||||||
|
`<script>` elements are silently ignored. Content remains readable.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-003: Fixed Layout (FXL) Support
|
||||||
|
|
||||||
|
**Status:** INTENTIONALLY OMITTED
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
EPUB Fixed Layout provides pixel-precise page positioning for:
|
||||||
|
- Comic books
|
||||||
|
- Children's picture books
|
||||||
|
- Magazines
|
||||||
|
- Technical drawings with precise layout
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Do not implement Fixed Layout support.**
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **Display Mismatch:**
|
||||||
|
- FXL designed for high-resolution color tablets
|
||||||
|
- 800x480 grayscale e-ink would require heavy downscaling
|
||||||
|
- Visual quality would be poor
|
||||||
|
|
||||||
|
2. **User Experience:**
|
||||||
|
- FXL EPUBs expect pan/zoom interaction
|
||||||
|
- E-ink refresh rate makes this impractical
|
||||||
|
- Text would be too small to read without zoom
|
||||||
|
|
||||||
|
3. **Implementation Complexity:**
|
||||||
|
- Requires full CSS positioning engine
|
||||||
|
- Viewport meta tag handling
|
||||||
|
- Coordinate transformation system
|
||||||
|
|
||||||
|
### Alternative
|
||||||
|
|
||||||
|
FXL EPUBs will open but may display incorrectly. Users should use reflowable EPUBs on this device.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-004: Audio/Video and Media Overlays
|
||||||
|
|
||||||
|
**Status:** HARDWARE LIMITED - CANNOT IMPLEMENT
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
EPUB 3.3 supports:
|
||||||
|
- `<audio>` and `<video>` elements
|
||||||
|
- Media Overlays (SMIL synchronization)
|
||||||
|
- Text-to-speech hints
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Cannot implement due to hardware constraints.**
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **No Audio Hardware:**
|
||||||
|
- Device has no speaker or audio DAC
|
||||||
|
- No audio output jack
|
||||||
|
|
||||||
|
2. **No Video Capability:**
|
||||||
|
- E-ink refresh rate (~1 Hz) incompatible with video
|
||||||
|
- No video decoding hardware
|
||||||
|
|
||||||
|
### Alternative
|
||||||
|
|
||||||
|
- Audio/video elements are ignored
|
||||||
|
- Alt text or fallback content displayed if available
|
||||||
|
- Media Overlays not processed
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-005: Color CSS Properties
|
||||||
|
|
||||||
|
**Status:** HARDWARE LIMITED - SIMPLIFIED HANDLING
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
EPUBs use CSS colors for:
|
||||||
|
- Text color (`color`)
|
||||||
|
- Background color (`background-color`)
|
||||||
|
- Border colors
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Ignore color properties; display in grayscale.**
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **Hardware Constraint:**
|
||||||
|
- Display is 4-level grayscale only
|
||||||
|
- Cannot render colors
|
||||||
|
|
||||||
|
2. **Acceptable Degradation:**
|
||||||
|
- Text remains readable in black
|
||||||
|
- Background remains white
|
||||||
|
- Colored elements appear as gray variations
|
||||||
|
|
||||||
|
### Implementation
|
||||||
|
|
||||||
|
Color CSS properties are parsed but not applied. Default black text on white background used.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-006: Table Rendering
|
||||||
|
|
||||||
|
**Status:** DEFERRED - OPTIONAL IMPLEMENTATION
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
Tables appear in:
|
||||||
|
- Technical documentation
|
||||||
|
- Reference material
|
||||||
|
- Data presentations
|
||||||
|
|
||||||
|
Current implementation shows `[Table omitted]` placeholder.
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Implement simple text-based table rendering as an optional enhancement.**
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **Moderate Impact:**
|
||||||
|
- Some EPUBs use tables, but not majority
|
||||||
|
- Technical users would benefit most
|
||||||
|
|
||||||
|
2. **Complexity vs. Benefit:**
|
||||||
|
- Full table layout is complex (colspan, rowspan, sizing)
|
||||||
|
- Simple tables can be rendered as text columns
|
||||||
|
|
||||||
|
### Implementation Approach (if implemented)
|
||||||
|
|
||||||
|
```
|
||||||
|
┌──────────────────────────────────────┐
|
||||||
|
│ Text-Based Table Rendering │
|
||||||
|
├──────────────────────────────────────┤
|
||||||
|
│ │
|
||||||
|
│ Header1 │ Header2 │ Header3 │
|
||||||
|
│ ─────────────────────────────────────│
|
||||||
|
│ Data 1 │ Data 2 │ Data 3 │
|
||||||
|
│ Data 4 │ Data 5 │ Data 6 │
|
||||||
|
│ │
|
||||||
|
└──────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
### Constraints
|
||||||
|
|
||||||
|
- Equal-width columns (no complex sizing)
|
||||||
|
- No colspan/rowspan support
|
||||||
|
- Truncate wide content
|
||||||
|
- Maximum 4-5 columns before overflow
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-007: SVG and MathML
|
||||||
|
|
||||||
|
**Status:** INTENTIONALLY OMITTED
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
- SVG: Scalable Vector Graphics for illustrations
|
||||||
|
- MathML: Mathematical notation markup
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Do not implement SVG or MathML rendering.**
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **Implementation Complexity:**
|
||||||
|
- SVG requires full vector graphics engine
|
||||||
|
- MathML requires specialized math typesetting
|
||||||
|
|
||||||
|
2. **Limited Use:**
|
||||||
|
- Most EPUBs use raster images, not SVG
|
||||||
|
- MathML primarily in academic texts
|
||||||
|
|
||||||
|
3. **Alternative Exists:**
|
||||||
|
- Many EPUBs include fallback PNG for SVG
|
||||||
|
- MathML often has image fallback
|
||||||
|
|
||||||
|
### Alternative
|
||||||
|
|
||||||
|
Display alt text or fallback image if available.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-008: CSS Selector Support
|
||||||
|
|
||||||
|
**Status:** CURRENT IMPLEMENTATION SUFFICIENT
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
CSS selectors enable targeting elements for styling. Current support:
|
||||||
|
- Element selectors (`p`, `div`)
|
||||||
|
- Class selectors (`.classname`)
|
||||||
|
- Element.class selectors (`p.intro`)
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Maintain current limited selector support; do not expand.**
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **Sufficient for Most EPUBs:**
|
||||||
|
- 90%+ of EPUB styling uses simple selectors
|
||||||
|
- Complex selectors rarely affect core readability
|
||||||
|
|
||||||
|
2. **Implementation Complexity:**
|
||||||
|
- Descendant selectors require DOM tree
|
||||||
|
- Pseudo-selectors need state tracking
|
||||||
|
- Specificity calculation is complex
|
||||||
|
|
||||||
|
3. **Memory Constraints:**
|
||||||
|
- DOM tree would consume significant RAM
|
||||||
|
- Current streaming parser is memory-efficient
|
||||||
|
|
||||||
|
### Not Implemented
|
||||||
|
|
||||||
|
- Descendant selectors (`div p`)
|
||||||
|
- Child selectors (`ul > li`)
|
||||||
|
- Sibling selectors (`h1 + p`)
|
||||||
|
- Pseudo-classes (`:first-child`, `:hover`)
|
||||||
|
- Pseudo-elements (`::before`, `::after`)
|
||||||
|
- Attribute selectors (`[type="text"]`)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-009: Internal Link Navigation
|
||||||
|
|
||||||
|
**Status:** RECOMMENDED FOR IMPLEMENTATION (PHASE 2)
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
EPUBs use internal links for:
|
||||||
|
- Footnotes
|
||||||
|
- Cross-references
|
||||||
|
- Table of contents
|
||||||
|
- Index entries
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Implement internal link navigation in Phase 2.**
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **User Value:**
|
||||||
|
- Footnotes are common in non-fiction
|
||||||
|
- Reference navigation improves usability
|
||||||
|
|
||||||
|
2. **Complexity:**
|
||||||
|
- Requires anchor parsing and storage
|
||||||
|
- Needs selection UI for link activation
|
||||||
|
- Cross-chapter navigation adds complexity
|
||||||
|
|
||||||
|
### Implementation Architecture
|
||||||
|
|
||||||
|
```
|
||||||
|
Link Navigation Flow:
|
||||||
|
1. Parse <a href="#id"> during HTML parsing
|
||||||
|
2. Store link targets in PageLine metadata
|
||||||
|
3. Add link highlighting (underline or marker)
|
||||||
|
4. User selects link via UI
|
||||||
|
5. Resolve target: same chapter (anchor) or cross-chapter (spine + anchor)
|
||||||
|
6. Navigate to target page/position
|
||||||
|
```
|
||||||
|
|
||||||
|
### Deferred
|
||||||
|
|
||||||
|
- External links (http://) - no network navigation on e-reader
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-010: DRM and Encryption
|
||||||
|
|
||||||
|
**Status:** INTENTIONALLY OMITTED
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
EPUB supports DRM through:
|
||||||
|
- `encryption.xml` in META-INF
|
||||||
|
- Adobe DRM
|
||||||
|
- Various proprietary schemes
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Do not implement DRM support.**
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **Licensing Complexity:**
|
||||||
|
- Adobe DRM requires licensing agreements
|
||||||
|
- Proprietary schemes have legal restrictions
|
||||||
|
|
||||||
|
2. **User Expectation:**
|
||||||
|
- Open-source e-reader users expect DRM-free content
|
||||||
|
- DRM conflicts with device modification philosophy
|
||||||
|
|
||||||
|
3. **Implementation Complexity:**
|
||||||
|
- Each DRM scheme is different
|
||||||
|
- Secure key storage required
|
||||||
|
- Regular updates needed for scheme changes
|
||||||
|
|
||||||
|
### Alternative
|
||||||
|
|
||||||
|
Users should remove DRM from purchased content using legal tools before loading to device.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## ADR-011: List Rendering Enhancements
|
||||||
|
|
||||||
|
**Status:** RECOMMENDED FOR IMPLEMENTATION (PHASE 1)
|
||||||
|
|
||||||
|
### Context
|
||||||
|
|
||||||
|
Current implementation:
|
||||||
|
- `<li>` renders bullet character `•`
|
||||||
|
- No numbered list support
|
||||||
|
- No nesting indentation
|
||||||
|
|
||||||
|
### Decision
|
||||||
|
|
||||||
|
**Enhance list rendering with ordered numbers and nesting.**
|
||||||
|
|
||||||
|
### Rationale
|
||||||
|
|
||||||
|
1. **Low Complexity:**
|
||||||
|
- Track list type (`<ol>` vs `<ul>`) in parser state
|
||||||
|
- Maintain counter for ordered lists
|
||||||
|
- Apply indentation based on nesting depth
|
||||||
|
|
||||||
|
2. **Clear User Benefit:**
|
||||||
|
- Numbered lists convey sequence
|
||||||
|
- Indentation shows hierarchy
|
||||||
|
|
||||||
|
### Implementation
|
||||||
|
|
||||||
|
```cpp
|
||||||
|
// Parser state additions
|
||||||
|
int listDepth = 0;
|
||||||
|
int orderedListCounter = 0;
|
||||||
|
bool isOrderedList = false;
|
||||||
|
|
||||||
|
// On <ol> start
|
||||||
|
isOrderedList = true;
|
||||||
|
orderedListCounter = 1;
|
||||||
|
listDepth++;
|
||||||
|
|
||||||
|
// On <li> in ordered list
|
||||||
|
addWord(std::to_string(orderedListCounter++) + ".", REGULAR);
|
||||||
|
|
||||||
|
// Apply indent
|
||||||
|
textIndent = listDepth * 20; // pixels per level
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary Table
|
||||||
|
|
||||||
|
| Feature | Decision | Rationale |
|
||||||
|
|---------|----------|-----------|
|
||||||
|
| Inline Images | IMPLEMENT | High impact, infrastructure ready |
|
||||||
|
| JavaScript | OMIT | Security risk, no benefit for e-ink |
|
||||||
|
| Fixed Layout | OMIT | Display mismatch, poor UX |
|
||||||
|
| Audio/Video | CANNOT | No hardware support |
|
||||||
|
| Color CSS | IGNORE | Grayscale display |
|
||||||
|
| Tables | DEFER | Moderate impact, high complexity |
|
||||||
|
| SVG/MathML | OMIT | High complexity, limited use |
|
||||||
|
| Complex CSS Selectors | OMIT | Memory constraints, limited benefit |
|
||||||
|
| Internal Links | IMPLEMENT (Phase 2) | User value for references |
|
||||||
|
| DRM | OMIT | Licensing, philosophy conflict |
|
||||||
|
| List Enhancements | IMPLEMENT (Phase 1) | Low complexity, clear benefit |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Priority
|
||||||
|
|
||||||
|
### Phase 1 (Quick Wins)
|
||||||
|
- [x] Basic bullet rendering (already implemented)
|
||||||
|
- [ ] Ordered list numbering
|
||||||
|
- [ ] Nested list indentation
|
||||||
|
- [ ] Line-height CSS support
|
||||||
|
|
||||||
|
### Phase 2 (Image Support)
|
||||||
|
- [ ] Image extraction from EPUB
|
||||||
|
- [ ] JPEG to BMP conversion for inline images
|
||||||
|
- [ ] PageImage element integration
|
||||||
|
- [ ] Image scaling and layout
|
||||||
|
|
||||||
|
### Phase 3 (Navigation)
|
||||||
|
- [ ] Internal link parsing
|
||||||
|
- [ ] Link selection UI
|
||||||
|
- [ ] Anchor navigation
|
||||||
|
|
||||||
|
### Deferred/Not Planned
|
||||||
|
- PNG support (would need library addition)
|
||||||
|
- Table rendering
|
||||||
|
- Complex CSS selectors
|
||||||
|
- JavaScript, FXL, DRM
|
||||||
@ -0,0 +1,262 @@
|
|||||||
|
# EPUB 3.3 Compliance Feature Prioritization
|
||||||
|
|
||||||
|
**Date:** 2026-01-23 19:46:02
|
||||||
|
**Based on:** EPUB 3.3 Compliance Audit
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This document reviews the audit findings from the CrossPoint Reader EPUB 3.3 compliance audit and prioritizes features for implementation based on:
|
||||||
|
|
||||||
|
1. **User Impact** - How much the feature improves the reading experience
|
||||||
|
2. **Implementation Complexity** - Level of effort required
|
||||||
|
3. **Existing Infrastructure** - Available code that can be leveraged
|
||||||
|
4. **Hardware Feasibility** - Whether the e-ink display can support it
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Priority 1: High Impact, Infrastructure Ready
|
||||||
|
|
||||||
|
These features have existing code that can be leveraged and provide significant user value.
|
||||||
|
|
||||||
|
### 1.1 Inline Image Rendering
|
||||||
|
|
||||||
|
**Current State:** Images show placeholder text `[Image: alt_text]`
|
||||||
|
|
||||||
|
**Infrastructure Available:**
|
||||||
|
- `Bitmap` class (`lib/GfxRenderer/Bitmap.h`) - BMP parsing with grayscale conversion and dithering
|
||||||
|
- `JpegToBmpConverter` (`lib/JpegToBmpConverter/`) - JPEG to BMP conversion with prescaling
|
||||||
|
- `GfxRenderer::drawBitmap()` - Already renders bitmaps to e-ink display
|
||||||
|
- `ZipFile` - Can extract images from EPUB archive
|
||||||
|
|
||||||
|
**Implementation Approach:**
|
||||||
|
1. Extend `ChapterHtmlSlimParser` to extract image `src` attributes
|
||||||
|
2. Create `PageImage` element (alongside existing `PageLine`)
|
||||||
|
3. Extract and cache images from EPUB ZIP to SD card as BMP
|
||||||
|
4. Integrate image blocks into page layout calculations
|
||||||
|
5. Render images inline with text during page display
|
||||||
|
|
||||||
|
**Complexity:** Medium
|
||||||
|
**User Impact:** High - Many EPUBs have important diagrams, illustrations, and decorative images
|
||||||
|
|
||||||
|
**Supported Formats:**
|
||||||
|
- JPEG (via `picojpeg`) - Most common in EPUBs
|
||||||
|
- BMP (native support)
|
||||||
|
- PNG - **Not currently supported** (would need `pngle` or similar library)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 1.2 List Markers (Bullets and Numbers)
|
||||||
|
|
||||||
|
**Current State:** List items render without visual markers; `<li>` already adds bullet character
|
||||||
|
|
||||||
|
**Infrastructure Available:**
|
||||||
|
- `ChapterHtmlSlimParser` already handles `<li>` tags (line 248 adds bullet)
|
||||||
|
- Font system supports Unicode characters
|
||||||
|
|
||||||
|
**Implementation Approach:**
|
||||||
|
1. Track list type (`<ol>` vs `<ul>`) in parser state
|
||||||
|
2. For `<ol>`: maintain counter, render as "1.", "2.", etc.
|
||||||
|
3. For `<ul>`: use bullet character (already implemented: `\xe2\x80\xa2`)
|
||||||
|
4. Apply text indentation for list item content
|
||||||
|
|
||||||
|
**Current Code (already adds bullet):**
|
||||||
|
```cpp
|
||||||
|
if (strcmp(name, "li") == 0) {
|
||||||
|
self->currentTextBlock->addWord("\xe2\x80\xa2", EpdFontFamily::REGULAR);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Complexity:** Low
|
||||||
|
**User Impact:** Medium - Improves readability of enumerated content
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 1.3 Nested List Indentation
|
||||||
|
|
||||||
|
**Current State:** Nested lists are flattened to same indentation level
|
||||||
|
|
||||||
|
**Implementation Approach:**
|
||||||
|
1. Track nesting depth in parser (`listDepth` counter)
|
||||||
|
2. Apply progressive `text-indent` based on depth (e.g., 20px per level)
|
||||||
|
3. Store depth in `BlockStyle` for rendering
|
||||||
|
|
||||||
|
**Complexity:** Low
|
||||||
|
**User Impact:** Medium - Important for technical documentation and outlines
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Priority 2: Medium Impact, Moderate Complexity
|
||||||
|
|
||||||
|
### 2.1 Basic Table Rendering
|
||||||
|
|
||||||
|
**Current State:** Tables show `[Table omitted]` placeholder
|
||||||
|
|
||||||
|
**Implementation Approach:**
|
||||||
|
1. Parse table structure (`<table>`, `<tr>`, `<td>`, `<th>`)
|
||||||
|
2. Calculate column widths based on content or proportional division
|
||||||
|
3. Render as text rows with column separators (e.g., `|` character)
|
||||||
|
4. Apply header styling (bold) for `<th>` elements
|
||||||
|
|
||||||
|
**Constraints:**
|
||||||
|
- Fixed-width font may be needed for alignment
|
||||||
|
- Complex tables (colspan, rowspan) would remain unsupported
|
||||||
|
- Wide tables may need horizontal truncation
|
||||||
|
|
||||||
|
**Complexity:** Medium-High
|
||||||
|
**User Impact:** Medium - Some EPUBs have data tables, but many don't
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2.2 Internal Link Navigation
|
||||||
|
|
||||||
|
**Current State:** Links render as plain text, not interactive
|
||||||
|
|
||||||
|
**Implementation Approach:**
|
||||||
|
1. Parse `<a href="#id">` attributes during HTML parsing
|
||||||
|
2. Store link targets in page elements
|
||||||
|
3. Add UI for link selection (highlight/underline links)
|
||||||
|
4. Implement navigation to target anchor when activated
|
||||||
|
5. Handle cross-chapter links (resolve to spine + anchor)
|
||||||
|
|
||||||
|
**Complexity:** Medium-High (requires UI changes)
|
||||||
|
**User Impact:** Medium - Important for reference material, footnotes
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 2.3 Line Height CSS Support
|
||||||
|
|
||||||
|
**Current State:** Line height is fixed based on font metrics
|
||||||
|
|
||||||
|
**Implementation Approach:**
|
||||||
|
1. Add `line-height` property to `CssParser`
|
||||||
|
2. Store in `CssStyle` struct
|
||||||
|
3. Apply multiplier to `getLineHeight()` during layout
|
||||||
|
|
||||||
|
**Complexity:** Low
|
||||||
|
**User Impact:** Medium - Improves text density matching to publisher intent
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Priority 3: Lower Impact or Higher Complexity
|
||||||
|
|
||||||
|
### 3.1 Font Size CSS Support
|
||||||
|
|
||||||
|
**Current State:** Font size controlled by reader settings only
|
||||||
|
|
||||||
|
**Implementation Approach:**
|
||||||
|
1. Parse `font-size` in `CssParser` (relative units: em, rem, %)
|
||||||
|
2. Map to available font sizes (snap to nearest)
|
||||||
|
3. Apply during text rendering
|
||||||
|
|
||||||
|
**Constraints:**
|
||||||
|
- Limited font size options in bitmap fonts
|
||||||
|
- Large size changes may cause layout issues
|
||||||
|
|
||||||
|
**Complexity:** Medium
|
||||||
|
**User Impact:** Low-Medium - Most content uses default sizing
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3.2 Page-List Navigation
|
||||||
|
|
||||||
|
**Current State:** Only TOC navigation supported
|
||||||
|
|
||||||
|
**Implementation Approach:**
|
||||||
|
1. Parse `<nav epub:type="page-list">` in navigation document
|
||||||
|
2. Store page reference mappings
|
||||||
|
3. Add UI to jump to specific page numbers
|
||||||
|
|
||||||
|
**Complexity:** Medium
|
||||||
|
**User Impact:** Low - Useful for academic/reference texts with page citations
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### 3.3 Landmarks Navigation
|
||||||
|
|
||||||
|
**Current State:** Not implemented
|
||||||
|
|
||||||
|
**Implementation Approach:**
|
||||||
|
1. Parse `<nav epub:type="landmarks">`
|
||||||
|
2. Extract semantic markers (cover, toc, bodymatter, etc.)
|
||||||
|
3. Add quick-nav UI
|
||||||
|
|
||||||
|
**Complexity:** Low-Medium
|
||||||
|
**User Impact:** Low - Convenience feature
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Features NOT Recommended for Implementation
|
||||||
|
|
||||||
|
These features are either hardware-limited or provide minimal value for the target use case.
|
||||||
|
|
||||||
|
### Hardware Limited (Cannot Implement)
|
||||||
|
|
||||||
|
| Feature | Reason |
|
||||||
|
|---------|--------|
|
||||||
|
| Color images/CSS | Monochrome 4-level grayscale display |
|
||||||
|
| Audio playback | No audio hardware |
|
||||||
|
| Video playback | No video hardware, e-ink refresh too slow |
|
||||||
|
| Animations | E-ink refresh latency incompatible |
|
||||||
|
| Media Overlays | Requires audio hardware |
|
||||||
|
|
||||||
|
### Not Worth Implementing
|
||||||
|
|
||||||
|
| Feature | Reason |
|
||||||
|
|---------|--------|
|
||||||
|
| JavaScript | Security concerns, minimal EPUB use, high complexity |
|
||||||
|
| Fixed Layout (FXL) | Designed for tablets; poor e-ink experience |
|
||||||
|
| SVG | Complex parser needed, limited use in text EPUBs |
|
||||||
|
| MathML | Extremely complex, niche use case |
|
||||||
|
| CSS Grid/Flexbox | Overkill for text layout |
|
||||||
|
| @font-face | Memory constraints, limited benefit |
|
||||||
|
| DRM | Licensing complexity, user expectation of DRM-free |
|
||||||
|
| Forms | Interactive elements unsuitable for e-ink |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Roadmap
|
||||||
|
|
||||||
|
### Phase 1: Quick Wins (Low Effort, High Value)
|
||||||
|
1. ~~List bullet rendering~~ (already implemented)
|
||||||
|
2. Ordered list numbering
|
||||||
|
3. Nested list indentation
|
||||||
|
4. Line-height CSS support
|
||||||
|
|
||||||
|
### Phase 2: Image Support (Medium Effort, High Value)
|
||||||
|
1. JPEG image extraction and caching
|
||||||
|
2. BMP image support
|
||||||
|
3. `PageImage` element integration
|
||||||
|
4. Image scaling to viewport width
|
||||||
|
5. (Optional) PNG support via external library
|
||||||
|
|
||||||
|
### Phase 3: Enhanced Navigation (Medium Effort, Medium Value)
|
||||||
|
1. Internal link parsing
|
||||||
|
2. Link highlighting/selection UI
|
||||||
|
3. Anchor navigation within chapters
|
||||||
|
4. Cross-chapter link navigation
|
||||||
|
|
||||||
|
### Phase 4: Table Support (High Effort, Medium Value)
|
||||||
|
1. Basic table parsing
|
||||||
|
2. Simple column layout
|
||||||
|
3. Text-based table rendering
|
||||||
|
4. Header styling
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
| Priority | Feature | Complexity | Impact | Dependencies |
|
||||||
|
|----------|---------|------------|--------|--------------|
|
||||||
|
| 1.1 | Inline Images | Medium | High | Bitmap, JpegToBmpConverter |
|
||||||
|
| 1.2 | Ordered List Numbers | Low | Medium | None |
|
||||||
|
| 1.3 | Nested List Indent | Low | Medium | None |
|
||||||
|
| 2.1 | Basic Tables | Medium-High | Medium | None |
|
||||||
|
| 2.2 | Internal Links | Medium-High | Medium | UI changes |
|
||||||
|
| 2.3 | Line Height CSS | Low | Medium | CssParser |
|
||||||
|
| 3.1 | Font Size CSS | Medium | Low-Medium | Font system |
|
||||||
|
| 3.2 | Page-List Nav | Medium | Low | Navigation system |
|
||||||
|
| 3.3 | Landmarks Nav | Low-Medium | Low | Navigation system |
|
||||||
|
|
||||||
|
**Recommended First Implementation:** Start with Phase 1 quick wins (list improvements, line-height), then proceed to Image Support as it has the highest user-visible impact and leverages existing infrastructure.
|
||||||
Loading…
x
Reference in New Issue
Block a user