crosspoint-reader/claude_notes/serial-blocking-debug-2026-01-28.md

126 lines
4.6 KiB
Markdown
Raw Normal View History

# Serial Blocking Debug Session Summary
**Date:** 2026-01-28
**Issue:** Device freezes when booted without USB connected
**Resolution:** `Serial.setTxTimeoutMs(0)` - make Serial TX non-blocking
## Problem Description
During release preparation for ef-0.15.9, the device was discovered to freeze completely when:
1. Unplugged from USB
2. Powered on via power button
3. Book page displays, then device becomes unresponsive
4. No button presses register
The device worked perfectly when USB was connected.
## Investigation Process
### Initial Hypotheses Tested
Multiple hypotheses were systematically investigated:
1. **Hypothesis A-D:** Display/rendering mutex issues
- Added mutex logging to SD card
- Mutex operations completed successfully
- Ruled out as root cause
2. **Hypothesis E:** FreeRTOS task creation issues
- Task created and ran successfully
- First render completed normally
- Ruled out
3. **Hypothesis F-G:** Main loop execution
- Added loop counter logging to SD card
- **Key finding:** Main loop never started logging
- Setup() completed but loop() never executed meaningful work
4. **Hypothesis H-J:** Various timing and initialization issues
- Tested different delays and initialization orders
- No improvement
### Root Cause Discovery
The breakthrough came from analyzing the boot sequence:
1. `setup()` completes successfully
2. `EpubReaderActivity::onEnter()` runs and calls `Serial.printf()` to log progress
3. **Device hangs at Serial.printf() call**
On ESP32-C3 with USB CDC (USB serial), `Serial.printf()` blocks indefinitely waiting for the TX buffer to drain when USB is not connected. The default behavior expects a host to read the data.
### Evidence
- When USB connected: `Serial.printf()` returns immediately (data sent to host)
- When USB disconnected: `Serial.printf()` blocks forever waiting for TX buffer space
- The hang occurred specifically in `EpubReaderActivity.cpp` during progress logging
## Solution
### Primary Fix
Configure Serial to be non-blocking in `src/main.cpp`:
```cpp
// Always initialize Serial but make it non-blocking
Serial.begin(115200);
Serial.setTxTimeoutMs(0); // Non-blocking TX - critical for USB disconnect handling
```
`Serial.setTxTimeoutMs(0)` tells the ESP32 Arduino core to return immediately from Serial write operations if the buffer is full, rather than blocking.
### Secondary Protection (Belt and Suspenders)
Added `if (Serial)` guards to high-traffic Serial calls in `EpubReaderActivity.cpp`:
```cpp
if (Serial) Serial.printf("[%lu] [ERS] Loaded progress...\n", millis());
```
This provides an additional check before attempting to print, though it's not strictly necessary with the timeout set to 0.
## Files Changed
| File | Change |
|------|--------|
| `src/main.cpp` | Added `Serial.setTxTimeoutMs(0)` after `Serial.begin()` |
| `src/main.cpp` | Added `if (Serial)` guard to auto-sleep log |
| `src/main.cpp` | Added `if (Serial)` guard to max loop duration log |
| `src/activities/reader/EpubReaderActivity.cpp` | Added 16 `if (Serial)` guards |
## Verification
After applying the fix:
1. Device boots successfully when unplugged from USB
2. Book pages render correctly
3. Button presses register normally
4. Sleep/wake cycle works
5. No functionality lost when USB is connected
## Lessons Learned
1. **ESP32-C3 USB CDC behavior:** Serial output can block indefinitely without a connected host
2. **Always set non-blocking:** `Serial.setTxTimeoutMs(0)` should be standard for battery-powered devices
3. **Debug logging location matters:** When debugging hangs, SD card logging proved essential since Serial was the problem
4. **Systematic hypothesis testing:** Ruled out many red herrings (mutex, task, rendering) before finding the true cause
## Technical Details
### Why This Affects ESP32-C3 Specifically
The ESP32-C3 uses native USB CDC for serial communication (no external USB-UART chip). The Arduino core's default behavior is to wait for TX buffer space, which requires an active USB host connection.
### Alternative Approaches Considered
1. **Only initialize Serial when USB connected:** Partially implemented, but insufficient because USB can be disconnected after boot
2. **Add `if (Serial)` guards everywhere:** Too invasive (400+ calls)
3. **Disable Serial entirely:** Would lose debug output when USB connected
The chosen solution (`setTxTimeoutMs(0)`) provides the best balance: debug output works when USB is connected, device operates normally when disconnected.
## References
- ESP32 Arduino Core Serial documentation
- ESP-IDF USB CDC documentation
- FreeRTOS queue behavior (initial red herring investigation)