feat: add historical backfill with --init CLI and episode numbering

Adds a --init mode that seeds the database with past shows from a given
anchor episode/date forward, batch-fetching likes from SoundCloud and
partitioning them into weekly buckets. Episode numbers are tracked in
the shows table and auto-incremented by the poller for new shows.

Includes full API documentation (docs/api.md) and updated README.

Made-with: Cursor
This commit is contained in:
cottongin
2026-03-12 02:09:15 -04:00
parent c88826ac4d
commit cb3ae403cf
14 changed files with 922 additions and 21 deletions

View File

@@ -0,0 +1,39 @@
# NtR SoundCloud Fetcher — Full Implementation
## Task Description
Designed and implemented a Python service that polls NicktheRat's SoundCloud likes, builds weekly playlists aligned to the Wednesday 22:00 ET show schedule, and serves them via a JSON API for an IRC bot.
## Changes Made
### Design Phase
- Brainstormed requirements through 6 clarifying questions
- Evaluated 3 architectural approaches, selected single-process daemon
- Produced design doc covering architecture, data model, API, poller logic
- Produced 13-task TDD implementation plan
### Implementation (42 tests, all passing, lint clean)
| Module | File | Purpose |
|--------|------|---------|
| Config | `src/ntr_fetcher/config.py` | Pydantic settings with `NTR_` env prefix |
| Week | `src/ntr_fetcher/week.py` | DST-aware Wednesday 22:00 ET boundary computation |
| Models | `src/ntr_fetcher/models.py` | Track, Show, ShowTrack dataclasses |
| Database | `src/ntr_fetcher/db.py` | SQLite schema, CRUD, track sync with unlike removal |
| SoundCloud | `src/ntr_fetcher/soundcloud.py` | client_id extraction, user resolution, likes fetching |
| Poller | `src/ntr_fetcher/poller.py` | Hourly polling with supervised restart |
| API | `src/ntr_fetcher/api.py` | FastAPI routes for playlist, shows, admin, health |
| Main | `src/ntr_fetcher/main.py` | Entry point wiring everything together |
### Key Design Decisions
- Tracks removed when Nick unlikes them (positions re-compact)
- Cursor-seeking for efficient SoundCloud API pagination
- Automatic client_id rotation on 401
- Supervisor restarts poller on failure without affecting API
## Follow-up Items
- **Incremental fetching**: Currently fetches full week every poll; could optimize to stop at known tracks
- **Retry/backoff for non-401 errors**: 429, 5xx, timeouts not yet handled with retries
- **`full` parameter**: Accepted but currently equivalent to normal poll (no incremental to differentiate from)
- **`soundcloud_url` in admin add track**: Removed from API; only `track_id` supported

View File

@@ -0,0 +1,30 @@
# Historical Backfill (--init) Feature
## Task
Add CLI-based historical show backfill with episode numbering throughout the system.
## Changes Made
### New file
- `src/ntr_fetcher/backfill.py` — Computes show weeks from an anchor episode/date, batch-fetches all likes from SoundCloud, partitions them into weekly buckets, and populates the DB.
### Modified files
- `src/ntr_fetcher/models.py` — Added `episode_number: int | None` to `Show` dataclass.
- `src/ntr_fetcher/db.py` — Added `episode_number` column to schema, ALTER TABLE migration for existing DBs, updated `get_or_create_show` to accept/store episode numbers, added `get_latest_episode_number()` and `update_show_episode_number()`, changed `list_shows` ordering to `week_start DESC`.
- `src/ntr_fetcher/main.py` — Added `argparse` with `--init`, `--show`, `--aired` flags. `--init` runs backfill then exits; default starts the server as before.
- `src/ntr_fetcher/poller.py` — Auto-assigns episode number (latest + 1) when creating a new show if historical data exists.
- `src/ntr_fetcher/api.py` — Added `episode_number` to `/playlist`, `/shows`, `/shows/{show_id}` responses.
### New/updated tests
- `tests/test_backfill.py` — Week computation, batch partitioning, empty data, idempotency.
- `tests/test_db.py` — Episode number creation, update, and `get_latest_episode_number`.
- `tests/test_poller.py` — Auto-numbering when history exists, skips when no history, skips when already assigned.
- `tests/test_api.py``episode_number` present in show responses.
## Results
- 58 tests passing (up from 42), ruff clean.
## Usage
```
NTR_ADMIN_TOKEN=token ntr-fetcher --init --show 521 --aired 2026-01-07
```