feat: add historical backfill with --init CLI and episode numbering
Adds a --init mode that seeds the database with past shows from a given anchor episode/date forward, batch-fetching likes from SoundCloud and partitioning them into weekly buckets. Episode numbers are tracked in the shows table and auto-incremented by the poller for new shows. Includes full API documentation (docs/api.md) and updated README. Made-with: Cursor
This commit is contained in:
45
README.md
45
README.md
@@ -14,16 +14,34 @@ ntr-fetcher
|
|||||||
|
|
||||||
The API starts at `http://127.0.0.1:8000`.
|
The API starts at `http://127.0.0.1:8000`.
|
||||||
|
|
||||||
|
## Historical Backfill
|
||||||
|
|
||||||
|
Seed the database with past shows by providing an anchor episode and its air date:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
NTR_ADMIN_TOKEN=token ntr-fetcher --init --show 521 --aired 2026-01-07
|
||||||
|
```
|
||||||
|
|
||||||
|
This computes every weekly show from the anchor forward to today, batch-fetches
|
||||||
|
the corresponding likes from SoundCloud, and populates the database. Episode
|
||||||
|
numbers are assigned automatically (521, 522, ...). After backfill completes,
|
||||||
|
the normal server mode will auto-increment from the latest episode.
|
||||||
|
|
||||||
## API
|
## API
|
||||||
|
|
||||||
| Endpoint | Description |
|
Full documentation: [`docs/api.md`](docs/api.md)
|
||||||
|----------|-------------|
|
|
||||||
| `GET /playlist` | Current week's playlist |
|
| Endpoint | Method | Auth | Description |
|
||||||
| `GET /playlist/{n}` | Track at position n |
|
|----------|--------|------|-------------|
|
||||||
| `GET /shows` | List all shows |
|
| `/health` | GET | -- | Service health check |
|
||||||
| `GET /shows/{id}` | Specific show's playlist |
|
| `/playlist` | GET | -- | Current week's playlist |
|
||||||
| `GET /health` | Service health check |
|
| `/playlist/{position}` | GET | -- | Single track by position (1-indexed) |
|
||||||
| `POST /admin/refresh` | Trigger SoundCloud fetch (token required) |
|
| `/shows` | GET | -- | List all shows (paginated) |
|
||||||
|
| `/shows/{show_id}` | GET | -- | Specific show with tracks |
|
||||||
|
| `/admin/refresh` | POST | Bearer | Trigger immediate SoundCloud fetch |
|
||||||
|
| `/admin/tracks` | POST | Bearer | Add track to current show |
|
||||||
|
| `/admin/tracks/{track_id}` | DELETE | Bearer | Remove track from current show |
|
||||||
|
| `/admin/tracks/{track_id}/position` | PUT | Bearer | Move track to new position |
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
@@ -33,14 +51,17 @@ Environment variables (prefix `NTR_`):
|
|||||||
|----------|---------|-------------|
|
|----------|---------|-------------|
|
||||||
| `NTR_PORT` | `8000` | API port |
|
| `NTR_PORT` | `8000` | API port |
|
||||||
| `NTR_HOST` | `127.0.0.1` | Bind address |
|
| `NTR_HOST` | `127.0.0.1` | Bind address |
|
||||||
| `NTR_DB_PATH` | `./ntr_fetcher.db` | SQLite path |
|
| `NTR_DB_PATH` | `./ntr_fetcher.db` | SQLite database path |
|
||||||
| `NTR_POLL_INTERVAL_SECONDS` | `3600` | Poll frequency |
|
| `NTR_POLL_INTERVAL_SECONDS` | `3600` | How often to check SoundCloud (seconds) |
|
||||||
| `NTR_ADMIN_TOKEN` | (required) | Admin bearer token |
|
| `NTR_ADMIN_TOKEN` | *(required)* | Bearer token for admin endpoints |
|
||||||
| `NTR_SOUNDCLOUD_USER` | `nicktherat` | SoundCloud user |
|
| `NTR_SOUNDCLOUD_USER` | `nicktherat` | SoundCloud username to track |
|
||||||
|
| `NTR_SHOW_DAY` | `2` | Day of week for show (0=Mon, 2=Wed) |
|
||||||
|
| `NTR_SHOW_HOUR` | `22` | Hour (Eastern Time) when the show starts |
|
||||||
|
|
||||||
## Development
|
## Development
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
pip install -e ".[dev]"
|
pip install -e ".[dev]"
|
||||||
pytest
|
pytest
|
||||||
|
ruff check src/ tests/
|
||||||
```
|
```
|
||||||
|
|||||||
39
chat-summaries/2026-03-12_00-00-summary.md
Normal file
39
chat-summaries/2026-03-12_00-00-summary.md
Normal file
@@ -0,0 +1,39 @@
|
|||||||
|
# NtR SoundCloud Fetcher — Full Implementation
|
||||||
|
|
||||||
|
## Task Description
|
||||||
|
|
||||||
|
Designed and implemented a Python service that polls NicktheRat's SoundCloud likes, builds weekly playlists aligned to the Wednesday 22:00 ET show schedule, and serves them via a JSON API for an IRC bot.
|
||||||
|
|
||||||
|
## Changes Made
|
||||||
|
|
||||||
|
### Design Phase
|
||||||
|
- Brainstormed requirements through 6 clarifying questions
|
||||||
|
- Evaluated 3 architectural approaches, selected single-process daemon
|
||||||
|
- Produced design doc covering architecture, data model, API, poller logic
|
||||||
|
- Produced 13-task TDD implementation plan
|
||||||
|
|
||||||
|
### Implementation (42 tests, all passing, lint clean)
|
||||||
|
|
||||||
|
| Module | File | Purpose |
|
||||||
|
|--------|------|---------|
|
||||||
|
| Config | `src/ntr_fetcher/config.py` | Pydantic settings with `NTR_` env prefix |
|
||||||
|
| Week | `src/ntr_fetcher/week.py` | DST-aware Wednesday 22:00 ET boundary computation |
|
||||||
|
| Models | `src/ntr_fetcher/models.py` | Track, Show, ShowTrack dataclasses |
|
||||||
|
| Database | `src/ntr_fetcher/db.py` | SQLite schema, CRUD, track sync with unlike removal |
|
||||||
|
| SoundCloud | `src/ntr_fetcher/soundcloud.py` | client_id extraction, user resolution, likes fetching |
|
||||||
|
| Poller | `src/ntr_fetcher/poller.py` | Hourly polling with supervised restart |
|
||||||
|
| API | `src/ntr_fetcher/api.py` | FastAPI routes for playlist, shows, admin, health |
|
||||||
|
| Main | `src/ntr_fetcher/main.py` | Entry point wiring everything together |
|
||||||
|
|
||||||
|
### Key Design Decisions
|
||||||
|
- Tracks removed when Nick unlikes them (positions re-compact)
|
||||||
|
- Cursor-seeking for efficient SoundCloud API pagination
|
||||||
|
- Automatic client_id rotation on 401
|
||||||
|
- Supervisor restarts poller on failure without affecting API
|
||||||
|
|
||||||
|
## Follow-up Items
|
||||||
|
|
||||||
|
- **Incremental fetching**: Currently fetches full week every poll; could optimize to stop at known tracks
|
||||||
|
- **Retry/backoff for non-401 errors**: 429, 5xx, timeouts not yet handled with retries
|
||||||
|
- **`full` parameter**: Accepted but currently equivalent to normal poll (no incremental to differentiate from)
|
||||||
|
- **`soundcloud_url` in admin add track**: Removed from API; only `track_id` supported
|
||||||
30
chat-summaries/2026-03-12_16-30-summary.md
Normal file
30
chat-summaries/2026-03-12_16-30-summary.md
Normal file
@@ -0,0 +1,30 @@
|
|||||||
|
# Historical Backfill (--init) Feature
|
||||||
|
|
||||||
|
## Task
|
||||||
|
Add CLI-based historical show backfill with episode numbering throughout the system.
|
||||||
|
|
||||||
|
## Changes Made
|
||||||
|
|
||||||
|
### New file
|
||||||
|
- `src/ntr_fetcher/backfill.py` — Computes show weeks from an anchor episode/date, batch-fetches all likes from SoundCloud, partitions them into weekly buckets, and populates the DB.
|
||||||
|
|
||||||
|
### Modified files
|
||||||
|
- `src/ntr_fetcher/models.py` — Added `episode_number: int | None` to `Show` dataclass.
|
||||||
|
- `src/ntr_fetcher/db.py` — Added `episode_number` column to schema, ALTER TABLE migration for existing DBs, updated `get_or_create_show` to accept/store episode numbers, added `get_latest_episode_number()` and `update_show_episode_number()`, changed `list_shows` ordering to `week_start DESC`.
|
||||||
|
- `src/ntr_fetcher/main.py` — Added `argparse` with `--init`, `--show`, `--aired` flags. `--init` runs backfill then exits; default starts the server as before.
|
||||||
|
- `src/ntr_fetcher/poller.py` — Auto-assigns episode number (latest + 1) when creating a new show if historical data exists.
|
||||||
|
- `src/ntr_fetcher/api.py` — Added `episode_number` to `/playlist`, `/shows`, `/shows/{show_id}` responses.
|
||||||
|
|
||||||
|
### New/updated tests
|
||||||
|
- `tests/test_backfill.py` — Week computation, batch partitioning, empty data, idempotency.
|
||||||
|
- `tests/test_db.py` — Episode number creation, update, and `get_latest_episode_number`.
|
||||||
|
- `tests/test_poller.py` — Auto-numbering when history exists, skips when no history, skips when already assigned.
|
||||||
|
- `tests/test_api.py` — `episode_number` present in show responses.
|
||||||
|
|
||||||
|
## Results
|
||||||
|
- 58 tests passing (up from 42), ruff clean.
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
```
|
||||||
|
NTR_ADMIN_TOKEN=token ntr-fetcher --init --show 521 --aired 2026-01-07
|
||||||
|
```
|
||||||
310
docs/api.md
Normal file
310
docs/api.md
Normal file
@@ -0,0 +1,310 @@
|
|||||||
|
# NtR SoundCloud Fetcher -- API Reference
|
||||||
|
|
||||||
|
Base URL: `http://127.0.0.1:8000` (configurable via `NTR_HOST` / `NTR_PORT`)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Public Endpoints
|
||||||
|
|
||||||
|
### `GET /health`
|
||||||
|
|
||||||
|
Service health check.
|
||||||
|
|
||||||
|
**Response**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"status": "ok",
|
||||||
|
"poller_alive": true,
|
||||||
|
"last_fetch": "2026-03-12T02:00:00+00:00",
|
||||||
|
"current_week_track_count": 9
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
| Field | Type | Description |
|
||||||
|
|-------|------|-------------|
|
||||||
|
| `status` | string | Always `"ok"` |
|
||||||
|
| `poller_alive` | boolean | Whether the background poller is running |
|
||||||
|
| `last_fetch` | string \| null | ISO 8601 timestamp of last successful poll, or `null` if never |
|
||||||
|
| `current_week_track_count` | integer | Number of tracks in the current week's playlist |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### `GET /playlist`
|
||||||
|
|
||||||
|
Returns the current week's full playlist.
|
||||||
|
|
||||||
|
**Response**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"show_id": 10,
|
||||||
|
"episode_number": 530,
|
||||||
|
"week_start": "2026-03-05T02:00:00+00:00",
|
||||||
|
"week_end": "2026-03-12T02:00:00+00:00",
|
||||||
|
"tracks": [
|
||||||
|
{
|
||||||
|
"show_id": 10,
|
||||||
|
"track_id": 12345,
|
||||||
|
"position": 1,
|
||||||
|
"title": "Night Drive",
|
||||||
|
"artist": "SomeArtist",
|
||||||
|
"permalink_url": "https://soundcloud.com/someartist/night-drive",
|
||||||
|
"artwork_url": "https://i1.sndcdn.com/artworks-...-large.jpg",
|
||||||
|
"duration_ms": 245000,
|
||||||
|
"license": "cc-by",
|
||||||
|
"liked_at": "2026-03-06T14:23:00+00:00",
|
||||||
|
"raw_json": "{...}"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
| Field | Type | Description |
|
||||||
|
|-------|------|-------------|
|
||||||
|
| `show_id` | integer | Internal database ID for this show |
|
||||||
|
| `episode_number` | integer \| null | Episode number (e.g. 530), or `null` if not assigned |
|
||||||
|
| `week_start` | string | ISO 8601 UTC timestamp -- start of the show's like window |
|
||||||
|
| `week_end` | string | ISO 8601 UTC timestamp -- end of the show's like window |
|
||||||
|
| `tracks` | array | Ordered list of tracks (see Track Object below) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### `GET /playlist/{position}`
|
||||||
|
|
||||||
|
Returns a single track by its position in the current week's playlist. Positions are 1-indexed (matching IRC commands `!1`, `!2`, etc.).
|
||||||
|
|
||||||
|
**Path Parameters**
|
||||||
|
|
||||||
|
| Parameter | Type | Description |
|
||||||
|
|-----------|------|-------------|
|
||||||
|
| `position` | integer | 1-based position in the playlist |
|
||||||
|
|
||||||
|
**Response** -- a single Track Object (see below).
|
||||||
|
|
||||||
|
**Errors**
|
||||||
|
|
||||||
|
| Status | Detail |
|
||||||
|
|--------|--------|
|
||||||
|
| 404 | `"No track at position {n}"` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### `GET /shows`
|
||||||
|
|
||||||
|
Lists all shows, ordered by week start date (newest first).
|
||||||
|
|
||||||
|
**Query Parameters**
|
||||||
|
|
||||||
|
| Parameter | Type | Default | Description |
|
||||||
|
|-----------|------|---------|-------------|
|
||||||
|
| `limit` | integer | 20 | Max number of shows to return |
|
||||||
|
| `offset` | integer | 0 | Number of shows to skip |
|
||||||
|
|
||||||
|
**Response**
|
||||||
|
|
||||||
|
```json
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"id": 10,
|
||||||
|
"episode_number": 530,
|
||||||
|
"week_start": "2026-03-05T02:00:00+00:00",
|
||||||
|
"week_end": "2026-03-12T02:00:00+00:00",
|
||||||
|
"created_at": "2026-03-05T03:00:00+00:00"
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### `GET /shows/{show_id}`
|
||||||
|
|
||||||
|
Returns a specific show with its full track listing.
|
||||||
|
|
||||||
|
**Path Parameters**
|
||||||
|
|
||||||
|
| Parameter | Type | Description |
|
||||||
|
|-----------|------|-------------|
|
||||||
|
| `show_id` | integer | The show's internal database ID |
|
||||||
|
|
||||||
|
**Response**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"show_id": 10,
|
||||||
|
"episode_number": 530,
|
||||||
|
"week_start": "2026-03-05T02:00:00+00:00",
|
||||||
|
"week_end": "2026-03-12T02:00:00+00:00",
|
||||||
|
"tracks": [...]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Errors**
|
||||||
|
|
||||||
|
| Status | Detail |
|
||||||
|
|--------|--------|
|
||||||
|
| 404 | `"Show not found"` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Admin Endpoints
|
||||||
|
|
||||||
|
All admin endpoints require a bearer token via the `Authorization` header:
|
||||||
|
|
||||||
|
```
|
||||||
|
Authorization: Bearer <NTR_ADMIN_TOKEN>
|
||||||
|
```
|
||||||
|
|
||||||
|
Returns `401` with `"Missing or invalid token"` if the header is absent or the token doesn't match.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### `POST /admin/refresh`
|
||||||
|
|
||||||
|
Triggers an immediate SoundCloud fetch for the current week's show.
|
||||||
|
|
||||||
|
**Request Body** (optional)
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"full": false
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
| Field | Type | Default | Description |
|
||||||
|
|-------|------|---------|-------------|
|
||||||
|
| `full` | boolean | `false` | Reserved for future use (full vs incremental refresh) |
|
||||||
|
|
||||||
|
**Response**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"status": "refreshed",
|
||||||
|
"track_count": 9
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### `POST /admin/tracks`
|
||||||
|
|
||||||
|
Manually add a track to the current week's show.
|
||||||
|
|
||||||
|
**Request Body**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"track_id": 12345,
|
||||||
|
"position": 3
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
| Field | Type | Required | Description |
|
||||||
|
|-------|------|----------|-------------|
|
||||||
|
| `track_id` | integer | yes | SoundCloud track ID (must already exist in the `tracks` table) |
|
||||||
|
| `position` | integer | no | Insert at this position (shifts others down). Omit to append at end. |
|
||||||
|
|
||||||
|
**Response**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"status": "added"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### `DELETE /admin/tracks/{track_id}`
|
||||||
|
|
||||||
|
Remove a track from the current week's show. Remaining positions are re-compacted.
|
||||||
|
|
||||||
|
**Path Parameters**
|
||||||
|
|
||||||
|
| Parameter | Type | Description |
|
||||||
|
|-----------|------|-------------|
|
||||||
|
| `track_id` | integer | SoundCloud track ID to remove |
|
||||||
|
|
||||||
|
**Response**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"status": "removed"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Errors**
|
||||||
|
|
||||||
|
| Status | Detail |
|
||||||
|
|--------|--------|
|
||||||
|
| 404 | `"Track not in current show"` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### `PUT /admin/tracks/{track_id}/position`
|
||||||
|
|
||||||
|
Move a track to a new position within the current week's show.
|
||||||
|
|
||||||
|
**Path Parameters**
|
||||||
|
|
||||||
|
| Parameter | Type | Description |
|
||||||
|
|-----------|------|-------------|
|
||||||
|
| `track_id` | integer | SoundCloud track ID to move |
|
||||||
|
|
||||||
|
**Request Body**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"position": 1
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
| Field | Type | Required | Description |
|
||||||
|
|-------|------|----------|-------------|
|
||||||
|
| `position` | integer | yes | New 1-based position for the track |
|
||||||
|
|
||||||
|
**Response**
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"status": "moved"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Errors**
|
||||||
|
|
||||||
|
| Status | Detail |
|
||||||
|
|--------|--------|
|
||||||
|
| 404 | `"Track not in current show"` |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Track Object
|
||||||
|
|
||||||
|
Returned inside playlist and show detail responses.
|
||||||
|
|
||||||
|
| Field | Type | Description |
|
||||||
|
|-------|------|-------------|
|
||||||
|
| `show_id` | integer | The show this track belongs to |
|
||||||
|
| `track_id` | integer | SoundCloud track ID |
|
||||||
|
| `position` | integer | 1-based position in the playlist |
|
||||||
|
| `title` | string | Track title |
|
||||||
|
| `artist` | string | Uploader's SoundCloud username |
|
||||||
|
| `permalink_url` | string | Full URL to the track on SoundCloud |
|
||||||
|
| `artwork_url` | string \| null | URL to artwork image, or `null` |
|
||||||
|
| `duration_ms` | integer | Track duration in milliseconds |
|
||||||
|
| `license` | string | License string (e.g. `"cc-by"`, `"cc-by-sa"`) |
|
||||||
|
| `liked_at` | string | ISO 8601 timestamp of when the host liked the track |
|
||||||
|
| `raw_json` | string | Full SoundCloud API response for this track (JSON string) |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Week Boundaries
|
||||||
|
|
||||||
|
Shows follow a weekly cadence aligned to **Wednesday 22:00 Eastern Time** (EST or EDT depending on DST). The like window for a show runs from the previous Wednesday 22:00 ET to the current Wednesday 22:00 ET.
|
||||||
|
|
||||||
|
All timestamps in API responses are UTC. The boundary shifts by 1 hour across DST transitions:
|
||||||
|
|
||||||
|
| Period | Eastern | UTC boundary |
|
||||||
|
|--------|---------|--------------|
|
||||||
|
| EST (Nov -- Mar) | Wed 22:00 | Thu 03:00 |
|
||||||
|
| EDT (Mar -- Nov) | Wed 22:00 | Thu 02:00 |
|
||||||
@@ -61,6 +61,7 @@ def create_app(
|
|||||||
tracks = db.get_show_tracks(show.id)
|
tracks = db.get_show_tracks(show.id)
|
||||||
return {
|
return {
|
||||||
"show_id": show.id,
|
"show_id": show.id,
|
||||||
|
"episode_number": show.episode_number,
|
||||||
"week_start": show.week_start.isoformat(),
|
"week_start": show.week_start.isoformat(),
|
||||||
"week_end": show.week_end.isoformat(),
|
"week_end": show.week_end.isoformat(),
|
||||||
"tracks": tracks,
|
"tracks": tracks,
|
||||||
@@ -80,6 +81,7 @@ def create_app(
|
|||||||
return [
|
return [
|
||||||
{
|
{
|
||||||
"id": s.id,
|
"id": s.id,
|
||||||
|
"episode_number": s.episode_number,
|
||||||
"week_start": s.week_start.isoformat(),
|
"week_start": s.week_start.isoformat(),
|
||||||
"week_end": s.week_end.isoformat(),
|
"week_end": s.week_end.isoformat(),
|
||||||
"created_at": s.created_at.isoformat(),
|
"created_at": s.created_at.isoformat(),
|
||||||
@@ -96,6 +98,7 @@ def create_app(
|
|||||||
tracks = db.get_show_tracks(show.id)
|
tracks = db.get_show_tracks(show.id)
|
||||||
return {
|
return {
|
||||||
"show_id": show.id,
|
"show_id": show.id,
|
||||||
|
"episode_number": show.episode_number,
|
||||||
"week_start": show.week_start.isoformat(),
|
"week_start": show.week_start.isoformat(),
|
||||||
"week_end": show.week_end.isoformat(),
|
"week_end": show.week_end.isoformat(),
|
||||||
"tracks": tracks,
|
"tracks": tracks,
|
||||||
|
|||||||
82
src/ntr_fetcher/backfill.py
Normal file
82
src/ntr_fetcher/backfill.py
Normal file
@@ -0,0 +1,82 @@
|
|||||||
|
import logging
|
||||||
|
from datetime import date, datetime, timedelta, timezone
|
||||||
|
from zoneinfo import ZoneInfo
|
||||||
|
|
||||||
|
from ntr_fetcher.db import Database
|
||||||
|
from ntr_fetcher.soundcloud import SoundCloudClient
|
||||||
|
from ntr_fetcher.week import get_show_week
|
||||||
|
|
||||||
|
EASTERN = ZoneInfo("America/New_York")
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
def _compute_show_weeks(
|
||||||
|
anchor_aired: date,
|
||||||
|
anchor_episode: int,
|
||||||
|
show_day: int,
|
||||||
|
show_hour: int,
|
||||||
|
) -> list[tuple[int, datetime, datetime]]:
|
||||||
|
"""Return (episode_number, week_start_utc, week_end_utc) for each show
|
||||||
|
from the anchor forward through the current date."""
|
||||||
|
today = date.today()
|
||||||
|
weeks: list[tuple[int, datetime, datetime]] = []
|
||||||
|
aired = anchor_aired
|
||||||
|
episode = anchor_episode
|
||||||
|
|
||||||
|
while aired <= today:
|
||||||
|
noon_et = datetime(aired.year, aired.month, aired.day, 12, 0, 0, tzinfo=EASTERN)
|
||||||
|
noon_utc = noon_et.astimezone(timezone.utc).replace(tzinfo=timezone.utc)
|
||||||
|
week_start, week_end = get_show_week(noon_utc, show_day, show_hour)
|
||||||
|
weeks.append((episode, week_start, week_end))
|
||||||
|
aired += timedelta(days=7)
|
||||||
|
episode += 1
|
||||||
|
|
||||||
|
return weeks
|
||||||
|
|
||||||
|
|
||||||
|
async def run_backfill(
|
||||||
|
db: Database,
|
||||||
|
soundcloud: SoundCloudClient,
|
||||||
|
soundcloud_user: str,
|
||||||
|
show_day: int,
|
||||||
|
show_hour: int,
|
||||||
|
anchor_episode: int,
|
||||||
|
anchor_aired: date,
|
||||||
|
) -> None:
|
||||||
|
weeks = _compute_show_weeks(anchor_aired, anchor_episode, show_day, show_hour)
|
||||||
|
if not weeks:
|
||||||
|
logger.warning("No show weeks to backfill")
|
||||||
|
return
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
"Backfilling %d shows: #%d (%s) through #%d (%s)",
|
||||||
|
len(weeks),
|
||||||
|
weeks[0][0], anchor_aired.isoformat(),
|
||||||
|
weeks[-1][0], (anchor_aired + timedelta(days=7 * (len(weeks) - 1))).isoformat(),
|
||||||
|
)
|
||||||
|
|
||||||
|
overall_start = weeks[0][1]
|
||||||
|
overall_end = weeks[-1][2]
|
||||||
|
|
||||||
|
user_id = await soundcloud.resolve_user(soundcloud_user)
|
||||||
|
all_tracks = await soundcloud.fetch_likes(
|
||||||
|
user_id=user_id,
|
||||||
|
since=overall_start,
|
||||||
|
until=overall_end,
|
||||||
|
)
|
||||||
|
logger.info("Fetched %d total tracks from SoundCloud", len(all_tracks))
|
||||||
|
|
||||||
|
for track in all_tracks:
|
||||||
|
db.upsert_track(track)
|
||||||
|
|
||||||
|
for episode, week_start, week_end in weeks:
|
||||||
|
show = db.get_or_create_show(week_start, week_end, episode_number=episode)
|
||||||
|
week_tracks = [
|
||||||
|
t for t in all_tracks
|
||||||
|
if week_start <= t.liked_at < week_end
|
||||||
|
]
|
||||||
|
week_tracks.sort(key=lambda t: t.liked_at)
|
||||||
|
track_ids = [t.id for t in week_tracks]
|
||||||
|
db.set_show_tracks(show.id, track_ids)
|
||||||
|
logger.info("Show #%d (%s): %d tracks", episode, week_start.isoformat(), len(week_tracks))
|
||||||
@@ -20,7 +20,8 @@ CREATE TABLE IF NOT EXISTS shows (
|
|||||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||||
week_start TEXT NOT NULL,
|
week_start TEXT NOT NULL,
|
||||||
week_end TEXT NOT NULL,
|
week_end TEXT NOT NULL,
|
||||||
created_at TEXT NOT NULL
|
created_at TEXT NOT NULL,
|
||||||
|
episode_number INTEGER
|
||||||
);
|
);
|
||||||
|
|
||||||
CREATE TABLE IF NOT EXISTS show_tracks (
|
CREATE TABLE IF NOT EXISTS show_tracks (
|
||||||
@@ -49,6 +50,11 @@ class Database:
|
|||||||
def initialize(self) -> None:
|
def initialize(self) -> None:
|
||||||
conn = self._connect()
|
conn = self._connect()
|
||||||
conn.executescript(SCHEMA)
|
conn.executescript(SCHEMA)
|
||||||
|
try:
|
||||||
|
conn.execute("ALTER TABLE shows ADD COLUMN episode_number INTEGER")
|
||||||
|
conn.commit()
|
||||||
|
except sqlite3.OperationalError:
|
||||||
|
pass
|
||||||
conn.close()
|
conn.close()
|
||||||
|
|
||||||
def upsert_track(self, track: Track) -> None:
|
def upsert_track(self, track: Track) -> None:
|
||||||
@@ -102,26 +108,36 @@ class Database:
|
|||||||
)
|
)
|
||||||
|
|
||||||
def get_or_create_show(
|
def get_or_create_show(
|
||||||
self, week_start: datetime, week_end: datetime
|
self,
|
||||||
|
week_start: datetime,
|
||||||
|
week_end: datetime,
|
||||||
|
episode_number: int | None = None,
|
||||||
) -> Show:
|
) -> Show:
|
||||||
conn = self._connect()
|
conn = self._connect()
|
||||||
row = conn.execute(
|
row = conn.execute(
|
||||||
"SELECT id, week_start, week_end, created_at FROM shows "
|
"SELECT id, week_start, week_end, created_at, episode_number FROM shows "
|
||||||
"WHERE week_start = ? AND week_end = ?",
|
"WHERE week_start = ? AND week_end = ?",
|
||||||
(week_start.isoformat(), week_end.isoformat()),
|
(week_start.isoformat(), week_end.isoformat()),
|
||||||
).fetchone()
|
).fetchone()
|
||||||
if row is not None:
|
if row is not None:
|
||||||
|
if episode_number is not None and row["episode_number"] != episode_number:
|
||||||
|
conn.execute(
|
||||||
|
"UPDATE shows SET episode_number = ? WHERE id = ?",
|
||||||
|
(episode_number, row["id"]),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
conn.close()
|
conn.close()
|
||||||
return Show(
|
return Show(
|
||||||
id=row["id"],
|
id=row["id"],
|
||||||
week_start=datetime.fromisoformat(row["week_start"]),
|
week_start=datetime.fromisoformat(row["week_start"]),
|
||||||
week_end=datetime.fromisoformat(row["week_end"]),
|
week_end=datetime.fromisoformat(row["week_end"]),
|
||||||
created_at=datetime.fromisoformat(row["created_at"]),
|
created_at=datetime.fromisoformat(row["created_at"]),
|
||||||
|
episode_number=episode_number if episode_number is not None else row["episode_number"],
|
||||||
)
|
)
|
||||||
now = datetime.now(timezone.utc).isoformat()
|
now = datetime.now(timezone.utc).isoformat()
|
||||||
cursor = conn.execute(
|
cursor = conn.execute(
|
||||||
"INSERT INTO shows (week_start, week_end, created_at) VALUES (?, ?, ?)",
|
"INSERT INTO shows (week_start, week_end, created_at, episode_number) VALUES (?, ?, ?, ?)",
|
||||||
(week_start.isoformat(), week_end.isoformat(), now),
|
(week_start.isoformat(), week_end.isoformat(), now, episode_number),
|
||||||
)
|
)
|
||||||
conn.commit()
|
conn.commit()
|
||||||
show_id = cursor.lastrowid
|
show_id = cursor.lastrowid
|
||||||
@@ -131,6 +147,7 @@ class Database:
|
|||||||
week_start=week_start,
|
week_start=week_start,
|
||||||
week_end=week_end,
|
week_end=week_end,
|
||||||
created_at=datetime.fromisoformat(now),
|
created_at=datetime.fromisoformat(now),
|
||||||
|
episode_number=episode_number,
|
||||||
)
|
)
|
||||||
|
|
||||||
def get_show_tracks(self, show_id: int) -> list[dict]:
|
def get_show_tracks(self, show_id: int) -> list[dict]:
|
||||||
@@ -203,9 +220,9 @@ class Database:
|
|||||||
conn = self._connect()
|
conn = self._connect()
|
||||||
rows = conn.execute(
|
rows = conn.execute(
|
||||||
"""
|
"""
|
||||||
SELECT id, week_start, week_end, created_at
|
SELECT id, week_start, week_end, created_at, episode_number
|
||||||
FROM shows
|
FROM shows
|
||||||
ORDER BY created_at DESC
|
ORDER BY week_start DESC
|
||||||
LIMIT ? OFFSET ?
|
LIMIT ? OFFSET ?
|
||||||
""",
|
""",
|
||||||
(limit, offset),
|
(limit, offset),
|
||||||
@@ -217,10 +234,28 @@ class Database:
|
|||||||
week_start=datetime.fromisoformat(row["week_start"]),
|
week_start=datetime.fromisoformat(row["week_start"]),
|
||||||
week_end=datetime.fromisoformat(row["week_end"]),
|
week_end=datetime.fromisoformat(row["week_end"]),
|
||||||
created_at=datetime.fromisoformat(row["created_at"]),
|
created_at=datetime.fromisoformat(row["created_at"]),
|
||||||
|
episode_number=row["episode_number"],
|
||||||
)
|
)
|
||||||
for row in rows
|
for row in rows
|
||||||
]
|
]
|
||||||
|
|
||||||
|
def get_latest_episode_number(self) -> int | None:
|
||||||
|
conn = self._connect()
|
||||||
|
row = conn.execute(
|
||||||
|
"SELECT MAX(episode_number) as max_ep FROM shows WHERE episode_number IS NOT NULL"
|
||||||
|
).fetchone()
|
||||||
|
conn.close()
|
||||||
|
return row["max_ep"] if row else None
|
||||||
|
|
||||||
|
def update_show_episode_number(self, show_id: int, episode_number: int) -> None:
|
||||||
|
conn = self._connect()
|
||||||
|
conn.execute(
|
||||||
|
"UPDATE shows SET episode_number = ? WHERE id = ?",
|
||||||
|
(episode_number, show_id),
|
||||||
|
)
|
||||||
|
conn.commit()
|
||||||
|
conn.close()
|
||||||
|
|
||||||
def has_track_in_show(self, show_id: int, track_id: int) -> bool:
|
def has_track_in_show(self, show_id: int, track_id: int) -> bool:
|
||||||
conn = self._connect()
|
conn = self._connect()
|
||||||
row = conn.execute(
|
row = conn.execute(
|
||||||
|
|||||||
@@ -1,9 +1,12 @@
|
|||||||
|
import argparse
|
||||||
import asyncio
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
|
from datetime import date
|
||||||
|
|
||||||
import uvicorn
|
import uvicorn
|
||||||
|
|
||||||
from ntr_fetcher.api import create_app
|
from ntr_fetcher.api import create_app
|
||||||
|
from ntr_fetcher.backfill import run_backfill
|
||||||
from ntr_fetcher.config import Settings
|
from ntr_fetcher.config import Settings
|
||||||
from ntr_fetcher.db import Database
|
from ntr_fetcher.db import Database
|
||||||
from ntr_fetcher.poller import Poller
|
from ntr_fetcher.poller import Poller
|
||||||
@@ -16,13 +19,51 @@ logging.basicConfig(
|
|||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_args() -> argparse.Namespace:
|
||||||
|
parser = argparse.ArgumentParser(description="NtR SoundCloud Fetcher")
|
||||||
|
parser.add_argument(
|
||||||
|
"--init", action="store_true",
|
||||||
|
help="Run historical backfill instead of starting the server",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--show", type=int,
|
||||||
|
help="Anchor episode number (required with --init)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--aired", type=date.fromisoformat,
|
||||||
|
help="Air date of anchor episode as YYYY-MM-DD (required with --init)",
|
||||||
|
)
|
||||||
|
args = parser.parse_args()
|
||||||
|
if args.init and (args.show is None or args.aired is None):
|
||||||
|
parser.error("--init requires both --show and --aired")
|
||||||
|
return args
|
||||||
|
|
||||||
|
|
||||||
def run() -> None:
|
def run() -> None:
|
||||||
|
args = _parse_args()
|
||||||
settings = Settings()
|
settings = Settings()
|
||||||
|
|
||||||
db = Database(settings.db_path)
|
db = Database(settings.db_path)
|
||||||
db.initialize()
|
db.initialize()
|
||||||
logger.info("Database initialized at %s", settings.db_path)
|
logger.info("Database initialized at %s", settings.db_path)
|
||||||
|
|
||||||
|
if args.init:
|
||||||
|
sc = SoundCloudClient()
|
||||||
|
asyncio.run(
|
||||||
|
run_backfill(
|
||||||
|
db=db,
|
||||||
|
soundcloud=sc,
|
||||||
|
soundcloud_user=settings.soundcloud_user,
|
||||||
|
show_day=settings.show_day,
|
||||||
|
show_hour=settings.show_hour,
|
||||||
|
anchor_episode=args.show,
|
||||||
|
anchor_aired=args.aired,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
asyncio.run(sc.close())
|
||||||
|
logger.info("Backfill complete")
|
||||||
|
return
|
||||||
|
|
||||||
sc = SoundCloudClient()
|
sc = SoundCloudClient()
|
||||||
poller = Poller(
|
poller = Poller(
|
||||||
db=db,
|
db=db,
|
||||||
|
|||||||
@@ -21,6 +21,7 @@ class Show:
|
|||||||
week_start: datetime
|
week_start: datetime
|
||||||
week_end: datetime
|
week_end: datetime
|
||||||
created_at: datetime
|
created_at: datetime
|
||||||
|
episode_number: int | None = None
|
||||||
|
|
||||||
|
|
||||||
@dataclass(frozen=True)
|
@dataclass(frozen=True)
|
||||||
|
|||||||
@@ -40,6 +40,13 @@ class Poller:
|
|||||||
week_start, week_end = get_show_week(now, self._show_day, self._show_hour)
|
week_start, week_end = get_show_week(now, self._show_day, self._show_hour)
|
||||||
show = self._db.get_or_create_show(week_start, week_end)
|
show = self._db.get_or_create_show(week_start, week_end)
|
||||||
|
|
||||||
|
if show.episode_number is None:
|
||||||
|
latest = self._db.get_latest_episode_number()
|
||||||
|
if latest is not None:
|
||||||
|
new_ep = latest + 1
|
||||||
|
self._db.update_show_episode_number(show.id, new_ep)
|
||||||
|
logger.info("Auto-assigned episode #%d to show %d", new_ep, show.id)
|
||||||
|
|
||||||
tracks = await self._sc.fetch_likes(
|
tracks = await self._sc.fetch_likes(
|
||||||
user_id=user_id,
|
user_id=user_id,
|
||||||
since=week_start,
|
since=week_start,
|
||||||
|
|||||||
@@ -62,6 +62,7 @@ def test_playlist(client, db):
|
|||||||
resp = client.get("/playlist")
|
resp = client.get("/playlist")
|
||||||
assert resp.status_code == 200
|
assert resp.status_code == 200
|
||||||
data = resp.json()
|
data = resp.json()
|
||||||
|
assert "episode_number" in data
|
||||||
assert len(data["tracks"]) == 2
|
assert len(data["tracks"]) == 2
|
||||||
assert data["tracks"][0]["position"] == 1
|
assert data["tracks"][0]["position"] == 1
|
||||||
assert data["tracks"][0]["title"] == "Song A"
|
assert data["tracks"][0]["title"] == "Song A"
|
||||||
@@ -84,14 +85,18 @@ def test_shows_list(client, db):
|
|||||||
_seed_show(db)
|
_seed_show(db)
|
||||||
resp = client.get("/shows")
|
resp = client.get("/shows")
|
||||||
assert resp.status_code == 200
|
assert resp.status_code == 200
|
||||||
assert len(resp.json()) >= 1
|
data = resp.json()
|
||||||
|
assert len(data) >= 1
|
||||||
|
assert "episode_number" in data[0]
|
||||||
|
|
||||||
|
|
||||||
def test_shows_detail(client, db):
|
def test_shows_detail(client, db):
|
||||||
show = _seed_show(db)
|
show = _seed_show(db)
|
||||||
resp = client.get(f"/shows/{show.id}")
|
resp = client.get(f"/shows/{show.id}")
|
||||||
assert resp.status_code == 200
|
assert resp.status_code == 200
|
||||||
assert len(resp.json()["tracks"]) == 2
|
data = resp.json()
|
||||||
|
assert "episode_number" in data
|
||||||
|
assert len(data["tracks"]) == 2
|
||||||
|
|
||||||
|
|
||||||
def test_admin_refresh_requires_token(client):
|
def test_admin_refresh_requires_token(client):
|
||||||
|
|||||||
198
tests/test_backfill.py
Normal file
198
tests/test_backfill.py
Normal file
@@ -0,0 +1,198 @@
|
|||||||
|
from datetime import date, datetime, timezone
|
||||||
|
from unittest.mock import AsyncMock
|
||||||
|
|
||||||
|
import pytest
|
||||||
|
|
||||||
|
from ntr_fetcher.backfill import _compute_show_weeks, run_backfill
|
||||||
|
from ntr_fetcher.db import Database
|
||||||
|
from ntr_fetcher.models import Track
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.fixture
|
||||||
|
def db(tmp_path):
|
||||||
|
database = Database(str(tmp_path / "test.db"))
|
||||||
|
database.initialize()
|
||||||
|
return database
|
||||||
|
|
||||||
|
|
||||||
|
def _make_track(id: int, liked_at: str) -> Track:
|
||||||
|
return Track(
|
||||||
|
id=id,
|
||||||
|
title=f"Track {id}",
|
||||||
|
artist="Artist",
|
||||||
|
permalink_url=f"https://soundcloud.com/a/t-{id}",
|
||||||
|
artwork_url=None,
|
||||||
|
duration_ms=180000,
|
||||||
|
license="cc-by",
|
||||||
|
liked_at=datetime.fromisoformat(liked_at),
|
||||||
|
raw_json="{}",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class TestComputeShowWeeks:
|
||||||
|
def test_single_week_anchor_is_today(self):
|
||||||
|
today = date.today()
|
||||||
|
weeks = _compute_show_weeks(today, 100, show_day=2, show_hour=22)
|
||||||
|
assert len(weeks) >= 1
|
||||||
|
assert weeks[0][0] == 100
|
||||||
|
|
||||||
|
def test_multiple_weeks(self):
|
||||||
|
weeks = _compute_show_weeks(
|
||||||
|
anchor_aired=date(2026, 1, 7),
|
||||||
|
anchor_episode=521,
|
||||||
|
show_day=2,
|
||||||
|
show_hour=22,
|
||||||
|
)
|
||||||
|
assert weeks[0][0] == 521
|
||||||
|
assert weeks[1][0] == 522
|
||||||
|
for i, (ep, start, end) in enumerate(weeks):
|
||||||
|
assert ep == 521 + i
|
||||||
|
assert end > start
|
||||||
|
|
||||||
|
def test_week_boundaries_are_utc(self):
|
||||||
|
weeks = _compute_show_weeks(
|
||||||
|
anchor_aired=date(2026, 1, 7),
|
||||||
|
anchor_episode=521,
|
||||||
|
show_day=2,
|
||||||
|
show_hour=22,
|
||||||
|
)
|
||||||
|
for _, start, end in weeks:
|
||||||
|
assert start.tzinfo == timezone.utc
|
||||||
|
assert end.tzinfo == timezone.utc
|
||||||
|
|
||||||
|
def test_consecutive_weeks_are_contiguous(self):
|
||||||
|
weeks = _compute_show_weeks(
|
||||||
|
anchor_aired=date(2026, 1, 7),
|
||||||
|
anchor_episode=521,
|
||||||
|
show_day=2,
|
||||||
|
show_hour=22,
|
||||||
|
)
|
||||||
|
for i in range(len(weeks) - 1):
|
||||||
|
assert weeks[i][2] == weeks[i + 1][1], (
|
||||||
|
f"Week {i} end != week {i+1} start"
|
||||||
|
)
|
||||||
|
|
||||||
|
def test_anchor_in_future_returns_empty(self):
|
||||||
|
future = date(2099, 1, 1)
|
||||||
|
weeks = _compute_show_weeks(future, 999, show_day=2, show_hour=22)
|
||||||
|
assert weeks == []
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_backfill_populates_db(db):
|
||||||
|
mock_sc = AsyncMock()
|
||||||
|
mock_sc.resolve_user.return_value = 12345
|
||||||
|
|
||||||
|
t1 = _make_track(1, "2026-01-02T05:00:00+00:00")
|
||||||
|
t2 = _make_track(2, "2026-01-04T15:00:00+00:00")
|
||||||
|
t3 = _make_track(3, "2026-01-09T10:00:00+00:00")
|
||||||
|
mock_sc.fetch_likes.return_value = [t1, t2, t3]
|
||||||
|
|
||||||
|
await run_backfill(
|
||||||
|
db=db,
|
||||||
|
soundcloud=mock_sc,
|
||||||
|
soundcloud_user="nicktherat",
|
||||||
|
show_day=2,
|
||||||
|
show_hour=22,
|
||||||
|
anchor_episode=521,
|
||||||
|
anchor_aired=date(2026, 1, 7),
|
||||||
|
)
|
||||||
|
|
||||||
|
shows = db.list_shows(limit=100, offset=0)
|
||||||
|
assert len(shows) >= 1
|
||||||
|
|
||||||
|
ep_521 = next((s for s in shows if s.episode_number == 521), None)
|
||||||
|
assert ep_521 is not None
|
||||||
|
|
||||||
|
tracks = db.get_show_tracks(ep_521.id)
|
||||||
|
track_ids = [t["track_id"] for t in tracks]
|
||||||
|
assert 1 in track_ids or 2 in track_ids or 3 in track_ids
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_backfill_partitions_tracks_by_week(db):
|
||||||
|
mock_sc = AsyncMock()
|
||||||
|
mock_sc.resolve_user.return_value = 12345
|
||||||
|
|
||||||
|
t_week1 = _make_track(10, "2026-01-02T12:00:00+00:00")
|
||||||
|
t_week2 = _make_track(20, "2026-01-10T12:00:00+00:00")
|
||||||
|
mock_sc.fetch_likes.return_value = [t_week1, t_week2]
|
||||||
|
|
||||||
|
await run_backfill(
|
||||||
|
db=db,
|
||||||
|
soundcloud=mock_sc,
|
||||||
|
soundcloud_user="nicktherat",
|
||||||
|
show_day=2,
|
||||||
|
show_hour=22,
|
||||||
|
anchor_episode=521,
|
||||||
|
anchor_aired=date(2026, 1, 7),
|
||||||
|
)
|
||||||
|
|
||||||
|
shows = db.list_shows(limit=100, offset=0)
|
||||||
|
ep_521 = next((s for s in shows if s.episode_number == 521), None)
|
||||||
|
ep_522 = next((s for s in shows if s.episode_number == 522), None)
|
||||||
|
|
||||||
|
if ep_521:
|
||||||
|
tracks_521 = db.get_show_tracks(ep_521.id)
|
||||||
|
ids_521 = {t["track_id"] for t in tracks_521}
|
||||||
|
else:
|
||||||
|
ids_521 = set()
|
||||||
|
|
||||||
|
if ep_522:
|
||||||
|
tracks_522 = db.get_show_tracks(ep_522.id)
|
||||||
|
ids_522 = {t["track_id"] for t in tracks_522}
|
||||||
|
else:
|
||||||
|
ids_522 = set()
|
||||||
|
|
||||||
|
assert ids_521 & ids_522 == set(), "Tracks should not appear in multiple weeks"
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_backfill_no_tracks(db):
|
||||||
|
mock_sc = AsyncMock()
|
||||||
|
mock_sc.resolve_user.return_value = 12345
|
||||||
|
mock_sc.fetch_likes.return_value = []
|
||||||
|
|
||||||
|
await run_backfill(
|
||||||
|
db=db,
|
||||||
|
soundcloud=mock_sc,
|
||||||
|
soundcloud_user="nicktherat",
|
||||||
|
show_day=2,
|
||||||
|
show_hour=22,
|
||||||
|
anchor_episode=521,
|
||||||
|
anchor_aired=date(2026, 1, 7),
|
||||||
|
)
|
||||||
|
|
||||||
|
shows = db.list_shows(limit=100, offset=0)
|
||||||
|
assert len(shows) >= 1
|
||||||
|
for show in shows:
|
||||||
|
tracks = db.get_show_tracks(show.id)
|
||||||
|
assert len(tracks) == 0
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_run_backfill_idempotent(db):
|
||||||
|
"""Running backfill twice with the same data shouldn't duplicate shows."""
|
||||||
|
mock_sc = AsyncMock()
|
||||||
|
mock_sc.resolve_user.return_value = 12345
|
||||||
|
mock_sc.fetch_likes.return_value = [
|
||||||
|
_make_track(1, "2026-01-05T12:00:00+00:00"),
|
||||||
|
]
|
||||||
|
|
||||||
|
kwargs = dict(
|
||||||
|
db=db,
|
||||||
|
soundcloud=mock_sc,
|
||||||
|
soundcloud_user="nicktherat",
|
||||||
|
show_day=2,
|
||||||
|
show_hour=22,
|
||||||
|
anchor_episode=521,
|
||||||
|
anchor_aired=date(2026, 1, 7),
|
||||||
|
)
|
||||||
|
|
||||||
|
await run_backfill(**kwargs)
|
||||||
|
count_first = len(db.list_shows(limit=1000, offset=0))
|
||||||
|
|
||||||
|
await run_backfill(**kwargs)
|
||||||
|
count_second = len(db.list_shows(limit=1000, offset=0))
|
||||||
|
|
||||||
|
assert count_first == count_second
|
||||||
@@ -226,6 +226,52 @@ def test_add_track_to_show_at_position(db):
|
|||||||
assert tracks[2]["track_id"] == 2
|
assert tracks[2]["track_id"] == 2
|
||||||
|
|
||||||
|
|
||||||
|
def test_get_or_create_show_with_episode_number(db):
|
||||||
|
week_start = datetime(2026, 1, 8, 3, 0, 0, tzinfo=timezone.utc)
|
||||||
|
week_end = datetime(2026, 1, 15, 3, 0, 0, tzinfo=timezone.utc)
|
||||||
|
show = db.get_or_create_show(week_start, week_end, episode_number=521)
|
||||||
|
assert show.episode_number == 521
|
||||||
|
show2 = db.get_or_create_show(week_start, week_end)
|
||||||
|
assert show2.id == show.id
|
||||||
|
assert show2.episode_number == 521
|
||||||
|
|
||||||
|
|
||||||
|
def test_get_or_create_show_updates_episode_number(db):
|
||||||
|
week_start = datetime(2026, 1, 8, 3, 0, 0, tzinfo=timezone.utc)
|
||||||
|
week_end = datetime(2026, 1, 15, 3, 0, 0, tzinfo=timezone.utc)
|
||||||
|
show = db.get_or_create_show(week_start, week_end)
|
||||||
|
assert show.episode_number is None
|
||||||
|
show2 = db.get_or_create_show(week_start, week_end, episode_number=521)
|
||||||
|
assert show2.id == show.id
|
||||||
|
assert show2.episode_number == 521
|
||||||
|
|
||||||
|
|
||||||
|
def test_get_latest_episode_number(db):
|
||||||
|
assert db.get_latest_episode_number() is None
|
||||||
|
db.get_or_create_show(
|
||||||
|
datetime(2026, 1, 8, 3, 0, 0, tzinfo=timezone.utc),
|
||||||
|
datetime(2026, 1, 15, 3, 0, 0, tzinfo=timezone.utc),
|
||||||
|
episode_number=521,
|
||||||
|
)
|
||||||
|
assert db.get_latest_episode_number() == 521
|
||||||
|
db.get_or_create_show(
|
||||||
|
datetime(2026, 1, 15, 3, 0, 0, tzinfo=timezone.utc),
|
||||||
|
datetime(2026, 1, 22, 3, 0, 0, tzinfo=timezone.utc),
|
||||||
|
episode_number=522,
|
||||||
|
)
|
||||||
|
assert db.get_latest_episode_number() == 522
|
||||||
|
|
||||||
|
|
||||||
|
def test_update_show_episode_number(db):
|
||||||
|
week_start = datetime(2026, 1, 8, 3, 0, 0, tzinfo=timezone.utc)
|
||||||
|
week_end = datetime(2026, 1, 15, 3, 0, 0, tzinfo=timezone.utc)
|
||||||
|
show = db.get_or_create_show(week_start, week_end)
|
||||||
|
assert show.episode_number is None
|
||||||
|
db.update_show_episode_number(show.id, 521)
|
||||||
|
show2 = db.get_or_create_show(week_start, week_end)
|
||||||
|
assert show2.episode_number == 521
|
||||||
|
|
||||||
|
|
||||||
def test_has_track_in_show(db):
|
def test_has_track_in_show(db):
|
||||||
week_start = datetime(2026, 3, 13, 2, 0, 0, tzinfo=timezone.utc)
|
week_start = datetime(2026, 3, 13, 2, 0, 0, tzinfo=timezone.utc)
|
||||||
week_end = datetime(2026, 3, 20, 2, 0, 0, tzinfo=timezone.utc)
|
week_end = datetime(2026, 3, 20, 2, 0, 0, tzinfo=timezone.utc)
|
||||||
|
|||||||
@@ -93,6 +93,89 @@ async def test_poll_once_removes_unliked_tracks():
|
|||||||
assert call_args[0][1] == [1]
|
assert call_args[0][1] == [1]
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_poll_once_auto_assigns_episode_number():
|
||||||
|
mock_sc = AsyncMock()
|
||||||
|
mock_sc.resolve_user.return_value = 206979918
|
||||||
|
mock_sc.fetch_likes.return_value = [
|
||||||
|
_make_track(1, "2026-03-14T01:00:00+00:00"),
|
||||||
|
]
|
||||||
|
|
||||||
|
mock_db = MagicMock()
|
||||||
|
mock_show = MagicMock()
|
||||||
|
mock_show.id = 5
|
||||||
|
mock_show.episode_number = None
|
||||||
|
mock_db.get_or_create_show.return_value = mock_show
|
||||||
|
mock_db.get_latest_episode_number.return_value = 530
|
||||||
|
|
||||||
|
poller = Poller(
|
||||||
|
db=mock_db,
|
||||||
|
soundcloud=mock_sc,
|
||||||
|
soundcloud_user="nicktherat",
|
||||||
|
show_day=2,
|
||||||
|
show_hour=22,
|
||||||
|
poll_interval=3600,
|
||||||
|
)
|
||||||
|
|
||||||
|
await poller.poll_once()
|
||||||
|
|
||||||
|
mock_db.update_show_episode_number.assert_called_once_with(5, 531)
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_poll_once_skips_numbering_when_no_history():
|
||||||
|
mock_sc = AsyncMock()
|
||||||
|
mock_sc.resolve_user.return_value = 206979918
|
||||||
|
mock_sc.fetch_likes.return_value = []
|
||||||
|
|
||||||
|
mock_db = MagicMock()
|
||||||
|
mock_show = MagicMock()
|
||||||
|
mock_show.id = 1
|
||||||
|
mock_show.episode_number = None
|
||||||
|
mock_db.get_or_create_show.return_value = mock_show
|
||||||
|
mock_db.get_latest_episode_number.return_value = None
|
||||||
|
|
||||||
|
poller = Poller(
|
||||||
|
db=mock_db,
|
||||||
|
soundcloud=mock_sc,
|
||||||
|
soundcloud_user="nicktherat",
|
||||||
|
show_day=2,
|
||||||
|
show_hour=22,
|
||||||
|
poll_interval=3600,
|
||||||
|
)
|
||||||
|
|
||||||
|
await poller.poll_once()
|
||||||
|
|
||||||
|
mock_db.update_show_episode_number.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
|
@pytest.mark.asyncio
|
||||||
|
async def test_poll_once_skips_numbering_when_already_assigned():
|
||||||
|
mock_sc = AsyncMock()
|
||||||
|
mock_sc.resolve_user.return_value = 206979918
|
||||||
|
mock_sc.fetch_likes.return_value = []
|
||||||
|
|
||||||
|
mock_db = MagicMock()
|
||||||
|
mock_show = MagicMock()
|
||||||
|
mock_show.id = 1
|
||||||
|
mock_show.episode_number = 530
|
||||||
|
mock_db.get_or_create_show.return_value = mock_show
|
||||||
|
|
||||||
|
poller = Poller(
|
||||||
|
db=mock_db,
|
||||||
|
soundcloud=mock_sc,
|
||||||
|
soundcloud_user="nicktherat",
|
||||||
|
show_day=2,
|
||||||
|
show_hour=22,
|
||||||
|
poll_interval=3600,
|
||||||
|
)
|
||||||
|
|
||||||
|
await poller.poll_once()
|
||||||
|
|
||||||
|
mock_db.get_latest_episode_number.assert_not_called()
|
||||||
|
mock_db.update_show_episode_number.assert_not_called()
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
@pytest.mark.asyncio
|
||||||
async def test_poll_once_full_refresh():
|
async def test_poll_once_full_refresh():
|
||||||
mock_sc = AsyncMock()
|
mock_sc = AsyncMock()
|
||||||
|
|||||||
Reference in New Issue
Block a user