Add design doc and SoundCloud API reference
Design for a SoundCloud likes fetcher service that builds weekly playlists for Nick the Rat Radio and serves them via JSON API. Made-with: Cursor
This commit is contained in:
5
.gitignore
vendored
Normal file
5
.gitignore
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
.DS_Store
|
||||
*.db
|
||||
.env
|
||||
__pycache__/
|
||||
*.pyc
|
||||
224
docs/plans/2026-03-12-ntr-soundcloud-fetcher-design.md
Normal file
224
docs/plans/2026-03-12-ntr-soundcloud-fetcher-design.md
Normal file
@@ -0,0 +1,224 @@
|
||||
# NtR SoundCloud Fetcher — Design Document
|
||||
|
||||
> **Date**: 2026-03-12
|
||||
> **Status**: Approved
|
||||
|
||||
## Purpose
|
||||
|
||||
A service that periodically fetches SoundCloud likes from NicktheRat's profile, builds weekly playlists aligned to his Wednesday 22:00 ET show schedule, and exposes them via a JSON API for an IRC bot to query track info by position number (`!1`, `!2`, etc.).
|
||||
|
||||
## Architecture
|
||||
|
||||
Single Python process with three internal responsibilities:
|
||||
|
||||
1. **API server** — FastAPI on a configurable port, serves playlist data as JSON.
|
||||
2. **Poller** — async background task that fetches Nick's SoundCloud likes every hour.
|
||||
3. **Supervisor** — monitors the poller task, restarts it on failure without affecting the API.
|
||||
|
||||
The poller and API run as independent `asyncio` tasks. If the poller crashes, the supervisor catches the exception, logs it, waits a backoff period, and restarts the poller. The API continues serving from the last-known-good SQLite data.
|
||||
|
||||
External process management (systemd with `Restart=on-failure`) handles whole-process crashes. The service does not try to be its own process manager.
|
||||
|
||||
### Startup sequence
|
||||
|
||||
1. Open/create SQLite database, run migrations.
|
||||
2. Check if the current week's playlist exists. If not (or stale), do an immediate fetch.
|
||||
3. Start the API server.
|
||||
4. Start the poller on its hourly schedule.
|
||||
|
||||
## Data Model (SQLite)
|
||||
|
||||
### `tracks`
|
||||
|
||||
Canonical store of every SoundCloud track Nick has liked.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|--------|------|-------|
|
||||
| `id` | INTEGER PK | SoundCloud track ID (not auto-increment) |
|
||||
| `title` | TEXT | Track title |
|
||||
| `artist` | TEXT | `track.user.username` from the API |
|
||||
| `permalink_url` | TEXT | Full SoundCloud URL |
|
||||
| `artwork_url` | TEXT | Nullable |
|
||||
| `duration_ms` | INTEGER | Duration in milliseconds |
|
||||
| `license` | TEXT | e.g. `cc-by-sa` |
|
||||
| `liked_at` | TEXT | ISO 8601 — when Nick liked it |
|
||||
| `raw_json` | TEXT | Full track JSON blob |
|
||||
|
||||
### `shows`
|
||||
|
||||
One row per weekly show.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|--------|------|-------|
|
||||
| `id` | INTEGER PK | Auto-increment |
|
||||
| `week_start` | TEXT | ISO 8601 UTC of the Wednesday 22:00 ET boundary that opens this week |
|
||||
| `week_end` | TEXT | ISO 8601 UTC of the next Wednesday 22:00 ET boundary |
|
||||
| `created_at` | TEXT | When this row was created |
|
||||
|
||||
### `show_tracks`
|
||||
|
||||
Join table linking tracks to shows with position.
|
||||
|
||||
| Column | Type | Notes |
|
||||
|--------|------|-------|
|
||||
| `show_id` | INTEGER FK | References `shows.id` |
|
||||
| `track_id` | INTEGER FK | References `tracks.id` |
|
||||
| `position` | INTEGER | 1-indexed — maps to `!1`, `!2`, etc. |
|
||||
| UNIQUE | | `(show_id, track_id)` |
|
||||
|
||||
Position assignment: likes sorted by `liked_at` ascending (oldest first), positions assigned 1, 2, 3... New likes mid-week get the next position; existing positions never shift.
|
||||
|
||||
Once a track is assigned a position in a show, it stays even if Nick unlikes it. Admin endpoints exist for manual corrections.
|
||||
|
||||
## API Endpoints
|
||||
|
||||
All endpoints return JSON. Base URL: `http://localhost:{port}`.
|
||||
|
||||
### Current Week
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| `GET` | `/playlist` | Current week's full playlist |
|
||||
| `GET` | `/playlist/{position}` | Single track by position |
|
||||
|
||||
`GET /playlist` response shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"show_id": 12,
|
||||
"week_start": "2026-03-12T02:00:00Z",
|
||||
"week_end": "2026-03-19T02:00:00Z",
|
||||
"tracks": [
|
||||
{
|
||||
"position": 1,
|
||||
"title": "Running Through My Mind",
|
||||
"artist": "Purrple Panther",
|
||||
"permalink_url": "https://soundcloud.com/...",
|
||||
"artwork_url": "https://...",
|
||||
"duration_ms": 202909,
|
||||
"liked_at": "2026-03-09T02:24:30Z"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
`GET /playlist/{position}` returns a single track object. 404 if the position doesn't exist.
|
||||
|
||||
### History
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| `GET` | `/shows` | List all shows, newest first. Supports `?limit=` and `?offset=`. |
|
||||
| `GET` | `/shows/{show_id}` | Full playlist for a specific show |
|
||||
|
||||
### Admin (bearer token required)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| `POST` | `/admin/refresh` | Trigger immediate SoundCloud fetch. `{"full": true}` re-fetches the entire week; default is incremental. |
|
||||
| `POST` | `/admin/tracks` | Add a track to the current show. Body: `{"soundcloud_url": "..."}` or `{"track_id": 12345, "position": 5}` |
|
||||
| `DELETE` | `/admin/tracks/{track_id}` | Remove a track from the current show. Remaining positions re-compact. |
|
||||
| `PUT` | `/admin/tracks/{track_id}/position` | Move a track to a different position. Body: `{"position": 3}` |
|
||||
|
||||
### Health
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| `GET` | `/health` | Poller status, last fetch time, current week track count |
|
||||
|
||||
Admin endpoints are protected by a bearer token (`NTR_ADMIN_TOKEN`). Read endpoints have no auth (localhost only).
|
||||
|
||||
## Poller & Fetcher Logic
|
||||
|
||||
### Polling cycle
|
||||
|
||||
1. Compute current week's boundary window (Wednesday 22:00 ET -> next Wednesday 22:00 ET, converted to UTC accounting for DST via `zoneinfo`).
|
||||
2. Ensure a `shows` row exists for this window.
|
||||
3. Fetch likes from SoundCloud.
|
||||
4. Upsert new tracks, assign positions, update `show_tracks`.
|
||||
5. Sleep for the configured interval.
|
||||
|
||||
### Incremental fetching
|
||||
|
||||
The poller does not re-fetch all likes every hour. It uses cursor-seeking:
|
||||
|
||||
- **First fetch for a new week**: craft a synthetic cursor at the week's end boundary, paginate backward until hitting the week's start boundary.
|
||||
- **Subsequent fetches**: craft a cursor at "now", paginate backward until hitting a track already in the database. Most hourly polls fetch a single page or zero pages.
|
||||
- **Full refresh** (`POST /admin/refresh` with `{"full": true}`): re-fetches the entire week from scratch, same as the first-fetch path.
|
||||
|
||||
### `client_id` management
|
||||
|
||||
- Extract from `soundcloud.com` HTML (`__sc_hydration` -> `apiClient` -> `id`) on startup.
|
||||
- Cache in memory (not persisted — rotates too frequently).
|
||||
- On any 401 response, re-extract and retry.
|
||||
- If re-extraction fails, log the error and let the next tick retry.
|
||||
|
||||
### Retry & backoff
|
||||
|
||||
Each SoundCloud HTTP call: 3 attempts, exponential backoff (2s, 4s, 8s). 401s trigger `client_id` refresh before retry (doesn't count against attempts). Request timeout: 15 seconds.
|
||||
|
||||
### Error scenarios
|
||||
|
||||
| Scenario | Behavior |
|
||||
|----------|----------|
|
||||
| SoundCloud 401 | Refresh `client_id`, retry |
|
||||
| SoundCloud 429 | Back off, retry next tick |
|
||||
| SoundCloud 5xx | Retry with backoff, skip tick after 3 failures |
|
||||
| Network timeout | Same as 5xx |
|
||||
| `client_id` extraction failure | Log error, skip tick, retry next hour |
|
||||
| Poller task crash | Supervisor restarts after 30s backoff |
|
||||
| Nick unlikes a track | Track stays in show — positions are stable |
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
NtR-soundcloud-fetcher/
|
||||
├── docs/
|
||||
│ ├── soundcloud-likes-api.md
|
||||
│ └── plans/
|
||||
│ └── 2026-03-12-ntr-soundcloud-fetcher-design.md
|
||||
├── src/
|
||||
│ └── ntr_fetcher/
|
||||
│ ├── __init__.py
|
||||
│ ├── main.py # entry point — starts API + poller
|
||||
│ ├── config.py # settings from env vars / .env
|
||||
│ ├── db.py # SQLite connection, migrations, queries
|
||||
│ ├── models.py # dataclasses for Track, Show, ShowTrack
|
||||
│ ├── soundcloud.py # client_id extraction, likes fetching
|
||||
│ ├── poller.py # polling loop + supervisor
|
||||
│ ├── api.py # FastAPI routes
|
||||
│ └── week.py # week boundary computation (ET → UTC)
|
||||
├── tests/
|
||||
├── pyproject.toml
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Package | Purpose |
|
||||
|---------|---------|
|
||||
| `fastapi` | API framework |
|
||||
| `uvicorn` | ASGI server |
|
||||
| `httpx` | Async HTTP client for SoundCloud |
|
||||
| `pydantic` | Config + response models (bundled with FastAPI) |
|
||||
|
||||
Standard library: `sqlite3`, `zoneinfo`, `asyncio`, `dataclasses`, `json`.
|
||||
|
||||
No ORM. Raw SQL via `sqlite3`, wrapped in `asyncio.to_thread` for async compatibility.
|
||||
|
||||
Python 3.11+.
|
||||
|
||||
## Configuration
|
||||
|
||||
Environment variables, loaded by pydantic `BaseSettings`. Supports `.env` file.
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `NTR_PORT` | `8000` | API listen port |
|
||||
| `NTR_HOST` | `127.0.0.1` | API bind address |
|
||||
| `NTR_DB_PATH` | `./ntr_fetcher.db` | SQLite database file path |
|
||||
| `NTR_POLL_INTERVAL_SECONDS` | `3600` | Polling interval |
|
||||
| `NTR_ADMIN_TOKEN` | (required) | Bearer token for admin endpoints |
|
||||
| `NTR_SOUNDCLOUD_USER` | `nicktherat` | SoundCloud username to track |
|
||||
| `NTR_SHOW_DAY` | `2` | Day of week (0=Mon, 2=Wed) |
|
||||
| `NTR_SHOW_HOUR` | `22` | Hour in Eastern Time |
|
||||
1
docs/soundcloud-likes-api.md
Symbolic link
1
docs/soundcloud-likes-api.md
Symbolic link
@@ -0,0 +1 @@
|
||||
/Users/erikfredericks/dev-ai/one-offs/soundcloud-reverse-engineering/docs/soundcloud-likes-api.md
|
||||
Reference in New Issue
Block a user