Initial release v1.0.0
This commit is contained in:
@@ -0,0 +1,86 @@
|
||||
# Lidify - Feature Overview
|
||||
|
||||
A self-hosted music streaming platform with intelligent discovery, podcast support, audiobooks, and a unique vibe-matching system.
|
||||
|
||||
---
|
||||
|
||||
## Music Discovery
|
||||
|
||||
**Discover Weekly** - AI-generated weekly playlists based on your listening history. Customize parameters like track duration and date ranges to fine-tune your discoveries.
|
||||
|
||||
**Smart Mixes** - Automatically generated playlists including era-based mixes (90s, 2000s), genre mixes, top tracks, rediscover mix (songs you haven't played in a while), and artist similarity mixes.
|
||||
|
||||
**Library Radio Stations** - One-click radio modes including Shuffle All, Workout (high energy), Discovery (lesser-played gems), and Favorites (most played). Genre and decade-based radio stations are dynamically created from your library.
|
||||
|
||||
**Similar Artists & Recommendations** - Powered by Last.fm integration, discover artists similar to ones you love and get personalized recommendations based on your listening habits.
|
||||
|
||||
---
|
||||
|
||||
## The Vibe System
|
||||
|
||||
**Vibe Button** - The standout feature. While listening to any track, tap the vibe button to see real-time audio analysis including energy, mood (valence), danceability, tempo, and arousal levels displayed on a visual radar chart.
|
||||
|
||||
**Keep The Vibe Going** - Uses ML mood predictions (Happy, Sad, Relaxed, Aggressive, Party, Acoustic, Electronic) to queue tracks that match your current vibe with a match percentage score.
|
||||
|
||||
**Mood Mixer** - Create custom playlists by adjusting mood sliders or using presets like Happy, Energetic, Chill, Focus, or Workout. The system finds tracks in your library matching your desired vibe.
|
||||
|
||||
---
|
||||
|
||||
## Playlist Import
|
||||
|
||||
**Spotify Import** - Paste any Spotify playlist URL to import. Preview shows which tracks match your library, which albums need downloading, and which tracks have no matches. Selectively download what you need.
|
||||
|
||||
**Deezer Import** - Same functionality for Deezer playlists. Browse Deezer's featured and genre playlists directly in-app.
|
||||
|
||||
---
|
||||
|
||||
## Podcasts
|
||||
|
||||
**Full Podcast Support** - Search iTunes for podcasts, subscribe, and manage your library. Browse top podcasts and discover by genre (Comedy, True Crime, News, Business, Sports, etc.).
|
||||
|
||||
**Progress Tracking** - Continue listening picks up exactly where you left off across all your subscribed shows.
|
||||
|
||||
---
|
||||
|
||||
## Audiobooks
|
||||
|
||||
**Audiobookshelf Integration** - Connect your Audiobookshelf instance to browse and play your audiobook collection directly in Lidify.
|
||||
|
||||
**Smart Organization** - Filter by currently listening or finished books. Group by series with proper sequence ordering. Sort by title, author, or recently played.
|
||||
|
||||
**Progress Sync** - Seamless progress tracking so you never lose your place.
|
||||
|
||||
---
|
||||
|
||||
## Library Management
|
||||
|
||||
**Multi-View Library** - Browse by Artists, Albums, or Tracks with flexible sorting and filtering options.
|
||||
|
||||
**Smart Filters** - View owned music, discovery tracks (from Discover Weekly), or everything combined.
|
||||
|
||||
**Bulk Operations** - Delete artists, albums, or tracks with confirmation. Paginated views for large libraries.
|
||||
|
||||
---
|
||||
|
||||
## Player
|
||||
|
||||
**Adaptive Player** - Full-width desktop player, mini player for mobile, and an immersive overlay mode.
|
||||
|
||||
**Universal Playback** - Single unified player handles music, podcasts, and audiobooks with type-specific controls.
|
||||
|
||||
**Queue Management** - Full control over what's playing next with shuffle and repeat modes.
|
||||
|
||||
---
|
||||
|
||||
## Additional Features
|
||||
|
||||
- Global search across artists, albums, and tracks
|
||||
- Featured playlists from Deezer
|
||||
- Recently added and popular artists sections
|
||||
- Create and share playlists with other users
|
||||
- MusicBrainz integration for accurate metadata
|
||||
- Clean, responsive UI that works on desktop, tablet, and mobile
|
||||
|
||||
---
|
||||
|
||||
*Lidify is self-hosted, giving you full control over your music library with the discovery features of commercial streaming services.*
|
||||
@@ -0,0 +1,212 @@
|
||||
# Lidify Testing Checklist
|
||||
|
||||
Use this checklist when testing Lidify before releases or after major changes.
|
||||
|
||||
## ✅ Automated Pre-Deploy Smoke Test (Recommended)
|
||||
|
||||
This repo includes a one-command smoke test that covers the **core** flows (API + UI). It intentionally skips “hard” items like lock-screen media controls, background playback on real devices, etc.
|
||||
|
||||
### Run (one command)
|
||||
|
||||
```bash
|
||||
./scripts/predeploy-test.sh
|
||||
```
|
||||
|
||||
### Notes
|
||||
|
||||
- **Requires music in `MUSIC_PATH`** (or `./music`) with at least one track, otherwise playback/playlist-related checks will fail.
|
||||
- **Environment overrides** (optional):
|
||||
- `LIDIFY_UI_BASE_URL` (default `http://127.0.0.1:3030`)
|
||||
- `LIDIFY_API_BASE_URL` (default `http://127.0.0.1:3006`)
|
||||
- `LIDIFY_TEST_USERNAME` / `LIDIFY_TEST_PASSWORD`
|
||||
- `LIDIFY_TEARDOWN=0` to keep containers running after the script finishes
|
||||
|
||||
## 🎵 Audio Playback
|
||||
|
||||
### Music (Tracks)
|
||||
|
||||
- [ ] Play a track from an album
|
||||
- [ ] Play/pause toggle works
|
||||
- [ ] Seeking works (drag the progress bar)
|
||||
- [ ] Fast forward (10s) works
|
||||
- [ ] Rewind (10s) works
|
||||
- [ ] Next track works
|
||||
- [ ] Previous track works
|
||||
- [ ] Volume slider works
|
||||
- [ ] Mute toggle works
|
||||
- [ ] Shuffle toggle works (plays random order)
|
||||
- [ ] Repeat modes work (off, repeat all, repeat one)
|
||||
- [ ] Queue displays correctly
|
||||
- [ ] Removing tracks from queue works
|
||||
|
||||
### Podcasts
|
||||
|
||||
- [ ] Play a podcast episode
|
||||
- [ ] Seeking works (when cached)
|
||||
- [ ] Progress saves when pausing
|
||||
- [ ] Progress resumes on different device/browser
|
||||
- [ ] Can seek far ahead after episode is fully cached/downloaded
|
||||
- [ ] Subscribing to a new podcast works
|
||||
- [ ] Unsubscribing from a podcast works
|
||||
- [ ] Episode list loads correctly
|
||||
|
||||
### Audiobooks
|
||||
|
||||
- [ ] Play an audiobook (requires Audiobookshelf integration)
|
||||
- [ ] Progress saves automatically
|
||||
- [ ] Can resume from saved position
|
||||
- [ ] Reset progress works
|
||||
- [ ] Mark as complete works
|
||||
|
||||
### Cross-Device Sync
|
||||
|
||||
- [ ] Start playing on desktop, resume on mobile (or vice versa)
|
||||
- [ ] Queue syncs between devices
|
||||
|
||||
---
|
||||
|
||||
## 🔍 Discovery & Search
|
||||
|
||||
### Deezer Previews
|
||||
|
||||
- [ ] Preview button appears on unowned albums
|
||||
- [ ] Preview button appears on artist discovery pages
|
||||
- [ ] Preview plays 30-second clip
|
||||
- [ ] Preview stops when full track starts
|
||||
|
||||
### Search
|
||||
|
||||
- [ ] Library search finds artists
|
||||
- [ ] Library search finds albums
|
||||
- [ ] Library search finds tracks
|
||||
- [ ] Discovery search finds external artists
|
||||
- [ ] Discovery search finds podcasts
|
||||
|
||||
---
|
||||
|
||||
## 📥 Downloads & Integration
|
||||
|
||||
### Lidarr Integration
|
||||
|
||||
- [ ] Download entire artist works
|
||||
- [ ] Download individual album works
|
||||
- [ ] Download status updates in real-time
|
||||
- [ ] Webhook triggers library rescan after import
|
||||
|
||||
### Soularr (Soulseek)
|
||||
|
||||
- [ ] Search returns results
|
||||
- [ ] Download from Soulseek works
|
||||
- [ ] Downloaded files appear in library after scan
|
||||
|
||||
---
|
||||
|
||||
## 📚 Library Management
|
||||
|
||||
### Discover Weekly
|
||||
|
||||
- [ ] Generate Discover Weekly works
|
||||
- [ ] Playlist populates with recommendations
|
||||
- [ ] Can like/dislike albums
|
||||
- [ ] Liked albums move to permanent collection
|
||||
|
||||
### Playlists
|
||||
|
||||
- [ ] Create new playlist works
|
||||
- [ ] Add track to playlist works
|
||||
- [ ] Remove track from playlist works
|
||||
- [ ] Delete playlist works
|
||||
- [ ] Reorder tracks (drag and drop) works
|
||||
|
||||
---
|
||||
|
||||
## 🔐 Authentication & Users
|
||||
|
||||
### Two-Factor Authentication
|
||||
|
||||
- [ ] Enable 2FA works
|
||||
- [ ] Login with 2FA code works
|
||||
- [ ] Recovery codes work
|
||||
- [ ] Disable 2FA works
|
||||
|
||||
### User Management
|
||||
|
||||
- [ ] Create new user works (admin only)
|
||||
- [ ] User can log in
|
||||
- [ ] User has separate playlists/history
|
||||
- [ ] Delete user works (admin only)
|
||||
|
||||
---
|
||||
|
||||
## 🎨 Metadata & Enrichment
|
||||
|
||||
### Artist Enrichment
|
||||
|
||||
- [ ] Manual enrichment button works
|
||||
- [ ] Artist bio populates
|
||||
- [ ] Artist genres populate
|
||||
- [ ] Hero image/background loads
|
||||
- [ ] Album art loads correctly
|
||||
|
||||
---
|
||||
|
||||
## 📱 PWA / Mobile
|
||||
|
||||
### Installation
|
||||
|
||||
- [ ] PWA install prompt appears on mobile browsers
|
||||
- [ ] Can install to home screen (Android Chrome)
|
||||
- [ ] Can add to home screen (iOS Safari)
|
||||
|
||||
### PWA Features
|
||||
|
||||
- [ ] Installed PWA opens in standalone mode
|
||||
- [ ] Media Session controls show in notification/lock screen
|
||||
- [ ] Background audio continues when screen is off
|
||||
- [ ] Audio continues when switching tabs
|
||||
|
||||
---
|
||||
|
||||
## 🖥️ UI/UX
|
||||
|
||||
### General
|
||||
|
||||
- [ ] Login page loads correctly
|
||||
- [ ] Onboarding flow works for new users
|
||||
- [ ] Navigation between pages works
|
||||
- [ ] Dark theme renders correctly
|
||||
- [ ] Mobile responsive layout works
|
||||
|
||||
### Player
|
||||
|
||||
- [ ] Mini player shows on mobile
|
||||
- [ ] Full player expands correctly
|
||||
- [ ] Album art displays
|
||||
- [ ] Artist/track info displays
|
||||
|
||||
---
|
||||
|
||||
## 🐳 Docker
|
||||
|
||||
### All-in-One Container
|
||||
|
||||
- [ ] Container starts without errors
|
||||
- [ ] Web UI accessible on port 3030
|
||||
- [ ] API proxying works (rewrites to backend)
|
||||
- [ ] Database persists on restart
|
||||
- [ ] Library scan works
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
**Test Environment:**
|
||||
|
||||
- Browser:
|
||||
- OS:
|
||||
- Lidify Version:
|
||||
- Date:
|
||||
|
||||
**Issues Found:**
|
||||
|
||||
-
|
||||
@@ -0,0 +1,153 @@
|
||||
<!-- f0350f33-28ae-4b99-a6ef-c0ec4fc46b90 3ebd44b8-4704-4bf4-a7cc-824ec82aafa3 -->
|
||||
# Fix Lidarr Webhooks, Progress Updates, and Discovery Isolation
|
||||
|
||||
## Issue 1: Lidarr Webhook URL Missing /api Prefix (Critical)
|
||||
|
||||
**Root Cause**: [backend/src/routes/systemSettings.ts](backend/src/routes/systemSettings.ts) line 276 sets webhook URL to `http://host.docker.internal:3006/webhooks/lidarr` but the route is mounted at `/api/webhooks` in [backend/src/index.ts](backend/src/index.ts) line 137.
|
||||
|
||||
**Fix**: Update the webhook URL construction to:
|
||||
|
||||
1. Add `/api` prefix to the path
|
||||
2. Use a smarter URL based on the request origin or a configurable callback URL
|
||||
```typescript
|
||||
// Line 276 - change from:
|
||||
const webhookUrl = "http://host.docker.internal:3006/webhooks/lidarr";
|
||||
|
||||
// To something like:
|
||||
const callbackHost = process.env.LIDIFY_CALLBACK_URL || "http://host.docker.internal:3006";
|
||||
const webhookUrl = `${callbackHost}/api/webhooks/lidarr`;
|
||||
```
|
||||
|
||||
|
||||
Also add `LIDIFY_CALLBACK_URL` to Docker compose environment variables so users can configure it.
|
||||
|
||||
---
|
||||
|
||||
## Issue 2: Audiobook/Podcast Progress Not Updating Real-time
|
||||
|
||||
**Root Cause**: [frontend/app/audiobooks/page.tsx](frontend/app/audiobooks/page.tsx) computes `continueListening` from `useAudiobooksQuery()` data only. When playback starts, the audio context updates but the query cache doesn't invalidate.
|
||||
|
||||
**Fix**: Modify the audiobooks page to:
|
||||
|
||||
1. Check if `currentAudiobook` from audio context matches any book in the list
|
||||
2. If the currently playing audiobook isn't in `continueListening`, prepend it
|
||||
3. Invalidate audiobooks query when playback starts/stops
|
||||
```typescript
|
||||
// In audiobooks page, combine query data with audio context
|
||||
const { currentAudiobook } = useAudio();
|
||||
|
||||
const continueListening = useMemo(() => {
|
||||
const inProgress = audiobooks.filter(
|
||||
(book) => book.progress && book.progress.progress > 0 && !book.progress.isFinished
|
||||
);
|
||||
|
||||
// If currently playing an audiobook that's not in the list, add it
|
||||
if (currentAudiobook && !inProgress.find(b => b.id === currentAudiobook.id)) {
|
||||
const currentBook = audiobooks.find(b => b.id === currentAudiobook.id);
|
||||
if (currentBook) {
|
||||
return [currentBook, ...inProgress];
|
||||
}
|
||||
}
|
||||
return inProgress;
|
||||
}, [audiobooks, currentAudiobook]);
|
||||
```
|
||||
|
||||
|
||||
---
|
||||
|
||||
## Issue 3: Discovery Albums Not Isolated from Library
|
||||
|
||||
**Root Cause Analysis**: The discovery system relies on:
|
||||
|
||||
1. Webhook firing to mark download complete
|
||||
2. Download job having `discoveryBatchId` set
|
||||
3. Scanner checking `isDiscoveryDownload()` during scan
|
||||
|
||||
If webhook never fires (Issue 1), the scan runs but can't identify albums as discovery.
|
||||
|
||||
**Fix**:
|
||||
|
||||
1. Fix webhook URL (Issue 1) - this is the primary fix
|
||||
2. Add fallback: During scan, also check if album path contains "discovery" in Lidarr metadata
|
||||
3. Verify library routes filter by `location: "LIBRARY"` consistently
|
||||
|
||||
---
|
||||
|
||||
## Issue 4: Album Cover 404s Spamming Console
|
||||
|
||||
**Root Cause**: [frontend/features/artist/components/AvailableAlbums.tsx](frontend/features/artist/components/AvailableAlbums.tsx) fetches covers for unowned albums. When Cover Art Archive doesn't have them, 404 errors spam the console.
|
||||
|
||||
**Fix**:
|
||||
|
||||
1. In [backend/src/routes/library.ts](backend/src/routes/library.ts) `/album-cover/:mbid` endpoint - return 204 No Content instead of 404 for missing covers (less noisy)
|
||||
2. In frontend - catch and silently handle missing covers, show placeholder
|
||||
|
||||
---
|
||||
|
||||
## Issue 5: Shared Playlists Not Showing Username
|
||||
|
||||
**Verification Needed**: The code exists in [frontend/app/playlists/page.tsx](frontend/app/playlists/page.tsx) lines 162-164. Check if backend is returning `user.username` correctly.
|
||||
|
||||
**Files to check**:
|
||||
|
||||
- [backend/src/routes/playlists.ts](backend/src/routes/playlists.ts) - verify `include: { user: { select: { username: true } } }` is working
|
||||
- Verify playlists actually have `isOwner: false` when shared
|
||||
|
||||
---
|
||||
|
||||
## Issue 6: Discovery Playlist Never Appears
|
||||
|
||||
**Root Cause**: This is directly caused by Issue 1 (webhook URL). The discovery playlist flow is:
|
||||
|
||||
1. Discovery Weekly generates recommendations and starts downloads
|
||||
2. Lidarr grabs and downloads the albums
|
||||
3. **Lidarr webhook fires on completion** (BROKEN - wrong URL)
|
||||
4. `simpleDownloadManager.onDownloadComplete()` marks job complete
|
||||
5. `discoverWeeklyService.checkBatchCompletion()` checks if all albums done
|
||||
6. When batch complete, triggers scan with `source: "discover-weekly-completion"`
|
||||
7. Scan processor calls `discoverWeeklyService.buildFinalPlaylist()`
|
||||
8. Discovery playlist appears in UI
|
||||
|
||||
Since step 3 never happens, the playlist is never built.
|
||||
|
||||
**Fix**:
|
||||
1. Fix webhook URL (Issue 1) - primary fix
|
||||
2. Add a manual "Rebuild Discovery Playlist" button in the UI as fallback
|
||||
3. Add a background job that periodically checks for orphaned discovery batches
|
||||
|
||||
---
|
||||
|
||||
## Issue 7: Audiobooks/Podcasts Missing Filter/Sort Controls
|
||||
|
||||
**Problem**: Library page has sorting, pagination, and shuffle controls but audiobooks and podcasts pages don't match this design.
|
||||
|
||||
**Fix**: Add to [frontend/app/audiobooks/page.tsx](frontend/app/audiobooks/page.tsx) and [frontend/app/podcasts/page.tsx](frontend/app/podcasts/page.tsx):
|
||||
|
||||
- Sort dropdown (Title A-Z, Author A-Z, Recently Added, etc.)
|
||||
- Items per page dropdown (25, 50, 100, 250)
|
||||
- Pagination controls
|
||||
- "Shuffle" button for audiobooks (shuffle all chapters/books)
|
||||
|
||||
Match the styling from [frontend/app/library/page.tsx](frontend/app/library/page.tsx) for visual consistency.
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order
|
||||
|
||||
1. Fix Lidarr webhook URL (critical - blocking all download tracking)
|
||||
2. Add real-time audiobook progress
|
||||
3. Add filter/sort/pagination to audiobooks and podcasts pages
|
||||
4. Suppress album cover 404 noise
|
||||
5. Verify shared playlist data flow
|
||||
6. Test discovery isolation after webhook fix
|
||||
|
||||
### To-dos
|
||||
|
||||
- [ ] Fix owned artist pages - not showing downloadable albums
|
||||
- [ ] Change default playback quality to 'original'
|
||||
- [ ] Create docs/ directory with tracking file, add to gitignore
|
||||
- [ ] Fix Lidarr webhook URL to include /api prefix and make configurable
|
||||
- [ ] Add real-time audiobook progress by combining query data with audio context
|
||||
- [ ] Change album cover endpoint to return 204 instead of 404 for missing covers
|
||||
- [ ] Debug shared playlist username display
|
||||
- [ ] Test discovery isolation after webhook fix
|
||||
@@ -0,0 +1,64 @@
|
||||
# Lidify Design System
|
||||
|
||||
## Brand Colors
|
||||
- **Primary**: #fca200 (logo gold)
|
||||
- **Hover**: #e69200 (darker gold for hover states)
|
||||
- **Light**: #fcb84d (lighter gold for accents)
|
||||
- **Dark**: #d48c00 (darker gold for emphasis)
|
||||
|
||||
## Design Principles
|
||||
- **Glassmorphism**: Use `backdrop-blur-sm` with semi-transparent cards for premium feel
|
||||
- **Border Radius**: `rounded-lg` (8px) for modern, edgy feel - avoid overly rounded elements
|
||||
- **Shadows**: Prefer `shadow-lg`/`shadow-xl` over `shadow-2xl` for subtlety
|
||||
- **Spacing**: 20-25% tighter than current values for refined look
|
||||
- **Typography**: Smaller, tighter proportions for elegance
|
||||
|
||||
## Component Guidelines
|
||||
|
||||
### Buttons
|
||||
- **Primary CTA**: `bg-brand hover:bg-brand-hover text-black font-bold rounded-lg py-3`
|
||||
- **Secondary**: `bg-white/5 hover:bg-white/10 border border-white/10 rounded-lg py-2.5`
|
||||
- **Avoid**: `rounded-full` (too soft), `rounded-2xl` (too rounded)
|
||||
|
||||
### Cards
|
||||
- **Style**: `rounded-lg backdrop-blur-sm bg-[#111]/90 border border-white/10`
|
||||
- **Shadow**: `shadow-xl` (subtle, premium)
|
||||
- **Padding**: `p-6 md:p-8` (tighter than current)
|
||||
|
||||
### Form Elements
|
||||
- **Inputs**: `rounded-lg py-2.5 px-4 bg-white/5 border border-white/10`
|
||||
- **Focus**: `focus:ring-2 focus:ring-brand/30 focus:border-transparent`
|
||||
- **Labels**: `text-sm font-medium text-white/90 mb-1.5`
|
||||
|
||||
### Typography
|
||||
- **Page Headings**: `text-2xl` (reduced from `text-3xl`)
|
||||
- **Section Headings**: `text-xl` (reduced from `text-2xl`)
|
||||
- **Card Titles**: `text-sm font-semibold`
|
||||
- **Spacing**: Tighter margins (`mb-1` vs `mb-2`)
|
||||
|
||||
## Layout Guidelines
|
||||
|
||||
### Login Page
|
||||
- Logo: `mb-8`, `width={40}`
|
||||
- Card: `rounded-lg p-6 md:p-8`
|
||||
- Form: `space-y-4`
|
||||
- Button: `py-3 rounded-lg`
|
||||
|
||||
### Onboarding Page
|
||||
- Logo: `width={48}`
|
||||
- Title: `text-4xl`
|
||||
- Progress: `w-9 h-9` step circles
|
||||
- Card: `rounded-lg p-6 md:p-8`
|
||||
- Buttons: `py-3.5 rounded-lg`
|
||||
|
||||
## Color Usage
|
||||
- Replace all `#ecb200` with `#fca200`
|
||||
- Replace all `#ffc933` with `#e69200`
|
||||
- Use Tailwind `text-brand`, `bg-brand`, `border-brand` classes
|
||||
- Update gradient overlays to use new brand color
|
||||
|
||||
## Implementation Notes
|
||||
- Glassmorphism effect: `backdrop-blur-sm` (subtle)
|
||||
- Card opacity: `bg-[#111]/90` (90% opacity)
|
||||
- Border consistency: `border-white/10` throughout
|
||||
- Shadow consistency: `shadow-xl` for cards
|
||||
@@ -0,0 +1,191 @@
|
||||
# Spotify Import - Code Reference
|
||||
|
||||
Quick reference to key code sections for the next agent.
|
||||
|
||||
## Backend Entry Points
|
||||
|
||||
### Preview Playlist
|
||||
**File**: `backend/src/routes/spotify.ts`
|
||||
**Endpoint**: `POST /spotify/preview`
|
||||
**Handler**: Lines ~50-120
|
||||
|
||||
```typescript
|
||||
// Fetches Spotify playlist, searches MusicBrainz for albums
|
||||
const preview = await spotifyImportService.previewPlaylist(url);
|
||||
// Returns: matchedTracks, unmatchedTracks, albumsToDownload
|
||||
```
|
||||
|
||||
### Execute Import
|
||||
**File**: `backend/src/routes/spotify.ts`
|
||||
**Endpoint**: `POST /spotify/import`
|
||||
**Handler**: Lines ~130-200
|
||||
|
||||
```typescript
|
||||
// Starts async import job
|
||||
const job = await spotifyImportService.executeImport(preview, userId, playlistName);
|
||||
// Returns: jobId for status polling
|
||||
```
|
||||
|
||||
### Retry Pending Track
|
||||
**File**: `backend/src/routes/playlists.ts`
|
||||
**Endpoint**: `POST /playlists/:id/pending/:trackId/retry`
|
||||
**Handler**: Lines ~630-745
|
||||
|
||||
```typescript
|
||||
// Non-blocking retry flow:
|
||||
// 1. Search Soulseek (15s timeout)
|
||||
// 2. Return immediately with success/failure
|
||||
// 3. Download in background
|
||||
// 4. Trigger library scan after download
|
||||
```
|
||||
|
||||
## Core Import Logic
|
||||
|
||||
### spotifyImportService.executeImport()
|
||||
**File**: `backend/src/services/spotifyImport.ts`
|
||||
**Function**: Lines ~150-350
|
||||
|
||||
Key sections:
|
||||
- **Lines ~180-220**: Download albums via Lidarr or Soulseek
|
||||
- **Lines ~230-280**: Wait for downloads, handle failures
|
||||
- **Lines ~290-350**: Create playlist, match tracks, store pending
|
||||
|
||||
### Soulseek Download Flow
|
||||
**File**: `backend/src/services/soulseek.ts`
|
||||
|
||||
Key methods:
|
||||
- `searchTrack()` - Lines ~150-250: Search with 15s timeout
|
||||
- `downloadTrack()` - Lines ~300-400: Download single file with 180s timeout
|
||||
- `searchAndDownloadBatch()` - Lines ~525-600: Parallel search, concurrent download
|
||||
- `downloadBestMatch()` - Lines ~465-520: Download from pre-searched results
|
||||
|
||||
### Track Matching
|
||||
**File**: `backend/src/services/spotifyImport.ts`
|
||||
**Function**: `matchTrackToLibrary()` - Lines ~400-500
|
||||
|
||||
Matching strategies (in order):
|
||||
1. Exact normalized title + artist first word
|
||||
2. Stripped title (remove remaster/remix suffixes)
|
||||
3. Contains search
|
||||
4. Fuzzy artist + title
|
||||
5. StartsWith search
|
||||
6. Last resort fuzzy
|
||||
|
||||
### Pending Track Reconciliation
|
||||
**File**: `backend/src/services/spotifyImport.ts`
|
||||
**Function**: `reconcilePendingTracks()` - Lines ~550-650
|
||||
|
||||
Called after library scan to match pending tracks to newly added files.
|
||||
|
||||
## Frontend Components
|
||||
|
||||
### Import Wizard
|
||||
**File**: `frontend/app/import/spotify/page.tsx`
|
||||
|
||||
Key state:
|
||||
- `step`: "url" | "preview" | "importing" | "complete"
|
||||
- `preview`: PreviewResult from API
|
||||
- `jobStatus`: Polling status during import
|
||||
|
||||
### Playlist Detail - Pending Tracks
|
||||
**File**: `frontend/app/playlist/[id]/page.tsx`
|
||||
|
||||
Key handlers (Lines ~100-160):
|
||||
- `handlePlayPreview()` - Fetches fresh Deezer URL, plays audio
|
||||
- `handleRetryPendingTrack()` - Calls retry API, shows toast
|
||||
- `handleRemovePendingTrack()` - Removes from playlist
|
||||
|
||||
Pending track rendering: Lines ~555-650
|
||||
|
||||
## Database Queries
|
||||
|
||||
### Get Playlist with Pending Tracks
|
||||
```typescript
|
||||
const playlist = await prisma.playlist.findUnique({
|
||||
where: { id: playlistId },
|
||||
include: {
|
||||
items: { include: { track: { include: { album: { include: { artist: true }} }} }},
|
||||
pendingTracks: { orderBy: { sort: 'asc' } }
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### Create Pending Track
|
||||
```typescript
|
||||
await prisma.playlistPendingTrack.create({
|
||||
data: {
|
||||
playlistId,
|
||||
spotifyArtist: track.artist,
|
||||
spotifyTitle: track.title,
|
||||
spotifyAlbum: resolvedAlbum,
|
||||
spotifyTrackId: track.spotifyId,
|
||||
deezerPreviewUrl: previewUrl,
|
||||
sort: index
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### Reconcile Pending Track (convert to real track)
|
||||
```typescript
|
||||
// Delete pending, add real track
|
||||
await prisma.$transaction([
|
||||
prisma.playlistPendingTrack.delete({ where: { id: pendingId } }),
|
||||
prisma.playlistItem.create({
|
||||
data: { playlistId, trackId: matchedTrack.id, sort: pending.sort }
|
||||
})
|
||||
]);
|
||||
```
|
||||
|
||||
## Configuration Check
|
||||
|
||||
```typescript
|
||||
const settings = await getSystemSettings();
|
||||
// Key fields:
|
||||
// - settings.downloadSource: "soulseek" | "lidarr"
|
||||
// - settings.soulseekFallback: "none" | "failed" | "always"
|
||||
// - settings.musicPath: where files are downloaded
|
||||
// - settings.soulseekUsername / soulseekPassword
|
||||
// - settings.lidarrUrl / lidarrApiKey
|
||||
```
|
||||
|
||||
## Error Handling Patterns
|
||||
|
||||
### Soulseek Connection
|
||||
```typescript
|
||||
try {
|
||||
await soulseekService.ensureConnected();
|
||||
} catch (err) {
|
||||
// Credentials not configured or connection failed
|
||||
return { success: false, error: "Soulseek connection failed" };
|
||||
}
|
||||
```
|
||||
|
||||
### Download Retry Logic
|
||||
```typescript
|
||||
const matchesToTry = allMatches.slice(0, MAX_DOWNLOAD_RETRIES); // 3 attempts
|
||||
for (const match of matchesToTry) {
|
||||
const result = await this.downloadTrack(match, destPath);
|
||||
if (result.success) return { success: true, filePath: destPath };
|
||||
// Try next user on failure
|
||||
}
|
||||
return { success: false, error: "All attempts failed" };
|
||||
```
|
||||
|
||||
## Logging
|
||||
|
||||
Session logging for debugging:
|
||||
```typescript
|
||||
import { sessionLog } from "../utils/playlistLogger";
|
||||
sessionLog("SOULSEEK", "Message here"); // INFO level
|
||||
sessionLog("SOULSEEK", "Error message", "ERROR");
|
||||
sessionLog("SOULSEEK", "Warning", "WARN");
|
||||
```
|
||||
|
||||
Job-specific logging:
|
||||
```typescript
|
||||
import { createPlaylistLogger } from "../utils/playlistLogger";
|
||||
const logger = createPlaylistLogger(jobId);
|
||||
logger.info("Message");
|
||||
logger.error("Error");
|
||||
logger.debug("Debug info");
|
||||
```
|
||||
@@ -0,0 +1,194 @@
|
||||
# Spotify Import Feature - Handoff Document
|
||||
|
||||
## Overview
|
||||
|
||||
The Spotify Import feature allows users to import playlists from Spotify into Lidify. It searches for matching tracks on Soulseek (and optionally Lidarr), downloads them, creates a local playlist, and matches downloaded tracks to the playlist.
|
||||
|
||||
## Current State
|
||||
|
||||
### What Works
|
||||
1. **Spotify Playlist Parsing**: Fetches playlist metadata via Spotify embed API
|
||||
2. **Soulseek Downloads**: Direct P2P downloads with retry logic (tries up to 3 different users)
|
||||
3. **Parallel Processing**: Searches run in parallel, downloads limited to concurrency of 4
|
||||
4. **Track Matching**: Multiple matching strategies (exact, fuzzy, contains, startsWith)
|
||||
5. **Pending Track System**: Tracks that fail to download are stored as "pending" with:
|
||||
- Deezer preview playback (30s samples)
|
||||
- Manual retry button
|
||||
- Remove button
|
||||
6. **Retry Functionality**: Non-blocking retry - returns immediately, downloads in background
|
||||
7. **Reconciliation**: After library scan, pending tracks are automatically matched to downloaded files
|
||||
|
||||
### What Needs Testing
|
||||
1. **Lidarr Integration**: Download source can be set to "lidarr" but needs end-to-end testing
|
||||
2. **Lidarr + Soulseek Fallback**: When `downloadSource: "lidarr"` and `soulseekFallback: "failed"`, should try Lidarr first then fall back to Soulseek
|
||||
3. **Activity Panel Integration**: Downloads should show progress in the activity panel
|
||||
4. **Edge Cases**: Various artist name formats, special characters, live recordings filtering
|
||||
|
||||
## Architecture
|
||||
|
||||
### Flow
|
||||
```
|
||||
1. User pastes Spotify playlist URL
|
||||
2. Frontend calls POST /spotify/preview with URL
|
||||
3. Backend fetches playlist via Spotify embed API
|
||||
4. Backend searches MusicBrainz for album MBIDs
|
||||
5. Preview returned to user showing matched/unmatched tracks
|
||||
6. User confirms import
|
||||
7. Frontend calls POST /spotify/import
|
||||
8. Backend:
|
||||
a. For each album, either:
|
||||
- Sends to Lidarr (if enabled)
|
||||
- Downloads directly via Soulseek
|
||||
b. Waits for downloads to complete
|
||||
c. Runs library scan
|
||||
d. Matches tracks to playlist
|
||||
e. Creates pending entries for unmatched tracks
|
||||
9. User sees playlist with matched tracks + failed/pending tracks
|
||||
```
|
||||
|
||||
### Key Files
|
||||
|
||||
#### Backend Routes
|
||||
- `backend/src/routes/spotify.ts` - Main import endpoints
|
||||
- `POST /spotify/preview` - Parse and preview playlist
|
||||
- `POST /spotify/import` - Execute import job
|
||||
- `GET /spotify/import/:jobId/status` - Check job status
|
||||
|
||||
- `backend/src/routes/playlists.ts` - Playlist management + pending track handling
|
||||
- `GET /playlists/:id/pending/:trackId/preview` - Get fresh Deezer preview URL
|
||||
- `POST /playlists/:id/pending/:trackId/retry` - Retry downloading a failed track
|
||||
- `DELETE /playlists/:id/pending/:trackId` - Remove pending track from playlist
|
||||
- `POST /playlists/:id/pending/reconcile` - Manually trigger reconciliation
|
||||
|
||||
#### Backend Services
|
||||
- `backend/src/services/spotifyImport.ts` - Core import logic
|
||||
- `previewPlaylist()` - Parse Spotify URL and match to MusicBrainz
|
||||
- `executeImport()` - Run the full import job
|
||||
- `reconcilePendingTracks()` - Match pending tracks to library after scan
|
||||
|
||||
- `backend/src/services/soulseek.ts` - Direct Soulseek P2P client
|
||||
- `searchTrack()` - Search for a track (15s timeout)
|
||||
- `downloadTrack()` - Download a single file
|
||||
- `searchAndDownload()` - Search + download with retry
|
||||
- `searchAndDownloadBatch()` - Parallel search, concurrent download
|
||||
- `downloadBestMatch()` - Download from pre-searched results (used by retry)
|
||||
|
||||
- `backend/src/services/lidarr.ts` - Lidarr integration
|
||||
- `searchAlbum()` - Search for album by MBID
|
||||
- `addAlbum()` - Add album to Lidarr for download
|
||||
- `getDownloadQueue()` - Check download progress
|
||||
|
||||
- `backend/src/services/deezer.ts` - Deezer API for previews
|
||||
- `getTrackPreview()` - Get 30s preview URL for a track
|
||||
|
||||
- `backend/src/services/musicbrainz.ts` - MusicBrainz lookups
|
||||
- `searchRecordingByISRC()` - Find recording by ISRC
|
||||
- `searchRecording()` - Search by artist/title
|
||||
- `getReleaseDetails()` - Get album details
|
||||
|
||||
#### Frontend
|
||||
- `frontend/app/import/spotify/page.tsx` - Import wizard UI
|
||||
- `frontend/app/playlist/[id]/page.tsx` - Playlist detail with pending track handling
|
||||
- `frontend/lib/api.ts` - API client methods
|
||||
|
||||
#### Database Schema (relevant tables)
|
||||
```prisma
|
||||
model Playlist {
|
||||
id String @id @default(cuid())
|
||||
name String
|
||||
userId String
|
||||
isPublic Boolean @default(false)
|
||||
spotifyUrl String? // Original Spotify URL
|
||||
items PlaylistItem[]
|
||||
pendingTracks PlaylistPendingTrack[]
|
||||
}
|
||||
|
||||
model PlaylistPendingTrack {
|
||||
id String @id @default(cuid())
|
||||
playlistId String
|
||||
spotifyArtist String
|
||||
spotifyTitle String
|
||||
spotifyAlbum String
|
||||
spotifyTrackId String?
|
||||
deezerPreviewUrl String?
|
||||
sort Int
|
||||
createdAt DateTime @default(now())
|
||||
}
|
||||
```
|
||||
|
||||
## Known Issues
|
||||
|
||||
### 1. Album Name Shows "Unknown Album"
|
||||
**Problem**: Pending tracks sometimes show "Unknown Album" instead of the real album name.
|
||||
**Cause**: Spotify embed API sometimes returns "Unknown Album" for track.album.
|
||||
**Fix Applied**: Now uses resolved album name from `albumsToDownload` (MusicBrainz data) instead of Spotify embed data.
|
||||
**File**: `backend/src/services/spotifyImport.ts` line ~280
|
||||
|
||||
### 2. Deezer Preview URLs Expire
|
||||
**Problem**: Deezer preview URLs have timestamps and expire quickly.
|
||||
**Fix Applied**: Added endpoint to fetch fresh preview URL on demand.
|
||||
**File**: `backend/src/routes/playlists.ts` - `GET /:id/pending/:trackId/preview`
|
||||
|
||||
### 3. Retry Button Was Hanging
|
||||
**Problem**: Clicking retry would hang for up to 180s (download timeout).
|
||||
**Fix Applied**: Made retry non-blocking - search first (15s), return immediately, download in background.
|
||||
**File**: `backend/src/routes/playlists.ts` - `POST /:id/pending/:trackId/retry`
|
||||
|
||||
### 4. Missing Files After Scan (Unresolved)
|
||||
**Problem**: During testing, original downloaded files disappeared from disk, causing scan to remove 7 tracks.
|
||||
**Status**: Unknown cause - not a code bug. Files were deleted externally. Need to monitor in future tests.
|
||||
|
||||
## Testing Checklist
|
||||
|
||||
### Soulseek-Only Mode (Current Focus)
|
||||
- [x] Basic playlist import with Soulseek
|
||||
- [x] Track matching after download
|
||||
- [x] Pending track display for failed downloads
|
||||
- [x] Deezer preview playback
|
||||
- [x] Retry button functionality
|
||||
- [x] Remove pending track
|
||||
- [ ] Toast notifications for retry status
|
||||
- [ ] Activity panel shows download progress
|
||||
- [ ] Verify files persist after download
|
||||
|
||||
### Lidarr Mode (Needs Testing)
|
||||
- [ ] Set `downloadSource: "lidarr"` in settings
|
||||
- [ ] Import playlist - should send albums to Lidarr
|
||||
- [ ] Lidarr downloads complete
|
||||
- [ ] Library scan picks up Lidarr downloads
|
||||
- [ ] Tracks match to playlist
|
||||
|
||||
### Lidarr + Soulseek Fallback (Needs Testing)
|
||||
- [ ] Set `downloadSource: "lidarr"`, `soulseekFallback: "failed"`
|
||||
- [ ] Import playlist with mix of albums (some in Lidarr, some not)
|
||||
- [ ] Albums not in Lidarr should fall back to Soulseek
|
||||
- [ ] Both sources' downloads get matched
|
||||
|
||||
## Configuration
|
||||
|
||||
System settings relevant to import (in `SystemSettings` table):
|
||||
```
|
||||
downloadSource: "soulseek" | "lidarr"
|
||||
soulseekFallback: "none" | "failed" | "always"
|
||||
soulseekUsername: string
|
||||
soulseekPassword: string (encrypted)
|
||||
lidarrEnabled: boolean
|
||||
lidarrUrl: string
|
||||
lidarrApiKey: string (encrypted)
|
||||
musicPath: string (e.g., "C:/Users/kevin/Music")
|
||||
```
|
||||
|
||||
## Logs
|
||||
|
||||
Import logs are written to: `docs/logs/playlists/import_<jobId>_<timestamp>.log`
|
||||
|
||||
Session log for Soulseek activity: `docs/logs/playlists/session.log`
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Run fresh import test with Soulseek
|
||||
2. Verify files persist and scan works correctly
|
||||
3. Test Lidarr-only mode
|
||||
4. Test Lidarr + Soulseek fallback
|
||||
5. Add activity panel integration for download progress
|
||||
6. Consider adding notification when background retry completes
|
||||
@@ -0,0 +1,174 @@
|
||||
# Audio Analysis - Enhanced Mode (MusiCNN)
|
||||
|
||||
## Overview
|
||||
|
||||
Enhanced mode uses Essentia's TensorFlow integration with MusiCNN (Music Convolutional Neural Network) models to perform ML-based mood and audio classification. This provides significantly more accurate mood detection compared to the heuristic-based Standard mode.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────┐
|
||||
│ Audio File │
|
||||
│ (16kHz mono) │
|
||||
└────────┬────────┘
|
||||
│
|
||||
┌────────▼────────┐
|
||||
│ TensorflowPredict│
|
||||
│ MusiCNN │
|
||||
│ (Embeddings) │
|
||||
└────────┬────────┘
|
||||
│
|
||||
┌──────────────┼──────────────┐
|
||||
│ │ │
|
||||
┌─────────▼─────┐ ┌──────▼─────┐ ┌──────▼─────┐
|
||||
│ Mood Happy │ │ Mood Sad │ │ Danceability│
|
||||
│ TensorFlow │ │ TensorFlow │ │ TensorFlow │
|
||||
│ Predict2D │ │ Predict2D │ │ Predict2D │
|
||||
└───────┬───────┘ └─────┬──────┘ └──────┬──────┘
|
||||
│ │ │
|
||||
└───────────────┼───────────────┘
|
||||
│
|
||||
┌───────▼───────┐
|
||||
│ Derived Scores│
|
||||
│ Valence/Arousal│
|
||||
└───────────────┘
|
||||
```
|
||||
|
||||
## Key Components
|
||||
|
||||
### 1. Base Model: MusiCNN
|
||||
|
||||
- **Model**: `msd-musicnn-1.pb` (~3MB)
|
||||
- **Source**: [Essentia Model Zoo](https://essentia.upf.edu/models/autotagging/msd/)
|
||||
- **Function**: Extracts 200-dimensional embeddings from audio
|
||||
- **Algorithm**: `TensorflowPredictMusiCNN`
|
||||
|
||||
### 2. Classification Heads
|
||||
|
||||
Each classification head takes the MusiCNN embeddings and outputs probabilities:
|
||||
|
||||
| Model | File | Output |
|
||||
|-------|------|--------|
|
||||
| Mood Happy | `mood_happy-msd-musicnn-1.pb` | P(happy) |
|
||||
| Mood Sad | `mood_sad-msd-musicnn-1.pb` | P(sad) |
|
||||
| Mood Relaxed | `mood_relaxed-msd-musicnn-1.pb` | P(relaxed) |
|
||||
| Mood Aggressive | `mood_aggressive-msd-musicnn-1.pb` | P(aggressive) |
|
||||
| Mood Party | `mood_party-msd-musicnn-1.pb` | P(party) |
|
||||
| Mood Acoustic | `mood_acoustic-msd-musicnn-1.pb` | P(acoustic) |
|
||||
| Mood Electronic | `mood_electronic-msd-musicnn-1.pb` | P(electronic) |
|
||||
| Danceability | `danceability-msd-musicnn-1.pb` | P(danceable) |
|
||||
| Voice/Instrumental | `voice_instrumental-msd-musicnn-1.pb` | P(instrumental) |
|
||||
|
||||
### 3. Derived Features
|
||||
|
||||
Valence and Arousal are derived from the mood predictions:
|
||||
|
||||
```python
|
||||
# Valence = emotional positivity
|
||||
valence = happy * 0.5 + party * 0.3 + (1 - sad) * 0.2
|
||||
|
||||
# Arousal = energy level
|
||||
arousal = aggressive * 0.35 + party * 0.25 + electronic * 0.2
|
||||
+ (1 - relaxed) * 0.1 + (1 - acoustic) * 0.1
|
||||
```
|
||||
|
||||
## Docker Configuration
|
||||
|
||||
### Dockerfile
|
||||
|
||||
```dockerfile
|
||||
FROM ubuntu:20.04
|
||||
|
||||
# Install essentia-tensorflow (includes TensorFlow + MusiCNN support)
|
||||
RUN pip3 install --no-cache-dir essentia-tensorflow
|
||||
|
||||
# Download MusiCNN models
|
||||
RUN curl -L -o /app/models/msd-musicnn-1.pb \
|
||||
"https://essentia.upf.edu/models/autotagging/msd/msd-musicnn-1.pb"
|
||||
|
||||
# Classification heads
|
||||
RUN curl -L -o /app/models/mood_happy-msd-musicnn-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_happy/mood_happy-msd-musicnn-1.pb"
|
||||
# ... (other models)
|
||||
```
|
||||
|
||||
### Requirements
|
||||
|
||||
- **Ubuntu 20.04** (for Python 3.8 compatibility)
|
||||
- **essentia-tensorflow** pip package
|
||||
- **~10MB** for all models combined
|
||||
|
||||
## Usage in Code
|
||||
|
||||
```python
|
||||
from essentia.standard import TensorflowPredictMusiCNN, TensorflowPredict2D
|
||||
|
||||
# Load base embedding model
|
||||
musicnn = TensorflowPredictMusiCNN(
|
||||
graphFilename='/app/models/msd-musicnn-1.pb',
|
||||
output="model/dense/BiasAdd" # Embedding output layer
|
||||
)
|
||||
|
||||
# Load classification head
|
||||
mood_happy = TensorflowPredict2D(
|
||||
graphFilename='/app/models/mood_happy-msd-musicnn-1.pb',
|
||||
output="model/Softmax"
|
||||
)
|
||||
|
||||
# Process audio
|
||||
audio = es.MonoLoader(filename=path, sampleRate=16000)()
|
||||
embeddings = musicnn(audio) # Shape: [frames, 200]
|
||||
predictions = mood_happy(embeddings) # Shape: [frames, 2]
|
||||
happy_score = float(np.mean(predictions[:, 1])) # Average over frames
|
||||
```
|
||||
|
||||
## Output Fields
|
||||
|
||||
Enhanced mode produces these additional fields:
|
||||
|
||||
| Field | Type | Range | Description |
|
||||
|-------|------|-------|-------------|
|
||||
| moodHappy | float | 0-1 | ML probability of happy mood |
|
||||
| moodSad | float | 0-1 | ML probability of sad mood |
|
||||
| moodRelaxed | float | 0-1 | ML probability of relaxed mood |
|
||||
| moodAggressive | float | 0-1 | ML probability of aggressive mood |
|
||||
| moodParty | float | 0-1 | ML probability of party mood |
|
||||
| moodAcoustic | float | 0-1 | ML probability of acoustic sound |
|
||||
| moodElectronic | float | 0-1 | ML probability of electronic sound |
|
||||
| danceabilityMl | float | 0-1 | ML danceability score |
|
||||
| valence | float | 0-1 | Derived emotional positivity |
|
||||
| arousal | float | 0-1 | Derived energy level |
|
||||
| acousticness | float | 0-1 | From moodAcoustic |
|
||||
| instrumentalness | float | 0-1 | ML voice/instrumental detection |
|
||||
|
||||
## Comparison: Standard vs Enhanced
|
||||
|
||||
| Feature | Standard Mode | Enhanced Mode |
|
||||
|---------|---------------|---------------|
|
||||
| Mood Detection | Heuristic (key/BPM/energy) | ML (MusiCNN) |
|
||||
| Accuracy | Approximate | Research-grade |
|
||||
| Speed | Fast (~100ms) | Moderate (~500ms) |
|
||||
| Dependencies | Essentia core | Essentia + TensorFlow |
|
||||
| Model Size | 0 | ~10MB |
|
||||
| Python Version | Any | 3.7-3.9 (for pip) |
|
||||
|
||||
## Fallback Behavior
|
||||
|
||||
If Enhanced mode fails to initialize (missing models, TensorFlow errors), the analyzer automatically falls back to Standard mode:
|
||||
|
||||
```python
|
||||
if self.enhanced_mode and self.musicnn_model:
|
||||
ml_features = self._extract_ml_features(audio_16k)
|
||||
result.update(ml_features)
|
||||
else:
|
||||
self._apply_standard_estimates(result, scale, bpm)
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [Essentia TensorFlow Documentation](https://essentia.upf.edu/machine_learning.html)
|
||||
- [MusiCNN Paper](https://arxiv.org/abs/1711.02520)
|
||||
- [Essentia Model Zoo](https://essentia.upf.edu/models/)
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,443 @@
|
||||
# Audio Analysis: Standard Mode (Heuristic Approach)
|
||||
|
||||
## Overview
|
||||
|
||||
The Lidify audio analyzer has two modes:
|
||||
- **Enhanced Mode**: Uses TensorFlow ML models for accurate mood/valence/arousal predictions
|
||||
- **Standard Mode**: Uses signal processing heuristics when ML models aren't available
|
||||
|
||||
This document covers the **Standard Mode** implementation for code review.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Docker Container │
|
||||
│ lidify_audio_analyzer │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
|
||||
│ │ Redis │◄───│ Worker │───►│ PostgreSQL │ │
|
||||
│ │ Job Queue │ │ Loop │ │ Track Table │ │
|
||||
│ └─────────────┘ └──────┬──────┘ └─────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌──────▼──────┐ │
|
||||
│ │ AudioAnalyzer│ │
|
||||
│ │ Class │ │
|
||||
│ └──────┬──────┘ │
|
||||
│ │ │
|
||||
│ ┌────────────────┼────────────────┐ │
|
||||
│ ▼ ▼ ▼ │
|
||||
│ ┌───────────────┐ ┌─────────────┐ ┌──────────────────┐ │
|
||||
│ │ Basic Features│ │ Spectral │ │ Heuristic │ │
|
||||
│ │ (BPM, Key) │ │ Analysis │ │ Mood Estimation │ │
|
||||
│ └───────────────┘ └─────────────┘ └──────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
services/audio-analyzer/
|
||||
├── analyzer.py # Main analyzer code (870 lines)
|
||||
├── requirements.txt # Python dependencies
|
||||
└── Dockerfile # Container build configuration
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Classes
|
||||
|
||||
### 1. `AudioAnalyzer` (Line 130-660)
|
||||
|
||||
Main analysis class with two modes:
|
||||
|
||||
```python
|
||||
class AudioAnalyzer:
|
||||
def __init__(self):
|
||||
self.enhanced_mode = False # Falls back to Standard if ML unavailable
|
||||
self._init_essentia() # Initialize signal processing algorithms
|
||||
self._load_ml_models() # Attempt to load ML models
|
||||
```
|
||||
|
||||
### 2. `AnalysisWorker` (Line 663-847)
|
||||
|
||||
Redis queue worker that:
|
||||
1. Polls for pending tracks from `audio:analysis:queue`
|
||||
2. Falls back to scanning `Track` table for `analysisStatus = 'pending'`
|
||||
3. Processes tracks and updates database
|
||||
|
||||
---
|
||||
|
||||
## Standard Mode: Heuristic Calculations
|
||||
|
||||
### Input Features (Always Extracted)
|
||||
|
||||
| Feature | Essentia Algorithm | Description |
|
||||
|---------|-------------------|-------------|
|
||||
| BPM | `RhythmExtractor2013` | Beats per minute |
|
||||
| Key/Scale | `KeyExtractor` | Musical key (C, D#, etc.) and mode (major/minor) |
|
||||
| Loudness | `Loudness` | Perceived loudness in dB |
|
||||
| Dynamic Range | `DynamicComplexity` | Difference between quiet and loud parts |
|
||||
| Danceability | `Danceability` | How suitable for dancing (0-1) |
|
||||
| RMS Energy | `RMS` | Root Mean Square amplitude per frame |
|
||||
| Spectral Centroid | `Centroid` | "Brightness" - center of spectral mass |
|
||||
| Spectral Flatness | `FlatnessDB` | Noise-like vs tonal content |
|
||||
| Zero-Crossing Rate | `ZeroCrossingRate` | Rate of signal sign changes |
|
||||
|
||||
### Frame-Based Processing (Lines 328-365)
|
||||
|
||||
```python
|
||||
frame_size = 2048
|
||||
hop_size = 1024
|
||||
|
||||
for i in range(0, len(audio_44k) - frame_size, hop_size):
|
||||
frame = audio_44k[i:i + frame_size]
|
||||
windowed = self.windowing(frame)
|
||||
spectrum = self.spectrum(windowed)
|
||||
|
||||
rms_values.append(self.rms(frame))
|
||||
zcr_values.append(self.zcr(frame))
|
||||
spectral_centroid_values.append(self.spectral_centroid(spectrum))
|
||||
spectral_flatness_values.append(self.spectral_flatness(spectrum))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Heuristic Formulas
|
||||
|
||||
### Energy (Line 347-353)
|
||||
|
||||
**Problem Solved**: Previous implementation used `es.Energy()` which returns sum of squared samples (huge number), normalized incorrectly as `energy / 100`.
|
||||
|
||||
**Current Implementation**:
|
||||
```python
|
||||
avg_rms = np.mean(rms_values)
|
||||
energy = min(1.0, avg_rms * 3) # RMS typically 0.0-0.5, scale to 0-1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Valence (Happiness/Positivity) - Lines 495-518
|
||||
|
||||
**Formula**:
|
||||
```
|
||||
valence = key_valence * 0.40
|
||||
+ bpm_valence * 0.25
|
||||
+ brightness_valence * 0.20
|
||||
+ energy * 0.15
|
||||
```
|
||||
|
||||
**Components**:
|
||||
|
||||
| Component | Weight | Calculation | Rationale |
|
||||
|-----------|--------|-------------|-----------|
|
||||
| Key Valence | 40% | Major = 0.65, Minor = 0.35 | Major keys sound happier |
|
||||
| BPM Valence | 25% | Fast (≥120) → 0.8, Slow (≤80) → 0.2 | Fast tempo = upbeat |
|
||||
| Brightness | 20% | `spectral_centroid * 1.5` | Bright sounds feel positive |
|
||||
| Energy | 15% | RMS energy (0-1) | Loud = energetic/positive |
|
||||
|
||||
**Code**:
|
||||
```python
|
||||
# Key contribution
|
||||
key_valence = 0.65 if scale == 'major' else 0.35
|
||||
|
||||
# BPM contribution
|
||||
if bpm >= 120:
|
||||
bpm_valence = min(0.8, 0.5 + (bpm - 120) / 200)
|
||||
elif bpm <= 80:
|
||||
bpm_valence = max(0.2, 0.5 - (80 - bpm) / 100)
|
||||
else:
|
||||
bpm_valence = 0.5
|
||||
|
||||
# Brightness contribution
|
||||
brightness_valence = min(1.0, spectral_centroid * 1.5)
|
||||
|
||||
# Final weighted sum
|
||||
result['valence'] = round(
|
||||
key_valence * 0.4 +
|
||||
bpm_valence * 0.25 +
|
||||
brightness_valence * 0.2 +
|
||||
energy * 0.15,
|
||||
3
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Arousal (Energy/Intensity) - Lines 520-543
|
||||
|
||||
**Formula**:
|
||||
```
|
||||
arousal = bpm_arousal * 0.35
|
||||
+ energy_arousal * 0.35
|
||||
+ brightness_arousal * 0.15
|
||||
+ compression_arousal * 0.15
|
||||
```
|
||||
|
||||
**Components**:
|
||||
|
||||
| Component | Weight | Calculation | Rationale |
|
||||
|-----------|--------|-------------|-----------|
|
||||
| BPM Arousal | 35% | `(bpm - 60) / 140` mapped to 0.1-0.9 | Fast = high energy |
|
||||
| Energy | 35% | RMS energy (0-1) | Loud = intense |
|
||||
| Brightness | 15% | `spectral_centroid * 1.2` | Bright = energetic |
|
||||
| Compression | 15% | `1 - (dynamic_range / 20)` | Compressed = intense/modern |
|
||||
|
||||
**Code**:
|
||||
```python
|
||||
# BPM contribution (60-180 BPM → 0.1-0.9)
|
||||
bpm_arousal = min(0.9, max(0.1, (bpm - 60) / 140))
|
||||
|
||||
# Energy is direct intensity indicator
|
||||
energy_arousal = energy
|
||||
|
||||
# Low dynamic range = compressed = more intense
|
||||
compression_arousal = max(0, min(1.0, 1 - (dynamic_range / 20)))
|
||||
|
||||
# Brightness adds perceived energy
|
||||
brightness_arousal = min(1.0, spectral_centroid * 1.2)
|
||||
|
||||
result['arousal'] = round(
|
||||
bpm_arousal * 0.35 +
|
||||
energy_arousal * 0.35 +
|
||||
brightness_arousal * 0.15 +
|
||||
compression_arousal * 0.15,
|
||||
3
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Instrumentalness - Lines 545-563
|
||||
|
||||
**Approach**: Estimate likelihood of vocals vs instrumental based on spectral characteristics.
|
||||
|
||||
**Formula**:
|
||||
```
|
||||
instrumentalness = flatness_normalized * 0.6 + zcr_instrumental * 0.4
|
||||
```
|
||||
|
||||
**Components**:
|
||||
|
||||
| Component | Weight | Calculation | Rationale |
|
||||
|-----------|--------|-------------|-----------|
|
||||
| Spectral Flatness | 60% | `(flatness + 40) / 40` | Noise-like (0dB) = instrumental; Tonal (-60dB) = vocals |
|
||||
| ZCR Pattern | 40% | Low (<0.05) = 0.7; High (>0.15) = 0.4 | Sustained tones = instrumental |
|
||||
|
||||
**Code**:
|
||||
```python
|
||||
# Spectral flatness: -40dB to 0dB → 0 to 1
|
||||
flatness_normalized = min(1.0, max(0, (spectral_flatness + 40) / 40))
|
||||
|
||||
# ZCR patterns
|
||||
if zcr < 0.05:
|
||||
zcr_instrumental = 0.7 # Sustained instrumental tones
|
||||
elif zcr > 0.15:
|
||||
zcr_instrumental = 0.4 # Could be speech or percussion
|
||||
else:
|
||||
zcr_instrumental = 0.5 # Uncertain
|
||||
|
||||
result['instrumentalness'] = round(
|
||||
flatness_normalized * 0.6 + zcr_instrumental * 0.4,
|
||||
3
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Acousticness - Line 565-568
|
||||
|
||||
**Simple heuristic**: High dynamic range suggests acoustic recording (natural dynamics preserved).
|
||||
|
||||
```python
|
||||
result['acousticness'] = round(min(1.0, dynamic_range / 12), 3)
|
||||
```
|
||||
|
||||
| Dynamic Range | Acousticness | Interpretation |
|
||||
|---------------|--------------|----------------|
|
||||
| < 6 dB | < 0.5 | Heavily compressed (electronic/pop) |
|
||||
| 6-12 dB | 0.5-1.0 | Moderate (mixed) |
|
||||
| > 12 dB | 1.0 | High dynamic range (acoustic/classical) |
|
||||
|
||||
---
|
||||
|
||||
### Speechiness - Lines 570-575
|
||||
|
||||
**Approach**: Speech has characteristic ZCR + spectral centroid patterns.
|
||||
|
||||
```python
|
||||
if zcr > 0.08 and zcr < 0.2 and spectral_centroid > 0.1 and spectral_centroid < 0.4:
|
||||
result['speechiness'] = round(min(0.5, zcr * 3), 3)
|
||||
else:
|
||||
result['speechiness'] = 0.1
|
||||
```
|
||||
|
||||
| Condition | Result |
|
||||
|-----------|--------|
|
||||
| ZCR 0.08-0.2 AND centroid 0.1-0.4 | Speech-like (up to 0.5) |
|
||||
| Outside range | Low speechiness (0.1) |
|
||||
|
||||
---
|
||||
|
||||
## Mood Tag Generation (Lines 581-660)
|
||||
|
||||
Tags are derived from computed features:
|
||||
|
||||
| Condition | Tags Added |
|
||||
|-----------|------------|
|
||||
| `arousal >= 0.7` | energetic, upbeat |
|
||||
| `arousal <= 0.3` | calm, peaceful |
|
||||
| `valence >= 0.7` | happy, uplifting |
|
||||
| `valence <= 0.3` | sad, melancholic |
|
||||
| `danceability >= 0.7` | dance, groovy |
|
||||
| `bpm >= 140` | fast |
|
||||
| `bpm <= 80` | slow |
|
||||
| `keyScale == 'minor'` (and not happy) | moody |
|
||||
| `arousal >= 0.7 AND bpm >= 120` | workout |
|
||||
| `arousal <= 0.4 AND valence <= 0.4` | atmospheric |
|
||||
| `arousal <= 0.3 AND bpm <= 90` | chill |
|
||||
|
||||
---
|
||||
|
||||
## Output Schema
|
||||
|
||||
```typescript
|
||||
interface AnalysisResult {
|
||||
// Basic features
|
||||
bpm: number; // 60-200 typical
|
||||
beatsCount: number; // Total beat count
|
||||
key: string; // "C", "D#", etc.
|
||||
keyScale: string; // "major" or "minor"
|
||||
keyStrength: number; // 0-1 confidence
|
||||
|
||||
// Energy metrics
|
||||
energy: number; // 0-1 (RMS-based)
|
||||
loudness: number; // dB
|
||||
dynamicRange: number; // dB
|
||||
|
||||
// Heuristic estimates
|
||||
danceability: number; // 0-1
|
||||
valence: number; // 0-1 (happiness)
|
||||
arousal: number; // 0-1 (energy)
|
||||
instrumentalness: number; // 0-1
|
||||
acousticness: number; // 0-1
|
||||
speechiness: number; // 0-1
|
||||
|
||||
// Derived
|
||||
moodTags: string[]; // ["calm", "peaceful", "chill"]
|
||||
analysisMode: "standard"; // Always "standard" for this mode
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Database Update (Lines 766-822)
|
||||
|
||||
All features are persisted to the `Track` table:
|
||||
|
||||
```sql
|
||||
UPDATE "Track"
|
||||
SET
|
||||
bpm = %s,
|
||||
"beatsCount" = %s,
|
||||
key = %s,
|
||||
"keyScale" = %s,
|
||||
"keyStrength" = %s,
|
||||
energy = %s,
|
||||
loudness = %s,
|
||||
"dynamicRange" = %s,
|
||||
danceability = %s,
|
||||
valence = %s,
|
||||
arousal = %s,
|
||||
instrumentalness = %s,
|
||||
acousticness = %s,
|
||||
speechiness = %s,
|
||||
"moodTags" = %s,
|
||||
"analysisMode" = 'standard',
|
||||
"analysisStatus" = 'completed',
|
||||
"analysisVersion" = %s,
|
||||
"analyzedAt" = %s
|
||||
WHERE id = %s
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Known Limitations
|
||||
|
||||
### Standard Mode vs ML Models
|
||||
|
||||
| Aspect | Standard Mode | Enhanced Mode (ML) |
|
||||
|--------|--------------|-------------------|
|
||||
| Valence accuracy | ~60% correlation | ~85% correlation |
|
||||
| Arousal accuracy | ~65% correlation | ~88% correlation |
|
||||
| Mood detection | Rule-based | Neural network |
|
||||
| Processing speed | Fast (~1-2 sec) | Slower (~5-10 sec) |
|
||||
| Dependencies | Essentia only | Essentia + TensorFlow |
|
||||
|
||||
### Edge Cases
|
||||
|
||||
1. **Ambient music**: Low BPM detection reliability
|
||||
2. **Classical**: Variable tempo causes BPM averaging issues
|
||||
3. **Spoken word**: May be misclassified as low-energy music
|
||||
4. **Electronic/EDM**: Compression detection may overestimate arousal
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
```
|
||||
# requirements.txt
|
||||
essentia==2.1b6.dev1110
|
||||
essentia-tensorflow==2.1b6.dev1110
|
||||
numpy>=1.21.0,<2.0.0
|
||||
tensorflow==2.15.0
|
||||
redis>=4.5.0
|
||||
psycopg2-binary>=2.9.0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
Run single file analysis:
|
||||
```bash
|
||||
docker exec lidify_audio_analyzer python3 analyzer.py --test /music/path/to/song.mp3
|
||||
```
|
||||
|
||||
Example output:
|
||||
```json
|
||||
{
|
||||
"bpm": 128.5,
|
||||
"beatsCount": 256,
|
||||
"key": "C",
|
||||
"keyScale": "minor",
|
||||
"keyStrength": 0.723,
|
||||
"energy": 0.65,
|
||||
"loudness": -8.2,
|
||||
"dynamicRange": 7.5,
|
||||
"danceability": 0.72,
|
||||
"valence": 0.42,
|
||||
"arousal": 0.68,
|
||||
"instrumentalness": 0.35,
|
||||
"acousticness": 0.625,
|
||||
"speechiness": 0.1,
|
||||
"moodTags": ["energetic", "upbeat", "moody", "dance"],
|
||||
"analysisMode": "standard"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Files
|
||||
|
||||
- `services/audio-analyzer/Dockerfile` - Container build
|
||||
- `backend/src/services/vibeMatching.ts` - Uses these features for song matching
|
||||
- `prisma/schema.prisma` - Track table schema with analysis columns
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,107 @@
|
||||
# Curated Vibe Mixes Implementation
|
||||
|
||||
## Overview
|
||||
|
||||
This update adds **19 new curated vibe mixes** and a **Mood-on-Demand** feature that allows users to generate custom mixes based on audio features.
|
||||
|
||||
## Bug Fix
|
||||
|
||||
Fixed the `genres` field bug - the Album model uses `genres` (JSON array) not `genre` (string). Added a helper function `findTracksByGenrePatterns()` that properly queries:
|
||||
1. Track's `lastfmTags` and `essentiaGenres` (native String[] fields)
|
||||
2. Falls back to filtering `album.genres` JSON array in application code
|
||||
|
||||
## New Daily Vibe Mixes (10 tracks each)
|
||||
|
||||
| Mix Name | Description | Key Audio Features |
|
||||
|----------|-------------|-------------------|
|
||||
| **Sad Girl Sundays** | Melancholic introspection | valence < 0.35, minor key, arousal < 0.4 |
|
||||
| **Main Character Energy** | You're the protagonist ✨ | valence > 0.55, energy > 0.55, danceability > 0.5 |
|
||||
| **Villain Era** | Dark & empowering 😈 | minor key, energy > 0.65, aggressive tags |
|
||||
| **3AM Thoughts** | Late night overthinking 🌙 | arousal < 0.35, energy < 0.45, valence < 0.45 |
|
||||
| **Hot Girl Walk** | Confident cardio 💅 | danceability > 0.65, BPM 95-135, energy > 0.55 |
|
||||
| **Rage Cleaning** | Aggressive productivity 🔥 | energy > 0.75, arousal > 0.65, BPM > 125 |
|
||||
| **Golden Hour** | Warm sunset vibes 🌅 | valence > 0.45, acousticness > 0.35, energy 0.25-0.65 |
|
||||
| **Shower Karaoke** | Belters you can't help sing 🚿 | instrumentalness < 0.35, energy > 0.55, valence > 0.45 |
|
||||
| **In My Feelings** | Let it all out 💔 | valence < 0.4, arousal < 0.55, acousticness > 0.25 |
|
||||
| **Midnight Drive** | Late night cruising 🚗 | energy 0.35-0.65, arousal 0.25-0.55, BPM 85-125 |
|
||||
| **Coffee Shop Vibes** | Cozy background ☕ | acousticness > 0.4, energy 0.15-0.55 |
|
||||
| **Romanticize Your Life** | Aesthetic moments 🎬 | valence 0.35-0.75, arousal 0.25-0.65, acousticness > 0.25 |
|
||||
| **That Girl Era** | Self-improvement mode 💪 | valence > 0.55, energy > 0.45, danceability > 0.45 |
|
||||
| **Unhinged** | Embrace the chaos 🎪 | Extreme features (high or low everything) |
|
||||
|
||||
## New Weekly Curated Mixes (20 tracks each)
|
||||
|
||||
| Mix Name | Description | Algorithm |
|
||||
|----------|-------------|-----------|
|
||||
| **Deep Cuts** | Hidden gems 💎 | Tracks with zero or few plays |
|
||||
| **Key Journey** | Harmonic progression 🎹 | Ordered by circle of fifths |
|
||||
| **Tempo Flow** | Energy arc 📈 | slow → fast → slow BPM journey |
|
||||
| **Vocal Detox** | Instrumental escape 🧘 | instrumentalness > 0.75 |
|
||||
| **Minor Key Mondays** | All minor key bangers 🖤 | keyScale = 'minor', energy > 0.45 |
|
||||
|
||||
## Mood-on-Demand Feature
|
||||
|
||||
### Backend Endpoints
|
||||
|
||||
- `POST /api/mixes/mood` - Generate a custom mix based on audio parameters
|
||||
- `GET /api/mixes/mood/presets` - Get available mood presets for the UI
|
||||
|
||||
### Preset Moods (12 total)
|
||||
|
||||
1. 😊 Happy & Upbeat
|
||||
2. 😢 Melancholic
|
||||
3. 😌 Chill & Relaxed
|
||||
4. ⚡ High Energy
|
||||
5. 🎯 Focus Mode
|
||||
6. 💃 Dance Party
|
||||
7. 🎸 Acoustic Vibes
|
||||
8. 🖤 Dark & Moody
|
||||
9. 💕 Romantic
|
||||
10. 💪 Workout Beast
|
||||
11. 😴 Sleep & Unwind
|
||||
12. 👑 Confidence Boost
|
||||
|
||||
### Custom Mix Builder
|
||||
|
||||
Users can adjust sliders for:
|
||||
- Happiness (valence)
|
||||
- Energy
|
||||
- Danceability
|
||||
- Tempo (BPM)
|
||||
|
||||
## Frontend Changes
|
||||
|
||||
### New Component: `MoodMixer.tsx`
|
||||
|
||||
A beautiful Spotify-esque modal with:
|
||||
- Gradient preset cards with emojis
|
||||
- Smooth animations (Framer Motion)
|
||||
- Custom range slider controls
|
||||
- Dark theme matching the app aesthetic
|
||||
|
||||
### Homepage Integration
|
||||
|
||||
Added "Mood Mixer" button next to the "Refresh" button in the "Made For You" section.
|
||||
|
||||
## Files Modified
|
||||
|
||||
### Backend
|
||||
- `backend/src/services/programmaticPlaylists.ts` - Added helper function, fixed 12 genre bugs, added 19 new mix generators
|
||||
- `backend/src/routes/mixes.ts` - Added mood endpoints and presets
|
||||
|
||||
### Frontend
|
||||
- `frontend/lib/api.ts` - Added types and API methods for mood mixing
|
||||
- `frontend/app/page.tsx` - Integrated MoodMixer modal
|
||||
- `frontend/components/MoodMixer.tsx` - New component (created)
|
||||
|
||||
## Technical Notes
|
||||
|
||||
- All mixes use Essentia audio analysis data (valence, energy, danceability, BPM, key, etc.)
|
||||
- Fallback to Last.fm tags when audio analysis is insufficient
|
||||
- Daily mixes: 10 tracks, refreshed daily
|
||||
- Weekly mixes: 20 tracks, for longer listening sessions
|
||||
- Mix generation is cached in Redis for performance
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,798 @@
|
||||
# Modified Files for Review
|
||||
|
||||
> **Last Updated:** December 16, 2025
|
||||
> **Features:** Spotify Import + UI Overhaul (Activity Panel, Carousels, Notifications, Playlist/Mix/Discover Redesign, Settings Page Redesign)
|
||||
|
||||
## Overview
|
||||
|
||||
This document tracks all files created or modified as part of:
|
||||
|
||||
1. **Spotify Import Feature** - Import Spotify playlists, match tracks, download albums
|
||||
2. **UI Overhaul** - Activity Panel, horizontal carousels, notification system
|
||||
|
||||
---
|
||||
|
||||
## Backend - New Files
|
||||
|
||||
| File | Purpose |
|
||||
| --------------------------------------------- | --------------------------------------------------------------- |
|
||||
| `backend/src/services/notificationService.ts` | Notification CRUD service with convenience methods |
|
||||
| `backend/src/services/spotifyImport.ts` | Spotify playlist import logic, track matching, album resolution |
|
||||
| `backend/src/services/spotify.ts` | Spotify API/scraping service (embed data extraction) |
|
||||
| `backend/src/routes/notifications.ts` | Notification & download history API endpoints |
|
||||
| `backend/src/routes/spotify.ts` | Spotify import API endpoints |
|
||||
| `backend/src/utils/playlistLogger.ts` | Debug logger for Spotify import jobs |
|
||||
|
||||
## Backend - Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| ----------------------------------------------- | --------------------------------------------------------------------- |
|
||||
| `backend/prisma/schema.prisma` | Added `Notification` model, `DownloadJob.cleared` field |
|
||||
| `backend/src/services/simpleDownloadManager.ts` | Added notification integration, failure deduplication |
|
||||
| `backend/src/services/lidarr.ts` | Smart `anyReleaseOk` fallback, MusicBrainz fallback for artist lookup |
|
||||
| `backend/src/services/musicbrainz.ts` | Recording filtering, scoring system, title normalization |
|
||||
| `backend/src/services/spotify.ts` | Embed scraping improvements, debug logging |
|
||||
| `backend/src/index.ts` | Registered notification routes |
|
||||
|
||||
---
|
||||
|
||||
## Frontend - New Files
|
||||
|
||||
| File | Purpose |
|
||||
| ----------------------------------------------------- | ----------------------------------------------------- |
|
||||
| `frontend/components/layout/ActivityPanel.tsx` | Collapsible 3rd column with tabs, PWA install button |
|
||||
| `frontend/components/activity/NotificationsTab.tsx` | System notifications list |
|
||||
| `frontend/components/activity/ActiveDownloadsTab.tsx` | Currently downloading items |
|
||||
| `frontend/components/activity/HistoryTab.tsx` | Completed/failed with retry |
|
||||
| `frontend/components/ui/HorizontalCarousel.tsx` | Reusable carousel with arrows |
|
||||
| `frontend/hooks/useActivityPanel.ts` | Panel state management |
|
||||
| `frontend/app/import/spotify/page.tsx` | Spotify import UI page (preview, selection, progress) |
|
||||
|
||||
## Frontend - Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| ------------------------------------------------------------- | ------------------------------------------------------ |
|
||||
| `frontend/components/layout/AuthenticatedLayout.tsx` | Added 3rd column, event listener for toggle |
|
||||
| `frontend/components/layout/TopBar.tsx` | Added `ActivityPanelToggle` button |
|
||||
| `frontend/components/MixCard.tsx` | Reduced padding/sizing (`p-4` → `p-2.5`) |
|
||||
| `frontend/features/home/components/ArtistsGrid.tsx` | Uses `HorizontalCarousel` |
|
||||
| `frontend/features/home/components/MixesGrid.tsx` | Uses `HorizontalCarousel` |
|
||||
| `frontend/features/home/components/ContinueListening.tsx` | Uses `HorizontalCarousel` |
|
||||
| `frontend/features/home/components/PodcastsGrid.tsx` | Uses `HorizontalCarousel` |
|
||||
| `frontend/features/home/components/HomeHero.tsx` | Already optimized (compact greeting) |
|
||||
| `frontend/lib/api.ts` | Added notification API methods, Spotify import methods |
|
||||
| `frontend/app/playlists/page.tsx` | Added "Import from Spotify" button/link |
|
||||
| `frontend/app/playlist/[id]/page.tsx` | Full Spotify-style redesign (see below) |
|
||||
| `frontend/app/mix/[id]/page.tsx` | Full Spotify-style redesign (matches playlist page) |
|
||||
| `frontend/app/discover/page.tsx` | Updated to use consistent container widths |
|
||||
| `frontend/features/discover/components/DiscoverHero.tsx` | Redesigned to match playlist/mix hero style |
|
||||
| `frontend/features/discover/components/DiscoverActionBar.tsx` | Redesigned with Lidify yellow play button |
|
||||
| `frontend/features/discover/components/TrackList.tsx` | Redesigned to match playlist/mix track listing |
|
||||
| `frontend/components/layout/Sidebar.tsx` | Removed unused icon imports |
|
||||
|
||||
---
|
||||
|
||||
## Database Changes
|
||||
|
||||
```prisma
|
||||
// NEW MODEL
|
||||
model Notification {
|
||||
id String @id @default(cuid())
|
||||
userId String
|
||||
type String // system, download_complete, playlist_ready, error, import_complete
|
||||
title String
|
||||
message String?
|
||||
metadata Json? // { playlistId, albumId, artistId, etc. }
|
||||
read Boolean @default(false)
|
||||
cleared Boolean @default(false)
|
||||
createdAt DateTime @default(now())
|
||||
|
||||
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
|
||||
|
||||
@@index([userId, cleared])
|
||||
@@index([userId, read])
|
||||
@@index([createdAt])
|
||||
}
|
||||
|
||||
// MODIFIED MODEL - DownloadJob
|
||||
model DownloadJob {
|
||||
// ... existing fields ...
|
||||
cleared Boolean @default(false) // NEW: User dismissed from history
|
||||
}
|
||||
```
|
||||
|
||||
**Migration Applied:** `npx prisma db push`
|
||||
|
||||
---
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Notifications (`/api/notifications`)
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
| ------ | ------------------------------------ | ---------------------------- |
|
||||
| GET | `/notifications` | List uncleared notifications |
|
||||
| GET | `/notifications/unread-count` | Get unread count |
|
||||
| POST | `/notifications/:id/read` | Mark as read |
|
||||
| POST | `/notifications/read-all` | Mark all as read |
|
||||
| POST | `/notifications/:id/clear` | Clear (dismiss) notification |
|
||||
| POST | `/notifications/clear-all` | Clear all notifications |
|
||||
| GET | `/notifications/downloads/active` | Active downloads |
|
||||
| GET | `/notifications/downloads/history` | Completed/failed downloads |
|
||||
| POST | `/notifications/downloads/:id/clear` | Clear from history |
|
||||
| POST | `/notifications/downloads/clear-all` | Clear all history |
|
||||
| POST | `/notifications/downloads/:id/retry` | Retry failed download |
|
||||
|
||||
### Spotify Import (`/api/spotify`)
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
| ------ | ---------------------------- | -------------------------------- |
|
||||
| POST | `/spotify/import/preview` | Generate import preview from URL |
|
||||
| POST | `/spotify/import/start` | Start import with selections |
|
||||
| GET | `/spotify/import/:id/status` | Get import job status |
|
||||
|
||||
---
|
||||
|
||||
## Key Bug Fixes
|
||||
|
||||
### 1. Track Matching (Spotify Import)
|
||||
|
||||
- **File:** `backend/src/services/spotifyImport.ts`
|
||||
- **Fix:** Added `stripTrackSuffix()` to remove "- 2011 Remaster" etc. while keeping punctuation
|
||||
- **Fix:** Added Unicode normalization for artist names (Röyksopp → Royksopp)
|
||||
- **Fix:** Multiple matching strategies (exact → stripped → fuzzy)
|
||||
|
||||
### 2. MusicBrainz Album Resolution
|
||||
|
||||
- **File:** `backend/src/services/musicbrainz.ts`
|
||||
- **Fix:** Score threshold > 50 for studio albums
|
||||
- **Fix:** Recording filtering (exclude live/demo/acoustic)
|
||||
- **Fix:** Soundtrack penalty in scoring
|
||||
|
||||
### 3. Lidarr Album Addition
|
||||
|
||||
- **File:** `backend/src/services/lidarr.ts`
|
||||
- **Fix:** Smart `anyReleaseOk` fallback (try strict first, then loosen)
|
||||
- **Fix:** MusicBrainz fallback when Lidarr's metadata server fails
|
||||
- **Fix:** Immediate error when no releases found
|
||||
|
||||
### 4. Multiple Failure Notifications
|
||||
|
||||
- **File:** `backend/src/services/simpleDownloadManager.ts`
|
||||
- **Fix:** 30-second deduplication window for failure events
|
||||
- **Fix:** Only notify on final exhaustion, not each retry
|
||||
- **Fix:** Skip notifications for discovery/import batches
|
||||
|
||||
---
|
||||
|
||||
## Testing Checklist
|
||||
|
||||
### Activity Panel
|
||||
|
||||
- [ ] Panel opens/closes from TopBar button
|
||||
- [ ] Panel state persists in localStorage
|
||||
- [ ] Notifications tab shows system messages
|
||||
- [ ] Active tab shows downloading items (refreshes every 5s)
|
||||
- [ ] History tab shows completed/failed
|
||||
- [ ] Retry button works for failed downloads
|
||||
- [ ] Clear buttons work
|
||||
|
||||
### Home Page Carousels
|
||||
|
||||
- [ ] Horizontal scroll works
|
||||
- [ ] Arrow buttons appear on hover (desktop)
|
||||
- [ ] Snap behavior works
|
||||
- [ ] Card sizing is compact
|
||||
|
||||
### Spotify Import
|
||||
|
||||
- [ ] Preview generation works
|
||||
- [ ] Album selection works
|
||||
- [ ] Downloads start correctly
|
||||
- [ ] Track matching works after downloads
|
||||
- [ ] Playlist is created with matched tracks
|
||||
- [ ] Notification appears when complete
|
||||
|
||||
### Notifications
|
||||
|
||||
- [ ] Download complete creates notification
|
||||
- [ ] Download failed creates notification (only on exhaustion)
|
||||
- [ ] Spotify import complete creates notification
|
||||
- [ ] Unread badge shows count
|
||||
- [ ] Mark as read works
|
||||
- [ ] Clear works
|
||||
|
||||
### Playlist Page
|
||||
|
||||
- [ ] Hero section is compact with bottom-aligned content
|
||||
- [ ] Shuffle button randomizes and plays tracks
|
||||
- [ ] Track listing spans full width (no container)
|
||||
- [ ] Currently playing track is highlighted
|
||||
- [ ] Track numbers become play icons on hover
|
||||
- [ ] Album column hidden on mobile
|
||||
|
||||
### PWA Install
|
||||
|
||||
- [ ] "Install App" button appears in Activity Panel (when installable)
|
||||
- [ ] Button triggers browser install prompt
|
||||
- [ ] Button disappears after installation
|
||||
|
||||
---
|
||||
|
||||
## Rollback Instructions
|
||||
|
||||
If issues arise, revert these files:
|
||||
|
||||
```bash
|
||||
# Core files to revert for UI changes
|
||||
git checkout HEAD~1 -- frontend/components/layout/AuthenticatedLayout.tsx
|
||||
git checkout HEAD~1 -- frontend/components/layout/TopBar.tsx
|
||||
git checkout HEAD~1 -- frontend/components/layout/ActivityPanel.tsx
|
||||
git checkout HEAD~1 -- frontend/components/activity/
|
||||
|
||||
# For Spotify import issues
|
||||
git checkout HEAD~1 -- backend/src/services/spotifyImport.ts
|
||||
git checkout HEAD~1 -- backend/src/services/musicbrainz.ts
|
||||
git checkout HEAD~1 -- backend/src/services/lidarr.ts
|
||||
|
||||
# Database rollback (if needed)
|
||||
# Remove Notification model and DownloadJob.cleared from schema
|
||||
npx prisma db push
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- The old `DownloadNotifications.tsx` (floating modal) still exists but is no longer imported in the layout
|
||||
- All grid components were already converted to carousels prior to this session
|
||||
- The Spotify import flow uses `lidarrService.addAlbum()` directly instead of `simpleDownloadManager` to avoid same-artist fallback
|
||||
|
||||
## Playlist Page Redesign
|
||||
|
||||
**File:** `frontend/app/playlist/[id]/page.tsx`
|
||||
|
||||
### Changes Made
|
||||
|
||||
1. **Fixed React Hooks Error** - Moved `totalDuration` useMemo before early returns
|
||||
2. **Full-Width Track Listing** - Removed container wrapper, tracks span full panel width like Spotify
|
||||
3. **Compact Hero Section** - Smaller cover art (140px/192px), bottom-aligned content, reduced title size
|
||||
4. **Added Shuffle Button** - Shuffles and plays all tracks in random order
|
||||
5. **Grid-Based Track Layout** - Columns: #, Title/Artist, Album, Duration (responsive)
|
||||
6. **Track Hover States** - Number becomes play icon on hover, row highlights
|
||||
|
||||
### PWA Install in Activity Panel
|
||||
|
||||
**File:** `frontend/components/layout/ActivityPanel.tsx`
|
||||
|
||||
- Added `beforeinstallprompt` event listener
|
||||
- "Install App" button appears at bottom of panel when PWA can be installed
|
||||
- Hides automatically when app is already installed or running in standalone mode
|
||||
|
||||
### Sidebar Cleanup
|
||||
|
||||
**File:** `frontend/components/layout/Sidebar.tsx`
|
||||
|
||||
- Removed unused icon imports (Home, Library, Sparkles, Book, Mic2)
|
||||
- Navigation items use text-only (no icons) - matching minimalist design
|
||||
|
||||
### Playlists Page Redesign
|
||||
|
||||
**File:** `frontend/app/playlists/page.tsx`
|
||||
|
||||
**Before → After:**
|
||||
|
||||
| Element | Before | After |
|
||||
| ---------------- | --------------------------------- | -------------------------------------- |
|
||||
| Header title | `text-3xl md:text-4xl font-black` | `text-2xl font-bold` |
|
||||
| Header padding | `px-6 md:px-8 py-6 md:py-8` | `px-6 pt-6 pb-4` |
|
||||
| Gradient overlay | Yellow gradient at top | Removed |
|
||||
| Import button | Green outline with icon | Solid green `bg-[#1DB954]`, no icon |
|
||||
| Hidden toggle | Icon + text, bordered | Text only, minimal style |
|
||||
| Card wrapper | `<Card>` component | Simple `<div>` with `hover:bg-white/5` |
|
||||
| Card padding | `p-4` (via Card) | `p-3` |
|
||||
| Play button | `w-12 h-12` | `w-10 h-10` |
|
||||
| Empty state | `<EmptyState>` with icons | Simple centered div |
|
||||
| Shared badge | Purple badge | Shown in subtitle instead |
|
||||
| Track count | "tracks" | "songs" (matches Spotify) |
|
||||
|
||||
**Design Philosophy:**
|
||||
|
||||
- Remove decorative icons where text suffices
|
||||
- Reduce spacing for tighter, professional feel
|
||||
- Use native hover states instead of custom components
|
||||
- Minimal color - let content speak
|
||||
- Match Spotify's terminology
|
||||
|
||||
---
|
||||
|
||||
## Spotify-Style Design Patterns
|
||||
|
||||
> **Use these patterns consistently across all pages for a cohesive look.**
|
||||
|
||||
### 1. Hero Sections (Albums, Playlists, Artists)
|
||||
|
||||
```
|
||||
- Compact height (max ~180px for cover on desktop)
|
||||
- Content bottom-aligned to the cover art
|
||||
- Title: text-2xl md:text-3xl font-bold (NOT text-4xl+)
|
||||
- Subtitle info: text-sm text-gray-400
|
||||
- Reduced vertical spacing (gap-2 to gap-4 max)
|
||||
- No decorative gradients overlaying the hero
|
||||
```
|
||||
|
||||
### 2. Track Listings
|
||||
|
||||
```
|
||||
- Full-width, no container card wrapping
|
||||
- Grid layout: [#] [Title/Artist] [Album] [Duration]
|
||||
- Album column: hidden on mobile (md:grid-cols-[16px_1fr_1fr_60px])
|
||||
- Hover: row bg-white/5, number → play icon
|
||||
- Playing indicator: Lidify yellow (#ecb200) on track number
|
||||
- Compact row height (~56px)
|
||||
```
|
||||
|
||||
### 3. Page Headers
|
||||
|
||||
```
|
||||
- Title: text-2xl font-bold (not text-3xl+)
|
||||
- Subtitle: text-sm text-gray-400
|
||||
- Actions: rounded-full buttons with minimal icons
|
||||
- No excessive padding (px-6 py-4 is enough)
|
||||
```
|
||||
|
||||
### 4. Cards (Albums, Artists, Playlists)
|
||||
|
||||
```
|
||||
- Compact padding: p-2.5 (not p-4)
|
||||
- Title: text-sm font-medium truncate
|
||||
- Subtitle: text-xs text-gray-500
|
||||
- Play button: bottom-right, shows on hover
|
||||
```
|
||||
|
||||
### 5. Grids → Carousels
|
||||
|
||||
```
|
||||
- Use HorizontalCarousel for content rows
|
||||
- Single horizontal line, scroll/swipe
|
||||
- Arrow buttons on hover (desktop)
|
||||
- Snap behavior for smooth scrolling
|
||||
```
|
||||
|
||||
### 6. General Typography
|
||||
|
||||
```
|
||||
- Section headers: text-lg font-semibold (not text-xl)
|
||||
- Greeting (home): text-2xl md:text-3xl font-bold tracking-tight
|
||||
- No ALL CAPS unless absolutely necessary
|
||||
- Muted subtitles: text-gray-400 or text-gray-500
|
||||
```
|
||||
|
||||
### 7. Buttons & Actions
|
||||
|
||||
```
|
||||
- Primary action: rounded-full, bg-[#ecb200] text-black
|
||||
- Secondary: bg-white/10 hover:bg-white/20
|
||||
- Icon-only buttons: rounded-full p-2
|
||||
- Minimal icon usage - text labels preferred
|
||||
```
|
||||
|
||||
### 8. Spacing Philosophy
|
||||
|
||||
```
|
||||
- Tight but breathable
|
||||
- Section gaps: gap-6 (not gap-8 or gap-10)
|
||||
- Card grids: gap-4
|
||||
- Hero to content: pt-6 (not pt-10)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Post-Implementation Fixes
|
||||
|
||||
| Date | File | Issue | Fix |
|
||||
| ---------- | --------------------------------------------------------------- | ----------------------------------------------------- | -------------------------------------------------------------------------------------- |
|
||||
| 2025-12-15 | `backend/src/routes/notifications.ts` | Wrong import path `../db` | Changed to `../utils/db` |
|
||||
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | React hooks order violation | Moved `useMemo` before early returns |
|
||||
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | `useAuth` not defined | Removed unused `isAuthenticated` |
|
||||
| 2025-12-15 | `frontend/components/layout/ActivityPanel.tsx` | Badge not clearing after clear all | Added `notifications-changed` event listener |
|
||||
| 2025-12-15 | `frontend/components/activity/NotificationsTab.tsx` | Badge not updating | Dispatch `notifications-changed` event on mutations |
|
||||
| 2025-12-15 | `backend/src/services/spotifyImport.ts` | Track matching failing (apostrophes, artist matching) | Added `normalizeApostrophes()`, changed artist match to use `contains` with first word |
|
||||
| 2025-12-15 | `frontend/app/playlists/page.tsx` | Page design not matching Spotify style | Full redesign: compact header, cleaner cards, minimal icons, refined typography |
|
||||
| 2025-12-15 | `frontend/app/import/spotify/page.tsx` | Using Music2 icon instead of Spotify logo | Uses SpotIcon.png, cleaner layout, matches style guide, removed heavy Card components |
|
||||
| 2025-12-15 | `frontend/app/import/spotify/page.tsx` | Grey/transparent gradient not matching brand | Added yellow-to-purple gradient (same as home page) with quick fade ratio (35vh/25vh) |
|
||||
| 2025-12-15 | `frontend/app/discover/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
|
||||
| 2025-12-15 | `frontend/app/mix/[id]/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
|
||||
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
|
||||
| 2025-12-15 | `frontend/features/discover/components/*` | Discover page not matching playlist/mix design | Redesigned DiscoverHero, DiscoverActionBar, TrackList to match Spotify style |
|
||||
| 2025-12-15 | `frontend/app/library/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
|
||||
| 2025-12-15 | `frontend/features/library/components/LibraryHeader.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
|
||||
| 2025-12-15 | `frontend/app/podcasts/page.tsx` | Container width + card styling not matching | Removed `max-w-7xl mx-auto`, cleaner cards without borders/gradients |
|
||||
| 2025-12-15 | `frontend/app/audiobooks/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, smaller header text, consistent with Spotify style |
|
||||
| 2025-12-15 | `frontend/app/artist/[id]/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
|
||||
| 2025-12-15 | `frontend/app/album/[id]/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
|
||||
| 2025-12-15 | `frontend/features/artist/components/ArtistHero.tsx` | Hero not matching Spotify style | Compact hero, full-width, bottom-aligned content, kept VibrantJS gradients |
|
||||
| 2025-12-15 | `frontend/features/artist/components/ArtistActionBar.tsx` | Action bar too heavy | Simplified to play button + shuffle + download, matching playlist style |
|
||||
| 2025-12-15 | `frontend/features/artist/components/PopularTracks.tsx` | Track list not matching new style | Removed Card wrapper, grid-based layout, cleaner typography |
|
||||
| 2025-12-15 | `frontend/features/artist/components/Discography.tsx` | Section header too large | Changed header from `text-2xl md:text-3xl` to `text-xl` |
|
||||
| 2025-12-15 | `frontend/features/artist/components/AvailableAlbums.tsx` | Section headers too large | Changed headers to `text-xl font-bold mb-4`, renamed sections |
|
||||
| 2025-12-15 | `frontend/features/artist/components/SimilarArtists.tsx` | Cards not matching new style | Cleaner cards with transparent bg, smaller header |
|
||||
| 2025-12-15 | `frontend/features/artist/components/ArtistBio.tsx` | Using Card component | Replaced Card with simple `bg-white/5` div |
|
||||
| 2025-12-15 | `frontend/features/album/components/AlbumHero.tsx` | Hero not matching Spotify style | Compact hero, full-width, bottom-aligned content, kept VibrantJS gradients |
|
||||
| 2025-12-15 | `frontend/features/album/components/AlbumActionBar.tsx` | Action bar too heavy | Simplified to play + shuffle + add to playlist, matching playlist style |
|
||||
| 2025-12-15 | `frontend/features/album/components/SimilarAlbums.tsx` | Section header too large | Changed header to `text-xl font-bold mb-4` |
|
||||
| 2025-12-15 | `frontend/app/artist/[id]/page.tsx` | Artist bio/about not showing | Now uses `artist.bio \|\| artist.summary` for library artists with `summary` field |
|
||||
| 2025-12-15 | `frontend/features/artist/components/ArtistBio.tsx` | Read more link not brand color | Added `[&_a]:text-[#ecb200]` for Lidify yellow links |
|
||||
| 2025-12-15 | `frontend/app/audiobooks/[id]/page.tsx` | Page design not matching Spotify style | Compact hero, yellow play button, integrated action bar, full-width layout |
|
||||
| 2025-12-15 | `frontend/features/audiobook/components/AudiobookHero.tsx` | Hero too large and dated | Compact Spotify-style hero with bottom-aligned content, VibrantJS gradients preserved |
|
||||
| 2025-12-15 | `frontend/features/audiobook/components/AudiobookActionBar.tsx` | Action bar not matching other pages | Yellow play button, inline progress, subtle action icons |
|
||||
| 2025-12-15 | `frontend/app/podcasts/[id]/page.tsx` | Page design not matching Spotify style | Compact hero, fixed height gradient (25vh), full-width layout |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/PodcastHero.tsx` | Hero too large and dated | Compact Spotify-style hero with bottom-aligned content, VibrantJS gradients preserved |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/PodcastActionBar.tsx` | Action bar too heavy | Yellow subscribe button, subtle RSS link, cleaner remove confirmation |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/ContinueListening.tsx` | Cards not matching new style | Yellow play button, cleaner progress bar, simpler prev/next episodes |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/EpisodeList.tsx` | Episode list not matching new style | Removed Card wrapper, yellow highlights, cleaner typography |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/SimilarPodcasts.tsx` | Cards not matching new style | Transparent bg with hover, smaller header, cleaner layout |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/PreviewEpisodes.tsx` | Cards not matching new style | Removed Card wrappers, yellow subscribe button, cleaner About section |
|
||||
|
||||
---
|
||||
|
||||
## Settings Page Redesign (December 16, 2025)
|
||||
|
||||
### Overview
|
||||
|
||||
Complete redesign of the settings page to match Spotify's clean, minimal aesthetic with:
|
||||
|
||||
- **Sidebar navigation** - Fixed sidebar with section links, active state tracking via intersection observer
|
||||
- **Single scrollable page** - All sections on one page instead of tabs
|
||||
- **Unified Spotify section** - Combined OAuth user connection + Developer API credentials
|
||||
- **Spotify-style design patterns** - Row-based layouts, clean toggles, minimal borders
|
||||
|
||||
### Database Changes
|
||||
|
||||
```prisma
|
||||
model User {
|
||||
// ... existing fields ...
|
||||
|
||||
// NEW: Spotify OAuth connection
|
||||
spotifyAccessToken String? // Encrypted OAuth access token
|
||||
spotifyRefreshToken String? // Encrypted OAuth refresh token
|
||||
spotifyTokenExpiry DateTime? // When access token expires
|
||||
spotifyUserId String? // Spotify user ID
|
||||
spotifyDisplayName String? // Display name from Spotify
|
||||
}
|
||||
```
|
||||
|
||||
### New API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
| ------ | ------------------------------ | ----------------------------------- |
|
||||
| GET | `/api/spotify/auth/url` | Generate OAuth authorization URL |
|
||||
| GET | `/api/spotify/auth/callback` | Handle OAuth callback, store tokens |
|
||||
| POST | `/api/spotify/auth/disconnect` | Remove user's Spotify connection |
|
||||
| GET | `/api/spotify/auth/status` | Check if user is connected |
|
||||
|
||||
### New Frontend Files
|
||||
|
||||
| File | Purpose |
|
||||
| ----------------------------------------------------------------------------- | --------------------------------------- |
|
||||
| `frontend/features/settings/components/ui/SettingsLayout.tsx` | Sidebar + main content wrapper |
|
||||
| `frontend/features/settings/components/ui/SettingsSidebar.tsx` | Navigation sidebar with section links |
|
||||
| `frontend/features/settings/components/ui/SettingsSection.tsx` | Section header with separator |
|
||||
| `frontend/features/settings/components/ui/SettingsRow.tsx` | Label + description left, control right |
|
||||
| `frontend/features/settings/components/ui/SettingsToggle.tsx` | Spotify-style toggle switch |
|
||||
| `frontend/features/settings/components/ui/SettingsSelect.tsx` | Dropdown select |
|
||||
| `frontend/features/settings/components/ui/SettingsInput.tsx` | Text/password input with show/hide |
|
||||
| `frontend/features/settings/components/ui/ConnectionCard.tsx` | OAuth connection card (Spotify) |
|
||||
| `frontend/features/settings/components/ui/index.ts` | Barrel export |
|
||||
| `frontend/features/settings/components/sections/AccountSection.tsx` | Password change + 2FA |
|
||||
| `frontend/features/settings/components/sections/PlaybackSection.tsx` | Streaming quality dropdown |
|
||||
| `frontend/features/settings/components/sections/SpotifyConnectionSection.tsx` | Spotify OAuth connection |
|
||||
| `frontend/features/settings/components/sections/SpotifyAPISection.tsx` | Developer API credentials |
|
||||
| `frontend/features/settings/components/sections/CacheSection.tsx` | Cache sizes + automation toggles |
|
||||
| `frontend/features/settings/hooks/useSpotifyOAuth.ts` | OAuth state management |
|
||||
|
||||
### Modified Frontend Files
|
||||
|
||||
| File | Changes |
|
||||
| -------------------------------------------------------------------------- | ------------------------------------- |
|
||||
| `frontend/app/settings/page.tsx` | Complete redesign with sidebar layout |
|
||||
| `frontend/features/settings/components/sections/LidarrSection.tsx` | Spotify-style row layout |
|
||||
| `frontend/features/settings/components/sections/AudiobookshelfSection.tsx` | Spotify-style row layout |
|
||||
| `frontend/features/settings/components/sections/SoulseekSection.tsx` | Spotify-style row layout |
|
||||
| `frontend/features/settings/components/sections/AIServicesSection.tsx` | Spotify-style row layout |
|
||||
| `frontend/features/settings/components/sections/StoragePathsSection.tsx` | Spotify-style row layout |
|
||||
| `frontend/features/settings/components/sections/UserManagementSection.tsx` | Cleaner design, modal for delete |
|
||||
|
||||
### Modified Backend Files
|
||||
|
||||
| File | Changes |
|
||||
| ------------------------------- | ---------------------------------------- |
|
||||
| `backend/prisma/schema.prisma` | Added Spotify OAuth fields to User model |
|
||||
| `backend/src/routes/spotify.ts` | Added OAuth routes |
|
||||
|
||||
### Deleted Files (Consolidated)
|
||||
|
||||
| File | Reason |
|
||||
| ---------------------------------------------------------------------------- | --------------------------------- |
|
||||
| `frontend/features/settings/components/UserSettingsTab.tsx` | Replaced by unified settings page |
|
||||
| `frontend/features/settings/components/AccountTab.tsx` | Replaced by unified settings page |
|
||||
| `frontend/features/settings/components/SystemSettingsTab.tsx` | Replaced by unified settings page |
|
||||
| `frontend/features/settings/components/sections/ChangePasswordSection.tsx` | Merged into AccountSection |
|
||||
| `frontend/features/settings/components/sections/TwoFactorAuthSection.tsx` | Merged into AccountSection |
|
||||
| `frontend/features/settings/components/sections/PlaybackQualitySection.tsx` | Replaced by PlaybackSection |
|
||||
| `frontend/features/settings/components/sections/AdvancedSettingsSection.tsx` | Replaced by CacheSection |
|
||||
| `frontend/features/settings/components/sections/CacheSettingsSection.tsx` | Replaced by CacheSection |
|
||||
| `frontend/features/settings/components/sections/SpotifySection.tsx` | Split into Connection + API |
|
||||
|
||||
### Settings Sections
|
||||
|
||||
**All Users:** Account, Playback, Connected Services (Spotify OAuth)
|
||||
|
||||
**Admin Only:** Download Services, Media Servers, P2P Networks, AI Services, Spotify API, Storage, Cache & Automation, User Management
|
||||
|
||||
---
|
||||
|
||||
## Home Page Enhancements (Dec 16, 2025)
|
||||
|
||||
### New Features
|
||||
|
||||
1. **Radio Stations Section** - Compact horizontal row at the top of the home page showing random Deezer radio stations
|
||||
2. **Featured Playlists Section** - Grid showing 10 featured Deezer playlists after Popular Artists section
|
||||
|
||||
### New Files Created
|
||||
|
||||
| File | Purpose |
|
||||
| ------------------------------------------------------ | ------------------------------------------- |
|
||||
| `frontend/features/home/components/FeaturedPlaylistsGrid.tsx` | Grid component for featured playlists |
|
||||
| `frontend/features/home/components/RadioStationsGrid.tsx` | Horizontal scroll component for radio stations |
|
||||
|
||||
### Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| ---------------------------------------------------- | ------------------------------------------------ |
|
||||
| `frontend/app/page.tsx` | Added radio stations and featured playlists sections |
|
||||
| `frontend/features/home/hooks/useHomeData.ts` | Added browse data fetching for playlists/radios |
|
||||
| `frontend/hooks/useQueries.ts` | Added browse query keys and hooks |
|
||||
| `backend/src/routes/browse.ts` | Increased featured playlists limit from 50 to 200 |
|
||||
|
||||
---
|
||||
|
||||
## Notification & Sync Button Improvements (Dec 16, 2025)
|
||||
|
||||
### Changes
|
||||
|
||||
1. **Sync Button** - No longer shows toast overlay, turns green with spinning animation while syncing
|
||||
2. **Optimistic Notification Clearing** - Notifications are cleared from UI immediately before API call completes
|
||||
3. **Duplicate Key Fix** - Added context parameter to renderCard in browse page to prevent duplicate key errors
|
||||
|
||||
### Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| -------------------------------------------------------- | ------------------------------------------------ |
|
||||
| `frontend/components/layout/Sidebar.tsx` | Removed toast, added green color while syncing |
|
||||
| `frontend/components/activity/NotificationsTab.tsx` | Implemented optimistic updates for all mutations |
|
||||
| `frontend/app/browse/playlists/page.tsx` | Fixed duplicate key errors with unique keys |
|
||||
|
||||
---
|
||||
|
||||
## Essentia Audio Analysis Integration (Dec 16, 2025)
|
||||
|
||||
### Overview
|
||||
|
||||
Integrated Essentia audio analysis to extract BPM, key, mood, energy, and other audio features from tracks. This enables intelligent mood-based mixes and personalized playlists.
|
||||
|
||||
### Database Changes
|
||||
|
||||
Added to `Track` model in `backend/prisma/schema.prisma`:
|
||||
|
||||
| Field | Type | Description |
|
||||
| ------------------ | ---------- | ------------------------------------- |
|
||||
| `bpm` | Float? | Beats per minute |
|
||||
| `beatsCount` | Int? | Total beats in track |
|
||||
| `key` | String? | Musical key (C, F#, Bb, etc.) |
|
||||
| `keyScale` | String? | "major" or "minor" |
|
||||
| `keyStrength` | Float? | Key detection confidence (0-1) |
|
||||
| `energy` | Float? | Overall energy (0-1) |
|
||||
| `loudness` | Float? | Average loudness in dB |
|
||||
| `dynamicRange` | Float? | Dynamic range in dB |
|
||||
| `danceability` | Float? | Danceability score (0-1) |
|
||||
| `valence` | Float? | Happy (1) to sad (0) |
|
||||
| `arousal` | Float? | Energetic (1) to calm (0) |
|
||||
| `instrumentalness` | Float? | Vocal presence (0-1, 1=instrumental) |
|
||||
| `acousticness` | Float? | Acoustic vs electronic (0-1) |
|
||||
| `speechiness` | Float? | Spoken word content (0-1) |
|
||||
| `moodTags` | String[] | ML-classified mood tags |
|
||||
| `essentiaGenres` | String[] | ML-classified genres |
|
||||
| `lastfmTags` | String[] | User-generated mood tags from Last.fm |
|
||||
| `analysisStatus` | String | pending/processing/completed/failed |
|
||||
| `analysisVersion` | String? | Essentia version used |
|
||||
| `analyzedAt` | DateTime? | When analysis was completed |
|
||||
| `analysisError` | String? | Error message if failed |
|
||||
|
||||
### New Files
|
||||
|
||||
| File | Description |
|
||||
| ------------------------------------------------- | -------------------------------------------------- |
|
||||
| `services/audio-analyzer/Dockerfile` | Python 3.11 + Essentia container |
|
||||
| `services/audio-analyzer/analyzer.py` | Main audio analysis service |
|
||||
| `services/audio-analyzer/requirements.txt` | Python dependencies |
|
||||
| `backend/src/workers/trackEnrichment.ts` | Last.fm tag enrichment worker |
|
||||
| `backend/src/routes/analysis.ts` | API routes for analysis status & triggers |
|
||||
|
||||
### Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| -------------------------------------------------------------- | ----------------------------------------------- |
|
||||
| `backend/prisma/schema.prisma` | Added audio analysis fields to Track model |
|
||||
| `backend/src/workers/index.ts` | Added track enrichment worker startup/shutdown |
|
||||
| `backend/src/workers/queues.ts` | Added `analysisQueue` for audio analysis jobs |
|
||||
| `backend/src/index.ts` | Registered `/api/analysis` routes |
|
||||
| `backend/src/services/programmaticPlaylists.ts` | Added mood-based mix generators |
|
||||
| `backend/src/routes/library.ts` | Added mood-based radio station filtering |
|
||||
| `frontend/features/home/components/LibraryRadioStations.tsx` | Added mood-based radio station buttons |
|
||||
| `docker-compose.yml` | Added `audio-analyzer` service (optional) |
|
||||
|
||||
### New Mix Types (Audio Analysis-Based)
|
||||
|
||||
| Mix Type | Criteria |
|
||||
| -------------- | --------------------------------------------- |
|
||||
| High Energy | energy >= 0.7, BPM >= 120 |
|
||||
| Late Night | energy <= 0.4, BPM <= 90, low arousal |
|
||||
| Happy Vibes | valence >= 0.6, energy >= 0.5 |
|
||||
| Melancholy | valence <= 0.4, minor key preferred |
|
||||
| Dance Floor | danceability >= 0.7, BPM 110-140 |
|
||||
| Acoustic | acousticness >= 0.6, energy 0.3-0.6 |
|
||||
| Instrumental | instrumentalness >= 0.7, energy 0.3-0.6 |
|
||||
| Road Trip | tags or energy 0.5-0.8, BPM 100-130 |
|
||||
| Sunday Morning | low energy, high acousticness (day-specific) |
|
||||
| Monday Motivation | high energy, high valence (day-specific) |
|
||||
| Friday Night | high danceability, high energy (day-specific) |
|
||||
|
||||
### API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
| ------ | ----------------------------- | ---------------------------------------- |
|
||||
| GET | `/api/analysis/status` | Get analysis progress statistics |
|
||||
| POST | `/api/analysis/start` | Queue pending tracks for analysis |
|
||||
| POST | `/api/analysis/retry-failed` | Reset failed tracks to pending |
|
||||
| POST | `/api/analysis/analyze/:id` | Queue specific track for analysis |
|
||||
| GET | `/api/analysis/track/:id` | Get analysis data for specific track |
|
||||
| GET | `/api/analysis/features` | Get aggregated feature statistics |
|
||||
|
||||
### Starting the Audio Analyzer
|
||||
|
||||
The audio analyzer is disabled by default. To enable it:
|
||||
|
||||
```bash
|
||||
docker-compose --profile audio-analysis up -d
|
||||
```
|
||||
|
||||
Or just run it separately:
|
||||
|
||||
```bash
|
||||
docker-compose up audio-analyzer -d
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Notification System Fixes (Dec 16, 2025)
|
||||
|
||||
### Issues Fixed
|
||||
|
||||
1. **Toast overlays for cache clearing and sync** - Removed toast.success overlays for "Caches cleared" and "Library scan started" since these should appear in the activity panel notification bar instead.
|
||||
|
||||
2. **Notification badge not clearing immediately** - The `useNotifications` hook wasn't responding to `notifications-changed` events. Fixed by adding an event listener that triggers a refetch.
|
||||
|
||||
3. **Settings page glitchy sidebar** - Replaced IntersectionObserver with scroll-based tracking for smoother sidebar highlighting.
|
||||
|
||||
### Modified Files
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `frontend/hooks/useNotifications.ts` | Added event listener for `notifications-changed` to trigger immediate refetch |
|
||||
| `frontend/features/settings/components/sections/CacheSection.tsx` | Removed toast.success for cache clearing and sync, added local error state |
|
||||
| `frontend/components/layout/TopBar.tsx` | Removed toast.success for library scan started |
|
||||
| `frontend/components/layout/Sidebar.tsx` | Added `notifications-changed` event dispatch after sync |
|
||||
| `frontend/features/settings/components/ui/SettingsLayout.tsx` | Replaced IntersectionObserver with throttled scroll listener for smoother sidebar tracking |
|
||||
|
||||
### Behavior Changes
|
||||
|
||||
- **Sync button**: No longer shows toast overlay - progress appears in activity panel
|
||||
- **Clear caches button**: No longer shows toast overlay - implicit success (button returns to normal state)
|
||||
- **Notification badge**: Now clears immediately via optimistic updates and event system
|
||||
- **Settings sidebar**: Smoother scrolling behavior without jumpy highlights
|
||||
|
||||
---
|
||||
|
||||
## Session 8: Artist Radio Feature
|
||||
|
||||
### New Feature: Artist Radio with Hybrid Similarity Matching
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `backend/src/routes/library.ts` | Added `artist` case to `/library/radio` endpoint with hybrid matching |
|
||||
| `backend/src/routes/library.ts` | Added artist name filtering to `/library/genres` endpoint |
|
||||
| `frontend/features/artist/components/ArtistActionBar.tsx` | Added Radio icon button for library artists |
|
||||
| `frontend/app/artist/[id]/page.tsx` | Added `handleStartRadio` function and passed to ArtistActionBar |
|
||||
| `frontend/lib/api.ts` | Added `getRadioTracks()` method |
|
||||
|
||||
### Artist Radio Logic
|
||||
|
||||
The artist radio uses a **hybrid approach** with vibe boosting:
|
||||
|
||||
1. **Last.fm Similar Artists (filtered to library)**: Primary source, gets up to 15 similar artists that exist in user's library
|
||||
2. **Genre Matching Fallback**: If < 5 similar artists, finds library artists with overlapping genres
|
||||
3. **Vibe Boost via Audio Analysis**: Scores similar artists' tracks by BPM, energy, valence, and danceability similarity
|
||||
4. **Track Mix**: ~40% from original artist, ~60% from vibe-matched similar artists
|
||||
|
||||
### Genre Filtering Fix
|
||||
|
||||
Artist names (like "Jamiroquai") were incorrectly showing as genres. Fixed by:
|
||||
- Fetching all artist names at query time
|
||||
- Filtering out any "genre" that matches an artist name (case-insensitive)
|
||||
|
||||
### Bug Fix: Artist Radio "Unknown Artist" / No Image
|
||||
|
||||
Fixed two issues with artist radio playback:
|
||||
1. **Frontend**: Removed double-transformation of tracks - backend already returns properly formatted data
|
||||
2. **Backend**: Fixed `coverArt` to use `track.album.coverUrl` directly instead of conditional `lidarrAlbumId` check
|
||||
|
||||
---
|
||||
|
||||
## Session 9: Vibe Match Feature
|
||||
|
||||
### New Feature: "Vibe Match" Button on Media Player
|
||||
|
||||
Allows users to instantly create a queue of tracks that sound like the currently playing track.
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `backend/src/routes/library.ts` | Added `vibe` case to `/library/radio` endpoint with audio feature matching |
|
||||
| `frontend/components/player/MiniPlayer.tsx` | Added Vibe button (AudioWaveform icon) with loading state |
|
||||
| `frontend/components/player/FullPlayer.tsx` | Added Vibe button (AudioWaveform icon) with loading state |
|
||||
|
||||
### How Vibe Match Works
|
||||
|
||||
1. **Takes current track's audio features** (BPM, energy, valence, danceability, key, mood tags)
|
||||
2. **Searches entire library** for tracks with similar audio profiles
|
||||
3. **Scores matches** using weighted algorithm:
|
||||
- BPM (25%) - within ±15 BPM is ideal
|
||||
- Energy (25%)
|
||||
- Valence/mood (20%)
|
||||
- Danceability (15%)
|
||||
- Key compatibility (10%)
|
||||
- Mood tag overlap (5%)
|
||||
4. **Falls back gracefully** if not enough audio matches:
|
||||
- Same artist's other tracks
|
||||
- Last.fm similar artists' tracks
|
||||
- Same genre tracks
|
||||
- Random library tracks
|
||||
|
||||
### UI Location
|
||||
|
||||
The Vibe button (waveform icon) appears after the Repeat button in both:
|
||||
- MiniPlayer (sidebar player)
|
||||
- FullPlayer (bottom bar player)
|
||||
|
||||
Clicking it replaces the current queue with vibe-matched tracks and shows a toast notification.
|
||||
|
||||
---
|
||||
|
||||
## Session 9 (continued): Search Tracks Fix
|
||||
|
||||
### Bug Fix: Library Tracks Not Showing in Search
|
||||
|
||||
The backend was returning tracks in search results, but the frontend never displayed them.
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `frontend/app/search/page.tsx` | Added import for `LibraryTracksList` and section to display library tracks |
|
||||
| `frontend/features/search/components/LibraryTracksList.tsx` | **New file** - Component to display library tracks in search results |
|
||||
|
||||
### Features of LibraryTracksList
|
||||
|
||||
- Shows up to 10 tracks matching the search query
|
||||
- Displays cover art, title, artist, album, and duration
|
||||
- Click to play (integrates with audio context)
|
||||
- Currently playing track highlighted in yellow
|
||||
- Artist and album names link to their respective pages
|
||||
@@ -0,0 +1,396 @@
|
||||
# Vibe Matching Algorithm Overhaul Plan
|
||||
|
||||
## Overview
|
||||
|
||||
This document outlines the plan to overhaul the vibe matching algorithm to use **cosine similarity** on a comprehensive feature vector that includes all 9 ML mood predictions, audio features, and genre/tag matching.
|
||||
|
||||
## Current State (Before Overhaul)
|
||||
|
||||
### What We Have
|
||||
- **ML Mood Predictions (9 total):**
|
||||
- `moodHappy`, `moodSad`, `moodRelaxed`, `moodAggressive` (existing)
|
||||
- `moodParty`, `moodAcoustic`, `moodElectronic` (newly added)
|
||||
- `danceabilityMl`, `aggressivenessMl` (existing)
|
||||
|
||||
- **Audio Features:**
|
||||
- `bpm`, `key`, `keyScale` (major/minor)
|
||||
- `energy`, `danceability`, `valence`, `arousal`
|
||||
- `instrumentalness`, `acousticness`, `speechiness`
|
||||
|
||||
- **Metadata:**
|
||||
- `lastfmTags` (JSON array of tag objects with name/count)
|
||||
- `essentiaGenres` (JSON array of genre strings)
|
||||
- `trackGenres` relation (linked genre records)
|
||||
|
||||
### Previous Algorithm (Weighted Manhattan Distance)
|
||||
```typescript
|
||||
// Old approach - arbitrary weights, limited features
|
||||
const weights = {
|
||||
energy: 1.5,
|
||||
danceability: 1.2,
|
||||
valence: 1.0,
|
||||
arousal: 1.0,
|
||||
instrumentalness: 0.8,
|
||||
bpm: 0.5,
|
||||
};
|
||||
|
||||
let score = 0;
|
||||
for (const [feature, weight] of Object.entries(weights)) {
|
||||
const diff = Math.abs(sourceTrack[feature] - candidateTrack[feature]);
|
||||
score += diff * weight;
|
||||
}
|
||||
// Lower score = more similar (inverted logic)
|
||||
```
|
||||
|
||||
**Problems with old approach:**
|
||||
1. Only used 6 features, ignored all ML mood predictions
|
||||
2. Arbitrary weights with no scientific basis
|
||||
3. Manhattan distance less effective for high-dimensional feature spaces
|
||||
4. No genre/tag matching
|
||||
5. Score inversion was confusing
|
||||
|
||||
---
|
||||
|
||||
## New Algorithm (Cosine Similarity)
|
||||
|
||||
### Phase 1: Database Schema Update ✅
|
||||
Add new mood fields to Prisma schema:
|
||||
|
||||
```prisma
|
||||
model Track {
|
||||
// ... existing fields ...
|
||||
|
||||
// ML Mood Predictions (0.0-1.0)
|
||||
moodHappy Float?
|
||||
moodSad Float?
|
||||
moodRelaxed Float?
|
||||
moodAggressive Float?
|
||||
moodParty Float? // NEW
|
||||
moodAcoustic Float? // NEW
|
||||
moodElectronic Float? // NEW
|
||||
|
||||
// ... rest of schema ...
|
||||
}
|
||||
```
|
||||
|
||||
**Migration command:**
|
||||
```bash
|
||||
cd backend
|
||||
npx prisma db push --skip-generate
|
||||
```
|
||||
|
||||
### Phase 2: Audio Analyzer Update ✅
|
||||
Update `services/audio-analyzer/analyzer.py` to extract and save all 7 mood predictions:
|
||||
|
||||
```python
|
||||
# MusiCNN mood classifiers
|
||||
mood_models = {
|
||||
'moodHappy': 'mood_happy-musicnn-msd-2',
|
||||
'moodSad': 'mood_sad-musicnn-msd-2',
|
||||
'moodRelaxed': 'mood_relaxed-musicnn-msd-2',
|
||||
'moodAggressive': 'mood_aggressive-musicnn-msd-2',
|
||||
'moodParty': 'mood_party-musicnn-msd-2',
|
||||
'moodAcoustic': 'mood_acoustic-musicnn-msd-2',
|
||||
'moodElectronic': 'mood_electronic-musicnn-msd-2',
|
||||
}
|
||||
|
||||
# Save all to database
|
||||
UPDATE "Track" SET
|
||||
"moodHappy" = %s,
|
||||
"moodSad" = %s,
|
||||
"moodRelaxed" = %s,
|
||||
"moodAggressive" = %s,
|
||||
"moodParty" = %s,
|
||||
"moodAcoustic" = %s,
|
||||
"moodElectronic" = %s,
|
||||
...
|
||||
```
|
||||
|
||||
### Phase 3: Feature Vector Construction
|
||||
Build a normalized feature vector for each track:
|
||||
|
||||
```typescript
|
||||
interface TrackFeatures {
|
||||
// ML Moods (0-1)
|
||||
moodHappy: number | null;
|
||||
moodSad: number | null;
|
||||
moodRelaxed: number | null;
|
||||
moodAggressive: number | null;
|
||||
moodParty: number | null;
|
||||
moodAcoustic: number | null;
|
||||
moodElectronic: number | null;
|
||||
|
||||
// Audio Features
|
||||
energy: number | null;
|
||||
arousal: number | null;
|
||||
danceability: number | null;
|
||||
danceabilityMl: number | null;
|
||||
instrumentalness: number | null;
|
||||
bpm: number | null;
|
||||
keyScale: string | null;
|
||||
|
||||
// Metadata
|
||||
lastfmTags: any;
|
||||
essentiaGenres: any;
|
||||
}
|
||||
|
||||
function buildFeatureVector(track: TrackFeatures): number[] {
|
||||
return [
|
||||
// 7 ML Mood predictions (indices 0-6)
|
||||
track.moodHappy ?? 0.5,
|
||||
track.moodSad ?? 0.5,
|
||||
track.moodRelaxed ?? 0.5,
|
||||
track.moodAggressive ?? 0.5,
|
||||
track.moodParty ?? 0.5,
|
||||
track.moodAcoustic ?? 0.5,
|
||||
track.moodElectronic ?? 0.5,
|
||||
|
||||
// Core audio features (indices 7-10)
|
||||
track.energy ?? 0.5,
|
||||
track.arousal ?? 0.5,
|
||||
track.danceabilityMl ?? track.danceability ?? 0.5,
|
||||
track.instrumentalness ?? 0.5,
|
||||
|
||||
// Normalized BPM (index 11)
|
||||
// Maps 60-180 BPM to 0-1 range
|
||||
Math.max(0, Math.min(1, ((track.bpm ?? 120) - 60) / 120)),
|
||||
|
||||
// Key mode (index 12)
|
||||
// Major = 1, Minor = 0
|
||||
track.keyScale === 'major' ? 1 : 0,
|
||||
];
|
||||
}
|
||||
```
|
||||
|
||||
**Feature Vector Dimensions: 13**
|
||||
|
||||
### Phase 4: Cosine Similarity Calculation
|
||||
|
||||
```typescript
|
||||
function cosineSimilarity(a: number[], b: number[]): number {
|
||||
let dotProduct = 0;
|
||||
let magnitudeA = 0;
|
||||
let magnitudeB = 0;
|
||||
|
||||
for (let i = 0; i < a.length; i++) {
|
||||
dotProduct += a[i] * b[i];
|
||||
magnitudeA += a[i] * a[i];
|
||||
magnitudeB += b[i] * b[i];
|
||||
}
|
||||
|
||||
if (magnitudeA === 0 || magnitudeB === 0) return 0;
|
||||
|
||||
return dotProduct / (Math.sqrt(magnitudeA) * Math.sqrt(magnitudeB));
|
||||
}
|
||||
```
|
||||
|
||||
**Properties:**
|
||||
- Returns value between -1 and 1 (for our 0-1 normalized vectors, always 0 to 1)
|
||||
- 1.0 = identical vectors (perfect match)
|
||||
- 0.0 = orthogonal vectors (no similarity)
|
||||
- Higher = better (intuitive, no inversion needed)
|
||||
|
||||
### Phase 5: Tag/Genre Bonus
|
||||
|
||||
Add bonus points for matching tags and genres:
|
||||
|
||||
```typescript
|
||||
function calculateTagBonus(
|
||||
sourceTrack: TrackFeatures,
|
||||
candidateTrack: TrackFeatures
|
||||
): number {
|
||||
let bonus = 0;
|
||||
|
||||
// Extract tags
|
||||
const sourceTags = new Set<string>();
|
||||
const candidateTags = new Set<string>();
|
||||
|
||||
// Parse lastfmTags
|
||||
if (Array.isArray(sourceTrack.lastfmTags)) {
|
||||
sourceTrack.lastfmTags.forEach((t: any) => {
|
||||
if (t?.name) sourceTags.add(t.name.toLowerCase());
|
||||
});
|
||||
}
|
||||
if (Array.isArray(candidateTrack.lastfmTags)) {
|
||||
candidateTrack.lastfmTags.forEach((t: any) => {
|
||||
if (t?.name) candidateTags.add(t.name.toLowerCase());
|
||||
});
|
||||
}
|
||||
|
||||
// Parse essentiaGenres
|
||||
if (Array.isArray(sourceTrack.essentiaGenres)) {
|
||||
sourceTrack.essentiaGenres.forEach((g: string) => {
|
||||
sourceTags.add(g.toLowerCase());
|
||||
});
|
||||
}
|
||||
if (Array.isArray(candidateTrack.essentiaGenres)) {
|
||||
candidateTrack.essentiaGenres.forEach((g: string) => {
|
||||
candidateTags.add(g.toLowerCase());
|
||||
});
|
||||
}
|
||||
|
||||
// Count overlapping tags
|
||||
let overlap = 0;
|
||||
for (const tag of sourceTags) {
|
||||
if (candidateTags.has(tag)) overlap++;
|
||||
}
|
||||
|
||||
// Bonus: up to 0.1 (10%) for tag overlap
|
||||
// Normalized by the smaller set size to handle varying tag counts
|
||||
const minSize = Math.min(sourceTags.size, candidateTags.size);
|
||||
if (minSize > 0) {
|
||||
bonus = (overlap / minSize) * 0.1;
|
||||
}
|
||||
|
||||
return bonus;
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 6: Final Score Calculation
|
||||
|
||||
```typescript
|
||||
function calculateVibeScore(
|
||||
sourceTrack: TrackFeatures,
|
||||
candidateTrack: TrackFeatures
|
||||
): number {
|
||||
// Build feature vectors
|
||||
const sourceVector = buildFeatureVector(sourceTrack);
|
||||
const candidateVector = buildFeatureVector(candidateTrack);
|
||||
|
||||
// Calculate cosine similarity (0-1)
|
||||
const cosineSim = cosineSimilarity(sourceVector, candidateVector);
|
||||
|
||||
// Add tag bonus (0-0.1)
|
||||
const tagBonus = calculateTagBonus(sourceTrack, candidateTrack);
|
||||
|
||||
// Final score: cosine similarity + tag bonus
|
||||
// Capped at 1.0
|
||||
const finalScore = Math.min(1.0, cosineSim + tagBonus);
|
||||
|
||||
return finalScore;
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 7: Integration into Radio Endpoint
|
||||
|
||||
Update `backend/src/routes/library.ts`:
|
||||
|
||||
```typescript
|
||||
// In the vibe radio section
|
||||
const sourceTrack = await prisma.track.findUnique({
|
||||
where: { id: trackId },
|
||||
select: {
|
||||
moodHappy: true,
|
||||
moodSad: true,
|
||||
moodRelaxed: true,
|
||||
moodAggressive: true,
|
||||
moodParty: true,
|
||||
moodAcoustic: true,
|
||||
moodElectronic: true,
|
||||
energy: true,
|
||||
arousal: true,
|
||||
danceability: true,
|
||||
danceabilityMl: true,
|
||||
instrumentalness: true,
|
||||
bpm: true,
|
||||
keyScale: true,
|
||||
lastfmTags: true,
|
||||
essentiaGenres: true,
|
||||
},
|
||||
});
|
||||
|
||||
// Get candidates
|
||||
const candidates = await prisma.track.findMany({
|
||||
where: {
|
||||
id: { not: trackId },
|
||||
analysisStatus: 'enhanced', // Only use analyzed tracks
|
||||
},
|
||||
select: { /* same fields */ },
|
||||
take: 500, // Get more candidates for better matching
|
||||
});
|
||||
|
||||
// Score all candidates
|
||||
const scored = candidates.map(candidate => ({
|
||||
...candidate,
|
||||
vibeScore: calculateVibeScore(sourceTrack, candidate),
|
||||
}));
|
||||
|
||||
// Sort by score (highest first)
|
||||
scored.sort((a, b) => b.vibeScore - a.vibeScore);
|
||||
|
||||
// Take top N for the queue
|
||||
const vibeQueue = scored.slice(0, limit);
|
||||
|
||||
// DO NOT SHUFFLE - preserve the sorted order!
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
- [x] **Phase 1:** Add `moodParty`, `moodAcoustic`, `moodElectronic` to Prisma schema
|
||||
- [x] **Phase 2:** Update audio analyzer to extract all 7 moods
|
||||
- [x] **Phase 3:** Implement `buildFeatureVector()` function
|
||||
- [x] **Phase 4:** Implement `cosineSimilarity()` function
|
||||
- [x] **Phase 5:** Implement `calculateTagBonus()` function (called `computeTagBonus`)
|
||||
- [x] **Phase 6:** Implement `calculateVibeScore()` combining all components
|
||||
- [x] **Phase 7:** Integrate into `/library/radio` endpoint
|
||||
- [ ] **Phase 8:** Update frontend to display match percentage (optional enhancement)
|
||||
- [ ] **Phase 9:** Re-analyze tracks to populate new mood fields
|
||||
|
||||
---
|
||||
|
||||
## Re-Analysis Script
|
||||
|
||||
To populate the new mood fields for existing tracks:
|
||||
|
||||
```sql
|
||||
-- Reset analysis status for enhanced tracks to re-run analysis
|
||||
UPDATE "Track"
|
||||
SET "analysisStatus" = 'pending'
|
||||
WHERE "analysisStatus" = 'enhanced';
|
||||
```
|
||||
|
||||
Or use the existing script:
|
||||
```bash
|
||||
docker exec lidify_db psql -U lidifydb -d lidify -f /path/to/reset-analysis-for-new-moods.sql
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Expected Improvements
|
||||
|
||||
1. **Better Similarity Matching:** Cosine similarity is mathematically proven to work well for high-dimensional feature vectors
|
||||
2. **Full ML Utilization:** All 9 mood predictions now contribute to matching
|
||||
3. **Genre Awareness:** Tag/genre overlap provides meaningful boost
|
||||
4. **Intuitive Scores:** Higher score = better match (no inversion)
|
||||
5. **Normalized Features:** All features scaled to 0-1 for fair comparison
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
1. Pick a track with known characteristics (e.g., happy upbeat pop song)
|
||||
2. Generate vibe queue
|
||||
3. Verify top matches share similar mood profiles
|
||||
4. Check that match percentages in UI reflect actual similarity
|
||||
5. Test with various genres to ensure cross-genre matching works appropriately
|
||||
|
||||
---
|
||||
|
||||
## Files Modified
|
||||
|
||||
- `backend/prisma/schema.prisma` - New mood fields
|
||||
- `backend/src/routes/library.ts` - New scoring algorithm
|
||||
- `services/audio-analyzer/analyzer.py` - Extract all 7 moods
|
||||
- `frontend/components/player/VibeOverlay.tsx` - Display all moods
|
||||
- `frontend/lib/audio-state-context.tsx` - Extended AudioFeatures interface
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- **Gaia:** Essentia has a companion library called Gaia for large-scale similarity search using KD-trees. This is overkill for our scale (< 100k tracks) but could be considered for future scaling.
|
||||
- **MusiCNN Limitations:** The model was trained on MSD (Million Song Dataset) which is pop/rock heavy. For classical/ambient music, predictions may be less reliable. We've added normalization to handle this.
|
||||
- **Shuffle Interaction:** Vibe mode automatically disables shuffle to preserve the sorted order.
|
||||
|
||||
@@ -0,0 +1,571 @@
|
||||
# Vibe Matching Implementation Plan
|
||||
|
||||
## Executive Summary
|
||||
|
||||
The current vibe matching system uses Essentia for audio analysis but only extracts **basic features**. Critical mood/emotion features are either placeholder values or poorly estimated. This document outlines a comprehensive plan to achieve Spotify-quality vibe matching while being conscious of performance on user hardware.
|
||||
|
||||
## Strategy Update (Latest)
|
||||
|
||||
**Default:** Enhanced mode (ML-powered, accurate)
|
||||
**Fallback:** Standard mode (lightweight, for troubleshooting or power saving)
|
||||
|
||||
**Approach:**
|
||||
1. ✅ Pre-package all Essentia TensorFlow models in Docker image (~200MB)
|
||||
2. 🔄 Fix Enhanced mode FIRST - make it actually use the ML models
|
||||
3. ⏳ THEN create Standard mode as a lightweight fallback
|
||||
4. Users can toggle to Standard mode to save CPU if needed
|
||||
|
||||
---
|
||||
|
||||
## Current State Analysis
|
||||
|
||||
### What Essentia IS Currently Extracting (Working)
|
||||
|
||||
| Feature | Status | Quality |
|
||||
|---------|--------|---------|
|
||||
| **BPM** | ✅ Working | Good - Uses `RhythmExtractor2013` |
|
||||
| **Key** | ✅ Working | Good - Uses `KeyExtractor` |
|
||||
| **KeyScale** | ✅ Working | Good - major/minor detection |
|
||||
| **Energy** | ✅ Working | Moderate - Raw energy normalized |
|
||||
| **Loudness** | ✅ Working | Good - dB measurement |
|
||||
| **Dynamic Range** | ✅ Working | Good |
|
||||
| **Danceability** | ✅ Working | Good - Uses `Danceability` algorithm |
|
||||
| **Beats Count** | ✅ Working | Good |
|
||||
|
||||
### What's Broken or Placeholder
|
||||
|
||||
| Feature | Status | Problem |
|
||||
|---------|--------|---------|
|
||||
| **Valence** | ⚠️ Fake | Calculated as `(major/minor * 0.4) + (energy * 0.6)` - NOT actual emotional valence |
|
||||
| **Arousal** | ⚠️ Fake | Calculated as `(BPM * 0.5) + (energy * 0.5)` - NOT actual arousal |
|
||||
| **Instrumentalness** | ❌ Placeholder | Hardcoded to `0.5` |
|
||||
| **Acousticness** | ⚠️ Estimate | Rough estimate from dynamic range |
|
||||
| **Speechiness** | ❌ Placeholder | Hardcoded to `0.1` |
|
||||
| **Mood Tags** | ⚠️ Derived | Generated from fake valence/arousal, not ML |
|
||||
| **Genre Tags** | ❌ Empty | TensorFlow models not loaded |
|
||||
|
||||
### The Core Issue
|
||||
|
||||
```python
|
||||
# Current valence calculation (analyzer.py lines 226-231)
|
||||
key_valence = 0.6 if scale == 'major' else 0.4
|
||||
energy_valence = result['energy']
|
||||
result['valence'] = round((key_valence * 0.4 + energy_valence * 0.6), 3)
|
||||
```
|
||||
|
||||
**"Fake Happy" by Paramore** (emotionally complex, about masking sadness):
|
||||
- Major key → 0.6
|
||||
- High energy → ~0.7
|
||||
- Calculated valence: `(0.6 * 0.4) + (0.7 * 0.6) = 0.66` (appears "happy")
|
||||
|
||||
**"Summer Girl" by Jamiroquai** (genuinely upbeat funk):
|
||||
- Major key → 0.6
|
||||
- High energy → ~0.7
|
||||
- Calculated valence: `(0.6 * 0.4) + (0.7 * 0.6) = 0.66` (appears "happy")
|
||||
|
||||
**Result: 97% match despite being completely different vibes!**
|
||||
|
||||
---
|
||||
|
||||
## How Spotify Does It
|
||||
|
||||
Spotify's audio analysis uses a combination of:
|
||||
|
||||
### 1. Low-Level Audio Features (Similar to what we have)
|
||||
- Tempo/BPM
|
||||
- Key/Mode
|
||||
- Loudness
|
||||
- Time signature
|
||||
|
||||
### 2. Mid-Level Features (We're missing these)
|
||||
- **Spectral Centroid** - "brightness" of the sound
|
||||
- **Spectral Rolloff** - frequency distribution
|
||||
- **Zero Crossing Rate** - percussiveness
|
||||
- **MFCCs** - Mel-frequency cepstral coefficients (timbral texture)
|
||||
- **Chroma Features** - harmonic content
|
||||
|
||||
### 3. High-Level Features (We're faking these)
|
||||
- **Valence** - Musical positiveness (0-1)
|
||||
- **Arousal/Energy** - Intensity and activity
|
||||
- **Instrumentalness** - Vocal presence prediction
|
||||
- **Acousticness** - Acoustic vs electronic
|
||||
- **Speechiness** - Presence of spoken words
|
||||
- **Liveness** - Audience presence detection
|
||||
|
||||
### 4. Deep Learning Models
|
||||
Spotify trains neural networks on millions of labeled tracks to predict:
|
||||
- Mood categories
|
||||
- Genre classification
|
||||
- User preference patterns
|
||||
|
||||
---
|
||||
|
||||
## Two-Tier System
|
||||
|
||||
### Default: Enhanced Vibe Matching (ML-Powered)
|
||||
**Status:** DEFAULT - Pre-packaged in Docker, just works
|
||||
**Target:** High accuracy, ~5-10 seconds per track
|
||||
|
||||
**Features (from Essentia TensorFlow Models):**
|
||||
1. **Mood Predictions (real ML, not estimated):**
|
||||
- `mood_happy-discogs-effnet-1.pb` - Happiness/positivity 0-1
|
||||
- `mood_sad-discogs-effnet-1.pb` - Sadness 0-1
|
||||
- `mood_relaxed-discogs-effnet-1.pb` - Relaxation/calmness 0-1
|
||||
- `mood_aggressive-discogs-effnet-1.pb` - Aggression/intensity 0-1
|
||||
|
||||
2. **Audio Characteristics:**
|
||||
- `danceability-discogs-effnet-1.pb` - ML-based danceability
|
||||
- `voice_instrumental-discogs-effnet-1.pb` - Vocal detection (instrumentalness)
|
||||
|
||||
3. **Embeddings for Similarity:**
|
||||
- `discogs-effnet-bs64-1.pb` - Audio embeddings (neural "fingerprint")
|
||||
- Can be used for direct similarity comparison
|
||||
|
||||
4. **Spectral Features:**
|
||||
- Spectral Centroid (brightness)
|
||||
- MFCCs (timbral texture - 13 coefficients)
|
||||
|
||||
**Models Pre-packaged:** ~200MB in Docker image (no user download)
|
||||
**RAM Requirement:** ~500MB during analysis
|
||||
**CPU Requirement:** Any modern CPU (2015+)
|
||||
|
||||
### Fallback: Standard Vibe Matching (Lightweight)
|
||||
**Status:** FALLBACK - For troubleshooting or power saving
|
||||
**Target:** Fast, <2 seconds per track, low CPU
|
||||
|
||||
**Features Used:**
|
||||
- BPM (Essentia RhythmExtractor)
|
||||
- Energy (Essentia Energy)
|
||||
- Danceability (Essentia Danceability - non-ML version)
|
||||
- Key/Scale (Essentia KeyExtractor)
|
||||
- Spectral Centroid (cheap to compute)
|
||||
- Last.fm mood tags
|
||||
- Genre matching from tags
|
||||
|
||||
**When to use Standard mode:**
|
||||
- Low-power devices (Raspberry Pi, older NAS)
|
||||
- Troubleshooting if Enhanced mode has issues
|
||||
- User preference to save CPU cycles
|
||||
|
||||
---
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: Pre-Package Models in Docker (Day 1)
|
||||
|
||||
#### 1.1 Update Dockerfile to Include Models
|
||||
|
||||
```dockerfile
|
||||
# Download Essentia ML models during build (~200MB)
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends curl && \
|
||||
# Base embedding model (required for all predictions)
|
||||
curl -L -o /app/models/discogs-effnet-bs64-1.pb \
|
||||
"https://essentia.upf.edu/models/feature-extractors/discogs-effnet/discogs-effnet-bs64-1.pb" && \
|
||||
# Mood models
|
||||
curl -L -o /app/models/mood_happy-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_happy/mood_happy-discogs-effnet-1.pb" && \
|
||||
curl -L -o /app/models/mood_sad-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_sad/mood_sad-discogs-effnet-1.pb" && \
|
||||
curl -L -o /app/models/mood_relaxed-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_relaxed/mood_relaxed-discogs-effnet-1.pb" && \
|
||||
curl -L -o /app/models/mood_aggressive-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_aggressive/mood_aggressive-discogs-effnet-1.pb" && \
|
||||
# Danceability and voice/instrumental
|
||||
curl -L -o /app/models/danceability-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/danceability/danceability-discogs-effnet-1.pb" && \
|
||||
curl -L -o /app/models/voice_instrumental-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/voice_instrumental/voice_instrumental-discogs-effnet-1.pb" && \
|
||||
# Arousal/Valence models
|
||||
curl -L -o /app/models/arousal-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_arousal/mood_arousal-discogs-effnet-1.pb" && \
|
||||
curl -L -o /app/models/valence-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_valence/mood_valence-discogs-effnet-1.pb" && \
|
||||
apt-get purge -y curl && rm -rf /var/lib/apt/lists/*
|
||||
```
|
||||
|
||||
### Phase 2: Implement Enhanced Analysis (Days 2-4)
|
||||
|
||||
#### 2.1 Rewrite analyzer.py with ML Models
|
||||
|
||||
```python
|
||||
class AudioAnalyzer:
|
||||
"""Enhanced audio analysis using Essentia TensorFlow models"""
|
||||
|
||||
def __init__(self):
|
||||
self.models_loaded = False
|
||||
self.embedding_model = None
|
||||
self.mood_models = {}
|
||||
|
||||
if ESSENTIA_AVAILABLE:
|
||||
self._init_essentia()
|
||||
self._load_ml_models()
|
||||
|
||||
def _load_ml_models(self):
|
||||
"""Load TensorFlow models for enhanced analysis"""
|
||||
try:
|
||||
from essentia.standard import (
|
||||
TensorflowPredictEffnetDiscogs,
|
||||
TensorflowPredict2D
|
||||
)
|
||||
|
||||
# Load embedding extractor (base for all predictions)
|
||||
embedding_path = '/app/models/discogs-effnet-bs64-1.pb'
|
||||
if os.path.exists(embedding_path):
|
||||
self.embedding_model = TensorflowPredictEffnetDiscogs(
|
||||
graphFilename=embedding_path,
|
||||
output="PartitionedCall:1"
|
||||
)
|
||||
logger.info("Loaded embedding model")
|
||||
|
||||
# Load mood prediction models
|
||||
mood_models = {
|
||||
'happy': '/app/models/mood_happy-discogs-effnet-1.pb',
|
||||
'sad': '/app/models/mood_sad-discogs-effnet-1.pb',
|
||||
'relaxed': '/app/models/mood_relaxed-discogs-effnet-1.pb',
|
||||
'aggressive': '/app/models/mood_aggressive-discogs-effnet-1.pb',
|
||||
'danceability': '/app/models/danceability-discogs-effnet-1.pb',
|
||||
'voice_instrumental': '/app/models/voice_instrumental-discogs-effnet-1.pb',
|
||||
'arousal': '/app/models/arousal-discogs-effnet-1.pb',
|
||||
'valence': '/app/models/valence-discogs-effnet-1.pb',
|
||||
}
|
||||
|
||||
for name, path in mood_models.items():
|
||||
if os.path.exists(path):
|
||||
self.mood_models[name] = TensorflowPredict2D(
|
||||
graphFilename=path,
|
||||
output="model/Softmax"
|
||||
)
|
||||
logger.info(f"Loaded {name} model")
|
||||
|
||||
self.models_loaded = len(self.mood_models) > 0
|
||||
logger.info(f"ML models loaded: {self.models_loaded} ({len(self.mood_models)} models)")
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Could not load ML models: {e}")
|
||||
self.models_loaded = False
|
||||
|
||||
def analyze(self, file_path: str) -> Dict[str, Any]:
|
||||
"""Full analysis with ML models if available"""
|
||||
result = self._extract_basic_features(file_path)
|
||||
|
||||
if self.models_loaded:
|
||||
ml_features = self._extract_ml_features(file_path)
|
||||
result.update(ml_features)
|
||||
result['analysisMode'] = 'enhanced'
|
||||
else:
|
||||
# Fallback to estimated values
|
||||
result.update(self._estimate_mood_features(result))
|
||||
result['analysisMode'] = 'standard'
|
||||
|
||||
return result
|
||||
|
||||
def _extract_ml_features(self, file_path: str) -> Dict[str, Any]:
|
||||
"""Extract features using TensorFlow models"""
|
||||
result = {}
|
||||
|
||||
# Load audio at 16kHz for ML models
|
||||
audio = self.load_audio(file_path, sample_rate=16000)
|
||||
if audio is None:
|
||||
return result
|
||||
|
||||
# Get embeddings
|
||||
embeddings = self.embedding_model(audio)
|
||||
|
||||
# Mood predictions
|
||||
if 'happy' in self.mood_models:
|
||||
preds = self.mood_models['happy'](embeddings)
|
||||
result['moodHappy'] = float(np.mean(preds[:, 1])) # Probability of "happy"
|
||||
|
||||
if 'sad' in self.mood_models:
|
||||
preds = self.mood_models['sad'](embeddings)
|
||||
result['moodSad'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'relaxed' in self.mood_models:
|
||||
preds = self.mood_models['relaxed'](embeddings)
|
||||
result['moodRelaxed'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'aggressive' in self.mood_models:
|
||||
preds = self.mood_models['aggressive'](embeddings)
|
||||
result['moodAggressive'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
# Real valence and arousal from dedicated models
|
||||
if 'valence' in self.mood_models:
|
||||
preds = self.mood_models['valence'](embeddings)
|
||||
result['valence'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'arousal' in self.mood_models:
|
||||
preds = self.mood_models['arousal'](embeddings)
|
||||
result['arousal'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
# Instrumentalness from voice/instrumental model
|
||||
if 'voice_instrumental' in self.mood_models:
|
||||
preds = self.mood_models['voice_instrumental'](embeddings)
|
||||
result['instrumentalness'] = float(np.mean(preds[:, 1])) # 1 = instrumental
|
||||
|
||||
# ML-based danceability
|
||||
if 'danceability' in self.mood_models:
|
||||
preds = self.mood_models['danceability'](embeddings)
|
||||
result['danceabilityMl'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
return result
|
||||
```
|
||||
|
||||
### Phase 3: Update Database Schema (Day 3)
|
||||
|
||||
#### 3.1 Add New Feature Columns
|
||||
|
||||
```prisma
|
||||
model Track {
|
||||
// ... existing fields ...
|
||||
|
||||
// ML-based mood predictions (Enhanced mode)
|
||||
moodHappy Float? // ML prediction 0-1
|
||||
moodSad Float? // ML prediction 0-1
|
||||
moodRelaxed Float? // ML prediction 0-1
|
||||
moodAggressive Float? // ML prediction 0-1
|
||||
danceabilityMl Float? // ML-based danceability
|
||||
|
||||
// Analysis metadata
|
||||
analysisMode String? // 'standard' or 'enhanced'
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 4: Update Vibe Matching Algorithm (Day 4)
|
||||
|
||||
#### 4.1 Use Real Mood Predictions in Matching
|
||||
|
||||
```typescript
|
||||
// In library.ts - Enhanced vibe matching
|
||||
const scored = analyzedTracks.map(t => {
|
||||
let score = 0;
|
||||
let factors = 0;
|
||||
|
||||
// === MOOD MATCHING (50% total - the heart of vibe) ===
|
||||
|
||||
// Happy mood (15%)
|
||||
if (sourceTrack.moodHappy !== null && t.moodHappy !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.moodHappy - t.moodHappy)) * 0.15;
|
||||
factors += 0.15;
|
||||
}
|
||||
|
||||
// Sad mood (10%)
|
||||
if (sourceTrack.moodSad !== null && t.moodSad !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.moodSad - t.moodSad)) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// Relaxed mood (10%)
|
||||
if (sourceTrack.moodRelaxed !== null && t.moodRelaxed !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.moodRelaxed - t.moodRelaxed)) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// Aggressive mood (10%)
|
||||
if (sourceTrack.moodAggressive !== null && t.moodAggressive !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.moodAggressive - t.moodAggressive)) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// Valence - overall positivity (5%)
|
||||
if (sourceTrack.valence !== null && t.valence !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.valence - t.valence)) * 0.05;
|
||||
factors += 0.05;
|
||||
}
|
||||
|
||||
// === AUDIO CHARACTERISTICS (35% total) ===
|
||||
|
||||
// BPM (15%) - within ±15 BPM is good
|
||||
if (sourceTrack.bpm && t.bpm) {
|
||||
const bpmDiff = Math.abs(sourceTrack.bpm - t.bpm);
|
||||
score += Math.max(0, 1 - bpmDiff / 30) * 0.15;
|
||||
factors += 0.15;
|
||||
}
|
||||
|
||||
// Energy (10%)
|
||||
if (sourceTrack.energy !== null && t.energy !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.energy - t.energy)) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// Danceability - prefer ML version (10%)
|
||||
const srcDance = sourceTrack.danceabilityMl ?? sourceTrack.danceability;
|
||||
const tDance = t.danceabilityMl ?? t.danceability;
|
||||
if (srcDance !== null && tDance !== null) {
|
||||
score += (1 - Math.abs(srcDance - tDance)) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// === GENRE/TAGS (15% total) ===
|
||||
|
||||
// Genre/tag overlap (10%)
|
||||
const sourceGenres = [...(sourceTrack.lastfmTags || []), ...(sourceTrack.essentiaGenres || [])];
|
||||
const trackGenres = [...(t.lastfmTags || []), ...(t.essentiaGenres || [])];
|
||||
if (sourceGenres.length > 0 && trackGenres.length > 0) {
|
||||
const overlap = sourceGenres.filter(g => trackGenres.includes(g)).length;
|
||||
const maxOverlap = Math.max(sourceGenres.length, trackGenres.length);
|
||||
score += (overlap / maxOverlap) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// Key compatibility (5%)
|
||||
if (sourceTrack.keyScale && t.keyScale) {
|
||||
score += (sourceTrack.keyScale === t.keyScale ? 1 : 0.5) * 0.05;
|
||||
factors += 0.05;
|
||||
}
|
||||
|
||||
const finalScore = factors > 0 ? score / factors : 0;
|
||||
return { id: t.id, score: finalScore };
|
||||
});
|
||||
```
|
||||
|
||||
### Phase 5: Create Standard Mode Fallback (Day 5)
|
||||
|
||||
After Enhanced mode is working, implement Standard mode:
|
||||
- Same algorithm structure but skip ML features
|
||||
- Use estimated valence (improved heuristics)
|
||||
- Lower weights on mood matching since it's estimated
|
||||
- Higher weights on BPM, energy, genre tags
|
||||
|
||||
### Phase 6: Settings & UI (Day 6)
|
||||
|
||||
#### 6.1 Add Settings Toggle
|
||||
|
||||
```typescript
|
||||
// System settings - Enhanced is DEFAULT
|
||||
{
|
||||
audioAnalysis: {
|
||||
vibeMatchingMode: 'enhanced' | 'standard', // Default: 'enhanced'
|
||||
reanalyzeOnModeChange: boolean, // Default: false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 6.2 Settings UI
|
||||
|
||||
```
|
||||
Audio Analysis
|
||||
├── Vibe Matching Mode
|
||||
│ ├── ● Enhanced (Recommended - Default)
|
||||
│ │ └── Uses ML models for accurate mood detection
|
||||
│ └── ○ Standard (Power Saver)
|
||||
│ └── Faster, uses basic audio features only
|
||||
│
|
||||
├── Analysis Status
|
||||
│ └── "1,234 / 1,500 tracks analyzed (Enhanced mode)"
|
||||
│
|
||||
└── [Re-analyze Library] button
|
||||
└── "Re-analyze all tracks with current settings"
|
||||
```
|
||||
|
||||
### Phase 7: Testing & Validation (Day 7)
|
||||
|
||||
#### 7.1 Test Cases
|
||||
|
||||
| Source Track | Bad Match (Current) | Expected Good Match |
|
||||
|--------------|---------------------|---------------------|
|
||||
| "Fake Happy" (Paramore) | "Summer Girl" (Jamiroquai) 97% | Other emo/pop-punk <60% |
|
||||
| "Creep" (Radiohead) | Fast dance track | Other melancholic rock |
|
||||
| "Uptown Funk" | Slow ballad | Other high-energy funk/pop |
|
||||
|
||||
#### 7.2 Performance Testing
|
||||
- Analyze 100 tracks, measure time
|
||||
- Memory usage during analysis
|
||||
- Queue handling under load
|
||||
|
||||
---
|
||||
|
||||
## Database Schema Updates
|
||||
|
||||
```prisma
|
||||
model Track {
|
||||
// ... existing fields ...
|
||||
|
||||
// ML-based mood predictions (Enhanced mode)
|
||||
moodHappy Float? // ML prediction 0-1
|
||||
moodSad Float? // ML prediction 0-1
|
||||
moodRelaxed Float? // ML prediction 0-1
|
||||
moodAggressive Float? // ML prediction 0-1
|
||||
danceabilityMl Float? // ML-based danceability
|
||||
|
||||
// Analysis metadata
|
||||
analysisMode String? // 'standard' or 'enhanced'
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Benchmarks (Estimated)
|
||||
|
||||
| Operation | Standard Mode | Enhanced Mode |
|
||||
|-----------|---------------|---------------|
|
||||
| Analysis per track | 1-2 sec | 5-10 sec |
|
||||
| RAM usage | ~100MB | ~500MB |
|
||||
| Models in Docker | N/A | ~200MB (pre-packaged) |
|
||||
| Vibe match query | <100ms | <100ms |
|
||||
| Full library (1000 tracks) | ~30 min | ~2-3 hours |
|
||||
|
||||
---
|
||||
|
||||
## Files to Modify
|
||||
|
||||
| File | Changes |
|
||||
|------|---------|
|
||||
| `services/audio-analyzer/Dockerfile` | Add model downloads during build |
|
||||
| `services/audio-analyzer/analyzer.py` | Implement ML model loading and prediction |
|
||||
| `backend/prisma/schema.prisma` | Add mood prediction columns |
|
||||
| `backend/src/routes/library.ts` | Update vibe matching algorithm weights |
|
||||
| `frontend/features/settings/` | Add analysis mode toggle (default: enhanced) |
|
||||
| `frontend/components/player/VibeGraph.tsx` | Display mood predictions |
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
After implementation, "Fake Happy" and "Summer Girl" should:
|
||||
- Match at **<50%** (different emotional content, different genre)
|
||||
|
||||
Better matches for "Fake Happy" would be:
|
||||
- Other Paramore songs (same artist = genre/production match)
|
||||
- Emo/pop-punk with similar emotional complexity
|
||||
- Songs with high energy but mixed emotional signals
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order (Enhanced First)
|
||||
|
||||
### Week 1: Get Enhanced Mode Working
|
||||
1. [x] Create implementation plan (this document)
|
||||
2. [x] Update Dockerfile to pre-package ML models (~200MB)
|
||||
3. [x] Rewrite analyzer.py with TensorFlow model loading
|
||||
4. [x] Add new database columns for mood predictions (moodHappy, moodSad, etc.)
|
||||
5. [x] Update vibe matching algorithm with ML mood weights
|
||||
6. [x] Update programmatic playlists to use ML mood predictions
|
||||
7. [ ] Run Prisma migration to apply schema changes
|
||||
8. [ ] Rebuild audio-analyzer Docker container
|
||||
9. [ ] Test ML analysis on sample tracks
|
||||
|
||||
### Week 2: Polish & Fallback
|
||||
10. [ ] Test accuracy with diverse track pairs
|
||||
11. [ ] Add settings UI (Enhanced = default)
|
||||
12. [ ] Implement Standard mode as explicit fallback option
|
||||
13. [ ] Update VibeGraph to show mood predictions
|
||||
14. [ ] Documentation and testing
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference: Models to Include
|
||||
|
||||
| Model | File | Purpose | Size |
|
||||
|-------|------|---------|------|
|
||||
| Embeddings | `discogs-effnet-bs64-1.pb` | Base model for all predictions | ~85MB |
|
||||
| Happy | `mood_happy-discogs-effnet-1.pb` | Happiness detection | ~15MB |
|
||||
| Sad | `mood_sad-discogs-effnet-1.pb` | Sadness detection | ~15MB |
|
||||
| Relaxed | `mood_relaxed-discogs-effnet-1.pb` | Relaxation detection | ~15MB |
|
||||
| Aggressive | `mood_aggressive-discogs-effnet-1.pb` | Aggression detection | ~15MB |
|
||||
| Arousal | `mood_arousal-discogs-effnet-1.pb` | Energy/calm scale | ~15MB |
|
||||
| Valence | `mood_valence-discogs-effnet-1.pb` | Positive/negative | ~15MB |
|
||||
| Danceability | `danceability-discogs-effnet-1.pb` | ML danceability | ~15MB |
|
||||
| Voice/Instrumental | `voice_instrumental-discogs-effnet-1.pb` | Vocal detection | ~15MB |
|
||||
|
||||
**Total:** ~200MB (one-time addition to Docker image)
|
||||
|
||||
@@ -0,0 +1,132 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
# One-command predeploy test runner for Lidify.
|
||||
#
|
||||
# What it does:
|
||||
# - Starts a clean docker compose stack (core services only)
|
||||
# - Runs backend API smoke tests
|
||||
# - Runs frontend Playwright E2E smoke tests
|
||||
# - Optionally tears the stack down
|
||||
#
|
||||
# Requirements:
|
||||
# - Docker + docker compose plugin
|
||||
# - Node/npm available (to run the test runners)
|
||||
# - A MUSIC_PATH that contains at least one track if you want playback/playlist tests to pass
|
||||
#
|
||||
# Environment variables:
|
||||
# - LIDIFY_UI_BASE_URL (default: http://127.0.0.1:3030)
|
||||
# - LIDIFY_API_BASE_URL (default: http://127.0.0.1:3006)
|
||||
# - LIDIFY_TEST_USERNAME (default: predeploy)
|
||||
# - LIDIFY_TEST_PASSWORD (default: predeploy-password)
|
||||
# - LIDIFY_COMPOSE_FILE (default: docker-compose.yml)
|
||||
# - LIDIFY_COMPOSE_PROJECT (default: lidify_predeploy_<timestamp>)
|
||||
# - LIDIFY_TEARDOWN (default: 1) set to 0 to keep containers running
|
||||
|
||||
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
|
||||
|
||||
COMPOSE_FILE="${LIDIFY_COMPOSE_FILE:-docker-compose.yml}"
|
||||
UI_BASE_URL="${LIDIFY_UI_BASE_URL:-http://127.0.0.1:3030}"
|
||||
API_BASE_URL="${LIDIFY_API_BASE_URL:-http://127.0.0.1:3006}"
|
||||
TEARDOWN="${LIDIFY_TEARDOWN:-1}"
|
||||
|
||||
PROJECT="${LIDIFY_COMPOSE_PROJECT:-lidify_predeploy_$(date +%Y%m%d_%H%M%S)}"
|
||||
|
||||
cd "$ROOT_DIR"
|
||||
|
||||
echo "[predeploy] project=$PROJECT"
|
||||
echo "[predeploy] compose=$COMPOSE_FILE"
|
||||
echo "[predeploy] ui=$UI_BASE_URL"
|
||||
echo "[predeploy] api=$API_BASE_URL"
|
||||
|
||||
if ! command -v docker >/dev/null 2>&1; then
|
||||
echo "[predeploy] ERROR: docker is not installed or not in PATH"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if ! docker compose version >/dev/null 2>&1; then
|
||||
echo "[predeploy] ERROR: docker compose plugin not available (try: docker --version, docker compose version)"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
cleanup() {
|
||||
if [ "$TEARDOWN" = "1" ]; then
|
||||
echo "[predeploy] tearing down docker compose stack..."
|
||||
docker compose -p "$PROJECT" -f "$COMPOSE_FILE" down -v
|
||||
else
|
||||
echo "[predeploy] teardown disabled (LIDIFY_TEARDOWN=0) - leaving containers running"
|
||||
fi
|
||||
}
|
||||
trap cleanup EXIT
|
||||
|
||||
echo "[predeploy] starting docker compose (core services only)..."
|
||||
docker compose -p "$PROJECT" -f "$COMPOSE_FILE" up -d postgres redis backend frontend
|
||||
|
||||
echo "[predeploy] waiting for backend health..."
|
||||
node - <<'NODE'
|
||||
const base = (process.env.LIDIFY_API_BASE_URL || "http://127.0.0.1:3006").replace(/\/$/, "");
|
||||
const timeoutMs = 120000;
|
||||
const start = Date.now();
|
||||
|
||||
async function sleep(ms){ return new Promise(r=>setTimeout(r, ms)); }
|
||||
|
||||
(async () => {
|
||||
while (Date.now() - start < timeoutMs) {
|
||||
try {
|
||||
const res = await fetch(`${base}/health`);
|
||||
if (res.ok) process.exit(0);
|
||||
} catch {}
|
||||
await sleep(1000);
|
||||
}
|
||||
console.error(`Backend did not become healthy at ${base}/health within ${timeoutMs}ms`);
|
||||
process.exit(1);
|
||||
})();
|
||||
NODE
|
||||
|
||||
echo "[predeploy] waiting for frontend health..."
|
||||
node - <<'NODE'
|
||||
const base = (process.env.LIDIFY_UI_BASE_URL || "http://127.0.0.1:3030").replace(/\/$/, "");
|
||||
const timeoutMs = 120000;
|
||||
const start = Date.now();
|
||||
|
||||
async function sleep(ms){ return new Promise(r=>setTimeout(r, ms)); }
|
||||
|
||||
(async () => {
|
||||
while (Date.now() - start < timeoutMs) {
|
||||
try {
|
||||
const res = await fetch(`${base}/health`);
|
||||
if (res.ok) process.exit(0);
|
||||
} catch {}
|
||||
await sleep(1000);
|
||||
}
|
||||
console.error(`Frontend did not become healthy at ${base}/health within ${timeoutMs}ms`);
|
||||
process.exit(1);
|
||||
})();
|
||||
NODE
|
||||
|
||||
echo "[predeploy] running backend API smoke tests..."
|
||||
(cd backend && \
|
||||
LIDIFY_API_BASE_URL="$API_BASE_URL" \
|
||||
LIDIFY_TEST_USERNAME="${LIDIFY_TEST_USERNAME:-predeploy}" \
|
||||
LIDIFY_TEST_PASSWORD="${LIDIFY_TEST_PASSWORD:-predeploy-password}" \
|
||||
npm run test:smoke)
|
||||
|
||||
echo "[predeploy] ensuring Playwright browser is installed..."
|
||||
(cd frontend && npx playwright install chromium)
|
||||
|
||||
echo "[predeploy] running frontend E2E smoke tests..."
|
||||
(cd frontend && \
|
||||
LIDIFY_UI_BASE_URL="$UI_BASE_URL" \
|
||||
LIDIFY_TEST_USERNAME="${LIDIFY_TEST_USERNAME:-predeploy}" \
|
||||
LIDIFY_TEST_PASSWORD="${LIDIFY_TEST_PASSWORD:-predeploy-password}" \
|
||||
npm run test:e2e)
|
||||
|
||||
echo "[predeploy] PASS"
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,17 @@
|
||||
-- Reset all enhanced tracks for re-analysis to populate new mood fields
|
||||
-- (moodParty, moodAcoustic, moodElectronic)
|
||||
|
||||
-- Option 1: Reset only enhanced tracks (faster - already have ML models loaded)
|
||||
UPDATE "Track"
|
||||
SET
|
||||
"analysisStatus" = 'pending',
|
||||
"moodParty" = NULL,
|
||||
"moodAcoustic" = NULL,
|
||||
"moodElectronic" = NULL
|
||||
WHERE "analysisMode" = 'enhanced';
|
||||
|
||||
-- Check how many tracks will be re-analyzed
|
||||
SELECT COUNT(*) as tracks_to_reanalyze FROM "Track" WHERE "analysisStatus" = 'pending';
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,222 @@
|
||||
/**
|
||||
* Lidify predeploy smoke tests (API-level).
|
||||
*
|
||||
* Goals:
|
||||
* - deterministic, fast "is the app basically working?" checks
|
||||
* - no build step (runs via tsx)
|
||||
*
|
||||
* Usage:
|
||||
* LIDIFY_API_BASE_URL=http://127.0.0.1:3006 \
|
||||
* LIDIFY_TEST_USERNAME=predeploy \
|
||||
* LIDIFY_TEST_PASSWORD=predeploy-password \
|
||||
* npm run test:smoke
|
||||
*/
|
||||
|
||||
type Json = any;
|
||||
|
||||
const API_BASE_URL = (process.env.LIDIFY_API_BASE_URL || "http://127.0.0.1:3006").replace(/\/$/, "");
|
||||
const USERNAME = process.env.LIDIFY_TEST_USERNAME || "predeploy";
|
||||
const PASSWORD = process.env.LIDIFY_TEST_PASSWORD || "predeploy-password";
|
||||
|
||||
const WAIT_MS = Number(process.env.LIDIFY_SMOKE_WAIT_MS || "60000"); // total budget
|
||||
const POLL_INTERVAL_MS = Number(process.env.LIDIFY_SMOKE_POLL_INTERVAL_MS || "1000");
|
||||
|
||||
function sleep(ms: number) {
|
||||
return new Promise((r) => setTimeout(r, ms));
|
||||
}
|
||||
|
||||
function assert(condition: any, message: string): asserts condition {
|
||||
if (!condition) throw new Error(message);
|
||||
}
|
||||
|
||||
async function fetchJson(
|
||||
path: string,
|
||||
opts: RequestInit & { token?: string } = {}
|
||||
): Promise<{ status: number; ok: boolean; json: Json }> {
|
||||
const url = `${API_BASE_URL}${path}`;
|
||||
const headers: Record<string, string> = {
|
||||
"Content-Type": "application/json",
|
||||
...(opts.headers as any),
|
||||
};
|
||||
if (opts.token) headers.Authorization = `Bearer ${opts.token}`;
|
||||
|
||||
const res = await fetch(url, { ...opts, headers });
|
||||
const json = await res.json().catch(() => ({}));
|
||||
return { status: res.status, ok: res.ok, json };
|
||||
}
|
||||
|
||||
async function waitForHealth() {
|
||||
const start = Date.now();
|
||||
let lastErr: any = null;
|
||||
|
||||
while (Date.now() - start < WAIT_MS) {
|
||||
try {
|
||||
const res = await fetch(`${API_BASE_URL}/health`);
|
||||
if (res.ok) return;
|
||||
lastErr = new Error(`health returned ${res.status}`);
|
||||
} catch (e) {
|
||||
lastErr = e;
|
||||
}
|
||||
await sleep(POLL_INTERVAL_MS);
|
||||
}
|
||||
|
||||
throw new Error(
|
||||
`Backend did not become healthy at ${API_BASE_URL}/health within ${WAIT_MS}ms. Last error: ${
|
||||
lastErr instanceof Error ? lastErr.message : String(lastErr)
|
||||
}`
|
||||
);
|
||||
}
|
||||
|
||||
async function ensureTestUserAndToken(): Promise<string> {
|
||||
// Prefer onboarding/register because it's available without admin and works even when users exist.
|
||||
const register = await fetchJson("/api/onboarding/register", {
|
||||
method: "POST",
|
||||
body: JSON.stringify({ username: USERNAME, password: PASSWORD }),
|
||||
});
|
||||
|
||||
if (register.ok && register.json?.token) {
|
||||
return register.json.token as string;
|
||||
}
|
||||
|
||||
// If user already exists, login.
|
||||
const login = await fetchJson("/api/auth/login", {
|
||||
method: "POST",
|
||||
body: JSON.stringify({ username: USERNAME, password: PASSWORD }),
|
||||
});
|
||||
assert(login.ok, `Login failed: status=${login.status} body=${JSON.stringify(login.json)}`);
|
||||
assert(login.json?.token, `Login did not return token: ${JSON.stringify(login.json)}`);
|
||||
return login.json.token as string;
|
||||
}
|
||||
|
||||
async function completeOnboarding(token: string) {
|
||||
const res = await fetchJson("/api/onboarding/complete", {
|
||||
method: "POST",
|
||||
token,
|
||||
});
|
||||
// It's fine if it's already complete; endpoint should still succeed.
|
||||
assert(res.ok, `Onboarding complete failed: status=${res.status} body=${JSON.stringify(res.json)}`);
|
||||
}
|
||||
|
||||
async function getOneTrackId(token: string): Promise<string | null> {
|
||||
const tracks = await fetchJson("/api/library/tracks?limit=1&offset=0", { method: "GET", token });
|
||||
assert(tracks.ok, `Fetch tracks failed: status=${tracks.status} body=${JSON.stringify(tracks.json)}`);
|
||||
const id = tracks.json?.tracks?.[0]?.id;
|
||||
return typeof id === "string" ? id : null;
|
||||
}
|
||||
|
||||
async function scanLibraryIfNeeded(token: string) {
|
||||
// If you already have at least one track, don’t force a scan (keeps it fast).
|
||||
const existing = await getOneTrackId(token);
|
||||
if (existing) return;
|
||||
|
||||
const scan = await fetchJson("/api/library/scan", { method: "POST", token });
|
||||
assert(scan.ok, `Library scan start failed: status=${scan.status} body=${JSON.stringify(scan.json)}`);
|
||||
const jobId = scan.json?.jobId;
|
||||
assert(typeof jobId === "string", `Library scan did not return jobId: ${JSON.stringify(scan.json)}`);
|
||||
|
||||
const start = Date.now();
|
||||
while (Date.now() - start < WAIT_MS) {
|
||||
const status = await fetchJson(`/api/library/scan/status/${jobId}`, { method: "GET", token });
|
||||
assert(status.ok, `Library scan status failed: status=${status.status} body=${JSON.stringify(status.json)}`);
|
||||
const s = status.json?.status;
|
||||
if (s === "completed" || s === "complete" || s === "done" || s === "success") return;
|
||||
if (s === "failed" || s === "error") {
|
||||
throw new Error(`Library scan failed: ${JSON.stringify(status.json)}`);
|
||||
}
|
||||
await sleep(POLL_INTERVAL_MS);
|
||||
}
|
||||
|
||||
throw new Error(`Library scan did not complete within ${WAIT_MS}ms (jobId=${jobId}).`);
|
||||
}
|
||||
|
||||
async function playlistsCrud(token: string) {
|
||||
// Needs at least one track.
|
||||
const trackId = await getOneTrackId(token);
|
||||
assert(
|
||||
trackId,
|
||||
`No tracks found. Set MUSIC_PATH to a library with at least one track, or run a scan before testing.`
|
||||
);
|
||||
|
||||
const created = await fetchJson("/api/playlists", {
|
||||
method: "POST",
|
||||
token,
|
||||
body: JSON.stringify({ name: `predeploy-smoke-${Date.now()}`, isPublic: false }),
|
||||
});
|
||||
assert(created.ok, `Create playlist failed: status=${created.status} body=${JSON.stringify(created.json)}`);
|
||||
const playlistId = created.json?.id;
|
||||
assert(typeof playlistId === "string", `Create playlist missing id: ${JSON.stringify(created.json)}`);
|
||||
|
||||
const add = await fetchJson(`/api/playlists/${playlistId}/items`, {
|
||||
method: "POST",
|
||||
token,
|
||||
body: JSON.stringify({ trackId }),
|
||||
});
|
||||
assert(add.ok, `Add track to playlist failed: status=${add.status} body=${JSON.stringify(add.json)}`);
|
||||
|
||||
const del = await fetchJson(`/api/playlists/${playlistId}`, { method: "DELETE", token });
|
||||
assert(del.ok, `Delete playlist failed: status=${del.status} body=${JSON.stringify(del.json)}`);
|
||||
}
|
||||
|
||||
async function playbackStateRoundTrip(token: string) {
|
||||
const trackId = await getOneTrackId(token);
|
||||
assert(
|
||||
trackId,
|
||||
`No tracks found. Set MUSIC_PATH to a library with at least one track, or run a scan before testing.`
|
||||
);
|
||||
|
||||
const payload = {
|
||||
playbackType: "track",
|
||||
trackId,
|
||||
queue: [{ id: trackId }],
|
||||
currentIndex: 0,
|
||||
isShuffle: false,
|
||||
};
|
||||
|
||||
const save = await fetchJson("/api/playback-state", {
|
||||
method: "POST",
|
||||
token,
|
||||
body: JSON.stringify(payload),
|
||||
});
|
||||
assert(save.ok, `Save playback state failed: status=${save.status} body=${JSON.stringify(save.json)}`);
|
||||
|
||||
const got = await fetchJson("/api/playback-state", { method: "GET", token });
|
||||
assert(got.ok, `Get playback state failed: status=${got.status} body=${JSON.stringify(got.json)}`);
|
||||
}
|
||||
|
||||
async function main() {
|
||||
const started = Date.now();
|
||||
console.log(`[smoke] API_BASE_URL=${API_BASE_URL}`);
|
||||
|
||||
await waitForHealth();
|
||||
console.log("[smoke] health ok");
|
||||
|
||||
const token = await ensureTestUserAndToken();
|
||||
console.log(`[smoke] got token for user=${USERNAME}`);
|
||||
|
||||
await completeOnboarding(token);
|
||||
console.log("[smoke] onboarding marked complete");
|
||||
|
||||
await scanLibraryIfNeeded(token);
|
||||
console.log("[smoke] library ready");
|
||||
|
||||
await playlistsCrud(token);
|
||||
console.log("[smoke] playlists CRUD ok");
|
||||
|
||||
await playbackStateRoundTrip(token);
|
||||
console.log("[smoke] playback-state roundtrip ok");
|
||||
|
||||
console.log(`[smoke] PASS in ${Date.now() - started}ms`);
|
||||
}
|
||||
|
||||
main().catch((err) => {
|
||||
console.error("[smoke] FAIL", err);
|
||||
process.exit(1);
|
||||
});
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,97 @@
|
||||
/**
|
||||
* Check if tracks have Enhanced vibe analysis data
|
||||
*/
|
||||
import { prisma } from "../utils/db";
|
||||
|
||||
async function check() {
|
||||
// Get a sample of tracks with their analysis data
|
||||
const tracks = await prisma.track.findMany({
|
||||
take: 10,
|
||||
select: {
|
||||
title: true,
|
||||
album: { select: { artist: { select: { name: true } } } },
|
||||
analysisMode: true,
|
||||
moodHappy: true,
|
||||
moodSad: true,
|
||||
moodRelaxed: true,
|
||||
moodAggressive: true,
|
||||
danceabilityMl: true,
|
||||
valence: true,
|
||||
arousal: true,
|
||||
energy: true,
|
||||
bpm: true,
|
||||
moodTags: true,
|
||||
},
|
||||
where: {
|
||||
bpm: { not: null }
|
||||
}
|
||||
});
|
||||
|
||||
console.log('Sample tracks with analysis data:');
|
||||
for (const t of tracks) {
|
||||
console.log(`\n${t.album?.artist?.name} - ${t.title}`);
|
||||
console.log(` analysisMode: ${t.analysisMode || 'NOT SET (legacy)'}`);
|
||||
console.log(` ML moods: happy=${t.moodHappy}, sad=${t.moodSad}, relaxed=${t.moodRelaxed}, aggressive=${t.moodAggressive}`);
|
||||
console.log(` danceabilityMl: ${t.danceabilityMl}`);
|
||||
console.log(` valence: ${t.valence}, arousal: ${t.arousal}`);
|
||||
console.log(` energy: ${t.energy}, bpm: ${t.bpm}`);
|
||||
console.log(` moodTags: ${t.moodTags?.join(', ') || 'none'}`);
|
||||
}
|
||||
|
||||
// Count tracks with enhanced analysis
|
||||
const enhancedCount = await prisma.track.count({ where: { analysisMode: 'enhanced' } });
|
||||
const standardCount = await prisma.track.count({ where: { analysisMode: 'standard' } });
|
||||
const noModeCount = await prisma.track.count({ where: { analysisMode: null, bpm: { not: null } } });
|
||||
const totalAnalyzed = await prisma.track.count({ where: { bpm: { not: null } } });
|
||||
|
||||
// Count tracks with ML mood data
|
||||
const withMoodHappy = await prisma.track.count({ where: { moodHappy: { not: null } } });
|
||||
|
||||
console.log(`\n--- Analysis Mode Stats ---`);
|
||||
console.log(`Enhanced: ${enhancedCount}`);
|
||||
console.log(`Standard: ${standardCount}`);
|
||||
console.log(`No mode (legacy): ${noModeCount}`);
|
||||
console.log(`Total analyzed: ${totalAnalyzed}`);
|
||||
console.log(`With ML mood data: ${withMoodHappy}`);
|
||||
|
||||
// Check specific songs the user mentioned
|
||||
console.log(`\n--- Checking specific songs ---`);
|
||||
const specificSongs = await prisma.track.findMany({
|
||||
where: {
|
||||
OR: [
|
||||
{ title: { contains: "I Love You", mode: "insensitive" } },
|
||||
{ title: { contains: "Roots", mode: "insensitive" } },
|
||||
{ title: { contains: "Alright", mode: "insensitive" } },
|
||||
]
|
||||
},
|
||||
select: {
|
||||
title: true,
|
||||
album: { select: { artist: { select: { name: true } } } },
|
||||
analysisMode: true,
|
||||
moodHappy: true,
|
||||
moodSad: true,
|
||||
moodRelaxed: true,
|
||||
moodAggressive: true,
|
||||
valence: true,
|
||||
arousal: true,
|
||||
energy: true,
|
||||
bpm: true,
|
||||
danceability: true,
|
||||
moodTags: true,
|
||||
}
|
||||
});
|
||||
|
||||
for (const t of specificSongs) {
|
||||
console.log(`\n${t.album?.artist?.name} - ${t.title}`);
|
||||
console.log(` analysisMode: ${t.analysisMode || 'NOT SET (legacy)'}`);
|
||||
console.log(` ML moods: happy=${t.moodHappy}, sad=${t.moodSad}, relaxed=${t.moodRelaxed}, aggressive=${t.moodAggressive}`);
|
||||
console.log(` valence: ${t.valence}, arousal: ${t.arousal}`);
|
||||
console.log(` energy: ${t.energy}, bpm: ${t.bpm}, dance: ${t.danceability}`);
|
||||
console.log(` moodTags: ${t.moodTags?.join(', ') || 'none'}`);
|
||||
}
|
||||
|
||||
await prisma.$disconnect();
|
||||
}
|
||||
|
||||
check().catch(console.error);
|
||||
|
||||
@@ -0,0 +1,550 @@
|
||||
# Lidify Vibe Matching System - Research Review Document
|
||||
|
||||
## Executive Summary
|
||||
|
||||
This document provides a complete overview of Lidify's audio-based music recommendation ("vibe matching") system for research review. The system uses ML-based audio analysis to find similar songs based on how they *sound*, not metadata or collaborative filtering.
|
||||
|
||||
---
|
||||
|
||||
## Sample Results (Live Terminal Output)
|
||||
|
||||
### Example 1: Piano Music ("I Love You" by RIOPY)
|
||||
|
||||
```
|
||||
SOURCE: "I Love You" by RIOPY
|
||||
Album: RIOPY
|
||||
Analysis Mode: enhanced
|
||||
BPM: 91.3 | Energy: 0.28 | Valence: 0.53
|
||||
Danceability: 0.96 | Arousal: 0.52 | Key: major
|
||||
ML Moods: Happy=0.91, Sad=0.65, Relaxed=1.00, Aggressive=0.99
|
||||
Mood Tags: sad, dance, chill, melancholic, relaxed, uplifting, aggressive, intense, groovy, happy
|
||||
|
||||
TOP MATCHES (by cosine similarity):
|
||||
# | TRACK | ARTIST | BPM | ENG | VAL | H | S | R | A
|
||||
----|--------------------------------|------------------|------|------|------|------|------|------|------
|
||||
1 | Minimal Game | RIOPY | 84 | 0.25 | 0.51 | 0.70 | 0.20 | 0.80 | 0.76
|
||||
2 | Lullaby | RIOPY | 82 | 0.28 | 0.54 | 0.75 | 0.20 | 0.80 | 0.76
|
||||
3 | Joy | RIOPY | 97 | 0.34 | 0.57 | 0.98 | 0.58 | 1.00 | 0.99
|
||||
4 | Introspective (From Home) | Dirk Maassen | 94 | 0.32 | 0.55 | 0.79 | 0.20 | 0.80 | 0.80
|
||||
5 | Sweet dream | RIOPY | 91 | 0.28 | 0.48 | 0.64 | 0.20 | 0.80 | 0.77
|
||||
6 | Sense of hope | RIOPY | 99 | 0.25 | 0.53 | 0.74 | 0.20 | 0.80 | 0.78
|
||||
7 | Drive | RIOPY | 96 | 0.44 | 0.55 | 0.78 | 0.20 | 0.80 | 0.78
|
||||
8 | Air (From Home) | Dirk Maassen | 81 | 0.14 | 0.56 | 0.79 | 0.20 | 0.80 | 0.76
|
||||
9 | Prelude | Muse | 85 | 0.39 | 0.40 | 0.68 | 0.70 | 0.96 | 1.00
|
||||
10 | Towards the Sun | Dirk Maassen | 117 | 0.25 | 0.49 | 0.66 | 0.20 | 0.80 | 0.80
|
||||
```
|
||||
|
||||
**Observation:** Piano music correctly matches with other piano composers (RIOPY, Dirk Maassen).
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Alt-Rock ("You and I" by Pvris)
|
||||
|
||||
```
|
||||
SOURCE: "You and I" by Pvris
|
||||
Album: White Noise
|
||||
Analysis Mode: enhanced
|
||||
BPM: 101.9 | Energy: 0.57 | Valence: 0.50
|
||||
Danceability: 1.00 | Arousal: 0.44 | Key: major
|
||||
ML Moods: Happy=0.49, Sad=0.31, Relaxed=0.44, Aggressive=0.68
|
||||
Mood Tags: intense, dance, aggressive, groovy
|
||||
|
||||
TOP MATCHES:
|
||||
# | TRACK | ARTIST | BPM | ENG | VAL | H | S | R | A
|
||||
----|--------------------------------|------------------|------|------|------|------|------|------|------
|
||||
1 | Tether | CHVRCHES | 120 | 0.52 | 0.47 | 0.43 | 0.28 | 0.50 | 0.69
|
||||
2 | By The Throat (Live) | CHVRCHES | 118 | 0.50 | 0.52 | 0.37 | 0.20 | 0.34 | 0.72
|
||||
3 | Separate | Pvris | 90 | 0.64 | 0.52 | 0.49 | 0.26 | 0.40 | 0.85
|
||||
4 | Strong Hand (Live) | CHVRCHES | 80 | 0.58 | 0.60 | 0.55 | 0.34 | 0.34 | 0.74
|
||||
5 | Stay Gold | Pvris | 100 | 0.72 | 0.57 | 0.47 | 0.25 | 0.35 | 0.80
|
||||
6 | I Like The Devil | Purity Ring | 100 | 0.65 | 0.54 | 0.60 | 0.31 | 0.43 | 0.92
|
||||
7 | Madness (Live) | Muse | 92 | 0.78 | 0.62 | 0.77 | 0.52 | 0.57 | 0.77
|
||||
```
|
||||
|
||||
**Observation:** Synth-pop/alt-rock correctly matches with similar artists (CHVRCHES, Pvris, Purity Ring).
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Rock ("Supermassive Black Hole" by Muse)
|
||||
|
||||
```
|
||||
SOURCE: "Supermassive Black Hole" by Muse
|
||||
Album: HAARP
|
||||
Analysis Mode: enhanced
|
||||
BPM: 120.1 | Energy: 0.67 | Valence: 0.56
|
||||
Danceability: 1.00 | Arousal: 0.42 | Key: minor
|
||||
ML Moods: Happy=0.72, Sad=0.64, Relaxed=0.16, Aggressive=0.22
|
||||
Mood Tags: sad, dance, melancholic, uplifting, groovy, happy
|
||||
|
||||
TOP MATCHES:
|
||||
# | TRACK | ARTIST | BPM | ENG | VAL | H | S | R | A
|
||||
----|--------------------------------|------------------|------|------|------|------|------|------|------
|
||||
1 | Supermassive Black Hole (Live) | Muse | 120 | 0.75 | 0.56 | 0.76 | 0.58 | 0.06 | 0.04
|
||||
2 | Thought Contagion (Live) | Muse | 140 | 0.76 | 0.57 | 0.77 | 0.52 | 0.08 | 0.09
|
||||
3 | Let Them In | Pvris | 146 | 0.64 | 0.62 | 0.67 | 0.50 | 0.22 | 0.22
|
||||
4 | Panic Station (Live) | Muse | 105 | 0.69 | 0.47 | 0.61 | 0.61 | 0.02 | 0.03
|
||||
5 | Smoke | Pvris | 150 | 0.57 | 0.56 | 0.64 | 0.66 | 0.20 | 0.30
|
||||
6 | Animals | Muse | 113 | 0.82 | 0.55 | 0.79 | 0.59 | 0.24 | 0.21
|
||||
```
|
||||
|
||||
**Observation:** Rock music correctly matches with other Muse tracks and similar-sounding rock/alt artists.
|
||||
|
||||
---
|
||||
|
||||
## System Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ AUDIO ANALYSIS PIPELINE │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌─────────────────────────────────────────────────┐ │
|
||||
│ │ Audio File │────►│ Essentia Audio Processing │ │
|
||||
│ │ (.flac/.mp3)│ │ │ │
|
||||
│ └─────────────┘ │ • FFT/Spectral Analysis │ │
|
||||
│ │ • Beat/Tempo Detection │ │
|
||||
│ │ • Key/Scale Detection │ │
|
||||
│ │ • RMS Energy Calculation │ │
|
||||
│ └─────────────┬────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌─────────────▼────────────────────────────────────┐ │
|
||||
│ │ MusiCNN (TensorFlow Model) │ │
|
||||
│ │ │ │
|
||||
│ │ Input: 16kHz mono audio │ │
|
||||
│ │ Output: 200-dimensional embeddings │ │
|
||||
│ │ Architecture: Convolutional Neural Network │ │
|
||||
│ │ Training: Million Song Dataset (MSD) │ │
|
||||
│ └─────────────┬────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌────────────────────────┼────────────────────────────┐ │
|
||||
│ │ │ │ │
|
||||
│ ▼ ▼ ▼ │
|
||||
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
|
||||
│ │ Mood Happy │ │ Mood Sad │ ... │ Danceability │ │
|
||||
│ │ Classifier │ │ Classifier │ │ Classifier │ │
|
||||
│ │ (Softmax) │ │ (Softmax) │ │ (Softmax) │ │
|
||||
│ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │
|
||||
│ │ │ │ │
|
||||
│ └──────────────────────┼────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌───────────▼───────────┐ │
|
||||
│ │ DERIVED FEATURES │ │
|
||||
│ │ │ │
|
||||
│ │ Valence = f(happy, party, sad) │
|
||||
│ │ Arousal = f(aggressive, party, electronic, │
|
||||
│ │ relaxed, acoustic) │
|
||||
│ └───────────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ VIBE MATCHING ALGORITHM │
|
||||
├─────────────────────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ 1. Build Feature Vector (13 dimensions): │
|
||||
│ [moodHappy, moodSad, moodRelaxed, moodAggressive, moodParty, │
|
||||
│ moodAcoustic, moodElectronic, energy, arousal, danceability, │
|
||||
│ instrumentalness, normalizedBPM, keyMode] │
|
||||
│ │
|
||||
│ 2. Compute Cosine Similarity: │
|
||||
│ Σ(aᵢ × bᵢ) │
|
||||
│ cos(θ) = ───────────────────── │
|
||||
│ √(Σaᵢ²) × √(Σbᵢ²) │
|
||||
│ │
|
||||
│ 3. Add Tag/Genre Bonus (max 5%): │
|
||||
│ Jaccard similarity on lastfmTags ∪ essentiaGenres │
|
||||
│ │
|
||||
│ 4. Final Score = 0.95 × cosineSim + tagBonus │
|
||||
│ │
|
||||
│ 5. Filter threshold: 40% (Enhanced) or 50% (Standard) │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Data Schema (What We Store Per Track)
|
||||
|
||||
### Database Schema (PostgreSQL + Prisma)
|
||||
|
||||
```sql
|
||||
-- Track table audio analysis columns
|
||||
model Track {
|
||||
-- Basic Info
|
||||
id String @id
|
||||
title String
|
||||
albumId String
|
||||
duration Int -- seconds
|
||||
filePath String -- relative path to audio file
|
||||
|
||||
-- === RHYTHM ANALYSIS (Essentia) ===
|
||||
bpm Float? -- beats per minute (60-200 typical)
|
||||
beatsCount Int? -- total beats in track
|
||||
|
||||
-- === TONALITY (Essentia) ===
|
||||
key String? -- musical key ("C", "F#", "Bb", etc.)
|
||||
keyScale String? -- "major" or "minor"
|
||||
keyStrength Float? -- confidence 0-1
|
||||
|
||||
-- === ENERGY & DYNAMICS (Essentia) ===
|
||||
energy Float? -- overall energy 0-1 (RMS-based)
|
||||
loudness Float? -- average loudness in dB
|
||||
dynamicRange Float? -- dynamic range in dB
|
||||
|
||||
-- === BASIC AUDIO FEATURES ===
|
||||
danceability Float? -- 0-1 how suitable for dancing
|
||||
valence Float? -- 0 (sad) to 1 (happy) - DERIVED
|
||||
arousal Float? -- 0 (calm) to 1 (energetic) - DERIVED
|
||||
|
||||
-- === INSTRUMENTATION ===
|
||||
instrumentalness Float? -- 0-1 (1 = no vocals) - ML predicted
|
||||
acousticness Float? -- 0-1 (1 = acoustic)
|
||||
speechiness Float? -- 0-1 (1 = spoken word)
|
||||
|
||||
-- === ML MOOD PREDICTIONS (Enhanced Mode) ===
|
||||
-- These are the core ML outputs from MusiCNN classifiers
|
||||
moodHappy Float? -- ML prediction 0-1 (probability of happy)
|
||||
moodSad Float? -- ML prediction 0-1 (probability of sad)
|
||||
moodRelaxed Float? -- ML prediction 0-1 (probability of relaxed)
|
||||
moodAggressive Float? -- ML prediction 0-1 (probability of aggressive)
|
||||
moodParty Float? -- ML prediction 0-1 (probability of party/upbeat)
|
||||
moodAcoustic Float? -- ML prediction 0-1 (probability of acoustic)
|
||||
moodElectronic Float? -- ML prediction 0-1 (probability of electronic)
|
||||
danceabilityMl Float? -- ML-based danceability (more accurate)
|
||||
|
||||
-- === DERIVED TAGS ===
|
||||
moodTags String[] -- ["aggressive", "happy", "chill", "workout"]
|
||||
essentiaGenres String[] -- ["rock", "electronic", "jazz"]
|
||||
lastfmTags String[] -- ["chill", "workout", "sad", "90s"]
|
||||
|
||||
-- === ANALYSIS METADATA ===
|
||||
analysisStatus String -- pending, processing, completed, failed
|
||||
analysisMode String? -- 'standard' or 'enhanced'
|
||||
analysisVersion String? -- Essentia version used
|
||||
analyzedAt DateTime?
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Core Algorithm: Feature Extraction (Python)
|
||||
|
||||
### analyzer.py - ML Feature Extraction
|
||||
|
||||
```python
|
||||
def _extract_ml_features(self, audio_16k) -> Dict[str, Any]:
|
||||
"""
|
||||
Extract features using Essentia MusiCNN + classification heads.
|
||||
|
||||
Architecture:
|
||||
1. TensorflowPredictMusiCNN extracts embeddings from audio
|
||||
2. TensorflowPredict2D classification heads output predictions
|
||||
"""
|
||||
result = {}
|
||||
|
||||
# Step 1: Get embeddings from base MusiCNN model
|
||||
# Output shape: [frames, 200] - 200-dimensional embedding per frame
|
||||
embeddings = self.musicnn_model(audio_16k)
|
||||
|
||||
# Step 2: Pass embeddings through classification heads
|
||||
# Each head outputs [frames, 2] where [:, 1] is probability of positive class
|
||||
|
||||
# Collect raw predictions
|
||||
if 'mood_happy' in self.prediction_models:
|
||||
preds = self.prediction_models['mood_happy'](embeddings)
|
||||
result['moodHappy'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'mood_sad' in self.prediction_models:
|
||||
preds = self.prediction_models['mood_sad'](embeddings)
|
||||
result['moodSad'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'mood_relaxed' in self.prediction_models:
|
||||
preds = self.prediction_models['mood_relaxed'](embeddings)
|
||||
result['moodRelaxed'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'mood_aggressive' in self.prediction_models:
|
||||
preds = self.prediction_models['mood_aggressive'](embeddings)
|
||||
result['moodAggressive'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'mood_party' in self.prediction_models:
|
||||
preds = self.prediction_models['mood_party'](embeddings)
|
||||
result['moodParty'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'mood_acoustic' in self.prediction_models:
|
||||
preds = self.prediction_models['mood_acoustic'](embeddings)
|
||||
result['moodAcoustic'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'mood_electronic' in self.prediction_models:
|
||||
preds = self.prediction_models['mood_electronic'](embeddings)
|
||||
result['moodElectronic'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
# === VALENCE (derived from mood models) ===
|
||||
# Valence = emotional positivity: happy/party vs sad
|
||||
happy = result.get('moodHappy', 0.5)
|
||||
sad = result.get('moodSad', 0.5)
|
||||
party = result.get('moodParty', 0.5)
|
||||
result['valence'] = round(happy * 0.5 + party * 0.3 + (1 - sad) * 0.2, 3)
|
||||
|
||||
# === AROUSAL (derived from mood models) ===
|
||||
# Arousal = energy level: aggressive/party/electronic vs relaxed/acoustic
|
||||
aggressive = result.get('moodAggressive', 0.5)
|
||||
relaxed = result.get('moodRelaxed', 0.5)
|
||||
acoustic = result.get('moodAcoustic', 0.5)
|
||||
electronic = result.get('moodElectronic', 0.5)
|
||||
result['arousal'] = round(
|
||||
aggressive * 0.35 +
|
||||
party * 0.25 +
|
||||
electronic * 0.2 +
|
||||
(1 - relaxed) * 0.1 +
|
||||
(1 - acoustic) * 0.1,
|
||||
3
|
||||
)
|
||||
|
||||
return result
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Core Algorithm: Cosine Similarity Matching (TypeScript)
|
||||
|
||||
### library.ts - Vibe Matching Implementation
|
||||
|
||||
```typescript
|
||||
// === COSINE SIMILARITY SCORING ===
|
||||
// Industry-standard approach: build feature vectors, compute cosine similarity
|
||||
// Uses ALL 13 features for comprehensive matching
|
||||
|
||||
// Helper: Build normalized feature vector from track
|
||||
const buildFeatureVector = (track: TrackFeatures): number[] => {
|
||||
return [
|
||||
// ML Mood predictions (7 features) - 0.5 default for missing
|
||||
track.moodHappy ?? 0.5,
|
||||
track.moodSad ?? 0.5,
|
||||
track.moodRelaxed ?? 0.5,
|
||||
track.moodAggressive ?? 0.5,
|
||||
track.moodParty ?? 0.5,
|
||||
track.moodAcoustic ?? 0.5,
|
||||
track.moodElectronic ?? 0.5,
|
||||
// Audio features (5 features)
|
||||
track.energy ?? 0.5,
|
||||
track.arousal ?? 0.5,
|
||||
track.danceabilityMl ?? track.danceability ?? 0.5,
|
||||
track.instrumentalness ?? 0.5,
|
||||
// BPM normalized to 0-1 (60-180 BPM range)
|
||||
Math.max(0, Math.min(1, ((track.bpm ?? 120) - 60) / 120)),
|
||||
// Key: major=1, minor=0, unknown=0.5
|
||||
track.keyScale === 'major' ? 1 : track.keyScale === 'minor' ? 0 : 0.5,
|
||||
];
|
||||
};
|
||||
|
||||
// Helper: Compute cosine similarity between two vectors
|
||||
const cosineSimilarity = (a: number[], b: number[]): number => {
|
||||
let dot = 0, magA = 0, magB = 0;
|
||||
for (let i = 0; i < a.length; i++) {
|
||||
dot += a[i] * b[i];
|
||||
magA += a[i] * a[i];
|
||||
magB += b[i] * b[i];
|
||||
}
|
||||
if (magA === 0 || magB === 0) return 0;
|
||||
return dot / (Math.sqrt(magA) * Math.sqrt(magB));
|
||||
};
|
||||
|
||||
// Helper: Compute tag overlap bonus
|
||||
const computeTagBonus = (
|
||||
sourceTags: string[],
|
||||
sourceGenres: string[],
|
||||
trackTags: string[],
|
||||
trackGenres: string[]
|
||||
): number => {
|
||||
const sourceSet = new Set([...sourceTags, ...sourceGenres].map(t => t.toLowerCase()));
|
||||
const trackSet = new Set([...trackTags, ...trackGenres].map(t => t.toLowerCase()));
|
||||
if (sourceSet.size === 0 || trackSet.size === 0) return 0;
|
||||
const overlap = [...sourceSet].filter(tag => trackSet.has(tag)).length;
|
||||
// Max 5% bonus for tag overlap
|
||||
return Math.min(0.05, overlap * 0.01);
|
||||
};
|
||||
|
||||
// Score all candidate tracks
|
||||
const scored = analyzedTracks.map(t => {
|
||||
const targetVector = buildFeatureVector(t);
|
||||
|
||||
// Compute base cosine similarity
|
||||
let score = cosineSimilarity(sourceVector, targetVector);
|
||||
|
||||
// Add tag/genre overlap bonus (max 5%)
|
||||
const tagBonus = computeTagBonus(
|
||||
sourceTrack.lastfmTags || [],
|
||||
sourceTrack.essentiaGenres || [],
|
||||
t.lastfmTags || [],
|
||||
t.essentiaGenres || []
|
||||
);
|
||||
|
||||
// Final score: 95% cosine similarity + 5% tag bonus
|
||||
const finalScore = score * 0.95 + tagBonus;
|
||||
|
||||
return { id: t.id, score: finalScore };
|
||||
});
|
||||
|
||||
// Filter to good matches (>40% for Enhanced, >50% for Standard)
|
||||
const minThreshold = isEnhancedAnalysis ? 0.40 : 0.50;
|
||||
const goodMatches = scored
|
||||
.filter(t => t.score > minThreshold)
|
||||
.sort((a, b) => b.score - a.score);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Feature Vector Breakdown
|
||||
|
||||
| Index | Feature | Range | Description | Weight Rationale |
|
||||
|-------|---------|-------|-------------|------------------|
|
||||
| 0 | moodHappy | 0-1 | ML probability of happy mood | Core mood dimension |
|
||||
| 1 | moodSad | 0-1 | ML probability of sad mood | Core mood dimension |
|
||||
| 2 | moodRelaxed | 0-1 | ML probability of relaxed mood | Core mood dimension |
|
||||
| 3 | moodAggressive | 0-1 | ML probability of aggressive mood | Core mood dimension |
|
||||
| 4 | moodParty | 0-1 | ML probability of party/upbeat | Core mood dimension |
|
||||
| 5 | moodAcoustic | 0-1 | ML probability of acoustic sound | Instrumentation |
|
||||
| 6 | moodElectronic | 0-1 | ML probability of electronic sound | Instrumentation |
|
||||
| 7 | energy | 0-1 | RMS-based energy level | Audio characteristic |
|
||||
| 8 | arousal | 0-1 | Derived energy/intensity | Composite dimension |
|
||||
| 9 | danceability | 0-1 | ML or Essentia danceability | Rhythm characteristic |
|
||||
| 10 | instrumentalness | 0-1 | Voice/instrumental ML detection | Instrumentation |
|
||||
| 11 | normalizedBPM | 0-1 | (bpm - 60) / 120 | Tempo matching |
|
||||
| 12 | keyMode | 0/0.5/1 | minor/unknown/major | Tonality |
|
||||
|
||||
---
|
||||
|
||||
## Valence & Arousal Derivation
|
||||
|
||||
Since Essentia doesn't have direct valence/arousal models, we derive them from mood predictions:
|
||||
|
||||
### Valence (Emotional Positivity)
|
||||
```python
|
||||
valence = moodHappy * 0.5 + moodParty * 0.3 + (1 - moodSad) * 0.2
|
||||
```
|
||||
|
||||
**Rationale:**
|
||||
- Happy mood is the strongest positive indicator (50% weight)
|
||||
- Party/upbeat suggests positive energy (30% weight)
|
||||
- Low sadness contributes to positivity (20% weight)
|
||||
|
||||
### Arousal (Energy Level)
|
||||
```python
|
||||
arousal = moodAggressive * 0.35 + moodParty * 0.25 + moodElectronic * 0.2
|
||||
+ (1 - moodRelaxed) * 0.1 + (1 - moodAcoustic) * 0.1
|
||||
```
|
||||
|
||||
**Rationale:**
|
||||
- Aggressive music is high-energy (35% weight)
|
||||
- Party music has high arousal (25% weight)
|
||||
- Electronic music tends to be energetic (20% weight)
|
||||
- Low relaxation indicates higher energy (10% weight)
|
||||
- Non-acoustic sound suggests higher energy (10% weight)
|
||||
|
||||
---
|
||||
|
||||
## Known Limitations & Edge Cases
|
||||
|
||||
### 1. Out-of-Distribution Audio
|
||||
MusiCNN was trained on the Million Song Dataset (mostly pop/rock). For genres outside this distribution (classical, ambient, piano), the model sometimes outputs high values for ALL mood dimensions.
|
||||
|
||||
**Detection & Normalization:**
|
||||
```python
|
||||
core_moods = ['moodHappy', 'moodSad', 'moodRelaxed', 'moodAggressive']
|
||||
core_values = [raw_moods[m][0] for m in core_moods if m in raw_moods]
|
||||
|
||||
if len(core_values) >= 4:
|
||||
min_mood = min(core_values)
|
||||
max_mood = max(core_values)
|
||||
|
||||
# If all core moods are > 0.7 AND the range is small,
|
||||
# the predictions are likely unreliable (out-of-distribution audio)
|
||||
if min_mood > 0.7 and (max_mood - min_mood) < 0.3:
|
||||
# Normalize: scale so max becomes 0.8 and min becomes 0.2
|
||||
for mood_key in core_moods:
|
||||
old_val = raw_moods[mood_key][0]
|
||||
normalized = 0.2 + (old_val - min_mood) / (max_mood - min_mood) * 0.6
|
||||
raw_moods[mood_key] = normalized
|
||||
```
|
||||
|
||||
### 2. Standard Mode Fallback
|
||||
When ML models aren't available, heuristic estimates are used:
|
||||
|
||||
| Feature | Heuristic Formula |
|
||||
|---------|-------------------|
|
||||
| Valence | key_valence * 0.4 + bpm_valence * 0.25 + brightness * 0.2 + energy * 0.15 |
|
||||
| Arousal | bpm_arousal * 0.35 + energy * 0.35 + brightness * 0.15 + compression * 0.15 |
|
||||
| Instrumentalness | spectral_flatness * 0.6 + zcr_instrumental * 0.4 |
|
||||
| Acousticness | dynamic_range / 12 |
|
||||
|
||||
### 3. Feature Vector Missing Values
|
||||
Missing values default to 0.5 (neutral) to prevent bias:
|
||||
```typescript
|
||||
track.moodHappy ?? 0.5
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Open Questions for Review
|
||||
|
||||
1. **Feature Weighting:** Currently all 13 features have equal weight in cosine similarity. Should mood features (indices 0-6) have higher weight than audio features?
|
||||
|
||||
2. **Threshold Selection:** We use 40% similarity threshold for Enhanced mode. Is this too permissive? Too restrictive?
|
||||
|
||||
3. **Valence/Arousal Derivation:** Our formulas for deriving valence/arousal from mood predictions are hand-tuned. Are the weights reasonable?
|
||||
|
||||
4. **BPM Normalization:** We normalize BPM to 60-180 range. Should we use octave-aware BPM (treating 60 and 120 as similar)?
|
||||
|
||||
5. **Cross-Genre Matching:** The algorithm matches based on audio similarity regardless of genre. Should genre matching have more weight?
|
||||
|
||||
6. **Cold Start:** Tracks with missing analysis fall back to 0.5 for all features. Should they be excluded from matching?
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
### Python (Audio Analyzer)
|
||||
```
|
||||
essentia==2.1b6.dev1110
|
||||
essentia-tensorflow==2.1b6.dev1110
|
||||
numpy>=1.21.0,<2.0.0
|
||||
tensorflow==2.15.0
|
||||
redis>=4.5.0
|
||||
psycopg2-binary>=2.9.0
|
||||
```
|
||||
|
||||
### MusiCNN Models (Essentia Model Zoo)
|
||||
- `msd-musicnn-1.pb` - Base embedding model (~3MB)
|
||||
- `mood_happy-msd-musicnn-1.pb` - Happy classifier
|
||||
- `mood_sad-msd-musicnn-1.pb` - Sad classifier
|
||||
- `mood_relaxed-msd-musicnn-1.pb` - Relaxed classifier
|
||||
- `mood_aggressive-msd-musicnn-1.pb` - Aggressive classifier
|
||||
- `mood_party-msd-musicnn-1.pb` - Party classifier
|
||||
- `mood_acoustic-msd-musicnn-1.pb` - Acoustic classifier
|
||||
- `mood_electronic-msd-musicnn-1.pb` - Electronic classifier
|
||||
- `danceability-msd-musicnn-1.pb` - Danceability classifier
|
||||
- `voice_instrumental-msd-musicnn-1.pb` - Voice/instrumental classifier
|
||||
|
||||
---
|
||||
|
||||
## References
|
||||
|
||||
- [Essentia TensorFlow Documentation](https://essentia.upf.edu/machine_learning.html)
|
||||
- [MusiCNN Paper (Pons et al.)](https://arxiv.org/abs/1711.02520)
|
||||
- [Essentia Model Zoo](https://essentia.upf.edu/models/)
|
||||
- [Million Song Dataset](http://millionsongdataset.com/)
|
||||
|
||||
---
|
||||
|
||||
## File Locations
|
||||
|
||||
| Component | Path |
|
||||
|-----------|------|
|
||||
| Audio Analyzer | `services/audio-analyzer/analyzer.py` |
|
||||
| Vibe Matching | `backend/src/routes/library.ts` (lines 3293-3580) |
|
||||
| Database Schema | `backend/prisma/schema.prisma` |
|
||||
| Standard Mode Docs | `docs/implementation-summaries/audio-analysis-standard-mode/README.md` |
|
||||
| Enhanced Mode Docs | `docs/implementation-summaries/audio-analysis-standard-mode/ENHANCED_MODE.md` |
|
||||
| Algorithm Overview | `docs/implementation-summaries/vibe-matching-overhaul/README.md` |
|
||||
|
||||
@@ -0,0 +1,651 @@
|
||||
# Lidify Vibe System Documentation
|
||||
|
||||
This document provides comprehensive documentation of the Vibe System - how Lidify analyzes tracks, collects audio metrics, and compares them for vibe matching. Use this as a reference for building frontend interfaces.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Metrics Collected](#metrics-collected)
|
||||
3. [Data Structures](#data-structures)
|
||||
4. [Vibe Matching Algorithm](#vibe-matching-algorithm)
|
||||
5. [API Endpoints](#api-endpoints)
|
||||
6. [Frontend Integration Guide](#frontend-integration-guide)
|
||||
7. [Existing Components Reference](#existing-components-reference)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The Vibe System uses a combination of **audio signal analysis** and **ML-based mood prediction** to understand the "feel" of a track. It operates in two modes:
|
||||
|
||||
| Mode | Description | Accuracy |
|
||||
|------|-------------|----------|
|
||||
| **Standard** | Heuristic-based analysis using audio signal features (BPM, key, energy) | Good |
|
||||
| **Enhanced** | ML-based analysis using MusiCNN neural network for mood prediction | Best |
|
||||
|
||||
The system enables:
|
||||
- Finding tracks with similar vibes to a source track
|
||||
- Generating mood-based playlists
|
||||
- Visualizing track characteristics in real-time
|
||||
|
||||
---
|
||||
|
||||
## Metrics Collected
|
||||
|
||||
### Core Audio Features (Always Available)
|
||||
|
||||
These are extracted directly from audio signal analysis at 44.1kHz:
|
||||
|
||||
| Metric | Type | Range | Description |
|
||||
|--------|------|-------|-------------|
|
||||
| `bpm` | Float | 60-200 | Tempo in beats per minute |
|
||||
| `beatsCount` | Int | 0+ | Total number of beats detected |
|
||||
| `key` | String | "C", "F#", etc. | Musical key |
|
||||
| `keyScale` | String | "major" \| "minor" | Major or minor tonality |
|
||||
| `keyStrength` | Float | 0-1 | Confidence of key detection |
|
||||
| `energy` | Float | 0-1 | RMS-based intensity level |
|
||||
| `loudness` | Float | dB | Average loudness |
|
||||
| `dynamicRange` | Float | dB | Difference between quietest and loudest |
|
||||
| `danceability` | Float | 0-1 | Rhythm regularity and groove potential |
|
||||
|
||||
### ML Mood Predictions (Enhanced Mode)
|
||||
|
||||
Seven core mood dimensions predicted by the MusiCNN model:
|
||||
|
||||
| Metric | Type | Range | Description | Icon Suggestion |
|
||||
|--------|------|-------|-------------|-----------------|
|
||||
| `moodHappy` | Float | 0-1 | Happiness/cheerfulness probability | Smile |
|
||||
| `moodSad` | Float | 0-1 | Sadness/melancholy probability | Frown |
|
||||
| `moodRelaxed` | Float | 0-1 | Calm/peaceful probability | Coffee |
|
||||
| `moodAggressive` | Float | 0-1 | Intensity/aggression probability | Flame |
|
||||
| `moodParty` | Float | 0-1 | Upbeat/party probability | PartyPopper |
|
||||
| `moodAcoustic` | Float | 0-1 | Acoustic instrumentation probability | Guitar |
|
||||
| `moodElectronic` | Float | 0-1 | Electronic/synthetic probability | Radio |
|
||||
|
||||
### Derived Features (Computed)
|
||||
|
||||
These are calculated from the ML predictions:
|
||||
|
||||
#### Valence (Emotional Positivity)
|
||||
|
||||
```typescript
|
||||
// Formula:
|
||||
valence = (
|
||||
moodHappy * 0.5 + // Happy mood (50% weight)
|
||||
moodParty * 0.3 + // Party mood (30% weight)
|
||||
(1 - moodSad) * 0.2 // Inverse of sadness (20% weight)
|
||||
)
|
||||
```
|
||||
|
||||
| Value | Interpretation |
|
||||
|-------|----------------|
|
||||
| 0.0 - 0.3 | Melancholic, sad |
|
||||
| 0.3 - 0.6 | Neutral, balanced |
|
||||
| 0.6 - 1.0 | Happy, positive |
|
||||
|
||||
#### Arousal (Energy/Excitement Level)
|
||||
|
||||
```typescript
|
||||
// Formula:
|
||||
arousal = (
|
||||
moodAggressive * 0.35 + // Aggressive mood (35% weight)
|
||||
moodParty * 0.25 + // Party mood (25% weight)
|
||||
moodElectronic * 0.2 + // Electronic sound (20% weight)
|
||||
(1 - moodRelaxed) * 0.1 + // Inverse of relaxation (10% weight)
|
||||
(1 - moodAcoustic) * 0.1 // Inverse of acoustic (10% weight)
|
||||
)
|
||||
```
|
||||
|
||||
| Value | Interpretation |
|
||||
|-------|----------------|
|
||||
| 0.0 - 0.3 | Calm, peaceful |
|
||||
| 0.3 - 0.6 | Moderate energy |
|
||||
| 0.6 - 1.0 | High energy, intense |
|
||||
|
||||
### Additional Features
|
||||
|
||||
| Metric | Type | Range | Description |
|
||||
|--------|------|-------|-------------|
|
||||
| `instrumentalness` | Float | 0-1 | Voice presence (0=vocal, 1=instrumental) |
|
||||
| `acousticness` | Float | 0-1 | Acoustic vs. processed sound |
|
||||
| `speechiness` | Float | 0-1 | Spoken word detection |
|
||||
| `danceabilityMl` | Float | 0-1 | ML-based danceability (more accurate) |
|
||||
|
||||
### Metadata & Tags
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `moodTags` | String[] | Derived mood labels (e.g., ["chill", "happy"]) |
|
||||
| `essentiaGenres` | String[] | ML-predicted genres (e.g., ["rock", "electronic"]) |
|
||||
| `lastfmTags` | String[] | User-generated tags from Last.fm |
|
||||
| `analysisStatus` | String | "pending" \| "processing" \| "completed" \| "failed" |
|
||||
| `analysisMode` | String | "standard" \| "enhanced" |
|
||||
| `analyzedAt` | DateTime | When analysis was performed |
|
||||
|
||||
---
|
||||
|
||||
## Data Structures
|
||||
|
||||
### TypeScript Interface
|
||||
|
||||
```typescript
|
||||
interface AudioFeatures {
|
||||
// Core audio features
|
||||
bpm?: number | null;
|
||||
beatsCount?: number | null;
|
||||
key?: string | null;
|
||||
keyScale?: string | null;
|
||||
keyStrength?: number | null;
|
||||
energy?: number | null;
|
||||
loudness?: number | null;
|
||||
dynamicRange?: number | null;
|
||||
danceability?: number | null;
|
||||
|
||||
// Derived features
|
||||
valence?: number | null;
|
||||
arousal?: number | null;
|
||||
|
||||
// Additional features
|
||||
instrumentalness?: number | null;
|
||||
acousticness?: number | null;
|
||||
speechiness?: number | null;
|
||||
danceabilityMl?: number | null;
|
||||
|
||||
// ML Mood predictions (Enhanced mode)
|
||||
moodHappy?: number | null;
|
||||
moodSad?: number | null;
|
||||
moodRelaxed?: number | null;
|
||||
moodAggressive?: number | null;
|
||||
moodParty?: number | null;
|
||||
moodAcoustic?: number | null;
|
||||
moodElectronic?: number | null;
|
||||
|
||||
// Metadata
|
||||
analysisStatus?: string | null;
|
||||
analysisMode?: string | null;
|
||||
analyzedAt?: string | null;
|
||||
|
||||
// Tags
|
||||
moodTags?: string[];
|
||||
essentiaGenres?: string[];
|
||||
lastfmTags?: string[];
|
||||
}
|
||||
```
|
||||
|
||||
### Feature Display Configuration
|
||||
|
||||
Recommended configuration for displaying features in UI:
|
||||
|
||||
```typescript
|
||||
const FEATURE_CONFIG = [
|
||||
{
|
||||
key: "energy",
|
||||
label: "Energy",
|
||||
icon: "Zap", // lucide-react icon
|
||||
min: 0,
|
||||
max: 1,
|
||||
lowLabel: "Calm",
|
||||
highLabel: "Intense",
|
||||
},
|
||||
{
|
||||
key: "valence",
|
||||
label: "Mood",
|
||||
icon: "Heart",
|
||||
min: 0,
|
||||
max: 1,
|
||||
lowLabel: "Melancholic",
|
||||
highLabel: "Happy",
|
||||
},
|
||||
{
|
||||
key: "danceability",
|
||||
label: "Groove",
|
||||
icon: "Footprints",
|
||||
min: 0,
|
||||
max: 1,
|
||||
lowLabel: "Freeform",
|
||||
highLabel: "Danceable",
|
||||
},
|
||||
{
|
||||
key: "bpm",
|
||||
label: "Tempo",
|
||||
icon: "Gauge",
|
||||
min: 60,
|
||||
max: 180,
|
||||
lowLabel: "Slow",
|
||||
highLabel: "Fast",
|
||||
unit: "BPM",
|
||||
},
|
||||
{
|
||||
key: "arousal",
|
||||
label: "Arousal",
|
||||
icon: "AudioWaveform",
|
||||
min: 0,
|
||||
max: 1,
|
||||
lowLabel: "Peaceful",
|
||||
highLabel: "Energetic",
|
||||
},
|
||||
];
|
||||
|
||||
const ML_MOOD_CONFIG = [
|
||||
{ key: "moodHappy", label: "Happy", icon: "Smile", color: "yellow-400" },
|
||||
{ key: "moodSad", label: "Sad", icon: "Frown", color: "blue-400" },
|
||||
{ key: "moodRelaxed", label: "Relaxed", icon: "Coffee", color: "green-400" },
|
||||
{ key: "moodAggressive", label: "Aggressive", icon: "Flame", color: "red-400" },
|
||||
{ key: "moodParty", label: "Party", icon: "PartyPopper", color: "pink-400" },
|
||||
{ key: "moodAcoustic", label: "Acoustic", icon: "Guitar", color: "amber-400" },
|
||||
{ key: "moodElectronic", label: "Electronic", icon: "Radio", color: "purple-400" },
|
||||
];
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Vibe Matching Algorithm
|
||||
|
||||
### Feature Vector Construction
|
||||
|
||||
The system builds a **13-dimensional feature vector** for each track:
|
||||
|
||||
```typescript
|
||||
const buildFeatureVector = (track: AudioFeatures) => [
|
||||
// ML Mood predictions (7 features) - 1.3x weight for semantic importance
|
||||
getMoodValue(track.moodHappy, 0.5) * 1.3,
|
||||
getMoodValue(track.moodSad, 0.5) * 1.3,
|
||||
getMoodValue(track.moodRelaxed, 0.5) * 1.3,
|
||||
getMoodValue(track.moodAggressive, 0.5) * 1.3,
|
||||
getMoodValue(track.moodParty, 0.5) * 1.3,
|
||||
getMoodValue(track.moodAcoustic, 0.5) * 1.3,
|
||||
getMoodValue(track.moodElectronic, 0.5) * 1.3,
|
||||
|
||||
// Audio features (5 features)
|
||||
track.energy ?? 0.5,
|
||||
calculateEnhancedArousal(track),
|
||||
track.danceabilityMl ?? track.danceability ?? 0.5,
|
||||
track.instrumentalness ?? 0.5,
|
||||
|
||||
// BPM (octave-aware normalization)
|
||||
1 - octaveAwareBPMDistance(track.bpm ?? 120, 120),
|
||||
|
||||
// Valence
|
||||
calculateEnhancedValence(track),
|
||||
];
|
||||
|
||||
// Helper: Get mood value with fallback
|
||||
const getMoodValue = (value: number | null | undefined, fallback: number) =>
|
||||
value ?? fallback;
|
||||
```
|
||||
|
||||
### Cosine Similarity Calculation
|
||||
|
||||
Tracks are compared using cosine similarity:
|
||||
|
||||
```typescript
|
||||
const cosineSimilarity = (vectorA: number[], vectorB: number[]): number => {
|
||||
let dotProduct = 0;
|
||||
let magA = 0;
|
||||
let magB = 0;
|
||||
|
||||
for (let i = 0; i < vectorA.length; i++) {
|
||||
dotProduct += vectorA[i] * vectorB[i];
|
||||
magA += vectorA[i] * vectorA[i];
|
||||
magB += vectorB[i] * vectorB[i];
|
||||
}
|
||||
|
||||
return dotProduct / (Math.sqrt(magA) * Math.sqrt(magB));
|
||||
};
|
||||
```
|
||||
|
||||
### Tag/Genre Bonus
|
||||
|
||||
Additional boost for shared tags:
|
||||
|
||||
```typescript
|
||||
const computeTagBonus = (
|
||||
sourceTags: string[],
|
||||
sourceGenres: string[],
|
||||
trackTags: string[],
|
||||
trackGenres: string[]
|
||||
): number => {
|
||||
const sourceSet = new Set(
|
||||
[...sourceTags, ...sourceGenres].map(t => t.toLowerCase())
|
||||
);
|
||||
const trackSet = new Set(
|
||||
[...trackTags, ...trackGenres].map(t => t.toLowerCase())
|
||||
);
|
||||
|
||||
const overlap = [...sourceSet].filter(tag => trackSet.has(tag)).length;
|
||||
return Math.min(0.05, overlap * 0.01); // Max 5% bonus
|
||||
};
|
||||
```
|
||||
|
||||
### Final Score
|
||||
|
||||
```typescript
|
||||
const finalScore = cosineSimilarity(sourceVector, targetVector) * 0.95 + tagBonus;
|
||||
```
|
||||
|
||||
### Matching Thresholds
|
||||
|
||||
| Mode | Minimum Similarity |
|
||||
|------|-------------------|
|
||||
| Enhanced | 40% |
|
||||
| Standard | 50% |
|
||||
|
||||
Lower threshold for Enhanced mode because ML predictions provide more nuanced differentiation.
|
||||
|
||||
### Octave-Aware BPM Matching
|
||||
|
||||
Treats harmonically related tempos as similar (60 BPM ≈ 120 BPM ≈ 240 BPM):
|
||||
|
||||
```typescript
|
||||
const octaveAwareBPMDistance = (bpm1: number, bpm2: number): number => {
|
||||
const normalizeToOctave = (bpm: number): number => {
|
||||
while (bpm < 77) bpm *= 2;
|
||||
while (bpm > 154) bpm /= 2;
|
||||
return bpm;
|
||||
};
|
||||
|
||||
const norm1 = normalizeToOctave(bpm1);
|
||||
const norm2 = normalizeToOctave(bpm2);
|
||||
|
||||
const logDistance = Math.abs(Math.log2(norm1) - Math.log2(norm2));
|
||||
return Math.min(logDistance, 1);
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Get Track Audio Features
|
||||
|
||||
```
|
||||
GET /api/tracks/:id/features
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
{
|
||||
"bpm": 128.5,
|
||||
"energy": 0.78,
|
||||
"valence": 0.65,
|
||||
"arousal": 0.72,
|
||||
"danceability": 0.85,
|
||||
"key": "C",
|
||||
"keyScale": "major",
|
||||
"moodHappy": 0.72,
|
||||
"moodSad": 0.15,
|
||||
"moodRelaxed": 0.28,
|
||||
"moodAggressive": 0.45,
|
||||
"moodParty": 0.68,
|
||||
"moodAcoustic": 0.12,
|
||||
"moodElectronic": 0.78,
|
||||
"analysisMode": "enhanced",
|
||||
"analysisStatus": "completed"
|
||||
}
|
||||
```
|
||||
|
||||
### Find Similar Tracks (Vibe Match)
|
||||
|
||||
```
|
||||
GET /api/library/vibe-match?trackId=:id&limit=20
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
{
|
||||
"source": { /* track with features */ },
|
||||
"matches": [
|
||||
{
|
||||
"track": { /* track data */ },
|
||||
"similarity": 0.87,
|
||||
"features": { /* audio features */ }
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Generate Mood Mix
|
||||
|
||||
```
|
||||
POST /api/mixes/mood
|
||||
```
|
||||
|
||||
Request:
|
||||
```json
|
||||
{
|
||||
"valence": { "min": 0.6, "max": 1.0 },
|
||||
"energy": { "min": 0.5, "max": 0.8 },
|
||||
"danceability": { "min": 0.7, "max": 1.0 },
|
||||
"bpm": { "min": 100, "max": 140 },
|
||||
"limit": 15
|
||||
}
|
||||
```
|
||||
|
||||
### Get Mood Presets
|
||||
|
||||
```
|
||||
GET /api/mixes/mood-presets
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
[
|
||||
{
|
||||
"id": "chill",
|
||||
"name": "Chill Vibes",
|
||||
"color": "from-blue-600 to-purple-600",
|
||||
"params": {
|
||||
"valence": { "min": 0.3, "max": 0.7 },
|
||||
"energy": { "min": 0.1, "max": 0.4 }
|
||||
}
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Frontend Integration Guide
|
||||
|
||||
### Displaying Feature Values
|
||||
|
||||
Normalize values for consistent display:
|
||||
|
||||
```typescript
|
||||
function normalizeValue(
|
||||
value: number | null | undefined,
|
||||
min: number,
|
||||
max: number
|
||||
): number {
|
||||
if (value === null || value === undefined) return 0;
|
||||
return Math.max(0, Math.min(1, (value - min) / (max - min)));
|
||||
}
|
||||
|
||||
// Usage
|
||||
const normalizedBpm = normalizeValue(track.bpm, 60, 180);
|
||||
const normalizedEnergy = normalizeValue(track.energy, 0, 1);
|
||||
```
|
||||
|
||||
### Calculating Match Scores
|
||||
|
||||
```typescript
|
||||
function calculateFeatureMatch(
|
||||
sourceVal: number | null,
|
||||
currentVal: number | null,
|
||||
min: number,
|
||||
max: number
|
||||
): { diff: number; match: number } {
|
||||
const sourceNorm = normalizeValue(sourceVal, min, max);
|
||||
const currentNorm = normalizeValue(currentVal, min, max);
|
||||
const diff = Math.abs(sourceNorm - currentNorm);
|
||||
const match = Math.round((1 - diff) * 100);
|
||||
|
||||
return { diff, match };
|
||||
}
|
||||
```
|
||||
|
||||
### Match Score Color Coding
|
||||
|
||||
```typescript
|
||||
function getMatchColor(matchPercent: number): string {
|
||||
if (matchPercent >= 80) return "text-green-400"; // Excellent
|
||||
if (matchPercent >= 60) return "text-yellow-400"; // Good
|
||||
return "text-red-400"; // Different
|
||||
}
|
||||
|
||||
function getMatchDescription(matchPercent: number): string {
|
||||
if (matchPercent >= 80) return "Excellent match - very similar vibe";
|
||||
if (matchPercent >= 60) return "Good match - similar energy";
|
||||
return "Different vibe - exploring variety";
|
||||
}
|
||||
```
|
||||
|
||||
### Visualization Recommendations
|
||||
|
||||
#### 1. Radar Chart (Spider Graph)
|
||||
Best for comparing multiple features at once. Shows source track (dashed line) vs current track (solid fill).
|
||||
|
||||
#### 2. Progress Bars
|
||||
Best for individual feature comparison with source marker overlay.
|
||||
|
||||
#### 3. Mood Grid
|
||||
4x2 or 4x4 grid of ML mood indicators with percentage matches.
|
||||
|
||||
#### 4. Valence-Arousal Quadrant
|
||||
2D scatter plot with:
|
||||
- X-axis: Valence (sad → happy)
|
||||
- Y-axis: Arousal (calm → energetic)
|
||||
|
||||
Quadrants:
|
||||
- Top-right: Happy + Energetic (Party)
|
||||
- Top-left: Sad + Energetic (Angry/Tense)
|
||||
- Bottom-right: Happy + Calm (Peaceful)
|
||||
- Bottom-left: Sad + Calm (Melancholic)
|
||||
|
||||
---
|
||||
|
||||
## Existing Components Reference
|
||||
|
||||
### VibeOverlay
|
||||
Location: `frontend/components/player/VibeOverlay.tsx`
|
||||
|
||||
Full-featured overlay showing:
|
||||
- Overall match percentage
|
||||
- Feature-by-feature comparison bars
|
||||
- ML mood grid (enhanced mode)
|
||||
- Source vs current legend
|
||||
|
||||
### VibeGraph
|
||||
Location: `frontend/components/player/VibeGraph.tsx`
|
||||
|
||||
Compact radar chart for:
|
||||
- 4-feature comparison (Energy, Mood, Dance, BPM)
|
||||
- Match score badge
|
||||
- Inline display in player
|
||||
|
||||
### MoodMixer
|
||||
Location: `frontend/components/MoodMixer.tsx`
|
||||
|
||||
Modal for:
|
||||
- Quick mood presets
|
||||
- Custom range sliders
|
||||
- Generating mood-based playlists
|
||||
|
||||
---
|
||||
|
||||
## Special Considerations
|
||||
|
||||
### Out-of-Distribution (OOD) Detection
|
||||
|
||||
The MusiCNN model was trained on pop/rock music. For other genres (classical, ambient, jazz), predictions may be unreliable. The backend normalizes these cases:
|
||||
|
||||
**Detection criteria:**
|
||||
- All mood values > 0.7 with low variance
|
||||
- All mood values clustered around 0.5
|
||||
|
||||
**UI Recommendation:** Show a subtle indicator when `analysisMode` is "standard" or when predictions seem unreliable.
|
||||
|
||||
### Handling Missing Data
|
||||
|
||||
Always provide fallback values:
|
||||
|
||||
```typescript
|
||||
const safeFeatures = {
|
||||
energy: track.energy ?? 0.5,
|
||||
valence: track.valence ?? 0.5,
|
||||
bpm: track.bpm ?? 120,
|
||||
// ... etc
|
||||
};
|
||||
```
|
||||
|
||||
### Analysis Status States
|
||||
|
||||
| Status | UI Treatment |
|
||||
|--------|--------------|
|
||||
| `pending` | Show "Analyzing..." with spinner |
|
||||
| `processing` | Show progress indicator |
|
||||
| `completed` | Show full vibe data |
|
||||
| `failed` | Show fallback/retry option |
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference: Value Ranges
|
||||
|
||||
| Metric | Min | Max | Neutral |
|
||||
|--------|-----|-----|---------|
|
||||
| All mood* | 0 | 1 | 0.5 |
|
||||
| energy | 0 | 1 | 0.5 |
|
||||
| valence | 0 | 1 | 0.5 |
|
||||
| arousal | 0 | 1 | 0.5 |
|
||||
| danceability | 0 | 1 | 0.5 |
|
||||
| bpm | 60 | 200 | 120 |
|
||||
| keyStrength | 0 | 1 | - |
|
||||
|
||||
---
|
||||
|
||||
## File Locations
|
||||
|
||||
| Component | Path |
|
||||
|-----------|------|
|
||||
| Audio Analyzer (Python) | `services/audio-analyzer/analyzer.py` |
|
||||
| Vibe Matching Logic | `backend/src/routes/library.ts` |
|
||||
| Database Schema | `backend/prisma/schema.prisma` |
|
||||
| Frontend Vibe Overlay | `frontend/components/player/VibeOverlay.tsx` |
|
||||
| Frontend Vibe Graph | `frontend/components/player/VibeGraph.tsx` |
|
||||
| Mood Mixer | `frontend/components/MoodMixer.tsx` |
|
||||
| Audio State Context | `frontend/lib/audio-state-context.tsx` |
|
||||
|
||||
---
|
||||
|
||||
## Research Background
|
||||
|
||||
The Vibe System's valence and arousal calculations are informed by music psychology research:
|
||||
|
||||
### Valence (Emotional Positivity)
|
||||
|
||||
**Key Finding:** Mode/tonality is the strongest predictor of perceived valence in music.
|
||||
|
||||
- **Lee et al. (ICASSP 2020)** - Demonstrated that musical mode (major vs. minor) has the highest correlation with listener-reported valence
|
||||
- Major keys contribute positively (+0.3 in our formula), minor keys negatively (-0.2)
|
||||
- This aligns with centuries of music theory and empirical psychology research
|
||||
|
||||
### Arousal (Energy/Excitement)
|
||||
|
||||
**Key Finding:** The "electronic" mood prediction from ML models is unreliable for arousal calculation.
|
||||
|
||||
- **Grekow (2018)** - Found that direct energy and tempo features outperform genre-based predictions for arousal
|
||||
- Our implementation replaces the "electronic" mood with explicit energy and BPM contributions
|
||||
- This provides more consistent arousal predictions across diverse genres
|
||||
|
||||
### Feature Weights
|
||||
|
||||
The specific weights in our formulas (e.g., 0.35 for happy mood, 0.25 for energy) were tuned through:
|
||||
1. Initial values from published research
|
||||
2. Empirical testing on a diverse music library
|
||||
3. User feedback on vibe matching accuracy
|
||||
|
||||
### References
|
||||
|
||||
- Lee, J., et al. (2020). "Music Emotion Recognition Using Valence-Arousal Regression." ICASSP 2020.
|
||||
- Grekow, J. (2018). "Music Emotion Maps in Arousal-Valence Space." IFIP International Conference on Computer Information Systems and Industrial Management.
|
||||
Reference in New Issue
Block a user