Initial release v1.0.0

This commit is contained in:
Kevin O'Neill
2025-12-25 18:58:06 -06:00
commit 021aec7a63
439 changed files with 116588 additions and 0 deletions
+86
View File
@@ -0,0 +1,86 @@
# Lidify - Feature Overview
A self-hosted music streaming platform with intelligent discovery, podcast support, audiobooks, and a unique vibe-matching system.
---
## Music Discovery
**Discover Weekly** - AI-generated weekly playlists based on your listening history. Customize parameters like track duration and date ranges to fine-tune your discoveries.
**Smart Mixes** - Automatically generated playlists including era-based mixes (90s, 2000s), genre mixes, top tracks, rediscover mix (songs you haven't played in a while), and artist similarity mixes.
**Library Radio Stations** - One-click radio modes including Shuffle All, Workout (high energy), Discovery (lesser-played gems), and Favorites (most played). Genre and decade-based radio stations are dynamically created from your library.
**Similar Artists & Recommendations** - Powered by Last.fm integration, discover artists similar to ones you love and get personalized recommendations based on your listening habits.
---
## The Vibe System
**Vibe Button** - The standout feature. While listening to any track, tap the vibe button to see real-time audio analysis including energy, mood (valence), danceability, tempo, and arousal levels displayed on a visual radar chart.
**Keep The Vibe Going** - Uses ML mood predictions (Happy, Sad, Relaxed, Aggressive, Party, Acoustic, Electronic) to queue tracks that match your current vibe with a match percentage score.
**Mood Mixer** - Create custom playlists by adjusting mood sliders or using presets like Happy, Energetic, Chill, Focus, or Workout. The system finds tracks in your library matching your desired vibe.
---
## Playlist Import
**Spotify Import** - Paste any Spotify playlist URL to import. Preview shows which tracks match your library, which albums need downloading, and which tracks have no matches. Selectively download what you need.
**Deezer Import** - Same functionality for Deezer playlists. Browse Deezer's featured and genre playlists directly in-app.
---
## Podcasts
**Full Podcast Support** - Search iTunes for podcasts, subscribe, and manage your library. Browse top podcasts and discover by genre (Comedy, True Crime, News, Business, Sports, etc.).
**Progress Tracking** - Continue listening picks up exactly where you left off across all your subscribed shows.
---
## Audiobooks
**Audiobookshelf Integration** - Connect your Audiobookshelf instance to browse and play your audiobook collection directly in Lidify.
**Smart Organization** - Filter by currently listening or finished books. Group by series with proper sequence ordering. Sort by title, author, or recently played.
**Progress Sync** - Seamless progress tracking so you never lose your place.
---
## Library Management
**Multi-View Library** - Browse by Artists, Albums, or Tracks with flexible sorting and filtering options.
**Smart Filters** - View owned music, discovery tracks (from Discover Weekly), or everything combined.
**Bulk Operations** - Delete artists, albums, or tracks with confirmation. Paginated views for large libraries.
---
## Player
**Adaptive Player** - Full-width desktop player, mini player for mobile, and an immersive overlay mode.
**Universal Playback** - Single unified player handles music, podcasts, and audiobooks with type-specific controls.
**Queue Management** - Full control over what's playing next with shuffle and repeat modes.
---
## Additional Features
- Global search across artists, albums, and tracks
- Featured playlists from Deezer
- Recently added and popular artists sections
- Create and share playlists with other users
- MusicBrainz integration for accurate metadata
- Clean, responsive UI that works on desktop, tablet, and mobile
---
*Lidify is self-hosted, giving you full control over your music library with the discovery features of commercial streaming services.*
+212
View File
@@ -0,0 +1,212 @@
# Lidify Testing Checklist
Use this checklist when testing Lidify before releases or after major changes.
## ✅ Automated Pre-Deploy Smoke Test (Recommended)
This repo includes a one-command smoke test that covers the **core** flows (API + UI). It intentionally skips “hard” items like lock-screen media controls, background playback on real devices, etc.
### Run (one command)
```bash
./scripts/predeploy-test.sh
```
### Notes
- **Requires music in `MUSIC_PATH`** (or `./music`) with at least one track, otherwise playback/playlist-related checks will fail.
- **Environment overrides** (optional):
- `LIDIFY_UI_BASE_URL` (default `http://127.0.0.1:3030`)
- `LIDIFY_API_BASE_URL` (default `http://127.0.0.1:3006`)
- `LIDIFY_TEST_USERNAME` / `LIDIFY_TEST_PASSWORD`
- `LIDIFY_TEARDOWN=0` to keep containers running after the script finishes
## 🎵 Audio Playback
### Music (Tracks)
- [ ] Play a track from an album
- [ ] Play/pause toggle works
- [ ] Seeking works (drag the progress bar)
- [ ] Fast forward (10s) works
- [ ] Rewind (10s) works
- [ ] Next track works
- [ ] Previous track works
- [ ] Volume slider works
- [ ] Mute toggle works
- [ ] Shuffle toggle works (plays random order)
- [ ] Repeat modes work (off, repeat all, repeat one)
- [ ] Queue displays correctly
- [ ] Removing tracks from queue works
### Podcasts
- [ ] Play a podcast episode
- [ ] Seeking works (when cached)
- [ ] Progress saves when pausing
- [ ] Progress resumes on different device/browser
- [ ] Can seek far ahead after episode is fully cached/downloaded
- [ ] Subscribing to a new podcast works
- [ ] Unsubscribing from a podcast works
- [ ] Episode list loads correctly
### Audiobooks
- [ ] Play an audiobook (requires Audiobookshelf integration)
- [ ] Progress saves automatically
- [ ] Can resume from saved position
- [ ] Reset progress works
- [ ] Mark as complete works
### Cross-Device Sync
- [ ] Start playing on desktop, resume on mobile (or vice versa)
- [ ] Queue syncs between devices
---
## 🔍 Discovery & Search
### Deezer Previews
- [ ] Preview button appears on unowned albums
- [ ] Preview button appears on artist discovery pages
- [ ] Preview plays 30-second clip
- [ ] Preview stops when full track starts
### Search
- [ ] Library search finds artists
- [ ] Library search finds albums
- [ ] Library search finds tracks
- [ ] Discovery search finds external artists
- [ ] Discovery search finds podcasts
---
## 📥 Downloads & Integration
### Lidarr Integration
- [ ] Download entire artist works
- [ ] Download individual album works
- [ ] Download status updates in real-time
- [ ] Webhook triggers library rescan after import
### Soularr (Soulseek)
- [ ] Search returns results
- [ ] Download from Soulseek works
- [ ] Downloaded files appear in library after scan
---
## 📚 Library Management
### Discover Weekly
- [ ] Generate Discover Weekly works
- [ ] Playlist populates with recommendations
- [ ] Can like/dislike albums
- [ ] Liked albums move to permanent collection
### Playlists
- [ ] Create new playlist works
- [ ] Add track to playlist works
- [ ] Remove track from playlist works
- [ ] Delete playlist works
- [ ] Reorder tracks (drag and drop) works
---
## 🔐 Authentication & Users
### Two-Factor Authentication
- [ ] Enable 2FA works
- [ ] Login with 2FA code works
- [ ] Recovery codes work
- [ ] Disable 2FA works
### User Management
- [ ] Create new user works (admin only)
- [ ] User can log in
- [ ] User has separate playlists/history
- [ ] Delete user works (admin only)
---
## 🎨 Metadata & Enrichment
### Artist Enrichment
- [ ] Manual enrichment button works
- [ ] Artist bio populates
- [ ] Artist genres populate
- [ ] Hero image/background loads
- [ ] Album art loads correctly
---
## 📱 PWA / Mobile
### Installation
- [ ] PWA install prompt appears on mobile browsers
- [ ] Can install to home screen (Android Chrome)
- [ ] Can add to home screen (iOS Safari)
### PWA Features
- [ ] Installed PWA opens in standalone mode
- [ ] Media Session controls show in notification/lock screen
- [ ] Background audio continues when screen is off
- [ ] Audio continues when switching tabs
---
## 🖥️ UI/UX
### General
- [ ] Login page loads correctly
- [ ] Onboarding flow works for new users
- [ ] Navigation between pages works
- [ ] Dark theme renders correctly
- [ ] Mobile responsive layout works
### Player
- [ ] Mini player shows on mobile
- [ ] Full player expands correctly
- [ ] Album art displays
- [ ] Artist/track info displays
---
## 🐳 Docker
### All-in-One Container
- [ ] Container starts without errors
- [ ] Web UI accessible on port 3030
- [ ] API proxying works (rewrites to backend)
- [ ] Database persists on restart
- [ ] Library scan works
---
## Notes
**Test Environment:**
- Browser:
- OS:
- Lidify Version:
- Date:
**Issues Found:**
-
@@ -0,0 +1,153 @@
<!-- f0350f33-28ae-4b99-a6ef-c0ec4fc46b90 3ebd44b8-4704-4bf4-a7cc-824ec82aafa3 -->
# Fix Lidarr Webhooks, Progress Updates, and Discovery Isolation
## Issue 1: Lidarr Webhook URL Missing /api Prefix (Critical)
**Root Cause**: [backend/src/routes/systemSettings.ts](backend/src/routes/systemSettings.ts) line 276 sets webhook URL to `http://host.docker.internal:3006/webhooks/lidarr` but the route is mounted at `/api/webhooks` in [backend/src/index.ts](backend/src/index.ts) line 137.
**Fix**: Update the webhook URL construction to:
1. Add `/api` prefix to the path
2. Use a smarter URL based on the request origin or a configurable callback URL
```typescript
// Line 276 - change from:
const webhookUrl = "http://host.docker.internal:3006/webhooks/lidarr";
// To something like:
const callbackHost = process.env.LIDIFY_CALLBACK_URL || "http://host.docker.internal:3006";
const webhookUrl = `${callbackHost}/api/webhooks/lidarr`;
```
Also add `LIDIFY_CALLBACK_URL` to Docker compose environment variables so users can configure it.
---
## Issue 2: Audiobook/Podcast Progress Not Updating Real-time
**Root Cause**: [frontend/app/audiobooks/page.tsx](frontend/app/audiobooks/page.tsx) computes `continueListening` from `useAudiobooksQuery()` data only. When playback starts, the audio context updates but the query cache doesn't invalidate.
**Fix**: Modify the audiobooks page to:
1. Check if `currentAudiobook` from audio context matches any book in the list
2. If the currently playing audiobook isn't in `continueListening`, prepend it
3. Invalidate audiobooks query when playback starts/stops
```typescript
// In audiobooks page, combine query data with audio context
const { currentAudiobook } = useAudio();
const continueListening = useMemo(() => {
const inProgress = audiobooks.filter(
(book) => book.progress && book.progress.progress > 0 && !book.progress.isFinished
);
// If currently playing an audiobook that's not in the list, add it
if (currentAudiobook && !inProgress.find(b => b.id === currentAudiobook.id)) {
const currentBook = audiobooks.find(b => b.id === currentAudiobook.id);
if (currentBook) {
return [currentBook, ...inProgress];
}
}
return inProgress;
}, [audiobooks, currentAudiobook]);
```
---
## Issue 3: Discovery Albums Not Isolated from Library
**Root Cause Analysis**: The discovery system relies on:
1. Webhook firing to mark download complete
2. Download job having `discoveryBatchId` set
3. Scanner checking `isDiscoveryDownload()` during scan
If webhook never fires (Issue 1), the scan runs but can't identify albums as discovery.
**Fix**:
1. Fix webhook URL (Issue 1) - this is the primary fix
2. Add fallback: During scan, also check if album path contains "discovery" in Lidarr metadata
3. Verify library routes filter by `location: "LIBRARY"` consistently
---
## Issue 4: Album Cover 404s Spamming Console
**Root Cause**: [frontend/features/artist/components/AvailableAlbums.tsx](frontend/features/artist/components/AvailableAlbums.tsx) fetches covers for unowned albums. When Cover Art Archive doesn't have them, 404 errors spam the console.
**Fix**:
1. In [backend/src/routes/library.ts](backend/src/routes/library.ts) `/album-cover/:mbid` endpoint - return 204 No Content instead of 404 for missing covers (less noisy)
2. In frontend - catch and silently handle missing covers, show placeholder
---
## Issue 5: Shared Playlists Not Showing Username
**Verification Needed**: The code exists in [frontend/app/playlists/page.tsx](frontend/app/playlists/page.tsx) lines 162-164. Check if backend is returning `user.username` correctly.
**Files to check**:
- [backend/src/routes/playlists.ts](backend/src/routes/playlists.ts) - verify `include: { user: { select: { username: true } } }` is working
- Verify playlists actually have `isOwner: false` when shared
---
## Issue 6: Discovery Playlist Never Appears
**Root Cause**: This is directly caused by Issue 1 (webhook URL). The discovery playlist flow is:
1. Discovery Weekly generates recommendations and starts downloads
2. Lidarr grabs and downloads the albums
3. **Lidarr webhook fires on completion** (BROKEN - wrong URL)
4. `simpleDownloadManager.onDownloadComplete()` marks job complete
5. `discoverWeeklyService.checkBatchCompletion()` checks if all albums done
6. When batch complete, triggers scan with `source: "discover-weekly-completion"`
7. Scan processor calls `discoverWeeklyService.buildFinalPlaylist()`
8. Discovery playlist appears in UI
Since step 3 never happens, the playlist is never built.
**Fix**:
1. Fix webhook URL (Issue 1) - primary fix
2. Add a manual "Rebuild Discovery Playlist" button in the UI as fallback
3. Add a background job that periodically checks for orphaned discovery batches
---
## Issue 7: Audiobooks/Podcasts Missing Filter/Sort Controls
**Problem**: Library page has sorting, pagination, and shuffle controls but audiobooks and podcasts pages don't match this design.
**Fix**: Add to [frontend/app/audiobooks/page.tsx](frontend/app/audiobooks/page.tsx) and [frontend/app/podcasts/page.tsx](frontend/app/podcasts/page.tsx):
- Sort dropdown (Title A-Z, Author A-Z, Recently Added, etc.)
- Items per page dropdown (25, 50, 100, 250)
- Pagination controls
- "Shuffle" button for audiobooks (shuffle all chapters/books)
Match the styling from [frontend/app/library/page.tsx](frontend/app/library/page.tsx) for visual consistency.
---
## Implementation Order
1. Fix Lidarr webhook URL (critical - blocking all download tracking)
2. Add real-time audiobook progress
3. Add filter/sort/pagination to audiobooks and podcasts pages
4. Suppress album cover 404 noise
5. Verify shared playlist data flow
6. Test discovery isolation after webhook fix
### To-dos
- [ ] Fix owned artist pages - not showing downloadable albums
- [ ] Change default playback quality to 'original'
- [ ] Create docs/ directory with tracking file, add to gitignore
- [ ] Fix Lidarr webhook URL to include /api prefix and make configurable
- [ ] Add real-time audiobook progress by combining query data with audio context
- [ ] Change album cover endpoint to return 204 instead of 404 for missing covers
- [ ] Debug shared playlist username display
- [ ] Test discovery isolation after webhook fix
+64
View File
@@ -0,0 +1,64 @@
# Lidify Design System
## Brand Colors
- **Primary**: #fca200 (logo gold)
- **Hover**: #e69200 (darker gold for hover states)
- **Light**: #fcb84d (lighter gold for accents)
- **Dark**: #d48c00 (darker gold for emphasis)
## Design Principles
- **Glassmorphism**: Use `backdrop-blur-sm` with semi-transparent cards for premium feel
- **Border Radius**: `rounded-lg` (8px) for modern, edgy feel - avoid overly rounded elements
- **Shadows**: Prefer `shadow-lg`/`shadow-xl` over `shadow-2xl` for subtlety
- **Spacing**: 20-25% tighter than current values for refined look
- **Typography**: Smaller, tighter proportions for elegance
## Component Guidelines
### Buttons
- **Primary CTA**: `bg-brand hover:bg-brand-hover text-black font-bold rounded-lg py-3`
- **Secondary**: `bg-white/5 hover:bg-white/10 border border-white/10 rounded-lg py-2.5`
- **Avoid**: `rounded-full` (too soft), `rounded-2xl` (too rounded)
### Cards
- **Style**: `rounded-lg backdrop-blur-sm bg-[#111]/90 border border-white/10`
- **Shadow**: `shadow-xl` (subtle, premium)
- **Padding**: `p-6 md:p-8` (tighter than current)
### Form Elements
- **Inputs**: `rounded-lg py-2.5 px-4 bg-white/5 border border-white/10`
- **Focus**: `focus:ring-2 focus:ring-brand/30 focus:border-transparent`
- **Labels**: `text-sm font-medium text-white/90 mb-1.5`
### Typography
- **Page Headings**: `text-2xl` (reduced from `text-3xl`)
- **Section Headings**: `text-xl` (reduced from `text-2xl`)
- **Card Titles**: `text-sm font-semibold`
- **Spacing**: Tighter margins (`mb-1` vs `mb-2`)
## Layout Guidelines
### Login Page
- Logo: `mb-8`, `width={40}`
- Card: `rounded-lg p-6 md:p-8`
- Form: `space-y-4`
- Button: `py-3 rounded-lg`
### Onboarding Page
- Logo: `width={48}`
- Title: `text-4xl`
- Progress: `w-9 h-9` step circles
- Card: `rounded-lg p-6 md:p-8`
- Buttons: `py-3.5 rounded-lg`
## Color Usage
- Replace all `#ecb200` with `#fca200`
- Replace all `#ffc933` with `#e69200`
- Use Tailwind `text-brand`, `bg-brand`, `border-brand` classes
- Update gradient overlays to use new brand color
## Implementation Notes
- Glassmorphism effect: `backdrop-blur-sm` (subtle)
- Card opacity: `bg-[#111]/90` (90% opacity)
- Border consistency: `border-white/10` throughout
- Shadow consistency: `shadow-xl` for cards
@@ -0,0 +1,191 @@
# Spotify Import - Code Reference
Quick reference to key code sections for the next agent.
## Backend Entry Points
### Preview Playlist
**File**: `backend/src/routes/spotify.ts`
**Endpoint**: `POST /spotify/preview`
**Handler**: Lines ~50-120
```typescript
// Fetches Spotify playlist, searches MusicBrainz for albums
const preview = await spotifyImportService.previewPlaylist(url);
// Returns: matchedTracks, unmatchedTracks, albumsToDownload
```
### Execute Import
**File**: `backend/src/routes/spotify.ts`
**Endpoint**: `POST /spotify/import`
**Handler**: Lines ~130-200
```typescript
// Starts async import job
const job = await spotifyImportService.executeImport(preview, userId, playlistName);
// Returns: jobId for status polling
```
### Retry Pending Track
**File**: `backend/src/routes/playlists.ts`
**Endpoint**: `POST /playlists/:id/pending/:trackId/retry`
**Handler**: Lines ~630-745
```typescript
// Non-blocking retry flow:
// 1. Search Soulseek (15s timeout)
// 2. Return immediately with success/failure
// 3. Download in background
// 4. Trigger library scan after download
```
## Core Import Logic
### spotifyImportService.executeImport()
**File**: `backend/src/services/spotifyImport.ts`
**Function**: Lines ~150-350
Key sections:
- **Lines ~180-220**: Download albums via Lidarr or Soulseek
- **Lines ~230-280**: Wait for downloads, handle failures
- **Lines ~290-350**: Create playlist, match tracks, store pending
### Soulseek Download Flow
**File**: `backend/src/services/soulseek.ts`
Key methods:
- `searchTrack()` - Lines ~150-250: Search with 15s timeout
- `downloadTrack()` - Lines ~300-400: Download single file with 180s timeout
- `searchAndDownloadBatch()` - Lines ~525-600: Parallel search, concurrent download
- `downloadBestMatch()` - Lines ~465-520: Download from pre-searched results
### Track Matching
**File**: `backend/src/services/spotifyImport.ts`
**Function**: `matchTrackToLibrary()` - Lines ~400-500
Matching strategies (in order):
1. Exact normalized title + artist first word
2. Stripped title (remove remaster/remix suffixes)
3. Contains search
4. Fuzzy artist + title
5. StartsWith search
6. Last resort fuzzy
### Pending Track Reconciliation
**File**: `backend/src/services/spotifyImport.ts`
**Function**: `reconcilePendingTracks()` - Lines ~550-650
Called after library scan to match pending tracks to newly added files.
## Frontend Components
### Import Wizard
**File**: `frontend/app/import/spotify/page.tsx`
Key state:
- `step`: "url" | "preview" | "importing" | "complete"
- `preview`: PreviewResult from API
- `jobStatus`: Polling status during import
### Playlist Detail - Pending Tracks
**File**: `frontend/app/playlist/[id]/page.tsx`
Key handlers (Lines ~100-160):
- `handlePlayPreview()` - Fetches fresh Deezer URL, plays audio
- `handleRetryPendingTrack()` - Calls retry API, shows toast
- `handleRemovePendingTrack()` - Removes from playlist
Pending track rendering: Lines ~555-650
## Database Queries
### Get Playlist with Pending Tracks
```typescript
const playlist = await prisma.playlist.findUnique({
where: { id: playlistId },
include: {
items: { include: { track: { include: { album: { include: { artist: true }} }} }},
pendingTracks: { orderBy: { sort: 'asc' } }
}
});
```
### Create Pending Track
```typescript
await prisma.playlistPendingTrack.create({
data: {
playlistId,
spotifyArtist: track.artist,
spotifyTitle: track.title,
spotifyAlbum: resolvedAlbum,
spotifyTrackId: track.spotifyId,
deezerPreviewUrl: previewUrl,
sort: index
}
});
```
### Reconcile Pending Track (convert to real track)
```typescript
// Delete pending, add real track
await prisma.$transaction([
prisma.playlistPendingTrack.delete({ where: { id: pendingId } }),
prisma.playlistItem.create({
data: { playlistId, trackId: matchedTrack.id, sort: pending.sort }
})
]);
```
## Configuration Check
```typescript
const settings = await getSystemSettings();
// Key fields:
// - settings.downloadSource: "soulseek" | "lidarr"
// - settings.soulseekFallback: "none" | "failed" | "always"
// - settings.musicPath: where files are downloaded
// - settings.soulseekUsername / soulseekPassword
// - settings.lidarrUrl / lidarrApiKey
```
## Error Handling Patterns
### Soulseek Connection
```typescript
try {
await soulseekService.ensureConnected();
} catch (err) {
// Credentials not configured or connection failed
return { success: false, error: "Soulseek connection failed" };
}
```
### Download Retry Logic
```typescript
const matchesToTry = allMatches.slice(0, MAX_DOWNLOAD_RETRIES); // 3 attempts
for (const match of matchesToTry) {
const result = await this.downloadTrack(match, destPath);
if (result.success) return { success: true, filePath: destPath };
// Try next user on failure
}
return { success: false, error: "All attempts failed" };
```
## Logging
Session logging for debugging:
```typescript
import { sessionLog } from "../utils/playlistLogger";
sessionLog("SOULSEEK", "Message here"); // INFO level
sessionLog("SOULSEEK", "Error message", "ERROR");
sessionLog("SOULSEEK", "Warning", "WARN");
```
Job-specific logging:
```typescript
import { createPlaylistLogger } from "../utils/playlistLogger";
const logger = createPlaylistLogger(jobId);
logger.info("Message");
logger.error("Error");
logger.debug("Debug info");
```
+194
View File
@@ -0,0 +1,194 @@
# Spotify Import Feature - Handoff Document
## Overview
The Spotify Import feature allows users to import playlists from Spotify into Lidify. It searches for matching tracks on Soulseek (and optionally Lidarr), downloads them, creates a local playlist, and matches downloaded tracks to the playlist.
## Current State
### What Works
1. **Spotify Playlist Parsing**: Fetches playlist metadata via Spotify embed API
2. **Soulseek Downloads**: Direct P2P downloads with retry logic (tries up to 3 different users)
3. **Parallel Processing**: Searches run in parallel, downloads limited to concurrency of 4
4. **Track Matching**: Multiple matching strategies (exact, fuzzy, contains, startsWith)
5. **Pending Track System**: Tracks that fail to download are stored as "pending" with:
- Deezer preview playback (30s samples)
- Manual retry button
- Remove button
6. **Retry Functionality**: Non-blocking retry - returns immediately, downloads in background
7. **Reconciliation**: After library scan, pending tracks are automatically matched to downloaded files
### What Needs Testing
1. **Lidarr Integration**: Download source can be set to "lidarr" but needs end-to-end testing
2. **Lidarr + Soulseek Fallback**: When `downloadSource: "lidarr"` and `soulseekFallback: "failed"`, should try Lidarr first then fall back to Soulseek
3. **Activity Panel Integration**: Downloads should show progress in the activity panel
4. **Edge Cases**: Various artist name formats, special characters, live recordings filtering
## Architecture
### Flow
```
1. User pastes Spotify playlist URL
2. Frontend calls POST /spotify/preview with URL
3. Backend fetches playlist via Spotify embed API
4. Backend searches MusicBrainz for album MBIDs
5. Preview returned to user showing matched/unmatched tracks
6. User confirms import
7. Frontend calls POST /spotify/import
8. Backend:
a. For each album, either:
- Sends to Lidarr (if enabled)
- Downloads directly via Soulseek
b. Waits for downloads to complete
c. Runs library scan
d. Matches tracks to playlist
e. Creates pending entries for unmatched tracks
9. User sees playlist with matched tracks + failed/pending tracks
```
### Key Files
#### Backend Routes
- `backend/src/routes/spotify.ts` - Main import endpoints
- `POST /spotify/preview` - Parse and preview playlist
- `POST /spotify/import` - Execute import job
- `GET /spotify/import/:jobId/status` - Check job status
- `backend/src/routes/playlists.ts` - Playlist management + pending track handling
- `GET /playlists/:id/pending/:trackId/preview` - Get fresh Deezer preview URL
- `POST /playlists/:id/pending/:trackId/retry` - Retry downloading a failed track
- `DELETE /playlists/:id/pending/:trackId` - Remove pending track from playlist
- `POST /playlists/:id/pending/reconcile` - Manually trigger reconciliation
#### Backend Services
- `backend/src/services/spotifyImport.ts` - Core import logic
- `previewPlaylist()` - Parse Spotify URL and match to MusicBrainz
- `executeImport()` - Run the full import job
- `reconcilePendingTracks()` - Match pending tracks to library after scan
- `backend/src/services/soulseek.ts` - Direct Soulseek P2P client
- `searchTrack()` - Search for a track (15s timeout)
- `downloadTrack()` - Download a single file
- `searchAndDownload()` - Search + download with retry
- `searchAndDownloadBatch()` - Parallel search, concurrent download
- `downloadBestMatch()` - Download from pre-searched results (used by retry)
- `backend/src/services/lidarr.ts` - Lidarr integration
- `searchAlbum()` - Search for album by MBID
- `addAlbum()` - Add album to Lidarr for download
- `getDownloadQueue()` - Check download progress
- `backend/src/services/deezer.ts` - Deezer API for previews
- `getTrackPreview()` - Get 30s preview URL for a track
- `backend/src/services/musicbrainz.ts` - MusicBrainz lookups
- `searchRecordingByISRC()` - Find recording by ISRC
- `searchRecording()` - Search by artist/title
- `getReleaseDetails()` - Get album details
#### Frontend
- `frontend/app/import/spotify/page.tsx` - Import wizard UI
- `frontend/app/playlist/[id]/page.tsx` - Playlist detail with pending track handling
- `frontend/lib/api.ts` - API client methods
#### Database Schema (relevant tables)
```prisma
model Playlist {
id String @id @default(cuid())
name String
userId String
isPublic Boolean @default(false)
spotifyUrl String? // Original Spotify URL
items PlaylistItem[]
pendingTracks PlaylistPendingTrack[]
}
model PlaylistPendingTrack {
id String @id @default(cuid())
playlistId String
spotifyArtist String
spotifyTitle String
spotifyAlbum String
spotifyTrackId String?
deezerPreviewUrl String?
sort Int
createdAt DateTime @default(now())
}
```
## Known Issues
### 1. Album Name Shows "Unknown Album"
**Problem**: Pending tracks sometimes show "Unknown Album" instead of the real album name.
**Cause**: Spotify embed API sometimes returns "Unknown Album" for track.album.
**Fix Applied**: Now uses resolved album name from `albumsToDownload` (MusicBrainz data) instead of Spotify embed data.
**File**: `backend/src/services/spotifyImport.ts` line ~280
### 2. Deezer Preview URLs Expire
**Problem**: Deezer preview URLs have timestamps and expire quickly.
**Fix Applied**: Added endpoint to fetch fresh preview URL on demand.
**File**: `backend/src/routes/playlists.ts` - `GET /:id/pending/:trackId/preview`
### 3. Retry Button Was Hanging
**Problem**: Clicking retry would hang for up to 180s (download timeout).
**Fix Applied**: Made retry non-blocking - search first (15s), return immediately, download in background.
**File**: `backend/src/routes/playlists.ts` - `POST /:id/pending/:trackId/retry`
### 4. Missing Files After Scan (Unresolved)
**Problem**: During testing, original downloaded files disappeared from disk, causing scan to remove 7 tracks.
**Status**: Unknown cause - not a code bug. Files were deleted externally. Need to monitor in future tests.
## Testing Checklist
### Soulseek-Only Mode (Current Focus)
- [x] Basic playlist import with Soulseek
- [x] Track matching after download
- [x] Pending track display for failed downloads
- [x] Deezer preview playback
- [x] Retry button functionality
- [x] Remove pending track
- [ ] Toast notifications for retry status
- [ ] Activity panel shows download progress
- [ ] Verify files persist after download
### Lidarr Mode (Needs Testing)
- [ ] Set `downloadSource: "lidarr"` in settings
- [ ] Import playlist - should send albums to Lidarr
- [ ] Lidarr downloads complete
- [ ] Library scan picks up Lidarr downloads
- [ ] Tracks match to playlist
### Lidarr + Soulseek Fallback (Needs Testing)
- [ ] Set `downloadSource: "lidarr"`, `soulseekFallback: "failed"`
- [ ] Import playlist with mix of albums (some in Lidarr, some not)
- [ ] Albums not in Lidarr should fall back to Soulseek
- [ ] Both sources' downloads get matched
## Configuration
System settings relevant to import (in `SystemSettings` table):
```
downloadSource: "soulseek" | "lidarr"
soulseekFallback: "none" | "failed" | "always"
soulseekUsername: string
soulseekPassword: string (encrypted)
lidarrEnabled: boolean
lidarrUrl: string
lidarrApiKey: string (encrypted)
musicPath: string (e.g., "C:/Users/kevin/Music")
```
## Logs
Import logs are written to: `docs/logs/playlists/import_<jobId>_<timestamp>.log`
Session log for Soulseek activity: `docs/logs/playlists/session.log`
## Next Steps
1. Run fresh import test with Soulseek
2. Verify files persist and scan works correctly
3. Test Lidarr-only mode
4. Test Lidarr + Soulseek fallback
5. Add activity panel integration for download progress
6. Consider adding notification when background retry completes
@@ -0,0 +1,174 @@
# Audio Analysis - Enhanced Mode (MusiCNN)
## Overview
Enhanced mode uses Essentia's TensorFlow integration with MusiCNN (Music Convolutional Neural Network) models to perform ML-based mood and audio classification. This provides significantly more accurate mood detection compared to the heuristic-based Standard mode.
## Architecture
```
┌─────────────────┐
│ Audio File │
│ (16kHz mono) │
└────────┬────────┘
┌────────▼────────┐
│ TensorflowPredict│
│ MusiCNN │
│ (Embeddings) │
└────────┬────────┘
┌──────────────┼──────────────┐
│ │ │
┌─────────▼─────┐ ┌──────▼─────┐ ┌──────▼─────┐
│ Mood Happy │ │ Mood Sad │ │ Danceability│
│ TensorFlow │ │ TensorFlow │ │ TensorFlow │
│ Predict2D │ │ Predict2D │ │ Predict2D │
└───────┬───────┘ └─────┬──────┘ └──────┬──────┘
│ │ │
└───────────────┼───────────────┘
┌───────▼───────┐
│ Derived Scores│
│ Valence/Arousal│
└───────────────┘
```
## Key Components
### 1. Base Model: MusiCNN
- **Model**: `msd-musicnn-1.pb` (~3MB)
- **Source**: [Essentia Model Zoo](https://essentia.upf.edu/models/autotagging/msd/)
- **Function**: Extracts 200-dimensional embeddings from audio
- **Algorithm**: `TensorflowPredictMusiCNN`
### 2. Classification Heads
Each classification head takes the MusiCNN embeddings and outputs probabilities:
| Model | File | Output |
|-------|------|--------|
| Mood Happy | `mood_happy-msd-musicnn-1.pb` | P(happy) |
| Mood Sad | `mood_sad-msd-musicnn-1.pb` | P(sad) |
| Mood Relaxed | `mood_relaxed-msd-musicnn-1.pb` | P(relaxed) |
| Mood Aggressive | `mood_aggressive-msd-musicnn-1.pb` | P(aggressive) |
| Mood Party | `mood_party-msd-musicnn-1.pb` | P(party) |
| Mood Acoustic | `mood_acoustic-msd-musicnn-1.pb` | P(acoustic) |
| Mood Electronic | `mood_electronic-msd-musicnn-1.pb` | P(electronic) |
| Danceability | `danceability-msd-musicnn-1.pb` | P(danceable) |
| Voice/Instrumental | `voice_instrumental-msd-musicnn-1.pb` | P(instrumental) |
### 3. Derived Features
Valence and Arousal are derived from the mood predictions:
```python
# Valence = emotional positivity
valence = happy * 0.5 + party * 0.3 + (1 - sad) * 0.2
# Arousal = energy level
arousal = aggressive * 0.35 + party * 0.25 + electronic * 0.2
+ (1 - relaxed) * 0.1 + (1 - acoustic) * 0.1
```
## Docker Configuration
### Dockerfile
```dockerfile
FROM ubuntu:20.04
# Install essentia-tensorflow (includes TensorFlow + MusiCNN support)
RUN pip3 install --no-cache-dir essentia-tensorflow
# Download MusiCNN models
RUN curl -L -o /app/models/msd-musicnn-1.pb \
"https://essentia.upf.edu/models/autotagging/msd/msd-musicnn-1.pb"
# Classification heads
RUN curl -L -o /app/models/mood_happy-msd-musicnn-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_happy/mood_happy-msd-musicnn-1.pb"
# ... (other models)
```
### Requirements
- **Ubuntu 20.04** (for Python 3.8 compatibility)
- **essentia-tensorflow** pip package
- **~10MB** for all models combined
## Usage in Code
```python
from essentia.standard import TensorflowPredictMusiCNN, TensorflowPredict2D
# Load base embedding model
musicnn = TensorflowPredictMusiCNN(
graphFilename='/app/models/msd-musicnn-1.pb',
output="model/dense/BiasAdd" # Embedding output layer
)
# Load classification head
mood_happy = TensorflowPredict2D(
graphFilename='/app/models/mood_happy-msd-musicnn-1.pb',
output="model/Softmax"
)
# Process audio
audio = es.MonoLoader(filename=path, sampleRate=16000)()
embeddings = musicnn(audio) # Shape: [frames, 200]
predictions = mood_happy(embeddings) # Shape: [frames, 2]
happy_score = float(np.mean(predictions[:, 1])) # Average over frames
```
## Output Fields
Enhanced mode produces these additional fields:
| Field | Type | Range | Description |
|-------|------|-------|-------------|
| moodHappy | float | 0-1 | ML probability of happy mood |
| moodSad | float | 0-1 | ML probability of sad mood |
| moodRelaxed | float | 0-1 | ML probability of relaxed mood |
| moodAggressive | float | 0-1 | ML probability of aggressive mood |
| moodParty | float | 0-1 | ML probability of party mood |
| moodAcoustic | float | 0-1 | ML probability of acoustic sound |
| moodElectronic | float | 0-1 | ML probability of electronic sound |
| danceabilityMl | float | 0-1 | ML danceability score |
| valence | float | 0-1 | Derived emotional positivity |
| arousal | float | 0-1 | Derived energy level |
| acousticness | float | 0-1 | From moodAcoustic |
| instrumentalness | float | 0-1 | ML voice/instrumental detection |
## Comparison: Standard vs Enhanced
| Feature | Standard Mode | Enhanced Mode |
|---------|---------------|---------------|
| Mood Detection | Heuristic (key/BPM/energy) | ML (MusiCNN) |
| Accuracy | Approximate | Research-grade |
| Speed | Fast (~100ms) | Moderate (~500ms) |
| Dependencies | Essentia core | Essentia + TensorFlow |
| Model Size | 0 | ~10MB |
| Python Version | Any | 3.7-3.9 (for pip) |
## Fallback Behavior
If Enhanced mode fails to initialize (missing models, TensorFlow errors), the analyzer automatically falls back to Standard mode:
```python
if self.enhanced_mode and self.musicnn_model:
ml_features = self._extract_ml_features(audio_16k)
result.update(ml_features)
else:
self._apply_standard_estimates(result, scale, bpm)
```
## References
- [Essentia TensorFlow Documentation](https://essentia.upf.edu/machine_learning.html)
- [MusiCNN Paper](https://arxiv.org/abs/1711.02520)
- [Essentia Model Zoo](https://essentia.upf.edu/models/)
@@ -0,0 +1,443 @@
# Audio Analysis: Standard Mode (Heuristic Approach)
## Overview
The Lidify audio analyzer has two modes:
- **Enhanced Mode**: Uses TensorFlow ML models for accurate mood/valence/arousal predictions
- **Standard Mode**: Uses signal processing heuristics when ML models aren't available
This document covers the **Standard Mode** implementation for code review.
---
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Docker Container │
│ lidify_audio_analyzer │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Redis │◄───│ Worker │───►│ PostgreSQL │ │
│ │ Job Queue │ │ Loop │ │ Track Table │ │
│ └─────────────┘ └──────┬──────┘ └─────────────────────┘ │
│ │ │
│ ┌──────▼──────┐ │
│ │ AudioAnalyzer│ │
│ │ Class │ │
│ └──────┬──────┘ │
│ │ │
│ ┌────────────────┼────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌───────────────┐ ┌─────────────┐ ┌──────────────────┐ │
│ │ Basic Features│ │ Spectral │ │ Heuristic │ │
│ │ (BPM, Key) │ │ Analysis │ │ Mood Estimation │ │
│ └───────────────┘ └─────────────┘ └──────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```
---
## File Structure
```
services/audio-analyzer/
├── analyzer.py # Main analyzer code (870 lines)
├── requirements.txt # Python dependencies
└── Dockerfile # Container build configuration
```
---
## Key Classes
### 1. `AudioAnalyzer` (Line 130-660)
Main analysis class with two modes:
```python
class AudioAnalyzer:
def __init__(self):
self.enhanced_mode = False # Falls back to Standard if ML unavailable
self._init_essentia() # Initialize signal processing algorithms
self._load_ml_models() # Attempt to load ML models
```
### 2. `AnalysisWorker` (Line 663-847)
Redis queue worker that:
1. Polls for pending tracks from `audio:analysis:queue`
2. Falls back to scanning `Track` table for `analysisStatus = 'pending'`
3. Processes tracks and updates database
---
## Standard Mode: Heuristic Calculations
### Input Features (Always Extracted)
| Feature | Essentia Algorithm | Description |
|---------|-------------------|-------------|
| BPM | `RhythmExtractor2013` | Beats per minute |
| Key/Scale | `KeyExtractor` | Musical key (C, D#, etc.) and mode (major/minor) |
| Loudness | `Loudness` | Perceived loudness in dB |
| Dynamic Range | `DynamicComplexity` | Difference between quiet and loud parts |
| Danceability | `Danceability` | How suitable for dancing (0-1) |
| RMS Energy | `RMS` | Root Mean Square amplitude per frame |
| Spectral Centroid | `Centroid` | "Brightness" - center of spectral mass |
| Spectral Flatness | `FlatnessDB` | Noise-like vs tonal content |
| Zero-Crossing Rate | `ZeroCrossingRate` | Rate of signal sign changes |
### Frame-Based Processing (Lines 328-365)
```python
frame_size = 2048
hop_size = 1024
for i in range(0, len(audio_44k) - frame_size, hop_size):
frame = audio_44k[i:i + frame_size]
windowed = self.windowing(frame)
spectrum = self.spectrum(windowed)
rms_values.append(self.rms(frame))
zcr_values.append(self.zcr(frame))
spectral_centroid_values.append(self.spectral_centroid(spectrum))
spectral_flatness_values.append(self.spectral_flatness(spectrum))
```
---
## Heuristic Formulas
### Energy (Line 347-353)
**Problem Solved**: Previous implementation used `es.Energy()` which returns sum of squared samples (huge number), normalized incorrectly as `energy / 100`.
**Current Implementation**:
```python
avg_rms = np.mean(rms_values)
energy = min(1.0, avg_rms * 3) # RMS typically 0.0-0.5, scale to 0-1
```
---
### Valence (Happiness/Positivity) - Lines 495-518
**Formula**:
```
valence = key_valence * 0.40
+ bpm_valence * 0.25
+ brightness_valence * 0.20
+ energy * 0.15
```
**Components**:
| Component | Weight | Calculation | Rationale |
|-----------|--------|-------------|-----------|
| Key Valence | 40% | Major = 0.65, Minor = 0.35 | Major keys sound happier |
| BPM Valence | 25% | Fast (≥120) → 0.8, Slow (≤80) → 0.2 | Fast tempo = upbeat |
| Brightness | 20% | `spectral_centroid * 1.5` | Bright sounds feel positive |
| Energy | 15% | RMS energy (0-1) | Loud = energetic/positive |
**Code**:
```python
# Key contribution
key_valence = 0.65 if scale == 'major' else 0.35
# BPM contribution
if bpm >= 120:
bpm_valence = min(0.8, 0.5 + (bpm - 120) / 200)
elif bpm <= 80:
bpm_valence = max(0.2, 0.5 - (80 - bpm) / 100)
else:
bpm_valence = 0.5
# Brightness contribution
brightness_valence = min(1.0, spectral_centroid * 1.5)
# Final weighted sum
result['valence'] = round(
key_valence * 0.4 +
bpm_valence * 0.25 +
brightness_valence * 0.2 +
energy * 0.15,
3
)
```
---
### Arousal (Energy/Intensity) - Lines 520-543
**Formula**:
```
arousal = bpm_arousal * 0.35
+ energy_arousal * 0.35
+ brightness_arousal * 0.15
+ compression_arousal * 0.15
```
**Components**:
| Component | Weight | Calculation | Rationale |
|-----------|--------|-------------|-----------|
| BPM Arousal | 35% | `(bpm - 60) / 140` mapped to 0.1-0.9 | Fast = high energy |
| Energy | 35% | RMS energy (0-1) | Loud = intense |
| Brightness | 15% | `spectral_centroid * 1.2` | Bright = energetic |
| Compression | 15% | `1 - (dynamic_range / 20)` | Compressed = intense/modern |
**Code**:
```python
# BPM contribution (60-180 BPM → 0.1-0.9)
bpm_arousal = min(0.9, max(0.1, (bpm - 60) / 140))
# Energy is direct intensity indicator
energy_arousal = energy
# Low dynamic range = compressed = more intense
compression_arousal = max(0, min(1.0, 1 - (dynamic_range / 20)))
# Brightness adds perceived energy
brightness_arousal = min(1.0, spectral_centroid * 1.2)
result['arousal'] = round(
bpm_arousal * 0.35 +
energy_arousal * 0.35 +
brightness_arousal * 0.15 +
compression_arousal * 0.15,
3
)
```
---
### Instrumentalness - Lines 545-563
**Approach**: Estimate likelihood of vocals vs instrumental based on spectral characteristics.
**Formula**:
```
instrumentalness = flatness_normalized * 0.6 + zcr_instrumental * 0.4
```
**Components**:
| Component | Weight | Calculation | Rationale |
|-----------|--------|-------------|-----------|
| Spectral Flatness | 60% | `(flatness + 40) / 40` | Noise-like (0dB) = instrumental; Tonal (-60dB) = vocals |
| ZCR Pattern | 40% | Low (<0.05) = 0.7; High (>0.15) = 0.4 | Sustained tones = instrumental |
**Code**:
```python
# Spectral flatness: -40dB to 0dB → 0 to 1
flatness_normalized = min(1.0, max(0, (spectral_flatness + 40) / 40))
# ZCR patterns
if zcr < 0.05:
zcr_instrumental = 0.7 # Sustained instrumental tones
elif zcr > 0.15:
zcr_instrumental = 0.4 # Could be speech or percussion
else:
zcr_instrumental = 0.5 # Uncertain
result['instrumentalness'] = round(
flatness_normalized * 0.6 + zcr_instrumental * 0.4,
3
)
```
---
### Acousticness - Line 565-568
**Simple heuristic**: High dynamic range suggests acoustic recording (natural dynamics preserved).
```python
result['acousticness'] = round(min(1.0, dynamic_range / 12), 3)
```
| Dynamic Range | Acousticness | Interpretation |
|---------------|--------------|----------------|
| < 6 dB | < 0.5 | Heavily compressed (electronic/pop) |
| 6-12 dB | 0.5-1.0 | Moderate (mixed) |
| > 12 dB | 1.0 | High dynamic range (acoustic/classical) |
---
### Speechiness - Lines 570-575
**Approach**: Speech has characteristic ZCR + spectral centroid patterns.
```python
if zcr > 0.08 and zcr < 0.2 and spectral_centroid > 0.1 and spectral_centroid < 0.4:
result['speechiness'] = round(min(0.5, zcr * 3), 3)
else:
result['speechiness'] = 0.1
```
| Condition | Result |
|-----------|--------|
| ZCR 0.08-0.2 AND centroid 0.1-0.4 | Speech-like (up to 0.5) |
| Outside range | Low speechiness (0.1) |
---
## Mood Tag Generation (Lines 581-660)
Tags are derived from computed features:
| Condition | Tags Added |
|-----------|------------|
| `arousal >= 0.7` | energetic, upbeat |
| `arousal <= 0.3` | calm, peaceful |
| `valence >= 0.7` | happy, uplifting |
| `valence <= 0.3` | sad, melancholic |
| `danceability >= 0.7` | dance, groovy |
| `bpm >= 140` | fast |
| `bpm <= 80` | slow |
| `keyScale == 'minor'` (and not happy) | moody |
| `arousal >= 0.7 AND bpm >= 120` | workout |
| `arousal <= 0.4 AND valence <= 0.4` | atmospheric |
| `arousal <= 0.3 AND bpm <= 90` | chill |
---
## Output Schema
```typescript
interface AnalysisResult {
// Basic features
bpm: number; // 60-200 typical
beatsCount: number; // Total beat count
key: string; // "C", "D#", etc.
keyScale: string; // "major" or "minor"
keyStrength: number; // 0-1 confidence
// Energy metrics
energy: number; // 0-1 (RMS-based)
loudness: number; // dB
dynamicRange: number; // dB
// Heuristic estimates
danceability: number; // 0-1
valence: number; // 0-1 (happiness)
arousal: number; // 0-1 (energy)
instrumentalness: number; // 0-1
acousticness: number; // 0-1
speechiness: number; // 0-1
// Derived
moodTags: string[]; // ["calm", "peaceful", "chill"]
analysisMode: "standard"; // Always "standard" for this mode
}
```
---
## Database Update (Lines 766-822)
All features are persisted to the `Track` table:
```sql
UPDATE "Track"
SET
bpm = %s,
"beatsCount" = %s,
key = %s,
"keyScale" = %s,
"keyStrength" = %s,
energy = %s,
loudness = %s,
"dynamicRange" = %s,
danceability = %s,
valence = %s,
arousal = %s,
instrumentalness = %s,
acousticness = %s,
speechiness = %s,
"moodTags" = %s,
"analysisMode" = 'standard',
"analysisStatus" = 'completed',
"analysisVersion" = %s,
"analyzedAt" = %s
WHERE id = %s
```
---
## Known Limitations
### Standard Mode vs ML Models
| Aspect | Standard Mode | Enhanced Mode (ML) |
|--------|--------------|-------------------|
| Valence accuracy | ~60% correlation | ~85% correlation |
| Arousal accuracy | ~65% correlation | ~88% correlation |
| Mood detection | Rule-based | Neural network |
| Processing speed | Fast (~1-2 sec) | Slower (~5-10 sec) |
| Dependencies | Essentia only | Essentia + TensorFlow |
### Edge Cases
1. **Ambient music**: Low BPM detection reliability
2. **Classical**: Variable tempo causes BPM averaging issues
3. **Spoken word**: May be misclassified as low-energy music
4. **Electronic/EDM**: Compression detection may overestimate arousal
---
## Dependencies
```
# requirements.txt
essentia==2.1b6.dev1110
essentia-tensorflow==2.1b6.dev1110
numpy>=1.21.0,<2.0.0
tensorflow==2.15.0
redis>=4.5.0
psycopg2-binary>=2.9.0
```
---
## Testing
Run single file analysis:
```bash
docker exec lidify_audio_analyzer python3 analyzer.py --test /music/path/to/song.mp3
```
Example output:
```json
{
"bpm": 128.5,
"beatsCount": 256,
"key": "C",
"keyScale": "minor",
"keyStrength": 0.723,
"energy": 0.65,
"loudness": -8.2,
"dynamicRange": 7.5,
"danceability": 0.72,
"valence": 0.42,
"arousal": 0.68,
"instrumentalness": 0.35,
"acousticness": 0.625,
"speechiness": 0.1,
"moodTags": ["energetic", "upbeat", "moody", "dance"],
"analysisMode": "standard"
}
```
---
## Related Files
- `services/audio-analyzer/Dockerfile` - Container build
- `backend/src/services/vibeMatching.ts` - Uses these features for song matching
- `prisma/schema.prisma` - Track table schema with analysis columns
@@ -0,0 +1,107 @@
# Curated Vibe Mixes Implementation
## Overview
This update adds **19 new curated vibe mixes** and a **Mood-on-Demand** feature that allows users to generate custom mixes based on audio features.
## Bug Fix
Fixed the `genres` field bug - the Album model uses `genres` (JSON array) not `genre` (string). Added a helper function `findTracksByGenrePatterns()` that properly queries:
1. Track's `lastfmTags` and `essentiaGenres` (native String[] fields)
2. Falls back to filtering `album.genres` JSON array in application code
## New Daily Vibe Mixes (10 tracks each)
| Mix Name | Description | Key Audio Features |
|----------|-------------|-------------------|
| **Sad Girl Sundays** | Melancholic introspection | valence < 0.35, minor key, arousal < 0.4 |
| **Main Character Energy** | You're the protagonist ✨ | valence > 0.55, energy > 0.55, danceability > 0.5 |
| **Villain Era** | Dark & empowering 😈 | minor key, energy > 0.65, aggressive tags |
| **3AM Thoughts** | Late night overthinking 🌙 | arousal < 0.35, energy < 0.45, valence < 0.45 |
| **Hot Girl Walk** | Confident cardio 💅 | danceability > 0.65, BPM 95-135, energy > 0.55 |
| **Rage Cleaning** | Aggressive productivity 🔥 | energy > 0.75, arousal > 0.65, BPM > 125 |
| **Golden Hour** | Warm sunset vibes 🌅 | valence > 0.45, acousticness > 0.35, energy 0.25-0.65 |
| **Shower Karaoke** | Belters you can't help sing 🚿 | instrumentalness < 0.35, energy > 0.55, valence > 0.45 |
| **In My Feelings** | Let it all out 💔 | valence < 0.4, arousal < 0.55, acousticness > 0.25 |
| **Midnight Drive** | Late night cruising 🚗 | energy 0.35-0.65, arousal 0.25-0.55, BPM 85-125 |
| **Coffee Shop Vibes** | Cozy background ☕ | acousticness > 0.4, energy 0.15-0.55 |
| **Romanticize Your Life** | Aesthetic moments 🎬 | valence 0.35-0.75, arousal 0.25-0.65, acousticness > 0.25 |
| **That Girl Era** | Self-improvement mode 💪 | valence > 0.55, energy > 0.45, danceability > 0.45 |
| **Unhinged** | Embrace the chaos 🎪 | Extreme features (high or low everything) |
## New Weekly Curated Mixes (20 tracks each)
| Mix Name | Description | Algorithm |
|----------|-------------|-----------|
| **Deep Cuts** | Hidden gems 💎 | Tracks with zero or few plays |
| **Key Journey** | Harmonic progression 🎹 | Ordered by circle of fifths |
| **Tempo Flow** | Energy arc 📈 | slow → fast → slow BPM journey |
| **Vocal Detox** | Instrumental escape 🧘 | instrumentalness > 0.75 |
| **Minor Key Mondays** | All minor key bangers 🖤 | keyScale = 'minor', energy > 0.45 |
## Mood-on-Demand Feature
### Backend Endpoints
- `POST /api/mixes/mood` - Generate a custom mix based on audio parameters
- `GET /api/mixes/mood/presets` - Get available mood presets for the UI
### Preset Moods (12 total)
1. 😊 Happy & Upbeat
2. 😢 Melancholic
3. 😌 Chill & Relaxed
4. ⚡ High Energy
5. 🎯 Focus Mode
6. 💃 Dance Party
7. 🎸 Acoustic Vibes
8. 🖤 Dark & Moody
9. 💕 Romantic
10. 💪 Workout Beast
11. 😴 Sleep & Unwind
12. 👑 Confidence Boost
### Custom Mix Builder
Users can adjust sliders for:
- Happiness (valence)
- Energy
- Danceability
- Tempo (BPM)
## Frontend Changes
### New Component: `MoodMixer.tsx`
A beautiful Spotify-esque modal with:
- Gradient preset cards with emojis
- Smooth animations (Framer Motion)
- Custom range slider controls
- Dark theme matching the app aesthetic
### Homepage Integration
Added "Mood Mixer" button next to the "Refresh" button in the "Made For You" section.
## Files Modified
### Backend
- `backend/src/services/programmaticPlaylists.ts` - Added helper function, fixed 12 genre bugs, added 19 new mix generators
- `backend/src/routes/mixes.ts` - Added mood endpoints and presets
### Frontend
- `frontend/lib/api.ts` - Added types and API methods for mood mixing
- `frontend/app/page.tsx` - Integrated MoodMixer modal
- `frontend/components/MoodMixer.tsx` - New component (created)
## Technical Notes
- All mixes use Essentia audio analysis data (valence, energy, danceability, BPM, key, etc.)
- Fallback to Last.fm tags when audio analysis is insufficient
- Daily mixes: 10 tracks, refreshed daily
- Weekly mixes: 20 tracks, for longer listening sessions
- Mix generation is cached in Redis for performance
@@ -0,0 +1,798 @@
# Modified Files for Review
> **Last Updated:** December 16, 2025
> **Features:** Spotify Import + UI Overhaul (Activity Panel, Carousels, Notifications, Playlist/Mix/Discover Redesign, Settings Page Redesign)
## Overview
This document tracks all files created or modified as part of:
1. **Spotify Import Feature** - Import Spotify playlists, match tracks, download albums
2. **UI Overhaul** - Activity Panel, horizontal carousels, notification system
---
## Backend - New Files
| File | Purpose |
| --------------------------------------------- | --------------------------------------------------------------- |
| `backend/src/services/notificationService.ts` | Notification CRUD service with convenience methods |
| `backend/src/services/spotifyImport.ts` | Spotify playlist import logic, track matching, album resolution |
| `backend/src/services/spotify.ts` | Spotify API/scraping service (embed data extraction) |
| `backend/src/routes/notifications.ts` | Notification & download history API endpoints |
| `backend/src/routes/spotify.ts` | Spotify import API endpoints |
| `backend/src/utils/playlistLogger.ts` | Debug logger for Spotify import jobs |
## Backend - Modified Files
| File | Changes |
| ----------------------------------------------- | --------------------------------------------------------------------- |
| `backend/prisma/schema.prisma` | Added `Notification` model, `DownloadJob.cleared` field |
| `backend/src/services/simpleDownloadManager.ts` | Added notification integration, failure deduplication |
| `backend/src/services/lidarr.ts` | Smart `anyReleaseOk` fallback, MusicBrainz fallback for artist lookup |
| `backend/src/services/musicbrainz.ts` | Recording filtering, scoring system, title normalization |
| `backend/src/services/spotify.ts` | Embed scraping improvements, debug logging |
| `backend/src/index.ts` | Registered notification routes |
---
## Frontend - New Files
| File | Purpose |
| ----------------------------------------------------- | ----------------------------------------------------- |
| `frontend/components/layout/ActivityPanel.tsx` | Collapsible 3rd column with tabs, PWA install button |
| `frontend/components/activity/NotificationsTab.tsx` | System notifications list |
| `frontend/components/activity/ActiveDownloadsTab.tsx` | Currently downloading items |
| `frontend/components/activity/HistoryTab.tsx` | Completed/failed with retry |
| `frontend/components/ui/HorizontalCarousel.tsx` | Reusable carousel with arrows |
| `frontend/hooks/useActivityPanel.ts` | Panel state management |
| `frontend/app/import/spotify/page.tsx` | Spotify import UI page (preview, selection, progress) |
## Frontend - Modified Files
| File | Changes |
| ------------------------------------------------------------- | ------------------------------------------------------ |
| `frontend/components/layout/AuthenticatedLayout.tsx` | Added 3rd column, event listener for toggle |
| `frontend/components/layout/TopBar.tsx` | Added `ActivityPanelToggle` button |
| `frontend/components/MixCard.tsx` | Reduced padding/sizing (`p-4``p-2.5`) |
| `frontend/features/home/components/ArtistsGrid.tsx` | Uses `HorizontalCarousel` |
| `frontend/features/home/components/MixesGrid.tsx` | Uses `HorizontalCarousel` |
| `frontend/features/home/components/ContinueListening.tsx` | Uses `HorizontalCarousel` |
| `frontend/features/home/components/PodcastsGrid.tsx` | Uses `HorizontalCarousel` |
| `frontend/features/home/components/HomeHero.tsx` | Already optimized (compact greeting) |
| `frontend/lib/api.ts` | Added notification API methods, Spotify import methods |
| `frontend/app/playlists/page.tsx` | Added "Import from Spotify" button/link |
| `frontend/app/playlist/[id]/page.tsx` | Full Spotify-style redesign (see below) |
| `frontend/app/mix/[id]/page.tsx` | Full Spotify-style redesign (matches playlist page) |
| `frontend/app/discover/page.tsx` | Updated to use consistent container widths |
| `frontend/features/discover/components/DiscoverHero.tsx` | Redesigned to match playlist/mix hero style |
| `frontend/features/discover/components/DiscoverActionBar.tsx` | Redesigned with Lidify yellow play button |
| `frontend/features/discover/components/TrackList.tsx` | Redesigned to match playlist/mix track listing |
| `frontend/components/layout/Sidebar.tsx` | Removed unused icon imports |
---
## Database Changes
```prisma
// NEW MODEL
model Notification {
id String @id @default(cuid())
userId String
type String // system, download_complete, playlist_ready, error, import_complete
title String
message String?
metadata Json? // { playlistId, albumId, artistId, etc. }
read Boolean @default(false)
cleared Boolean @default(false)
createdAt DateTime @default(now())
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
@@index([userId, cleared])
@@index([userId, read])
@@index([createdAt])
}
// MODIFIED MODEL - DownloadJob
model DownloadJob {
// ... existing fields ...
cleared Boolean @default(false) // NEW: User dismissed from history
}
```
**Migration Applied:** `npx prisma db push`
---
## API Endpoints
### Notifications (`/api/notifications`)
| Method | Endpoint | Description |
| ------ | ------------------------------------ | ---------------------------- |
| GET | `/notifications` | List uncleared notifications |
| GET | `/notifications/unread-count` | Get unread count |
| POST | `/notifications/:id/read` | Mark as read |
| POST | `/notifications/read-all` | Mark all as read |
| POST | `/notifications/:id/clear` | Clear (dismiss) notification |
| POST | `/notifications/clear-all` | Clear all notifications |
| GET | `/notifications/downloads/active` | Active downloads |
| GET | `/notifications/downloads/history` | Completed/failed downloads |
| POST | `/notifications/downloads/:id/clear` | Clear from history |
| POST | `/notifications/downloads/clear-all` | Clear all history |
| POST | `/notifications/downloads/:id/retry` | Retry failed download |
### Spotify Import (`/api/spotify`)
| Method | Endpoint | Description |
| ------ | ---------------------------- | -------------------------------- |
| POST | `/spotify/import/preview` | Generate import preview from URL |
| POST | `/spotify/import/start` | Start import with selections |
| GET | `/spotify/import/:id/status` | Get import job status |
---
## Key Bug Fixes
### 1. Track Matching (Spotify Import)
- **File:** `backend/src/services/spotifyImport.ts`
- **Fix:** Added `stripTrackSuffix()` to remove "- 2011 Remaster" etc. while keeping punctuation
- **Fix:** Added Unicode normalization for artist names (Röyksopp → Royksopp)
- **Fix:** Multiple matching strategies (exact → stripped → fuzzy)
### 2. MusicBrainz Album Resolution
- **File:** `backend/src/services/musicbrainz.ts`
- **Fix:** Score threshold > 50 for studio albums
- **Fix:** Recording filtering (exclude live/demo/acoustic)
- **Fix:** Soundtrack penalty in scoring
### 3. Lidarr Album Addition
- **File:** `backend/src/services/lidarr.ts`
- **Fix:** Smart `anyReleaseOk` fallback (try strict first, then loosen)
- **Fix:** MusicBrainz fallback when Lidarr's metadata server fails
- **Fix:** Immediate error when no releases found
### 4. Multiple Failure Notifications
- **File:** `backend/src/services/simpleDownloadManager.ts`
- **Fix:** 30-second deduplication window for failure events
- **Fix:** Only notify on final exhaustion, not each retry
- **Fix:** Skip notifications for discovery/import batches
---
## Testing Checklist
### Activity Panel
- [ ] Panel opens/closes from TopBar button
- [ ] Panel state persists in localStorage
- [ ] Notifications tab shows system messages
- [ ] Active tab shows downloading items (refreshes every 5s)
- [ ] History tab shows completed/failed
- [ ] Retry button works for failed downloads
- [ ] Clear buttons work
### Home Page Carousels
- [ ] Horizontal scroll works
- [ ] Arrow buttons appear on hover (desktop)
- [ ] Snap behavior works
- [ ] Card sizing is compact
### Spotify Import
- [ ] Preview generation works
- [ ] Album selection works
- [ ] Downloads start correctly
- [ ] Track matching works after downloads
- [ ] Playlist is created with matched tracks
- [ ] Notification appears when complete
### Notifications
- [ ] Download complete creates notification
- [ ] Download failed creates notification (only on exhaustion)
- [ ] Spotify import complete creates notification
- [ ] Unread badge shows count
- [ ] Mark as read works
- [ ] Clear works
### Playlist Page
- [ ] Hero section is compact with bottom-aligned content
- [ ] Shuffle button randomizes and plays tracks
- [ ] Track listing spans full width (no container)
- [ ] Currently playing track is highlighted
- [ ] Track numbers become play icons on hover
- [ ] Album column hidden on mobile
### PWA Install
- [ ] "Install App" button appears in Activity Panel (when installable)
- [ ] Button triggers browser install prompt
- [ ] Button disappears after installation
---
## Rollback Instructions
If issues arise, revert these files:
```bash
# Core files to revert for UI changes
git checkout HEAD~1 -- frontend/components/layout/AuthenticatedLayout.tsx
git checkout HEAD~1 -- frontend/components/layout/TopBar.tsx
git checkout HEAD~1 -- frontend/components/layout/ActivityPanel.tsx
git checkout HEAD~1 -- frontend/components/activity/
# For Spotify import issues
git checkout HEAD~1 -- backend/src/services/spotifyImport.ts
git checkout HEAD~1 -- backend/src/services/musicbrainz.ts
git checkout HEAD~1 -- backend/src/services/lidarr.ts
# Database rollback (if needed)
# Remove Notification model and DownloadJob.cleared from schema
npx prisma db push
```
---
## Notes
- The old `DownloadNotifications.tsx` (floating modal) still exists but is no longer imported in the layout
- All grid components were already converted to carousels prior to this session
- The Spotify import flow uses `lidarrService.addAlbum()` directly instead of `simpleDownloadManager` to avoid same-artist fallback
## Playlist Page Redesign
**File:** `frontend/app/playlist/[id]/page.tsx`
### Changes Made
1. **Fixed React Hooks Error** - Moved `totalDuration` useMemo before early returns
2. **Full-Width Track Listing** - Removed container wrapper, tracks span full panel width like Spotify
3. **Compact Hero Section** - Smaller cover art (140px/192px), bottom-aligned content, reduced title size
4. **Added Shuffle Button** - Shuffles and plays all tracks in random order
5. **Grid-Based Track Layout** - Columns: #, Title/Artist, Album, Duration (responsive)
6. **Track Hover States** - Number becomes play icon on hover, row highlights
### PWA Install in Activity Panel
**File:** `frontend/components/layout/ActivityPanel.tsx`
- Added `beforeinstallprompt` event listener
- "Install App" button appears at bottom of panel when PWA can be installed
- Hides automatically when app is already installed or running in standalone mode
### Sidebar Cleanup
**File:** `frontend/components/layout/Sidebar.tsx`
- Removed unused icon imports (Home, Library, Sparkles, Book, Mic2)
- Navigation items use text-only (no icons) - matching minimalist design
### Playlists Page Redesign
**File:** `frontend/app/playlists/page.tsx`
**Before → After:**
| Element | Before | After |
| ---------------- | --------------------------------- | -------------------------------------- |
| Header title | `text-3xl md:text-4xl font-black` | `text-2xl font-bold` |
| Header padding | `px-6 md:px-8 py-6 md:py-8` | `px-6 pt-6 pb-4` |
| Gradient overlay | Yellow gradient at top | Removed |
| Import button | Green outline with icon | Solid green `bg-[#1DB954]`, no icon |
| Hidden toggle | Icon + text, bordered | Text only, minimal style |
| Card wrapper | `<Card>` component | Simple `<div>` with `hover:bg-white/5` |
| Card padding | `p-4` (via Card) | `p-3` |
| Play button | `w-12 h-12` | `w-10 h-10` |
| Empty state | `<EmptyState>` with icons | Simple centered div |
| Shared badge | Purple badge | Shown in subtitle instead |
| Track count | "tracks" | "songs" (matches Spotify) |
**Design Philosophy:**
- Remove decorative icons where text suffices
- Reduce spacing for tighter, professional feel
- Use native hover states instead of custom components
- Minimal color - let content speak
- Match Spotify's terminology
---
## Spotify-Style Design Patterns
> **Use these patterns consistently across all pages for a cohesive look.**
### 1. Hero Sections (Albums, Playlists, Artists)
```
- Compact height (max ~180px for cover on desktop)
- Content bottom-aligned to the cover art
- Title: text-2xl md:text-3xl font-bold (NOT text-4xl+)
- Subtitle info: text-sm text-gray-400
- Reduced vertical spacing (gap-2 to gap-4 max)
- No decorative gradients overlaying the hero
```
### 2. Track Listings
```
- Full-width, no container card wrapping
- Grid layout: [#] [Title/Artist] [Album] [Duration]
- Album column: hidden on mobile (md:grid-cols-[16px_1fr_1fr_60px])
- Hover: row bg-white/5, number → play icon
- Playing indicator: Lidify yellow (#ecb200) on track number
- Compact row height (~56px)
```
### 3. Page Headers
```
- Title: text-2xl font-bold (not text-3xl+)
- Subtitle: text-sm text-gray-400
- Actions: rounded-full buttons with minimal icons
- No excessive padding (px-6 py-4 is enough)
```
### 4. Cards (Albums, Artists, Playlists)
```
- Compact padding: p-2.5 (not p-4)
- Title: text-sm font-medium truncate
- Subtitle: text-xs text-gray-500
- Play button: bottom-right, shows on hover
```
### 5. Grids → Carousels
```
- Use HorizontalCarousel for content rows
- Single horizontal line, scroll/swipe
- Arrow buttons on hover (desktop)
- Snap behavior for smooth scrolling
```
### 6. General Typography
```
- Section headers: text-lg font-semibold (not text-xl)
- Greeting (home): text-2xl md:text-3xl font-bold tracking-tight
- No ALL CAPS unless absolutely necessary
- Muted subtitles: text-gray-400 or text-gray-500
```
### 7. Buttons & Actions
```
- Primary action: rounded-full, bg-[#ecb200] text-black
- Secondary: bg-white/10 hover:bg-white/20
- Icon-only buttons: rounded-full p-2
- Minimal icon usage - text labels preferred
```
### 8. Spacing Philosophy
```
- Tight but breathable
- Section gaps: gap-6 (not gap-8 or gap-10)
- Card grids: gap-4
- Hero to content: pt-6 (not pt-10)
```
---
## Post-Implementation Fixes
| Date | File | Issue | Fix |
| ---------- | --------------------------------------------------------------- | ----------------------------------------------------- | -------------------------------------------------------------------------------------- |
| 2025-12-15 | `backend/src/routes/notifications.ts` | Wrong import path `../db` | Changed to `../utils/db` |
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | React hooks order violation | Moved `useMemo` before early returns |
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | `useAuth` not defined | Removed unused `isAuthenticated` |
| 2025-12-15 | `frontend/components/layout/ActivityPanel.tsx` | Badge not clearing after clear all | Added `notifications-changed` event listener |
| 2025-12-15 | `frontend/components/activity/NotificationsTab.tsx` | Badge not updating | Dispatch `notifications-changed` event on mutations |
| 2025-12-15 | `backend/src/services/spotifyImport.ts` | Track matching failing (apostrophes, artist matching) | Added `normalizeApostrophes()`, changed artist match to use `contains` with first word |
| 2025-12-15 | `frontend/app/playlists/page.tsx` | Page design not matching Spotify style | Full redesign: compact header, cleaner cards, minimal icons, refined typography |
| 2025-12-15 | `frontend/app/import/spotify/page.tsx` | Using Music2 icon instead of Spotify logo | Uses SpotIcon.png, cleaner layout, matches style guide, removed heavy Card components |
| 2025-12-15 | `frontend/app/import/spotify/page.tsx` | Grey/transparent gradient not matching brand | Added yellow-to-purple gradient (same as home page) with quick fade ratio (35vh/25vh) |
| 2025-12-15 | `frontend/app/discover/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
| 2025-12-15 | `frontend/app/mix/[id]/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
| 2025-12-15 | `frontend/features/discover/components/*` | Discover page not matching playlist/mix design | Redesigned DiscoverHero, DiscoverActionBar, TrackList to match Spotify style |
| 2025-12-15 | `frontend/app/library/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
| 2025-12-15 | `frontend/features/library/components/LibraryHeader.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
| 2025-12-15 | `frontend/app/podcasts/page.tsx` | Container width + card styling not matching | Removed `max-w-7xl mx-auto`, cleaner cards without borders/gradients |
| 2025-12-15 | `frontend/app/audiobooks/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, smaller header text, consistent with Spotify style |
| 2025-12-15 | `frontend/app/artist/[id]/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
| 2025-12-15 | `frontend/app/album/[id]/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
| 2025-12-15 | `frontend/features/artist/components/ArtistHero.tsx` | Hero not matching Spotify style | Compact hero, full-width, bottom-aligned content, kept VibrantJS gradients |
| 2025-12-15 | `frontend/features/artist/components/ArtistActionBar.tsx` | Action bar too heavy | Simplified to play button + shuffle + download, matching playlist style |
| 2025-12-15 | `frontend/features/artist/components/PopularTracks.tsx` | Track list not matching new style | Removed Card wrapper, grid-based layout, cleaner typography |
| 2025-12-15 | `frontend/features/artist/components/Discography.tsx` | Section header too large | Changed header from `text-2xl md:text-3xl` to `text-xl` |
| 2025-12-15 | `frontend/features/artist/components/AvailableAlbums.tsx` | Section headers too large | Changed headers to `text-xl font-bold mb-4`, renamed sections |
| 2025-12-15 | `frontend/features/artist/components/SimilarArtists.tsx` | Cards not matching new style | Cleaner cards with transparent bg, smaller header |
| 2025-12-15 | `frontend/features/artist/components/ArtistBio.tsx` | Using Card component | Replaced Card with simple `bg-white/5` div |
| 2025-12-15 | `frontend/features/album/components/AlbumHero.tsx` | Hero not matching Spotify style | Compact hero, full-width, bottom-aligned content, kept VibrantJS gradients |
| 2025-12-15 | `frontend/features/album/components/AlbumActionBar.tsx` | Action bar too heavy | Simplified to play + shuffle + add to playlist, matching playlist style |
| 2025-12-15 | `frontend/features/album/components/SimilarAlbums.tsx` | Section header too large | Changed header to `text-xl font-bold mb-4` |
| 2025-12-15 | `frontend/app/artist/[id]/page.tsx` | Artist bio/about not showing | Now uses `artist.bio \|\| artist.summary` for library artists with `summary` field |
| 2025-12-15 | `frontend/features/artist/components/ArtistBio.tsx` | Read more link not brand color | Added `[&_a]:text-[#ecb200]` for Lidify yellow links |
| 2025-12-15 | `frontend/app/audiobooks/[id]/page.tsx` | Page design not matching Spotify style | Compact hero, yellow play button, integrated action bar, full-width layout |
| 2025-12-15 | `frontend/features/audiobook/components/AudiobookHero.tsx` | Hero too large and dated | Compact Spotify-style hero with bottom-aligned content, VibrantJS gradients preserved |
| 2025-12-15 | `frontend/features/audiobook/components/AudiobookActionBar.tsx` | Action bar not matching other pages | Yellow play button, inline progress, subtle action icons |
| 2025-12-15 | `frontend/app/podcasts/[id]/page.tsx` | Page design not matching Spotify style | Compact hero, fixed height gradient (25vh), full-width layout |
| 2025-12-15 | `frontend/features/podcast/components/PodcastHero.tsx` | Hero too large and dated | Compact Spotify-style hero with bottom-aligned content, VibrantJS gradients preserved |
| 2025-12-15 | `frontend/features/podcast/components/PodcastActionBar.tsx` | Action bar too heavy | Yellow subscribe button, subtle RSS link, cleaner remove confirmation |
| 2025-12-15 | `frontend/features/podcast/components/ContinueListening.tsx` | Cards not matching new style | Yellow play button, cleaner progress bar, simpler prev/next episodes |
| 2025-12-15 | `frontend/features/podcast/components/EpisodeList.tsx` | Episode list not matching new style | Removed Card wrapper, yellow highlights, cleaner typography |
| 2025-12-15 | `frontend/features/podcast/components/SimilarPodcasts.tsx` | Cards not matching new style | Transparent bg with hover, smaller header, cleaner layout |
| 2025-12-15 | `frontend/features/podcast/components/PreviewEpisodes.tsx` | Cards not matching new style | Removed Card wrappers, yellow subscribe button, cleaner About section |
---
## Settings Page Redesign (December 16, 2025)
### Overview
Complete redesign of the settings page to match Spotify's clean, minimal aesthetic with:
- **Sidebar navigation** - Fixed sidebar with section links, active state tracking via intersection observer
- **Single scrollable page** - All sections on one page instead of tabs
- **Unified Spotify section** - Combined OAuth user connection + Developer API credentials
- **Spotify-style design patterns** - Row-based layouts, clean toggles, minimal borders
### Database Changes
```prisma
model User {
// ... existing fields ...
// NEW: Spotify OAuth connection
spotifyAccessToken String? // Encrypted OAuth access token
spotifyRefreshToken String? // Encrypted OAuth refresh token
spotifyTokenExpiry DateTime? // When access token expires
spotifyUserId String? // Spotify user ID
spotifyDisplayName String? // Display name from Spotify
}
```
### New API Endpoints
| Method | Endpoint | Description |
| ------ | ------------------------------ | ----------------------------------- |
| GET | `/api/spotify/auth/url` | Generate OAuth authorization URL |
| GET | `/api/spotify/auth/callback` | Handle OAuth callback, store tokens |
| POST | `/api/spotify/auth/disconnect` | Remove user's Spotify connection |
| GET | `/api/spotify/auth/status` | Check if user is connected |
### New Frontend Files
| File | Purpose |
| ----------------------------------------------------------------------------- | --------------------------------------- |
| `frontend/features/settings/components/ui/SettingsLayout.tsx` | Sidebar + main content wrapper |
| `frontend/features/settings/components/ui/SettingsSidebar.tsx` | Navigation sidebar with section links |
| `frontend/features/settings/components/ui/SettingsSection.tsx` | Section header with separator |
| `frontend/features/settings/components/ui/SettingsRow.tsx` | Label + description left, control right |
| `frontend/features/settings/components/ui/SettingsToggle.tsx` | Spotify-style toggle switch |
| `frontend/features/settings/components/ui/SettingsSelect.tsx` | Dropdown select |
| `frontend/features/settings/components/ui/SettingsInput.tsx` | Text/password input with show/hide |
| `frontend/features/settings/components/ui/ConnectionCard.tsx` | OAuth connection card (Spotify) |
| `frontend/features/settings/components/ui/index.ts` | Barrel export |
| `frontend/features/settings/components/sections/AccountSection.tsx` | Password change + 2FA |
| `frontend/features/settings/components/sections/PlaybackSection.tsx` | Streaming quality dropdown |
| `frontend/features/settings/components/sections/SpotifyConnectionSection.tsx` | Spotify OAuth connection |
| `frontend/features/settings/components/sections/SpotifyAPISection.tsx` | Developer API credentials |
| `frontend/features/settings/components/sections/CacheSection.tsx` | Cache sizes + automation toggles |
| `frontend/features/settings/hooks/useSpotifyOAuth.ts` | OAuth state management |
### Modified Frontend Files
| File | Changes |
| -------------------------------------------------------------------------- | ------------------------------------- |
| `frontend/app/settings/page.tsx` | Complete redesign with sidebar layout |
| `frontend/features/settings/components/sections/LidarrSection.tsx` | Spotify-style row layout |
| `frontend/features/settings/components/sections/AudiobookshelfSection.tsx` | Spotify-style row layout |
| `frontend/features/settings/components/sections/SoulseekSection.tsx` | Spotify-style row layout |
| `frontend/features/settings/components/sections/AIServicesSection.tsx` | Spotify-style row layout |
| `frontend/features/settings/components/sections/StoragePathsSection.tsx` | Spotify-style row layout |
| `frontend/features/settings/components/sections/UserManagementSection.tsx` | Cleaner design, modal for delete |
### Modified Backend Files
| File | Changes |
| ------------------------------- | ---------------------------------------- |
| `backend/prisma/schema.prisma` | Added Spotify OAuth fields to User model |
| `backend/src/routes/spotify.ts` | Added OAuth routes |
### Deleted Files (Consolidated)
| File | Reason |
| ---------------------------------------------------------------------------- | --------------------------------- |
| `frontend/features/settings/components/UserSettingsTab.tsx` | Replaced by unified settings page |
| `frontend/features/settings/components/AccountTab.tsx` | Replaced by unified settings page |
| `frontend/features/settings/components/SystemSettingsTab.tsx` | Replaced by unified settings page |
| `frontend/features/settings/components/sections/ChangePasswordSection.tsx` | Merged into AccountSection |
| `frontend/features/settings/components/sections/TwoFactorAuthSection.tsx` | Merged into AccountSection |
| `frontend/features/settings/components/sections/PlaybackQualitySection.tsx` | Replaced by PlaybackSection |
| `frontend/features/settings/components/sections/AdvancedSettingsSection.tsx` | Replaced by CacheSection |
| `frontend/features/settings/components/sections/CacheSettingsSection.tsx` | Replaced by CacheSection |
| `frontend/features/settings/components/sections/SpotifySection.tsx` | Split into Connection + API |
### Settings Sections
**All Users:** Account, Playback, Connected Services (Spotify OAuth)
**Admin Only:** Download Services, Media Servers, P2P Networks, AI Services, Spotify API, Storage, Cache & Automation, User Management
---
## Home Page Enhancements (Dec 16, 2025)
### New Features
1. **Radio Stations Section** - Compact horizontal row at the top of the home page showing random Deezer radio stations
2. **Featured Playlists Section** - Grid showing 10 featured Deezer playlists after Popular Artists section
### New Files Created
| File | Purpose |
| ------------------------------------------------------ | ------------------------------------------- |
| `frontend/features/home/components/FeaturedPlaylistsGrid.tsx` | Grid component for featured playlists |
| `frontend/features/home/components/RadioStationsGrid.tsx` | Horizontal scroll component for radio stations |
### Modified Files
| File | Changes |
| ---------------------------------------------------- | ------------------------------------------------ |
| `frontend/app/page.tsx` | Added radio stations and featured playlists sections |
| `frontend/features/home/hooks/useHomeData.ts` | Added browse data fetching for playlists/radios |
| `frontend/hooks/useQueries.ts` | Added browse query keys and hooks |
| `backend/src/routes/browse.ts` | Increased featured playlists limit from 50 to 200 |
---
## Notification & Sync Button Improvements (Dec 16, 2025)
### Changes
1. **Sync Button** - No longer shows toast overlay, turns green with spinning animation while syncing
2. **Optimistic Notification Clearing** - Notifications are cleared from UI immediately before API call completes
3. **Duplicate Key Fix** - Added context parameter to renderCard in browse page to prevent duplicate key errors
### Modified Files
| File | Changes |
| -------------------------------------------------------- | ------------------------------------------------ |
| `frontend/components/layout/Sidebar.tsx` | Removed toast, added green color while syncing |
| `frontend/components/activity/NotificationsTab.tsx` | Implemented optimistic updates for all mutations |
| `frontend/app/browse/playlists/page.tsx` | Fixed duplicate key errors with unique keys |
---
## Essentia Audio Analysis Integration (Dec 16, 2025)
### Overview
Integrated Essentia audio analysis to extract BPM, key, mood, energy, and other audio features from tracks. This enables intelligent mood-based mixes and personalized playlists.
### Database Changes
Added to `Track` model in `backend/prisma/schema.prisma`:
| Field | Type | Description |
| ------------------ | ---------- | ------------------------------------- |
| `bpm` | Float? | Beats per minute |
| `beatsCount` | Int? | Total beats in track |
| `key` | String? | Musical key (C, F#, Bb, etc.) |
| `keyScale` | String? | "major" or "minor" |
| `keyStrength` | Float? | Key detection confidence (0-1) |
| `energy` | Float? | Overall energy (0-1) |
| `loudness` | Float? | Average loudness in dB |
| `dynamicRange` | Float? | Dynamic range in dB |
| `danceability` | Float? | Danceability score (0-1) |
| `valence` | Float? | Happy (1) to sad (0) |
| `arousal` | Float? | Energetic (1) to calm (0) |
| `instrumentalness` | Float? | Vocal presence (0-1, 1=instrumental) |
| `acousticness` | Float? | Acoustic vs electronic (0-1) |
| `speechiness` | Float? | Spoken word content (0-1) |
| `moodTags` | String[] | ML-classified mood tags |
| `essentiaGenres` | String[] | ML-classified genres |
| `lastfmTags` | String[] | User-generated mood tags from Last.fm |
| `analysisStatus` | String | pending/processing/completed/failed |
| `analysisVersion` | String? | Essentia version used |
| `analyzedAt` | DateTime? | When analysis was completed |
| `analysisError` | String? | Error message if failed |
### New Files
| File | Description |
| ------------------------------------------------- | -------------------------------------------------- |
| `services/audio-analyzer/Dockerfile` | Python 3.11 + Essentia container |
| `services/audio-analyzer/analyzer.py` | Main audio analysis service |
| `services/audio-analyzer/requirements.txt` | Python dependencies |
| `backend/src/workers/trackEnrichment.ts` | Last.fm tag enrichment worker |
| `backend/src/routes/analysis.ts` | API routes for analysis status & triggers |
### Modified Files
| File | Changes |
| -------------------------------------------------------------- | ----------------------------------------------- |
| `backend/prisma/schema.prisma` | Added audio analysis fields to Track model |
| `backend/src/workers/index.ts` | Added track enrichment worker startup/shutdown |
| `backend/src/workers/queues.ts` | Added `analysisQueue` for audio analysis jobs |
| `backend/src/index.ts` | Registered `/api/analysis` routes |
| `backend/src/services/programmaticPlaylists.ts` | Added mood-based mix generators |
| `backend/src/routes/library.ts` | Added mood-based radio station filtering |
| `frontend/features/home/components/LibraryRadioStations.tsx` | Added mood-based radio station buttons |
| `docker-compose.yml` | Added `audio-analyzer` service (optional) |
### New Mix Types (Audio Analysis-Based)
| Mix Type | Criteria |
| -------------- | --------------------------------------------- |
| High Energy | energy >= 0.7, BPM >= 120 |
| Late Night | energy <= 0.4, BPM <= 90, low arousal |
| Happy Vibes | valence >= 0.6, energy >= 0.5 |
| Melancholy | valence <= 0.4, minor key preferred |
| Dance Floor | danceability >= 0.7, BPM 110-140 |
| Acoustic | acousticness >= 0.6, energy 0.3-0.6 |
| Instrumental | instrumentalness >= 0.7, energy 0.3-0.6 |
| Road Trip | tags or energy 0.5-0.8, BPM 100-130 |
| Sunday Morning | low energy, high acousticness (day-specific) |
| Monday Motivation | high energy, high valence (day-specific) |
| Friday Night | high danceability, high energy (day-specific) |
### API Endpoints
| Method | Endpoint | Description |
| ------ | ----------------------------- | ---------------------------------------- |
| GET | `/api/analysis/status` | Get analysis progress statistics |
| POST | `/api/analysis/start` | Queue pending tracks for analysis |
| POST | `/api/analysis/retry-failed` | Reset failed tracks to pending |
| POST | `/api/analysis/analyze/:id` | Queue specific track for analysis |
| GET | `/api/analysis/track/:id` | Get analysis data for specific track |
| GET | `/api/analysis/features` | Get aggregated feature statistics |
### Starting the Audio Analyzer
The audio analyzer is disabled by default. To enable it:
```bash
docker-compose --profile audio-analysis up -d
```
Or just run it separately:
```bash
docker-compose up audio-analyzer -d
```
---
## Notification System Fixes (Dec 16, 2025)
### Issues Fixed
1. **Toast overlays for cache clearing and sync** - Removed toast.success overlays for "Caches cleared" and "Library scan started" since these should appear in the activity panel notification bar instead.
2. **Notification badge not clearing immediately** - The `useNotifications` hook wasn't responding to `notifications-changed` events. Fixed by adding an event listener that triggers a refetch.
3. **Settings page glitchy sidebar** - Replaced IntersectionObserver with scroll-based tracking for smoother sidebar highlighting.
### Modified Files
| File | Change |
|------|--------|
| `frontend/hooks/useNotifications.ts` | Added event listener for `notifications-changed` to trigger immediate refetch |
| `frontend/features/settings/components/sections/CacheSection.tsx` | Removed toast.success for cache clearing and sync, added local error state |
| `frontend/components/layout/TopBar.tsx` | Removed toast.success for library scan started |
| `frontend/components/layout/Sidebar.tsx` | Added `notifications-changed` event dispatch after sync |
| `frontend/features/settings/components/ui/SettingsLayout.tsx` | Replaced IntersectionObserver with throttled scroll listener for smoother sidebar tracking |
### Behavior Changes
- **Sync button**: No longer shows toast overlay - progress appears in activity panel
- **Clear caches button**: No longer shows toast overlay - implicit success (button returns to normal state)
- **Notification badge**: Now clears immediately via optimistic updates and event system
- **Settings sidebar**: Smoother scrolling behavior without jumpy highlights
---
## Session 8: Artist Radio Feature
### New Feature: Artist Radio with Hybrid Similarity Matching
| File | Change |
|------|--------|
| `backend/src/routes/library.ts` | Added `artist` case to `/library/radio` endpoint with hybrid matching |
| `backend/src/routes/library.ts` | Added artist name filtering to `/library/genres` endpoint |
| `frontend/features/artist/components/ArtistActionBar.tsx` | Added Radio icon button for library artists |
| `frontend/app/artist/[id]/page.tsx` | Added `handleStartRadio` function and passed to ArtistActionBar |
| `frontend/lib/api.ts` | Added `getRadioTracks()` method |
### Artist Radio Logic
The artist radio uses a **hybrid approach** with vibe boosting:
1. **Last.fm Similar Artists (filtered to library)**: Primary source, gets up to 15 similar artists that exist in user's library
2. **Genre Matching Fallback**: If < 5 similar artists, finds library artists with overlapping genres
3. **Vibe Boost via Audio Analysis**: Scores similar artists' tracks by BPM, energy, valence, and danceability similarity
4. **Track Mix**: ~40% from original artist, ~60% from vibe-matched similar artists
### Genre Filtering Fix
Artist names (like "Jamiroquai") were incorrectly showing as genres. Fixed by:
- Fetching all artist names at query time
- Filtering out any "genre" that matches an artist name (case-insensitive)
### Bug Fix: Artist Radio "Unknown Artist" / No Image
Fixed two issues with artist radio playback:
1. **Frontend**: Removed double-transformation of tracks - backend already returns properly formatted data
2. **Backend**: Fixed `coverArt` to use `track.album.coverUrl` directly instead of conditional `lidarrAlbumId` check
---
## Session 9: Vibe Match Feature
### New Feature: "Vibe Match" Button on Media Player
Allows users to instantly create a queue of tracks that sound like the currently playing track.
| File | Change |
|------|--------|
| `backend/src/routes/library.ts` | Added `vibe` case to `/library/radio` endpoint with audio feature matching |
| `frontend/components/player/MiniPlayer.tsx` | Added Vibe button (AudioWaveform icon) with loading state |
| `frontend/components/player/FullPlayer.tsx` | Added Vibe button (AudioWaveform icon) with loading state |
### How Vibe Match Works
1. **Takes current track's audio features** (BPM, energy, valence, danceability, key, mood tags)
2. **Searches entire library** for tracks with similar audio profiles
3. **Scores matches** using weighted algorithm:
- BPM (25%) - within ±15 BPM is ideal
- Energy (25%)
- Valence/mood (20%)
- Danceability (15%)
- Key compatibility (10%)
- Mood tag overlap (5%)
4. **Falls back gracefully** if not enough audio matches:
- Same artist's other tracks
- Last.fm similar artists' tracks
- Same genre tracks
- Random library tracks
### UI Location
The Vibe button (waveform icon) appears after the Repeat button in both:
- MiniPlayer (sidebar player)
- FullPlayer (bottom bar player)
Clicking it replaces the current queue with vibe-matched tracks and shows a toast notification.
---
## Session 9 (continued): Search Tracks Fix
### Bug Fix: Library Tracks Not Showing in Search
The backend was returning tracks in search results, but the frontend never displayed them.
| File | Change |
|------|--------|
| `frontend/app/search/page.tsx` | Added import for `LibraryTracksList` and section to display library tracks |
| `frontend/features/search/components/LibraryTracksList.tsx` | **New file** - Component to display library tracks in search results |
### Features of LibraryTracksList
- Shows up to 10 tracks matching the search query
- Displays cover art, title, artist, album, and duration
- Click to play (integrates with audio context)
- Currently playing track highlighted in yellow
- Artist and album names link to their respective pages
@@ -0,0 +1,396 @@
# Vibe Matching Algorithm Overhaul Plan
## Overview
This document outlines the plan to overhaul the vibe matching algorithm to use **cosine similarity** on a comprehensive feature vector that includes all 9 ML mood predictions, audio features, and genre/tag matching.
## Current State (Before Overhaul)
### What We Have
- **ML Mood Predictions (9 total):**
- `moodHappy`, `moodSad`, `moodRelaxed`, `moodAggressive` (existing)
- `moodParty`, `moodAcoustic`, `moodElectronic` (newly added)
- `danceabilityMl`, `aggressivenessMl` (existing)
- **Audio Features:**
- `bpm`, `key`, `keyScale` (major/minor)
- `energy`, `danceability`, `valence`, `arousal`
- `instrumentalness`, `acousticness`, `speechiness`
- **Metadata:**
- `lastfmTags` (JSON array of tag objects with name/count)
- `essentiaGenres` (JSON array of genre strings)
- `trackGenres` relation (linked genre records)
### Previous Algorithm (Weighted Manhattan Distance)
```typescript
// Old approach - arbitrary weights, limited features
const weights = {
energy: 1.5,
danceability: 1.2,
valence: 1.0,
arousal: 1.0,
instrumentalness: 0.8,
bpm: 0.5,
};
let score = 0;
for (const [feature, weight] of Object.entries(weights)) {
const diff = Math.abs(sourceTrack[feature] - candidateTrack[feature]);
score += diff * weight;
}
// Lower score = more similar (inverted logic)
```
**Problems with old approach:**
1. Only used 6 features, ignored all ML mood predictions
2. Arbitrary weights with no scientific basis
3. Manhattan distance less effective for high-dimensional feature spaces
4. No genre/tag matching
5. Score inversion was confusing
---
## New Algorithm (Cosine Similarity)
### Phase 1: Database Schema Update ✅
Add new mood fields to Prisma schema:
```prisma
model Track {
// ... existing fields ...
// ML Mood Predictions (0.0-1.0)
moodHappy Float?
moodSad Float?
moodRelaxed Float?
moodAggressive Float?
moodParty Float? // NEW
moodAcoustic Float? // NEW
moodElectronic Float? // NEW
// ... rest of schema ...
}
```
**Migration command:**
```bash
cd backend
npx prisma db push --skip-generate
```
### Phase 2: Audio Analyzer Update ✅
Update `services/audio-analyzer/analyzer.py` to extract and save all 7 mood predictions:
```python
# MusiCNN mood classifiers
mood_models = {
'moodHappy': 'mood_happy-musicnn-msd-2',
'moodSad': 'mood_sad-musicnn-msd-2',
'moodRelaxed': 'mood_relaxed-musicnn-msd-2',
'moodAggressive': 'mood_aggressive-musicnn-msd-2',
'moodParty': 'mood_party-musicnn-msd-2',
'moodAcoustic': 'mood_acoustic-musicnn-msd-2',
'moodElectronic': 'mood_electronic-musicnn-msd-2',
}
# Save all to database
UPDATE "Track" SET
"moodHappy" = %s,
"moodSad" = %s,
"moodRelaxed" = %s,
"moodAggressive" = %s,
"moodParty" = %s,
"moodAcoustic" = %s,
"moodElectronic" = %s,
...
```
### Phase 3: Feature Vector Construction
Build a normalized feature vector for each track:
```typescript
interface TrackFeatures {
// ML Moods (0-1)
moodHappy: number | null;
moodSad: number | null;
moodRelaxed: number | null;
moodAggressive: number | null;
moodParty: number | null;
moodAcoustic: number | null;
moodElectronic: number | null;
// Audio Features
energy: number | null;
arousal: number | null;
danceability: number | null;
danceabilityMl: number | null;
instrumentalness: number | null;
bpm: number | null;
keyScale: string | null;
// Metadata
lastfmTags: any;
essentiaGenres: any;
}
function buildFeatureVector(track: TrackFeatures): number[] {
return [
// 7 ML Mood predictions (indices 0-6)
track.moodHappy ?? 0.5,
track.moodSad ?? 0.5,
track.moodRelaxed ?? 0.5,
track.moodAggressive ?? 0.5,
track.moodParty ?? 0.5,
track.moodAcoustic ?? 0.5,
track.moodElectronic ?? 0.5,
// Core audio features (indices 7-10)
track.energy ?? 0.5,
track.arousal ?? 0.5,
track.danceabilityMl ?? track.danceability ?? 0.5,
track.instrumentalness ?? 0.5,
// Normalized BPM (index 11)
// Maps 60-180 BPM to 0-1 range
Math.max(0, Math.min(1, ((track.bpm ?? 120) - 60) / 120)),
// Key mode (index 12)
// Major = 1, Minor = 0
track.keyScale === 'major' ? 1 : 0,
];
}
```
**Feature Vector Dimensions: 13**
### Phase 4: Cosine Similarity Calculation
```typescript
function cosineSimilarity(a: number[], b: number[]): number {
let dotProduct = 0;
let magnitudeA = 0;
let magnitudeB = 0;
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
magnitudeA += a[i] * a[i];
magnitudeB += b[i] * b[i];
}
if (magnitudeA === 0 || magnitudeB === 0) return 0;
return dotProduct / (Math.sqrt(magnitudeA) * Math.sqrt(magnitudeB));
}
```
**Properties:**
- Returns value between -1 and 1 (for our 0-1 normalized vectors, always 0 to 1)
- 1.0 = identical vectors (perfect match)
- 0.0 = orthogonal vectors (no similarity)
- Higher = better (intuitive, no inversion needed)
### Phase 5: Tag/Genre Bonus
Add bonus points for matching tags and genres:
```typescript
function calculateTagBonus(
sourceTrack: TrackFeatures,
candidateTrack: TrackFeatures
): number {
let bonus = 0;
// Extract tags
const sourceTags = new Set<string>();
const candidateTags = new Set<string>();
// Parse lastfmTags
if (Array.isArray(sourceTrack.lastfmTags)) {
sourceTrack.lastfmTags.forEach((t: any) => {
if (t?.name) sourceTags.add(t.name.toLowerCase());
});
}
if (Array.isArray(candidateTrack.lastfmTags)) {
candidateTrack.lastfmTags.forEach((t: any) => {
if (t?.name) candidateTags.add(t.name.toLowerCase());
});
}
// Parse essentiaGenres
if (Array.isArray(sourceTrack.essentiaGenres)) {
sourceTrack.essentiaGenres.forEach((g: string) => {
sourceTags.add(g.toLowerCase());
});
}
if (Array.isArray(candidateTrack.essentiaGenres)) {
candidateTrack.essentiaGenres.forEach((g: string) => {
candidateTags.add(g.toLowerCase());
});
}
// Count overlapping tags
let overlap = 0;
for (const tag of sourceTags) {
if (candidateTags.has(tag)) overlap++;
}
// Bonus: up to 0.1 (10%) for tag overlap
// Normalized by the smaller set size to handle varying tag counts
const minSize = Math.min(sourceTags.size, candidateTags.size);
if (minSize > 0) {
bonus = (overlap / minSize) * 0.1;
}
return bonus;
}
```
### Phase 6: Final Score Calculation
```typescript
function calculateVibeScore(
sourceTrack: TrackFeatures,
candidateTrack: TrackFeatures
): number {
// Build feature vectors
const sourceVector = buildFeatureVector(sourceTrack);
const candidateVector = buildFeatureVector(candidateTrack);
// Calculate cosine similarity (0-1)
const cosineSim = cosineSimilarity(sourceVector, candidateVector);
// Add tag bonus (0-0.1)
const tagBonus = calculateTagBonus(sourceTrack, candidateTrack);
// Final score: cosine similarity + tag bonus
// Capped at 1.0
const finalScore = Math.min(1.0, cosineSim + tagBonus);
return finalScore;
}
```
### Phase 7: Integration into Radio Endpoint
Update `backend/src/routes/library.ts`:
```typescript
// In the vibe radio section
const sourceTrack = await prisma.track.findUnique({
where: { id: trackId },
select: {
moodHappy: true,
moodSad: true,
moodRelaxed: true,
moodAggressive: true,
moodParty: true,
moodAcoustic: true,
moodElectronic: true,
energy: true,
arousal: true,
danceability: true,
danceabilityMl: true,
instrumentalness: true,
bpm: true,
keyScale: true,
lastfmTags: true,
essentiaGenres: true,
},
});
// Get candidates
const candidates = await prisma.track.findMany({
where: {
id: { not: trackId },
analysisStatus: 'enhanced', // Only use analyzed tracks
},
select: { /* same fields */ },
take: 500, // Get more candidates for better matching
});
// Score all candidates
const scored = candidates.map(candidate => ({
...candidate,
vibeScore: calculateVibeScore(sourceTrack, candidate),
}));
// Sort by score (highest first)
scored.sort((a, b) => b.vibeScore - a.vibeScore);
// Take top N for the queue
const vibeQueue = scored.slice(0, limit);
// DO NOT SHUFFLE - preserve the sorted order!
```
---
## Implementation Checklist
- [x] **Phase 1:** Add `moodParty`, `moodAcoustic`, `moodElectronic` to Prisma schema
- [x] **Phase 2:** Update audio analyzer to extract all 7 moods
- [x] **Phase 3:** Implement `buildFeatureVector()` function
- [x] **Phase 4:** Implement `cosineSimilarity()` function
- [x] **Phase 5:** Implement `calculateTagBonus()` function (called `computeTagBonus`)
- [x] **Phase 6:** Implement `calculateVibeScore()` combining all components
- [x] **Phase 7:** Integrate into `/library/radio` endpoint
- [ ] **Phase 8:** Update frontend to display match percentage (optional enhancement)
- [ ] **Phase 9:** Re-analyze tracks to populate new mood fields
---
## Re-Analysis Script
To populate the new mood fields for existing tracks:
```sql
-- Reset analysis status for enhanced tracks to re-run analysis
UPDATE "Track"
SET "analysisStatus" = 'pending'
WHERE "analysisStatus" = 'enhanced';
```
Or use the existing script:
```bash
docker exec lidify_db psql -U lidifydb -d lidify -f /path/to/reset-analysis-for-new-moods.sql
```
---
## Expected Improvements
1. **Better Similarity Matching:** Cosine similarity is mathematically proven to work well for high-dimensional feature vectors
2. **Full ML Utilization:** All 9 mood predictions now contribute to matching
3. **Genre Awareness:** Tag/genre overlap provides meaningful boost
4. **Intuitive Scores:** Higher score = better match (no inversion)
5. **Normalized Features:** All features scaled to 0-1 for fair comparison
---
## Testing Strategy
1. Pick a track with known characteristics (e.g., happy upbeat pop song)
2. Generate vibe queue
3. Verify top matches share similar mood profiles
4. Check that match percentages in UI reflect actual similarity
5. Test with various genres to ensure cross-genre matching works appropriately
---
## Files Modified
- `backend/prisma/schema.prisma` - New mood fields
- `backend/src/routes/library.ts` - New scoring algorithm
- `services/audio-analyzer/analyzer.py` - Extract all 7 moods
- `frontend/components/player/VibeOverlay.tsx` - Display all moods
- `frontend/lib/audio-state-context.tsx` - Extended AudioFeatures interface
---
## Notes
- **Gaia:** Essentia has a companion library called Gaia for large-scale similarity search using KD-trees. This is overkill for our scale (< 100k tracks) but could be considered for future scaling.
- **MusiCNN Limitations:** The model was trained on MSD (Million Song Dataset) which is pop/rock heavy. For classical/ambient music, predictions may be less reliable. We've added normalization to handle this.
- **Shuffle Interaction:** Vibe mode automatically disables shuffle to preserve the sorted order.
@@ -0,0 +1,571 @@
# Vibe Matching Implementation Plan
## Executive Summary
The current vibe matching system uses Essentia for audio analysis but only extracts **basic features**. Critical mood/emotion features are either placeholder values or poorly estimated. This document outlines a comprehensive plan to achieve Spotify-quality vibe matching while being conscious of performance on user hardware.
## Strategy Update (Latest)
**Default:** Enhanced mode (ML-powered, accurate)
**Fallback:** Standard mode (lightweight, for troubleshooting or power saving)
**Approach:**
1. ✅ Pre-package all Essentia TensorFlow models in Docker image (~200MB)
2. 🔄 Fix Enhanced mode FIRST - make it actually use the ML models
3. ⏳ THEN create Standard mode as a lightweight fallback
4. Users can toggle to Standard mode to save CPU if needed
---
## Current State Analysis
### What Essentia IS Currently Extracting (Working)
| Feature | Status | Quality |
|---------|--------|---------|
| **BPM** | ✅ Working | Good - Uses `RhythmExtractor2013` |
| **Key** | ✅ Working | Good - Uses `KeyExtractor` |
| **KeyScale** | ✅ Working | Good - major/minor detection |
| **Energy** | ✅ Working | Moderate - Raw energy normalized |
| **Loudness** | ✅ Working | Good - dB measurement |
| **Dynamic Range** | ✅ Working | Good |
| **Danceability** | ✅ Working | Good - Uses `Danceability` algorithm |
| **Beats Count** | ✅ Working | Good |
### What's Broken or Placeholder
| Feature | Status | Problem |
|---------|--------|---------|
| **Valence** | ⚠️ Fake | Calculated as `(major/minor * 0.4) + (energy * 0.6)` - NOT actual emotional valence |
| **Arousal** | ⚠️ Fake | Calculated as `(BPM * 0.5) + (energy * 0.5)` - NOT actual arousal |
| **Instrumentalness** | ❌ Placeholder | Hardcoded to `0.5` |
| **Acousticness** | ⚠️ Estimate | Rough estimate from dynamic range |
| **Speechiness** | ❌ Placeholder | Hardcoded to `0.1` |
| **Mood Tags** | ⚠️ Derived | Generated from fake valence/arousal, not ML |
| **Genre Tags** | ❌ Empty | TensorFlow models not loaded |
### The Core Issue
```python
# Current valence calculation (analyzer.py lines 226-231)
key_valence = 0.6 if scale == 'major' else 0.4
energy_valence = result['energy']
result['valence'] = round((key_valence * 0.4 + energy_valence * 0.6), 3)
```
**"Fake Happy" by Paramore** (emotionally complex, about masking sadness):
- Major key → 0.6
- High energy → ~0.7
- Calculated valence: `(0.6 * 0.4) + (0.7 * 0.6) = 0.66` (appears "happy")
**"Summer Girl" by Jamiroquai** (genuinely upbeat funk):
- Major key → 0.6
- High energy → ~0.7
- Calculated valence: `(0.6 * 0.4) + (0.7 * 0.6) = 0.66` (appears "happy")
**Result: 97% match despite being completely different vibes!**
---
## How Spotify Does It
Spotify's audio analysis uses a combination of:
### 1. Low-Level Audio Features (Similar to what we have)
- Tempo/BPM
- Key/Mode
- Loudness
- Time signature
### 2. Mid-Level Features (We're missing these)
- **Spectral Centroid** - "brightness" of the sound
- **Spectral Rolloff** - frequency distribution
- **Zero Crossing Rate** - percussiveness
- **MFCCs** - Mel-frequency cepstral coefficients (timbral texture)
- **Chroma Features** - harmonic content
### 3. High-Level Features (We're faking these)
- **Valence** - Musical positiveness (0-1)
- **Arousal/Energy** - Intensity and activity
- **Instrumentalness** - Vocal presence prediction
- **Acousticness** - Acoustic vs electronic
- **Speechiness** - Presence of spoken words
- **Liveness** - Audience presence detection
### 4. Deep Learning Models
Spotify trains neural networks on millions of labeled tracks to predict:
- Mood categories
- Genre classification
- User preference patterns
---
## Two-Tier System
### Default: Enhanced Vibe Matching (ML-Powered)
**Status:** DEFAULT - Pre-packaged in Docker, just works
**Target:** High accuracy, ~5-10 seconds per track
**Features (from Essentia TensorFlow Models):**
1. **Mood Predictions (real ML, not estimated):**
- `mood_happy-discogs-effnet-1.pb` - Happiness/positivity 0-1
- `mood_sad-discogs-effnet-1.pb` - Sadness 0-1
- `mood_relaxed-discogs-effnet-1.pb` - Relaxation/calmness 0-1
- `mood_aggressive-discogs-effnet-1.pb` - Aggression/intensity 0-1
2. **Audio Characteristics:**
- `danceability-discogs-effnet-1.pb` - ML-based danceability
- `voice_instrumental-discogs-effnet-1.pb` - Vocal detection (instrumentalness)
3. **Embeddings for Similarity:**
- `discogs-effnet-bs64-1.pb` - Audio embeddings (neural "fingerprint")
- Can be used for direct similarity comparison
4. **Spectral Features:**
- Spectral Centroid (brightness)
- MFCCs (timbral texture - 13 coefficients)
**Models Pre-packaged:** ~200MB in Docker image (no user download)
**RAM Requirement:** ~500MB during analysis
**CPU Requirement:** Any modern CPU (2015+)
### Fallback: Standard Vibe Matching (Lightweight)
**Status:** FALLBACK - For troubleshooting or power saving
**Target:** Fast, <2 seconds per track, low CPU
**Features Used:**
- BPM (Essentia RhythmExtractor)
- Energy (Essentia Energy)
- Danceability (Essentia Danceability - non-ML version)
- Key/Scale (Essentia KeyExtractor)
- Spectral Centroid (cheap to compute)
- Last.fm mood tags
- Genre matching from tags
**When to use Standard mode:**
- Low-power devices (Raspberry Pi, older NAS)
- Troubleshooting if Enhanced mode has issues
- User preference to save CPU cycles
---
## Implementation Plan
### Phase 1: Pre-Package Models in Docker (Day 1)
#### 1.1 Update Dockerfile to Include Models
```dockerfile
# Download Essentia ML models during build (~200MB)
RUN apt-get update && apt-get install -y --no-install-recommends curl && \
# Base embedding model (required for all predictions)
curl -L -o /app/models/discogs-effnet-bs64-1.pb \
"https://essentia.upf.edu/models/feature-extractors/discogs-effnet/discogs-effnet-bs64-1.pb" && \
# Mood models
curl -L -o /app/models/mood_happy-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_happy/mood_happy-discogs-effnet-1.pb" && \
curl -L -o /app/models/mood_sad-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_sad/mood_sad-discogs-effnet-1.pb" && \
curl -L -o /app/models/mood_relaxed-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_relaxed/mood_relaxed-discogs-effnet-1.pb" && \
curl -L -o /app/models/mood_aggressive-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_aggressive/mood_aggressive-discogs-effnet-1.pb" && \
# Danceability and voice/instrumental
curl -L -o /app/models/danceability-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/danceability/danceability-discogs-effnet-1.pb" && \
curl -L -o /app/models/voice_instrumental-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/voice_instrumental/voice_instrumental-discogs-effnet-1.pb" && \
# Arousal/Valence models
curl -L -o /app/models/arousal-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_arousal/mood_arousal-discogs-effnet-1.pb" && \
curl -L -o /app/models/valence-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_valence/mood_valence-discogs-effnet-1.pb" && \
apt-get purge -y curl && rm -rf /var/lib/apt/lists/*
```
### Phase 2: Implement Enhanced Analysis (Days 2-4)
#### 2.1 Rewrite analyzer.py with ML Models
```python
class AudioAnalyzer:
"""Enhanced audio analysis using Essentia TensorFlow models"""
def __init__(self):
self.models_loaded = False
self.embedding_model = None
self.mood_models = {}
if ESSENTIA_AVAILABLE:
self._init_essentia()
self._load_ml_models()
def _load_ml_models(self):
"""Load TensorFlow models for enhanced analysis"""
try:
from essentia.standard import (
TensorflowPredictEffnetDiscogs,
TensorflowPredict2D
)
# Load embedding extractor (base for all predictions)
embedding_path = '/app/models/discogs-effnet-bs64-1.pb'
if os.path.exists(embedding_path):
self.embedding_model = TensorflowPredictEffnetDiscogs(
graphFilename=embedding_path,
output="PartitionedCall:1"
)
logger.info("Loaded embedding model")
# Load mood prediction models
mood_models = {
'happy': '/app/models/mood_happy-discogs-effnet-1.pb',
'sad': '/app/models/mood_sad-discogs-effnet-1.pb',
'relaxed': '/app/models/mood_relaxed-discogs-effnet-1.pb',
'aggressive': '/app/models/mood_aggressive-discogs-effnet-1.pb',
'danceability': '/app/models/danceability-discogs-effnet-1.pb',
'voice_instrumental': '/app/models/voice_instrumental-discogs-effnet-1.pb',
'arousal': '/app/models/arousal-discogs-effnet-1.pb',
'valence': '/app/models/valence-discogs-effnet-1.pb',
}
for name, path in mood_models.items():
if os.path.exists(path):
self.mood_models[name] = TensorflowPredict2D(
graphFilename=path,
output="model/Softmax"
)
logger.info(f"Loaded {name} model")
self.models_loaded = len(self.mood_models) > 0
logger.info(f"ML models loaded: {self.models_loaded} ({len(self.mood_models)} models)")
except Exception as e:
logger.warning(f"Could not load ML models: {e}")
self.models_loaded = False
def analyze(self, file_path: str) -> Dict[str, Any]:
"""Full analysis with ML models if available"""
result = self._extract_basic_features(file_path)
if self.models_loaded:
ml_features = self._extract_ml_features(file_path)
result.update(ml_features)
result['analysisMode'] = 'enhanced'
else:
# Fallback to estimated values
result.update(self._estimate_mood_features(result))
result['analysisMode'] = 'standard'
return result
def _extract_ml_features(self, file_path: str) -> Dict[str, Any]:
"""Extract features using TensorFlow models"""
result = {}
# Load audio at 16kHz for ML models
audio = self.load_audio(file_path, sample_rate=16000)
if audio is None:
return result
# Get embeddings
embeddings = self.embedding_model(audio)
# Mood predictions
if 'happy' in self.mood_models:
preds = self.mood_models['happy'](embeddings)
result['moodHappy'] = float(np.mean(preds[:, 1])) # Probability of "happy"
if 'sad' in self.mood_models:
preds = self.mood_models['sad'](embeddings)
result['moodSad'] = float(np.mean(preds[:, 1]))
if 'relaxed' in self.mood_models:
preds = self.mood_models['relaxed'](embeddings)
result['moodRelaxed'] = float(np.mean(preds[:, 1]))
if 'aggressive' in self.mood_models:
preds = self.mood_models['aggressive'](embeddings)
result['moodAggressive'] = float(np.mean(preds[:, 1]))
# Real valence and arousal from dedicated models
if 'valence' in self.mood_models:
preds = self.mood_models['valence'](embeddings)
result['valence'] = float(np.mean(preds[:, 1]))
if 'arousal' in self.mood_models:
preds = self.mood_models['arousal'](embeddings)
result['arousal'] = float(np.mean(preds[:, 1]))
# Instrumentalness from voice/instrumental model
if 'voice_instrumental' in self.mood_models:
preds = self.mood_models['voice_instrumental'](embeddings)
result['instrumentalness'] = float(np.mean(preds[:, 1])) # 1 = instrumental
# ML-based danceability
if 'danceability' in self.mood_models:
preds = self.mood_models['danceability'](embeddings)
result['danceabilityMl'] = float(np.mean(preds[:, 1]))
return result
```
### Phase 3: Update Database Schema (Day 3)
#### 3.1 Add New Feature Columns
```prisma
model Track {
// ... existing fields ...
// ML-based mood predictions (Enhanced mode)
moodHappy Float? // ML prediction 0-1
moodSad Float? // ML prediction 0-1
moodRelaxed Float? // ML prediction 0-1
moodAggressive Float? // ML prediction 0-1
danceabilityMl Float? // ML-based danceability
// Analysis metadata
analysisMode String? // 'standard' or 'enhanced'
}
```
### Phase 4: Update Vibe Matching Algorithm (Day 4)
#### 4.1 Use Real Mood Predictions in Matching
```typescript
// In library.ts - Enhanced vibe matching
const scored = analyzedTracks.map(t => {
let score = 0;
let factors = 0;
// === MOOD MATCHING (50% total - the heart of vibe) ===
// Happy mood (15%)
if (sourceTrack.moodHappy !== null && t.moodHappy !== null) {
score += (1 - Math.abs(sourceTrack.moodHappy - t.moodHappy)) * 0.15;
factors += 0.15;
}
// Sad mood (10%)
if (sourceTrack.moodSad !== null && t.moodSad !== null) {
score += (1 - Math.abs(sourceTrack.moodSad - t.moodSad)) * 0.10;
factors += 0.10;
}
// Relaxed mood (10%)
if (sourceTrack.moodRelaxed !== null && t.moodRelaxed !== null) {
score += (1 - Math.abs(sourceTrack.moodRelaxed - t.moodRelaxed)) * 0.10;
factors += 0.10;
}
// Aggressive mood (10%)
if (sourceTrack.moodAggressive !== null && t.moodAggressive !== null) {
score += (1 - Math.abs(sourceTrack.moodAggressive - t.moodAggressive)) * 0.10;
factors += 0.10;
}
// Valence - overall positivity (5%)
if (sourceTrack.valence !== null && t.valence !== null) {
score += (1 - Math.abs(sourceTrack.valence - t.valence)) * 0.05;
factors += 0.05;
}
// === AUDIO CHARACTERISTICS (35% total) ===
// BPM (15%) - within ±15 BPM is good
if (sourceTrack.bpm && t.bpm) {
const bpmDiff = Math.abs(sourceTrack.bpm - t.bpm);
score += Math.max(0, 1 - bpmDiff / 30) * 0.15;
factors += 0.15;
}
// Energy (10%)
if (sourceTrack.energy !== null && t.energy !== null) {
score += (1 - Math.abs(sourceTrack.energy - t.energy)) * 0.10;
factors += 0.10;
}
// Danceability - prefer ML version (10%)
const srcDance = sourceTrack.danceabilityMl ?? sourceTrack.danceability;
const tDance = t.danceabilityMl ?? t.danceability;
if (srcDance !== null && tDance !== null) {
score += (1 - Math.abs(srcDance - tDance)) * 0.10;
factors += 0.10;
}
// === GENRE/TAGS (15% total) ===
// Genre/tag overlap (10%)
const sourceGenres = [...(sourceTrack.lastfmTags || []), ...(sourceTrack.essentiaGenres || [])];
const trackGenres = [...(t.lastfmTags || []), ...(t.essentiaGenres || [])];
if (sourceGenres.length > 0 && trackGenres.length > 0) {
const overlap = sourceGenres.filter(g => trackGenres.includes(g)).length;
const maxOverlap = Math.max(sourceGenres.length, trackGenres.length);
score += (overlap / maxOverlap) * 0.10;
factors += 0.10;
}
// Key compatibility (5%)
if (sourceTrack.keyScale && t.keyScale) {
score += (sourceTrack.keyScale === t.keyScale ? 1 : 0.5) * 0.05;
factors += 0.05;
}
const finalScore = factors > 0 ? score / factors : 0;
return { id: t.id, score: finalScore };
});
```
### Phase 5: Create Standard Mode Fallback (Day 5)
After Enhanced mode is working, implement Standard mode:
- Same algorithm structure but skip ML features
- Use estimated valence (improved heuristics)
- Lower weights on mood matching since it's estimated
- Higher weights on BPM, energy, genre tags
### Phase 6: Settings & UI (Day 6)
#### 6.1 Add Settings Toggle
```typescript
// System settings - Enhanced is DEFAULT
{
audioAnalysis: {
vibeMatchingMode: 'enhanced' | 'standard', // Default: 'enhanced'
reanalyzeOnModeChange: boolean, // Default: false
}
}
```
#### 6.2 Settings UI
```
Audio Analysis
├── Vibe Matching Mode
│ ├── ● Enhanced (Recommended - Default)
│ │ └── Uses ML models for accurate mood detection
│ └── ○ Standard (Power Saver)
│ └── Faster, uses basic audio features only
├── Analysis Status
│ └── "1,234 / 1,500 tracks analyzed (Enhanced mode)"
└── [Re-analyze Library] button
└── "Re-analyze all tracks with current settings"
```
### Phase 7: Testing & Validation (Day 7)
#### 7.1 Test Cases
| Source Track | Bad Match (Current) | Expected Good Match |
|--------------|---------------------|---------------------|
| "Fake Happy" (Paramore) | "Summer Girl" (Jamiroquai) 97% | Other emo/pop-punk <60% |
| "Creep" (Radiohead) | Fast dance track | Other melancholic rock |
| "Uptown Funk" | Slow ballad | Other high-energy funk/pop |
#### 7.2 Performance Testing
- Analyze 100 tracks, measure time
- Memory usage during analysis
- Queue handling under load
---
## Database Schema Updates
```prisma
model Track {
// ... existing fields ...
// ML-based mood predictions (Enhanced mode)
moodHappy Float? // ML prediction 0-1
moodSad Float? // ML prediction 0-1
moodRelaxed Float? // ML prediction 0-1
moodAggressive Float? // ML prediction 0-1
danceabilityMl Float? // ML-based danceability
// Analysis metadata
analysisMode String? // 'standard' or 'enhanced'
}
```
---
## Performance Benchmarks (Estimated)
| Operation | Standard Mode | Enhanced Mode |
|-----------|---------------|---------------|
| Analysis per track | 1-2 sec | 5-10 sec |
| RAM usage | ~100MB | ~500MB |
| Models in Docker | N/A | ~200MB (pre-packaged) |
| Vibe match query | <100ms | <100ms |
| Full library (1000 tracks) | ~30 min | ~2-3 hours |
---
## Files to Modify
| File | Changes |
|------|---------|
| `services/audio-analyzer/Dockerfile` | Add model downloads during build |
| `services/audio-analyzer/analyzer.py` | Implement ML model loading and prediction |
| `backend/prisma/schema.prisma` | Add mood prediction columns |
| `backend/src/routes/library.ts` | Update vibe matching algorithm weights |
| `frontend/features/settings/` | Add analysis mode toggle (default: enhanced) |
| `frontend/components/player/VibeGraph.tsx` | Display mood predictions |
---
## Success Metrics
After implementation, "Fake Happy" and "Summer Girl" should:
- Match at **<50%** (different emotional content, different genre)
Better matches for "Fake Happy" would be:
- Other Paramore songs (same artist = genre/production match)
- Emo/pop-punk with similar emotional complexity
- Songs with high energy but mixed emotional signals
---
## Implementation Order (Enhanced First)
### Week 1: Get Enhanced Mode Working
1. [x] Create implementation plan (this document)
2. [x] Update Dockerfile to pre-package ML models (~200MB)
3. [x] Rewrite analyzer.py with TensorFlow model loading
4. [x] Add new database columns for mood predictions (moodHappy, moodSad, etc.)
5. [x] Update vibe matching algorithm with ML mood weights
6. [x] Update programmatic playlists to use ML mood predictions
7. [ ] Run Prisma migration to apply schema changes
8. [ ] Rebuild audio-analyzer Docker container
9. [ ] Test ML analysis on sample tracks
### Week 2: Polish & Fallback
10. [ ] Test accuracy with diverse track pairs
11. [ ] Add settings UI (Enhanced = default)
12. [ ] Implement Standard mode as explicit fallback option
13. [ ] Update VibeGraph to show mood predictions
14. [ ] Documentation and testing
---
## Quick Reference: Models to Include
| Model | File | Purpose | Size |
|-------|------|---------|------|
| Embeddings | `discogs-effnet-bs64-1.pb` | Base model for all predictions | ~85MB |
| Happy | `mood_happy-discogs-effnet-1.pb` | Happiness detection | ~15MB |
| Sad | `mood_sad-discogs-effnet-1.pb` | Sadness detection | ~15MB |
| Relaxed | `mood_relaxed-discogs-effnet-1.pb` | Relaxation detection | ~15MB |
| Aggressive | `mood_aggressive-discogs-effnet-1.pb` | Aggression detection | ~15MB |
| Arousal | `mood_arousal-discogs-effnet-1.pb` | Energy/calm scale | ~15MB |
| Valence | `mood_valence-discogs-effnet-1.pb` | Positive/negative | ~15MB |
| Danceability | `danceability-discogs-effnet-1.pb` | ML danceability | ~15MB |
| Voice/Instrumental | `voice_instrumental-discogs-effnet-1.pb` | Vocal detection | ~15MB |
**Total:** ~200MB (one-time addition to Docker image)
+132
View File
@@ -0,0 +1,132 @@
#!/usr/bin/env bash
set -euo pipefail
# One-command predeploy test runner for Lidify.
#
# What it does:
# - Starts a clean docker compose stack (core services only)
# - Runs backend API smoke tests
# - Runs frontend Playwright E2E smoke tests
# - Optionally tears the stack down
#
# Requirements:
# - Docker + docker compose plugin
# - Node/npm available (to run the test runners)
# - A MUSIC_PATH that contains at least one track if you want playback/playlist tests to pass
#
# Environment variables:
# - LIDIFY_UI_BASE_URL (default: http://127.0.0.1:3030)
# - LIDIFY_API_BASE_URL (default: http://127.0.0.1:3006)
# - LIDIFY_TEST_USERNAME (default: predeploy)
# - LIDIFY_TEST_PASSWORD (default: predeploy-password)
# - LIDIFY_COMPOSE_FILE (default: docker-compose.yml)
# - LIDIFY_COMPOSE_PROJECT (default: lidify_predeploy_<timestamp>)
# - LIDIFY_TEARDOWN (default: 1) set to 0 to keep containers running
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
COMPOSE_FILE="${LIDIFY_COMPOSE_FILE:-docker-compose.yml}"
UI_BASE_URL="${LIDIFY_UI_BASE_URL:-http://127.0.0.1:3030}"
API_BASE_URL="${LIDIFY_API_BASE_URL:-http://127.0.0.1:3006}"
TEARDOWN="${LIDIFY_TEARDOWN:-1}"
PROJECT="${LIDIFY_COMPOSE_PROJECT:-lidify_predeploy_$(date +%Y%m%d_%H%M%S)}"
cd "$ROOT_DIR"
echo "[predeploy] project=$PROJECT"
echo "[predeploy] compose=$COMPOSE_FILE"
echo "[predeploy] ui=$UI_BASE_URL"
echo "[predeploy] api=$API_BASE_URL"
if ! command -v docker >/dev/null 2>&1; then
echo "[predeploy] ERROR: docker is not installed or not in PATH"
exit 1
fi
if ! docker compose version >/dev/null 2>&1; then
echo "[predeploy] ERROR: docker compose plugin not available (try: docker --version, docker compose version)"
exit 1
fi
cleanup() {
if [ "$TEARDOWN" = "1" ]; then
echo "[predeploy] tearing down docker compose stack..."
docker compose -p "$PROJECT" -f "$COMPOSE_FILE" down -v
else
echo "[predeploy] teardown disabled (LIDIFY_TEARDOWN=0) - leaving containers running"
fi
}
trap cleanup EXIT
echo "[predeploy] starting docker compose (core services only)..."
docker compose -p "$PROJECT" -f "$COMPOSE_FILE" up -d postgres redis backend frontend
echo "[predeploy] waiting for backend health..."
node - <<'NODE'
const base = (process.env.LIDIFY_API_BASE_URL || "http://127.0.0.1:3006").replace(/\/$/, "");
const timeoutMs = 120000;
const start = Date.now();
async function sleep(ms){ return new Promise(r=>setTimeout(r, ms)); }
(async () => {
while (Date.now() - start < timeoutMs) {
try {
const res = await fetch(`${base}/health`);
if (res.ok) process.exit(0);
} catch {}
await sleep(1000);
}
console.error(`Backend did not become healthy at ${base}/health within ${timeoutMs}ms`);
process.exit(1);
})();
NODE
echo "[predeploy] waiting for frontend health..."
node - <<'NODE'
const base = (process.env.LIDIFY_UI_BASE_URL || "http://127.0.0.1:3030").replace(/\/$/, "");
const timeoutMs = 120000;
const start = Date.now();
async function sleep(ms){ return new Promise(r=>setTimeout(r, ms)); }
(async () => {
while (Date.now() - start < timeoutMs) {
try {
const res = await fetch(`${base}/health`);
if (res.ok) process.exit(0);
} catch {}
await sleep(1000);
}
console.error(`Frontend did not become healthy at ${base}/health within ${timeoutMs}ms`);
process.exit(1);
})();
NODE
echo "[predeploy] running backend API smoke tests..."
(cd backend && \
LIDIFY_API_BASE_URL="$API_BASE_URL" \
LIDIFY_TEST_USERNAME="${LIDIFY_TEST_USERNAME:-predeploy}" \
LIDIFY_TEST_PASSWORD="${LIDIFY_TEST_PASSWORD:-predeploy-password}" \
npm run test:smoke)
echo "[predeploy] ensuring Playwright browser is installed..."
(cd frontend && npx playwright install chromium)
echo "[predeploy] running frontend E2E smoke tests..."
(cd frontend && \
LIDIFY_UI_BASE_URL="$UI_BASE_URL" \
LIDIFY_TEST_USERNAME="${LIDIFY_TEST_USERNAME:-predeploy}" \
LIDIFY_TEST_PASSWORD="${LIDIFY_TEST_PASSWORD:-predeploy-password}" \
npm run test:e2e)
echo "[predeploy] PASS"
@@ -0,0 +1,17 @@
-- Reset all enhanced tracks for re-analysis to populate new mood fields
-- (moodParty, moodAcoustic, moodElectronic)
-- Option 1: Reset only enhanced tracks (faster - already have ML models loaded)
UPDATE "Track"
SET
"analysisStatus" = 'pending',
"moodParty" = NULL,
"moodAcoustic" = NULL,
"moodElectronic" = NULL
WHERE "analysisMode" = 'enhanced';
-- Check how many tracks will be re-analyzed
SELECT COUNT(*) as tracks_to_reanalyze FROM "Track" WHERE "analysisStatus" = 'pending';
+222
View File
@@ -0,0 +1,222 @@
/**
* Lidify predeploy smoke tests (API-level).
*
* Goals:
* - deterministic, fast "is the app basically working?" checks
* - no build step (runs via tsx)
*
* Usage:
* LIDIFY_API_BASE_URL=http://127.0.0.1:3006 \
* LIDIFY_TEST_USERNAME=predeploy \
* LIDIFY_TEST_PASSWORD=predeploy-password \
* npm run test:smoke
*/
type Json = any;
const API_BASE_URL = (process.env.LIDIFY_API_BASE_URL || "http://127.0.0.1:3006").replace(/\/$/, "");
const USERNAME = process.env.LIDIFY_TEST_USERNAME || "predeploy";
const PASSWORD = process.env.LIDIFY_TEST_PASSWORD || "predeploy-password";
const WAIT_MS = Number(process.env.LIDIFY_SMOKE_WAIT_MS || "60000"); // total budget
const POLL_INTERVAL_MS = Number(process.env.LIDIFY_SMOKE_POLL_INTERVAL_MS || "1000");
function sleep(ms: number) {
return new Promise((r) => setTimeout(r, ms));
}
function assert(condition: any, message: string): asserts condition {
if (!condition) throw new Error(message);
}
async function fetchJson(
path: string,
opts: RequestInit & { token?: string } = {}
): Promise<{ status: number; ok: boolean; json: Json }> {
const url = `${API_BASE_URL}${path}`;
const headers: Record<string, string> = {
"Content-Type": "application/json",
...(opts.headers as any),
};
if (opts.token) headers.Authorization = `Bearer ${opts.token}`;
const res = await fetch(url, { ...opts, headers });
const json = await res.json().catch(() => ({}));
return { status: res.status, ok: res.ok, json };
}
async function waitForHealth() {
const start = Date.now();
let lastErr: any = null;
while (Date.now() - start < WAIT_MS) {
try {
const res = await fetch(`${API_BASE_URL}/health`);
if (res.ok) return;
lastErr = new Error(`health returned ${res.status}`);
} catch (e) {
lastErr = e;
}
await sleep(POLL_INTERVAL_MS);
}
throw new Error(
`Backend did not become healthy at ${API_BASE_URL}/health within ${WAIT_MS}ms. Last error: ${
lastErr instanceof Error ? lastErr.message : String(lastErr)
}`
);
}
async function ensureTestUserAndToken(): Promise<string> {
// Prefer onboarding/register because it's available without admin and works even when users exist.
const register = await fetchJson("/api/onboarding/register", {
method: "POST",
body: JSON.stringify({ username: USERNAME, password: PASSWORD }),
});
if (register.ok && register.json?.token) {
return register.json.token as string;
}
// If user already exists, login.
const login = await fetchJson("/api/auth/login", {
method: "POST",
body: JSON.stringify({ username: USERNAME, password: PASSWORD }),
});
assert(login.ok, `Login failed: status=${login.status} body=${JSON.stringify(login.json)}`);
assert(login.json?.token, `Login did not return token: ${JSON.stringify(login.json)}`);
return login.json.token as string;
}
async function completeOnboarding(token: string) {
const res = await fetchJson("/api/onboarding/complete", {
method: "POST",
token,
});
// It's fine if it's already complete; endpoint should still succeed.
assert(res.ok, `Onboarding complete failed: status=${res.status} body=${JSON.stringify(res.json)}`);
}
async function getOneTrackId(token: string): Promise<string | null> {
const tracks = await fetchJson("/api/library/tracks?limit=1&offset=0", { method: "GET", token });
assert(tracks.ok, `Fetch tracks failed: status=${tracks.status} body=${JSON.stringify(tracks.json)}`);
const id = tracks.json?.tracks?.[0]?.id;
return typeof id === "string" ? id : null;
}
async function scanLibraryIfNeeded(token: string) {
// If you already have at least one track, dont force a scan (keeps it fast).
const existing = await getOneTrackId(token);
if (existing) return;
const scan = await fetchJson("/api/library/scan", { method: "POST", token });
assert(scan.ok, `Library scan start failed: status=${scan.status} body=${JSON.stringify(scan.json)}`);
const jobId = scan.json?.jobId;
assert(typeof jobId === "string", `Library scan did not return jobId: ${JSON.stringify(scan.json)}`);
const start = Date.now();
while (Date.now() - start < WAIT_MS) {
const status = await fetchJson(`/api/library/scan/status/${jobId}`, { method: "GET", token });
assert(status.ok, `Library scan status failed: status=${status.status} body=${JSON.stringify(status.json)}`);
const s = status.json?.status;
if (s === "completed" || s === "complete" || s === "done" || s === "success") return;
if (s === "failed" || s === "error") {
throw new Error(`Library scan failed: ${JSON.stringify(status.json)}`);
}
await sleep(POLL_INTERVAL_MS);
}
throw new Error(`Library scan did not complete within ${WAIT_MS}ms (jobId=${jobId}).`);
}
async function playlistsCrud(token: string) {
// Needs at least one track.
const trackId = await getOneTrackId(token);
assert(
trackId,
`No tracks found. Set MUSIC_PATH to a library with at least one track, or run a scan before testing.`
);
const created = await fetchJson("/api/playlists", {
method: "POST",
token,
body: JSON.stringify({ name: `predeploy-smoke-${Date.now()}`, isPublic: false }),
});
assert(created.ok, `Create playlist failed: status=${created.status} body=${JSON.stringify(created.json)}`);
const playlistId = created.json?.id;
assert(typeof playlistId === "string", `Create playlist missing id: ${JSON.stringify(created.json)}`);
const add = await fetchJson(`/api/playlists/${playlistId}/items`, {
method: "POST",
token,
body: JSON.stringify({ trackId }),
});
assert(add.ok, `Add track to playlist failed: status=${add.status} body=${JSON.stringify(add.json)}`);
const del = await fetchJson(`/api/playlists/${playlistId}`, { method: "DELETE", token });
assert(del.ok, `Delete playlist failed: status=${del.status} body=${JSON.stringify(del.json)}`);
}
async function playbackStateRoundTrip(token: string) {
const trackId = await getOneTrackId(token);
assert(
trackId,
`No tracks found. Set MUSIC_PATH to a library with at least one track, or run a scan before testing.`
);
const payload = {
playbackType: "track",
trackId,
queue: [{ id: trackId }],
currentIndex: 0,
isShuffle: false,
};
const save = await fetchJson("/api/playback-state", {
method: "POST",
token,
body: JSON.stringify(payload),
});
assert(save.ok, `Save playback state failed: status=${save.status} body=${JSON.stringify(save.json)}`);
const got = await fetchJson("/api/playback-state", { method: "GET", token });
assert(got.ok, `Get playback state failed: status=${got.status} body=${JSON.stringify(got.json)}`);
}
async function main() {
const started = Date.now();
console.log(`[smoke] API_BASE_URL=${API_BASE_URL}`);
await waitForHealth();
console.log("[smoke] health ok");
const token = await ensureTestUserAndToken();
console.log(`[smoke] got token for user=${USERNAME}`);
await completeOnboarding(token);
console.log("[smoke] onboarding marked complete");
await scanLibraryIfNeeded(token);
console.log("[smoke] library ready");
await playlistsCrud(token);
console.log("[smoke] playlists CRUD ok");
await playbackStateRoundTrip(token);
console.log("[smoke] playback-state roundtrip ok");
console.log(`[smoke] PASS in ${Date.now() - started}ms`);
}
main().catch((err) => {
console.error("[smoke] FAIL", err);
process.exit(1);
});
+97
View File
@@ -0,0 +1,97 @@
/**
* Check if tracks have Enhanced vibe analysis data
*/
import { prisma } from "../utils/db";
async function check() {
// Get a sample of tracks with their analysis data
const tracks = await prisma.track.findMany({
take: 10,
select: {
title: true,
album: { select: { artist: { select: { name: true } } } },
analysisMode: true,
moodHappy: true,
moodSad: true,
moodRelaxed: true,
moodAggressive: true,
danceabilityMl: true,
valence: true,
arousal: true,
energy: true,
bpm: true,
moodTags: true,
},
where: {
bpm: { not: null }
}
});
console.log('Sample tracks with analysis data:');
for (const t of tracks) {
console.log(`\n${t.album?.artist?.name} - ${t.title}`);
console.log(` analysisMode: ${t.analysisMode || 'NOT SET (legacy)'}`);
console.log(` ML moods: happy=${t.moodHappy}, sad=${t.moodSad}, relaxed=${t.moodRelaxed}, aggressive=${t.moodAggressive}`);
console.log(` danceabilityMl: ${t.danceabilityMl}`);
console.log(` valence: ${t.valence}, arousal: ${t.arousal}`);
console.log(` energy: ${t.energy}, bpm: ${t.bpm}`);
console.log(` moodTags: ${t.moodTags?.join(', ') || 'none'}`);
}
// Count tracks with enhanced analysis
const enhancedCount = await prisma.track.count({ where: { analysisMode: 'enhanced' } });
const standardCount = await prisma.track.count({ where: { analysisMode: 'standard' } });
const noModeCount = await prisma.track.count({ where: { analysisMode: null, bpm: { not: null } } });
const totalAnalyzed = await prisma.track.count({ where: { bpm: { not: null } } });
// Count tracks with ML mood data
const withMoodHappy = await prisma.track.count({ where: { moodHappy: { not: null } } });
console.log(`\n--- Analysis Mode Stats ---`);
console.log(`Enhanced: ${enhancedCount}`);
console.log(`Standard: ${standardCount}`);
console.log(`No mode (legacy): ${noModeCount}`);
console.log(`Total analyzed: ${totalAnalyzed}`);
console.log(`With ML mood data: ${withMoodHappy}`);
// Check specific songs the user mentioned
console.log(`\n--- Checking specific songs ---`);
const specificSongs = await prisma.track.findMany({
where: {
OR: [
{ title: { contains: "I Love You", mode: "insensitive" } },
{ title: { contains: "Roots", mode: "insensitive" } },
{ title: { contains: "Alright", mode: "insensitive" } },
]
},
select: {
title: true,
album: { select: { artist: { select: { name: true } } } },
analysisMode: true,
moodHappy: true,
moodSad: true,
moodRelaxed: true,
moodAggressive: true,
valence: true,
arousal: true,
energy: true,
bpm: true,
danceability: true,
moodTags: true,
}
});
for (const t of specificSongs) {
console.log(`\n${t.album?.artist?.name} - ${t.title}`);
console.log(` analysisMode: ${t.analysisMode || 'NOT SET (legacy)'}`);
console.log(` ML moods: happy=${t.moodHappy}, sad=${t.moodSad}, relaxed=${t.moodRelaxed}, aggressive=${t.moodAggressive}`);
console.log(` valence: ${t.valence}, arousal: ${t.arousal}`);
console.log(` energy: ${t.energy}, bpm: ${t.bpm}, dance: ${t.danceability}`);
console.log(` moodTags: ${t.moodTags?.join(', ') || 'none'}`);
}
await prisma.$disconnect();
}
check().catch(console.error);
+550
View File
@@ -0,0 +1,550 @@
# Lidify Vibe Matching System - Research Review Document
## Executive Summary
This document provides a complete overview of Lidify's audio-based music recommendation ("vibe matching") system for research review. The system uses ML-based audio analysis to find similar songs based on how they *sound*, not metadata or collaborative filtering.
---
## Sample Results (Live Terminal Output)
### Example 1: Piano Music ("I Love You" by RIOPY)
```
SOURCE: "I Love You" by RIOPY
Album: RIOPY
Analysis Mode: enhanced
BPM: 91.3 | Energy: 0.28 | Valence: 0.53
Danceability: 0.96 | Arousal: 0.52 | Key: major
ML Moods: Happy=0.91, Sad=0.65, Relaxed=1.00, Aggressive=0.99
Mood Tags: sad, dance, chill, melancholic, relaxed, uplifting, aggressive, intense, groovy, happy
TOP MATCHES (by cosine similarity):
# | TRACK | ARTIST | BPM | ENG | VAL | H | S | R | A
----|--------------------------------|------------------|------|------|------|------|------|------|------
1 | Minimal Game | RIOPY | 84 | 0.25 | 0.51 | 0.70 | 0.20 | 0.80 | 0.76
2 | Lullaby | RIOPY | 82 | 0.28 | 0.54 | 0.75 | 0.20 | 0.80 | 0.76
3 | Joy | RIOPY | 97 | 0.34 | 0.57 | 0.98 | 0.58 | 1.00 | 0.99
4 | Introspective (From Home) | Dirk Maassen | 94 | 0.32 | 0.55 | 0.79 | 0.20 | 0.80 | 0.80
5 | Sweet dream | RIOPY | 91 | 0.28 | 0.48 | 0.64 | 0.20 | 0.80 | 0.77
6 | Sense of hope | RIOPY | 99 | 0.25 | 0.53 | 0.74 | 0.20 | 0.80 | 0.78
7 | Drive | RIOPY | 96 | 0.44 | 0.55 | 0.78 | 0.20 | 0.80 | 0.78
8 | Air (From Home) | Dirk Maassen | 81 | 0.14 | 0.56 | 0.79 | 0.20 | 0.80 | 0.76
9 | Prelude | Muse | 85 | 0.39 | 0.40 | 0.68 | 0.70 | 0.96 | 1.00
10 | Towards the Sun | Dirk Maassen | 117 | 0.25 | 0.49 | 0.66 | 0.20 | 0.80 | 0.80
```
**Observation:** Piano music correctly matches with other piano composers (RIOPY, Dirk Maassen).
---
### Example 2: Alt-Rock ("You and I" by Pvris)
```
SOURCE: "You and I" by Pvris
Album: White Noise
Analysis Mode: enhanced
BPM: 101.9 | Energy: 0.57 | Valence: 0.50
Danceability: 1.00 | Arousal: 0.44 | Key: major
ML Moods: Happy=0.49, Sad=0.31, Relaxed=0.44, Aggressive=0.68
Mood Tags: intense, dance, aggressive, groovy
TOP MATCHES:
# | TRACK | ARTIST | BPM | ENG | VAL | H | S | R | A
----|--------------------------------|------------------|------|------|------|------|------|------|------
1 | Tether | CHVRCHES | 120 | 0.52 | 0.47 | 0.43 | 0.28 | 0.50 | 0.69
2 | By The Throat (Live) | CHVRCHES | 118 | 0.50 | 0.52 | 0.37 | 0.20 | 0.34 | 0.72
3 | Separate | Pvris | 90 | 0.64 | 0.52 | 0.49 | 0.26 | 0.40 | 0.85
4 | Strong Hand (Live) | CHVRCHES | 80 | 0.58 | 0.60 | 0.55 | 0.34 | 0.34 | 0.74
5 | Stay Gold | Pvris | 100 | 0.72 | 0.57 | 0.47 | 0.25 | 0.35 | 0.80
6 | I Like The Devil | Purity Ring | 100 | 0.65 | 0.54 | 0.60 | 0.31 | 0.43 | 0.92
7 | Madness (Live) | Muse | 92 | 0.78 | 0.62 | 0.77 | 0.52 | 0.57 | 0.77
```
**Observation:** Synth-pop/alt-rock correctly matches with similar artists (CHVRCHES, Pvris, Purity Ring).
---
### Example 3: Rock ("Supermassive Black Hole" by Muse)
```
SOURCE: "Supermassive Black Hole" by Muse
Album: HAARP
Analysis Mode: enhanced
BPM: 120.1 | Energy: 0.67 | Valence: 0.56
Danceability: 1.00 | Arousal: 0.42 | Key: minor
ML Moods: Happy=0.72, Sad=0.64, Relaxed=0.16, Aggressive=0.22
Mood Tags: sad, dance, melancholic, uplifting, groovy, happy
TOP MATCHES:
# | TRACK | ARTIST | BPM | ENG | VAL | H | S | R | A
----|--------------------------------|------------------|------|------|------|------|------|------|------
1 | Supermassive Black Hole (Live) | Muse | 120 | 0.75 | 0.56 | 0.76 | 0.58 | 0.06 | 0.04
2 | Thought Contagion (Live) | Muse | 140 | 0.76 | 0.57 | 0.77 | 0.52 | 0.08 | 0.09
3 | Let Them In | Pvris | 146 | 0.64 | 0.62 | 0.67 | 0.50 | 0.22 | 0.22
4 | Panic Station (Live) | Muse | 105 | 0.69 | 0.47 | 0.61 | 0.61 | 0.02 | 0.03
5 | Smoke | Pvris | 150 | 0.57 | 0.56 | 0.64 | 0.66 | 0.20 | 0.30
6 | Animals | Muse | 113 | 0.82 | 0.55 | 0.79 | 0.59 | 0.24 | 0.21
```
**Observation:** Rock music correctly matches with other Muse tracks and similar-sounding rock/alt artists.
---
## System Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ AUDIO ANALYSIS PIPELINE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────────────────────────────────────────┐ │
│ │ Audio File │────►│ Essentia Audio Processing │ │
│ │ (.flac/.mp3)│ │ │ │
│ └─────────────┘ │ • FFT/Spectral Analysis │ │
│ │ • Beat/Tempo Detection │ │
│ │ • Key/Scale Detection │ │
│ │ • RMS Energy Calculation │ │
│ └─────────────┬────────────────────────────────────┘ │
│ │ │
│ ┌─────────────▼────────────────────────────────────┐ │
│ │ MusiCNN (TensorFlow Model) │ │
│ │ │ │
│ │ Input: 16kHz mono audio │ │
│ │ Output: 200-dimensional embeddings │ │
│ │ Architecture: Convolutional Neural Network │ │
│ │ Training: Million Song Dataset (MSD) │ │
│ └─────────────┬────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────────┼────────────────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Mood Happy │ │ Mood Sad │ ... │ Danceability │ │
│ │ Classifier │ │ Classifier │ │ Classifier │ │
│ │ (Softmax) │ │ (Softmax) │ │ (Softmax) │ │
│ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │ │
│ └──────────────────────┼────────────────────────────┘ │
│ │ │
│ ┌───────────▼───────────┐ │
│ │ DERIVED FEATURES │ │
│ │ │ │
│ │ Valence = f(happy, party, sad) │
│ │ Arousal = f(aggressive, party, electronic, │
│ │ relaxed, acoustic) │
│ └───────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ VIBE MATCHING ALGORITHM │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ 1. Build Feature Vector (13 dimensions): │
│ [moodHappy, moodSad, moodRelaxed, moodAggressive, moodParty, │
│ moodAcoustic, moodElectronic, energy, arousal, danceability, │
│ instrumentalness, normalizedBPM, keyMode] │
│ │
│ 2. Compute Cosine Similarity: │
│ Σ(aᵢ × bᵢ) │
│ cos(θ) = ───────────────────── │
│ √(Σaᵢ²) × √(Σbᵢ²) │
│ │
│ 3. Add Tag/Genre Bonus (max 5%): │
│ Jaccard similarity on lastfmTags essentiaGenres │
│ │
│ 4. Final Score = 0.95 × cosineSim + tagBonus │
│ │
│ 5. Filter threshold: 40% (Enhanced) or 50% (Standard) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
```
---
## Data Schema (What We Store Per Track)
### Database Schema (PostgreSQL + Prisma)
```sql
-- Track table audio analysis columns
model Track {
-- Basic Info
id String @id
title String
albumId String
duration Int -- seconds
filePath String -- relative path to audio file
-- === RHYTHM ANALYSIS (Essentia) ===
bpm Float? -- beats per minute (60-200 typical)
beatsCount Int? -- total beats in track
-- === TONALITY (Essentia) ===
key String? -- musical key ("C", "F#", "Bb", etc.)
keyScale String? -- "major" or "minor"
keyStrength Float? -- confidence 0-1
-- === ENERGY & DYNAMICS (Essentia) ===
energy Float? -- overall energy 0-1 (RMS-based)
loudness Float? -- average loudness in dB
dynamicRange Float? -- dynamic range in dB
-- === BASIC AUDIO FEATURES ===
danceability Float? -- 0-1 how suitable for dancing
valence Float? -- 0 (sad) to 1 (happy) - DERIVED
arousal Float? -- 0 (calm) to 1 (energetic) - DERIVED
-- === INSTRUMENTATION ===
instrumentalness Float? -- 0-1 (1 = no vocals) - ML predicted
acousticness Float? -- 0-1 (1 = acoustic)
speechiness Float? -- 0-1 (1 = spoken word)
-- === ML MOOD PREDICTIONS (Enhanced Mode) ===
-- These are the core ML outputs from MusiCNN classifiers
moodHappy Float? -- ML prediction 0-1 (probability of happy)
moodSad Float? -- ML prediction 0-1 (probability of sad)
moodRelaxed Float? -- ML prediction 0-1 (probability of relaxed)
moodAggressive Float? -- ML prediction 0-1 (probability of aggressive)
moodParty Float? -- ML prediction 0-1 (probability of party/upbeat)
moodAcoustic Float? -- ML prediction 0-1 (probability of acoustic)
moodElectronic Float? -- ML prediction 0-1 (probability of electronic)
danceabilityMl Float? -- ML-based danceability (more accurate)
-- === DERIVED TAGS ===
moodTags String[] -- ["aggressive", "happy", "chill", "workout"]
essentiaGenres String[] -- ["rock", "electronic", "jazz"]
lastfmTags String[] -- ["chill", "workout", "sad", "90s"]
-- === ANALYSIS METADATA ===
analysisStatus String -- pending, processing, completed, failed
analysisMode String? -- 'standard' or 'enhanced'
analysisVersion String? -- Essentia version used
analyzedAt DateTime?
}
```
---
## Core Algorithm: Feature Extraction (Python)
### analyzer.py - ML Feature Extraction
```python
def _extract_ml_features(self, audio_16k) -> Dict[str, Any]:
"""
Extract features using Essentia MusiCNN + classification heads.
Architecture:
1. TensorflowPredictMusiCNN extracts embeddings from audio
2. TensorflowPredict2D classification heads output predictions
"""
result = {}
# Step 1: Get embeddings from base MusiCNN model
# Output shape: [frames, 200] - 200-dimensional embedding per frame
embeddings = self.musicnn_model(audio_16k)
# Step 2: Pass embeddings through classification heads
# Each head outputs [frames, 2] where [:, 1] is probability of positive class
# Collect raw predictions
if 'mood_happy' in self.prediction_models:
preds = self.prediction_models['mood_happy'](embeddings)
result['moodHappy'] = float(np.mean(preds[:, 1]))
if 'mood_sad' in self.prediction_models:
preds = self.prediction_models['mood_sad'](embeddings)
result['moodSad'] = float(np.mean(preds[:, 1]))
if 'mood_relaxed' in self.prediction_models:
preds = self.prediction_models['mood_relaxed'](embeddings)
result['moodRelaxed'] = float(np.mean(preds[:, 1]))
if 'mood_aggressive' in self.prediction_models:
preds = self.prediction_models['mood_aggressive'](embeddings)
result['moodAggressive'] = float(np.mean(preds[:, 1]))
if 'mood_party' in self.prediction_models:
preds = self.prediction_models['mood_party'](embeddings)
result['moodParty'] = float(np.mean(preds[:, 1]))
if 'mood_acoustic' in self.prediction_models:
preds = self.prediction_models['mood_acoustic'](embeddings)
result['moodAcoustic'] = float(np.mean(preds[:, 1]))
if 'mood_electronic' in self.prediction_models:
preds = self.prediction_models['mood_electronic'](embeddings)
result['moodElectronic'] = float(np.mean(preds[:, 1]))
# === VALENCE (derived from mood models) ===
# Valence = emotional positivity: happy/party vs sad
happy = result.get('moodHappy', 0.5)
sad = result.get('moodSad', 0.5)
party = result.get('moodParty', 0.5)
result['valence'] = round(happy * 0.5 + party * 0.3 + (1 - sad) * 0.2, 3)
# === AROUSAL (derived from mood models) ===
# Arousal = energy level: aggressive/party/electronic vs relaxed/acoustic
aggressive = result.get('moodAggressive', 0.5)
relaxed = result.get('moodRelaxed', 0.5)
acoustic = result.get('moodAcoustic', 0.5)
electronic = result.get('moodElectronic', 0.5)
result['arousal'] = round(
aggressive * 0.35 +
party * 0.25 +
electronic * 0.2 +
(1 - relaxed) * 0.1 +
(1 - acoustic) * 0.1,
3
)
return result
```
---
## Core Algorithm: Cosine Similarity Matching (TypeScript)
### library.ts - Vibe Matching Implementation
```typescript
// === COSINE SIMILARITY SCORING ===
// Industry-standard approach: build feature vectors, compute cosine similarity
// Uses ALL 13 features for comprehensive matching
// Helper: Build normalized feature vector from track
const buildFeatureVector = (track: TrackFeatures): number[] => {
return [
// ML Mood predictions (7 features) - 0.5 default for missing
track.moodHappy ?? 0.5,
track.moodSad ?? 0.5,
track.moodRelaxed ?? 0.5,
track.moodAggressive ?? 0.5,
track.moodParty ?? 0.5,
track.moodAcoustic ?? 0.5,
track.moodElectronic ?? 0.5,
// Audio features (5 features)
track.energy ?? 0.5,
track.arousal ?? 0.5,
track.danceabilityMl ?? track.danceability ?? 0.5,
track.instrumentalness ?? 0.5,
// BPM normalized to 0-1 (60-180 BPM range)
Math.max(0, Math.min(1, ((track.bpm ?? 120) - 60) / 120)),
// Key: major=1, minor=0, unknown=0.5
track.keyScale === 'major' ? 1 : track.keyScale === 'minor' ? 0 : 0.5,
];
};
// Helper: Compute cosine similarity between two vectors
const cosineSimilarity = (a: number[], b: number[]): number => {
let dot = 0, magA = 0, magB = 0;
for (let i = 0; i < a.length; i++) {
dot += a[i] * b[i];
magA += a[i] * a[i];
magB += b[i] * b[i];
}
if (magA === 0 || magB === 0) return 0;
return dot / (Math.sqrt(magA) * Math.sqrt(magB));
};
// Helper: Compute tag overlap bonus
const computeTagBonus = (
sourceTags: string[],
sourceGenres: string[],
trackTags: string[],
trackGenres: string[]
): number => {
const sourceSet = new Set([...sourceTags, ...sourceGenres].map(t => t.toLowerCase()));
const trackSet = new Set([...trackTags, ...trackGenres].map(t => t.toLowerCase()));
if (sourceSet.size === 0 || trackSet.size === 0) return 0;
const overlap = [...sourceSet].filter(tag => trackSet.has(tag)).length;
// Max 5% bonus for tag overlap
return Math.min(0.05, overlap * 0.01);
};
// Score all candidate tracks
const scored = analyzedTracks.map(t => {
const targetVector = buildFeatureVector(t);
// Compute base cosine similarity
let score = cosineSimilarity(sourceVector, targetVector);
// Add tag/genre overlap bonus (max 5%)
const tagBonus = computeTagBonus(
sourceTrack.lastfmTags || [],
sourceTrack.essentiaGenres || [],
t.lastfmTags || [],
t.essentiaGenres || []
);
// Final score: 95% cosine similarity + 5% tag bonus
const finalScore = score * 0.95 + tagBonus;
return { id: t.id, score: finalScore };
});
// Filter to good matches (>40% for Enhanced, >50% for Standard)
const minThreshold = isEnhancedAnalysis ? 0.40 : 0.50;
const goodMatches = scored
.filter(t => t.score > minThreshold)
.sort((a, b) => b.score - a.score);
```
---
## Feature Vector Breakdown
| Index | Feature | Range | Description | Weight Rationale |
|-------|---------|-------|-------------|------------------|
| 0 | moodHappy | 0-1 | ML probability of happy mood | Core mood dimension |
| 1 | moodSad | 0-1 | ML probability of sad mood | Core mood dimension |
| 2 | moodRelaxed | 0-1 | ML probability of relaxed mood | Core mood dimension |
| 3 | moodAggressive | 0-1 | ML probability of aggressive mood | Core mood dimension |
| 4 | moodParty | 0-1 | ML probability of party/upbeat | Core mood dimension |
| 5 | moodAcoustic | 0-1 | ML probability of acoustic sound | Instrumentation |
| 6 | moodElectronic | 0-1 | ML probability of electronic sound | Instrumentation |
| 7 | energy | 0-1 | RMS-based energy level | Audio characteristic |
| 8 | arousal | 0-1 | Derived energy/intensity | Composite dimension |
| 9 | danceability | 0-1 | ML or Essentia danceability | Rhythm characteristic |
| 10 | instrumentalness | 0-1 | Voice/instrumental ML detection | Instrumentation |
| 11 | normalizedBPM | 0-1 | (bpm - 60) / 120 | Tempo matching |
| 12 | keyMode | 0/0.5/1 | minor/unknown/major | Tonality |
---
## Valence & Arousal Derivation
Since Essentia doesn't have direct valence/arousal models, we derive them from mood predictions:
### Valence (Emotional Positivity)
```python
valence = moodHappy * 0.5 + moodParty * 0.3 + (1 - moodSad) * 0.2
```
**Rationale:**
- Happy mood is the strongest positive indicator (50% weight)
- Party/upbeat suggests positive energy (30% weight)
- Low sadness contributes to positivity (20% weight)
### Arousal (Energy Level)
```python
arousal = moodAggressive * 0.35 + moodParty * 0.25 + moodElectronic * 0.2
+ (1 - moodRelaxed) * 0.1 + (1 - moodAcoustic) * 0.1
```
**Rationale:**
- Aggressive music is high-energy (35% weight)
- Party music has high arousal (25% weight)
- Electronic music tends to be energetic (20% weight)
- Low relaxation indicates higher energy (10% weight)
- Non-acoustic sound suggests higher energy (10% weight)
---
## Known Limitations & Edge Cases
### 1. Out-of-Distribution Audio
MusiCNN was trained on the Million Song Dataset (mostly pop/rock). For genres outside this distribution (classical, ambient, piano), the model sometimes outputs high values for ALL mood dimensions.
**Detection & Normalization:**
```python
core_moods = ['moodHappy', 'moodSad', 'moodRelaxed', 'moodAggressive']
core_values = [raw_moods[m][0] for m in core_moods if m in raw_moods]
if len(core_values) >= 4:
min_mood = min(core_values)
max_mood = max(core_values)
# If all core moods are > 0.7 AND the range is small,
# the predictions are likely unreliable (out-of-distribution audio)
if min_mood > 0.7 and (max_mood - min_mood) < 0.3:
# Normalize: scale so max becomes 0.8 and min becomes 0.2
for mood_key in core_moods:
old_val = raw_moods[mood_key][0]
normalized = 0.2 + (old_val - min_mood) / (max_mood - min_mood) * 0.6
raw_moods[mood_key] = normalized
```
### 2. Standard Mode Fallback
When ML models aren't available, heuristic estimates are used:
| Feature | Heuristic Formula |
|---------|-------------------|
| Valence | key_valence * 0.4 + bpm_valence * 0.25 + brightness * 0.2 + energy * 0.15 |
| Arousal | bpm_arousal * 0.35 + energy * 0.35 + brightness * 0.15 + compression * 0.15 |
| Instrumentalness | spectral_flatness * 0.6 + zcr_instrumental * 0.4 |
| Acousticness | dynamic_range / 12 |
### 3. Feature Vector Missing Values
Missing values default to 0.5 (neutral) to prevent bias:
```typescript
track.moodHappy ?? 0.5
```
---
## Open Questions for Review
1. **Feature Weighting:** Currently all 13 features have equal weight in cosine similarity. Should mood features (indices 0-6) have higher weight than audio features?
2. **Threshold Selection:** We use 40% similarity threshold for Enhanced mode. Is this too permissive? Too restrictive?
3. **Valence/Arousal Derivation:** Our formulas for deriving valence/arousal from mood predictions are hand-tuned. Are the weights reasonable?
4. **BPM Normalization:** We normalize BPM to 60-180 range. Should we use octave-aware BPM (treating 60 and 120 as similar)?
5. **Cross-Genre Matching:** The algorithm matches based on audio similarity regardless of genre. Should genre matching have more weight?
6. **Cold Start:** Tracks with missing analysis fall back to 0.5 for all features. Should they be excluded from matching?
---
## Dependencies
### Python (Audio Analyzer)
```
essentia==2.1b6.dev1110
essentia-tensorflow==2.1b6.dev1110
numpy>=1.21.0,<2.0.0
tensorflow==2.15.0
redis>=4.5.0
psycopg2-binary>=2.9.0
```
### MusiCNN Models (Essentia Model Zoo)
- `msd-musicnn-1.pb` - Base embedding model (~3MB)
- `mood_happy-msd-musicnn-1.pb` - Happy classifier
- `mood_sad-msd-musicnn-1.pb` - Sad classifier
- `mood_relaxed-msd-musicnn-1.pb` - Relaxed classifier
- `mood_aggressive-msd-musicnn-1.pb` - Aggressive classifier
- `mood_party-msd-musicnn-1.pb` - Party classifier
- `mood_acoustic-msd-musicnn-1.pb` - Acoustic classifier
- `mood_electronic-msd-musicnn-1.pb` - Electronic classifier
- `danceability-msd-musicnn-1.pb` - Danceability classifier
- `voice_instrumental-msd-musicnn-1.pb` - Voice/instrumental classifier
---
## References
- [Essentia TensorFlow Documentation](https://essentia.upf.edu/machine_learning.html)
- [MusiCNN Paper (Pons et al.)](https://arxiv.org/abs/1711.02520)
- [Essentia Model Zoo](https://essentia.upf.edu/models/)
- [Million Song Dataset](http://millionsongdataset.com/)
---
## File Locations
| Component | Path |
|-----------|------|
| Audio Analyzer | `services/audio-analyzer/analyzer.py` |
| Vibe Matching | `backend/src/routes/library.ts` (lines 3293-3580) |
| Database Schema | `backend/prisma/schema.prisma` |
| Standard Mode Docs | `docs/implementation-summaries/audio-analysis-standard-mode/README.md` |
| Enhanced Mode Docs | `docs/implementation-summaries/audio-analysis-standard-mode/ENHANCED_MODE.md` |
| Algorithm Overview | `docs/implementation-summaries/vibe-matching-overhaul/README.md` |
+651
View File
@@ -0,0 +1,651 @@
# Lidify Vibe System Documentation
This document provides comprehensive documentation of the Vibe System - how Lidify analyzes tracks, collects audio metrics, and compares them for vibe matching. Use this as a reference for building frontend interfaces.
---
## Table of Contents
1. [Overview](#overview)
2. [Metrics Collected](#metrics-collected)
3. [Data Structures](#data-structures)
4. [Vibe Matching Algorithm](#vibe-matching-algorithm)
5. [API Endpoints](#api-endpoints)
6. [Frontend Integration Guide](#frontend-integration-guide)
7. [Existing Components Reference](#existing-components-reference)
---
## Overview
The Vibe System uses a combination of **audio signal analysis** and **ML-based mood prediction** to understand the "feel" of a track. It operates in two modes:
| Mode | Description | Accuracy |
|------|-------------|----------|
| **Standard** | Heuristic-based analysis using audio signal features (BPM, key, energy) | Good |
| **Enhanced** | ML-based analysis using MusiCNN neural network for mood prediction | Best |
The system enables:
- Finding tracks with similar vibes to a source track
- Generating mood-based playlists
- Visualizing track characteristics in real-time
---
## Metrics Collected
### Core Audio Features (Always Available)
These are extracted directly from audio signal analysis at 44.1kHz:
| Metric | Type | Range | Description |
|--------|------|-------|-------------|
| `bpm` | Float | 60-200 | Tempo in beats per minute |
| `beatsCount` | Int | 0+ | Total number of beats detected |
| `key` | String | "C", "F#", etc. | Musical key |
| `keyScale` | String | "major" \| "minor" | Major or minor tonality |
| `keyStrength` | Float | 0-1 | Confidence of key detection |
| `energy` | Float | 0-1 | RMS-based intensity level |
| `loudness` | Float | dB | Average loudness |
| `dynamicRange` | Float | dB | Difference between quietest and loudest |
| `danceability` | Float | 0-1 | Rhythm regularity and groove potential |
### ML Mood Predictions (Enhanced Mode)
Seven core mood dimensions predicted by the MusiCNN model:
| Metric | Type | Range | Description | Icon Suggestion |
|--------|------|-------|-------------|-----------------|
| `moodHappy` | Float | 0-1 | Happiness/cheerfulness probability | Smile |
| `moodSad` | Float | 0-1 | Sadness/melancholy probability | Frown |
| `moodRelaxed` | Float | 0-1 | Calm/peaceful probability | Coffee |
| `moodAggressive` | Float | 0-1 | Intensity/aggression probability | Flame |
| `moodParty` | Float | 0-1 | Upbeat/party probability | PartyPopper |
| `moodAcoustic` | Float | 0-1 | Acoustic instrumentation probability | Guitar |
| `moodElectronic` | Float | 0-1 | Electronic/synthetic probability | Radio |
### Derived Features (Computed)
These are calculated from the ML predictions:
#### Valence (Emotional Positivity)
```typescript
// Formula:
valence = (
moodHappy * 0.5 + // Happy mood (50% weight)
moodParty * 0.3 + // Party mood (30% weight)
(1 - moodSad) * 0.2 // Inverse of sadness (20% weight)
)
```
| Value | Interpretation |
|-------|----------------|
| 0.0 - 0.3 | Melancholic, sad |
| 0.3 - 0.6 | Neutral, balanced |
| 0.6 - 1.0 | Happy, positive |
#### Arousal (Energy/Excitement Level)
```typescript
// Formula:
arousal = (
moodAggressive * 0.35 + // Aggressive mood (35% weight)
moodParty * 0.25 + // Party mood (25% weight)
moodElectronic * 0.2 + // Electronic sound (20% weight)
(1 - moodRelaxed) * 0.1 + // Inverse of relaxation (10% weight)
(1 - moodAcoustic) * 0.1 // Inverse of acoustic (10% weight)
)
```
| Value | Interpretation |
|-------|----------------|
| 0.0 - 0.3 | Calm, peaceful |
| 0.3 - 0.6 | Moderate energy |
| 0.6 - 1.0 | High energy, intense |
### Additional Features
| Metric | Type | Range | Description |
|--------|------|-------|-------------|
| `instrumentalness` | Float | 0-1 | Voice presence (0=vocal, 1=instrumental) |
| `acousticness` | Float | 0-1 | Acoustic vs. processed sound |
| `speechiness` | Float | 0-1 | Spoken word detection |
| `danceabilityMl` | Float | 0-1 | ML-based danceability (more accurate) |
### Metadata & Tags
| Field | Type | Description |
|-------|------|-------------|
| `moodTags` | String[] | Derived mood labels (e.g., ["chill", "happy"]) |
| `essentiaGenres` | String[] | ML-predicted genres (e.g., ["rock", "electronic"]) |
| `lastfmTags` | String[] | User-generated tags from Last.fm |
| `analysisStatus` | String | "pending" \| "processing" \| "completed" \| "failed" |
| `analysisMode` | String | "standard" \| "enhanced" |
| `analyzedAt` | DateTime | When analysis was performed |
---
## Data Structures
### TypeScript Interface
```typescript
interface AudioFeatures {
// Core audio features
bpm?: number | null;
beatsCount?: number | null;
key?: string | null;
keyScale?: string | null;
keyStrength?: number | null;
energy?: number | null;
loudness?: number | null;
dynamicRange?: number | null;
danceability?: number | null;
// Derived features
valence?: number | null;
arousal?: number | null;
// Additional features
instrumentalness?: number | null;
acousticness?: number | null;
speechiness?: number | null;
danceabilityMl?: number | null;
// ML Mood predictions (Enhanced mode)
moodHappy?: number | null;
moodSad?: number | null;
moodRelaxed?: number | null;
moodAggressive?: number | null;
moodParty?: number | null;
moodAcoustic?: number | null;
moodElectronic?: number | null;
// Metadata
analysisStatus?: string | null;
analysisMode?: string | null;
analyzedAt?: string | null;
// Tags
moodTags?: string[];
essentiaGenres?: string[];
lastfmTags?: string[];
}
```
### Feature Display Configuration
Recommended configuration for displaying features in UI:
```typescript
const FEATURE_CONFIG = [
{
key: "energy",
label: "Energy",
icon: "Zap", // lucide-react icon
min: 0,
max: 1,
lowLabel: "Calm",
highLabel: "Intense",
},
{
key: "valence",
label: "Mood",
icon: "Heart",
min: 0,
max: 1,
lowLabel: "Melancholic",
highLabel: "Happy",
},
{
key: "danceability",
label: "Groove",
icon: "Footprints",
min: 0,
max: 1,
lowLabel: "Freeform",
highLabel: "Danceable",
},
{
key: "bpm",
label: "Tempo",
icon: "Gauge",
min: 60,
max: 180,
lowLabel: "Slow",
highLabel: "Fast",
unit: "BPM",
},
{
key: "arousal",
label: "Arousal",
icon: "AudioWaveform",
min: 0,
max: 1,
lowLabel: "Peaceful",
highLabel: "Energetic",
},
];
const ML_MOOD_CONFIG = [
{ key: "moodHappy", label: "Happy", icon: "Smile", color: "yellow-400" },
{ key: "moodSad", label: "Sad", icon: "Frown", color: "blue-400" },
{ key: "moodRelaxed", label: "Relaxed", icon: "Coffee", color: "green-400" },
{ key: "moodAggressive", label: "Aggressive", icon: "Flame", color: "red-400" },
{ key: "moodParty", label: "Party", icon: "PartyPopper", color: "pink-400" },
{ key: "moodAcoustic", label: "Acoustic", icon: "Guitar", color: "amber-400" },
{ key: "moodElectronic", label: "Electronic", icon: "Radio", color: "purple-400" },
];
```
---
## Vibe Matching Algorithm
### Feature Vector Construction
The system builds a **13-dimensional feature vector** for each track:
```typescript
const buildFeatureVector = (track: AudioFeatures) => [
// ML Mood predictions (7 features) - 1.3x weight for semantic importance
getMoodValue(track.moodHappy, 0.5) * 1.3,
getMoodValue(track.moodSad, 0.5) * 1.3,
getMoodValue(track.moodRelaxed, 0.5) * 1.3,
getMoodValue(track.moodAggressive, 0.5) * 1.3,
getMoodValue(track.moodParty, 0.5) * 1.3,
getMoodValue(track.moodAcoustic, 0.5) * 1.3,
getMoodValue(track.moodElectronic, 0.5) * 1.3,
// Audio features (5 features)
track.energy ?? 0.5,
calculateEnhancedArousal(track),
track.danceabilityMl ?? track.danceability ?? 0.5,
track.instrumentalness ?? 0.5,
// BPM (octave-aware normalization)
1 - octaveAwareBPMDistance(track.bpm ?? 120, 120),
// Valence
calculateEnhancedValence(track),
];
// Helper: Get mood value with fallback
const getMoodValue = (value: number | null | undefined, fallback: number) =>
value ?? fallback;
```
### Cosine Similarity Calculation
Tracks are compared using cosine similarity:
```typescript
const cosineSimilarity = (vectorA: number[], vectorB: number[]): number => {
let dotProduct = 0;
let magA = 0;
let magB = 0;
for (let i = 0; i < vectorA.length; i++) {
dotProduct += vectorA[i] * vectorB[i];
magA += vectorA[i] * vectorA[i];
magB += vectorB[i] * vectorB[i];
}
return dotProduct / (Math.sqrt(magA) * Math.sqrt(magB));
};
```
### Tag/Genre Bonus
Additional boost for shared tags:
```typescript
const computeTagBonus = (
sourceTags: string[],
sourceGenres: string[],
trackTags: string[],
trackGenres: string[]
): number => {
const sourceSet = new Set(
[...sourceTags, ...sourceGenres].map(t => t.toLowerCase())
);
const trackSet = new Set(
[...trackTags, ...trackGenres].map(t => t.toLowerCase())
);
const overlap = [...sourceSet].filter(tag => trackSet.has(tag)).length;
return Math.min(0.05, overlap * 0.01); // Max 5% bonus
};
```
### Final Score
```typescript
const finalScore = cosineSimilarity(sourceVector, targetVector) * 0.95 + tagBonus;
```
### Matching Thresholds
| Mode | Minimum Similarity |
|------|-------------------|
| Enhanced | 40% |
| Standard | 50% |
Lower threshold for Enhanced mode because ML predictions provide more nuanced differentiation.
### Octave-Aware BPM Matching
Treats harmonically related tempos as similar (60 BPM ≈ 120 BPM ≈ 240 BPM):
```typescript
const octaveAwareBPMDistance = (bpm1: number, bpm2: number): number => {
const normalizeToOctave = (bpm: number): number => {
while (bpm < 77) bpm *= 2;
while (bpm > 154) bpm /= 2;
return bpm;
};
const norm1 = normalizeToOctave(bpm1);
const norm2 = normalizeToOctave(bpm2);
const logDistance = Math.abs(Math.log2(norm1) - Math.log2(norm2));
return Math.min(logDistance, 1);
};
```
---
## API Endpoints
### Get Track Audio Features
```
GET /api/tracks/:id/features
```
Response:
```json
{
"bpm": 128.5,
"energy": 0.78,
"valence": 0.65,
"arousal": 0.72,
"danceability": 0.85,
"key": "C",
"keyScale": "major",
"moodHappy": 0.72,
"moodSad": 0.15,
"moodRelaxed": 0.28,
"moodAggressive": 0.45,
"moodParty": 0.68,
"moodAcoustic": 0.12,
"moodElectronic": 0.78,
"analysisMode": "enhanced",
"analysisStatus": "completed"
}
```
### Find Similar Tracks (Vibe Match)
```
GET /api/library/vibe-match?trackId=:id&limit=20
```
Response:
```json
{
"source": { /* track with features */ },
"matches": [
{
"track": { /* track data */ },
"similarity": 0.87,
"features": { /* audio features */ }
}
]
}
```
### Generate Mood Mix
```
POST /api/mixes/mood
```
Request:
```json
{
"valence": { "min": 0.6, "max": 1.0 },
"energy": { "min": 0.5, "max": 0.8 },
"danceability": { "min": 0.7, "max": 1.0 },
"bpm": { "min": 100, "max": 140 },
"limit": 15
}
```
### Get Mood Presets
```
GET /api/mixes/mood-presets
```
Response:
```json
[
{
"id": "chill",
"name": "Chill Vibes",
"color": "from-blue-600 to-purple-600",
"params": {
"valence": { "min": 0.3, "max": 0.7 },
"energy": { "min": 0.1, "max": 0.4 }
}
}
]
```
---
## Frontend Integration Guide
### Displaying Feature Values
Normalize values for consistent display:
```typescript
function normalizeValue(
value: number | null | undefined,
min: number,
max: number
): number {
if (value === null || value === undefined) return 0;
return Math.max(0, Math.min(1, (value - min) / (max - min)));
}
// Usage
const normalizedBpm = normalizeValue(track.bpm, 60, 180);
const normalizedEnergy = normalizeValue(track.energy, 0, 1);
```
### Calculating Match Scores
```typescript
function calculateFeatureMatch(
sourceVal: number | null,
currentVal: number | null,
min: number,
max: number
): { diff: number; match: number } {
const sourceNorm = normalizeValue(sourceVal, min, max);
const currentNorm = normalizeValue(currentVal, min, max);
const diff = Math.abs(sourceNorm - currentNorm);
const match = Math.round((1 - diff) * 100);
return { diff, match };
}
```
### Match Score Color Coding
```typescript
function getMatchColor(matchPercent: number): string {
if (matchPercent >= 80) return "text-green-400"; // Excellent
if (matchPercent >= 60) return "text-yellow-400"; // Good
return "text-red-400"; // Different
}
function getMatchDescription(matchPercent: number): string {
if (matchPercent >= 80) return "Excellent match - very similar vibe";
if (matchPercent >= 60) return "Good match - similar energy";
return "Different vibe - exploring variety";
}
```
### Visualization Recommendations
#### 1. Radar Chart (Spider Graph)
Best for comparing multiple features at once. Shows source track (dashed line) vs current track (solid fill).
#### 2. Progress Bars
Best for individual feature comparison with source marker overlay.
#### 3. Mood Grid
4x2 or 4x4 grid of ML mood indicators with percentage matches.
#### 4. Valence-Arousal Quadrant
2D scatter plot with:
- X-axis: Valence (sad → happy)
- Y-axis: Arousal (calm → energetic)
Quadrants:
- Top-right: Happy + Energetic (Party)
- Top-left: Sad + Energetic (Angry/Tense)
- Bottom-right: Happy + Calm (Peaceful)
- Bottom-left: Sad + Calm (Melancholic)
---
## Existing Components Reference
### VibeOverlay
Location: `frontend/components/player/VibeOverlay.tsx`
Full-featured overlay showing:
- Overall match percentage
- Feature-by-feature comparison bars
- ML mood grid (enhanced mode)
- Source vs current legend
### VibeGraph
Location: `frontend/components/player/VibeGraph.tsx`
Compact radar chart for:
- 4-feature comparison (Energy, Mood, Dance, BPM)
- Match score badge
- Inline display in player
### MoodMixer
Location: `frontend/components/MoodMixer.tsx`
Modal for:
- Quick mood presets
- Custom range sliders
- Generating mood-based playlists
---
## Special Considerations
### Out-of-Distribution (OOD) Detection
The MusiCNN model was trained on pop/rock music. For other genres (classical, ambient, jazz), predictions may be unreliable. The backend normalizes these cases:
**Detection criteria:**
- All mood values > 0.7 with low variance
- All mood values clustered around 0.5
**UI Recommendation:** Show a subtle indicator when `analysisMode` is "standard" or when predictions seem unreliable.
### Handling Missing Data
Always provide fallback values:
```typescript
const safeFeatures = {
energy: track.energy ?? 0.5,
valence: track.valence ?? 0.5,
bpm: track.bpm ?? 120,
// ... etc
};
```
### Analysis Status States
| Status | UI Treatment |
|--------|--------------|
| `pending` | Show "Analyzing..." with spinner |
| `processing` | Show progress indicator |
| `completed` | Show full vibe data |
| `failed` | Show fallback/retry option |
---
## Quick Reference: Value Ranges
| Metric | Min | Max | Neutral |
|--------|-----|-----|---------|
| All mood* | 0 | 1 | 0.5 |
| energy | 0 | 1 | 0.5 |
| valence | 0 | 1 | 0.5 |
| arousal | 0 | 1 | 0.5 |
| danceability | 0 | 1 | 0.5 |
| bpm | 60 | 200 | 120 |
| keyStrength | 0 | 1 | - |
---
## File Locations
| Component | Path |
|-----------|------|
| Audio Analyzer (Python) | `services/audio-analyzer/analyzer.py` |
| Vibe Matching Logic | `backend/src/routes/library.ts` |
| Database Schema | `backend/prisma/schema.prisma` |
| Frontend Vibe Overlay | `frontend/components/player/VibeOverlay.tsx` |
| Frontend Vibe Graph | `frontend/components/player/VibeGraph.tsx` |
| Mood Mixer | `frontend/components/MoodMixer.tsx` |
| Audio State Context | `frontend/lib/audio-state-context.tsx` |
---
## Research Background
The Vibe System's valence and arousal calculations are informed by music psychology research:
### Valence (Emotional Positivity)
**Key Finding:** Mode/tonality is the strongest predictor of perceived valence in music.
- **Lee et al. (ICASSP 2020)** - Demonstrated that musical mode (major vs. minor) has the highest correlation with listener-reported valence
- Major keys contribute positively (+0.3 in our formula), minor keys negatively (-0.2)
- This aligns with centuries of music theory and empirical psychology research
### Arousal (Energy/Excitement)
**Key Finding:** The "electronic" mood prediction from ML models is unreliable for arousal calculation.
- **Grekow (2018)** - Found that direct energy and tempo features outperform genre-based predictions for arousal
- Our implementation replaces the "electronic" mood with explicit energy and BPM contributions
- This provides more consistent arousal predictions across diverse genres
### Feature Weights
The specific weights in our formulas (e.g., 0.35 for happy mood, 0.25 for energy) were tuned through:
1. Initial values from published research
2. Empirical testing on a diverse music library
3. User feedback on vibe matching accuracy
### References
- Lee, J., et al. (2020). "Music Emotion Recognition Using Valence-Arousal Regression." ICASSP 2020.
- Grekow, J. (2018). "Music Emotion Maps in Arousal-Valence Space." IFIP International Conference on Computer Information Systems and Industrial Management.