Initial release v1.0.0

This commit is contained in:
Kevin O'Neill
2025-12-25 18:58:06 -06:00
commit 021aec7a63
439 changed files with 116588 additions and 0 deletions

View File

@@ -0,0 +1,174 @@
# Audio Analysis - Enhanced Mode (MusiCNN)
## Overview
Enhanced mode uses Essentia's TensorFlow integration with MusiCNN (Music Convolutional Neural Network) models to perform ML-based mood and audio classification. This provides significantly more accurate mood detection compared to the heuristic-based Standard mode.
## Architecture
```
┌─────────────────┐
│ Audio File │
│ (16kHz mono) │
└────────┬────────┘
┌────────▼────────┐
│ TensorflowPredict│
│ MusiCNN │
│ (Embeddings) │
└────────┬────────┘
┌──────────────┼──────────────┐
│ │ │
┌─────────▼─────┐ ┌──────▼─────┐ ┌──────▼─────┐
│ Mood Happy │ │ Mood Sad │ │ Danceability│
│ TensorFlow │ │ TensorFlow │ │ TensorFlow │
│ Predict2D │ │ Predict2D │ │ Predict2D │
└───────┬───────┘ └─────┬──────┘ └──────┬──────┘
│ │ │
└───────────────┼───────────────┘
┌───────▼───────┐
│ Derived Scores│
│ Valence/Arousal│
└───────────────┘
```
## Key Components
### 1. Base Model: MusiCNN
- **Model**: `msd-musicnn-1.pb` (~3MB)
- **Source**: [Essentia Model Zoo](https://essentia.upf.edu/models/autotagging/msd/)
- **Function**: Extracts 200-dimensional embeddings from audio
- **Algorithm**: `TensorflowPredictMusiCNN`
### 2. Classification Heads
Each classification head takes the MusiCNN embeddings and outputs probabilities:
| Model | File | Output |
|-------|------|--------|
| Mood Happy | `mood_happy-msd-musicnn-1.pb` | P(happy) |
| Mood Sad | `mood_sad-msd-musicnn-1.pb` | P(sad) |
| Mood Relaxed | `mood_relaxed-msd-musicnn-1.pb` | P(relaxed) |
| Mood Aggressive | `mood_aggressive-msd-musicnn-1.pb` | P(aggressive) |
| Mood Party | `mood_party-msd-musicnn-1.pb` | P(party) |
| Mood Acoustic | `mood_acoustic-msd-musicnn-1.pb` | P(acoustic) |
| Mood Electronic | `mood_electronic-msd-musicnn-1.pb` | P(electronic) |
| Danceability | `danceability-msd-musicnn-1.pb` | P(danceable) |
| Voice/Instrumental | `voice_instrumental-msd-musicnn-1.pb` | P(instrumental) |
### 3. Derived Features
Valence and Arousal are derived from the mood predictions:
```python
# Valence = emotional positivity
valence = happy * 0.5 + party * 0.3 + (1 - sad) * 0.2
# Arousal = energy level
arousal = aggressive * 0.35 + party * 0.25 + electronic * 0.2
+ (1 - relaxed) * 0.1 + (1 - acoustic) * 0.1
```
## Docker Configuration
### Dockerfile
```dockerfile
FROM ubuntu:20.04
# Install essentia-tensorflow (includes TensorFlow + MusiCNN support)
RUN pip3 install --no-cache-dir essentia-tensorflow
# Download MusiCNN models
RUN curl -L -o /app/models/msd-musicnn-1.pb \
"https://essentia.upf.edu/models/autotagging/msd/msd-musicnn-1.pb"
# Classification heads
RUN curl -L -o /app/models/mood_happy-msd-musicnn-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_happy/mood_happy-msd-musicnn-1.pb"
# ... (other models)
```
### Requirements
- **Ubuntu 20.04** (for Python 3.8 compatibility)
- **essentia-tensorflow** pip package
- **~10MB** for all models combined
## Usage in Code
```python
from essentia.standard import TensorflowPredictMusiCNN, TensorflowPredict2D
# Load base embedding model
musicnn = TensorflowPredictMusiCNN(
graphFilename='/app/models/msd-musicnn-1.pb',
output="model/dense/BiasAdd" # Embedding output layer
)
# Load classification head
mood_happy = TensorflowPredict2D(
graphFilename='/app/models/mood_happy-msd-musicnn-1.pb',
output="model/Softmax"
)
# Process audio
audio = es.MonoLoader(filename=path, sampleRate=16000)()
embeddings = musicnn(audio) # Shape: [frames, 200]
predictions = mood_happy(embeddings) # Shape: [frames, 2]
happy_score = float(np.mean(predictions[:, 1])) # Average over frames
```
## Output Fields
Enhanced mode produces these additional fields:
| Field | Type | Range | Description |
|-------|------|-------|-------------|
| moodHappy | float | 0-1 | ML probability of happy mood |
| moodSad | float | 0-1 | ML probability of sad mood |
| moodRelaxed | float | 0-1 | ML probability of relaxed mood |
| moodAggressive | float | 0-1 | ML probability of aggressive mood |
| moodParty | float | 0-1 | ML probability of party mood |
| moodAcoustic | float | 0-1 | ML probability of acoustic sound |
| moodElectronic | float | 0-1 | ML probability of electronic sound |
| danceabilityMl | float | 0-1 | ML danceability score |
| valence | float | 0-1 | Derived emotional positivity |
| arousal | float | 0-1 | Derived energy level |
| acousticness | float | 0-1 | From moodAcoustic |
| instrumentalness | float | 0-1 | ML voice/instrumental detection |
## Comparison: Standard vs Enhanced
| Feature | Standard Mode | Enhanced Mode |
|---------|---------------|---------------|
| Mood Detection | Heuristic (key/BPM/energy) | ML (MusiCNN) |
| Accuracy | Approximate | Research-grade |
| Speed | Fast (~100ms) | Moderate (~500ms) |
| Dependencies | Essentia core | Essentia + TensorFlow |
| Model Size | 0 | ~10MB |
| Python Version | Any | 3.7-3.9 (for pip) |
## Fallback Behavior
If Enhanced mode fails to initialize (missing models, TensorFlow errors), the analyzer automatically falls back to Standard mode:
```python
if self.enhanced_mode and self.musicnn_model:
ml_features = self._extract_ml_features(audio_16k)
result.update(ml_features)
else:
self._apply_standard_estimates(result, scale, bpm)
```
## References
- [Essentia TensorFlow Documentation](https://essentia.upf.edu/machine_learning.html)
- [MusiCNN Paper](https://arxiv.org/abs/1711.02520)
- [Essentia Model Zoo](https://essentia.upf.edu/models/)

View File

@@ -0,0 +1,443 @@
# Audio Analysis: Standard Mode (Heuristic Approach)
## Overview
The Lidify audio analyzer has two modes:
- **Enhanced Mode**: Uses TensorFlow ML models for accurate mood/valence/arousal predictions
- **Standard Mode**: Uses signal processing heuristics when ML models aren't available
This document covers the **Standard Mode** implementation for code review.
---
## Architecture
```
┌─────────────────────────────────────────────────────────────────┐
│ Docker Container │
│ lidify_audio_analyzer │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Redis │◄───│ Worker │───►│ PostgreSQL │ │
│ │ Job Queue │ │ Loop │ │ Track Table │ │
│ └─────────────┘ └──────┬──────┘ └─────────────────────┘ │
│ │ │
│ ┌──────▼──────┐ │
│ │ AudioAnalyzer│ │
│ │ Class │ │
│ └──────┬──────┘ │
│ │ │
│ ┌────────────────┼────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌───────────────┐ ┌─────────────┐ ┌──────────────────┐ │
│ │ Basic Features│ │ Spectral │ │ Heuristic │ │
│ │ (BPM, Key) │ │ Analysis │ │ Mood Estimation │ │
│ └───────────────┘ └─────────────┘ └──────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
```
---
## File Structure
```
services/audio-analyzer/
├── analyzer.py # Main analyzer code (870 lines)
├── requirements.txt # Python dependencies
└── Dockerfile # Container build configuration
```
---
## Key Classes
### 1. `AudioAnalyzer` (Line 130-660)
Main analysis class with two modes:
```python
class AudioAnalyzer:
def __init__(self):
self.enhanced_mode = False # Falls back to Standard if ML unavailable
self._init_essentia() # Initialize signal processing algorithms
self._load_ml_models() # Attempt to load ML models
```
### 2. `AnalysisWorker` (Line 663-847)
Redis queue worker that:
1. Polls for pending tracks from `audio:analysis:queue`
2. Falls back to scanning `Track` table for `analysisStatus = 'pending'`
3. Processes tracks and updates database
---
## Standard Mode: Heuristic Calculations
### Input Features (Always Extracted)
| Feature | Essentia Algorithm | Description |
|---------|-------------------|-------------|
| BPM | `RhythmExtractor2013` | Beats per minute |
| Key/Scale | `KeyExtractor` | Musical key (C, D#, etc.) and mode (major/minor) |
| Loudness | `Loudness` | Perceived loudness in dB |
| Dynamic Range | `DynamicComplexity` | Difference between quiet and loud parts |
| Danceability | `Danceability` | How suitable for dancing (0-1) |
| RMS Energy | `RMS` | Root Mean Square amplitude per frame |
| Spectral Centroid | `Centroid` | "Brightness" - center of spectral mass |
| Spectral Flatness | `FlatnessDB` | Noise-like vs tonal content |
| Zero-Crossing Rate | `ZeroCrossingRate` | Rate of signal sign changes |
### Frame-Based Processing (Lines 328-365)
```python
frame_size = 2048
hop_size = 1024
for i in range(0, len(audio_44k) - frame_size, hop_size):
frame = audio_44k[i:i + frame_size]
windowed = self.windowing(frame)
spectrum = self.spectrum(windowed)
rms_values.append(self.rms(frame))
zcr_values.append(self.zcr(frame))
spectral_centroid_values.append(self.spectral_centroid(spectrum))
spectral_flatness_values.append(self.spectral_flatness(spectrum))
```
---
## Heuristic Formulas
### Energy (Line 347-353)
**Problem Solved**: Previous implementation used `es.Energy()` which returns sum of squared samples (huge number), normalized incorrectly as `energy / 100`.
**Current Implementation**:
```python
avg_rms = np.mean(rms_values)
energy = min(1.0, avg_rms * 3) # RMS typically 0.0-0.5, scale to 0-1
```
---
### Valence (Happiness/Positivity) - Lines 495-518
**Formula**:
```
valence = key_valence * 0.40
+ bpm_valence * 0.25
+ brightness_valence * 0.20
+ energy * 0.15
```
**Components**:
| Component | Weight | Calculation | Rationale |
|-----------|--------|-------------|-----------|
| Key Valence | 40% | Major = 0.65, Minor = 0.35 | Major keys sound happier |
| BPM Valence | 25% | Fast (≥120) → 0.8, Slow (≤80) → 0.2 | Fast tempo = upbeat |
| Brightness | 20% | `spectral_centroid * 1.5` | Bright sounds feel positive |
| Energy | 15% | RMS energy (0-1) | Loud = energetic/positive |
**Code**:
```python
# Key contribution
key_valence = 0.65 if scale == 'major' else 0.35
# BPM contribution
if bpm >= 120:
bpm_valence = min(0.8, 0.5 + (bpm - 120) / 200)
elif bpm <= 80:
bpm_valence = max(0.2, 0.5 - (80 - bpm) / 100)
else:
bpm_valence = 0.5
# Brightness contribution
brightness_valence = min(1.0, spectral_centroid * 1.5)
# Final weighted sum
result['valence'] = round(
key_valence * 0.4 +
bpm_valence * 0.25 +
brightness_valence * 0.2 +
energy * 0.15,
3
)
```
---
### Arousal (Energy/Intensity) - Lines 520-543
**Formula**:
```
arousal = bpm_arousal * 0.35
+ energy_arousal * 0.35
+ brightness_arousal * 0.15
+ compression_arousal * 0.15
```
**Components**:
| Component | Weight | Calculation | Rationale |
|-----------|--------|-------------|-----------|
| BPM Arousal | 35% | `(bpm - 60) / 140` mapped to 0.1-0.9 | Fast = high energy |
| Energy | 35% | RMS energy (0-1) | Loud = intense |
| Brightness | 15% | `spectral_centroid * 1.2` | Bright = energetic |
| Compression | 15% | `1 - (dynamic_range / 20)` | Compressed = intense/modern |
**Code**:
```python
# BPM contribution (60-180 BPM → 0.1-0.9)
bpm_arousal = min(0.9, max(0.1, (bpm - 60) / 140))
# Energy is direct intensity indicator
energy_arousal = energy
# Low dynamic range = compressed = more intense
compression_arousal = max(0, min(1.0, 1 - (dynamic_range / 20)))
# Brightness adds perceived energy
brightness_arousal = min(1.0, spectral_centroid * 1.2)
result['arousal'] = round(
bpm_arousal * 0.35 +
energy_arousal * 0.35 +
brightness_arousal * 0.15 +
compression_arousal * 0.15,
3
)
```
---
### Instrumentalness - Lines 545-563
**Approach**: Estimate likelihood of vocals vs instrumental based on spectral characteristics.
**Formula**:
```
instrumentalness = flatness_normalized * 0.6 + zcr_instrumental * 0.4
```
**Components**:
| Component | Weight | Calculation | Rationale |
|-----------|--------|-------------|-----------|
| Spectral Flatness | 60% | `(flatness + 40) / 40` | Noise-like (0dB) = instrumental; Tonal (-60dB) = vocals |
| ZCR Pattern | 40% | Low (<0.05) = 0.7; High (>0.15) = 0.4 | Sustained tones = instrumental |
**Code**:
```python
# Spectral flatness: -40dB to 0dB → 0 to 1
flatness_normalized = min(1.0, max(0, (spectral_flatness + 40) / 40))
# ZCR patterns
if zcr < 0.05:
zcr_instrumental = 0.7 # Sustained instrumental tones
elif zcr > 0.15:
zcr_instrumental = 0.4 # Could be speech or percussion
else:
zcr_instrumental = 0.5 # Uncertain
result['instrumentalness'] = round(
flatness_normalized * 0.6 + zcr_instrumental * 0.4,
3
)
```
---
### Acousticness - Line 565-568
**Simple heuristic**: High dynamic range suggests acoustic recording (natural dynamics preserved).
```python
result['acousticness'] = round(min(1.0, dynamic_range / 12), 3)
```
| Dynamic Range | Acousticness | Interpretation |
|---------------|--------------|----------------|
| < 6 dB | < 0.5 | Heavily compressed (electronic/pop) |
| 6-12 dB | 0.5-1.0 | Moderate (mixed) |
| > 12 dB | 1.0 | High dynamic range (acoustic/classical) |
---
### Speechiness - Lines 570-575
**Approach**: Speech has characteristic ZCR + spectral centroid patterns.
```python
if zcr > 0.08 and zcr < 0.2 and spectral_centroid > 0.1 and spectral_centroid < 0.4:
result['speechiness'] = round(min(0.5, zcr * 3), 3)
else:
result['speechiness'] = 0.1
```
| Condition | Result |
|-----------|--------|
| ZCR 0.08-0.2 AND centroid 0.1-0.4 | Speech-like (up to 0.5) |
| Outside range | Low speechiness (0.1) |
---
## Mood Tag Generation (Lines 581-660)
Tags are derived from computed features:
| Condition | Tags Added |
|-----------|------------|
| `arousal >= 0.7` | energetic, upbeat |
| `arousal <= 0.3` | calm, peaceful |
| `valence >= 0.7` | happy, uplifting |
| `valence <= 0.3` | sad, melancholic |
| `danceability >= 0.7` | dance, groovy |
| `bpm >= 140` | fast |
| `bpm <= 80` | slow |
| `keyScale == 'minor'` (and not happy) | moody |
| `arousal >= 0.7 AND bpm >= 120` | workout |
| `arousal <= 0.4 AND valence <= 0.4` | atmospheric |
| `arousal <= 0.3 AND bpm <= 90` | chill |
---
## Output Schema
```typescript
interface AnalysisResult {
// Basic features
bpm: number; // 60-200 typical
beatsCount: number; // Total beat count
key: string; // "C", "D#", etc.
keyScale: string; // "major" or "minor"
keyStrength: number; // 0-1 confidence
// Energy metrics
energy: number; // 0-1 (RMS-based)
loudness: number; // dB
dynamicRange: number; // dB
// Heuristic estimates
danceability: number; // 0-1
valence: number; // 0-1 (happiness)
arousal: number; // 0-1 (energy)
instrumentalness: number; // 0-1
acousticness: number; // 0-1
speechiness: number; // 0-1
// Derived
moodTags: string[]; // ["calm", "peaceful", "chill"]
analysisMode: "standard"; // Always "standard" for this mode
}
```
---
## Database Update (Lines 766-822)
All features are persisted to the `Track` table:
```sql
UPDATE "Track"
SET
bpm = %s,
"beatsCount" = %s,
key = %s,
"keyScale" = %s,
"keyStrength" = %s,
energy = %s,
loudness = %s,
"dynamicRange" = %s,
danceability = %s,
valence = %s,
arousal = %s,
instrumentalness = %s,
acousticness = %s,
speechiness = %s,
"moodTags" = %s,
"analysisMode" = 'standard',
"analysisStatus" = 'completed',
"analysisVersion" = %s,
"analyzedAt" = %s
WHERE id = %s
```
---
## Known Limitations
### Standard Mode vs ML Models
| Aspect | Standard Mode | Enhanced Mode (ML) |
|--------|--------------|-------------------|
| Valence accuracy | ~60% correlation | ~85% correlation |
| Arousal accuracy | ~65% correlation | ~88% correlation |
| Mood detection | Rule-based | Neural network |
| Processing speed | Fast (~1-2 sec) | Slower (~5-10 sec) |
| Dependencies | Essentia only | Essentia + TensorFlow |
### Edge Cases
1. **Ambient music**: Low BPM detection reliability
2. **Classical**: Variable tempo causes BPM averaging issues
3. **Spoken word**: May be misclassified as low-energy music
4. **Electronic/EDM**: Compression detection may overestimate arousal
---
## Dependencies
```
# requirements.txt
essentia==2.1b6.dev1110
essentia-tensorflow==2.1b6.dev1110
numpy>=1.21.0,<2.0.0
tensorflow==2.15.0
redis>=4.5.0
psycopg2-binary>=2.9.0
```
---
## Testing
Run single file analysis:
```bash
docker exec lidify_audio_analyzer python3 analyzer.py --test /music/path/to/song.mp3
```
Example output:
```json
{
"bpm": 128.5,
"beatsCount": 256,
"key": "C",
"keyScale": "minor",
"keyStrength": 0.723,
"energy": 0.65,
"loudness": -8.2,
"dynamicRange": 7.5,
"danceability": 0.72,
"valence": 0.42,
"arousal": 0.68,
"instrumentalness": 0.35,
"acousticness": 0.625,
"speechiness": 0.1,
"moodTags": ["energetic", "upbeat", "moody", "dance"],
"analysisMode": "standard"
}
```
---
## Related Files
- `services/audio-analyzer/Dockerfile` - Container build
- `backend/src/services/vibeMatching.ts` - Uses these features for song matching
- `prisma/schema.prisma` - Track table schema with analysis columns

View File

@@ -0,0 +1,107 @@
# Curated Vibe Mixes Implementation
## Overview
This update adds **19 new curated vibe mixes** and a **Mood-on-Demand** feature that allows users to generate custom mixes based on audio features.
## Bug Fix
Fixed the `genres` field bug - the Album model uses `genres` (JSON array) not `genre` (string). Added a helper function `findTracksByGenrePatterns()` that properly queries:
1. Track's `lastfmTags` and `essentiaGenres` (native String[] fields)
2. Falls back to filtering `album.genres` JSON array in application code
## New Daily Vibe Mixes (10 tracks each)
| Mix Name | Description | Key Audio Features |
|----------|-------------|-------------------|
| **Sad Girl Sundays** | Melancholic introspection | valence < 0.35, minor key, arousal < 0.4 |
| **Main Character Energy** | You're the protagonist ✨ | valence > 0.55, energy > 0.55, danceability > 0.5 |
| **Villain Era** | Dark & empowering 😈 | minor key, energy > 0.65, aggressive tags |
| **3AM Thoughts** | Late night overthinking 🌙 | arousal < 0.35, energy < 0.45, valence < 0.45 |
| **Hot Girl Walk** | Confident cardio 💅 | danceability > 0.65, BPM 95-135, energy > 0.55 |
| **Rage Cleaning** | Aggressive productivity 🔥 | energy > 0.75, arousal > 0.65, BPM > 125 |
| **Golden Hour** | Warm sunset vibes 🌅 | valence > 0.45, acousticness > 0.35, energy 0.25-0.65 |
| **Shower Karaoke** | Belters you can't help sing 🚿 | instrumentalness < 0.35, energy > 0.55, valence > 0.45 |
| **In My Feelings** | Let it all out 💔 | valence < 0.4, arousal < 0.55, acousticness > 0.25 |
| **Midnight Drive** | Late night cruising 🚗 | energy 0.35-0.65, arousal 0.25-0.55, BPM 85-125 |
| **Coffee Shop Vibes** | Cozy background ☕ | acousticness > 0.4, energy 0.15-0.55 |
| **Romanticize Your Life** | Aesthetic moments 🎬 | valence 0.35-0.75, arousal 0.25-0.65, acousticness > 0.25 |
| **That Girl Era** | Self-improvement mode 💪 | valence > 0.55, energy > 0.45, danceability > 0.45 |
| **Unhinged** | Embrace the chaos 🎪 | Extreme features (high or low everything) |
## New Weekly Curated Mixes (20 tracks each)
| Mix Name | Description | Algorithm |
|----------|-------------|-----------|
| **Deep Cuts** | Hidden gems 💎 | Tracks with zero or few plays |
| **Key Journey** | Harmonic progression 🎹 | Ordered by circle of fifths |
| **Tempo Flow** | Energy arc 📈 | slow → fast → slow BPM journey |
| **Vocal Detox** | Instrumental escape 🧘 | instrumentalness > 0.75 |
| **Minor Key Mondays** | All minor key bangers 🖤 | keyScale = 'minor', energy > 0.45 |
## Mood-on-Demand Feature
### Backend Endpoints
- `POST /api/mixes/mood` - Generate a custom mix based on audio parameters
- `GET /api/mixes/mood/presets` - Get available mood presets for the UI
### Preset Moods (12 total)
1. 😊 Happy & Upbeat
2. 😢 Melancholic
3. 😌 Chill & Relaxed
4. ⚡ High Energy
5. 🎯 Focus Mode
6. 💃 Dance Party
7. 🎸 Acoustic Vibes
8. 🖤 Dark & Moody
9. 💕 Romantic
10. 💪 Workout Beast
11. 😴 Sleep & Unwind
12. 👑 Confidence Boost
### Custom Mix Builder
Users can adjust sliders for:
- Happiness (valence)
- Energy
- Danceability
- Tempo (BPM)
## Frontend Changes
### New Component: `MoodMixer.tsx`
A beautiful Spotify-esque modal with:
- Gradient preset cards with emojis
- Smooth animations (Framer Motion)
- Custom range slider controls
- Dark theme matching the app aesthetic
### Homepage Integration
Added "Mood Mixer" button next to the "Refresh" button in the "Made For You" section.
## Files Modified
### Backend
- `backend/src/services/programmaticPlaylists.ts` - Added helper function, fixed 12 genre bugs, added 19 new mix generators
- `backend/src/routes/mixes.ts` - Added mood endpoints and presets
### Frontend
- `frontend/lib/api.ts` - Added types and API methods for mood mixing
- `frontend/app/page.tsx` - Integrated MoodMixer modal
- `frontend/components/MoodMixer.tsx` - New component (created)
## Technical Notes
- All mixes use Essentia audio analysis data (valence, energy, danceability, BPM, key, etc.)
- Fallback to Last.fm tags when audio analysis is insufficient
- Daily mixes: 10 tracks, refreshed daily
- Weekly mixes: 20 tracks, for longer listening sessions
- Mix generation is cached in Redis for performance

View File

@@ -0,0 +1,798 @@
# Modified Files for Review
> **Last Updated:** December 16, 2025
> **Features:** Spotify Import + UI Overhaul (Activity Panel, Carousels, Notifications, Playlist/Mix/Discover Redesign, Settings Page Redesign)
## Overview
This document tracks all files created or modified as part of:
1. **Spotify Import Feature** - Import Spotify playlists, match tracks, download albums
2. **UI Overhaul** - Activity Panel, horizontal carousels, notification system
---
## Backend - New Files
| File | Purpose |
| --------------------------------------------- | --------------------------------------------------------------- |
| `backend/src/services/notificationService.ts` | Notification CRUD service with convenience methods |
| `backend/src/services/spotifyImport.ts` | Spotify playlist import logic, track matching, album resolution |
| `backend/src/services/spotify.ts` | Spotify API/scraping service (embed data extraction) |
| `backend/src/routes/notifications.ts` | Notification & download history API endpoints |
| `backend/src/routes/spotify.ts` | Spotify import API endpoints |
| `backend/src/utils/playlistLogger.ts` | Debug logger for Spotify import jobs |
## Backend - Modified Files
| File | Changes |
| ----------------------------------------------- | --------------------------------------------------------------------- |
| `backend/prisma/schema.prisma` | Added `Notification` model, `DownloadJob.cleared` field |
| `backend/src/services/simpleDownloadManager.ts` | Added notification integration, failure deduplication |
| `backend/src/services/lidarr.ts` | Smart `anyReleaseOk` fallback, MusicBrainz fallback for artist lookup |
| `backend/src/services/musicbrainz.ts` | Recording filtering, scoring system, title normalization |
| `backend/src/services/spotify.ts` | Embed scraping improvements, debug logging |
| `backend/src/index.ts` | Registered notification routes |
---
## Frontend - New Files
| File | Purpose |
| ----------------------------------------------------- | ----------------------------------------------------- |
| `frontend/components/layout/ActivityPanel.tsx` | Collapsible 3rd column with tabs, PWA install button |
| `frontend/components/activity/NotificationsTab.tsx` | System notifications list |
| `frontend/components/activity/ActiveDownloadsTab.tsx` | Currently downloading items |
| `frontend/components/activity/HistoryTab.tsx` | Completed/failed with retry |
| `frontend/components/ui/HorizontalCarousel.tsx` | Reusable carousel with arrows |
| `frontend/hooks/useActivityPanel.ts` | Panel state management |
| `frontend/app/import/spotify/page.tsx` | Spotify import UI page (preview, selection, progress) |
## Frontend - Modified Files
| File | Changes |
| ------------------------------------------------------------- | ------------------------------------------------------ |
| `frontend/components/layout/AuthenticatedLayout.tsx` | Added 3rd column, event listener for toggle |
| `frontend/components/layout/TopBar.tsx` | Added `ActivityPanelToggle` button |
| `frontend/components/MixCard.tsx` | Reduced padding/sizing (`p-4``p-2.5`) |
| `frontend/features/home/components/ArtistsGrid.tsx` | Uses `HorizontalCarousel` |
| `frontend/features/home/components/MixesGrid.tsx` | Uses `HorizontalCarousel` |
| `frontend/features/home/components/ContinueListening.tsx` | Uses `HorizontalCarousel` |
| `frontend/features/home/components/PodcastsGrid.tsx` | Uses `HorizontalCarousel` |
| `frontend/features/home/components/HomeHero.tsx` | Already optimized (compact greeting) |
| `frontend/lib/api.ts` | Added notification API methods, Spotify import methods |
| `frontend/app/playlists/page.tsx` | Added "Import from Spotify" button/link |
| `frontend/app/playlist/[id]/page.tsx` | Full Spotify-style redesign (see below) |
| `frontend/app/mix/[id]/page.tsx` | Full Spotify-style redesign (matches playlist page) |
| `frontend/app/discover/page.tsx` | Updated to use consistent container widths |
| `frontend/features/discover/components/DiscoverHero.tsx` | Redesigned to match playlist/mix hero style |
| `frontend/features/discover/components/DiscoverActionBar.tsx` | Redesigned with Lidify yellow play button |
| `frontend/features/discover/components/TrackList.tsx` | Redesigned to match playlist/mix track listing |
| `frontend/components/layout/Sidebar.tsx` | Removed unused icon imports |
---
## Database Changes
```prisma
// NEW MODEL
model Notification {
id String @id @default(cuid())
userId String
type String // system, download_complete, playlist_ready, error, import_complete
title String
message String?
metadata Json? // { playlistId, albumId, artistId, etc. }
read Boolean @default(false)
cleared Boolean @default(false)
createdAt DateTime @default(now())
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
@@index([userId, cleared])
@@index([userId, read])
@@index([createdAt])
}
// MODIFIED MODEL - DownloadJob
model DownloadJob {
// ... existing fields ...
cleared Boolean @default(false) // NEW: User dismissed from history
}
```
**Migration Applied:** `npx prisma db push`
---
## API Endpoints
### Notifications (`/api/notifications`)
| Method | Endpoint | Description |
| ------ | ------------------------------------ | ---------------------------- |
| GET | `/notifications` | List uncleared notifications |
| GET | `/notifications/unread-count` | Get unread count |
| POST | `/notifications/:id/read` | Mark as read |
| POST | `/notifications/read-all` | Mark all as read |
| POST | `/notifications/:id/clear` | Clear (dismiss) notification |
| POST | `/notifications/clear-all` | Clear all notifications |
| GET | `/notifications/downloads/active` | Active downloads |
| GET | `/notifications/downloads/history` | Completed/failed downloads |
| POST | `/notifications/downloads/:id/clear` | Clear from history |
| POST | `/notifications/downloads/clear-all` | Clear all history |
| POST | `/notifications/downloads/:id/retry` | Retry failed download |
### Spotify Import (`/api/spotify`)
| Method | Endpoint | Description |
| ------ | ---------------------------- | -------------------------------- |
| POST | `/spotify/import/preview` | Generate import preview from URL |
| POST | `/spotify/import/start` | Start import with selections |
| GET | `/spotify/import/:id/status` | Get import job status |
---
## Key Bug Fixes
### 1. Track Matching (Spotify Import)
- **File:** `backend/src/services/spotifyImport.ts`
- **Fix:** Added `stripTrackSuffix()` to remove "- 2011 Remaster" etc. while keeping punctuation
- **Fix:** Added Unicode normalization for artist names (Röyksopp → Royksopp)
- **Fix:** Multiple matching strategies (exact → stripped → fuzzy)
### 2. MusicBrainz Album Resolution
- **File:** `backend/src/services/musicbrainz.ts`
- **Fix:** Score threshold > 50 for studio albums
- **Fix:** Recording filtering (exclude live/demo/acoustic)
- **Fix:** Soundtrack penalty in scoring
### 3. Lidarr Album Addition
- **File:** `backend/src/services/lidarr.ts`
- **Fix:** Smart `anyReleaseOk` fallback (try strict first, then loosen)
- **Fix:** MusicBrainz fallback when Lidarr's metadata server fails
- **Fix:** Immediate error when no releases found
### 4. Multiple Failure Notifications
- **File:** `backend/src/services/simpleDownloadManager.ts`
- **Fix:** 30-second deduplication window for failure events
- **Fix:** Only notify on final exhaustion, not each retry
- **Fix:** Skip notifications for discovery/import batches
---
## Testing Checklist
### Activity Panel
- [ ] Panel opens/closes from TopBar button
- [ ] Panel state persists in localStorage
- [ ] Notifications tab shows system messages
- [ ] Active tab shows downloading items (refreshes every 5s)
- [ ] History tab shows completed/failed
- [ ] Retry button works for failed downloads
- [ ] Clear buttons work
### Home Page Carousels
- [ ] Horizontal scroll works
- [ ] Arrow buttons appear on hover (desktop)
- [ ] Snap behavior works
- [ ] Card sizing is compact
### Spotify Import
- [ ] Preview generation works
- [ ] Album selection works
- [ ] Downloads start correctly
- [ ] Track matching works after downloads
- [ ] Playlist is created with matched tracks
- [ ] Notification appears when complete
### Notifications
- [ ] Download complete creates notification
- [ ] Download failed creates notification (only on exhaustion)
- [ ] Spotify import complete creates notification
- [ ] Unread badge shows count
- [ ] Mark as read works
- [ ] Clear works
### Playlist Page
- [ ] Hero section is compact with bottom-aligned content
- [ ] Shuffle button randomizes and plays tracks
- [ ] Track listing spans full width (no container)
- [ ] Currently playing track is highlighted
- [ ] Track numbers become play icons on hover
- [ ] Album column hidden on mobile
### PWA Install
- [ ] "Install App" button appears in Activity Panel (when installable)
- [ ] Button triggers browser install prompt
- [ ] Button disappears after installation
---
## Rollback Instructions
If issues arise, revert these files:
```bash
# Core files to revert for UI changes
git checkout HEAD~1 -- frontend/components/layout/AuthenticatedLayout.tsx
git checkout HEAD~1 -- frontend/components/layout/TopBar.tsx
git checkout HEAD~1 -- frontend/components/layout/ActivityPanel.tsx
git checkout HEAD~1 -- frontend/components/activity/
# For Spotify import issues
git checkout HEAD~1 -- backend/src/services/spotifyImport.ts
git checkout HEAD~1 -- backend/src/services/musicbrainz.ts
git checkout HEAD~1 -- backend/src/services/lidarr.ts
# Database rollback (if needed)
# Remove Notification model and DownloadJob.cleared from schema
npx prisma db push
```
---
## Notes
- The old `DownloadNotifications.tsx` (floating modal) still exists but is no longer imported in the layout
- All grid components were already converted to carousels prior to this session
- The Spotify import flow uses `lidarrService.addAlbum()` directly instead of `simpleDownloadManager` to avoid same-artist fallback
## Playlist Page Redesign
**File:** `frontend/app/playlist/[id]/page.tsx`
### Changes Made
1. **Fixed React Hooks Error** - Moved `totalDuration` useMemo before early returns
2. **Full-Width Track Listing** - Removed container wrapper, tracks span full panel width like Spotify
3. **Compact Hero Section** - Smaller cover art (140px/192px), bottom-aligned content, reduced title size
4. **Added Shuffle Button** - Shuffles and plays all tracks in random order
5. **Grid-Based Track Layout** - Columns: #, Title/Artist, Album, Duration (responsive)
6. **Track Hover States** - Number becomes play icon on hover, row highlights
### PWA Install in Activity Panel
**File:** `frontend/components/layout/ActivityPanel.tsx`
- Added `beforeinstallprompt` event listener
- "Install App" button appears at bottom of panel when PWA can be installed
- Hides automatically when app is already installed or running in standalone mode
### Sidebar Cleanup
**File:** `frontend/components/layout/Sidebar.tsx`
- Removed unused icon imports (Home, Library, Sparkles, Book, Mic2)
- Navigation items use text-only (no icons) - matching minimalist design
### Playlists Page Redesign
**File:** `frontend/app/playlists/page.tsx`
**Before → After:**
| Element | Before | After |
| ---------------- | --------------------------------- | -------------------------------------- |
| Header title | `text-3xl md:text-4xl font-black` | `text-2xl font-bold` |
| Header padding | `px-6 md:px-8 py-6 md:py-8` | `px-6 pt-6 pb-4` |
| Gradient overlay | Yellow gradient at top | Removed |
| Import button | Green outline with icon | Solid green `bg-[#1DB954]`, no icon |
| Hidden toggle | Icon + text, bordered | Text only, minimal style |
| Card wrapper | `<Card>` component | Simple `<div>` with `hover:bg-white/5` |
| Card padding | `p-4` (via Card) | `p-3` |
| Play button | `w-12 h-12` | `w-10 h-10` |
| Empty state | `<EmptyState>` with icons | Simple centered div |
| Shared badge | Purple badge | Shown in subtitle instead |
| Track count | "tracks" | "songs" (matches Spotify) |
**Design Philosophy:**
- Remove decorative icons where text suffices
- Reduce spacing for tighter, professional feel
- Use native hover states instead of custom components
- Minimal color - let content speak
- Match Spotify's terminology
---
## Spotify-Style Design Patterns
> **Use these patterns consistently across all pages for a cohesive look.**
### 1. Hero Sections (Albums, Playlists, Artists)
```
- Compact height (max ~180px for cover on desktop)
- Content bottom-aligned to the cover art
- Title: text-2xl md:text-3xl font-bold (NOT text-4xl+)
- Subtitle info: text-sm text-gray-400
- Reduced vertical spacing (gap-2 to gap-4 max)
- No decorative gradients overlaying the hero
```
### 2. Track Listings
```
- Full-width, no container card wrapping
- Grid layout: [#] [Title/Artist] [Album] [Duration]
- Album column: hidden on mobile (md:grid-cols-[16px_1fr_1fr_60px])
- Hover: row bg-white/5, number → play icon
- Playing indicator: Lidify yellow (#ecb200) on track number
- Compact row height (~56px)
```
### 3. Page Headers
```
- Title: text-2xl font-bold (not text-3xl+)
- Subtitle: text-sm text-gray-400
- Actions: rounded-full buttons with minimal icons
- No excessive padding (px-6 py-4 is enough)
```
### 4. Cards (Albums, Artists, Playlists)
```
- Compact padding: p-2.5 (not p-4)
- Title: text-sm font-medium truncate
- Subtitle: text-xs text-gray-500
- Play button: bottom-right, shows on hover
```
### 5. Grids → Carousels
```
- Use HorizontalCarousel for content rows
- Single horizontal line, scroll/swipe
- Arrow buttons on hover (desktop)
- Snap behavior for smooth scrolling
```
### 6. General Typography
```
- Section headers: text-lg font-semibold (not text-xl)
- Greeting (home): text-2xl md:text-3xl font-bold tracking-tight
- No ALL CAPS unless absolutely necessary
- Muted subtitles: text-gray-400 or text-gray-500
```
### 7. Buttons & Actions
```
- Primary action: rounded-full, bg-[#ecb200] text-black
- Secondary: bg-white/10 hover:bg-white/20
- Icon-only buttons: rounded-full p-2
- Minimal icon usage - text labels preferred
```
### 8. Spacing Philosophy
```
- Tight but breathable
- Section gaps: gap-6 (not gap-8 or gap-10)
- Card grids: gap-4
- Hero to content: pt-6 (not pt-10)
```
---
## Post-Implementation Fixes
| Date | File | Issue | Fix |
| ---------- | --------------------------------------------------------------- | ----------------------------------------------------- | -------------------------------------------------------------------------------------- |
| 2025-12-15 | `backend/src/routes/notifications.ts` | Wrong import path `../db` | Changed to `../utils/db` |
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | React hooks order violation | Moved `useMemo` before early returns |
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | `useAuth` not defined | Removed unused `isAuthenticated` |
| 2025-12-15 | `frontend/components/layout/ActivityPanel.tsx` | Badge not clearing after clear all | Added `notifications-changed` event listener |
| 2025-12-15 | `frontend/components/activity/NotificationsTab.tsx` | Badge not updating | Dispatch `notifications-changed` event on mutations |
| 2025-12-15 | `backend/src/services/spotifyImport.ts` | Track matching failing (apostrophes, artist matching) | Added `normalizeApostrophes()`, changed artist match to use `contains` with first word |
| 2025-12-15 | `frontend/app/playlists/page.tsx` | Page design not matching Spotify style | Full redesign: compact header, cleaner cards, minimal icons, refined typography |
| 2025-12-15 | `frontend/app/import/spotify/page.tsx` | Using Music2 icon instead of Spotify logo | Uses SpotIcon.png, cleaner layout, matches style guide, removed heavy Card components |
| 2025-12-15 | `frontend/app/import/spotify/page.tsx` | Grey/transparent gradient not matching brand | Added yellow-to-purple gradient (same as home page) with quick fade ratio (35vh/25vh) |
| 2025-12-15 | `frontend/app/discover/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
| 2025-12-15 | `frontend/app/mix/[id]/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
| 2025-12-15 | `frontend/features/discover/components/*` | Discover page not matching playlist/mix design | Redesigned DiscoverHero, DiscoverActionBar, TrackList to match Spotify style |
| 2025-12-15 | `frontend/app/library/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
| 2025-12-15 | `frontend/features/library/components/LibraryHeader.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
| 2025-12-15 | `frontend/app/podcasts/page.tsx` | Container width + card styling not matching | Removed `max-w-7xl mx-auto`, cleaner cards without borders/gradients |
| 2025-12-15 | `frontend/app/audiobooks/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, smaller header text, consistent with Spotify style |
| 2025-12-15 | `frontend/app/artist/[id]/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
| 2025-12-15 | `frontend/app/album/[id]/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
| 2025-12-15 | `frontend/features/artist/components/ArtistHero.tsx` | Hero not matching Spotify style | Compact hero, full-width, bottom-aligned content, kept VibrantJS gradients |
| 2025-12-15 | `frontend/features/artist/components/ArtistActionBar.tsx` | Action bar too heavy | Simplified to play button + shuffle + download, matching playlist style |
| 2025-12-15 | `frontend/features/artist/components/PopularTracks.tsx` | Track list not matching new style | Removed Card wrapper, grid-based layout, cleaner typography |
| 2025-12-15 | `frontend/features/artist/components/Discography.tsx` | Section header too large | Changed header from `text-2xl md:text-3xl` to `text-xl` |
| 2025-12-15 | `frontend/features/artist/components/AvailableAlbums.tsx` | Section headers too large | Changed headers to `text-xl font-bold mb-4`, renamed sections |
| 2025-12-15 | `frontend/features/artist/components/SimilarArtists.tsx` | Cards not matching new style | Cleaner cards with transparent bg, smaller header |
| 2025-12-15 | `frontend/features/artist/components/ArtistBio.tsx` | Using Card component | Replaced Card with simple `bg-white/5` div |
| 2025-12-15 | `frontend/features/album/components/AlbumHero.tsx` | Hero not matching Spotify style | Compact hero, full-width, bottom-aligned content, kept VibrantJS gradients |
| 2025-12-15 | `frontend/features/album/components/AlbumActionBar.tsx` | Action bar too heavy | Simplified to play + shuffle + add to playlist, matching playlist style |
| 2025-12-15 | `frontend/features/album/components/SimilarAlbums.tsx` | Section header too large | Changed header to `text-xl font-bold mb-4` |
| 2025-12-15 | `frontend/app/artist/[id]/page.tsx` | Artist bio/about not showing | Now uses `artist.bio \|\| artist.summary` for library artists with `summary` field |
| 2025-12-15 | `frontend/features/artist/components/ArtistBio.tsx` | Read more link not brand color | Added `[&_a]:text-[#ecb200]` for Lidify yellow links |
| 2025-12-15 | `frontend/app/audiobooks/[id]/page.tsx` | Page design not matching Spotify style | Compact hero, yellow play button, integrated action bar, full-width layout |
| 2025-12-15 | `frontend/features/audiobook/components/AudiobookHero.tsx` | Hero too large and dated | Compact Spotify-style hero with bottom-aligned content, VibrantJS gradients preserved |
| 2025-12-15 | `frontend/features/audiobook/components/AudiobookActionBar.tsx` | Action bar not matching other pages | Yellow play button, inline progress, subtle action icons |
| 2025-12-15 | `frontend/app/podcasts/[id]/page.tsx` | Page design not matching Spotify style | Compact hero, fixed height gradient (25vh), full-width layout |
| 2025-12-15 | `frontend/features/podcast/components/PodcastHero.tsx` | Hero too large and dated | Compact Spotify-style hero with bottom-aligned content, VibrantJS gradients preserved |
| 2025-12-15 | `frontend/features/podcast/components/PodcastActionBar.tsx` | Action bar too heavy | Yellow subscribe button, subtle RSS link, cleaner remove confirmation |
| 2025-12-15 | `frontend/features/podcast/components/ContinueListening.tsx` | Cards not matching new style | Yellow play button, cleaner progress bar, simpler prev/next episodes |
| 2025-12-15 | `frontend/features/podcast/components/EpisodeList.tsx` | Episode list not matching new style | Removed Card wrapper, yellow highlights, cleaner typography |
| 2025-12-15 | `frontend/features/podcast/components/SimilarPodcasts.tsx` | Cards not matching new style | Transparent bg with hover, smaller header, cleaner layout |
| 2025-12-15 | `frontend/features/podcast/components/PreviewEpisodes.tsx` | Cards not matching new style | Removed Card wrappers, yellow subscribe button, cleaner About section |
---
## Settings Page Redesign (December 16, 2025)
### Overview
Complete redesign of the settings page to match Spotify's clean, minimal aesthetic with:
- **Sidebar navigation** - Fixed sidebar with section links, active state tracking via intersection observer
- **Single scrollable page** - All sections on one page instead of tabs
- **Unified Spotify section** - Combined OAuth user connection + Developer API credentials
- **Spotify-style design patterns** - Row-based layouts, clean toggles, minimal borders
### Database Changes
```prisma
model User {
// ... existing fields ...
// NEW: Spotify OAuth connection
spotifyAccessToken String? // Encrypted OAuth access token
spotifyRefreshToken String? // Encrypted OAuth refresh token
spotifyTokenExpiry DateTime? // When access token expires
spotifyUserId String? // Spotify user ID
spotifyDisplayName String? // Display name from Spotify
}
```
### New API Endpoints
| Method | Endpoint | Description |
| ------ | ------------------------------ | ----------------------------------- |
| GET | `/api/spotify/auth/url` | Generate OAuth authorization URL |
| GET | `/api/spotify/auth/callback` | Handle OAuth callback, store tokens |
| POST | `/api/spotify/auth/disconnect` | Remove user's Spotify connection |
| GET | `/api/spotify/auth/status` | Check if user is connected |
### New Frontend Files
| File | Purpose |
| ----------------------------------------------------------------------------- | --------------------------------------- |
| `frontend/features/settings/components/ui/SettingsLayout.tsx` | Sidebar + main content wrapper |
| `frontend/features/settings/components/ui/SettingsSidebar.tsx` | Navigation sidebar with section links |
| `frontend/features/settings/components/ui/SettingsSection.tsx` | Section header with separator |
| `frontend/features/settings/components/ui/SettingsRow.tsx` | Label + description left, control right |
| `frontend/features/settings/components/ui/SettingsToggle.tsx` | Spotify-style toggle switch |
| `frontend/features/settings/components/ui/SettingsSelect.tsx` | Dropdown select |
| `frontend/features/settings/components/ui/SettingsInput.tsx` | Text/password input with show/hide |
| `frontend/features/settings/components/ui/ConnectionCard.tsx` | OAuth connection card (Spotify) |
| `frontend/features/settings/components/ui/index.ts` | Barrel export |
| `frontend/features/settings/components/sections/AccountSection.tsx` | Password change + 2FA |
| `frontend/features/settings/components/sections/PlaybackSection.tsx` | Streaming quality dropdown |
| `frontend/features/settings/components/sections/SpotifyConnectionSection.tsx` | Spotify OAuth connection |
| `frontend/features/settings/components/sections/SpotifyAPISection.tsx` | Developer API credentials |
| `frontend/features/settings/components/sections/CacheSection.tsx` | Cache sizes + automation toggles |
| `frontend/features/settings/hooks/useSpotifyOAuth.ts` | OAuth state management |
### Modified Frontend Files
| File | Changes |
| -------------------------------------------------------------------------- | ------------------------------------- |
| `frontend/app/settings/page.tsx` | Complete redesign with sidebar layout |
| `frontend/features/settings/components/sections/LidarrSection.tsx` | Spotify-style row layout |
| `frontend/features/settings/components/sections/AudiobookshelfSection.tsx` | Spotify-style row layout |
| `frontend/features/settings/components/sections/SoulseekSection.tsx` | Spotify-style row layout |
| `frontend/features/settings/components/sections/AIServicesSection.tsx` | Spotify-style row layout |
| `frontend/features/settings/components/sections/StoragePathsSection.tsx` | Spotify-style row layout |
| `frontend/features/settings/components/sections/UserManagementSection.tsx` | Cleaner design, modal for delete |
### Modified Backend Files
| File | Changes |
| ------------------------------- | ---------------------------------------- |
| `backend/prisma/schema.prisma` | Added Spotify OAuth fields to User model |
| `backend/src/routes/spotify.ts` | Added OAuth routes |
### Deleted Files (Consolidated)
| File | Reason |
| ---------------------------------------------------------------------------- | --------------------------------- |
| `frontend/features/settings/components/UserSettingsTab.tsx` | Replaced by unified settings page |
| `frontend/features/settings/components/AccountTab.tsx` | Replaced by unified settings page |
| `frontend/features/settings/components/SystemSettingsTab.tsx` | Replaced by unified settings page |
| `frontend/features/settings/components/sections/ChangePasswordSection.tsx` | Merged into AccountSection |
| `frontend/features/settings/components/sections/TwoFactorAuthSection.tsx` | Merged into AccountSection |
| `frontend/features/settings/components/sections/PlaybackQualitySection.tsx` | Replaced by PlaybackSection |
| `frontend/features/settings/components/sections/AdvancedSettingsSection.tsx` | Replaced by CacheSection |
| `frontend/features/settings/components/sections/CacheSettingsSection.tsx` | Replaced by CacheSection |
| `frontend/features/settings/components/sections/SpotifySection.tsx` | Split into Connection + API |
### Settings Sections
**All Users:** Account, Playback, Connected Services (Spotify OAuth)
**Admin Only:** Download Services, Media Servers, P2P Networks, AI Services, Spotify API, Storage, Cache & Automation, User Management
---
## Home Page Enhancements (Dec 16, 2025)
### New Features
1. **Radio Stations Section** - Compact horizontal row at the top of the home page showing random Deezer radio stations
2. **Featured Playlists Section** - Grid showing 10 featured Deezer playlists after Popular Artists section
### New Files Created
| File | Purpose |
| ------------------------------------------------------ | ------------------------------------------- |
| `frontend/features/home/components/FeaturedPlaylistsGrid.tsx` | Grid component for featured playlists |
| `frontend/features/home/components/RadioStationsGrid.tsx` | Horizontal scroll component for radio stations |
### Modified Files
| File | Changes |
| ---------------------------------------------------- | ------------------------------------------------ |
| `frontend/app/page.tsx` | Added radio stations and featured playlists sections |
| `frontend/features/home/hooks/useHomeData.ts` | Added browse data fetching for playlists/radios |
| `frontend/hooks/useQueries.ts` | Added browse query keys and hooks |
| `backend/src/routes/browse.ts` | Increased featured playlists limit from 50 to 200 |
---
## Notification & Sync Button Improvements (Dec 16, 2025)
### Changes
1. **Sync Button** - No longer shows toast overlay, turns green with spinning animation while syncing
2. **Optimistic Notification Clearing** - Notifications are cleared from UI immediately before API call completes
3. **Duplicate Key Fix** - Added context parameter to renderCard in browse page to prevent duplicate key errors
### Modified Files
| File | Changes |
| -------------------------------------------------------- | ------------------------------------------------ |
| `frontend/components/layout/Sidebar.tsx` | Removed toast, added green color while syncing |
| `frontend/components/activity/NotificationsTab.tsx` | Implemented optimistic updates for all mutations |
| `frontend/app/browse/playlists/page.tsx` | Fixed duplicate key errors with unique keys |
---
## Essentia Audio Analysis Integration (Dec 16, 2025)
### Overview
Integrated Essentia audio analysis to extract BPM, key, mood, energy, and other audio features from tracks. This enables intelligent mood-based mixes and personalized playlists.
### Database Changes
Added to `Track` model in `backend/prisma/schema.prisma`:
| Field | Type | Description |
| ------------------ | ---------- | ------------------------------------- |
| `bpm` | Float? | Beats per minute |
| `beatsCount` | Int? | Total beats in track |
| `key` | String? | Musical key (C, F#, Bb, etc.) |
| `keyScale` | String? | "major" or "minor" |
| `keyStrength` | Float? | Key detection confidence (0-1) |
| `energy` | Float? | Overall energy (0-1) |
| `loudness` | Float? | Average loudness in dB |
| `dynamicRange` | Float? | Dynamic range in dB |
| `danceability` | Float? | Danceability score (0-1) |
| `valence` | Float? | Happy (1) to sad (0) |
| `arousal` | Float? | Energetic (1) to calm (0) |
| `instrumentalness` | Float? | Vocal presence (0-1, 1=instrumental) |
| `acousticness` | Float? | Acoustic vs electronic (0-1) |
| `speechiness` | Float? | Spoken word content (0-1) |
| `moodTags` | String[] | ML-classified mood tags |
| `essentiaGenres` | String[] | ML-classified genres |
| `lastfmTags` | String[] | User-generated mood tags from Last.fm |
| `analysisStatus` | String | pending/processing/completed/failed |
| `analysisVersion` | String? | Essentia version used |
| `analyzedAt` | DateTime? | When analysis was completed |
| `analysisError` | String? | Error message if failed |
### New Files
| File | Description |
| ------------------------------------------------- | -------------------------------------------------- |
| `services/audio-analyzer/Dockerfile` | Python 3.11 + Essentia container |
| `services/audio-analyzer/analyzer.py` | Main audio analysis service |
| `services/audio-analyzer/requirements.txt` | Python dependencies |
| `backend/src/workers/trackEnrichment.ts` | Last.fm tag enrichment worker |
| `backend/src/routes/analysis.ts` | API routes for analysis status & triggers |
### Modified Files
| File | Changes |
| -------------------------------------------------------------- | ----------------------------------------------- |
| `backend/prisma/schema.prisma` | Added audio analysis fields to Track model |
| `backend/src/workers/index.ts` | Added track enrichment worker startup/shutdown |
| `backend/src/workers/queues.ts` | Added `analysisQueue` for audio analysis jobs |
| `backend/src/index.ts` | Registered `/api/analysis` routes |
| `backend/src/services/programmaticPlaylists.ts` | Added mood-based mix generators |
| `backend/src/routes/library.ts` | Added mood-based radio station filtering |
| `frontend/features/home/components/LibraryRadioStations.tsx` | Added mood-based radio station buttons |
| `docker-compose.yml` | Added `audio-analyzer` service (optional) |
### New Mix Types (Audio Analysis-Based)
| Mix Type | Criteria |
| -------------- | --------------------------------------------- |
| High Energy | energy >= 0.7, BPM >= 120 |
| Late Night | energy <= 0.4, BPM <= 90, low arousal |
| Happy Vibes | valence >= 0.6, energy >= 0.5 |
| Melancholy | valence <= 0.4, minor key preferred |
| Dance Floor | danceability >= 0.7, BPM 110-140 |
| Acoustic | acousticness >= 0.6, energy 0.3-0.6 |
| Instrumental | instrumentalness >= 0.7, energy 0.3-0.6 |
| Road Trip | tags or energy 0.5-0.8, BPM 100-130 |
| Sunday Morning | low energy, high acousticness (day-specific) |
| Monday Motivation | high energy, high valence (day-specific) |
| Friday Night | high danceability, high energy (day-specific) |
### API Endpoints
| Method | Endpoint | Description |
| ------ | ----------------------------- | ---------------------------------------- |
| GET | `/api/analysis/status` | Get analysis progress statistics |
| POST | `/api/analysis/start` | Queue pending tracks for analysis |
| POST | `/api/analysis/retry-failed` | Reset failed tracks to pending |
| POST | `/api/analysis/analyze/:id` | Queue specific track for analysis |
| GET | `/api/analysis/track/:id` | Get analysis data for specific track |
| GET | `/api/analysis/features` | Get aggregated feature statistics |
### Starting the Audio Analyzer
The audio analyzer is disabled by default. To enable it:
```bash
docker-compose --profile audio-analysis up -d
```
Or just run it separately:
```bash
docker-compose up audio-analyzer -d
```
---
## Notification System Fixes (Dec 16, 2025)
### Issues Fixed
1. **Toast overlays for cache clearing and sync** - Removed toast.success overlays for "Caches cleared" and "Library scan started" since these should appear in the activity panel notification bar instead.
2. **Notification badge not clearing immediately** - The `useNotifications` hook wasn't responding to `notifications-changed` events. Fixed by adding an event listener that triggers a refetch.
3. **Settings page glitchy sidebar** - Replaced IntersectionObserver with scroll-based tracking for smoother sidebar highlighting.
### Modified Files
| File | Change |
|------|--------|
| `frontend/hooks/useNotifications.ts` | Added event listener for `notifications-changed` to trigger immediate refetch |
| `frontend/features/settings/components/sections/CacheSection.tsx` | Removed toast.success for cache clearing and sync, added local error state |
| `frontend/components/layout/TopBar.tsx` | Removed toast.success for library scan started |
| `frontend/components/layout/Sidebar.tsx` | Added `notifications-changed` event dispatch after sync |
| `frontend/features/settings/components/ui/SettingsLayout.tsx` | Replaced IntersectionObserver with throttled scroll listener for smoother sidebar tracking |
### Behavior Changes
- **Sync button**: No longer shows toast overlay - progress appears in activity panel
- **Clear caches button**: No longer shows toast overlay - implicit success (button returns to normal state)
- **Notification badge**: Now clears immediately via optimistic updates and event system
- **Settings sidebar**: Smoother scrolling behavior without jumpy highlights
---
## Session 8: Artist Radio Feature
### New Feature: Artist Radio with Hybrid Similarity Matching
| File | Change |
|------|--------|
| `backend/src/routes/library.ts` | Added `artist` case to `/library/radio` endpoint with hybrid matching |
| `backend/src/routes/library.ts` | Added artist name filtering to `/library/genres` endpoint |
| `frontend/features/artist/components/ArtistActionBar.tsx` | Added Radio icon button for library artists |
| `frontend/app/artist/[id]/page.tsx` | Added `handleStartRadio` function and passed to ArtistActionBar |
| `frontend/lib/api.ts` | Added `getRadioTracks()` method |
### Artist Radio Logic
The artist radio uses a **hybrid approach** with vibe boosting:
1. **Last.fm Similar Artists (filtered to library)**: Primary source, gets up to 15 similar artists that exist in user's library
2. **Genre Matching Fallback**: If < 5 similar artists, finds library artists with overlapping genres
3. **Vibe Boost via Audio Analysis**: Scores similar artists' tracks by BPM, energy, valence, and danceability similarity
4. **Track Mix**: ~40% from original artist, ~60% from vibe-matched similar artists
### Genre Filtering Fix
Artist names (like "Jamiroquai") were incorrectly showing as genres. Fixed by:
- Fetching all artist names at query time
- Filtering out any "genre" that matches an artist name (case-insensitive)
### Bug Fix: Artist Radio "Unknown Artist" / No Image
Fixed two issues with artist radio playback:
1. **Frontend**: Removed double-transformation of tracks - backend already returns properly formatted data
2. **Backend**: Fixed `coverArt` to use `track.album.coverUrl` directly instead of conditional `lidarrAlbumId` check
---
## Session 9: Vibe Match Feature
### New Feature: "Vibe Match" Button on Media Player
Allows users to instantly create a queue of tracks that sound like the currently playing track.
| File | Change |
|------|--------|
| `backend/src/routes/library.ts` | Added `vibe` case to `/library/radio` endpoint with audio feature matching |
| `frontend/components/player/MiniPlayer.tsx` | Added Vibe button (AudioWaveform icon) with loading state |
| `frontend/components/player/FullPlayer.tsx` | Added Vibe button (AudioWaveform icon) with loading state |
### How Vibe Match Works
1. **Takes current track's audio features** (BPM, energy, valence, danceability, key, mood tags)
2. **Searches entire library** for tracks with similar audio profiles
3. **Scores matches** using weighted algorithm:
- BPM (25%) - within ±15 BPM is ideal
- Energy (25%)
- Valence/mood (20%)
- Danceability (15%)
- Key compatibility (10%)
- Mood tag overlap (5%)
4. **Falls back gracefully** if not enough audio matches:
- Same artist's other tracks
- Last.fm similar artists' tracks
- Same genre tracks
- Random library tracks
### UI Location
The Vibe button (waveform icon) appears after the Repeat button in both:
- MiniPlayer (sidebar player)
- FullPlayer (bottom bar player)
Clicking it replaces the current queue with vibe-matched tracks and shows a toast notification.
---
## Session 9 (continued): Search Tracks Fix
### Bug Fix: Library Tracks Not Showing in Search
The backend was returning tracks in search results, but the frontend never displayed them.
| File | Change |
|------|--------|
| `frontend/app/search/page.tsx` | Added import for `LibraryTracksList` and section to display library tracks |
| `frontend/features/search/components/LibraryTracksList.tsx` | **New file** - Component to display library tracks in search results |
### Features of LibraryTracksList
- Shows up to 10 tracks matching the search query
- Displays cover art, title, artist, album, and duration
- Click to play (integrates with audio context)
- Currently playing track highlighted in yellow
- Artist and album names link to their respective pages

View File

@@ -0,0 +1,396 @@
# Vibe Matching Algorithm Overhaul Plan
## Overview
This document outlines the plan to overhaul the vibe matching algorithm to use **cosine similarity** on a comprehensive feature vector that includes all 9 ML mood predictions, audio features, and genre/tag matching.
## Current State (Before Overhaul)
### What We Have
- **ML Mood Predictions (9 total):**
- `moodHappy`, `moodSad`, `moodRelaxed`, `moodAggressive` (existing)
- `moodParty`, `moodAcoustic`, `moodElectronic` (newly added)
- `danceabilityMl`, `aggressivenessMl` (existing)
- **Audio Features:**
- `bpm`, `key`, `keyScale` (major/minor)
- `energy`, `danceability`, `valence`, `arousal`
- `instrumentalness`, `acousticness`, `speechiness`
- **Metadata:**
- `lastfmTags` (JSON array of tag objects with name/count)
- `essentiaGenres` (JSON array of genre strings)
- `trackGenres` relation (linked genre records)
### Previous Algorithm (Weighted Manhattan Distance)
```typescript
// Old approach - arbitrary weights, limited features
const weights = {
energy: 1.5,
danceability: 1.2,
valence: 1.0,
arousal: 1.0,
instrumentalness: 0.8,
bpm: 0.5,
};
let score = 0;
for (const [feature, weight] of Object.entries(weights)) {
const diff = Math.abs(sourceTrack[feature] - candidateTrack[feature]);
score += diff * weight;
}
// Lower score = more similar (inverted logic)
```
**Problems with old approach:**
1. Only used 6 features, ignored all ML mood predictions
2. Arbitrary weights with no scientific basis
3. Manhattan distance less effective for high-dimensional feature spaces
4. No genre/tag matching
5. Score inversion was confusing
---
## New Algorithm (Cosine Similarity)
### Phase 1: Database Schema Update ✅
Add new mood fields to Prisma schema:
```prisma
model Track {
// ... existing fields ...
// ML Mood Predictions (0.0-1.0)
moodHappy Float?
moodSad Float?
moodRelaxed Float?
moodAggressive Float?
moodParty Float? // NEW
moodAcoustic Float? // NEW
moodElectronic Float? // NEW
// ... rest of schema ...
}
```
**Migration command:**
```bash
cd backend
npx prisma db push --skip-generate
```
### Phase 2: Audio Analyzer Update ✅
Update `services/audio-analyzer/analyzer.py` to extract and save all 7 mood predictions:
```python
# MusiCNN mood classifiers
mood_models = {
'moodHappy': 'mood_happy-musicnn-msd-2',
'moodSad': 'mood_sad-musicnn-msd-2',
'moodRelaxed': 'mood_relaxed-musicnn-msd-2',
'moodAggressive': 'mood_aggressive-musicnn-msd-2',
'moodParty': 'mood_party-musicnn-msd-2',
'moodAcoustic': 'mood_acoustic-musicnn-msd-2',
'moodElectronic': 'mood_electronic-musicnn-msd-2',
}
# Save all to database
UPDATE "Track" SET
"moodHappy" = %s,
"moodSad" = %s,
"moodRelaxed" = %s,
"moodAggressive" = %s,
"moodParty" = %s,
"moodAcoustic" = %s,
"moodElectronic" = %s,
...
```
### Phase 3: Feature Vector Construction
Build a normalized feature vector for each track:
```typescript
interface TrackFeatures {
// ML Moods (0-1)
moodHappy: number | null;
moodSad: number | null;
moodRelaxed: number | null;
moodAggressive: number | null;
moodParty: number | null;
moodAcoustic: number | null;
moodElectronic: number | null;
// Audio Features
energy: number | null;
arousal: number | null;
danceability: number | null;
danceabilityMl: number | null;
instrumentalness: number | null;
bpm: number | null;
keyScale: string | null;
// Metadata
lastfmTags: any;
essentiaGenres: any;
}
function buildFeatureVector(track: TrackFeatures): number[] {
return [
// 7 ML Mood predictions (indices 0-6)
track.moodHappy ?? 0.5,
track.moodSad ?? 0.5,
track.moodRelaxed ?? 0.5,
track.moodAggressive ?? 0.5,
track.moodParty ?? 0.5,
track.moodAcoustic ?? 0.5,
track.moodElectronic ?? 0.5,
// Core audio features (indices 7-10)
track.energy ?? 0.5,
track.arousal ?? 0.5,
track.danceabilityMl ?? track.danceability ?? 0.5,
track.instrumentalness ?? 0.5,
// Normalized BPM (index 11)
// Maps 60-180 BPM to 0-1 range
Math.max(0, Math.min(1, ((track.bpm ?? 120) - 60) / 120)),
// Key mode (index 12)
// Major = 1, Minor = 0
track.keyScale === 'major' ? 1 : 0,
];
}
```
**Feature Vector Dimensions: 13**
### Phase 4: Cosine Similarity Calculation
```typescript
function cosineSimilarity(a: number[], b: number[]): number {
let dotProduct = 0;
let magnitudeA = 0;
let magnitudeB = 0;
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
magnitudeA += a[i] * a[i];
magnitudeB += b[i] * b[i];
}
if (magnitudeA === 0 || magnitudeB === 0) return 0;
return dotProduct / (Math.sqrt(magnitudeA) * Math.sqrt(magnitudeB));
}
```
**Properties:**
- Returns value between -1 and 1 (for our 0-1 normalized vectors, always 0 to 1)
- 1.0 = identical vectors (perfect match)
- 0.0 = orthogonal vectors (no similarity)
- Higher = better (intuitive, no inversion needed)
### Phase 5: Tag/Genre Bonus
Add bonus points for matching tags and genres:
```typescript
function calculateTagBonus(
sourceTrack: TrackFeatures,
candidateTrack: TrackFeatures
): number {
let bonus = 0;
// Extract tags
const sourceTags = new Set<string>();
const candidateTags = new Set<string>();
// Parse lastfmTags
if (Array.isArray(sourceTrack.lastfmTags)) {
sourceTrack.lastfmTags.forEach((t: any) => {
if (t?.name) sourceTags.add(t.name.toLowerCase());
});
}
if (Array.isArray(candidateTrack.lastfmTags)) {
candidateTrack.lastfmTags.forEach((t: any) => {
if (t?.name) candidateTags.add(t.name.toLowerCase());
});
}
// Parse essentiaGenres
if (Array.isArray(sourceTrack.essentiaGenres)) {
sourceTrack.essentiaGenres.forEach((g: string) => {
sourceTags.add(g.toLowerCase());
});
}
if (Array.isArray(candidateTrack.essentiaGenres)) {
candidateTrack.essentiaGenres.forEach((g: string) => {
candidateTags.add(g.toLowerCase());
});
}
// Count overlapping tags
let overlap = 0;
for (const tag of sourceTags) {
if (candidateTags.has(tag)) overlap++;
}
// Bonus: up to 0.1 (10%) for tag overlap
// Normalized by the smaller set size to handle varying tag counts
const minSize = Math.min(sourceTags.size, candidateTags.size);
if (minSize > 0) {
bonus = (overlap / minSize) * 0.1;
}
return bonus;
}
```
### Phase 6: Final Score Calculation
```typescript
function calculateVibeScore(
sourceTrack: TrackFeatures,
candidateTrack: TrackFeatures
): number {
// Build feature vectors
const sourceVector = buildFeatureVector(sourceTrack);
const candidateVector = buildFeatureVector(candidateTrack);
// Calculate cosine similarity (0-1)
const cosineSim = cosineSimilarity(sourceVector, candidateVector);
// Add tag bonus (0-0.1)
const tagBonus = calculateTagBonus(sourceTrack, candidateTrack);
// Final score: cosine similarity + tag bonus
// Capped at 1.0
const finalScore = Math.min(1.0, cosineSim + tagBonus);
return finalScore;
}
```
### Phase 7: Integration into Radio Endpoint
Update `backend/src/routes/library.ts`:
```typescript
// In the vibe radio section
const sourceTrack = await prisma.track.findUnique({
where: { id: trackId },
select: {
moodHappy: true,
moodSad: true,
moodRelaxed: true,
moodAggressive: true,
moodParty: true,
moodAcoustic: true,
moodElectronic: true,
energy: true,
arousal: true,
danceability: true,
danceabilityMl: true,
instrumentalness: true,
bpm: true,
keyScale: true,
lastfmTags: true,
essentiaGenres: true,
},
});
// Get candidates
const candidates = await prisma.track.findMany({
where: {
id: { not: trackId },
analysisStatus: 'enhanced', // Only use analyzed tracks
},
select: { /* same fields */ },
take: 500, // Get more candidates for better matching
});
// Score all candidates
const scored = candidates.map(candidate => ({
...candidate,
vibeScore: calculateVibeScore(sourceTrack, candidate),
}));
// Sort by score (highest first)
scored.sort((a, b) => b.vibeScore - a.vibeScore);
// Take top N for the queue
const vibeQueue = scored.slice(0, limit);
// DO NOT SHUFFLE - preserve the sorted order!
```
---
## Implementation Checklist
- [x] **Phase 1:** Add `moodParty`, `moodAcoustic`, `moodElectronic` to Prisma schema
- [x] **Phase 2:** Update audio analyzer to extract all 7 moods
- [x] **Phase 3:** Implement `buildFeatureVector()` function
- [x] **Phase 4:** Implement `cosineSimilarity()` function
- [x] **Phase 5:** Implement `calculateTagBonus()` function (called `computeTagBonus`)
- [x] **Phase 6:** Implement `calculateVibeScore()` combining all components
- [x] **Phase 7:** Integrate into `/library/radio` endpoint
- [ ] **Phase 8:** Update frontend to display match percentage (optional enhancement)
- [ ] **Phase 9:** Re-analyze tracks to populate new mood fields
---
## Re-Analysis Script
To populate the new mood fields for existing tracks:
```sql
-- Reset analysis status for enhanced tracks to re-run analysis
UPDATE "Track"
SET "analysisStatus" = 'pending'
WHERE "analysisStatus" = 'enhanced';
```
Or use the existing script:
```bash
docker exec lidify_db psql -U lidifydb -d lidify -f /path/to/reset-analysis-for-new-moods.sql
```
---
## Expected Improvements
1. **Better Similarity Matching:** Cosine similarity is mathematically proven to work well for high-dimensional feature vectors
2. **Full ML Utilization:** All 9 mood predictions now contribute to matching
3. **Genre Awareness:** Tag/genre overlap provides meaningful boost
4. **Intuitive Scores:** Higher score = better match (no inversion)
5. **Normalized Features:** All features scaled to 0-1 for fair comparison
---
## Testing Strategy
1. Pick a track with known characteristics (e.g., happy upbeat pop song)
2. Generate vibe queue
3. Verify top matches share similar mood profiles
4. Check that match percentages in UI reflect actual similarity
5. Test with various genres to ensure cross-genre matching works appropriately
---
## Files Modified
- `backend/prisma/schema.prisma` - New mood fields
- `backend/src/routes/library.ts` - New scoring algorithm
- `services/audio-analyzer/analyzer.py` - Extract all 7 moods
- `frontend/components/player/VibeOverlay.tsx` - Display all moods
- `frontend/lib/audio-state-context.tsx` - Extended AudioFeatures interface
---
## Notes
- **Gaia:** Essentia has a companion library called Gaia for large-scale similarity search using KD-trees. This is overkill for our scale (< 100k tracks) but could be considered for future scaling.
- **MusiCNN Limitations:** The model was trained on MSD (Million Song Dataset) which is pop/rock heavy. For classical/ambient music, predictions may be less reliable. We've added normalization to handle this.
- **Shuffle Interaction:** Vibe mode automatically disables shuffle to preserve the sorted order.

View File

@@ -0,0 +1,571 @@
# Vibe Matching Implementation Plan
## Executive Summary
The current vibe matching system uses Essentia for audio analysis but only extracts **basic features**. Critical mood/emotion features are either placeholder values or poorly estimated. This document outlines a comprehensive plan to achieve Spotify-quality vibe matching while being conscious of performance on user hardware.
## Strategy Update (Latest)
**Default:** Enhanced mode (ML-powered, accurate)
**Fallback:** Standard mode (lightweight, for troubleshooting or power saving)
**Approach:**
1. ✅ Pre-package all Essentia TensorFlow models in Docker image (~200MB)
2. 🔄 Fix Enhanced mode FIRST - make it actually use the ML models
3. ⏳ THEN create Standard mode as a lightweight fallback
4. Users can toggle to Standard mode to save CPU if needed
---
## Current State Analysis
### What Essentia IS Currently Extracting (Working)
| Feature | Status | Quality |
|---------|--------|---------|
| **BPM** | ✅ Working | Good - Uses `RhythmExtractor2013` |
| **Key** | ✅ Working | Good - Uses `KeyExtractor` |
| **KeyScale** | ✅ Working | Good - major/minor detection |
| **Energy** | ✅ Working | Moderate - Raw energy normalized |
| **Loudness** | ✅ Working | Good - dB measurement |
| **Dynamic Range** | ✅ Working | Good |
| **Danceability** | ✅ Working | Good - Uses `Danceability` algorithm |
| **Beats Count** | ✅ Working | Good |
### What's Broken or Placeholder
| Feature | Status | Problem |
|---------|--------|---------|
| **Valence** | ⚠️ Fake | Calculated as `(major/minor * 0.4) + (energy * 0.6)` - NOT actual emotional valence |
| **Arousal** | ⚠️ Fake | Calculated as `(BPM * 0.5) + (energy * 0.5)` - NOT actual arousal |
| **Instrumentalness** | ❌ Placeholder | Hardcoded to `0.5` |
| **Acousticness** | ⚠️ Estimate | Rough estimate from dynamic range |
| **Speechiness** | ❌ Placeholder | Hardcoded to `0.1` |
| **Mood Tags** | ⚠️ Derived | Generated from fake valence/arousal, not ML |
| **Genre Tags** | ❌ Empty | TensorFlow models not loaded |
### The Core Issue
```python
# Current valence calculation (analyzer.py lines 226-231)
key_valence = 0.6 if scale == 'major' else 0.4
energy_valence = result['energy']
result['valence'] = round((key_valence * 0.4 + energy_valence * 0.6), 3)
```
**"Fake Happy" by Paramore** (emotionally complex, about masking sadness):
- Major key → 0.6
- High energy → ~0.7
- Calculated valence: `(0.6 * 0.4) + (0.7 * 0.6) = 0.66` (appears "happy")
**"Summer Girl" by Jamiroquai** (genuinely upbeat funk):
- Major key → 0.6
- High energy → ~0.7
- Calculated valence: `(0.6 * 0.4) + (0.7 * 0.6) = 0.66` (appears "happy")
**Result: 97% match despite being completely different vibes!**
---
## How Spotify Does It
Spotify's audio analysis uses a combination of:
### 1. Low-Level Audio Features (Similar to what we have)
- Tempo/BPM
- Key/Mode
- Loudness
- Time signature
### 2. Mid-Level Features (We're missing these)
- **Spectral Centroid** - "brightness" of the sound
- **Spectral Rolloff** - frequency distribution
- **Zero Crossing Rate** - percussiveness
- **MFCCs** - Mel-frequency cepstral coefficients (timbral texture)
- **Chroma Features** - harmonic content
### 3. High-Level Features (We're faking these)
- **Valence** - Musical positiveness (0-1)
- **Arousal/Energy** - Intensity and activity
- **Instrumentalness** - Vocal presence prediction
- **Acousticness** - Acoustic vs electronic
- **Speechiness** - Presence of spoken words
- **Liveness** - Audience presence detection
### 4. Deep Learning Models
Spotify trains neural networks on millions of labeled tracks to predict:
- Mood categories
- Genre classification
- User preference patterns
---
## Two-Tier System
### Default: Enhanced Vibe Matching (ML-Powered)
**Status:** DEFAULT - Pre-packaged in Docker, just works
**Target:** High accuracy, ~5-10 seconds per track
**Features (from Essentia TensorFlow Models):**
1. **Mood Predictions (real ML, not estimated):**
- `mood_happy-discogs-effnet-1.pb` - Happiness/positivity 0-1
- `mood_sad-discogs-effnet-1.pb` - Sadness 0-1
- `mood_relaxed-discogs-effnet-1.pb` - Relaxation/calmness 0-1
- `mood_aggressive-discogs-effnet-1.pb` - Aggression/intensity 0-1
2. **Audio Characteristics:**
- `danceability-discogs-effnet-1.pb` - ML-based danceability
- `voice_instrumental-discogs-effnet-1.pb` - Vocal detection (instrumentalness)
3. **Embeddings for Similarity:**
- `discogs-effnet-bs64-1.pb` - Audio embeddings (neural "fingerprint")
- Can be used for direct similarity comparison
4. **Spectral Features:**
- Spectral Centroid (brightness)
- MFCCs (timbral texture - 13 coefficients)
**Models Pre-packaged:** ~200MB in Docker image (no user download)
**RAM Requirement:** ~500MB during analysis
**CPU Requirement:** Any modern CPU (2015+)
### Fallback: Standard Vibe Matching (Lightweight)
**Status:** FALLBACK - For troubleshooting or power saving
**Target:** Fast, <2 seconds per track, low CPU
**Features Used:**
- BPM (Essentia RhythmExtractor)
- Energy (Essentia Energy)
- Danceability (Essentia Danceability - non-ML version)
- Key/Scale (Essentia KeyExtractor)
- Spectral Centroid (cheap to compute)
- Last.fm mood tags
- Genre matching from tags
**When to use Standard mode:**
- Low-power devices (Raspberry Pi, older NAS)
- Troubleshooting if Enhanced mode has issues
- User preference to save CPU cycles
---
## Implementation Plan
### Phase 1: Pre-Package Models in Docker (Day 1)
#### 1.1 Update Dockerfile to Include Models
```dockerfile
# Download Essentia ML models during build (~200MB)
RUN apt-get update && apt-get install -y --no-install-recommends curl && \
# Base embedding model (required for all predictions)
curl -L -o /app/models/discogs-effnet-bs64-1.pb \
"https://essentia.upf.edu/models/feature-extractors/discogs-effnet/discogs-effnet-bs64-1.pb" && \
# Mood models
curl -L -o /app/models/mood_happy-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_happy/mood_happy-discogs-effnet-1.pb" && \
curl -L -o /app/models/mood_sad-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_sad/mood_sad-discogs-effnet-1.pb" && \
curl -L -o /app/models/mood_relaxed-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_relaxed/mood_relaxed-discogs-effnet-1.pb" && \
curl -L -o /app/models/mood_aggressive-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_aggressive/mood_aggressive-discogs-effnet-1.pb" && \
# Danceability and voice/instrumental
curl -L -o /app/models/danceability-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/danceability/danceability-discogs-effnet-1.pb" && \
curl -L -o /app/models/voice_instrumental-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/voice_instrumental/voice_instrumental-discogs-effnet-1.pb" && \
# Arousal/Valence models
curl -L -o /app/models/arousal-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_arousal/mood_arousal-discogs-effnet-1.pb" && \
curl -L -o /app/models/valence-discogs-effnet-1.pb \
"https://essentia.upf.edu/models/classification-heads/mood_valence/mood_valence-discogs-effnet-1.pb" && \
apt-get purge -y curl && rm -rf /var/lib/apt/lists/*
```
### Phase 2: Implement Enhanced Analysis (Days 2-4)
#### 2.1 Rewrite analyzer.py with ML Models
```python
class AudioAnalyzer:
"""Enhanced audio analysis using Essentia TensorFlow models"""
def __init__(self):
self.models_loaded = False
self.embedding_model = None
self.mood_models = {}
if ESSENTIA_AVAILABLE:
self._init_essentia()
self._load_ml_models()
def _load_ml_models(self):
"""Load TensorFlow models for enhanced analysis"""
try:
from essentia.standard import (
TensorflowPredictEffnetDiscogs,
TensorflowPredict2D
)
# Load embedding extractor (base for all predictions)
embedding_path = '/app/models/discogs-effnet-bs64-1.pb'
if os.path.exists(embedding_path):
self.embedding_model = TensorflowPredictEffnetDiscogs(
graphFilename=embedding_path,
output="PartitionedCall:1"
)
logger.info("Loaded embedding model")
# Load mood prediction models
mood_models = {
'happy': '/app/models/mood_happy-discogs-effnet-1.pb',
'sad': '/app/models/mood_sad-discogs-effnet-1.pb',
'relaxed': '/app/models/mood_relaxed-discogs-effnet-1.pb',
'aggressive': '/app/models/mood_aggressive-discogs-effnet-1.pb',
'danceability': '/app/models/danceability-discogs-effnet-1.pb',
'voice_instrumental': '/app/models/voice_instrumental-discogs-effnet-1.pb',
'arousal': '/app/models/arousal-discogs-effnet-1.pb',
'valence': '/app/models/valence-discogs-effnet-1.pb',
}
for name, path in mood_models.items():
if os.path.exists(path):
self.mood_models[name] = TensorflowPredict2D(
graphFilename=path,
output="model/Softmax"
)
logger.info(f"Loaded {name} model")
self.models_loaded = len(self.mood_models) > 0
logger.info(f"ML models loaded: {self.models_loaded} ({len(self.mood_models)} models)")
except Exception as e:
logger.warning(f"Could not load ML models: {e}")
self.models_loaded = False
def analyze(self, file_path: str) -> Dict[str, Any]:
"""Full analysis with ML models if available"""
result = self._extract_basic_features(file_path)
if self.models_loaded:
ml_features = self._extract_ml_features(file_path)
result.update(ml_features)
result['analysisMode'] = 'enhanced'
else:
# Fallback to estimated values
result.update(self._estimate_mood_features(result))
result['analysisMode'] = 'standard'
return result
def _extract_ml_features(self, file_path: str) -> Dict[str, Any]:
"""Extract features using TensorFlow models"""
result = {}
# Load audio at 16kHz for ML models
audio = self.load_audio(file_path, sample_rate=16000)
if audio is None:
return result
# Get embeddings
embeddings = self.embedding_model(audio)
# Mood predictions
if 'happy' in self.mood_models:
preds = self.mood_models['happy'](embeddings)
result['moodHappy'] = float(np.mean(preds[:, 1])) # Probability of "happy"
if 'sad' in self.mood_models:
preds = self.mood_models['sad'](embeddings)
result['moodSad'] = float(np.mean(preds[:, 1]))
if 'relaxed' in self.mood_models:
preds = self.mood_models['relaxed'](embeddings)
result['moodRelaxed'] = float(np.mean(preds[:, 1]))
if 'aggressive' in self.mood_models:
preds = self.mood_models['aggressive'](embeddings)
result['moodAggressive'] = float(np.mean(preds[:, 1]))
# Real valence and arousal from dedicated models
if 'valence' in self.mood_models:
preds = self.mood_models['valence'](embeddings)
result['valence'] = float(np.mean(preds[:, 1]))
if 'arousal' in self.mood_models:
preds = self.mood_models['arousal'](embeddings)
result['arousal'] = float(np.mean(preds[:, 1]))
# Instrumentalness from voice/instrumental model
if 'voice_instrumental' in self.mood_models:
preds = self.mood_models['voice_instrumental'](embeddings)
result['instrumentalness'] = float(np.mean(preds[:, 1])) # 1 = instrumental
# ML-based danceability
if 'danceability' in self.mood_models:
preds = self.mood_models['danceability'](embeddings)
result['danceabilityMl'] = float(np.mean(preds[:, 1]))
return result
```
### Phase 3: Update Database Schema (Day 3)
#### 3.1 Add New Feature Columns
```prisma
model Track {
// ... existing fields ...
// ML-based mood predictions (Enhanced mode)
moodHappy Float? // ML prediction 0-1
moodSad Float? // ML prediction 0-1
moodRelaxed Float? // ML prediction 0-1
moodAggressive Float? // ML prediction 0-1
danceabilityMl Float? // ML-based danceability
// Analysis metadata
analysisMode String? // 'standard' or 'enhanced'
}
```
### Phase 4: Update Vibe Matching Algorithm (Day 4)
#### 4.1 Use Real Mood Predictions in Matching
```typescript
// In library.ts - Enhanced vibe matching
const scored = analyzedTracks.map(t => {
let score = 0;
let factors = 0;
// === MOOD MATCHING (50% total - the heart of vibe) ===
// Happy mood (15%)
if (sourceTrack.moodHappy !== null && t.moodHappy !== null) {
score += (1 - Math.abs(sourceTrack.moodHappy - t.moodHappy)) * 0.15;
factors += 0.15;
}
// Sad mood (10%)
if (sourceTrack.moodSad !== null && t.moodSad !== null) {
score += (1 - Math.abs(sourceTrack.moodSad - t.moodSad)) * 0.10;
factors += 0.10;
}
// Relaxed mood (10%)
if (sourceTrack.moodRelaxed !== null && t.moodRelaxed !== null) {
score += (1 - Math.abs(sourceTrack.moodRelaxed - t.moodRelaxed)) * 0.10;
factors += 0.10;
}
// Aggressive mood (10%)
if (sourceTrack.moodAggressive !== null && t.moodAggressive !== null) {
score += (1 - Math.abs(sourceTrack.moodAggressive - t.moodAggressive)) * 0.10;
factors += 0.10;
}
// Valence - overall positivity (5%)
if (sourceTrack.valence !== null && t.valence !== null) {
score += (1 - Math.abs(sourceTrack.valence - t.valence)) * 0.05;
factors += 0.05;
}
// === AUDIO CHARACTERISTICS (35% total) ===
// BPM (15%) - within ±15 BPM is good
if (sourceTrack.bpm && t.bpm) {
const bpmDiff = Math.abs(sourceTrack.bpm - t.bpm);
score += Math.max(0, 1 - bpmDiff / 30) * 0.15;
factors += 0.15;
}
// Energy (10%)
if (sourceTrack.energy !== null && t.energy !== null) {
score += (1 - Math.abs(sourceTrack.energy - t.energy)) * 0.10;
factors += 0.10;
}
// Danceability - prefer ML version (10%)
const srcDance = sourceTrack.danceabilityMl ?? sourceTrack.danceability;
const tDance = t.danceabilityMl ?? t.danceability;
if (srcDance !== null && tDance !== null) {
score += (1 - Math.abs(srcDance - tDance)) * 0.10;
factors += 0.10;
}
// === GENRE/TAGS (15% total) ===
// Genre/tag overlap (10%)
const sourceGenres = [...(sourceTrack.lastfmTags || []), ...(sourceTrack.essentiaGenres || [])];
const trackGenres = [...(t.lastfmTags || []), ...(t.essentiaGenres || [])];
if (sourceGenres.length > 0 && trackGenres.length > 0) {
const overlap = sourceGenres.filter(g => trackGenres.includes(g)).length;
const maxOverlap = Math.max(sourceGenres.length, trackGenres.length);
score += (overlap / maxOverlap) * 0.10;
factors += 0.10;
}
// Key compatibility (5%)
if (sourceTrack.keyScale && t.keyScale) {
score += (sourceTrack.keyScale === t.keyScale ? 1 : 0.5) * 0.05;
factors += 0.05;
}
const finalScore = factors > 0 ? score / factors : 0;
return { id: t.id, score: finalScore };
});
```
### Phase 5: Create Standard Mode Fallback (Day 5)
After Enhanced mode is working, implement Standard mode:
- Same algorithm structure but skip ML features
- Use estimated valence (improved heuristics)
- Lower weights on mood matching since it's estimated
- Higher weights on BPM, energy, genre tags
### Phase 6: Settings & UI (Day 6)
#### 6.1 Add Settings Toggle
```typescript
// System settings - Enhanced is DEFAULT
{
audioAnalysis: {
vibeMatchingMode: 'enhanced' | 'standard', // Default: 'enhanced'
reanalyzeOnModeChange: boolean, // Default: false
}
}
```
#### 6.2 Settings UI
```
Audio Analysis
├── Vibe Matching Mode
│ ├── ● Enhanced (Recommended - Default)
│ │ └── Uses ML models for accurate mood detection
│ └── ○ Standard (Power Saver)
│ └── Faster, uses basic audio features only
├── Analysis Status
│ └── "1,234 / 1,500 tracks analyzed (Enhanced mode)"
└── [Re-analyze Library] button
└── "Re-analyze all tracks with current settings"
```
### Phase 7: Testing & Validation (Day 7)
#### 7.1 Test Cases
| Source Track | Bad Match (Current) | Expected Good Match |
|--------------|---------------------|---------------------|
| "Fake Happy" (Paramore) | "Summer Girl" (Jamiroquai) 97% | Other emo/pop-punk <60% |
| "Creep" (Radiohead) | Fast dance track | Other melancholic rock |
| "Uptown Funk" | Slow ballad | Other high-energy funk/pop |
#### 7.2 Performance Testing
- Analyze 100 tracks, measure time
- Memory usage during analysis
- Queue handling under load
---
## Database Schema Updates
```prisma
model Track {
// ... existing fields ...
// ML-based mood predictions (Enhanced mode)
moodHappy Float? // ML prediction 0-1
moodSad Float? // ML prediction 0-1
moodRelaxed Float? // ML prediction 0-1
moodAggressive Float? // ML prediction 0-1
danceabilityMl Float? // ML-based danceability
// Analysis metadata
analysisMode String? // 'standard' or 'enhanced'
}
```
---
## Performance Benchmarks (Estimated)
| Operation | Standard Mode | Enhanced Mode |
|-----------|---------------|---------------|
| Analysis per track | 1-2 sec | 5-10 sec |
| RAM usage | ~100MB | ~500MB |
| Models in Docker | N/A | ~200MB (pre-packaged) |
| Vibe match query | <100ms | <100ms |
| Full library (1000 tracks) | ~30 min | ~2-3 hours |
---
## Files to Modify
| File | Changes |
|------|---------|
| `services/audio-analyzer/Dockerfile` | Add model downloads during build |
| `services/audio-analyzer/analyzer.py` | Implement ML model loading and prediction |
| `backend/prisma/schema.prisma` | Add mood prediction columns |
| `backend/src/routes/library.ts` | Update vibe matching algorithm weights |
| `frontend/features/settings/` | Add analysis mode toggle (default: enhanced) |
| `frontend/components/player/VibeGraph.tsx` | Display mood predictions |
---
## Success Metrics
After implementation, "Fake Happy" and "Summer Girl" should:
- Match at **<50%** (different emotional content, different genre)
Better matches for "Fake Happy" would be:
- Other Paramore songs (same artist = genre/production match)
- Emo/pop-punk with similar emotional complexity
- Songs with high energy but mixed emotional signals
---
## Implementation Order (Enhanced First)
### Week 1: Get Enhanced Mode Working
1. [x] Create implementation plan (this document)
2. [x] Update Dockerfile to pre-package ML models (~200MB)
3. [x] Rewrite analyzer.py with TensorFlow model loading
4. [x] Add new database columns for mood predictions (moodHappy, moodSad, etc.)
5. [x] Update vibe matching algorithm with ML mood weights
6. [x] Update programmatic playlists to use ML mood predictions
7. [ ] Run Prisma migration to apply schema changes
8. [ ] Rebuild audio-analyzer Docker container
9. [ ] Test ML analysis on sample tracks
### Week 2: Polish & Fallback
10. [ ] Test accuracy with diverse track pairs
11. [ ] Add settings UI (Enhanced = default)
12. [ ] Implement Standard mode as explicit fallback option
13. [ ] Update VibeGraph to show mood predictions
14. [ ] Documentation and testing
---
## Quick Reference: Models to Include
| Model | File | Purpose | Size |
|-------|------|---------|------|
| Embeddings | `discogs-effnet-bs64-1.pb` | Base model for all predictions | ~85MB |
| Happy | `mood_happy-discogs-effnet-1.pb` | Happiness detection | ~15MB |
| Sad | `mood_sad-discogs-effnet-1.pb` | Sadness detection | ~15MB |
| Relaxed | `mood_relaxed-discogs-effnet-1.pb` | Relaxation detection | ~15MB |
| Aggressive | `mood_aggressive-discogs-effnet-1.pb` | Aggression detection | ~15MB |
| Arousal | `mood_arousal-discogs-effnet-1.pb` | Energy/calm scale | ~15MB |
| Valence | `mood_valence-discogs-effnet-1.pb` | Positive/negative | ~15MB |
| Danceability | `danceability-discogs-effnet-1.pb` | ML danceability | ~15MB |
| Voice/Instrumental | `voice_instrumental-discogs-effnet-1.pb` | Vocal detection | ~15MB |
**Total:** ~200MB (one-time addition to Docker image)