Initial release v1.0.0
This commit is contained in:
@@ -0,0 +1,174 @@
|
||||
# Audio Analysis - Enhanced Mode (MusiCNN)
|
||||
|
||||
## Overview
|
||||
|
||||
Enhanced mode uses Essentia's TensorFlow integration with MusiCNN (Music Convolutional Neural Network) models to perform ML-based mood and audio classification. This provides significantly more accurate mood detection compared to the heuristic-based Standard mode.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────┐
|
||||
│ Audio File │
|
||||
│ (16kHz mono) │
|
||||
└────────┬────────┘
|
||||
│
|
||||
┌────────▼────────┐
|
||||
│ TensorflowPredict│
|
||||
│ MusiCNN │
|
||||
│ (Embeddings) │
|
||||
└────────┬────────┘
|
||||
│
|
||||
┌──────────────┼──────────────┐
|
||||
│ │ │
|
||||
┌─────────▼─────┐ ┌──────▼─────┐ ┌──────▼─────┐
|
||||
│ Mood Happy │ │ Mood Sad │ │ Danceability│
|
||||
│ TensorFlow │ │ TensorFlow │ │ TensorFlow │
|
||||
│ Predict2D │ │ Predict2D │ │ Predict2D │
|
||||
└───────┬───────┘ └─────┬──────┘ └──────┬──────┘
|
||||
│ │ │
|
||||
└───────────────┼───────────────┘
|
||||
│
|
||||
┌───────▼───────┐
|
||||
│ Derived Scores│
|
||||
│ Valence/Arousal│
|
||||
└───────────────┘
|
||||
```
|
||||
|
||||
## Key Components
|
||||
|
||||
### 1. Base Model: MusiCNN
|
||||
|
||||
- **Model**: `msd-musicnn-1.pb` (~3MB)
|
||||
- **Source**: [Essentia Model Zoo](https://essentia.upf.edu/models/autotagging/msd/)
|
||||
- **Function**: Extracts 200-dimensional embeddings from audio
|
||||
- **Algorithm**: `TensorflowPredictMusiCNN`
|
||||
|
||||
### 2. Classification Heads
|
||||
|
||||
Each classification head takes the MusiCNN embeddings and outputs probabilities:
|
||||
|
||||
| Model | File | Output |
|
||||
|-------|------|--------|
|
||||
| Mood Happy | `mood_happy-msd-musicnn-1.pb` | P(happy) |
|
||||
| Mood Sad | `mood_sad-msd-musicnn-1.pb` | P(sad) |
|
||||
| Mood Relaxed | `mood_relaxed-msd-musicnn-1.pb` | P(relaxed) |
|
||||
| Mood Aggressive | `mood_aggressive-msd-musicnn-1.pb` | P(aggressive) |
|
||||
| Mood Party | `mood_party-msd-musicnn-1.pb` | P(party) |
|
||||
| Mood Acoustic | `mood_acoustic-msd-musicnn-1.pb` | P(acoustic) |
|
||||
| Mood Electronic | `mood_electronic-msd-musicnn-1.pb` | P(electronic) |
|
||||
| Danceability | `danceability-msd-musicnn-1.pb` | P(danceable) |
|
||||
| Voice/Instrumental | `voice_instrumental-msd-musicnn-1.pb` | P(instrumental) |
|
||||
|
||||
### 3. Derived Features
|
||||
|
||||
Valence and Arousal are derived from the mood predictions:
|
||||
|
||||
```python
|
||||
# Valence = emotional positivity
|
||||
valence = happy * 0.5 + party * 0.3 + (1 - sad) * 0.2
|
||||
|
||||
# Arousal = energy level
|
||||
arousal = aggressive * 0.35 + party * 0.25 + electronic * 0.2
|
||||
+ (1 - relaxed) * 0.1 + (1 - acoustic) * 0.1
|
||||
```
|
||||
|
||||
## Docker Configuration
|
||||
|
||||
### Dockerfile
|
||||
|
||||
```dockerfile
|
||||
FROM ubuntu:20.04
|
||||
|
||||
# Install essentia-tensorflow (includes TensorFlow + MusiCNN support)
|
||||
RUN pip3 install --no-cache-dir essentia-tensorflow
|
||||
|
||||
# Download MusiCNN models
|
||||
RUN curl -L -o /app/models/msd-musicnn-1.pb \
|
||||
"https://essentia.upf.edu/models/autotagging/msd/msd-musicnn-1.pb"
|
||||
|
||||
# Classification heads
|
||||
RUN curl -L -o /app/models/mood_happy-msd-musicnn-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_happy/mood_happy-msd-musicnn-1.pb"
|
||||
# ... (other models)
|
||||
```
|
||||
|
||||
### Requirements
|
||||
|
||||
- **Ubuntu 20.04** (for Python 3.8 compatibility)
|
||||
- **essentia-tensorflow** pip package
|
||||
- **~10MB** for all models combined
|
||||
|
||||
## Usage in Code
|
||||
|
||||
```python
|
||||
from essentia.standard import TensorflowPredictMusiCNN, TensorflowPredict2D
|
||||
|
||||
# Load base embedding model
|
||||
musicnn = TensorflowPredictMusiCNN(
|
||||
graphFilename='/app/models/msd-musicnn-1.pb',
|
||||
output="model/dense/BiasAdd" # Embedding output layer
|
||||
)
|
||||
|
||||
# Load classification head
|
||||
mood_happy = TensorflowPredict2D(
|
||||
graphFilename='/app/models/mood_happy-msd-musicnn-1.pb',
|
||||
output="model/Softmax"
|
||||
)
|
||||
|
||||
# Process audio
|
||||
audio = es.MonoLoader(filename=path, sampleRate=16000)()
|
||||
embeddings = musicnn(audio) # Shape: [frames, 200]
|
||||
predictions = mood_happy(embeddings) # Shape: [frames, 2]
|
||||
happy_score = float(np.mean(predictions[:, 1])) # Average over frames
|
||||
```
|
||||
|
||||
## Output Fields
|
||||
|
||||
Enhanced mode produces these additional fields:
|
||||
|
||||
| Field | Type | Range | Description |
|
||||
|-------|------|-------|-------------|
|
||||
| moodHappy | float | 0-1 | ML probability of happy mood |
|
||||
| moodSad | float | 0-1 | ML probability of sad mood |
|
||||
| moodRelaxed | float | 0-1 | ML probability of relaxed mood |
|
||||
| moodAggressive | float | 0-1 | ML probability of aggressive mood |
|
||||
| moodParty | float | 0-1 | ML probability of party mood |
|
||||
| moodAcoustic | float | 0-1 | ML probability of acoustic sound |
|
||||
| moodElectronic | float | 0-1 | ML probability of electronic sound |
|
||||
| danceabilityMl | float | 0-1 | ML danceability score |
|
||||
| valence | float | 0-1 | Derived emotional positivity |
|
||||
| arousal | float | 0-1 | Derived energy level |
|
||||
| acousticness | float | 0-1 | From moodAcoustic |
|
||||
| instrumentalness | float | 0-1 | ML voice/instrumental detection |
|
||||
|
||||
## Comparison: Standard vs Enhanced
|
||||
|
||||
| Feature | Standard Mode | Enhanced Mode |
|
||||
|---------|---------------|---------------|
|
||||
| Mood Detection | Heuristic (key/BPM/energy) | ML (MusiCNN) |
|
||||
| Accuracy | Approximate | Research-grade |
|
||||
| Speed | Fast (~100ms) | Moderate (~500ms) |
|
||||
| Dependencies | Essentia core | Essentia + TensorFlow |
|
||||
| Model Size | 0 | ~10MB |
|
||||
| Python Version | Any | 3.7-3.9 (for pip) |
|
||||
|
||||
## Fallback Behavior
|
||||
|
||||
If Enhanced mode fails to initialize (missing models, TensorFlow errors), the analyzer automatically falls back to Standard mode:
|
||||
|
||||
```python
|
||||
if self.enhanced_mode and self.musicnn_model:
|
||||
ml_features = self._extract_ml_features(audio_16k)
|
||||
result.update(ml_features)
|
||||
else:
|
||||
self._apply_standard_estimates(result, scale, bpm)
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [Essentia TensorFlow Documentation](https://essentia.upf.edu/machine_learning.html)
|
||||
- [MusiCNN Paper](https://arxiv.org/abs/1711.02520)
|
||||
- [Essentia Model Zoo](https://essentia.upf.edu/models/)
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,443 @@
|
||||
# Audio Analysis: Standard Mode (Heuristic Approach)
|
||||
|
||||
## Overview
|
||||
|
||||
The Lidify audio analyzer has two modes:
|
||||
- **Enhanced Mode**: Uses TensorFlow ML models for accurate mood/valence/arousal predictions
|
||||
- **Standard Mode**: Uses signal processing heuristics when ML models aren't available
|
||||
|
||||
This document covers the **Standard Mode** implementation for code review.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Docker Container │
|
||||
│ lidify_audio_analyzer │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
|
||||
│ │ Redis │◄───│ Worker │───►│ PostgreSQL │ │
|
||||
│ │ Job Queue │ │ Loop │ │ Track Table │ │
|
||||
│ └─────────────┘ └──────┬──────┘ └─────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌──────▼──────┐ │
|
||||
│ │ AudioAnalyzer│ │
|
||||
│ │ Class │ │
|
||||
│ └──────┬──────┘ │
|
||||
│ │ │
|
||||
│ ┌────────────────┼────────────────┐ │
|
||||
│ ▼ ▼ ▼ │
|
||||
│ ┌───────────────┐ ┌─────────────┐ ┌──────────────────┐ │
|
||||
│ │ Basic Features│ │ Spectral │ │ Heuristic │ │
|
||||
│ │ (BPM, Key) │ │ Analysis │ │ Mood Estimation │ │
|
||||
│ └───────────────┘ └─────────────┘ └──────────────────┘ │
|
||||
│ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
services/audio-analyzer/
|
||||
├── analyzer.py # Main analyzer code (870 lines)
|
||||
├── requirements.txt # Python dependencies
|
||||
└── Dockerfile # Container build configuration
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Classes
|
||||
|
||||
### 1. `AudioAnalyzer` (Line 130-660)
|
||||
|
||||
Main analysis class with two modes:
|
||||
|
||||
```python
|
||||
class AudioAnalyzer:
|
||||
def __init__(self):
|
||||
self.enhanced_mode = False # Falls back to Standard if ML unavailable
|
||||
self._init_essentia() # Initialize signal processing algorithms
|
||||
self._load_ml_models() # Attempt to load ML models
|
||||
```
|
||||
|
||||
### 2. `AnalysisWorker` (Line 663-847)
|
||||
|
||||
Redis queue worker that:
|
||||
1. Polls for pending tracks from `audio:analysis:queue`
|
||||
2. Falls back to scanning `Track` table for `analysisStatus = 'pending'`
|
||||
3. Processes tracks and updates database
|
||||
|
||||
---
|
||||
|
||||
## Standard Mode: Heuristic Calculations
|
||||
|
||||
### Input Features (Always Extracted)
|
||||
|
||||
| Feature | Essentia Algorithm | Description |
|
||||
|---------|-------------------|-------------|
|
||||
| BPM | `RhythmExtractor2013` | Beats per minute |
|
||||
| Key/Scale | `KeyExtractor` | Musical key (C, D#, etc.) and mode (major/minor) |
|
||||
| Loudness | `Loudness` | Perceived loudness in dB |
|
||||
| Dynamic Range | `DynamicComplexity` | Difference between quiet and loud parts |
|
||||
| Danceability | `Danceability` | How suitable for dancing (0-1) |
|
||||
| RMS Energy | `RMS` | Root Mean Square amplitude per frame |
|
||||
| Spectral Centroid | `Centroid` | "Brightness" - center of spectral mass |
|
||||
| Spectral Flatness | `FlatnessDB` | Noise-like vs tonal content |
|
||||
| Zero-Crossing Rate | `ZeroCrossingRate` | Rate of signal sign changes |
|
||||
|
||||
### Frame-Based Processing (Lines 328-365)
|
||||
|
||||
```python
|
||||
frame_size = 2048
|
||||
hop_size = 1024
|
||||
|
||||
for i in range(0, len(audio_44k) - frame_size, hop_size):
|
||||
frame = audio_44k[i:i + frame_size]
|
||||
windowed = self.windowing(frame)
|
||||
spectrum = self.spectrum(windowed)
|
||||
|
||||
rms_values.append(self.rms(frame))
|
||||
zcr_values.append(self.zcr(frame))
|
||||
spectral_centroid_values.append(self.spectral_centroid(spectrum))
|
||||
spectral_flatness_values.append(self.spectral_flatness(spectrum))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Heuristic Formulas
|
||||
|
||||
### Energy (Line 347-353)
|
||||
|
||||
**Problem Solved**: Previous implementation used `es.Energy()` which returns sum of squared samples (huge number), normalized incorrectly as `energy / 100`.
|
||||
|
||||
**Current Implementation**:
|
||||
```python
|
||||
avg_rms = np.mean(rms_values)
|
||||
energy = min(1.0, avg_rms * 3) # RMS typically 0.0-0.5, scale to 0-1
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Valence (Happiness/Positivity) - Lines 495-518
|
||||
|
||||
**Formula**:
|
||||
```
|
||||
valence = key_valence * 0.40
|
||||
+ bpm_valence * 0.25
|
||||
+ brightness_valence * 0.20
|
||||
+ energy * 0.15
|
||||
```
|
||||
|
||||
**Components**:
|
||||
|
||||
| Component | Weight | Calculation | Rationale |
|
||||
|-----------|--------|-------------|-----------|
|
||||
| Key Valence | 40% | Major = 0.65, Minor = 0.35 | Major keys sound happier |
|
||||
| BPM Valence | 25% | Fast (≥120) → 0.8, Slow (≤80) → 0.2 | Fast tempo = upbeat |
|
||||
| Brightness | 20% | `spectral_centroid * 1.5` | Bright sounds feel positive |
|
||||
| Energy | 15% | RMS energy (0-1) | Loud = energetic/positive |
|
||||
|
||||
**Code**:
|
||||
```python
|
||||
# Key contribution
|
||||
key_valence = 0.65 if scale == 'major' else 0.35
|
||||
|
||||
# BPM contribution
|
||||
if bpm >= 120:
|
||||
bpm_valence = min(0.8, 0.5 + (bpm - 120) / 200)
|
||||
elif bpm <= 80:
|
||||
bpm_valence = max(0.2, 0.5 - (80 - bpm) / 100)
|
||||
else:
|
||||
bpm_valence = 0.5
|
||||
|
||||
# Brightness contribution
|
||||
brightness_valence = min(1.0, spectral_centroid * 1.5)
|
||||
|
||||
# Final weighted sum
|
||||
result['valence'] = round(
|
||||
key_valence * 0.4 +
|
||||
bpm_valence * 0.25 +
|
||||
brightness_valence * 0.2 +
|
||||
energy * 0.15,
|
||||
3
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Arousal (Energy/Intensity) - Lines 520-543
|
||||
|
||||
**Formula**:
|
||||
```
|
||||
arousal = bpm_arousal * 0.35
|
||||
+ energy_arousal * 0.35
|
||||
+ brightness_arousal * 0.15
|
||||
+ compression_arousal * 0.15
|
||||
```
|
||||
|
||||
**Components**:
|
||||
|
||||
| Component | Weight | Calculation | Rationale |
|
||||
|-----------|--------|-------------|-----------|
|
||||
| BPM Arousal | 35% | `(bpm - 60) / 140` mapped to 0.1-0.9 | Fast = high energy |
|
||||
| Energy | 35% | RMS energy (0-1) | Loud = intense |
|
||||
| Brightness | 15% | `spectral_centroid * 1.2` | Bright = energetic |
|
||||
| Compression | 15% | `1 - (dynamic_range / 20)` | Compressed = intense/modern |
|
||||
|
||||
**Code**:
|
||||
```python
|
||||
# BPM contribution (60-180 BPM → 0.1-0.9)
|
||||
bpm_arousal = min(0.9, max(0.1, (bpm - 60) / 140))
|
||||
|
||||
# Energy is direct intensity indicator
|
||||
energy_arousal = energy
|
||||
|
||||
# Low dynamic range = compressed = more intense
|
||||
compression_arousal = max(0, min(1.0, 1 - (dynamic_range / 20)))
|
||||
|
||||
# Brightness adds perceived energy
|
||||
brightness_arousal = min(1.0, spectral_centroid * 1.2)
|
||||
|
||||
result['arousal'] = round(
|
||||
bpm_arousal * 0.35 +
|
||||
energy_arousal * 0.35 +
|
||||
brightness_arousal * 0.15 +
|
||||
compression_arousal * 0.15,
|
||||
3
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Instrumentalness - Lines 545-563
|
||||
|
||||
**Approach**: Estimate likelihood of vocals vs instrumental based on spectral characteristics.
|
||||
|
||||
**Formula**:
|
||||
```
|
||||
instrumentalness = flatness_normalized * 0.6 + zcr_instrumental * 0.4
|
||||
```
|
||||
|
||||
**Components**:
|
||||
|
||||
| Component | Weight | Calculation | Rationale |
|
||||
|-----------|--------|-------------|-----------|
|
||||
| Spectral Flatness | 60% | `(flatness + 40) / 40` | Noise-like (0dB) = instrumental; Tonal (-60dB) = vocals |
|
||||
| ZCR Pattern | 40% | Low (<0.05) = 0.7; High (>0.15) = 0.4 | Sustained tones = instrumental |
|
||||
|
||||
**Code**:
|
||||
```python
|
||||
# Spectral flatness: -40dB to 0dB → 0 to 1
|
||||
flatness_normalized = min(1.0, max(0, (spectral_flatness + 40) / 40))
|
||||
|
||||
# ZCR patterns
|
||||
if zcr < 0.05:
|
||||
zcr_instrumental = 0.7 # Sustained instrumental tones
|
||||
elif zcr > 0.15:
|
||||
zcr_instrumental = 0.4 # Could be speech or percussion
|
||||
else:
|
||||
zcr_instrumental = 0.5 # Uncertain
|
||||
|
||||
result['instrumentalness'] = round(
|
||||
flatness_normalized * 0.6 + zcr_instrumental * 0.4,
|
||||
3
|
||||
)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Acousticness - Line 565-568
|
||||
|
||||
**Simple heuristic**: High dynamic range suggests acoustic recording (natural dynamics preserved).
|
||||
|
||||
```python
|
||||
result['acousticness'] = round(min(1.0, dynamic_range / 12), 3)
|
||||
```
|
||||
|
||||
| Dynamic Range | Acousticness | Interpretation |
|
||||
|---------------|--------------|----------------|
|
||||
| < 6 dB | < 0.5 | Heavily compressed (electronic/pop) |
|
||||
| 6-12 dB | 0.5-1.0 | Moderate (mixed) |
|
||||
| > 12 dB | 1.0 | High dynamic range (acoustic/classical) |
|
||||
|
||||
---
|
||||
|
||||
### Speechiness - Lines 570-575
|
||||
|
||||
**Approach**: Speech has characteristic ZCR + spectral centroid patterns.
|
||||
|
||||
```python
|
||||
if zcr > 0.08 and zcr < 0.2 and spectral_centroid > 0.1 and spectral_centroid < 0.4:
|
||||
result['speechiness'] = round(min(0.5, zcr * 3), 3)
|
||||
else:
|
||||
result['speechiness'] = 0.1
|
||||
```
|
||||
|
||||
| Condition | Result |
|
||||
|-----------|--------|
|
||||
| ZCR 0.08-0.2 AND centroid 0.1-0.4 | Speech-like (up to 0.5) |
|
||||
| Outside range | Low speechiness (0.1) |
|
||||
|
||||
---
|
||||
|
||||
## Mood Tag Generation (Lines 581-660)
|
||||
|
||||
Tags are derived from computed features:
|
||||
|
||||
| Condition | Tags Added |
|
||||
|-----------|------------|
|
||||
| `arousal >= 0.7` | energetic, upbeat |
|
||||
| `arousal <= 0.3` | calm, peaceful |
|
||||
| `valence >= 0.7` | happy, uplifting |
|
||||
| `valence <= 0.3` | sad, melancholic |
|
||||
| `danceability >= 0.7` | dance, groovy |
|
||||
| `bpm >= 140` | fast |
|
||||
| `bpm <= 80` | slow |
|
||||
| `keyScale == 'minor'` (and not happy) | moody |
|
||||
| `arousal >= 0.7 AND bpm >= 120` | workout |
|
||||
| `arousal <= 0.4 AND valence <= 0.4` | atmospheric |
|
||||
| `arousal <= 0.3 AND bpm <= 90` | chill |
|
||||
|
||||
---
|
||||
|
||||
## Output Schema
|
||||
|
||||
```typescript
|
||||
interface AnalysisResult {
|
||||
// Basic features
|
||||
bpm: number; // 60-200 typical
|
||||
beatsCount: number; // Total beat count
|
||||
key: string; // "C", "D#", etc.
|
||||
keyScale: string; // "major" or "minor"
|
||||
keyStrength: number; // 0-1 confidence
|
||||
|
||||
// Energy metrics
|
||||
energy: number; // 0-1 (RMS-based)
|
||||
loudness: number; // dB
|
||||
dynamicRange: number; // dB
|
||||
|
||||
// Heuristic estimates
|
||||
danceability: number; // 0-1
|
||||
valence: number; // 0-1 (happiness)
|
||||
arousal: number; // 0-1 (energy)
|
||||
instrumentalness: number; // 0-1
|
||||
acousticness: number; // 0-1
|
||||
speechiness: number; // 0-1
|
||||
|
||||
// Derived
|
||||
moodTags: string[]; // ["calm", "peaceful", "chill"]
|
||||
analysisMode: "standard"; // Always "standard" for this mode
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Database Update (Lines 766-822)
|
||||
|
||||
All features are persisted to the `Track` table:
|
||||
|
||||
```sql
|
||||
UPDATE "Track"
|
||||
SET
|
||||
bpm = %s,
|
||||
"beatsCount" = %s,
|
||||
key = %s,
|
||||
"keyScale" = %s,
|
||||
"keyStrength" = %s,
|
||||
energy = %s,
|
||||
loudness = %s,
|
||||
"dynamicRange" = %s,
|
||||
danceability = %s,
|
||||
valence = %s,
|
||||
arousal = %s,
|
||||
instrumentalness = %s,
|
||||
acousticness = %s,
|
||||
speechiness = %s,
|
||||
"moodTags" = %s,
|
||||
"analysisMode" = 'standard',
|
||||
"analysisStatus" = 'completed',
|
||||
"analysisVersion" = %s,
|
||||
"analyzedAt" = %s
|
||||
WHERE id = %s
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Known Limitations
|
||||
|
||||
### Standard Mode vs ML Models
|
||||
|
||||
| Aspect | Standard Mode | Enhanced Mode (ML) |
|
||||
|--------|--------------|-------------------|
|
||||
| Valence accuracy | ~60% correlation | ~85% correlation |
|
||||
| Arousal accuracy | ~65% correlation | ~88% correlation |
|
||||
| Mood detection | Rule-based | Neural network |
|
||||
| Processing speed | Fast (~1-2 sec) | Slower (~5-10 sec) |
|
||||
| Dependencies | Essentia only | Essentia + TensorFlow |
|
||||
|
||||
### Edge Cases
|
||||
|
||||
1. **Ambient music**: Low BPM detection reliability
|
||||
2. **Classical**: Variable tempo causes BPM averaging issues
|
||||
3. **Spoken word**: May be misclassified as low-energy music
|
||||
4. **Electronic/EDM**: Compression detection may overestimate arousal
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
```
|
||||
# requirements.txt
|
||||
essentia==2.1b6.dev1110
|
||||
essentia-tensorflow==2.1b6.dev1110
|
||||
numpy>=1.21.0,<2.0.0
|
||||
tensorflow==2.15.0
|
||||
redis>=4.5.0
|
||||
psycopg2-binary>=2.9.0
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Testing
|
||||
|
||||
Run single file analysis:
|
||||
```bash
|
||||
docker exec lidify_audio_analyzer python3 analyzer.py --test /music/path/to/song.mp3
|
||||
```
|
||||
|
||||
Example output:
|
||||
```json
|
||||
{
|
||||
"bpm": 128.5,
|
||||
"beatsCount": 256,
|
||||
"key": "C",
|
||||
"keyScale": "minor",
|
||||
"keyStrength": 0.723,
|
||||
"energy": 0.65,
|
||||
"loudness": -8.2,
|
||||
"dynamicRange": 7.5,
|
||||
"danceability": 0.72,
|
||||
"valence": 0.42,
|
||||
"arousal": 0.68,
|
||||
"instrumentalness": 0.35,
|
||||
"acousticness": 0.625,
|
||||
"speechiness": 0.1,
|
||||
"moodTags": ["energetic", "upbeat", "moody", "dance"],
|
||||
"analysisMode": "standard"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Files
|
||||
|
||||
- `services/audio-analyzer/Dockerfile` - Container build
|
||||
- `backend/src/services/vibeMatching.ts` - Uses these features for song matching
|
||||
- `prisma/schema.prisma` - Track table schema with analysis columns
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,107 @@
|
||||
# Curated Vibe Mixes Implementation
|
||||
|
||||
## Overview
|
||||
|
||||
This update adds **19 new curated vibe mixes** and a **Mood-on-Demand** feature that allows users to generate custom mixes based on audio features.
|
||||
|
||||
## Bug Fix
|
||||
|
||||
Fixed the `genres` field bug - the Album model uses `genres` (JSON array) not `genre` (string). Added a helper function `findTracksByGenrePatterns()` that properly queries:
|
||||
1. Track's `lastfmTags` and `essentiaGenres` (native String[] fields)
|
||||
2. Falls back to filtering `album.genres` JSON array in application code
|
||||
|
||||
## New Daily Vibe Mixes (10 tracks each)
|
||||
|
||||
| Mix Name | Description | Key Audio Features |
|
||||
|----------|-------------|-------------------|
|
||||
| **Sad Girl Sundays** | Melancholic introspection | valence < 0.35, minor key, arousal < 0.4 |
|
||||
| **Main Character Energy** | You're the protagonist ✨ | valence > 0.55, energy > 0.55, danceability > 0.5 |
|
||||
| **Villain Era** | Dark & empowering 😈 | minor key, energy > 0.65, aggressive tags |
|
||||
| **3AM Thoughts** | Late night overthinking 🌙 | arousal < 0.35, energy < 0.45, valence < 0.45 |
|
||||
| **Hot Girl Walk** | Confident cardio 💅 | danceability > 0.65, BPM 95-135, energy > 0.55 |
|
||||
| **Rage Cleaning** | Aggressive productivity 🔥 | energy > 0.75, arousal > 0.65, BPM > 125 |
|
||||
| **Golden Hour** | Warm sunset vibes 🌅 | valence > 0.45, acousticness > 0.35, energy 0.25-0.65 |
|
||||
| **Shower Karaoke** | Belters you can't help sing 🚿 | instrumentalness < 0.35, energy > 0.55, valence > 0.45 |
|
||||
| **In My Feelings** | Let it all out 💔 | valence < 0.4, arousal < 0.55, acousticness > 0.25 |
|
||||
| **Midnight Drive** | Late night cruising 🚗 | energy 0.35-0.65, arousal 0.25-0.55, BPM 85-125 |
|
||||
| **Coffee Shop Vibes** | Cozy background ☕ | acousticness > 0.4, energy 0.15-0.55 |
|
||||
| **Romanticize Your Life** | Aesthetic moments 🎬 | valence 0.35-0.75, arousal 0.25-0.65, acousticness > 0.25 |
|
||||
| **That Girl Era** | Self-improvement mode 💪 | valence > 0.55, energy > 0.45, danceability > 0.45 |
|
||||
| **Unhinged** | Embrace the chaos 🎪 | Extreme features (high or low everything) |
|
||||
|
||||
## New Weekly Curated Mixes (20 tracks each)
|
||||
|
||||
| Mix Name | Description | Algorithm |
|
||||
|----------|-------------|-----------|
|
||||
| **Deep Cuts** | Hidden gems 💎 | Tracks with zero or few plays |
|
||||
| **Key Journey** | Harmonic progression 🎹 | Ordered by circle of fifths |
|
||||
| **Tempo Flow** | Energy arc 📈 | slow → fast → slow BPM journey |
|
||||
| **Vocal Detox** | Instrumental escape 🧘 | instrumentalness > 0.75 |
|
||||
| **Minor Key Mondays** | All minor key bangers 🖤 | keyScale = 'minor', energy > 0.45 |
|
||||
|
||||
## Mood-on-Demand Feature
|
||||
|
||||
### Backend Endpoints
|
||||
|
||||
- `POST /api/mixes/mood` - Generate a custom mix based on audio parameters
|
||||
- `GET /api/mixes/mood/presets` - Get available mood presets for the UI
|
||||
|
||||
### Preset Moods (12 total)
|
||||
|
||||
1. 😊 Happy & Upbeat
|
||||
2. 😢 Melancholic
|
||||
3. 😌 Chill & Relaxed
|
||||
4. ⚡ High Energy
|
||||
5. 🎯 Focus Mode
|
||||
6. 💃 Dance Party
|
||||
7. 🎸 Acoustic Vibes
|
||||
8. 🖤 Dark & Moody
|
||||
9. 💕 Romantic
|
||||
10. 💪 Workout Beast
|
||||
11. 😴 Sleep & Unwind
|
||||
12. 👑 Confidence Boost
|
||||
|
||||
### Custom Mix Builder
|
||||
|
||||
Users can adjust sliders for:
|
||||
- Happiness (valence)
|
||||
- Energy
|
||||
- Danceability
|
||||
- Tempo (BPM)
|
||||
|
||||
## Frontend Changes
|
||||
|
||||
### New Component: `MoodMixer.tsx`
|
||||
|
||||
A beautiful Spotify-esque modal with:
|
||||
- Gradient preset cards with emojis
|
||||
- Smooth animations (Framer Motion)
|
||||
- Custom range slider controls
|
||||
- Dark theme matching the app aesthetic
|
||||
|
||||
### Homepage Integration
|
||||
|
||||
Added "Mood Mixer" button next to the "Refresh" button in the "Made For You" section.
|
||||
|
||||
## Files Modified
|
||||
|
||||
### Backend
|
||||
- `backend/src/services/programmaticPlaylists.ts` - Added helper function, fixed 12 genre bugs, added 19 new mix generators
|
||||
- `backend/src/routes/mixes.ts` - Added mood endpoints and presets
|
||||
|
||||
### Frontend
|
||||
- `frontend/lib/api.ts` - Added types and API methods for mood mixing
|
||||
- `frontend/app/page.tsx` - Integrated MoodMixer modal
|
||||
- `frontend/components/MoodMixer.tsx` - New component (created)
|
||||
|
||||
## Technical Notes
|
||||
|
||||
- All mixes use Essentia audio analysis data (valence, energy, danceability, BPM, key, etc.)
|
||||
- Fallback to Last.fm tags when audio analysis is insufficient
|
||||
- Daily mixes: 10 tracks, refreshed daily
|
||||
- Weekly mixes: 20 tracks, for longer listening sessions
|
||||
- Mix generation is cached in Redis for performance
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -0,0 +1,798 @@
|
||||
# Modified Files for Review
|
||||
|
||||
> **Last Updated:** December 16, 2025
|
||||
> **Features:** Spotify Import + UI Overhaul (Activity Panel, Carousels, Notifications, Playlist/Mix/Discover Redesign, Settings Page Redesign)
|
||||
|
||||
## Overview
|
||||
|
||||
This document tracks all files created or modified as part of:
|
||||
|
||||
1. **Spotify Import Feature** - Import Spotify playlists, match tracks, download albums
|
||||
2. **UI Overhaul** - Activity Panel, horizontal carousels, notification system
|
||||
|
||||
---
|
||||
|
||||
## Backend - New Files
|
||||
|
||||
| File | Purpose |
|
||||
| --------------------------------------------- | --------------------------------------------------------------- |
|
||||
| `backend/src/services/notificationService.ts` | Notification CRUD service with convenience methods |
|
||||
| `backend/src/services/spotifyImport.ts` | Spotify playlist import logic, track matching, album resolution |
|
||||
| `backend/src/services/spotify.ts` | Spotify API/scraping service (embed data extraction) |
|
||||
| `backend/src/routes/notifications.ts` | Notification & download history API endpoints |
|
||||
| `backend/src/routes/spotify.ts` | Spotify import API endpoints |
|
||||
| `backend/src/utils/playlistLogger.ts` | Debug logger for Spotify import jobs |
|
||||
|
||||
## Backend - Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| ----------------------------------------------- | --------------------------------------------------------------------- |
|
||||
| `backend/prisma/schema.prisma` | Added `Notification` model, `DownloadJob.cleared` field |
|
||||
| `backend/src/services/simpleDownloadManager.ts` | Added notification integration, failure deduplication |
|
||||
| `backend/src/services/lidarr.ts` | Smart `anyReleaseOk` fallback, MusicBrainz fallback for artist lookup |
|
||||
| `backend/src/services/musicbrainz.ts` | Recording filtering, scoring system, title normalization |
|
||||
| `backend/src/services/spotify.ts` | Embed scraping improvements, debug logging |
|
||||
| `backend/src/index.ts` | Registered notification routes |
|
||||
|
||||
---
|
||||
|
||||
## Frontend - New Files
|
||||
|
||||
| File | Purpose |
|
||||
| ----------------------------------------------------- | ----------------------------------------------------- |
|
||||
| `frontend/components/layout/ActivityPanel.tsx` | Collapsible 3rd column with tabs, PWA install button |
|
||||
| `frontend/components/activity/NotificationsTab.tsx` | System notifications list |
|
||||
| `frontend/components/activity/ActiveDownloadsTab.tsx` | Currently downloading items |
|
||||
| `frontend/components/activity/HistoryTab.tsx` | Completed/failed with retry |
|
||||
| `frontend/components/ui/HorizontalCarousel.tsx` | Reusable carousel with arrows |
|
||||
| `frontend/hooks/useActivityPanel.ts` | Panel state management |
|
||||
| `frontend/app/import/spotify/page.tsx` | Spotify import UI page (preview, selection, progress) |
|
||||
|
||||
## Frontend - Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| ------------------------------------------------------------- | ------------------------------------------------------ |
|
||||
| `frontend/components/layout/AuthenticatedLayout.tsx` | Added 3rd column, event listener for toggle |
|
||||
| `frontend/components/layout/TopBar.tsx` | Added `ActivityPanelToggle` button |
|
||||
| `frontend/components/MixCard.tsx` | Reduced padding/sizing (`p-4` → `p-2.5`) |
|
||||
| `frontend/features/home/components/ArtistsGrid.tsx` | Uses `HorizontalCarousel` |
|
||||
| `frontend/features/home/components/MixesGrid.tsx` | Uses `HorizontalCarousel` |
|
||||
| `frontend/features/home/components/ContinueListening.tsx` | Uses `HorizontalCarousel` |
|
||||
| `frontend/features/home/components/PodcastsGrid.tsx` | Uses `HorizontalCarousel` |
|
||||
| `frontend/features/home/components/HomeHero.tsx` | Already optimized (compact greeting) |
|
||||
| `frontend/lib/api.ts` | Added notification API methods, Spotify import methods |
|
||||
| `frontend/app/playlists/page.tsx` | Added "Import from Spotify" button/link |
|
||||
| `frontend/app/playlist/[id]/page.tsx` | Full Spotify-style redesign (see below) |
|
||||
| `frontend/app/mix/[id]/page.tsx` | Full Spotify-style redesign (matches playlist page) |
|
||||
| `frontend/app/discover/page.tsx` | Updated to use consistent container widths |
|
||||
| `frontend/features/discover/components/DiscoverHero.tsx` | Redesigned to match playlist/mix hero style |
|
||||
| `frontend/features/discover/components/DiscoverActionBar.tsx` | Redesigned with Lidify yellow play button |
|
||||
| `frontend/features/discover/components/TrackList.tsx` | Redesigned to match playlist/mix track listing |
|
||||
| `frontend/components/layout/Sidebar.tsx` | Removed unused icon imports |
|
||||
|
||||
---
|
||||
|
||||
## Database Changes
|
||||
|
||||
```prisma
|
||||
// NEW MODEL
|
||||
model Notification {
|
||||
id String @id @default(cuid())
|
||||
userId String
|
||||
type String // system, download_complete, playlist_ready, error, import_complete
|
||||
title String
|
||||
message String?
|
||||
metadata Json? // { playlistId, albumId, artistId, etc. }
|
||||
read Boolean @default(false)
|
||||
cleared Boolean @default(false)
|
||||
createdAt DateTime @default(now())
|
||||
|
||||
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
|
||||
|
||||
@@index([userId, cleared])
|
||||
@@index([userId, read])
|
||||
@@index([createdAt])
|
||||
}
|
||||
|
||||
// MODIFIED MODEL - DownloadJob
|
||||
model DownloadJob {
|
||||
// ... existing fields ...
|
||||
cleared Boolean @default(false) // NEW: User dismissed from history
|
||||
}
|
||||
```
|
||||
|
||||
**Migration Applied:** `npx prisma db push`
|
||||
|
||||
---
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Notifications (`/api/notifications`)
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
| ------ | ------------------------------------ | ---------------------------- |
|
||||
| GET | `/notifications` | List uncleared notifications |
|
||||
| GET | `/notifications/unread-count` | Get unread count |
|
||||
| POST | `/notifications/:id/read` | Mark as read |
|
||||
| POST | `/notifications/read-all` | Mark all as read |
|
||||
| POST | `/notifications/:id/clear` | Clear (dismiss) notification |
|
||||
| POST | `/notifications/clear-all` | Clear all notifications |
|
||||
| GET | `/notifications/downloads/active` | Active downloads |
|
||||
| GET | `/notifications/downloads/history` | Completed/failed downloads |
|
||||
| POST | `/notifications/downloads/:id/clear` | Clear from history |
|
||||
| POST | `/notifications/downloads/clear-all` | Clear all history |
|
||||
| POST | `/notifications/downloads/:id/retry` | Retry failed download |
|
||||
|
||||
### Spotify Import (`/api/spotify`)
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
| ------ | ---------------------------- | -------------------------------- |
|
||||
| POST | `/spotify/import/preview` | Generate import preview from URL |
|
||||
| POST | `/spotify/import/start` | Start import with selections |
|
||||
| GET | `/spotify/import/:id/status` | Get import job status |
|
||||
|
||||
---
|
||||
|
||||
## Key Bug Fixes
|
||||
|
||||
### 1. Track Matching (Spotify Import)
|
||||
|
||||
- **File:** `backend/src/services/spotifyImport.ts`
|
||||
- **Fix:** Added `stripTrackSuffix()` to remove "- 2011 Remaster" etc. while keeping punctuation
|
||||
- **Fix:** Added Unicode normalization for artist names (Röyksopp → Royksopp)
|
||||
- **Fix:** Multiple matching strategies (exact → stripped → fuzzy)
|
||||
|
||||
### 2. MusicBrainz Album Resolution
|
||||
|
||||
- **File:** `backend/src/services/musicbrainz.ts`
|
||||
- **Fix:** Score threshold > 50 for studio albums
|
||||
- **Fix:** Recording filtering (exclude live/demo/acoustic)
|
||||
- **Fix:** Soundtrack penalty in scoring
|
||||
|
||||
### 3. Lidarr Album Addition
|
||||
|
||||
- **File:** `backend/src/services/lidarr.ts`
|
||||
- **Fix:** Smart `anyReleaseOk` fallback (try strict first, then loosen)
|
||||
- **Fix:** MusicBrainz fallback when Lidarr's metadata server fails
|
||||
- **Fix:** Immediate error when no releases found
|
||||
|
||||
### 4. Multiple Failure Notifications
|
||||
|
||||
- **File:** `backend/src/services/simpleDownloadManager.ts`
|
||||
- **Fix:** 30-second deduplication window for failure events
|
||||
- **Fix:** Only notify on final exhaustion, not each retry
|
||||
- **Fix:** Skip notifications for discovery/import batches
|
||||
|
||||
---
|
||||
|
||||
## Testing Checklist
|
||||
|
||||
### Activity Panel
|
||||
|
||||
- [ ] Panel opens/closes from TopBar button
|
||||
- [ ] Panel state persists in localStorage
|
||||
- [ ] Notifications tab shows system messages
|
||||
- [ ] Active tab shows downloading items (refreshes every 5s)
|
||||
- [ ] History tab shows completed/failed
|
||||
- [ ] Retry button works for failed downloads
|
||||
- [ ] Clear buttons work
|
||||
|
||||
### Home Page Carousels
|
||||
|
||||
- [ ] Horizontal scroll works
|
||||
- [ ] Arrow buttons appear on hover (desktop)
|
||||
- [ ] Snap behavior works
|
||||
- [ ] Card sizing is compact
|
||||
|
||||
### Spotify Import
|
||||
|
||||
- [ ] Preview generation works
|
||||
- [ ] Album selection works
|
||||
- [ ] Downloads start correctly
|
||||
- [ ] Track matching works after downloads
|
||||
- [ ] Playlist is created with matched tracks
|
||||
- [ ] Notification appears when complete
|
||||
|
||||
### Notifications
|
||||
|
||||
- [ ] Download complete creates notification
|
||||
- [ ] Download failed creates notification (only on exhaustion)
|
||||
- [ ] Spotify import complete creates notification
|
||||
- [ ] Unread badge shows count
|
||||
- [ ] Mark as read works
|
||||
- [ ] Clear works
|
||||
|
||||
### Playlist Page
|
||||
|
||||
- [ ] Hero section is compact with bottom-aligned content
|
||||
- [ ] Shuffle button randomizes and plays tracks
|
||||
- [ ] Track listing spans full width (no container)
|
||||
- [ ] Currently playing track is highlighted
|
||||
- [ ] Track numbers become play icons on hover
|
||||
- [ ] Album column hidden on mobile
|
||||
|
||||
### PWA Install
|
||||
|
||||
- [ ] "Install App" button appears in Activity Panel (when installable)
|
||||
- [ ] Button triggers browser install prompt
|
||||
- [ ] Button disappears after installation
|
||||
|
||||
---
|
||||
|
||||
## Rollback Instructions
|
||||
|
||||
If issues arise, revert these files:
|
||||
|
||||
```bash
|
||||
# Core files to revert for UI changes
|
||||
git checkout HEAD~1 -- frontend/components/layout/AuthenticatedLayout.tsx
|
||||
git checkout HEAD~1 -- frontend/components/layout/TopBar.tsx
|
||||
git checkout HEAD~1 -- frontend/components/layout/ActivityPanel.tsx
|
||||
git checkout HEAD~1 -- frontend/components/activity/
|
||||
|
||||
# For Spotify import issues
|
||||
git checkout HEAD~1 -- backend/src/services/spotifyImport.ts
|
||||
git checkout HEAD~1 -- backend/src/services/musicbrainz.ts
|
||||
git checkout HEAD~1 -- backend/src/services/lidarr.ts
|
||||
|
||||
# Database rollback (if needed)
|
||||
# Remove Notification model and DownloadJob.cleared from schema
|
||||
npx prisma db push
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- The old `DownloadNotifications.tsx` (floating modal) still exists but is no longer imported in the layout
|
||||
- All grid components were already converted to carousels prior to this session
|
||||
- The Spotify import flow uses `lidarrService.addAlbum()` directly instead of `simpleDownloadManager` to avoid same-artist fallback
|
||||
|
||||
## Playlist Page Redesign
|
||||
|
||||
**File:** `frontend/app/playlist/[id]/page.tsx`
|
||||
|
||||
### Changes Made
|
||||
|
||||
1. **Fixed React Hooks Error** - Moved `totalDuration` useMemo before early returns
|
||||
2. **Full-Width Track Listing** - Removed container wrapper, tracks span full panel width like Spotify
|
||||
3. **Compact Hero Section** - Smaller cover art (140px/192px), bottom-aligned content, reduced title size
|
||||
4. **Added Shuffle Button** - Shuffles and plays all tracks in random order
|
||||
5. **Grid-Based Track Layout** - Columns: #, Title/Artist, Album, Duration (responsive)
|
||||
6. **Track Hover States** - Number becomes play icon on hover, row highlights
|
||||
|
||||
### PWA Install in Activity Panel
|
||||
|
||||
**File:** `frontend/components/layout/ActivityPanel.tsx`
|
||||
|
||||
- Added `beforeinstallprompt` event listener
|
||||
- "Install App" button appears at bottom of panel when PWA can be installed
|
||||
- Hides automatically when app is already installed or running in standalone mode
|
||||
|
||||
### Sidebar Cleanup
|
||||
|
||||
**File:** `frontend/components/layout/Sidebar.tsx`
|
||||
|
||||
- Removed unused icon imports (Home, Library, Sparkles, Book, Mic2)
|
||||
- Navigation items use text-only (no icons) - matching minimalist design
|
||||
|
||||
### Playlists Page Redesign
|
||||
|
||||
**File:** `frontend/app/playlists/page.tsx`
|
||||
|
||||
**Before → After:**
|
||||
|
||||
| Element | Before | After |
|
||||
| ---------------- | --------------------------------- | -------------------------------------- |
|
||||
| Header title | `text-3xl md:text-4xl font-black` | `text-2xl font-bold` |
|
||||
| Header padding | `px-6 md:px-8 py-6 md:py-8` | `px-6 pt-6 pb-4` |
|
||||
| Gradient overlay | Yellow gradient at top | Removed |
|
||||
| Import button | Green outline with icon | Solid green `bg-[#1DB954]`, no icon |
|
||||
| Hidden toggle | Icon + text, bordered | Text only, minimal style |
|
||||
| Card wrapper | `<Card>` component | Simple `<div>` with `hover:bg-white/5` |
|
||||
| Card padding | `p-4` (via Card) | `p-3` |
|
||||
| Play button | `w-12 h-12` | `w-10 h-10` |
|
||||
| Empty state | `<EmptyState>` with icons | Simple centered div |
|
||||
| Shared badge | Purple badge | Shown in subtitle instead |
|
||||
| Track count | "tracks" | "songs" (matches Spotify) |
|
||||
|
||||
**Design Philosophy:**
|
||||
|
||||
- Remove decorative icons where text suffices
|
||||
- Reduce spacing for tighter, professional feel
|
||||
- Use native hover states instead of custom components
|
||||
- Minimal color - let content speak
|
||||
- Match Spotify's terminology
|
||||
|
||||
---
|
||||
|
||||
## Spotify-Style Design Patterns
|
||||
|
||||
> **Use these patterns consistently across all pages for a cohesive look.**
|
||||
|
||||
### 1. Hero Sections (Albums, Playlists, Artists)
|
||||
|
||||
```
|
||||
- Compact height (max ~180px for cover on desktop)
|
||||
- Content bottom-aligned to the cover art
|
||||
- Title: text-2xl md:text-3xl font-bold (NOT text-4xl+)
|
||||
- Subtitle info: text-sm text-gray-400
|
||||
- Reduced vertical spacing (gap-2 to gap-4 max)
|
||||
- No decorative gradients overlaying the hero
|
||||
```
|
||||
|
||||
### 2. Track Listings
|
||||
|
||||
```
|
||||
- Full-width, no container card wrapping
|
||||
- Grid layout: [#] [Title/Artist] [Album] [Duration]
|
||||
- Album column: hidden on mobile (md:grid-cols-[16px_1fr_1fr_60px])
|
||||
- Hover: row bg-white/5, number → play icon
|
||||
- Playing indicator: Lidify yellow (#ecb200) on track number
|
||||
- Compact row height (~56px)
|
||||
```
|
||||
|
||||
### 3. Page Headers
|
||||
|
||||
```
|
||||
- Title: text-2xl font-bold (not text-3xl+)
|
||||
- Subtitle: text-sm text-gray-400
|
||||
- Actions: rounded-full buttons with minimal icons
|
||||
- No excessive padding (px-6 py-4 is enough)
|
||||
```
|
||||
|
||||
### 4. Cards (Albums, Artists, Playlists)
|
||||
|
||||
```
|
||||
- Compact padding: p-2.5 (not p-4)
|
||||
- Title: text-sm font-medium truncate
|
||||
- Subtitle: text-xs text-gray-500
|
||||
- Play button: bottom-right, shows on hover
|
||||
```
|
||||
|
||||
### 5. Grids → Carousels
|
||||
|
||||
```
|
||||
- Use HorizontalCarousel for content rows
|
||||
- Single horizontal line, scroll/swipe
|
||||
- Arrow buttons on hover (desktop)
|
||||
- Snap behavior for smooth scrolling
|
||||
```
|
||||
|
||||
### 6. General Typography
|
||||
|
||||
```
|
||||
- Section headers: text-lg font-semibold (not text-xl)
|
||||
- Greeting (home): text-2xl md:text-3xl font-bold tracking-tight
|
||||
- No ALL CAPS unless absolutely necessary
|
||||
- Muted subtitles: text-gray-400 or text-gray-500
|
||||
```
|
||||
|
||||
### 7. Buttons & Actions
|
||||
|
||||
```
|
||||
- Primary action: rounded-full, bg-[#ecb200] text-black
|
||||
- Secondary: bg-white/10 hover:bg-white/20
|
||||
- Icon-only buttons: rounded-full p-2
|
||||
- Minimal icon usage - text labels preferred
|
||||
```
|
||||
|
||||
### 8. Spacing Philosophy
|
||||
|
||||
```
|
||||
- Tight but breathable
|
||||
- Section gaps: gap-6 (not gap-8 or gap-10)
|
||||
- Card grids: gap-4
|
||||
- Hero to content: pt-6 (not pt-10)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Post-Implementation Fixes
|
||||
|
||||
| Date | File | Issue | Fix |
|
||||
| ---------- | --------------------------------------------------------------- | ----------------------------------------------------- | -------------------------------------------------------------------------------------- |
|
||||
| 2025-12-15 | `backend/src/routes/notifications.ts` | Wrong import path `../db` | Changed to `../utils/db` |
|
||||
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | React hooks order violation | Moved `useMemo` before early returns |
|
||||
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | `useAuth` not defined | Removed unused `isAuthenticated` |
|
||||
| 2025-12-15 | `frontend/components/layout/ActivityPanel.tsx` | Badge not clearing after clear all | Added `notifications-changed` event listener |
|
||||
| 2025-12-15 | `frontend/components/activity/NotificationsTab.tsx` | Badge not updating | Dispatch `notifications-changed` event on mutations |
|
||||
| 2025-12-15 | `backend/src/services/spotifyImport.ts` | Track matching failing (apostrophes, artist matching) | Added `normalizeApostrophes()`, changed artist match to use `contains` with first word |
|
||||
| 2025-12-15 | `frontend/app/playlists/page.tsx` | Page design not matching Spotify style | Full redesign: compact header, cleaner cards, minimal icons, refined typography |
|
||||
| 2025-12-15 | `frontend/app/import/spotify/page.tsx` | Using Music2 icon instead of Spotify logo | Uses SpotIcon.png, cleaner layout, matches style guide, removed heavy Card components |
|
||||
| 2025-12-15 | `frontend/app/import/spotify/page.tsx` | Grey/transparent gradient not matching brand | Added yellow-to-purple gradient (same as home page) with quick fade ratio (35vh/25vh) |
|
||||
| 2025-12-15 | `frontend/app/discover/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
|
||||
| 2025-12-15 | `frontend/app/mix/[id]/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
|
||||
| 2025-12-15 | `frontend/app/playlist/[id]/page.tsx` | Container width inconsistent with hero | Added `max-w-7xl mx-auto` to track listing section |
|
||||
| 2025-12-15 | `frontend/features/discover/components/*` | Discover page not matching playlist/mix design | Redesigned DiscoverHero, DiscoverActionBar, TrackList to match Spotify style |
|
||||
| 2025-12-15 | `frontend/app/library/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
|
||||
| 2025-12-15 | `frontend/features/library/components/LibraryHeader.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
|
||||
| 2025-12-15 | `frontend/app/podcasts/page.tsx` | Container width + card styling not matching | Removed `max-w-7xl mx-auto`, cleaner cards without borders/gradients |
|
||||
| 2025-12-15 | `frontend/app/audiobooks/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, smaller header text, consistent with Spotify style |
|
||||
| 2025-12-15 | `frontend/app/artist/[id]/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
|
||||
| 2025-12-15 | `frontend/app/album/[id]/page.tsx` | Container width not matching other pages | Removed `max-w-7xl mx-auto`, now full-width with `px-4 md:px-8` |
|
||||
| 2025-12-15 | `frontend/features/artist/components/ArtistHero.tsx` | Hero not matching Spotify style | Compact hero, full-width, bottom-aligned content, kept VibrantJS gradients |
|
||||
| 2025-12-15 | `frontend/features/artist/components/ArtistActionBar.tsx` | Action bar too heavy | Simplified to play button + shuffle + download, matching playlist style |
|
||||
| 2025-12-15 | `frontend/features/artist/components/PopularTracks.tsx` | Track list not matching new style | Removed Card wrapper, grid-based layout, cleaner typography |
|
||||
| 2025-12-15 | `frontend/features/artist/components/Discography.tsx` | Section header too large | Changed header from `text-2xl md:text-3xl` to `text-xl` |
|
||||
| 2025-12-15 | `frontend/features/artist/components/AvailableAlbums.tsx` | Section headers too large | Changed headers to `text-xl font-bold mb-4`, renamed sections |
|
||||
| 2025-12-15 | `frontend/features/artist/components/SimilarArtists.tsx` | Cards not matching new style | Cleaner cards with transparent bg, smaller header |
|
||||
| 2025-12-15 | `frontend/features/artist/components/ArtistBio.tsx` | Using Card component | Replaced Card with simple `bg-white/5` div |
|
||||
| 2025-12-15 | `frontend/features/album/components/AlbumHero.tsx` | Hero not matching Spotify style | Compact hero, full-width, bottom-aligned content, kept VibrantJS gradients |
|
||||
| 2025-12-15 | `frontend/features/album/components/AlbumActionBar.tsx` | Action bar too heavy | Simplified to play + shuffle + add to playlist, matching playlist style |
|
||||
| 2025-12-15 | `frontend/features/album/components/SimilarAlbums.tsx` | Section header too large | Changed header to `text-xl font-bold mb-4` |
|
||||
| 2025-12-15 | `frontend/app/artist/[id]/page.tsx` | Artist bio/about not showing | Now uses `artist.bio \|\| artist.summary` for library artists with `summary` field |
|
||||
| 2025-12-15 | `frontend/features/artist/components/ArtistBio.tsx` | Read more link not brand color | Added `[&_a]:text-[#ecb200]` for Lidify yellow links |
|
||||
| 2025-12-15 | `frontend/app/audiobooks/[id]/page.tsx` | Page design not matching Spotify style | Compact hero, yellow play button, integrated action bar, full-width layout |
|
||||
| 2025-12-15 | `frontend/features/audiobook/components/AudiobookHero.tsx` | Hero too large and dated | Compact Spotify-style hero with bottom-aligned content, VibrantJS gradients preserved |
|
||||
| 2025-12-15 | `frontend/features/audiobook/components/AudiobookActionBar.tsx` | Action bar not matching other pages | Yellow play button, inline progress, subtle action icons |
|
||||
| 2025-12-15 | `frontend/app/podcasts/[id]/page.tsx` | Page design not matching Spotify style | Compact hero, fixed height gradient (25vh), full-width layout |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/PodcastHero.tsx` | Hero too large and dated | Compact Spotify-style hero with bottom-aligned content, VibrantJS gradients preserved |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/PodcastActionBar.tsx` | Action bar too heavy | Yellow subscribe button, subtle RSS link, cleaner remove confirmation |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/ContinueListening.tsx` | Cards not matching new style | Yellow play button, cleaner progress bar, simpler prev/next episodes |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/EpisodeList.tsx` | Episode list not matching new style | Removed Card wrapper, yellow highlights, cleaner typography |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/SimilarPodcasts.tsx` | Cards not matching new style | Transparent bg with hover, smaller header, cleaner layout |
|
||||
| 2025-12-15 | `frontend/features/podcast/components/PreviewEpisodes.tsx` | Cards not matching new style | Removed Card wrappers, yellow subscribe button, cleaner About section |
|
||||
|
||||
---
|
||||
|
||||
## Settings Page Redesign (December 16, 2025)
|
||||
|
||||
### Overview
|
||||
|
||||
Complete redesign of the settings page to match Spotify's clean, minimal aesthetic with:
|
||||
|
||||
- **Sidebar navigation** - Fixed sidebar with section links, active state tracking via intersection observer
|
||||
- **Single scrollable page** - All sections on one page instead of tabs
|
||||
- **Unified Spotify section** - Combined OAuth user connection + Developer API credentials
|
||||
- **Spotify-style design patterns** - Row-based layouts, clean toggles, minimal borders
|
||||
|
||||
### Database Changes
|
||||
|
||||
```prisma
|
||||
model User {
|
||||
// ... existing fields ...
|
||||
|
||||
// NEW: Spotify OAuth connection
|
||||
spotifyAccessToken String? // Encrypted OAuth access token
|
||||
spotifyRefreshToken String? // Encrypted OAuth refresh token
|
||||
spotifyTokenExpiry DateTime? // When access token expires
|
||||
spotifyUserId String? // Spotify user ID
|
||||
spotifyDisplayName String? // Display name from Spotify
|
||||
}
|
||||
```
|
||||
|
||||
### New API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
| ------ | ------------------------------ | ----------------------------------- |
|
||||
| GET | `/api/spotify/auth/url` | Generate OAuth authorization URL |
|
||||
| GET | `/api/spotify/auth/callback` | Handle OAuth callback, store tokens |
|
||||
| POST | `/api/spotify/auth/disconnect` | Remove user's Spotify connection |
|
||||
| GET | `/api/spotify/auth/status` | Check if user is connected |
|
||||
|
||||
### New Frontend Files
|
||||
|
||||
| File | Purpose |
|
||||
| ----------------------------------------------------------------------------- | --------------------------------------- |
|
||||
| `frontend/features/settings/components/ui/SettingsLayout.tsx` | Sidebar + main content wrapper |
|
||||
| `frontend/features/settings/components/ui/SettingsSidebar.tsx` | Navigation sidebar with section links |
|
||||
| `frontend/features/settings/components/ui/SettingsSection.tsx` | Section header with separator |
|
||||
| `frontend/features/settings/components/ui/SettingsRow.tsx` | Label + description left, control right |
|
||||
| `frontend/features/settings/components/ui/SettingsToggle.tsx` | Spotify-style toggle switch |
|
||||
| `frontend/features/settings/components/ui/SettingsSelect.tsx` | Dropdown select |
|
||||
| `frontend/features/settings/components/ui/SettingsInput.tsx` | Text/password input with show/hide |
|
||||
| `frontend/features/settings/components/ui/ConnectionCard.tsx` | OAuth connection card (Spotify) |
|
||||
| `frontend/features/settings/components/ui/index.ts` | Barrel export |
|
||||
| `frontend/features/settings/components/sections/AccountSection.tsx` | Password change + 2FA |
|
||||
| `frontend/features/settings/components/sections/PlaybackSection.tsx` | Streaming quality dropdown |
|
||||
| `frontend/features/settings/components/sections/SpotifyConnectionSection.tsx` | Spotify OAuth connection |
|
||||
| `frontend/features/settings/components/sections/SpotifyAPISection.tsx` | Developer API credentials |
|
||||
| `frontend/features/settings/components/sections/CacheSection.tsx` | Cache sizes + automation toggles |
|
||||
| `frontend/features/settings/hooks/useSpotifyOAuth.ts` | OAuth state management |
|
||||
|
||||
### Modified Frontend Files
|
||||
|
||||
| File | Changes |
|
||||
| -------------------------------------------------------------------------- | ------------------------------------- |
|
||||
| `frontend/app/settings/page.tsx` | Complete redesign with sidebar layout |
|
||||
| `frontend/features/settings/components/sections/LidarrSection.tsx` | Spotify-style row layout |
|
||||
| `frontend/features/settings/components/sections/AudiobookshelfSection.tsx` | Spotify-style row layout |
|
||||
| `frontend/features/settings/components/sections/SoulseekSection.tsx` | Spotify-style row layout |
|
||||
| `frontend/features/settings/components/sections/AIServicesSection.tsx` | Spotify-style row layout |
|
||||
| `frontend/features/settings/components/sections/StoragePathsSection.tsx` | Spotify-style row layout |
|
||||
| `frontend/features/settings/components/sections/UserManagementSection.tsx` | Cleaner design, modal for delete |
|
||||
|
||||
### Modified Backend Files
|
||||
|
||||
| File | Changes |
|
||||
| ------------------------------- | ---------------------------------------- |
|
||||
| `backend/prisma/schema.prisma` | Added Spotify OAuth fields to User model |
|
||||
| `backend/src/routes/spotify.ts` | Added OAuth routes |
|
||||
|
||||
### Deleted Files (Consolidated)
|
||||
|
||||
| File | Reason |
|
||||
| ---------------------------------------------------------------------------- | --------------------------------- |
|
||||
| `frontend/features/settings/components/UserSettingsTab.tsx` | Replaced by unified settings page |
|
||||
| `frontend/features/settings/components/AccountTab.tsx` | Replaced by unified settings page |
|
||||
| `frontend/features/settings/components/SystemSettingsTab.tsx` | Replaced by unified settings page |
|
||||
| `frontend/features/settings/components/sections/ChangePasswordSection.tsx` | Merged into AccountSection |
|
||||
| `frontend/features/settings/components/sections/TwoFactorAuthSection.tsx` | Merged into AccountSection |
|
||||
| `frontend/features/settings/components/sections/PlaybackQualitySection.tsx` | Replaced by PlaybackSection |
|
||||
| `frontend/features/settings/components/sections/AdvancedSettingsSection.tsx` | Replaced by CacheSection |
|
||||
| `frontend/features/settings/components/sections/CacheSettingsSection.tsx` | Replaced by CacheSection |
|
||||
| `frontend/features/settings/components/sections/SpotifySection.tsx` | Split into Connection + API |
|
||||
|
||||
### Settings Sections
|
||||
|
||||
**All Users:** Account, Playback, Connected Services (Spotify OAuth)
|
||||
|
||||
**Admin Only:** Download Services, Media Servers, P2P Networks, AI Services, Spotify API, Storage, Cache & Automation, User Management
|
||||
|
||||
---
|
||||
|
||||
## Home Page Enhancements (Dec 16, 2025)
|
||||
|
||||
### New Features
|
||||
|
||||
1. **Radio Stations Section** - Compact horizontal row at the top of the home page showing random Deezer radio stations
|
||||
2. **Featured Playlists Section** - Grid showing 10 featured Deezer playlists after Popular Artists section
|
||||
|
||||
### New Files Created
|
||||
|
||||
| File | Purpose |
|
||||
| ------------------------------------------------------ | ------------------------------------------- |
|
||||
| `frontend/features/home/components/FeaturedPlaylistsGrid.tsx` | Grid component for featured playlists |
|
||||
| `frontend/features/home/components/RadioStationsGrid.tsx` | Horizontal scroll component for radio stations |
|
||||
|
||||
### Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| ---------------------------------------------------- | ------------------------------------------------ |
|
||||
| `frontend/app/page.tsx` | Added radio stations and featured playlists sections |
|
||||
| `frontend/features/home/hooks/useHomeData.ts` | Added browse data fetching for playlists/radios |
|
||||
| `frontend/hooks/useQueries.ts` | Added browse query keys and hooks |
|
||||
| `backend/src/routes/browse.ts` | Increased featured playlists limit from 50 to 200 |
|
||||
|
||||
---
|
||||
|
||||
## Notification & Sync Button Improvements (Dec 16, 2025)
|
||||
|
||||
### Changes
|
||||
|
||||
1. **Sync Button** - No longer shows toast overlay, turns green with spinning animation while syncing
|
||||
2. **Optimistic Notification Clearing** - Notifications are cleared from UI immediately before API call completes
|
||||
3. **Duplicate Key Fix** - Added context parameter to renderCard in browse page to prevent duplicate key errors
|
||||
|
||||
### Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| -------------------------------------------------------- | ------------------------------------------------ |
|
||||
| `frontend/components/layout/Sidebar.tsx` | Removed toast, added green color while syncing |
|
||||
| `frontend/components/activity/NotificationsTab.tsx` | Implemented optimistic updates for all mutations |
|
||||
| `frontend/app/browse/playlists/page.tsx` | Fixed duplicate key errors with unique keys |
|
||||
|
||||
---
|
||||
|
||||
## Essentia Audio Analysis Integration (Dec 16, 2025)
|
||||
|
||||
### Overview
|
||||
|
||||
Integrated Essentia audio analysis to extract BPM, key, mood, energy, and other audio features from tracks. This enables intelligent mood-based mixes and personalized playlists.
|
||||
|
||||
### Database Changes
|
||||
|
||||
Added to `Track` model in `backend/prisma/schema.prisma`:
|
||||
|
||||
| Field | Type | Description |
|
||||
| ------------------ | ---------- | ------------------------------------- |
|
||||
| `bpm` | Float? | Beats per minute |
|
||||
| `beatsCount` | Int? | Total beats in track |
|
||||
| `key` | String? | Musical key (C, F#, Bb, etc.) |
|
||||
| `keyScale` | String? | "major" or "minor" |
|
||||
| `keyStrength` | Float? | Key detection confidence (0-1) |
|
||||
| `energy` | Float? | Overall energy (0-1) |
|
||||
| `loudness` | Float? | Average loudness in dB |
|
||||
| `dynamicRange` | Float? | Dynamic range in dB |
|
||||
| `danceability` | Float? | Danceability score (0-1) |
|
||||
| `valence` | Float? | Happy (1) to sad (0) |
|
||||
| `arousal` | Float? | Energetic (1) to calm (0) |
|
||||
| `instrumentalness` | Float? | Vocal presence (0-1, 1=instrumental) |
|
||||
| `acousticness` | Float? | Acoustic vs electronic (0-1) |
|
||||
| `speechiness` | Float? | Spoken word content (0-1) |
|
||||
| `moodTags` | String[] | ML-classified mood tags |
|
||||
| `essentiaGenres` | String[] | ML-classified genres |
|
||||
| `lastfmTags` | String[] | User-generated mood tags from Last.fm |
|
||||
| `analysisStatus` | String | pending/processing/completed/failed |
|
||||
| `analysisVersion` | String? | Essentia version used |
|
||||
| `analyzedAt` | DateTime? | When analysis was completed |
|
||||
| `analysisError` | String? | Error message if failed |
|
||||
|
||||
### New Files
|
||||
|
||||
| File | Description |
|
||||
| ------------------------------------------------- | -------------------------------------------------- |
|
||||
| `services/audio-analyzer/Dockerfile` | Python 3.11 + Essentia container |
|
||||
| `services/audio-analyzer/analyzer.py` | Main audio analysis service |
|
||||
| `services/audio-analyzer/requirements.txt` | Python dependencies |
|
||||
| `backend/src/workers/trackEnrichment.ts` | Last.fm tag enrichment worker |
|
||||
| `backend/src/routes/analysis.ts` | API routes for analysis status & triggers |
|
||||
|
||||
### Modified Files
|
||||
|
||||
| File | Changes |
|
||||
| -------------------------------------------------------------- | ----------------------------------------------- |
|
||||
| `backend/prisma/schema.prisma` | Added audio analysis fields to Track model |
|
||||
| `backend/src/workers/index.ts` | Added track enrichment worker startup/shutdown |
|
||||
| `backend/src/workers/queues.ts` | Added `analysisQueue` for audio analysis jobs |
|
||||
| `backend/src/index.ts` | Registered `/api/analysis` routes |
|
||||
| `backend/src/services/programmaticPlaylists.ts` | Added mood-based mix generators |
|
||||
| `backend/src/routes/library.ts` | Added mood-based radio station filtering |
|
||||
| `frontend/features/home/components/LibraryRadioStations.tsx` | Added mood-based radio station buttons |
|
||||
| `docker-compose.yml` | Added `audio-analyzer` service (optional) |
|
||||
|
||||
### New Mix Types (Audio Analysis-Based)
|
||||
|
||||
| Mix Type | Criteria |
|
||||
| -------------- | --------------------------------------------- |
|
||||
| High Energy | energy >= 0.7, BPM >= 120 |
|
||||
| Late Night | energy <= 0.4, BPM <= 90, low arousal |
|
||||
| Happy Vibes | valence >= 0.6, energy >= 0.5 |
|
||||
| Melancholy | valence <= 0.4, minor key preferred |
|
||||
| Dance Floor | danceability >= 0.7, BPM 110-140 |
|
||||
| Acoustic | acousticness >= 0.6, energy 0.3-0.6 |
|
||||
| Instrumental | instrumentalness >= 0.7, energy 0.3-0.6 |
|
||||
| Road Trip | tags or energy 0.5-0.8, BPM 100-130 |
|
||||
| Sunday Morning | low energy, high acousticness (day-specific) |
|
||||
| Monday Motivation | high energy, high valence (day-specific) |
|
||||
| Friday Night | high danceability, high energy (day-specific) |
|
||||
|
||||
### API Endpoints
|
||||
|
||||
| Method | Endpoint | Description |
|
||||
| ------ | ----------------------------- | ---------------------------------------- |
|
||||
| GET | `/api/analysis/status` | Get analysis progress statistics |
|
||||
| POST | `/api/analysis/start` | Queue pending tracks for analysis |
|
||||
| POST | `/api/analysis/retry-failed` | Reset failed tracks to pending |
|
||||
| POST | `/api/analysis/analyze/:id` | Queue specific track for analysis |
|
||||
| GET | `/api/analysis/track/:id` | Get analysis data for specific track |
|
||||
| GET | `/api/analysis/features` | Get aggregated feature statistics |
|
||||
|
||||
### Starting the Audio Analyzer
|
||||
|
||||
The audio analyzer is disabled by default. To enable it:
|
||||
|
||||
```bash
|
||||
docker-compose --profile audio-analysis up -d
|
||||
```
|
||||
|
||||
Or just run it separately:
|
||||
|
||||
```bash
|
||||
docker-compose up audio-analyzer -d
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Notification System Fixes (Dec 16, 2025)
|
||||
|
||||
### Issues Fixed
|
||||
|
||||
1. **Toast overlays for cache clearing and sync** - Removed toast.success overlays for "Caches cleared" and "Library scan started" since these should appear in the activity panel notification bar instead.
|
||||
|
||||
2. **Notification badge not clearing immediately** - The `useNotifications` hook wasn't responding to `notifications-changed` events. Fixed by adding an event listener that triggers a refetch.
|
||||
|
||||
3. **Settings page glitchy sidebar** - Replaced IntersectionObserver with scroll-based tracking for smoother sidebar highlighting.
|
||||
|
||||
### Modified Files
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `frontend/hooks/useNotifications.ts` | Added event listener for `notifications-changed` to trigger immediate refetch |
|
||||
| `frontend/features/settings/components/sections/CacheSection.tsx` | Removed toast.success for cache clearing and sync, added local error state |
|
||||
| `frontend/components/layout/TopBar.tsx` | Removed toast.success for library scan started |
|
||||
| `frontend/components/layout/Sidebar.tsx` | Added `notifications-changed` event dispatch after sync |
|
||||
| `frontend/features/settings/components/ui/SettingsLayout.tsx` | Replaced IntersectionObserver with throttled scroll listener for smoother sidebar tracking |
|
||||
|
||||
### Behavior Changes
|
||||
|
||||
- **Sync button**: No longer shows toast overlay - progress appears in activity panel
|
||||
- **Clear caches button**: No longer shows toast overlay - implicit success (button returns to normal state)
|
||||
- **Notification badge**: Now clears immediately via optimistic updates and event system
|
||||
- **Settings sidebar**: Smoother scrolling behavior without jumpy highlights
|
||||
|
||||
---
|
||||
|
||||
## Session 8: Artist Radio Feature
|
||||
|
||||
### New Feature: Artist Radio with Hybrid Similarity Matching
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `backend/src/routes/library.ts` | Added `artist` case to `/library/radio` endpoint with hybrid matching |
|
||||
| `backend/src/routes/library.ts` | Added artist name filtering to `/library/genres` endpoint |
|
||||
| `frontend/features/artist/components/ArtistActionBar.tsx` | Added Radio icon button for library artists |
|
||||
| `frontend/app/artist/[id]/page.tsx` | Added `handleStartRadio` function and passed to ArtistActionBar |
|
||||
| `frontend/lib/api.ts` | Added `getRadioTracks()` method |
|
||||
|
||||
### Artist Radio Logic
|
||||
|
||||
The artist radio uses a **hybrid approach** with vibe boosting:
|
||||
|
||||
1. **Last.fm Similar Artists (filtered to library)**: Primary source, gets up to 15 similar artists that exist in user's library
|
||||
2. **Genre Matching Fallback**: If < 5 similar artists, finds library artists with overlapping genres
|
||||
3. **Vibe Boost via Audio Analysis**: Scores similar artists' tracks by BPM, energy, valence, and danceability similarity
|
||||
4. **Track Mix**: ~40% from original artist, ~60% from vibe-matched similar artists
|
||||
|
||||
### Genre Filtering Fix
|
||||
|
||||
Artist names (like "Jamiroquai") were incorrectly showing as genres. Fixed by:
|
||||
- Fetching all artist names at query time
|
||||
- Filtering out any "genre" that matches an artist name (case-insensitive)
|
||||
|
||||
### Bug Fix: Artist Radio "Unknown Artist" / No Image
|
||||
|
||||
Fixed two issues with artist radio playback:
|
||||
1. **Frontend**: Removed double-transformation of tracks - backend already returns properly formatted data
|
||||
2. **Backend**: Fixed `coverArt` to use `track.album.coverUrl` directly instead of conditional `lidarrAlbumId` check
|
||||
|
||||
---
|
||||
|
||||
## Session 9: Vibe Match Feature
|
||||
|
||||
### New Feature: "Vibe Match" Button on Media Player
|
||||
|
||||
Allows users to instantly create a queue of tracks that sound like the currently playing track.
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `backend/src/routes/library.ts` | Added `vibe` case to `/library/radio` endpoint with audio feature matching |
|
||||
| `frontend/components/player/MiniPlayer.tsx` | Added Vibe button (AudioWaveform icon) with loading state |
|
||||
| `frontend/components/player/FullPlayer.tsx` | Added Vibe button (AudioWaveform icon) with loading state |
|
||||
|
||||
### How Vibe Match Works
|
||||
|
||||
1. **Takes current track's audio features** (BPM, energy, valence, danceability, key, mood tags)
|
||||
2. **Searches entire library** for tracks with similar audio profiles
|
||||
3. **Scores matches** using weighted algorithm:
|
||||
- BPM (25%) - within ±15 BPM is ideal
|
||||
- Energy (25%)
|
||||
- Valence/mood (20%)
|
||||
- Danceability (15%)
|
||||
- Key compatibility (10%)
|
||||
- Mood tag overlap (5%)
|
||||
4. **Falls back gracefully** if not enough audio matches:
|
||||
- Same artist's other tracks
|
||||
- Last.fm similar artists' tracks
|
||||
- Same genre tracks
|
||||
- Random library tracks
|
||||
|
||||
### UI Location
|
||||
|
||||
The Vibe button (waveform icon) appears after the Repeat button in both:
|
||||
- MiniPlayer (sidebar player)
|
||||
- FullPlayer (bottom bar player)
|
||||
|
||||
Clicking it replaces the current queue with vibe-matched tracks and shows a toast notification.
|
||||
|
||||
---
|
||||
|
||||
## Session 9 (continued): Search Tracks Fix
|
||||
|
||||
### Bug Fix: Library Tracks Not Showing in Search
|
||||
|
||||
The backend was returning tracks in search results, but the frontend never displayed them.
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `frontend/app/search/page.tsx` | Added import for `LibraryTracksList` and section to display library tracks |
|
||||
| `frontend/features/search/components/LibraryTracksList.tsx` | **New file** - Component to display library tracks in search results |
|
||||
|
||||
### Features of LibraryTracksList
|
||||
|
||||
- Shows up to 10 tracks matching the search query
|
||||
- Displays cover art, title, artist, album, and duration
|
||||
- Click to play (integrates with audio context)
|
||||
- Currently playing track highlighted in yellow
|
||||
- Artist and album names link to their respective pages
|
||||
396
docs/implementation-summaries/vibe-matching-overhaul/README.md
Normal file
396
docs/implementation-summaries/vibe-matching-overhaul/README.md
Normal file
@@ -0,0 +1,396 @@
|
||||
# Vibe Matching Algorithm Overhaul Plan
|
||||
|
||||
## Overview
|
||||
|
||||
This document outlines the plan to overhaul the vibe matching algorithm to use **cosine similarity** on a comprehensive feature vector that includes all 9 ML mood predictions, audio features, and genre/tag matching.
|
||||
|
||||
## Current State (Before Overhaul)
|
||||
|
||||
### What We Have
|
||||
- **ML Mood Predictions (9 total):**
|
||||
- `moodHappy`, `moodSad`, `moodRelaxed`, `moodAggressive` (existing)
|
||||
- `moodParty`, `moodAcoustic`, `moodElectronic` (newly added)
|
||||
- `danceabilityMl`, `aggressivenessMl` (existing)
|
||||
|
||||
- **Audio Features:**
|
||||
- `bpm`, `key`, `keyScale` (major/minor)
|
||||
- `energy`, `danceability`, `valence`, `arousal`
|
||||
- `instrumentalness`, `acousticness`, `speechiness`
|
||||
|
||||
- **Metadata:**
|
||||
- `lastfmTags` (JSON array of tag objects with name/count)
|
||||
- `essentiaGenres` (JSON array of genre strings)
|
||||
- `trackGenres` relation (linked genre records)
|
||||
|
||||
### Previous Algorithm (Weighted Manhattan Distance)
|
||||
```typescript
|
||||
// Old approach - arbitrary weights, limited features
|
||||
const weights = {
|
||||
energy: 1.5,
|
||||
danceability: 1.2,
|
||||
valence: 1.0,
|
||||
arousal: 1.0,
|
||||
instrumentalness: 0.8,
|
||||
bpm: 0.5,
|
||||
};
|
||||
|
||||
let score = 0;
|
||||
for (const [feature, weight] of Object.entries(weights)) {
|
||||
const diff = Math.abs(sourceTrack[feature] - candidateTrack[feature]);
|
||||
score += diff * weight;
|
||||
}
|
||||
// Lower score = more similar (inverted logic)
|
||||
```
|
||||
|
||||
**Problems with old approach:**
|
||||
1. Only used 6 features, ignored all ML mood predictions
|
||||
2. Arbitrary weights with no scientific basis
|
||||
3. Manhattan distance less effective for high-dimensional feature spaces
|
||||
4. No genre/tag matching
|
||||
5. Score inversion was confusing
|
||||
|
||||
---
|
||||
|
||||
## New Algorithm (Cosine Similarity)
|
||||
|
||||
### Phase 1: Database Schema Update ✅
|
||||
Add new mood fields to Prisma schema:
|
||||
|
||||
```prisma
|
||||
model Track {
|
||||
// ... existing fields ...
|
||||
|
||||
// ML Mood Predictions (0.0-1.0)
|
||||
moodHappy Float?
|
||||
moodSad Float?
|
||||
moodRelaxed Float?
|
||||
moodAggressive Float?
|
||||
moodParty Float? // NEW
|
||||
moodAcoustic Float? // NEW
|
||||
moodElectronic Float? // NEW
|
||||
|
||||
// ... rest of schema ...
|
||||
}
|
||||
```
|
||||
|
||||
**Migration command:**
|
||||
```bash
|
||||
cd backend
|
||||
npx prisma db push --skip-generate
|
||||
```
|
||||
|
||||
### Phase 2: Audio Analyzer Update ✅
|
||||
Update `services/audio-analyzer/analyzer.py` to extract and save all 7 mood predictions:
|
||||
|
||||
```python
|
||||
# MusiCNN mood classifiers
|
||||
mood_models = {
|
||||
'moodHappy': 'mood_happy-musicnn-msd-2',
|
||||
'moodSad': 'mood_sad-musicnn-msd-2',
|
||||
'moodRelaxed': 'mood_relaxed-musicnn-msd-2',
|
||||
'moodAggressive': 'mood_aggressive-musicnn-msd-2',
|
||||
'moodParty': 'mood_party-musicnn-msd-2',
|
||||
'moodAcoustic': 'mood_acoustic-musicnn-msd-2',
|
||||
'moodElectronic': 'mood_electronic-musicnn-msd-2',
|
||||
}
|
||||
|
||||
# Save all to database
|
||||
UPDATE "Track" SET
|
||||
"moodHappy" = %s,
|
||||
"moodSad" = %s,
|
||||
"moodRelaxed" = %s,
|
||||
"moodAggressive" = %s,
|
||||
"moodParty" = %s,
|
||||
"moodAcoustic" = %s,
|
||||
"moodElectronic" = %s,
|
||||
...
|
||||
```
|
||||
|
||||
### Phase 3: Feature Vector Construction
|
||||
Build a normalized feature vector for each track:
|
||||
|
||||
```typescript
|
||||
interface TrackFeatures {
|
||||
// ML Moods (0-1)
|
||||
moodHappy: number | null;
|
||||
moodSad: number | null;
|
||||
moodRelaxed: number | null;
|
||||
moodAggressive: number | null;
|
||||
moodParty: number | null;
|
||||
moodAcoustic: number | null;
|
||||
moodElectronic: number | null;
|
||||
|
||||
// Audio Features
|
||||
energy: number | null;
|
||||
arousal: number | null;
|
||||
danceability: number | null;
|
||||
danceabilityMl: number | null;
|
||||
instrumentalness: number | null;
|
||||
bpm: number | null;
|
||||
keyScale: string | null;
|
||||
|
||||
// Metadata
|
||||
lastfmTags: any;
|
||||
essentiaGenres: any;
|
||||
}
|
||||
|
||||
function buildFeatureVector(track: TrackFeatures): number[] {
|
||||
return [
|
||||
// 7 ML Mood predictions (indices 0-6)
|
||||
track.moodHappy ?? 0.5,
|
||||
track.moodSad ?? 0.5,
|
||||
track.moodRelaxed ?? 0.5,
|
||||
track.moodAggressive ?? 0.5,
|
||||
track.moodParty ?? 0.5,
|
||||
track.moodAcoustic ?? 0.5,
|
||||
track.moodElectronic ?? 0.5,
|
||||
|
||||
// Core audio features (indices 7-10)
|
||||
track.energy ?? 0.5,
|
||||
track.arousal ?? 0.5,
|
||||
track.danceabilityMl ?? track.danceability ?? 0.5,
|
||||
track.instrumentalness ?? 0.5,
|
||||
|
||||
// Normalized BPM (index 11)
|
||||
// Maps 60-180 BPM to 0-1 range
|
||||
Math.max(0, Math.min(1, ((track.bpm ?? 120) - 60) / 120)),
|
||||
|
||||
// Key mode (index 12)
|
||||
// Major = 1, Minor = 0
|
||||
track.keyScale === 'major' ? 1 : 0,
|
||||
];
|
||||
}
|
||||
```
|
||||
|
||||
**Feature Vector Dimensions: 13**
|
||||
|
||||
### Phase 4: Cosine Similarity Calculation
|
||||
|
||||
```typescript
|
||||
function cosineSimilarity(a: number[], b: number[]): number {
|
||||
let dotProduct = 0;
|
||||
let magnitudeA = 0;
|
||||
let magnitudeB = 0;
|
||||
|
||||
for (let i = 0; i < a.length; i++) {
|
||||
dotProduct += a[i] * b[i];
|
||||
magnitudeA += a[i] * a[i];
|
||||
magnitudeB += b[i] * b[i];
|
||||
}
|
||||
|
||||
if (magnitudeA === 0 || magnitudeB === 0) return 0;
|
||||
|
||||
return dotProduct / (Math.sqrt(magnitudeA) * Math.sqrt(magnitudeB));
|
||||
}
|
||||
```
|
||||
|
||||
**Properties:**
|
||||
- Returns value between -1 and 1 (for our 0-1 normalized vectors, always 0 to 1)
|
||||
- 1.0 = identical vectors (perfect match)
|
||||
- 0.0 = orthogonal vectors (no similarity)
|
||||
- Higher = better (intuitive, no inversion needed)
|
||||
|
||||
### Phase 5: Tag/Genre Bonus
|
||||
|
||||
Add bonus points for matching tags and genres:
|
||||
|
||||
```typescript
|
||||
function calculateTagBonus(
|
||||
sourceTrack: TrackFeatures,
|
||||
candidateTrack: TrackFeatures
|
||||
): number {
|
||||
let bonus = 0;
|
||||
|
||||
// Extract tags
|
||||
const sourceTags = new Set<string>();
|
||||
const candidateTags = new Set<string>();
|
||||
|
||||
// Parse lastfmTags
|
||||
if (Array.isArray(sourceTrack.lastfmTags)) {
|
||||
sourceTrack.lastfmTags.forEach((t: any) => {
|
||||
if (t?.name) sourceTags.add(t.name.toLowerCase());
|
||||
});
|
||||
}
|
||||
if (Array.isArray(candidateTrack.lastfmTags)) {
|
||||
candidateTrack.lastfmTags.forEach((t: any) => {
|
||||
if (t?.name) candidateTags.add(t.name.toLowerCase());
|
||||
});
|
||||
}
|
||||
|
||||
// Parse essentiaGenres
|
||||
if (Array.isArray(sourceTrack.essentiaGenres)) {
|
||||
sourceTrack.essentiaGenres.forEach((g: string) => {
|
||||
sourceTags.add(g.toLowerCase());
|
||||
});
|
||||
}
|
||||
if (Array.isArray(candidateTrack.essentiaGenres)) {
|
||||
candidateTrack.essentiaGenres.forEach((g: string) => {
|
||||
candidateTags.add(g.toLowerCase());
|
||||
});
|
||||
}
|
||||
|
||||
// Count overlapping tags
|
||||
let overlap = 0;
|
||||
for (const tag of sourceTags) {
|
||||
if (candidateTags.has(tag)) overlap++;
|
||||
}
|
||||
|
||||
// Bonus: up to 0.1 (10%) for tag overlap
|
||||
// Normalized by the smaller set size to handle varying tag counts
|
||||
const minSize = Math.min(sourceTags.size, candidateTags.size);
|
||||
if (minSize > 0) {
|
||||
bonus = (overlap / minSize) * 0.1;
|
||||
}
|
||||
|
||||
return bonus;
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 6: Final Score Calculation
|
||||
|
||||
```typescript
|
||||
function calculateVibeScore(
|
||||
sourceTrack: TrackFeatures,
|
||||
candidateTrack: TrackFeatures
|
||||
): number {
|
||||
// Build feature vectors
|
||||
const sourceVector = buildFeatureVector(sourceTrack);
|
||||
const candidateVector = buildFeatureVector(candidateTrack);
|
||||
|
||||
// Calculate cosine similarity (0-1)
|
||||
const cosineSim = cosineSimilarity(sourceVector, candidateVector);
|
||||
|
||||
// Add tag bonus (0-0.1)
|
||||
const tagBonus = calculateTagBonus(sourceTrack, candidateTrack);
|
||||
|
||||
// Final score: cosine similarity + tag bonus
|
||||
// Capped at 1.0
|
||||
const finalScore = Math.min(1.0, cosineSim + tagBonus);
|
||||
|
||||
return finalScore;
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 7: Integration into Radio Endpoint
|
||||
|
||||
Update `backend/src/routes/library.ts`:
|
||||
|
||||
```typescript
|
||||
// In the vibe radio section
|
||||
const sourceTrack = await prisma.track.findUnique({
|
||||
where: { id: trackId },
|
||||
select: {
|
||||
moodHappy: true,
|
||||
moodSad: true,
|
||||
moodRelaxed: true,
|
||||
moodAggressive: true,
|
||||
moodParty: true,
|
||||
moodAcoustic: true,
|
||||
moodElectronic: true,
|
||||
energy: true,
|
||||
arousal: true,
|
||||
danceability: true,
|
||||
danceabilityMl: true,
|
||||
instrumentalness: true,
|
||||
bpm: true,
|
||||
keyScale: true,
|
||||
lastfmTags: true,
|
||||
essentiaGenres: true,
|
||||
},
|
||||
});
|
||||
|
||||
// Get candidates
|
||||
const candidates = await prisma.track.findMany({
|
||||
where: {
|
||||
id: { not: trackId },
|
||||
analysisStatus: 'enhanced', // Only use analyzed tracks
|
||||
},
|
||||
select: { /* same fields */ },
|
||||
take: 500, // Get more candidates for better matching
|
||||
});
|
||||
|
||||
// Score all candidates
|
||||
const scored = candidates.map(candidate => ({
|
||||
...candidate,
|
||||
vibeScore: calculateVibeScore(sourceTrack, candidate),
|
||||
}));
|
||||
|
||||
// Sort by score (highest first)
|
||||
scored.sort((a, b) => b.vibeScore - a.vibeScore);
|
||||
|
||||
// Take top N for the queue
|
||||
const vibeQueue = scored.slice(0, limit);
|
||||
|
||||
// DO NOT SHUFFLE - preserve the sorted order!
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Checklist
|
||||
|
||||
- [x] **Phase 1:** Add `moodParty`, `moodAcoustic`, `moodElectronic` to Prisma schema
|
||||
- [x] **Phase 2:** Update audio analyzer to extract all 7 moods
|
||||
- [x] **Phase 3:** Implement `buildFeatureVector()` function
|
||||
- [x] **Phase 4:** Implement `cosineSimilarity()` function
|
||||
- [x] **Phase 5:** Implement `calculateTagBonus()` function (called `computeTagBonus`)
|
||||
- [x] **Phase 6:** Implement `calculateVibeScore()` combining all components
|
||||
- [x] **Phase 7:** Integrate into `/library/radio` endpoint
|
||||
- [ ] **Phase 8:** Update frontend to display match percentage (optional enhancement)
|
||||
- [ ] **Phase 9:** Re-analyze tracks to populate new mood fields
|
||||
|
||||
---
|
||||
|
||||
## Re-Analysis Script
|
||||
|
||||
To populate the new mood fields for existing tracks:
|
||||
|
||||
```sql
|
||||
-- Reset analysis status for enhanced tracks to re-run analysis
|
||||
UPDATE "Track"
|
||||
SET "analysisStatus" = 'pending'
|
||||
WHERE "analysisStatus" = 'enhanced';
|
||||
```
|
||||
|
||||
Or use the existing script:
|
||||
```bash
|
||||
docker exec lidify_db psql -U lidifydb -d lidify -f /path/to/reset-analysis-for-new-moods.sql
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Expected Improvements
|
||||
|
||||
1. **Better Similarity Matching:** Cosine similarity is mathematically proven to work well for high-dimensional feature vectors
|
||||
2. **Full ML Utilization:** All 9 mood predictions now contribute to matching
|
||||
3. **Genre Awareness:** Tag/genre overlap provides meaningful boost
|
||||
4. **Intuitive Scores:** Higher score = better match (no inversion)
|
||||
5. **Normalized Features:** All features scaled to 0-1 for fair comparison
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
1. Pick a track with known characteristics (e.g., happy upbeat pop song)
|
||||
2. Generate vibe queue
|
||||
3. Verify top matches share similar mood profiles
|
||||
4. Check that match percentages in UI reflect actual similarity
|
||||
5. Test with various genres to ensure cross-genre matching works appropriately
|
||||
|
||||
---
|
||||
|
||||
## Files Modified
|
||||
|
||||
- `backend/prisma/schema.prisma` - New mood fields
|
||||
- `backend/src/routes/library.ts` - New scoring algorithm
|
||||
- `services/audio-analyzer/analyzer.py` - Extract all 7 moods
|
||||
- `frontend/components/player/VibeOverlay.tsx` - Display all moods
|
||||
- `frontend/lib/audio-state-context.tsx` - Extended AudioFeatures interface
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- **Gaia:** Essentia has a companion library called Gaia for large-scale similarity search using KD-trees. This is overkill for our scale (< 100k tracks) but could be considered for future scaling.
|
||||
- **MusiCNN Limitations:** The model was trained on MSD (Million Song Dataset) which is pop/rock heavy. For classical/ambient music, predictions may be less reliable. We've added normalization to handle this.
|
||||
- **Shuffle Interaction:** Vibe mode automatically disables shuffle to preserve the sorted order.
|
||||
|
||||
@@ -0,0 +1,571 @@
|
||||
# Vibe Matching Implementation Plan
|
||||
|
||||
## Executive Summary
|
||||
|
||||
The current vibe matching system uses Essentia for audio analysis but only extracts **basic features**. Critical mood/emotion features are either placeholder values or poorly estimated. This document outlines a comprehensive plan to achieve Spotify-quality vibe matching while being conscious of performance on user hardware.
|
||||
|
||||
## Strategy Update (Latest)
|
||||
|
||||
**Default:** Enhanced mode (ML-powered, accurate)
|
||||
**Fallback:** Standard mode (lightweight, for troubleshooting or power saving)
|
||||
|
||||
**Approach:**
|
||||
1. ✅ Pre-package all Essentia TensorFlow models in Docker image (~200MB)
|
||||
2. 🔄 Fix Enhanced mode FIRST - make it actually use the ML models
|
||||
3. ⏳ THEN create Standard mode as a lightweight fallback
|
||||
4. Users can toggle to Standard mode to save CPU if needed
|
||||
|
||||
---
|
||||
|
||||
## Current State Analysis
|
||||
|
||||
### What Essentia IS Currently Extracting (Working)
|
||||
|
||||
| Feature | Status | Quality |
|
||||
|---------|--------|---------|
|
||||
| **BPM** | ✅ Working | Good - Uses `RhythmExtractor2013` |
|
||||
| **Key** | ✅ Working | Good - Uses `KeyExtractor` |
|
||||
| **KeyScale** | ✅ Working | Good - major/minor detection |
|
||||
| **Energy** | ✅ Working | Moderate - Raw energy normalized |
|
||||
| **Loudness** | ✅ Working | Good - dB measurement |
|
||||
| **Dynamic Range** | ✅ Working | Good |
|
||||
| **Danceability** | ✅ Working | Good - Uses `Danceability` algorithm |
|
||||
| **Beats Count** | ✅ Working | Good |
|
||||
|
||||
### What's Broken or Placeholder
|
||||
|
||||
| Feature | Status | Problem |
|
||||
|---------|--------|---------|
|
||||
| **Valence** | ⚠️ Fake | Calculated as `(major/minor * 0.4) + (energy * 0.6)` - NOT actual emotional valence |
|
||||
| **Arousal** | ⚠️ Fake | Calculated as `(BPM * 0.5) + (energy * 0.5)` - NOT actual arousal |
|
||||
| **Instrumentalness** | ❌ Placeholder | Hardcoded to `0.5` |
|
||||
| **Acousticness** | ⚠️ Estimate | Rough estimate from dynamic range |
|
||||
| **Speechiness** | ❌ Placeholder | Hardcoded to `0.1` |
|
||||
| **Mood Tags** | ⚠️ Derived | Generated from fake valence/arousal, not ML |
|
||||
| **Genre Tags** | ❌ Empty | TensorFlow models not loaded |
|
||||
|
||||
### The Core Issue
|
||||
|
||||
```python
|
||||
# Current valence calculation (analyzer.py lines 226-231)
|
||||
key_valence = 0.6 if scale == 'major' else 0.4
|
||||
energy_valence = result['energy']
|
||||
result['valence'] = round((key_valence * 0.4 + energy_valence * 0.6), 3)
|
||||
```
|
||||
|
||||
**"Fake Happy" by Paramore** (emotionally complex, about masking sadness):
|
||||
- Major key → 0.6
|
||||
- High energy → ~0.7
|
||||
- Calculated valence: `(0.6 * 0.4) + (0.7 * 0.6) = 0.66` (appears "happy")
|
||||
|
||||
**"Summer Girl" by Jamiroquai** (genuinely upbeat funk):
|
||||
- Major key → 0.6
|
||||
- High energy → ~0.7
|
||||
- Calculated valence: `(0.6 * 0.4) + (0.7 * 0.6) = 0.66` (appears "happy")
|
||||
|
||||
**Result: 97% match despite being completely different vibes!**
|
||||
|
||||
---
|
||||
|
||||
## How Spotify Does It
|
||||
|
||||
Spotify's audio analysis uses a combination of:
|
||||
|
||||
### 1. Low-Level Audio Features (Similar to what we have)
|
||||
- Tempo/BPM
|
||||
- Key/Mode
|
||||
- Loudness
|
||||
- Time signature
|
||||
|
||||
### 2. Mid-Level Features (We're missing these)
|
||||
- **Spectral Centroid** - "brightness" of the sound
|
||||
- **Spectral Rolloff** - frequency distribution
|
||||
- **Zero Crossing Rate** - percussiveness
|
||||
- **MFCCs** - Mel-frequency cepstral coefficients (timbral texture)
|
||||
- **Chroma Features** - harmonic content
|
||||
|
||||
### 3. High-Level Features (We're faking these)
|
||||
- **Valence** - Musical positiveness (0-1)
|
||||
- **Arousal/Energy** - Intensity and activity
|
||||
- **Instrumentalness** - Vocal presence prediction
|
||||
- **Acousticness** - Acoustic vs electronic
|
||||
- **Speechiness** - Presence of spoken words
|
||||
- **Liveness** - Audience presence detection
|
||||
|
||||
### 4. Deep Learning Models
|
||||
Spotify trains neural networks on millions of labeled tracks to predict:
|
||||
- Mood categories
|
||||
- Genre classification
|
||||
- User preference patterns
|
||||
|
||||
---
|
||||
|
||||
## Two-Tier System
|
||||
|
||||
### Default: Enhanced Vibe Matching (ML-Powered)
|
||||
**Status:** DEFAULT - Pre-packaged in Docker, just works
|
||||
**Target:** High accuracy, ~5-10 seconds per track
|
||||
|
||||
**Features (from Essentia TensorFlow Models):**
|
||||
1. **Mood Predictions (real ML, not estimated):**
|
||||
- `mood_happy-discogs-effnet-1.pb` - Happiness/positivity 0-1
|
||||
- `mood_sad-discogs-effnet-1.pb` - Sadness 0-1
|
||||
- `mood_relaxed-discogs-effnet-1.pb` - Relaxation/calmness 0-1
|
||||
- `mood_aggressive-discogs-effnet-1.pb` - Aggression/intensity 0-1
|
||||
|
||||
2. **Audio Characteristics:**
|
||||
- `danceability-discogs-effnet-1.pb` - ML-based danceability
|
||||
- `voice_instrumental-discogs-effnet-1.pb` - Vocal detection (instrumentalness)
|
||||
|
||||
3. **Embeddings for Similarity:**
|
||||
- `discogs-effnet-bs64-1.pb` - Audio embeddings (neural "fingerprint")
|
||||
- Can be used for direct similarity comparison
|
||||
|
||||
4. **Spectral Features:**
|
||||
- Spectral Centroid (brightness)
|
||||
- MFCCs (timbral texture - 13 coefficients)
|
||||
|
||||
**Models Pre-packaged:** ~200MB in Docker image (no user download)
|
||||
**RAM Requirement:** ~500MB during analysis
|
||||
**CPU Requirement:** Any modern CPU (2015+)
|
||||
|
||||
### Fallback: Standard Vibe Matching (Lightweight)
|
||||
**Status:** FALLBACK - For troubleshooting or power saving
|
||||
**Target:** Fast, <2 seconds per track, low CPU
|
||||
|
||||
**Features Used:**
|
||||
- BPM (Essentia RhythmExtractor)
|
||||
- Energy (Essentia Energy)
|
||||
- Danceability (Essentia Danceability - non-ML version)
|
||||
- Key/Scale (Essentia KeyExtractor)
|
||||
- Spectral Centroid (cheap to compute)
|
||||
- Last.fm mood tags
|
||||
- Genre matching from tags
|
||||
|
||||
**When to use Standard mode:**
|
||||
- Low-power devices (Raspberry Pi, older NAS)
|
||||
- Troubleshooting if Enhanced mode has issues
|
||||
- User preference to save CPU cycles
|
||||
|
||||
---
|
||||
|
||||
## Implementation Plan
|
||||
|
||||
### Phase 1: Pre-Package Models in Docker (Day 1)
|
||||
|
||||
#### 1.1 Update Dockerfile to Include Models
|
||||
|
||||
```dockerfile
|
||||
# Download Essentia ML models during build (~200MB)
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends curl && \
|
||||
# Base embedding model (required for all predictions)
|
||||
curl -L -o /app/models/discogs-effnet-bs64-1.pb \
|
||||
"https://essentia.upf.edu/models/feature-extractors/discogs-effnet/discogs-effnet-bs64-1.pb" && \
|
||||
# Mood models
|
||||
curl -L -o /app/models/mood_happy-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_happy/mood_happy-discogs-effnet-1.pb" && \
|
||||
curl -L -o /app/models/mood_sad-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_sad/mood_sad-discogs-effnet-1.pb" && \
|
||||
curl -L -o /app/models/mood_relaxed-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_relaxed/mood_relaxed-discogs-effnet-1.pb" && \
|
||||
curl -L -o /app/models/mood_aggressive-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_aggressive/mood_aggressive-discogs-effnet-1.pb" && \
|
||||
# Danceability and voice/instrumental
|
||||
curl -L -o /app/models/danceability-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/danceability/danceability-discogs-effnet-1.pb" && \
|
||||
curl -L -o /app/models/voice_instrumental-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/voice_instrumental/voice_instrumental-discogs-effnet-1.pb" && \
|
||||
# Arousal/Valence models
|
||||
curl -L -o /app/models/arousal-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_arousal/mood_arousal-discogs-effnet-1.pb" && \
|
||||
curl -L -o /app/models/valence-discogs-effnet-1.pb \
|
||||
"https://essentia.upf.edu/models/classification-heads/mood_valence/mood_valence-discogs-effnet-1.pb" && \
|
||||
apt-get purge -y curl && rm -rf /var/lib/apt/lists/*
|
||||
```
|
||||
|
||||
### Phase 2: Implement Enhanced Analysis (Days 2-4)
|
||||
|
||||
#### 2.1 Rewrite analyzer.py with ML Models
|
||||
|
||||
```python
|
||||
class AudioAnalyzer:
|
||||
"""Enhanced audio analysis using Essentia TensorFlow models"""
|
||||
|
||||
def __init__(self):
|
||||
self.models_loaded = False
|
||||
self.embedding_model = None
|
||||
self.mood_models = {}
|
||||
|
||||
if ESSENTIA_AVAILABLE:
|
||||
self._init_essentia()
|
||||
self._load_ml_models()
|
||||
|
||||
def _load_ml_models(self):
|
||||
"""Load TensorFlow models for enhanced analysis"""
|
||||
try:
|
||||
from essentia.standard import (
|
||||
TensorflowPredictEffnetDiscogs,
|
||||
TensorflowPredict2D
|
||||
)
|
||||
|
||||
# Load embedding extractor (base for all predictions)
|
||||
embedding_path = '/app/models/discogs-effnet-bs64-1.pb'
|
||||
if os.path.exists(embedding_path):
|
||||
self.embedding_model = TensorflowPredictEffnetDiscogs(
|
||||
graphFilename=embedding_path,
|
||||
output="PartitionedCall:1"
|
||||
)
|
||||
logger.info("Loaded embedding model")
|
||||
|
||||
# Load mood prediction models
|
||||
mood_models = {
|
||||
'happy': '/app/models/mood_happy-discogs-effnet-1.pb',
|
||||
'sad': '/app/models/mood_sad-discogs-effnet-1.pb',
|
||||
'relaxed': '/app/models/mood_relaxed-discogs-effnet-1.pb',
|
||||
'aggressive': '/app/models/mood_aggressive-discogs-effnet-1.pb',
|
||||
'danceability': '/app/models/danceability-discogs-effnet-1.pb',
|
||||
'voice_instrumental': '/app/models/voice_instrumental-discogs-effnet-1.pb',
|
||||
'arousal': '/app/models/arousal-discogs-effnet-1.pb',
|
||||
'valence': '/app/models/valence-discogs-effnet-1.pb',
|
||||
}
|
||||
|
||||
for name, path in mood_models.items():
|
||||
if os.path.exists(path):
|
||||
self.mood_models[name] = TensorflowPredict2D(
|
||||
graphFilename=path,
|
||||
output="model/Softmax"
|
||||
)
|
||||
logger.info(f"Loaded {name} model")
|
||||
|
||||
self.models_loaded = len(self.mood_models) > 0
|
||||
logger.info(f"ML models loaded: {self.models_loaded} ({len(self.mood_models)} models)")
|
||||
|
||||
except Exception as e:
|
||||
logger.warning(f"Could not load ML models: {e}")
|
||||
self.models_loaded = False
|
||||
|
||||
def analyze(self, file_path: str) -> Dict[str, Any]:
|
||||
"""Full analysis with ML models if available"""
|
||||
result = self._extract_basic_features(file_path)
|
||||
|
||||
if self.models_loaded:
|
||||
ml_features = self._extract_ml_features(file_path)
|
||||
result.update(ml_features)
|
||||
result['analysisMode'] = 'enhanced'
|
||||
else:
|
||||
# Fallback to estimated values
|
||||
result.update(self._estimate_mood_features(result))
|
||||
result['analysisMode'] = 'standard'
|
||||
|
||||
return result
|
||||
|
||||
def _extract_ml_features(self, file_path: str) -> Dict[str, Any]:
|
||||
"""Extract features using TensorFlow models"""
|
||||
result = {}
|
||||
|
||||
# Load audio at 16kHz for ML models
|
||||
audio = self.load_audio(file_path, sample_rate=16000)
|
||||
if audio is None:
|
||||
return result
|
||||
|
||||
# Get embeddings
|
||||
embeddings = self.embedding_model(audio)
|
||||
|
||||
# Mood predictions
|
||||
if 'happy' in self.mood_models:
|
||||
preds = self.mood_models['happy'](embeddings)
|
||||
result['moodHappy'] = float(np.mean(preds[:, 1])) # Probability of "happy"
|
||||
|
||||
if 'sad' in self.mood_models:
|
||||
preds = self.mood_models['sad'](embeddings)
|
||||
result['moodSad'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'relaxed' in self.mood_models:
|
||||
preds = self.mood_models['relaxed'](embeddings)
|
||||
result['moodRelaxed'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'aggressive' in self.mood_models:
|
||||
preds = self.mood_models['aggressive'](embeddings)
|
||||
result['moodAggressive'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
# Real valence and arousal from dedicated models
|
||||
if 'valence' in self.mood_models:
|
||||
preds = self.mood_models['valence'](embeddings)
|
||||
result['valence'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
if 'arousal' in self.mood_models:
|
||||
preds = self.mood_models['arousal'](embeddings)
|
||||
result['arousal'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
# Instrumentalness from voice/instrumental model
|
||||
if 'voice_instrumental' in self.mood_models:
|
||||
preds = self.mood_models['voice_instrumental'](embeddings)
|
||||
result['instrumentalness'] = float(np.mean(preds[:, 1])) # 1 = instrumental
|
||||
|
||||
# ML-based danceability
|
||||
if 'danceability' in self.mood_models:
|
||||
preds = self.mood_models['danceability'](embeddings)
|
||||
result['danceabilityMl'] = float(np.mean(preds[:, 1]))
|
||||
|
||||
return result
|
||||
```
|
||||
|
||||
### Phase 3: Update Database Schema (Day 3)
|
||||
|
||||
#### 3.1 Add New Feature Columns
|
||||
|
||||
```prisma
|
||||
model Track {
|
||||
// ... existing fields ...
|
||||
|
||||
// ML-based mood predictions (Enhanced mode)
|
||||
moodHappy Float? // ML prediction 0-1
|
||||
moodSad Float? // ML prediction 0-1
|
||||
moodRelaxed Float? // ML prediction 0-1
|
||||
moodAggressive Float? // ML prediction 0-1
|
||||
danceabilityMl Float? // ML-based danceability
|
||||
|
||||
// Analysis metadata
|
||||
analysisMode String? // 'standard' or 'enhanced'
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 4: Update Vibe Matching Algorithm (Day 4)
|
||||
|
||||
#### 4.1 Use Real Mood Predictions in Matching
|
||||
|
||||
```typescript
|
||||
// In library.ts - Enhanced vibe matching
|
||||
const scored = analyzedTracks.map(t => {
|
||||
let score = 0;
|
||||
let factors = 0;
|
||||
|
||||
// === MOOD MATCHING (50% total - the heart of vibe) ===
|
||||
|
||||
// Happy mood (15%)
|
||||
if (sourceTrack.moodHappy !== null && t.moodHappy !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.moodHappy - t.moodHappy)) * 0.15;
|
||||
factors += 0.15;
|
||||
}
|
||||
|
||||
// Sad mood (10%)
|
||||
if (sourceTrack.moodSad !== null && t.moodSad !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.moodSad - t.moodSad)) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// Relaxed mood (10%)
|
||||
if (sourceTrack.moodRelaxed !== null && t.moodRelaxed !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.moodRelaxed - t.moodRelaxed)) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// Aggressive mood (10%)
|
||||
if (sourceTrack.moodAggressive !== null && t.moodAggressive !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.moodAggressive - t.moodAggressive)) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// Valence - overall positivity (5%)
|
||||
if (sourceTrack.valence !== null && t.valence !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.valence - t.valence)) * 0.05;
|
||||
factors += 0.05;
|
||||
}
|
||||
|
||||
// === AUDIO CHARACTERISTICS (35% total) ===
|
||||
|
||||
// BPM (15%) - within ±15 BPM is good
|
||||
if (sourceTrack.bpm && t.bpm) {
|
||||
const bpmDiff = Math.abs(sourceTrack.bpm - t.bpm);
|
||||
score += Math.max(0, 1 - bpmDiff / 30) * 0.15;
|
||||
factors += 0.15;
|
||||
}
|
||||
|
||||
// Energy (10%)
|
||||
if (sourceTrack.energy !== null && t.energy !== null) {
|
||||
score += (1 - Math.abs(sourceTrack.energy - t.energy)) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// Danceability - prefer ML version (10%)
|
||||
const srcDance = sourceTrack.danceabilityMl ?? sourceTrack.danceability;
|
||||
const tDance = t.danceabilityMl ?? t.danceability;
|
||||
if (srcDance !== null && tDance !== null) {
|
||||
score += (1 - Math.abs(srcDance - tDance)) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// === GENRE/TAGS (15% total) ===
|
||||
|
||||
// Genre/tag overlap (10%)
|
||||
const sourceGenres = [...(sourceTrack.lastfmTags || []), ...(sourceTrack.essentiaGenres || [])];
|
||||
const trackGenres = [...(t.lastfmTags || []), ...(t.essentiaGenres || [])];
|
||||
if (sourceGenres.length > 0 && trackGenres.length > 0) {
|
||||
const overlap = sourceGenres.filter(g => trackGenres.includes(g)).length;
|
||||
const maxOverlap = Math.max(sourceGenres.length, trackGenres.length);
|
||||
score += (overlap / maxOverlap) * 0.10;
|
||||
factors += 0.10;
|
||||
}
|
||||
|
||||
// Key compatibility (5%)
|
||||
if (sourceTrack.keyScale && t.keyScale) {
|
||||
score += (sourceTrack.keyScale === t.keyScale ? 1 : 0.5) * 0.05;
|
||||
factors += 0.05;
|
||||
}
|
||||
|
||||
const finalScore = factors > 0 ? score / factors : 0;
|
||||
return { id: t.id, score: finalScore };
|
||||
});
|
||||
```
|
||||
|
||||
### Phase 5: Create Standard Mode Fallback (Day 5)
|
||||
|
||||
After Enhanced mode is working, implement Standard mode:
|
||||
- Same algorithm structure but skip ML features
|
||||
- Use estimated valence (improved heuristics)
|
||||
- Lower weights on mood matching since it's estimated
|
||||
- Higher weights on BPM, energy, genre tags
|
||||
|
||||
### Phase 6: Settings & UI (Day 6)
|
||||
|
||||
#### 6.1 Add Settings Toggle
|
||||
|
||||
```typescript
|
||||
// System settings - Enhanced is DEFAULT
|
||||
{
|
||||
audioAnalysis: {
|
||||
vibeMatchingMode: 'enhanced' | 'standard', // Default: 'enhanced'
|
||||
reanalyzeOnModeChange: boolean, // Default: false
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 6.2 Settings UI
|
||||
|
||||
```
|
||||
Audio Analysis
|
||||
├── Vibe Matching Mode
|
||||
│ ├── ● Enhanced (Recommended - Default)
|
||||
│ │ └── Uses ML models for accurate mood detection
|
||||
│ └── ○ Standard (Power Saver)
|
||||
│ └── Faster, uses basic audio features only
|
||||
│
|
||||
├── Analysis Status
|
||||
│ └── "1,234 / 1,500 tracks analyzed (Enhanced mode)"
|
||||
│
|
||||
└── [Re-analyze Library] button
|
||||
└── "Re-analyze all tracks with current settings"
|
||||
```
|
||||
|
||||
### Phase 7: Testing & Validation (Day 7)
|
||||
|
||||
#### 7.1 Test Cases
|
||||
|
||||
| Source Track | Bad Match (Current) | Expected Good Match |
|
||||
|--------------|---------------------|---------------------|
|
||||
| "Fake Happy" (Paramore) | "Summer Girl" (Jamiroquai) 97% | Other emo/pop-punk <60% |
|
||||
| "Creep" (Radiohead) | Fast dance track | Other melancholic rock |
|
||||
| "Uptown Funk" | Slow ballad | Other high-energy funk/pop |
|
||||
|
||||
#### 7.2 Performance Testing
|
||||
- Analyze 100 tracks, measure time
|
||||
- Memory usage during analysis
|
||||
- Queue handling under load
|
||||
|
||||
---
|
||||
|
||||
## Database Schema Updates
|
||||
|
||||
```prisma
|
||||
model Track {
|
||||
// ... existing fields ...
|
||||
|
||||
// ML-based mood predictions (Enhanced mode)
|
||||
moodHappy Float? // ML prediction 0-1
|
||||
moodSad Float? // ML prediction 0-1
|
||||
moodRelaxed Float? // ML prediction 0-1
|
||||
moodAggressive Float? // ML prediction 0-1
|
||||
danceabilityMl Float? // ML-based danceability
|
||||
|
||||
// Analysis metadata
|
||||
analysisMode String? // 'standard' or 'enhanced'
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Benchmarks (Estimated)
|
||||
|
||||
| Operation | Standard Mode | Enhanced Mode |
|
||||
|-----------|---------------|---------------|
|
||||
| Analysis per track | 1-2 sec | 5-10 sec |
|
||||
| RAM usage | ~100MB | ~500MB |
|
||||
| Models in Docker | N/A | ~200MB (pre-packaged) |
|
||||
| Vibe match query | <100ms | <100ms |
|
||||
| Full library (1000 tracks) | ~30 min | ~2-3 hours |
|
||||
|
||||
---
|
||||
|
||||
## Files to Modify
|
||||
|
||||
| File | Changes |
|
||||
|------|---------|
|
||||
| `services/audio-analyzer/Dockerfile` | Add model downloads during build |
|
||||
| `services/audio-analyzer/analyzer.py` | Implement ML model loading and prediction |
|
||||
| `backend/prisma/schema.prisma` | Add mood prediction columns |
|
||||
| `backend/src/routes/library.ts` | Update vibe matching algorithm weights |
|
||||
| `frontend/features/settings/` | Add analysis mode toggle (default: enhanced) |
|
||||
| `frontend/components/player/VibeGraph.tsx` | Display mood predictions |
|
||||
|
||||
---
|
||||
|
||||
## Success Metrics
|
||||
|
||||
After implementation, "Fake Happy" and "Summer Girl" should:
|
||||
- Match at **<50%** (different emotional content, different genre)
|
||||
|
||||
Better matches for "Fake Happy" would be:
|
||||
- Other Paramore songs (same artist = genre/production match)
|
||||
- Emo/pop-punk with similar emotional complexity
|
||||
- Songs with high energy but mixed emotional signals
|
||||
|
||||
---
|
||||
|
||||
## Implementation Order (Enhanced First)
|
||||
|
||||
### Week 1: Get Enhanced Mode Working
|
||||
1. [x] Create implementation plan (this document)
|
||||
2. [x] Update Dockerfile to pre-package ML models (~200MB)
|
||||
3. [x] Rewrite analyzer.py with TensorFlow model loading
|
||||
4. [x] Add new database columns for mood predictions (moodHappy, moodSad, etc.)
|
||||
5. [x] Update vibe matching algorithm with ML mood weights
|
||||
6. [x] Update programmatic playlists to use ML mood predictions
|
||||
7. [ ] Run Prisma migration to apply schema changes
|
||||
8. [ ] Rebuild audio-analyzer Docker container
|
||||
9. [ ] Test ML analysis on sample tracks
|
||||
|
||||
### Week 2: Polish & Fallback
|
||||
10. [ ] Test accuracy with diverse track pairs
|
||||
11. [ ] Add settings UI (Enhanced = default)
|
||||
12. [ ] Implement Standard mode as explicit fallback option
|
||||
13. [ ] Update VibeGraph to show mood predictions
|
||||
14. [ ] Documentation and testing
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference: Models to Include
|
||||
|
||||
| Model | File | Purpose | Size |
|
||||
|-------|------|---------|------|
|
||||
| Embeddings | `discogs-effnet-bs64-1.pb` | Base model for all predictions | ~85MB |
|
||||
| Happy | `mood_happy-discogs-effnet-1.pb` | Happiness detection | ~15MB |
|
||||
| Sad | `mood_sad-discogs-effnet-1.pb` | Sadness detection | ~15MB |
|
||||
| Relaxed | `mood_relaxed-discogs-effnet-1.pb` | Relaxation detection | ~15MB |
|
||||
| Aggressive | `mood_aggressive-discogs-effnet-1.pb` | Aggression detection | ~15MB |
|
||||
| Arousal | `mood_arousal-discogs-effnet-1.pb` | Energy/calm scale | ~15MB |
|
||||
| Valence | `mood_valence-discogs-effnet-1.pb` | Positive/negative | ~15MB |
|
||||
| Danceability | `danceability-discogs-effnet-1.pb` | ML danceability | ~15MB |
|
||||
| Voice/Instrumental | `voice_instrumental-discogs-effnet-1.pb` | Vocal detection | ~15MB |
|
||||
|
||||
**Total:** ~200MB (one-time addition to Docker image)
|
||||
|
||||
Reference in New Issue
Block a user