Initial release v1.0.0
This commit is contained in:
651
docs/vibe-system.md
Normal file
651
docs/vibe-system.md
Normal file
@@ -0,0 +1,651 @@
|
||||
# Lidify Vibe System Documentation
|
||||
|
||||
This document provides comprehensive documentation of the Vibe System - how Lidify analyzes tracks, collects audio metrics, and compares them for vibe matching. Use this as a reference for building frontend interfaces.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Overview](#overview)
|
||||
2. [Metrics Collected](#metrics-collected)
|
||||
3. [Data Structures](#data-structures)
|
||||
4. [Vibe Matching Algorithm](#vibe-matching-algorithm)
|
||||
5. [API Endpoints](#api-endpoints)
|
||||
6. [Frontend Integration Guide](#frontend-integration-guide)
|
||||
7. [Existing Components Reference](#existing-components-reference)
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The Vibe System uses a combination of **audio signal analysis** and **ML-based mood prediction** to understand the "feel" of a track. It operates in two modes:
|
||||
|
||||
| Mode | Description | Accuracy |
|
||||
|------|-------------|----------|
|
||||
| **Standard** | Heuristic-based analysis using audio signal features (BPM, key, energy) | Good |
|
||||
| **Enhanced** | ML-based analysis using MusiCNN neural network for mood prediction | Best |
|
||||
|
||||
The system enables:
|
||||
- Finding tracks with similar vibes to a source track
|
||||
- Generating mood-based playlists
|
||||
- Visualizing track characteristics in real-time
|
||||
|
||||
---
|
||||
|
||||
## Metrics Collected
|
||||
|
||||
### Core Audio Features (Always Available)
|
||||
|
||||
These are extracted directly from audio signal analysis at 44.1kHz:
|
||||
|
||||
| Metric | Type | Range | Description |
|
||||
|--------|------|-------|-------------|
|
||||
| `bpm` | Float | 60-200 | Tempo in beats per minute |
|
||||
| `beatsCount` | Int | 0+ | Total number of beats detected |
|
||||
| `key` | String | "C", "F#", etc. | Musical key |
|
||||
| `keyScale` | String | "major" \| "minor" | Major or minor tonality |
|
||||
| `keyStrength` | Float | 0-1 | Confidence of key detection |
|
||||
| `energy` | Float | 0-1 | RMS-based intensity level |
|
||||
| `loudness` | Float | dB | Average loudness |
|
||||
| `dynamicRange` | Float | dB | Difference between quietest and loudest |
|
||||
| `danceability` | Float | 0-1 | Rhythm regularity and groove potential |
|
||||
|
||||
### ML Mood Predictions (Enhanced Mode)
|
||||
|
||||
Seven core mood dimensions predicted by the MusiCNN model:
|
||||
|
||||
| Metric | Type | Range | Description | Icon Suggestion |
|
||||
|--------|------|-------|-------------|-----------------|
|
||||
| `moodHappy` | Float | 0-1 | Happiness/cheerfulness probability | Smile |
|
||||
| `moodSad` | Float | 0-1 | Sadness/melancholy probability | Frown |
|
||||
| `moodRelaxed` | Float | 0-1 | Calm/peaceful probability | Coffee |
|
||||
| `moodAggressive` | Float | 0-1 | Intensity/aggression probability | Flame |
|
||||
| `moodParty` | Float | 0-1 | Upbeat/party probability | PartyPopper |
|
||||
| `moodAcoustic` | Float | 0-1 | Acoustic instrumentation probability | Guitar |
|
||||
| `moodElectronic` | Float | 0-1 | Electronic/synthetic probability | Radio |
|
||||
|
||||
### Derived Features (Computed)
|
||||
|
||||
These are calculated from the ML predictions:
|
||||
|
||||
#### Valence (Emotional Positivity)
|
||||
|
||||
```typescript
|
||||
// Formula:
|
||||
valence = (
|
||||
moodHappy * 0.5 + // Happy mood (50% weight)
|
||||
moodParty * 0.3 + // Party mood (30% weight)
|
||||
(1 - moodSad) * 0.2 // Inverse of sadness (20% weight)
|
||||
)
|
||||
```
|
||||
|
||||
| Value | Interpretation |
|
||||
|-------|----------------|
|
||||
| 0.0 - 0.3 | Melancholic, sad |
|
||||
| 0.3 - 0.6 | Neutral, balanced |
|
||||
| 0.6 - 1.0 | Happy, positive |
|
||||
|
||||
#### Arousal (Energy/Excitement Level)
|
||||
|
||||
```typescript
|
||||
// Formula:
|
||||
arousal = (
|
||||
moodAggressive * 0.35 + // Aggressive mood (35% weight)
|
||||
moodParty * 0.25 + // Party mood (25% weight)
|
||||
moodElectronic * 0.2 + // Electronic sound (20% weight)
|
||||
(1 - moodRelaxed) * 0.1 + // Inverse of relaxation (10% weight)
|
||||
(1 - moodAcoustic) * 0.1 // Inverse of acoustic (10% weight)
|
||||
)
|
||||
```
|
||||
|
||||
| Value | Interpretation |
|
||||
|-------|----------------|
|
||||
| 0.0 - 0.3 | Calm, peaceful |
|
||||
| 0.3 - 0.6 | Moderate energy |
|
||||
| 0.6 - 1.0 | High energy, intense |
|
||||
|
||||
### Additional Features
|
||||
|
||||
| Metric | Type | Range | Description |
|
||||
|--------|------|-------|-------------|
|
||||
| `instrumentalness` | Float | 0-1 | Voice presence (0=vocal, 1=instrumental) |
|
||||
| `acousticness` | Float | 0-1 | Acoustic vs. processed sound |
|
||||
| `speechiness` | Float | 0-1 | Spoken word detection |
|
||||
| `danceabilityMl` | Float | 0-1 | ML-based danceability (more accurate) |
|
||||
|
||||
### Metadata & Tags
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `moodTags` | String[] | Derived mood labels (e.g., ["chill", "happy"]) |
|
||||
| `essentiaGenres` | String[] | ML-predicted genres (e.g., ["rock", "electronic"]) |
|
||||
| `lastfmTags` | String[] | User-generated tags from Last.fm |
|
||||
| `analysisStatus` | String | "pending" \| "processing" \| "completed" \| "failed" |
|
||||
| `analysisMode` | String | "standard" \| "enhanced" |
|
||||
| `analyzedAt` | DateTime | When analysis was performed |
|
||||
|
||||
---
|
||||
|
||||
## Data Structures
|
||||
|
||||
### TypeScript Interface
|
||||
|
||||
```typescript
|
||||
interface AudioFeatures {
|
||||
// Core audio features
|
||||
bpm?: number | null;
|
||||
beatsCount?: number | null;
|
||||
key?: string | null;
|
||||
keyScale?: string | null;
|
||||
keyStrength?: number | null;
|
||||
energy?: number | null;
|
||||
loudness?: number | null;
|
||||
dynamicRange?: number | null;
|
||||
danceability?: number | null;
|
||||
|
||||
// Derived features
|
||||
valence?: number | null;
|
||||
arousal?: number | null;
|
||||
|
||||
// Additional features
|
||||
instrumentalness?: number | null;
|
||||
acousticness?: number | null;
|
||||
speechiness?: number | null;
|
||||
danceabilityMl?: number | null;
|
||||
|
||||
// ML Mood predictions (Enhanced mode)
|
||||
moodHappy?: number | null;
|
||||
moodSad?: number | null;
|
||||
moodRelaxed?: number | null;
|
||||
moodAggressive?: number | null;
|
||||
moodParty?: number | null;
|
||||
moodAcoustic?: number | null;
|
||||
moodElectronic?: number | null;
|
||||
|
||||
// Metadata
|
||||
analysisStatus?: string | null;
|
||||
analysisMode?: string | null;
|
||||
analyzedAt?: string | null;
|
||||
|
||||
// Tags
|
||||
moodTags?: string[];
|
||||
essentiaGenres?: string[];
|
||||
lastfmTags?: string[];
|
||||
}
|
||||
```
|
||||
|
||||
### Feature Display Configuration
|
||||
|
||||
Recommended configuration for displaying features in UI:
|
||||
|
||||
```typescript
|
||||
const FEATURE_CONFIG = [
|
||||
{
|
||||
key: "energy",
|
||||
label: "Energy",
|
||||
icon: "Zap", // lucide-react icon
|
||||
min: 0,
|
||||
max: 1,
|
||||
lowLabel: "Calm",
|
||||
highLabel: "Intense",
|
||||
},
|
||||
{
|
||||
key: "valence",
|
||||
label: "Mood",
|
||||
icon: "Heart",
|
||||
min: 0,
|
||||
max: 1,
|
||||
lowLabel: "Melancholic",
|
||||
highLabel: "Happy",
|
||||
},
|
||||
{
|
||||
key: "danceability",
|
||||
label: "Groove",
|
||||
icon: "Footprints",
|
||||
min: 0,
|
||||
max: 1,
|
||||
lowLabel: "Freeform",
|
||||
highLabel: "Danceable",
|
||||
},
|
||||
{
|
||||
key: "bpm",
|
||||
label: "Tempo",
|
||||
icon: "Gauge",
|
||||
min: 60,
|
||||
max: 180,
|
||||
lowLabel: "Slow",
|
||||
highLabel: "Fast",
|
||||
unit: "BPM",
|
||||
},
|
||||
{
|
||||
key: "arousal",
|
||||
label: "Arousal",
|
||||
icon: "AudioWaveform",
|
||||
min: 0,
|
||||
max: 1,
|
||||
lowLabel: "Peaceful",
|
||||
highLabel: "Energetic",
|
||||
},
|
||||
];
|
||||
|
||||
const ML_MOOD_CONFIG = [
|
||||
{ key: "moodHappy", label: "Happy", icon: "Smile", color: "yellow-400" },
|
||||
{ key: "moodSad", label: "Sad", icon: "Frown", color: "blue-400" },
|
||||
{ key: "moodRelaxed", label: "Relaxed", icon: "Coffee", color: "green-400" },
|
||||
{ key: "moodAggressive", label: "Aggressive", icon: "Flame", color: "red-400" },
|
||||
{ key: "moodParty", label: "Party", icon: "PartyPopper", color: "pink-400" },
|
||||
{ key: "moodAcoustic", label: "Acoustic", icon: "Guitar", color: "amber-400" },
|
||||
{ key: "moodElectronic", label: "Electronic", icon: "Radio", color: "purple-400" },
|
||||
];
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Vibe Matching Algorithm
|
||||
|
||||
### Feature Vector Construction
|
||||
|
||||
The system builds a **13-dimensional feature vector** for each track:
|
||||
|
||||
```typescript
|
||||
const buildFeatureVector = (track: AudioFeatures) => [
|
||||
// ML Mood predictions (7 features) - 1.3x weight for semantic importance
|
||||
getMoodValue(track.moodHappy, 0.5) * 1.3,
|
||||
getMoodValue(track.moodSad, 0.5) * 1.3,
|
||||
getMoodValue(track.moodRelaxed, 0.5) * 1.3,
|
||||
getMoodValue(track.moodAggressive, 0.5) * 1.3,
|
||||
getMoodValue(track.moodParty, 0.5) * 1.3,
|
||||
getMoodValue(track.moodAcoustic, 0.5) * 1.3,
|
||||
getMoodValue(track.moodElectronic, 0.5) * 1.3,
|
||||
|
||||
// Audio features (5 features)
|
||||
track.energy ?? 0.5,
|
||||
calculateEnhancedArousal(track),
|
||||
track.danceabilityMl ?? track.danceability ?? 0.5,
|
||||
track.instrumentalness ?? 0.5,
|
||||
|
||||
// BPM (octave-aware normalization)
|
||||
1 - octaveAwareBPMDistance(track.bpm ?? 120, 120),
|
||||
|
||||
// Valence
|
||||
calculateEnhancedValence(track),
|
||||
];
|
||||
|
||||
// Helper: Get mood value with fallback
|
||||
const getMoodValue = (value: number | null | undefined, fallback: number) =>
|
||||
value ?? fallback;
|
||||
```
|
||||
|
||||
### Cosine Similarity Calculation
|
||||
|
||||
Tracks are compared using cosine similarity:
|
||||
|
||||
```typescript
|
||||
const cosineSimilarity = (vectorA: number[], vectorB: number[]): number => {
|
||||
let dotProduct = 0;
|
||||
let magA = 0;
|
||||
let magB = 0;
|
||||
|
||||
for (let i = 0; i < vectorA.length; i++) {
|
||||
dotProduct += vectorA[i] * vectorB[i];
|
||||
magA += vectorA[i] * vectorA[i];
|
||||
magB += vectorB[i] * vectorB[i];
|
||||
}
|
||||
|
||||
return dotProduct / (Math.sqrt(magA) * Math.sqrt(magB));
|
||||
};
|
||||
```
|
||||
|
||||
### Tag/Genre Bonus
|
||||
|
||||
Additional boost for shared tags:
|
||||
|
||||
```typescript
|
||||
const computeTagBonus = (
|
||||
sourceTags: string[],
|
||||
sourceGenres: string[],
|
||||
trackTags: string[],
|
||||
trackGenres: string[]
|
||||
): number => {
|
||||
const sourceSet = new Set(
|
||||
[...sourceTags, ...sourceGenres].map(t => t.toLowerCase())
|
||||
);
|
||||
const trackSet = new Set(
|
||||
[...trackTags, ...trackGenres].map(t => t.toLowerCase())
|
||||
);
|
||||
|
||||
const overlap = [...sourceSet].filter(tag => trackSet.has(tag)).length;
|
||||
return Math.min(0.05, overlap * 0.01); // Max 5% bonus
|
||||
};
|
||||
```
|
||||
|
||||
### Final Score
|
||||
|
||||
```typescript
|
||||
const finalScore = cosineSimilarity(sourceVector, targetVector) * 0.95 + tagBonus;
|
||||
```
|
||||
|
||||
### Matching Thresholds
|
||||
|
||||
| Mode | Minimum Similarity |
|
||||
|------|-------------------|
|
||||
| Enhanced | 40% |
|
||||
| Standard | 50% |
|
||||
|
||||
Lower threshold for Enhanced mode because ML predictions provide more nuanced differentiation.
|
||||
|
||||
### Octave-Aware BPM Matching
|
||||
|
||||
Treats harmonically related tempos as similar (60 BPM ≈ 120 BPM ≈ 240 BPM):
|
||||
|
||||
```typescript
|
||||
const octaveAwareBPMDistance = (bpm1: number, bpm2: number): number => {
|
||||
const normalizeToOctave = (bpm: number): number => {
|
||||
while (bpm < 77) bpm *= 2;
|
||||
while (bpm > 154) bpm /= 2;
|
||||
return bpm;
|
||||
};
|
||||
|
||||
const norm1 = normalizeToOctave(bpm1);
|
||||
const norm2 = normalizeToOctave(bpm2);
|
||||
|
||||
const logDistance = Math.abs(Math.log2(norm1) - Math.log2(norm2));
|
||||
return Math.min(logDistance, 1);
|
||||
};
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Get Track Audio Features
|
||||
|
||||
```
|
||||
GET /api/tracks/:id/features
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
{
|
||||
"bpm": 128.5,
|
||||
"energy": 0.78,
|
||||
"valence": 0.65,
|
||||
"arousal": 0.72,
|
||||
"danceability": 0.85,
|
||||
"key": "C",
|
||||
"keyScale": "major",
|
||||
"moodHappy": 0.72,
|
||||
"moodSad": 0.15,
|
||||
"moodRelaxed": 0.28,
|
||||
"moodAggressive": 0.45,
|
||||
"moodParty": 0.68,
|
||||
"moodAcoustic": 0.12,
|
||||
"moodElectronic": 0.78,
|
||||
"analysisMode": "enhanced",
|
||||
"analysisStatus": "completed"
|
||||
}
|
||||
```
|
||||
|
||||
### Find Similar Tracks (Vibe Match)
|
||||
|
||||
```
|
||||
GET /api/library/vibe-match?trackId=:id&limit=20
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
{
|
||||
"source": { /* track with features */ },
|
||||
"matches": [
|
||||
{
|
||||
"track": { /* track data */ },
|
||||
"similarity": 0.87,
|
||||
"features": { /* audio features */ }
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Generate Mood Mix
|
||||
|
||||
```
|
||||
POST /api/mixes/mood
|
||||
```
|
||||
|
||||
Request:
|
||||
```json
|
||||
{
|
||||
"valence": { "min": 0.6, "max": 1.0 },
|
||||
"energy": { "min": 0.5, "max": 0.8 },
|
||||
"danceability": { "min": 0.7, "max": 1.0 },
|
||||
"bpm": { "min": 100, "max": 140 },
|
||||
"limit": 15
|
||||
}
|
||||
```
|
||||
|
||||
### Get Mood Presets
|
||||
|
||||
```
|
||||
GET /api/mixes/mood-presets
|
||||
```
|
||||
|
||||
Response:
|
||||
```json
|
||||
[
|
||||
{
|
||||
"id": "chill",
|
||||
"name": "Chill Vibes",
|
||||
"color": "from-blue-600 to-purple-600",
|
||||
"params": {
|
||||
"valence": { "min": 0.3, "max": 0.7 },
|
||||
"energy": { "min": 0.1, "max": 0.4 }
|
||||
}
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Frontend Integration Guide
|
||||
|
||||
### Displaying Feature Values
|
||||
|
||||
Normalize values for consistent display:
|
||||
|
||||
```typescript
|
||||
function normalizeValue(
|
||||
value: number | null | undefined,
|
||||
min: number,
|
||||
max: number
|
||||
): number {
|
||||
if (value === null || value === undefined) return 0;
|
||||
return Math.max(0, Math.min(1, (value - min) / (max - min)));
|
||||
}
|
||||
|
||||
// Usage
|
||||
const normalizedBpm = normalizeValue(track.bpm, 60, 180);
|
||||
const normalizedEnergy = normalizeValue(track.energy, 0, 1);
|
||||
```
|
||||
|
||||
### Calculating Match Scores
|
||||
|
||||
```typescript
|
||||
function calculateFeatureMatch(
|
||||
sourceVal: number | null,
|
||||
currentVal: number | null,
|
||||
min: number,
|
||||
max: number
|
||||
): { diff: number; match: number } {
|
||||
const sourceNorm = normalizeValue(sourceVal, min, max);
|
||||
const currentNorm = normalizeValue(currentVal, min, max);
|
||||
const diff = Math.abs(sourceNorm - currentNorm);
|
||||
const match = Math.round((1 - diff) * 100);
|
||||
|
||||
return { diff, match };
|
||||
}
|
||||
```
|
||||
|
||||
### Match Score Color Coding
|
||||
|
||||
```typescript
|
||||
function getMatchColor(matchPercent: number): string {
|
||||
if (matchPercent >= 80) return "text-green-400"; // Excellent
|
||||
if (matchPercent >= 60) return "text-yellow-400"; // Good
|
||||
return "text-red-400"; // Different
|
||||
}
|
||||
|
||||
function getMatchDescription(matchPercent: number): string {
|
||||
if (matchPercent >= 80) return "Excellent match - very similar vibe";
|
||||
if (matchPercent >= 60) return "Good match - similar energy";
|
||||
return "Different vibe - exploring variety";
|
||||
}
|
||||
```
|
||||
|
||||
### Visualization Recommendations
|
||||
|
||||
#### 1. Radar Chart (Spider Graph)
|
||||
Best for comparing multiple features at once. Shows source track (dashed line) vs current track (solid fill).
|
||||
|
||||
#### 2. Progress Bars
|
||||
Best for individual feature comparison with source marker overlay.
|
||||
|
||||
#### 3. Mood Grid
|
||||
4x2 or 4x4 grid of ML mood indicators with percentage matches.
|
||||
|
||||
#### 4. Valence-Arousal Quadrant
|
||||
2D scatter plot with:
|
||||
- X-axis: Valence (sad → happy)
|
||||
- Y-axis: Arousal (calm → energetic)
|
||||
|
||||
Quadrants:
|
||||
- Top-right: Happy + Energetic (Party)
|
||||
- Top-left: Sad + Energetic (Angry/Tense)
|
||||
- Bottom-right: Happy + Calm (Peaceful)
|
||||
- Bottom-left: Sad + Calm (Melancholic)
|
||||
|
||||
---
|
||||
|
||||
## Existing Components Reference
|
||||
|
||||
### VibeOverlay
|
||||
Location: `frontend/components/player/VibeOverlay.tsx`
|
||||
|
||||
Full-featured overlay showing:
|
||||
- Overall match percentage
|
||||
- Feature-by-feature comparison bars
|
||||
- ML mood grid (enhanced mode)
|
||||
- Source vs current legend
|
||||
|
||||
### VibeGraph
|
||||
Location: `frontend/components/player/VibeGraph.tsx`
|
||||
|
||||
Compact radar chart for:
|
||||
- 4-feature comparison (Energy, Mood, Dance, BPM)
|
||||
- Match score badge
|
||||
- Inline display in player
|
||||
|
||||
### MoodMixer
|
||||
Location: `frontend/components/MoodMixer.tsx`
|
||||
|
||||
Modal for:
|
||||
- Quick mood presets
|
||||
- Custom range sliders
|
||||
- Generating mood-based playlists
|
||||
|
||||
---
|
||||
|
||||
## Special Considerations
|
||||
|
||||
### Out-of-Distribution (OOD) Detection
|
||||
|
||||
The MusiCNN model was trained on pop/rock music. For other genres (classical, ambient, jazz), predictions may be unreliable. The backend normalizes these cases:
|
||||
|
||||
**Detection criteria:**
|
||||
- All mood values > 0.7 with low variance
|
||||
- All mood values clustered around 0.5
|
||||
|
||||
**UI Recommendation:** Show a subtle indicator when `analysisMode` is "standard" or when predictions seem unreliable.
|
||||
|
||||
### Handling Missing Data
|
||||
|
||||
Always provide fallback values:
|
||||
|
||||
```typescript
|
||||
const safeFeatures = {
|
||||
energy: track.energy ?? 0.5,
|
||||
valence: track.valence ?? 0.5,
|
||||
bpm: track.bpm ?? 120,
|
||||
// ... etc
|
||||
};
|
||||
```
|
||||
|
||||
### Analysis Status States
|
||||
|
||||
| Status | UI Treatment |
|
||||
|--------|--------------|
|
||||
| `pending` | Show "Analyzing..." with spinner |
|
||||
| `processing` | Show progress indicator |
|
||||
| `completed` | Show full vibe data |
|
||||
| `failed` | Show fallback/retry option |
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference: Value Ranges
|
||||
|
||||
| Metric | Min | Max | Neutral |
|
||||
|--------|-----|-----|---------|
|
||||
| All mood* | 0 | 1 | 0.5 |
|
||||
| energy | 0 | 1 | 0.5 |
|
||||
| valence | 0 | 1 | 0.5 |
|
||||
| arousal | 0 | 1 | 0.5 |
|
||||
| danceability | 0 | 1 | 0.5 |
|
||||
| bpm | 60 | 200 | 120 |
|
||||
| keyStrength | 0 | 1 | - |
|
||||
|
||||
---
|
||||
|
||||
## File Locations
|
||||
|
||||
| Component | Path |
|
||||
|-----------|------|
|
||||
| Audio Analyzer (Python) | `services/audio-analyzer/analyzer.py` |
|
||||
| Vibe Matching Logic | `backend/src/routes/library.ts` |
|
||||
| Database Schema | `backend/prisma/schema.prisma` |
|
||||
| Frontend Vibe Overlay | `frontend/components/player/VibeOverlay.tsx` |
|
||||
| Frontend Vibe Graph | `frontend/components/player/VibeGraph.tsx` |
|
||||
| Mood Mixer | `frontend/components/MoodMixer.tsx` |
|
||||
| Audio State Context | `frontend/lib/audio-state-context.tsx` |
|
||||
|
||||
---
|
||||
|
||||
## Research Background
|
||||
|
||||
The Vibe System's valence and arousal calculations are informed by music psychology research:
|
||||
|
||||
### Valence (Emotional Positivity)
|
||||
|
||||
**Key Finding:** Mode/tonality is the strongest predictor of perceived valence in music.
|
||||
|
||||
- **Lee et al. (ICASSP 2020)** - Demonstrated that musical mode (major vs. minor) has the highest correlation with listener-reported valence
|
||||
- Major keys contribute positively (+0.3 in our formula), minor keys negatively (-0.2)
|
||||
- This aligns with centuries of music theory and empirical psychology research
|
||||
|
||||
### Arousal (Energy/Excitement)
|
||||
|
||||
**Key Finding:** The "electronic" mood prediction from ML models is unreliable for arousal calculation.
|
||||
|
||||
- **Grekow (2018)** - Found that direct energy and tempo features outperform genre-based predictions for arousal
|
||||
- Our implementation replaces the "electronic" mood with explicit energy and BPM contributions
|
||||
- This provides more consistent arousal predictions across diverse genres
|
||||
|
||||
### Feature Weights
|
||||
|
||||
The specific weights in our formulas (e.g., 0.35 for happy mood, 0.25 for energy) were tuned through:
|
||||
1. Initial values from published research
|
||||
2. Empirical testing on a diverse music library
|
||||
3. User feedback on vibe matching accuracy
|
||||
|
||||
### References
|
||||
|
||||
- Lee, J., et al. (2020). "Music Emotion Recognition Using Valence-Arousal Regression." ICASSP 2020.
|
||||
- Grekow, J. (2018). "Music Emotion Maps in Arousal-Valence Space." IFIP International Conference on Computer Information Systems and Industrial Management.
|
||||
Reference in New Issue
Block a user