Vibe Matching Algorithm Overhaul Plan
Overview
This document outlines the plan to overhaul the vibe matching algorithm to use cosine similarity on a comprehensive feature vector that includes all 9 ML mood predictions, audio features, and genre/tag matching.
Current State (Before Overhaul)
What We Have
-
ML Mood Predictions (9 total):
moodHappy,moodSad,moodRelaxed,moodAggressive(existing)moodParty,moodAcoustic,moodElectronic(newly added)danceabilityMl,aggressivenessMl(existing)
-
Audio Features:
bpm,key,keyScale(major/minor)energy,danceability,valence,arousalinstrumentalness,acousticness,speechiness
-
Metadata:
lastfmTags(JSON array of tag objects with name/count)essentiaGenres(JSON array of genre strings)trackGenresrelation (linked genre records)
Previous Algorithm (Weighted Manhattan Distance)
// Old approach - arbitrary weights, limited features
const weights = {
energy: 1.5,
danceability: 1.2,
valence: 1.0,
arousal: 1.0,
instrumentalness: 0.8,
bpm: 0.5,
};
let score = 0;
for (const [feature, weight] of Object.entries(weights)) {
const diff = Math.abs(sourceTrack[feature] - candidateTrack[feature]);
score += diff * weight;
}
// Lower score = more similar (inverted logic)
Problems with old approach:
- Only used 6 features, ignored all ML mood predictions
- Arbitrary weights with no scientific basis
- Manhattan distance less effective for high-dimensional feature spaces
- No genre/tag matching
- Score inversion was confusing
New Algorithm (Cosine Similarity)
Phase 1: Database Schema Update ✅
Add new mood fields to Prisma schema:
model Track {
// ... existing fields ...
// ML Mood Predictions (0.0-1.0)
moodHappy Float?
moodSad Float?
moodRelaxed Float?
moodAggressive Float?
moodParty Float? // NEW
moodAcoustic Float? // NEW
moodElectronic Float? // NEW
// ... rest of schema ...
}
Migration command:
cd backend
npx prisma db push --skip-generate
Phase 2: Audio Analyzer Update ✅
Update services/audio-analyzer/analyzer.py to extract and save all 7 mood predictions:
# MusiCNN mood classifiers
mood_models = {
'moodHappy': 'mood_happy-musicnn-msd-2',
'moodSad': 'mood_sad-musicnn-msd-2',
'moodRelaxed': 'mood_relaxed-musicnn-msd-2',
'moodAggressive': 'mood_aggressive-musicnn-msd-2',
'moodParty': 'mood_party-musicnn-msd-2',
'moodAcoustic': 'mood_acoustic-musicnn-msd-2',
'moodElectronic': 'mood_electronic-musicnn-msd-2',
}
# Save all to database
UPDATE "Track" SET
"moodHappy" = %s,
"moodSad" = %s,
"moodRelaxed" = %s,
"moodAggressive" = %s,
"moodParty" = %s,
"moodAcoustic" = %s,
"moodElectronic" = %s,
...
Phase 3: Feature Vector Construction
Build a normalized feature vector for each track:
interface TrackFeatures {
// ML Moods (0-1)
moodHappy: number | null;
moodSad: number | null;
moodRelaxed: number | null;
moodAggressive: number | null;
moodParty: number | null;
moodAcoustic: number | null;
moodElectronic: number | null;
// Audio Features
energy: number | null;
arousal: number | null;
danceability: number | null;
danceabilityMl: number | null;
instrumentalness: number | null;
bpm: number | null;
keyScale: string | null;
// Metadata
lastfmTags: any;
essentiaGenres: any;
}
function buildFeatureVector(track: TrackFeatures): number[] {
return [
// 7 ML Mood predictions (indices 0-6)
track.moodHappy ?? 0.5,
track.moodSad ?? 0.5,
track.moodRelaxed ?? 0.5,
track.moodAggressive ?? 0.5,
track.moodParty ?? 0.5,
track.moodAcoustic ?? 0.5,
track.moodElectronic ?? 0.5,
// Core audio features (indices 7-10)
track.energy ?? 0.5,
track.arousal ?? 0.5,
track.danceabilityMl ?? track.danceability ?? 0.5,
track.instrumentalness ?? 0.5,
// Normalized BPM (index 11)
// Maps 60-180 BPM to 0-1 range
Math.max(0, Math.min(1, ((track.bpm ?? 120) - 60) / 120)),
// Key mode (index 12)
// Major = 1, Minor = 0
track.keyScale === 'major' ? 1 : 0,
];
}
Feature Vector Dimensions: 13
Phase 4: Cosine Similarity Calculation
function cosineSimilarity(a: number[], b: number[]): number {
let dotProduct = 0;
let magnitudeA = 0;
let magnitudeB = 0;
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
magnitudeA += a[i] * a[i];
magnitudeB += b[i] * b[i];
}
if (magnitudeA === 0 || magnitudeB === 0) return 0;
return dotProduct / (Math.sqrt(magnitudeA) * Math.sqrt(magnitudeB));
}
Properties:
- Returns value between -1 and 1 (for our 0-1 normalized vectors, always 0 to 1)
- 1.0 = identical vectors (perfect match)
- 0.0 = orthogonal vectors (no similarity)
- Higher = better (intuitive, no inversion needed)
Phase 5: Tag/Genre Bonus
Add bonus points for matching tags and genres:
function calculateTagBonus(
sourceTrack: TrackFeatures,
candidateTrack: TrackFeatures
): number {
let bonus = 0;
// Extract tags
const sourceTags = new Set<string>();
const candidateTags = new Set<string>();
// Parse lastfmTags
if (Array.isArray(sourceTrack.lastfmTags)) {
sourceTrack.lastfmTags.forEach((t: any) => {
if (t?.name) sourceTags.add(t.name.toLowerCase());
});
}
if (Array.isArray(candidateTrack.lastfmTags)) {
candidateTrack.lastfmTags.forEach((t: any) => {
if (t?.name) candidateTags.add(t.name.toLowerCase());
});
}
// Parse essentiaGenres
if (Array.isArray(sourceTrack.essentiaGenres)) {
sourceTrack.essentiaGenres.forEach((g: string) => {
sourceTags.add(g.toLowerCase());
});
}
if (Array.isArray(candidateTrack.essentiaGenres)) {
candidateTrack.essentiaGenres.forEach((g: string) => {
candidateTags.add(g.toLowerCase());
});
}
// Count overlapping tags
let overlap = 0;
for (const tag of sourceTags) {
if (candidateTags.has(tag)) overlap++;
}
// Bonus: up to 0.1 (10%) for tag overlap
// Normalized by the smaller set size to handle varying tag counts
const minSize = Math.min(sourceTags.size, candidateTags.size);
if (minSize > 0) {
bonus = (overlap / minSize) * 0.1;
}
return bonus;
}
Phase 6: Final Score Calculation
function calculateVibeScore(
sourceTrack: TrackFeatures,
candidateTrack: TrackFeatures
): number {
// Build feature vectors
const sourceVector = buildFeatureVector(sourceTrack);
const candidateVector = buildFeatureVector(candidateTrack);
// Calculate cosine similarity (0-1)
const cosineSim = cosineSimilarity(sourceVector, candidateVector);
// Add tag bonus (0-0.1)
const tagBonus = calculateTagBonus(sourceTrack, candidateTrack);
// Final score: cosine similarity + tag bonus
// Capped at 1.0
const finalScore = Math.min(1.0, cosineSim + tagBonus);
return finalScore;
}
Phase 7: Integration into Radio Endpoint
Update backend/src/routes/library.ts:
// In the vibe radio section
const sourceTrack = await prisma.track.findUnique({
where: { id: trackId },
select: {
moodHappy: true,
moodSad: true,
moodRelaxed: true,
moodAggressive: true,
moodParty: true,
moodAcoustic: true,
moodElectronic: true,
energy: true,
arousal: true,
danceability: true,
danceabilityMl: true,
instrumentalness: true,
bpm: true,
keyScale: true,
lastfmTags: true,
essentiaGenres: true,
},
});
// Get candidates
const candidates = await prisma.track.findMany({
where: {
id: { not: trackId },
analysisStatus: 'enhanced', // Only use analyzed tracks
},
select: { /* same fields */ },
take: 500, // Get more candidates for better matching
});
// Score all candidates
const scored = candidates.map(candidate => ({
...candidate,
vibeScore: calculateVibeScore(sourceTrack, candidate),
}));
// Sort by score (highest first)
scored.sort((a, b) => b.vibeScore - a.vibeScore);
// Take top N for the queue
const vibeQueue = scored.slice(0, limit);
// DO NOT SHUFFLE - preserve the sorted order!
Implementation Checklist
- Phase 1: Add
moodParty,moodAcoustic,moodElectronicto Prisma schema - Phase 2: Update audio analyzer to extract all 7 moods
- Phase 3: Implement
buildFeatureVector()function - Phase 4: Implement
cosineSimilarity()function - Phase 5: Implement
calculateTagBonus()function (calledcomputeTagBonus) - Phase 6: Implement
calculateVibeScore()combining all components - Phase 7: Integrate into
/library/radioendpoint - Phase 8: Update frontend to display match percentage (optional enhancement)
- Phase 9: Re-analyze tracks to populate new mood fields
Re-Analysis Script
To populate the new mood fields for existing tracks:
-- Reset analysis status for enhanced tracks to re-run analysis
UPDATE "Track"
SET "analysisStatus" = 'pending'
WHERE "analysisStatus" = 'enhanced';
Or use the existing script:
docker exec lidify_db psql -U lidifydb -d lidify -f /path/to/reset-analysis-for-new-moods.sql
Expected Improvements
- Better Similarity Matching: Cosine similarity is mathematically proven to work well for high-dimensional feature vectors
- Full ML Utilization: All 9 mood predictions now contribute to matching
- Genre Awareness: Tag/genre overlap provides meaningful boost
- Intuitive Scores: Higher score = better match (no inversion)
- Normalized Features: All features scaled to 0-1 for fair comparison
Testing Strategy
- Pick a track with known characteristics (e.g., happy upbeat pop song)
- Generate vibe queue
- Verify top matches share similar mood profiles
- Check that match percentages in UI reflect actual similarity
- Test with various genres to ensure cross-genre matching works appropriately
Files Modified
backend/prisma/schema.prisma- New mood fieldsbackend/src/routes/library.ts- New scoring algorithmservices/audio-analyzer/analyzer.py- Extract all 7 moodsfrontend/components/player/VibeOverlay.tsx- Display all moodsfrontend/lib/audio-state-context.tsx- Extended AudioFeatures interface
Notes
- Gaia: Essentia has a companion library called Gaia for large-scale similarity search using KD-trees. This is overkill for our scale (< 100k tracks) but could be considered for future scaling.
- MusiCNN Limitations: The model was trained on MSD (Million Song Dataset) which is pop/rock heavy. For classical/ambient music, predictions may be less reliable. We've added normalization to handle this.
- Shuffle Interaction: Vibe mode automatically disables shuffle to preserve the sorted order.