← Back to blog

11 min read · March 10, 2026

ML Music Research

Constraint-Aware Recommendation as Creative Scaffolding

Music recommendation systems optimize for engagement. DJs optimize for a feeling. What happens when you build a system that takes structural constraints seriously?

Spotify's recommendation engine is good at one thing: keeping you listening. It optimizes for session length, click-through rate, and platform engagement. These are legitimate optimization targets for a streaming service. They are not legitimate targets for a DJ building a set. A DJ's constraints are structural. The next track must be within a few BPM of the current one — not for algorithmic similarity, but because a real-time mix requires tempo alignment. The key must be harmonically compatible — not for acoustic similarity, but because mixing two tracks in conflicting keys creates audible dissonance on a sound system. The energy must follow a deliberate arc — not a random walk, but a shape that moves a room from one emotional state to another. These constraints aren't edge cases. They're the core logic of how DJs actually select tracks.

And mainstream recommendation systems don't model any of them. Engagement optimization vs. set optimization Spotify's Discover Weekly, Release Radar, and personalized playlists all optimize for one thing: will you keep listening? The signals are implicit — skips, saves, repeats, session length. The feedback loop is simple: more engagement means more data means better engagement predictions. DJ set selection operates under a completely different objective function. A DJ isn't trying to maximize listening time. They're trying to construct a sequence that moves a room — that builds tension, releases it, creates moments of surprise and recognition, and resolves into a satisfying whole. The signal for success isn't "did they keep listening" — it's "did the room move." This is not a small difference.

It's the difference between a system that optimizes for convenience and one that optimizes for craft.

The constraint landscape A constraint-aware recommendation system for DJs needs to model at least four structural axes: Constraint What it controls Typical range Why it matters Tempo (BPM) Beat-mixing feasibility ±3-5 BPM for smooth transition Physical constraint — can't blend two tracks at different tempos without pitch shift Key compatibility Harmonic mixing Same key ±1 semitone, or Camelot adjacent Dissonance in a live mix is immediately audible Energy contour Set arc shape Normalized RMS & spectral centroid over time The arc of a set is an energy story, not a tempo story Spectral density Perceived intensity Sparse (minimal) to dense (anthemic) Two tracks at 128 BPM can feel completely different depending on frequency content Notice what's not on the list: genre, mood, popularity, listening history.

These are the axes that Spotify and Apple Music use for recommendation. They matter for discovery, but they don't model the structural constraints that determine whether two tracks can actually be mixed together. BPM is not energy (the 128 BPM problem) The most common mistake in DJ-adjacent recommendation is treating BPM as a proxy for energy. It isn't. A minimal techno track at 128 BPM — sparse, bass and hi-hat only, enormous gaps of silence — has roughly a quarter of the perceived energy of a peak-time trance anthem at the same tempo. BPM tells you how fast the beat pulses. It doesn't tell you how full the frequency spectrum is, how dense the arrangement is, or how the track sits in the room. Within a single genre, the BPM range across an entire DJ set is typically only 10-15 BPM — roughly 13% variation.

The energy range across the same set, measured by RMS loudness and spectral density, varies by 400% or more. BPM is the guardrail. Energy is the steering. Even Spotify's own API acknowledges this. The "energy" feature in Spotify's audio analysis endpoint computes a composite score from RMS level, spectral centroid, spectral flatness, and onset rate — not from BPM. They built a feature that explicitly decouples energy from tempo for their own recommendation engine. DJs need the same decoupling, with the added constraint that the results must be mixable in real-time. What constraint-aware recommendation looks like A constraint-aware recommender doesn't replace the DJ. It scaffolds the DJ. The constraints define a space of possible next tracks, and the DJ selects within that space based on taste, intuition, and room feedback.

The system handles the structural logic so the DJ can focus on the creative logic. Concretely, this means: Filter before suggest. The system first eliminates tracks that violate structural constraints: wrong key, incompatible tempo, energy profile that would break the set arc. Only then does it rank the remaining candidates by similarity, novelty, or other preference signals. Model the arc, not the track. Instead of recommending "tracks like this one," the system recommends tracks that maintain or advance the current energy trajectory. If the set is building, suggest tracks with higher spectral density. If the set is peaking, suggest tracks with similar density but different timbral character. If the set is winding down, suggest sparser, lower-energy alternatives. Treat key and tempo as constraints, not features.

In collaborative filtering, key and BPM are features like any other — they contribute to a similarity score. In DJ recommendation, they're hard constraints. A track in the wrong key isn't "less similar" — it's unusable without pitch-shifting that degrades audio quality. Use spectral profile as the primary similarity axis. Two tracks with similar spectral centroid, spectral flatness, and dynamic range will sound more alike to a DJ than two tracks with similar BPM and key but different spectral profiles. Spectral similarity predicts mixability. Conventional vs. constraint-aware Conventional: "You liked this track. Here are more tracks like it." Constraint-aware: "You're 40 minutes into a set, currently at 126 BPM in G minor, energy trending upward. Here are tracks that maintain that trajectory, are harmonically compatible, and won't kill the room.

" Why existing systems don't do this Spotify, Apple Music, and YouTube Music don't build for DJs because DJs aren't their primary audience. Their optimization targets — session length, ad revenue, discovery — are served by engagement-maximizing recommendation, not constraint-aware recommendation. Building for DJs would mean building a different product. Even Beatport, the DJ-focused marketplace, only allows filtering by BPM and key as separate parameters. There's no "suggest tracks that mix well with this one" button. The DJ is expected to have the expertise to construct a set manually. Rekordbox's cloud analysis feature and Serato's upcoming recommendation features both focus on library management — "here are tracks you haven't played recently" or "here are tracks similar to ones you've played." Neither models the real-time constraint satisfaction problem that a DJ faces during a set.

The scaffolding, not the scaffold The goal of constraint-aware recommendation isn't to automate the DJ. It's to reduce the cognitive load of selection so the DJ can focus on the things that machines can't model: reading the room, feeling the energy, making the call that no algorithm can make. A good DJ set is a structured improvisation. The structure comes from the constraints — key, tempo, energy, density. The improvisation comes from the DJ's taste, experience, and real-time perception. A constraint-aware system handles the structure so the DJ can improvise. This is the difference between scaffolding — a temporary support that makes creative work possible — and a scaffold — a rigid framework that replaces it.

The research on AI-assisted creativity is clear: co-creative tools that preserve human agency outperform fully automated systems in terms of creative output quality and user satisfaction. The Doshi & Hauser study on AI art platforms found that AI adoption increased individual productivity by 25% but reduced collective diversity — the homogenization effect. The design implication: co-create, don't autopilot. Constraint-aware recommendation is scaffolding. It narrows the search space to structurally viable options, then gets out of the way. The DJ is still the one deciding. The system just makes sure the options on the table are all options that could actually work. Engagement-optimized recommendation gives you more of what you already like. Constraint-aware recommendation gives you things you might not have considered, but that will actually work in the context you're in.

For DJs — and for anyone making sequential creative decisions under real constraints — that's the difference between a feed and a tool.