The Good Bot Rises: How Computational Skepticism Exposes Spotify's Ghost Artist Fraud
One professor got angry reading about Spotify's systematic theft. So he built the detection system the FTC won't.
I spent a couple hours reading Liz Pelly’s Mood Machine: The Rise of Spotify and the Costs of the Perfect Playlist and got angrier with each chapter. Not the productive anger that fades into acceptance. The kind that demands building something. By page 200, when Pelly documented Spotify’s internal Slack messages showing Strategic Programming team celebrating €61.4 million in gross profit from replacing real jazz artists with Swedish ghost musicians, I stopped reading and started coding.
The result is the Musinique playlist auditor —a computational framework that does what journalism can’t: systematically measure the scale of theft hiding in plain sight across 5.8 million Spotify playlists. Not to help artists navigate a rigged system. To prove the system is rigged with numbers regulators and courts can’t dismiss as anecdotal.
The Theft Pelly Documented, Quantified
Pelly proved through leaked documents that Spotify runs an internal program called “Perfect Fit Content”—licensing anonymous stock music at reduced royalty rates, releasing it under fabricated artist names with invented biographies, then systematically replacing real musicians on official playlists with these ghosts. One hundred-plus playlists are now over 90% fake artists. “Stress Relief” (1.45 million followers) contains 270 tracks. Forty-one are compositions by Johan Röhr, a Swedish composer operating behind 650+ invented identities who’s accumulated 15 billion streams and makes $30 million annually. Users encounter “diversity”—different artist names, different album covers. Reality: three guys in a Stockholm studio recording single takes, optimized for background listening, designed to be “as milk-toast as possible.”
What Pelly couldn’t measure: How many total playlists are compromised? What percentage of mood category streams go to ghosts? What’s the exact displaced revenue for independent musicians? How do you detect this at scale without access to Spotify’s internal monitoring tools?
That’s what Musinique calculates. We’ve scraped playlists, extracted complete track listings and artist metadata for every playlist, mapped artist-declared genres against playlist-claimed contexts, and built detection algorithms targeting the six fraud signatures Pelly’s reporting revealed. The framework isn’t finished—historical time-series data collection, won’t have statistical validity for a couple of months.. But what’s operational already exposes patterns invisible to human observation.
What’s Built: The Computational Skepticism Stack
Focus Score: The Genre Coherence Audit
The first layer analyzes playlist integrity through mathematical rigor, not taste judgment. The Focus Score (0-100) combines three weighted metrics:
Genre Breadth (45%): Penalizes playlists covering 10+ primary genres. One genre scores 100. Fifty genres scores zero. The logic: real human curators specialize. Bot farms accepting paid submissions dump anything.
Genre Density (30%): Rewards depth over breadth. Four hundred tracks across two genres (200 tracks/genre) scores 100. One hundred tracks across fifteen genres (6.7 tracks/genre) scores 2.3. Deep catalogs indicate expertise. Shallow mixing indicates spam.
Artist Repetition (25%): Rewards curation over randomness. If a playlist features the same 10 artists repeatedly (30% artist uniqueness), that’s focused sound curation—score 100. If every artist appears exactly once (100% uniqueness), that’s random dumping—score zero.
The mathematics matter because they’re objective. You can’t argue with entropy calculations. You can’t spin power law distributions. When the analysis of 25,000 curators reveals the top 1% control 54% of total reach, and the #1 curator by followers (9.19 million) is Filtr US—Sony Music’s playlist operation disguised as independent curation—the concentration isn’t opinion. It’s measurement.
Scores below 40 correlate with bot farm indicators: high follower counts plus genre chaos plus zero social media presence. Scores above 70 predict human curation: genre-specific communities, consistent updates, verifiable identities. This isn’t subjective taste. This is forensic auditing of technological artifacts. This all still needs to be validated with formal research papers.
The Contact Discovery Agent
The second operational component automates what used to require hours of manual research: finding how to actually reach playlist curators. Spotify’s API provides playlist data but deliberately omits contact information. Artists are left guessing—send Instagram DMs? Email submission forms? Twitter mentions?
The LangGraph-orchestrated research agent executes systematic intelligence gathering: Google searches curator names, scrapes potential websites, extracts social media handles and submission forms, verifies matches through context (requires music-related confirmation, won’t extract “John Smith the plumber” when searching for “John Smith the DJ”). Success rate: approximately 80% for curators with public web presence. Rate-limited to ~20 curators per hour to avoid detection. Already enriched 84 curators from initial dataset.
The output is Playlisters.csv: curator name, Instagram, Twitter, Facebook, submission forms, average Focus Score across their catalog, total reach across all playlists. Not selling access to a broken game. Documenting the game’s mechanics so we can prove it’s broken.
The Validation Infrastructure
Third operational layer: multi-process Playwright automation verifying playlist liveness. Thirty percent of playlist URLs become invalid over six months—curators delete playlists, accounts get suspended, links break. The validator runs 16 parallel headless browsers, simulates human behavior (random mouse movements, variable scroll amounts, realistic wait times), masks browser fingerprints (disables webdriver detection), and checks for Spotify’s error messages plus content indicators. Processing 1,000 URLs takes 20-30 minutes. Manual verification would take days.
This matters because data entropy is the enemy of research. Without weekly re-validation, the database becomes archaeological record—interesting historically, useless practically. Automating liveness checks means the dataset stays current.
What’s Missing: The Fraud Detection Suite
Reading Pelly’s documentation of the Michael Smith case—musician who stole $10 million by generating hundreds of thousands of AI tracks with names like “Zygotic Washstands,” then using 10,000+ bot accounts to stream them billions of times, all while Spotify’s fraud detection failed for years—clarified what Musinique actually needs to build. Not playlist recommendations. Fraud forensics.
The Z-Score Growth Monitor (In Development)
Traditional bot detection flags sudden follower spikes. Sophisticated operations use “low and slow” methods—distributing streams across thousands of accounts at rates mimicking organic growth. Statistical process control solves this: monitor follower growth against genre-specific baselines, calculate Z-scores (standard deviations from mean), flag vertical spikes (Z > 3.0 indicates bot injection with 99.7% confidence).
Data requirements: 90 days of historical follower counts to establish baseline. Current status: daily snapshots launched February 12, 2026. Will have statistical validity by May 15. Then we can scan every playlist in the database, identify which ones show growth patterns inconsistent with organic discovery, and estimate the scale of stream fraud across the platform. Pelly documented this happens. We’ll measure exactly how much.
The Churn Pattern Analyzer (In Development)
Pay-for-play schemes charge artists $50-500 for temporary playlist placement—one week, two weeks, one month. Then mechanically remove tracks to make room for next paying customer. The pattern is exact-interval retention: if 30% of a playlist’s songs are removed at precisely 7±1 days, that’s not organic curation (which varies). That’s weekly paid slots.
Algorithm is simple: compare weekly playlist snapshots, calculate retention periods for every removed track, build histograms, test for clustering around 7/14/30-day intervals using chi-square goodness-of-fit. If the distribution rejects the null hypothesis (retention periods are normally distributed), you’ve detected mechanical replacement. The data exists—weekly snapshots are running. The analysis script is TODO. Estimated implementation: 2-3 days once sufficient temporal data accumulates.
Expected finding based on Pelly’s reporting: 15-25% of playlists with SubmitHub submission forms will show exact-interval clustering. That’s not a guess. That’s hypothesis derived from documented business model (SubmitHub charges $5-50 per submission to thousands of curators) plus internal logic (curators maximize revenue by cycling paid slots on fixed schedules).
The Ghost Artist Detector (Highest Priority)
This is the analysis that made me angry enough to build Musinique. Pelly documented the mechanism: Spotify’s Strategic Programming team commissions tracks from production companies (Firefly Entertainment, Epidemic Sound, Hush Hush LLC, Cat Farm Music, Queen Street Content, Mind Stream, Slumber Group, Audio Network), releases them under fabricated names (Ekvatt—”classically trained Icelandic beatmaker” who doesn’t exist), places them on official mood playlists, monitors the “PFC %” using internal dashboards, and celebrates when goals hit. One hundred playlists over 90% ghost artists. €61.4 million annual profit. Real musicians systematically displaced.
But Pelly reviewed internal Slack messages—maybe 100-200 playlists total. We have access to every playlist on Spotify. We can check every artist name against external verification: Google search results, Wikipedia pages, Instagram accounts, artist websites, MusicBrainz databases. We can flag known PFC labels. We can scan artist bios for fabrication patterns (generic wellness language, invented conservatory credentials, vague origin stories). We can identify playlists where artists appear on 100+ lists but have fewer than 1,000 monthly listeners—an impossible ratio for real musicians.
The detection pipeline is straightforward:
For each artist on mood playlists (ambient, jazz, classical, lofi, sleep, focus, chill):
Google Search API: Check if name returns >5 relevant results
Wikipedia API: Check if page exists
Instagram Graph API: Verify account + follower count
Label cross-reference: Flag Firefly, Epidemic, etc.
Bio NLP: Scan for fabrication patterns
Calculate PFC probability: 0 (verified) to 1.0 (ghost)
Aggregate to playlist level: percentage of tracks that are likely fabricated. Flag playlists where PFC% exceeds 50% (Pelly’s threshold) or 90% (extreme cases).
Expected scope: Pelly said “100+ playlists over 90% PFC.” We’ll count exactly. We’ll calculate total streams to ghost artists monthly. We’ll estimate displaced revenue for real musicians (streams × $0.003-0.005 per stream). We’ll map the production company networks. We’ll provide regulators with numbers.
Current status: Algorithm designed, API integrations planned. Implementation blocked by rate limits—290 million artist verifications required (5.8M playlists × average 50 artists per playlist). Even at 1 million API calls daily, that’s 290 days. Solution: strategic sampling (analyze mood playlists first where PFC concentration is highest), parallel processing, and patience. This is research, not rapid prototyping. Accuracy matters more than speed.
Why the Only Way to Combat Evil Bots Is With Good Bots
Pelly’s journalism documents exploitation through testimony and leaked documents. Powerful—but dismissible as anecdotal. “That’s just a few playlists.” “Those are disgruntled employees.” “Individual cases don’t prove systematic behavior.”
Computational methods eliminate that escape. When data analysis shows power law concentration (top 1% of curators controlling 54% of reach), that’s not interpretation. When statistical tests show retention periods clustering at exact 7-day intervals (p<0.001), that’s not speculation. When artist verification reveals 90% of “Peaceful Piano” tracks come from fabricated identities, that’s not journalism—it’s forensic accounting.
The Michael Smith case proves why automation is necessary. Smith ran his fraud for years using bots to stream AI-generated music from 10,000+ fake accounts. Spotify’s human fraud team missed it. The company only caught him after the FBI investigation was already underway—and even then, only because Smith got greedy (streaming billions of times monthly, impossible to miss). Sophisticated fraud doesn’t announce itself. It looks like optimized platform participation. Only algorithmic auditing at scale can detect it.
This is computational skepticism as civic infrastructure: using data science not to optimize extraction but to expose it. Using algorithms not to replace human judgment but to audit systems that claim algorithmic neutrality while systematically favoring corporate interests. Using automation not to generate more content but to verify what’s real.
Spotify built bots to replace musicians (ghost artists, AI-generated mood music, algorithmic playlist stuffing). The response can’t be more journalism. Journalism documented the crime. What’s needed now is measurement—quantifying prevalence, identifying perpetrators, calculating damages, providing evidence courts and regulators can act on.
Evil bots steal. Good bots count what was stolen and identify who took it.
What Musinique Is Actually Building
Not: A better SubmitHub (playlist pitching service for desperate artists) Not: A fair alternative to Spotify (mathematically impossible—$10/month can’t support millions of artists) Not: Tools for navigating broken system (monetizing artist desperation)
But: Research infrastructure exposing and measuring exploitation
Immediate Release (This Week):
Curators.csv (25,000 curators, CC-BY open license): Contact information, reach metrics, Focus Scores, corporate flags (Filtr US = Sony, Digster = Universal). Proves major label playlist operations dominate ecosystem. Free on GitHub, permanent archive on Zenodo.
Power Law Analysis: Top 1% control 54% of reach. “Democratization” claim is empirically false. Corporate curators (major label playlist brands) sit at top of rankings. This is the wealth inequality of playlist curation, quantified.
Near-Term Research (3-6 Months):
PFC Detection Analysis: Quantify ghost artist prevalence across all mood playlists. Expected finding: 40-60% of chill/sleep/focus playlists contain majority fabricated artists. Calculate displaced revenue (€X million annually). Identify production company networks (Firefly, Epidemic connections). Evidence package for FTC investigation.
Payola Pattern Detection: Statistical analysis of retention periods reveals pay-for-play schemes. Expected finding: 15-25% of playlists with submission forms show exact-interval clustering (7/14/30-day cycles). Estimate weekly payola market (€Y million). Evidence for FTC Section 5 enforcement (deceptive trade practices).
Algorithmic Bias Study: Why do algorithmically-optimized tracks outperform artistically-ambitious music? Compare “viral unknowns” (>1M streams, <50K artist followers) vs “great unknowns” (<10K streams, critical acclaim). Test Pelly’s hypothesis: mood playlist placement predicts virality independent of musical quality. Expected finding: streaming optimization systematically biases against complexity, originality, cultural specificity.
Long-Term Infrastructure (6-12 Months):
Alternative Music Infrastructure Map: Database of non-Spotify pathways. Library streaming programs (50+ cities, flat licensing fees, no data extraction). Cooperative platforms (Catalytic Sound, Resonate, Ampled—democratic governance, fair payment). Public funding opportunities (Ireland’s €325/week basic income, France’s intermittence du spectacle, state arts council grants). Independent radio contacts (college stations, community radio, actual human DJs). Music journalism (newsletters, blogs, zines still operating).
This becomes the actual solution. Not “pitch better on Spotify.” But “here are 200 ways to exit Spotify entirely.”
The Technical Reality: What Actually Works Right Now
The operational components aren’t theoretical. They’re running code, processing real data, generating forensic evidence.
Multi-Process Playlist Validator: Sixteen parallel headless browsers verifying playlist liveness. Simulates human behavior (random mouse movements, variable scrolling, realistic wait times), masks browser fingerprints, checks for Spotify error messages plus content indicators. Processes 1,000 URLs in 20-30 minutes. Detects the 30% decay rate (playlists deleted, links broken) over six-month periods. Weekly re-validation combats data entropy. This is infrastructure for maintaining dataset freshness—prerequisite for longitudinal fraud detection.
LangGraph Contact Discovery Agent: Automated research pipeline extracting curator social media handles and submission forms Spotify’s API deliberately omits. Google searches curator names, scrapes potential websites, extracts Instagram/Twitter/Facebook/submission portals, verifies matches through music-related context requirements. Success rate: ~80% for curators with public presence. Rate-limited to ~20 curators hourly (SerpAPI constraints, Gemini API quotas, manual delays preventing detection). Already enriched 84 curators. Proves contact aggregation is automatable—the “gatekeeper to the gatekeeper” information asymmetry Pelly documented can be eliminated through systematic intelligence gathering.
Spotify API Data Collector: Asynchronous pipeline fetching complete metadata for curators and their catalogs. For each curator: all playlists via pagination, all tracks from all playlists, all artist details batched efficiently. Collects 5,000+ Spotify sub-genres, maps them to 18 primary categories for coherent analysis. Handles rate limits (0.15-second delays), retries with exponential backoff (401 token refresh, 429 respect for Retry-After headers), processes thousands of playlists daily. This is the data foundation—without complete track listings and artist metadata, fraud detection is impossible.
Focus Score Calculation Engine: The mathematical formula measuring playlist quality through genre coherence:
Focus Score = 0.45×(Genre Breadth) + 0.30×(Genre Density) + 0.25×(Artist Repetition)
Where Genre Breadth penalizes covering 10+ genres (real curators specialize), Genre Density rewards deep catalogs (200 tracks in two genres scores higher than 100 tracks across fifteen), and Artist Repetition rewards showcasing consistent sound (same artists appearing multiple times indicates curation, not random dumping).
Empirical validation from initial dataset: scores below 40 flag suspected bot farms (high followers, genre chaos, no external presence). Scores above 70 predict human curation (focused communities, regular updates, verifiable identities). Peter Ries Music: 94.66 focus score, 78 playlists, 12 genres covered—genuine metal curator. Filtr US: 33.78 focus score, 96 playlists, 17 genres covered—Sony’s promotional vehicle masquerading as independent tastemaker.
This is what’s operational. Not complete—missing the fraud detection layers that transform this from “playlist database” into “Consumer Reports for streaming exploitation.” But sufficient to prove the approach works.
What’s Missing: The Forensic Layers That Matter
Z-Score Growth Monitor (Critical—In Development): Detects bot injection through statistical process control. Monitors daily follower counts, calculates Z-scores against genre-specific baselines (Jazz growth rate differs from Viral Pop), flags vertical spikes (Z > 3.0 = bot farm activity with 99.7% statistical confidence). Current blocker: requires 90 days of historical data for baseline calculation. Daily snapshots launched February 12. Will have validity May 15. Then we scan 5.8 million playlists, identify which show growth inconsistent with organic discovery, estimate scale of stream fraud platform-wide.
Churn Pattern Analyzer (Critical—Algorithm Ready, Waiting for Data): Detects payola through exact-interval retention clustering. Compares weekly snapshots, calculates days each track remained on playlist before removal, builds retention histograms, tests for clustering around 7/14/30-day periods. If 30%+ of removals occur at exactly seven days, that’s not curation—that’s weekly paid slots. Statistical test: chi-square goodness-of-fit. Null hypothesis: retention periods normally distributed. Rejection indicates mechanical replacement. Implementation: 2-3 days once temporal data sufficient.
PFC Ghost Artist Detector (Highest Impact—Designed, Not Implemented): The analysis that matters most. For every artist on every mood playlist: verify web presence through Google/Wikipedia/Instagram APIs, cross-reference labels against known PFC providers (Firefly, Epidemic, Hush Hush, Cat Farm, Queen Street, Mind Stream, Slumber Group, Audio Network), scan bios for fabrication patterns (NLP classifier trained on Pelly’s examples: “classically trained,” “conservatory,” “limited edition cassettes,” “joined the [genre] crew”). Calculate PFC probability per artist (0 = verified human, 1.0 = definitely ghost). Aggregate to playlist level. Flag where ghost percentage exceeds 50% or 90%.
Expected deliverable: “We analyzed 5.8 million playlists. X% of mood playlists contain majority ghost artists. Y billion monthly streams go to fabricated musicians. Estimated €Z million annual displacement of independent artist revenue.” This becomes evidence for regulatory action, journalism follow-up, artist organizing.
Current blocker: 290 million artist verifications required (5.8M playlists × 50 average artists). Even at 1 million API calls daily, that’s 290 days. Solution: strategic sampling (mood playlists first—highest expected PFC concentration), distributed processing, institutional collaboration (academic research API access). This is the work. This is what proves Pelly’s reporting at scale.
Semantic Alignment Auditor (High Priority—Ready to Implement): Detects playlist stuffing through title/description mismatch. Uses Sentence-BERT embeddings: encode playlist description (”Chill Lofi Beats for Studying”), encode actual genres from track analysis, calculate cosine similarity. If similarity drops below 0.3, flag as deceptive labeling. Example: playlist titled “Peaceful Morning Meditation” containing Death Metal. The mismatch is measurable. Implementation: 2-3 days (sentence-transformers library, straightforward logic). Waiting on prioritization.
Ellipsoid Diversity Metric (Medium Priority—Research Required): Quantifies “sonic chaos” through multidimensional feature space analysis. Model playlists as ellipsoids using Spotify’s audio features (energy, valence, danceability, tempo, acousticness, instrumentalness, speechiness). Calculate volume. Human-curated playlists cluster tightly (small ellipsoid = focused sound). Bot farms scatter randomly (large ellipsoid = accepts anything). Research shows human playlists are five orders of magnitude smaller than random sampling. Implementation challenge: 580 million audio feature API calls needed (5.8M playlists × 100 average tracks). Timeline: 1-2 weeks with batch processing and caching. Prerequisite for sonic coherence auditing.
The Actual Contribution: Evidence for Transformation
Pelly’s Mood Machine documented that Spotify’s founding mythology is false (ad-tech entrepreneurs seeking traffic, not saving music from piracy), that major labels designed streaming for their benefit (equity stakes, guaranteed minimums, privileged terms), that Perfect Fit Content program systematically replaces artists with ghosts (€61.4M profit, 100+ playlists compromised), that algorithmic personalization optimizes for engagement not discovery (session extension metrics, not musical diversity), and that Discovery Mode functions as undisclosed payola (30% royalty cuts for algorithmic promotion, no user labeling).
What she couldn’t prove: exact prevalence (how many playlists total?), exact scale (what percentage of streams?), exact mechanisms (which detection methods work?), exact alternatives (what infrastructure is needed?).
That’s what computational analysis provides. Not better storytelling. Quantitative measurement enabling regulatory action, legislative proposals, cooperative organizing, and infrastructure development.
The Living Wage for Musicians Act (introduced March 2024 by Rep. Rashida Tlaib) proposes new royalty stream paid directly to artists, bypassing labels and platforms. It needs evidence showing current system’s inadequacy. Ghost artist displacement calculations provide that evidence: “If Y billion monthly streams go to fabrications, and pro-rata means this directly reduces independent artist payments by €Z million annually, here’s the concrete harm requiring legislative remedy.”
The Federal Trade Commission could investigate Discovery Mode as digital payola under Section 5 (deceptive trade practices). It needs proof of scale and consumer deception. Payola detection showing X% of playlists accept undisclosed payments provides that proof: “Listeners have no way to know they’re hearing paid placements, warping perceived popularity, exactly what radio payola prohibitions addressed.”
United Musicians and Allied Workers organized protests in 32 cities demanding transparency, fair payment, user-centric royalties. They need organizing tools and advocacy data. The corporate curator dominance analysis provides that infrastructure: “Major label playlist operations control 54% of curator ecosystem reach while operating as ostensibly independent brands—this is the information asymmetry requiring transparency mandates.”
Cooperative platforms (Catalytic Sound, Resonate) demonstrate alternatives work at small scale but face adoption challenges. They need evidence that niche-focused models are economically viable. The Focus Score analysis identifying 200+ genuine jazz curators (score >70, <50 playlists, jazz-focused, independently operated) provides recruitment targets: “Here are the humans already doing curatorial work on Spotify for free. Contact them. Offer cooperative ownership. Build the jazz streaming collective using them as founding editorial board.”
The Question This Forces
Pelly documented the theft. The question she left: What do we do about it?
The reformist answer: Living Wage Act (direct artist payments), FTC enforcement (ban digital payola), GDPR privacy laws (limit surveillance), user-centric payments (your subscription supports artists you actually hear). Achievable through legislation, regulation, organizing. Band-aids on a system designed to extract.
The abolitionist answer: Cooperative platforms (artist-owned, democratically governed), library streaming (public funding, local focus, flat fees), public arts support (Ireland’s basic income model, France’s intermittence du spectacle). Requires reimagining digital infrastructure, treating culture as public good, rejecting venture capital ownership entirely. Radical but microscopic—Catalytic Sound serves 30 artists, library streaming reaches tens of thousands. Spotify serves 615 million.
The computational skepticism answer: Build the evidence base that makes either path possible. Expose exploitation through measurement. Map alternatives through systematic documentation. Create tools that make fraud impossible to sustain because it becomes visible, quantifiable, and prosecutable.
Journalists can document that Perfect Fit Content exists. Data scientists can measure exactly how many playlists are compromised and exactly how much money was stolen. Organizers can demand change. Regulators can enforce accountability. Artists can choose exits. But only if the evidence exists in forms power can’t dismiss.
This is why Musinique isn’t selling playlist contacts to desperate independent artists. That’s profiting from the system Pelly exposed. This is building forensic infrastructure that makes the theft measurable, the fraud detectable, the alternatives mappable. Then releasing it—data, code, methodology, findings—as public good. Creative Commons license. GitHub repository. Zenodo archive. Permanent, citable, reproducible.
When Pelly writes “100+ playlists over 90% ghost artists,” power can say “just 100.” When Musinique calculates “X% of all mood playlists contain majority fabrications, representing Y billion monthly streams and €Z million displaced revenue,” power must respond to evidence.
When Pelly documents Discovery Mode functions as payola, power can say “no proof of scale.” When churn analysis shows “15-25% of playlists with submission forms demonstrate exact-interval retention clustering statistically inconsistent with organic curation (p<0.001),” that’s not opinion. That’s forensic proof.
When Pelly argues streaming systematically favors background-optimized content over artistically ambitious music, power can say “subjective taste.” When statistical modeling shows “mood playlist placement predicts virality independent of musical complexity after controlling for genre, release date, and artist followers,” that’s not critique. That’s quantified bias.
The Only Way This Works
Computational methods alone can’t achieve justice. They provide evidence making justice possible. The path requires:
Release the data (public good, not proprietary product): Curators.csv, playlist analysis, ghost artist detection results, payola findings. Creative Commons licensed. Anyone can use commercially, must credit source, derivative works encouraged. This builds scientific credibility (peer review, validation, collaboration) and movement infrastructure (artists organize around shared evidence, journalists cite findings, regulators reference studies).
Run the research (academic rigor, not corporate metrics): Publish in peer-reviewed journals (Cultural Analytics, New Media & Society, First Monday). Present at conferences (ISMIR, Web Conference, music industry events). Submit to regulatory bodies (FTC complaints, Congressional testimony). Co-author with journalists (Pelly, David Turner, Cherie Hu—quantitative follow-up to qualitative documentation). This establishes authority and creates citeable evidence base.
Build the alternatives (cooperative infrastructure, not corporate replacement): Not another streaming platform competing with Spotify (that failed—Resonate went on hiatus 2024). But infrastructure-as-service for niche communities. Open-source streaming toolkit (audio delivery, payment processing, governance tools, discovery interfaces). License cooperatively—jazz collective deploys it, ambient archive customizes it, local library adapts it. Thirty specialized platforms serving their communities, not one mega-platform serving capital.
Validate what works (measurement, not mythology): Does playlist placement lead to sustainable careers? (Track artists longitudinally: streams, followers, show attendance, merch sales.) Do cooperative models generate fair income? (Compare Catalytic Sound’s equal distribution vs Spotify’s pro-rata.) Does public funding support artistic practice? (Ireland pilot shows decreased anxiety, more hours on creative work.) Evidence-based policy requires evidence.
This is computational skepticism as public service: data science not for optimization but for accountability, algorithms not for replacement but for protection, automation not for content generation but for fraud detection. The opposite of what Spotify built.
What Gets Built Tomorrow
The playlist database was written on a whim. Couple hours of curiosity about why mediocre music gets millions of streams. Got bored. Read Pelly’s book. Got angry. Saw the connection: the data I’d collected on a lark could quantify the theft she’d documented through years of investigation.
Tomorrow I don’t build better tools for pitching to Spotify curators. Soon volunteers from Humanitarians AI (https://www.humanitarians.ai/) will join the project to finish the PFC detection pipeline. Tomorrow I calculate exactly how many playlists are ghost-artist operations. Soon we’ll measure the displaced revenue in euros, not emotions. Soon we’’l provide regulators with evidence courts can’t ignore.
The only way to combat evil bots is with good bots. Spotify built automation to replace musicians. The response is automation to expose that replacement, measure its scale, identify its perpetrators, calculate its damages, and build the infrastructure that makes alternatives viable.
Not friendly competition. Not incremental reform. Not working within the system hoping it improves. But forensic accounting of systematic fraud, followed by blueprint for cooperative reconstruction, backed by computational evidence making denial impossible.
Pelly documented the crime. Musinique measures it. Then we build what comes next.
The ghost artists are already here. The detection system launches in 87 days when the statistical baselines achieve validity. Then we count exactly how many ghosts Spotify created, exactly how much they stole, and exactly who profited.
You’re still streaming. The playlist is still lying. The artist is still erased. But the evidence is compiling. And the algorithm doesn’t forgive.
Tags: Spotify fraud detection infrastructure, computational ghost artist analysis, streaming platform accountability research, Perfect Fit Content quantification, music industry forensic data science


