Club Sentiment

Measuring football fan sentiment worldwide

View Rankings·Matches·Weekly Report·MPI·

Methodology

How Club Sentiment Measures Fan Mood

Club Sentiment is an independent football analytics project that measures the emotional state of a club’s public supporter ecosystem. Each day, the platform gathers supporter discussion, football media coverage, and public reaction across several channels, filters that evidence for club relevance, and converts it into a structured sentiment score from 1 to 100.

The system is designed to answer a specific question: how do supporters feel about the club right now? It does not attempt to rate squad quality, predict results, or judge tactical strength. It is a measurement of supporter mood as expressed in public football conversation.

Institutional principle

Club Sentiment treats fan sentiment as a measurable football signal. The objective is not to chase viral noise, but to transform public supporter discussion into a consistent, explainable daily index that can be compared across clubs, leagues, and time.

1. What the score represents

Every club receives a daily score on a 1-100 scale. Higher values indicate a more positive supporter environment, while lower values indicate more frustration, pressure, pessimism, or emotional instability around the club.

The score is intended to measure the aggregate mood of publicly visible supporter discussion. That mood is shaped by results, injuries, transfers, manager pressure, media narratives, and broader club context. Because football fandom is emotional and often reactive, the system is built to preserve genuine swings in mood while reducing distortion from off-topic content, ambiguity, spam, and short-lived noise.

Range	Label	Interpretation
96-100	Ecstatic	Peak euphoria, title-level energy, historic result energy, or overwhelming optimism.
91-95	Electric	Supporter mood is intensely positive and emotionally charged.
86-90	Euphoric	Major positivity, broad unity, strong confidence.
81-85	Surging	Strong upward momentum in fan mood.
76-80	Buzzing	Clearly positive atmosphere with visible excitement.
71-75	Very Positive	Fans are confident, encouraged, and generally pleased.
66-70	Confident	Supporters are in a healthy positive state.
61-65	Encouraged	More positive than negative, but not emotionally explosive.
56-60	Optimistic	Positive lean with remaining uncertainty.
51-55	Leaning Positive	Slightly favorable balance of public mood.
46-50	Mixed	Divided reaction, balanced between positivity and frustration.
41-45	Uneasy	Confidence is weakening; unease is visible.
36-40	Frustrated	Negative feeling is established and sustained.
31-35	Disappointed	Fan confidence is deteriorating in a visible way.
26-30	Angry	Public supporter mood is sharply negative.
21-25	Toxic	Very negative discourse, often featuring blame and pressure.
16-20	Miserable	Deeply pessimistic atmosphere around the club.
11-15	Meltdown	Severe instability in supporter sentiment.
1-10	In Crisis	Extreme negativity or public emotional collapse.

2. Full daily flow

For each club, Club Sentiment runs a daily evidence pipeline. The process is designed to be source-aware, relevance-aware, and transparent enough to support historical comparison rather than only one-off daily reactions.

Step 1 — Team context

The system starts with a structured team record containing club name, slug, search name, league, and known source-specific aliases. These values shape how evidence is searched and how ambiguity is handled.

For clubs with ambiguous names, source-specific search overrides are used so the system searches for the football club rather than a city, person, product, or unrelated topic.

Step 2 — Parallel evidence collection

The platform then gathers evidence in parallel from four public source families: Reddit, X, YouTube, and football news.

Each source has its own search logic, relevance logic, and fallback logic. The aim is not to force identical behavior across all platforms, but to extract the most useful football signal from each environment.

Step 3 — Relevance filtering

Raw evidence is filtered using club-specific rules and, for selected ambiguous clubs, an optional Gemini-based relevance gate.

If filtering becomes too aggressive and evidence volume drops below the minimum threshold, the system relaxes filtering in stages rather than letting the club go dark.

Step 4 — Daily sentiment analysis

The filtered evidence is sent to Gemini in a structured prompt that asks the model to score several football-specific sentiment dimensions, explain the dominant themes, and produce source-level sub-scores.

The final daily reactive score is then calculated from the dimension scores rather than trusting a single loose overall number.

Step 5 — Baseline trend computation

After the reactive score is produced, the system looks back over prior days and computes a baseline sentiment trend using an exponentially weighted average of earlier reactive scores.

This baseline is intentionally slower-moving than the daily score, so it can represent the club’s broader emotional trend.

Step 6 — Display score

The score shown publicly on the site is a blend of the new daily reactive score and the prior baseline trend. This creates a display score that remains responsive without becoming unstable.

Evidence volume influences how much weight is given to the reactive layer versus the baseline layer.

3. How Reddit evidence is collected

Reddit is treated as a supporter-community source rather than a generic social feed. The system uses several progressively broader mechanisms, starting with direct club communities and expanding only when necessary.

3.1 Known subreddit mapping

Club Sentiment maintains a known set of team-to-subreddit mappings for clubs with established supporter communities. When a known mapping exists, that subreddit is searched first.

Additional candidate subreddit names are also generated from the team slug, normalized club name, and common naming variations.

3.2 Team subreddit RSS

The Reddit collector first reads the club subreddit’s recent RSS feed. This provides recent post titles and summaries without requiring special access credentials.

Each item is stored with metadata such as source kind, publication time, subreddit name, post id, and whether the post appears to be a match thread.

3.3 Match-thread and post-comment enrichment

After fetching the freshest posts from club subreddits, the system prioritizes likely match threads and recent team posts with commentable post ids. It then fetches top comments from a limited number of those posts.

Match threads are allowed a higher comment cap than ordinary posts because they are often the highest-value expressions of real supporter mood immediately before, during, and after matches.

3.4 Subreddit discovery

If the direct team subreddit is weak, missing, or uncertain, the system searches Reddit’s subreddit directory for communities that resemble the club’s name and football context.

Candidate subreddits are scored using their display name, title, query-term overlap, and basic community size signals.

3.5 League and general-football fallback

If club-specific subreddit evidence is still thin, the system searches broader football communities such as league-level or general-football subreddits for recent posts mentioning the club.

This allows the platform to capture relevant discussion even when a club’s own subreddit is inactive or very small.

3.6 Ordering and de-duplication

Reddit evidence is bucketed into team, discovered, and fallback categories. Within those buckets, evidence is sorted by source type and freshness, then de-duplicated at the text level.

This preserves the highest-value direct supporter evidence while still allowing broader football discussion to supplement sparse clubs.

4. How X evidence is collected

X is treated as a rapid-reaction source. It is especially useful for major clubs, breaking news, and emotionally immediate public discussion after significant events.

4.1 Query construction

For selected clubs, Club Sentiment uses club-specific X query overrides that combine exact club names, major hashtags, and official or highly associated handles.

For clubs without an explicit override, the system builds a simpler search around the club name and a normalized hashtag form. Ambiguous clubs receive extra football-context terms to reduce drift.

4.2 Recent-volume check

Before spending search calls on full result collection, the system checks the recent tweet count for the query. If the count falls below a configured threshold, X evidence is skipped for that run.

This avoids wasting usage on clubs or days where X is unlikely to add meaningful signal.

4.3 Recent search and pagination

When a topic shows enough recent activity, the system uses X’s recent-search endpoint to fetch the newest relevant posts. For high-volume clubs or major stories, a second page may be requested.

This keeps the collector efficient: it expands when the topic is busy, but remains conservative when the topic is quiet.

4.4 Ranking

X posts are not used in raw API order alone. The system ranks them using a hybrid of freshness and engagement. Extremely fresh posts are favored, while public engagement helps separate meaningful supporter reaction from low-signal chatter.

The output is then de-duplicated so repeated phrasing or near duplicates do not overwhelm the day’s sample.

4.5 Rate and usage awareness

The X collector monitors usage and request limits and slows down when necessary. This protects system stability and helps preserve the source as a sustainable daily signal rather than a source that works only on some days.

5. How YouTube evidence is collected

YouTube is treated as a supporter reaction layer. It is especially useful for post-match fan commentary, match reaction content, and discussion around emotionally loaded club events.

5.1 Query variants

The YouTube collector supports multiple football-specific query templates (for example match reaction, fan reaction, and post-match analysis) around the club’s search name.

In production, query expansion is currently constrained for quota control, so only a limited subset of those templates may run on a given scrape.

5.2 Video-level relevance screening

Before comments are collected, candidate videos are screened using the video title and description. The system checks whether the club is actually being discussed and whether the surrounding context is recognizably football-related.

Ambiguous club names are held to a stricter standard and must show both club identity and football context.

5.3 Comment collection

Once relevant videos are identified, the system collects top level comments from a limited number of the most recent eligible videos. Comments are trimmed to a bounded length and de-duplicated.

YouTube is therefore not being used as a raw video crawler. It is being used as a football fan-reaction sampler.

5.4 Cost control

YouTube search calls are relatively expensive in usage terms compared with comment fetch calls, so the collector is intentionally selective. Only a subset of clubs are enabled by default for YouTube, with a focus on clubs and situations where YouTube adds the strongest signal.

6. How football news evidence is collected

News is treated as a narrative and framing source. It does not represent pure supporter voice, but it helps capture how club events are being described in public football coverage.

6.1 Query design

Club Sentiment uses a recent Google News RSS search and limits attention to the last seven days. For ambiguous clubs, football context is injected into the query itself.

6.2 Headline and summary extraction

The collector reads both headlines and article summaries, strips markup, removes obvious junk, and stores the resulting snippets as candidate evidence.

6.3 Relevance and junk suppression

News evidence is filtered for football context and common low-value markers such as betting-style language, live-update pages, or clearly non-club topics.

This makes the news layer more useful as a media-tone signal rather than a broad web-search feed.

6.4 Recency ranking

News snippets are ranked with a strong recency bias. The aim is to measure the club’s current narrative environment, not to mix last week’s framing with stories that are already obsolete.

7. Club relevance filtering

Raw evidence alone is not enough. Many club names are ambiguous, and online discussion is full of partial matches, unrelated references, and duplicate phrases. Club Sentiment therefore uses a layered relevance system.

7.1 Rule-based relevance

For selected ambiguous clubs, the system defines strong terms, football-context terms, and excluded terms. A snippet can be retained because it strongly identifies the club, or because it mentions the club alongside football context.

Snippets that match known non-football ambiguity patterns are excluded.

7.2 Optional AI relevance check

When enabled, a second layer sends batches of snippets to Gemini and asks the model to classify whether each snippet is truly about the football club in question.

This is used selectively, especially for ambiguous clubs, to keep cost under control while improving relevance quality.

7.3 Fallback when evidence is too limited

The system is intentionally designed not to collapse coverage when filtering becomes too strict. If evidence volume falls below the minimum threshold after full filtering, it retries with rules only. If evidence is still too thin, it falls back to raw source evidence.

This preserves continuity while still preferring cleaner evidence whenever possible.

8. Reactive score, baseline trend, and public display score

Club Sentiment separates immediate supporter reaction from slower trend behavior.

Reactive score

The reactive score is the day’s raw sentiment output based on that day’s evidence. It is intentionally responsive to recent matches, injuries, transfers, sack pressure, or sudden mood shifts.

Baseline score

The baseline is an exponentially weighted average of prior reactive scores over up to 30 days. Newer days carry more weight than older days, so the baseline can move, but more slowly than the daily reaction layer.

Display score

The public score displayed on the site blends the new reactive score with the prior baseline. This allows a club’s score to respond to real emotion without becoming too unstable from a single evidence burst.

9. Persistence, history, and continuity

Every club-day result is written to storage with the full score record, including the display score, reactive score, baseline score, baseline confidence, reasoning, source counts, dominant themes, and sample snippets.

This historical record serves two purposes. First, it allows Club Sentiment to compute trend baselines over time. Second, it enables users to compare supporter mood across days, matches, and broader club cycles rather than seeing each day in isolation.

10. What the system is designed to do well

Capture short-term supporter mood after results, transfers, or major club events.
Preserve direct fan voice through club communities, match-thread comments, and reaction-heavy platforms.
Reduce noise caused by ambiguous club names and low-quality off-topic snippets.
Produce a public-facing score that is reactive enough to feel real but stable enough to be historically meaningful.
Build a longitudinal trend layer that becomes stronger as history accumulates.

11. What the system does not claim

It is not a betting model.
It is not a prediction engine.
It is not a rating of football quality.
It does not claim to see all supporter discussion everywhere.
It does not interpret every source identically; each source is used for the kind of football signal it provides best.

12. Why this methodology exists

Football is not only a sport of results. It is also a sport of emotion, expectation, pressure, belief, resentment, unity, and reaction. Those forces shape the public environment around clubs every day.

Club Sentiment exists to measure that emotional environment with as much structure, consistency, and methodological honesty as possible. The score is not meant to replace football analysis. It is meant to add a new layer to it: a disciplined measure of public supporter mood.

13. Manager Pressure Index (MPI)

MPI is a daily manager risk meter from 0 to 100. Higher values mean the pressure around a manager is rising quickly; lower values mean the situation looks stable.

It combines fan mood volatility with results context and expectation gaps, then applies a confidence adjustment so lower evidence days do not overreact.

What moves the score

Fan sentiment stress and week-to-week momentum
Narrative inconsistency and divergence from baseline
Discussion intensity across sources
Results stress and expectation-gap stress

Quick score guide

85-100 Critical
70-84 Severe
50-69 High
25-49 Watch
0-24 Low

14. Match Reaction Index (MRI) and Matches page

Club Sentiment also publishes a match-level layer called Match Reaction Index (MRI). While the daily club score tracks broader supporter mood over time, MRI focuses on how mood moves around a single fixture.

On the Matches page, each tracked team can receive an MRI card for that fixture. The score range is -100 to +100: positive values indicate mood improved after the match window, negative values indicate mood deteriorated.

How MRI is measured

MRI compares pre-match and post-match supporter mood windows around kickoff, then combines that swing with context signals such as discussion volume eruption, expectation gap, narrative concentration, and match-event intensity.

The objective is not to judge football quality. It is to measure supporter emotional reaction around that specific fixture.

Observed vs proxy windows

When intraday snapshots are available, MRI uses observed windows. If coverage is incomplete, it falls back to nearby daily sentiment as a proxy so fixtures do not disappear.

This is why some cards have stronger precision and confidence than others, especially shortly after kickoff.

Provisional and final states

MRI updates are intentionally iterative. Scheduled and live fixtures are provisional. Finished fixtures stay provisional until late post-match windows are collected; then they move to final.

This protects accuracy by avoiding premature lock-in when supporter reaction is still evolving.

How cards are ordered on /matches

Fixtures are ranked by reaction strength adjusted by confidence so the most meaningful emotional moves appear first. Within a fixture, tracked-team cards are also sorted by the same logic.

If a fixture shows “not available yet,” MRI coverage for that team-match context has not been persisted yet.

Summary

In practical terms, the flow is: gather public football discussion from Reddit, X, YouTube, and news; filter it for club relevance; analyze it with a structured Gemini prompt; compute a daily reactive score; blend that score with a historical baseline; and publish a daily index that reflects how supporters feel, not how the table looks.