Algorithm Methodology
A transparent breakdown of the computational models used to derive player ratings, rarities, and historical confidence scores from structured open-source data.
1. Data Ingestion & Preprocessing
The foundation of the osu!cards rating system relies on a unified dataset parsed from historical tournament spreadsheets spanning multiple eras. The ingestion pipeline (`pipeline.py`) standardizes diverse sheet formats into a structured CSV, yielding baseline metrics per player:
- Average Accuracy (
avg_accuracy): The mean accuracy across all valid played maps. - Match Wins (
total_wins): The aggregate number of series victories. - Volume Metrics: Total maps played (
total_maps) and MVPs awarded (total_mvps). - Temporal Footprint: Unique years participated (
appearances), denoting continuous relevance.
2. Core Characteristic Derivation
The computational model maps raw volume and performance metrics into five distinct, normalized characteristic vectors. Calculated via `card_metrics.py`, each vector serves a specific analytical purpose:
Power (POW)
Represents the player's performance ceiling and direct impact on match outcomes.
POW = (avg_accuracy * 0.70) + (total_wins * 2.00) Consistency (CON)
Reflects sustained baseline mechanical performance over large map volumes.
CON = (avg_accuracy * 0.80) + (total_maps * 0.05) Clutch (CLT)
Measures the ability to deliver match-winning performances under pressure.
CLT = (total_mvps * 3.00) + (total_wins * 1.50) Experience (EXP)
Focuses on sheer exposure to competitive tension and bracket survival.
EXP = (appearances * 10.00) + (total_maps * 0.10) Legacy (LEG)
Quantifies the span and footprint of a player's career across eras.
LEG = (appearances * 8.00) + (last_year - first_year + 1) 3. Overall Coefficient (OVR) & Confidence
The Overall Rating (OVR) is evaluated using a weighted linear combination governed by `rating_policy.py`. This formula prevents any single characteristic (e.g., pure longevity) from unilaterally inflating a player's perceived skill band:
OVR = (POW * 0.27) + (CON * 0.23) + (CLT * 0.17) + (EXP * 0.13) + (LEG * 0.20)
Cross-Era Coefficients: To counteract statistical devaluation over multiple eras, a static scalar bonus is applied to veterans:
+2.50 OVR for ≥5 active years, and +1.00 OVR for ≥3 active years.
Data Quality Decay & Caps
Because historical eras suffer from fragmented recording, a confidence score (source_quality) is applied to all records. Data ingested through fallback channels or manual review is strictly bounded to prevent erroneous inflationary ceilings:
- Full Real (1.00 Confidence): Uncapped ceiling.
- Partial Real (0.90 Confidence): Capped at OVR 89.9.
- Fallback Controlled (0.78 Confidence): Capped at OVR 84.9.
- Manual Review (0.72 Confidence): Capped at OVR 79.9.
4. Rarity Classification & Prestige Floors
The final computed OVR is projected through a tiered threshold algorithm to determine the card's rarity class:
Prestige Exceptions: The engine allows manual overrides via the MANUAL_PRESTIGE_FLOORS mapping. This ensures that historically transcendent players—whose true impact wasn't fully captured by surviving spreadsheets—receive arbitrary rarity protection preventing them from dropping below an Epic or Legendary floor regardless of statistical degradation.