1. Emotion Taxonomy
We use a fixed list of 20 emotions selected to span Russell's circumplex model of affect. Each emotion is assigned a fixed valence (positive ↔ negative), arousal (calm ↔ energetic), and weight.
| Emotion | Valence | Arousal | Polarity | Weight |
|---|---|---|---|---|
| happy | +0.85 | +0.40 | positive | 1.00 |
| excited | +0.75 | +0.80 | positive | 1.00 |
| love | +0.95 | +0.30 | positive | 1.10 |
| grateful | +0.90 | −0.10 | positive | 1.00 |
| hopeful | +0.70 | +0.20 | positive | 1.00 |
| confident | +0.70 | +0.40 | positive | 1.00 |
| calm | +0.50 | −0.60 | positive | 1.00 |
| relaxed | +0.55 | −0.55 | positive | 1.00 |
| curious | +0.40 | +0.50 | neutral | 0.90 |
| surprised | +0.10 | +0.75 | neutral | 0.80 |
| bored | −0.30 | −0.60 | neutral | 0.80 |
| tired | −0.30 | −0.70 | neutral | 0.90 |
| nervous | −0.50 | +0.50 | negative | 1.00 |
| anxious | −0.65 | +0.60 | negative | 1.00 |
| stressed | −0.70 | +0.70 | negative | 1.10 |
| frustrated | −0.70 | +0.60 | negative | 1.00 |
| angry | −0.80 | +0.80 | negative | 1.10 |
| sad | −0.75 | −0.40 | negative | 1.00 |
| lonely | −0.80 | −0.30 | negative | 1.00 |
| depressed | −0.90 | −0.50 | negative | 1.20 |
2. Data Collection
Visitors choose one emotion and submit. The system records:
- The selected emotion ID.
- Approximate location (browser-granted coordinates or IP-derived country).
- UTC timestamp.
- SHA-256 hashes of the IP and User-Agent strings (for rate-limiting and abuse detection).
- A bot score (0–100) computed from CSRF token, honeypot, request shape, and user-agent heuristics.
No personally identifiable information is ever stored.
3. Geolocation
Two methods, in priority order:
- Browser geolocation (if granted). Coordinates are rounded to four decimal places (~11 m precision), then bucketed to ~1 km for caching.
-
IP geocoding
via
ip-api.com, on the hashed IP's source IP. Returns country only (no city).
Reverse geocoding uses OpenStreetMap Nominatim, cached for six hours per coordinate bucket.
4. Emotional Index
The flagship metric. Range 0–100. For a set of counts \(n_e\) per emotion \(e\):
EI = (V_raw + 1) · 50
The raw valence is in [−1, 1]; it is linearly rescaled to [0, 100]. Below 50 = predominantly negative; above 50 = predominantly positive.
5. Positivity Score
Share of submissions classified as positive emotions:
6. Stress Score
Share of negative high-arousal emotions (anxious, stressed, angry, frustrated, nervous):
7. Resilience
Inversely related to depressive emotions, weighted by overall positivity:
Capped at [0, 100].
8. Volatility
Shannon entropy of the emotion distribution, normalized to [0, 100]:
volatility = (H / log₂(20)) · 100
A country that always feels the same has 0 volatility; one with every emotion equally represented has 100.
9. Confidence
How much weight to put on a country's score given its sample size:
Reaches 0.86 at n=200, 0.99 at n=460.
10. Bayesian Shrinkage
To prevent tiny countries from dominating rankings with noise, we shrink each country's raw valence toward the global prior:
k = 30, prior_valence = +0.10
A country with n=300 trusts its own data 10× more than the prior. A country with n=3 trusts the prior 10× more than its own data.
11. Forecasting
Daily emotional index series are forecasted with Holt-Winters triple exponential smoothing (additive model) with weekly seasonality (m=7):
Min history: 21 days · Horizons: 24h, 7d, 30d
Prediction intervals derive from one-step-ahead residual standard deviation scaled by √h. 80% interval = ±1.282σ√h; 95% = ±1.96σ√h.
12. Anomaly Detection
Per (country, emotion), today's volume is compared against a 14-day rolling baseline. Z-scores ≥ 2.5 are flagged. Severity bands: low (2.5), medium (3.0), high (4.0), critical (5.0).
13. Anti-Abuse
- HMAC double-submit cookie for CSRF.
- Honeypot field that hidden form-fillers trip.
- Rate limit: three submissions per IP-hash per six hours, sliding window via Redis with DB fallback.
- Bot score from UA heuristics, IP class, missing headers; submissions above 70 are silently dropped.
14. Known Limitations
- Self-selection. Users who choose to participate are not a representative population sample. We treat the Index as a live signal, not a survey.
- VPN noise. A non-trivial fraction of submissions arrive via VPNs and may be attributed to the wrong country.
- Time-zone effects. Evening submissions skew toward fatigue; we publish hourly breakdowns to surface this.
- Forecast horizon. Beyond 30 days the prediction interval widens past usefulness.
Methodology v1.0 — last updated May 2025. Reproducible computations: every formula on this page is implemented in
the open source server code
under
src/Services/EmotionIndexCalculator.php
.