Jula Luehring, Hannah Metzler, Ruggero Lazzaroni, Apeksha Shetty, Jana Lasser
Misinformation is hard to detect due to
- subtle & grey-area content
- ethical complexity
- conceptual/methodological constraints & traditions
Source-based methods have the advantage to
- include grey-area content & better reflects information diets
- easier to make judgments
- scalable & unbiased
data base of ~12,000 news domains rated for trustworthiness (0–100)
selected based on web data and evaluated using 9 journalistic quality criteria \(\rightarrow\) accurate at scale
ratings are updated regularly by country experts
initially US-focused, now including other countries
++ NewsGuard is the most comprehensive list of source ratings & widely used (in Science & nature)
— it’s not reproducible, and we can’t validate it
assess stability and completeness of ratings over time and countries
evaluate the value of additional labels (e.g., political orientation, topics)
provide recommendations for source-based approaches
Luehring, Metzler et al., 2025
trustworthiness is relatively stable (changes rare, avg. 2 yrs)
drops due to new low-trustworthiness sources being added
majority of sources are US-based (~76%)
trustworthiness scores vary by country (US lowest on avg.)
stable state reached by ~2022 for US, DE, FR, IT, CA
blue = number of sources, green = trustworthiness
political orientation label (right/left) sparse (~33%, mostly US)
right-leaning sources score lower on average
most sources have useful topic label (e.g., “health”, “politics”)
continuous scores: stable results across time
binary labels: can distort trends (e.g., spike in “untrustworthy” links from Republicans post-2020)
Lasser et al., 2022
always prefer continuous over binary scores (or match dynamically)
use annual snapshots after first stable state and check for major source additions/removals
check for country-specific journalistic traditions, esp. for comparative research
topic and orientation labels are useful to characterize sources beyond trustworthiness but should be validated (or at least spot-checked)
binary true/false labels can be volatile across time periods
coverage & ratings are relatively insensitive to time once it reaches a stable state, speaking for the reliability of source-based approaches
need more open, transparent alternatives
Email: jula.luehring@univie.ac.at
Bluesky: @julaluehring.bsky.social
Github: github.com/julaluehring