Best Practices for Source-Based Research on News Trustworthiness

ICA 2025 in Denver, USA



Jula Luehring, Hannah Metzler, Ruggero Lazzaroni, Apeksha Shetty, Jana Lasser

How to measure misinformation?

Misinformation is hard to detect due to

-   subtle & grey-area content

-   ethical complexity
    
-   conceptual/methodological constraints & traditions
    

Source-based methods have the advantage to

-   include grey-area content & better reflects information diets

-   easier to make judgments
    
-   scalable & unbiased 
    

NewsGuard

  • data base of ~12,000 news domains rated for trustworthiness (0–100)

  • selected based on web data and evaluated using 9 journalistic quality criteria \(\rightarrow\) accurate at scale

  • ratings are updated regularly by country experts

  • initially US-focused, now including other countries

++ NewsGuard is the most comprehensive list of source ratings & widely used (in Science & nature)

— it’s not reproducible, and we can’t validate it

Research goals

  1. assess stability and completeness of ratings over time and countries

  2. evaluate the value of additional labels (e.g., political orientation, topics)

  3. provide recommendations for source-based approaches

Rating stability over time

  • trustworthiness is relatively stable (changes rare, avg. 2 yrs)

  • drops due to new low-trustworthiness sources being added

Country-level completeness

  • majority of sources are US-based (~76%)

  • trustworthiness scores vary by country (US lowest on avg.)

  • stable state reached by ~2022 for US, DE, FR, IT, CA

blue = number of sources, green = trustworthiness

Use of contextual labels

  • political orientation label (right/left) sparse (~33%, mostly US)

  • right-leaning sources score lower on average

  • most sources have useful topic label (e.g., “health”, “politics”)

What happens when we use different versions?

  • continuous scores: stable results across time

  • binary labels: can distort trends (e.g., spike in “untrustworthy” links from Republicans post-2020)

Recommendations

  1. always prefer continuous over binary scores (or match dynamically)

  2. use annual snapshots after first stable state and check for major source additions/removals

  3. check for country-specific journalistic traditions, esp. for comparative research

  4. topic and orientation labels are useful to characterize sources beyond trustworthiness but should be validated (or at least spot-checked)

Implications for source-based approaches

  • binary true/false labels can be volatile across time periods

  • coverage & ratings are relatively insensitive to time once it reaches a stable state, speaking for the reliability of source-based approaches

  • need more open, transparent alternatives

Published in JQD:DM earlier this year


Thank you!

Email: jula.luehring@univie.ac.at

Bluesky: @julaluehring.bsky.social

Github: github.com/julaluehring