Why good statistics are important for fact checkers
What we found from our research
The importance of context
Understanding where the data came from, how it was made, and if there were things to watch out for is often extremely important in the work of fact checkers. Fact checkers often also include the context in the fact check they create.
Extra context is sometimes needed to show to readers whether they should trust the numbers published by national statistics institutes. It also helps to show the shortcomings of the data, if they exist. This then helps build trust in the fact checkers amongst their readers.
Knowing if ‘experimental’ methods are used – such as modelling or sampling – is helpful, and especially relevant when more data science or machine learning approaches are used by the national statistics institute.
Data reliability for fact checking
A crucial emerging theme was whether a fact checker can depend on the data.
The timeliness of data – how recent the information is – is considered a high priority. However, this is often lacking in countries or departments with less resources.
Many fact checkers said it was very important to link data across time and make comparisons with previous years. A long historical series where the methodology hasn’t changed was found to give a feeling of stability or of the data being more reliable.
Comparability was also mentioned quite often – fact checkers want to be able to compare data on a similar topic from two different places or two different times.
Multiple organisations publishing data on a similar topic is a complex issue depending on the country. While some found it good to have a single source of truth, others found it useful to have multiple organisations to back up or add more trust to the numbers.
Data from other organisations can complement data published by national statistics institutes, or can provide a safety net if the government service goes down. When data and information is spread across multiple organisations it creates a challenge when it comes to big structural issues like Covid-19, climate change, or the environment. This is less of an issue for specific, focused issues or topics.
Publishing and formatting matters
CSV files were by far the most requested format for publishing data – but are not always in ready supply for fact checkers.
Fact checkers often have to ‘unbundle’ data from the medium it was published in, for example flat images or PDFs. They often then create their own spreadsheets or datasets with the data to do their own analysis.
National statistics institutes should publish related spreadsheets alongside the report in which they include data.
Not surprisingly, APIs were very popular with technologists. Although other tech-savvy fact checkers also called for them.
Conclusion and next steps
The themes from this research were not unexpected, but do neatly show the challenge we face.
Easy access to high-quality statistics makes fact checking easier. Consistent access to high-quality statistics in a way that machines can process means we may be able to use technology to reduce the time it takes to fact check. Having the statistics available in a way that machines can not just process, but understand the context, caveats and complexity of the numbers would be game changing. Read more