The Times, They Are A-Changin'
Characterizing Post-Publication Changes to Online News
The focus of this work is to study the extent to which post-publication changes happen to news articles. We collect articles over a period of 9 months from news publishers of varying popularity and political biases and show that 165k out of 600k articles exhibit some post-publication changes. We also leverage Natural Language Processing to measure the semantics of these changes, such as whether a change alters the meaning of the paragraph it occurs in - which, in 22% of cases, it does.
Full Paper PDF
Bibtex for citation
Parsers: You can find the Python code used to parse the article
HTML from the various crawled articles here.
Mapping of changes to categories: You can download the aforementioned CSVs that we created from this folder.
R Notebooks used for analysis and graphs: Finally, a collection of notebooks we used to arrive at various metrics and create the graphs present in the paper can be found here.