Tor users: An untapped resource for Wikipedia?

New research suggests ban on Tor contributors may be unfounded

screenshot of Wikipedia banning tor users

An image from the study displaying the message that Tor users typically receive when trying to make edits on Wikipedia, stating that the user's IP address has been identified as a Tor exit node, and that "editing through Tor is blocked to prevent abuse." 

Users of anonymous browsing software Tor make contributions to Wikipedia that are just as valuable as those made by other groups of editors and are more likely to engage with controversial topics, according to a newly published research paper co-authored by Rachel Greenstadt, an associate professor of computer science and engineering, and doctoral candidate Chau Tran, who was the lead author.

By examining more than 11,000 Wikipedia edits made by Tor users able to bypass Wikipedia’s Tor ban between 2007 and 2018, the research team found that Tor users made similar quality edits to those of IP editors, who are non-logged-in users identified by their IP addresses, and first-time editors. The paper notes that Tor users, on average, contributed higher-quality changes to articles than IP editors.

The study also finds that Tor-based editors are more likely than other users to focus on topics that may be considered controversial, such as politics, technology, and religion.

Raster image showing the differences in topics edited by Tor users (see caption for details)

An image from the study showing the differences in topics edited by Tor users and other Wikipedia users. The image suggests that Tor users are more likely to edit pages discussing topics such as politics, religion, and technology. Other types of users, including IP, First-time, and Registered editors, are more likely to edit pages discussing topics such as music and sports.

The research team used a range of analytical techniques including direct parsing of article histories, manual inspections of article changes, and a machine learning platform called ORES to analyze the quality of contributions. The team also used a machine learning technique called topic modeling to analyze Tor users’ areas of interest by checking their edits against clusters of keywords.

Greenstadt suggested these findings imply Tor users are quite similar to other internet users, highlighting other work indicating that Tor users frequently visit websites in the Alexa top one million.

Tran said, “Initially when Tor was not widely used, users might have been using it for bad purposes. But today, the population of normal users is growing and anybody can start using it. [Tor users] shouldn’t be treated differently.”

“People often choose to use Tor from a data hygiene perspective,” Greenstadt added. “They’re primarily concerned with the data trail they leave on websites. They don’t know what ultimately will become of their data and may not want all that info easily accessible.”

The paper suggests that the benefits of a “pathway to legitimacy” for Tor contributors to Wikipedia might exceed the potential harm due to the value of their contributions.

“It’s something for the Wikipedia community to decide,” said Greenstadt, “there might be some experimentation with allowing Tor edits under certain conditions where they have to be reviewed before they go through. There are a lot of ways to do this short of a full ban.”

Greenstadt suggested the role of anonymity in online discourse as a whole be further investigated, adding that she was interested in examining anonymous activity on other websites.

“I think that people overall are concerned about the effect the internet has had on our discourse and ability to operate. Wikipedia is an example of how it can go right in several ways.” She added: “people can attribute a lot of things online to anonymity and lack of accountability. But some evidence shows that the nastiest stuff online is not done anonymously. It’s easy to jump to conclusions about anonymity, so we should rigorously study this.”

“We must move forward with this conversation about how we can handle discourse in our society and our new environment,” she said.

The paper, “Are anonymity-seekers just like everybody else? An analysis of contributions to Wikipedia from Tor,” has been accepted for publication at the 2020 IEEE Symposium on Security and Privacy between May 18 and 20. Originally to be held in San Francisco, the event will be held digitally due to concerns about the COVID-19 pandemic. Other co-authors included Kaylea Champion and Benjamin Mako Hill of the University of Washington, and Andrea Forte of Drexel University.

The research was funded by the National Science Foundation.   

Andrew Laurent
BA, College of Arts and Science
Class of 2022