Software can 'proof-read' Wikipedia

IOWA CITY, Iowa, Sept. 23 (UPI) -- A new tool may help fight malicious editing that introduces incorrect or misleading information in online sites such as Wikipedia, U.S. researchers say.

University of Iowa researchers are developing a software tool that can detect potential vandalism and improve the accuracy of Wikipedia entries, a university release says.


The tool is an algorithm that looks at new edits to a page and compares them to existing words in the rest of the entry, and then alerts an editor or page manager if it senses a problem.

There are existing tools that spot obscenities or vulgarities, or major edits, such as deletions of entire sections, or significant edits throughout a document. But those tools are built manually, with prohibited words and phrases entered by hand, so they're time-consuming and easy to evade, the UI researchers say.

Their automatic statistical language model algorithm works by finding words or vocabulary patterns that it can't find elsewhere in the entry at any time since it was first written.

For instance, when someone wrote "Pete loves PANCAKES" into the Wikipedia entry for Abraham Lincoln, the algorithm recognized the graffiti as potential vandalism after scanning the rest of the entry.


"It determines the probability of each word appearing, and because the word 'pancakes' didn't turn up anywhere else in the history of Lincoln's entry, the algorithm saw it as something new and possible graffiti," said Si-Chi Chin, a graduate student in UI's Interdisciplinary Graduate Program in Informatics.

Latest Headlines