Is Your Big Data Messy? We're Making an App for That

Researchers are creating software to ‘clean’ large datasets, making it easier for scientists and the public to use big data


Like a teenager's bedroom, big data is often messy. 

Malfunctioning computers, data entry errors and other hard-to-spot problems can skew datasets and mislead people — everyone from data scientists to data hobbyists — trying to draw conclusions from raw data.

Vizier, a software tool under development by a University at Buffalo-led research team, aims to proactively catch those errors.

The project, backed by a $2.7 million National Science Foundation grant, launched in January. Like Excel and other spreadsheet software, Vizier will allow users to interactively work with datasets. 

(See more...)