NYU Polytechnic School of Engineering Professor Honored for Pioneering Work on Provenance Research

Juliana Freire Is Named a Fellow of the Association for Computing Machinery


Juliana Freire, a professor of computer science and engineering at the NYU Polytechnic School of Engineering, has been named a fellow of the Association for Computing Machinery (ACM), the world’s largest educational and scientific computing society. Freire’s cutting-edge work has made possible significant advances in the management, integration, analysis, and visualization of big data—essential work given the widely cited estimate that we create 2.5 quintillion bytes of data each day.

Freire also is a member of the faculties of the NYU Courant Institute of Mathematical Sciences, the NYU Center for Urban Science and Progress, and the NYU Center for Data Science, where she is the director of graduate studies.

Dean of Engineering Katepalli R. Sreenivasan said, “Juliana Freire, as a new fellow of the ACM, exemplifies the spirit of scientific inquiry, innovation, energy, and collaboration so important to us at the NYU School of Engineering. She is a great asset to the school, and we are gratified that she has won this recognition.”

Freire’s contributions to provenance management and computational reproducibility were specifically cited by the ACM. To analyze and understand data, complex computational processes need to be assembled, often requiring the combination of loosely-coupled resources, specialized libraries, and tools.  These processes may generate yet more data, adding to the overflow of information with which data scientists must deal.

Ad hoc approaches to data exploration are widely used but have serious limitations. In particular, analysts expend substantial effort managing data (for example, scripts that encode computational tasks, raw data, data products, and notes) and recording provenance information so that basic questions can be answered, such as: Who created a data product and when? When was it modified and by whom? What was the process used to create the data product? Were two data products derived from the same raw data? Not only is the process time-consuming, but also error-prone.

As a step towards addressing these problems, Freire and collaborators have performed seminal work on provenance management. Besides patents and publications (which include two best-paper awards), they have built VisTrails, an open-source system that supports data exploration and provides a comprehensive provenance management infrastructure. VisTrails not only enables the creation of reproducible results, but it also leverages provenance information through a series of operations and intuitive user interfaces that help users to collaboratively analyze data.

Since its beta release, the system has been downloaded more than 40,000 times and it has been adopted in several scientific projects, both nationally and internationally, in different fields, including environmental sciences, climate data analysis, psychiatry, astronomy, cosmology, high-energy physics, molecular modeling, and quantum physics.

Freire has laid new ground in the area of reproducibility of computational experiments. In addition to scientific contributions and open source systems, she has actively reached out to the computer science community and other scientific domains to raise awareness regarding reproducibility and to inform researchers about technology and tools that can help them create reproducible results. She currently leads the Reproducibility and Open Science group as part of the Moore-Sloan Data Science Environment at NYU.

Only the top one percent of ACM members is ever awarded fellowships for outstanding accomplishments in computing and information technology and/or outstanding service to ACM and the larger computing community. Freire will be honored along with the other 2014 fellows at the annual ACM Awards Banquet in June 2015, in San Francisco.

The ACM fellowship is only the latest in a long list of accomplishments for Freire. She has received a Google Faculty Research Award, a National Science Foundation CAREER Award, and two IBM Faculty Awards.