Machine Learning Demystified

Google Faculty Research Awards are aimed at recognizing and supporting world-class academic researchers who are doing cutting-edge work that could impact how future generations use technology. Assistant Professor of Computer Science & Engineering Enrico Bertini, in collaboration with co-principal investigator Yindalon Aphinyanaphongs of NYU’s Langone Medical Center, recently received one of the highly competitive unrestricted grants for the creation of interactive visualization and machine learning methods that will enable developers and domain experts to explore and understand the decisions made by machine learning models.

Enrico Bertini

“Data scientists have been very successful in creating machine-learning models that accurately predict future outcomes,” Bertini says. “But now there’s a pressing need to understand, explain and describe the decisions a trained classification model makes, in order to build trust in applied machine learning. That’s important because while models can be validated statistically through procedures that verify their generalization power, validation of their plausibility in specific domains requires domain experts to interact with the model to make sense of the decisions it makes.”

Their proposed solution integrates machine learning and visual analytics methods to enable end-users to explore classification models interactively. Because their focus is “model-agnostic,” their method can be applied to a large variety of contexts. “Machine learning is now being applied in many sectors, including cybersecurity, banking, and advertising,” Bertini explains. “We chose to focus on healthcare, however, because not only does the field pose exceptionally interesting questions, but collaborations between the NYU schools of engineering and medicine present great potential for making major contributions to the public good.”

For the purposes of this study, the researchers are using large open-source datasets of patient information in order to build mortality models that predict, with the discharge summary, whether the patient is likely to die within 30 days of release from the hospital. 

“Our broad aim, however, is to pilot and validate novel approaches of visualization of model explanations,” Bertini says, “because success in those efforts will open the door to implementation in other domains and datasets.”