Anima Anandkumar Kicks off Fall AI Series With Talk On Data, Algorithms and Infrastructure
Anima Anandkumar, Bren professor of computing and mathematical sciences at the California Institute of Technology (Caltech) and Director of Research in Machine Learning at NVIDIA offered her insights on data, algorithms and cloud Infrastructure — the three pillars of artificial intelligence — for developing scalable and easily available datasets for training systems.
Anandkumar kicked off the Modern Artificial Intelligence seminar series with an exploration of her and other scientists’ research on the development and analysis of tensor algorithms for machine learning — emphasizing the need for simplifying the large datasets required for modern deep learning modules.
Besides being costly and difficult to collect, these torrents of data are not easily available in all domains; Anandkumar offered insights on how to use machine learning to build more efficient, less cumbersome data acquisition processes.
Active learning, she explained, can be an effective tool for strategic data collection because it allows for the use of partial labels (samples of data that have been tagged with meaningful information, used for training machine learning models) rather than labels with full information, which require a much longer processing time.
“Building intelligence into data collection and aggregation allows you to reduce sample complexity and improve generalization for machine learning models,” she explained, describing various ways of augmenting domain information through graphics rendering, +GANs (generative adversarial networks) and symbolic expression.
She noted that with the use of algorithms and cloud infrastructure, it is possible to train machine learning models at a large scale with fewer data samples, eliminating all of the difficulties that come with acquiring large data sets for model training.