Big Data Tutorial 1: MapReduce

For NYU Community

NYU High Performance Computing

This class will provide a brief overview of what Hadoop is and the various components that are involved in the Hadoop ecosystem. There will be a hands on showcase for the users on how to use the dumbo (Hadoop) cluster to run basic map-reduce jobs. Various hands on exercises have been incorporated for the users to get a better understanding. Please bring your laptop to this workshop as it is intended to be a practical hands-on session.

1. HPC user account is mandatory
2. The user needs to have a basic knowledge of Unix and Java/python