Takahiro Yabe
,
Ph.D.
-
Assistant Professor
Dr. Takahiro Yabe is an Assistant Professor at the Department of Technology Management and Innovation (TMI) and the Center for Urban Science and Progress (CUSP). His research focuses on computational social science and netwrok science approaches to model the resilience of cities to disasters, pandemics, and disruptive mobility technology, and has published in journals including, PNAS, Nature Communications, and Nature Machine Intelligence. Previously, he was a Postdoctoral Associate at the MIT Institute of Data, Systems and Society (IDSS) and the MIT Media Lab with Alex 'Sandy' Pentland and Esteban Moro. He obtained his Ph.D. from Purdue University in 2021, and his MS and BS from the University of Tokyo in 2017 and 2015, respectively.
Urban Resilience
Human Mobility
Inequality and Segregation
Socioeconomic Networks
Education
Postdoc, Massachusetts Insititute of Technology (2021 - 2023)
Ph.D., Purdue University (2017 - 2021)
MS, The University of Tokyo (2015 - 2017)
Bachelor of Engineering, The University of Tokyo (2011 - 2015)
Selected Awards and Grants
- Vice Presidential Unit Award (2022), The World Bank, to GFDRR mobility data analytics team
- Top 10% Citation Award (2021), PLoS ONE, for “Cross-comparative analysis of evacuation behavior after earthquakes using mobile phone data”
- Doctoral “Systems” Fellowship (2018 – 2020), Purdue University
- STV Civil Engineering Graduate Assistantship Endowment Award (2020), Lyles School of Civil Engineering, Purdue University
Selected Publications
- Yabe, T., Bueno, B. G. B., Dong, X., Pentland, A., & Moro, E. (2023). Behavioral changes during the COVID-19 pandemic decreased income diversity of urban encounters. Nature Communications, 14(1), 2310.
- Yabe, T., Rao, P. S. C., Ukkusuri, S. V., & Cutter, S. L. (2022). Toward data-driven, dynamical complex systems approaches to disaster resilience. Proceedings of the National Academy of Sciences, 119(8), e2111997119.
- Yabe, T., Jones, N. K., Rao, P. S. C., Gonzalez, M. C., & Ukkusuri, S. V. (2022). Mobile phone location data for disasters: A review from natural hazards and epidemics. Computers, Environment and Urban Systems, 94, 101777.
- Yabe, T., Tsubouchi, K., Fujiwara, N., Sekimoto, Y., & Ukkusuri, S. V. (2020). Understanding post-disaster population recovery patterns. Journal of the Royal Society Interface, 17(163), 20190532.
- Yabe, T., Tsubouchi, K., Fujiwara, N., Wada, T., Sekimoto, Y., & Ukkusuri, S. V. (2020). Non-compulsory measures sufficiently reduced human mobility in Tokyo during the COVID-19 epidemic. Scientific reports, 10(1), 18053.
Research News
Towards open and standardized human mobility data: NYU Tandon researchers address key challenges and solutions
Despite the vast amounts of human mobility data generated by smartphones, a lack of standardized formats, protocols, and privacy-protected open-source datasets hampers innovation across various sectors, including city planning, transportation design, public health, emergency response, and economic research. The absence of established benchmarks further complicates efforts to evaluate progress and share best practices.
Takahiro Yabe, assistant professor at the Center for Urban Science + Progress (CUSP) and Department of Technology Management and Innovation at NYU Tandon, recently collaborated with a team of researchers — Massimiliano Luca (Fondazione Bruno Kessler), Kota Tsubouchi (LY Corporation), Bruno Lepri (Fondazione Bruno Kessler), Marta C. Gonzalez (University of California, Berkeley), and Esteban Moro (Northeastern University) — to advocate for the necessity of open and standardized human mobility data. Their opinion piece, published in Nature Computational Science, builds upon the recent release of the open-source, anonymized, large-scale human mobility dataset YJMob100K developed by several authors of this paper.
"The creation of the YJMob100K dataset and associated HuMob Challenge 2024 was motivated by our concern that similar human mobility datasets are exclusive to select researchers and industry organizations," said Yabe.
Unpacking the complexities of human mobility data pre-processing
Tracking a journey from origin to destination using mobile phone data is not straightforward. Data can vary significantly depending on the application employed, and datasets may not capture all transit modes. Inconsistencies in definitions, classifications, and data tags can introduce biases.
For example, defining what constitutes a "stop" — whether it’s a pause of 5 minutes, 30 minutes, or an hour — can be somewhat subjective. “To detect a stop within a mobility trajectory, data scientists need to define arbitrary hyperparameters such as the minimum number of minutes spent at the stop and the maximum movement distance allowed from the stop centroid,” noted the authors. “With several hyperparameters needed for each pre-processing step, a slight change in the selection of these parameters could result in a very different processed human mobility dataset.”
Pre-processing human mobility data is critical for protecting individual privacy. However, the methods used in this process are often kept confidential, creating a "black box" problem that raises concerns about data validity and accuracy. To address these issues, companies evaluate datasets by comparing them with external sources such as census data. However, researchers aiming to introduce and evaluate new methods for predicting human mobility continue to face a significant hurdle: there is currently no standard open benchmark dataset.
Advancing human mobility research with the YJMob100K dataset
The authors propose two strategies for approaching this challenge: creating synthetic, privacy-preserving human mobility datasets using machine learning models, or anonymizing a large-scale mobility dataset through collaboration with a private company. The latter strategy was employed for YJMob100K, which was developed through a collaboration involving Takahiro Yabe from NYU Tandon, Yoshihide Sekimoto and Kaoru Sezaki from the University of Tokyo, Esteban Moro and Alex Pentland from MIT, along with Kota Tsubouchi and Toru Shimizu from the private Japanese internet company LY Corporation.
The YJMob100K dataset was anonymized by converting location pings into 500x500 meter grid cells and aggregating timestamps into 30-minute intervals while masking the actual dates. Individuals who contributed data signed a consent form outlining the frequency and accuracy of location data collection as well as its intended use. All data processing and analysis were conducted on servers managed by the company.
To promote the dataset, the authors initiated the Human Mobility Prediction Challenge (HuMob Challenge) in 2023, which is currently underway for its second year. Selected entries will be showcased at the ACM SIGSPATIAL conference, taking place in Atlanta from October 29 to November 1, 2024.
"Inspired by advancements in natural language processing models like transformers, many submissions in the 2023 edition used AI and machine learning approaches," said Yabe. "Eight of the top ten submissions used deep learning techniques. With over 85 teams and 200 participants from 22 countries, the overwhelming response motivated us to continue into 2024. This year's challenge is more advanced, targeting predictions for cities that lack data."
The authors have identified several key challenges they aim to address in their future work. First, they aim to establish criteria for creating “fit-for-purpose” benchmarking datasets to achieve consensus within the research community. This involves defining clear metrics for data specification and establishing industry standards for pre-processing. Second, recognizing that different tasks require different types of data, they propose creating a collection of “fit-for-purpose” datasets, each tailored to specific research domains, communities, and socio-spatial-temporal contexts. This necessitates a bottom-up approach led by relevant research communities to ensure the datasets are well-aligned with their intended uses.
Yabe, T., Luca, M., Tsubouchi, K., Lepri, B., Gonzalez, M. C., & Moro, E. (2024). Enhancing human mobility research with open and standardized datasets. Nature Computational Science, 4(7), 469-472. https://doi.org/10.1038/s43588-024-00650-3