Generate Standardized Global City Segments for Public Health, Climate, and Urban Planning Applications | NYU Tandon School of Engineering

Generate Standardized Global City Segments for Public Health, Climate, and Urban Planning Applications

Health & Wellness,
Urban


Project Sponsor:

Dana R. Thomson, Associate Director of Science Applications, Center for Integrated Earth System Information (CIESIN), Climate School, Columbia University

 

MENTOR:

Claire Dooley, Lecturer in Spatial Data Science, Bartlet Faculty of the Built Environment, University College of London

Caitlin Clary, Data Scientist, Biostat Global Consulting LLC

 


Authors

Lula Yang, Shourya Dokania, Pranay Kashyap


Research Question

How can we generate standardized, census-like neighborhood-level city segments across urban areas globally using open data and methods to support public health, climate, and planning field activities?


Background

The Center for Integrated Earth System Information (CIESIN) at Columbia University is an internationally recognized leader in geospatial data science and interdisciplinary research. Since its founding in 1984, CIESIN has produced hundreds of granular, high-impact global datasets about human-environment interactions, and hosted one of NASA's 12 data centers on socioeconomic data applications for more than 30 years. CIESIN's current research & collaborations focus on four data streams: Population, Operational & Decision Support, Environmental Mobility, and Hazard & Climate Risk.

This capstone project focuses on generating standardized city segments across urban areas globally using open data. The segments are designed to support a variety of field activities, such as household surveys, vaccination campaigns, disaster assessments, and environmental monitoring. Each segment is homogenous in terms of population size and urban characteristics, ensuring they are meaningful units for analysis, field data collection, and interventions. The project involves defining major uncrossable features (e.g., highways, rivers, administrative boundaries), then using minor features (e.g., roads, rivers) to split the urban space into smaller segments. Population and morphometric characteristics (e.g., building density, road length) are estimated for each segment, enabling the creation of navigable spatial units that reflect real-world urban conditions.

Using advanced data science methods, including GIS, AI, and geocomputing, the project creates a global city segments dataset, beginning with a subset of cities or regions. The project is achievable within the academic timeframe, with the initial focus on developing the methodology and generating segments in a subset of cities. Students work with open data sources such as GHS-POP, OpenStreetMap, and Overture building footprint data. The project also incorporate census best practices, ensuring flexibility for future updates as urban populations and conditions evolve. This capstone provides students with a practical opportunity to apply interdisciplinary skills in geospatial analysis and data science, while contributing to scalable solutions for global urban challenges.


Methodology
  • Defining Major Uncrossable Features: The project defines major uncrossable features such as highways, water bodies, cliffs, city walls, fences, and administrative boundaries. These features act as barriers, preventing the merging of segments across them and helping to maintain clear and meaningful city boundaries.
  • Segmentation Within Uncrossable Features: Minor features, such as secondary roads and rivers, are used to split the space within these major barriers into smaller polygons. These polygons represent potential city segments, capturing the functional divisions of urban areas.
  • Population and Morphometric Estimation: Population is estimated for each polygon using gridded population data (e.g., GHS-POP) or building data. Morphometric characteristics are calculated, such as building density and road length, based on available building footprints and road networks within the polygon.
  • Combining Polygons into Homogenous City Segments: Polygons are combined based on population and morphometric parameters to form homogenous city segments. Each segment has a target population size that ensures it is large enough for meaningful analysis and small enough to support field activities.
  • Additional Processing for Large Segments: Large city segments may be further subdivided using techniques that identify voids between building footprints, ensuring segments are appropriately sized and functional for fieldwork.
  • Handling Unpopulated Areas: Large unpopulated areas are removed, similar to how census EAs are delineated. This ensures that the resulting segments are relevant for analysis and intervention.
  • Census Best Practices: The city segments follow best practices from census methodologies, ensuring consistency and the ability to adjust segments as population distributions change over time.

Deliverables
  • City Segments Spatial Dataset
  • Technical Documentation

Data Sources
  • 2024 Global Human Settlement Urban Centres Database (GHS-UCDB) from European Commission - Joint Research Centre (EC-JRC)
  • 2025 Global Human Settlement Population estimates (GHS-POP) from European Commission - Joint Research Centre (EC-JRC) (or similar fine scale population estimates)
  • Current OSM features including roads, rivers, waterbodies, wetlands, cliffs, city walls from OpenStreetMap
  • Current Building Footprints from Overture Foundation
  • Current Global Subnational Administrative Boundaries from GeoBoundaries (or similar source)