Automating First Floor Elevation Estimates with Google Street View and Machine Learning
Riccardo Negri, Ph.D. Candidate at NYU Tandon and UC Berkeley and member of the Disaster Risk Analysis Lab
Authors
Lishun Liu Gaara, Danni Yang, Qirui Su
Research Question
The project addresses the gap between existing research methods for estimating First Floor Height Above Ground (FFHAG) and their practical application by local governments. Accurate FFHAG data are critical for flood risk assessments, but current approaches often rely on cost-intensive surveys that most cities cannot afford. The team is reviewing state-of-the-art methods for estimating FFHAG and detecting basement presence using ML and Google Street View imagery, as well as methods for retrieving images from geocoded addresses. They are integrating these components into an accessible, automated system and test its reliability by comparing outputs with standard Hazus-based flood risk estimates used by FEMA.
Background
This project aims to develop a system to estimate First Floor Height Above Ground (FFHAG) for residential buildings exposed to flooding, using Google Street View imagery and machine learning (ML). Accurate FFHAG data are critical for flood risk assessments but are rarely available at scale. For example, NYC compiled FFHAG data through the Building Elevation and Subgrade (BES) survey, which required significant resources. The proposed system uses publicly available imagery and existing ML algorithms to estimate FFHAG, reducing the need for resource-intensive surveys and making the approach feasible for other flood-prone cities. The project has three phases. In the first phase, the team reviews literature on ML methods for estimating FFHAG and detecting basement presence from building images. The team then manually labels images in a selected NYC neighborhood, trains selected models, and evaluates their accuracy using the BES dataset. In the second phase, the team assesses techniques for extracting building images from Google Street View that are suitable for ML analysis. They select an existing algorithm (or develop a new one) to automate image retrieval from Google Street View starting from a list of geocoded addresses. In the final phase, the team applies the system in a pilot neighborhood to generate parcel-level FFHAG and basement presence estimates. The team uses these results to perform a flood risk analysis, and compare the outputs with those from Hazus-based methods currently used by FEMA.
Methodology
The project follows three steps:
-
The team reviews existing machine learning methods for estimating First Floor Height Above Ground (FFHAG) and detecting basement presence from street-level imagery. Based on this review, they select 2-3 suitable models for testing. They also manually label a sample of buildings in a selected NYC neighborhood, train the chosen models, and evaluate their accuracy using the NYC Building Elevation and Subgrade dataset.
-
The team reviews available algorithms for retrieving building images from Google Street View based on address inputs. They implement a system that automates this process, ensuring that the retrieved images clearly show building entrances and other relevant features.
-
The team applies the full system (image retrieval and FFHAG/basement estimation) to a case-study neighborhood. They use the output to run a flood risk analysis and compare the results with those generated using FEMA's standard Hazus methodology.
Deliverables
- Prototype of an automated system that retrieves Google Street View images for a set of geocoded addresses and estimates First Floor Height Above Ground (FFHAG) and basement presence using machine learning.
- Technical report documenting the system architecture, data sources, model performance, and comparison with standard FEMA Hazus-based risk outputs.
- Web-based interface or notebook demonstrating the system's application to a pilot neighborhood, highlighting usability for non-research users.
Data Sources
All required data are open-access or accessible through standard APIs, and no proprietary or sponsor-provided data are needed. The main data sources include:
-
Google Street View imagery, accessed through the Google Maps API, for visual input
-
Geocoded building address lists, which can be obtained from open NYC datasets (e.g., PLUTO or Primary Land Use Tax Lot Output)
-
Building Elevation and Subgrade (BES) dataset published by NYC as a ground-truth reference for model training and validation.