How the scores are built
Each risk score is grounded in public federal data, normalized to a 0–100 scale, and translated into plain-English buyer guidance. This document describes our data sources, normalization approach, risk banding definitions, and the editorial layer that sits above the algorithmic scores.
Our foundational principle
We do not generate proprietary risk models. We aggregate, normalize, and translate existing federal government datasets into buyer-readable intelligence. Every score on this platform has a direct lineage to a named federal source — FEMA, NOAA, USGS, USDA, or EPA — and we expose that lineage transparently on every page.
This approach means our scores are only as accurate as the underlying federal data. We note known limitations in the section below. Where federal coverage is thin, we flag that explicitly rather than filling gaps with interpolation.
Primary Data Sources
National Risk Index (NRI)
The FEMA National Risk Index is the primary backbone of our scoring system. It provides county-level composite risk scores for 18 natural hazards, normalized to a 0–100 scale against the national distribution. We use NRI's Expected Annual Loss (EAL), Social Vulnerability (SoVI), and Community Resilience indices as the four core score components in our city-level profiles.
Storm Events Database & Climate Normals
NOAA's Storm Events Database provides granular historical event records dating to 1950, including fatality counts, damage estimates, and geographic extent for every declared storm event in the United States. We use these records to validate and contextualize the FEMA NRI hazard scores and to populate our historical event timelines. NOAA Climate Normals provide the 30-year baseline temperature and precipitation context for our heat exposure analysis.
National Seismic Hazard Maps
The USGS National Seismic Hazard Maps provide probabilistic ground motion estimates at 2% probability of exceedance over 50 years — the standard for building code and insurance actuarial use. We use USGS peak ground acceleration (PGA) values to validate and supplement the FEMA NRI earthquake score where NRI coverage is thin.
Wildfire Hazard Potential (WHP)
The USDA Forest Service Wildfire Hazard Potential raster dataset provides 30-meter resolution wildfire risk coverage for the contiguous United States. WHP integrates fire weather, fuels, and terrain to produce a relative index of wildfire likelihood and intensity. We use WHP to supplement the FEMA NRI wildfire score, particularly for urban-wildland interface assessments in the Western states.
EnviroAtlas & Air Quality Index
EPA's EnviroAtlas and Air Quality Index data provide environmental burden context for our heat and urban heat island analysis. We use EPA air quality data as a secondary signal for communities where extreme heat and air quality interact — particularly relevant for cities in the American Southwest and Southeast.
Scoring Process
- 01
Raw data ingestion
We pull source data from federal APIs and published datasets on a quarterly basis, storing retrieval timestamps with each record to maintain data provenance.
- 02
County-to-city mapping
FEMA NRI scores are reported at the county level. We map counties to cities using FIPS codes and population-weighted centroids, then validate against ZIP code–level overlays.
- 03
Score normalization (0–100)
Raw NRI composite values are already on a 0–100 national percentile scale. We preserve this scale for transparency. Supplemental datasets (USGS PGA, USDA WHP) are normalized to the same scale using national distribution bounds.
- 04
Hazard-specific banding
Each normalized score is classified into one of five bands (Minimal, Low, Moderate, High, Extreme) using percentile thresholds derived from the national FEMA NRI distribution.
- 05
Editorial review & annotation
City and state profiles receive editorial review by the Open Data Collective research team. The editorial notes layer is separate from the algorithmic score — it adds market context, insurance nuance, and buyer-relevant interpretation that raw numbers cannot express.
- 06
Address-level report generation
For paid reports, parcel-level data is joined against ZIP code aggregates, FEMA flood zone designations, and insurance market context to generate the 12-page address report.
Risk Band Definitions
Update Cadence
FEMA updates NRI scores annually. We ingest the new release within 30 days and update all city and state profiles.
NOAA updates the Storm Events database continuously. We pull quarterly snapshots and refresh historical event timelines.
USGS updates national seismic hazard maps periodically. We monitor for new releases and update on publication.
USDA Forest Service publishes updated WHP rasters annually. We integrate each annual release.
How to read a score
From 0–100 to national percentile
Per the 2026 product audit, every public surface presents scores as national-percentile phrases rather than raw numbers. Here is the exact conversion math and the band definitions.
Conversion formula
Our internal scoring convention is "higher = riskier" on a 0–100 scale, where 100 represents the most at-risk properties in the United States and 0 represents the least. Because raw numbers are hard to interpret, every public surface converts the raw score to a national-percentile phrase before display. Conversion math: percentile = round(100 - score), clamped to [0, 100] For data sourced from FEMA NRI (the National Risk Index), the underlying field (RISK_SCORE, RFLD_RISKP, HRCN_RISKP, etc.) is already a national percentile, so the conversion is effectively a no-op. For parcel-level sources (FEMA NFHL, USDA Wildfire Hazard Potential, USGS Design Maps) we calibrate the synthetic 0–100 score against the NRI distribution of the same hazard — a property that scores 80 on a parcel-level flood query sits in roughly the same relative position as a county whose NRI flood percentile is 80. The calibration table is reproduced in the L3 audit appendix.
A property that scores 87 nationally is in the top 13% of risk — i.e. 87% of US properties are LESS at risk. We surface that as “Top 13% nationally” so users see context, not a verdict on the property itself.
Percentile bands shown to users
| Score range | Phrase displayed |
|---|---|
| 95–100 | Top 5% nationally |
| 90–94 | Top 10% nationally |
| 85–89 | Top 15% nationally |
| 80–84 | Top 20% nationally |
| 75–79 | Top 25% nationally |
| 60–74 | Above the national median (top 40%) |
| 40–59 | Around the national median |
| 20–39 | Below the national median |
| 0–19 | Bottom 20% nationally |
Internal band name → user-facing label
Our pipeline keeps the internal bucket name (for analytics and color bucketing) and translates it to a softer phrase on every public surface. The 2026 product audit specifically called out “extreme” and “high” as too strong for end users; we re-render them as “Significant” and “Substantial” so the word choice doesn't imply property-level fault.
| Internal (analytics) | Public-facing (UI/PDF/email) | Why |
|---|---|---|
| extreme | Significant | Internal bucket name preserved for analytics; 'Significant' is what the user sees in copy. |
| high | Substantial | Same — softened to avoid implying property-level fault. |
| moderate | Moderate | Unchanged. |
| low | Low | Unchanged. |
| minimal | Minimal | Unchanged. |
Known Limitations
Transparency about what our data cannot do is as important as what it can.
County-level resolution for many scores
FEMA NRI is a county-level dataset. City and neighborhood-level scores inherit county-level risk values, which can mask significant intra-county variation. Address-level reports attempt to refine this using ZIP code overlays, but county-level averaging remains a structural constraint.
Future climate change not fully integrated
Federal datasets reflect historical and current baseline risk. Long-term climate change projections (e.g., sea level rise, increased wildfire frequency) are noted editorially but are not embedded in the primary NRI scores.
Insurance market data is indicative, not quoted
All insurance premium ranges on this platform are estimated ranges based on publicly available industry data and editorial research. They are not insurer quotes. Actual premiums depend on property-specific factors, underwriting criteria, and carrier availability.
This is not an insurance, financial, or legal advisory
Risk Before Buy is an educational due-diligence tool. Nothing on this platform constitutes professional insurance advice, financial advice, or legal counsel. Consult licensed professionals before making any real estate transaction.
Start your due diligence
Ready to check an address?
Put the methodology to work. Get a full report on any US address in minutes.