How AI Scores Roof Condition From Satellite Imagery (Full Guide)
A non-fluffy explainer of how vision AI scores roof condition from satellite imagery — what it sees, what it misses, and what makes one tool better than another.
When a roofer hears "AI-powered roof scoring," they usually picture one of two things: either a magic box that's smarter than a 30-year roofer (unrealistic) or a marketing gimmick wrapping basic image filtering (often true). The truth is in the middle, and it's more useful to know exactly what's under the hood than to take any vendor's word for it.
I'm going to walk through, end to end, how modern vision AI actually scores roof condition. What signals it picks up. What it can't see. How accuracy is measured. And what makes one tool's output meaningfully better than another's. Then you'll have a real framework for evaluating any "AI roofing" tool — including Roofbird, which I'll be transparent about throughout.
The simplest possible explanation
Vision AI is a pattern recognizer. You give it a labeled training set — thousands of satellite roof images that human experts have already scored 0-10 for condition, tagged with specific damage types, and assigned an age band — and it learns the visual patterns that correspond to each label.
Once trained, you can give the AI a new satellite image it's never seen, and it'll output:
- A condition score (0-10)
- A confidence level (low / medium / high)
- A list of detected damage signals
- An estimated age band
- A replacement-likelihood prediction
That's the whole loop. The interesting question is what each step looks like in practice — and where each step can go wrong.
Step 1: Image acquisition (and why this matters more than the model)
The single biggest variable in scoring accuracy isn't the AI model. It's the imagery feeding it.
Most "AI roofing" tools pull satellite images from one of three providers:
Google Maps Static API (free or near-free, what most newer tools use):
- Resolution: ~5-15cm per pixel at maximum zoom (zoom 20)
- Freshness: varies wildly by region — Dallas might be 6 months old, rural Texas 3 years old
- Coverage: global, US-comprehensive
- Pricing: $2 per 1000 image requests
Nearmap (paid, what most commercial inspection tools use):
- Resolution: ~7cm per pixel, sometimes finer
- Freshness: 3-6 month refresh in major metros
- Coverage: US + a few international markets
- Pricing: enterprise contracts, $$$$
EagleView / Vexcel (paid, the most expensive option):
- Resolution: under 5cm per pixel, near-survey quality
- Freshness: variable
- Coverage: US-comprehensive
- Pricing: per-property or annual subscriptions
Why this matters: if your tool is scoring a roof from a 2-year-old Google image, you might be looking at the previous owner's roof. We've all seen the case where Google imagery shows a tarp that was replaced 18 months ago.
When evaluating any AI roofing tool, ask which imagery provider they use, and what the typical age of imagery is in your service area. A model trained on perfect labels but fed stale or low-resolution imagery will be wrong systematically.
Step 2: Feature extraction (what the AI looks for)
Modern vision AI for roof inspection looks at roughly 8-12 visual feature categories. The specific list varies by vendor; here's the standard set:
Material classification:
- Asphalt 3-tab vs. architectural vs. metal vs. tile vs. slate vs. wood
- Sub-classification within asphalt (luxury vs. premium vs. standard)
Age indicators:
- Color uniformity (faded vs. fresh)
- Granule loss progression
- Surface texture patterns (smooth = new, mottled = aged)
Damage signals:
- Missing tabs / shingles (visible underlayment)
- Curling and lifting (raised edges visible from above)
- Algae streaking (dark vertical streaks)
- Moss growth (visible green/dark patches)
- Tarp presence (bright unnatural colors)
- Hail-impact bruising (circular patterns, granule displacement)
- Patch repairs (color/texture inconsistency)
Structural features:
- Roof complexity (simple gable vs. multi-hip with valleys)
- Estimated square footage (computed from footprint)
- Slope/pitch (estimated from shadow analysis)
- Penetrations (vents, plumbing stacks, skylights — affects replacement complexity)
Context features:
- Neighbor replacement signals (newer/different roofs nearby)
- Tree overhang severity
- Surrounding land use (residential vs. commercial-adjacent)
Each feature is detected independently. A typical roof might trigger 3-7 of these signals simultaneously. The AI then needs to combine them into a coherent score.
Step 3: Scoring + replacement-likelihood prediction
This is where the vendor differentiation gets real.
Raw feature detection is largely commoditized. Most modern vision models can spot missing shingles or algae streaking with 85-95% accuracy. What separates tools is how the features get combined into a score and how that maps to a commercial action (is this worth knocking?).
Three approaches I've seen in the wild:
Approach A: Weighted feature sum
score = w1*granule_loss + w2*curl + w3*algae + ... + bias
Simple, transparent, but doesn't handle interactions well. A roof with mild granule loss AND mild curl is worse than one with either alone — but a linear sum doesn't capture that.
Approach B: End-to-end classifier
Train a model directly on (image → 0-10 score) pairs. Better at interactions but a black box. Hard to explain why a specific roof scored what it did.
Approach C: Hybrid (what most modern tools use)
Use a vision model to extract structured features (Step 2), then a second model (often simpler, like a gradient-boosted tree) to combine features into a final score. The features are auditable, the combination is learned from data, and the score can be explained in roofer-language ("scored high because of algae + 20+ year age + missing tabs").
Roofbird uses Approach C. It's the trade-off that gives the best combination of accuracy and explainability — which matters because a roofer needs to defend the score to a homeowner.
Step 4: Replacement-likelihood prediction (the actually-useful number)
A 0-10 condition score tells you the roof's state. It doesn't tell you whether the homeowner is likely to replace.
The two diverge often:
- A vision-9 roof (severe wear) where the owner just bought the house six months ago → low replacement likelihood (they have other priorities)
- A vision-5 roof (visible wear) where 3 neighbors just replaced and a hail event hit 14 days ago → high replacement likelihood
This is the layer where context matters more than the image itself:
- Roof age band + historical replacement-cycle data (asphalt lasts 20-25 years average)
- Storm exposure (hail events in the last 12 months within 1 mile)
- Neighborhood replacement cascade (3+ adjacent properties with fresh roofs)
- Homeowner tenure (longer = more likely to invest)
- Zip-level willingness-to-pay (income/value bands)
The output of this layer is a 0-100 "buy probability" or "replacement likelihood" score. That's the number a roofer should actually filter on — not the raw condition score.
For Roofbird's vision-only v1 (we're transparent that this is the simplification we made), we combine the AI's condition score with the AI's self-reported replacement-likelihood class (high/medium/low) plus a confidence multiplier. Future versions will incorporate storm + permit + neighborhood signals more deeply.
The signals AI gets right
In our internal validation against ground-truth data (roofers inspecting in person):
| Signal | Accuracy | Notes |
|---|---|---|
| Missing tabs / shingles | 98% | Very high contrast in satellite |
| Visible tarp | 99% | Bright blue is unmistakable |
| Material classification (asphalt vs. metal vs. tile) | 96% | Texture is distinctive |
| 3-tab vs. architectural asphalt | 88% | Harder distinction |
| Algae streaking presence | 94% | Clear in mid-resolution imagery |
| Granule loss (severity) | 85% | More subjective |
| Age band (within 5 years) | 80% | Hardest visual signal — relies on indirect cues |
| Neighborhood replacement cluster | 92% | Computed, not visual |
The signals AI gets wrong
Equally important to be honest about:
| Signal | Why AI misses | Workaround |
|---|---|---|
| Hail bruise depth | Visible from above as displacement, but depth = ground truth | AI flags candidates, roofer verifies on ground |
| Mat damage under intact tabs | Hidden by the top layer | Same |
| Soft spots / underlying decking | Not visible at all | In-person inspection required |
| Leak history | Not in any image | Homeowner conversation |
| Active warranty status | Not visible | County permit records |
| Solar panels skewing age estimate | Newer panels on older roofs read as "newer roof" | Tag solar separately |
The honest pitch any AI roofing tool should make: we screen the field of candidates so you can spend your in-person time on the right doors. Not "we replace the in-person inspection."
What makes one tool better than another
If you're evaluating "AI roofing" tools, here's the framework I'd use:
1. Imagery freshness in your specific service area. Ask the vendor: "what's the median age of the imagery in 75024 (or your top zip)?" If they don't know, they're not really thinking about this.
2. Feature extraction transparency. Can you see the specific damage signals the AI flagged for a given property? Or is the score a black box? Transparent tools build roofer trust faster.
3. Replacement-likelihood vs. condition score separation. Tools that only give you a condition number are missing the commercially important layer. The roofer's buying question is "will this house buy?" not "how bad is this roof?"
4. Ground-truth accuracy data. Has the vendor done any validation against in-person inspections? At what scale? If they can't share numbers, treat the scoring as a heuristic, not a measurement.
5. Coverage of your service area. Some tools focus on storm-belt markets, others on uniform-density coverage. Check before you sign.
6. Output usability. Can the tool generate door hangers, CRM exports, route plans? A great score that lives in a dashboard your reps don't open is a score that does nothing.
How Roofbird approaches it (transparent methodology)
Since I built one of these tools, here's exactly how Roofbird's stack works today (v1, June 2026):
-
Imagery: Google Maps Static API at zoom 20 (5-10cm/pixel). We chose this over Nearmap to keep pricing under $199/month for small shops. Trade-off: imagery freshness varies (most metros within 12-18 months). We're adding Nearmap as a paid upgrade tier.
-
Feature extraction: AI vision model, prompt-engineered for the 10 feature categories listed above. Outputs structured JSON with confidence per feature.
-
Scoring: Vision-only v1 — condition score directly mapped from AI output. We don't blend storm/age/tenure into the score yet (it's coming). Instead, those signals surface as tags on the property card so the roofer can weight them themselves.
-
Replacement-likelihood: AI's self-reported high/medium/low class, modulated by its confidence level.
-
Validation: ~200 ground-truth comparisons so far. Material classification 96% accurate, missing-shingles detection 98%, age estimation within ±4 years at 80%. We publish these numbers because they're the only honest way to set roofer expectations.
We're a year-old company; the methodology will get more sophisticated over time. The principle won't change: screen the field of candidates with AI so roofers spend in-person time on the right doors.
What's next: multi-modal scoring
The next 12 months in AI roof scoring will be about combining vision with non-visual data sources:
- Permit records (last replacement date directly)
- Insurance claim density (community-level storm exposure)
- Property records (home value, last sale, tenure)
- Weather data (specific event impacts, not just hail counts)
- Drone imagery (for high-value verification when satellite is too coarse)
The tools that integrate all five will outperform vision-only tools by a meaningful margin. We're working on this. So is every other serious player in the space.
The roofers who win in 2026-2027 will be the ones who learn to use these tools effectively — not the ones who wait for the technology to be perfect.
If you want to see exactly what the output looks like for one metro: Roofbird's DFW sample dashboard has 25 real, scored properties — 10 with full diagnostics unlocked. No signup to view the first 10. The trial loads 25 free scored leads in your own service area.
— Jake
Written by
Jake Thompson
Have a question about anything in this post? Reach the Roofbird team at support@roofbird.ai.
Try Roofbird — 25 free leads in your area
See a sample dashboard for DFW first, no signup needed. Trial loads 25 free pre-scored leads in your own service area.
More for roofers
How to Generate Roofing Leads Without Cold Calling: 7 Methods Ranked by Cost and Close Rate
Seven proven ways to generate roofing leads without cold calling — ranked by realistic cost-per-lead and close rate. From free GBP optimization to satellite roof scoring. No shared leads, no phone room.
How Solo Roofing Contractors Get Leads Without Angi or HomeAdvisor (7 Methods That Actually Work)
Seven proven ways to generate roofing leads without paying Angi or HomeAdvisor's marketplace tax — ranked by cost, effort, and lead quality for solo contractors and small crews.
Best Alternatives to EagleView for Roofing Measurements and Lead Generation (2025)
Honest breakdown of EagleView alternatives for roofers — Hover, Roofr, Nearmap, GAF QuickMeasure, and AI prospecting tools. Covers both measurement accuracy and lead generation, because most comparison articles only cover one.