Back
Albedo

Albedo

Built at CMU TartanHacks

Solar energy analysis platform that loads photorealistic 3D city tiles for any address, uses a custom AI-powered pipeline to detect and mask windows, then ray traces a full sky-dome lighting model to generate per-surface irradiance heatmaps with estimated energy yield and financial savings.

UnityC#Computer VisionCesium 3D TilesMapboxGPU Ray TracingOpenCV

Albedo is a solar energy analysis platform that maps irradiance across real 3D building geometry — detecting windows with a Gemini LLM, simulating a full sky-dome lighting model, and producing per-surface heatmaps with energy yield and financial savings estimates. Give it an address; it handles the rest.

The Problem

Rooftop solar assessments today are manual, expensive, and coarse.

Traditional tools use flat 2D satellite imagery, ignore window placement, and treat buildings as uniform rectangles. They miss partial shading from adjacent structures, can't distinguish panel-viable surface from glass, and produce estimates that vary wildly from actual generation at install time.

The result is that solar feasibility for a specific building — at the façade and window level — is either not computed at all, or requires a paid site assessment by a licensed installer.

Albedo removes that barrier.

The Pipeline

The system moves from an address to a rendered heatmap in eight stages:

  1. Address geocoding — Mapbox geocoder converts an address to lat/lon coordinates.
  2. 3D city load — Google Photorealistic 3D Tiles (via Cesium for Unity) or Mapbox SDK stream the building geometry. A single building is isolated from the tile stream using OBB frustum filtering.
  3. Multi-view capture (5 shots) — Orthographic cameras render the building from 4 cardinal directions and top-down. Each render is saved as a JPEG alongside a JSON file recording camera position, rotation, and FOV.
  4. AI window detection — All 5 images are sent concurrently to Gemini 2.5 Flash with a prompt instructing it to recolor every exterior window to pure green (#00FF00) and leave everything else unchanged. The LLM handles reflections, occlusion, and perspective.
  5. Mask generation — HSV thresholding converts each Gemini-edited image to a binary mask. Green pixels become white (window), everything else becomes black (usable surface). Optional morphological cleanup removes jagged edges.
Window segmentation pipeline: original building, Gemini-detected windows (green), and 3D mesh projection

Window segmentation — original, Gemini-masked, and 3D mesh projection

  1. Mesh projection — The saved camera pose reconstructs the original view frustum. For each white pixel in the mask, a ray is cast through the view-projection matrix onto the building mesh, tagging intersected triangles as window geometry. Coarse Cesium polygons are subdivided 6–12× for pixel-level spatial accuracy.
  2. Sky-dome lighting simulation — Hundreds of directional lights are distributed across a hemisphere. Surface points on the building mesh receive rays from each light; window geometry is excluded via Unity layer culling. Irradiance accumulates per surface point.
  3. Heatmap + analytics UI — Irradiance is color-mapped blue → red and rendered onto the building. The UI displays avg/min/max irradiance, estimated annual kWh, financial savings, and window film ROI.

Tech Stack

  • 3D engine: Unity 2022 (C#)
  • Geospatial data: Cesium for Unity (Google Photorealistic 3D Tiles), Mapbox Unity SDK (vector tiles, geocoding)
  • AI vision: Google Gemini 2.5 Flash Image API — LLM-driven semantic segmentation
  • Image processing: OpenCV + NumPy + Pillow (Python) for offline mask generation; HSV thresholding in Unity for real-time
  • Rendering: Custom GLSL projective texture shader, RenderTexture + Graphics.Blit, DirectionalLight sky dome with layer culling masks
  • Physics/geometry: PhysX MeshColliders, custom raycasting pipeline, adaptive triangle subdivision, Unity.Mathematics
  • Networking: UnityWebRequest with concurrent HTTP coroutines (all 5 Gemini calls fire in parallel)

Challenges & Solutions

  • Projecting 2D masks onto coarse 3D geometry: Google Photorealistic 3D Tiles use large baked-texture polygons where one triangle can cover an entire building face. A 1:1 pixel-to-triangle raycast produces terrible spatial resolution. We implemented adaptive triangle subdivision (depth 0–24) before raycasting — tessellating each large polygon into sub-triangles sized to match the mask's pixel density. The camera pose JSON makes this deterministic without keeping the capture camera alive.
  • Reliable window detection across building styles: Classical edge-detection and color segmentation fail on reflective glass, curved façades, and varying sun angles. Instead of training a segmentation model, we used Gemini as a semantic oracle — giving it a precise output format (#00FF00 for windows) lets us use dead-simple HSV thresholding downstream. The LLM handles the hard problem; the threshold handles the easy one. The prompt is the only thing that needs updating for different detection targets (e.g., detecting existing solar panels with a different color).
  • Isolating a single building from a streaming tile dataset: Cesium has no “get me this building” API — tiles load lazily as the camera moves. We pause tile streaming after initial load, construct a tight AABB from the target lat/lon, and filter all loaded mesh objects by centroid. MeshColliders are baked post-isolation since Cesium tiles arrive without physics geometry.
  • Light culling mask inheritance in Unity: Unity DirectionalLights affected window overlays even when culled because child objects inherit parent layer assignments. Fix: assign all window meshes to a dedicated WindowLayer, exclude it from every DirectionalLight's culling mask bitmask, and call Physics.IgnoreLayerCollision to prevent spurious raycast hits from the lighting sim.

Outcome

Albedo showing irradiance heatmap on a NYC skyscraper with energy insights panel

895 MWh/year estimated for a Midtown Manhattan high-rise

Albedo shows that LLMs can be used as preprocessing oracles for computer vision pipelines — not just as chat interfaces. By combining real-world 3D tile data, AI semantic segmentation, and physically-grounded lighting simulation, it makes solar feasibility analysis fast and building-specific for any address in the world.

Built at Carnegie Mellon's TartanHacks hackathon.