An interactive notebook for practitioners in Built Environment Intelligence and data visualisation.
# BIM Data Visualisation 101: Introduction to data reporting
**SUMMARY**
This notebook demonstrates seven browser-based visualization techniques for BIM data—from element distribution and issue trends to system dependencies—enabling coordination teams to turn tens of thousands of model elements into actionable intelligence without servers or databases. Using Observable Plot, Chart.js, D3.js, and DuckDB-Wasm, it shows how the same normalized data pipeline that powers human-readable dashboards can feed machine-readable APIs for agentic BIM assistants. The patterns scale from project dashboards to live digital twin interfaces, with an in-browser SQL editor for ad-hoc exploration.
A modern BIM model can contain tens of thousands of elements, hundreds of tracked issues, and compliance requirements spanning multiple ISO standards. Most of that data lives in spreadsheets or static PDF reports — useful for record-keeping, invisible to decision-makers.
This notebook demonstrates seven visualisation techniques, all running **live in the browser** with no server, no build step, and no database. Every chart draws from construction-domain data: element counts, issue trends, quality metrics, system dependencies, and ad-hoc SQL queries. The same patterns scale from project dashboards to digital twin interfaces.
An agentic co-worker can read the chart configurations and iterate on it and allow for new data to flow into the system.
[1] Why Visualise BIM Data?
Consider a federated model with 12,000 elements across six disciplines, 240 open issues in a BCF tracker, and a BIM Execution Plan requiring LOD 350 compliance at design freeze. A spreadsheet can hold all of it. A human cannot read all of it.
Visualisation turns volume into signal. A bar chart reveals that one discipline accounts for 60 % of the model's geometry but only 5 % of the issues — is that good coverage or a data gap? A time-series line exposes whether issues are being resolved faster than they are raised. A scatter plot shows which projects consistently underperform relative to their size.
These are the questions that coordination meetings should answer in seconds, not hours. The same structured data that feeds a chart can feed an MCP tool or an LLM-powered assistant — visualisation is the human interface, structured queries are the machine interface, and both consume the same intelligence pipeline.
In the operational phase, these patterns become the foundation of a digital twin: live sensor data replaces static exports, but the visualisation grammar stays the same.
[2] The Intelligence Stack
The charts in this notebook follow a three-tier architecture common to BIM reporting and business intelligence:
**Ingestion** — Raw data enters the pipeline as IFC geometry, BCF issues, CSV exports, or API responses. In this notebook, the data is embedded as JSON to keep everything self-contained.
**Intelligence** — Data is normalised into consistent views: disciplines get canonical names, dates align to ISO weeks, severity levels map to a controlled vocabulary. This is where business rules live — the same normalisation layer that powers a coordination report can feed a digital twin dashboard.
**Presentation** — Normalised data is rendered through chart components. Each section in this notebook is a self-contained card: a prose introduction, a chart container, and interpretive notes. The page structure is declared in TOML; the rendering engine dispatches each card to the right template.
This mirrors how professional BIM reporting works at scale: structured data in, templated output out. The only difference is that here, the entire pipeline runs in the browser.
[3] Ad-Hoc Exploration — In-Browser SQL
**The question:** What if you could query model data with SQL — right here, right now, with no server?
Before diving into the charts, try the data yourself. Below is a fully interactive SQL engine running **in the browser** via [DuckDB-Wasm](https://duckdb.org/docs/api/wasm/overview.html) — the same analytical database used in data engineering pipelines, compiled to WebAssembly so it runs client-side with zero infrastructure.
The dataset contains **200 BIM issues** with six columns:
| Column | Type | Example |
|--------|------|---------|
| `id` | INTEGER | 1, 2, ... 200 |
| `discipline` | VARCHAR | Architectural, Structural, MEP, ... |
| `severity` | VARCHAR | Critical, Major, Minor, Info |
| `status` | VARCHAR | Open, In Progress, Resolved, Closed |
| `element_type` | VARCHAR | Wall, Beam, Duct, Pipe, ... |
| `days_open` | INTEGER | 0–120 (how long the issue has been active) |
Use the **preset buttons** to explore common query patterns, or write your own SQL in the editor. Press **Run** (or `Ctrl+Enter` / `Cmd+Enter`) to execute. Every query runs entirely in your browser — nothing leaves your machine.
Loading DuckDB-Wasm...
### Learning by Example
Each preset button above demonstrates a different SQL pattern. Here is what they teach and how to extend them:
```sql
-- BASIC SELECT: Retrieve raw rows from the table.
-- LIMIT caps the output — essential when exploring unfamiliar data.
-- Try: change LIMIT 25 to LIMIT 100, or add WHERE discipline = 'MEP'
SELECT * FROM issues ORDER BY id LIMIT 25
-- GROUP BY + COUNT: Aggregate rows into summary statistics.
-- This is the same data that powers the bar chart in cell [4].
-- Try: replace COUNT(*) with AVG(days_open) to see resolution speed
SELECT discipline, COUNT(*) as total
FROM issues GROUP BY discipline ORDER BY total DESC
-- PIVOT: Reshape rows into a cross-tabulation matrix.
-- DuckDB's PIVOT is a single statement — no subqueries needed.
-- This is the table that opens a design review meeting.
-- Try: change USING COUNT(*) to USING AVG(days_open) for time analysis
PIVOT issues ON severity USING COUNT(*) GROUP BY discipline
-- WHERE + AND: Filter rows on multiple conditions.
-- Overdue open issues are the triage list for coordination meetings.
-- Try: change days_open > 30 to days_open > 60 for critical backlog
SELECT id, discipline, severity, element_type, days_open
FROM issues WHERE days_open > 30 AND status = 'Open'
ORDER BY days_open DESC
-- ROUND + AVG: Compute averages with controlled decimal precision.
-- Reveals which severity levels take longest to resolve.
-- Try: add a WHERE status = 'Resolved' to see only closed issues
SELECT severity, ROUND(AVG(days_open), 1) as avg_days, COUNT(*) as total
FROM issues GROUP BY severity ORDER BY avg_days DESC
```
This is the same pattern that powers LLM-based BIM assistants: a structured dataset exposed through a query interface. Replace the text input with a natural-language prompt and an MCP tool, and you have an agentic data exploration workflow — the SQL is generated, not typed.
[4] Element Distribution — Bar Chart
**The question:** What is the element breakdown across disciplines in a federated model — and where are the gaps?
A horizontal bar chart showing element counts by discipline. On a real project, this view is your first check after federation: does the structural model contribute the expected volume? Is the MEP count consistent with the services design stage?
Architectural elements dominate, as expected in a typical design-stage federation. The trailing MEP count suggests either an early-stage model or a discipline that has not yet published its latest exchange. In an ISO 19650 workflow, this chart maps directly to the **Model Production and Delivery Table (MPDT)** — you can cross-reference expected vs. actual element volumes to flag delivery gaps before the information exchange deadline.
*Technical aside:* Observable Plot's `barX` mark handles layout, sorting, and axis labels automatically. One function call, no manual scale configuration.
[5] Issue Trends Over Time — Line Chart
**The question:** Are issues being resolved at a healthy rate — or is the backlog growing?
A time-series chart tracking weekly issue counts across three disciplines over six months. In coordination management, the shape of these curves matters more than the absolute numbers.
A healthy project shows converging lines — all disciplines trending toward zero open issues as the design matures. Diverging lines signal a coordination bottleneck: one discipline is generating issues faster than the team resolves them.
Under ISO 19650, this view supports the **Information Model Review** process. The trend data answers a question that a single snapshot cannot: is the information model *improving* between exchanges, or are we shipping the same problems forward?
*Technical aside:* Observable Plot handles date axes natively — no manual tick formatting. The `tip` option adds interactive tooltips on hover for exact weekly values.
[6] Project Performance — Scatter Plot
**The question:** Which projects underperform relative to their size — and is that a model quality issue or a process issue?
A scatter plot correlating model file size against load time for 40 projects. Dot radius maps to element count, colour maps to discipline. Outliers in the upper-right quadrant signal bloated models; outliers in the lower-left may be suspiciously lean.
In practice, this view is a model hygiene check. A project with a large file size but low element count likely contains duplicated geometry or embedded raster images — common IFC export artefacts. Conversely, a high element count in a small file suggests efficient modelling or aggressive geometric simplification.
For digital twin lifecycles, this correlation becomes a leading indicator: models that are expensive to load in a viewer will be expensive to query in a twin platform. Catching bloat early saves infrastructure cost downstream.
*Technical aside:* Observable Plot's `dot` mark with `r` and `fill` channels creates a bubble chart from a flat data array. The `tip` option shows project details on hover.
[7] Quality Breakdown — Doughnut & Polar Area
**The question:** What is the severity distribution of open issues — and how complete is each discipline's contribution?
Two complementary views side by side: a **doughnut chart** showing issue severity proportions, and a **polar area chart** showing model completeness by discipline. Together, they form the core of a coordination report KPI dashboard.
The doughnut chart answers the triage question: how much of the backlog is critical vs. minor? If the red slice dominates, the next coordination meeting should focus on resolution, not new model exchanges. The polar area chart answers the coverage question: which disciplines are closest to their BEP targets?
In a live dashboard, these charts update with every BCF import or model check run. The same data structure — severity counts, completeness percentages — feeds both human-readable charts and machine-readable APIs for downstream automation.
*Technical aside:* Chart.js canvas rendering is fast and the animation system is polished. The doughnut uses a 65 % cutout for label readability.
[8] Multi-Dimensional Quality — Radar Chart
**The question:** How do two projects compare across multiple quality dimensions simultaneously?
A radar chart mapping six quality axes: geometry accuracy, data completeness, naming compliance, clash resolution, LOD consistency, and documentation coverage. This is BEP and IDP maturity scoring made visual.
Radar charts reveal trade-offs that tables hide. A project may score highly on geometry accuracy but poorly on naming compliance — a common pattern when modellers focus on visual output and neglect classification data. In an ISO 19650 context, all six axes map to verifiable information requirements.
For BEP audits, overlay the target profile (the contract requirement) against the actual profile (the delivered model). The gap between the two polygons is your non-conformance surface — instantly legible to a non-technical project sponsor.
*Technical aside:* Chart.js handles polar coordinate systems, gridlines, and fill opacity natively. Two datasets on the same radar chart require no extra configuration beyond the data arrays.
[9] System Dependencies — Force Graph
**The question:** How are building systems connected — and where are the coordination bottlenecks?
A force-directed graph showing relationships between model components. Nodes represent building systems, edges represent data dependencies. The layout emerges organically from link structure — tightly coupled systems cluster together, isolated nodes drift to the periphery.
This type of graph is invaluable for two BIM workflows. First, **coordination planning**: highly connected nodes are high-risk interfaces that need early and frequent clash detection. Second, **FM handover**: the dependency map shows which systems must be commissioned together and which asset data packages must cross-reference each other.
In a digital twin context, force graphs can visualise live system topology — HVAC zones connected to sensors, sensors connected to controllers, controllers connected to BMS gateways. The same graph grammar applies at design-time and operations-time.
*Technical aside:* D3's `forceSimulation` with charge, center, and link forces positions nodes dynamically. Try dragging nodes — the simulation responds in real time. Node colours map to discipline, link thickness maps to dependency strength.
[10] The Technology Stack
The charts above are powered by four open-source libraries, each chosen for a specific class of visualisation problem. No npm, no bundler, no framework — just CDN imports and structured data.
| Library | Cell | Purpose |
|---------|------|---------|
| **DuckDB-Wasm** | [3] | In-browser SQL engine — ad-hoc queries, pivots, window functions. |
| **Observable Plot** | [4] [5] [6] | Declarative statistical charts — bar, line, scatter. Minimal code, automatic scales. |
| **Chart.js** | [7] [8] | Canvas-based animated charts — doughnut, polar, radar. Compact, label-heavy formats. |
| **D3.js** | [9] | Low-level SVG data binding — force-directed graphs. Total layout control. |
Observable Plot handles 80 % of typical BIM charting needs with the least code. Chart.js fills the gap for KPI-style summary charts where animation and canvas rendering matter. D3 is reserved for bespoke layouts where no declarative library reaches. DuckDB turns the browser into a query terminal — useful for exploration, prototyping, and demonstrating data structures to stakeholders who speak SQL.
All four libraries coexist on a single page without conflict, sharing a unified dark/light theme system.
[11] From Raw Data to Intelligence
Every chart in this notebook consumes pre-structured JSON. In production, that JSON is the output of a data pipeline — the same ETL pattern used in business intelligence, adapted for construction data.
**Extract** — Pull data from source systems: IFC models (geometry + properties), BCF files (issues + viewpoints), COBie exports (asset registers), or project APIs. Each source has its own schema and quirks.
**Transform** — Normalise into consistent views. Discipline names get mapped to a controlled vocabulary. Dates align to ISO weeks. Severity levels follow a project-specific classification. Element types map to Uniclass or OmniClass codes. This is where business rules live — and where most data quality problems surface.
**Load** — Write the normalised data to a queryable store: date-partitioned Parquet files, a DuckDB database, or a REST API. Idempotent loads matter — when an ISO 19650 audit asks "what was the state of the model on 15 March?", the pipeline must reproduce that snapshot exactly.
The JSON embedded in this notebook simulates pipeline output. In a production deployment, the same chart code reads from Parquet, a database, or an API — the visualisation layer is decoupled from the data source by design.
[12] Reflections & Recommendations
After building all seven visualisation types against BIM-domain data, a few patterns stand out.
**Observable Plot** is the clear first choice for most BIM charting. Its declarative API means a bar chart is one function call, a scatter plot is two. Automatic scales and built-in tooltips eliminate boilerplate. The trade-off: no animation, and limited customisation for edge cases.
**Chart.js** excels at compact, animated charts — doughnuts, polar areas, radar overlays. For KPI dashboards and coordination report summaries, its canvas rendering is fast and the animation system adds polish that stakeholders notice. The trade-off: canvas means no DOM-level interactivity without workarounds.
**D3.js** gives total control. The force-directed graph in this notebook cannot be replicated in any declarative library. For system topology, spatial overlays, and custom interaction patterns, D3 is unmatched. The trade-off: steep learning curve and verbose code for simple charts.
**DuckDB-Wasm** is genuinely impressive — full SQL including window functions, CTEs, and PIVOT running entirely client-side. For exploration pages, prototyping, and BIM assistant interfaces, it turns a static page into a query terminal. The trade-off: a ~4 MB binary that should be lazy-loaded.
### The agentic workflow
An agentic code assistant helped iterate on chart configurations, debug dark/light theme transitions, scaffold the TOML content structure, and refine prose across multiple editing passes. This is not a replacement for domain expertise — it is an amplifier. The BIM questions, the ISO references, the architectural framing: those came from years of practice. The speed of iteration, the consistency of formatting, the breadth of library coverage: that is where the AI co-worker earns its place in the pipeline.
### Recommended stack for Built Environment Intelligence
1. **Observable Plot** as the primary chart library (bar, line, scatter)
2. **Chart.js** for compact KPI-style charts (doughnut, polar, radar)
3. **D3.js** only when custom interactivity is essential (force, tree, spatial)
4. **DuckDB-Wasm** for exploration and ad-hoc data validation
[13] Implementation Notes
### Stack Summary
| Library | Version | Size (gzip) | Purpose |
|---------|---------|-------------|---------|
| Observable Plot | 0.6 | ~45 KB | Declarative statistical charts |
| Chart.js | 4.x | ~65 KB | Canvas-based animated charts |
| D3.js | 7.x | ~90 KB | Low-level SVG data binding |
| DuckDB-Wasm | 1.x | ~4 MB | In-browser SQL engine |
| markdown-it | 14.x | ~30 KB | Client-side markdown rendering |
### How This Notebook Is Built
The page structure is declared in `page.toml`: metadata, section ordering, and card assignments. All prose and chart data live in `content.toml`. The rendering engine reads both files, dispatches each card to its template, and produces a single HTML page with one shared stylesheet and one shared script file.
Charts are rendered client-side by targeting DOM containers emitted by card templates. Each chart has a `figure_id` that the JavaScript runtime uses to locate its container and inject the visualisation. No server-side rendering, no canvas-to-image conversion — what you see is live code.
### Architecture Decisions
- **CDN-only dependencies** — no npm install, no bundler, no node_modules. Every library loads from jsDelivr at a pinned version.
- **Unified theme** — all charts respond to the page-level dark/light toggle. Observable Plot and D3 inherit CSS custom properties; Chart.js uses a shared palette object.
- **Lazy DuckDB** — the ~4 MB Wasm binary loads only when the SQL section scrolls into view, keeping initial page load fast.
- **TOML content separation** — prose and chart data are editable without touching templates or JavaScript. New charts require only a content block and a section entry.
Cite this work (APA & BibTeX)
Milotin, D. (2026),
“BIM Data Visualisation 101: Introduction to data reporting,”
Delta Persist.
@article{milotind2026,
author = {Milotin, Dragos},
title = {BIM Data Visualisation 101: Introduction to data reporting},
journal = {Delta Persist},
year = {2026},
note = {Field: Built Environment Intelligence. Accessed: 2026-02-18}
}