We have over twenty senior managers at Ritchies Transport who need access to reporting. Board members, general managers, regional heads. The people who run the business and need to know the state of it at a glance. KPIs. Trends. A few filtered views. None of them are analysts. None of them open a report and start writing DAX. They open it, look at the numbers, maybe filter by region or month, and close it.
For this audience, Ritchies has been maintaining Power BI Pro licences. Power BI Pro is currently around USD $14 per seat per month, and for a company of our size with roughly twenty read-only viewers that adds up to a meaningful annual cost. The alternative, Fabric F64 capacity, would allow viewers to access reports without a per-seat licence, but the entry-level capacity cost starts well north of what makes sense for a transport company our size. The economics simply do not work for us.
So I started looking elsewhere. Looker is elegant, but substantially more expensive. Tableau is powerful, but the per-seat model has the same fundamental problem. I even spent a few hours evaluating SSRS, and that was enough of that. The common thread: every option solving the viewer licensing problem either costs more than Power BI or requires infrastructure I do not want to manage.
I have been using DuckDB as part of our ETL pipeline for a while now and know its capabilities well. What I had not had a proper chance to explore was DuckDB-WASM, the build that runs entirely inside the browser via WebAssembly. Once I started looking at it properly, this came together quickly.
Our Data Lake IS the Query Target. No Extra Database Required.
Ritchies follows an ELT pattern. Data lands in Azure Data Lake Storage Gen2 as Parquet files, partitioned by date in Hive-style paths like /yyyy/mm/dd/. From there it gets loaded through to our data warehouse, which has four schemas: load for raw ingestion, ODS as our bronze reporting layer, EDM as silver, and MDS as gold.
The key insight is that DuckDB does not need a database at all. It reads Parquet files directly, and it understands Hive partition paths natively. A query scoped to a specific date range will use the partition folder structure to skip any files outside that range entirely, before reading a single byte. Within each file, columnar storage means a query asking for fare_amount and payment_type never reads passenger_count or trip_distance. Row group statistics let DuckDB skip entire chunks when a filter predicate cannot match. This is predicate pushdown at the storage layer, built into the file format and the path structure, and it means DuckDB can serve fast aggregation queries directly off a data lake with no indexing, no separate database, and no extra infrastructure.
Delta Lake support makes this even more compelling. If your data platform is entirely lake-based (no separate data warehouse, just Delta tables on ADLS or S3), DuckDB handles that too. The delta extension lets DuckDB read Delta Lake tables directly, including time travel (querying a snapshot as of a specific version or timestamp) and automatic schema resolution. You get ACID-compliant reads from your Delta tables with the same predicate pushdown behaviour as raw Parquet, all client-side. If your whole platform lives in the lake and you have been reluctant to add a data warehouse just to serve reporting, this removes that need entirely. The lake is already your warehouse as far as DuckDB is concerned.
What this means in practice: DuckDB reads your raw Hive-partitioned Parquet from ADLS, runs the aggregation queries, and produces a compact summary. That summary is what the browser loads via DuckDB-WASM. The interactive filtering in the dashboard runs as fresh SQL queries against that in-memory summary. No API round trip. No server billed per click. The interactive layer is entirely client-side.
Imagine your data is a filing cabinet organised by date, with each drawer labelled by year and month (Hive partitioning). Normally you would need to hire a librarian (a database server or API) to go to the filing cabinet, pull the relevant files, and hand them to whoever asked. DuckDB-WASM fires the librarian. The person asking the question now has a copy of the librarian's brain running in their own browser tab. They go directly to the filing cabinet, open only the drawers matching the date filter, grab only the specific columns they need, and answer the question. Nobody else was involved and nothing was billed.
Charts Power BI Cannot Build Without Paying for Extensions
The three visuals below do not exist in Power BI's standard chart library. The sunburst chart shows trip distribution across time of day and payment type as a two-ring hierarchy. The radial clock chart shows the 24-hour pattern as a clock face with bars radiating outward, coloured by time of day. The line chart with rich hover tooltip replicates Power BI's premium tooltip page feature: hover over any day and a floating panel appears with a live payment breakdown chart inside it.
Every click on any visual cross-filters all the others. Click the sunburst inner ring to filter by time period. Click a radial bar to filter by hour. Both filters can be active simultaneously. This is the same cross-filtering behaviour Power BI users expect, running as real DuckDB SQL in your browser tab with zero server involvement.
Sunburst chart , not available natively in Power BI
to filter by time
Radial bar chart , not available natively in Power BI
to filter by hour
What you are looking at: Three chart types that do not exist in Power BI's standard library, each built with a short AI-assisted conversation. All cross-filtering each other. All running real DuckDB SQL in your browser tab with zero server involvement. No marketplace extension. No compatibility check. No dependency on someone else's maintained package. The chart you had in your head, described in plain English, and done.
The data powering this dashboard is being queried directly from a public S3 bucket. No pre-loaded dataset, no API sitting in the middle. DuckDB-WASM is reaching out to the source and querying it live, exactly as it would against your own enterprise data lake. If you swapped the public bucket URL for your ADLS Gen2 SAS-authenticated endpoint, the experience would be identical. What you are seeing here is what your reporting would actually look like in production.
Two Services. No API. No Compute Layer. No Middleman.
The architecture is deliberately minimal. There is no API management layer, no App Service, no Logic Apps, nothing running between the browser and the data. The core pattern is just two things: a cloud object store and a static web host. The specifics below are Azure because that is what Ritchies runs on, but the same pattern works identically on AWS (S3 + Amplify or CloudFront), or GCP (Cloud Storage + Firebase Hosting or Cloud Run serving static files). The DuckDB-WASM layer in the browser does not care which cloud vendor hosts the Parquet file, as long as it is accessible over HTTPS.
| Component | Role | Cost |
|---|---|---|
| ADLS Gen2 (already yours) | Holds your Parquet files, Hive-partitioned by date. Your existing ELT pipeline writes to it. DuckDB reads directly from it using partition path filtering. No extra database, no copy, no transformation layer needed. Azure does not charge for data ingress. Egress to the internet is free for the first 100GB/month, then ~USD $0.087/GB. For a 50KB pre-aggregated summary file, this cost rounds to cents. | ~$0 extra |
| Azure Static Web Apps (Standard, $9 USD/mo) | Serves the HTML and JavaScript globally via CDN. Native Entra ID authentication built in, including custom identity provider configuration and role assignment. GitHub Actions CI/CD. Custom domain. HTTPS. Environment variables for secrets like SAS tokens. Zero backend compute whatsoever. The Standard tier is required to configure a custom identity provider. Specifically, so you can lock authentication to your own Entra tenant rather than accepting any Microsoft account. Without this, the default Entra login allows sign-in from any Microsoft-backed identity. With a custom provider configured, only users from your specific tenant can authenticate. That is the gate that makes this safe for internal enterprise reporting, and it is also what enables Entra group-based role assignment downstream. | ~$15 NZD/mo |
| DuckDB-WASM (in the browser) | A JavaScript library that initialises in the user's tab. Loads the summary data into an in-memory SQL engine and answers every filter interaction client-side. No server is billed per query, ever. | $0 |
Authentication and Role Assignment: Native, Zero Extra Infrastructure
Azure Static Web Apps has built-in Entra ID authentication that works out of the box, with no extra services and no custom auth code. You define roles in a staticwebapp.config.json file and use a small serverless function at login time to check the user's Entra security group membership and assign a role. I have done this in other SWA projects at Ritchies: the login function calls Microsoft Graph to get the user's group memberships, maps a group like SeniorManagers to a role like viewer, and the routing rules enforce that only users with that role can access the dashboard path.
But it goes further than page-level access. Because DuckDB-WASM is running SQL in the browser, the role from the authenticated session can be passed directly into the query as a WHERE clause filter. A regional manager assigned the role region_auckland gets queries like WHERE region = 'Auckland' injected automatically. A board member with the all_regions role sees everything. This is the same row-level security behaviour Power BI provides through RLS, but it is running as a SQL predicate in the user's browser, scoped by their Entra identity, with no extra infrastructure required.
// Role assignment happens in your login function via Graph API group check
{
"routes": [
{
"route": "/dashboard/*",
"allowedRoles": ["viewer", "analyst"]
}
],
"responseOverrides": {
"401": { "redirect": "/.auth/login/aad" }
}
}
// In your api/GetRoles function:
// 1. Read caller's userId from the x-ms-client-principal header
// 2. Call Graph: GET /v1.0/me/memberOf
// 3. If "AucklandManagers" group, return role "region_auckland"
// 4. If "BoardMembers" group, return role "all_regions"
// 5. In your dashboard JS: inject WHERE region = '{role}' into DuckDB queries
// Row-level filtering via SQL predicate. Same outcome as Power BI RLS, no extra cost.
// call Microsoft Graph to retrieve the user's security group memberships.
// Map each group to an application role and attach metadata.
const userGroups = await getUserSecurityGroups(context, userEmail);
const roles = [];
const roleMetadata = {};
// Walk through the user's groups and assign the highest applicable role.
// Only one role per user. The most privileged one wins.
if (isGroupMember(userGroups, 'Reporting-Admin')) {
roles.push('admin');
roleMetadata.displayBadge = 'Admin';
roleMetadata.roleLevel = 3;
} else if (isGroupMember(userGroups, 'RegionalManagers-Auckland')) {
roles.push('region_auckland');
roleMetadata.displayBadge = 'Auckland Region';
roleMetadata.roleLevel = 2;
} else if (isGroupMember(userGroups, 'Reporting-Viewer')) {
roles.push('viewer');
roleMetadata.displayBadge = 'Viewer';
roleMetadata.roleLevel = 1;
}
// Return roles array. SWA injects this into the client principal.
// Your DuckDB query layer reads the role and scopes WHERE clauses accordingly.
return { roles, ...roleMetadata };
The Frontend Gap Just Closed for Data Engineers
This section is really for every data analyst, BI analyst, and reporting analyst out there. All of these roles already have the hard skills: the ability to dig through data, find the patterns, and recover the insight. The pain has always been the last step. Translating that understanding into something a boardroom of non-analysts can absorb in thirty seconds, as an executive-level elevator pitch rather than a technical deep-dive. BI tools help, but they also constrain you to the charts they chose to build. You spend hours fighting with a drag-and-drop interface trying to produce a reasonable approximation of the visual you actually had in mind.
The sunburst chart in the demo above is a good example. I described it: a donut chart with two rings, the inner showing time of day periods, the outer showing payment type breakdown within each period, clicking the inner ring filters other charts. I got working SVG code back. A few colour adjustments and it was done. No marketplace. No version compatibility check. No hoping someone has maintained the extension package this year. The same goes for the radial clock chart, which I described as a clock face where each of the 24 hours has a bar radiating outward, length proportional to trip volume, coloured by time of day.
With agentic AI tools, the time to a working custom chart is now measured in minutes, not hours. The question is no longer "does the BI tool have a version of this?" but simply "what would best communicate this pattern in my data?"
As a data engineer, you have always been the person who understands what the data is saying. AI just removed the last barrier between that understanding and the visual that communicates it.
Maha KepakisanThis is the argument for any analyst, data engineer, or BI developer who has been in the same position: you are not bottlenecked by front-end skill anymore. You are bottlenecked by imagination. Sankey flow diagrams for route analysis, bump charts for rank changes over time, chord diagrams showing which zones connect to each other, anything your data suggests would tell a better story. Describe it and you have it. The time that used to go into finding out whether the BI tool's marketplace had anything remotely similar now goes into actually building the right visual for the story you are trying to tell. That changes the work entirely.
More importantly, it changes where analyst energy actually goes. The presentation layer is no longer a significant piece of work. That means analysts can spend the majority of their time doing what they are actually good at: sitting with the data, finding the patterns, asking the next question, and pulling a story out of what would otherwise be rows on a screen. The insight is the hard part. It always was. The gap between finding a pattern and being able to show it clearly to a non-technical audience was an overhead that had nothing to do with analytical skill, and AI has now largely absorbed that overhead.
The Analyst Keeps Desktop. The Senior Manager Gets a Free Browser Tab.
Power BI Desktop is free. Always has been. The analysts on my team who are building data models, writing DAX measures, and doing exploratory analysis are not going anywhere. Desktop is the right tool for that work and it costs nothing. The licensing cost kicks in entirely at the point of sharing through the Power BI service. That is the specific problem this approach solves.
For our senior managers, what they actually need is not a Power BI subscription. They need a fast, well-designed page that shows them the state of the business, responds to their filters, and does not require them to understand how it was built. This delivers exactly that. Secured by Entra ID. Served from a global CDN. Costing essentially nothing. Built on top of data infrastructure we already maintain.
If an analyst ever needs to drill into transaction-level detail, going past the summary and into individual records, Power BI Desktop is still there, still free, still connected to the source. Nothing about this approach removes that capability. You are separating the build-and-analyse tool from the view-and-consume tool, and you are not paying per seat for the viewing side anymore.
The Honest Limits of This Approach
Row-level security at the data model layer, where two viewers see different subsets of the same report, is a genuine Power BI strength that would require more engineering to replicate here. If you need that pattern across many reports, Power BI may still be the right tool for those specific cases. Native Excel integration and Power BI embedded in Teams with commenting and subscriptions has real workflow value for organisations already deep in the Microsoft 365 ecosystem.
On real-time data: Power BI's legacy streaming datasets technically still work with Pro licences, but Microsoft has announced their retirement in October 2027. The modern replacement, Fabric Real-Time Intelligence using Eventstream and Eventhouse, requires Fabric capacity. So even for streaming scenarios, you end up back at the same Fabric capacity pricing conversation, and the per-seat model is not a viable path forward for real-time use cases. I have a working SignalR and Azure Functions implementation at Ritchies for near-real-time data, and surfacing that in a Static Web App dashboard is absolutely achievable outside of Power BI entirely. That deserves its own post, which I will write separately.
The pattern described here is best suited to a summary view of operational data, consumed by non-analyst viewers who need to see the state of the business. For that scenario, which covers the majority of what senior managers actually use reporting for, the per-seat licensing cost is an unnecessary overhead. This removes it.
I am also aware that some organisations will be hesitant to move reporting to a full-stack web approach precisely because it is harder to hand off to the business. With Power BI, a reasonably skilled analyst can open Desktop and build something new without involving development. A static web app requires someone who can write HTML and JavaScript, and most business users are not that person. That is a real friction point and worth being honest about.
But the shape of that friction is changing. The pattern that makes sense here is the same one that already works in Power BI: a developer or senior reporting analyst owns and maintains the data model, the connection layer, and the underlying structure. End users do not touch any of that. What changes is how they interact with the presentation layer. With AI becoming a standard part of how people work, prompting for a chart is not a significantly higher bar than configuring one in a drag-and-drop interface. The data model can even include a pre-written prompt: a plain-language description of what the tables contain, what the key measures are, and what questions this data is meant to answer. That way the AI already understands the structure before the user asks anything. The end user experience becomes: describe what you want to see, and it appears. That is not a harder experience than Power BI. It is arguably a simpler one.
