Funnels & User Paths
Funnel Journey
Rethinking Funnel Analysis- Choosing the Right Tool for the Job
Funnel analysis is a core technique in product analytics. It allows teams to track how users progress through key workflows—be it sign-up flows, onboarding steps, or activation milestones—and identify where users drop off. A well-constructed funnel can highlight bottlenecks, validate hypotheses about feature adoption, and uncover insights that drive product iteration.
But while the concept of funnels is simple, implementing them efficiently is far from trivial—especially at scale. And the technical choices you make early on—whether you're using a dedicated product analytics tool or a general-purpose data warehouse—can introduce long-term trade-offs, including performance limitations, schema rigidity, or analytical blind spots.
This article is not about pushing a specific solution. It’s about understanding the architectural choices behind funnel analysis—and making informed decisions based on the scale, flexibility, and depth your organization needs.
What Is Funnel Analysis (and Why It’s Not Just Business Intelligence)
At its core, funnel analysis tracks a sequence of user events leading toward a goal. This could be something straightforward like “Signed Up → Onboarded → Activated” or a more specific product journey such as “Viewed Feature → Used Feature → Converted to Paid.”
While business intelligence (BI) tools aggregate metrics along dimensions (e.g. users by region, revenue by product), funnel analysis is fundamentally sequential and user-centric. It requires you to reconstruct the path users take through your product, in time order, with precise filtering and segmentation.
That brings up a key technical consideration: your event schema and storage layer need to support this kind of sequence-aware querying.
The Data Model: A Hidden Source of Technical Debt
In an ideal world, your event data is clean, structured, and easy to query. In reality, it’s often:
- Semi-structured, with properties varying across events
- Spread across multiple sources, requiring joins for business context
- Massive in volume, with millions or billions of rows
Some tools like Mixpanel or Amplitude solve this by abstracting away the complexity. They ingest and pre-process the data, creating opinionated schemas that optimise for fast funnel queries.
This can be great for PMs who need answers quickly, but it can also become a source of technical debt. These tools often:
- Limit join flexibility across datasets
- Abstract away query logic, making debugging hard
- Lock you into a specific interface or pricing model
On the other hand, data warehouses like BigQuery, Snowflake, or Redshift offer raw flexibility and scale. You can model events exactly as you like, integrate them with customer, revenue, or marketing data, and run arbitrary SQL. But with that power comes complexity: writing performant funnel queries in SQL is not trivial, especially at scale.
Two Ways to Query Funnels in SQL: Joins vs. Window Functions
Let’s say you have an events
table with columns like event_name
, event_time
, user_id
, platform
, and browser
. You want to compute a three-stage funnel: “Getting Started” → “Analytics Course” → “Paid Activation.”
Two common approaches in SQL:
1. Join-Based Approach
Each stage is expressed as a subquery and joined with the next. This method is intuitive and maps well to how PMs think about funnels.
SELECT ...
FROM stage1
JOIN stage2 ON user_id AND time constraints
JOIN stage3 ON user_id AND time constraints
- Pros: Easy to reason about; explicit control
- Cons: Multiple joins can become expensive, especially if subqueries are not materialized or indexed
2. Window Functions Approach
Use PARTITION BY user_id ORDER BY event_time
to simulate the sequence, and stack window functions for each funnel stage.
SELECT user_id,
MIN(
CASE WHEN event_name = 'Getting Started' THEN event_time END
) AS t1,
MIN(
CASE WHEN event_name = 'Analytics Course' AND event_time > t1
THEN event_time END
) AS t2,
...
FROM events WINDOW ...
``
- Pros: Fewer joins, single pass over data, better for parallel execution
- Cons: Verbose, harder to maintain, may require fine-tuned sort and partition strategies
Optimisations and Gotchas
Modern data warehouses are built for analytical workloads, but efficient funnel analysis depends on more than just compute horsepower. Consider:
- Sorting & Clustering: Clustering on
user_id
andevent_time
can dramatically improve window function performance. - Scan Minimization: Avoid scanning the
events
table multiple times by materializing intermediate results or using Common Table Expressions (CTEs) wisely. - Join Indices: Subquery joins without indices can be costly—especially on high-cardinality keys like user IDs.
Product Analytics Tools vs. Warehouse-Native Analytics: Trade-offs to Consider
Criteria | Product Analytics Tools | Data Warehouses |
---|---|---|
Time to Insight | Fast, UI-driven | Slower, requires SQL |
Flexibility | Limited joins/custom logic | Fully flexible |
Schema Lock-in | High | Low |
Cost Control | Prone to scale cost surprises | Tunable per-query |
Debuggability | Black-box behavior | Transparent SQL |
There’s no “best” solution. But there is a right solution for your context.
If your team needs speed, out-of-the-box charts, and little overhead, a tool like Mixpanel can work well—especially early-stage. If you're scaling and need custom funnel definitions, integrations with other data, or want to avoid vendor lock-in, a warehouse-native approach is likely the better bet.
Conclusion: Don’t Just Build Funnels—Build for Flexibility
Funnel analysis is one of the most powerful tools in a product analytics arsenal. But the implementation path you choose can either empower your team or create future bottlenecks.
As consultants or data teams making this choice, it's not just about performance—it's about maintainability, cost, flexibility, and visibility. Understanding what happens under the hood helps avoid shiny-tool syndrome and build analytics capabilities that scale with the business.