They Asked About Marginal Returns. The Data Could Only Tell Them Cross-Tabs.

Why dashboarding has replaced analytics in most data teams — and what would fix it. Notes from forty interviews in Prague and a decade of working in the US data industry.

The assignment that broke itself

The take-home test arrived in March. A digital marketing company in Prague had reached out about a senior data analyst role, and after the screening call they sent me a parquet file with about a quarter-million rows — event-level data from January and February 2026, covering digital ad campaigns across Google, Meta, Sklik, Microsoft, and a handful of smaller platforms.

The business context they provided was sophisticated. The company runs a two-stage funnel: media buying acquires users at a loss (the direct margin on acquisition is negative), and remarketing — SMS, email, voicebot — re-engages those users to generate revenue over time. The instructions asked me to investigate cost efficiency, diminishing returns, and traffic-source performance. Which platforms deliver the best return on ad spend? Should spend be reallocated? Where are returns flattening?

These are good questions. Real questions. The kind of questions that can move millions of dollars.

I opened the file and started looking around. And then I noticed something.

The dataset included a click_id column that linked all events from a single user session — visit, lead submission, lead-tier assignment, eventual revenue or cost. But the documentation contained one quiet sentence: “Remarketing activities for a user are linked to different click_ids and are not connected to the original acquisition session in this dataset.”

That sentence broke the assignment.

The most important business question — which acquisition source produces users with the highest lifetime value? — was structurally unanswerable in the data they had given me. I could compute cost per lead by platform. I could compute revenue per acquisition session. But the entire revenue-generating side of the business — the remarketing channel that turned a money-losing acquisition funnel into a profitable enterprise — could not be attributed back to the platform that had originally acquired each user.

The questions were about marginal returns. The data could only tell me cross-tabs.

I did the analysis they asked for. I computed cost per lead and revenue per lead by platform. I built a marginal cost curve. I identified one platform that should be paused immediately (471,000 visits, five leads). I flagged Google’s top three campaigns as suspicious outliers consuming 50% of total spend at 51% higher cost per lead than the rest. Then I wrote them a follow-up section: here is what the analysis would actually look like if your data could be linked end-to-end. I simulated the linkage, showed how the platform ranking shifts when remarketing lifetime value is included, and walked through a marginal-optimization framework where you fund campaigns from most to least efficient until the next dollar stops paying off.

I did not get the job. I did not, exactly, play it safe in the interview — I told them the assignment was boring, because the data could not answer the questions it was asking. I’ll come back to why I pushed that hard. First, the broader pattern.

It wasn’t a one-off

I have spent the past few months interviewing for data roles in Prague. About forty companies. Banks, e-commerce platforms, gaming firms, sports betting, retailers, telecoms, consultancies, automotive, insurance, media. A representative cross-section of the local data market.

What I found, with a few notable exceptions, was a pattern I’ve now seen often enough that I can describe it from memory.

The job posting asks for a “data analyst” or “data scientist.” During the interview, the hiring manager asks how I would measure incrementality, attribute conversions, or estimate the effect of a campaign. The role description lists Python, SQL, statistical modeling, A/B testing.

Then the test assignment arrives, or the work gets described in detail, and the truth becomes visible: the actual job is building dashboards in Power BI, writing SQL for cross-tabs, monitoring KPIs week-over-week, and producing reports that describe what happened. The questions during the interview were aspirational. The work is something else.

Another company sent me a take-home assignment with a dataset that could only support frequency tables and conditional means. Another wanted me to “tell a story with the data” but the data couldn’t support causal claims of any kind. Another asked me about A/B testing methodology in the interview, then explained that the team had never actually run an experiment.

This is not a Czech problem. I’ll get to that in a moment. But it is a problem.

Three modes of data work

It helps to name what we’re talking about. Most data work in industry falls into one of three modes.

Dashboarding is descriptive and retrospective. It answers the question what happened? Power BI, Tableau, Looker, KPI reports, weekly cross-tabs, executive summaries. This is real work. Done well, it produces the shared situational awareness an organization needs to function. But on its own, dashboarding cannot tell you why something happened, what to do about it, or what would have happened under a different decision.

Analytics, in the precise sense I want to use here, is explanatory and prescriptive. It answers why did this happen, and what should we do? Causal inference, A/B testing, lift studies, attribution, mediation analysis, structural econometric models, marginal optimization. These techniques require careful identification assumptions, thoughtful experimental or quasi-experimental design, and — critically — data infrastructure built to support counterfactual questions. Linked sessions, longitudinal panels, randomized treatments, controlled exposures.

Data science, in its predictive form, answers what will happen? — or builds systems that automate decisions. Forecasting, classification, recommender systems, NLP, computer vision, more recently large language models. Predictive modeling has its own well-developed methodology and is genuinely valuable for the right problems.

These three modes are not in competition with each other. Mature organizations do all three. The question is whether the organization has the architecture and the talent to do all three, and whether decision-makers know which mode is appropriate for which question.

What I have observed in most companies — both in Prague and globally — is that mode 1 has eaten everything. Mode 2 barely exists. Mode 3 exists but is often misapplied, asked to answer counterfactual questions that prediction cannot answer.

Where the US seems to be ahead

I want to update something. When I first started thinking about this, my impression was that the gap I was describing in Prague was roughly mirrored in the US. The longer I look, the less sure I am that’s right — at least when it comes to the leading edge of the US tech industry and the US data-science job market.

A handful of US firms have made causal inference central to their data work. Amazon brought economists into its data teams long before most others did. Uber recruited John List as Chief Economist. eBay built an economics team under Steve Tadelis. Microsoft Research has long run a serious causal inference program. LinkedIn and Zillow have followed. These firms run experiments at scale, treat attribution and lift measurement as first-class problems, and design their data infrastructure to answer counterfactual questions.

The US job market reflects this. Open LinkedIn on a given day and you will find dozens of data-scientist and applied-economist roles where the title means A/B testing, causal inference, or experimentation — and the candidates being hired are expected to know what an instrumental variable is and how to read a treatment effect. The labels and the work are aligned.

Of course, not every US firm has caught up. Plenty have dashboards, ML models, and an “AI strategy” without a coherent program for causal estimation. But the leading edge is sharper than what I have observed in Prague, and the job market signals more clearly what good looks like.

That alignment between label and practice is much rarer in the Czech market. I have seen a number of Prague job ads that list A/B testing or causal inference among the responsibilities. I have not yet seen one where the label corresponded to a real underlying practice. The Scalepunk-style mismatch — analytical questions in the job description, dashboarding-grade work in reality — appears to be the dominant pattern, not the exception.

There are exceptions. Out of the forty-odd companies I interviewed with, several stood out as taking analytics seriously: Yieldigo, which builds dynamic pricing systems and lives or dies by causal estimates of price elasticity; Ematiq, in sports betting, where margins are thin enough that lift measurement is an existential concern; Allegro, the e-commerce platform, which has invested in real experimentation and attribution infrastructure; and Filevine, CDN77, Mastercard, and Dateio, where my conversations suggested a similar level of methodological seriousness. They are organizations where someone in leadership understood that the questions they wanted to answer required a different kind of infrastructure.

But on what I have observed, the broader Czech tech market is further behind the leading edge of the US industry than I had initially assumed.

The management gap

Why is this so rare? The answer is partly historical.

The data revolution of the past two decades was largely about products that use data — recommender systems, search ranking, ad bidding, content feeds, fraud detection. These are predictive systems embedded in software. They generate revenue automatically once deployed. Management never had to understand the math behind them. They hired computer scientists, gave them latitude, and watched the metrics improve. The expertise sat behind an API. The business captured the value without ever having to learn the language.

Analytics in the sense I’ve been describing — causal estimation, attribution, lift measurement, the Hanssens marketing-mix tradition — does not work that way. It requires a partnership between the analyst and the executive. The analyst cannot do the work in isolation, because the questions are interpretive. Which counterfactual do we want? What assumptions are we willing to defend? What decision will this estimate actually inform? The executive cannot delegate the work entirely, because they have to be able to formulate the question, understand the answer, and act on it.

That partnership requires shared vocabulary. The executive does not need to know how to run a regression. But they do need to know what a confidence interval is, what statistical significance means and does not mean, what a p-value is and is not, and — most importantly — the difference between accounting and analytics. Accounting describes what happened. Analytics explains why, and predicts what would happen under a different decision.

Management schools do not teach this. Most MBAs take a single statistics course that fades within months. Executives rise through marketing, finance, operations, engineering. They learn to demand “data-driven” decisions without learning the vocabulary that would let them evaluate whether the data is actually telling them what they think it is.

I saw the consequence of this gap up close in another Prague take-home assignment. An analytics consultancy sent me e-shop transaction data and asked, among other questions: did increasing the marketing budget on Google Ads, Sklik, and Facebook on a specific date lead to a change in sales? Average daily orders fell 55% after the budget increase. To an accounting eye, that looks bad — the budget went up, sales went down. But after controlling for trend and seasonality with both OLS and Poisson regression, the decline was not statistically distinguishable from background variation. A power analysis showed the dataset could reliably detect changes of only ±38%; anything subtler would have been invisible. The honest answer was: the data can neither confirm nor deny that the budget change had an effect. To know, you would need session-level analytics data, or better, a controlled experiment next time.

I wrote it up with a section explicitly distinguishing the accounting view (the raw numbers) from the statistical view (what we can defensibly conclude). I did not hear back. I suspect the report read as inconvenient. The manager wanted a clear yes-or-no. The honest answer was we don’t know, and here is why we don’t know. That answer required understanding what the dataset could and could not support — and not every reader is prepared for that.

The training gap

The management gap is compounded by a training gap.

Most data scientists and analysts are recruited from computer science programs — and computer science programs train students in prediction. Supervised learning, neural networks, classification, regression as a fitting exercise. These are valuable techniques. But they are designed to answer the question what will happen given what I have observed? — not what would happen if I changed something?

Causal inference is its own field, with its own intellectual lineage: Donald Rubin’s potential outcomes framework, Judea Pearl’s causal graphs, Joshua Angrist and Guido Imbens on instrumental variables, James Heckman on selection. Hanssens, Parsons, and Schultz on marketing response models. These ideas live in economics and biostatistics departments, not in most CS programs. A data scientist who knows XGBoost cold may have never seen a directed acyclic graph, never run a difference-in-differences, never thought hard about identification.

The current AI cycle is making this worse, not better. Every organization wants an “AI strategy.” The phrase “data-driven” gets attached to anything involving a model. But the techniques that actually answer business questions about causal effects — the techniques that justify a recommendation to shift four million dollars from Google to Meta — are not the same techniques that power chatbots and image generators. The hype is pulling talent and attention toward generative models and predictive ML, and away from the less glamorous but more decision-relevant work of causal estimation.

Prediction and causal inference are different tools for different jobs

None of this is an argument against ML. Predictive ML and causal inference are not competing paradigms. They are different tools that answer different questions, and the most sophisticated organizations use both — often together.

If I want to predict next quarter’s demand for a product, a gradient boosting model trained on historical sales, seasonality, and macro indicators is exactly the right tool. The question is associative: given everything I know, what is the best estimate of next quarter’s demand? Predictive ML excels at this.

But if I want to know what happens to demand if I raise prices by 10%, no amount of training data on past prices and past demand will give me the right answer — because past prices were not set randomly, and the same observed correlation between price and demand can be generated by very different causal structures. This is a counterfactual question. It requires either a randomized experiment, a natural experiment, an instrumental variable, or a structural model that explicitly identifies the price elasticity I am after.

Sophisticated marketing analytics — the kind of work that Hanssens, Parsons, and Schultz describe in Market Response Models — uses both. A demand forecast from a GBM can feed into a structural marketing mix model that decomposes revenue into baseline demand, advertising effects, promotional lifts, and seasonality. The predictive component handles the data-rich, associative part. The structural component handles the causal decomposition. Together, they answer questions neither one could answer alone.

The mistake is not using ML. The mistake is using ML to answer questions it cannot answer, because nobody on the team knew the question was causal in the first place.

Why I pushed

I escalated in that interview. I told them, in plain terms, that the assignment was boring — not because the questions were uninteresting, but because the data they had given me could not answer them. I asked whether their production data was structured the same way as the assignment data, or whether the broken linkage was something they had introduced for the test. They told me their production data could be linked end-to-end. Which left me with a question I could not put down: why send a candidate a dataset that prevents the analyses you are asking for? Were they testing whether candidates would notice? Did they not realize themselves what format their own data needed to be in to answer the business questions they kept raising?

I pushed harder than I had to. I knew it. I have my reasons.

I spent the last decade at a previous organization — a serious institution doing serious work — where leadership routinely demanded data-driven decisions while the underlying infrastructure could not support them, and where any suggestion that the architecture itself needed investment was treated as complaining rather than as useful information. Data, in that environment, was not a tool for learning. It was a marketing instrument — something to point at, not something to be tested by. After ten years of trying to shift that culture from the inside, I had nothing to show for it.

My time at Amazon taught me the opposite lesson. A confrontational style there was not a flaw to be smoothed over — it was how the organization got to the right answer fast. If a colleague’s analysis had a problem, you said so. If a leader was about to make a decision based on a number that did not mean what they thought it meant, you stopped them. The discomfort was the point. The discomfort produced the work.

I sent the company my follow-up report not just as analysis but as a diagnostic. If they read it and said yes, this is exactly the kind of thinking we need, I would have wanted to work with them. If they read it and got defensive, I had my answer either way. The fact that I did not hear back is a result, not a failure.

I will turn forty next year. I do not have another ten years to spend at organizations that mistake dashboarding for analytics, or where the architectural conversation is treated as criticism instead of contribution. I am being more careful now about whom I lend my time to. That selectivity is a luxury I did not have a decade ago.

If I get rejected by any of those companies, it will be because they operate above my current skill level. That kind of rejection I would accept gratefully, treat as a signal to grow, and use to plan what to study next. But rejection from a company that did not engage with the diagnostic I was offering is not really a rejection. It is a filter, and it filtered the right way.

Why bother trying, then?

A reader might reasonably ask: if these gaps are so visible, why spend a decade trying to convince an organization that did not get it? Why send a follow-up report to a company that handed you a broken assignment? Why not just walk away and work with organizations that already understand?

The answer is economic. Marginal productivity is highest where productivity is currently lowest. The organizations that have done this poorly for the longest have the most to gain from doing it well. There are real fixed costs — hiring the right people, building the right infrastructure, training executives in basic statistical literacy. But once those costs are paid, the gains in the first quarter or two can be enormous. The first few steps from zero are the equivalent of going from 0 to 60 miles per hour.

I saw this concretely at Amazon. One of my team’s earliest meaningful wins was the transition from magic numbers — hand-picked weights that engineers and product managers had been arguing about for years — to the first data-driven model that estimated how much customers actually valued different aspects of different sellers’ offers for the same product, and ranked those offers accordingly. The economic gain in the first three months was substantial, and the methodology generalized to other ranking problems across the marketplace. The lesson was not that the model was sophisticated. It was that anything grounded in actual customer behavior beats anything grounded in argument and intuition.

The same logic applies to most of the companies I have been describing. The marketing firm with the broken click_id linkage would see disproportionate returns the moment they fix it, because they are starting from zero attribution. The e-shop manager who reads a 55% post-budget drop as causal would, the moment they learn to control for trend and run a power calculation, stop making expensive decisions on noise. The first proper marketing-mix model in an organization that has never built one will surface insights that nobody currently has.

This is also why I keep trying with managers and executives who do not have the technical background but seem genuinely curious. The cost of bringing them up to speed is small relative to the gains. The transition from we should probably look at the data to we have a clear hypothesis and a way to test it is a transition almost anyone with intellectual humility can make. Helping a Yieldigo or an Allegro improve their methodology by five percent is genuinely useful. But it does not move the same kind of needle as helping a company go from we have a Power BI dashboard to we know which marketing dollar is paying off.

What would help

A few concrete moves, in rough order of leverage.

Hire economists alongside CS-trained data scientists. This is what Amazon, Uber, eBay, and Microsoft Research figured out. Economists are trained from day one to think about identification — to ask “what would I need to assume in order to interpret this estimate causally?” That habit of mind is rare among CS-trained data scientists, not because they are incapable of learning it, but because their training did not emphasize it.

Design data infrastructure for causal questions. Linked sessions across acquisition and retention. Longitudinal user panels. Randomized treatments wherever feasible. Controlled exposure logs. Holdout groups for measurement. These design choices need to happen upstream of the analyst, not after the fact. The Scalepunk-style assignment I described — analytical questions on top of dashboarding-grade data — is the predictable consequence of skipping this step.

Treat attribution and lift measurement as first-class problems. Most marketing organizations have an attribution model that nobody fully trusts. Most product organizations have an A/B testing platform but lack the statistical sophistication to detect the effects they care about. These are solvable problems, but they require investment and serious engineering, not a single senior hire.

Match the question to the tool. A demand forecast is a prediction problem. A pricing decision is a causal problem. A recommender system is a prediction problem. An ad-spend reallocation is a causal problem. This sounds obvious. It is routinely violated.

The opportunity

The companies that figure this out first — globally, and in Prague — will pull ahead in ways that are hard to overstate. Better attribution means better marketing budgets. Better lift measurement means faster product iteration. Better causal models for pricing, retention, and cross-sell mean millions of dollars unlocked from data the company is already collecting.

The talent is there. The appetite is there. What is missing is the architectural shift — and a hiring philosophy that sees economists and statisticians as core to the data function, not peripheral to it.

The next time someone hands you a parquet file and asks you to find marginal returns, mentally prepare yourself for the possibility that the data architecture doesn’t even allow the question to be answered. You might find yourself spending hours processing data with SQL only to find out that the dataset is a dud.

If this resonates — or if you’re building a data team and want to talk about what causal infrastructure looks like in practice — reach out on LinkedIn or at nail@hassairi.com.