Key trends in quantitative evaluation for Government

By Hugh Miller
Principal
23 April 2026


By Hugh Miller - Principal
23 April 2026

URL has been copied successfully!
URL has been copied successfully!


By Hugh Miller
23 April 2026

URL has been copied successfully!
URL has been copied successfully!

Clear evidence about outcomes underpins sound policy and funding decisions, yet generating that evidence is rarely straightforward. Choices about data and evaluation design shape what conclusions are possible, particularly when trying to understand impact and why programs do (or don’t) work for different groups. Drawing on a decade of experience, Hugh Miller reflects on how quantitative evaluation has matured to better allow government to understand impact.

Recently, I had the opportunity to present some thoughts on how evaluation has changed over the past decade, with an emphasis on the quantitative analysis we specialise in. My main thoughts are summarised below, and have implications for most areas of government.

I also recognise many of the items below partly reflect my growth as an evaluator as much as broader changes in the space.

Enduring data linkages are changing how governments evaluate impact.

Trend 1 – The welcome rise in enduring linked data assets

Data linkage has been a key part of the evaluation landscape for the past decade or so, since it enables us to look beyond what is in the specific program data. For instance, we can track across time (e.g. do people using a mental health program re-access services afterwards?), across services (how frequently do they present to hospitals with mental health issues?), and can be used to make broader comparisons (how do service patterns compare to similar people who did not access the program?).

Many of our past evaluations have involved bespoke linkages, where a specific set of administrative and program data would be linked together for a specific cohort. This added significant time and effort to the analysis. In some cases, this was also duplicative and wasteful – we would see the same people and datasets linked multiple times for different projects.

A much better solution is to have regular enduring linkages that can be accessed for specific projects. Governance and privacy provisions remain strong, but it avoids the delay associated with linkage and removes the duplication.

Examples of these national and state-based assets include:

Despite this progress, significant work remains. An obvious challenge in Australia remains the gap between State and Territory datasets and the Commonwealth – sharing and governing data for linkage remains complex, and inhibits work in areas of shared responsibility, which include health, education and disability.

Trend 2 – An increasing appetite to address cause from administrative data

We’ve built a lot of models based on administrative data for government, and most carry the well-worn disclaimer that ‘correlation does not imply causation’. The reluctance to attempt to get at questions of causation often reflects a proper assessment of what is possible with the data, but is unsatisfying for policymakers who want to understand how a program or policy is affecting people.

We have seen something of a causal revolution in the past two decades, with much theoretical and applied work occurring in statistics and econometrics to explore new methods and the circumstances where causation is reasonable (see for instance Pearl, 2009 or Imbens & Rubin, 2015*). These causal methods are increasingly popular, particularly in economics.

I think this push is mostly positive – it is useful to have researchers think deeply on questions of cause, and what assumptions need to hold for results to be causal. It is also useful to have a variety of tools available to explore these questions when the right data presents. And it has encouraged researchers to be on the lookout for natural experiments that can create the insights we seek.

There is always a question of whether the pendulum has swung too far, and in some cases this is definitely true. There are definitely pieces of work where the word ‘causal’ is now added because of an analytic technique where the underlying data warrants far more caution due to confounding factors or selection effects. But we are in a better place with our larger toolkit.

Trend 3 – More ingenuity in getting from administrative data to useful outcomes

One area I’ve particularly enjoyed is thinking carefully in the design stage of an evaluation to understand whether there are clever ways to exploit the data to ask more strategic questions.

As an example, consider health interventions designed to help people regularly take their medication. A straight linkage between the program and an outcome such as hospital presentations would be unlikely to reveal much – we might actually see that people who take more medication go to hospital more (since their starting health may be worse). However, a more targeted approach that looks at outcomes for people hospitalised with a specific condition, then given a post-hospital medication regimen, allows for more controlled program and comparison group analysis, including whether there are subsequent re-admissions. In this case, we throw a lot of data away (by focusing on specific sub-cohorts and hospital presentations) but end up with a much more meaningful test.

In a world with more linkage and larger datasets, the opportunity to be targeted and creative in the outcomes we measure is larger.

Again, challenges remain, with some areas remaining difficult to convert into unambiguous meaningful outcomes. For example, mental health services remain difficult to evaluate, even with good linkage – a person no longer accessing MH services might indicate a good recovery, or an ongoing need that is no longer being met – and so it is hard to distinguish between the two.

Trend 4 – A stronger connection between outcomes and benefits

Evaluations often include an economic component, seeking to understand if the service offers good value for money, which in turn can inform program design and future funding.

Historically economic analyses vary widely – particularly in the value placed on outcomes. For example, the value of successfully helping someone into employment will vary by factors such as the timeframe of employment assumed, whether government and/or private benefits are recognised, and how the opportunity cost of time while employed is recognised. This variation can be partly resolved by good transparency in reporting and greater use of standardised benefits to ensure consistency.

A more fundamental source of variation is the degree to which assumptions (whether drawn from broader evidence or plausibly guessed) are used compared to numbers tied to quantitative outcomes. At worst, this creates unhelpful estimates that are not grounded in reality. My all-time least favourite economic analysis of a government program took estimated fiscal multipliers from international health programs and assumed that was a reasonable estimate of the benefit-to-cost ratio of the program, which means that you get large benefits irrespective of whether any actual value is delivered.

Encouragingly, there is growing expectation that economic benefit estimates are meaningfully tied to outcome benefits. This can be challenging when funding cycles are short and when evidence is still emerging, but adding rigour and discipline will ultimately lead to better articulation of what a program is achieving.

Final thoughts

I’m ultimately optimistic about where evaluation is heading. Today, expectations are higher: evaluations are more likely to focus on measurable outcomes and explore impact. This shift means evaluation is better serving the need to inform funding and policy choices. This raises the bar for all of us – not just to produce analysis, but to be clear about what programs are achieving, for whom, and why.

*Pearl, J. (2009). Causality. Cambridge university press and Imbens, G. W., & Rubin, D. B. (2015). Causal inference in statistics, social, and biomedical sciences. Cambridge university press.


Other articles by
Hugh Miller

Other articles by Hugh Miller

More articles

Hugh Miller
Principal


NSW Department of Communities and Justice releases Taylor Fry report

Taylor Fry’s findings and recommendations in a report for the NSW Government evaluating a key initiative to improve lives

Read Article

Hugh Miller
Principal


Well, that generative AI thing got real pretty quickly

Six months ago, the world seemed to stop and take notice of generative AI. Hugh Miller sorts through the hype and fears to find clarity.

Read Article



Related articles

Related articles

More articles

Dennis Lam
Director


NSW Ministry of Health releases Taylor Fry report

Our evaluation of suicide prevention initiatives shows where they’re operating well and where access and follow-up care can be strengthened

Read Article

Hugh Miller
Principal


NSW Department of Communities and Justice releases Taylor Fry report

Taylor Fry’s findings and recommendations in a report for the NSW Government evaluating a key initiative to improve lives

Read Article