Hospital league tables: a statistician's view

Hospital league tables could be returning to the UK. On the 21^st anniversary of the Royal Statistical Society Working Party report on Performance Monitoring in the Public Services, Sheila Bird hopes the benefits of performance monitoring will not be overlooked in the heated debate

In early November 2024, I awoke to the news that Sir Keir Starmer’s Labour government was re-introducing performance monitoring in the public services – ranging from health and social care to policing and criminal justice. Described in the media as “a football-style league table” designed to “name and shame” England’s worst-performing hospitals, the plan drew criticism from many, including health bodies such as the National Health Service (NHS) Federation. My view on performance monitoring (PM) is this: league tables are not a good idea, but clued-up performance monitoring is.

Performance monitoring was introduced across UK public services by Tony Blair’s government in 2001, in an attempt to measure outcomes and as a goad to efficiency and effectiveness. The 2003 Royal Statistical Society Working Party, of which I was chair, identified three broad aims of PM: to establish “what works” in promoting stated objectives of the public services (research role); to identify the functional competence of individual practitioners or organisations (managerial role); and public accountability by ministers for their stewardship of the public services (democratic role).

Will our 2003 report help to ensure that – this time around – departments have well thought-out performance indicators that do not create perverse incentives and are detailed in each department’s PM protocol? This written protocol, we argued, should be open-access and should set out performance-targets that are reasoned and realistic (not aspirational) together with how target-compliance will be measured in a principled way that abides by methodological rigour.

The Royal Statistical Society held a press conference in October 2003 to launch its Working Party Report, Performance Indicators: Good, Bad and Ugly¹. As chair, and a proud Scot, I had suggested a heather colour for the report’s cover. The publisher’s interpretation was “bell heather pink”, which my male colleagues tolerated with good humour. I had acquired my first mobile phone, also pink, to receive press calls about PM, and my current phone is still referred to as “The Pink”.

In addition to its formal recommendations (which I summarise below), the Working Party commended – and illustrated – the use of funnel plots with super-imposed 95% and 99% uncertainty bounds (or “tram-lines”) to display that statistical variation depends on throughput: for example, the number of operations performed or the number of prisoners held per establishment. Following our 2003 report, we were told on good authority that Prime Minister Blair referred to uncertainty bounds as “tram-lines” and appreciated that there was no need for ministers to intervene in respect of prisons or hospitals that were within those tram-lines.

Done well, PM can lead to good results. Done badly, PM can be ineffective if it promotes perverse behaviours or demoralises, rather than inspires, staff.

Last week, I attended a conference in Glasgow where the last speaker was Susanne Millar, a social worker by profession, and former chief officer of Glasgow’s Health and Social Care Partnership. She had inspired her social work staff by her decision to deploy the intensive review procedure – usually reserved for when cases have gone badly wrong – in respect of cases that had gone exceptionally well, so that they learned from and were encouraged by what had been done well. A brilliant decision that empowered colleagues to have the courage to take difficult decisions: such as not to use an out-of-area private placement to solve a late Friday child-safety crisis but take the time to find a better solution. Even 15 years ago, placement of an at-risk child in out-of-area privately provided care would cost half a million pounds per annum.

It is good that the discussion around PM has been reignited, and that our report is reaching the new generation of policymakers, advisors and ministers. At my request, in November 2024 the RSS wrote to ministers and chief scientific advisors in government departments where PM has already been announced to remind them about the RSS’s recommendations on how to ensure that their performance monitoring is well-designed, realistically targeted and methodologically robust. One recipient was kind enough to thank the RSS: “for reminding me about this report which I read with interest at its first release – it is just as good to read again”.

Recommendations:

1. All PM procedures need a detailed protocol
2. A PM procedure must have clearly specified objectives and achieve them with methodological rigour. Individuals and/or institutions monitored should have substantial input to the development of a PM procedure.
3. A PM procedure should be so designed that counter-productive behaviour is discouraged.
4. Cost-effectiveness should be given wider consideration in both the design and the evaluation of PM procedures. Realistic assessment of the burden (indirect as well as direct) of collecting quality-assured PM data is important, for PM’s benefits should outweigh the burden.
*
5. Independent scrutiny of a PM procedure is needed as a safeguard of public accountability, methodological rigour and of the individuals and/or institutions being monitored. The scrutineers’ role includes checking that the objectives of PM are being achieved without disproportionate burden, inducement of counter-productive behaviours, inappropriate setting or revision of targets or interference in, or over-interpretation of, analyses and reporting.
*
6. Performance indicators (PIs) need clear definition. Even so, they are typically subject to several sources of variation, essential or systematic – due to case-mix, for example – as well as random. This must be recognised in design, target setting (if any) and analysis.
7. The reporting of PM data should always include measures of uncertainty.
*
8. Investigations on a range of aspects of PM should be done under Research Council sponsorship, including study of the relative merits of different dissemination strategies for public release of PM data.
9. Research should be undertaken on robust methods for evaluating new government policies, including the role of randomised controlled trials. In particular, efficient designs are needed for when government departments, in accordance with budgetary or other constraints, introduce (or ‘roll-out’) a series of PI-monitored policies.
*
10. Ethical considerations may be involved in all aspects of PM procedures, and must be properly addressed.
11. A wide-ranging educational effort is required about the role and interpretation of PM data.

Reference

Bird S.M., Cox D.R., Farewell V.T., Goldstein H., Holt T., Smith P.C. (2005). Performance Indicators: Good, Bad and Ugly. Journal of the Royal Statistical Society: Series A (Statistics in Society), 168(1), 1–27. https://doi.org/10.1111/j.1467-985X.2004.00333.x

Sheila M. Bird is a former programme leader at MRC Biostatistics Unit, Cambridge and is honorary professor at the College of Medicine and Veterinary Medicine at Edinburgh University.

You might also like:

Post Office scandal: prosecutions data should have been a red flag

AI: statisticians must have a seat at the table