Cardiovascular Journal of Africa: Vol 23 No 10 (November 2012)

CARDIOVASCULAR JOURNAL OF AFRICA • Vol 23, No 10, November 2012

550

AFRICA

Discussion

This study compared the performance of four methods for

profiling hospitals and assessed their agreement. The methods

included combinations of two Bayesian methods, fixed and

hierarchical, and two ways of identifying outliers, rank and

exceeding some threshold using a hospital’s risk-adjusted

mortality rate; two were based on a hospital’s rank for its risk-

adjusted mortality rate, obtained from fitting both fixed- and

random-effects models. The agreements between the different

methods were empirically examined using an extensive dataset

of ACS patients.

Even though all the methods were able to classify hospitals

as low- and high-outcome outliers, profiling methods using

random-effects models were more conservative than fixed-

effects models in classifying hospitals as having better- or

worse-than-expected mortality. These findings were expected on

theoretical grounds and support the results from a multitude of

prior studies, showing that random-effects models identify fewer

performance outliers.

8,11

In the present study, the observed agreement in the methods’

classification of hospitals ranged from 90 to 98%, the highest

being between the methods within each effects model. The

agreement was excellent (

κ

=

0.77)

in only one of the six

comparisons. Otherwise, in all the remaining five scenarios, the

agreement was, at best moderate (

κ <

0.75).

Our findings relied on routinely collected clinical data. These

types of data suffer from incompleteness and inaccuracy of the

variables entered.

31

In our preliminary investigation, 11% of

the total patients had missing codes on survival status. We did

not have full data for admission age, SBP, HR, ECG findings

and biochemical markers of the patients. Other risk variables

that may have been used also demonstrated missing data, thus

limiting the number of risk factors in the case mix adjustment

model on this occasion. However, our findings were shown to

be robust to which factors were included in the risk-adjustment

model. Indeed, difficult-to-obtain key clinical variables add little

to the predictive power of ACS risk scores.

27

It may well be that the hospital performance variation

exhibited in this study was substantially contributed to by the

variation in definitions and data quality, as alluded to by Lilford

et al.

4

However, it is unlikely that these issues alone could

be attributed to the outcome variation found across the four

analytical strategies examined.

We did not impute for missing data since other researchers

have shown that this does not affect the prediction model or

mortality.

32

A more elaborate assessment of MINAP data quality

and validity on the resulting classification of hospitals is the

subject of a British Heart Foundation-funded project within our

group undertaken by Gale

et al

.

33

For the present study, it suffices

to say that the number of patients analysed and the data used were

of sufficient quality to enable a comparison of different methods

to assess the hospitals’ performance for 30-day mortality among

ACS patients. However, we remain cautious regarding the exact

inference made for some hospitals, given their data quality.

We performed a limited-sensitivity analysis to different prior

specifications of the hospital random-effects variation and

threshold values. We found classification of outlying hospitals

was not affected by changes in the random-effects variations, but

it was slightly affected when the thresholds were changed.

A more elaborate sensitivity analysis would alter specification

of the hospital random-effects distribution as the assumed normal

distribution is not robust and flexible enough to account for

outlying hospital effects. Therefore it may be necessary in future

research to model the hospital effects more flexibly, for example

by heavy-tailed

t

distributions to investigate both sensitivity

and robustness of the results, as in Manda,

34

or mixtures or

non-parametric Dirichlet distributions, as in Ohlssen.

35

The threshold level chosen and the required probability

of exceeding this threshold to classify a hospital using the

risk-adjusted mortality rate as an outlier were subjective and

completely arbitrary. We could have used other thresholds

and probabilities, as in Austin,

12

which may have generated

stronger or weaker levels of agreement between the methods.

Furthermore, the requirement that intervals of the ranks must lie

entirely in the bottom or top quarters of ranks for the hospital to

be classified as an outlier was also arbitrary but has been used

before.

11,12

Results from any study on profiling hospitals’ performance

are predictably used to produce league tables of performance.

We are aware of the many criticisms surrounding the statistics

used in measuring performance and the subsequent ranking of

hospitals. We did not intend to contribute to this controversy.

Our aim was to describe and compare the performance of

four different Bayesian methods for institutional profiling. In

using ranks to compare hospitals, caution should be exercised

since most hospitals had considerably overlapping intervals,

which made it difficult to obtain reliable ranking, especially for

hospitals admitting fewer patients.

We follow Normand

et al

.,

10

Marshall and Spiegelhalter,

11

Austin

12

and Ohlssen

et al.

18

in advocating the use of Bayesian

methods, which when pooling data across hospitals, handles

the problem of small hospitals better than frequentist methods,

for which a minimum number of patients is required before a

hospital can be included.

12

However, if we are willing to accept

wide confidence intervals, the exact probabilistic methods can

be used within a frequentist framework to handle small hospitals

(

see Luft and Brown

36

).

Furthermore, it is much easier within

Bayesian methods to determine uncertainty associated with

the ranks, which are very sensitive to sampling variations (see

TABLE 3. CLASSIFICATION OF HOSPITALS UNDER

THE FIXEDAND HIERARCHAL MODELS

Fixed RAMR

Fixed rank

Hierarchical RAMR

Low Normal High Low Normal High Low Normal High

Fixed RAMR

Low –

–

– 20 0

0 6 14 0

Normal

–

– 7 88 0 0 95 0

High

–

– 0

9

4 0

8

5

= 0.71

= 0.46

Fixed rank

Low –

–

– –

–

– 6 21 0

Normal

–

– –

–

– 0 96 1

High

–

– –

–

– 0

0

4

= 0.44

Hierarchical rank

Low 2

0

0 2

0

0 2

0

Normal 18 95 8 25 96 0 4 117 0

High

0

5 0

1

4 0

0

5

= 0.32

= 0.29

= 0.77

Cardiovascular Journal of Africa: Vol 23 No 10 (November 2012) - page 24

Warning.