How do world and European standard populations impact burden of disease studies? A case study of disability-adjusted life years (DALYs) in Scotland

Background Disability-Adjusted Life Years (DALYs) are an established method for quantifying population health needs and guiding prioritisation decisions. Global Burden of Disease (GBD) estimates aim to ensure comparability between countries and over time by using age-standardised rates (ASR) to account for differences in the age structure of different populations. Different standard populations are used for this purpose but it is not widely appreciated that the choice of standard may affect not only the resulting rates but also the rankings of causes of DALYs. We aimed to evaluate the impact of the choice of standard, using the example of Scotland. Methods DALY estimates were derived from the 2016 Scottish Burden of Disease (SBoD) study for an abridged list of 68 causes of disease/injury, representing a three-year annual average across 2014–16. Crude DALY rates were calculated using Scottish national population estimates. DALY ASRs standardised using the GBD World Standard Population (GBD WSP) were compared to those using the 2013 European Standard Population (ESP2013). Differences in ASR and in rank order within the cause list were summarised for all-cause and for each individual cause. Results The ranking of causes by DALYs were similar using crude rates or ASR (ESP2013). All-cause DALY rates using ASR (GBD WSP) were around 26% lower. Overall 58 out of 68 causes had a lower ASR using GBD WSP compared with ESP2013, with the largest falls occurring for leading causes of mortality observed in older ages. Gains in ASR were much smaller in absolute scale and largely affected causes that operated early in life. These differences were associated with a substantial change to the ranking of causes when GBD WSP was used compared with ESP2013. Conclusion Disease rankings based on DALY ASRs are strongly influenced by the choice of standard population. While GBD WSP offers international comparability, within-country analyses based on DALY ASRs should reflect local age structures. For European countries, including Scotland, ESP2013 may better guide local priority setting by avoiding large disparities occurring between crude and age-standardised results sets, which could potentially confuse non-technical audiences.


Background
A Burden of Disease (BoD) approach can be used to summarise the debilitating effects of morbidity and premature mortality in a population in a consistent and comparable manner. This is achieved by framing the effects of morbidity and mortality as population health loss as a function of time, in a composite measure called Disability-Adjusted Life Years (DALYs) [1]. By framing health loss in this way, DALYs combine the effects of morbidity and mortality in an equitable way and thus can be used to identify the leading causes of disease or injury that cause BoD and to quantify the relative importance of specific risk factors [2].
The Global Burden of Disease (GBD) study [3] provides estimates of the BoD for regions, countries and selected sub-national regions across the world. Country representatives and researchers across the world can contribute to BoD activities in collaboration with, or independent of, the GBD study. It is often highlighted that a major benefit of using the GBD study is in its comparability across international regions and over time [2,4]. Independent national BoD studies often lose direct comparability with estimates from the GBD study and other independent national BoD studies when they opt to make different methodological choices, such as using a different life table to facilitate Years of Life Lost (YLL) calculations or using different methods to standardise rate calculations [5][6][7][8][9]. BoD studies are becoming an increasingly popular way to assess national and local population health as a means to influence national and local policy decisions for withincountry resource allocation. It is therefore essential that estimates used to set national and local policies are based on the needs of the populations they represent and are a valid reflection of the relative burden of different causes of ill-health and mortality. Once this assessment has been made then comparability between different locations are other important approaches which can be usefully utilised.
In order to retain international and temporal comparability it is essential that estimates are adjusted to reflect potential differences in population demographic structures between comparator groups. The most common approach to achieve this in BoD studies is to calculate directly standardised rates per 100,000 population. This is accomplished by applying a common reference population age structure to the populations which are being compared. This allows for the creation of artificial rates that provide the hypothetical scenario that would have occurred had the two populations being compared had the same age distribution. In BoD studies the most common approach is to compute age-standardised rates (ASR) using the GBD 2017 World Standard Population (GBD WSP) [10] or the 2013 European Standard Population (ESP2013) [11] as common reference population structures. From the outset of the first GBD study for 1990, the World Health Organization WSP was used as the reference due to the worldwide remit of the study [1]. In more recent years the GBD study has developed their own WSP for use within the study [10].
The primary aim of a BoD study is to identify the impact of health problems and causes of death in a consistent and comparable manner between causes, subgroups, locations and time, which is facilitated by using DALYs [12]. From a planning perspective it is important to understand what is currently causing mortality and health loss and to understand how this has varied over time and location. Although consistency in comparisons across location and time are important, users of BoD at a national and sub-national level must understand the impact these choices have on estimates to ensure that the primary aim of the BoD method is not threatened by introducing a significant bias. This study is highly topical, particularly for European and other high income countries carrying out BoD studies, because it is unclear if the ranking of causes is being skewed because users are focusing on monitoring changes over time and location at the expense of correctly assessing the national and local priorities of the populations they serve.
The aim of our study was to evaluate the impact that the choice of method used for rate calculations (crude or age-standardised) has on the DALYs ranking and rate of causes of disease/injury. This was carried out by comparing crude and age-standardised rates, and assessing differences between age-standardised rates derived using different standard populations (ESP2013 and GDB WSP). We illustrate this using the example of Scotland.

Data
Estimates of the number of DALYs were derived from the Scottish Burden of Disease (SBoD) 2016 study [6]. These estimates represented a three-year annual average across 2014-16 based on an abridged cause list of 68 causes of disease/injury and were stratified by sex and five-year age-group, splitting the under 5 year age-group into under 1 year and 1 to 4 years. Further information on the derivation of these estimates is provided elsewhere [6]. A three-year annual average across 2014-16 of Scottish national mid-year population estimates were sourced from National Records of Scotland, by sex and five-year age-group, respecting the aforementioned split of the under 5 years age-group [13]. Two different standard populations were sourced for use in calculations of ASR: the GBD WSP [10] and the ESP2013 [11].

Analyses
The unit of analyses used in this study was all ages and both sexes. DALYs were summed to give the observed number of all-cause DALYs and DALYs for 68 causes of disease/injury. Crude rates were calculated by dividing the number of DALYs by the three-year average annual (2014-16) Scottish national mid-year population. Two different methods of directly calculating ASR were calculated for all-cause DALYs and DALYs for 68 causes of disease/ injury using the ESP2013 and GBD WSP, with an upper age-group of 90 years and above and the under 5 years age-group being split into under 1 year and 1 to 4 years.
The SBoD 2016 study directly standardised rates to the ESP2013 to facilitate comparisons across different subnational areas, therefore this was assessed as the baseline position when comparing standardisation methods. The main study outcome was to assess the absolute and relative difference of ASR of all-cause DALYs and ASR of DALYs for 68 causes of disease/injury, between rates standardised using GBD WSP compared with ESP2013. Causes of disease/injury were ranked by their respective crude rates of DALYs and rankings of ASR using ESP2013 were compared with those using GBD WSP.

Data permissions
Formal permission to access linked patient-level National Health Service (NHS) administrative databases as part of the SBoD study was granted by the Privacy Advisory Committee, NHS National Services Scotland (NSS) [PAC Reference 51/14] [25]. All summary data used in this study are provided (see Additional file 1).

Differences in population structures
The age distribution of the GBD WSP, ESP2013 and three-year annual average (2014-16) mid-year estimate of the Scottish national population is shown in Fig. 1. The GBD WSP is skewed towards younger ages and has a modal percentage of 8.7% in the age-group 5 to 9 years. The ESP2013 and 2014-16 Scottish national population have a similar distribution which reflects a much older population than the GBD WSP. The main deviations between ESP2013 and the 2014-2016 Scottish national population occur across the age ranges 20 to 29 years, where the population of Scotland is proportionately higher than ESP2013, and the ages 35-44, where the population of Scotland is proportionately lower than ESP2013.

Effect of standard populations on DALY rate estimates
The number of DALYs over the three-year annual average across 2014-16 was 1,305,004 ( Table 1). The crude rate of all-cause DALYs was 24,279 per 100,000 population and all-cause ASR of DALYs in Scotland was 24,753 per 100,000 population when directly standardised to ESP2013. By contrast, the all-cause ASR of DALYs in Scotland directly standardised using the GBD WSP was 18,275 per 100,000 population, 26% lower than the ASR using ESP2013.
Ischaemic heart disease was the leading cause of DALYs for both crude rates of DALYs and rates standardised using ESP2013 (Fig. 2). The ranking of causes of disease/injury by DALYs were very similar when ranked by crude rates or ESP2013 age-standardised rates. Within the leading 10 causes, tracheal, bronchus and lung cancer and migraine slightly dropped in ranking when based on ESP2013 age-standardised rates compared to crude rates. Cerebrovascular disease and chronic obstructive pulmonary disease were ranked slightly higher when based on ESP2013 age-standardised rates rather than crude rates.
However these changes were small compared to those observed between ranks based on crude rates and those based on GBD WSP age-standardised rates. Ischaemic heart disease dropped in rank to become the second leading cause when using GBD WSP age-standardised rates, whilst lower back and neck pain was ranked as the leading cause. Within the leading 10 causes other drops in rank occurred (ESP2013 vs. GBD WSP) for: tracheal, bronchus and lung cancer (3 places); cerebrovascular disease (5 places); Alzheimer's and other dementia's (7 places); and chronic obstructive pulmonary disease (4 places). Other increases in rank within the leading 10 causes occurred for: migraine (4 places); drug use disorders (4 places); and anxiety disorders (4 places). Additionally sensory disorders and other cancers which were ranked outside the leading conditions moved up 4 and 2 places respectively and were ranked within the leading 10 causes when ranked based on ASR (GBD WSP). The largest change in rank (ESP2013 vs. GBD WSP) across the full abridged cause list was for neonatal disorders which increased 23 places.
Overall, 58 out of a total of 68 causes of disease/injury had lower DALY ASRs (GBD WSP vs. ESP2013). The largest absolute and relative changes in ASR of DALYs were observed for conditions that were leading causes of mortality and that occurred at older ages. The balance in the scale of change was largely due to reductions in ASR and where increases in ASR were observed, they tended to be much smaller.

Summary of findings
Our study found that the ranking of causes of disease/injury were similar between ranks based on crude rates of DALYs and ranks based on age-standardised rates of DALYs using ESP2013 as the reference population. On the other hand, there were large scale differences in the absolute and relative scale, and in rank order, between causes of disease/injury when rates were age-standardised using GBD WSP as the reference population compared with the ESP2013 or crude rate methods. The largest absolute reductions between standardisation methods were observed in those causes of disease/injury where onset occurs at older ages such as ischaemic heart disease, Alzheimer's disease and other dementias, and cerebrovascular disease. Overall, the use of GBD WSP in standardised rate calculations reduced rates. The ranking of conditions also changed due to the differences in agegroups weights between the two different standard populations. Some causes of disease/injury where the burden is experienced early in the life course, saw large relative, but small absolute, increases in rate, such as neonatal disorders, congenital birth defects and sudden infant death syndrome.

Strengths and limitations
The major strength of this study is that it assesses the effect of different methods of age-standardisation of rates in an objective way using a comprehensive national assessment of BoD. All other parameters remained constant through the analysis so that the impact of the choice of standard population could be illustrated. A possible limitation of the study is the need to truncate the oldest open-ended age-group to 90 years and above to allow for the same age-groups to be used. As the 90 years and above age-group represent less than 1 % of the Scottish population any effect on these findings would be small.

How this compares with existing literature
There are currently no published literature appraising the impact of using different standard population structures to directly age-standardise rates of DALYs. Interrogation of GBD estimates for the United Kingdom (UK) via the GBD country profiles highlights the disparities in  Absolute rate differences calculated by subtracting the difference in age-standardised rates (between those directly standardised to GBD WSP and ESP2013); Relative rate differences were calculated as the percentage difference in age-standardised rates (between those directly standardised to GBD WSP relative to ESP2013); Causes of disease/injury ranked based on ascending order of absolute rate differences; Causes above/below the solid black line have lower/higher age-standardised rates of DALYs when the GBD WSP is used compared to the ESP2013 different ways of ranking causes of disease/injury (see Additional file 2). From a national needs assessment perspective, the country profiles correctly rank the top 10 leading causes of disease/injury based on the number of DALYs to give an indication of the leading causes of disease/injury and to identify ischaemic heart disease as the leading cause of disease/injury. The GBD UK country profile provides comparisons with other countries and regions, which suggest that low back pain has a higher age-standardised rate of DALYs than that of ischaemic heart disease. However this simply reflects the use of GBD WSP for standardisation. Similarly, the higher ranking of both headaches and neonatal disorders compared to chronic obstructive pulmonary disease, depressive disorders, lung cancer, stroke and falls is largely driven by the use of GBD WSP and does not reflect the fact that these conditions generate substantially larger numbers of DALYs than headaches and neonatal disorders. The GBD country profiles are an excellent example of how BoD estimates should be used from a national perspective by considering both the number (or crude rate) of DALYs and age-standardised rates. However, the disparities in rank order that occur can lead to confusion for non-expert users of BoD estimates. These types of challenges have been faced before in other settings, such as in the United States' (US) Surveillance, Epidemiology and End Results programme, which has standardised rates to the US standard population for some time [14]. The Australian Bureau of Statistics also standardise rates based on estimates of its resident population [15], while the NORDCAN (Cancer statistics for the Nordic countries) project offers the option of calculating age-standardised rates using a standard population from the Nordic countries [16].

Implications for research and policy
These results demonstrate the importance of the choices researchers make when designing BoD studies as a means for supporting evidence-based decision making. This study serves as an important reminder that the use of different reference populations in rate calculations can significantly impact both rates and rankings, which are both crucially important in BoD studies.
Currently BoD work internationally focuses on advocating for better country-specific prevalence data as an input, which would have clear benefits. However, it is important to note that improvements to other inputs, such as severity distributions, also have significant potential to improve these estimates [17]. This study opted to assess the value of standard populations in rates calculations, as it has been another area which has been largely thought of as a fixed choice. Our findings are an important reminder that there are other highly feasible approaches available to improve methods and estimates. Future planned research from the SBoD study includes assessing the impact of the use of different life tables on estimates, which remains another highly topical issue for BoD researchers.
From the perspective of international comparisons the use of world standard populations in rate calculations remains a valid approach. As more users become interested in the value of BoD estimates primarily to influence national and local policy decisions, the consistency and comparability of estimates across causes must be retained and key messages must be clear. Those using GBD estimates locally for prioritisation must ensure that they consider consistency and comparability of estimates across causes rather than merely comparability across time and countries. For users of the GBD, this can currently be done by using the GBD country profiles [18], GBD results tool [19] or any of the GBD data visualisations [20] and focusing on crude rate or numbers as the method for prioritisation within countries, or allocation of international resource across countries if resource is to be focused on the regions of greatest need. Time trends and wider international comparisons remain highly important. The current approach to standardise rates to the world standard population is important for comparing across the world by accounting for different age-structures of populations. However, if not supplemented by crude rate or numbers, it has the potential to significantly underestimate the burden in older ages in high income countries, and overestimate the burden in younger ages in low income countries as well as introducing important distortions to DALY rankings. These concerns extend to sub-national comparisons, particularly in countries exhibiting wide inequalities which lead to the emergence of different sub-national population structures.

Conclusion
In the interests of comparability across the sub-national regions of Scotland our findings support the use of ESP2013 to calculate DALY ASRs as a means of achieving comparability over sub-national regions. We recommend that high-income and European countries, such as those involved in the European Burden of Disease Network (EBoDN) [21,22] use ESP2013 standardised rates or at least offer them as an alternative. This would limit the potential development of mixed messages, or incorrect conclusions, being drawn by nonexperts when using BoD estimates.