Improving estimates of the burden of severe acute malnutrition and predictions of caseload for programs treating severe acute malnutrition: experiences from Nigeria

Background The burden of severe acute malnutrition (SAM) is estimated using unadjusted prevalence estimates. SAM is an acute condition and many children with SAM will either recover or die within a few weeks. Estimating SAM burden using unadjusted prevalence estimates results in significant underestimation. This has a negative impact on allocation of resources for the prevention and treatment of SAM. A simple method for adjusting prevalence estimates intended to improve the accuracy of burden estimates and caseload predictions has been proposed. This method employs an incidence correction factor. Application of this method using the globally recommended incidence correction factor has led to programs underestimating burden and caseload in some settings. Methods A method for estimating a locally appropriate incidence correction factor from prevalence, population size, program caseload, and program coverage was developed and tested using data from the Nigerian national SAM treatment program. Results Applying the developed method resulted in errors in caseload prediction of about 10%. This is a considerable improvement upon the current method, which resulted in a 79.5% underestimate. Methods for improving the precision of estimates are proposed. Conclusions It is possible to considerably improve predictions of caseload by applying a simple model to data that are readily available to program managers. This implies that more accurate estimates of burden may also be made using the same methods and data. Electronic supplementary material The online version of this article (10.1186/s13690-017-0234-4) contains supplementary material, which is available to authorized users.


Background
A child with severe acute malnutrition (SAM) has a high risk of near term mortality [1,2]. It has been estimated that SAM affected more than 16 million children globally in 2016 [3]. This figure is based on prevalence estimates from cross-sectional surveys. SAM is an acute condition and many children with SAM will either recover or die within a few weeks. Estimating the number of SAM cases present in a population over a given period of time, the "SAM burden", using unadjusted prevalence estimates is likely, therefore, to miss many new (incident) cases and significantly underestimate the SAM burden [4]. A recent estimate of the annual global SAM burden that attempts to account for incident cases suggests that 110 million cases per year might be a more accurate estimate [5]. Poor estimates of SAM burden are a problem for program managers at all levels. Underestimation has a negative impact on the prioritization of resource allocation for the prevention and treatment of SAM both globally and locally [6].
Burden is the sum of prevalent cases at the start of a period and incident cases that arise during that period. The number of prevalent cases in a population at a given point in time can be estimated using a combination of a prevalence estimate from a cross-sectional survey and population data. This information is usually already available to program managers. Incidence is more complicated and more expensive to estimate.
The relationship between incidence and prevalence is frequently described using a "bathtub" metaphor [7]. In this model the flow of water into the bathtub is analogous to incidence, the level of the water in the bathtub represents prevalence, and the flow of water out of the bathtub through the drain represents recovery and mortality. Incidence in relation to prevalence depends, to a large extent, upon the average duration of illness (see Fig. 1).
The simple relationship between prevalence, incidence, and duration of illness makes it possible to create a simple mathematical model that allows the estimation of burden using prevalence and population estimates together with other data (e.g. program coverage and program caseloads) that will usually be available to program managers.
The Community Management of Acute Malnutrition (CMAM) Forum has proposed a simple method to estimate SAM burden and predict the number of cases that a program will treat over a given planning period [8]. The number of prevalent cases present in a population at the time of a prevalence survey is estimated as the product of prevalence and population size: N is the size of the population of interest P is the prevalence of the condition of interest The population burden (B) consists of both prevalent cases and new (incident) cases that are expected to occur in the program area over a given planning period: The expected number of incident cases can be estimated using:

Expected numer of incident cases ¼ NPK
where K is a correction factor [9] calculated as: This allows the population burden (B) to be estimated: The population burden (B) can be used to predict the number of cases that a program will treat over the planning period (L) using an estimate of program coverage (C): Fig. 1 The "bathtub" metaphor for the relationship between incidence and prevalence. The rate at which cases leave the population depends upon the average duration of illness All of the terms in this estimator are subject to uncertainty.
Uncertainty regarding coverage (C) and prevalence (P) is usually quantifiable and is quantified by confidence intervals or credible intervals on point estimates. The prevalence of severe acute malnutrition (SAM) is often estimated with poor relative precision. For example, the commonly used Standardised Monitoring and Assessment of Relief and Transitions (SMART) prevalence surveys typically have effective sample sizes (i.e. the sample size after accounting for survey design effects) between n = 300 and n = 400 [10]. An effective sample size of n = 400 yields an exact 95% confidence interval of [0.55%; 3.24%] on a 1.50% point estimate of SAM prevalence [11]. The relative precision of this estimate is: Coverage is typically estimated with a precision of about ± 10% on a 50% estimate [12]. This is a 40% relative precision.
Useful accuracy of population estimates can be achieved by correcting census data to account for population growth and migration. It can often be assumed that the population is estimated with little or no error. This may not, however, be the case in emergencies in which there is considerable and ongoing population movement and / or high levels of mortality.
Caseload (L) is a simple count of program admissions. This data is collected and reported on a routine basis and can usually be assumed to be measured with little or no error.
There is considerable uncertainty about the value of the incidence correction factor (K). The average duration of an untreated SAM episode that is currently being used globally is 7.5 months. This is based on data from two cohort studies and provides an incidence correction factor (K) of 1.6 for a one-year planning period [13]. It was assumed that this value of K would apply in all contexts. Governments, United Nations agencies, non-governmental organizations (NGOs), and other SAM treatment program implementing partners have, in the absence of other evidence, been using this value of K to estimate the burden and expected caseload and to advocate for resources to treat children with SAM. Reports from SAM treatment programs suggest that the use of K = 1.6 has led to programs underestimating caseload in some West African settings. Recent work indicates that a single value of K for use globally may not be useful (see Table 1) [6,[14][15][16].
Data from the Nigerian Community-based Management of Acute Malnutrition (CMAM) program from 2014 and 2015 are presented in this article. This program started operations in 2009 and has treated between 300 thousand and 500 thousand SAM cases each year. During the course of implementation it was recognized that the use of K = 1.6 had led to considerable underestimation of SAM burden and program caseload. Given the public health and security situation in Nigeria it is anticipated that the Nigerian CMAM program will run for many years and accurate estimates of expected caseloads will be required to secure adequate continued funding.
This article presents a method to adjust or calibrate the value of K using the population of the program area, the number of program admissions, estimates of program coverage, and estimates of the prevalence of SAM in order to provide more accurate estimates of burden and expected caseload during program implementation. The method is illustrated using data from the Nigerian CMAM program. The revised estimate of K may also be useful to predict SAM burden and caseload from prevalence surveys in similar settings.

Methods
The caseload estimation formula: can be rearranged to find K given the other terms: A suitable value for K can be found by substituting known values for L, C, N, and P with L being the observed program caseload (i.e. the number of admissions).
The method outlined here assumes that both population (N) and caseload (L) are measured with little or no error although the method can be easily extended to accommodate uncertainty in these terms. The principal sources of uncertainty in this analysis are, therefore, prevalence (P) and coverage (C). This can lead to considerable uncertainty in their product (PC) used in the estimator (Additional file 1).
An approximate 95% confidence interval for the product of two proportions (i.e. prevalence (P) and coverage (C) in this application): and n P and n C are the sample sizes used to estimate prevalence (P) and coverage (C) respectively [17]. This formula is not immediately applicable to the sorts of data likely to be available to program managers because the effective sample sizes used to estimate both prevalence and coverage (n P and n C ) will differ from reported sample sizes due to design effects introduced by the use of complex samples and / or the use of prior information [12,18,19].
Prevalence is usually estimated using surveys employing complex sample designs. The prevalence estimates used in this report were made by combining results from several cross-sectional household surveys that used a two-stage cluster sample design representative at the state level following the SMART methodology [10,20,21].
Coverage of CMAM programs is often estimated using spatially stratified samples [12,[22][23][24]. Semi-Quantitative Evaluation of Access and Coverage (SQUEAC) coverage assessments use a Bayesian beta-binomial conjugate analysis in which the conjugate prior contains information that contributes "pseudo-observations" to the analysis [12,19].
The effective sample size associated with the estimate of a proportion can be calculated from the reported point estimate (p) and its associated upper and lower 95% confidence limits (UCL and LCL).
Variance (VAR) is calculated as: The effective sample size (n effective ) is calculated as: This calculation is performed to find both n P and n C before calculating SE logθ b .
We used the approach outlined above to find a suitable value for K for the Nigerian CMAM program. Data relating to program admissions (i.e. caseload) in 2014 and 2015 were taken from routine program monitoring reports. Population estimates were made using data from the 2006 Nigerian Census corrected for population growth and migration [25]. Prevalence estimates for SAM were available for 2014 and 2015 [20,21]. An estimate of program coverage was available from a wide-area Simplified Lot-qualityassurance Evaluation of Access and Coverage (SLEAC) survey completed in early 2014 [12,26,27]. The reported coverage from this survey was used for  [28].
The method used to calculate the 95% confidence limits for the product of prevalence and coverage (PC) is approximate. A less approximate 95% confidence interval (i.e. an interval that contains the true value close to 95% of the time) may be calculated using a bootstrap estimator [29,30]. Estimates of the incidence correction factor (K) were made using a bootstrap estimator for the product of prevalence and coverage (PC). A percentile bootstrap estimator with one million replicates for prevalence and coverage drawn from appropriate binomial distributions was used [29]. Data were analyzed using the R Language and Environment for Statistical Computing version 3.3.3 [28]. Table 2 shows the observed and expected (i.e. calculated using K = 1.6) caseloads and the revised incidence correction factors (K) for 2014 and 2015 together with the data on which the calculations were based. Use of K = 1.6 to predict caseload had resulted in gross underestimates in both years. The resulting revised estimates for K were K = 14.39 (95% CI = 6.64; 30.02) and K = 11.66 (95% CI = 5.94; 22.10) for 2014 and 2015 respectively. These estimates were pooled giving K = 13.02 (95% CI = 6.80; 19.25). The final two rows of Table 2 show the expected caseloads for 2014 and 2015 using the pooled estimate for K and difference between the observed and expected caseloads. Table 3 compares estimates of the incidence correction factor (K) calculated using the approximate method and the bootstrap estimator.

Discussion
The approach outlined in this document can provide useful estimates of locally appropriate incidence correction factors. Applying the value of K estimated for 2014 to the population, prevalence, and coverage data for 2015 yields a predicted caseload of 484,766 cases. This is a 21.6% overestimate of the observed caseload for 2015. Some of this error may have been due to lower than specified coverage during the implementation phase of additional CMAM programming initiated in early 2015 as part of the ongoing emergency response in Northern Nigeria. This degree of error in caseload prediction is a considerable improvement Underestimation may lead to an under-resourced program in which program activities essential to achieving and maintaining coverage (e.g. community mobilization, community sensitization, and community-based case-finding activities) are neglected in order to maintain core clinical activities. Underestimation, in some cases, may lead to supply breaks necessitating the temporary closure of programs. Confidence intervals for the bootstrap estimates of the incidence correction factors are wider than when the approximate method is used. Estimates made using the approximate method are likely to be spuriously precise. The use of approximate methods to calculate confidence intervals is, however, a widely accepted practice for many public health applications. The approximate method has the advantage of being easy to implement using software, such as Microsoft Excel, that is available and familiar to CMAM program managers.
Estimates of the incidence correction factor (K) lack precision even when the approximate method is used. For example, the 95% confidence interval for the 2015 estimate of the incidence correction factor (K) using the approximate method ranges between K = 5.94 and K = 22.10. This translates into a 95% confidence interval for the caseload prediction of between about 218 thousand and 728 thousand. This degree of imprecision may limit the utility of the method as a planning tool.
The principal sources of imprecision are in estimates of prevalence and coverage. Improving the precision of estimates of prevalence and / or coverage will improve the precision with which the incidence correction factor (K) is estimated.
SAM prevalence is usually estimated with poor relative precision. Relative precision of the prevalence estimates are 138% for the 2014 SAM prevalence estimate and 118% for the 2015 prevalence estimate. The lack of precision in prevalence estimates is due, in part, to the use of sample designs that reduce the effective sample sizes of surveys. It is likely that precision could be improved using, for example, stratified sample designs and larger sample sizes. This would, however, require considerable changes to current practice. Lack of precision is also due to the way that survey data are analyzed. Replacing the classic estimator: Prevalence ¼ Number of SAM cases found in the survey sample Survey sample size with a PROBIT estimator has been demonstrated to reduce the half-width of 95% CIs by about 60% with only small losses of accuracy [31][32][33]. Slightly Larger gains in precision have been demonstrated using a Bayesian-PROBIT estimator [19,34]. The advantage of data analytic approaches to improving precision are that they can be applied to data collected with currently used survey methods including historical data at little extra cost. The precision of the coverage estimate was not an issue in the work reported here because a large stratified sample was used to estimate coverage with good relative precision (i.e. 23.5%). Precision of coverage estimates may, however, be a problem for smaller programs. We investigated this issue using data from 227 SQUEAC coverage assessments of district-level NGO-delivered CMAM programs performed between January 2010 and July 2015 and provided to us by the Coverage Monitoring Network. The median relative precision for coverage estimates between 40% and 60% was 42.6% (IQR = 38.4%; 48.5%). This is an expected result as SQUEAC coverage assessments are usually designed to estimate coverage with this level of precision [12].
The poorer relative precision of SAM prevalence estimates means that efforts to improve the precision of these estimates are likely to yield greater improvements in the precision with which the incidence correction factor (K) is estimated than may be achieved by efforts to improve the precision of coverage estimates. This is illustrated in Table 4 using the data from 2015. It is important to note that improvement in the precision of prevalence estimates can be achieved with very little increase in costs but that improvements in the precision of coverage estimates would entail considerable increases in costs.

Limitations
A key limitation of the work reported here is that coverage data was not current, particularly for 2015.
A limitation of the method described here is that burden and caseload may be influenced by migration into and out of the program area. Rapid and substantial changes in the population of the program area are likely to affect population size (N), prevalence (P), and program coverage (C). Migration may, therefore, result in grossly inaccurate predictions of burden (B) and caseload (L) that are based on estimates of population size (N), prevalence (P), and program coverage (C). Monitoring population movements and adjusting burden and caseload predictions may help to address this problem. Adjustment may also require that additional prevalence and coverage surveys be undertaken.
In the case of the Nigerian CMAM program there have been reports of SAM cases entering Nigeria from Niger and being admitted to CMAM sites in districts that border Niger. The effect of this on the work reported here is likely to be small since data for the whole country were used. It is important to note that this may have larger effects on burden (B) and caseload (L) predictions for (e.g.) small NGOdelivered programs operating in border districts.
The assumption that caseload (L) is measured with little or no error may also be a limitation. In the case of the Nigerian CMAM program there have been reports from 3 of the 114 districts in which the program is operating of beneficiaries being registered at more than one CMAM site with the assumed intention of receiving additional food and drugs. New CMAM sites were opened in these districts and some of the double registration may have been due to informal transfers between sites. An informal transfer would have been reported as a new admission at the destination site and, some weeks later, as a defaulting patient at the originating site. The effect of this would have been to increase reported caseload (L). It seems likely that double registration will have had only a small effect on caseload (L) used in the work reported here. This would have caused only a small increase in the estimates for K reported here. The covert nature of some double registrations does mean that the magnitude of any increase will always be difficult to quantify.

Conclusion
The work reported here shows that it is possible to considerably improve predictions of CMAM caseload by applying a simple mathematical model to data that are readily available to program managers. This implies that more accurate predictions of burden may also be made using the same methods and data. The precision of estimates of caseload and burden may be improved by using PROBIT or Bayesian-PROBIT estimators of SAM prevalence.
The implication of this study, and of similar reports based on a variety of approaches (see Table 1), is that the current estimates of SAM burden are likely to be gross underestimates. Applying the pooled incidence correction factor found in this study to the 16 million estimate made using prevalence data yields an estimated global SAM burden of 208 (95% CI = 109; 308) million cases annually. It seems unlikely, however, that the incidence correction factor estimated for the Nigerian CMAM program will be globally applicable. Local estimates of K will be needed to make local prediction of burden and caseload. These local estimates of K could be applied to local estimates of prevalence and population with the results summed in order to estimate global SAM burden.
Given the public health importance of having reliable estimates of burden and caseload and the uncertainties of this approach based on program data, a confirmation of estimates of K using direct estimates of incidence from continuous monitoring of open cohorts and surveillance systems in similar settings may be warranted. Comparison with other indirect methods may also prove useful.

Additional file
Additional file 1: Caseload method. (XLSX 36 kb) Table 4 Effect of improved precision of SAM prevalence estimates and coverage estimates of the precision of the estimate of the incidence correction factor (K) using 2015 data from the Nigerian CMAM program This level of improvement is achievable using a PROBIT estimator with existing survey designs and survey data c This level of improvement could only be achieved by a considerable increase in survey sample sizes