The effect of resistance training on health-related quality of life in older adults: Systematic review and meta-analysis

Background: Resistance training (RT) is recommended as part of our national physical activity guidelines which includes working all major muscle groups on two or more days a week.Older adults can gain many health benefits from RT, such as increased muscle strength,increased muscle mass, and maintenance of bone density. Additionally, certain dimensions of health-related quality of life (HRQOL) have been shown to improve in older adults due to RT intervention. The purpose of this study was to use systematic review and meta-analytic techniques to examine the effect of RT on HRQOL in older adults. Methods: A systematic review of current studies (2008 thru 2017) was conducted using PubMed. Studies were included if they used a randomized controlled design, had RT as an intervention, measured HRQOL using the SF-36/12 assessment, and included adults 50+ years of age. Eight dimension scores (physical functioning, bodily pain, physical role function, general health, mental health, emotional role function, social function, and vitality) and two summary scores (physical component and mental component) were extracted. Ten meta-analyses were performed using standardized mean effect sizes and random effects models. Study quality,moderator and sensitivity analyses were conducted. Results: A total of 16 studies were included in the analyses with a mean Physiotherapy Evidence Database (PEDro) score of 4.9 (SD=1.0). Among the mental health measures, RT had the greatest effect on mental health (Effect size [ES]=0.64, 95% CI: 0.30-0.99, I2=79.7). Among the physical health measures, RT had the largest effect on body pain (ES=0.81, 95% CI: 0.26-1.35, I2=85.9).Initially, RT did not significantly affect measures of emotional role function, social function or physical role function. However, after removing a single study, RT significantly increased all HRQOL measures. Conclusion: The meta-analytic evidence presented in this research clearly supports the promotion of RT in improving HRQOL in older adults.


Introduction
Resistance training (RT) is recommended as part of the 2008 physical activity guidelines for Americans. 1 Specifically, adults should engage in muscle strengthening activities of moderate to high intensity which includes working all major muscle groups on two or more days a week. For the aged adult, the same muscle strengthening guidelines apply, as RT may hold even greater benefit for this population. Several health problems affecting older adults can be countered by adopting a regular RT program. For example, older adults are at greater risk of premature death due to falls, which is associated with age-related declines in muscular fitness and balance. [2][3][4][5] A recent report from the Centers for Disease Control and Prevention (CDC) states that approximately one in four older (65+ years of age) US adults fall each year and deaths from falls have increased an average of 3% annually from 2007 to 2016. 6 Older adults can gain other health benefits from RT, besides increased muscle mass and strength. 7 Studies have shown that RT can benefit bone mineral density, 8,9 lipoprotein profiles, 10 glycemic control, 11 body composition, 12 symptoms of frailty, 13 metabolic syndrome risk factors, 14 and cardiovascular disease markers. 15 Studies have further shown that RT can decrease the risk of all-cause mortality both in observational 16,17 and experimental 18 designs. Furthermore, RT intervention has shown to effectively improve psychosocial health 2 outcomes such as sense of coherence, 19 perceived stress, 20 depression, 21 anxiety, 22 and fatigue. 23 Health-related quality of life (HRQOL) is another psychosocial outcome of increasing interest in health sciences research. 24 HRQOL is a multidimensional construct and considers the relationship between an individual's health status and their quality of life. 25 As a recent addition to the Healthy People goals for year 2020, two objectives were issued. 26 Specifically, these objectives are to increase the proportion of adults who report at least good health, with one objective specifying physical health and the other specifying mental health.
As with any health outcome measure used in practice or in research, the use of a reliable HRQOL measure is important to the internal validity of study findings. 27 Therefore, selecting an appropriate assessment is paramount to research soundness. Many different HRQOL assessments have been used in physical activity-related research, however, the Medical Outcomes Study 36-Item Short Form Survey (SF-36) and its variant (SF-12) have served as a gold-standard. 28 One attractive characteristic of the SF-36 and SF-12 assessments (SF-36/12), is the many different outcome scores resulting from its administration. Specifically, ten different scores can be computed from the SF- 36 29 Due to its widespread use and above standard psychometric properties, 30 this research delimited its examination to only studies using the SF-36/12 to measure HRQOL. Moreover, studies support the positive effect that RT has on HRQOL. 31 However, a collective summary of the effect that RT has on a gold-standard HRQOL assessment is necessary. A collective summary through systematic review can ensure that the promotion and adoption of RT among older adults will contribute to the effort to meet our national HRQOL objectives. Therefore, the purpose of this study was to use systematic review and meta-analytic techniques to examine the effect of RT on HRQOL, assessed only using the SF-36/12, among older adults.

Systematic review search strategy
Two researchers independently engaged in all search strategy procedures. During review of the results at each stage, if discrepancies were found between the researchers, they were reviewed and discussed until an agreement was made. The search strategy steps consisted of: (1) initial search of the PubMed database using keyword search terms and review of all initial abstracts, (2) retrieval and review of all full-text articles estimated to be appropriate from the initial abstract review, and (3) agreement on the final set of full-text articles included in the study. The following terms were used in the initial PUBMED search: "(elderly OR older OR aging) ("strength training" OR "resistance training" OR "resistance exercise" OR "muscle strengthening" OR "weight training") ("HRQOL" OR "SF-36" OR "SF-12" OR "health-related quality of life" OR PCS OR MCS OR "physical component" OR "mental component")".

Inclusion and exclusion criteria
During the search procedures stated above, inclusion criteria were used to flag abstracts and full-text articles as appropriate for the study. During the last stage of the search, included studies were excluded only if the data reported were not conducive to a standardized mean difference meta-analysis (e.g., regression analysis). The following inclusion criteria were used during each step of the search: 1.

Data extraction
Data were extracted from each study independently by the same two researchers. A preformatted spreadsheet was created for both researchers and included the following columns: study number, first author last name, year of publication, HRQOL form (SF-36 or SF-12), mean age of participants, minimum age of participants, gender of participants (male/female/both), disease status (e.g., diabetic) (yes or no), length of intervention (in weeks), and whether the intervention included other components (i.e., multiplicity) (RT only or RT plus). Additionally, data for effect size calculations were extracted and included any number of columns such as pretest mean value and standard deviation (SD), posttest mean value and SD, mean gain score value and SD, between group mean difference in gains and SD, confidence interval limits, standard errors (SEs), P values, and test statistics. All data were entered into spreadsheets with the same formatting and a comparison of results was performed using the SAS PROC COMPARE procedure. 32 Any discrepancies in data extraction were discussed until an agreement was made.

Statistical analysis
Each of the ten HRQOL measures from the SF-36/12 assessment was considered distinct measures of HRQOL and so ten different meta-analyses were performed. Each meta-analysis was conducted using the computed standardized mean effect size (ES) and its SE. 33 The effect size in this study represents the effect that RT has on HRQOL, as compared to a control. In 15 of the 16 studies, the reported pretest and posttest means and standard deviations were used to compute each effect size. The numerator of each effect size was simply the difference between the treatment mean difference and the control mean difference. The denominator of each effect size was a pooled standard deviation of the two group's standard deviation of changes. When these standard deviations of changes were not reported, we estimated them using conventional methods. 34 When pretest and posttest group standard deviations were not reported directly, we computed them from reported confidence intervals. The effect size standard errors were also calculated using the computed effect size and group sample sizes. The one unique study reported all change statistics for each group. In this case, the effect size numerator was simply the subtraction between the two reported mean differences. In this case, the denominator was the conventional pooled standard deviation of the two reported standard deviations of changes. For this meta-analysis research, it was assumed that different populations indeed exist within the older adult population (e.g., diseased and non-diseased) and therefore RT would have varying effect on HRQOL across these different populations. With this assumption in mind, random effects models were pre-planned and performed on all meta-analyses. 35 To describe individual study-level effect sizes and each pooled effect size, Forest plots were constructed with 95% confidence intervals (CIs). 36 To further describe variability in effect sizes, the Q statistic for heterogeneity, tau-squared (τ 2 ) representing the variance component, and I 2 describing percent of heterogeneity were computed. 37 Additionally, moderator analyses using random effects models were performed for four categorical factors and three continuous factors. 35 Three procedures were employed as part of a sensitivity analysis. First, Egger's regression was performed to test funnel plot asymmetry. 38 Second, a trim-and-fill procedure was performed to estimate the number of effect sizes needed to reproduce a symmetric funnel plot. 39 An estimated mean effect size was produced as part of the trim-and-fill analysis and represents the change in pooled effect size with imputed study effect sizes required to balance each funnel plot. Third, a leave-oneout analysis was performed which estimates new pooled effect size estimates with each study deleted. 40 Finally, the experimental design quality of each study included in this research was evaluated using the Physiotherapy Evidence Database (PEDro) scale. 41 SAS version 9.4 (SAS Institute, Cary, NC, USA), 32 R version 3.5 (R Core Team, Vienna, Austria), 42 and STATA version 14 (StataCorp, College Station, TX, USA) 43 were used for all analyses. Significance was set to P < 0.05. Due to a relatively small sample size for some meta-analyses, suggestive evidence was set at P < 0.10 for the moderator analysis. Strength criteria for the standardized mean difference effect sizes were set as follows: 0.20 (small), 0.50 (medium), 0.80 (large). 44 Figure 1 displays the results of the systematic review procedures. A total of 245 studies were first identified by keywords. After a complete review of all abstracts, the full-text of 114 articles were retrieved. After review of all full-text articles, 20 studies met inclusion criteria   Table 1 describe these studies in terms of their characteristics. [45][46][47][48][49][50][51][52][53][54][55][56][57][58][59][60] A total of 77 effect sizes were computed from all studies with specific HRQOL measures ranging from 6 (ERF and PRF) to 12 (PF) effect sizes. Table 2 contains results from the PEDro methodological quality analysis. Of the 16 studies included in the analyses, the mean PEDro score of 4.9 (SD = 1.0). Of the ten meta-analyses, RT did not appear to significantly effect ERF, SF, and PRF. Table 3 provides evidence for the heterogeneity of effect sizes across the ten meta-analyses. All Q statistics were significant (P < 0.01), indicating heterogeneity in effect sizes. Additionally, I 2 values were large for all ten metaanalyses, with the smallest I 2 showing approximately 63% variance (inconsistency) in effect sizes due to factors other  than sampling error (chance).  Table 6 contains results from the three-step sensitivity analysis. Only two meta-analyses showed signs of funnel plot asymmetry. Although two effect sizes were required to balance the MH meta-analysis, its pooled mean effect size was still significant after imputation (MH: ES = 0.48, CI: 0.16-0.80). Conversely, two effect sizes were required to balance the GH meta-analysis, however, its pooled mean effect size was no longer significant after imputation (GH: ES = 0.34, CI: -0.03-0.71). Finally, results from the leave-one-out analyses were less consistent. Specifically,   seven (MCS, MH, VT, PCS, BP, GH, and PF) of the ten meta-analyses had effects that remained significant regardless of which single study was removed from the pooled mean estimate. This implies that no single study influenced the significance of the effect that RT had on those HRQOL measures. The remaining three metaanalyses (ERF, SF, and PRF) each showed non-significant effects across each study removed with exception of one single study. That is, for each of these three meta-analyses, a single study removed brought the pooled mean estimate to a significant level. Specifically, if Tomas-Carus (2016)

Discussion
The purpose of this study was to use systematic review and meta-analytic techniques to examine the effect of RT   on measures of HRQOL in older adults. Additionally, this research sought to use only HRQOL measures assessed by the gold-standard SF-36/12 assessment which consisted of MCS, ERF, MH, VT, SF, PCS, BP, GH, PF, and PRF. Results from this study support RT intervention as an effective means for improving HRQOL in older adults. These results, however, are not without caveats and, therefore, should be discussed. For instance, three of the ten metaanalyses (ERF, SF and PRF) did not significantly support RT as an efficacious means for increasing HRQOL, at the initial stages of analysis. However, results from the sensitivity analysis revealed a single study was influencing the non-significant pooled mean effects. Specifically, if Tomas-Carus (2016) is removed from the meta-analyses, it is seen that RT has a significant effect on both ERF and SF. This inconsistency is clarified after further inspection into the Tomas-Carus (2016) data. Specifically, Tomas-Carus (2016) reported an unusually low pretest ERF mean value for the control group of 79.5, whereas the treatment group pretest ERF mean was 93.8. Posttest ERF means for control and treatment groups were 92.3 and 92.5, respectively. Conversely, Tomas-Carus (2016) reported an unusually high pretest SF mean value for the treatment group of 89.1, whereas the control group pretest mean was 82.7 for SF. Posttest SF means for treatment and control groups were 82.0 and 82.7, respectively. When examining the Tomas-Carus (2016) data this way, it becomes clearer that a case of statistical regression toward the mean possibly influenced these study findings. 61 Furthermore, the control group in Tomas-Carus (2016) was a true control group in that they were only instructed to behave in their usual manner. With this in mind, it is unlikely to see a control group experience this amount of ERF improvement and it is more likely that some subjects were measured as having abnormally lower ERF than typical. It is also likely, that this same reasoning (i.e., statistical regression) explains why the treatment group in Tomas-Carus (2016) appeared to suffer such a drop in SF from pretest to posttest. The third meta-analysis that indicated an initial nonsignificant effect was concerning PRF. Specifically, if Teixeira (2010) is removed from the meta-analyses, it is seen that RT has a significant effect on PRF. At first glance, the pooled mean effect becoming significant after removing the Teixeira (2010) seems counterintuitive (see Figure 5). It would appear that Teixeira (2010) included in the PRF meta-analysis would, if anything, skew the pooled mean estimate in the positive (greater effect) direction. However, when considering random effects models, the standard error of the pooled mean effect size is a function of not only the inverse variance weights but additionally the variance component of tau-squared (τ 2 ). 62 Tau-squared is a measure of between-study variance and adding its component to the inverse variance weights of the standard error computation is the driving mechanism behind random effects models. 63,64 Therefore, the value of tausquared was considerably large with Teixeira (2010) in the PRF meta-analysis. More specifically, the Teixeira (2010) effect size contributing to the PRF meta-analysis was 2.28 (CI: 1.74-2.83), which yielded a tau-squared of 0.65 (see Table 3). With such a large tau-squared, the standard error of the pooled mean effect size thus produced an unusually large confidence interval. Hence, the PRF meta-analysis lacked power to find a RT effect. And so, the RT effect size of 0.30 (CI: 0.05-0.54), with Teixeira (2010) omitted, may be a more suitable reported effect on PRF, with the Teixeira (2010) effect size likely belonging to a different population.
With the above caveats explained, it can then be concluded that RT has a robust effect on HRQOL in older adults. This conclusion is supported by findings from similar meta-analyses. A recent meta-analysis examined the effect of RT on HRQOL among participants with chronic heart failure (CHF). 65 This meta-analysis included studies that used a different HRQOL assessment (Minnesota Living with Heart Failure Questionnaire) with a lower bound mean age of 48 years. The findings from this meta-analysis supported RT as a strong positive factor in increasing HRQOL. Another meta-analysis examined the effect of RT on HRQOL among participants with chronic kidney disease (CKD). 66 In this research, measures of HRQOL were extracted from studies that used the PF and PCS of the SF-36 assessment with a lower bound mean age of 43 years. Results from this meta-analysis also supported RT as an effective intervention in improving HRQOL in participants. A final study worth noting is a metaanalysis that examined the effect of RT on HRQOL in cancer patients. 67 This meta-analysis included studies that used two different disease-specific HRQOL assessments, the Functional Assessment of Cancer Therapy (FACT) self-report questionnaire and the Cancer Rehabilitation Evaluation System Short Form (CARES-SF). Only six studies were included in this meta-analysis, with a lower bound mean age of 49 years. However, a small RT effect (ES = -0.17 in favor of intervention) on HRQOL was still seen. Given the results from these supporting studies and results from the current meta-analyses, RT clearly is an effective intervention for increasing HRQOL in older adults.
The major strength of this study was its use of the SF-36/12 assessment as inclusion criteria during the systematic review. This inclusion criteria gave strength to this research for two reasons. One, the SF-36/12 assessment, as previously mentioned, is a gold standard HRQOL assessment in physical activity research, providing both valid and reliable measures. 28,30 Two, the SF-36/12 assessment has a unique attribute in that it allows for ten different HRQOL scores. 29 This attribute of the SF-36/12 assessment permits a greater and more valid coverage of the various health-related dimensions that ultimately affect the quality of life of older adults.
This study does have limitations worth noting. First, this study is possibly limited due to the phenomenon of publication bias. 68 Publication bias exists in a metaanalysis if studies with negative (null) findings have been systematically omitted from the data extraction process. However, this phenomenon is more likely to occur in industries such as pharmaceutical manufacturing, where organizations have a stake in the research results. 69 In physical activity research, a null finding is more likely considered a valuable addition to the literature. For example, of the 77 effects extracted from the 16 studies in this research, 49 were non-significant -which arguably is evidence against publication bias. Additionally, bias was addressed in this research during the sensitivity analysis, where little bias was found. Second, this study is possibly limited due to search bias. That is, a bias introduced using a limited search strategy. Although this limitation is important to consider, this study took measures to prevent search bias. Specifically, the systematic review procedures included a search of the PubMed database as well as included a large set of keyword terms to ensure a sensitive search. Third, this study is possibly limited due to selection bias, which is related to bias in the way flagged abstracts and articles were included into the meta-analysis. This limitation is important to consider. However, this study utilized two independent researchers on all stages of the systematic review and data extraction procedures, to limit this potential bias. Finally, the use of a single database (i.e., PubMed) may have limited this research and decreased the quality of the search strategy by systematically missing relevant research articles. However, PubMed, a webbased portal of MEDLINE developed by the United States Department of Health and Human Services, has been shown to be more effective with comprehensive medicalrelated reviews than other similar databases. 70

Conclusion
The meta-analytic evidence presented in this research clearly supports RT as an effective means for improving HRQOL in older adults. The array of specific HRQOL dimensions that RT may improve span both mental (MCS, ERF, MH, VT, and SF) and physical (PCS, BP, GH, PRF, and PF) HRQOL domains. RT may, however, be particularly effective at improving MH and BP in older adults. RT should be a priority intervention for improving HRQOL in older adults and helping to meet our national HRQOL goals.

Ethical approval
This study used already published data from journal articles. Therefore, institutional review board approval was not required.