Relationships between physical activity and other health-related measures using state-based prevalence estimates

Background: Both physical activity and muscle-strengthening activity have known relationships with other health-related variables such as alcohol and tobacco use, diet, and health-related quality of life (HRQOL). The purpose of this study was to explore and quantify the associations between physical activity measures and health-related variables at the higher state level. Methods: This cross-sectional study used data from the 2017 and 2019 Behavioral Risk Factor Surveillance System surveys. State-based prevalence (%) estimates were computed for meeting physical activity guidelines (PA), meeting muscle-strengthening activity guidelines (MS), both PA and MS (MB), drinking alcohol (D1), heavy alcohol drinking (HD), fruit consumption (F1), vegetable consumption (V1), good self-rated health (GH), overweight (OW), obesity (OB), current smoking (SN), and smokeless tobacco use (SL). Descriptive statistics, correlation coefficients, and data visualization methods were employed. Results: Strongest associations were seen between PA and F1 (2017: r=0.717 & 2019: r=0.695), MS and OB (2017: r=-0.781 & 2019: r=-0.599), PA and GH (2017: r=0.631 & 2019: r=0.649), PA and OB (2017: r=-0.645 & 2019: r=-0.763), and MB and SN (2017: r=-0.713 & 2019: r=-0.645). V1 was associated only with PA (2017: r=0.335 & 2019: r=0.357) whereas OW was not associated only with PA. Canonical correlation analysis showed the physical activity variables were directly related (r c=0.884, P<0.001) to the health variables. Conclusion: This study used high-level data to support the many known relationships between PA measures and health-related variables.


Introduction
Physical activity (PA) is a known preventive health behavior with increased amounts associated with decreased health risk and thus promoted to all adult populations. 1 Accordingly, Healthy People 2030 has an objective to increase the percent of adults 18 + years of age to 52.9% (from 47.9% in 2020) who engage in aerobic PA of at least moderate intensity for 150 + minutes per week, or 75 + minutes per week of vigorous intensity, or an equivalent combination. 2Muscle strengthening activity (MSA) is a specific form of PA and is also promoted to adults as a preventive health behavior. 3Another Healthy People 2030 objective, related to MSA, is to increase the percent of adults 18 + years of age to 36.6% (from 31.9% in 2020) who perform muscle-strengthening activity on 2 + days per week. 4Both PA and MSA are promoted in combination to adult populations and can serve as an additional PA guideline measure for individuals meeting both recommendations. 5he influence that PA has on health outcomes can be direct, indirect, or both in nature. 6For example, PA can directly improve a person's cardiovascular disease risk by reducing blood pressure. 7Similarly, MSA can directly improve an individual's functional ability by improving their muscular strength and balance. 8PA can also contribute to positive health outcomes indirectly by first influencing a different health behavior or outcome.For example, PA can indirectly improve a person's cancer risk by stimulating the desire to improve nutrition and adhere to the American Cancer Society (ACS) Guideline for Diet and Physical Activity. 9In this scenario, the ACS diet would be directly related to improved cancer risk.
There is a large body of knowledge published on the health-related correlates of PA that directly relate to health outcomes in adult populations.For instance, objectively measured PA has been shown proportionally related to ][18] Evidence of these aforementioned associations between PA measures and health-related behaviors is important for both preventive medicine and individual-level behavior change.But a more comprehensive understanding of these relationships can be achieved when examining them additionally at a higher level of analysis.That is to say, if prevalence estimates of certain behaviors are collected across varying geographic locations, such as U.S. states, these estimates can serve as correlational data from a higher observational level.One such study from the Centers for Disease Control and Prevention (CDC) reported this type of analysis by correlating alcohol-use behaviors among youth with those of adults, across U.S. states. 19Another study used state-based prevalence of obesity to find their associations with six different types of cancer using data from a national health survey and the U.S. Cancer Statistics. 20onsidering this, examining the extent to which statebased PA estimates influence other health measures can be an important contribution to population health.The aim of this study was to explore and quantify the associations between PA measures and other health-related variables at a higher observational level using state-based prevalence estimates.

Study procedures
A cross-sectional correlational design was employed to address the study's research question.Data came from the 2017 and 2019 U.S. CDC's Behavioral Risk Factor Surveillance System (BRFSS) surveys.Details regarding the BRFSS background and design can be found elsewhere. 21,22Briefly, the BRFSS is a state-based annual telephone survey designed to collect consistent data on health-risk behaviors, health conditions, and preventive care in noninstitutionalized U.S. adults 18 + years of age.In this analysis, data from all available states were included without the use of the District of Columbia, Guam, and Puerto Rico.This resulted in N = 50 state records for the 2017 BRFSS and N = 49 state records for the 2019 BRFSS (No data for New Jersey).The 2017 and 2019 BRFSS surveys were the most recent, to date, assessing MSA.

PA measures and health-related variables
Twelve different health variables were used in this study, each representing a state's weighted prevalence (%) estimate. 23Aerobic PA represents the prevalence of adults that participate in 150 minutes or more of aerobic PA per week.Muscle strengthening exercise (MS) represents the prevalence of adults that participate in muscle strengthening exercises two or more times per week.Meeting both PA and MS (MB) represents the prevalence of adults that participated in enough aerobic and muscle strengthening exercises to meet guidelines (i.e., PA and MS).Drinking alcohol (D1) represents the prevalence of adults who have had at least one drink of alcohol within the past 30 days.Heavy drinking (HD) represents the prevalence of heavy alcohol consumption as defined as an adult male having more than 14 drinks per week or an adult female having more than 7 drinks per week.Fruit consumption (F1) represents the prevalence of consuming fruit one or more times per day.Vegetable consumption (V1) represents the prevalence of consuming vegetables one or more times per day.General health (GH) represents the prevalence of self-rated good health or better.Obese (OB) represents the prevalence of obesity as assessed by a body mass index (BMI) within the 30.0 kg/m 2 to 99.8 kg/ m 2 range.Overweight (OW) represents the prevalence of overweight as assessed by a BMI within the 25.0 kg/m 2 to 29.9 kg/m 2 range.Smoking (SN) represents the prevalence of adults who are current smokers.Smokeless tobacco use (SL) represents the prevalence of adults who currently use chewing tobacco, snuff, or snus every day.

Statistical analyses
Data analysis for this study included (1) descriptive statistics, (2) normality and outlier analyses, (3) Pearson correlations for bivariate associations, (4) data visualization methods for displaying prevalence estimates and correlations with 95% confidence intervals (CIs), and (5) canonical correlation analysis on the set of PA measures and set of other health variables.The purpose of the normality and outlier analyses was to further explore the state-based prevalence data as well as to check for major violations of Pearson correlation coefficient assumptions.The correlation analysis was replicated using Spearman correlations and each coefficient was found to be similar in direction and magnitude and therefore not presented.Data visualization techniques included (a) a prevalence plot by state using average 2017 and 2019 values (see description below), (b) bivariate scatter plots for each year (i.e., 2017 and 2019) for each PA measure and health variable pair, with fit line, (c) Forest plots of the Pearson correlations with their 95% CI, by year, for each PA measure (i.e., PA, MS, and MB), (d) canonical correlation analysis path diagram, representing two constructs of PA and health, and (e) scatter plot with fit regression line for the first canonical PA (i.e., x variable) and health (i.e., y variable) variates.
The canonical correlation analysis also used average 2017 and 2019 prevalence data with an N = 50.Since the SL prevalence was missing for Rhode Island in 2017, its 2019 value was used as the average value.Additionally, since New Jersey did not have reliable data in 2019, their 2017 prevalence estimates were used as average values.Multicollinearity was checked during the canonical correlation analysis and MB was dropped from the set of PA variables due to high variance inflation (i.e., VIF > 10).[26]

Results
Table 1 contains descriptive statistics, normality checks, and outlier analysis for the 2017 BRFSS prevalence estimates with a total sample size of N = 50 for all variables, except SL due to the lack of Rhode Island data on that variable.It is clear that on average the prevalence of PA (Mean = 50.6%,SD = 4.5%) is greater than MS (Mean = 30.1%%,SD = 2.7%) and MB (Mean = 20.2%,SD = 2.6%).Table 2 contains the same exploratory data analyses but for the 2019 BRFSS prevalence estimates with a total sample size of N = 49 for all variables (minus New Jersey).Similarly, the average prevalence of PA (Mean = 50.6%,SD = 5.6%) is greater than MS (Mean = 34.9%%,SD = 3.5%) and MB (Mean = 22.6%, SD = 3.3%).Figure 1 displays these findings visually with 2017 and 2019 prevalence estimates averaged.The normality and outlier analyses was judged overall as acceptable for both years because (1) the few number of outliers, (2) the relatively small skewness and kurtosis statistics, (3) the rejected normality test for only one variable (F1 in 2017 and HD in 2019), (4) the subjective opinion that the histograms looked appropriate, (5) all prevalence data were checked and found to be accurate, and (6) Spearman correlations were not judged different from the Pearson correlation coefficients in the upcoming bivariate analyses.
Table 3 contains the bivariate Pearson correlation coefficients for the PA measure and health variable prevalence estimates for the 2017 BRFSS data.Table 4 contains the same bivariate correlations for the 2019 BRFSS data.Figure 2 displays the scatter plots for these relationships with fit linear regression lines -confirming the assumption of linearity for the correlations.Collectively, most state-based correlations were significant (P < 0.05) with strongest associations seen between PA and F1 (2017: r = 0.717 & 2019: r = 0.695),  Table 5 displays results of the canonical correlation analysis using PA and MS measures as one set of PA variables and D1, HD, F1, V1, GH, OB, OW, SN, and SL as a set of health variables.These results show the linear combination of PA variables are strongly related (r c = 0.884, P < 0.001) to the linear combination of health variables.

Discussion
This study used a novel approach for examining associations between PA measures and health-related variables.Specifically, state-level prevalence (%) estimates, weighted and representing the U.S. civilian adult population, were used for bivariate correlation data.Findings showed that most state-level healthrelated variables were associated with all three statelevel PA measures.Predictably, the prevalence of OB was negatively correlated strongly (all r < -0.60) with all three PA measures.The prevalence of OW, however, was not correlated with PA and only moderately positively correlated with MS and MB.These findings are noteworthy since individual level associations appear mixed regarding PA and overweightness. 27,280][31][32][33][34] An exception was the weak associations between PA measures and vegetable consumption (V1), with just a modest correlation between V1 and PA and no relationship between V1 and MS or MB. 35his study also used a novel multivariate statistical technique, canonical correlation analysis, to examine the extent to which the set of PA measures (i.e., PA and MS) correlate with the set of health-related variables (i.e., D1, HD, F1, V1, GH, OB, OW, SN, and SL). the results clearly indicated a strong association (r c = 0.884) between the two sets of health variables.This may be the only application, to date, where such a multivariate correlation coefficient has been computed using U.S. state-level prevalence estimates of health variables.
A strength regarding this study is its use of BRFSS data and its use of representative samples of noninstitutionalized adults for estimation of health-related summary statistics.Another strength regarding this study is the series of survey questions and modules assessing various health risk behavior and health status outcomes.The BRFSS specifically designs its questionnaires to target the leading causes of premature death and disability in the U.S. Therefore, the variables used in this study are of the utmost importance to the health status of U.S. adults.There are limitations worth mentioning.Firstly, this study uses a higher level of analysis unit, in the form of state-based prevalence estimates, and can be considered ecological data.Therefore, findings from this study should not necessarily imply that the same associations exist at  the individual level (i.e., ecological fallacy). 36Secondly, BRFSS data are cross-sectional in nature and thus do not provide evidence for cause-and-effect.Specifically, these findings are not implying that changing a person's PA will subsequently change their health behavior or health status.Thirdly, all variables in this study were assessed via self-report telephone interviews.Therefore, participant misclassification cannot be ruled out due to measurement and reporting bias.In sum, findings from this study should be considered with caution.

Conclusion
This study used higher-level data to support the many known relationships between PA and health.Findings clearly showed moderately strong associations between the different PA measures and F1, GH, OB, and SN among U.S. adults.Conversely, findings showed consistently weak associations between the PA measures and V1 and OW.Thus, at the state level, PA may provide little information regarding adult overweightness status and vegetable consumption.State-based associations between PA and health can be an alternative source of needs assessment for health promotion professionals and policy makers.

Figure 1 .
Figure 1.Plot of physical activity (PA) prevalence by physical activity (PA) measure for each state, Behavioral Risk Factor Surveillance System (BRFSS) 2017 and 2019
Figure 7 displays the scatter and fit line for the first canonical scores with large explained variance (R² = 0.78).

Figure 2 .
Figure 2. Scatter plots for physical activity (PA) measure and health variable prevalence estimates with fit linear regression lines for Behavioral Risk Factor Surveillance System (BRFSS) 2017 (left) and 2019 (right)

Figure 3 .
Figure 3. Forest plot of Pearson correlations for meeting physical activity guidelines (PA) and health-related variables, Behavioral Risk Factor Surveillance System (BRFSS) 2017 and 2019

Figure 7 .
Figure 7. Scatter and fit line for the first canonical variate scores for physical activity (PA) and health, Behavioral Risk Factor Surveillance System (BRFSS) 2017 and 2019

Table 1 .
Descriptive and normality statistics for state-based health-related prevalence estimates, Behavioral Risk Factor Surveillance System (BRFSS) 2017 Note.N = 50.(N = 49 for SL).All states participated in 2017.Descriptive statistics are for state-based prevalence estimates (%s).SD is standard deviation.Skew is skewness.Kurt is kurtosis.Z Skew and Z Kurt are the Z statistics for skewness (skew/sqrt(6/N)) and kurtosis (kurt/sqrt(24/N)), respectively.Outliers is the number of % values with a standard score greater than |2.5|.P value is for the Kolmogorov-Smirnov normality test.PA: Participated in 150 minutes or more of aerobic physical activity per week.MS: Participated in muscle strengthening exercises two or more times per week.MB: Participated in enough aerobic and muscle strengthening exercises to meet guidelines.D1: Adults who have had at least one drink of alcohol within the past 30 days.HD: Heavy drinkers (adult men having more than 14 drinks per week and adult women having more than 7 drinks per week).F1: Consumed fruit one or more times per day.V1: Consumed vegetables one or more times per day.GH: Good or better health.OB: Obese (BMI 30.0 kg/m 2 -99.8 kg/m 2 ).OW: Overweight (BMI 25.0 kg/m 2 -29.9 kg/m 2 ).SN: Adults who are current smokers.SL: Adults who currently use chewing tobacco, snuff, or snus every day.

Table 2 .
Descriptive and normality statistics for state-based health-related prevalence estimates, Behavioral Risk Factor Surveillance System (BRFSS) 2019 = 49.All states participated in 2019 less NJ.Descriptive statistics are for state-based prevalence estimates (%s).SD is standard deviation.Skew is skewness.
Kurt is kurtosis.Z Skew and Z Kurt are the Z statistics for skewness (skew/sqrt(6/N)) and kurtosis (kurt/sqrt(24/N)), respectively.Outliers is the number of % values with a standard score greater than |2.5|.p-value is for the Kolmogorov-Smirnov normality test.PA: Participated in 150 minutes or more of aerobic physical activity per week.MS: Participated in muscle strengthening exercises two or more times per week.MB: Participated in enough aerobic and muscle strengthening exercises to meet guidelines.D1: Adults who have had at least one drink of alcohol within the past 30 days.HD: Heavy drinkers (adult men having more than 14 drinks per week and adult women having more than 7 drinks per week).F1: Consumed fruit one or more times per day.V1: Consumed vegetables one or more times per day.GH: Good or better health.OB: Obese (BMI 30.0 kg/m 2 -99.8 kg/m 2 ).OW: Overweight (BMI 25.0 kg/m 2 -29.9 kg/m 2 ).SN: Adults who are current smokers.SL: Adults who currently use chewing tobacco, snuff, or snus every day.

Table 3 .
Pearson correlation coefficients for state-based health-related prevalence estimates, Behavioral Risk Factor Surveillance System (BRFSS) 2017 Note.N = 50.(N = 49 for SL).All states participated in 2017.Average is the mean of the absolute value of correlation coefficients.PA: Participated in 150 minutes or more of aerobic physical activity per week.MS: Participated in muscle strengthening exercises two or more times per week.MB: Participated in enough aerobic and muscle strengthening exercises to meet guidelines.D1: Adults who have had at least one drink of alcohol within the past 30 days.HD: Heavy drinkers (adult men having more than 14 drinks per week and adult women having more than 7 drinks per week).F1: Consumed fruit one or more times per day.V1: Consumed vegetables one or more times per day.GH: Good or better health.OB: Obese (BMI 30.0 kg/m 2 -99.8 kg/m 2 ).OW: Overweight (BMI 25.0 kg/m 2 -29.9 kg/m 2 ).SN: Adults who are current smokers.SL: Adults who currently use chewing tobacco, snuff, or snus every day.

Table 4 .
Pearson correlation coefficients for state-based health-related prevalence estimates, Behavioral Risk Factor Surveillance System (BRFSS) 2019 Note.N = 49.All states participated in 2019 less NJ.Average is the mean of the absolute value of correlation coefficients.PA: Participated in 150 minutes or more of aerobic physical activity per week.MS: Participated in muscle strengthening exercises two or more times per week.MB: Participated in enough aerobic and muscle strengthening exercises to meet guidelines.D1: Adults who have had at least one drink of alcohol within the past 30 days.HD: Heavy drinkers (adult men having more than 14 drinks per week and adult women having more than 7 drinks per week).F1: Consumed fruit one or more times per day.V1: Consumed vegetables one or more times per day.GH: Good or better health.OB: Obese (BMI 30.0 kg/m 2 -99.8 kg/m 2 ).OW: Overweight (BMI 25.0 kg/m 2 -29.9 kg/m 2 ).SN: Adults who are current smokers.SL: Adults who currently use chewing tobacco, snuff, or snus every day.

Table 5 .
Canonical correlation analysis of physical activity (PA) and health prevalence, Behavioral Risk Factor Surveillance System (BRFSS) 2017 and 2019 Note.b is raw canonical coefficients (within construct slopes).B is standardized canonical coefficients (within construct standardized slopes).r s.w is within construct structure loadings (correlations).r s.b is between (cross) construct structure loadings (correlations).RI is the Stewart-Love redundancy index.r c is canonical correlation coefficient.% represents proportion of the sum of eigenvalues explained by the first canonical variate.F statistic is for likelihood ratio test and λ is Wilks' Lambda with null hypothesis stating the canonical correlation is zero.The 2nd canonical variates was also significant but with low explained variance and uninterpretable.