Gender difference in colorectal cancer indicators for exercise interventions: the National Health Insurance Sharing Service-Derived Big Data Analysis
Article information
Abstract
We aimed to examine various characterized features and effects of gender-associated different parameters including exercise on the prevalence of colorectal cancer by using data from the National Health Insurance Sharing Service Database (NHISS DB). Data from NHISS were collected on Koreans aged from 40 to 85 years and were subjected to thematic analysis. The colorectal cancer codes (C19, C20, D011, and D012) from Korean Standard Classification of Disease and Causes of Death selected a target study group, and t-test and logistic regression were used. As results, the age was higher for men who had colorectal cancer than the noncancer group; however, high and low blood pressure, hemoglobin, and age had lower values for the cancer group compared to their counterparts in women. Only total cholesterol in men and waist size in women between cancer and noncancer groups were shown to have significant differences. Serum glutamic pyruvic transaminase and alanine aminotransaminase (SGPT_ALT) showed significant differences for both sexes. In exercise-related parameter, the response number 2 (1–2 times/wk, 0.535 for odd ratio) in women and response number 3 (3–4 times/wk, 0.466 for odd ratio) in men were associated with a reduced incidence of colon cancer. There was a difference in parameters in colorectal cancer patients over 40 years old for both sexes, but not in SGPT_ALT. Regular physical activity might be one of strong factors affecting or predicting colorectal cancer incidence.
INTRODUCTION
Westernized unhealthy and unbalanced dietary habits and lifestyles contribute to the increased risk for cancer (Haydon et al., 2006). Cancer is the most common cause of death, and colorectal cancer (16.5/100,000 persons) is the third-leading cause of death according to the 2016 report from Statistics Korea (http://kostat.go.kr).
The National Health Insurance Sharing Service (NHISS; https://www.nhis.or.kr) was implemented in 1963, and the whole Korean population is currently compulsorily enrolled in the NHISS (Choi et al., 2015). Korean population’s medical records, thus, can possibly be electronically tracked from the NHISS database (DB). This is one of the coordinated strategies to solve global health issues (Chalkidou and Vega, 2013). NHISS data-derived results have been increasingly reported (Jee et al., 2018).
Various unspecified etiologies of cancer, such as physical stimulations (e.g., viruses, radiation, and ultraviolet rays), smoking, and heredity, may cause genetic mutations and ultimately, inhibitor genes (Mafune et al., 2015; Upadhyay et al., 2016). Owing to unidentified patients with cancer in this study, the cure for colorectal cancer was not provided despite conducting numerous trials. In those trials, retrospective patient-oriented studies were also tracked to identify a reasonable solution to prevent cancer symptoms and to understand the various characteristics of colorectal cancer from diverse perspectives. This is important for deciding the factors influencing the incidence of colorectal cancer, and such efforts play a critical role in developing beneficial intervention programs such as tailor-made exercise prescription programs.
Thus, in this study, we aimed to examine the various characteristic features and the effect of parameters on the prevalence of colorectal cancer using big data sourced from the NHISS DB. To accomplish the goals of this study we conducted the followings:
Patients with colorectal cancer included in the NHISS DB and their demographic characteristics were determined.
Sex-specific parameters based on the characteristics of patients with colorectal cancer were detected using t-test and logistic regression analyses, which may provide beneficial information for designing an evidence-based program for alleviating colorectal cancer symptoms.
MATERIALS AND METHODS
Study design
The NHISS-derived data contributed to a longitudinal study in the population registered from 2002 to 2008. Patients with colorectal cancer were tracked using codes from the Korean Standard Classification of Disease and Causes of Death (KSCDCD; http://kssc.kostat.go.kr/ksscNew_web/index.jsp); the code numbers were C19, C20, C218, D011, and D012. From the data of extracted patients with colorectal cancer, sex-specific values of 12 selected parameters were analyzed using t-test. A total of 1,740 patients who registered as those harboring benign colon lesions were subsequently tracked to compare with patients in the extracted colorectal cancer group. The codes for colon benign lesions were M814 (adenomas and adenocarcinomas), M838 (adenomas and adenocarcinomas), K621 (rectal polyps), K635 (polyps of the colon), K6358 (other polyps of the colon), D12 (benign neoplasms of the colon, rectum, anus, and anal canal), D123 (benign neoplasms of the transverse colon), and D126 (benign neoplasms of the colon, unspecified). This part of the study is described in Fig. 1.
Data source and subject population
The randomly selected number of patients older than 40 years and representing almost 10% of the whole Korean population in the NIHSS DB pool was 514,866. Whole Koreans are mandatorily required to enregister (Lee et al., 2016), and Korean people older than 40 years should especially undergo regular health checkups (there is unnecessary to recruit the medical record since it has been automatically recorded and there is basically no missing data in the NHISS DB). The pathogenesis of colorectal cancer abruptly increased from the age of 40 years; thus, we focused on analyzing data from the NHISS DB from this age. Patients with cancer were tracked using colorectal cancer-related codes such as C19 (malignant neoplasm of the colon with the rectum), C20 (malignant neoplasm of the rectal ampulla), C21.8 (malignant neoplasm of the anorectal junction), D01.1 (carcinoma in situ of the rectosigmoid junction), or D01.2 (carcinoma in situ of the rectum) from the KSCDCD. The institutional review board of Seoul National University Bundang Hospital (X-1707-411-903) approved this study.
Categorization of variables
The following variables were used to analyze the obtained data, which were divided as follows: body mass index (BMI; kg/m2); high blood pressure (BP_HIGH; mmHg); low blood pressure (BP_LWST; mmHg); fasting blood sugar levels (BLDS; mg/dL); total cholesterol levels (TOT_CHOLE; mg/dL); hemoglobin levels (HMG; g/dL); urine glucose (GLY_CD; 1, negative; 2, weakly positive; 3, positive [+1]; 4, positive [+2]; 5, positive [+3]; 6, positive [+4]); occult hematuria (OLIG_OCCU_CD; 1, negative; 2, weakly positive; 3, positive [+1]; 4, positive [+2]; 5, positive [+3]; 6, positive [+4]); urine pH levels (OLIG_PH; pH); protein in urine (OLIG_PROTE_CD; 1, negative; 2, weakly positive; 3, positive [+1]; 4, positive [+2]; 5, positive [+3]); serum glutamic oxaloacetic transaminase and aspartate aminotransferase levels (SGOT_AST; U/L); serum glutamic pyruvic transaminase and alanine aminotransaminase levels (SGPT_ALT; U/L); gamma glutamyl transpeptidase levels (GAMMA_GTP; U/L); individual disease history (HCHK_PMH_CD; 1, tuberculosis; 2, hepatitis; 3, hepatism; 4, high blood pressure; 5, cardiopathy; 6, cerebral apoplexy; 7, diabetes; 8, cancer; 9, others); hepatism (FMLY_LIVER_DISE_PATIEN_YN; 1, no; 2, yes); family history of high blood pressure (FMLY_HPRTS_PATIEN_YN; 1, no; 2, yes); family history of cerebral apoplexy (FMLY_APOP_PATIEN_YN; 1, no; 2, yes); family history of cardiopathy (FMLY_HDISE_PATIEN_YN; 1, no; 2, yes); family history of diabetes (FMLY_DIABML_PATIEN_YN; 1, no; 2, yes); family history of cancers (FMLY_CANCER_PATIEN_YN; 1, no; 2, yes); current smoking status (SMK_STAT_TYPE_RSPS_CD; 1, no; 2, quit; 3, smoking); smoking period in the past (SMK_TERM_RSPS_CD; 1, within 6 years; 2, 6–9 years; 3, 10–19 years; 4, 20–29 years; 5, over 30 years); smoking in a day (DSQTY_RSPS_CD; 1, half-pack/day; 2, half-pack/day–1 pack/day; 3, 1 pack/day–2 packs/day; 4, more than 2 packs/day); alcohol consumption habits per week (DRNK_HABIT_RSPS_CD; 1, almost nondrinking; 2, 2–3 times/month; 3, 1–2 times/week; 4, almost every day); alcohol drinking quantity (360 mL) of 20% alcohol by volume at once, (TM1_DRKQTY_RSPS_CD; 1, less than half a bottle; 2, one bottle; 3, one and a half bottle; 4, more than 2 bottles); frequency of moderate-intensity exercise per week (EXERCI_FREQ_RSPS_CD); waist circumference (WAIST; cm); and age (AGE; years).
Statistical analysis
All data are presented as means±standard deviations. In the sex-specific comparison, t-test was used to analyze the mean difference between the cancer and noncancer groups of both sexes. Using logistic regression analysis between patients of different sexes, the effect of various parameters on colorectal carcinogenesis and odds ratios (ORs), P-values, and 95% confidence intervals (CIs) were calculated. SAS ver. 9.4 (SAS Institute, Cary, NC, USA) and IBM SPSS ver. 18.0 (IBM Co., Armonk, NY, USA) were used for all statistical analyses. A P-value of <0.05 was considered to indicate a statistically significant difference in all analyses.
RESULTS
Subject characteristics
Medical histories of all Korean people who were mandatorily registered were tracked through the NHISS DB, which included 514,866 randomly selected representatives of the Korean population who were older than 40 years (a 10th ratio; thus, the whole population older than 40 years was approximately 5 million) were analyzed. We found that among the 514,866 people, the men and women’s representative populations were 279,125 and 235,741 people, respectively. Their average age was 58.56±9.58 years, which was calculated based on the age range from 40 to 85 years. Importantly, we could extract 1,675 patients with colorectal cancer from the DB using the cancer code decided from the KSCDC (Table 1).
Sex-based differences in 12 parameters according to colorectal cancer pathogenesis
Table 2 shows that the cancer and noncancer groups were compared based on sex (Table 2). Parameters that required selection of numbers for responses (questionnaires) were excluded in Table 2. Among the 12 parameters, significant differences (P<0.01) were observed for BMI, BLDS, HMG, SGPT_ALT, WAIST, and AGE in men with cancer. Only BLDS was significantly higher in the men of the cancer groups, whereas higher P_HIGH, BP_LWST, BLDS, and AGE were observed in the women of the cancer group than in those of the noncancer group (P<0.05 and P<0.01, respectively). BLDS in men with colorectal cancer increased by almost 5% (P<0.01). Approximately 12% of women in the cancer group had increased AGE (P<0.01).
Factors affecting colorectal cancer analyzed using logistic regression analysis
Significant factors affecting colorectal cancer were dependent on sex-based differences, except family-related factors, such as FMLY_HPRTS_PATIEN_YN, FMLY_APOP_PATIEN_YN, and FMLY_CANCER_PATIEN_YN. DRNK_HABIT_RSPS_CD, EXERCI_FREQ_RSPS_CD, and AGE were significant factors (Table 3). Significant factors were as follows: BMI (men), BP_LWST (women), BLDS (men), HMG (men), GLY_CD (men), OLIG_OCCU_CD (men), OLIG_PROTE_CD (men), SGPT_ALT (men), HCHK_PMH_CD (women), FMLY_LIVER_DISE_PATIEN_YN (men), SMK_STAT_TYPE_RSPS_CD (men), and WAIST (men) (P<0.05 and P<001). Interestingly, the protein in urine (OLIG_PROTE_CD) in men who responded as number 2 (weakly positive) conferred 2.173 times greater colorectal carcinogenesis than that in control men (P<0.035; 95% CI, 1.054–4.479). Regarding individual disease history (HCHK_PMH_CD), women with cardiopathy (response of number 5) harbored 22 times greater colorectal carcinogenesis than control women (P<0.01; 95% CI, 6.440–75.161). Unexpected interesting data of BMI (men), SMK_STAT_TYPE_RSPS_CD (men), and WAIST (men) revealed ORs of 0.895 (95% CI, 0.868–0.922), 1.413 (95% CI, 1.117–1.787), and 0.952 (95% CI, 0.934–0.971), respectively (P<0.01).
DISCUSSION
The results of this retrospective cohort of patients with colorectal cancer who were aged older than 40 years and registered in the NHISS DB were as follows.
A total of 1,675 patients with colorectal cancer who were traced through C19, C20, C218, D011, and D012 codes were found from the NHISS DB during the study years from 2002 to 2008.
BMI, BLDS, HMG, SGPT_ALT, WAIST, and AGE were higher in the men of the cancer group than in those of the noncancer group (P<0.01). All values, except those of BLDS, were lower in the men of the cancer group than in those of the noncancer group. However, BP_HIGH, BP_LWST, BLDS, and AGE were higher in the women of the cancer group than in those of the noncancer group (P<0.05 and P<0.01, respectively).
Familial disease-related history (high blood pressure, cere bral apoplexy, and cancer), alcohol consumption habits, exer cise frequency, and age differences significantly affected colon carcinogenesis without sex differences (P<0.05). Twelve sex difference-dependent factors, such as waist circumference in men (P<0.01; OR, 0.952; 95% CI, 0.934–0.971) and diastolic pressure in women (P<0.05; OR, 1.050; 95% CI, 1.009–1.092), that affected colorectal cancer were also reported.
Big data analysis to identify parameters related to colorectal cancer incidence
To possibly conquer cancer or at least ameliorate the symptoms of cancer, convergent insights from multidisciplinary aspects of cancer-related studies are indispensable. In vivo and in vitro (Jee et al., 2016) results from direct and practical experiments have been important; furthermore, big data analysis can be added as an allied aid. A study introduced a synergistic double-check of the interrelationship of cancer-regulating genes in colorectal cancer from precedent published datasets to build target gene DBs based on data mining from the IluminaGA_miRNASeq platform served by The Cancer Genome Atlas (https://cancergenome.nih.gov/) by validating in vitro experiments. Especially, Koreans mandatorily enregistered in the large-scale NHISS DB can be adapted for ideal research (i.e., cross-sectional, retrospective, and prospective studies of each individualized patient) (Kim et al., 2016). We believe that these big data-related research activities can contribute to a new frontier field committed to national health and welfare.
Individualized exercise interventions to reduce the incidence of colorectal cancer
Colorectal cancer is a terrifying disease called “quiet cancer” because it metastasizes with no specific symptoms. It occurs by defying the golden rule for preventing colorectal cancer, even though patients are usually aware of their defiance. The golden rule for preventing colorectal cancer involves a healthy diet and regular lifestyle. The easiest means to prevent colorectal cancer is to maintain a healthy lifestyle by engaging in 30-min walking exercises and avoiding excessive intake of red meat, for example. In our findings, exercise-related findings (EXERCI_FREQ_RSPS_CD) were the most clear results obtained from the aspect of the statistical analysis, in that the response numbers 2 (1–2 times/wk) in women and response number 3 (3–4 times/wk) in men was associated with a reduced incidence of colon cancer (P<0.01). In women, the ORs of response numbers 1 and 2 were 0.462 and 0.535, respectively; in men, the ORs of response numbers 1, 2, and 3 were 0.544, 0.496, and 0.466, respectively. These results suggest that designing tailor-made ideal exercise programs using the analyzed parameters in this study is one way to prevent or alleviate the symptoms of colorectal cancer, although the exercise-related parameter in this study did not indicate details such as the intensity or duration of exercise because these factors appear to be more important to obtain the maximal effect of exercise intervention. Jee et al. (2016) reported that high rather than moderate intensity of exercise confers a greater effect on various parameters of cancer models. To design an individual exercise program, adjusting for the markers found to be significant, such as SEX, BMI, BP_LWST, BLDS, HMG, SGPT_ALT, and WAIST, to specify an idealistic exercise program should be developed to prevent colorectal cancer.
As shown by the unexpected interesting results obtained in this study (e.g., OLIG_OCCU_CD only in men), we also consider that the appropriate response of patients with colorectal cancer is dependent on sex-based parameters, which suggests that there seem to exist proper ideal values of the relevant parameters to at least suppress the prevalence of colon cancer (Vainio et al., 2002; Thune and Lund, 1996).
Prescriptive exercise programs can be thus designed according to individualized, age-, and sex-based differences according to frequency, intensity, time (duration), and type of exercise because it has been suggested that they may reflect different susceptibilities that induce non-identical effects of exercise interventions (Bufill, 1990; Dubrow et al., 1993).
Sex differences and other etiological patterns in logistic regression analysis for colorectal carcinogenesis
Regarding logistic regression analysis, sex-based differences are also minimally shown in this study. In Table 3, different patterns with significance are shown only in BMI, BLDS, HMG, GLY_CD, OLIG_OCCU_CD, OLIG_PROTE_CD, and SGPT_ALT in men with colorectal cancer; however, only BP_LWST and HCHK_PMH_CD are significantly different in women (P<0.05). Interestingly, 1.955 times higher risk for colorectal cancer was found in women with a family history of cerebral apoplexy (P< 0.05; 95% CI, 1.007–3.796) and 0.435 times higher risk for the incidence of colorectal cancer was found in women with a family history of diabetes. Regarding the relation between cerebral apoplexy and colorectal cancer, some studies have found that bevacizumab (antitumor angiogenesis drug) increases the risk for cerebral apoplexy as a side effect in patients with colorectal cancer (Lv et al., 2012). Aspirin intake for colorectal cancer increases the risk for developing cerebral hemorrhages (Dehmer et al., 2016). It suggests that patients with colorectal cancer would take aspirin because of secondary diseases, which may induce blood disorders in the brain. Regarding the diabetes and colorectal cancer incidences, we obtained consistent results with those of the study by de Kort et al. (2016). However, there are discrepant results from various studies regarding the relation between colorectal cancer and family disease histories. For the pursuing the etiological relationship between the incidence of colorectal cancer and various factors described in this study such as family disease histories, longitudinal etiology-related study can be one of the way to offset the arising issue (Lin et al., 2018). This is also a way of raising the quality to be pursued for the next study.
The results from the large-scale NIHSS DB suggest that sex-based differences in parameters are observed in patients with colorectal cancer aged older than 40 years, except for the factors described above; thus, sex difference-adjusted parameters significantly reflect evidence-based programs (such as reasonable exercise interventions) that can be developed for alleviating prognostic colorectal cancer symptoms.
ACKNOWLEDGMENTS
This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF-2018S1A5B5A02033087).
Notes
CONFLICT OF INTEREST
No potential conflict of interest relevant to this article was reported.