nature.com

The interactions among factors associated with the risk of lung cancer among diabetes patients: a survival tree analysis

Abstract

Past epidemiological studies demonstrated mixed findings on the association between diabetes and lung cancer. Given the possible links between diabetes, smoking, and respiratory diseases, this study aims to examine the interaction patterns among factors associated with the risk of lung cancer among diabetes patients. A territory-wide retrospective cohort study was performed using electronic health records of Hong Kong. Patients who received diabetes care in general outpatient clinics between 2010 and 2019 without cancer history were included and followed up until December 2019. Conditional inference survival tree was applied to examine the interaction patterns among factors associated with the risk of lung cancer. A total of 385,521 patients were included. During a median follow-up of 6.2 years, 3395 developed lung cancer. Age emerged as primary factor in differentiating the risk of lung cancer. Conditional on age ( ≤ 64 vs >64 years), smoking appeared as subsequent dominant risk factor within each subpopulation. Among old smokers aged >64 years characterized by long duration of diabetes (median: 6–8 years), chronic obstructive pulmonary disease (COPD) emerged as key risk factor. Six distinct subgroups of diabetes patients with different risk levels of lung cancer according to age, smoking, metformin use, and COPD status were identified. Findings of the study suggest the interaction patterns among age, smoking, and COPD on the risk of lung cancer among diabetes patients, providing targets for public health interventions.

Introduction

Lung cancer is the most frequently diagnosed and deadliest cancer worldwide1. The major etiological factor is tobacco smoking. While diabetes is shown to be linked to several cancers2,3, its association with lung cancer remains inconclusive2,3,4,5,6.

Nevertheless, pathophysiological evidence suggests that under diabetes condition, the lung could be a target organ of microangiopathy, potentially increasing susceptibility to respiratory infections and pulmonary dysfunction7. Furthermore, in addition to diabetes, some factors associated with lung cancer such as smoking and chronic lung diseases8 may commonly contribute to intensified oxidative stress, chronic inflammation, and accelerated decline in pulmonary function beyond normal biological aging9,10,11, potentially elevating the risk of lung cancer.

While biological evidence suggests the possible intricate links among factors potentially associated with the risk of lung cancer, there is a lack of epidemiological studies examining how different factors interact to influence the risk of lung cancer among diabetes patients. The conflicting overall associations between diabetes and lung cancer found in previous studies2,3,4,5,6 could be potentially due to differential links by smoking12 or lung disease status13. Moreover, while biological aging is a risk factor for many cancers, it remains less clear how the concurrent presence of diabetes and possible weakened pulmonary function in diabetes7 may influence the risk of lung cancer among diabetes patients across different age groups by smoking status. Furthermore, it remains uncertain whether the presence of pulmonary diseases may individually14 or collectively7,13 influence the risk of lung cancer in the presence of other conditions.

Conditional inference survival tree15 is a tree-structured (or recursive partitioning) algorithm embedded with statistical theory in its partitioning procedures. Tree-structured algorithms are able to (i) capture interaction patterns among covariates15,16; and (ii) provide an intuitive approach for interpretation. Compared to other tree-structured algorithms, conditional inference survival tree has the advantages of (i) incorporating a theoretical framework; (ii) preventing overfitting; (iii) minimizing selection bias towards covariates with many possible values; and (iv) not requiring explicit pruning.

To fill the gap in the literature on the lack of epidemiological studies on how different combinations of factors are associated with the risk of lung cancer among diabetes patients, this study seeks to examine the interaction patterns among factors potentially associated with the risk of lung cancer among diabetes patients using a survival tree analysis approach.

Methods

Study design and study population

This is a territory-wide retrospective cohort study based on electronic health records of Hong Kong’s public healthcare system. The Hospital Authority (HA) is a statutory body responsible for managing 43 hospitals, 49 specialist outpatient clinics and 74 general outpatient clinics. The HA systematically stores records on patients’ demographics, disease diagnoses, prescription records, laboratory results, inpatient admission and outpatient attendance in a centralized data repository. Disease diagnoses were coded with the International Classification of Diseases, 9th or 10th revision (ICD-9 or ICD-10) or the International Classification of Primary Care 2nd edition (ICPC-2). Data were accessed via HA Data Collaboration Lab. Ethics approval for secondary data analysis was provided by the Chinese University of Hong Kong – Survey and Behavioural Research Ethics Committee (reference number: SBRE-22-0386).

Patients

Patients who were diagnosed with diabetes and received a first diabetes complication screening assessment at any of the general outpatient clinics between 2010 and 2019 were initially included. Index date was defined as date of the first assessment. Those who (i) were diagnosed with non-type 2 diabetes; (ii) had missing information on time of diabetes diagnosis; (iii) received a diabetes diagnosis below the age of 18 years; (iv) had a history of malignancy; or (v) had a follow-up period of less than six months were excluded. Patients were followed up until a lung cancer diagnosis, death, or December 2019, whichever occurred earlier.

Outcome

The outcome of interest was diagnosis of lung cancer (ICD-9: 162; ICD-10: C33-34) during follow-up.

Covariates

Information on input variables was ascertained during the first assessment. Candidate split covariates included demographics (age and sex), disease history, medication use, behavioral factors, anthropometric and laboratory measurements. Disease history included duration of diabetes, family history of diabetes, lung diseases (chronic obstructive pulmonary disease [COPD]8 and pneumonia)17, and common comorbidities (ischemic heart disease, cerebrovascular disease, heart failure, hypertension, chronic kidney disease, and liver cirrhosis). Medication use included commonly used anti-diabetic drugs (metformin, sulfonylurea, insulin, and dipeptidyl peptidase-4 inhibitors), aspirin, non-steroidal anti-inflammatory drugs, anti-coagulants, anti-platelets, anti-hypertensive drugs, and statins. Medication use was defined as whether a patient had been prescribed a drug at the time of assessment. Behavioral factors included smoking and alcohol use. Anthropometric measurements included body mass index and waist-to-hip ratio. Laboratory measurements included HbA1c, fasting glucose, low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, triglycerides, and serum creatinine. Laboratory measurements were taken from results closest to the time of the assessment within one year.

Data analysis

Conditional inference survival tree15 was applied to examine the interaction patterns among factors associated with the risk of lung cancer. At each split, a global null hypothesis of independence between a set of covariates and the outcome was tested at a pre-determined α level. If rejected, a set of partial null hypotheses of independence between each covariate and the outcome were tested at the same α level. The covariate with the strongest association or smallest Bonferroni-corrected p-value was then chosen as split variable. The algorithm recursively conducted partitioning until the global null hypothesis cannot be rejected. The α level and maximum depth of the tree model were set at 0.01 and 4 respectively. For continuous split variables, the cutoff value was selected to optimize (maximize) differences in between-group survival outcomes. Each path from the root node to a terminal node represented an interaction pattern16, where the effects of a split variable were conditional on split variables selected at its ancestor nodes. Patients were separated into mutually exclusive subgroups of most homogenous within-group survival outcomes at terminal nodes. The cumulative lung cancer incidence during follow-up across identified distinct subgroups was graphically examined. Model performance was assessed using area under the curve (AUC) as metric. In post-hoc analyses, Cox proportional hazards regression was applied to examine the association between each identified important factor and the risk of lung cancer.

Results

Of the 385,521 patients included, 3395 patients developed lung cancer during a median follow-up of 6.2 years. The incidence rates among smokers and non-smokers were 2.93 and 1.00 per 1000 person-years respectively. In the tree model, age emerged as primary factor in differentiating the risk of lung cancer. Conditional on age (≤64 vs >64 years), smoking appeared as subsequent predominant risk factor within each age-specific subpopulation. Among old smokers aged >64 years who took metformin with longer duration of diabetes (median: 6–8 years), COPD emerged as important risk factor. Six distinct subgroups of patients were identified, namely young never smoker, young ever smoker, old never smoker, old ever smoker without metformin use, old ever smoker in the absence of COPD with metformin use, and old ever smoker in the presence of COPD and metformin use (Fig. 1; Table 1).

Fig. 1: Survival tree diagram for lung cancer incidence among diabetes patients.

figure 1

COPD chronic obstructive pulmonary disease.

Full size image

Table 1 Characteristics of distinct subgroups of diabetes patients.

Full size table

Age and smoking

Age at 64 years was identified as primary factor in differentiating the risk of lung cancer among the overall diabetes population. Across the old (>64 years) and young (≤64 years) subpopulations, smoking symmetrically emerged as most dominant risk factor for lung cancer. Among old and young patients, ever smokers were 3.12 and 3.24 times likely to develop lung cancer respectively, when compared to never smokers, controlling for age, sex, and duration of diabetes (Table 2).

Table 2 Significant differences in adjusted hazard ratios of selected split variables between sibling or comparison nodes.

Full size table

Age, smoking, metformin use, and COPD

Among old ever smokers, the presence of COPD emerged as key factor in differentiating the risk of lung cancer in metformin users (Fig. 1). Those with COPD were 2.09 times likely to develop lung cancer when compared to those without COPD, adjusting for age, sex, and duration of diabetes (Table 2). Results remained similar when HbA1c and fasting glucose were additionally controlled. Old ever smokers who took metformin and suffered from COPD had the highest risk of lung cancer development (Table 2; Fig. 2). They were also more likely to have a history of pneumonia when compared to other old ever smokers (33 vs 5–8%) (Table 1).

Fig. 2: Cumulative lung cancer incidence across distinct subgroups of diabetes patients.

figure 2

COPD chronic obstructive pulmonary disease.

Full size image

Among old ever smokers, metformin users tended to have a longer duration of diabetes (median: 6–8 vs 3 years) and were also more likely to be prescribed with sulfonylurea (52–58 vs 12%) when compared to metformin non-users (Table 1). Nevertheless, metformin use did not appear to be individually associated with the risk of lung cancer (Supplementary Table S1).

Overall, when compared to young never smokers, old never smokers, young ever smokers, old ever smokers without metformin use/COPD, and old ever smokers with metformin use and COPD were 1.30, 2.83, 4.10–4.11, and 8.38 times likely to develop lung cancer, when controlling for age, sex, duration of diabetes, HbA1c, and fasting glucose (Table 2).

Model performance

The AUCs of the tree model at 2, 5, and 7 years were 0.764 (95%CI: 0.763–0.765), 0.743 (95%CI: 0.742–0.744), and 0.735 (95%CI: 0.734–0.736) respectively.

Discussion

The present study revealed that despite tobacco smoking being a well-established risk factor for lung cancer, age, smoking, and COPD may interact to differentiate the risk of lung cancer in diabetes. The tree model identified age at 64 years as optimal age cutoff to primarily differentiate the risk of lung cancer among the study diabetes population. Among old ever smokers characterized by a longer history of diabetes (median: 6–8 years), the presence of COPD emerged as key risk factor for lung cancer. Young never smokers, old never smokers, young ever smokers, old ever smokers without metformin use/COPD, and old ever smokers with metformin use and COPD demonstrated a gradient of increasing lung cancer risk.

In the current study, age, smoking, and COPD exhibited an interaction pattern on the risk of lung cancer under diabetes condition. Aging, smoking, diabetes, and COPD are intricately linked and may collectively influence the risk of lung cancer among diabetes patients. Diabetes18 and COPD19,20 are both age-related diseases18,19,20, and characterized by chronic low-grade inflammation18,20 and cellular senescence18,19,21. Prior research has shown that COPD, but not asthma, is associated with an elevated risk of diabetes, potentially due to the shared oxidative stress, systematic inflammation mechanism, and cytokine profile between two diseases22. Elevated levels of proinflammatory factors in COPD may promote insulin resistance over time22,23. On the other hand, while smoking exposure is a major risk factor for COPD, it is estimated that one-third of patients with COPD are never smokers24. Previous research has demonstrated that COPD is a risk factor for lung cancer regardless of smoking status25. Nevertheless, smoking exposure may induce additional damage to the lungs by intensifying oxidative stress and triggering systematic inflammation10. Moreover, the repairing process of injured lungs may cause scar formation10. Smoking and COPD may both accelerate lung functioning decline faster than the normal physiological aging process11. In addition, cumulative exposure to carcinogens in tobacco smoke may increase with age. Furthermore, under chronic hyperglcemia, the lungs may suffer from further injury due to microangiopathy in diabetes7. As a result, in addition to direct exposure to carcinogens in tobacco smoke, the co-existence of accelerated decline in lung functioning and systematic inflammation under smoking exposure, chronic lung disease, and diabetes10,22, may collectively accelerate carcinogenesis of the lungs.

There are some potential public health implications of the present study. While the overall association between diabetes and lung cancer remains controversial in the literature2,3,4,5,6, biological aging, smoking, and COPD may collectively promote chronic inflammation and accelerate decline in lung functioning faster than normal aging in diabetes11. Prior research suggests that improved metabolic health may potentially delay progression of chronic lung diseases such as COPD23. In addition to preventing tobacco use, improved metabolic health may slow down deterioration of lung functioning in the presence of chronic lung diseases23, potentially lowering the risk of developing lung cancer under diabetes condition.

Some limitations are potentially present in the current study. First, information on cumulative exposure to active smoking was not available in this study. Dose-response effects of smoking were not evaluated. Second, smoking information was self-reported and prone to social desirability bias. Third, information on COPD severity was not available in this study. Fourth, glycemic levels were measured at baseline in this study. The subsequent change in glycemic levels was not captured. Fifth, dosage and duration of medication use was not evaluated in the present study. Sixth, information on occupational and environmental exposures to potential carcinogenic agents was not available in this study. Seventh, histological subtypes of lung cancer were not differentiated in the study. Lastly, the dominant factors and optimal cutoff for age may vary across different populations. Further studies are warranted to verify generalizability of the findings in other populations.

Conclusions

This study suggests the interaction patterns among age, smoking, and COPD on the risk of lung cancer among diabetes patients. While tobacco smoking is a known risk factor for lung cancer, the concurrent presence of smoking, COPD, and diabetes on the background of biological aging may exhibit an interaction pattern on the risk of lung cancer. Findings of the study may help identify target groups for public health prevention strategies, and provide evidence for the importance of tobacco prevention and chronic lung disease management in attenuating the risk of lung cancer under age-associated conditions.

Data availability

Data is not available for sharing due to access restriction.

Code availability

The underlying code for this study is not publicly available but may be made available to qualified researchers on reasonable request from the corresponding author.

References

Bray, F. et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 74, 229–263 (2024).

PubMedGoogle Scholar

Giovannucci, E. et al. Diabetes and cancer: a consensus report. Diabetes Care 33, 1674–1685 (2010).

PubMedPubMed CentralGoogle Scholar

Tsilidis, K. K., Kasimis, J. C., Lopez, D. S., Ntzani, E. E. & Ioannidis, J. P. Type 2 diabetes and cancer: umbrella review of meta-analyses of observational studies. BMJ 350, g7607 (2015).

PubMedGoogle Scholar

Lee, J. Y., Jeon, I., Lee, J. M., Yoon, J. M. & Park, S. M. Diabetes mellitus as an independent risk factor for lung cancer: a meta-analysis of observational studies. Eur J Cancer 49, 2411–2423 (2013).

PubMedGoogle Scholar

Liu, J., Wang, R., Tan, S., Zhao, X. & Hou, A. Association between insulin resistance, metabolic syndrome and its components and lung cancer: a systematic review and meta-analysis. Diabetol Metab Syndr 16, 63 (2024).

CASPubMedPubMed CentralGoogle Scholar

Hua, J. et al. Associations of glycosylated hemoglobin, pre-diabetes, and type 2 diabetes with incident lung cancer: a large prospective cohort study. Diabetes Metab Syndr 18, 102968 (2024).

CASPubMedGoogle Scholar

Zheng, H., Wu, J., Jin, Z. & Yan, L. J. Potential biochemical mechanisms of lung injury in diabetes. Aging Dis 8, 7–16 (2017).

PubMedPubMed CentralGoogle Scholar

Malhotra, J., Malvezzi, M., Negri, E., La Vecchia, C. & Boffetta, P. Risk factors for lung cancer worldwide. Eur Respir J 48, 889–902 (2016).

PubMedGoogle Scholar

Hecht, S. S. Lung carcinogenesis by tobacco smoke. Int J Cancer 131, 2724–2732 (2012).

CASPubMedPubMed CentralGoogle Scholar

Agustí, A. & Hogg, J. C. Update on the pathogenesis of chronic obstructive pulmonary disease. N. Engl J Med 381, 1248–1256 (2019).

PubMedGoogle Scholar

Agusti, A. & Faner, R. Lung function trajectories in health and disease. Lancet Respir Med 7, 358–364 (2019).

PubMedGoogle Scholar

Park, H. J., Joh, H. K., Choi, S. & Park, S. M. Type 2 diabetes mellitus does not increase the risk of lung cancer among never-smokers: a nationwide cohort study. Transl Lung Cancer Res 8, 1073–1077 (2019).

PubMedPubMed CentralGoogle Scholar

Kim, N. E., Kang, E. H., Ha, E., Lee, J. Y. & Lee, J. H. Association of type 2 diabetes mellitus with lung cancer in patients with chronic obstructive pulmonary disease. Front Med (Lausanne) 10, 1118863 (2023).

PubMedGoogle Scholar

Zhao, G. et al. Prevalence of lung cancer in chronic obstructive pulmonary disease: a systematic review and meta-analysis. Front Oncol 12, 947981 (2022).

PubMedPubMed CentralGoogle Scholar

Hothorn, T., Hornik, K. & Zeileis, A. Unbiased recursive partitioning: a conditional inference framework. J Comput Graph Stat 15, 651–674 (2012).

Google Scholar

Ramezankhani, A., Tohidi, M., Azizi, F. & Hadaegh, F. Application of survival tree analysis for exploration of potential interactions between predictors of incident chronic kidney disease: a 15-year follow-up study. J Transl Med 15, 240 (2017).

PubMedPubMed CentralGoogle Scholar

Ang, L., Ghosh, P. & Seow, W. J. Association between previous lung diseases and lung cancer risk: a systematic review and meta-analysis. Carcinogenesis 42, 1461–1474 (2021).

CASPubMedGoogle Scholar

Guo, J. et al. Aging and aging-related diseases: from molecular mechanisms to interventions and treatments. Signal Transduct Target Ther 7, 391 (2022).

CASPubMedPubMed CentralGoogle Scholar

Mercado, N., Ito, K. & Barnes, P. J. Accelerated ageing of the lung in COPD: new concepts. Thorax 70, 482–489 (2015).

PubMedGoogle Scholar

Shaykhiev, R. & Crystal, R. G. Innate immunity and chronic obstructive pulmonary disease: a mini-review. Gerontology 59, 481–489 (2013).

CASPubMedGoogle Scholar

Palmer, A. K., Gustafson, B., Kirkland, J. L. & Smith, U. Cellular senescence: at the nexus between ageing and diabetes. Diabetologia 62, 1835–1841 (2019).

PubMedPubMed CentralGoogle Scholar

Rana, J. S. et al. Chronic obstructive pulmonary disease, asthma, and risk of type 2 diabetes in women. Diabetes Care 27, 2478–2484 (2004).

PubMedGoogle Scholar

Papaioannou, O. et al. Metabolic disorders in chronic lung diseases. Front Med (Lausanne) 4, 246 (2018).

PubMedGoogle Scholar

Salvi, S. S. & Barnes, P. J. Chronic obstructive pulmonary disease in non-smokers. Lancet 374, 733–743 (2009).

PubMedGoogle Scholar

Park, H. Y. et al. Chronic obstructive pulmonary disease and lung cancer incidence in never smokers: a cohort study. Thorax 75, 506–509 (2020).

PubMedGoogle Scholar

Download references

Author information

Authors and Affiliations

JC School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong SAR, China

Sarah Tsz Yui Yau, Chi Tim Hung, Eman Yee Man Leung, Albert Lee & Eng Kiong Yeoh

Authors

Sarah Tsz Yui Yau

View author publications

You can also search for this author inPubMedGoogle Scholar

2. Chi Tim Hung

View author publications

You can also search for this author inPubMedGoogle Scholar

3. Eman Yee Man Leung

View author publications

You can also search for this author inPubMedGoogle Scholar

4. Albert Lee

View author publications

You can also search for this author inPubMedGoogle Scholar

5. Eng Kiong Yeoh

View author publications

You can also search for this author inPubMedGoogle Scholar

Contributions

Conceptualization & Methodology, S.T.Y.Y., E.Y.M.L. and E.K.Y.; Data Curation & Formal Analysis, S.T.Y.Y.; Writing – Original Draft Preparation, S.T.Y.Y.; Writing – Review & Editing, C.T.H., E.Y.M.L., A.L., and E.K.Y.; Supervision, C.T.H. and E.K.Y. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Eman Yee Man Leung or Eng Kiong Yeoh.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Tables

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yau, S.T.Y., Hung, C.T., Leung, E.Y.M. et al. The interactions among factors associated with the risk of lung cancer among diabetes patients: a survival tree analysis. npj Prim. Care Respir. Med. 35, 20 (2025). https://doi.org/10.1038/s41533-025-00417-x

Download citation

Received:07 November 2024

Accepted:26 February 2025

Published:30 March 2025

DOI:https://doi.org/10.1038/s41533-025-00417-x

Share this article

Anyone you share the following link with will be able to read this content:

Get shareable link

Sorry, a shareable link is not currently available for this article.

Copy to clipboard

Provided by the Springer Nature SharedIt content-sharing initiative

Read full news in source page