Could Machine Learning-Based Gestational Diabetes Mellitus Prediction Models Replace Traditional Screening Test?

Jong Yun Hwang

doi:10.21896/jkmch.2024.28.4.153

Gestational diabetes mellitus (GDM) is defined as the initial detection of high blood sugar levels during pregnancy. The prevalence of GDM is increasing globally, with the highest rates in the Middle East and North Africa, followed by South-east Asia. In South Korea, the incidence of GDM increased from 8% in 2012 to 11.1% in 2016 (Jung et al., 2021).

The diagnosis and treatment of GDM are important for maternal and childcare. The American Diabetes Association, American College of Obstetricians and Gynecologists, and the Korean Diabetes Association recommend screening for GDM between 24 and 28 weeks of gestation (ACOG Practice Bulletin, 2018).

However, some researchers have reported that women with GDM may experience complications before undergoing scheduled screening tests. Yang et at. (2024) reported that un controlled diabetes mellitus (DM) during early pregnancy is associated with increased miscarriage, and that hyperglycemia in early pregnancy is associated with fetal malformations. Ma-nagement of pregnant women with GDM diagnosed after 24 weeks of gestation has been less effective in preventing mac-rosomia, even with aggressive treatment, because fetal weight increases steadily from early pregnancy (Fu & Retnakaran, 2022).

Early diagnosis and early intervention for GDM are neces-sary to prevent complicated pregnancies from the beginning of pregnancy. Predicting GDM in the first trimester is crucial because pregnant women with high-risk factors for hyper-gly cemia require customized early interventions such as education on weight control and exercise, as soon as possible.

Models based on traditional logistic regression and nomo-grams have been reported to predict GDM in the first trimester. However, their clinical usefulness is limited for several reasons.

Machine learning, which has recently been used for disease prediction, is being applied to the development of GDM prediction models (Arain et al., 2023).

In this article, I review the current research trends in GDM prediction models and suggest future directions for their development.

Results From Searching a Machine Learning-Based GDM Prediction Model in PubMed

We searched PubMed using the keywords ‘machine learning and gestational diabetes mellitus, machine learning and gestational diabetes mellitus, machine learning and prediction model and gestational diabetes mellitus,’ and "machine learning and prediction model and gestational diabetes mellitus."

A search using the keywords ‘machine learning and gestational diabetes mellitus’ retrieved 148 articles. The phrase ‘machine learning and gestational diabetes mellitus’ returned 98 papers. The search for ‘machine learning and prediction model and gestational diabetes mellitus’ yielded 103 results, while ‘machine learning and prediction model and gestational diabetes mellitus’ produced 69 results.

Of these, 61 articles relevant to GDM-related prediction models were selected. Among these, 47 articles focused on prediction models for GDM during pregnancy, while 5 articles dealt with predicting complications in pregnant women with GDM. Additionally, there were 4 articles related to the progression of type II DM in the postpartum period, and another 4 articles introduced the development of GDM prediction models.

Of the 47 articles on GDM prediction models in pregnancy, 3 were review articles and 44 were original research articles. I have reviewed and described the original journals used in this study.

Introduce ot Previous Machine Learning-Based GDM Predictioon Models

Of the 47 retrieved results, the earliest was published in 2017. This study included 33,935 pregnant women from a cohort in a Chinese hospital, of whom 4,378 were selected for the study. Electronic health records obtained at 22-24 weeks of gestation were analyzed using 5 machine learning methods, and an accuracy of 62.16% and positive predictive value of 98.4% were reported. This paper presents a GDM prediction model that differs from traditional screening tests. However, it has some limitations. The testing period is not shorter than that of traditional screening, and its accuracy is relatively low (Qiu et al., 2017). A 2020 Israeli study analyzed the electronic health records of 984,122 women from 2010 to 2017. This study used 2,355 variables selected from the records and reported an area under the receiver operating characteristic curve of 0.85, and has the strengths of using nationally based data and reporting results with high accuracy. However, there are limitations such as a large number of variables (2,355) that must be reduced for clinical application (Artzi et al., 2020).

Recently, Chinese studies have attempted to predict GDM using machine learning prior to routine screening tests. One study reported higher accuracy than traditional logistic models in pregnant women before 15 weeks of gestation (Liu et al., 2021). Another study reported high accuracy using machine learning in the first trimester without a blood test (Wang et al., 2021)

Two studies were conducted in South Korea: (1) In 2023, Kang et al. (2023) reported a machine learning-based model for predicting GDM in 34,387 pregnant Asian women. At the first visit, we used the general characteristics of the pregnant women and questionnaires for GDM prediction. Biochemical markers such as white blood cell count, fasting blood sugar, and cholesterol were used to develop a GDM prediction model before 10 weeks of gestation, and factors such as glycosylated hemoglobin at 14-24 weeks of gestation were used to develop the GDM prediction model. The number of variables used ranges from 165 to 361 (Kang et al, 2023). (2) Another study developed a GDM prediction model using clinical data prior to 14 weeks of gestation and found that the model performance improved with the addition of nonal-coholic fatty liver disease-associated variables. The machine learning methods used were logistic regression, random forest, support vector machines, and deep neural networks (Lee et al., 2022).

Research Trend

Early diagnosis of GDM is important for both maternal and child health as it allows for early intervention and treatment. There is no consensus on screening for GDM before 24 weeks of gestation because the timing of the screening test should be cost-effectiveness for the population.

However, the development of a low-cost, simple, and convenient test to predict GDM in early pregnancy would greatly benefit maternal and child health. Fortunately, recent efforts to use machine learning for disease prediction are increasing, and this approach is also being applied to GDM. Research trends in GDM prediction models using machine learning have focused on moving predictive testing earlier and reducing the number of variables.

Expectations for Machine Learning Model to Predict GDM

The incidence of GDM is expected to increase because of the increasing number of older pregnant women in South

Korea (Hwang, 2020; Lee et al., 2021). National professional societies have suggested high-risk groups for GDM and recommend screening for GDM as soon as possible. However, the screening test for GDM requires at least 1 hour and is invasive, which may lead clinicians to miss the timing owing to careful decision-making.

A machine learning-based GDM prediction model is less expensive and noninvasive because it uses hospital data ac-quired during routine antenatal visits. This could be especially useful in developing countries, where maternity hospital access is difficult.

However, there are still insufficient studies to determine whether they can replace existing screening tests. However, if future studies prove its effectiveness, it can replace time-con-suming and invasive screening tests.

Conflict of Interest

The author has nothing to disclose.