Introduction
Acute myeloid leukemia (AML) is a highly diverse disease, with various subtypes characterized by distinct genetic profiles, clinical features, and prognoses [
1,
2]. AML is characterized by the clonal expansion of myeloid precursors that exhibit impaired differentiation. This condition results from the accumulation of genetic and epigenetic alterations in hematopoietic stem cells, which leads to dysregulated gene expression and chromosomal abnormalities [
1].
Despite advancements in treatment protocols, the overall five-year survival rate for AML is estimated to be approximately 30% [
2]; this rate has not significantly improved due to factors such as high relapse rates and resistance to standard chemotherapy. The challenges in improving patient outcomes highlight the urgent need for novel therapeutic strategies that can address the complex nature of the disease, thereby facilitating a more individualized treatment approach for AML patients.
Making treatment decisions and managing patients depend on identifying the key genes and pathways associated with cancer prognosis [
3]. The widespread use of high-throughput sequencing technology has led to the accumulation of gene signature studies over time, resulting in large-scale, reusable gene expression profiles that are publicly accessible. Several prognostic studies have redundantly identified linked genes (or genes with similar functions) across multiple investigations, primarily because they utilize the same public dataset sources, such as The Cancer Genome Atlas (TCGA) [
4].
The most common types of failure in both pediatric and adult AML remain resistant and relapsed disease. One of the primary reasons for AML relapse is the persistence of leukemic stem cells (LSCs) [
5]. Drug resistance is also associated with LSCs. Therefore, to enhance risk classification in AML and predict prognosis, the gene expression signatures of LSCs have been identified and quantified. The 17-gene stemness score (LSC17) [
6] and the six-gene LSC score in pediatric AML (pLSC6) [
7] have the potential to redefine initial risk stratification and identify low-risk AML, which will aid in the development of new treatment strategies.
Pyroptosis is a form of programmed cell death that is characterized by inflammation and is mediated by the activation of inflammatory caspases, which are triggered by inflammasomes [
8]. The six-gene pyroptosis-related signature has been closely linked to the prognosis of AML [
8]. Additionally, the following gene signatures have been reported: IED172 (172-gene immune effector dysfunction) signature [
9], PS29MRC signature [
10], 24-gene signature [
11], IDO1 (immune-related gene) signature [
12], autophagy signature [
13], 7-gene signature [
14], IRG (immune-related gene) signature [
15], hypoxia signature [
16], and CXCR signature [
17].
The literature-based strategy, which has gained popularity as an alternative to the data-driven approach, identifies and validates predictive biomarkers using prior research papers or databases [
4]. In this study, we employed a literature-based methodology to develop a gene signature for AML prognosis using the penalized Cox regression technique, demonstrating its robustness. Additionally, we utilized the previously published Tumor Online Prognostic Analysis Platform (ToPP), a validated web- or code-based tool that ensures reproducibility, to build this gene signature model [
18].
Discussion
We developed a predictive gene signature, termed LBS6, and an associated risk score using a LASSO-Cox regression model based on a literature-driven approach in the TCGA-LAML dataset. We validated LBS6 in two independent datasets: TARGET-AML and BeatAML. Our analysis utilized a curated set of 300 well-validated genes reported in the literature, which includes immune- and cell-death–related genes, across multiple AML datasets. Additionally, ToPP was employed to provide robust capabilities for prognostic analysis and model construction [
18].
Notably, the risk score derived from the LBS6 gene signature demonstrated predictive significance in differentiating the survival status of AML patient cohorts. The LBS6 gene signature, which includes ETFB, ARL6IP5, PTP4A3, CSK, HS3ST3B1, and PLA2G4A, was significantly associated with lower OS rates across the TCGA dataset (HR, 4.2; 95% CI, 2.59 to 6.81; p < 0.0001), as well as in two independent datasets: BeatAML (HR, 1.52; 95% CI, 1.17 to 1.96; p=0.0013) and TARGET (HR, 2.04; 95% CI, 1.39 to 3.08; p < 0.001).
The comparison of our 6-gene signature with the well-established LSC17 gene signature revealed a similar decrease in survival rates between the high-risk and low-risk groups, with an HR of 4.2 (95% CI, 2.59 to 6.81) based on the LBS6 gene signature, compared to an HR of 4.67 (95% CI, 2.82 to 7.76) based on the LSC17 gene signature (
S12 Fig.). This indicates that the LBS6 gene signature is comparable and effectively predicts the prognosis of patients with AML and that the risk score derived from the integrated gene signature of only six genes is a reliable predictor of AML outcomes.
This literature-based gene signature of LBS6 demonstrated consistent performance across the AML cohorts, despite the heterogeneity in therapies, age, and gene mutations among the patients. LBS6 signature emerged as a powerful prognostic tool in our study, demonstrating independence and significance in multivariate analyses. Notably, it retained predictive power for both OS and EFS across diverse AML patient cohorts, even after adjusting for well-established risk factors and genetic mutations, WBC count, and the status of key genes such as CEBPA, WT1, FLT3, and NPM1. The unique strength of the LBS6 signature in maintaining its prognostic relevance amidst these established factors underscores its potential to substantially significantly enhance AML risk stratification systems.
Among the six genes in the LBS6 signature,
CSK (c-Src, C-terminal Src kinase), a regulator of Src family kinases, has emerged as a critical component linked to adverse clinical outcomes. High expression levels of
CSK have been consistently associated with negative prognostic factors, such as advanced age, elevated WBC counts, and poor cytogenetics, underscoring its influence within the immune microenvironment [
15]. Furthermore, the role of
CSK in regulating Src family kinases, which are crucial for T-cell activation, aligns with its association with negative outcomes in AML, where pathways such as Akt/mammalian target of rapamycin are affected by alterations in c-Src activity [
22]. This analysis of immune-related gene expression may reflect the significance of the tumor microenvironment in disease progression and patient outcomes in AML [
12,
15].
Four genes (
ETFB,
ARL6IP5,
PTP4A3, and
PLA2G4A), which comprised the LBS6 signature in our study, belong to the 24-gene prognostic signature [
11] derived from a meta-analysis of Cox regression values across multiple training sets. Interestingly,
ARL6IP5,
PTP4A3, and
PLA2G4A showed frequent interactions among the 24 genes within this signature [
11].
Previous studies have indicated that
ETFB is involved in mitochondrial function and energy metabolism, which are critical processes in cancer cell survival and proliferation [
11]. Alterations in genes such as
ETFB can lead to the disruption of host-cell homeostasis.
PTP4A3 (protein tyrosine phosphatase type IVA, member 3), also known as
PRL-3, is a well-known oncogene that promotes cancer cell migration, invasion, and metastasis. High expression levels of
PTP4A3 are associated with poor prognosis in various cancers, including AML [
23]. Studies have shown that
PTP4A3 is significantly upregulated in AML and correlates with adverse outcomes [
23].
ARL6IP5 (ADP ribosylation factor like GTPase 6 interacting protein 5) is involved in various cellular processes, including intracellular transport [
24].
ARL6IP5 expression has been correlated with chemotherapeutic response [
24]. A 24-gene signature that includes
ARL6IP5 and our LBS6 was associated with poor prognosis in AML, suggesting its contribution to the aggressive behavior of leukemia cells [
11].
PLA2G4A (phospholipase A2 group IVA) has emerged as a significant gene in the progression and prognosis of AML.
PLA2G4A serves as a potential biomarker for predicting shorter OS in patients with non-M3/
NPM1 wild-type AML [
25]. Additionally,
PLA2G4A may function as a prognostic marker and a potential therapeutic target for specific subtypes of AML [
26].
The
PLA2G4A gene in our LBS6 signature is associated with necroptosis [
27]. While our LBS6 signature included
PLA2G4A, it did not contain any other genes linked to various forms of regulated cell death, such as necroptosis, pyroptosis, apoptosis, cuproptosis, and ferroptosis. Recent studies integrating lipid metabolism with immune-related genes have proposed new prognostic classifications for AML, highlighting the pivotal role of
PLA2G4A in both metabolic and immune processes [
28]. The collective evidence underscores the significance of
PLA2G4A in the pathogenesis of AML, as well as its potential as a biomarker and therapeutic target.
HS3ST3B1 is a component of the IED172 signature, as reported by Rutella et al. [
9]. Although this gene is part of a broader immune-related expression profile in AML, its specific clinical significance and individual contribution to leukemia pathogenesis have yet to be fully elucidated.
The enrichment analysis highlighted the significant involvement of several molecular functions, particularly those associated with transcriptional activity and chemokine signaling. Notably, “
CXCR3 chemokine receptor binding” suggests that chemokine signaling plays a crucial role in the pathogenesis of AML, emphasizing its importance in immune activation and its potential as a therapeutic target. This signaling pathway recruits effector T cells to tumor sites, thereby enhancing anti-tumor immunity and influencing leukemia progression [
29].
CXCR3 expression in regulatory T cells affects CD8(+) T-cell immunity and impacts cancer dissemination by guiding T-cell fate. Targeting
CXCR3 signaling could enhance immune responses and improve therapeutic outcomes in AML, making it a promising treatment strategy. These enriched Gene Ontology (GO) terms underscore the complex interplay of genetic regulation, immune response, and cellular interactions in pediatric AML, providing insights into potential therapeutic targets and the biological foundations of risk stratification.
Notably, the CIBERSORTx analysis reveals a significant enrichment of monocytes in the high-risk group and an increased presence of resting mast cells in the low-risk group. These findings suggest distinct immune landscapes, with the high-risk group exhibiting a potential activation profile that may contribute to aggressive disease progression. According to the literature, the presence of monocytes is associated with a pro-tumorigenic environment due to their role in suppressing anti-tumor responses and promoting angiogenesis and tumor growth [
30]. Conversely, an increased presence of mast cells has been linked to improved outcomes in certain cancers, potentially due to their role in modulating immune responses and inflammation. This pattern underscores the complexity of the immune landscape in pediatric AML and highlights the potential of specific immune profiles as markers of disease severity and prognosis.
Notably, the LBS6 gene signature was revealed as an independent risk factor in multivariate analysis, irrespective of other established risk factors such as poor karyotypes, elevated WBC, advanced age, and gene mutations. An important limitation of our study is that key prognostic factors such as AML M3 and core binding factor AML status and detailed treatment were not consistently available across the public databases utilized. We employed multivariate analysis to control known clinical variables and minimize potential confounding effects, helping to mitigate the impact of treatment heterogeneity and key prognostic factors. Future studies incorporating these important clinical and molecular features will be valuable for a more comprehensive understanding of IED signatures across AML subtypes.
While our study suggests the potential prognostic value of the LBS6 signature, it may serve as a complement to, rather than a replacement for, established risk factors such as cytogenetics and molecular abnormalities. The independent prognostic significance of LBS6 in multivariate analysis indicates that it might provide additional, complementary information that could potentially enhance current risk stratification strategies. Based on LBS6-defined risk groups, different therapeutic approaches could be considered: patients with high LBS6 scores might potentially benefit from more intensive therapeutic approaches, possibly including consideration for allogeneic stem cell transplantation, while standard treatment protocols could be sufficient for those with low LBS6 scores. However, these therapeutic suggestions would require prospective validation through clinical trials before any consideration for implementation in clinical practice.
Our study is significant in that the LBS6 signature was constructed based on the expression levels of genes derived from well-documented signatures that reflect the key biological processes associated with AML across various datasets. Additionally, a literature-oriented approach presents a viable option for constructing a robust gene signature at a reasonable cost, particularly when multiple studies are available.
In addition, our study demonstrated that the six-gene signature performed consistently across various AML cohorts, regardless of differences in age, genetic backgrounds, and therapies, establishing it as an independent prognostic factor for AML patients. By utilizing only six genes in conjunction with cytogenetic and molecular tests, patients with AML can be accurately stratified by risk through practical techniques such as multiplex real-time quantitative polymerase chain reaction.
In conclusion, the LBS6 score, derived from well-validated gene signatures of 300 genes across multiple independent AML datasets, has the potential to redefine early risk categorization and identify low-risk AML cases. By refining the gene panel while preserving its predictive power, the LBS6 score enhances clinical value and may inform the development of new therapeutic strategies.