Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Filter by Categories
Article
Brief Report
Case Report
Commentary
Community Case Study
Editorial
Image
Images
Letter to Editor
Letter to the Editor
Media & News
Mini Review
Obituary
Original Article
Perspective
Review Article
Reviewers; List
Short Communication
Task Force Report
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Search in posts
Search in pages
Filter by Categories
Article
Brief Report
Case Report
Commentary
Community Case Study
Editorial
Image
Images
Letter to Editor
Letter to the Editor
Media & News
Mini Review
Obituary
Original Article
Perspective
Review Article
Reviewers; List
Short Communication
Task Force Report
View/Download PDF

Translate this page into:

Original Article
62 (
2
); 135-148
doi:
10.25259/ANAMS_147_2024

Phylogenetic landscape of peptide-binding domains of classical HLA alleles reported in the Indian population: Plausible implications for stem cell transplantation

Department of Translational and Regenerative Medicine, Postgraduate Institute of Medical Education and Research, Chandigarh, India

* Corresponding author: Dr. Gaurav Sharma, Department of Translational and Regenerative Medicine, Postgraduate Institute of Medical Education and Research, Chandigarh, India. sharma.gaurav@pgimer.edu.in

Licence
This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial-Share Alike 4.0 License, which allows others to remix, transform, and build upon the work non-commercially, as long as the author is credited and the new creations are licensed under the identical terms.

How to cite this article: Agarwal D, Sharma G. Phylogenetic landscape of peptide binding domains of classical HLA alleles reported in the Indian population: Plausible implications for stem cell transplantation. Ann Natl Acad Med Sci (India). 2026;62:135-48. doi: 10.25259/ANAMS_147_2024

Abstract

Objectives

Hematopoietic stem cell transplantation (HSCT) is preferred for several malignant and non-malignant hematopoietic conditions. However, high diversity in the histocompatibility genes, i.e., human leukocyte antigen (HLA), poses a challenge due to the unavailability of HLA-identical donors for most cases. This has increased haplo-identical HSCT in the last few years, which can lead to enhanced allorecognition and propensity for graft-versus-host disease (GvHD). The population-specific molecular diversity of peptide-binding domains (PBD) of HLA alleles defines allo-sensitivity. Therefore, we attempted to analyze reported Indian HLA diversity for their phylogenetic relatedness in PBD, towards plausible implications for population-specific donor selection for haplo-identical HSCT here.

Material and Methods

For this observational study, published literature, allelefrequencies.net, and immuno polymorphism database (IPD) - international ImMunoGeneTics information system (IMGT)/HLA databases were referred to cumulatively estimate the degree of reported HLA diversity in the Indian population for the PBD region, with reference to global diversity. CLUSTAL omega and interactive tree of life (iTOL) were used for estimating phylogenetic relatedness.

Results

Comprehensive analysis of reported HLA data for Indian population revealed a heterogenous distribution pattern representing the majority of globally present allelic groups (one field). At two fields, 114 HLA-A, 185 HLA-B and 68 HLA-C alleles for HLA class I are reported for Indian population. The dendrograms highlighted phylogenetic relatedness between small clusters usually having same allele group (one field level), though few exceptions were also observed.

Conclusion

The observed heterogeneity in distribution of class I and class II and their phylogenetic relatedness could be attributed to diverse microflora and environmental conditions. Our analyses could plausibly facilitate formulating algorithms guiding optimal selection of permissible histocompatible donor for transplantation.

Keywords

Hematopoietic stem cell transplantation
Histocompatibility
Human leukocyte antigen
Peptide binding domain
Phylogeny
Phylogram

INTRODUCTION

Hematopoietic stem cell transplantation (HSCT) has emerged as the preferred treatment modality for several malignant and non-malignant hematopoietic conditions. Over the last few decades, technological and clinical advancements have increased the quantum of HSCT globally, with nearly 90,000 transplants performed every year.1 According to Indian Society for Blood and Marrow Transplantation (ISBMT), nearly 23,843 transplants have been performed so far across 110 centers.2 Advancements in high-throughput clinical and diagnostic technologies, viz., transplant protocols, high-resolution genotyping for human leukocyte antigen (HLA), with improved accessibility from HLA-based stem cell donor registries, are encouraging.3

The HLA system is primarily involved in antigen presentation to T-cells for orchestrating the adaptive immune responses and thus largely defines the histocompatibility for transplantation. Its genomic locus is on chromosome 6 (p21.3) and represents the most polymorphic gene-dense region (with strong linkage and ∼39,627 known alleles) of the human genome.4,5 HLA class I (A, B, and C) genes are involved in presenting antigens to CD8+ cells, while HLA class II (DP, DQ, DR) present antigenic peptides to CD4+ cells. As of August 2024, a total of 27,717 HLA class I alleles have been described, which include 8,381 at locus HLA-A, 10,080 at locus HLA-B, and 8,454 at locus HLA-C. Similarly, a total of 11,910 HLA class II alleles have been reported, which include 3,714 at the HLA-DRB1 locus, 2,602 at HLA-DQB1, and 2,607 at the HLA-DPB1 locus.4 This high diversity poses the challenge of the unavailability of HLA-identical donors for most cases.

To this end, HLA-based international donor stem cell registries and associations/networks viz., World Marrow Donor Association, National Marrow Donor Program, International Bone Marrow Transplant Registry, and others, work in close collaboration to support unrelated HLA-matched donor selection for HSCT. However, ethnic distribution therein suggests limited Indian population-specific allelic coverage in HLA loci. In accordance, Marrow Donor Registry of India (MDRI), DATRI blood stem cell donors registry, ISBMT registry, Asian Indian Donor Marrow Registry (AIDMR), and others are important initiatives to meet the increasing requirement for HLA identical donors for HSCT.

To fill the gap in HLA identical donor availability, there has been a significant increase in haplo-identical HSCT in the last few years, wherein the recipient and donor have one haplotype identical. This potentially enhances allorecognition and propensity for graft versus host disease (GvHD), thus increasing immunosuppression regimens, thereby increasing vulnerability to post-HSCT infections. The disparity of HLA in donor and recipient elicits an immunogenic response with T lymphocytes of the donor targeting the recipient’s healthy cells, disrupting normal physiology and metabolic functions of the host.6 Further, haplo-HSCT poses a challenge of identification of a permissible mismatched donor among the multiple available related haplo-identical donors. This sets the rationale for defining optimal methods of donor selection, minimizing the load of posttransplant complications. A contributing factor in this direction could be phylogenetic relatedness of different HLA alleles based on their peptide binding domain (PBD) sequences, which are encoded by exon 2 and 3 in class I and exon 2 in class II molecules, respectively.7,8 The population-specific molecular diversity of PBD of HLA alleles largely defines allo-sensitivity via presentation of antigenic peptides differentially and promiscuously. Briefly, these sequences upon translation form an antigen recognition site/domain (ARS/ARD) carrying multiple pockets where the antigenic peptide fragments bind, based on their amino acid composition and binding efficiency with the residues forming those pockets.7-9 These amino acids contribute not only to the structural diversity of different HLA alleles but are also a significant aspect in peptide recognition and the length of the peptide binding to that domain.9 Thus, identifying evolutionary associations between different alleles based on their PBD region could support the identification of closely related alleles having analogy in peptide presentation patterns. Furthermore, this could contribute to developing population-specific donor selection algorithms for haploidentical HSCT and even for organ transplantation. Therefore, here we attempted to analyze reported Indian HLA diversity at a low resolution level for their phylogenetic relatedness in PBD, towards plausible implications for population-specific donor selection for haplo-identical HSCT here.

MATERIAL AND METHODS

For this observational study, published literature and allelefrequencies.net database were referred to cumulatively estimate the degree of reported HLA diversity at one and two field levels, respectively.10 The global HLA data was retrieved and analyzed from hlaalleles.org and international ImMuno GeneTics information system (IMGT)/HLA database.11,12 Further, the PBD sequences of classical HLA class I and II alleles were acquired from the IPD-IMGT/HLA database.12 The sequence relatedness was deduced using phylogenetic trees, for which freely accessible online multiple DNA alignment tool CULSTAL Omega13 was utilized and formatted using an online tool interactive tree of life (iTOL),14 designed for visualization, annotation, and management of dendrograms. In our analysis, we used the default settings of CLUSTAL omega, which constructs a guide tree based on pairwise distance matrices of aligned sequences. From an ethical viewpoint, here we only analyzed the data that is already published, already in the public domain, and in databases, which are duly referred to and acknowledged. This reflective study is supportive of the direction of our ongoing research grant from the Indian Council of Medical Research, Govt. of India (No. 2021-12089/SCR/ADHOC-BMS) towards defining permissible HLA matching for HSCT.

RESULTS

HLA diversity of peptide binding sequences in the Indian population with reference to global data

Analysis of reported HLA data at low resolution (one field, i.e., two-digit level) suggests a diversified allelic composition in the Indian population, as depicted in Table 1. For HLA class I (A, B, and C loci), out of 36 HLA-B allelic groups reported globally, 34 HLA-B of these are documented in the Indian dataset, while for HLA-A and HLA-C, all allelic groups known so far in the IPD-IMGT/HLA database12 are reported in the Indian population [Figure 1a].

Table 1: Diversity of classical HLA class I (A, B, and C) and HLA class II (DP, DQ, and DRB1) alleles at low resolution (one and two field level) as reported for Indian and global populations.
Allele name One-field level
Two field-level
Global Indian Global Indian
HLA-A 21 21 5094 114
HLA-B 36 34 6027 185
HLA-C 14 14 4765 68
HLA-DPA1 4 4 324 7
HLA-DPB1 1520 71 1523 73
HLA-DQA1 6 6 357 17
HLA-DQB1 5 5 1628 51
HLA-DRB1 13 13 2438 118
(a) Numbers of classical HLA class I (A, B, and C), (b) Numbers of classical HLA class II (DP, DQ, and DRB1) allelic groups at low resolution (one field, i.e., two-digit) as reported for Indian and global populations so far.
Figure 1:
(a) Numbers of classical HLA class I (A, B, and C), (b) Numbers of classical HLA class II (DP, DQ, and DRB1) allelic groups at low resolution (one field, i.e., two-digit) as reported for Indian and global populations so far.

Similar assessment for HLA class II (DP, DQ and DRB1) alleles at two-digit level, revealed all documented allelic groups of HLA-DPA1 (n=4), DQA1 (n=6), DQB1, (n=5) and DRB1 (n=13) are reported in Indian dataset, however, 71 HLA-DPB1 allelic groups were identified so far from Indian population, comprising ∼4.6% of the globally known HLA-DPB1 alleles at two-digit level [Figure 1b]. The list of these reported HLA class I and II alleles in the Indian population have been given in Supplementary Table 1 and 2.

Supplementary Table 1

Supplementary Table 2

Phylogenetics of peptide-binding sequences of classical HLA alleles of the Indian population

ARS/ARD for HLA class I (encoded by exons 2 and 3) and II (encoded by exon 2) are composed of specific pockets with unique biochemical properties, which define the structural diversity of antigenic peptide binding. Here we analyzed these PBD sequences of reported HLA alleles (A, B, C, DP, DQ, DRB1) from India using dendrograms, which indicate both the homology as well as diversity in the amino acid sequences and their functional aspects reflecting their phylogenetic relatedness.

i) Classical HLA-Class I (A, B, C)

At the two field level, 114 alleles for HLA-A, 185 for HLA-B, and 68 for HLA-C are reported for the Indian population, of which the PBD sequences of 85 alleles for HLA-A, 147 for HLA-B, and 57 for HLA-C were available in IPD-IMGT/HLA database12 and retrieved for phylogenetic analysis. For HLA-A, the phylogram broadly grouped these leaves into two sections, further bifurcated into sub-branches comprising alleles with similar nomenclature at the two-digit level, for e.g., A*74:01, A*74:03 formed a close cladistic group with A*03:01 and A*03:224 [Figure 2a]. However, the A*02 lineage was distributed heterogeneously with close association to A*24 alleles, indicating a homology of gene sequences in peptide peptide-binding domain. Incidentally, for HLA-B, which is more diverse, the phylogenetic tree revealed a heterogenous alignment of alleles with many small clusters with the same allele group but different HLA protein. For instance, B*27, B*18, B*13, and B*39 formed close cladistic groups [Figure 2b]. On the other hand, B*15 alleles were miscellaneously distributed, displaying sequence homology of B*15:13 and B*15:18 with B*40:01 and B*14:01, respectively. Notably, some monophyletic groups were formed, e.g, by B*35:57 and B*35:42, B*35:40N, B*35:01 alleles, while others included a combination of B*35 and B*07, B*18 and B*27 lineages. Similarly, B*15:10, B*40:26, B*40:23, and B*15:09 clustered closely [Figure 2b].

Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide binding domain (PBD) sequences of HLA class I alleles reported in the Indian population for HLA-A. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13,14].
Figure 2a:
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide binding domain (PBD) sequences of HLA class I alleles reported in the Indian population for HLA-A. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13,14].
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide binding domain (PBD) sequences of HLA class I alleles reported in the Indian population for HLA-B. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13,14].
Figure 2b:
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide binding domain (PBD) sequences of HLA class I alleles reported in the Indian population for HLA-B. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13,14].

On the contrary, for HLA-C largely homologous association was observed between alleles having the same allele group at 1st field in the phylogram, e.g., C*03 and C*07 forming clearly distinguished groups, respectively. Further, C*02 and C*15 alleles were found to be diverging from a common node [Figure 2c], displaying a common origin. Similarly, C*18:01 was found closer ancestrally to C*04 alleles, while C*05 and C*08 alleles originating from a common node and approximately similar branch lengths [Figure 2c] indicated a higher sequence similarity at the PBD region.

Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide binding domain (PBD) sequences of HLA class I alleles reported in the Indian population for HLA-C. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of the phylogram [Created by CLUSTAL Omega and iTOL13,14].
Figure 2c:
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide binding domain (PBD) sequences of HLA class I alleles reported in the Indian population for HLA-C. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of the phylogram [Created by CLUSTAL Omega and iTOL13,14].

ii) Classical HLA-Class II (DP, DQ, DRB1)

For HLA class II, beta chains are diverse and predominantly largely define the differential peptide binding. In accordance, at two field level, seven alleles for HLA-DPA1, 73 for HLA-DPB1, 17 for HLA-DQA1, 51 for HLA-DQB1 and 118 for HLA-DRB1 are reported for Indian population, of which the PBD sequences of five alleles for HLA-DPA1, 55 for HLA-DPB1, 16 for HLA-DQA1, 33 for HLA-DQB1 and 73 for HLA-DRB1 were available in IPD-IMGT/HLA database and retrieved for phylogenetic analysis. Phylogram for HLA-DPA1 reflected five leaves of DPA1*04:01, 02:01, 01:04, 01:03, 03:01 [Figure 3a], while HLA-DPB1 reflected heterogeneity with 55 leaves with genealogical diverse relatedness between alleles within the same allele group, e.g., DPB1*02:01 was observed to have more homology with DPB1*46:01 and DPB1*41:01, while DPB1*02:02 was phylogenetically closer to DPB1*47:01 [Figure 3b]. Intriguingly, most of the monophyletic groups of HLA-DPB1 comprised different allele groups at 1st field, e.g., DPB1*11:01 with DPB1*15:01, DPB1*36:01 with DPB1*21:01 [Figure 3b], depicting the diverse trends in PBD sequences.

Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide-binding domain (PBD) sequences of HLA class II alleles reported in the Indian population for HLA-DPA1. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of the phylogram [Created by CLUSTAL Omega and iTOL13, 14].
Figure 3a:
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide-binding domain (PBD) sequences of HLA class II alleles reported in the Indian population for HLA-DPA1. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of the phylogram [Created by CLUSTAL Omega and iTOL13, 14].
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide binding domain (PBD) sequences of HLA class II alleles reported in the Indian population for HLA-DPB1. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13, 14].
Figure 3b:
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide binding domain (PBD) sequences of HLA class II alleles reported in the Indian population for HLA-DPB1. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13, 14].

Analysis for HLA-DQA1 PBD sequences revealed 16 leaves broadly divided into two groups, with the first descending group comprising only the DQA1*01 allele group [Figure 3c]. The second group branched into two subgroups displaying mixed allelic clustering. The DQA1*03:01 and 03:02 formed a single monophyletic group with DQA1*02:01, while DQA1*03:03 displayed a high similarity index with DQA1*04:01 and DQA1*04:02, having the same branch length, indicating the analogy in evolutionary pattern of descent [Figure 3c]. On the other hand, the phylogram for HLA-DQB1 reflects 33 leaves, highlighting a uniformity in allelic distribution, with alleles having the same allele group in one field forming one cluster together, e.g., DQB1*06 lineage clustered with two branches descending from a common node. Similarly, DQB1*05 formed an individual cladistic group while DQB1*03 formed another [Figure 3d]. It was a significant indicator of conserved PBD sequences in a lineage-specific manner, thereby indicating plausible permissive mismatches in multiple mismatched donors for HSCT.

Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of PBD sequences of HLA class II alleles reported in the Indian population for HLA-DQA1. Only the genomic sequences of the PBD which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13, 14].
Figure 3c:
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of PBD sequences of HLA class II alleles reported in the Indian population for HLA-DQA1. Only the genomic sequences of the PBD which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13, 14].
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of PBD sequences of HLA class II alleles reported in the Indian population for HLA-DQB1. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13, 14].
Figure 3d:
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of PBD sequences of HLA class II alleles reported in the Indian population for HLA-DQB1. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13, 14].

For HLA-DRB1, the midpoint rooted tree comprised of two major branches, which further bifurcated into smaller branches, generating clusters. DRB1*04 formed a major group having two subgroups. Similarly, DRB1*11 alleles formed a small group diverging from the common nodal point of DRB1*13, DRB1*14, and DRB1*03 allele groups [Figure 3e]. Miscellaneous distribution of DRB1*14 group was observed with some sequences, e.g., DRB1*14:21 and 14:02 showing higher similarity index with DRB1*03:01, 03:02, and 03:05 forming one single cladistic group, while DRB1*14:15 clustered with DRB1*08:04. Similarly, DRB1*07 alleles formed one monophyletic group with DRB1*09:01, whereas DRB1*12:02 and DRB1*12:01 formed a closed end cluster [Figure 3e]. Overall evolutionary pattern class II alleles were lineage-specific, with the majority of alleles belonging to one lineage forming single cladistic groups, indicating close homology of exon 2 sequences, thereby their PBD repertoire.

Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide binding domain (PBD) sequences of HLA class II alleles reported in the Indian population for HLA-DRB1. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13, 14].
Figure 3e:
Neighbor-joining (midpoint rooted) phylogenetic tree depicting clustering of peptide binding domain (PBD) sequences of HLA class II alleles reported in the Indian population for HLA-DRB1. Only the genomic sequences of the PBD, which are available in the IPD-IMGT/HLA database12, were retrieved for the formation of a phylogram [Created by CLUSTAL Omega and iTOL13, 14].

DISCUSSION

Hematopoietic stem cell transplant has been a preferred treatment modality in cases of malignant and non-malignant hematological disorders. The increasing burden of these hematological conditions has elevated the number of allogenic HSCTs performed worldwide and in India.1 A major factor governing histocompatibility in transplant settings is the HLA system, which is highly polymorphic. In this reflective study, we compared the HLA diversity reported in the Indian population, at one and two field levels, with that of the global dataset. The majority of allele groups (one field, i.e., two-digit level) for both classical class I and II present globally are observed in the Indian dataset. At two field levels, 114 HLA-A, 185 HLA-B, 68 HLA-C alleles for HLA class I and 7 HLA-DPA1, 73 HLA-DPB1, 17 HLA-DQA1, 51 HLA-DQB1, and 118 HLA-DRB1 alleles are reported for the Indian population [Table 1 and Supplementary Tables 1, 2].

Since, variability of HLA primarily attributed to presence of structurally and biochemically diverse group of amino acids in their PBD, we also attempted to deduce the phylogenetic landscape of these peptide binding sequences of classical HLA class I (A, B and C) and class II (DP, DQ and DRB1) alleles reported from various regions of Indian subcontinent. Out of the reported HLA class I alleles at two field levels in the Indian population [Supplementary Tables 1, 2], PBD sequences of 85 for HLA-A, 147 for HLA-B, 57 for HLA-C, 5 for HLA-DPA1, 55 for HLA-DPB1, 16 for HLA-DQA1, 33 for HLA-DQB1, and 73 for HLA-DRB1 alleles were available in the HLA/IMGT database and retrieved for phylogenetic analysis

The phylograms of HLA-C displayed a relatively uniform pattern with alleles of the same lineage clustering together [Figure 2c], while HLA-A and B revealed a mixed and miscellaneous distribution of alleles [Figure 2a, 2b]. This multifarious distribution of HLA-A and HLA-B alleles could plausibly contribute to donor selection based on their parallel peptide presentation and immunogenicity. Similarly, phylogenetic analyses of reported classical HLA class II alleles from the Indian population reflected small monophyletic groups combining alleles of the same and/or different lineages for both HLA-DPB1 [Figure 3b] and HLA-DRB1 [Figure 3e].

Incidentally, the Indian subcontinent is one of the most diverse geographical landscapes of the South Asian region, inhabited by evolutionarily distinct microflora and fauna. This diversity is also reflected in the antigen presentation system, i.e., HLA, thereby rendering the identification of a permissible histocompatible donor for transplant challenging.3,5 Briefly, PBDs in HLA class I are constituted by α1 and α2 chains of the heterodimer forming a closed-ended domain housing six pockets named A-F,7,9 while in class II, it is an open-ended cleft formed of the N-terminal of α1 and β1 chains, where the β chain forms the floor of the groove and α chain creates the walls.8 Each binding pocket is unique for its biochemical properties owing to its specific HLA allotype. Due to the structural and biochemical complexity of antigen presentation by HLA molecules, finding an HLA-identical donor and fulfilling the quantum of transplant burden is a challenging constraint. Moreover, the inclination of this axis towards haploidentical transplants and the availability of multiple related donors identical for one haplotype have highlighted the need for suitable criteria for the identification of permissible mismatched donors. Our analyses of reported allelic heterogeneity of classical HLA alleles in the Indian population and their phylogenetic clustering based on the PBD region could plausibly facilitate algorithms guiding optimal selection of permissible histocompatible donors for HSCT as well as for solid organ transplantation.

To this end, at the two-digit (one field) level, a high divergence is reported in the Indian population, covering the majority of allelic groups reported globally [Table 1, Figure 1]. For example, HLA-A and HLA-C, driving both innate and adaptive immune responses, displayed a diversifying trend in the Indian population with all 21 and 14 lineages, respectively [Table 1]. Pioneering leads by Narinder Mehra and his group,7 along with others,15-18 reported predisposition and HLA association in the context of diseases like diabetes mellitus, tuberculosis, leprosy, rheumatoid arthritis, multiple sclerosis, spondyloarthropathies, and others. These studies actually observed the landscape of allelic repertoire specific to discrete geographical regions and ethnic diversities of the Indian subcontinent.19-42 These studies also covered tribal populations of the north and central zones of the country,21-25 while others focused on groups from south26,27 and east India.28,29 Furthermore, literature elucidated novel HLA alleles and unique haplotypes of HLA-A, HLA-B, and HLA-DRB1 at low resolution, contributing to the existing Indian population-specific information on these loci.43-53 Pan-India data across various Indian regions is covered, e.g., Haryana, Punjab, Delhi, Uttar Pradesh, Himachal Pradesh, Kashmir, Madhya Pradesh, Chhattisgarh, Rajasthan, Bihar, Gujarat, Maharashtra, Kerala, Karnataka, Tamil Nadu, Jalpaiguri, and others, including eastern regions) including diverse ethnicities and tribes, accessed through simplified user interface allelfrequency.net developed by Derek Middleton and his team10,11 as well as literature. Global data available at the IPD-IMGT/HLA database, maintained and curated by Baker et al.12 (2023), was used to retrieve PBD sequences for phylogenetic analysis, which is indeed instrumental in governing transplant immunology nationwide.

The multifarious distribution of classical HLA lineages could be attributed to the interplay of diverse microflora and evolving environmental conditions. The dendrograms were generated via multiple sequence alignment tools, namely, CLUSTAL Omega13 and iTOL.14 The alleles converging to a common nodal point reflect a closer PBD homology and influence peptide repertoire. In our study, the phylograms of HLA-A, B, DPB1, and DRB1 reflected a diversifying trend with mixed and miscellaneous relatedness, based on broad allelic groups [Figures 2 and 3]. On the other hand, HLA-C, DQA1, DQB1, and DPA1 displayed known lineage-specific clustering of alleles [Figures 2 and 3]. Initial leads by Mary Carrington’s group in 199954 highlighted the phylogenetic hierarchy of HLA A, -B, and -C alleles based on their PBD sequences for the alleles known at that time point. The new nomenclature was adopted thereafter, with several new alleles identified and reflected in our study. Previously, phylogenetic analyses based on the whole genome sequences,54 specific promoters, and 3’ untranslated regions (UTR)55 and peptide binding groove56-58 of classical HLA-A, -B, -DRB, and -DPB1 in different populations. The allelic heterogeneity and distribution varied in our study [Figures 2a, b and 3e]; nevertheless, similarity was observed in HLA-C allelic clustering in our study [Figure 2c] with that reported by Van Dorp C.H. and Kesmir C. (2018)57 e.g. HLA-C*16:02 was seen clustered with C*12:03. On the contrary, the phylogenetic distance increased between C*17:01 and C*15:02 in our study owing to the addition of new alleles with closer descend (C*03 with C*17:01 and C*15:08 with C*15:02) [Figure 2c]. Further, Morishima et al. (2018) analyzed phylogenetics of HLA class II alleles, like the 3’ UTR region of HLA-DP,59 while others reported lineage-specific heterogeneity of DQ and DR alleles among different population subsets.60

Nevertheless, certain concerns should be cautiously considered while interpreting this HLA phylogenetic landscape in a heterogeneously assorted population like that of India. Here, we could cover only classical HLA data through scanning published reports and relied primarily on the data available at allelefrequencies.net and IPD-IMGT/HLA as of November 2023. The PBDs’ genomic sequences of classical HLA class I and II alleles were acquired from the IPD-IMGT/HLA database, as most of the population-specific alleles are reported at low resolution here. A multitude of HLA data remains inaccessible, including data from various private laboratories involved in pre-transplant workup as well as databases maintained by various Indian origin donor registries like DATRI, MDRI, AIDMR, Be The Cure registry, and others for the identification of potential donors for matched unrelated HSCT. Further, here the frequencies of classical HLA loci are not covered, which could impact developing donor selection algorithms beyond phylogenetic relatedness. Analyses of frequency data, inclusive of registry data, could add new dimensions. Additionally, anomalies/ambiguities in the referred literature at the author’s end at the time of allele call/genotyping or entry into the database could not be ruled out. Moreover, most of the referred literature used conventional low-resolution techniques, viz. PCR-SSP and SSO, where high-resolution details are not available. Also, for a few HLA alleles, only partial CDS (codon determining sequences) were available and therefore could not be analyzed phylogenetically due to the unavailability of their exon 2 and 3 genomic sequences, specifically in the global database IPD-IMGT/HLA [allele list in Supplementary Tables 1 and 2]. Moreover, the tools used for phylogenetic study generated phylograms without branch lengths, which were then interpreted using iTOL.14 Furthermore, here we have analyzed only at the genomic sequence level but not at the amino acid sequence level for phylogenetic relatedness. This aspect will require additional evaluation not only at the amino acid sequence levels but also at their interactive levels, which could be further analyzed in the future.

To the best of our knowledge, the present study is the first attempt to delineate the phylogenetic landscape for population-specific (Indian) observed/reported HLA alleles. The given information can be referred to as a source for developing algorithms optimizing donor selection for haploidentical HSCT and organ transplantation. The PBD-based allelic clustering can further be evaluated for clinical utility along with existing algorithms like Predicted Indirectly Recognizable HLA Epitopes (PIRCHE), wherein low PIRCHE scores are indicative of low immunogenicity.61,62 Alleles clustering together could be assessed with respect to their PIRCHE scores for plausible implications of these dendrograms in transplant outcomes. In addition, the relevance of these T-cell epitope (TCE) binding-based algorithms relying on PBD could be explored in relation to B cell epitopes, i.e., cross-reactive groups (CREGS) for post-transplant development of de novo donor-specific antibody (DSA) against HLA. Moreover, this information could be supportive for stem cell registries as well as characterizing disease susceptibility and interventions, e.g., diagnosis and peptide-based vaccines. Therefore, this preliminary step can be alleviated further to extract mechanistic information on the roots of HLA alleles, their evolutionary patterns, and translational relevance.

CONCLUSION

HLA complex is a highly polymorphic genetic locus (6p21.3), which governs the antigen presentation and histocompatibility in transplantation. Most of the variations in these molecules are present in the PBDs. In HSCT, alloreactivity mediated through molecularly diverse PBDs poses a major challenge e.g. GvHD. Here, we analyzed known Indian HLA diversity for their phylogenetic relatedness in PBDs, towards plausible implications for population specific donor selection for haplo-identical HSCT. Approaches to utilize the population-specific evolutionary analogy based on PBD region of HLA class I and II alleles, could refine algorithms for donor selection strategies in haploidentical HSCT.

Acknowledgement

Databases used for analyses i.e. IPD-IMGT/HLA and (AFND). Authors who reported the distribution of HLA alleles in Indian population.

Authors’ contributions

GS: Conceptualization and supervision; DA and GS: Analysed the data and written the manuscript draft.

Ethical approval

Institutional Review Board approval is not required. This is only an analysis report based on already freely available data and databases which are duly referred. Study does not involve use of any human or any samples and/or their enrolment and no ethical concerns are linked to it.

Declaration of patient consent

Patient’s consent not required as there are no patients in this study.

Financial support and sponsorship

This reflective study is supportive of the direction of our ongoing research grant from the Indian Council of Medical Research, Govt. of India (No. 2021-12089/SCR/ADHOC-BMS) towards defining permissible HLA matching for HSCT.

Conflicts of interest

There are no conflicts of interest.

Use of artificial intelligence (AI)-assisted technology for manuscript preparation

The authors confirm that there was no use of artificial intelligence (AI)-assisted technology for assisting in the writing or editing of the manuscript and no images were manipulated using AI.

References

  1. , , , , , , et al. One and half million hematopoietic stem cell transplants (HSCT). Dissemination, trends and potential to improve activity by telemedicine from the Worldwide Network for Blood and Marrow Transplantation (WBMT) Blood. 2019;134:2035. (Supplement_1)
    [CrossRef] [Google Scholar]
  2. , , , , , , et al. American Society of Transplantation and Cellular Therapy International Affair Committee: Report of the Third Workshop on Global Perspective to Access to Transplantation at the 2022 Tandem Meeting. Transplantation and Cellular Therapy. 2023;29:410-7.
    [CrossRef] [PubMed] [Google Scholar]
  3. Mehra NK. The HLA complex in biology and medicine: A resource book. Boydell Brewer Ltd ISBN 978-81-8448-870-8; 2010.
  4. , , , , , . IPD-IMGT/HLA Database. Nucleic Acids Res. 2020;48:D948-55.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  5. , , . The human leukocyte antigen system in human disease and transplantation medicine. In: Clinical molecular medicine Clinical molecular medicine. Elsevier; . p. :309-25.
    [Google Scholar]
  6. , , . Review of graft-versus-host disease. Dermatol Clin. 2019;37:569-82.
    [CrossRef] [PubMed] [Google Scholar]
  7. , , . Molecular challenges imposed by MHC-I restricted long epitopes on T cell immunity. Biol Chem. 2017;398:1027-36.
    [CrossRef] [PubMed] [Google Scholar]
  8. , , , , , , et al. Three-dimensional structure of the human class II histocompatibility antigen HLA-DR1. Nature. 1993;364:33-9.
    [CrossRef] [PubMed] [Google Scholar]
  9. , , . The pockets guide to HLA class I molecules. Biochem Soc Trans. 2021;49:2319-31.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  10. , , , , , , et al. Allele frequency net database (AFND) 2020 update: Gold-standard data classification, open access genotype data and new query tools. Nucleic Acids Res. 2020;48:D783-8.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  11. , , , , , , et al. IPD-MHC: nomenclature requirements for the non-human major histocompatibility complex in the next-generation sequencing era. Immunogenetics.. 2018;70:619-23.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  12. , , , , , , et al. The IPD-IMGT/HLA database. Nucleic Acids Res. 2023;51:D1053-60.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  13. , , , , , , et al. Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol Syst Biol. 2011;7:539.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  14. , . Interactive tree of life (iTOL) v5: An online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49:W293-6.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  15. , , , , . Human leucocyte antigen class II DRB1 and DQB1 associations in human immunodeficiency virus‐infected patients of Mumbai, India. Int J Immunogenetics. 2010;37:199-204.
    [Google Scholar]
  16. , , , , , , et al. Distinctive KIR and HLA diversity in a panel of north Indian Hindus. Immunogenetics. 2002;53:1009-19.
    [CrossRef] [PubMed] [Google Scholar]
  17. , , , , , . Kannadigas from South India: Putatively unique five-locus haplotypes among the Kannadigas of South India. HLA. 2018;92:193-5.
    [CrossRef] [PubMed] [Google Scholar]
  18. , , , , , . Malayalam speaking population from South India: Common five-locus haplotypes in Malayalam speaking population. HLA. 2018;92:432-4.
    [CrossRef] [PubMed] [Google Scholar]
  19. , . Molecular diversity of HLA-a, HLA-b, HLA-DRB1 and HLA-DQB1 Alleles from Mumbai India. International Journal of Human Genetics. 2012;12:57-62.
    [CrossRef] [Google Scholar]
  20. , , , . High‐resolution HLA genotyping identifies alleles associated with severe COVID‐19: A preliminary study from India. Immunity Inflam & Disease. 2021;9:1781-5.
    [PubMed] [Google Scholar]
  21. , , , . Analysis of HLA-B allele polymorphism in North Indian population: Experience at tertiary care centre. Gene Reports. 2021;22:100996.
    [CrossRef] [Google Scholar]
  22. , , , , , . Distribution of HLA-a, B and DRB1 alleles in Sahariya tribe of North Central India: An association with pulmonary tuberculosis. Infect Genet Evol. 2014;22:175-82.
    [CrossRef] [PubMed] [Google Scholar]
  23. , , , , , , et al. HLA-a*02 repertoires in three defined population groups from North and Central India: Punjabi Khatries, Kashmiri Brahmins and Sahariya Tribe. HLA. 2019;93:16-23.
    [CrossRef] [PubMed] [Google Scholar]
  24. , . HLA‐A and HLA‐B distribution in Toto – A vanishing sub‐Himalayan tribe of India. Tissue Antigens. 2006;67:64-5.
    [CrossRef] [PubMed] [Google Scholar]
  25. , , . HLA polymorphisms in Sindhi community in Mumbai, India. Int J Immunogenet. 2010;37:373-7.
    [CrossRef] [PubMed] [Google Scholar]
  26. , , , , , , et al. Gradients in distribution of HLA – DRB1 alleles in castes and tribes of South India. Int J Hum Genet Genetics. 2012;12:45-5.
    [Google Scholar]
  27. , , , , , , et al. Distribution of HLA alleles and haplotypes in Tamil-speaking South Indian populations: Affinities with Spanish and Austronesian. Russ J Genet. 2020;56:1139-50.
    [CrossRef] [Google Scholar]
  28. , , . A study of the association of childhood asthma with HLA alleles in the population of Siliguri, West Bengal, India. Tissue Antigens. 2014;84:316-20.
    [CrossRef] [PubMed] [Google Scholar]
  29. , , , , , . Distribution of HLA genes and disease predisposition in Bengali speaking people from India. Indian J Transplantation. 2015;9:2-6.
    [CrossRef] [Google Scholar]
  30. , , , . Analysis of HLA-a, HLA-B and HLA-DRB1 allelic frequencies in tertiary care from Telangana and Andhra Pradesh. J Med Sci Res. 2014;2:140-4.
    [CrossRef] [Google Scholar]
  31. , , , , , , et al. Frequency analysis of HLA-B allele in leukemia patients from a North Indian population: A case-control study. Meta Gene. 2021;27:100842.
    [CrossRef] [Google Scholar]
  32. , , , , . Novel and extended HLA class I and II alleles encountered in Kashmiri Brahmin population from North India. HLA. 2020;96:487-9.
    [CrossRef] [PubMed] [Google Scholar]
  33. , , , , , , et al. Diverse human leukocyte antigen association of type 1 diabetes in north India. J Diabetes. 2019;11:719-28.
    [CrossRef] [PubMed] [Google Scholar]
  34. , , . HLAA, b, Cw DRB1 and DQB1 alleles in multiple sclerosis patients in India. Int J Hum Genet Genetics. 2012;12:37-40.
    [Google Scholar]
  35. , , , , , , et al. HLA associations in South Asian multiple sclerosis. Mult Scler. 2016;22:19-24.
    [CrossRef] [PubMed] [Google Scholar]
  36. , . Detection of HLA class II alleles in the Muslim population of South India. The Res–June.. 2019;5(2):21-29.
    [Google Scholar]
  37. , , . Genetic diversity through human leukocyte antigen typing in end-stage renal disease patients and prospective donors of North India. Indian J Pathol Microbiol. 2016;59:59-62.
    [CrossRef] [PubMed] [Google Scholar]
  38. , , , , . Caucasian and Asian specific rheumatoid arthritis risk loci reveal limited replication and apparent allelic heterogeneity in north Indians. PLoS One. 2012;7:e31584.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  39. , , , , . Association of Human Leucocyte Antigen (HLA) class II with systemic lupus erythematosis (SLE) patients from western India. Meta Gene. 2018;16:230-3.
    [CrossRef] [Google Scholar]
  40. , , , , , . The Austroasiatic Munda population from India and Its enigmatic origin: A HLA diversity study. Hum Biol. 2011;83:405-35.
    [CrossRef] [PubMed] [Google Scholar]
  41. , , , , , . A novel HLA‐DPB1 allele, DPB1*125:01, identified by sequence‐based typing in an Indian individual. Tissue Antigens. 2011;77:85-7.
    [CrossRef] [PubMed] [Google Scholar]
  42. , , , . HLA–DQ genotyping in celiac disease in western India. Trop Gastroenterol. 2015;36:174-178.
    [CrossRef] [PubMed] [Google Scholar]
  43. , , , , , , et al. HLA-DQ haplotypes in 15 different populations. In: Major histocompatibility complex Major histocompatibility complex. Springer Japan; . p. :412-26.
    [Google Scholar]
  44. , , , , , . Haplotype analysis of HLA‐A, ‐B antigens and ‐DRB1 alleles in south Indian HIV‐1‐infected patients with and without pulmonary tuberculosis. Int J Immunogenetics. 2009;36:129-33.
    [Google Scholar]
  45. , , , . HLA haplotype diversity in the South Indian population and its relevance. Indian J Transplantation. 2015;9:138-43.
    [CrossRef] [Google Scholar]
  46. , , , , , , et al. HLA-DRB1* and DQB1* allele and haplotype diversity in eight tribal populations: Global affinities and genetic basis of diseases in South India. Infect Genet Evol. 2021;89:104685.
    [CrossRef] [PubMed] [Google Scholar]
  47. , , , , , , et al. Association of HLA‐DR/DQ alleles and haplotypes with nephrotic syndrome. Nephrology. 2016;21:745-52.
    [CrossRef] [PubMed] [Google Scholar]
  48. , , , , , , et al. Human leucocyte antigen (HLA)-a, -b, -c, -DRB1 and -DQB1 haplotype frequencies from 2491 cord blood units from Tamil speaking population from Tamil Nadu, India. Mol Biol Rep. 2018;45:2821-9.
    [CrossRef] [PubMed] [Google Scholar]
  49. , , , , . Next Generation Sequencing in HLA haplotype distribution among Telugu speaking population from Andhra Pradesh, India. Hum Immunol. 2018;79:583-4.
    [CrossRef] [PubMed] [Google Scholar]
  50. , , , , , , et al. Human leukocyte antigen association in systemic sclerosis patients: Our experience at a tertiary care center in North India. Front Immunol. 2023;14:1179514.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  51. , . Role of HLA-a, HLA-b, HLA-DRB1 and HLADQB1 alleles in HIV-1 patients with pulmonary tuberculosis co-infection from India. Int J Hum Genet Genetics. 2012;12:11-3.
    [CrossRef] [Google Scholar]
  52. , , , , , , et al. Human leucocyte antigen (HLA)-a, -b, -c, -DRB1 and -DQB1 haplotype frequencies from 2491 cord blood units from Tamil speaking population from Tamil Nadu, India. Mol Biol Rep. 2018;45:2821-9.
    [CrossRef] [PubMed] [Google Scholar]
  53. , , , , . Extensive studies on polymerase chain reaction-sequence-specific primers (PCR-SSP) based HLA-DRB1* allele profiling in non insulin dependent diabetes mellitus (Indian population) African Journal of Pharmacy and Pharmacology.. 2012;6:685-91.
    [Google Scholar]
  54. , , , . Taxonomic hierarchy of HLA class I allele sequences. Genes Immun. 1999;1:120-9.
    [CrossRef] [PubMed] [Google Scholar]
  55. , , , , , et al. Sequence and phylogenetic analysis of the untranslated promoter regions for HLA class I Genes. J Immunol. 2017;198:2320-9.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  56. , , . HLA Class I supertype classification based on structural similarity. J Immunol. 2023;210:103-14.
    [CrossRef] [PubMed] [Google Scholar]
  57. , . Estimating HLA disease associations using similarity trees. bioRxiv. 2018:408302.
    [Google Scholar]
  58. , , . Clustering HLA class I superfamilies using structural interaction patterns. PLoS One. 2014;9:e86655.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  59. , , , , , , et al. Evolutionary basis of HLA-DPB1 alleles affects acute GVHD in unrelated donor stem cell transplantation. Blood. 2018;131:808-17.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
  60. , . Study of the HLA class II allele polymorphism and phylogenetic analysis in Vojvodina population. Genetika. 2011;47:412-6.
    [PubMed] [Google Scholar]
  61. , . Matching donor and recipient based on predicted indirectly recognizable human leucocyte antigen epitopes. Int J Immunogenet. 2018;45:41-53.
    [CrossRef] [PubMed] [Google Scholar]
  62. , . PIRCHE-II: An algorithm to predict indirectly recognizable HLA epitopes in solid organ transplantation. Immunogenetics. 2020;72:119-2.
    [CrossRef] [PubMed] [PubMed Central] [Google Scholar]
Show Sections