Computational approaches to facilitate epitope-based HLA matching in solid organ transplantation
M. Niemann (Berlin, DE)
Epitope-based HLA matching has emerged over the last few years as an improved method for HLA matching in solid organ transplantation. The epitope-based matching concept has been incorporated both in the PIRCHE-II and the HLAmatchmaker algorithm to find the most suitable donor for a recipient. For these algorithms, high-resolution HLA genotype data of both donor and recipient is required. Since high-resolution HLA genotype data is often not available, we developed a computational method which allows epitope-based HLA matching from serological split level HLA typing relying on HLA haplotype frequencies. To validate this method, we simulated a donor-recipient population for which PIRCHE-II and eplet values were calculated when using both high-resolution HLA genotype data and serological split level HLA typing. The majority of the serological split level HLA-determined ln(PIRCHE-II)/ln(eplet) values do not or only slightly deviate from the reference group of high-resolution HLA-determined ln(PIRCHE-II)/ln(eplet) values. This deviation was slightly increased when HLA-C or HLA-DQ was omitted from the input and was substantially decreased when using two-field resolution HLA genotype data of the recipient and serological split level HLA typing of the donor. Thus, our data suggest that our computational approach is a powerful tool to estimate PIRCHE-II/eplet values when high-resolution HLA genotype data is not available.
Ferret, a user-friendly tool to extract data from the 1000 Genomes Project
S. Limou (Nantes, FR)
The 1000 Genomes (1KG) Project provides a near-comprehensive resource on human genetic variation in worldwide reference populations, including high-resolution HLA typing. We implemented a user-friendly Java tool, “Ferret”, to ease the access of the community to the large and complex 1KG genomic data files. Ferret's unique advantages encompass: (1) multiple inputs accepted such as locus, gene(s) or SNP(s) of interest; (2) fast extraction of 1KG individual genotype data for SNPs and indels; (3) allelic frequency computation for SNP, indel and CNV in 1KG populations and (4) options for the Exome Sequencing Project populations. By converting the NCBI annotations and the 1KG data into files that can be imported into popular pre-existing tools (e.g. PLINK and HaploView), Ferret hence offers a straightforward way, even for clinicians and biologists, to manipulate, explore, and merge 1KG data with the user’s dataset, as well as visualize linkage disequilibrium pattern, infer haplotypes, and design tagSNPs. This tool could therefore empower the immunogenetics community to leverage the 1KG data, especially in the prospect of granting HLA data access. Ferret is publicly available at: http://limousophie35.github.io/Ferret/.
HLA and the EMR: Developing HL7 FHIR tools for exchanging NGS-based HLA genotyping
M. Maiers (Minneapolis, US)
While principles and standards have recently been developed for exchanging NGS-based HLA genotyping data between typing labs, researchers, and donor registries (MIRING & HML), it is still challenging to interoperate with clinical EHR/EMR and other healthcare systems, especially when the data includes clinical genomics and sequencing information. New standards are emerging that use modern approaches for interoperability with health care systems. The most promising standard is Health Level 7 Fast Healthcare Interoperability Resources (HL7 FHIR). HL7 FHIR has developed great momentum in the vendor community, evidenced by the Argonaut Project (argonautproject.org), a private sector initiative including EHR/EMR vendor and health care organizations to advance industry adoption of modern open interoperability standards for sharing electronic health records, and is focused on developing FHIR-base API and Core Data Services. FHIR’s fundamental building block is a resource (e.g. Patient, Observation, etc). These resources account for expected use by 80% of the implementations. The other 20% is accomplished by profiling the resource by constraining existing data elements and introducing domain specific elements with extensions. Using existing resources and profiles (including those developed for clinical genomics) in HL7 FHIR Standard for Trial Use 3, we have developed a transaction bundle for exchanging HLA typing reports. The scenario includes specimen collection (Specimen), typing labs (Organization), registries (Organization), lab orders for NGS-based HLA typing (Diagnostic Request), and typing reports (Diagnostic Report) that consolidate the genotype reported, and supporting evidence for allele assignment (nested Observations and Sequences). Tools are being developed to support this HLA typing scenario. This includes standing up a FHIR compliant server based on the Java based HAPI FHIR Server (http://fhirtest.b12x.org/), clients and middleware for creating FHIR resources and profiles for HLA either directly or converting from HML, and FHIR compliant terminology resources for HLA nomenclature.
The kidney transplantation application (KiTapp): a visualization and contextualization tool in a kidney graft patient cohort.
P. Gourraud (Nantes, FR)
Around 10% of adults suffer from chronic kidney disease (CKD). Among those more than 550,000 patients with advanced CKD evolved to end-stage renal disease (ESRD) and become candidates for kidney transplantation. With around 200 kidney transplantations performed annually, our institute has had the opportunity to record data on more than 1,500 patients followed since 2008 gathering over 200 clinical and immunological items at the day of the graft and each year after that. How can this amount of data be used in a personalized way for both patients and practitioners to inform the decision-making process with data? The KiTapp (Kidney transplantation application) is a prototypic software designed to put in population context, a given patient facilitating actionable comparisons to his/her peers in the reference cohort. To some extent, KiTApp is a generalized version of paper-based growth charts used in pediatrics. The goal here is to understand how the immunosuppressive treatment and its intensity could affect the course of a patient’s disease. By comparing with the user-defined sub-cohort, the ultimate objective is to help the clinical decision by trying to anticipate the effect of possible decisions on the patient’s evolution. The development of the app required a careful curation of the data. We present the two types of algorithms developed: 1) a populational contextualization where we compare a patient to the different treatment available, and 2) a referential contextualization where we compare a patient to defined extreme groups (designed with the help of clinicians) such as acute graft reject, humoral rejection, cellular rejection or tolerance. KiTapp is presented as a web app displaying dynamic graphical view of the patient-centered comparisons. On a clinical point of view, this app ambition to help clinical decision by facilitating access to large amount of data actionable manner. On a technological point of view, software and algorithms developed here could be applicable to various chronic medical conditions.
CWD-viewer: a tool to visualise and inspect EFI and other CWD catalogues
A. Sanchez-Mazas (Geneva 4, CH)
The knowledge of which alleles are Common and Well Documented (CWD) is of great relevance to the HLA practitioner in diagnostics and research. The EFI catalogue published in February 2017 extends previous CWD catalogues by increasing the amount of data for Europe and by providing information about the CWD status of alleles at regional geographic levels. It is therefore possible to determine whether a given allele is CWD across all of Europe or only within specific regions. To facilitate access to this EFI CWD catalogue, we have built a web-based interface accessible from any connected device such as computers, tablets or smartphones (http://hla-net.eu/cwd). The design of the interface is tailored to simplify queries to all the CWD catalogue information. Queries, which can include several alleles and loci simultaneously, are either visual or text-based. The results are synoptic tables and maps illustrating the regional status (Common, Well-Documented or other) of each allele and enriched with additional relevant information retrieved from or linked to other databases. In particular, as the CWD information does not reflect the population diversity from the point of view of population genetics, the viewer provides links to maps and charts displaying the frequency distribution of CWD alleles across Europe when such information is available. The catalogue viewer presented here is implemented as a tool that depends both on a data format used to describe the data sources and on a set of computer programs used to query the data and produce statistical summaries and detailed analyses of the CWD data. By providing the full technical description of these formats and the algorithms used, it will also be possible to build equivalent viewers for other specific CWD catalogues. Finally, we discuss conditions for automatic updates of such catalogues.
SNP-HLA Reference Consortium: HLA and SNP data sharing for promoting HLA centric analyses in genomics.
P. Gourraud (Nantes, FR)
SNP based imputation of HLA alleles is now essential in the analysis of large human genomics studies. Recently, the Haplotype Reference Consortium aggregated over 20 studies to create a very large reference panel of human haplotypes where ~50M genetic variants are observed from >31,500 sequenced whole genomes (http://www.haplotype-reference-consortium.org/). Genotype imputation improves in accuracy with increasing numbers of sequenced samples. While extensive studies using SNP genotypes grew extremely fast; much work is still needed to increase capacities of studying HLA alleles and their association with different diseases where SNP genotypes are available. Indeed, it became much easier to impute the HLA alleles from SNPs genotyped in the MHC region. To fill this gap between genomic data availability and HLA alleles studies we propose to create a validated data framework for HLA imputation: the SNP-HLA Reference Consortium. The aim is facilitate the HLA imputation from SNP studies and therefore improve the correlation between SNPs and HLA genotypes. We already have access to large cohorts of over 4,000 ethnically diverse individuals with HLA types and SNPs in high-resolution genotypes. We plan to curate both SNP and HLA data to improve the resources available for training a software performing SNP-Based HLA imputation. We plan to make available to the community through IHIW websites large panels of references suitable to easily impute the HLA with SNPs genotyped from chip arrays. We will present this ambitious project to the next HLA and Immunogenetics Workshop in San Francisco next September. To develop this very ambitious tool we encourage willing participants with large HLA types + SNP data to join the project and contribute to, as seen for SNPs within the genomic community, accuracy increases when we merge data from different ancestry backgrounds.
High resolution haplotype inference for HLA genes from family trios
M. Li (Mountain View, US)
Full resolution genotypes of Human Leukocyte Antigen (HLA) genes can be obtained with next generation sequencing technology (NGS). All 11 major loci can be typed for multiple samples in a single run. Compared to traditional methods of SSO, SSP and Sanger sequencing, NGS delivers more complete coverage of the genome and phased contig sequences to resolve ambiguities. Combining the genotype at 4-field resolution and family pedigree information, haplotypes for HLA genes can be constructed with high accuracy. Here we describe a method for building phased haplotypes from family trios.
Approximately 1500 family trios were typed by this method and the genotype calls made by the software are manually reviewed based on all the quality metrics. The accuracy of genotyping results at 3- and 4-field resolution is assessed through segregation analysis. Concordance is computed by comparing allele calls of the child to those of the parents. At 3 field resolution, the validated genotyping accuracies for automatic calls are above 98.8% and 98.3% for class I and class II genes respectively. For reviewed calls, the validated accuracies are around 99.9% and 99.8% for class I and class II genes respectively. For family trios, haplotype phase can be determined at multiple loci by applying Mendelian constraints. The child within a trio must share one allele identical by descent at each locus with each of the parents. This information allows resolution of the haplotypes at strongly linked loci. The challenge of constructing haplotype phase is at locus where all three family members have the same heterozygous genotypes. In this case, coalescent based methods and Hidden Markov models are used to resolve ambiguous assignment. A database of the haplotype frequencies is constructed from the phased haplotypes based on family trios. The method to build the haplotype database with high resolution and accuracy allows applying linkage analysis in population and disease association studies.
Community resources for automated annotation of HLA, KIR and beyond
M. Maiers (Minneapolis, US)
New sequencing technologies have increased demand for tools and methods for annotating and analyzing sequence data. The extreme allelic and structural polymorphisms present in HLA and KIR renders general genetic variation nomenclatures, as well as those used within the immunogenetics field, only marginally useful for describing 1) consensus sequence with partial phasing, 2) incomplete gene sequence coverage and 3) novel variants, especially intronic variants. In preparation for the 17th IHIWS, we have introduced open source web services that perform automated analysis of NGS consensus sequences and deliver Gene Feature Enumeration (GFE) strings, a computable shorthand description of consensus sequences. This GFE service [http://gfe.b12x.org/] accepts (curated or pre-curated) consensus sequences, performs alignment and annotation and leverages a simpler system for persisting sequence data called “feature service” [http://feature.nmdp-bioinformatics.org/]. Feature Service has been developed to authoritatively assign a unique identifier to any sequence indexed by its locus (any gene in the list maintained by the Human Genome Organization (HUGO)) and feature (any term in list maintained Sequence Ontology (SO)). We have demonstrated the utility of these services through the analysis of sequences generated from over 500K genotyping results from HLA, KIR, ABO and other blood group antigen gene families with a variety of levels of coverage and phasing. In situations where targeted sequencing is used (e.g. exons only) we have extended and applied the Genotype List format and GL Service (gl.nmdp.org) for representing and persisting information about phase and allelic ambiguity. Applied together, these tools become a new platform for accelerating the development NGS data analysis for population genetics (LD, HWE), disease association, peptide binding, expression and clinical histocompatibility.
Pacific and European ancestry of Amerindians: a HLA relatedness study in Wiwa (Arsario) Colombian populations.
A. Arnaiz-Villena (Madrid, ES)
The HLA profile of an isolated Amerindian group from North Colombia (Wiwa) was studied in order to draw conclusions about its preventive medicine, the genetic relationship with worldwide populations and native American peopling since this last issue is hotly debated. Peripheral blood was obtained from volunteer blood donors belonging to Wiwa (also named Arsario) ethnic group. HLA-A, -B, -C, -DRB1 and -DQB1 genes were analyzed by standard methods. Wiwa Amerindians relationships with others were calculated by using Arlequin, Dispan and Vista software computer packages. Extended HLA haplotypes have been studied for the first time in this population. Classical Amerindian haplotypes have been found and also new Wiwa (Arsario) Amerindian haplotypes. New haplotypes are A*68:01 - B*15:01 - C*03:03 - DRB1*14:02 - DQB1*03:02, A*11:01 - B*07:02 - C*07:02 - DRB1*15:03 - DQB1*06:02 and A*68:01 - B*15:01 - C*03:04 - DRB1*14:02 - DQB1*03:01. Conclusions have been reached after exhaustive comparisons of Wiwa with other Amerindians and worldwide populations by using genetic distances, neighbor joining trees, correspondence analysis and specific groups of alleles which are common and frequent in both Amerindians and Pacific Islanders. They are: 1) The Americas First Inhabitants have probably come through the Bering Strait and the Pacific (from Austronesia and Asia) and Atlantic (from Europe) routes. A bidirectional gene flow is not discarded. 2) the genetic HLA Amerindian profile is distinct from that of other world populations. 3) Amerindians geographical proximity groups’ relatedness is not concordant with HLA genetic relatedness, neither with language. This may be explained by a substantial population decrease that occurred after Europeans invaded America in 1492 and carried new pathogens and epidemics. 4) Our results are also useful for Wiwa and other Amerindians future preventive medicine (HLA linked diseases), HLA pharmacogenomics and transplantation regional programs.https://benthamopen.com/MEDJ/VOLUME/3/
The distribution of alleles and haplotypes in Russian and Tatarian populations of Samara region population of Russia
D. Klyuchnikov (Samara, RU)
In the present work we analyzed HLA-A, -B, -DRB1 alleles and haplotypes of unrelated bone marrow donors of the Samara bone marrow donor registry. HLA-typing was performed using SSO and SSP techniques with the aid of OneLambda typing kits at low resolution level. The allele and haplotype frequencies of the samples of Russian (n=1177) and Tatarian (n=85) populations of the Samara region were estimated using the EM algorithm and Arlequin v.3.5 population genetics software. We found a significant difference in distribution of A*02 (27.6% in Russian population vs 16.5% in Tatarian population), B*35 (11.8% vs 16.5%), B*49 (1.9% vs 5.3%). There was no significant difference in the frequencies of DRB1 alleles. The most frequent 11 haplotypes (frequency >1%) of the Russian population were: A*01-B*08-DRB1*03 (4.25%), A*03-B*07-DRB1*15 (3.61%), A*03-B*35-DRB1*01 (1.95%), A*30-B*13-DRB1*07 (1.78%), A*02-B*07-DRB1*15 (1,70%), A*25-B*18-DRB1*15 (1.61%), A*02-B*13-DRB1*07 (1.53%), A*23-B*44-DRB1*07 (1.27%), A*02-B*15-DRB1*04 (1.19%), A*02-B*41-DRB1*13 (1.15%) and A*02-B*44-DRB1*16 (1.15%). The most frequent 9 haplotypes (frequency >1.5%) of Tatarian population were: A*03-B*35-DRB1*01 (6.47%), A*25-B*18-DRB1*15 (4.12%), A*01-B*08-DRB1*03 (2.94%), A*02-B*44-DRB1*04 (2.35%), A*03-B*07-DRB1*15 (2.35%), A*24-B*13-DRB1*07 (2.35%), A*03-B*07-DRB1*07 (1.77%), A*23-B*44-DRB1*16 (1.77%) and A*26-B*49-DRB1*15 (1.77%). The knowledge of alleles and haplotype frequencies of different populations can be used for population genetic purposes to establish the genetic relationship, disease association studies and developing the typing and search strategies in bone marrow donor registries.
Confirmation of the rare allele B*35:311 in potential stem cell donor from Sicily
M. Francone (Reggio Calabria, IT)
In the context of the cooperation existing between donor centre ME01 and transplant centre RC01 for resolution of any ambiguous results, during HLA typing of new donors to include in Italian Bone Marrow Donor Registry, the rare allele B*35:311 was identified in a Caucasian male, born in Sicily. It was reported for the first time by N. Cereb of the Histogenetics Laboratory in 2015 on the IPD-IMGT/HLA Database, in an individual of unknown ethnic origin with the genotype A*01,*24; B*18,*35; C*04,*12; DRB1*03, *11. Only exon 2 and 3 coding for the antigen recognition site were sequenced. HLA-B*35:311 was until now unconfirmed and the status in CWD Catalogue v2.0.0 was not defined. There are no references published in the literature. Genomic DNA of the donor was extracted from peripheral blood leukocytes using an automated system. HLA typing was performed by PCR-SSO (Mr.Spot BAG) and SSP (ONE LABMDA) methods. The B*35:311 allele was confirmed performing sequencing in forward and reverse direction of exons 2 and 3 (SBT ROSE). Within these exons, the most similar allele is B*35:02:01 that differs by B*35:311 in two positions: 45 and 46 of the exon 2 (ATG to ACG and GCG to GAG respectively), generating a protein with a different binding site. B*35:311 was submitted to AFND in February 2017. The genotype of the donor was: A*24:02, *68:02; B*35:311, *53:01; C*04:01 homozygous, DRB1*01:02,*11:04. A family study will be extended to parents and sister of the donor in order to separate parental haplotypes. Since HLA matching between donor and recipient is a key factor on the incidence of engraftment and GvHD, detection and communication of rare alleles is important to select donors with optimal characteristics.