This website requires JavaScript.
DOI: 10.1101/2023.05.24.542206

The T2T-CHM13 reference genome has more accurate sequences for immunoglobulin genes than GRCh38

J.Nie J. Tellier I. Tarasova S. L. Nutt G. K. Smyth
Background: The production of antibody by members of the B-cell lineage is essential for protective immunity. The clonal selection theory posits that each mature B cell has a unique immunoglobulin receptor, generated through random gene recombination, and when stimulated to differentiate into an antibody-secreting cell has the capacity to produce only a single antibody specificity. Based on this classical, "one cell one antibody" dogma, analysis of single cell RNA sequencing data of antibody-secreting cells should reveal that each cell expresses only a single form of each of the immunoglobulin heavy and light chains. However, when using GRCh38 as the genome reference, many plasma cells appear to express multiple immunoglobulin isotypes. Results: We show that this false mapping is caused by the inaccurate immunoglobulin sequences provided by GRCh38. The newly published human genome reference T2T-CHM13, due to its more accurate sequences, avoids this false mapping caveat. In addition, further reads mapped to GRCh38 with ambiguity also settle down perfectly. Conclusions: These studies reveal that the sequences of the immunoglobulin genes within T2T-CHM13 are more accurate than GRCh38. Thus T2T-CHM13 is the reference genome of choice for accurate mapping and identification of the human immunoglobulin genes. Keywords: Reference genome, T2T-CHM13, GRCh38, Immunoglobin genes, Clonal selection theory