Circular consensus sequencing

Circular consensus sequencing (CCS) is a DNA sequencing method that is used in conjunction with single-molecule real-time sequencing to yield highly accurate long-read sequencing datasets with read lengths averaging 15–25 kb with median accuracy greater than 99.9%.[1][2] These long reads, which are created via the formation of consensus sequencing obtained from multiple passes on a single DNA molecule, can be used to improve results for complex applications such as single nucleotide and structural variant detection, genome assembly, assembly of difficult polyploid or highly repetitive genomes, and assembly of metagenomes.[3]

CCS allows resolution of large or complex genomes – such as the California Redwood genome, nine times the size of the human genome - of any species, including variant detection single nucleotide variants (SNVs) to structural variants, with high precision.[4][5] CCS also enables separation of the different copies of each chromosome (e.g., maternal and paternal for diploid), known as haplotypes. CCS reads offer the benefits of high accuracy equivalent to short-read sequencing data, but with the length necessary for complex genome assemblies and phasing of variants across the genome.[6][7]

Technology

edit
 
Revio SMRT cell.

In this method, circularized fragments of DNA in solution float across the surface of a nanofluidic chip called a SMRT (Single Molecule, Real-Time) Cell. The surface of the chip is covered with millions of wells called zero-mode waveguides (ZMWs), each a few nanometers wide.[8] To prepare a sample for CCS/HiFi sequencing, primers and DNA polymerase are added to SMRTbell libraries. The circularized DNA becomes trapped in the ZMW, nucleotides are added, and the DNA polymerase enzyme begins to copy the molecule base by base. As this happens, a tiny amount of light is released and read by a detector, which helps the sequencer’s computer determine the order of bases present in the sample. The circularized DNA is sequenced in repeated passes to ensure accuracy – thus the name “circular” consensus sequencing – then  the primers and adapters are removed using bioinformatics to deliver a highly accurate consensus DNA read.[9]

In CCS, the genomic DNA is prepared without amplification such that individual base modifications such as methylation can be detected during sequencing. This allows for the capture of both sequence and valuable methylation information in a single experiment.[10]

History

edit

This sequencing method was first described by Travers, K.J., et al. in Nucleic Acids Research in 2010.[3] It was later commercialized by Pacific Biosciences in 2018 and made available on Sequel II and Revio long-read sequencing instruments.[11][12]

CCS technology has subsequently been used to power numerous studies in several fields, including: Human, telomere-to-telomere, whole genome assembly and pangenome research,[13][14][15] pediatric rare disease genomic analysis,[16][17] understanding DNA methylation in a rare disease cohorts,[18] assembly of whole genomes of non-human vertebrates,[19] assembly of whole genomics of other agriculturally significant species,[20] analysis of cancer genomes[21][22] and Metagenomics and microbial research, among others.[23][24]

Recognizing the importance of this technology in future genomic exploration and discovery, the editors of Nature Methods named long-read sequencing technology its method of the year for 2022.[25]

Applications

edit

Human and conservation biology

edit

CCS can be useful to researchers seeking to perform de novo sequencing assembly or studying haplotyped phased sequences from each chromosomal copy, regardless of how many chromosomes are present in the species.Many biodiversity-oriented consortia have leveraged such technology to complete their conservation biology studies including African Biogenome Project, California Conservation Genomics Project, Darwin Tree of Life, Desert Agriculture Initiative, Earth Biogenome Project , Global Ant Genomics Alliance, Human Pangenome ,Telomere-to-Telomere Consortium ,The 10,000 Fish Genomes Project and Vertebrate Genomes Project.[26][27][28]

Human health

edit

Circular consensus sequencing is helping researchers identify and characterize rare or structural variants with high confidence to better identify the underlying genomics of a given phenotype, with numerous applications to human health including rare disease research, microbiology and infectious disease, cancer research, and other genetic disease research areas.[29][30]

Rare diseases

edit

Although they occur with low frequency in the human population, rare diseases as a collective are common and most have a genetic cause, presenting unique diagnostic challenges. An estimated 50–80% of structural variants are tandem repeats.[31]

Because CCS provides a comprehensive view of variation in the human genome, producing complete, accurate, and phased assemblies for variant calling, identification of repeat expansions and medically relevant interruption sequences, it is enabling the identification of causative pathogenic variants and helping researchers discover novel disease-associated genes.[32]

Microbiology and infectious diseases

edit

Circular consensus sequencing can rapidly identify emerging pathogens and/or detection of changing pathogen genomics as part of regional or global surveillance operations.Where other molecular technologies for public health surveillance may require re-validation or the development of new panels, the unbiased nature of circular consensus sequencing delivers comprehensive genetic information to further characterize global outbreaks, pandemics, and epidemics.[12]

Cancer research

edit

Comprehensive resolution of structural variants enables researchers to better study and detect somatic variants driving cancer. Because of their size (>50 bp), structural variants and tandem repeats account for much genomic variation between individuals.[33]

Long-read RNA sequencing can be useful in cancer research to uncover sources of alternative splicing and fusion events which power cancer growth.[34][35][36][37] CCS also provides an advantage over other sequencing technologies as it can provide phasing information of expressed mutations.[38]

References

edit
  1. ^ Mastrorosa, Francesco Kumara; Miller, Danny E.; Eichler, Evan E. (2023-06-14). "Applications of long-read sequencing to Mendelian genetics". Genome Medicine. 15 (1): 42. doi:10.1186/s13073-023-01194-3. ISSN 1756-994X. PMC 10266321. PMID 37316925.
  2. ^ Wenger, Aaron M.; Peluso, Paul; Rowell, William J.; Chang, Pi-Chuan; Hall, Richard J.; Concepcion, Gregory T.; Ebler, Jana; Fungtammasan, Arkarachai; Kolesnikov, Alexey; Olson, Nathan D.; Töpfer, Armin (2019-10-12). "Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome". Nature Biotechnology. 37 (10): 1155–1162. doi:10.1038/s41587-019-0217-9. ISSN 1546-1696. PMC 6776680. PMID 31406327.
  3. ^ a b Travers, K. J.; Chin, C.-S.; Rank, D. R.; Eid, J. S.; Turner, S. W. (2010-08-01). "A flexible and efficient template format for circular consensus sequencing and SNP detection". Nucleic Acids Research. 38 (15): e159. doi:10.1093/nar/gkq543. ISSN 0305-1048. PMC 2926623. PMID 20571086.
  4. ^ Sharma, Priyanka; Masouleh, Ardashir Kharabian; Topp, Bruce; Furtado, Agnelo; Henry, Robert J. (February 2022). "De novo chromosome level assembly of a plant genome from long read sequence data". The Plant Journal. 109 (3): 727–736. doi:10.1111/tpj.15583. ISSN 0960-7412. PMC 9300133. PMID 34784084.
  5. ^ Cheng, Haoyu; Concepcion, Gregory T; Feng, Xiaowen; Zhang, Haowen; Li, Heng (2021-02-01). "Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm". Nature Methods. 18 (2): 170–175. doi:10.1038/s41592-020-01056-5. ISSN 1548-7091. PMC 7961889. PMID 33526886.
  6. ^ Cheng, Haoyu; Concepcion, Gregory T.; Feng, Xiaowen; Zhang, Haowen; Li, Heng (2021-02-01). "Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm". Nature Methods. 18 (2): 170–175. doi:10.1038/s41592-020-01056-5. ISSN 1548-7105. PMC 7961889. PMID 33526886.
  7. ^ Nurk, Sergey; Walenz, Brian P.; Rhie, Arang; Vollger, Mitchell R.; Logsdon, Glennis A.; Grothe, Robert; Miga, Karen H.; Eichler, Evan E.; Phillippy, Adam M.; Koren, Sergey (2020-09-01). "HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads". Genome Research. 30 (9): 1291–1305. doi:10.1101/gr.263566.120. ISSN 1088-9051. PMC 7545148. PMID 32801147.
  8. ^ Eid, John; Fehr, Adrian; Gray, Jeremy; Luong, Khai; Lyle, John; Otto, Geoff; Peluso, Paul; Rank, David; Baybayan, Primo; Bettman, Brad; Bibillo, Arkadiusz; Bjornson, Keith; Chaudhuri, Bidhan; Christians, Frederick; Cicero, Ronald (2009-01-02). "Real-Time DNA Sequencing from Single Polymerase Molecules". Science. 323 (5910): 133–138. Bibcode:2009Sci...323..133E. doi:10.1126/science.1162986. ISSN 0036-8075. PMID 19023044. S2CID 54488479.
  9. ^ Travers, K. J.; Chin, C.-S.; Rank, D. R.; Eid, J. S.; Turner, S. W. (2010-06-22). "A flexible and efficient template format for circular consensus sequencing and SNP detection". Nucleic Acids Research. 38 (15): e159. doi:10.1093/nar/gkq543. ISSN 0305-1048. PMC 2926623. PMID 20571086.
  10. ^ Flusberg, Benjamin A.; Webster, Dale R.; Lee, Jessica H.; Travers, Kevin J.; Olivares, Eric C.; Clark, Tyson A.; Korlach, Jonas; Turner, Stephen W. (2010-05-09). "Direct detection of DNA methylation during single-molecule, real-time sequencing". Nature Methods. 7 (6): 461–465. doi:10.1038/nmeth.1459. ISSN 1548-7105. PMC 2879396. PMID 20453866.
  11. ^ Wenger, Aaron M.; Peluso, Paul; Rowell, William J.; Chang, Pi-Chuan; Hall, Richard J.; Concepcion, Gregory T.; Ebler, Jana; Fungtammasan, Arkarachai; Kolesnikov, Alexey; Olson, Nathan D.; Töpfer, Armin; Alonge, Michael; Mahmoud, Medhat; Qian, Yufeng; Chin, Chen-Shan (2019-08-12). "Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome". Nature Biotechnology. 37 (10): 1155–1162. doi:10.1038/s41587-019-0217-9. ISSN 1546-1696. PMC 6776680. PMID 31406327. S2CID 199542686.
  12. ^ a b Oehler, Josephine B.; Wright, Helen; Stark, Zornitza; Mallett, Andrew J.; Schmitz, Ulf (2023-08-08). "The application of long-read sequencing in clinical settings". Human Genomics. 17 (1): 73. doi:10.1186/s40246-023-00522-3. ISSN 1479-7364. PMC 10410870. PMID 37553611.
  13. ^ Jarvis, Erich D.; Formenti, Giulio; Rhie, Arang; Guarracino, Andrea; Yang, Chentao; Wood, Jonathan; Tracey, Alan; Thibaud-Nissen, Francoise; Vollger, Mitchell R.; Porubsky, David; Cheng, Haoyu; Asri, Mobin; Logsdon, Glennis A.; Carnevali, Paolo; Chaisson, Mark J. P. (2022-11-22). "Semi-automated assembly of high-quality diploid human reference genomes". Nature. 611 (7936): 519–531. Bibcode:2022Natur.611..519J. doi:10.1038/s41586-022-05325-5. ISSN 1476-4687. PMC 9668749. PMID 36261518.
  14. ^ Nurk, Sergey; Koren, Sergey; Rhie, Arang; Rautiainen, Mikko; Bzikadze, Andrey V.; Mikheenko, Alla; Vollger, Mitchell R.; Altemose, Nicolas; Uralsky, Lev; Gershman, Ariel; Aganezov, Sergey; Hoyt, Savannah J.; Diekhans, Mark; Logsdon, Glennis A.; Alonge, Michael (2022-03-31). "The complete sequence of a human genome". Science. 376 (6588): 44–53. Bibcode:2022Sci...376...44N. doi:10.1126/science.abj6987. ISSN 0036-8075. PMC 9186530. PMID 35357919.
  15. ^ Gao, Yang; Yang, Xiaofei; Chen, Hao; Tan, Xinjiang; Yang, Zhaoqing; Deng, Lian; Wang, Baonan; Kong, Shuang; Li, Songyang; Cui, Yuhang; Lei, Chang; Wang, Yimin; Pan, Yuwen; Ma, Sen; Sun, Hao (2023-06-14). "A pangenome reference of 36 Chinese populations". Nature. 619 (7968): 112–121. Bibcode:2023Natur.619..112G. doi:10.1038/s41586-023-06173-7. ISSN 1476-4687. PMC 10322713. PMID 37316654.
  16. ^ Cohen, Ana S.A.; Farrow, Emily G.; Abdelmoity, Ahmed T.; Alaimo, Joseph T.; Amudhavalli, Shivarajan M.; Anderson, John T.; Bansal, Lalit; Bartik, Lauren; Baybayan, Primo; Belden, Bradley; Berrios, Courtney D.; Biswell, Rebecca L.; Buczkowicz, Pawel; Buske, Orion; Chakraborty, Shreyasee (June 2022). "Genomic answers for children: Dynamic analyses of >1000 pediatric rare disease genomes". Genetics in Medicine. 24 (6): 1336–1348. doi:10.1016/j.gim.2022.02.007. PMID 35305867. S2CID 263467538.
  17. ^ Sanford Kobayashi, Erica; Batalov, Serge; Wenger, Aaron M.; Lambert, Christine; Dhillon, Harsharan; Hall, Richard J.; Baybayan, Primo; Ding, Yan; Rego, Seema; Wigby, Kristen; Friedman, Jennifer; Hobbs, Charlotte; Bainbridge, Matthew N. (2022-10-09). "Approaches to long-read sequencing in a clinical setting to improve diagnostic rate". Scientific Reports. 12 (1): 16945. Bibcode:2022NatSR..1216945S. doi:10.1038/s41598-022-20113-x. ISSN 2045-2322. PMC 9548499. PMID 36210382.
  18. ^ Cheung, Warren A.; Johnson, Adam F.; Rowell, William J.; Farrow, Emily; Hall, Richard; Cohen, Ana S. A.; Means, John C.; Zion, Tricia N.; Portik, Daniel M.; Saunders, Christopher T.; Koseva, Boryana; Bi, Chengpeng; Truong, Tina K.; Schwendinger-Schreck, Carl; Yoo, Byunggil (2023-05-29). "Direct haplotype-resolved 5-base HiFi sequencing for genome-wide profiling of hypermethylation outliers in a rare disease cohort". Nature Communications. 14 (1): 3090. Bibcode:2023NatCo..14.3090C. doi:10.1038/s41467-023-38782-1. ISSN 2041-1723. PMC 10226990. PMID 37248219.
  19. ^ Rhie, Arang; McCarthy, Shane A.; Fedrigo, Olivier; Damas, Joana; Formenti, Giulio; Koren, Sergey; Uliano-Silva, Marcela; Chow, William; Fungtammasan, Arkarachai; Kim, Juwan; Lee, Chul; Ko, Byung June; Chaisson, Mark; Gedman, Gregory L.; Cantin, Lindsey J. (2021-04-29). "Towards complete and error-free genome assemblies of all vertebrate species". Nature. 592 (7856): 737–746. Bibcode:2021Natur.592..737R. doi:10.1038/s41586-021-03451-0. ISSN 0028-0836. PMC 8081667. PMID 33911273.
  20. ^ Chen, Jian; Wang, Zijian; Tan, Kaiwen; Huang, Wei; Shi, Junpeng; Li, Tong; Hu, Jiang; Wang, Kai; Wang, Chao; Xin, Beibei; Zhao, Haiming; Song, Weibin; Hufford, Matthew B.; Schnable, James C.; Jin, Weiwei (July 2023). "A complete telomere-to-telomere assembly of the maize genome". Nature Genetics. 55 (7): 1221–1231. doi:10.1038/s41588-023-01419-6. ISSN 1546-1718. PMC 10335936. PMID 37322109.
  21. ^ Veiga, Diogo F. T.; Nesta, Alex; Zhao, Yuqi; Mays, Anne Deslattes; Huynh, Richie; Rossi, Robert; Wu, Te-Chia; Palucka, Karolina; Anczukow, Olga; Beck, Christine R.; Banchereau, Jacques (2022-01-21). "A comprehensive long-read isoform analysis platform and sequencing resource for breast cancer". Science Advances. 8 (3): eabg6711. Bibcode:2022SciA....8.6711V. doi:10.1126/sciadv.abg6711. ISSN 2375-2548. PMC 8769553. PMID 35044822.
  22. ^ Choy, L Y Lois; Peng, Wenlei; Jiang, Peiyong; Cheng, Suk Hang; Yu, Stephanie C Y (19 May 2022). "Single-Molecule Sequencing Enables Long Cell-Free DNA Detection and Direct Methylation Analysis for Cancer Patients". Clinical Chemistry. 68 (9): 1151–1163. doi:10.1093/clinchem/hvac086. PMID 35587130.
  23. ^ Reiter, Taylor E.; Brown, C. Titus (2022-03-22). "MAGs achieve lineage resolution". Nature Microbiology. 7 (2): 193–194. doi:10.1038/s41564-021-01027-2. ISSN 2058-5276. PMID 34980920. S2CID 245653539.
  24. ^ Oyewole, Oluwaseun Rume-Abiola; Latzin, Philipp; Brugger, Silvio D.; Hilty, Markus (2022-09-22). "Strain-level resolution and pneumococcal carriage dynamics by single-molecule real-time (SMRT) sequencing of the plyNCR marker: a longitudinal study in Swiss infants". Microbiome. 10 (1): 152. doi:10.1186/s40168-022-01344-6. ISSN 2049-2618. PMC 9502908. PMID 36138483.
  25. ^ Marx, Vivien (2023-01-12). "Method of the year: long-read sequencing". Nature Methods. 20 (1): 6–11. doi:10.1038/s41592-022-01730-w. ISSN 1548-7105. PMID 36635542. S2CID 255773787.
  26. ^ Nurk, Sergey; Koren, Sergey; Rhie, Arang; Rautiainen, Mikko; Bzikadze, Andrey V.; Mikheenko, Alla; Vollger, Mitchell R.; Altemose, Nicolas; Uralsky, Lev; Gershman, Ariel; Aganezov, Sergey; Hoyt, Savannah J.; Diekhans, Mark; Logsdon, Glennis A.; Alonge, Michael (2022-03-31). "The complete sequence of a human genome". Science. 376 (6588): 44–53. Bibcode:2022Sci...376...44N. doi:10.1126/science.abj6987. ISSN 0036-8075. PMC 9186530. PMID 35357919.
  27. ^ Aganezov, Sergey; Yan, Stephanie M.; Soto, Daniela C.; Kirsche, Melanie; Zarate, Samantha; Avdeyev, Pavel; Taylor, Dylan J.; Shafin, Kishwar; Shumate, Alaina; Xiao, Chunlin; Wagner, Justin; McDaniel, Jennifer; Olson, Nathan D.; Sauria, Michael E. G.; Vollger, Mitchell R. (2022-04-01). "A complete reference genome improves analysis of human genetic variation". Science. 376 (6588): eabl3533. doi:10.1126/science.abl3533. ISSN 0036-8075. PMC 9336181. PMID 35357935.
  28. ^ Vollger, Mitchell R.; Guitart, Xavi; Dishuck, Philip C.; Mercuri, Ludovica; Harvey, William T.; Gershman, Ariel; Diekhans, Mark; Sulovari, Arvis; Munson, Katherine M.; Lewis, Alexandra P.; Hoekzema, Kendra; Porubsky, David; Li, Ruiyang; Nurk, Sergey; Koren, Sergey (2022-04-01). "Segmental duplications and their variation in a complete human genome". Science. 376 (6588): eabj6965. doi:10.1126/science.abj6965. ISSN 0036-8075. PMC 8979283. PMID 35357917.
  29. ^ Wenger, Aaron M.; Peluso, Paul; Rowell, William J.; Chang, Pi-Chuan; Hall, Richard J.; Concepcion, Gregory T.; Ebler, Jana; Fungtammasan, Arkarachai; Kolesnikov, Alexey; Olson, Nathan D.; Töpfer, Armin; Alonge, Michael; Mahmoud, Medhat; Qian, Yufeng; Chin, Chen-Shan (2019-08-12). "Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome". Nature Biotechnology. 37 (10): 1155–1162. doi:10.1038/s41587-019-0217-9. ISSN 1546-1696. PMC 6776680. PMID 31406327.
  30. ^ Salk, Jesse J.; Schmitt, Michael W.; Loeb, Lawrence A. (2018-03-26). "Enhancing the accuracy of next-generation sequencing for detecting rare and subclonal mutations". Nature Reviews. Genetics. 19 (5): 269–285. doi:10.1038/nrg.2017.117. PMC 6485430. PMID 29576615.
  31. ^ English, Adam C.; Menon, Vipin K.; Gibbs, Richard A.; Metcalf, Ginger A.; Sedlazeck, Fritz J. (2022-12-27). "Truvari: refined structural variant comparison preserves allelic diversity". Genome Biology. 23 (1): 271. doi:10.1186/s13059-022-02840-6. ISSN 1474-760X. PMC 9793516. PMID 36575487.
  32. ^ "Customer Success Story: Experts at Children's Mercy Kansas City Turn to Long-Read Whole Genome Sequencing to Find Answers for Rare Diseases". PacBio. Retrieved 2023-11-02.
  33. ^ Ebert, Peter; Audano, Peter A.; Zhu, Qihui; Rodriguez-Martin, Bernardo; Porubsky, David; Bonder, Marc Jan; Sulovari, Arvis; Ebler, Jana; Zhou, Weichen; Serra Mari, Rebecca; Yilmaz, Feyza; Zhao, Xuefang; Hsieh, PingHsun; Lee, Joyce; Kumar, Sushant (2021-04-02). "Haplotype-resolved diverse human genomes and integrated analysis of structural variation". Science. 372 (6537): eabf7117. doi:10.1126/science.abf7117. ISSN 1095-9203. PMC 8026704. PMID 33632895.
  34. ^ Veiga, Diogo F. T.; Nesta, Alex; Zhao, Yuqi; Deslattes Mays, Anne; Huynh, Richie; Rossi, Robert; Wu, Te-Chia; Palucka, Karolina; Anczukow, Olga; Beck, Christine R.; Banchereau, Jacques (2022-01-21). "A comprehensive long-read isoform analysis platform and sequencing resource for breast cancer". Science Advances. 8 (3): eabg6711. Bibcode:2022SciA....8.6711V. doi:10.1126/sciadv.abg6711. ISSN 2375-2548. PMC 8769553. PMID 35044822.
  35. ^ Mikheenko, Alla; Prjibelski, Andrey D.; Joglekar, Anoushka; Tilgner, Hagen U. (2022-04-01). "Sequencing of individual barcoded cDNAs using Pacific Biosciences and Oxford Nanopore Technologies reveals platform-specific error patterns". Genome Research. 32 (4): 726–737. doi:10.1101/gr.276405.121. PMC 8997348. PMID 35301264.
  36. ^ Nattestad, Maria; Goodwin, Sara; Ng, Karen; Baslan, Timour; Sedlazeck, Fritz J.; Rescheneder, Philipp; Garvin, Tyler; Fang, Han; Gurtowski, James; Hutton, Elizabeth; Tseng, Elizabeth; Chin, Chen-Shan; Beck, Timothy; Sundaravadanam, Yogi; Kramer, Melissa (2018-08-28). "Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line". Genome Research. 28 (8): 1126–1135. doi:10.1101/gr.231100.117. ISSN 1549-5469. PMC 6071638. PMID 29954844.
  37. ^ Miller, Anthony R.; Wijeratne, Saranga; McGrath, Sean D.; Schieffer, Kathleen M.; Miller, Katherine E.; Lee, Kristy; Mathew, Mariam; LaHaye, Stephanie; Fitch, James R.; Kelly, Benjamin J.; White, Peter; Mardis, Elaine R.; Wilson, Richard K.; Cottrell, Catherine E.; Magrini, Vincent (December 2022). "Pacific Biosciences Fusion and Long Isoform Pipeline for Cancer Transcriptome–Based Resolution of Isoform Complexity". The Journal of Molecular Diagnostics. 24 (12): 1292–1306. doi:10.1016/j.jmoldx.2022.09.003. PMID 36191838. S2CID 252653559.
  38. ^ Olson, Nathan D.; Wagner, Justin; McDaniel, Jennifer; Stephens, Sarah H.; Westreich, Samuel T.; Prasanna, Anish G.; Johanson, Elaine; Boja, Emily; Maier, Ezekiel J.; Serang, Omar; Jáspez, David; Lorenzo-Salazar, José M.; Muñoz-Barrera, Adrián; Rubio-Rodríguez, Luis A.; Flores, Carlos (2022-05-11). "PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions". Cell Genomics. 2 (5): 100129. doi:10.1016/j.xgen.2022.100129. ISSN 2666-979X. PMC 9205427. PMID 35720974.