Considerations on nomenclature and phylogeny

The US government and other national and international health organizations (e.g. Public Health England [PHE]) developed a three-tiered classification system for SARS-CoV-2 variants according to the risk associated to them: increased transmissibility, increased morbidity, increased mortality, ability to evade detection by diagnostic tests, ability to evade natural immunity (e.g. causing reinfections), ability to infect vaccinated individuals, etc. If a SARS-CoV-2 variant meets one or more of these criteria is classified as Variant of Interest (VOI) or Variant under Investigation (VUI); if validated, it is then classified as Variant of Concern(VOC). In addition, The World Health Organization (WHO) introduced a nomenclature for these variants that uses letters of the Greek alphabet to simplify discussion of variant. This nomenclature is being very popular in the media.

The correspondence of both nomenclature systems is as follows:

  • Variants of High Consequence (VHC): no variants have been classified in this category.
  • Variants of Concern (VOC): Alpha, Beta, Gamma, Delta and Epsilon.
  • Variants of Interest
  • (VOI): Eta, Theta, Iota, Kappa and Lambda.
  • Variants under investigation (VUI)

CovidPhy uses the stable nomenclature initially proposed by Gómez-Carballa et al. (2020) during the first phase of the pandemic, providing information on the mutations that determined the three-tiered classification system and in addition, the popular Greek-alphabet based nomenclature is also implemented. It pays special attention to the most important VOCs: Alpha, Beta, Delta, and Gamma. Their sequence motifs are:

GAT28280CTA, A28111G, G28048T, C27972T, G24914C, T24506G, C23709T, C23604A, C23271A, TTA21991---, TACATG21765------, T16176C, C15279T, C14676T, TCTGGTTTT11288---------, T6954C, C5986T, C5388A, C3267T, C913T, GGG28881AAC, C14408T, C241T, C3037T, A23403G, A23063T.
C28253T, C26456T, C25904T, C23664T, G23012A, G22813T, CTTTACTTG22281---------, A21801C, A10323G,G5230T, G174T, A23063T, C1059T, G25563T, C14408T, C241T, C3037T, A23403G.
AG28877TC, C28512G, G28167A, C24642T, C23525T, A23063T, G23012A, A22812C, G22132T, G21974T, C21638T, C21621A, C21614T, G17259T, C13860T, C12778T, TCTGGTTTT11288---------, A5648C, C3828T, C2749T, T733C, C14408T, C241T, C3037T, A23403G, GGG28881AAC.
C23604G, T22917G, C22995A, G210T, G24410A, C25469T, T26767C, T27638C, C27752T, G28881T, C21618G, A28271-, A28461G, AGTTCA22029------, C14408T, C241T, C3037T, A23403G.

Only sequences containing the full sequence motif are classified intothese categories. Following the nomenclature of Gómez-Carballa et al. (2020), Alpha and Gamma are a sub-branch of A2a4, Beta derives from A2a2a, and Delta is a new sub-branch of A2a.

Below is the phylogeny by Gómez-Carballa et al. (2020).