Manually verified microbial gene clusters

only VirMic
All clusters listed below represent functions that are known to be of microbial origin and were not observed on fully sequenced viral genomes before. For example: the photosystem I psaA gene is represented by a cluster that is listed below (T109), while psbA (T2911), which was found on the genome of cyanophages before, is not listed.
Cluster identification process consisted of two stages:
  • Automatic pipeline, whose details will be available soon in a paper describing the project and its finding, and
  • Manual inspection, in which all microbial gene clusters on viral scaffolds were scanned manually based on strict criteria that were meant to prevent false-positives.
  • All clusters listed below passed the manual inspection. Description of the criteria used for manual inspection is available here.

    T128 70 Peptidedeformylase
    T1396 58 Glycerol-3-phosphate cytidylyltransferase
    T982 36 Exported protein
    T451 31 Glycinedehydrogenase
    T768 30 Antioxidant, AhpC/Tsa family protein
    T1414 28 Fructose-1,6-bisphosphate aldolase class I
    T100 20 Glycosyltransferase, group 1
    T156 15 Serine hydroxymethyltransferase
    T212 13 NAD(P)H-dehydrogenase subunit D
    T603 13 NAD(P)H-dehydrogenase subunit I
    T17 13 Membrane protein
    T90 13 Phosphoribosylaminoimidazole-succinocarboxamidesynthase
    T1338 12 Glycosyltransferase involved in cell wall biogenesis-like
    T1486 11 NH(3)-dependent NAD+ synthetase NadE
    T71 10 Glycine cleavage system protein P2
    T885 10 Iron-sulfur cluster-binding protein
    T891 8 Phosphorylase
    T327 8 Mannitol-1-phosphate/altronate dehydrogenase
    T198 8 Alkylhydroperoxidereductase/Thiol specific antioxidant/Mal allergen
    T1113 8 Putative NH(3)-dependent NAD synthetase
    T596 8 Scaffold protein
    T158 7 Sugarisomerase (SIS)
    T255 6 Translation initiation factor IF-1
    T1119 6 Coenzyme PQQ biosynthesis protein A
    T445 6 Adenylatekinase
    T444 5 Iron-sulfur cluster insertion protein ErpA
    T883 5 Glycine cleavage system aminomethyltransferase T
    T712 5 Antioxidant, AhpC/TSA family protein
    T446 5 Photosynthesis protein N (psbN)
    T361 5 5'-methylthioadenosine nucleosidase/S-adenosylhomocysteinenucleosidase
    T72 5 Ribosomal protein S21
    T204 4 Photosystem I reaction center subunit IX (PsaJ)
    T1557 4 Photosystem I subunit VII (PsaC)
    T109 4 Photosystem I P700 chlorophyll a apoprotein A1 (PsaA)