Skip to main content

Table 2 Protein families with high fraction of conserved and variable positions

From: Analysis of lineage-specific protein family variability in prokaryotes combined with evolutionary reconstructions

csCOG identifier

COG

Func

Gene

Description

Comment

flavo9.00376

COG1158

K

Rho

Transcription termination factor Rho

Mostly Bacteroidetes

flavo9.00582

COG1314

U

SecG

Protein translocase subunit SecG

All bacteroidetes, but also in some other bacteria such as Chlorobia, some Proteobacteria, Spirochaetes; others do not possess the variable tail

flavo9.00756

–

–

–

–

xre family HTH (N-terminal), the loop is present mostly in Bacteroidetes, but seen in some Bacilli too

flavo9.00944

COG4807

S

YehS

Uncharacterized conserved protein YehS, DUF1456 family

Specific for Flavobacterium

deino9.00350

–

–

–

–

An artefact: wrong ORFs start in some of these genes

deino9.00475

COG1722

L

XseB

Exonuclease VII small subunit

Variable tail in other bacteria too

deino9.00842

COG0511

I

AccB

Biotin carboxyl carrier protein

PA-rich, present in most bacteria

deino9.01337

–

–

–

–

Uncharacterized, small, Deinococcus specific

deino9.01490

COG0568

K

RpoD

DNA-directed RNA polymerase, sigma subunit (sigma70/sigma32)

Specific N-terminal extension in Deinococci and Truepera, although partially low complexity region is present in Thermus

deino9.03407

COG0199

J

RpsN

Ribosomal protein S14

Xenologous gene displacement by zinc finger variant in some Deinococci

paen9.00611

COG1937

K

FrmR

DNA-binding transcriptional regulator, FrmR family

Copper-sensitive operon repressor, variable N-terminal region is present in many other Firmicutes

paen9.00802

–

–

–

YycC-like protein, PF14174.7

Paenibacillus specific variable tail

paen9.00805

COG3874

S

YtfJ

Uncharacterized spore protein YtfJ

Sporulation protein YtfJ; variable region is present in many sporulating Bacilli, but variable tail is rather specific for Paenibacillus

paen9.00958

COG1674

D

FtsK

DNA segregation ATPase FtsK/SpoIIIE or related protein

Variable insertion is present in all Bacilli and other bacteria, in Paenibacillus these regions are longer

paen9.01226

COG0323

L

MutL

DNA mismatch repair ATPase MutL

Common feature among some archaea and some bacteria

paen9.01699

COG4467

L

YabA

Regulator of replication initiation timing YabA

Variable insertion is present in all Firmicutes and other bacteria, in Paenibacillus these regions is longer [66]

paen9.02368

COG0532

J

InfB

Translation initiation factor IF-2, a GTPase

Variable insertion is present in all Firmicutes (very different lengths), in Paenibacillus these regions are longer, but not the longest among Firmicutes. In many other bacteria the insertion is much smaller [67]

rhodo7.000637

COG1826

U

TatA

Twin-arginine protein secretion pathway components TatA and TatB

Variable tail is specific for at least actinobacteria

rhodo7.001015

COG5416

S

YrvD

Uncharacterized integral membrane protein YrvD

Variable N-terminal region specific for actinobacteria, but not others

rhodo7.001149

COG2409

S

YdfJ

Predicted lipid transporter YdfJ, MMPL/SSD domain, RND superfamily

Variable tail region specific for actinobacteria, but not others, sometime the tail is missing in actinobacteria too

rhodo7.001169

–

–

–

lipid droplet-associated protein

Found in lipid droplets in Mycobacterium tuberculosis [68]; two variable internal regions specific for actinobacteria

rhodo7.001269

COG1158

K

Rho

Transcription termination factor Rho

N-terminal variable region specific for actinobacteria

rhodo7.001344

COG0328

L

RnhA

Ribonuclease HI

Variable region is present in many bacteria

rhodo7.001562

COG1862

U

YajC

Protein translocase subunit YajC

Variable region is present in many bacteria

rhodo7.001949

COG0305

L

DnaB

Replicative DNA helicase

Some contain intein

thermo9.00277

(arCOG04026)

–

–

Pilin/Flagellin, contains class III signal peptide

Thermococcus specific, not present elsewhere

halo9.00332

COG0323

L

MutL

DNA mismatch repair enzyme (predicted ATPase)

Common feature among some archaea and some bacteria

halo9.00351

COG1885

S

–

Uncharacterized protein, DUF555 family

Uncharacterized, variable tail present in Methanosarcina, but not in a few other euryarchaea

halo9.00421

COG4530

S

–

Uncharacterized protein

Uncharacterized DUF5806, specific for Halobacteria variable N-terminal region, some have CxxCxHxxH motif, variable N-terminal region

halo9.00587

COG0805

U

–

Sec-independent protein translocase protein TatC

Specific for Halobacteria variable N-terminal region

halo9.00602

COG0552

U

–

Signal recognition particle-docking protein FtsY

N-terminal variable region present in many euryarchaea

halo9.00879

COG1474

L

–

orc1/cdc6 family replication initiation protein

N-terminal region specific for Haloferacales

halo9.00317

COG0358

L

DnaG

DNA primase (bacterial type)

Common feature among euryarchaea

methano7.000496

COG1311

L

HYS2

Archaeal DNA polymerase II, small subunit/DNA polymerase delta, subunit B

Specific for Methanosarcina