Skip to main content

Table 1 Classification, domain architectures, gene-neighborhoods and other salient features of HEPN proteins

From: Comprehensive analysis of the HEPN superfamily: identification of novel roles in intra-genomic conflicts, defense, pathogenesis and RNA processing

Family (with any Pfam names/id)

Conservation of Rx4-6H

Salient Architecture and operons

Phyletic pattern, available structures and comments

Nucleotidyltransferase (NT-) associated HEPN families

HEPN-T (PF05168)

D replaces conserved H in several cases

Standalone versions and fusions to MNT; In the case of Sacsin it is part of a multi-domain protein with vertebrates showing a further fusion to an Ubiquitin-like domain and some animals showing a fusion to a Death domain. Several instances of genomic clustering with R-M system operons

Bacteria, Archaea, Eukaryotes

pdb: 1wwp, 2hsb, 1o3u.

Proteins with conserved D in place of H have a conserved H elsewhere which could contribute to activity

HEPN-T(Parep1/8)

Lacks R but H is conserved

Fused to inactive LAF-1/Vasa-like RNA helicase N-terminal ATPase domain in Caenorhabditis. In operon with genes encoding Parep in tandem repeats or with genes encoding proteins with MNT and REase (DUF1626)

Archaea. Has two distinct families PAE0096 and PaREP1. PDB:2q00

HEPN-T (Cpin_6617)

No

Fusions to a dyad of ferredoxin domains (gi: 381187024, Bacteroidetes, Nitrospirae)

Mostly Bacteria

HEPN-M (PF08780/DUF86-PF01934)

Mostly conserved (83%)

Occasionally fused to MNT, a previously undetected archaeal Holliday junction resolvase-like REase (Additional file 1), and nucleic acid methylase domains. In operon with a HAD phosphoesterase gene

PDB: 1ylm, 1jog-A. Bacteria, Archaea

HEPN-M (SAV_6107)

No

-

actinobacteria

Aminoglycoside_NT_C (PF07827/DUF4037)

No

Found at the C-termini of aminoglycoside nucleotidyltransferase and related proteins (gi: 15923025). Occasionally fused to TPRs (gi: 296454793)

PDB:1kny, 3jyy, 3jz0, 2pbe Bacteria

GlnD/GlnE (PF08335)/ DUF294_C (PF10335)

No

Fused to GlnD/E-like nucleotidyltransferase. Usually part of the glutamine synthetase modifying complex. DrrA is a secreted toxin in Legionella.

PDB:1v4a, 3l0i Bacteria

DUF4145-like

DUF4145 (PF13643)

Mostly conserved (80%)

Fused to Restriction Endonuclease (REase, SF-II-Helicase); Sel1, Zinc Ribbon, TM and SH3 (Firmicutes), UvrD Helicase (endoV alpha subunit), TIR and ATPase (Thiorhodococcus drewsii AZ1); SIGMA-HTH; DpnII/MboI-NTD; AbiJ-NTD1.

Bacteria > Archaeaa, dsDNA viruses;

In operon with R-M, TerD, McrB/C and symE toxin

c2405

Conserved H but lacks R

Fused to N-terminal AbiTii domain and in a few cases to a C-terminal Helix-hairpin-helix domain

Bacteria

MtlR

60%

Most often a part of mannitol operon with other mannitol utilization genes

gamma proteobacteria pdb:3c8g, 3brj

Abi2/Swt1

Abi2/AbiF/AbiD

Yes

Abi2/AbiF/AbiD and jhp1408 families

Bacteria

Embedded in R-M operons and also a protein with DNase domains ParB and HNH (Victivallis, Fusobacterium);

Swt1-like

Partly conserved

Swt1 - Dyad of HEPN domains fused to a PIN domain, with an additional fusion to WW in some; Inactive.

Swt1 - Eukaryotes

Ava_2192 - HEPN fused to a novel AAA + −ATPase. The Pfam profile DUF499 overlaps with this AAA + −ATPase (See Table 2); In operon with R-M components, where the SNF2-Helicase is fused to DUF3883, which is a novel REase domain. Active.

Ava_2192 - Bacterial with transfer to Naegleria, Dictyosteliida, Daphnia (expansion). All eukaryotes are solos.

Cxorf38 - Zn ribbon inserted into HEPN, DSRBD, NACHT, Ankyrin, CARD and DEATH, Active.

Cxorf38 - Vertebrates, Saccoglossus,

PY00838 – Fusion to Aegerolysin (Apicomplexa, Inactive)

Branchiostoma, Ciona, Nematostella. The Human gene is highly expressed in B lymphoblasts and CD56+ NK cells suggesting that this group might be involved in RNA virus defense.

Other Fusions to TM (STY4199), active; Phospholipase D Nuclease (SAV_2148 ), inactive; ParB (Saro_3948), mostly active; and ParB (DUF262) with HNH(DUF2081) (VNG7073 ), active; Transglutaminase, SF-I-

Helicase, Vsr REase and 2 wHTH (MTES_1575), active; CBS and HD (alr3009), active; RNASEIII and DSRBD (Cyanobacteria), active; STAND-ATPase, TPR, S1 (Npun_F6454, MED222_16016, Desac_1927), mostly active; SWI2/SNF2-ATPase (WQE_15321), active; Zinc Ribbon (Npun_R5629); ZnR with two TMs (Plim_2023), active.

 

Ribo L-PSP-HEPN

Yes

Fused to endoRNase L-PSP(gi: 166363853) ; operon with ParB

Bacteria. Distantly related AbiF and AbiD

Other Abi

AbiU2

Yes

In operon with a gene encoding protein with Sel1 repeats; R-M operons;

Bacteria

AbiV

No

-

Bacteria; Has an alternative conserved H at the same position as the first HEPN-T family; hence, could be related to that family

AbiJ

Yes

Fused to various novel N terminal domains labeled AbiJ-NTD1 to 5; Some of the solos occur in operon with R-M system

Bacteria

AbiA-CTD

Yes

Fused to Reverse Transcriptase ; in operon with R-M system

Bacteria

MAE_28990

MAE_28990

Yes

In operon with a ParB nuclease and DNA methylase genes

Bacteria

MAE_18760

Yes

Fused to HEPN/RES-NTD1, HEPN/Toprim-NTD1, Schlafen and a novel beta rich domain. In operon with ParA/Soj ATPase of SIMIBI-type GTPase fold

Bacteria

CRISPR-Cas

Csx1( MJ1666)

Yes

A dyad of HEPN domains fused to a Rossmann fold domain (PF09455)

Archaea > Bacteria; PDB:2i71, 4EOG

Csx1(TM1812)

Yes

HEPN fused to a Rossmann fold (PF09455), and a few other novel domains

Bacteria;

Csm6

Yes

HEPN fused to Csm6 (PF09659) and a helical domain

bacteria;

Csm6 (Cas_Cas02710)

Yes

HEPN fused to Csm6 (PF09670)

Bacteria > Archaea;

Other families

Ymh (PF09509)

Yes

Solos and fusions to pMORC, AbiJ-NTD1 and AbiTii domain.

Bacteria > Archaea

In operon with R-M

C6orf70

Yes

Fused to TPR; WD40 (Dictyostelium).

Bacteria > Eukaryotes. Overlaps with DUF4209 (PF13910). This family can be traced to LECA

Occurs in R-M related operons

DUF2526 (PF10735)

Yes

None detected

Gammaproteobacteria

KEN (RnaseL/Ire1)

Mostly conserved (95%)

Fused to S/T/Y-Kinase, along with ankyrin repeats, CCCH in some. Also found fused to UBI (gi:125543109) and BRCT (gi: 218187285)

Eukaryotes. pdb:3lj2; solo RNase L in Oikopleura and an independent LSE of the same is also seen Plants (mainly monocots)

Las1

Yes

Mainly Solos. Sometimes fused to Metallo-beta-lactamase and EF-HAND (Ascomycota) and to family specific globular domains

Eukaryotes

Rnase LS

Yes

Fused to RNase H (gi: 300902643), along with Caulimovirus viroplasmin domain (gi: 222100146). In some a TATA-binding protein (TBP)-like domain replaces the RNase H fold domain. In operon with antitoxin RnlB

Bacteria

DZIP3/ hRUL138

Mostly conserved

Fusion to TPR, Zn-ribbon, RING, Ankyrin, CARD, NACHT ATPase, DEATH and LRR in various animal lineages

Eukaryotes. Mainly animal lineage: LSEs in Nematostella, and the oyster and Capsaspora

PrrC/RloC/ APECO1_4465

Yes

Fused to ABC-ATPase. Often found in R-M operon and with genes for RhuM-like or Fic/Doc-like toxins. APECO1_4465 is also found in prophages

Bacteria

ERFG_01251

Yes

Fused to ABC-ATPase and HEPN/TOPRIM-NTD1

Bacteria

ApeA/BMEI1217

Yes

In epsilonproteobacteria embedded in R-M operons

Bacteria > Archaea;

EC042_2821

Yes

Fused to wHTH, REase and ZnR domains. Occurs in R-M system operons

Bacteria overlaps with DUF3644

Integron cassette HEPN

Yes

Part of mobile integron element

PDB:3jrt Gammaproteobacteria

pEK499_p136_Ecoli like (B)

Yes

Some in operon with R-M genes, ADP-ribosyltransferase-like enzymes (ART), and Macro. Also found in operon with NamA-like RNase H fold nuclease and with the Pgl components

NamA toxin / RlfA Replication in Phage P1 has a RnaseH fold

LA2681

Yes

Fused to TPR, and in operon with TPR

Bacteria > Archaea

Cthe_2314

Yes

None detected

Bacteria

Bxe_C0808

Yes

In operon with AbiU2

Bacteria

  1. a: The “>” sign indicates a postulated transfer from one lineage to another.