Table 1 Classification, domain architectures, gene-neighborhoods and other salient features of HEPN proteins

From: Comprehensive analysis of the HEPN superfamily: identification of novel roles in intra-genomic conflicts, defense, pathogenesis and RNA processing

Family (with any Pfam names/id) Conservation of Rx4-6H Salient Architecture and operons Phyletic pattern, available structures and comments
Nucleotidyltransferase (NT-) associated HEPN families
HEPN-T (PF05168) D replaces conserved H in several cases Standalone versions and fusions to MNT; In the case of Sacsin it is part of a multi-domain protein with vertebrates showing a further fusion to an Ubiquitin-like domain and some animals showing a fusion to a Death domain. Several instances of genomic clustering with R-M system operons Bacteria, Archaea, Eukaryotes
pdb: 1wwp, 2hsb, 1o3u.
Proteins with conserved D in place of H have a conserved H elsewhere which could contribute to activity
HEPN-T(Parep1/8) Lacks R but H is conserved Fused to inactive LAF-1/Vasa-like RNA helicase N-terminal ATPase domain in Caenorhabditis. In operon with genes encoding Parep in tandem repeats or with genes encoding proteins with MNT and REase (DUF1626) Archaea. Has two distinct families PAE0096 and PaREP1. PDB:2q00
HEPN-T (Cpin_6617) No Fusions to a dyad of ferredoxin domains (gi: 381187024, Bacteroidetes, Nitrospirae) Mostly Bacteria
HEPN-M (PF08780/DUF86-PF01934) Mostly conserved (83%) Occasionally fused to MNT, a previously undetected archaeal Holliday junction resolvase-like REase (Additional file 1), and nucleic acid methylase domains. In operon with a HAD phosphoesterase gene PDB: 1ylm, 1jog-A. Bacteria, Archaea
HEPN-M (SAV_6107) No - actinobacteria
Aminoglycoside_NT_C (PF07827/DUF4037) No Found at the C-termini of aminoglycoside nucleotidyltransferase and related proteins (gi: 15923025). Occasionally fused to TPRs (gi: 296454793) PDB:1kny, 3jyy, 3jz0, 2pbe Bacteria
GlnD/GlnE (PF08335)/ DUF294_C (PF10335) No Fused to GlnD/E-like nucleotidyltransferase. Usually part of the glutamine synthetase modifying complex. DrrA is a secreted toxin in Legionella. PDB:1v4a, 3l0i Bacteria
DUF4145 (PF13643) Mostly conserved (80%) Fused to Restriction Endonuclease (REase, SF-II-Helicase); Sel1, Zinc Ribbon, TM and SH3 (Firmicutes), UvrD Helicase (endoV alpha subunit), TIR and ATPase (Thiorhodococcus drewsii AZ1); SIGMA-HTH; DpnII/MboI-NTD; AbiJ-NTD1. Bacteria > Archaeaa, dsDNA viruses;
In operon with R-M, TerD, McrB/C and symE toxin
c2405 Conserved H but lacks R Fused to N-terminal AbiTii domain and in a few cases to a C-terminal Helix-hairpin-helix domain Bacteria
MtlR 60% Most often a part of mannitol operon with other mannitol utilization genes gamma proteobacteria pdb:3c8g, 3brj
Abi2/AbiF/AbiD Yes Abi2/AbiF/AbiD and jhp1408 families Bacteria
Embedded in R-M operons and also a protein with DNase domains ParB and HNH (Victivallis, Fusobacterium);
Swt1-like Partly conserved Swt1 - Dyad of HEPN domains fused to a PIN domain, with an additional fusion to WW in some; Inactive. Swt1 - Eukaryotes
Ava_2192 - HEPN fused to a novel AAA + −ATPase. The Pfam profile DUF499 overlaps with this AAA + −ATPase (See Table 2); In operon with R-M components, where the SNF2-Helicase is fused to DUF3883, which is a novel REase domain. Active. Ava_2192 - Bacterial with transfer to Naegleria, Dictyosteliida, Daphnia (expansion). All eukaryotes are solos.
Cxorf38 - Zn ribbon inserted into HEPN, DSRBD, NACHT, Ankyrin, CARD and DEATH, Active. Cxorf38 - Vertebrates, Saccoglossus,
PY00838 – Fusion to Aegerolysin (Apicomplexa, Inactive) Branchiostoma, Ciona, Nematostella. The Human gene is highly expressed in B lymphoblasts and CD56+ NK cells suggesting that this group might be involved in RNA virus defense.
Other Fusions to TM (STY4199), active; Phospholipase D Nuclease (SAV_2148 ), inactive; ParB (Saro_3948), mostly active; and ParB (DUF262) with HNH(DUF2081) (VNG7073 ), active; Transglutaminase, SF-I-
Helicase, Vsr REase and 2 wHTH (MTES_1575), active; CBS and HD (alr3009), active; RNASEIII and DSRBD (Cyanobacteria), active; STAND-ATPase, TPR, S1 (Npun_F6454, MED222_16016, Desac_1927), mostly active; SWI2/SNF2-ATPase (WQE_15321), active; Zinc Ribbon (Npun_R5629); ZnR with two TMs (Plim_2023), active.  
Ribo L-PSP-HEPN Yes Fused to endoRNase L-PSP(gi: 166363853) ; operon with ParB Bacteria. Distantly related AbiF and AbiD
Other Abi
AbiU2 Yes In operon with a gene encoding protein with Sel1 repeats; R-M operons; Bacteria
AbiV No - Bacteria; Has an alternative conserved H at the same position as the first HEPN-T family; hence, could be related to that family
AbiJ Yes Fused to various novel N terminal domains labeled AbiJ-NTD1 to 5; Some of the solos occur in operon with R-M system Bacteria
AbiA-CTD Yes Fused to Reverse Transcriptase ; in operon with R-M system Bacteria
MAE_28990 Yes In operon with a ParB nuclease and DNA methylase genes Bacteria
MAE_18760 Yes Fused to HEPN/RES-NTD1, HEPN/Toprim-NTD1, Schlafen and a novel beta rich domain. In operon with ParA/Soj ATPase of SIMIBI-type GTPase fold Bacteria
Csx1( MJ1666) Yes A dyad of HEPN domains fused to a Rossmann fold domain (PF09455) Archaea > Bacteria; PDB:2i71, 4EOG
Csx1(TM1812) Yes HEPN fused to a Rossmann fold (PF09455), and a few other novel domains Bacteria;
Csm6 Yes HEPN fused to Csm6 (PF09659) and a helical domain bacteria;
Csm6 (Cas_Cas02710) Yes HEPN fused to Csm6 (PF09670) Bacteria > Archaea;
Other families
Ymh (PF09509) Yes Solos and fusions to pMORC, AbiJ-NTD1 and AbiTii domain. Bacteria > Archaea
In operon with R-M
C6orf70 Yes Fused to TPR; WD40 (Dictyostelium). Bacteria > Eukaryotes. Overlaps with DUF4209 (PF13910). This family can be traced to LECA
Occurs in R-M related operons
DUF2526 (PF10735) Yes None detected Gammaproteobacteria
KEN (RnaseL/Ire1) Mostly conserved (95%) Fused to S/T/Y-Kinase, along with ankyrin repeats, CCCH in some. Also found fused to UBI (gi:125543109) and BRCT (gi: 218187285) Eukaryotes. pdb:3lj2; solo RNase L in Oikopleura and an independent LSE of the same is also seen Plants (mainly monocots)
Las1 Yes Mainly Solos. Sometimes fused to Metallo-beta-lactamase and EF-HAND (Ascomycota) and to family specific globular domains Eukaryotes
Rnase LS Yes Fused to RNase H (gi: 300902643), along with Caulimovirus viroplasmin domain (gi: 222100146). In some a TATA-binding protein (TBP)-like domain replaces the RNase H fold domain. In operon with antitoxin RnlB Bacteria
DZIP3/ hRUL138 Mostly conserved Fusion to TPR, Zn-ribbon, RING, Ankyrin, CARD, NACHT ATPase, DEATH and LRR in various animal lineages Eukaryotes. Mainly animal lineage: LSEs in Nematostella, and the oyster and Capsaspora
PrrC/RloC/ APECO1_4465 Yes Fused to ABC-ATPase. Often found in R-M operon and with genes for RhuM-like or Fic/Doc-like toxins. APECO1_4465 is also found in prophages Bacteria
ERFG_01251 Yes Fused to ABC-ATPase and HEPN/TOPRIM-NTD1 Bacteria
ApeA/BMEI1217 Yes In epsilonproteobacteria embedded in R-M operons Bacteria > Archaea;
EC042_2821 Yes Fused to wHTH, REase and ZnR domains. Occurs in R-M system operons Bacteria overlaps with DUF3644
Integron cassette HEPN Yes Part of mobile integron element PDB:3jrt Gammaproteobacteria
pEK499_p136_Ecoli like (B) Yes Some in operon with R-M genes, ADP-ribosyltransferase-like enzymes (ART), and Macro. Also found in operon with NamA-like RNase H fold nuclease and with the Pgl components NamA toxin / RlfA Replication in Phage P1 has a RnaseH fold
LA2681 Yes Fused to TPR, and in operon with TPR Bacteria > Archaea
Cthe_2314 Yes None detected Bacteria
Bxe_C0808 Yes In operon with AbiU2 Bacteria
  1. a: The “>” sign indicates a postulated transfer from one lineage to another.