About TF family

In AnimalTFDB4.0, the TF family classifications are consistent with AnimalTFDB3.0. Because we use the latest version of Ensembl fasta files and gtf files, we adjusted the thresholds for some families. For example, the Fork_head and MYB family thresholds are adjusted to 1e-2, and the threshold for Homeobox family is adjusted to 1e-3.

In most cases, a TF only has one kind of DBD; thus, it is easy to correctly assign it to one certain family. However, in some cases, a TF may have more than one kind of DBD. In order to classify them into a correct family, we checked all the TFs of human and mouse which contained multiple kinds of DBDs and then set up three rules. First, if a superfamily has several subfamilies, we classified the TFs based on the subfamily DBD. For example, the Homeobox superfamily has four subfamilies: Pou, CUT, TF_Otx and other Homeobox. In this superfamily, all TFs have a Homeobox domain, and some of them have one of the Pou, CUT, and TF_Otx subfamily signature domains. We assigned them to a specific family based on their subfamily signature domain. The second rule is that if a TF has more than one unrelated DBD, we will classify it into the family based on the DBD with the smallest E-value. The third rule is that if a TF family contains a non-TF gene (such as an enzyme gene) and cannot be rejected by changing the cutoff, we will remove the non-TF gene based on the prediction result of the other non-TF gene domain. There are three domains of enzyme genes (PLU-1, MOZ_SAS and TRAM_LAG1_CLN8) and three domains (TIP_N, SNF2_N, EP400_N ) act as non-TF gene domains in our prediction process. The thresholds of PLU-1, MOZ_SAS, TRAM_LAG1_CLN8 and TIP_N are 1e-2, and the thresholds of SNF2_N and EP400_N are 1e-60. We checked all the human and mouse classification results and found our method was reasonable. The self-build Hmm files are available here

Family DNA-binding domain Pfam ID or InterPro ID Cutoff Rules
AF-4 AF-4 PF05110 1E-04 AF-4 domain
AP-2 TF_AP-2 PF03299 1E-04 TF_AP-2 domain
ARID ARID PF01388 1E-04 ARID domain, no PLU-1 domain
bHLH HLH PF00010 1E-02 HLH domain
CBF CBF_beta PF02312 1E-04 CBF_beta domain
CSL BTD PF09270 1E-04 BTD domain
NF-Y NF-YA CBFB_NFYA PF02045 1E-04 CBFB_NFYA domain
NF-YB NF-YB self-build 1E-04 NF-YB domain
NF-YC NF-YC self-build 1E-04 NF-YC domain
CG-1 CG-1 PF03859 1E-04 CG-1 domain
CP2 CP2 PF04516 1E-04 CP2 domain
CSD CSD PF00313 1E-04 CSD domain
CSRNP_N CSRNP_N PF16019 1E-04 CSRNP_N domain
DACH DACH self-build 1E-04 DACH domain
E2F E2F_TDP PF02319 1E-04 E2F_TDP domain
ETS Ets PF00178 1E-04 Ets domain
Fork_head Fork_head PF00250 1E-03 Fork_head domain
GCM GCM PF03615 1E-04 GCM domain
GTF2I GTF2I PF02946 1E-04 GTF2I domain
HMG HMG_box PF00505 1E-03 HMG_box domain
HMGA HMGA self-build 1E-04 HMGA domain
HSF HSF_DNA-bind PF00447 1E-04 HSF_DNA-bind domain
HTH HTH_psq PF05225 1E-04 HTH_psq domain
IRF IRF PF00605 1E-04 IRF domain
MYB Myb_DNA-bd PF00249 1E-03 Myb_DNA-binding domain,no SNF2_N and EP400_N domain
MBD MBD PF01429 1E-04 MBD domain
NCU-G1 NCU-G1 PF15065 1E-04 NCU-G1 domain
NDT80_PhoG NDT80_PhoG PF05224 1E-04 NDT80_PhoG domain
Nrf1 Nrf1_DNA-bind PF10491 1E-04 Nrf1_DNA-bind domain
PC4 PC4 PF02229 1E-04 PC4 domain
P53 P53 PF00870 1E-04 P53 domain
PAX PAX PF00292 1E-04 PAX domain
HPD HPD PF05044 1E-04 HPD domain
LRRFIP LRRFIP PF09738 1E-04 LRRFIP domain
RFX RFX PF02257 1E-04 RFX domain
RHD RHD PF00554 1E-04 RHD domain
Runt Runt PF00853 1E-04 Runt domain
SAND SAND PF01342 1E-04 SAND domain
SRF SRF PF00319 1E-04 SRF domain
STAT STAT_bind PF02864 1E-04 STAT_bind domain
T-box T-box PF00907 1E-04 T-box domain
TEA TEA PF01285 1E-04 TEA domain
COE COE self-build 1E-04 COE domain
GCFC GCFC PF07842 1E-04 GCFC domain, no TIP_N domain
TSC22 TSC22 PF01166 1E-04 TSC22 domain
Tub Tub PF01167 1E-04 Tub domain
bZIP TF_bZIP bZIP self-build 1E-04 bZIP domain
MH1 CTF_NFI MH1 PF00859 1E-04 CTF/NFI and MH1 domain
MH1 MH1 PF03165 1E-04 MH1 domain
Homeobox Homeobox Homeobox PF00046 1E-02 All TFs have a Homeobox domain, without TRAM_LAG1_CLN8 domain; some of them have one of the Pou, CUT, and TF_Otx subfamily signature domains
Pou Homeobox, Pou PF00157 1E-04
CUT Homeobox, CUT PF02376 1E-04
TF_Otx Homeobox, TF_Otx PF03529 1E-04
Zinc finger zf-CCCH zf-CCCH PF00642 1E-20 zf-CCCH domain
zf-C2HC zf-C2HC PF01530 1E-04 zf-C2HC domain, no MOZ_SAS domain
zf-GAGA zf-GAGA PF09237 1E-04 zf-GAGA domain
zf-BED zf-BED PF02892 1E-03 zf-BED domain
zf-C2H2 ZBTB zf-C2H2 PF00651 1E-04 All TFs have a zf-C2H2 domain, ZBTB family has both ZBTB and zf-C2H2 domain
zf-C2H2 zf-C2H2 PF00096 1E-03
Nuclear Receptor Miscellaneous zf-C4 self-build 1E-04 All TFs have a zf-C4 domain; some of them have one of the Miscellaneous, THR-like, RXR-like, ESR-like, NGFIB-like, SF-like and GCNF-like subfamily signature domains
THR-like zf-C4 self-build 1E-04
RXR-like zf-C4 self-build 1E-04
ESR-like zf-C4 self-build 1E-04
NGFIB-like zf-C4 self-build 1E-04
SF-like zf-C4 self-build 1E-04
GCNF-like zf-C4 self-build 1E-04
DM DM PF00751 1E-04 DM domain
zf-GATA zf-GATA PF00320 1E-04 zf-GATA domain
zf-LITAF-like zf-LITAF-like PF10601 1E-04 zf-LITAF-like domain
zf-MIZ zf-MIZ PF02891 1E-04 zf-MIZ domain
zf-NF-X1 zf-NF-X1 PF01422 1E-04 zf-NF-X1 domain
THAP THAP PF05485 1E-04 THAP domain
Others 1E-04