Pfam Promiscuity
1 PF00117      304.56
2 PF00364      222.74
3 PF00443      201.48
4 PF00769      195.98
5 PF00173      182.43
'Pfam' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
Pfam
An object of S4 class 'InfoDataFrame'
  rowNames: PF00001 PF00002 PF00003 ... PF15659 PF15660 (14831 total)
  colNames: id level description
Start at 2015-07-23 12:15:20
First, load the ontology 'GOBP', the domain 'Pfam', and their associations (2015-07-23 12:15:20) ...
'onto.GOBP' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
'Pfam' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
'Pfam2GOBP' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
Second, perform enrichment analysis using HypergeoTest (2015-07-23 12:15:41) ...
	There are 722 terms being used, each restricted within [10,1000] annotations
Last, adjust the p-values using the BH method (2015-07-23 12:15:41) ...
End at 2015-07-23 12:15:41
Runtime in total is: 21 secs
eoutput
An object of S4 class 'Eoutput', containing following slots:
  @domain: 'Pfam'
  @ontology: 'GOBP'
  @term_info: a data.frame of 74 terms X 5 information
  @anno: a list of 74 terms, each storing annotated domains
  @data: a vector containing a group of 29 input domains (annotatable)
  @background: a vector containing a group of 3241 background domains (annotatable)
  @overlap: a list of 74 terms, each containing domains overlapped with input domains
  @zscore: a vector of 74 terms, containing z-scores
  @pvalue: a vector of 74 terms, containing p-values
  @adjp: a vector of 74 terms, containing adjusted p-values
In summary, a total of 74 terms ('GOBP') are analysed for a group of 29 input domains ('Pfam')
A file ('Basu_GOBP_enrichments.txt') has been written into your local directory ('/Users/hfang/Sites/SUPERFAMILY/dcGO/dcGOR')
### view the top 5 significant terms
view(eoutput, top_num=5, sortBy="pvalue", details=TRUE)
              term_id nAnno nGroup nOverlap zscore  pvalue    adjp
GO:0006298 GO:0006298    62     29        5   6.05 1.3e-05 0.00096
GO:0006281 GO:0006281    88     29        5   4.83 9.7e-05 0.00310
GO:0006974 GO:0006974    92     29        5   4.69 1.2e-04 0.00310
GO:0033554 GO:0033554   145     29        5   3.34 1.5e-03 0.02700
GO:0016051 GO:0016051   119     29        4   2.91 3.6e-03 0.05000
                                   term_name     term_namespace term_distance
GO:0006298                   mismatch repair biological_process             8
GO:0006281                        DNA repair biological_process             6
GO:0006974   response to DNA damage stimulus biological_process             5
GO:0033554       cellular response to stress biological_process             3
GO:0016051 carbohydrate biosynthetic process biological_process             4
                                           members
GO:0006298 PF00289,PF00310,PF00549,PF01546,PF00106
GO:0006281 PF00289,PF00310,PF00549,PF01546,PF00106
GO:0006974 PF00289,PF00310,PF00549,PF01546,PF00106
GO:0033554 PF00289,PF00310,PF00549,PF01546,PF00106
GO:0016051         PF02902,PF01421,PF00082,PF00085
Ontology 'GOBP' containing 13 nodes/terms (including 4 in query; also highlighted in frame) has been shown in your screen, with colorbar indicating -1*log10(adjusted p-values)
#### look at Pfam domains annotated by the most signficant term
tmp <- as.character(view(eoutput, top_num=1, sortBy="pvalue", details=T)$members)
tmp <- unlist(strsplit(tmp,","))
Data(Pfam)[match(tmp,rowNames(Pfam)),]
             id level                                             description
PF00289 PF00289  Pfam Carbamoyl-phosphate synthase L chain, N-terminal domain
PF00310 PF00310  Pfam                    Glutamine amidotransferases class-II
PF00549 PF00549  Pfam                                              CoA-ligase
PF01546 PF01546  Pfam                            Peptidase family M20/M25/M40
PF00106 PF00106  Pfam                               short chain dehydrogenase
Start at 2015-07-23 12:16:35
First, load the ontology 'GOMF', the domain 'Pfam', and their associations (2015-07-23 12:16:35) ...
'onto.GOMF' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
'Pfam' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
'Pfam2GOMF' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
Second, perform enrichment analysis using HypergeoTest (2015-07-23 12:16:46) ...
	There are 334 terms being used, each restricted within [10,1000] annotations
Last, adjust the p-values using the BH method (2015-07-23 12:16:46) ...
End at 2015-07-23 12:16:46
Runtime in total is: 11 secs
eoutput
An object of S4 class 'Eoutput', containing following slots:
  @domain: 'Pfam'
  @ontology: 'GOMF'
  @term_info: a data.frame of 24 terms X 5 information
  @anno: a list of 24 terms, each storing annotated domains
  @data: a vector containing a group of 36 input domains (annotatable)
  @background: a vector containing a group of 3359 background domains (annotatable)
  @overlap: a list of 24 terms, each containing domains overlapped with input domains
  @zscore: a vector of 24 terms, containing z-scores
  @pvalue: a vector of 24 terms, containing p-values
  @adjp: a vector of 24 terms, containing adjusted p-values
In summary, a total of 24 terms ('GOMF') are analysed for a group of 36 input domains ('Pfam')
A file ('Basu_GOMF_enrichments.txt') has been written into your local directory ('/Users/hfang/Sites/SUPERFAMILY/dcGO/dcGOR')
### view the top 5 significant terms
view(eoutput, top_num=5, sortBy="pvalue", details=TRUE)
              term_id nAnno nGroup nOverlap zscore  pvalue   adjp
GO:0016887 GO:0016887   106     36        6   4.66 0.00010 0.0022
GO:0017111 GO:0017111   121     36        6   4.23 0.00023 0.0022
GO:0016462 GO:0016462   132     36        6   3.95 0.00040 0.0022
GO:0016818 GO:0016818   133     36        6   3.93 0.00042 0.0022
GO:0016817 GO:0016817   135     36        6   3.88 0.00046 0.0022
                                                                                    term_name
GO:0016887                                                                    ATPase activity
GO:0017111                                                 nucleoside-triphosphatase activity
GO:0016462                                                           pyrophosphatase activity
GO:0016818 hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides
GO:0016817                                      hydrolase activity, acting on acid anhydrides
               term_namespace term_distance
GO:0016887 molecular_function             7
GO:0017111 molecular_function             6
GO:0016462 molecular_function             5
GO:0016818 molecular_function             4
GO:0016817 molecular_function             3
                                                   members
GO:0016887 PF00175,PF00258,PF01565,PF00070,PF00106,PF00107
GO:0017111 PF00175,PF00258,PF01565,PF00070,PF00106,PF00107
GO:0016462 PF00175,PF00258,PF01565,PF00070,PF00106,PF00107
GO:0016818 PF00175,PF00258,PF01565,PF00070,PF00106,PF00107
GO:0016817 PF00175,PF00258,PF01565,PF00070,PF00106,PF00107
Ontology 'GOMF' containing 8 nodes/terms (including 5 in query; also highlighted in frame) has been shown in your screen, with colorbar indicating -1*log10(adjusted p-values)
#### look at Pfam domains annotated by the most signficant term
tmp <- as.character(view(eoutput, top_num=1, sortBy="pvalue", details=T)$members)
tmp <- unlist(strsplit(tmp,","))
Data(Pfam)[match(tmp,rowNames(Pfam)),]
             id level                                   description
PF00175 PF00175  Pfam            Oxidoreductase NAD-binding domain 
PF00258 PF00258  Pfam                                    Flavodoxin
PF01565 PF01565  Pfam                           FAD binding domain 
PF00070 PF00070  Pfam Pyridine nucleotide-disulphide oxidoreductase
PF00106 PF00106  Pfam                     short chain dehydrogenase
PF00107 PF00107  Pfam                    Zinc-binding dehydrogenase
Start at 2015-07-23 12:16:59
First, load the ontology 'GOCC', the domain 'Pfam', and their associations (2015-07-23 12:16:59) ...
'onto.GOCC' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
'Pfam' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
'Pfam2GOCC' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
Second, perform enrichment analysis using HypergeoTest (2015-07-23 12:17:03) ...
	There are 149 terms being used, each restricted within [10,1000] annotations
Last, adjust the p-values using the BH method (2015-07-23 12:17:03) ...
End at 2015-07-23 12:17:03
Runtime in total is: 4 secs
eoutput
An object of S4 class 'Eoutput', containing following slots:
  @domain: 'Pfam'
  @ontology: 'GOCC'
  @term_info: a data.frame of 10 terms X 5 information
  @anno: a list of 10 terms, each storing annotated domains
  @data: a vector containing a group of 8 input domains (annotatable)
  @background: a vector containing a group of 2036 background domains (annotatable)
  @overlap: a list of 10 terms, each containing domains overlapped with input domains
  @zscore: a vector of 10 terms, containing z-scores
  @pvalue: a vector of 10 terms, containing p-values
  @adjp: a vector of 10 terms, containing adjusted p-values
In summary, a total of 10 terms ('GOCC') are analysed for a group of 8 input domains ('Pfam')
A file ('Basu_GOCC_enrichments.txt') has been written into your local directory ('/Users/hfang/Sites/SUPERFAMILY/dcGO/dcGOR')
### view the top 5 significant terms
view(eoutput, top_num=5, sortBy="pvalue", details=FALSE)
              term_id nAnno nGroup nOverlap zscore pvalue  adjp
GO:0043234 GO:0043234  1246      8        7  1.530  0.020 0.083
GO:0044445 GO:0044445   326      8        3  1.660  0.026 0.083
GO:0005829 GO:0005829   327      8        3  1.650  0.027 0.083
GO:0032991 GO:0032991  1332      8        7  1.320  0.033 0.083
GO:0005623 GO:0005623  1613      8        7  0.578  0.150 0.260
                        term_name
GO:0043234        protein complex
GO:0044445         cytosolic part
GO:0005829                cytosol
GO:0032991 macromolecular complex
GO:0005623                   cell
Ontology 'GOCC' containing 9 nodes/terms (including 5 in query; also highlighted in frame) has been shown in your screen, with colorbar indicating -1*log10(adjusted p-values)
#### look at Pfam domains annotated by the most signficant term
tmp <- as.character(view(eoutput, top_num=1, sortBy="pvalue", details=T)$members)
tmp <- unlist(strsplit(tmp,","))
Data(Pfam)[match(tmp,rowNames(Pfam)),]
             id level                               description
PF06247 PF06247  Pfam Plasmodium ookinete surface protein Pvs28
PF00520 PF00520  Pfam                     Ion transport protein
PF01496 PF01496  Pfam     V-type ATPase 116kDa subunit family  
PF00038 PF00038  Pfam             Intermediate filament protein
PF05955 PF05955  Pfam       Equine herpesvirus glycoprotein gp2
PF00083 PF00083  Pfam             Sugar (and other) transporter
PF00769 PF00769  Pfam               Ezrin/radixin/moesin family
'onto.GOBP' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
g
An object of S4 class 'Onto'
@adjMatrix: a direct matrix of 25154 terms (parents/from) X 25154 terms (children/to) 
@nodeInfo (InfoDataFrame)
  nodeNames: GO:0008150 GO:0000003 GO:0001906 ... GO:0021810 GO:0021811
    (25154 total)
  nodeAttr: term_id term_name term_namespace term_distance
'Pfam2GOBP' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
Anno
An object of S4 class 'Anno'
@annoData: 3241 domains, 1005 terms 
@termData (InfoDataFrame)
  termNames: GO:0008152 GO:0040007 GO:0048511 ... GO:0019281 GO:0032862
    (1005 total)
  tvarLabels: ID Name Namespace Distance
@domainData (InfoDataFrame)
  domainNames: PF00001 PF00002 PF00003 ... PF15550 PF15556 (3241 total)
  dvarLabels: id level description
An object of S4 class 'Onto'
@adjMatrix: a direct matrix of 1601 terms (parents/from) X 1601 terms (children/to) 
@nodeInfo (InfoDataFrame)
  nodeNames: GO:0008150 GO:0000003 GO:0001906 ... GO:0019281 GO:0032862
    (1601 total)
  nodeAttr: term_id term_name term_namespace term_distance annotations
    IC
Start at 2015-07-23 12:18:01
First, extract all annotatable domains (2015-07-23 12:18:01)...
	there are 29 input domains amongst 3241 annotatable domains
Second, pre-compute semantic similarity between 20 terms (forced to be the most specific for each domain) using Resnik method (2015-07-23 12:18:11)...
Last, calculate pair-wise semantic similarity between 29 domains using BM.average method (2015-07-23 12:18:12)...
	1 out of 29 (2015-07-23 12:18:12)
	3 out of 29 (2015-07-23 12:18:12)
	6 out of 29 (2015-07-23 12:18:13)
	9 out of 29 (2015-07-23 12:18:13)
	12 out of 29 (2015-07-23 12:18:13)
	15 out of 29 (2015-07-23 12:18:13)
	18 out of 29 (2015-07-23 12:18:13)
	21 out of 29 (2015-07-23 12:18:13)
	24 out of 29 (2015-07-23 12:18:13)
	27 out of 29 (2015-07-23 12:18:13)
Finish at 2015-07-23 12:18:13
Runtime in total is: 12 secs
dnetwork
An object of S4 class 'Dnetwork'
@adjMatrix: a weighted symmetric matrix of 29 domains X 29 domains 
@nodeInfo (InfoDataFrame)
  nodeNames: PF00443 PF00070 PF05557 ... PF00332 PF00872 (29 total)
  nodeAttr: id
### heatmap the adjacency matrix of the domain network
Adj_GOBP <- as.matrix(adjMatrix(dnetwork))
visHeatmapAdv(Adj_GOBP, Rowv=F, Colv=F, dendrogram="none", colormap="white-lightpink-darkred", zlim=c(0,1.5), cexRow=0.7, cexCol=0.7, KeyValueName="GOBP semantic similarity")
'onto.GOMF' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
g
An object of S4 class 'Onto'
@adjMatrix: a direct matrix of 9595 terms (parents/from) X 9595 terms (children/to) 
@nodeInfo (InfoDataFrame)
  nodeNames: GO:0003674 GO:0000988 GO:0001071 ... GO:0004008 GO:0086037
    (9595 total)
  nodeAttr: term_id term_name term_namespace term_distance
'Pfam2GOMF' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
Anno
An object of S4 class 'Anno'
@annoData: 3359 domains, 1065 terms 
@termData (InfoDataFrame)
  termNames: GO:0003824 GO:0004872 GO:0005198 ... GO:0016286 GO:0005219
    (1065 total)
  tvarLabels: ID Name Namespace Distance
@domainData (InfoDataFrame)
  domainNames: PF00001 PF00002 PF00003 ... PF15510 PF15549 (3359 total)
  dvarLabels: id level description
An object of S4 class 'Onto'
@adjMatrix: a direct matrix of 1368 terms (parents/from) X 1368 terms (children/to) 
@nodeInfo (InfoDataFrame)
  nodeNames: GO:0003674 GO:0000988 GO:0001071 ... GO:0016286 GO:0005219
    (1368 total)
  nodeAttr: term_id term_name term_namespace term_distance annotations
    IC
Start at 2015-07-23 12:18:29
First, extract all annotatable domains (2015-07-23 12:18:29)...
	there are 36 input domains amongst 3359 annotatable domains
Second, pre-compute semantic similarity between 33 terms (forced to be the most specific for each domain) using Resnik method (2015-07-23 12:18:33)...
Last, calculate pair-wise semantic similarity between 36 domains using BM.average method (2015-07-23 12:18:34)...
	1 out of 36 (2015-07-23 12:18:34)
	4 out of 36 (2015-07-23 12:18:35)
	8 out of 36 (2015-07-23 12:18:35)
	12 out of 36 (2015-07-23 12:18:35)
	16 out of 36 (2015-07-23 12:18:35)
	20 out of 36 (2015-07-23 12:18:35)
	24 out of 36 (2015-07-23 12:18:35)
	28 out of 36 (2015-07-23 12:18:35)
	32 out of 36 (2015-07-23 12:18:35)
Finish at 2015-07-23 12:18:35
Runtime in total is: 6 secs
dnetwork
An object of S4 class 'Dnetwork'
@adjMatrix: a weighted symmetric matrix of 36 domains X 36 domains 
@nodeInfo (InfoDataFrame)
  nodeNames: PF00769 PF00173 PF02786 ... PF00332 PF00872 (36 total)
  nodeAttr: id
### heatmap the adjacency matrix of the domain network
Adj_GOMF <- as.matrix(adjMatrix(dnetwork))
visHeatmapAdv(Adj_GOMF, Rowv=F, Colv=F, dendrogram="none", colormap="white-lightpink-darkred", zlim=c(0,1.5), cexRow=0.7, cexCol=0.7, KeyValueName="GOMF semantic similarity")
'onto.GOCC' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
g
An object of S4 class 'Onto'
@adjMatrix: a direct matrix of 3215 terms (parents/from) X 3215 terms (children/to) 
@nodeInfo (InfoDataFrame)
  nodeNames: GO:0005575 GO:0005576 GO:0005623 ... GO:0044192 GO:0044201
    (3215 total)
  nodeAttr: term_id term_name term_namespace term_distance
'Pfam2GOCC' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
Anno
An object of S4 class 'Anno'
@annoData: 2036 domains, 307 terms 
@termData (InfoDataFrame)
  termNames: GO:0005576 GO:0009295 GO:0016020 ... GO:0070188 GO:0042729
    (307 total)
  tvarLabels: ID Name Namespace Distance
@domainData (InfoDataFrame)
  domainNames: PF00001 PF00002 PF00003 ... PF15550 PF15556 (2036 total)
  dvarLabels: id level description
An object of S4 class 'Onto'
@adjMatrix: a direct matrix of 417 terms (parents/from) X 417 terms (children/to) 
@nodeInfo (InfoDataFrame)
  nodeNames: GO:0005575 GO:0005576 GO:0009295 ... GO:0070188 GO:0042729
    (417 total)
  nodeAttr: term_id term_name term_namespace term_distance annotations
    IC
Start at 2015-07-23 12:18:44
First, extract all annotatable domains (2015-07-23 12:18:44)...
	there are 8 input domains amongst 2036 annotatable domains
Second, pre-compute semantic similarity between 8 terms (forced to be the most specific for each domain) using Resnik method (2015-07-23 12:18:46)...
Last, calculate pair-wise semantic similarity between 8 domains using BM.average method (2015-07-23 12:18:47)...
	1 out of 8 (2015-07-23 12:18:47)
	2 out of 8 (2015-07-23 12:18:47)
	3 out of 8 (2015-07-23 12:18:47)
	4 out of 8 (2015-07-23 12:18:47)
	5 out of 8 (2015-07-23 12:18:47)
	6 out of 8 (2015-07-23 12:18:47)
	7 out of 8 (2015-07-23 12:18:47)
Finish at 2015-07-23 12:18:47
Runtime in total is: 3 secs
dnetwork
An object of S4 class 'Dnetwork'
@adjMatrix: a weighted symmetric matrix of 8 domains X 8 domains 
@nodeInfo (InfoDataFrame)
  nodeNames: PF00769 PF01496 PF05955 ... PF00520 PF00083 (8 total)
  nodeAttr: id
### heatmap the adjacency matrix of the domain network
Adj_GOCC <- as.matrix(adjMatrix(dnetwork))
visHeatmapAdv(Adj_GOCC, Rowv=F, Colv=F, dendrogram="none", colormap="white-lightpink-darkred", zlim=c(0,1.5), cexRow=0.7, cexCol=0.7, KeyValueName="GOCC semantic similarity")
## 4) Obtain GO-based overall semantic similarity via merging all three subontology (GOBP, GOMF and GOCC) based semantic similarity
allnodes <- sort(unique(c(rownames(Adj_GOBP), rownames(Adj_GOMF), rownames(Adj_GOCC))))
D <- matrix(0, nrow=length(allnodes), ncol=length(allnodes))
colnames(D) <- rownames(D) <- allnodes
### add Adj_GOBP
ind <- match(rownames(Adj_GOBP), allnodes)
D[ind,ind] <- D[ind,ind]+Adj_GOBP
### add Adj_GOMF
ind <- match(rownames(Adj_GOMF), allnodes)
D[ind,ind] <- D[ind,ind]+Adj_GOMF
### add Adj_GOCC
ind <- match(rownames(Adj_GOCC), allnodes)
D[ind,ind] <- D[ind,ind]+Adj_GOCC
### heatmap the GO-based overall semantic similarity
visHeatmapAdv(D, Rowv=T, Colv=T, dendrogram="none", colormap="white-lightpink-darkred", zlim=c(0,2), cexRow=0.5, cexCol=0.5, KeyValueName="GO overall semantic similarity")