dcAlgoPredictGenome
is supposed to predict ontology terms for
genomes with domain architectures (including individual domains).
dcAlgoPredictGenome(input.file, RData.HIS = c(NULL, "Feature2GOBP.sf", "Feature2GOMF.sf", "Feature2GOCC.sf", "Feature2HPPA.sf", "Feature2GOBP.pfam", "Feature2GOMF.pfam", "Feature2GOCC.pfam", "Feature2HPPA.pfam", "Feature2GOBP.interpro", "Feature2GOMF.interpro", "Feature2GOCC.interpro", "Feature2HPPA.interpro"), weight.method = c("none", "copynum", "ic", "both"), merge.method = c("sum", "max", "sequential"), scale.method = c("log", "linear", "none"), feature.mode = c("supra", "individual", "comb"), slim.level = NULL, max.num = NULL, parallel = TRUE, multicores = NULL, verbose = T, RData.HIS.customised = NULL, RData.location = "https://github.com/hfang-bristol/RDataCentre/blob/master/dcGOR")
RData.HIS.customised
below)\sum_{i=1}{\frac{R_{i}}{i}}
,
where R_{i}
is the i^{th}
ranked highest hscore\frac{S - S_{min}}{S_{max} - S_{min}}
, where
S_{min}
and S_{max}
are the minimum and maximum values for
S
slim.level
source("http://bioconductor.org/biocLite.R");
biocLite(c("foreach","doMC"))
. If not yet installed, this option will
be disableddcAlgoPropagate
on how this object is createdRData.location="."
. If RData to load is already part of package
itself, this parameter can be ignored (since this function will try to
load it via function data
first). Here is the UNIX command for
downloading all RData files (preserving the directory structure):
wget -r -l2 -A "*.RData" -np -nH --cut-dirs=0
"http://dcgor.r-forge.r-project.org/data"
a matrix of terms X genomes, containing the predicted scores (per genome) as a whole
none
# 1) Prepare an input file containing domain architectures input.file <- "http://dcgor.r-forge.r-project.org/data/Feature/Hominidae.txt" # 2) Do prediction using built-in data output <- dcAlgoPredictGenome(input.file, RData.HIS="Feature2GOMF.sf", parallel=FALSE)Start at 2015-07-23 12:28:29 Read the input file 'http://dcgor.r-forge.r-project.org/data/Feature/Hominidae.txt' ... Predictions for 4 sequences (9214 distinct architectures) using 'Feature2GOMF.sf' RData, 'sum' merge method, 'log' scale method and 'supra' feature mode (2015-07-23 12:28:29) ... ############################## 'dcAlgoPredict' is being called... ############################## Start at 2015-07-23 12:28:29 Load the HIS object 'Feature2GOMF.sf' (2015-07-23 12:28:29) ... 'Feature2GOMF.sf' (from https://github.com/hfang-bristol/RDataCentre/blob/master/dcGOR/Feature2GOMF.sf.RData?raw=true) has been loaded into the working environment Predictions for 9214 architectures using 'sum' merge method, 'log' scale method and 'supra' feature mode (2015-07-23 12:28:30)... 1 out of 9214 (2015-07-23 12:28:30) 922 out of 9214 (2015-07-23 12:28:37) 1844 out of 9214 (2015-07-23 12:28:40) 2766 out of 9214 (2015-07-23 12:28:43) 3688 out of 9214 (2015-07-23 12:28:46) 4610 out of 9214 (2015-07-23 12:28:48) 5532 out of 9214 (2015-07-23 12:28:51) 6454 out of 9214 (2015-07-23 12:28:55) 7376 out of 9214 (2015-07-23 12:28:58) 8298 out of 9214 (2015-07-23 12:29:00) 9214 out of 9214 (2015-07-23 12:29:03) End at 2015-07-23 12:29:03 Runtime in total is: 34 secs ############################## 'dcAlgoPredict' has been completed! ############################## A summary in terms of ontology terms using 'none' weight method (2015-07-23 12:29:03)... Load the HIS object 'Feature2GOMF.sf' (2015-07-23 12:29:03) ... 'Feature2GOMF.sf' (from https://github.com/hfang-bristol/RDataCentre/blob/master/dcGOR/Feature2GOMF.sf.RData?raw=true) has been loaded into the working environment 1 out of 4 (2015-07-23 12:29:03) 2 out of 4 (2015-07-23 12:29:04) 3 out of 4 (2015-07-23 12:29:06) 4 out of 4 (2015-07-23 12:29:07) End at 2015-07-23 12:29:08 Runtime in total is: 39 secsdim(output)[1] 2836 4output[1:10,]gx hs of xp GO:0003674 1.0000 1.0000 1.0000 1.0000 GO:0005488 0.8282 0.8334 0.8267 0.8246 GO:0005515 0.6853 0.6901 0.6819 0.6830 GO:0003824 0.7433 0.7332 0.7443 0.7430 GO:0043167 0.5006 0.4976 0.4971 0.4925 GO:0016787 0.4827 0.4732 0.4812 0.4798 GO:0016740 0.5363 0.5200 0.5361 0.5350 GO:0097159 0.4650 0.4481 0.4638 0.4606 GO:1901363 0.4538 0.4368 0.4524 0.4502 GO:0043168 0.4166 0.4213 0.4147 0.4107# 3) Advanced usage: using customised data x <- base::load(base::url("http://dcgor.r-forge.r-project.org/data/Feature2GOMF.sf.RData"))Error: the input does not start with a magic number compatible with loading from a connectionRData.HIS.customised <- 'Feature2GOMF.sf.RData' base::save(list=x, file=RData.HIS.customised)Error in base::save(list = x, file = RData.HIS.customised): object 'x' not found#list.files(pattern='*.RData') ## you will see an RData file 'Feature2GOMF.sf.RData' in local directory output <- dcAlgoPredictGenome(input.file, parallel=FALSE, RData.HIS.customised=RData.HIS.customised)Start at 2015-07-23 12:29:08 Read the input file 'http://dcgor.r-forge.r-project.org/data/Feature/Hominidae.txt' ... Predictions for 4 sequences (9214 distinct architectures) using 'Feature2GOBP.sf' RData, 'sum' merge method, 'log' scale method and 'supra' feature mode (2015-07-23 12:29:09) ... ############################## 'dcAlgoPredict' is being called... ############################## Start at 2015-07-23 12:29:09 Load the HIS object 'Feature2GOBP.sf' (2015-07-23 12:29:09) ... 'Feature2GOBP.sf' (from https://github.com/hfang-bristol/RDataCentre/blob/master/dcGOR/Feature2GOBP.sf.RData?raw=true) has been loaded into the working environment Predictions for 9214 architectures using 'sum' merge method, 'log' scale method and 'supra' feature mode (2015-07-23 12:29:11)... 1 out of 9214 (2015-07-23 12:29:11) 922 out of 9214 (2015-07-23 12:29:24) 1844 out of 9214 (2015-07-23 12:29:41) 2766 out of 9214 (2015-07-23 12:29:53) 3688 out of 9214 (2015-07-23 12:30:08) 4610 out of 9214 (2015-07-23 12:30:18) 5532 out of 9214 (2015-07-23 12:30:33) 6454 out of 9214 (2015-07-23 12:30:51) 7376 out of 9214 (2015-07-23 12:31:04) 8298 out of 9214 (2015-07-23 12:31:16) 9214 out of 9214 (2015-07-23 12:31:26) End at 2015-07-23 12:31:26 Runtime in total is: 137 secs ############################## 'dcAlgoPredict' has been completed! ############################## A summary in terms of ontology terms using 'none' weight method (2015-07-23 12:31:26)... Load the HIS object 'Feature2GOBP.sf' (2015-07-23 12:31:26) ... 'Feature2GOBP.sf' (from https://github.com/hfang-bristol/RDataCentre/blob/master/dcGOR/Feature2GOBP.sf.RData?raw=true) has been loaded into the working environment 1 out of 4 (2015-07-23 12:31:27) 2 out of 4 (2015-07-23 12:31:37) 3 out of 4 (2015-07-23 12:31:50) 4 out of 4 (2015-07-23 12:31:58) End at 2015-07-23 12:32:06 Runtime in total is: 178 secsdim(output)[1] 11203 4output[1:10,]gx hs of xp GO:0008150 1.0000 1.0000 1.0000 1.0000 GO:0009987 0.9451 0.9440 0.9455 0.9458 GO:0044699 0.8750 0.8741 0.8725 0.8713 GO:0044763 0.8430 0.8397 0.8412 0.8404 GO:0065007 0.8517 0.8472 0.8487 0.8494 GO:0008152 0.8505 0.8403 0.8490 0.8485 GO:0050789 0.8296 0.8250 0.8267 0.8272 GO:0032501 0.7676 0.7672 0.7651 0.7618 GO:0044707 0.7554 0.7545 0.7519 0.7490 GO:0032502 0.7531 0.7511 0.7505 0.7466
dcAlgoPredictGenome.r
dcAlgoPredictGenome.Rd
dcAlgoPredictGenome.pdf
dcRDataLoader
, dcAlgoPropagate
,
dcAlgoPredict