Description
visEnrichment
is supposed to visualise enrichment analysis
outputs (represented as an 'Eoutput' object) in the context of the
ontology hierarchy (direct acyclic graph; DAG). Only part of DAG
induced by those nodes/terms specified in query nodes (and the mode
defining the paths to the root of DAG) will be visualised. Nodes in
query are framed in black (by default), and all nodes (in query plus
induced) will be color-coded according to a given data.type ('zscore';
otherwise taking the form of 10-based negative logarithm for 'adjp' or
'pvalue'). If no nodes in query, the top 5 significant terms (in terms
of adjusted p-value) will be used for visualisation
Usage
visEnrichment(e, nodes_query = NULL, num_top_nodes = 5, path.mode = c("all_shortest_paths",
"shortest_paths", "all_paths"), data.type = c("adjp", "pvalue", "zscore"), height = 7,
width = 7, margin = rep(0.1, 4), colormap = c("yr", "bwr", "jet", "gbr", "wyr",
"br", "rainbow", "wb", "lightyellow-orange"), ncolors = 40, zlim = NULL,
colorbar = T, colorbar.fraction = 0.1, newpage = T, layout.orientation = c("left_right",
"top_bottom", "bottom_top", "right_left"), node.info = c("both", "none",
"term_id", "term_name", "full_term_name"), graph.node.attrs = NULL, graph.edge.attrs = NULL,
node.attrs = NULL)
Arguments
- e
- an object of S4 class
Eoutput
- nodes_query
- a verctor containing a list of nodes/terms in query.
These nodes are used to produce a subgraph of the ontology DAG induced
by them. If NULL, the top significant terms (in terms of p-value) will
be determined by the next 'num_top_nodes'
- num_top_nodes
- a numeric value specifying the number of the top
significant terms (in terms of p-value) will be used. This parameter
does not work if the previous 'nodes_query' has been specified
- path.mode
- the mode of paths induced by nodes in query. It can be
"all_paths" for all possible paths to the root, "shortest_paths" for
only one path to the root (for each node in query),
"all_shortest_paths" for all shortest paths to the root (i.e. for each
node, find all shortest paths with the equal lengths)
- data.type
- a character telling which data type for nodes in query
is used to color-code nodes. It can be one of 'adjp' for adjusted
p-values (by default), 'pvalue' for p-values and 'zscore' for z-scores.
When 'adjp' or 'pvalue' is used, 10-based negative logarithm is taken.
For the style of how to color-code, please see the next arguments:
colormap, ncolors, zlim and colorbar
- height
- a numeric value specifying the height of device
- width
- a numeric value specifying the width of device
- margin
- margins as units of length 4 or 1
- colormap
- short name for the colormap. It can be one of "yr"
(yellow-red colormap; by default), "jet" (jet colormap), "bwr"
(blue-white-red colormap), "gbr" (green-black-red colormap), "wyr"
(white-yellow-red colormap), "br" (black-red colormap), "wb"
(white-black colormap), and "rainbow" (rainbow colormap, that is,
red-yellow-green-cyan-blue-magenta). Alternatively, any
hyphen-separated HTML color names, e.g. "lightyellow-orange" (by
default), "blue-black-yellow", "royalblue-white-sandybrown",
"darkgreen-white-darkviolet". A list of standard color names can be
found in http://html-color-codes.info/color-names
- ncolors
- the number of colors specified over the colormap
- zlim
- the minimum and maximum z/data values for which colors
should be plotted, defaulting to the range of the finite values of z.
Each of the given colors will be used to color an equispaced interval
of this range. The midpoints of the intervals cover the range, so that
values just outside the range will be plotted
- colorbar
- logical to indicate whether to append a colorbar. If
data is null, it always sets to false
- colorbar.fraction
- the relative fraction of colorbar block
against the device size
- newpage
- logical to indicate whether to open a new page. By
default, it sets to true for opening a new page
- layout.orientation
- the orientation of the DAG layout. It can be
one of "left_right" for the left-right layout (viewed from the DAG root
point; by default), "top_bottom" for the top-bottom layout,
"bottom_top" for the bottom-top layout, and "right_left" for the
right-left layout
- node.info
- tells the ontology term information used to label
nodes. It can be one of "both" for using both of Term ID and Name (the
first 15 characters; by default), "none" for no node labeling,
"term_id" for using Term ID, "term_name" for using Term Name (the first
15 characters), and "full_term_name" for using the full Term Name
- graph.node.attrs
- a list of global node attributes. These node
attributes will be changed globally. See 'Note' below for details on
the attributes
- graph.edge.attrs
- a list of global edge attributes. These edge
attributes will be changed globally. See 'Note' below for details on
the attributes
- node.attrs
- a list of local edge attributes. These node
attributes will be changed locally; as such, for each attribute, the
input value must be a named vector (i.e. using Term ID as names). See
'Note' below for details on the attributes
Value
An object of class 'Ragraph'
Note
A list of global node attributes used in "graph.node.attrs":
- "shape": the shape of the node: "circle", "rectangle", "rect",
"box" and "ellipse"
- "fixedsize": the logical to use only width and height attributes.
By default, it sets to true for not expanding for the width of the
label
- "fillcolor": the background color of the node
- "color": the color for the node, corresponding to the outside
edge of the node
- "fontcolor": the color for the node text/labelings
- "fontsize": the font size for the node text/labelings
- "height": the height (in inches) of the node: 0.5 by default
- "width": the width (in inches) of the node: 0.75 by default
- "style": the line style for the node: "solid", "dashed",
"dotted", "invis" and "bold"
A list of global edge attributes used in "graph.edge.attrs":
- "color": the color of the edge: gray by default
- "weight": the weight of the edge: 1 by default
- "style": the line style for the edge: "solid", "dashed",
"dotted", "invis" and "bold"
A list of local node attributes used in "node.attrs" (only those named
Term IDs will be changed locally!):
- "label": a named vector specifying the node text/labelings
- "shape": a named vector specifying the shape of the node:
"circle", "rectangle", "rect", "box" and "ellipse"
- "fixedsize": a named vector specifying whether it sets to true
for not expanding for the width of the label
- "fillcolor": a named vector specifying the background color of
the node
- "color": a named vector specifying the color for the node,
corresponding to the outside edge of the node
- "fontcolor": a named vector specifying the color for the node
text/labelings
- "fontsize": a named vector specifying the font size for the node
text/labelings
- "height": a named vector specifying the height (in inches) of the
node: 0.5 by default
- "width": a named vector specifying the width (in inches) of the
node: 0.75 by default
- "style": a named vector specifying the line style for the node:
"solid", "dashed", "dotted", "invis" and "bold"
Examples
# 1) load SCOP.sf (as 'InfoDataFrame' object)
SCOP.sf <- dcRDataLoader('SCOP.sf')
'SCOP.sf' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
# randomly select 20 domains
data <- sample(rowNames(SCOP.sf), 20)
# 2) perform enrichment analysis, producing an object of S4 class 'Eoutput'
eoutput <- dcEnrichment(data, domain="SCOP.sf", ontology="GOMF")
Start at 2015-07-23 13:18:37
First, load the ontology 'GOMF', the domain 'SCOP.sf', and their associations (2015-07-23 13:18:37) ...
'onto.GOMF' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
'SCOP.sf' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
'SCOP.sf2GOMF' (from package 'dcGOR' version 1.0.5) has been loaded into the working environment
Second, perform enrichment analysis using HypergeoTest (2015-07-23 13:19:06) ...
There are 811 terms being used, each restricted within [10,1000] annotations
Last, adjust the p-values using the BH method (2015-07-23 13:19:06) ...
End at 2015-07-23 13:19:06
Runtime in total is: 29 secs
eoutput
An object of S4 class 'Eoutput', containing following slots:
@domain: 'SCOP.sf'
@ontology: 'GOMF'
@term_info: a data.frame of 142 terms X 5 information
@anno: a list of 142 terms, each storing annotated domains
@data: a vector containing a group of 10 input domains (annotatable)
@background: a vector containing a group of 1083 background domains (annotatable)
@overlap: a list of 142 terms, each containing domains overlapped with input domains
@zscore: a vector of 142 terms, containing z-scores
@pvalue: a vector of 142 terms, containing p-values
@adjp: a vector of 142 terms, containing adjusted p-values
In summary, a total of 142 terms ('GOMF') are analysed for a group of 10 input domains ('SCOP.sf')
# 3) visualise the top 10 significant terms
# color-coded according to 10-based negative logarithm of p-values
visEnrichment(eoutput)
Ontology 'GOMF' containing 15 nodes/terms (including 5 in query; also highlighted in frame) has been shown in your screen, with colorbar indicating -1*log10(adjusted p-values)
# color-coded according to zscore
visEnrichment(eoutput, data.type='zscore')
Ontology 'GOMF' containing 15 nodes/terms (including 5 in query; also highlighted in frame) has been shown in your screen, with colorbar indicating z-scores
# 4) visualise the top 5 significant terms in the ontology hierarchy
nodes_query <- names(sort(adjp(eoutput))[1:5])
visEnrichment(eoutput, nodes_query=nodes_query)
Ontology 'GOMF' containing 14 nodes/terms (including 5 in query; also highlighted in frame) has been shown in your screen, with colorbar indicating -1*log10(adjusted p-values)
# change the frame color: highlight (framed in blue) nodes/terms in query
nodes.highlight <- rep("blue", length(nodes_query))
names(nodes.highlight) <- nodes_query
visEnrichment(eoutput, nodes_query=nodes_query,
node.attrs=list(color=nodes.highlight))
Ontology 'GOMF' containing 14 nodes/terms (including 5 in query; also highlighted in frame) has been shown in your screen, with colorbar indicating -1*log10(adjusted p-values)