Hits-class {IRanges} | R Documentation |
The Hits
class stores a set of "hits"
between the elements in one vector-like object (called the "query")
and the elements in another (called the "subject"). Currently,
Hits
are used to represent the result of a call to
findOverlaps
, though other operations producing "hits"
are imaginable.
The as.matrix
and as.data.frame
methods coerce a Hits
object to a two column matrix
or data.frame
with one row for
each hit, where the value in the first column is the index of an element in
the query and the value in the second column is the index of an element in
the subject.
The as.table
method counts the number of hits for each
query element and outputs the counts as a table
.
To transpose a Hits
x
, so that the subject and query
are interchanged, call t(x)
. This allows, for example, counting
the number of hits for each subject element using as.table
.
In the code snippets below, x
is a Hits
object.
as.matrix(x)
: Coerces x
to a two
column integer matrix, with each row representing a hit
between a query index (first column) and subject index (second
column).
as(from, "DataFrame")
: Creates a DataFrame
by
combining the result of as.matrix(from)
with mcols(from)
.
as.data.frame(x)
: Attempts to coerce the result of
as(from, "DataFrame")
to a data.frame
.
as.table(x)
: counts the number of hits for each
query element in x
and outputs the counts as a table
.
t(x)
: Interchange the query and subject in x
,
returns a transposed Hits
.
as.list(x)
: Returns a list with an element for each
query, where each element contains the indices of the subjects
that have a hit with the corresponding query.
as(x, "List")
: Like as.list
, above.
x[i]
: Extracts a subset of the hits. The index
argument i
may be logical
or numeric
. If
numeric, be sure that i
does not contain any duplicates,
which would violate the set property of Hits
.
queryHits(x)
: Equivalent to as.data.frame(x)[[1]]
.
subjectHits(x)
: Equivalent
to as.data.frame(x)[[2]]
.
countQueryHits(x)
: Counts the number of hits for
each query, returning an integer vector.
countSubjectHits(x)
: Counts the number of hits for
each subject, returning an integer vector.
length(x)
: get the number of hits
queryLength(x)
, nrow(x)
: get the number of
elements in the query
subjectLength(x)
, ncol(x)
: get the number of
elements in the subject
queryHits(x, query.map=NULL, new.queryLength=NA,
subject.map=NULL, new.subjectLength=NA)
:
Remaps the hits in x
thru a "query map" and/or a "subject map"
map. The query hits are remapped thru the "query map", which is specified
via the query.map
and new.queryLength
arguments. The
subject hits are remapped thru the "subject map", which is specified via
the subject.map
and new.subjectLength
arguments.
The "query map" is conceptually a function (in the mathematical sense)
and is also known as the "mapping function". It must be defined on the
1..M interval and take values in the 1..N interval, where N is
queryLength(x)
and M is the value specified by the user via the
new.queryLength
argument. Note that this mapping function doesn't
need to be injective or surjective. Also it is not represented by an R
function but by an integer vector of length M with no NAs. More precisely
query.map
can be NULL (identity map), or a vector of
queryLength(x)
non-NA integers that are >= 1 and
<= new.queryLength
, or a factor of length queryLength(x)
with no NAs (a factor is treated as an integer vector, and, if missing,
new.queryLength
is taken to be its number of levels). Note that
a factor will typically be used to represent a mapping function that is
not injective.
The same apply to the "subject map".
remapHits
returns a Hits object where all the query and subject
hits (accessed with queryHits
and subjectHits
,
respectively) have been remapped thru the 2 specified maps. This
remapping is actually only the 1st step of the transformation, and is
followed by 2 additional steps: (2) the removal of duplicated hits,
and (3) the reordering of the hits (first by query hits, then by subject
hits). Note that if the 2 maps are injective then the remapping won't
introduce duplicated hits, so, in that case, step (2) is a no-op (but
is still performed). Also if the "query map" is strictly ascending and
the "subject map" ascending then the remapping will preserve the order
of the hits, so, in that case, step (3) is also a no-op (but is still
performed).
Michael Lawrence
findOverlaps
, which generates an instance of this class.
setops-methods for set operations on Hits objects.
query <- IRanges(c(1, 4, 9), c(5, 7, 10)) subject <- IRanges(c(2, 2, 10), c(2, 3, 12)) tree <- IntervalTree(subject) overlaps <- findOverlaps(query, tree) as.matrix(overlaps) as.data.frame(overlaps) as.table(overlaps) # hits per query as.table(t(overlaps)) # hits per subject hits1 <- remapHits(overlaps, subject.map=factor(c("e", "e", "d"), letters[1:5])) hits1 hits2 <- remapHits(overlaps, subject.map=c(5, 5, 4), new.subjectLength=5) hits2 stopifnot(identical(hits1, hits2))