Calculate the posterior marginal probabilities at a given variant using forward and backward tables propagated to that position.
Usage
PostProbs(
fwd,
bck,
unif.on.underflow = FALSE,
M = NULL,
beta.theta.opts = NULL,
nthreads = min(parallel::detectCores(logical = FALSE), fwd$to_recipient -
fwd$from_recipient + 1)
)
Arguments
- fwd
a forward table as returned by
MakeForwardTable()
and propagated to a target variant byForward()
. Must be at the same variant asbck
(unlessbck
is in "beta-theta space" in which case if must be downstream ... seeBackward()
for details).- bck
a backward table as returned by
MakeBackwardTable()
and propagated to a target variant byBackward()
. Must be at the same variant asfwd
(unlessbck
is in "beta-theta space" in which case if must be downstream ... seeBackward()
for details).- unif.on.underflow
a logical; if
TRUE
, then if all probabilities in a column underflow, then they will be set to \(\frac{1}{N-1}\) instead of 0- M
a pre-existing matrix into which to write the probabilities, can yield substantial speed up but requires special attention (see Details)
- beta.theta.opts
a list; see Details.
- nthreads
the number of CPU cores to use. By default uses the
parallel
package to detect the number of physical cores.
Value
Matrix of posterior marginal probabilities. The \((j,i)\)-th element of of the returned matrix is the probability that \(j\) is copied by \(i\) at the current variant, \(l\), of the two tables, given the haplotypes observed (over the whole sequence).
Details
The forward and backward tables must be at the same variant in order for them to be combined to yield the posterior marginal probabilities at variant \(l\). The \((j,i)\)-th element of of the returned matrix is the probability that \(j\) is copied by \(i\) at the current variant, \(l\), of the two tables, given the haplotypes observed (over the whole sequence).
Note that each column represents an independent HMM.
By convention, every diagonal element is zero.
Notes on beta.theta.opts
In order to obtain posterior marginal probability matrices between variants fwd$l
and bck$l
, then bck
must be in "beta-theta space", see Backward()
for details.
This allows the forward and backward tables to be transitioning both tables to some genomic position between fwd$l
and bck$l
.
The precise recombination distance by which each table is propagated can be controlled by passing optional arguments in a list via beta.theta.opts
.
The recombination distances used can be specified in one of two ways.
Manually. In this case,
beta.theta.opts
is a list containing two named elements:"rho.fwd"
\(\in (0,1)\) specifies the transition probabilityrho
for propagating the forward table."rho.bck"
\(\in (0,1)\) specifies the transition probabilityrho
for propagating the backward table.
Implicitly. In this case,
beta.theta.opts
is a list containing two named elements:
"pars"
: akalisParameters
object that implicitly defines the recombination distance \(\rho^\star\) betweenfwd$l
andbck$l
"bias"
\(\in (0,1)\). The forward table is propagated a distance of(bias)
\(\rho^\star\) and the backward table is propagated a distance of(1-bias)
\(\rho^\star\).
Performance notes
If calculating many posterior probability matrices in succession, providing a pre-existing matrix M
that can be updated in-place can dramatically increase speed by eliminating the time needed for memory allocation.
Be warned, since the matrix is updated in-place, if any other variables point to the same memory address, they will also be simultaneously overwritten.
For example, writing
will update M
and P
simultaneously.
When provided, M
must have dimensions matching that of fwd$alpha
.
Typically, that is simply \(N \times N\) for \(N\) haplotypes.
However, if kalis is being run in a distributed manner, M
will be a \(N \times R\) matrix where \(R\) is the number of recipient haplotypes on the current machine.
References
Aslett, L.J.M. and Christ, R.R. (2024) "kalis: a modern implementation of the Li & Stephens model for local ancestry inference in R", BMC Bioinformatics, 25(1). Available at: doi:10.1186/s12859-024-05688-8 .
See also
DistMat()
to generate calculate \(d_{ji}\) distances directly;
Forward()
to propagate a Forward table to a new variant;
Backward()
to propagate a Backward table to a new variant
Examples
# To get the posterior probabilities at, say, variants 100 of the toy data
# built into kalis
data("SmallHaps")
data("SmallMap")
CacheHaplotypes(SmallHaps)
#> Warning: haplotypes already cached ... overwriting existing cache.
rho <- CalcRho(diff(SmallMap))
pars <- Parameters(rho)
fwd <- MakeForwardTable(pars)
bck <- MakeBackwardTable(pars)
Forward(fwd, pars, 100)
Backward(bck, pars, 100)
p <- PostProbs(fwd, bck)
d <- DistMat(fwd, bck)