biseqt.seeds module

class biseqt.seeds.SeedIndex(S, T, kmer_cache=None, **kw)[source]

Bases: biseqt.kmers.KmerDBWrapper

An index for seeds in diagonal coordinates.

S

biseqt.sequence.Sequence – The 1st sequence.

T

biseqt.sequence.Sequence – The 2nd sequence.

cache

KmerCache – optional KmerCache object to use for retrieving integer representations of sequences.

seeds_table

The seeds table name seeds_[name], cf. KmerDBWrapper.name.

classmethod to_diagonal_coordinates(i, j)[source]

Convert standard coordinates to diagonal coordinates via:

\[\begin{split}\begin{aligned} d & = i - j \\ a & = i + j \end{aligned}\end{split}\]
classmethod to_ij_coordinates(d, a)[source]

Convert diagonal coordinates to standard coordinates:

\[\begin{split}\begin{aligned} i & = \frac{a + d}{2} \\ j & = \frac{a - d}{2} \end{aligned}\end{split}\]
classmethod to_ij_coordinates_seg(seg)[source]

Convert a segment in diagonal coordinates to standard coordinates according to to_ij_coordinates().

Parameters:seg (tuple) – \((d_{\min}, d_{\max}), (a_{\min}, a_{\max})\)
Returns
tuple: \((i_s, i_e), (j_s, j_e)\) start and end coordinates along the origin and mutant sequences (\(i,j\) respectively).
seeds(d_band=None, exclude_trivial=False)[source]

Yields all seeds, optionally those within a diagonal band.

Keyword Arguments:
 d_band (tuple|None) – If specified a (d_min, d_max) tuple restricting the seed count to a diagonal band.
Yields:tuple – seeds coordinates \((i, j)\).
seed_count(d_band=None, a_band=None)[source]

Counts the number of seeds either in the whole table or in the specified diagonal band.

Parameters:
  • d_band (tuple|None) – If specified a \((d_{\min}, d_{\max})\) tuple restricting the seed count to a diagonal band.
  • a_band (tuple|None) – If specified a \((a_{\min}, a_{\max})\) tuple restricting the seed count to an antidiagonal band.
Returns:

Number of seeds found in the entire table or in the specified diagonal band.

Return type:

int

class biseqt.seeds.SeedIndexMultiple(*seqs, **kw)[source]

Bases: biseqt.kmers.KmerDBWrapper

An index for seeds between multiple sequences in diagonal coordinates.

seqs

list[biseqt.sequence.Sequence] – The sequences of interest.

cache

KmerCache – optional KmerCache object to use for retrieving integer representations of sequences.

seeds_table

The seeds table name seeds_[name], cf. KmerDBWrapper.name.

classmethod to_diagonal_coordinates(*idxs)[source]

Convert standard coordinates to diagonal coordinates via:

\[\begin{split}\begin{aligned} d_k & = i_1 - i_{k+1} \ \ k = 1, 2, \ldots, n - 1 \\ a & = \sum_{k=1}^n i_k \end{aligned}\end{split}\]
classmethod to_ij_coordinates(ds, a)[source]

Convert diagonal \((d_1,\ldots, d_{n-1}, a)\) coordinates to standard coordinates via:

\[\begin{split}\begin{aligned} i_1 & = \frac{a + \sum_{k=1}^{n-1}d_k}{N} \\ i_k & = i_1 - d_{k-1} \end{aligned}\end{split}\]
classmethod to_ij_coordinates_seg(seg)[source]

Convert a segment in diagonal coordinates to standard coordinates according to to_ij_coordinates().

Parameters:seg (tuple) – \((d_{1,\min}, d_{1,\max}), \ldots, (d_{n-1,\min}, d_{n-1,\max}), (a_{\min}, a_{\max})\)
Returns:start and end coordinate in the \(n\) sequences.
Return type:tuple
seeds()[source]

Yields all seeds in diagonal coordinates.

Yields:tuple – seeds coordinates \((d_1, \ldots, d_{n-1}, a)\).
seed_count(ds_band=None, a_band=None)[source]

Counts the number of seeds either in the whole table or in the specified diagonal band.

Parameters:
  • ds_band (List|None) – If specified a list of \(n-1\) tuples \((d_{k,\min}, d_{k,\max})\) restricting the seed count to a diagonal hyper-band.
  • a_band (tuple|None) – If specified a \((a_{\min}, a_{\max})\) tuple restricting the seed count to an antidiagonal band.
Returns:

Number of seeds found in the entire table or in the specified diagonal band.

Return type:

int