biseqt.pw module¶
-
biseqt.pw.
lib
¶ The loaded shared object for
pwlib
. All functions defined in the header file are accessible through this object. This object is automatically populated upon loading this module (cf.setup_ffi()
) and users never have to manipulate it.
-
biseqt.pw.
ffi
¶ The main FFI instance used throughout this module. This object is automatically populated upon loading this module (cf.
setup_ffi()
) and users never have to manipulate it.
-
biseqt.pw.
setup_ffi
()[source]¶ Instantiates an FFI object as
ffi
and loads the shared object for pwlib intolib
. This function is automatically called when this module loads.Note
CFFI has issues with loading macros as they are defined in a header file. For this reason, and since we don’t use the macros in python code any line that begins with
#define
is ignored from the header file. This means multiline macros will not work.
-
biseqt.pw.
STD_MODE
¶ Standard alignment type; time and memory complexity is quadratic in sequence lengths.
-
biseqt.pw.
BANDED_MODE
¶ Banded alignment type; time and memory complexity is linear in sequence lengths with a constant proportional to band width. This mode is incompatible with local alignments.
-
biseqt.pw.
GLOBAL
¶ Standard global alignment problem, i.e Needleman-Wunsch.
-
biseqt.pw.
LOCAL
¶ Standard local alignment problem, i.e Smith-Waterman.
-
biseqt.pw.
START_ANCHORED
¶ Standard local alignment demanding that it begins at the start of frame of both sequences.
-
biseqt.pw.
END_ANCHORED
¶ Standard local alignment demanding that it ends at the end of frame of both sequences.
-
biseqt.pw.
OVERLAP
¶ Standard suffix-prefix alignment in any direction; this includes alignments where a prefix of either sequence matches a suffix of the other and alignments where one sequence is a substring of the other.
-
biseqt.pw.
START_ANCHORED_OVERLAP
¶ Standard suffix-prefix alignment demanding that it begins at the start of frame of both sequences.
-
biseqt.pw.
END_ANCHORED_OVERLAP
¶ Standard suffix-prefix alignment demanding that it ends at the end of frame of both sequences.
-
biseqt.pw.
B_GLOBAL
¶ Banded global alignment problem; may not be well-defined (end points of the table may not lie in band).
-
biseqt.pw.
B_OVERLAP
¶ Banded suffix-prefix alignment problem in either direction including substring alignments.
-
biseqt.pw.
B_LOCAL
¶ Banded local alignment problem.
-
class
biseqt.pw.
Aligner
(origin, mutant, **kw)[source]¶ Bases:
object
Provides a context that solves a pairwise alignment problem. Memory is allocated upon entering the context and is freed upon leaving it. All alignment calculations (
solve()
andtraceback()
) are explicitly invoked by the caller.Parameters: - origin (sequence.Sequence) – The original (“from”) sequence.
- mutant (sequence.Sequence) – The mutant (“to”) sequence.
Keyword Arguments: - origin_range (tuple) – The original (“from”) sequence; cf.
alnframe::origin_range
. - mutant_range (tuple) – The mutant (“to”) sequence; cf.
alnframe::mutant_range
. - alnmode (int) – One of the
STD_MODE
orBANDED_MODE
, default isSTD_MODE
; cf.alnprob::mode
. - alntype (int) – One of the allowed alingment types for the given
alnmode, see
ALN_TYPES
; default isGLOBAL
; cf.std_alnparams
andbanded_alnparams
. - subst_scores (list) – The overriding definition of the substitution
score matrix; cf.
alnscores::subst_scores
. Default is None in which case the score matrix is populated based on match and mismatch scores. - match_score (float) – If
subst_scores
is not given, this parameter is used to populate the diagonal entries of the substitution score matrix; default is 1. - mismatch_score (float) – If
subst_scores
is not given, this parameter is used to populate the off-diagonal entries of the substitution score matrix; default is 0. - go_score (float) – The gap open score; cf.
alnscores::gap_open_score
. Default is 0. - ge_score (float) – The gap extend score; cf.
alnscores::gap_extend_score
. Default is 0. - max_new_mins (int) – Maximum number of tolerated new minima encountered
in the running score of an alignment; cf.
alnprob::max_new_mins
. Default is -1 in which case no such constraint is imposed. - diag_range (tuple) – If in
BANDED_MODE
this argument specifies the upper and lower limit on diagonals of the dynamic programming table to be populated; cf.banded_alnparams
. - min_score (float) – The minimum required score for an alignment to be
reported; default is
float("-inf")
in which case all alignments are reported.
-
solve
()[source]¶ Populates the regions of interest in the dynamic programming table and reports the optimal score; if any. This function must be called within the context, cf.
__enter__()
,__exit__()
.Returns: - The score of the optimal alignment or None if none
- found.
Return type: score (float)
-
class
biseqt.pw.
Alignment
(origin, mutant, transcript, score=None, origin_start=0, mutant_start=0)[source]¶ Bases:
object
Represents a pairwise alignment.
-
origin
¶ sequence.Sequence – The original (“from”) sequence.
-
mutant
¶ sequence.Sequence – The mutant (“to”) sequence.
-
alphabet
¶ sequence.Alphabet – The shared alphabet of origin and mutant.
-
transcript
¶ str – The sequence of edit operations that transforms origin to mutant. The alphabet for edit operations is
M
for match,S
for substitution (mismatch), andI
andD
for insertion and deletion.
-
origin_start
¶ int – Starting position on the original sequence; default is 0.
-
mutant_start
¶ int – Starting position on the mutant sequence; default is 0.
-
score
¶ float – The score of the alignment; default is None.
-
classmethod
projected_len
(transcript, on='origin')[source]¶ Calculates the projected length of a given transcript on either of the involved sequences. For instance:
>>> biseqt.Alignment.projected_len('MSI', on='origin') 2 >>> biseqt.Alignment.projected_len('MSI', on='mutant') 3
Parameters: transcript (str) – A sequence of edit operations, cf. Alignment.transcript
.Keyword Arguments: on (str) – Either of origin
ormutant
.Returns: The projected length of the edit transcript. Return type: int
-
calculate_score
(subst_scores, go_score, ge_score)[source]¶ Scores a this alignment according to given scoring scheme.
Parameters: - subst_scores (list) – The substitution score matrix, cf.
Aligner.subst_scores
. - go_score (float) – The gap open score; cf.
Aligner.go_score
. - ge_score (float) – The gap extend score; cf.
Aligner.ge_score
.
Returns: The score of the alignment for
origin
andmutant
based on given scores.Return type: float
- subst_scores (list) – The substitution score matrix, cf.
-
render_term
(term_width=120, margin=0, colored=True)[source]¶ Renders a textual representation of the alignment.
Keyword Arguments: - term_width (int) – Terminal width used for wrapping; default is 120 and the smallest valid value is 30.
- margin (length) – Length of leading and trailing substring to include in original and mutant sequences; default is 20.
- colored (bool) – Whether or not to use ANSI color codes in output; default is True.
Returns: str
-