biseqt.pw module¶
-
biseqt.pw.lib¶ The loaded shared object for
pwlib. All functions defined in the header file are accessible through this object. This object is automatically populated upon loading this module (cf.setup_ffi()) and users never have to manipulate it.
-
biseqt.pw.ffi¶ The main FFI instance used throughout this module. This object is automatically populated upon loading this module (cf.
setup_ffi()) and users never have to manipulate it.
-
biseqt.pw.setup_ffi()[source]¶ Instantiates an FFI object as
ffiand loads the shared object for pwlib intolib. This function is automatically called when this module loads.Note
CFFI has issues with loading macros as they are defined in a header file. For this reason, and since we don’t use the macros in python code any line that begins with
#defineis ignored from the header file. This means multiline macros will not work.
-
biseqt.pw.STD_MODE¶ Standard alignment type; time and memory complexity is quadratic in sequence lengths.
-
biseqt.pw.BANDED_MODE¶ Banded alignment type; time and memory complexity is linear in sequence lengths with a constant proportional to band width. This mode is incompatible with local alignments.
-
biseqt.pw.GLOBAL¶ Standard global alignment problem, i.e Needleman-Wunsch.
-
biseqt.pw.LOCAL¶ Standard local alignment problem, i.e Smith-Waterman.
-
biseqt.pw.START_ANCHORED¶ Standard local alignment demanding that it begins at the start of frame of both sequences.
-
biseqt.pw.END_ANCHORED¶ Standard local alignment demanding that it ends at the end of frame of both sequences.
-
biseqt.pw.OVERLAP¶ Standard suffix-prefix alignment in any direction; this includes alignments where a prefix of either sequence matches a suffix of the other and alignments where one sequence is a substring of the other.
-
biseqt.pw.START_ANCHORED_OVERLAP¶ Standard suffix-prefix alignment demanding that it begins at the start of frame of both sequences.
-
biseqt.pw.END_ANCHORED_OVERLAP¶ Standard suffix-prefix alignment demanding that it ends at the end of frame of both sequences.
-
biseqt.pw.B_GLOBAL¶ Banded global alignment problem; may not be well-defined (end points of the table may not lie in band).
-
biseqt.pw.B_OVERLAP¶ Banded suffix-prefix alignment problem in either direction including substring alignments.
-
biseqt.pw.B_LOCAL¶ Banded local alignment problem.
-
class
biseqt.pw.Aligner(origin, mutant, **kw)[source]¶ Bases:
objectProvides a context that solves a pairwise alignment problem. Memory is allocated upon entering the context and is freed upon leaving it. All alignment calculations (
solve()andtraceback()) are explicitly invoked by the caller.Parameters: - origin (sequence.Sequence) – The original (“from”) sequence.
- mutant (sequence.Sequence) – The mutant (“to”) sequence.
Keyword Arguments: - origin_range (tuple) – The original (“from”) sequence; cf.
alnframe::origin_range. - mutant_range (tuple) – The mutant (“to”) sequence; cf.
alnframe::mutant_range. - alnmode (int) – One of the
STD_MODEorBANDED_MODE, default isSTD_MODE; cf.alnprob::mode. - alntype (int) – One of the allowed alingment types for the given
alnmode, see
ALN_TYPES; default isGLOBAL; cf.std_alnparamsandbanded_alnparams. - subst_scores (list) – The overriding definition of the substitution
score matrix; cf.
alnscores::subst_scores. Default is None in which case the score matrix is populated based on match and mismatch scores. - match_score (float) – If
subst_scoresis not given, this parameter is used to populate the diagonal entries of the substitution score matrix; default is 1. - mismatch_score (float) – If
subst_scoresis not given, this parameter is used to populate the off-diagonal entries of the substitution score matrix; default is 0. - go_score (float) – The gap open score; cf.
alnscores::gap_open_score. Default is 0. - ge_score (float) – The gap extend score; cf.
alnscores::gap_extend_score. Default is 0. - max_new_mins (int) – Maximum number of tolerated new minima encountered
in the running score of an alignment; cf.
alnprob::max_new_mins. Default is -1 in which case no such constraint is imposed. - diag_range (tuple) – If in
BANDED_MODEthis argument specifies the upper and lower limit on diagonals of the dynamic programming table to be populated; cf.banded_alnparams. - min_score (float) – The minimum required score for an alignment to be
reported; default is
float("-inf")in which case all alignments are reported.
-
solve()[source]¶ Populates the regions of interest in the dynamic programming table and reports the optimal score; if any. This function must be called within the context, cf.
__enter__(),__exit__().Returns: - The score of the optimal alignment or None if none
- found.
Return type: score (float)
-
class
biseqt.pw.Alignment(origin, mutant, transcript, score=None, origin_start=0, mutant_start=0)[source]¶ Bases:
objectRepresents a pairwise alignment.
-
origin¶ sequence.Sequence – The original (“from”) sequence.
-
mutant¶ sequence.Sequence – The mutant (“to”) sequence.
-
alphabet¶ sequence.Alphabet – The shared alphabet of origin and mutant.
-
transcript¶ str – The sequence of edit operations that transforms origin to mutant. The alphabet for edit operations is
Mfor match,Sfor substitution (mismatch), andIandDfor insertion and deletion.
-
origin_start¶ int – Starting position on the original sequence; default is 0.
-
mutant_start¶ int – Starting position on the mutant sequence; default is 0.
-
score¶ float – The score of the alignment; default is None.
-
classmethod
projected_len(transcript, on='origin')[source]¶ Calculates the projected length of a given transcript on either of the involved sequences. For instance:
>>> biseqt.Alignment.projected_len('MSI', on='origin') 2 >>> biseqt.Alignment.projected_len('MSI', on='mutant') 3
Parameters: transcript (str) – A sequence of edit operations, cf. Alignment.transcript.Keyword Arguments: on (str) – Either of originormutant.Returns: The projected length of the edit transcript. Return type: int
-
calculate_score(subst_scores, go_score, ge_score)[source]¶ Scores a this alignment according to given scoring scheme.
Parameters: - subst_scores (list) – The substitution score matrix, cf.
Aligner.subst_scores. - go_score (float) – The gap open score; cf.
Aligner.go_score. - ge_score (float) – The gap extend score; cf.
Aligner.ge_score.
Returns: The score of the alignment for
originandmutantbased on given scores.Return type: float
- subst_scores (list) – The substitution score matrix, cf.
-
render_term(term_width=120, margin=0, colored=True)[source]¶ Renders a textual representation of the alignment.
Keyword Arguments: - term_width (int) – Terminal width used for wrapping; default is 120 and the smallest valid value is 30.
- margin (length) – Length of leading and trailing substring to include in original and mutant sequences; default is 20.
- colored (bool) – Whether or not to use ANSI color codes in output; default is True.
Returns: str
-