biseqt.sequence module¶
-
class
biseqt.sequence.
Alphabet
(letters)[source]¶ Bases:
object
A sequence alphabet.
-
_letters
¶ tuple – The letters in the alphabet. All
getitem
operations (i.e indexing and slicing) are delegated to this tuple. This attribute should be considered read-only.
-
_letlen
¶ int – The length of the letters in the alphabet when represented as a string. This attribute should be considered read-only.
Parameters: letters (iterable) – The elements of this iterable must be hashable, i.e can be keys of a dictionary, and must respond to len()
. Typically, they are single character strings.-
letter_to_idx
(letters)[source]¶ Translates provided letters to the integer sequence corresponding to the index of each letter in this alphabet.
Parameters: letters (iterable) – The letters to be translated to integer indices. Each element retrieved through iteration should be an element in _letters
.Returns: tuple
-
parse
(string)[source]¶ Given a string representation of a sequence returns a corresponding
Sequence
object.Parameters: string (str) – The raw sequence represented as a string. Returns: Sequence
-
transform
(seq, mappings={})[source]¶ Transforms the given sequence to another sequence in the same alphabet according to provided letter-to-letter mappings.
Parameters: - seq (Sequence) – The original sequence.
- mappings (list|dict) – If a dictionary is given, each entry represents a translation rule from the key to the value. If a list is given, each entry must have two elements and is taken to represent a bidirectional translation rule between those two elements. Each element in either a dictionary or a list can either be a letter in string format or an integer representing the position of the letter.
Returns: Sequence
For example, to get the complement of a DNA sequence:
>>> from biseqt.sequence import Alphabet, complement >>> A = Alphabet('ACGT') >>> S = A.parse('AGGGT') >>> print A.transform(S, mappings=['AT', 'CG']) 'TCCCA'
whereas to get the same effect with a dictionary:
>>> mappings = {'A': 'T', 'T': 'A', 'C': 'G', 'G': 'C'} >>> print A.complement(S, mappings) 'TCCCA'
-
-
class
biseqt.sequence.
Sequence
(alphabet, contents=())[source]¶ Bases:
object
An immutable sequence of letters from some
Alphabet
which behaves mostly like a tuple.-
contents
¶ tuple – The contents of the sequence represented as tuple of integers of the same length where each letter is represented by its position in the alphabet.
-
content_id
¶ string – Hex representation of the sequence SHA1.
Initializes the sequence object: translates all letters to integers corresponding to the position of each letter in the alphabet.
Parameters: - alphabet (Alphabet) – The
Alphabet
of the sequence. - contents (iterable) – The contents of the sequence as an iterable, each element of
which is the integer representation of a letter from the
Alphabet
; default is an empty sequence. If the alphabet letter length is one, this argument can be a string.
-
reverse
()[source]¶ Returns another sequence whose contents are the reverse of this sequence in order.
Returns: Sequence
-
transform
(mappings={})[source]¶ Wraps
Alphabet.transform()
for convenience.
-