3.3. RNF format library¶
3.3.1. Read tuple: rnftools.rnfformat.ReadTuple
¶
-
class
rnftools.rnfformat.ReadTuple.
ReadTuple
(segments=[], read_tuple_id=0, prefix='', suffix='')[source]¶ Bases:
object
Class for a RNF read tuple.
Parameters: - segments (list of rnftools.rnfformat.Segment) – Segments of the read.
- read_tuple_id (int) – Read tuple ID.
- prefix (str) – Prefix for the read name.
- suffix (str) – Suffix for the read name.
-
destringize
(string)[source]¶ Get RNF values for this read from its textual representation and save them into this object.
Parameters: string (str) – Textual representation of a read. Raises: ValueError
-
stringize
(rnf_profile=<rnftools.rnfformat.RnfProfile.RnfProfile object>)[source]¶ Create RNF representation of this read.
Parameters: - read_tuple_id_width (int) – Maximal expected string length of read tuple ID.
- genome_id_width (int) – Maximal expected string length of genome ID.
- chr_id_width (int) – Maximal expected string length of chromosome ID.
- coor_width (int) – Maximal expected string length of a coordinate.
3.3.2. Segment: rnftools.rnfformat.Segment
¶
-
class
rnftools.rnfformat.Segment.
Segment
(genome_id=0, chr_id=0, direction='N', left=0, right=0)[source]¶ Bases:
object
Class for a single segment in a RNF read name.
3.3.3. RNF profile: rnftools.rnfformat.RnfProfile
¶
-
class
rnftools.rnfformat.RnfProfile.
RnfProfile
(prefix_width=0, read_tuple_id_width=8, genome_id_width=1, chr_id_width=2, coor_width=9, read_tuple_name=None)[source]¶ Bases:
object
Class for profile of RNF reads (widths).
Parameters: - prefix_width (int) – Length of prefix.
- read_tuple_id_width (int) – Width of read tuple ID
- genome_id_width (int) – Width of genome ID.
- chr_id_width (int) – Width of chromosome ID.
- coor_width (int) – Width of coordinate width.
- read_tuple_name (str) – Read tuple name to initialize all the values.
-
prefix_width
¶ int – Length of prefix.
-
read_tuple_id_width
¶ int – Width of read tuple ID
-
genome_id_width
¶ int – Width of genome ID.
-
chr_id_width
¶ int – Width of chromosome ID.
-
coor_width
¶ int – Width of coordinate width.
-
apply
(read_tuple_name, read_tuple_id=None, synchronize_widths=True)[source]¶ Apply profile on a read tuple name and update read tuple ID.
Parameters: - read_tuple_name (str) – Read tuple name to be updated.
- read_tuple_id (id) – New read tuple ID.
- synchronize_widths (bool) – Update widths (in accordance to this profile).
-
check
(read_tuple_name)[source]¶ Check if the given read tuple name satisfies this profile.
Parameters: read_tuple_name (str) – Read tuple name.
-
combine
()[source]¶ Combine more profiles and set their maximal values.
Parameters: *rnf_profiles (rnftools.rnfformat.RnfProfile) – RNF profile.
3.3.4. FASTQ creator: rnftools.rnfformat.FqCreator
¶
-
class
rnftools.rnfformat.FqCreator.
FqCreator
(fastq_fo, read_tuple_id_width=16, genome_id_width=2, chr_id_width=2, coor_width=8, info_reads_in_tuple=True, info_simulator=None)[source]¶ Bases:
object
Class for writing RNF reads to FASTQ files.
Every new read is added to the internal buffer. If read tuple ID is different, buffer is flushed. Hence, reads from the same tuple must be added in a series. It does not matter in which order are blocks reported and with which exact reads, they will be sorted during flushing.
Parameters: - fastq_fo (str) – Output FASTQ file - file object.
- read_tuple_id_width (int) – Maximal expected string length of read tuple ID.
- genome_id_width (int) – Maximal expected string length of genome ID.
- chr_id_width (int) – Maximal expected string length of chromosome ID.
- coor_width (int) – Maximal expected string length of a coordinate.
- info_reads_in_tuple (bool) – Include information about reads as a RNF comment.
- info_simulator (str) – Name of used simulator (to be included as a RNF comment).
-
add_read
(read_tuple_id, bases, qualities, segments)[source]¶ Add a new read to the current buffer. If it is a new read tuple (detected from ID), the buffer will be flushed.
Parameters: - read_tuple_id (int) – ID of the read tuple.
- bases (str) – Sequence of bases.
- qualities (str) – Sequence of FASTQ qualities.
- segments (list of rnftools.rnfformat.segment) – List of segments constituting the read.
3.3.5. FASTQ merger: rnftools.rnfformat.FqMerger
¶
-
class
rnftools.rnfformat.FqMerger.
FqMerger
(mode, input_files_fn, output_prefix)[source]¶ Bases:
object
Class for merging several RNF FASTQ files.
Parameters: - mode (str) – Output mode (single-end / paired-end-bwa / paired-end-bfast).
- input_files_fn (list) – List of file names of input FASTQ files.
- output_prefix (str) – Prefix for output FASTQ files.
3.3.6. RNF validator: rnftools.rnfformat.Validator
¶
-
class
rnftools.rnfformat.Validator.
Validator
(initial_read_tuple_name, report_only_first=True, warnings_as_errors=False)[source]¶ Bases:
object
Class for validation of RNF.
Parameters: - initial_read_tuple_name (str) – Initial read tuple name to detect profile (widths).
- report_only_first (bool) – Report only first occurrence of every error.
- warnings_as_errors (bool) – Treat warnings as errors (error code).
-
report_error
(read_tuple_name, error_name, wrong='', message='', warning=False)[source]¶ Report an error.
Parameters: - () (error_name) – Name of the read tuple.
- () – Name of the error.
- wrong (str) – What is wrong.
- message (str) – Additional msessage to be printed.
- warning (bool) – Warning (not an error).