3.3. RNF format library

3.3.1. Read tuple: rnftools.rnfformat.ReadTuple

class rnftools.rnfformat.ReadTuple.ReadTuple(segments=[], read_tuple_id=0, prefix='', suffix='')[source]

Bases: object

Class for a RNF read tuple.

Parameters:
  • segments (list of rnftools.rnfformat.Segment) – Segments of the read.
  • read_tuple_id (int) – Read tuple ID.
  • prefix (str) – Prefix for the read name.
  • suffix (str) – Suffix for the read name.
destringize(string)[source]

Get RNF values for this read from its textual representation and save them into this object.

Parameters:string (str) – Textual representation of a read.
Raises:ValueError
stringize(rnf_profile=<rnftools.rnfformat.RnfProfile.RnfProfile object>)[source]

Create RNF representation of this read.

Parameters:
  • read_tuple_id_width (int) – Maximal expected string length of read tuple ID.
  • genome_id_width (int) – Maximal expected string length of genome ID.
  • chr_id_width (int) – Maximal expected string length of chromosome ID.
  • coor_width (int) – Maximal expected string length of a coordinate.

3.3.2. Segment: rnftools.rnfformat.Segment

class rnftools.rnfformat.Segment.Segment(genome_id=0, chr_id=0, direction='N', left=0, right=0)[source]

Bases: object

Class for a single segment in a RNF read name.

destringize(string)[source]

Get RNF values for this segment from its textual representation and save them into this object.

Parameters:string (str) – Textual representation of a segment.
stringize(rnf_profile)[source]

Create RNF representation of this segment.

Parameters:rnf_profile (rnftools.rnfformat.RnfProfile) – RNF profile (with widths).

3.3.3. RNF profile: rnftools.rnfformat.RnfProfile

class rnftools.rnfformat.RnfProfile.RnfProfile(prefix_width=0, read_tuple_id_width=8, genome_id_width=1, chr_id_width=2, coor_width=9, read_tuple_name=None)[source]

Bases: object

Class for profile of RNF reads (widths).

Parameters:
  • prefix_width (int) – Length of prefix.
  • read_tuple_id_width (int) – Width of read tuple ID
  • genome_id_width (int) – Width of genome ID.
  • chr_id_width (int) – Width of chromosome ID.
  • coor_width (int) – Width of coordinate width.
  • read_tuple_name (str) – Read tuple name to initialize all the values.
prefix_width

Length of prefix.

Type:int
read_tuple_id_width

Width of read tuple ID

Type:int
genome_id_width

Width of genome ID.

Type:int
chr_id_width

Width of chromosome ID.

Type:int
coor_width

Width of coordinate width.

Type:int
apply(read_tuple_name, read_tuple_id=None, synchronize_widths=True)[source]

Apply profile on a read tuple name and update read tuple ID.

Parameters:
  • read_tuple_name (str) – Read tuple name to be updated.
  • read_tuple_id (id) – New read tuple ID.
  • synchronize_widths (bool) – Update widths (in accordance to this profile).
check(read_tuple_name)[source]

Check if the given read tuple name satisfies this profile.

Parameters:read_tuple_name (str) – Read tuple name.
combine()[source]

Combine more profiles and set their maximal values.

Parameters:*rnf_profiles (rnftools.rnfformat.RnfProfile) – RNF profile.
get_rnf_name(read_tuple)[source]

Get well-formatted RNF representation of a read tuple.

read_tuple (rnftools.rnfformat.ReadTuple): Read tuple.

load(read_tuple_name)[source]

Load RNF values from a read tuple name.

Parameters:read_tuple_name (str) – Read tuple name which the values are taken from.

3.3.4. FASTQ creator: rnftools.rnfformat.FqCreator

class rnftools.rnfformat.FqCreator.FqCreator(fastq_fo, read_tuple_id_width=16, genome_id_width=2, chr_id_width=2, coor_width=8, info_reads_in_tuple=True, info_simulator=None)[source]

Bases: object

Class for writing RNF reads to FASTQ files.

Every new read is added to the internal buffer. If read tuple ID is different, buffer is flushed. Hence, reads from the same tuple must be added in a series. It does not matter in which order are blocks reported and with which exact reads, they will be sorted during flushing.

Parameters:
  • fastq_fo (str) – Output FASTQ file - file object.
  • read_tuple_id_width (int) – Maximal expected string length of read tuple ID.
  • genome_id_width (int) – Maximal expected string length of genome ID.
  • chr_id_width (int) – Maximal expected string length of chromosome ID.
  • coor_width (int) – Maximal expected string length of a coordinate.
  • info_reads_in_tuple (bool) – Include information about reads as a RNF comment.
  • info_simulator (str) – Name of used simulator (to be included as a RNF comment).
add_read(read_tuple_id, bases, qualities, segments)[source]

Add a new read to the current buffer. If it is a new read tuple (detected from ID), the buffer will be flushed.

Parameters:
  • read_tuple_id (int) – ID of the read tuple.
  • bases (str) – Sequence of bases.
  • qualities (str) – Sequence of FASTQ qualities.
  • segments (list of rnftools.rnfformat.segment) – List of segments constituting the read.
empty()[source]

Empty all internal buffers.

flush_read_tuple()[source]

Flush the internal buffer of reads.

is_empty()[source]

All internal buffer empty?

3.3.5. FASTQ merger: rnftools.rnfformat.FqMerger

class rnftools.rnfformat.FqMerger.FqMerger(mode, input_files_fn, output_prefix)[source]

Bases: object

Class for merging several RNF FASTQ files.

Parameters:
  • mode (str) – Output mode (single-end / paired-end-bwa / paired-end-bfast).
  • input_files_fn (list) – List of file names of input FASTQ files.
  • output_prefix (str) – Prefix for output FASTQ files.
run()[source]

Run merging.

3.3.6. RNF validator: rnftools.rnfformat.Validator

class rnftools.rnfformat.Validator.Validator(initial_read_tuple_name, report_only_first=True, warnings_as_errors=False)[source]

Bases: object

Class for validation of RNF.

Parameters:
  • initial_read_tuple_name (str) – Initial read tuple name to detect profile (widths).
  • report_only_first (bool) – Report only first occurrence of every error.
  • warnings_as_errors (bool) – Treat warnings as errors (error code).
get_return_code()[source]

Get final return code (0 = ok, 1=error appeared).

report_error(read_tuple_name, error_name, wrong='', message='', warning=False)[source]

Report an error.

Parameters:
  • () (error_name) – Name of the read tuple.
  • () – Name of the error.
  • wrong (str) – What is wrong.
  • message (str) – Additional msessage to be printed.
  • warning (bool) – Warning (not an error).
validate(read_tuple_name)[source]

Check RNF validity of a read tuple.

Parameters:read_tuple_name (str) – Read tuple name to be checked.s