RNASeqReadSimulator is a set of scripts generating simulated
RNA-Seq reads. RNASeqReadSimulator provides users a simple tool to
generate RNA-Seq reads for research purposes, and a framework to allow
experienced users to expand functions. RNASeqReadSimulator offers the
following features:
RNASeqReadSimulator is distributed via GitHub. All updates and bug fixes will be reflected to GitHub repository immediately.
-h/--help | Print this message. |
-e/--lognormal <float,float> | Specify the mean and variance of the lognormal distribution used to assign expression levels. Default -4,4 |
-f/--statonly | Print the statistics only; do not assign expression levels. |
-e/--expression <expression level file> | Specify the weight of each transcript. Each line in the file should have at least (NFIELD+1) fields, with field 0 the annotation id, and field NFIELD the weight of this annotation (NFIELD is given by -f/--field option). If this file is not provided, uniform weight is applied. See the output of genexplvprofile.py for an example. |
-n/--nreads <int> | Specify the number of reads to be generated. Default 100,000 |
-b/--posbias <positional bias file> | Specify the positional bias file. The file should include at least 100 lines, each contains only one integer number, showing the preference of the positional bias at this position. If no positional bias file is specified, use uniform distribution bias. |
-l/--readlen <int> | Specify the read length Default 32. |
-o/--output <output file> | Specify the output file. The default is STDOUT |
-f/--field <int> | The field of each line as weight input. Default is 7 (beginning from field 0). |
-p/--pairend <int,int> | Generate paired-end reads with specified insert length mean and standard derivation. The default is 200,20. |
--stranded | Generate strand-specific RNA-Seq reads. |
-b/--seqerror <error file> | Specify the positional
error profile to be used. The file should
include at least 100 lines, each containing a positive number. The
number at line x is the weight that an error is occurred at x% position
of the read. If no positional error file specified, uniform weight is
assumed. |
-r/--errorrate <0.0-1.0> | Specify the overall error rate, a number between 0 and 1. Default is 0 (no errors). |
-l/--readlen <int> | Specify the read length. Default is 75. |
-f/--fill <string> | Fill at the end of each read by the sequence seq, if the read is shorter than the read length. This option is useful when simulating poly-A tails in RNA-Seq reads. |
2012.04.30
2012.02.16
2012.02.08 Initialization
by Wei Li