Input and Output Files
The SPARCS pipeline relies on several different files for the analysis. The following is a list of the file types and their purposes
Input files
SPARCS can handle gzipped files or files that are not compressed as input.
Reference Genome (FASTA) - The reference genome holds the sequence of the genome of interest. This file is used to get the sequences of the genes of interest.
Gene Annotation (GTF/GFF) - The gene annotation file holds the locations of the genes in the genome. This file is used to get the locations of the start and stop positions of the genes of interest.
Variant File (VCF) - The variant file holds the variants that are found in the sample. This file is used to get the locations of the variants of interest.
Output files:
The main output of the SPARCS pipeline is a file that contains all
of the riboSNitch predictions for the variants of interest. This file
is located inside of the results/ribosnitch_predictions directory
inside of the output directory. The file name is combined_ribosnitch_prediction_{TEMPERATURE}.txt
where {TEMPERATURE} is the temperature that was used for the analysis.