Permalink
@@ -0,0 +1,89 @@ | ||
POMP consists of two software packages i.e., (1) POMP-DETECT: detecting candidate splice junctions and | ||
(2) POMP_PRUNE: pruning candidate junctions vis Support Vector Machine (SVM). | ||
|
||
To begin with user must consider | ||
--------------------------------- | ||
(1) Java 1.8.0_31 or higher. | ||
(2) R version 3.1.1 or higher with "e1071" packages. | ||
(3) bowtie and bowtie-build executables should be in the folder of POMP-DETECT. For | ||
convenient those two executables are supplied with the package. | ||
(4) gcc version 4.6.1 or above | ||
(5) Should run in Linux machine with multiple processors. | ||
|
||
Install and run POMP-DETECT: | ||
---------------------------- | ||
>g++ -O3 contig_generator.cpp | ||
>javac Preprocess.java Utilities.java CallBackTest.java CallBackTask.java | ||
>java -Xmx10g -cp . Preprocess | ||
The output of POMP-DETECT is junctions_information.info residing in OUT/ folder.will be in OUT/ folder. | ||
|
||
|
||
Install and run POMP-PRUNE: | ||
---------------------------- | ||
|
||
POMP-PRUNE should be installed in OUT/ folder. OUT/ folder should be in POMP-DETECT folder. | ||
|
||
>javac Statistics.java | ||
>java -Xmx5g -cp . Statistics | ||
A R script named "script.r" can be found in the directory of POMP-PRUNE. It will be used in R for | ||
classification purpose. Please, change the first line of this script accordingly. Output of POMP-PRUNE | ||
will be predicted_junctions_list_<chromosome_name>.txt | ||
|
||
------------------------------------------------------------------------------------------------ | ||
|
||
Properties file (properties.prop) of POMP-DETECT | ||
------------------------------------------------- | ||
(1) INDEX FOLDER PATH WITH FILE PREAMBLE | ||
-This folder contains the Bowtie created index files from reference sequence where | ||
preamble is prefix of the index file. Let X be the folder path and chr is the preamble. | ||
The parameter should be written as X/chr | ||
|
||
(2), (3) and (4) files will be created by POMP. Please give suitable file names with paths. | ||
|
||
(5) FASTQ file name containg reads. | ||
|
||
(6) File will be created by POMP. Please give suitable file name with path. This file contains the coverage information. | ||
|
||
(7) Bowtie will align reads within given number of mismatches. | ||
|
||
(8) POMP will align unmapped reads within given number of mismatches. | ||
|
||
(9) Maximum number of alignment per read within reference. | ||
|
||
(10) Number of threads to be used by Bowtie. | ||
|
||
(11) Length of consensus. It depends on the length of the reads. If the length of a read is |r|, the consensus length | ||
will be L = 2*|r| + x where x > 0, such that L will be divisible by 4. For an example if read length is 50, then L = 104. | ||
|
||
(12) Name of the reference sequence file without extension | ||
|
||
(13) At the very beginning pre-process must be "on". Later it should be turned "off". For an example, we have 24 chromosomes | ||
in human genome. So, to detect genome wide splice events, POMP at first pre-process given reads. Then it will continue without | ||
pre-processing the data. For example at the very first to detect splice events in chromosome 1 we must turn on pre-process. For chromosome 2 to 24 | ||
pre-process must be turned off. | ||
|
||
(14) Should be between 1.2 - 1.5. | ||
|
||
(15) Search genome for gapped alignment with this length. | ||
|
||
(16) Should be between 10 - 15. | ||
|
||
(17) Folder for the sequences. | ||
|
||
(18) Overlap length between reads to be built representatives. | ||
|
||
(19) Overlap Hamming distance between reads. Should be 1 - 3. | ||
|
||
(20) Number of threads to be used by POMP. | ||
|
||
--------------------------------------------------------------------------------------------------- | ||
|
||
Properties file (properties.prop) of POMP-PRUNE | ||
------------------------------------------------- | ||
(1) Let sampling threshold is X. Then positive samples will be X times negative samples. | ||
|
||
(2) Number of random samples from the positive examples. | ||
|
||
(3) "On" means highly accurate but very reduced negative examples (recommended for large chromosome). |