From 848673ad5262f273d8d4b6a380d9cec22ae9bdab Mon Sep 17 00:00:00 2001
From: Subrata Saha <subrata.saha@uconn.edu>
Date: Wed, 28 Sep 2016 11:06:22 -0400
Subject: [PATCH] Create README.md

---
 README.md | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)
 create mode 100644 README.md

diff --git a/README.md b/README.md
new file mode 100644
index 0000000..af13b02
--- /dev/null
+++ b/README.md
@@ -0,0 +1,89 @@
+POMP consists of two software packages i.e., (1) POMP-DETECT: detecting candidate splice junctions and
+(2) POMP_PRUNE: pruning candidate junctions vis Support Vector Machine (SVM).
+
+To begin with user must consider
+---------------------------------
+(1) Java 1.8.0_31 or higher.
+(2) R version 3.1.1 or higher with "e1071" packages.
+(3) bowtie and bowtie-build executables should be in the folder of POMP-DETECT. For
+    convenient those two executables are supplied with the package.
+(4) gcc version 4.6.1 or above
+(5) Should run in Linux machine with multiple processors.
+
+Install and run POMP-DETECT:
+----------------------------
+>g++ -O3 contig_generator.cpp
+>javac Preprocess.java Utilities.java CallBackTest.java CallBackTask.java
+>java -Xmx10g -cp . Preprocess
+
+The output of POMP-DETECT is junctions_information.info residing in OUT/ folder.will be in OUT/ folder. 
+
+
+Install and run POMP-PRUNE:
+----------------------------
+
+POMP-PRUNE should be installed in OUT/ folder. OUT/ folder should be in POMP-DETECT folder.
+
+>javac Statistics.java
+>java -Xmx5g -cp . Statistics
+
+A R script named "script.r" can be found in the directory of POMP-PRUNE. It will be used in R for
+classification purpose. Please, change the first line of this script accordingly. Output of POMP-PRUNE
+will be predicted_junctions_list_<chromosome_name>.txt
+
+------------------------------------------------------------------------------------------------
+
+Properties file (properties.prop) of POMP-DETECT
+-------------------------------------------------
+(1) INDEX FOLDER PATH WITH FILE PREAMBLE
+-This folder contains the Bowtie created index files from reference sequence where
+ preamble is prefix of the index file. Let X be the folder path and chr is the preamble.
+ The parameter should be written as X/chr
+
+(2), (3) and (4) files will be created by POMP. Please give suitable file names with paths.
+
+(5) FASTQ file name containg reads.
+
+(6) File will be created by POMP. Please give suitable file name with path. This file contains the coverage information.
+
+(7) Bowtie will align reads within given number of mismatches.
+
+(8) POMP will align unmapped reads within given number of mismatches. 
+
+(9) Maximum number of alignment per read within reference.
+
+(10) Number of threads to be used by Bowtie.
+
+(11) Length of consensus. It depends on the length of the reads. If the length of a read is |r|, the consensus length
+will be L = 2*|r| + x where x > 0, such that L will be divisible by 4. For an example if read length is 50, then L = 104.
+
+(12) Name of the reference sequence file without extension
+
+(13) At the very beginning pre-process must be "on". Later it should be turned "off". For an example, we have 24 chromosomes
+     in human genome. So, to detect genome wide splice events, POMP at first pre-process given reads. Then it will continue without
+     pre-processing the data. For example at the very first to detect splice events in chromosome 1 we must turn on pre-process. For chromosome 2 to 24
+     pre-process must be turned off.
+
+(14) Should be between 1.2 - 1.5.
+
+(15) Search genome for gapped alignment with this length.
+
+(16) Should be between 10 - 15.
+
+(17) Folder for the sequences.
+
+(18) Overlap length between reads to be built representatives.
+
+(19) Overlap Hamming distance between reads. Should be 1 - 3.
+
+(20) Number of threads to be used by POMP.   
+
+---------------------------------------------------------------------------------------------------
+
+Properties file (properties.prop) of POMP-PRUNE
+-------------------------------------------------
+(1) Let sampling threshold is X. Then positive samples will be X times negative samples.
+
+(2) Number of random samples from the positive examples.
+
+(3) "On" means highly accurate but very reduced negative examples (recommended for large chromosome).