From cdb5acb53a1d6c27a8c31811827074e71aa9c1a6 Mon Sep 17 00:00:00 2001 From: Pariksheet Nanda Date: Fri, 17 Mar 2017 00:34:41 -0400 Subject: [PATCH] DOC: Add README --- README.md | 118 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 118 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..a43e055 --- /dev/null +++ b/README.md @@ -0,0 +1,118 @@ +Centromere Transcription Factor RNAi screen +=========================================== + +Search for genes that affect centromere establishment. + +Experiment background +--------------------- + +Centromeres are essential for cell division. +They recruit the proteins to which the spindle fibers attach, +and the spindle fibers segregate the sister chromatids. + +Centromeres require a unique sub-unit protein +in their histone octamer called CENP-A. +After each cell division, the CENP-A histone proteins +need to replace the H2A-H2B dimer to reform the centromere. +This process is thought to require transcription. + + +Experiment summary +------------------ + +To study centromere formation +one needs to seed for the protein complex +that will incorporate CENP-A into histones, +replacing the original H2A-H2B histone sub-units. +The "seed" to trigger the events of this histone replacement +is a Drosophila specific protein called CAL1. +CAL1 is brought to the ectopic centromere site +using the LacO-LacI tethering system: +namely, several (32) LacO repeats +are inserted into a region of chromosome 3L +and LacI-CAL1 will be recruited to that region +due to the LacO-LacI affinity. +About 25% of the time, this system creates a new centromere. + +The experiment searches for proteins involved +with centromere formation by knocking down known nuclear proteins, +which include transcription factors. +When ectopic formation dips significantly below the typical 25%, +we may have found a gene involved with centromere formation. + +The data collected are fluorescent images of multi-well plates, +where each well corresponds to a protein being knocked down. + + - A red fluorescent protein labels the ectopic centromere location + (of the LacO repeats). + - A green fluorescent protein labels CENP-A. + - DAPI blue fluorescence labels nuclear DNA. + +Usage +----- + +Run the `Makefile` in this directory to generate all results. + +``` sh +make +``` + +To tune CellProfiler's image processing, +it is helpful to save results in a separate directory. +One can clone this repository using a different directory name, +and link to the processed image set of the original: + +``` sh +cd .. +git clone git@github.uconn.edu:MelloneLab/RNAi_plate_analysis_all.git rnai-screen-tf_20170314 +cd rnai-screen-tf_20170314 +rm -rf z_projection +ln -s ../../rnai-screen-tf/results/z_projection z_projection +``` + +A nice feature of Makefiles is the ability to overwrite any number of variables +by specifying it on the command-line in the general form `variable=value`. +For example, to use cellprofiler installed to your personal directory +by `pip install --user ...`, +you may specify the path to cellprofiler as: + +``` sh +make CELLPROFILER=~/.local/bin/cellprofiler +``` + +Data processing +--------------- + +The raw input data consists of: + + 1. Images of 5 plates, with 10 sites per well + and 3 z-slices per site ("April_16_2016.tar.xz"). + 2. A spreadsheet mapping the proteins to the wells + ("DRSC_TF_Library_Distribution.xls"). + +Below is the file listing of the "data" directory using `tree`: + +``` +data +├── April_14_2016 [11 entries exceeds filelimit, not opening dir] +├── April_14_2016.tar.xz +└── DRSC_TF_Library_Distribution.xls +``` + +The images are 19 GB in the xz compressed archive, +therefore it is download from FIXME_INSERT_DOI +to the data directory. + +Broadly speaking this data is processed as follows: + + 1. CellProfiler saves image statistics to a database. + 2. R scripts to save high confidence wells and generate plots. + +CellProfiler 2 requires 2D image inputs, +therefore a Python script creates the z-projections. + +CellProfiler segments the ectopic and CENP-A centromeres and +saves the statistics into an sqlite database. + +The R-scripts read this CellProfiler generated database for their calculations +and plots.