Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
DOC: Add README
- Loading branch information
Showing
1 changed file
with
118 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
Centromere Transcription Factor RNAi screen | ||
=========================================== | ||
|
||
Search for genes that affect centromere establishment. | ||
|
||
Experiment background | ||
--------------------- | ||
|
||
Centromeres are essential for cell division. | ||
They recruit the proteins to which the spindle fibers attach, | ||
and the spindle fibers segregate the sister chromatids. | ||
|
||
Centromeres require a unique sub-unit protein | ||
in their histone octamer called CENP-A. | ||
After each cell division, the CENP-A histone proteins | ||
need to replace the H2A-H2B dimer to reform the centromere. | ||
This process is thought to require transcription. | ||
|
||
|
||
Experiment summary | ||
------------------ | ||
|
||
To study centromere formation | ||
one needs to seed for the protein complex | ||
that will incorporate CENP-A into histones, | ||
replacing the original H2A-H2B histone sub-units. | ||
The "seed" to trigger the events of this histone replacement | ||
is a Drosophila specific protein called CAL1. | ||
CAL1 is brought to the ectopic centromere site | ||
using the LacO-LacI tethering system: | ||
namely, several (32) LacO repeats | ||
are inserted into a region of chromosome 3L | ||
and LacI-CAL1 will be recruited to that region | ||
due to the LacO-LacI affinity. | ||
About 25% of the time, this system creates a new centromere. | ||
|
||
The experiment searches for proteins involved | ||
with centromere formation by knocking down known nuclear proteins, | ||
which include transcription factors. | ||
When ectopic formation dips significantly below the typical 25%, | ||
we may have found a gene involved with centromere formation. | ||
|
||
The data collected are fluorescent images of multi-well plates, | ||
where each well corresponds to a protein being knocked down. | ||
|
||
- A red fluorescent protein labels the ectopic centromere location | ||
(of the LacO repeats). | ||
- A green fluorescent protein labels CENP-A. | ||
- DAPI blue fluorescence labels nuclear DNA. | ||
|
||
Usage | ||
----- | ||
|
||
Run the `Makefile` in this directory to generate all results. | ||
|
||
``` sh | ||
make | ||
``` | ||
|
||
To tune CellProfiler's image processing, | ||
it is helpful to save results in a separate directory. | ||
One can clone this repository using a different directory name, | ||
and link to the processed image set of the original: | ||
|
||
``` sh | ||
cd .. | ||
git clone git@github.uconn.edu:MelloneLab/RNAi_plate_analysis_all.git rnai-screen-tf_20170314 | ||
cd rnai-screen-tf_20170314 | ||
rm -rf z_projection | ||
ln -s ../../rnai-screen-tf/results/z_projection z_projection | ||
``` | ||
|
||
A nice feature of Makefiles is the ability to overwrite any number of variables | ||
by specifying it on the command-line in the general form `variable=value`. | ||
For example, to use cellprofiler installed to your personal directory | ||
by `pip install --user ...`, | ||
you may specify the path to cellprofiler as: | ||
|
||
``` sh | ||
make CELLPROFILER=~/.local/bin/cellprofiler | ||
``` | ||
|
||
Data processing | ||
--------------- | ||
|
||
The raw input data consists of: | ||
|
||
1. Images of 5 plates, with 10 sites per well | ||
and 3 z-slices per site ("April_16_2016.tar.xz"). | ||
2. A spreadsheet mapping the proteins to the wells | ||
("DRSC_TF_Library_Distribution.xls"). | ||
|
||
Below is the file listing of the "data" directory using `tree`: | ||
|
||
``` | ||
data | ||
├── April_14_2016 [11 entries exceeds filelimit, not opening dir] | ||
├── April_14_2016.tar.xz | ||
└── DRSC_TF_Library_Distribution.xls | ||
``` | ||
|
||
The images are 19 GB in the xz compressed archive, | ||
therefore it is download from FIXME_INSERT_DOI | ||
to the data directory. | ||
|
||
Broadly speaking this data is processed as follows: | ||
|
||
1. CellProfiler saves image statistics to a database. | ||
2. R scripts to save high confidence wells and generate plots. | ||
|
||
CellProfiler 2 requires 2D image inputs, | ||
therefore a Python script creates the z-projections. | ||
|
||
CellProfiler segments the ectopic and CENP-A centromeres and | ||
saves the statistics into an sqlite database. | ||
|
||
The R-scripts read this CellProfiler generated database for their calculations | ||
and plots. |