Centromere Transcription Factor RNAi screen
Search for genes that affect centromere establishment.
Centromeres are essential for cell division. They recruit the proteins to which the spindle fibers attach, and the spindle fibers segregate the sister chromatids.
Centromeres require a unique sub-unit protein in their histone octamer called CENP-A. After each cell division, the CENP-A histone proteins need to replace the H2A-H2B dimer to reform the centromere. This process is thought to require transcription.
To study centromere formation one needs to seed for the protein complex that will incorporate CENP-A into histones, replacing the original H2A-H2B histone sub-units. The "seed" to trigger the events of this histone replacement is a Drosophila specific protein called CAL1. CAL1 is brought to the ectopic centromere site using the LacO-LacI tethering system: namely, several (32) LacO repeats are inserted into a region of chromosome 3L and LacI-CAL1 will be recruited to that region due to the LacO-LacI affinity. About 25% of the time, this system creates a new centromere.
The experiment searches for proteins involved with centromere formation by knocking down known nuclear proteins, which include transcription factors. When ectopic formation dips significantly below the typical 25%, we may have found a gene involved with centromere formation.
The data collected are fluorescent images of multi-well plates, where each well corresponds to a protein being knocked down.
- A red fluorescent protein labels the ectopic centromere location (of the LacO repeats).
- A green fluorescent protein labels CENP-A.
- DAPI blue fluorescence labels nuclear DNA.
All programs are run using a single
make help to see a list of options:
Usage: make [TARGET] ... Targets: all (Default) Run full pipeline from image processing to plots. help Show this help. z-projection Generate maximum intensity projection images. cellprofiler Collect statistics about all images. gui-cp Interactively run CellProfiler. gui-cpa Interactively run CellProfiler Analyst. stats Find significant wells from cellprofiler measurements. clean-all Delete all output.
Makefile in this directory to generate all results.
To tune CellProfiler's image processing, it is helpful to save results in a separate directory. One can clone this repository using a different directory name, and link to the processed image set of the original:
cd .. git clone email@example.com:MelloneLab/rnai-screen-tf.git rnai-screen-tf_20170314 cd rnai-screen-tf_20170314 rm -rf z_projection ln -s ../../rnai-screen-tf/results/z_projection z_projection
A nice feature of Makefiles is the ability to overwrite any number of variables
by specifying it on the command-line in the general form
For example, to use cellprofiler installed to your personal directory
pip install --user ...,
you may specify the path to cellprofiler as:
The raw input data consists of:
- Images of 5 plates, with 10 sites per well and 3 z-slices per site ("April_16_2016.tar.xz").
- A spreadsheet mapping the proteins to the wells ("DRSC_TF_Library_Distribution.xls").
Below is the file listing of the "data" directory using
data ├── April_14_2016 [11 entries exceeds filelimit, not opening dir] ├── April_14_2016.tar.xz └── DRSC_TF_Library_Distribution.xls
The images are 19 GB in the xz compressed archive, therefore it is download from FIXME_INSERT_DOI to the data directory.
Broadly speaking this data is processed as follows:
- CellProfiler saves image statistics to a database.
- R scripts to save high confidence wells and generate plots.
CellProfiler 2 requires 2D image inputs, therefore a Python script creates the z-projections.
CellProfiler segments the ectopic and CENP-A centromeres and saves the statistics into an sqlite database.
The R-scripts read this CellProfiler generated database for their calculations and plots.