Skip to content
Permalink
master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
Centromere Transcription Factor RNAi screen
===========================================
Search for genes that affect centromere establishment.
Experiment background
---------------------
Centromeres are essential for cell division.
They recruit the proteins to which the spindle fibers attach,
and the spindle fibers segregate the sister chromatids.
Centromeres require a unique sub-unit protein
in their histone octamer called CENP-A.
After each cell division, the CENP-A histone proteins
need to replace the H2A-H2B dimer to reform the centromere.
This process is thought to require transcription.
Experiment summary
------------------
To study centromere formation
one needs to seed for the protein complex
that will incorporate CENP-A into histones,
replacing the original H2A-H2B histone sub-units.
The "seed" to trigger the events of this histone replacement
is a Drosophila specific protein called CAL1.
CAL1 is brought to the ectopic centromere site
using the LacO-LacI tethering system:
namely, several (32) LacO repeats
are inserted into a region of chromosome 3L
and LacI-CAL1 will be recruited to that region
due to the LacO-LacI affinity.
About 25% of the time, this system creates a new centromere.
The experiment searches for proteins involved
with centromere formation by knocking down known nuclear proteins,
which include transcription factors.
When ectopic formation dips significantly below the typical 25%,
we may have found a gene involved with centromere formation.
The data collected are fluorescent images of multi-well plates,
where each well corresponds to a protein being knocked down.
- A red fluorescent protein labels the ectopic centromere location
(of the LacO repeats).
- A green fluorescent protein labels CENP-A.
- DAPI blue fluorescence labels nuclear DNA.
Usage
-----
All programs are run using a single `Makefile`.
Type the `make help` to see a list of options:
``` sh
Usage: make [TARGET] ...
Targets:
all (Default) Run full pipeline from image processing to plots.
help Show this help.
z-projection Generate maximum intensity projection images.
cellprofiler Collect statistics about all images.
gui-cp Interactively run CellProfiler.
gui-cpa Interactively run CellProfiler Analyst.
stats Find significant wells from cellprofiler measurements.
clean-all Delete all output.
```
Run the `Makefile` in this directory to generate all results.
``` sh
make all
```
Depending
To tune CellProfiler's image processing,
it is helpful to save results in a separate directory.
One can clone this repository using a different directory name,
and link to the processed image set of the original:
``` sh
cd ..
git clone git@github.uconn.edu:MelloneLab/rnai-screen-tf.git rnai-screen-tf_20170314
cd rnai-screen-tf_20170314
rm -rf z_projection
ln -s ../../rnai-screen-tf/results/z_projection z_projection
```
A nice feature of Makefiles is the ability to overwrite any number of variables
by specifying it on the command-line in the general form `variable=value`.
For example, to use cellprofiler installed to your personal directory
by `pip install --user ...`,
you may specify the path to cellprofiler as:
``` sh
make CELLPROFILER=~/.local/bin/cellprofiler
```
Data processing
---------------
The raw input data consists of:
1. Images of 5 plates, with 10 sites per well
and 3 z-slices per site ("April_16_2016.tar.xz").
2. A spreadsheet mapping the proteins to the wells
("DRSC_TF_Library_Distribution.xls").
Below is the file listing of the "data" directory using `tree`:
```
data
├── April_14_2016 [11 entries exceeds filelimit, not opening dir]
├── April_14_2016.tar.xz
└── DRSC_TF_Library_Distribution.xls
```
The images are 19 GB in the xz compressed archive,
therefore it is download from FIXME_INSERT_DOI
to the data directory.
Broadly speaking this data is processed as follows:
1. CellProfiler saves image statistics to a database.
2. R scripts to save high confidence wells and generate plots.
CellProfiler 2 requires 2D image inputs,
therefore a Python script creates the z-projections.
CellProfiler segments the ectopic and CENP-A centromeres and
saves the statistics into an sqlite database.
The R-scripts read this CellProfiler generated database for their calculations
and plots.