Summer Research Fellowship Programme of India's Science Academies

Validating the enrichment of PRC and TRX complex proteins at PRE-PIK3C2B associated long range interactor sequences

Tathagata Bhattacharya

Second year BS-MS, Indian Institute of Science Education and Research Pune, Dr. Homi Bhabha Road, Pashan, Pune 411008

Prof. Vani Brahmachari and Dr. Jayant Maini

Epigenetics and Developmental Regulation, Dr. B. R. Ambedkar Center for Biomedical Sciences, Delhi, University Enclave, New Delhi 110007


In a whole genome study, the first intron of the human PIK3C2B gene (phosphatidylinositol-4-phosphate 3-kinase catalytic subunit type 2 beta) was identified as a putative Polycomb Response Element (PRE) having 25 repeats of YY1 binding motif (​Bengani H and Mendiratta S and Maini J and Vasanthi D and Sultana H and Ghasemi M and Ahluwalia J and Ramachandran S and Mishra RK and Brahmachari V, 2013​). This element is referred to as hPRE-PIK3C2B, and it shows recruitment of both activating and repressing complexes in a concentration dependent manner in different conditions in transgenic Drosophila (​Maini, et al, 2017​). Some of the most important findings were that the localization of repressive complex took place in the default state and in some PcG mutants, enrichment of trithorax was observed. In this project, we look at the abundance of PcG proteins in the DNA sequences that interact with hPRE-PIK3C2B. The long range interactor sequences are identified by 4C (Circularized Chromosome Conformation Capture) and validated by 3C (Chromosome Conformation Capture). During the project I examined the interaction of selected Trx protein (eg. MLL), PRC proteins like YY1 and SUZ12, and ETP protein like hINO80 in the region where PRE-PIK3C2B and also the region where its interactors map. The method used is Chromatin Immuno-Precipitation (ChIP) followed by quantitave real-time PCR for the interactor DNA sequence.

Keywords: epigenetics, PcG, TrxG, regulation


PRCPolycomb Repressive Complex
PcGPolycomb Group
PRE Polycomb Response Element 
 TRETrithorax Response Element 
ChIP Chromatin Immunoprecipitation 
 ETP Enhancer of Trithorax and Polycomb
FBS  Fetal Bovine Serum
 DMSODimethyl Sulfoxide 
 PBS Phosphate Buffered Saline
PCR  Polymerase Chain Reaction


Life of a multicellular organism starts as a single cell, the zygote. The zygote undergoes divisions and creates a bunch of cells which have different fates. As time goes by, each of these cells give rise to more and more cells which progressively get more and more specialized to form a certain kind of tissue at a particular position of the body at a particular time during development. This is where gene expression patterns come in. All the cells in the body of an organism have the exact same copy of the genome but how do they function differently? The answer is not in genetics as clearly the difference is not in the DNA sequence. All the cells in the body ideally have the same DNA sequence. So the cells are genetically similar; what then causes the difference? The answer lies in the mechanism of regulation of gene expression.

Expression of different genes is selectively activated or silenced in different cells at different points in time. There are different levels of gene expression regulation, eg. transcriptional, post-transcriptional, translational, post-translational. Here, we will be concentrating on transcripional regulation.

Transcriptional regulation of genes is where the rate of transcript production from the gene is controlled. This is achieved mostly by changing the compaction of chromosome structure or modifying the RNA Polymerase recruitment and functioning using some other protein(s) or factors. Among these, compaction of chromosome and change in the rate of RNA Polymerase recruitment are intimately connected to Epigenetic modifications. Epigenetic modifications include post translational modifications of histone proteins that help in packaging of the DNA.


The term epigenetics is defined as 'the study of changes in gene function that are mitotically and/or meiotically heritable and that do not entail a change in DNA sequence' (​Dupont, et al, 2009​). A lot of work has been done in the field of epigenetics since Conrad Waddington's paper gave rise to this field more than sixty years ago. Chromatin level modifications can affect expression of a particular gene in two ways: either the gene is activated or it is repressed. These modifications were analogous to 'decisions' that 'canalised' the cell towards a particular fate (Noble D, 2015​).

      Waddington's developmental landscape diagram. The landscape itself and the ball at the top are from his original diagram. The subsequent positions of the ball have been added to illustrate his point that development can be canalised to follow different routes (A and B). The plasticity to enable this to happen already exists in the wild population of organisms (modified diagram by K. Mitchell). (Source given at the end)

    This was made clearer in the diagram (​​Fig 1​​) representing chrom​atin landscape where the environmental variables (eg. temperature, chemical stimuli) would determine the slope of the landscape (​​Waddington, et al, 2014​​) (​​Noble D, 2015​​). This landscape represented a form of a cellular memory that maintained the previous changes due to chromatin plasticity (​​Yadav T and Quivy JP and Almouzni G, 2018​​). Subsequent research revealed epigenetic mechanisms controlling histone and DNA modifications which in turn decided the fate of a gene followed by protein complexes, that carry out these modifications and well as these that identify these modifications, as the main characters in the set of chromatin architecture and gene regulation. The protein complexes responsible for modifying histones were identified first in Drosophila and are known as Trithorax and Polycomb group protein complexes.

    Polycomb and Trithorax Group Proteins

    In the 40s, the first Polycomb group mutations were identified in Drosophila. These were esc (extra sex comb) and Pc (Polycmb) (​​Ringrose L and Paro R, 2004​​). These mutations caused the appearance of sex combs on the second and third pair of legs where they usually do not belong. This happened due to mutations that altered the expression of Hox genes which play a major role in maintaining the identity of different body segments. This led to one of the segments taking the identity of another. It was found that this happened due to expression of Hox genes outside their normal expression domain (​​Lewis EB, 1978​​). Later, more Polycomb mutations were found out and the proteins corresponding to the mutated genes were called Polycomb Group proteins (PcG proteins). Later, proteins antagonistic to Polycomb proteins were found and they are put in the Trithorax Group of proteins.

    Polycomb group proteins/genes help maintain the repressed state of Hox genes but are not responsible for the establishment of repression (​​Struhl G and Akam M, 1985​​)(​​Müller, et al, 2006​​). Polycomb and Trithorax proteins maintain transcriptional expression of Hox genes in the form of multi-subunit protein complexes in organisms throughout kingdoms, i.e. these are conserved proteins (​​Fig 2​​) (​Schuettengruber, et al, 2017​​). As these proteins work in complexes, a number of these complexes are also conserved across species.

        Phylogenetic distribution of PcG and TrxG complexes. A broad spectrum of eukaryotes has been investigated, with an emphasis on holozoans, among which are the five major metazoans lineages. (Source given at the end)

      Polycomb Group proteins bind to and modify chromatin in the form of complexes (​​Eckert, et al, 2011​​). Trithorax Group proteins also form complexes which function antagonistically to the PcG complexes. The most important PcG complexes are PRC1, PRC2 (​​Fig 3​​) and PhoRC. The most important TrxG complexes are COMPASS and SWI/SNF.

      Screen Shot 2019-06-17 at 18.25.12_1.png
          Simplified description of PcG regulation of transcription. The PRC2 complex is recruited to chromatin, and the Ezh2 protein of this complex, a histone methyltransferase, catalyzes formation of H3K27me3 through its SET catalytic site. H3K27me3 then functions as a binding site for the CBX protein of the PRC1 complex. CBX encodes a chromodomain site that has a high affinity for H3K27me3. The interaction of CBX with H3K27me3 anchors the PRC1 complex to chromatin, and the RING1B subunit of the complex, an E3 ubiquitin ligase, catalyzes formation of H2A-K199-ub. Ultimately these events lead to chromatin folding and compaction and cessation of transcription. (Source given at the end)

        PcG and TrxG protein complexes need regulatory regions on the DNA that they can recognize. These regulatory regions are called Polycomb Response Elements (PREs) and Trithorax Response Elements (TREs). The mechanisms by which these complexes recognize and bind PREs/TREs are being investigated, however they mostly use protein domains that recognize prior epigenetic marks to find and bind to their targets. They also use transcription factors to get recruited to specific DNA sequences (naked DNA), which was verified with the discovery of DNA binding activity of Pho in Drosophila (​Brown JL and Mucci D and Whiteley M and Dirksen ML and Kassis JA, 1998).

        How We Find PREs/TREs

        By defination, PREs and TREs are regions in the genome where Polycomb complexes and Trithorax complexes bind respectively. So, to find PREs or TREs, one has to identify the parts of the DNA where these proteins are bound. The most obvious way to do that is using a technique called ChIP (Chromatin Immuno-Precipitation). In this technique, one can pull down the regions in the genome which are bound to some protein, using antibodies against that particular protein.

          A diagrammatic representation of the ChIP protocol, showing the various uses of the purified ChIP DNA. (Source given at the end)

          So, we use antibodies against PCG and TRX complex proteins and pull down the DNA bound to these proteins by immuno-precipitation and perform PCRs to find out the presence of a putative PRE/TRE or, we sequence the DNA to find out which sequences are present in the pulled down DNA.


          The objective is to validate the enrichment of PRC and TRX complex proteins at the long range interactors of the PRE-PIK3C2B in the HeLa epigenome.


          The techniques that were used throughout the project are mentioned and described below:

          Cryopreservation of Cells at -80℃

          Adherent HeLa cells in flasks are washed using 1x PBS buffer. The cells are trypsinized using 500μl 5x trypsin solution which is spread over the surface containing the cells. The cells are kept in trypsin solution for 2 minutes and the excess trypsin is drained and the flask is kept at 37℃ in a 5% CO2 incubator for another 2-3 minutes. The flasks are tapped to verify that trypsinization has been effective and the cells are collected in 1ml cryostock and put in a cryo-vial for preservation at -80℃.

          Cryostock composition: 80% FBS, 10% High Glucose Media, 10% DMSO.

          Revival of Cells from -80℃ Cryopreservation

          Collect the cryovials kept at -80℃ storage in ice. Thaw the cells by directly putting them in a water bath kept at 37℃. After the cells are completely thawed, 1ml of complete media is added to the cells and the cells are centrifuged at 1000 rpm at 4℃. After decanting the supernatant, the cells are resuspended in 1 ml of complete media. After that the resuspended cells are added to a T-25 flask containing 4ml of complete media and the flask is kept in 5% CO2 incubator at 37℃.

              HeLa cells adhered to the surface of flask. (Source given at the end)

            Genomic DNA Isolation from Tissue Sample

            500μl of the lysis buffer is added to 200μl of compact cell volume and vortexed for 2 minutues. Then 15μl of proteinase-K is added and gently mixed by inverting the tube 5 times. After that, the tube is incubated at 65℃ for 30 minutes and vortexed after every 10 minutes. The sample is centrifuged at 10,000rpm for 10 minutes at room temperature and the supernatant is collected. 300μl of Binding Buffer is added to the supernatant and poured into a spin column and centrifuged at 10,000rpm for 2 minutes. The flow through is discarded and 500μl of washing buffer I is added to the column and again centrifuged for 2 minutes at 10,000rpm. After this, the flow through is again discarded and 500μl of washing buffer II is added. The tube is then again centrifuged at 10,000rpm for 2 minutes and the flow through is discarded. Now, 30μl of elution buffer is added on top of the silica column and left for 3 minutes to incubate at room temperature. Now the DNA is eluted by spinning the column for 2 minutes at 10,000rpm.

            Splitting or Passaging Cells

            First, the media is discarded using a pipette. The cells in the flask are given a 1x PBS wash (Milli-Q H2O can be used instead of PBS buffer) with 1ml of the buffer. Cells are trypsinized using 500μl 5x trypsin solution and kept for 2 minutes. After draining the extra trypsin, the cells are kept for 2-3 minutes in a 5% CO2 incubator at 37℃. Following this, 1ml complete media is added and the cells are re-suspended in the media. Now the cell suspension is divided into different flasks and then appropriate amount of media is added to the flasks and the flasks are kept in 5% CO2 incubator at 37℃.

            Chromatin Immuno-Precipitation (ChIP)

            Chromatin immunoprecipitation comprises of four major steps: cross linking, DNA shearing, precipitation and finally reversal of cross linking and purification of DNA. The motivation is to selectively isolate the DNA strands that a particular protein interacts with. The process is carried out by fixing the protein onto the DNA sequence and breaking the DNA into small pieces and precipitating the DNA along with the protein using antibodies against the protein (​​Fig 4​​).

            Cross linking

            Firstly, the media is drained from flasks and cells are given a PBS (or water) wash. Then the excess solution is drained and cells are trypsinized using a similar protocol. The cells are collected in 500μl 1x PBS buffer and poured in a 1.5ml eppendorf tube. The tube is centrifuged at 1,000rpm for 5 minutes at 4℃. The supernatant is decanted and the cells are resuspended in PBS again and centrifuged in the same manner. After decanting the supernatant, the cells are now resuspended in 1% formaldehyde in PBS. The tube is kept on a rotor for 10 minutes so that it keeps being inverted end-to-end. After 10 minutes are over, glycine is added to the tubes to a final concentration of 0.125M and kept on the rotor for another 10 minutes. Next, the tube is now centrifuged at 2,000rpm for 5 minutes at 4℃. The supernatant is decanted, cells are resuspended in PBS and centrifuged again at 2,000rpm for 5 minutes at 4℃. The supernatant is decanted and the pellet stored at -80℃.

            DNA shearing by sonication

            The pellet is resuspended in a 0.1% SDS lysis buffer which has 1% proteinase inhibitor cocktail and kept in 4℃. The mixture is then sonicated for 50 minutes, collecting sample at every 10 minutes interval to find out the length of sheared DNA strands w.r.t. time. This step lyses the cell as well as shears DNA. The duration to get DNA fragments of necessary length, the duration of sonication should be standardized. After sonication, NaCl is added to the 10 minute samples to a final concentration of 0.2M and the lysate is stored at -20℃. DNA is isolated from the NaCl treated samples and run on a 1.5% agarose gel to check the size of the sheared DNA fragments.

            Precipitaion of protein-bound DNA

            The DNA concentration of the lysate is measured using a spectrometer and antibodies for particular proteins (MLL, YY1, hINO80 and SUZ12) were added in such a way that DNA:Ab mass ratio was 10:1. 10% input and 50% IgG negative control was used for each antibody. After addition of antibody, the samples were kept overnight in a rotor at 4°C. In this time, the antibodies bind to the corresponding proteins present in the sample. After this overnight treatment, protein A/G beads were added in the sample and kept overnight in a rotor at 4°C. In this time the antibodies, already bound to their target proteins, are bound to the beads. Firstly, the tubes are centrifuged at a speed of 500rpm for 2 minutes at 4 degrees and then decanted. After that, they are given three washes with 0.1% SDS lysis buffer in the following way: 800 μl of 0.1% SDS lysis buffer added, kept in rotor for 5 minutes (at room temp.) and centrifuged at 500 rpm for 2 mins at 4 degrees and decanted. After the third wash and decanting, 800 μl of Immune-complex buffer A is added (composition of buffers are added at the end), kept for 5 minutes in rotor and centrifuged at 500 rpm for 2 mins at 4 degrees. This Immune complex buffer A wash is repeated once more. After this, 2 washes in the same manner using Immune-complex buffer B is carried out. Then, after decanting, 800 μl of T.E buffer is added in the tube and 2 more washes are given in this way.

            Reverse crosslinking

            After the washes are done. 100 microlitres of 10% chelex solution is added per tube and kept in a dry bath at 100degC for 10 minutes. Then they are left to come to room temperature and 20 microgram of proteinase K (1 microlitre of 20mg/ml) is added per tube and incubated at 55 degrees in a thermomixer at 1400 rpm for 30 minutes. After this, the tubes are again kept in a dry bath at 100degC for 10 minutes. After taking them out, they are stored in -20 degrees.

            Analysis of the ChIP DNA

            After ChIP, DNA concentration in the samples are measured and PCRs (quantitative) are to be set-up, using these as templates and primers binding to areas of our interest (interactors of PRE-PIK3C2B), to verify the presence and quantify the amount of the DNA sequence of our interest in the samples, which is a proxy of the magnitude of interaction of the DNA and the proteins in context or the enrichment of the proteins at those regions in the genome.


            The sonicated DNA samples obtained had DNA concentration of around 830ng/μl.

            The ChIP DNA samples obtained had DNA concentration in the range of 16.3-95.4 ng/μl.

            PCRs to standardize some of the primers for the interactors of PRE-PIK3C2B were done on genomic DNA isolated from HeLa cell lines, however the PCRs that I attempted showed no bands (even the positive controls did not show bands) except the one where I checked amplification of the positive control primers using HeLa genomic DNA. The analysis of the ChIP DNA therefore, is yet to be done and the results are awaited.


            The PRE-PIK3C2B is one of the first human PREs that has been seen to interact with TRX complex proteins in transgenic fly models (​​Maini, et al, 2017​​). Much is to be known about the exact mechanisms by which the Polycomb/Trithorax Response Elements recruit Polycomb/Trithorax complex proteins. Learning more about hPRE-PIK3C2B will give more insight towards this greater question as it recruits elements from both the classes of protein complexes.
            The presence of PCG and/or TRX complex proteins at the long range interactors of PRE-PIK3C2B (identified using 4C: Circularized Chromosome Conformation Capture, and validated using 3C: Chromosome Conformation Capture) would imply that the interactors are also PREs/TREs. It would shed light on the mechanism of communication/interaction between PREs/TREs.


            The work done in the duration of the project did not give the answer to the exact question that was asked. The final analysis (quantitative analysis of the sequences present in the ChIP pulldown DNA) is yet to be done. The PCRs that I had set up in the final stage of the project (except the first) showed no bands, not even primer dimers when low annealing temperature was used. The most probable cause might have been a DNase contamination in one of the common reagents used or a mistake in the PCR that has been repeated in all of the PCRs except the first, however the exact reason still needs to be investigated. The nature of the approach that was taken limits my ability to draw any conclusion without the final analysis.


            I thank the IAS-INSA-NASI authorities for selecting me for their Summer Research Fellowship Programme. I thank Dr. B. R. Ambedkar Centre for Biomedical Research for providing me with an extremely nice working environment. I thank Prof. Vani Brahmachari for giving me the opportunity to work in her lab and guiding me throughout the project. I thank my mentor in the lab, Dr. Jayant Maini, immensely for helping me in designing and carrying out all the experiments and teaching me all the techniques necessary for the project. I thank my co-worker, Srinivas Patil, for helping me with most of the experimental setups. I thank all the members of Vani Brahmachari Lab for all the discussions we have had from which I drew inspiration. Last but not the least, I thank my parents and my friends for the constant support and motivation they have given me throughout the course of this project.


            • Bengani H and Mendiratta S and Maini J and Vasanthi D and Sultana H and Ghasemi M and Ahluwalia J and Ramachandran S and Mishra RK and Brahmachari V (2013). Identification and Validation of a Putative Polycomb Responsive Element in the Human Genome.. 8,

            • Maini, Jayant and Ghasemi, Mohsen and Yandhuri, Deepti and Thakur, Suman S. and Brahmachari, Vani (2017). Human PRE-PIK3C2B, an intronic cis-element with dual function of activation and repression. 1860,

            • Dupont, Cathérine and Armant, D. and Brenner, Carol (2009). Epigenetics: Definition, Mechanisms and Clinical Perspective. 27,

            • Noble D (2015). Conrad Waddington and the origin of epigenetics.. 218,

            • Waddington, C.H. (2014). The Strategy of the Genes.

            • Yadav T and Quivy JP and Almouzni G (2018). Chromatin plasticity: A versatile landscape that underlies cell fate and identity.. 361,

            • Ringrose L and Paro R (2004). Epigenetic regulation of cellular memory by the Polycomb and Trithorax group proteins.. 38,

            • Lewis EB (1978). A gene complex controlling segmentation in Drosophila.. 276,

            • Struhl G and Akam M (1985). Altered distributions of Ultrabithorax transcripts in extra sex combs mutant embryos of Drosophila.. 4,

            • Müller, Jürg and Kassis, Judith A (2006). Polycomb response elements and targeting of Polycomb group proteins in Drosophila. 16,

            • Schuettengruber, Bernd and Bourbon, Henri-Marc and Di Croce, Luciano and Cavalli, Giacomo (2017). Genome Regulation by Polycomb and Trithorax: 70 Years and Counting. 171,

            • Eckert, Richard L. and Adhikary, Gautam and Rorke, Ellen A. and Chew, Yap Ching and Balasubramanian, Sivaprakasam (2011). Polycomb Group Proteins Are Key Regulators of Keratinocyte Function. 131,

            • Brown JL and Mucci D and Whiteley M and Dirksen ML and Kassis JA (1998). The Drosophila Polycomb group gene pleiohomeotic encodes a DNA binding protein with homology to the transcription factor YY1.. 1,


            • Fig 1: Conrad Waddington and the origin of epigenetics, Denis Noble, Journal of Experimental Biology  2015  218: 816-818;  doi: 10.1242/jeb.120071
            • Fig 2: Genome Regulation by Polycomb and Trithorax: 70 Years and Counting, Schuettengruber, Bourbon, Di Croce, Cavalli, 2017
            • Fig 3: Polycomb Group Proteins Are Key Regulators of Keratinocyte Function, Richard L. EckertGautam AdhikaryEllen A.RorkeYap Ching ChewSivaprakasam Balasubramanian, 2011
            • Fig 4: https://commons.wikimedia.org/wiki/File:ChIP_procedure.jpg
            • Fig 5: https://www.atcc.org/~/media/Attachments/Micrographs/Cell/CCL-2.ashx
            Written, reviewed, revised, proofed and published with