Proximity RNA Sequencing: Breaking New Grounds in Spatial Transcriptomics

By Sangeeta Chakraborty, Ph.D.


Displaying an extraordinary magnitude of compaction, the entire human genome of colossal dimensions (2 meters long) is squeezed right inside a micrometer-sized nucleus in a knot-free mass. Remarkably though, this genomic globule is accessible to a complex range of transcriptional machinery—through chromatin loops, which bring cis-acting DNA elements into proximity to transcriptional units—allowing a coordinated action of millions of molecules to drive the processes of the central dogma.

Nuclear functions like transcription, recombination, replication, and several others rely on a well-compartmentalized genome and are constrained within their designated sub-nuclear territories rather than spread out randomly throughout the nucleus. Understanding these spatial territories and the interactions and networks of DNA and RNA they harbor, would be paramount to our understanding of how this massive nucleic acid architecture facilitates dynamic processes of gene regulation, cellular behavior, and development of the cell fate.

Biological organization at this level of complexity had to get decoded and mapped. In the past decade, researchers came up with a slew of advanced technologies that could spatially map the higher-order genome structure as well as the transcriptome in the confines of the 3-dimensional (3D) nuclear space. With the advent of high-throughput sequencing and in situ imaging technologies, scientists are finally able to map the entire repertoire of DNA and RNA interactions with single-cell resolution. 

Current techniques of spatiotemporal mapping of genome and transcriptome include: 

  1. Fluorescent in situ hybridization (FISH) based imaging coupled with some sequencing; transcripts are directly labeled in cells or tissue sections, meaning that their single-cell (and even subcellular) spatial information can be visualized,
  2. Proximity ligation combined with high-throughput sequencing (conformation-capture-based assays; Hi-C); two nucleotide sequences that may be far apart in the linear genome but interact and co-locate together in the 3D space are ligated, and the chimeric junctions are sequenced to map the pairwise interactions, this allows identification of distal nucleotide sequences that are interacting in close proximity to each other.

Although these techniques have enormously advanced our understanding of the spatial organization of genomic and transcriptomic profiles, they have their limitations. For example, FISH-based assays frequently suffer from noisy fluorescence signals due to molecular crowding within cells when more than a few hundred transcripts are visualized simultaneously. Proximity ligation-based methods, on the other hand, are limited to detecting pairwise interactions that are close enough to be ligated. It fails to identify structures that are spread out over large surface areas and interact over long-range distances like DNA structures around nuclear bodies (speckles and nucleoli).

Ligation-based assays especially do not pick signals from nucleoli, which are relatively DNA-sparse with lots of repeat sequences. Proximity ligation to study RNA interactomes also have similar drawbacks in that they only identify paired interactions between two RNA molecules or complex secondary structures (base-paired associations) within one RNA. Grouped interactions of several RNAs in a nucleoprotein complex are, however, possible to decipher through pull-down of the known protein molecule. 

However, none of these approaches inform about the positioning and organization of transcripts in the nucleus. Precisely—where are the nascent transcripts located relative to major nuclear landmarks (structures like nuclear speckles and nucleoli); where do they move for processing or storage; what company do they keep in their specific nuclear neighborhoods?


Proximity RNA Sequencing, the Answer to Spatial RNA Organization

A team of bioinformaticians from Babraham Institute, Cambridge, recently developed a ligation-independent method—proximity RNA-sequencing—to map closely interacting nuclear RNA molecules, to answer some of the questions. The new method identifies spatial assemblies of two or more RNAs that colocalize in a small area regardless of actual base-paired interactions. In simple terms, RNA neighborhoods are encapsulated in emulsion droplets that are uniquely barcoded, wherein members of a single neighborhood share the same barcode. Barcoded transcripts are later identified by sequencing. 

Technically, proximity sequencing derives from methodologies of both single-cell RNA-seq and conformation capture assays but ditches the use of microfluidics equipment (like in DROP-Seq) to capture the sub-nuclear particles; instead uses a simple vortexing protocol to create millions of emulsion droplets to encapsulate the nuclear fragments.


It’s All in the Method

Chemically crosslinked nuclear fragments are first encapsulated into millions of water-in-oil emulsion droplets, each containing a uniquely barcoded bead. A crosslinking time of 20 min ensures a homogenous preparation of nuclear fragments without any cytoplasmic contamination. The droplets contain a library of up to a trillion (1012) barcoded beads (one bead/droplet), a value in excess of the estimated 109 nascent transcripts that the array can capture, exceeding the limits of a typical library used for single-cell RNA-seq. Each encapsulated bead represents a unique 26 nucleotides DNA sequence called the barcode, where each barcode is made of identical copies of thousands of oligos, all containing the same 26 nucleotides sequence.

A tail of 15 random bases is appended at the end of the barcode to prime the reverse-transcription of the RNA species trapped in the droplet followed by amplification and sequencing. Transcripts that cluster together or interact in a dense group are likely to be captured in a single bead under the same barcode. Upon sequencing and identification of the reads, Morf and the team could determine groups of RNAs that hung out together in specific nuclear zones, in other words, they could identify RNA-rich and sparse neighborhoods in the nuclear landscape.


What Does the Nuclear Landscape Look Like?

On a transcriptome-wide level, proximity RNA-seq demonstrates a non-random positioning of RNA molecules within the nucleus. Transcripts of all kinds (precursor, mature, regulatory, non-coding) were found to position within distinct distances relative to the nucleolus—a fixed nuclear landmark. For example, small non-coding nucleolar RNAs (snoRNA) along with a few protein-coding transcripts and the bulk of ribosomal RNAs (rRNA) partitioned close to nucleoli.

Based on RNA density and local transcriptional activity, the team visualized the nuclear landscape being made of two distinct compartments: Compartment 1—nucleolus and vicinity; RNA-pol I transcripts and compartment 2—nucleoplasm and nuclear periphery; RNA pol II transcripts. Previously suspected of having a low transcriptional activity due to densely packed heterochromatin, specific locations in compartment 1 (primarily nucleolar periphery), turned out to be actively transcribed RNAs. Spatial mapping with this technique revealed that this RNA-dense region predominantly contained nascent rRNAs and housekeeping snoRNAs, whereas compartment 2 included most of the tissue-specific protein-coding RNAs.



Since proximity RNA-seq does not rely on proximity-ligation, it accurately measures spatial distances across nuclear bodies in a 3D space. Systemic applications of such 3D spatial maps in resolving the spatial transcriptome or genome can have profound impacts in many fields: it can provide insights into the complex transcriptional networks in diseased and healthy cells; it can be extended, beyond the nucleus, to include sub-cellular RNA containing structures and their functional implications (e.g., RNA phase-separated bodies); finally, it can enrich our systems-level understanding of how the relationship between genome structure and the associated transcriptome coordinates overall cellular functions.

Related Article: Single-cell genomic analysis (SCG)—A Key to Decipher the Cancer Landscape


© All rights reserved. Contact: