Web supplement to
"RNA-binding proteins that lack canonical RNA binding domains are rarely sequence-specific"

Debashish Ray1,7, Kaitlin U. Laverty1,2,7, Arttu Jolma1, Kate Nie1,2, Reuben Samson1,2, Sara E. Pour1,2, Cyrus L. Tam5,6, Niklas von Krosigk1,2, Syed Nabeel‐Shah1,2, Mihai Albu1, Hong Zheng1, Gabrielle Perron 3,4, Hyunmin Lee1, Hamed Najafabadi3,4, Benjamin Blencowe1,2, Jack Greenblatt1,2, Quaid Morris 1,2,5,6* & Timothy R. Hughes1,2*

1Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada.
2Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada.
3Department of Human Genetics, McGill University, Montréal, QC H3A 0C7, Canada.
4McGill Genome Centre, Montréal, QC H3A 0G1, Canada.
5Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
6Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medicine, New York, NY, USA.
7These authors contributed equally: Debashish Ray and Kaitlin U. Laverty.

* To whom correspondance should be addressed:

Abstract

Thousands of RNA‐binding proteins (RBPs) crosslink to cellular mRNA. Among these are numerous unconventional RBPs (ucRBPs)—proteins that associate with RNA but lack known RNA‐binding domains (RBDs). The vast majority of ucRBPs have uncharacterized RNA‐binding specificities. We analyzed 492 human ucRBPs for intrinsic RNA‐binding in vitro and identified 23 that bind specific RNA sequences. Most (17/23), including 8 ribosomal proteins, were previously associated with RNA‐related function. We identified the RBDs responsible for sequence‐specific RNA‐binding for several of these 23 ucRBPs and surveyed whether corresponding domains from homologous proteins also display RNA sequence specificity. CCHC‐zf domains from seven human proteins recognized specific RNA motifs, indicating that this is a major class of RBD. For Nudix, HABP4, TPR, RanBP2‐zf, and L7Ae domains, however, only isolated members or closely related homologs yielded motifs, consistent with RNA‐ binding as a derived function. The lack of sequence specificity for most ucRBPs is striking, and we suggest that many may function analogously to chromatin factors, which often crosslink efficiently to cellular DNA, presumably via indirect recruitment. Finally, we show that ucRBPs tend to be highly abundant proteins and suggest their identification in RNA interactome capture studies could also result from weak nonspecific interactions with RNA.

Supplementary Tables

Array Information, Raw and Processed Data

Z-scores

Motifs

RNAcompete Normalization and Classifier Code