Supplementary MaterialsSupplementary Information 41467_2017_39_MOESM1_ESM. cycle along time for unsynchronized single-cell transcriptome data. We independently test reCAT for accuracy and reliability using several data units. We find that cell cycle genes cluster into two major waves of expression, which correspond to the two well-known checkpoints, G1 and G2. Moreover, we leverage reCAT to exhibit methylation variance along the recovered cell cycle. Thus, reCAT shows the potential to elucidate diverse profiles of cell cycle, as well as other cyclic or circadian processes (e.g., in liver), on single-cell resolution. Introduction Cell cycle studies, a long-standing research area in biology, are supported by transcriptome profiling with traditional technologies, such as qPCR1, microarrays2, and RNA-seq3, which have been used to quantitate gene expression during cell cycle. However, these strategies require a large amount of synchronized cells, i.e., microarray and bulk RNA-seq, or they may lack observation of whole transcriptome, i.e., qPCR. Moreover, in the absence of elaborative and efficient cell cycle labeling methods, a high-resolution whole transcriptomic profile along an intact cell cycle remains unavailable. Recently, Mcl1-IN-11 single-cell RNA-sequencing (scRNA-seq) has become an efficient and reliable experimental technology for fast and low-cost transcriptome profiling at the single-cell level4, 5. The technology is employed to efficiently extract mRNA molecules from single cells and amplify them to certain large quantity for sequencing6. Single-cell transcriptomes facilitate research to examine temporal, spatial and micro-scale variations of cells. This includes (1) exploring temporal progress of single cells and their relationship with cellular processes, for example, transcriptome profiling at different time phases after activation of dendritic cells7, (2) characterizing spatial-functional associations at single-cell resolution which is essential to understand tumors and complex tissues, such as space orientation of different brain cells8, and (3) unraveling micro-scale differences among homogeneous cells, inferring, for example, axonal arborization and action potential amplitude of individual neurons9. One of the major difficulties of scRNA-seq data analysis involves separating biological variations from high-level technical noise, and dissecting multiple intertwining factors contributing to biological variations. Among all these factors, determining cell cycle stages of single cells Mcl1-IN-11 is critical and central to other analyses, such as determination of cell types and developmental stages, quantification of cellCcell difference, and stochasticity of gene expression10. Related computational methods have been developed to analyze scRNA-seq data units, including identifying oscillating genes and using them to order single cells for cell cycle (Oscope)11, classifying single cells to specific cell cycle stages (Cyclone)12, and scoring single cells in order to reconstruct a cell cycle time-series manually13. Besides, several computational models have been proposed to reconstruct the time-series of differentiation process, including principal curved analysis (SCUBA)14, construction of minimum spanning trees (Monocle15 and TSCAN16), nearest-neighbor graphs (Wanderlust17 and Wishbone18) and diffusion maps (DPT)19. In fact, even before scRNA-seq came into popular use, Rabbit Polyclonal to OR10G4 the reconstruction of cell cycle time-series was accomplished using, Mcl1-IN-11 for example, a fluorescent reporter and DNA content signals (ERA)20, and images of fixed cells (Cycler)21. However, despite these efforts, accurate and strong methods to elucidate time-series of cell cycle transcriptome at single cell resolution are still lacking. Here we propose a computational method termed reCAT (recover cycle along time) to reconstruct cell cycle time-series using single-cell transcriptome data. reCAT can be used to analyze almost any kind of unsynchronized scRNA-seq data set to obtain a high-resolution cell cycle time-series. In the following, we first show one marker gene is not sufficient to give reliable information about cell cycle stages Mcl1-IN-11 in scRNA-seq data units. Next, we give an overview of the design of reCAT, followed by an illustration of applying reCAT to a single Mcl1-IN-11 cell RNA-seq data set called mESC-SMARTer, and the demonstration of robustness and accuracy of reCAT. At the end, we give detailed analyses of several applications of reCAT. All data units used in this study are outlined in Table?1..