The Discovery Of The Specificity Of Entorhinal Cortex Cells In Alzheimer’s Disease Based On K-Graph Clustering Method For Single Cell Sequcencing Data

  • Yixi Li#,Wei Kong, Shuaiqun Wang


Single cell sequencing technology is a new technology for high-throughput sequencing and analysis of genome, transcriptome and epigenome at the single cell level. It can further understand the brain nerve cells, reveal the gene structure and gene expression status of single cell, reflect the heterogeneity of cells, and play an important role in Alzheimer's disease (AD) and other neuroscience fields. At present, the graph based clustering method divides the weighted undirected graph into two or more optimal subgraphs to make the subgraphs as similar as possible, while the distance between subgraphs is as far as possible, so as to achieve the purpose of clustering. Compared with the traditional clustering algorithm, the graph based clustering method can work in any space and cluster data of any shape. However, because this algorithm needs eigenvector decomposition, and also needs to find neighbors for each cell, so as to build a graph with high time complexity, it is facing great challenges in the era of big data. In order to meet the needs of big data, in this study, a new graph based clustering optimization method (k-graph clustering method) was applied to solve the problem of long execution time and low efficiency in data processing for the single cell transcriptome profiles of entorhinal cortex of AD. Firstly, k-means is used to get key centroid. since it can reflect the adjacent cells by a centroid to achieve the purpose of data compression. Then, based on the centroids, the single cell transcriptome profiles are clustered by the graph based clustering method, the cells around each centroid are taken over and placed around the centroid. The results show that the classification effect of the two methods is almost the same, but compared with the graph based clustering method, the speed of k-graph clustering method is significantly improved. By this method, we classified different cell types, including neurons and non neuronal brain cells, such as oligodendrocytes, astrocytes and microglias. Each of these cell types showed significant gene expression differences in AD, the regulation of these differentially expressed genes reveals heterogeneous responses between cell types to AD pathology.