Multi-Viewpoints Based Visual Method by Latent Dirichlet Allocation for Efficient Tweets Data Clustering
Abstract
Data clustering methods are popularly used for partitioning the data into clusters based on data objects similarity features. Cluster tendency is a crucial step for the data clustering techniques assessment of a suitable number of clusters of the data. Existing cluster tendency visualization methods, visual assessment tendency (VAT), spectral VAT (SpecVAT), and other methods use the Euclidean and cosine measures for similarity features computation. For text mining domains, cosine-based visual methods assess the clusters more accurately than others; which are recommended for tweet data models for clusters assessment. Such visual methods determine the clusters concerning a single viewpoint. Multi-viewpoints are required for addressing the more informative clusters assessment of tweet documents. In this paper, we proposed a multi-viewpoints based visual method by using Latent Dirichlet Allocation (MVP-VM-LDA) to find the quality of clusters (or topics) and for visualization of cluster tendency. Real-time tweet datasets were extracted for demonstrating the efficiency of the proposed method in the experimental study.