Dimensionally Reduced Optimal Skyline Evaluation by sampling in Big Data analytics
The rate of increase of data creates challenges in storage, processing, and maintenance. The skyline computation is effective in the data dimensionality reduction that helps in intelligent decision-making over complex data. This paper proposes a Dimensionally Reduced Optimal Skyline Evaluation (DROSE) algorithm for big data. The optimal skylines are achieved in two phases viz., DROSE_skyline and DROSE_k-means using Lloyd’s sampling. The proposed framework assures better computational efficiency by reducing the number of comparisons. The increase in dimensionality reduces the computation time and achieves higher accuracy in determining the skylines. The results also show higher reliability and fault tolerance.
Keywords- dimensionality reduction, MapReduce, k-means clustering, Lloyd’s sampling, DROSE, skyline