A Novel Measure for Dynamic Outlier Detection from Large Databases

Authors

  • K.Ashesh, Dr.G.AppaRao

Abstract

In the applications like data mining where business intelligence (BI) is the expected outcome, the
presence of outliers can have negative impact. Thus outlier detection became significant research area in the
confines of data mining. The existing algorithms used to detect outliers are effective and used in many real
world applications. However, those algorithms are less efficient, especially, the density-based algorithms
cannot leverage performance with datasets possessing densities or when datasets exhibit clusters of different
shapes. In order to remove these barriers, we proposed a methodology that captures density variations of a
test point efficiently for outlier detection with high accuracy. It is a two-fold process in which and potential
influence space for a dataset is determined followed by a novel measure to find outlierness based on
absolute density associated with potential influence space and neighbourhood rank difference. An Outlier
Score-based Detection (OSD) algorithm is proposed and implemented. A prototype-based application has
been built to validate the hypothesis. Experiments made on UCI datasets and synthetic datasets revealed the
performance in terms of efficiency in comparison between the proposed and the other existing state of the
art methods.

Published

2020-05-30

Issue

Section

Articles