Web Page Segmentation In Weblog Using KMedoid Clustering With Sorting And Kernel Distance Function

Authors

  • G.Vijaiprabhu , Dr. K. Meenakshi sundaram

Abstract

Web mining use data mining techniques to retrieve knowledge from web log data
which contains information such as User ID, Login time, URL, Status and Bytes Transferred.
These fields should be analyzed to get information about user identification, Session
Identification and user navigation behavior. Web Mining starts with Preprocessing, Pattern
Discovery and Pattern Analysis. Data Mining Classification, Clustering and Association
mining methods are used to access the pattern in web log data. Web mining consists of three
domains namely ‘Web Usage Mining, Content Mining and Structure Mining where the first
domain gives details about user log, second domain contains the analysis of user’s access
pattern and the third domain assess successive hyperlinks in the web pages. This research
work focuses on Web Content mining to have in depth knowledge about user navigation
behavior by segmenting the Web pages or URL with K-Medoid Clustering. Generally,
Clustering technique use distance function as its primary key to segment the data. This work
analyzes existing distance functions in K-Medoid algorithm with the proposed kernel
Euclidean distance functions combined with sorting algorithm. The work is implemented in
Rapid Miner Tool with suitable evaluation metrics.

Published

2020-12-01

Issue

Section

Articles