October 2017 Issue Vol.7 No.10

Part–Time Ph.D (Category – B), R&D Centre, Bharathiar University, Coimbatore & Assistant Professor, Department of Computer Science, Navarasam Arts and Science College for Women, Erode, Tamil Nadu, India. Ph.D., Research Scholar, Dept of Computer Science,Erode Arts And Science College (Autonomous), Erode,Tamil Nadu, India
A.Venkatesh Kumar
Technology Specialist, Cognizant Technology Solution, Coimbatore, Tamil Nadu, India.

Abstract: This paper delineates evolution of Dis-Similarity percentage calculation using grouping technique. None of the existing algorithm produces the dis-similarity percentage between pair of string. There are two types of evolution model for duplicate detection i.e., duplicates detection without grouping and duplicate detection with grouping. This re-search proved that the duplicate detection with grouping is more powerful and performance wise also it is much better than duplicate detection without grouping. This research introduced new technique which includes merits and features of clustering algorithm and de-duplication algorithm to improve the performance and accuracy of the new technique.
Keywords:Duplicate Detection, De-Duplication,Dis-Similarity, Grouping, Clustering.

