Duplicate Elimination: Difference between revisions

From Algorithm Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 42: Line 42:
[[File:Duplicate Elimination - Space.png|1000px]]
[[File:Duplicate Elimination - Space.png|1000px]]


== Pareto Frontier Improvements Graph ==  
== Time-Space Tradeoff ==  


[[File:Duplicate Elimination - Pareto Frontier.png|1000px]]
[[File:Duplicate Elimination - Pareto Frontier.png|1000px]]

Revision as of 15:46, 15 February 2023

Description

SQL does not eliminate duplicates implicitly. It allows to enter duplicate values on columns other than candidate key or if did not specified any keys. If the user wants to eliminate duplicate records, he has to use DISTINCT keyword in the query.

Databases, therefore, can have duplicate entries. The problem deals with identifying and removing duplicates from a database.

Parameters

No parameters found.

Table of Algorithms

Name Year Time Space Approximation Factor Model Reference
Sorting based (Merge Sort) 1964 $O(nlogn)$ $O(n)$ Exact Deterministic
Sorting based (Merge Sort) + real-time elimination 1964 $O(nlogn)$ Exact Deterministic
BST Algorithm 1999 $O(nlogn)$ $O(\log n)$ Exact Deterministic
Priority Queue Algorithm 1976 $O(n^{2})$ $O(n)$ Exact Deterministic
Sorted Neighborhood Algorithm (SNA) 1998 $O(n^{2})$ $O(n)$ Exact Deterministic Time
Duplicate Elimination Sorted Neighborhood Algorithm (DE-SNA) 2002 $O(nlogn)$ Exact Deterministic
Adaptive Duplicate Detection Algorithm (ADD) 2003 $O(n^{3})$ $O({1})$ Exact Deterministic Time

Time Complexity Graph

Duplicate Elimination - Time.png

Space Complexity Graph

Duplicate Elimination - Space.png

Time-Space Tradeoff

Duplicate Elimination - Pareto Frontier.png