Intelligent Discovery Management
Home | Contact Us


  • eDiscovery Consulting
    • Project Management
    • Managed Litigation Support
    • Reporting & Metrics
    • Application Development
  • Preservation & Collection
    • Data Preservation
    • Data Acquisition
    • Forensic Analysis
    • Media Restoration
    • Media Cataloging
  • Processing & Analysis
    • ESI Conversion
    • Near Duplicate Detection
    • Coding
    • Imaging
    • Keying
    • Data Normalization
  • Review & Production
    • Hosting
    • Review
    • Litigation Staffing
    • Production
  • About IDM
    • Principals
    • Team
    • Litigation Staffing
    • History
    • Partners
    • Contact
Processing and Analysis
PROCESSING
& ANALYSIS
  • Overview
  • ESI Conversion
  • Near Duplicate Detection
  • Coding
    • Coding Types
  • Imaging
  • Keying
  • Data Normalization
PROCESSING & ANALYSIS : NEAR DUPLICATE DETECTION

Efficiently Reduce Document Sets in a Defensible Manner

In typical document collections, as much as 30% - 60% of all documents are similar, with only iterative differences in content or formatting. IDM's Near Duplicate Detection groups near duplicate documents by percentage of similarity so reviewers can quickly review and code similar documents for responsiveness or privilege.

IDM's process utilizes Context Triggered Piecewise Hashing (CTPH) technology to identify similar documents. Unlike traditional de-duplication methods where the hash value of an entire file is compared against the values of other entire files, CTPH assigns fuzzy hash values based on distinct segments within each file. The fuzzy hash value of the file is compared against values for segments in other files. If a specified similarity threshold is met, the files are identified as duplicate or near duplicate.

IDM's near duplicate detection solution can be loaded into any server or web-based review application so you can sort your documents into near duplicate groups for review. Your review team will quickly become familiar with the content of similar documents and can code them efficiently and consistently.

IDM's near duplicate detection is charged per file at rates as much as 90% lower than other service providers…just pennies per file, while processing time is extremely fast and will not hold up your time-sensitive review.

© 2011 IDM | eDiscovery Consulting | Preservation & Collection | Processing & Analysis | Review & Production | About IDM | Safe Harbor | Contact Us
Website by Edgeworks Group