Home Blog News About Us Contact Us

Clustify 3.1 Adds Ability to Automatically Ignore Email Headers and Footers for Improved Document Clustering

HAVERTOWN, Pa. — (PRLog) — July 16, 2012 — Hot Neuron LLC announces the release of version 3.1 of its ClustifyTM software, featuring the ability to automatically ignore email headers and footers to produce cleaner results that are more useful during the document review phase of e-discovery.

Clustify groups related documents into labeled clusters, providing an overview of the document set and allowing the user to review and categorize related documents together for greater efficiency and consistency. The user chooses whether to group documents that are conceptually similar, near-duplicates, or elements of an email thread. Version 3.1 adds the option to automatically ignore email headers, footers, email addresses, and other clutter that can reduce the quality of the results.

Email can be problematic for text analytics because the substantive part of the email is often short, but it may be accompanied by a large amount of unimportant text in the headers and footers. The extraneous text can result in software choosing less informative labels for the clusters, and can cause emails to be grouped together because they share long disclaimers in the footers, rather than because they have something important in common. Replies in an email thread can have header and footer text from parent emails embedded in the middle of the body, making it difficult to identify and remove the clutter. Clustify 3.1 handles this automatically. It ignores the clutter when analyzing documents, and ghosts it when displaying the documents for the user, so a reviewer can see everything with the proper emphasis.

"We constantly push Clustify to produce better results with minimal effort by the user, and this is a solid step in that direction," says Bill Dimm, the CEO of Hot Neuron. "Version 3.1 makes it even easier to get good, clean results without a lot of tweaking or data polishing by the user."

About Hot Neuron

Hot Neuron LLC is an information retrieval software and services company located in Havertown, Pa. Its Clustify software (www.cluster-text.com) does conceptual clustering, near-duplicate detection, content-based email threading, and automatic categorization / predictive coding. Clustify is used in litigation to make the document review phase of e-discovery more efficient and consistent.

Clustify is a trademark of Hot Neuron LLC. Hot Neuron is a registered service mark of Hot Neuron LLC.

  Hot Neuron LLC
  Bill Dimm, 610-581-7702