Case study: Seddons uses DocsCorp cleanDocs and contentCrawler to support GDPR goals
We get feedback from you that client case studies are valuable – we don’t charge to publish them, as is the case with all our editorial, which is published only on editorial merit. Here is a case study from DocsCorp, describing how London-based Seddons uses cleanDocs and contentCrawler software to ensure optimal data security, searchability and to support their GDPR compliance goals. Note that one key factor was the ability to prevent misaddressed emails in cleanDocs.
Seddons are a top 200 law practice based in London’s West End, with a strong reputation for both commercial and private client work. They advise their UK and international clients on personal and commercial disputes, real estate, family law, estate planning, corporate law, employment law and more. Seddons achieve the best possible outcome, with the professional and interpersonal skills needed for sensitive matters, complex technical work and cross-practice issues.
When Seddons began its journey toward GDPR compliance in early 2017, the firm’s IT Director James Temple discovered gaps in their current software applications functionality.
By purchasing cleanDocs and contentCrawler from DocsCorp, Seddons has moved closer to achieving its goals for GDPR readiness, which will bring greater protection, data security and risk management confidence to the firm.
The business challenge
In early 2017, Seddons’ launched an initiative to ensure the firm would be compliant with GDPR regulations before the May 2018 deadline. In conjunction with Seddons’ Data Protection Officer, the firm hired an outside consultant to advise the firm on the GDPR compliance project.
Seddons already have several software products installed that assist with efficiency and productivity, namely case management system Proclaim® from Eclipse, and pdfDocs and compareDocs from DocsCorp.
To mitigate any potential data breaches where users have accidentally mistyped email addresses, the firm had installed an email recipient domain name checking software product. However, Seddons’ license agreement for that product was expiring, providing an opportune time to seek a more all-encompassing solution, which would also include metadata cleaning.
As part of the GDPR compliance project the firm embarked on a data mapping exercise to pinpoint where the firm’s data was being stored. The data map revealed that the firm had accumulated a large amount of “dark data” – unsearchable files such as scanned PDFs and TIFF and other image and graphic files like JPGs and BMPs. The firm realised it needed a product that could continuously audit the dark data, discover any searchable text in those files, and perform OCR (Optical Character Recognition) on the files to create a searchable text layer for each.
Seddons were already using DocsCorp products including pdfDocs for PDF creation and editing, and compareDocs document comparison software. compareDocs was already integrated with their Proclaim case management system. The firm saw that DocsCorp had established a GDPR compliance package including pdfDocs as well as cleanDocs and contentCrawler. cleanDocs removes metadata from Microsoft Office documents and verifies email addresses before messages are sent. contentCrawler bulk processes documents in the content repository using its OCR and compression modules.
Seddons noted that cleanDocs integrated with Microsoft Outlook email and handled domain name checking better than the firm’s incumbent software. cleanDocs categorises emails into groups and provides metadata cleaning of attachments prior to sending emails. Both features provide an extra layer of security for the firm. When users send out an email message, cleanDocs prompts them to double check that the recipient’s email address is correct and scrubs potentially harmful metadata from any attachments before the message is sent out.
contentCrawler works automatically behind the scenes, sifting through data and documents in the document repository looking for and assessing image-based documents for conversion to text-searchable PDFs. Once converted to PDF, these documents are indexed by the Proclaim PMS, which made them display in relevant searches by users.
James and the firm’s leadership were impressed with cleanDocs and contentCrawler, but they decided to run a pilot installation first to get a good feel for the configuration before buying. The cleanDocs and contentCrawler pilots were conducted simultaneously over several weeks. Both products performed well during the trial implementation and contentCrawler integrated seamlessly with Proclaim. The firm decided to purchase both cleanDocs and contentCrawler.
Seddons has been incredibly pleased with cleanDocs and contentCrawler and the products have provided the firm with a complete set of tools which will help on their journey to GDPR compliance. cleanDocs integrates tightly with Microsoft Outlook to prevent users from sending messages to unintended recipients, and the attachments are stripped of any potentially harmful metadata. In the future, Seddons plan to take advantage of some of the more advanced features of cleanDocs such as PDF conversion and file zipping.
Invisible to users but working 24/7 in the background, contentCrawler will uncover huge amounts of dark data stored in Seddons’ Proclaim system, ultimately OCR’ing the documents to make them text-searchable. Seddons were impressed with contentCrawler because it was easy to set up and configure during testing, could be scheduled to work in batches rather than all at once, and was flexible in dealing with many file formats including image files and PST files one level deep to include email attachments. Seddons’ next step will be to complete the task of indexing all the dark data.
“cleanDocs gives us an extra layer of protection so our staff now think twice before sending emails and attachments that could cause data breaches. contentCrawler will give us confidence that no data will slip through the net,” said James Temple, IT Director at Seddons.
Armed with the right software solutions, Seddons now is well on its way to preparing for the GDPR deadline in May 2018. James remarks that cleanDocs and contentCrawler were welcome additions because both products integrate seamlessly with existing applications and will make the firm more secure and airtight. DocsCorp software is poised to help Seddons to reach an ideal level of data security, searchability, and efficiency.
• Comply with the General Data Protection Regulations (GDPR) for data protection and privacy.
• Change staff email behavior to prevent mis-sends and other types of inadvertent data breaches as a key part of GDPR compliance.
• Replace a legacy email add-in
• Perform a data mapping exercise to pinpoint where data was being stored and uncover ‘dark data’ – unsearchable files such as scanned PDFs and TIFFs.
• Deploy a product that could continuously monitor a content repository, find dark data, and OCR them to add a searchable text layer.
CleanDocs & contentCrawler as a Solution
• cleanDocs removes metadata from Microsoft Office documents and verifies email addresses before messages are sent, reducing the risk of a breach occurring.
• contentCrawler finds image-based documents, OCR’ing them to create searchable documents.
• contentCrawler bulk processes documents using either or both of its available modules: OCR and compression.
• contentCrawler is flexible and versatile, easy to set up and configure.
• The firm partnered with DocsCorp, a company that had established a GDPR compliance package including cleanDocs and contentCrawler.
• cleanDocs integrated with Microsoft Outlook and handled domain name checking better than the firm’s incumbent software.
• cleanDocs categorises emails based on security risk and removes metadata from attachments prior to sending emails. Both features provide an extra layer of security for the firm.
• contentCrawler works automatically behind the scenes, sifting through documents in the content repository looking for and assessing image-based documents for conversion to text-searchable PDFs.
• contentCrawler was easy to set up and configure during testing, and could be scheduled to work in batches rather than all at once.
• Once converted to PDF by contentCrawler, image-based documents are indexed by the Proclaim PMS, which makes them display in relevant searches by users.