Zuva and Litera partner with SALI to release open-source document classification taxonomy 

Zuva and Litera have built a document classification taxonomy covering 225 document types which they are open-sourcing via the Standards Advancement for the Legal Industry (SALI) Alliance. 

Zuva is today (3 October) releasing an API implementation of the document classifier, which will primarily help with classifying documents in firms’ document management systems. This means anyone can implement it and automatically classify documents within the DMS. Zuva is a spin-off of Kira Systems, which Litera acquired in 2021. It focuses on adding contract intelligence to workflow and apps.

Speaking to Legal IT Insider, Kira and Zuva founder Noah Waisberg said: “In the document management system you have a lot of documents but not much metadata. With more metadata you can do better searches.  

“Firms try to add metadata, such as which type a document is. They can either use consultants or train their people to add metadata, but the problem with the latter is that fee-earners often don’t do it. I spoke to one firm that provided a drop down menu for document type and the most popular document was ‘admiralty’ – they don’t do admiralty law, it was just the first option.” 

The classifier being released now, which will compete with the work iManage is doing to enable customers to undertake the same exercise, is built on work done at Kira starting in 2016/17 and Waisberg said: “We hired a lady at Kira to put together a taxonomy and get it to work. Building the underlying tax is a lot of work. She built it and added documents in the taxonomy that fit the categories.  

“The announcement today is unusual because this is something that we have co-developed with Kira and then also that we have decided to make all of the information public.” Zuva’s own information is all public and Waisberg said: “You can try it for free and don’t have go via a salesperson or NDA, so our logic was ‘we might as well put all of this on our website.’” 

With regard to SALI, which is driving towards standardisation across the industry, they have not created document management classification taxonomy. Waisberg said: “The benefit of doing this through SALI is that it will become interchangeable with other systems. Vendors can use our standard and it should become interoperable.”

SALI leader Damien Riehl said: “For Large Language Models (LLMs), an important method of increasing accuracy and reducing hallucinations is Retrieval Augmented Generation (RAG), and SALI’s 13,000+ tags can helpfully curate that document subset — for LLMs to summarize, analyze, and synthesize.” 

Standardizing document classification ensures consistency in how contracts are categorized and processed, making contract review and analysis more reliable. Corinne Geller, director of legal knowledge engineering at Litera said: “Organizations are faced with an ever-growing volume of contracts; moving towards uniform document type classification both enables scalability and facilitates collaboration, allowing legal teams to develop better insights on larger sets of documents with accuracy and confidence.”