We spoke to Joel Hron, chief technology officer of Thomson Reuters, about the strategy behind the acquisition
Thomson Reuters yesterday (21 August) announced it has made the somewhat unusual acquisition of UK pre-revenue startup Safe Sign Technologies (SST), which is developing legal-specific large language models (LLMs) and as of just eight months ago was operating in stealth mode.
The 15-strong SST team will report directly to Thomson Reuters chief technology officer Joel Hron, and will be working closely with the Thomson Reuters Labs team.
SST was founded in 2022 by CEO Alexander Kardos-Nyheim, a Cambridge law graduate and former trainee solicitor at A&O Shearman. In late 2023, Harvard University research fellow and former DeepMind senior research scientist Dr. Jonathan Richard Schwarz joined SST as co-founder and chief scientist.
There isn’t an awful lot of public information available about the company but speaking to Legal IT Insider about the acquisition, Hron explained that SST is focused in part on deep learning research as it pertains to training large language models and specifically legal large language models. The company as yet has no customers and has been focusing exclusively on developing the technology and the models.
In terms of why SST and why now, Hron explained: “We have been investing and doing research in this space for at least a year and a half. We have tremendous content and expertise and we believe that SST and their expertise and approaches are really complementary and accelerate the work that we’re doing in that domain.”
SST is creating a proprietary legal LLM and training it on legal tasks and use cases. Hron said: “Our belief is that with our expertise and content we can really fuel the progress of training in both the legal domain and, in the future, across other domains in which Thomson Reuters operates.”
While SST’s legal LLM is being fed on a purer data diet than the average mainstream LLM, Hron says that experience shows that you need many types of training data, commenting: “When a model gets trained on code it gets better at legal reasoning. There is good reason to say that training it on a variety of data is better for generic model capability and once you’re past that and want it to accurately undertake skills or tasks, that’s where intense focus on specific data is important.” One of the key drivers for proprietary legal LLMs is that they are fed on clean, legally-permissible data. The first legal large language model came from 273 Ventures: Kelvin Legal DataPack was unveiled at ILTACON23 by 273 heads Daniel Katz and Michael Bommarito.
Hron said he expects this acquisition to help accelerate Thomson Reuters’ ability to provide its customers with a professional grade AI experience through CoCounsel AI Assistant.
While it’s unusual for Thomson Reuters to buy a company at such an early stage as this, Hron said: “Our focus is to be excellent at applying new technology and deeply involved in the research and development of the technology. That’s something that Thomson Reuters Labs has always been present in but we’re emphasising that and doubling down it with this acquisition.”
caroline@legaltechnology.com