The generative AI benchmarking meeting hosted by CMS’ chief innovation and knowledge officer John Craske on 1 August generated significant engagement, with some key takeaways to shape the next larger, virtual meeting. The work is being done under the Legal IT Innovators Group (LITIG) umbrella and the initiative aims to agree an industry-wide methodology for measuring the output of generative AI technologies.
The Legal Industry AI Benchmarking Collaboration is gathering views from suppliers, law firms and academia around how to ensure a sufficient level of transparency and standardisation among suppliers and tools.
The concern is that there is currently no way to compare and measure the output of generative AI tools, and that law firms need metrics that they can understand, trust, and track over time.
Craske said of the meeting on 1 August: “We had around 40 people at the CMS office and there was really strong engagement and a collective view that this is an important initiative where, as an industry, we need to do something. There was also an acknowledgement that there are some challenges we need to address as any AI benchmark or standard needs to work for all of the stakeholders in the industry and not act as an inhibitor to innovation or make it more difficult for small startups (or law firms) to get involved.”
Here’s a few highlights from the feedback at the end of the workshop:
- As part of the workshop, the group looked at four levels of potential standard or benchmark. These were: (1) a transparency commitment; (2) agreed methodologies; (3) defined use cases; (4) kitemark / third part verification. Different groups discussed each of these options at the workshop. Establishing a transparency commitment got the most votes in the end of workshop survey, with a potential build to defining agreed methodologies and defined use cases as the next steps. Craske said: “I don’t think this reflects a lack of ambition though as in the room the feeling was our aspirations should, at least, be to set some agreed methodologies – but more as a place to start.”
- Everyone agreed that the group should work towards developing a consultation paper that is published and seeks views from across the market.
Craske said: “In terms of next steps, we are now digesting all of the notes, feedback and flip chart scribbles and we will pull those into something we can all use. As promised, we are also planning to host a virtual session for those people who couldn’t make it to London – as soon as we have a date pencilled in, we’ll let you all know. At the virtual session I would like to playback the output from the in-person workshop, sense-check what we found and build on the work we did in terms of a potential consultation paper.”
Following the virtual session, Craske will be forming a working group (perhaps ‘core’ and ‘halo’ groups) to work on drafting a consultation paper.
LITIG is the UK’s leading legal IT industry body and its members represent around 90,000 users and £20bn of turnover.
Asked how LITIG will respond to any potential suggestions from suppliers that they can’t be expected to give away competitive secrets, Craske said: “We’re not asking how you got to where you got. It’s about testing the products available to customers and testing the output and how that compares with what is promised.”
Craske has published a post on LinkedIn here: https://www.linkedin.com/posts/johncraske_aibenchmarking-artificialintelligence-ai-activity-7226199034946818048-Ymg4