LITIG’s benchmarking initiative led by CMS chief innovation and knowledge officer John Craske is gaining momentum with the hire of Jo Owen, formerly CIO of Cripps, as project manager. David Wood, who is head of legal data and engineering at Simmons Wavelength has joined the core project group driving the initiative, which aims to create an open-source benchmark for legal AI systems that build transparency, trust, and responsible use across the board. Wood is on the board of LITIG; the UK’s leading legal IT industry body representing around 90,000 users and £20bn of turnover. Also helping to drive the initiative is Gideon Green, an associate at CMS.
LITIG, with support from Artificial Lawyer and Legal IT Insider, kicked off the AI Benchmarking Initiative this summer, with an in-person workshop in August.
Craske said in an email update to AL and LITI: “To help give the initiative some momentum, we have recently appointed Jo Owen (who was CIO at Cripps until recently) to help us with the organisation, project management and getting stuff done. The project team has also been joined by David Wood (David is also on the Litig Board, as well as having a leadership role at Simmons Wavelength) to help drive things forward, and Gideon Green who is an associate at CMS (some of you might recognise Gideon from the first workshop).”
Craske has published the key takeaways and key findings of the August workshop which include:
- Transparency Commitment: AI vendors must be transparent about their testing methodologies, data, and scenarios to build trust and allow users to verify claims.
- Agreed Methodology: Establishing a common testing methodology to make metrics easier to understand and compare, though flexibility is needed due to the complexity of legal tasks.
- Defined Use Cases: Creating agreed use cases with example questions and criteria for valid answers to help users understand where AI tools are most effective.
- Kitemark Certification: A trusted seal of approval from an independent organization could reassure users about AI tool quality but may be expensive and time-consuming.
- Common Themes: Consistent definitions of key terms, balancing speed, cost, accuracy, and trust, the importance of high-quality data, flexibility in benchmarks, and shared responsibility among all stakeholders.
You can read the report in full HERE. The next virtual workshop session will be at 4pm (UK time) on the 24 October 2024.
Craske said: “The workshop is open to all (subject to a sensible limit that still allows for a workshop style discussion) but is really aimed at those of you who couldn’t attend the in-person workshop. We are likely to only take one participant from each organisation, so if there are a few of you on the mailing list, please agree who is going to come along. Once we know how many of you are interested, we will either invite everyone or just a carefully selected group of the people interested! Please don’t email me … register using the link below.”
The objective of the second workshop will be to test the outputs from the session in August and aim to build a set of key considerations for the next stage, which will be a working group to help create the benchmark. Craske said: “Of course, even if you don’t come along, we will also share the outputs from the virtual session with the whole group. Based on engagement and feedback so far, we have seen considerable interest in this initiative, it is quite likely that we also set up special interest sub-groups who help direct and inform the working group.”
The form will be open until 5pm UK time Friday 18 October.