As several cameramen recorded events at Kennedys’ City office on Friday evening (27 October), the latest Man v Machine challenge reached its conclusion. But there was no real surprise when the results were announced: the machine, aka CaseCruncher Alpha, won hands down securing victory by a clear 24. 3% margin. Its accuracy rate of 86.6% compared with a more modest 62.3% achieved by the men and women challengers – every one a lawyer.
CaseCrunch, which builds systems that predict legal decisions using AI technology, devised the challenge. The tech start up is run by four Cambridge law students. Three are still undergraduates: Nadia Abdul, Ludwig Bull, and Jozef Maruscak – respectively, Romanian, German and Slovakian. Having graduated this summer, the team’s final member, Rebecca Agliolo, hails from Glasgow.
This legal quartet began their AI adventure in November 2016 with LawBot, a chatbot designed to offer students free legal advice on 26 serious criminal offences. After a successful four-month beta trial, LawBot rebranded as Elexis, part of Elixirr Technologies, providing free legal counselling via Facebook Messenger. Most recently, it morphed into CaseCrunch Systems whose Twitter handle reads: ‘We’re Solving Law. On a Mission to Find Objective Truth in Law.’
The challenge idea was originally conceived ‘over a beer, in a bar, in Berlin,’ according to Agliolo. More than 110 lawyers eventually agreed to take on the machine. Participants were comprised of barristers, in-house counsel, and lawyers from over 20 law firms including: Allen & Overy, Berwin Leighton Paisner, Bird & Bird, DLA Piper, DWF, Norton Rose Fulbright and Pinsent Masons.
Lawyers faced the same challenge task as the machine, albeit on a lesser scale: to evaluate the outcome of genuine complaints that had been made about Payment Protection Insurance (PPI) mis-selling. Following an FOI request, the Financial Ombudsman Service (FOS) provided CaseCrunch with data on decisions that had previously been made relating to 23,291 PPI complaints.
Revealing characteristics were then stripped away in every case before an identical question was put to test both participating lawyers and the machine’s predictive algorithms: is the claimant likely to have their claim upheld? Two judges oversaw the proceedings – technical judge Ian Dodd, UK director of Premonition Analytics, and legal judge Dr Felix Steffek from the law faculty at Cambridge.
CaseCruncher Alpha correctly predicted the outcome in 20,174 cases (86.6%). Meanwhile, participating lawyers were allowed to research any relevant information that they wanted about PPI and its impact in an unsupervised environment. The lawyers’ combined efforts resulted in 775 predictions, of which 483 (62.3%) were correct.
‘Evaluating these results is tricky,’ said CaseCrunch in a website statement afterwards, adding cautiously that ‘machines are able to compete with and sometimes outperform human lawyers. The main reason for the large winning margin seems to be that the network had a better grasp of the importance of non-legal factors than lawyers. This experiment also suggests that there may be factors other than legal factors contributing to the outcome of cases.’
The statement concluded with a bold claim: ‘The use case for systems like CaseCruncher Alpha is clear. Legal decision prediction systems like ours can solve legal bottlenecks within organisations permanently and reliably.’
Among practitioners who attended the event, opinion was generally positive, although they were less certain of the outcome. Timothy Leeson, a partner at Lewis Silkin who advises automotive clients using AI, said: ‘CaseCrunch met my expectations. But then I think I was pre-programmed to expect this result – as night follows day, it’s a product that’s coming on the market. I’m not completely clear what it will look like yet: an end-to-end product that means 5% of the workforce of private practice law firms is no longer required, or a product which allows us to deliver a much better service to clients.‘ However, he suggests that systems like CaseCrunch ‘are better marketed to the clients of law firms with a greater propensity to invest.’
Ralph Cox, a patent litigation partner at Clyde & Co, was similarly circumspect: ‘I keep an open mind. However, it struck me that the computer was given all the database information and therefore had an unfair advantage. Lawyers who took part were from outside this area of expertise without any real experience of PPI; they had to do it in a limited time. But the really interesting part is not the human bit, but the percentage that the computer got right – 86% is not bad at all.’
The unfair advantage point was ‘one of our concerns when we setting it up,’ conceded Abdul. ‘Initially, we were going to cover many areas, but we narrowed it down to one which gave everyone the chance to learn about that area as much as the system. Whereas the system does not have the experience of a lawyer who has worked in the field of PPI.’
Bull added: ‘We did point lawyers towards the Financial Conduct Authority so that they could learn something about PPI – they could have gone there and seen sample cases, for example. We struggled to find an area because lawyers specialise in so many niche areas. So we had to find something that was relatively easily intelligible to most lawyers and where they could also understand the underlying principles relatively quickly. It was as fair as we could make it. As to the advantages that the machine had – it is not wrong, these are inherent advantages that the machine will always have.’
So where does CaseCrunch go from here? As managing director and the majority shareholder, Bull made it clear that he is not seeking any further investment. ‘The ultimate goal is to be like (Ronald) Dworkin’s Judge Hercules who can make perfect predictions about every single case,’ he said. ‘In terms of the immediate goal, we think we have shown the use case; now we want to build systems for a number of our clients. We also want to extrapolate to build end-to-end systems, not just niches – that’s the path we have to take in order to build a system that has a more general application.’