Cutting through the noise: The impact of GPT/large language models (and what it means for legal tech vendors)

A little knowledge can be a dangerous thing, and as the legal industry has become savvier about artificial intelligence, there is a temptation for some to write off OpenAI’s generative pre-trained transformer (GPT) technology as just another big fuss that will be talked about incessantly before things calm down and return to normal (hashtag #bringbackboring). On the other end of the spectrum, there are those forecasting the end of lawyers, again.

In the interest of cutting through some of the noisier noise, I sought a few answers from AI expert and law professor Daniel Katz, who teaches a variety of courses at Illinois Tech Chicago Kent College, and is director of The Law Lab at Illinois Tech. He co-founded legal analytics company LexPredict, which was acquired by Elevate in 2018. The conversation happened at the end of January, and I’ll come back to reflect on some of the developments that have happened in just a few short, crazy weeks since then, and how that fits in with what Katz predicted. Spoiler alert, Katz says that for legal tech companies that are based on a Word plug-in, he would be very concerned, but more on that in a minute.

Is GPT the real deal?

You may or may not be aware that in December, Katz and fellow legal scholar Michael Bommarito put GPT-3.5 through the US Bar Exam, which ordinarily requires at least seven years of post-secondary education, including three years at an accredited law school and months of exam-specific preparation. Approximately one in five test takers still score under the pass rate on their first try. GPT-3 achieved an overall score of 50.3% (significantly above the 25% baseline guessing rate) and a pass rate for Evidence and Torts. Katz and Bommarito’s conclusion is that in the near future, GPT will pass the Bar Exam.

Given that we have spent time over the past few years telling people not to get to overestimate the capability of AI, is this the real deal?

“Yeah, I think it’s the real thing because if you look at why legal technologies have not had the adoption rate historically, language has always been the problem,” Katz said. “Language has been hard for machines historically to work with, and language is core to law. Every road leads to a document, essentially.”

People have built AI systems to try to process language, but Katz says “they are medium-good at best, is probably the best you can say, including stuff I’ve built. It solves some tasks, it helps some people, but we always said ‘Gee, this isn’t quite the level of understanding we need.’”

In the background, the quality of machines’ ability to process language has been improving, however. One graphic that didn’t initially make it into the Bar Exam paper, but was subsequently added, looks back on language models’ progress over the past few years in getting questions right. “Three years ago, these language models on, say, the Bar Exam task, couldn’t get any of the questions right – it did worse than chance because it improperly processes the questions so does worse than guessing. Then it guesses and gets 35%. And now it’s 50%. It is clearly climbing in the quality of being able to process and understand language.

“The fundamental problem that we’ve had this whole time is that you’re always trying some clever hack to work with language but it’s not really working as a frontal assault. With these large language models, which we’re interested in both commercially and as an academic, now we’re seeing something where you’re really trying to work on the true language problem.

“I lived this dualism where I trained students and the neural network between their ears all day about how to use language in this very careful way. And then you see these machines that historically can’t even come close to doing that. And now they can. And so that’s why I think it’s a big deal.”

Scepticism and the Rorschach ink blot test

Katz has been working with language models for years but the explosion of interest in GPT was sparked by the launch of OpenAI’s readily accessible chatbot ChatGPT. Since then, that’s all anyone really wants to talk about.

In my early conversations with legal heads of innovation, people are clearly nervous about trusting GPT technology for legal matters. It doesn’t always work. ChatGPT gets things wrong. Worse still, it hallucinates, meaning it invents things – such as caselaw – that haven’t happened.

Without wanting to sound too strong, Katz says: “I don’t think they’ve spent enough time really thinking about it all,” adding, “If you say it’s got to do something exactly the right way or it is of no utility, have you really thought carefully of what is being made possible by the advent of this technology?

“Yes, there are limitations. Yes, keep a human in the loop. But I think this has been a bit of a Rorschach inkblot test for people. Some people see there is an innovation. It’s making certain things more possible. And then others say, ‘It can’t do this, and it can’t do that.’ It reflects people’s personality a little bit.”

What does GPT mean for the legal tech development and vendors?

Given that all of the latest advances, plus the fact that OpenAI will in the near future be releasing the even more mind-blowing GPT-4, what does this mean in practice for the legal sector?

Katz says: “There are two types of things here. They would call general models GPT one through four, and then there’s domain models, so a legal specific large language model.

“I would bet the domain models will beat the general models. But then the question is, how do you combine them?

“GPT is on a diet of all this information from the big corpus of the Internet and other sources. And some of that includes legal sources, but then you go and try to do these legal problems and you say, ‘One of the ways we think we could do better with the Bar is if I could retrain the last few steps of the neural net on actual bar.’ Just like a human would, you get better if I train you right before you go do the task on how to do the task.

“What we’re going to see are large-ish, albeit not the largest model that’s heavily domain tailored is going to beat a general model in the same way that a really smart person can’t beat a super specialist. That’s where the value creation and the next generation of legal technology is going to live.”

Katz does not believe for a second that GPT will mark the end of the need for dedicated legal technology. He says: “I don’t believe that, but you hear people say this stuff. It’s not reasonable and it doesn’t even make sense from a common-sense perspective.”

However, legal technology vendors are going to have figure out what this vast improvement in large language models means for them going forward.

Katz says: “I would expect some of the incumbent legal technology companies are going to buy some of these, let’s call them generative AI companies and figure out how to combine that with their tech, because this might just be one of the capabilities in a broader suite of things. They can either figure out how to develop it themselves, or they could just buy the thing and integrate it into what they’re doing.

“I think a lot of them will opt for that because they will say, ‘Well, why would we just try to build it and then maybe we don’t know what we’re doing, we do it wrong and we waste a bunch of time and money when we can just acquire the thing that’s going to solve the problem for us.’

“I think you’ll see that, by the way, across the entire economy, forget about law per se, but in healthcare and all these areas, you’re going to see that type of thing happen.”

For document management system vendors, this could be something of an existential moment, and Katz says: “They had better come up with a plan. It seems to me that there’s this massive capability out there – you’re sitting on this huge pile of documents and people would like to be able to organise them and search the documents better and have a tool that allows you to connect the dots.”

He adds: “If you’re a legal tech company that is based on a Word plug-in, I would be very concerned about what this is all going to mean for your company going forward. Now, one version of this is that Microsoft could create a kind of apps-based environment, where they allow these other components to sit within the broader Azure environment. Or does Microsoft try to basically close ranks and say ‘No, you’re going to use these toolkits’ and do their own build outs of these things. That’s a strategic decision they’re going to have to make, but the risk is that they close down the shop on this, a bit like the day that Apple got rid of their authorised resellers.”

Infosec questions

Some of the big questions around how law firms use GPT or other large language models include where it will live and how much information they will share. Katz said: “There are a bunch of questions about where OpenAI is going to live. Is the system going to be within your firewalled environment? Am I going to dictate into GPT and have it called outside and come back? I think firms would be hesitant to do something like that. That’s why I’m interested in the next 12 months – is there going to be a firewalled version of this that you can instal or access that is basically like on prem. Then you can do these tasks that everybody dreams up, that you would want to be able to do without having to sign up for everything going out to some third party.

“Infosec and privilege and confidentiality type questions still loom over all of this but people are really interested in how GPT can help with all the administrative tasks that are necessary to get the job done, but if we could move that along we would have more time for valuable tasks.”

The future comes to pass, immediately

While much of the conversation around large language models has been dominated by OpenAI’s GPT-3 – which Microsoft in January invested a further $10bn in – Katz pointed out that this not going to be a one-horse race. “Deep Mind are not a bunch of slouches,” he said. “They are not going to allow Microsoft to take the entire orange. This is going to be a lot of fun, it’s going to be a very exciting year.”

Within a week of our conversation, Google launched new AI language model Bard, which is a rival to ChatGPT.

Also following our conversation, Allen & Overy came out with its bombshell announcement that it has partnered with ‘co-pilot’ Harvey, a platform that leverages OpenAI’s technology (and investment) to help lawyers edit documents and perform legal research. This was followed by an announcement on 1 March that US legal research provider Casetext has launched a legal assistant powered by OpenAI’s technology. There are plenty of vendors announcing ChatGPT integrations.

With regard to Microsoft, in our conversation Katz said: “Strategically it’s quite a genius thing they have done in positioning themselves so there is this independent entity (OpenAI) but they have the ability to invoke it into their own environment.” He pointed out that Microsoft will be looking at ways it can integrate GPT within Word, Outlook and PowerPoint.

Microsoft on 6 March launched a Microsoft Dynamics-based Copilot across its business apps portfolio, touching on the Power platform and Dynamics 365, you guessed it, powered by OpenAI tech and built using the OpenAI Service.

“Over the last four years, we’ve been on a journey to bring generative AI and foundation models to the workplace,” Charles Lamanna, CVP of business apps and platform at Microsoft told TechCrunch via email, adding, “And we’ve now reached the point where the tech and product can enable transformative outcomes for customers.”

It is, without a doubt, going to be an exciting year.

caroline@legaltechnology.com