Guest post: Unlock the potential of generative models with semantic search

By Yannic Kilcher and Aashna Majmudar from DeepJudge

Recent advancements in large language models (LLMs) have already transformed how the legal industry perceives its workflows, and the next big question on everyone’s mind is whether their abilities to generate legal content such as contracts, arguments, research and more will turn out to be the cherry on top that truly revolutionizes the field. As with any emerging technology, there are concerns about the privacy, accuracy and reliability of these tools. This article breaks down how large language models and generative AI work, how they can be safely implemented in your law firm, and the promise it holds for the future of legal practice.

The Potential of Generative Models

Generative models, a subset of machine learning, have the capacity to create new, original data based on patterns they learn from existing data. For law firms and legal departments, these models can streamline a multitude of tasks, such as document automation, analysis and research, and even provide basic legal advice through virtual legal assistants. The result? More time for you and your colleagues to focus on high-value tasks that require human expertise, like strategic decisions and negotiations. The legal sphere is truly buzzing with excitement at the impact such technology promises.

How do LLMs work, really?

Let’s take a step back to understand the mechanism behind LLMs. At their core, LLMs are autoregressive, a term that suggests they make predictions based on their own previous outputs. This trait of self-dependency enables them to generate cohesive and contextually relevant responses. Training these models involves an approach known as next-token prediction. This task is a bit like a sophisticated game of fill-in-the-blanks, where the model is presented with a sequence of words and must predict the word that logically follows. To do this, LLMs use massive amounts of data, attempting to match and emulate the statistical patterns within the dataset.

The effectiveness of LLMs is greatly attributed to the underlying architecture that powers them: the transformer. Transformer models are designed around the concept of ‘attention’, which allows them to weigh the importance of different parts of the input sequence when generating an output. This means they can focus on the most relevant parts of a sentence or paragraph when making their predictions, a capability that is crucial for understanding context and producing meaningful responses.

In terms of training these models, a process known as backpropagation is used. This is an algorithm for adjusting the model’s internal parameters, or weights, in response to the error of its predictions. In essence, backpropagation enables the model to learn from its mistakes.

This, in combination with a method called gradient descent, optimizes the model’s performance over time. Gradient descent is an iterative optimization algorithm that’s used to find the minimum of a function. In this context, that function is the error rate of the model, and the goal of gradient descent is to minimize this error rate.

Generative AI, which LLMs are a part of, refers to a branch of artificial intelligence (AI) that focuses on creating new content. These AI systems are not limited to just understanding and interpreting data, they also have the ability to generate data that was not part of their original training set. This could include everything from writing a poem, composing a piece of music, creating an image, to predicting the next word in a sentence. These systems learn the underlying patterns in the data they are trained on and can then generate new, similar data based on these patterns.

The Trouble with Generative Models – Accuracy, Auditability and Privacy

Generative models, while promising, are not without their flaws. Recent evidence of this lies in the story of a lawyer who submitted a 10-page brief that cited more than half a dozen court decisions, every single one of which had been convincingly concocted by ChatGPT. Technically speaking, we call that “hallucinating”, meaning a generative model can generate false facts or information – a huge concern in the legal sector. This happens because LLMs are statistical systems. They don’t generate outputs based on inherent understanding, but rather on the statistical patterns they’ve learned from training data. When these patterns indicate a need for a specific element, like a reference or a fact, LLMs can easily invent one to suit the context.

As it stands, it is also incredibly difficult to know where generated information comes from. How can we verify that all generated output is accurate, reliable and reproducible? Unfortunately, that remains a largely unanswered question, due again to the fact that information is produced statistically, and isn’t necessarily based on concrete reference materials or context. This is referred to as a lack of auditability.

The Solution: Generative Models Built on Semantic Search Engines

How can we tackle these concerns? By building private generative models that source input from enterprise semantic search engines. These search engines use NLP technology to understand the context and meaning of queries, and then retrieve relevant documents from a firm’s internal document collection without requiring exact keywords.

In terms of accuracy, semantic search engines provide a firm basis for generative models. This combination grounds all generated data in factual enterprise information, significantly reducing the risk of hallucinations. Additionally, it can alleviate the issue of auditability. A semantic search engine’s fundamental purpose is to understand the intent and contextual meaning of search queries to provide more accurate and relevant results. This understanding of context can be exploited to counteract a generative model’s reliance on purely statistical parameters and trace the origins of the generated information. Not just that, such a generative model can even refer back to the search engine to access, verify, and cite the source data it used to inform its output.


While there’s a keen interest in generative models in the legal sector, the issues of accuracy and auditability are pressing concerns. However, pairing generative models with semantic search engines can provide an accurate and accountable solution for law firms. What’s more, there are several different ways to design a semantic search engine, the most promising and thorough approach being the combination of semantic capabilities with traditional keyword searches. This powerful technology promises a future where legal professionals can leverage AI to increase efficiency, minimize routine work, and better serve their clients.

Yannic Kilcher, who is co-founder and CTO at DeepJudge, has over two hundred thousand subscribers to his YouTube channel on machine learning research and also played a leading role in the development of OpenAssistant, an open source alternative to ChatGPT. Aashna Majmudar is a business development associate at DeepJudge.

We do not charge for written editorial content. If you’d like to submit an idea for a guest post, please contact