Guest post: Get your data right

In this post on data & legal engineering, Wavelength Law’s chief scientific officer, Dr Ben Gardner, discusses how utilising new data structures provides the foundations for legal engineering and transformational innovation.
In a previous post we looked at how Google is combining a range of new tools that exploit and manage digital information. By combining these tools in innovative ways, Google has been able to realise synergies between them, allowing it to build a model of the world wide web which includes contextual ‘understanding’, not just of the pages on the web but also the information on those pages and the links between them.
We believe this transformation in the way we can structure data is a key factor in driving the information revolution we are living through. Data may not get the buzzword attention that AI currently receives, but it’s a fact that data is the fundamental fuel that drives AI and the innovation we are interested in as legal engineers. If we are to realise the value from the data held within organisations (in particular, law firms and legal departments of large corporates) then it is imperative to learn how to unlock that potential.
In this post we will look at how we can apply the new data structures used by the likes of Google to the challenge of combining internal data into a contextual network, an Enterprise Knowledge Map, which we use to underpin innovation.
In the guts of an organisation – a fragmented information landscape
The diagram below [click to enlarge] sets out a visual representation of the typical systems and information repositories that we would expect to find within a law firm or legal department. These are a mix of the tools used to run the organisation, a document management system and one or more knowledge management databases. If we were to map where information about matters, clients, colleagues, etc is held then we would find it is fragmented across multiple repositories. For example, in the diagram bits of information about a client are held across the practice management system, matter opening portal, engagement database, Client Relationship Management (CRM) system, Practice Management System (PMS), Document Management System (DMS), finance system and a knowledge management database containing precedents etc.

This fragmentation of information is common to many law firms and legal departments and it is generally recognised that if an organisation wants to build a complete profile of a client, it would have to login to multiple systems and manually extract, manipulate and combine the data. The effort is so high that in many cases it is not attempted. The effort currently required to combine data from multiple sources means organisations do not realise the value trapped within their systems.
To illustrate this point, consider an example where you may need to identify a New York Bar qualified lawyer who speaks mandarin and has experience working on M&A deals in the transport sector. This information is held within the organisation’s systems. The HR system holds data on qualifications and language skills, the time recording system holds recorded time against each matter and the practice management system holds information about each matter including transaction type, client and sector. You could extract the set of people from the HR system who are New York Bar qualified and speak mandarin, but you would then need to extract all the time records for those people in order to identify all the matters they have worked on. Next, you would need to obtain all the metadata about that large list of matters from the practice management system. Finally, you would have to hope you can join all this information from the different systems back together and then identify the few lawyers who meet your original profile. The one-off cost of performing this type of analysis probably outweighs the value of the individual task – but if the effort was reduced to a trivial one, i.e. composing a simple query, then significant value is realised by enabling expertise location based off the reuse of information that is already being captured by the organisation’s various systems.
Introducing the Enterprise Knowledge Map
In order to unlock the potential of the various data held within an organisation, it is necessary to aggregate all the fragments of information based on the things of interest, rather than aggregation by the processes involved. Essentially, organisations need to move from the fragmented model illustrated in the figure above to an environment similar to the one presented in the diagram below [click to enlarge].

In this diagram, the fragments of information scattered across the silos have been joined together to create connections between systems. The idea of the image is to paint this connected data as a metaphoric tube map where the different lines are the things of interest, and the systems are represented as the stations. The analogy is that by getting on the ‘Matter line’ you automatically can pass through all the ‘system stations’ on that line, collecting the data with minimal effort. Additionally, because this is a network it is possible to change focus very easily by ‘changing’ lines at any station, for example moving from the Matter-centric view to a Client-centric one. If data is re-structured in this way, then we are creating an Enterprise Knowledge Map.
The problem here is, in abstract, similar to that faced by Google on the web – where information about things, i.e. people, places, movies, etc., is captured in many places. Google needed to bring that information together from those disparate sources and build profiles of people, places, movies, etc. In Google’s case, they had the advantage that they were dealing with information that was already in a network, but faced the challenge that they have to deal with the seemingly infinite diversity and number of things that make up the internet. In most organisations we do not have networked data, everything exists in isolated silos, but the universe of things that matter to a law firm or legal department is much smaller (i.e. the number of reference points, such as colleagues, matters, clients, etc., may seem large but in reality is small and finite compared to the challenge faced by Google).
Furthermore, law firms and legal departments have significantly more consistency in their data than is found on the web, for example organisations have unique IDs (i.e. Matter, Employee number, etc.), that are shared across systems. Even when there are no unique ID’s but instead a controlled vocabulary, maybe for industry sectors, the number of instances (e.g. the types of sector), tend to be on the smaller end of the scale, where manual mapping between systems can be used. This means that whereas Google needed to add context to an established network, law firms or legal departments can create context by building a network. Such organisations can join data together using unique IDs and controlled vocabularies that are shared across systems to create an Enterprise Knowledge Map.
In the next post, we will delve deeper into Enterprise Knowledge Maps and the power of data in innovation.
Ben Gardner was formerly Linklaters’ data and information architect. At Wavelength, which recently won the Online Courts Hackathon, he specialises in developing data strategies that enable clients to exploit structured & unstructured information sources.