This article was first published in the CIO Review India magazine.
What is Natural Language Processing?How can we derive value from unstructured text? Read how to “Put Your Money Where Your Word Is”!
The Big Picture
The advent of new genres of technologies always creates a lot of buzz and typically everyone wants a piece of the action. It becomes a fad and the Fear Of Missing Out (FOMO) drives implementation, but not necessarily adoption and certainly does not generate adequate Return on Investment (RoI).
In this backdrop, let us examine the field of Natural Language Processing (NLP) and its potential implications in field of Legal Tech and beyond.
The Direction of Legal Tech
The legal fraternity has traditionally been slow in adopting the latest technological tools. The adoption of digitization, ERPs or practice management software, workflow systems and knowledge management systems by law firms has been rather slow. The focal points have typically been capturing time and generating bills. However, with the traditional commercial models being threatened and changing client expectations, this approach is woefully inadequate. If we look at in-house legal departments, technology budgets for solving legal problems have traditionally been minimalistic and this segment did not evolve much for several decades.
The Natural Language Promise
Whereas a lot of factors have recently created a lot of excitement in the legal tech space, we will focus on the area of processing unstructured text through Natural Language Processing (NLP). This is often combined with the power of AI through deployment of Machine Learning (ML) and Deep Learning models. Before we dive deeper, let us examine the macro factors that are leading to an unprecedented interest in text computation:
- Traditional computing focus and the computing architecture is designed to process binary information and translating quantitative information in the decimal system to binary, is but natural. However, humans as a species do not communicate in numbers; well mostly not! Consequently, we have generated exponentially more unstructured text data than we have generated structured quantitative data. Even the most sophisticated quants did not know, until very recently, how to process this information.
- The rise of computing power and availability of data has fueled the Intelligence Revolution, whether we call it BI or Big Data or AI or any other new jargon that I am sure we will conjure very shortly. These advances have also contributed to the field of computational linguistics. We can now ingest text and do all sorts of computational ‘magic’ with them. Imagine if you can unleash the data science power by encoding the meaning of a billion words in a 300 dimensional vector space. The good news has only just started to unfold. Such vectors, derived from Neural Networks that have been painstakingly trained on massive datasets such as Google News and Wikipedia are available for our use.
The Business of Text
With such exciting developments, the techie in us can very easily get carried away. However, first things first – let us look at some of the underlying problems that we are trying to solve. These problem statements will lead us to applications in the legal tech space and beyond.
- Information Explosion – does not need an introduction and more information to add to the explosion! The business needs to cull out relevant nuggets of information from a pile of information, often within short time frames. This renders a manual approach infructuous. This has led to the rise of ediscovery tools and solutions in the legal fraternity.
- Meaning based computation. For decades, we have relied exclusively upon human intelligence to derive meaning of words and all further processing has been dependent on our grey matter rather than silicon chips. Since such solutions were not around, we are yet to articulate the problem statements in this domain. Pause for a minute and think about it. You might get more answers specific to your domain, than I can explain in the confines of this space. As an example, sourcing of candidate profiles can become intelligent since the machines can now ‘understand’ what is being mentioned in the CVs.
- Look who’s talking. Lets change focus from the input layer to the output layer. Machines can not only ingest and process information, but they can also generate natural language outputs. In case of the legal fraternity, this has led to contract creation tools. Chatbots that use some of these principles are becoming ubiquitous. There are innumerable possibilities, since these are relatively ‘newer’ capabilities.
What’s the Good Word?
So, how do we convert the heap of unprocessed and unstructured text into a gold mine? How can unused words become the good word, in a business and economic sense?
In order to get bang for the buck, one needs to approach this field differently. Borrowing an analogy from Daniel Kahneman’s ‘Slow Thinking’, one needs to by-pass the hard-wired neural circuitry (‘Fast Thinking’) and discover more possibilities with these new-found capabilities of processing unstructured text.
Rajiv Maheshwari is the CEO of Anand and Anand. The article provides an overview of Natural Language Processing, drawing from his experience of creating innovative products recognized at the Financial Times Innovative Lawyers Asia-Pacific 2018.