Data-driven prediction tools are being used across a variety of fields to help professionals make more accurate and less biased decisions. In a meta-analysis of studies covering a wide range of fields, William M. Grove and Paul E. Meehl found that in 128 of 136 studies formal algorithmic predictions—ranging from simple counting rules to advanced statistical techniques—outperformed human judgment based on informal contemplation and discussion with others. They also found that the more advanced the statistical technique, and the greater the amount of data used to generate the algorithm, the more likely it is that the algorithm will outperform human judgment. (See Grove and Meehl, “Comparative Efficiency of Informal (Subjective, Impressionistic) and Formal (Mechanical, Algorithmic) Prediction Procedures: The Clinical-Statistical Controversy,” 2 Psychology, Public Policy, and Law 293 .)
Can data-driven tools be used in law with similar efficacy? The question is not as new as it might seem. Writing in the Harvard Law Review in 1897, Oliver Wendell Holmes Jr. observed that, “For the rational study of law the blackletter man may be the man of the present, but the man of the future is the man of statistics and the master of economics.” (See Oliver Wendell Holmes Jr., “The Path of the Law,” 10 Harvard Law Review 457, at 469 .)
Technological advancements in the last 10 years have made it possible to explore the application of data-driven tools to the legal field. The problem, however, is that law is full of grey areas. Tax law in particular can be a very uncertain field, often precisely because of the many rules and standards that are supposed to provide lawyers and judges with guidance. Rules are a challenge because of their intricate specificity. Standards, on the other hand, are difficult because of their amorphousness. In terms of standards, tax lawyers must characterize relationships, instruments, and entities to help their clients comply with the law. Prediction regarding characterization is an essential legal skill: in order to advise on compliance or prepare for litigation, tax lawyers must be able to predict how the courts will characterize their client’s situation.
However, legal predictions are limited by human judgment. Even the most careful lawyer’s predictions can be inaccurate in any number of ways: they may be based on overly broad rules of thumb, biased by individual experiences, or influenced by the interests of clients. But recent advances in machine learning provide lawyers with an opportunity to use these powerful new tools to support their predictions. By analyzing the facts and outcomes of past cases, machine learning algorithms can find hidden patterns in the existing data to predict the outcome of new scenarios.
Take, for example, the question of whether funds supplied to a business are best characterized as debt or equity. This question has a range of important implications including the tax treatment of a reimbursement, the deductibility of interest payments on purported debt, and the availability of the bad debt deduction. The characterization exercise can be complex and depends on the extent to which the transaction complies with arm’s-length standards and normal business practices. In many cases, assessing the substance of a particular transaction requires a detailed examination of up to 16 different factors. (See Estate of Mixon v. United States and Dixie Dairies Corp. v. Commissioner.)
Debt or Equity Example
Consider the following scenario: a privately-owned medical technology company seeks funds to complete a restructuring. An arm’s-length individual experienced in the field assesses the business and determines that $5 million is required to make the company successful. Despite a relatively high debt-to-equity ratio, the company is able to obtain $4 million in funding from an institutional lender, and the individual advances $1 million personally. This $1 million contribution is evidenced by a demand note with 6 percent interest annually. At first, the company’s plans lead to promising results, but the company begins to falter when a competitor enters the market. The individual contributes an additional $500,000 to keep the company afloat. The company ceases paying interest on the purported debt. Despite having a security interest in the company’s technology, the individual does not demand or enforce payment. Eventually the company goes bankrupt, and the individual claims a bad debt deduction. The IRS disallows the deductions on the basis that the advances were not legitimate indebtedness but rather capital contributions.
Imagine you are a lawyer for the individual in this case: how would you determine the likely outcome of this scenario in court? You might start your analysis by identifying the factors that support your preferred debt characterization:
- The obligations were documented as debt (both through the original note and in the company’s records)
- There was a certain promise to repay, along with a specified, market-based interest rate
- The security interest constitutes a definite right to enforce payment
- The company was financially sound enough to obtain funding from an institutional lender
Other factors in this situation, however, point to an equity characterization:
- The advances made by the individual were subordinate to the loan made by the institutional lender
- The investment was risky, given the company’s thin capitalization and the missed interest payments
- The individual did not enforce the obligation to repay
How might you weigh all of these factors to determine the most likely outcome? The most common approach is to search for similar cases. But with over 250 cases on this issue from the past 50 years, finding relevant cases is no simple task. After hours of research, you might successfully isolate similar cases such as Owens v. Commissioner and Magee v. Commissioner. And while there are many similarities, the facts do not match completely. In addition, the results of these two previous cases are different. The obligation in Owens was characterized as debt, and the obligation in Magee was characterized as equity.
Since the courts have stated that no single factor can be determinative and we do not know precisely how the various factors in these cases interact with one another, we cannot simply count the factors that point in each direction. And even if we were to attempt to apply traditional methods of statistical analysis to the facts of a large number of cases, we would need to avoid standard regression techniques that would require us to impose on the data an assumed relationship between the variables and the outcome.
Rather than having to choose between one option that is too simplistic and another that is potentially bias-prone, machine learning can detect connections between factors and produce predictive algorithms that capture more nuanced patterns than traditional statistical techniques. These trained algorithms can then be applied to the facts of a new situation in order to produce a prediction.
But how could we know that the machine-learning predictions are correct? One way to evaluate the accuracy of the AI is to follow an out-of-sample testing process. For instance, if we had 1,000 cases on a particular legal question, we would train the algorithm on the facts and outcomes of 700 of the cases and test it on the remaining 300. If any of the out-of-sample cases produce an unexpected result, we can go back and revaluate both our data and the weights assigned by the machine learning to particular factors. The more training and testing we apply to the system, the more we can ensure its accuracy.
Another advantage of machine learning is that it can assess the impact of different factors by changing the facts that are input and re-running the scenario. For example, what if the investment described above was less risky? What would happen, for example, if the company was not thinly capitalized at the time the obligation was entered into or if the company could not have obtained a loan from the institutional lender? This change might change the outcome or lead to a prediction at a lower confidence level.
Systems powered by machine learning would allow lawyers to make predictions more confidently and efficiently based on all of the relevant information. And while there is considerable anxiety about the disruptive potential of AI for the legal field, it is important to recognize that machine learning is not a replacement for the judgment of human lawyers. Instead, it is a powerful new tool that could augment their professional knowledge and instincts.
Benjamin Alarie holds the Osler Chair in Business Law at the University of Toronto Faculty of Law and is the CEO of Blue J Legal.