Don’t Trust AI, Always Verify. Tax Law Still Needs Humans

AI is changing the practice of tax law. This series examines the ethical, legal, and practical implications of AI across key areas of tax practice.

Part 1 of this contribution established that generative artificial intelligence creates structural risks in tax law: epistemic instability through hallucinated authority and substantive mismatch through the inability to comprehend anti-abuse doctrines. Part 2 turns to the professional response. Because AI lacks agency, it must be treated as an instrumentality for tax practitioners rather than as a legal assistant, placing the burden of verification squarely on the attorney under the “non-delegable duty” standard articulated in United States v. Boyle. This article identifies a critical “privilege gap” where AI tools fail to qualify for Kovel protection, risking the waiver of attorney-client confidentiality, and concludes by proposing a governance framework—including judicial certification and revised Circular 230 standards—that mandates “human-in-the-loop” verification to preserve the integrity of the tax system.

In a post-Loper Bright environment, where the judiciary must interpret statutes without agency deference, the injection of hallucinated legislative history or case law is not merely a drafting error; it is a corruption of the record.

If AI cannot be trusted to self-regulate, the tax profession must regulate its use through liability rules, privilege safeguards, and governance protocols.

Instrumentality for Tax Practitioners

Generative AI functions as neither a junior associate nor a paralegal. See Artificial Intelligence (AI) Guidance for Judicial Office Holders, Cts. & Tribunals Judiciary, 5 (2025). It is a highly persuasive tool. The danger is that it lowers the psychological barrier for a lawyer to accept a favorable result because a “smart machine” produced it. Courts and regulators must recognize this ontological status to properly assign liability. While the machine operates probabilistically, the untrained practitioner usually succumbs to “automation bias,” accepting algorithmic outputs as authoritative. See Mata v. Avianca, 678 F. Supp. 3d at 448; ABA Comm. on Ethics & Professional Resp., Formal Op. 512, 4 (2024).

Rejecting the agent analogy. Some have suggested treating AI like a supervised assistant. This analogy fails. AI is not supervised; it is prompted. See Eliza Mik, Caveat Lector: Large Language Models in Legal Practice, Rutgers Bus. L. Rev., 12 (2024) (observing that Large Language Models solve a next-word prediction problem rather than engaging in the cognitive process of understanding). It does not learn from correction in the manner of a human associate. See id. at 55 (describing the phenomenon of “model collapse” and the inability of models to acquire functional grounding from text alone). It does not reason, and it has no duty of loyalty. See id. at 12; see also Artificial Intelligence (AI) Guidance for Judicial Office Holders, Cts. & Tribunals Judiciary, 7 (2025) (warning that AI chatbots lack the capability for legal analysis and do not produce convincing reasoning). The agent model, central to vicarious liability, is doctrinally inapposite. See Eliza Mik, Caveat Lector: Large Language Models in Legal Practice, Rutgers Bus. L. Rev., 57 (2024); see also Victor Habib Lantyer, The Phantom Menace: Generative AI Hallucinations and Their Legal Implications, SSRN No. 5167036, 1 (2025).

If an AI prepares a brief that cites a fictional Tax Court decision, who bears the responsibility? The legal profession should abandon the “agent” analogy for AI, which presupposes supervision. Reliance on it without verification is not a failure of supervision; it is a failure of personal competence.

In such a framework, the AI is simply an instrument tool. Like a calculator or a word processor, it amplifies human input. However, unlike those tools, AI generates legal claims. If a calculator fails, it outputs “ERROR.” If an AI fails, it outputs a plausible-sounding falsehood. That functional leap demands heightened vigilance. See Standing Order on AI Usage, Judge Brantley Starr (N.D. Tex. May 30, 2023); see also ABA, Formal Op. 512, 4.

Non-delegable duty to verify. In United States v. Boyle, the US Supreme Court held that reliance on an agent does not excuse a taxpayer’s failure to meet a non-delegable statutory duty. 469 U.S. 241, 252 (1985). The Supreme Court distinguished between substantive legal advice (which may constitute reasonable cause) and the performance of ministerial acts. Verifying the existence and accuracy of legal citations is a ministerial function, not a matter of judgment.

Delegating this task to an AI without independent confirmation is not reliance on legal judgment; it constitutes a failure of ordinary business care. See Boyle, 469 U.S. at 252; see also Mata, 678 F. Supp. 3d at 448. In tax litigation, Federal Rule of Civil Procedure 11 and Tax Court Rule 33 impose a non-delegable gatekeeping function on the signing attorney. Fed. R. Civ. P. 11(b); Tax Ct. R. 33(b); see also Thomas, No. 12982-20. Verification of authorities remains an essential condition of the lawyer’s personal responsibility to the tribunal. Mata, 678 F. Supp. 3d at 448; Thomas v. Commissioner, No. 10795-22, Order (T.C. Oct. 23, 2024). Practitioners cannot outsource such gatekeeping functions to an algorithm.

Circular 230 and negligence. Circular 230, §10.22 requires practitioners to exercise due diligence in determining the correctness of representations made to the Department of the Treasury. 31 C.F.R. §10.22(a) (2024). This safe harbor assumes the existence of human communicative agency. A large language model, or LLM, is not a “person” capable of being supervised within the meaning of the regulation. See 31 C.F.R. §10.2 (2024).

Current regulations target individualized fraud rather than systemic negligence. While a human practitioner commits isolated errors, an AI system can generate thousands of erroneous tax positions instantly. The aggregate capacity of distributed AI systems presents unprecedented regulatory challenges for tax enforcement. The Treasury Department should clarify that the “reliance on others” defense in §10.22(b) does not extend to generative AI tools, effectively creating a strict liability standard for the submission of AI-hallucinated authority.

Confidentiality and privilege. Beyond threatening legal accuracy, generative AI compromises the structural integrity of attorney-client confidentiality. ABA Formal Op. 512, 6; CCBE Guide on the Use of Generative AI by Lawyers, 16 (2025). The risks are structural and arise from system architecture regarding data training and storage access. See CCBE Guide, 14; see also Considerations When Using ChatGPT and Generative AI Software Based on Large Language Models, Bar Council, ¶15 (Jan. 2024). In tax practice, where sensitive financial data is routine, these risks are amplified. See OECD, Governing with Artificial Intelligence (2024).

The third-party problem. Most generative AI tools are cloud-based. See Artificial Intelligence Use in the Federal Court of Australia 26 (2025). Inputting client data into these systems transmits it to third-party servers. ABA, Formal Op. 512, 7. Model Rule 1.6 prohibits unauthorized disclosure of information relating to representation. ABA, Model Rules of Professional Conduct, R. 1.6 (2023). If the AI provider logs, stores, or reuses that data, even in anonymized form, privilege may be lost. CCBE Guide, 17. Current service agreements demonstrate the practical reality of these data risks. Jack Turner, 7 Things You Should Never Share with ChatGPT, Tech.com (2025). Many AI terms of service reserve rights to use prompts for product improvement that destroys confidentiality. State Bar of Cal., Practical Guidance for the Use of Generative Artificial Intelligence in the Practice of Law (2024); ABA, Formal Op. 512, 11. The legal consequence is more than an ethical breach; it may constitute waiver of the attorney-client privilege. See United States v. Heppner, (S.D.N.Y. 2026); Considerations When Using ChatGPT, ¶19.

Tax professionals should regard AI as more than a passive, outsourced cloud service when applying confidentiality rules.

The Kovel gap. Under United States v. Kovel, privilege extends to non-lawyers (such as accountants) employed to assist in legal advice. 296 F.2d 918 (2d Cir. 1961). The Kovel doctrine protects agents who effectively translate complex factual or financial information for the attorney.30

AI does not fit the model, as Kovel requires an agency relationship and an expectation of confidentiality. See Mik, 12. AI systems lack agency, operate independently of direct attorney oversight, and—if utilizing public models—lack the expectation of confidentiality. If the AI’s terms of service state that content is used to improve services, the lawyer has effectively disclosed the client’s confidential tax strategy to a third-party commercial provider. That disclosure destroys the expectation of confidentiality required for privilege. Without a confidentiality agreement and a clear agency relationship, courts should not extend Kovel to AI tools. Kovel, 296 F.2d at 922. This gap is fatal in high-risk contexts such as tax shelter planning, where privilege is paramount.

Data persistence and IRC §7216. Even where providers disclaim retention, data persistence is opaque. In some systems, data may be incorporated into future training sets. Matthew Dahl et al., Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models, 16 J. Legal Analysis 64, 85 (2024). This raises compliance risks under IRC §7216, which imposes criminal penalties for the unauthorized disclosure of tax return information. See 31 C.F.R. §10.51(a)(15) (2024); see also IRC §7216 (2024). A practitioner who shares client income data or return structure with a public LLM may inadvertently trigger liability. Unlike a traditional research tool, the AI does not forget; it transforms inputs into permanent statistical weights.

Governance Proposals

The profession cannot await external remedies, as the burden of risk management lies squarely with practitioners. See Mata, 678 F. Supp. 3d at 448. Regulatory evolution requires a tripartite approach: clarification of existing standards, procedural mandates, and hardened internal controls. See Buckeye Tr. v. PCIT, ITA No. 1051 (ITAT Bengaluru Bench Dec. 30, 2024). AI in tax practice is not a future concern but a present liability. Thomas v. Commissioner, No. 10795-22, Order (T.C. Oct. 23, 2024). The objective is not prohibition, but disciplined integration as human discernment remains essential. Cross-border practice heightens the stakes as regulatory expectations will diverge across jurisdictions. What is permissible in one market may violate professional standards in another. A resilient compliance posture therefore requires anticipating conflicting rules and documenting governance choices.

Regulatory action: amending Circular 230. Circular 230 is silent on generative AI, but Treasury should amend §10.22 to clarify that reliance on generative AI without human verification violates the duty of diligence. Submitting AI-generated legal content that has not been manually confirmed should be treated as a breach. Circular 230, §10.33 (Best Practices), should also encourage concrete protocols, including usage logs, verification records, and version histories for AI-assisted work. These controls operationalize diligence. They also reduce the probability that fabricated authority enters the tax record. Verification remains non-delegable.

Reliance on generative AI without independent confirmation of primary authority should be treated as a failure of diligence under existing standards, including Circular 230 and Rule 11-type certification regimes. The premise is simple: If the machine can hallucinate, the practitioner must verify. No tool substitutes for judgment, and Circular 230 should state that expressly.

Ex-ante reform: competence and ethics certification. Even if AI were to reach superhuman-level capability, Model Rule 5.3 would still regulate the supervision of non-lawyers that presumes human assistants. ABA, Model Rules of Professional Conduct, Rule 5.3 (2023). With AI, the duty of competence under Rule 1.1 becomes the primary doctrinal standard. ABA, Model Rules of Professional Conduct, Rule 1.1 (2023); ABA, Formal Op. 512, 3. The duty is personal, and it is not enough to adopt a firm-wide AI policy. ABA, Model Rules of Professional Conduct, Rule 1.1 comment 8 (2023); see also AI & Legal Ethics: A Guide to ABA Rules in 2025, NexLaw Blog (2025). Each lawyer must verify outputs before use because the test is not whether the tool was “generally reliable—competence now includes the ability to identify hallucinations and understand where AI could fail. ABA, Formal Op. 512, 4; see also Tahir Khan, Oops! AI Made a Legal Mistake: Now What? AI Hallucinations, Professional Responsibility, and the Future of Legal Practice, The Barrister Group (Oct. 21, 2025).

Courts have also begun to act, as the Northern District of Texas requires certification of human verification for AI-assisted filings. See Standing Order on AI Usage. The Tax Court should follow with a uniform standing order requiring counsel to certify that citations and quotations have been verified against primary sources. This is not a rejection of innovation; it codifies what ethical rules already require. A short certification requirement forces a pause before submission, and a simple “nudge” is enough to stop the rushed submission of possibly fabricated precedents.

This point is no longer theoretical. In February 2026, the Fifth Circuit sanctioned counsel for using AI to draft a substantial portion of a reply brief, failing to verify the accuracy of the output, and responding evasively to the court’s show-cause order. Fletcher v. Experian Info. Solutions, Inc., No. 25-20086, slip op. at 2, 13–15 (5th Cir. Feb. 18, 2026). The court also explained that it declined to adopt a special AI rule because existing sanctions doctrines already require accuracy and verification. Id. at 3–5. The lesson is straightforward. Even absent an AI-specific certification rule, the duty of human verification already exists and is being enforced by Courts with increasing clarity.

Professional practice: human-in-the-loop by default. The central safeguard is structural. No AI-generated work should reach a client, court, or agency without experienced human review. Verification must be routine where each cited source is reviewed in context and each quotation is matched against the original. The human tax practitioner must test each legal proposition against controlling authority and remains the gatekeeper.

Effective risk management requires a structural redesign of firm protocols:

Engagement letters should disclose whether AI is used, under what terms, and for which tasks. The duty of confidentiality in the AI era requires proactive design, not reactive disclaimers.51 Clients should understand when automation is used and where human judgment begins. This aligns with Model Rule 1.4 and reduces misunderstanding risk. AI & Legal Ethics: A Guide to ABA Rules in 2025, NexLaw Blog (2025); State Bar of Cal., Practical Guidance for the Use of Generative Artificial Intelligence in the Practice of Law (2024).
Firms must prohibit the use of public LLMs for client matters because all AI tools must be sandboxed behind confidentiality firewalls. See Artificial Intelligence Use in the Federal Court of Australia, ¶19; see also Khan, supra.
Contracts with any AI tool, cloud vendor, or third-party processor must include “zero retention” clauses and verification that no input is used for training. CCBE Guide, 18; see also AI & Legal Ethics: A Guide to ABA Rules in 2025, NexLaw Blog (2025).
For drafting tools, retrieval-augmented generation models tied to verified databases offer a more defensible foundation than open-ended tools. Open systems that hallucinate are neither reliable nor efficient, especially when they make users waste time searching for needles in a haystack. ABA Formal Op. 512, 7; Cal. Bar Guidance, 2; see also Varun Magesh et al., Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools, 22 J. Empir. Leg. Stud. 216, 218 (2025); Dahl et al., 66.

Governance should make that risk structurally difficult, and professionally sanctionable when it occurs. See Buckeye Tr., ITA No. 1051 (2024) (recalling an order based on fictitious judicial precedents); see also QWYN v. Comm’r of Taxation [2024] AATA 1579, ¶35.

Look Ahead

Tax law is built on trust, but that trust must be earned through diligence. In the new era of algorithms and probabilistic output, human adjudication remains the terminal value in tax administration. AI can expedite the workflow, but it cannot supplant the decision.

Tax professionals bear an affirmative obligation to verify algorithmic outputs rather than merely query them. The efficiency gains of AI are real, but they are illusory if they come at the cost of legal integrity. If lawyers abandon their gatekeeping function for the speed of automation, they risk not merely sanctions, but the destabilization of the self-assessment system itself. As the courts have now made clear, there is no algorithm for accountability.

This article does not necessarily reflect the opinion of Bloomberg Industry Group, Inc., the publisher of Bloomberg Law, Bloomberg Tax, and Bloomberg Government, or its owners.

Author Information

Pramod Kumar Siva is an international tax practitioner with over 25 years of cross-border experience spanning North America, Europe, the Middle East, and Asia. He is also a visiting adjunct faculty member at Texas A&M University.

Write for Us: Author Guidelines

To contact the editors responsible for this story: Soni Manickam at smanickam@bloombergindustry.com; Daniel Xu at dxu@bloombergindustry.com

Learn more about Bloomberg Tax or Log In to keep reading:

See Breaking News in Context

From research to software to news, find what you need to stay ahead.

Don’t Trust AI, Always Verify. Tax Law Still Needs Humans—Pt. 2