Why legal AI still struggles with ‘why’ and ‘how’

As firms rely more heavily on AI tools, understanding their architectural limits is becoming a professional necessity
Lawyers seldom stop to ponder the complex architecture that underlies the AI they use on a daily basis – but maybe they should, because it has some serious implications for the work they do.
For the past couple of years, the foundation of AI has been something called Retrieval Augmented Generation (RAG) architecture, which has proven effective at locating information and answering ‘what’ questions, but struggles with the deeper ‘why’ and "how" inquiries.
This limitation stems from RAG's flat-file architecture, which – while it excels at finding information – fails to capture the complex interconnections and relationships that underpin meaningful understanding and knowledge, particularly in legal work.
How will AI data architecture need to evolve to better reveal critical relationships across multiple data points at scale and answer the ‘why’ and ‘how’ questions more effectively – and how can law firms ensure they are best positioned to leverage these new capabilities?
The reasoning gap
This inflection point has been taking shape over the past couple of years. After some (embarrassing) missteps of relying on the ‘internal knowledge’ of Large Language Models (LLMs), which resulted in hallucination-laden work product, the legal space found great success with the RAG approach.
By coupling LLMs with so-called vector databases (more on this in a bit), firms could suddenly interrogate vast repositories of case law and contracts, providing control over the data that AI used to formulate its answers. If you asked what a specific clause in a merger agreement stated, for instance, the system could find it and provide an accurate answer.
AI’s utility in these types of scenarios led to a slightly misleading assumption: that ‘access to information’ and ‘retrieval of information’ are synonymous with ‘intelligence.’ Part of the blame here lies with the humanlike interaction we have with AI models: we can ask natural language questions, and it replies in fluent speech – leading users to believe that the AI is, at some level, ‘reasoning.’
However, there’s a big gap between answering ‘what’ and answering ‘why’ or ‘how’ – and this ‘reasoning gap’ isn’t going away with the current architecture alone.
Into the weeds
A quick primer is in order here. RAG systems rely on Vector Space Models. When a lawyer uploads a document, the AI chunks the text and converts it into numerical vectors (basically, lists of numbers representing semantic meaning). When a query is posed, the system retrieves chunks that are mathematically close to the query – but the architecture treats each chunk as an isolated, disconnected data point, with no structural awareness of how different pieces of information relate to one another.
Let’s translate this arcana into real-world terms. A vector representing a clause in Contract A, for instance, has no inherent architectural connection to a conflicting clause in Contract B, nor does it inherently understand the hierarchical relationship between, say, a High Court judgment and a subsequent statutory instrument.
The result? The architecture can’t make the leap to the true intelligence of ‘how’ and ‘why’ or perform the causal reasoning that is essential for legal. This means that – amongst other things – it cannot reliably trace precedent chains, reason about regulatory dependencies, or explain legal conclusions, rather than merely retrieving relevant documents.
The next evolution in legal AI, then, is not about bigger or more powerful models – that’s a bit like putting a more powerful engine in your sports car and expecting it to take flight like an airplane. Power isn’t the constraint – the design is.
That said, the emergence of so-called ‘reasoning models’ and dramatically larger context windows are partially closing the reasoning gap from within the model itself. They don’t eliminate the need for architectural change at the data layer: a more powerful engine still needs the right airframe if the goal is flight.
What’s required then is a fundamental transformation in data architecture – one that moves from the flat, probabilistic retrieval that RAG specialises in towards architectures that can support reasoning and make AI more ‘intelligent.’ Several promising approaches are now emerging.
Option 1: Creating relationships with graph RAG
One approach is called graph RAG, which involves constructing entity-relationship graphs with hierarchical summarisation. To simplify somewhat, this means that the AI identifies the relationships that exist between different pieces of information.
For instance, a contract might have a starting date – that’s one entity. Party one in the contract, who has a certain obligation to pay within 30 days, is another entity. Each of these entities, as well as other ones within the contract, essentially become nodes on a graph that the AI can link together, making interconnections between the different pieces of information.
This approach is a promising way to close the reasoning gap, and recent research – including work on ontological-driven legal knowledge graphs – has shown it can achieve near-deterministic accuracy for tasks like point-in-time legislative retrieval. But the downside is it's very expensive. It’s significantly more costly to build and maintain graph RAG architecture than regular RAG architecture, and it’s also computationally intense, which adds an additional layer of cost.
Option 2: A neuro-symbolic approach that blends logic and learning
Another promising option, which is still in the nascent stages, is the neuro-symbolic approach. This architecture combines neural networks (i.e., LLMs) with symbolic AI – a rules-based ‘if, then’ type of AI that has actually been around for decades. Putting the two together makes the whole greater than the sum of its parts, because the LLMs can serve as a translator that helps overcome some of the inherent rigidity of symbolic logic.
Here’s how that works. Let’s say that a lawyer asks the AI, ‘Can our client terminate this contract early, given force majeure?’ The LLM translates this natural language into formal logic or a structured query that the symbolic system understands. The symbolic engine reasons over the question using explicit rules (such as contract terms, legal definitions, or logical inference), and then the LLM interprets the symbolic output back into natural language for the lawyer.
One of the main advantages of this approach is that the symbolic reasoning component is deterministic – meaning there's only one possible outcome for a given set of rules and facts – and it’s auditable, thanks to the ‘old-school’ symbolic logic. An end user can audit the entire reasoning process that the AI used to reach a certain conclusion rather than having to guess at what kind of calculations took place inside a black box.
The downside here? The LLM translation layer is still probabilistic, meaning the overall system isn’t immune to error. And the approach works best where legal rules can be clearly formalised – it’s less proven for areas involving discretion or open-textured concepts like ‘reasonableness.’ The neuro-symbolic approach is still mainly in the research phase. But early, promising seeds have been planted.
Option 3: tweaking the existing technology
A third possibility to close the reasoning gap that bedevils current RAG architecture is something broadly known as agenticRAG. This approach enables existing RAG architecture to get significantly closer to causal reasoning by incorporating autonomous agent elements that can plan, evaluate, and iterate.
With agentic RAG, an AI ‘agent’ is put in charge of answering a query and performing ‘quality control’ along the way. Rather than just handing a result to the user, it evaluates the search results multiple times – rejecting answers that seem hallucinated or widening the search as needed to get the best results.
For instance, if a lawyer asks, ‘can a UK company lawfully avoid liability for supply chain delays caused by post-Brexit customs checks?’, the AI agent will gather information, all while asking: are these sources authoritative? Are there gaps or contradictions? Is the answer too shallow or too confident? If the agent detects uncertainty or conflicting signals, it triggers widening or deepening.‑chain delays caused by post‑Brexit customs checks
Going beyond ‘single pass RAG’ – giving the AI agency to keep refining and reformulating results at the moment that a question is being asked – provides greater agility and can significantly improve the AI’s reasoning capabilities while reducing hallucinations. It’s an achievable shift in design that opens the door to a much bigger shift in capability.
Whatever’s next requires the right partner
Ultimately, it’s early days for all of these approaches, and the most likely outcome is that the future architecture will be a hybrid – combining elements of graph-based knowledge, symbolic reasoning, agentic iteration, and increasingly powerful models. RAG itself isn’t going away; rather, it’s being augmented and extended by these new capabilities.
Firms that want to ensure they aren’t left behind as the fundamentals of AI technology evolve will want to make sure they are partnering and collaborating with AI vendors who are already investing in these next-generation capabilities.
Such foresight is increasingly essential as the ground continues to shift beneath the industry. Ultimately, that close collaboration will be the best way for firms to close a reasoning gap that’s becoming impossible to ignore with current AI architecture – and smoothly transition from the ‘what’ to the richer ‘why’ and ‘how’ that their legal work demands.

