You Can’t AI Your Way Out of a Data Problem

Abstract geometric blocks representing data layers

There is a recurring conversation in enterprise portfolio management right now. It goes something like: "We have a data quality problem. We're going to fix it with AI." The logic feels sound. AI is good at processing messy data. Portfolio data is messy. Therefore AI should help.

The problem is that this gets the causality backwards. AI does not fix messy portfolio data. It operationalises it. Whatever inconsistencies, gaps, and ambiguities exist in your portfolio information today, AI will process them faster, surface them more confidently, and present them in a format that looks authoritative. The mess does not go away. It gets a better suit.

A faster signal from a flawed source

AI is genuinely good at processing volume: surfacing patterns, identifying anomalies, reducing manual effort, accelerating the production of portfolio-level insight. That capability is precisely why the underlying data quality matters so much.

If status criteria are inconsistent across initiatives, AI will reflect those inconsistencies at greater volume. If data is delayed by manual consolidation, AI will surface stale information more efficiently. If delivery methodologies vary, with waterfall programmes tracked one way and product-led initiatives tracked another, AI summaries will aggregate that variation without resolving it.

The portfolio view that reaches the governance forum will still reflect fragmented local judgement, only now it arrives faster and looks more polished. That is not an improvement. It makes the existing problem harder to detect and harder to challenge.

The problem is not data quality. It is context.

The instinct when AI fails to deliver useful portfolio insight is to blame the data. And data quality matters. But the deeper problem is more specific, and it is the same one tripping up AI deployments across every enterprise domain right now: the data lacks context.

The data infrastructure works. The pipelines run. The warehouses are populated. But when an AI system tries to answer even a straightforward question, it hits a wall because nobody has codified what the data actually means. In analytics, a question as simple as "what was revenue growth last quarter?" breaks down immediately: which revenue definition applies, which fiscal quarter, which source table is authoritative. The definitions, business logic, and institutional knowledge that sit between raw data and meaningful answers have never been written down in a way machines can use.

Portfolio management has the same gap, with higher stakes. When someone asks "which initiatives are at risk?", the answer depends on what "at risk" means, and that varies by organisation, by methodology, and often by who you ask within the same organisation. Is it RAG status? Cost variance against baseline? Schedule slippage beyond a threshold? Dependency exposure? The data to answer any of those questions probably exists somewhere in the estate. The context to interpret it does not.

In data infrastructure, practitioners have started calling this the "context layer": the structured, maintained body of business logic, definitions, and rules that sits between raw data and the AI systems consuming it. It is, essentially, the codification of tribal knowledge into something machines can reason over. Portfolio management needs exactly the same thing, and almost nobody has built it.

What a context layer looks like in practice

For portfolio management, the context layer has three components.

Consistent measures with explicit definitions. Not a governance document that nobody reads. A binding standard for what each measure means, how it is assessed, and what thresholds trigger action. RAG criteria are the obvious starting point, because they are where the gap between perceived and actual portfolio health originates. Where status is self-assessed without agreed definitions, the same colour means different things across different initiatives, and those differences compound silently as they roll up into portfolio views.

Here is how the same status colour can mean four entirely different things:

Initiative Status What it actually reflects The governance problem
Core Banking Migration
Waterfall · IT
🟢 Genuine confidence. Milestones met. Budget on track. Risks identified and owned. Green means what it says. This is the exception.
Digital Onboarding
Agile · Product
🟢 Optimism. Velocity is down. Two sprints behind. Reported green because the deadline is still "technically achievable." No agreed definition of green. PM applies their own judgement.
Regulatory Reporting
Waterfall · Compliance
🟢 Avoidance of scrutiny. A key dependency is unresolved. Reported green to avoid escalation before the next governance cycle. RAG used as a communication tool, not a factual signal.
Cloud Infrastructure
Hybrid · Platform
🟢 Stale data. Status was updated three weeks ago. Conditions have changed. Nobody has flagged it because reporting is manual. Manual consolidation means the governance forum sees last cycle's reality.

Most enterprise portfolios look like this right now. AI will not fix it. AI will make it harder to see, because the summary will read as coherent even though the underlying signals are not.

Direct data flows, not manual assembly. Portfolio data should flow directly from delivery and financial systems rather than being assembled by hand each cycle, because every manual consolidation step introduces delay, interpretation, and silent data loss. Organisations still running monthly reporting packs assembled in spreadsheets don't have a technology problem. They have an architecture gap. The data already exists in source systems. It is the connection that is missing, and that connection is what allows a context layer to function in real time rather than being rebuilt every reporting cycle.

Maintained institutional logic. Even with consistent definitions and automated data flows, there is a layer of institutional knowledge that exists only inside people's heads: which programme's financials need to be read differently because of a contract structure, which team's velocity numbers are misleading because they changed their estimation approach mid-year, which dependency is technically resolved on paper but practically still a risk.

This is the component that makes data quality programmes stall, because it cannot be solved with a standard or a pipeline. In analytics, practitioners are finding that AI can bootstrap a significant portion of the context layer by crawling documentation, ingesting data models, and analysing query history to identify the most referenced tables and common joins. But the institutional knowledge, the conditional, historically contingent logic that experienced practitioners carry around and apply instinctively, has to be captured through deliberate human input. Someone has to say: "for this programme, ignore the baseline figures before Q2 because we rebased after the scope change." No amount of automation surfaces that. The context layer only works when the formal and the informal are maintained together, continuously, as a living system rather than a one-off documentation exercise.

Where the context layer fits

Most organisations try to jump straight from raw data to AI-generated portfolio insight. The missing layer in between is why it fails.

Without a context layer
AI amplifies the mess
Raw data flows directly into AI. Business logic is assumed, not codified. The output looks authoritative but reflects inconsistent source data.
Governance forum receives
AI-generated summary that looks coherent but reflects four different definitions of "green"
Stale data presented with false confidence
Portfolio health overstated, unchallenged
Context layer
Definitions
Data flows
Inst. logic
Skipped
Raw data sources
Jira / ADO
Finance systems
PPM tools
Spreadsheets
With a context layer
AI delivers real intelligence
Business logic is codified between the data and the AI. Definitions are explicit, data flows are direct, institutional knowledge is maintained.
Governance forum receives
Consistent, current portfolio view with a single definition of status across all initiatives
Real-time data from source systems, not last cycle's manual pack
Capacity recovered for strategic judgement
Context layer
Explicit definitions
Direct data flows
Institutional logic
Raw data sources
Jira / ADO
Finance systems
PPM tools
Spreadsheets

Accountability before technology

None of this works without ownership.

Delivery teams will maintain their operational views. Finance will maintain financial reporting. But someone has to own the coherence of the whole: what gets measured, what gets retired, and what the numbers actually mean when they sit alongside each other in a portfolio view. In most organisations, that belongs with the portfolio governance function, not to produce more reports, but to protect the integrity of portfolio information and maintain the standards that make it trustworthy.

Standards without ownership are just documentation. Ownership without standards is just authority.

This is the part that cannot be automated. AI can maintain a context layer once it exists. It cannot create the organisational will to build one.

Then AI delivers

Get the context layer right and the payoff is not incremental. The governance forum stops receiving a reporting pack that reflects last month's reality and this cycle's formatting preferences and starts receiving current, consistent portfolio intelligence. The portfolio function recovers capacity for the work that actually matters: interrogating investment decisions, stress-testing strategic trade-offs, connecting delivery performance to outcomes.

AI handles the data processing. Humans handle the judgement. That is the correct division of labour, and it only works when the foundation is right.

Sequence over speed

The organisations pulling ahead are not the ones that moved fastest on AI. They are the ones that fixed the foundation first.

The technology is ready. The context is not. Organisations that invest in codifying their business logic, standardising their definitions, and connecting their data flows before layering AI on top will extract compounding value. Those that skip straight to deployment will get polished summaries of unreliable data, and they will struggle to explain why their expensive AI investment hasn't improved decision quality.

The sequence matters more than the speed. And the risk is not that your portfolio data is poor. It is that the people presenting it believe it is good, and that belief goes unchallenged.

Cut Through Complexity

Request a demo of Kiplot

Explore More Articles


Continue your journey through our Strategic Portfolio Management insights

Quick Filters
Hybrid Portfolios Hybrid Lifecycle Prioritization Resource Management Capacity Planning OKRs CapEx / OpEx Status Reporting Jira Best Practice Budgeting & Forecasting Business Cases Governance & Guardrails Roadmaps & Planning Strategic PMO Benefit Realization RFPs