Mapping project complexity with AI and the TOE framework

How I turned the TU Delft TOE framework for project complexity into two AI workflow skills for diagnosis and practical project management interventions.

After completing the DelftX project management course in TU Delft’s Engineering Project Management series, the idea that stayed with me was not a template, a checklist, or a new way to make a Gantt chart look more impressive. It was simpler than that: before you manage a project, you need to understand what kind of complexity you are actually dealing with.

That sounds obvious until you look at how many projects are managed as if all complexity were the same thing. More tasks? Add more planning. More uncertainty? Add more meetings. More stakeholders? Add a RACI and hope it survives contact with reality.

Sometimes that works. Often it does not. The problem is that “this project is complex” is not a diagnosis. It is only a warning light.

The TOE framework developed by TU Delft researchers gave me a better way to think about that warning light. TOE stands for Technical, Organizational, and External complexity.

The framework was developed to help project teams grasp complexity in large engineering projects and adapt the front-end development phase accordingly. The published paper behind it is Grasping project complexity in large engineering projects: the TOE framework.

I kept thinking about how useful this would be inside an AI-assisted workflow. Not because AI should manage the project for you. Because AI is useful when it is forced to structure messy thinking, and project complexity is usually very messy thinking.

So I built two new skills:

They live in the same my-ai-workflow repository as the rest of my AI workflow skills.

The core idea: not all complexity asks for the same response

The mistake I see often, including in my own thinking, is treating complexity as a single number. We say a project is “simple”, “complicated”, or “very complex”, then we jump straight to a management response.

That loses the interesting part.

A project can be technically complicated but organizationally stable. Another project can be technically simple but politically unpredictable. Another one can have both: many technical interfaces and stakeholders who are still negotiating what success even means.

Those are not the same project. They should not be managed in the same way.

The TOE framework helps by separating complexity into three families.

Technical complexity is about the content of the project: goals, scope, technology, tasks, dependencies, quality requirements, disciplines involved, and technical risks. In software terms, this is where we find things like legacy integrations, unclear architecture boundaries, strict performance requirements, many work packages, unfamiliar infrastructure, or unknown delivery methods.

Organizational complexity is about the internal project organization: team size, resource availability, contracts, trust, roles, methods, tools, funding sources, schedules, and coordination across disciplines or locations. In software teams, this often shows up as unclear ownership, unavailable specialists, misaligned delivery methods between teams, pressure to hit a date that was chosen before discovery, or contract structures that reward the wrong behavior.

External complexity is about the context around the project: external stakeholders, approvals, political influence, market instability, regulatory pressure, dependencies outside the team, strategic pressure, public trust, and other external risks. In a product or platform context, this may be customers with conflicting expectations, compliance deadlines, procurement constraints, partner APIs, legal uncertainty, or leadership changes that alter priorities halfway through the work.

The useful part is not the labels themselves. The useful part is that they stop you from pretending a technical solution can fix an organizational problem, or that a weekly status report can resolve an unstable external context.

Detail complexity and dynamic complexity

The other distinction I wanted the skills to preserve is the difference between detail complexity and dynamic complexity.

Detail complexity is about the number of parts and the structure between them. Many components, many interfaces, many work packages, many actors, many dependencies, many contracts. This kind of complexity often benefits from control: decomposition, interface registers, baselines, dependency maps, quality criteria, clear ownership, decision gates.

Dynamic complexity is about change, uncertainty, and unpredictability. Stakeholders shift position. Scope moves. The market changes. The regulator clarifies something late. The team learns that the original solution path does not work. The facts are incomplete and the project cannot simply be decomposed into stable pieces.

Dynamic complexity needs more than control. It needs interaction: shared problem framing, short feedback cycles, scenario exploration, visible assumptions, alignment sessions, escalation paths, and repeated reassessment.

This creates four broad management postures:

  • Low detail and low dynamic complexity: keep it simple.
  • High detail and low dynamic complexity: use systems management and control.
  • Low detail and high dynamic complexity: use interactive management and connecting work.
  • High detail and high dynamic complexity: use dynamic management, combining control with interaction.

That last category is where many real projects live. The risk is overcorrecting in one direction. Too much control, and the project becomes rigid exactly when it needs to learn. Too much interaction, and everyone keeps talking while decisions never land.

Why this belongs in an AI workflow

AI-generated project plans are often too tidy. They look coherent because they are formatted well, not because the underlying diagnosis is correct.

That is the danger. A model can produce a confident plan from a weak description. It can turn uncertainty into bullet points so smoothly that you forget the uncertainty was still there.

The TOE structure gives the model friction. It forces the conversation to separate the type of complexity from the action taken in response to it. It also makes confidence visible. A low-confidence hotspot is not ignored just because the model cannot fully prove it. If it could affect approval, trust, compliance, delivery feasibility, or project value, it deserves attention.

This is the same principle behind my earlier AI-assisted workflow: the AI can accelerate the production of artifacts, but the judgment remains yours. For project management, the artifact is not code, a PRD, or an issue list. It is a clearer picture of the project before the plan hardens.

Skill 1: project-complexity-mapper

The first skill is diagnostic. It takes a rough project description, messy notes, or a partially formed idea and turns it into a structured complexity assessment.

By default it runs as a fast complexity scan. That means it does not walk through every TOE element one by one. It reads the description, infers likely hotspots, asks only the few questions that would materially change the diagnosis, and marks confidence where evidence is thin.

If the project is high-risk, expensive, politically sensitive, or already showing signs of failure, it can switch to a complete TOE assessment. That mode uses the full 47-element TOE checklist and scores each element from 1 to 5: none, little, some, substantial, or very much.

The mapper is deliberately not an action planner. It produces a diagnosis, not a backlog.

The output includes:

  • A summary of the assessment mode and top complexity hotspots.
  • Findings by Technical, Organizational, and External category.
  • Scores for TOE contribution, detail complexity, dynamic complexity, severity, impact, and confidence.
  • An overall quadrant for the project.
  • Separate hotspot quadrants, because one average score hides too much.
  • A Mermaid quadrant chart to visualize where the main hotspots sit.
  • Management-fit guidance.
  • A handoff table that another skill can consume.

The handoff table matters. It makes the next step traceable. The action planner should not invent a generic project-management checklist. It should act on the diagnosed hotspots.

Skill 2: project-complexity-action-planner

The second skill starts where the mapper stops. It reads the diagnosis, especially the handoff table, management fit, hotspot quadrants, and findings by category. Then it converts those hotspots into a practical intervention plan.

The important word is intervention.

This is not a full project schedule. It is not a Gantt chart. It is not a ticket breakdown.

An intervention is a management action chosen because a specific hotspot needs it. If a project has high technical detail complexity around integrations, an intervention might be an interface register, a prototype for the riskiest API, or a dependency map with clear owners. If the hotspot is external and dynamic, the intervention might be a stakeholder alignment session, a scenario review, or an approval map. If the hotspot is both detailed and dynamic, the intervention should combine control and connection, for example a requirements baseline with a controlled discovery window, or decision gates paired with assumption reviews.

The action planner sequences work into three horizons:

  • Now: actions that clarify uncertainty, prevent blockage, or create needed control immediately.
  • Next: actions that depend on first facts, decisions, or alignment.
  • Later: actions to reassess after a phase change, pilot, procurement decision, design freeze, or other major shift.

It also assigns owner roles, not names, unless names are provided. That keeps the output useful across teams: project manager, sponsor, product owner, technical lead, integration lead, contract manager, legal or compliance lead, stakeholder lead, steering committee.

The final output includes an intervention backlog, sequencing logic, decision gates, reassessment triggers, risks in the plan, and a short handoff for execution.

Why I split it into two skills

I could have made one skill that diagnoses complexity and creates a plan in one pass. That would have been faster. It would also have been worse.

Diagnosis and action are different mental moves. When they happen at the same time, the model tends to rush toward advice. The format becomes a polished recommendation before the project has been understood.

Splitting the workflow keeps the first skill honest. Its job is to say, “Here is where complexity appears to be coming from, here is how severe it looks, here is how confident we are, and here is the management posture that seems to fit.”

Only after that does the second skill ask, “Given that diagnosis, what should we do first?”

That separation mirrors how I want to work with AI in general. Do not let the model skip the thinking artifact. Make the thinking artifact reviewable. Then use it as input to the next step.

How I would use it

I would start with a project description written in normal language. No template. No forced structure.

Something like:

text
We need to migrate a legacy reporting system into a new platform.
The old system is used by three departments, each with different reporting definitions.
The source data comes from two internal APIs and one external vendor.
The deadline is connected to a contract renewal, but the scope is still moving.
The engineering team understands the new platform, but nobody fully owns the old reporting logic.

Then I would run the mapper and read the output critically. The important questions are not “Do I like this report?” but:

  • Did it classify the real hotspots?
  • Did it confuse technical complexity with organizational complexity?
  • Did it miss an external dependency?
  • Are the low-confidence findings actually important?
  • Are we averaging away a hotspot that needs separate attention?

If the diagnosis looks useful, I would pass the handoff into the action planner in a new session with clear context window. Then I would review the intervention backlog with the same skepticism.

Every action should trace back to a hotspot. If an action does not trace back to a hotspot, it is probably generic advice. If a high-severity hotspot has no action, the plan is probably incomplete. If a dynamic hotspot only gets control artifacts, the plan is probably too rigid. If a high-detail hotspot only gets workshops, the plan is probably too vague.

The output should become a conversation artifact for the team, not a decree from the machine.

When this is useful

I see this being useful before a project execution plan, before a large PRD, before a replatforming effort, before a vendor integration, before a compliance-heavy feature, or when a project already feels stuck but the reason is not obvious.

It is especially useful when people disagree about what the problem is. One person says the project is technically hard. Another says the scope keeps changing. Another says the real blocker is approvals. Another says the team does not have the right people.

All of those can be true. The TOE framework gives you a way to hold those truths separately instead of collapsing them into one vague complaint.

It also helps with AI-assisted development because it sits upstream of implementation. If the project is organizationally unstable, breaking work into better tickets will not fix it. If the external context is volatile, a perfect architecture diagram will not make the decision process predictable. If the technical interfaces are unclear, more stakeholder alignment will not define the API contract by itself.

Different complexity asks for different management behavior.

When I would not use it

I would not use this for every task.

If the work is small, local, and well understood, this is overhead. If one person can explain the scope, implement it, test it, and ship it without meaningful dependencies, a TOE scan is probably unnecessary.

I also would not use it to justify a decision that has already been made. That is one of the easiest ways to misuse AI: ask for a framework, feed it selective context, and receive a professional-looking confirmation.

And I would be careful using it without the people who actually understand the project. Complexity is subjective and dynamic. Different actors see different risks because they sit in different parts of the system. The skill can structure the conversation, but it cannot replace the conversation.

The human part

The most valuable output of these skills is not the table. It is the disagreement the table makes visible.

If a stakeholder says a hotspot is scored too high, good. Ask why. If the technical lead says the integration risk is understated, good. Update the diagnosis. If the sponsor thinks the external pressure is a value opportunity rather than only a risk, good. That belongs in the management fit.

The goal is not to make project complexity disappear. Some complexity cannot be removed. Some complexity should not be removed, because it is tied to project value.

The goal is to stop managing every project with the same reflex.

Control where the project needs control. Connect where the project needs interaction. Reassess when the context changes. And keep the judgment with the humans who are accountable for the work.

The skills

The two skills are available in my AI workflow repository:

They are not meant to sell a methodology. They are my attempt to encode a useful way of thinking so I can reuse it, challenge it, and improve it over time.

Read Next