← All writings ES

Staff Engineering: Making Decisions That Outlast You

The hardest part of the Staff level isn't writing better code. It's making architectural decisions that remain legible and defensible after you've left the team.

I’ve left three long-term engineering engagements in my career. Each time, I’ve gone back and looked at what survived — not just what worked technically, but what remained understandable and intentional to the teams that inherited it.

The results were humbling.

A system I was proud of had become “legacy” within months, not because the technology was wrong, but because the team didn’t know why we’d made the key tradeoffs. When those constraints changed, they couldn’t adapt the architecture because they didn’t know what it was protecting against.

This is the central problem of Staff-level engineering: you’re not writing code for today’s team. You’re writing it for a team that doesn’t exist yet, with requirements that haven’t been stated.

The difference between a decision and a default

Most architectural decisions aren’t made — they accumulate. Nobody decided that all API calls should go through a global axios instance. One person did it, it worked, others followed the pattern, and now it’s “how we do things here.”

Accumulated defaults are invisible. Nobody can explain why they exist, which means nobody can evaluate whether they’re still appropriate. When a new engineer joins and asks “why do we do it this way?” the answer is “that’s how it’s always been done” — which is exactly the answer that precedes a costly refactoring effort six months later.

A real decision is different. A real decision has:

  • Context: what was true at the time that made this the right choice
  • Rationale: why this option over the alternatives
  • Consequence: what we gave up by choosing this

When all three are documented, the decision is legible. Someone inheriting the system can read it and say “the context has changed, so the decision should be revisited” or “the context still applies, so we should defend this boundary.”

Architecture Decision Records

ADRs are the most underrated tool in a Staff engineer’s toolkit. They’re short documents — typically one page — that capture a single architectural decision in a standard format.

At Scotiabank CCAU, I introduced ADRs at the start of the OneBank component library work. Here’s what a real one looked like:


ADR-012: Use discriminated unions for form schema typing

Status: Accepted

Context: The loan application form has conditional field requirements based on applicant type (natural person vs legal entity). Initial implementation used optional fields with runtime checks. This created TypeScript types that were too permissive and allowed invalid combinations to compile.

Decision: Use Zod’s z.discriminatedUnion keyed on personType. All conditional schemas extend a base schema and are composed at the union level.

Consequences:

  • TypeScript narrows the type correctly in conditional branches (positive)
  • The schema structure must match the UI’s discriminant field — changes to personType require updating all schemas (tradeoff)
  • Adding a new person type is additive — no existing schemas need modification (positive)

Alternatives considered: Optional fields with manual type guards; separate form components per person type. Both rejected because they move conditional logic into the UI layer.


The ADR took thirty minutes to write. It has prevented three “why do we do this?” conversations from turning into refactoring proposals.

The format doesn’t matter much. What matters is: context, decision, consequences, alternatives. The alternatives section is the one people skip and the one that’s most valuable, because it shows the decision was deliberate, not accidental.

Influencing without authority

Staff engineers typically don’t have direct reports. The influence happens through technical credibility, communication clarity, and timing.

The mistake I made early at this level: I thought being technically right was sufficient. If my argument was correct, the decision would follow. This is wrong. Engineering organizations don’t make decisions based on technical correctness alone — they make decisions based on risk, trust, relationship, and timing.

What actually works:

Make the tradeoffs explicit, not the conclusion. Instead of “we should use a monorepo,” write a document that says “here are the options, here are the tradeoffs, here’s my recommendation and why.” People who are skeptical of your conclusion will engage with the tradeoffs. That engagement is where alignment happens.

Timing matters as much as content. A proposal to restructure the component library landed very differently in sprint 1 (when nobody had opinions) versus sprint 12 (when everyone had commitments). The same proposal, same content, completely different reception. Architectural proposals need to land before people are invested in the current approach.

Write the RFC, even if you don’t need to. A written document forces you to steel-man the alternatives. It creates a shared artifact that people can comment on asynchronously. It signals that you’ve thought carefully about the problem. And it creates the paper trail that ADRs later reference.

Onboarding speed as an architecture quality signal

There’s a metric I’ve started treating as a proxy for architectural quality: time to productive for a net-new engineer on the team.

Not “time to set up the dev environment” — that’s infrastructure. Time to productive means: how long until a new engineer can make a non-trivial change to the system with reasonable confidence that they understand what they’re doing?

A well-architected system has clear entry points, obvious conventions, documented decisions, and strong constraints enforced by the type system. A new engineer can follow the patterns without needing to deeply understand the whole system.

A poorly-architected system requires extensive context before any change is safe. Every modification has hidden implications. The team carries the context in their heads, which means it leaves when they do.

If onboarding takes more than two weeks to reach productive contribution, that’s a signal — not necessarily that the system is bad, but that it’s not communicating itself clearly enough. The fix might be documentation, better abstractions, or naming. It’s rarely obvious, which is why the metric is useful: the measurement forces you to find out.

What outlasts individuals

Looking back at what actually survived across the engagements I’ve been part of: it’s never the clever code. It’s the boring code with clear names, the data models that matched the domain vocabulary, and the decisions that were written down with their context.

At BHP, we built mining operations software that was in production for eight years across two countries. The parts that aged well weren’t the technically interesting parts. They were the parts where someone had taken the time to write down what the system was, why it worked the way it did, and what it was not supposed to do.

The systems that are still legible five years later are the ones where someone cared enough to explain what they were doing and why. That’s 90% of the Staff engineer’s job.

The code is almost the least important part.