A head I once worked with led an oversubscribed independent girls’ school in suburban Surrey. By every available measure she was a successful head. Results were strong. Parents were satisfied. Inspection outcomes were good. She had a reputation for bluntness, for plain speaking, and for expectations of her staff that bordered on the unreasonable. She was trusted from above to deliver results. Those below her, however, knew she did not always have their best interests at heart. But the numbers were strong, and the numbers were what the CEO saw, himself a numbers man.
So when opportunity arose, she was promoted to oversee all the independent schools in the trust. The reasoning was not all that unreasonable. If she could do it once, she could do it again and again.
She brought with her the whole epistemologically flawed apparatus of measurement that had appeared to drive the Surrey result: target grades, predicted grades, value-added tracking at student level. With this, she arrived in mixed schools, in less selective intakes, in less affluent catchments, and applied the same instruments with the same conviction, and expecting the same results. Spoiler alert. The results did not follow. What followed though was friction, resistance, and heads who stopped questioning her approach because pushing back was pointless. Heads came and went. And heads rolled.
Dylan Wiliam used to caution against overstretching the research evidence in education. If you want a school to improve, he said, the most reliable advice is to make it Catholic, move it to the suburbs, and expel the boys. Most of what we attribute to leadership practice in successful schools is in fact attributable to the conditions in which that practice was deployed.
This is the question I keep returning to: why we keep promoting the wrong people.
Simon Sinek tells an illuminating story about US Navy SEAL selection: when he asked the SEALs how they choose candidates for their most elite teams, they drew him a simple graph. One axis: performance. The other: trust.
Nobody, they said, wants the low performer of low trust. Everybody wants the high performer of high trust. The interesting question is the top-left quadrant, the person who produces results but cannot be trusted. And here is what the highest-performing military organisation on the planet told him: they would rather have a medium performer of high trust. Sometimes, even a low performer of high trust. Because the high performer of low trust is not a strong team member. They are a toxic one. (You can watch the full clip from Simon Sinek here.) And the damage they do outlasts any result they deliver.
The point is not that results don’t matter. The point is that results delivered by someone who cannot be trusted are mortgaged against the future, in the way short-termist political wins so often are: banked now by the people in charge, paid for later by whoever inherits the consequences.
Now notice the question the matrix forces us to ask: trusted by whom? The Surrey head was high-trust with her CEO. The board of trustees supported her. The numbers vouched for her. But trust is directional, and the trust that matters most is the trust of the people closest to the coalface. A leader can sit comfortably in the top-right quadrant of one relationship and the top-left of another. We promote on the basis of the first and live with the consequences of the second.
So the question isn’t whether we can see the failure. We usually can, eventually. The question is why we keep selecting for it.
Rory Sutherland has a useful distinction here: efficiency versus efficacy. Cost reduction versus value creation. Organisations consistently optimise for what they can measure, and in schools we can measure an enormous amount of output. Exam results. Progress scores. Attendance data. Budget models. Inspection outcomes. Contact time.
What we cannot measure, at least not with anything like the same precision, is trust, candour, psychological safety. The willingness of a teacher early in their career to challenge an SLT decision. The retention of the brilliant head of department who holds three subjects together without ever asking for credit. The quality of attention in a staff meeting. The wider context matters.
So we build selection processes, appraisal systems, and promotion panels that reward the measurable and ignore the rest. We promote on individual performance and discover, too late, that the performance was partly a function of the setting. We promote on the basis of the trust the board can see, and miss or, worse, ignore the lack of trust elsewhere.
We promote toxicity and call it meritocracy. We promote contextual luck and call it talent.
Where this goes wrong in school leadership selection
Trustees and governors typically see performance data in abundance and culture data not at all. The board sees the headline figures, the SEF, the inspection outcome. It does not see the eyes rolling at the staff meeting, the middle leaders who stop asking questions, the deputy who starts to eye up a sideways move at a different school.
When the numbers look good, the board is structurally disinclined to ask whether the numbers are costing too much, or whether they were ever really the leader’s to take credit for in the first place. So it signs off on the promotion, renews the contract, rewards the bonus. The conditions that produced the result remain invisible and thus unconsidered. The leader who misreads them gets promoted on the strength of factors they were not fully responsible for. And the cycle continues.
What to select for when promoting school leaders
Here is the reframe:
We have been selecting for individual performance. We should be selecting for multiplied performance: the capacity to make the people around you better, in conditions that may be nothing like the ones you inherited.
Three practical moves.
Ask different questions in interview. Most panels are optimised to confirm competence. Competence is necessary and already well covered. What panels consistently miss is relational and contextual evidence. Try asking:
- Tell me about someone you developed who went on to surpass you.
- When did you last change your mind because of pushback from a junior colleague?
- Who on your current team disagrees with you most often, and how do you know?
- Walk me through a time the conditions changed under you. What did you stop doing?
These are hard to answer well without real evidence of practice. A candidate who cannot name the person who challenged them last month is telling you something important. A candidate who has never had to abandon a method that was working is also telling you something.
Change what you ask references for. Most reference conversations optimise for performance confirmation: did they hit targets, did they improve outcomes, would you describe them as effective? Ask instead: Would you want to work with this person again? Would you want your best colleague to serve under them? What happened to the people who worked most closely with them: did they flourish, or did they leave? Would their approach work in a school unlike yours? Retention of good people is a trust metric hiding in plain sight. Transferability is a contextual one. This is as important as grades on spreadsheet, if not more.
Measure the invisible by proxy. You cannot directly measure trust. But you can measure staff retention by line manager rather than by school; upward feedback trends; who speaks, who is interrupted, and who stops bothering in meetings; how often a junior colleague actually changes a decision. None of these metrics is perfect. But all of them beat measuring nothing, which is what most schools currently do. We confuse fluency and speed with strategic thought. We can do better, with a little patience.
If the only thing you measure is output, output is the only thing you will get. And eventually you will not even get that, because the people who made the output possible will have left. Silently, politely, for jobs in schools that did notice them.
The real cost of promoting the wrong people
Return to the scene at the top of this piece. The most uncomfortable detail of the story is the one I have not yet told you. The head in question is a good person; she is not overtly toxic. She continues to succeed, in her way. The metrics still back her up. The CEO is still pleased. The board still sees the dashboard and signs off on the strategy. The cost has been paid, and continues to be paid, by the heads who left, by the deputies who stopped speaking up, by the staff who withdrew their discretionary effort, by the children whose teachers felt they had no influence on their own practice. None of that appears in any trust-level report. And I doubt it ever will.
This is the part of the argument that I find most challenging. We tend to assume that systems eventually correct themselves, that bad selection decisions surface in the numbers, that the truth comes out. Sometimes it does. Often it does not. The lopsided metrics are not a temporary blind spot waiting to be addressed. They are the structure within which the leader continues to be successful, by the only measure the structure can see.
We do not promote leaders to produce results. We promote them to create the conditions in which other people produce results, over a long time, in settings that may bear little resemblance to the ones in which they first succeeded. That is what a healthy culture actually looks like and what emotionally mature leadership actually does. It is harder to see, harder to measure, and almost never the thing we get rewarded for choosing.
In politics they say we get the leaders we deserve. In schools, we get the leaders we measure for, on the schedule by which we measure them.
How digitally mature is your school?
Get a free, personalised report with priorities that are actually relevant to you.



