Europe is about to make human-first software non-optional – Steady

Home Technology Europe is about to make human-first software non-optional – Steady
Europe is about to make human-first software non-optional – Steady

In May 2026, Pope Leo XIV released his first encyclical, "Magnifica humanitas," a text devoted to safeguarding the human person in the time of artificial intelligence (Leo XIV, 2026). He chose to present it in the Vatican alongside an unexpected guest, Chris Olah, a co-founder of Anthropic and one of the people building the frontier systems the document examines.
Olah used the occasion to say something a technology executive is not expected to say. Every frontier lab, his own included, operates inside incentives that can conflict with doing the right thing, and no amount of good intention escapes them. For that reason, he argued, the world needs people outside those incentives, governments and civil society among them, to set the limits that the labs cannot be trusted to set for themselves. "We need informed critics who will tell the labs when we are failing," he said. "We need moral voices that the incentives cannot bend" (Olah, 2026).
When a co-founder of a leading AI company travels to Rome to ask the outside world to regulate his own industry, the familiar claim that regulation is the enemy of progress has lost its most credible narrator.
The decisive question for software over the next decade is not which model your team adopts. It is whether the people who build software stay in authority over it as artificial intelligence spreads through the work, and well-designed regulation is how Europe is about to make the human-first answer binding rather than aspirational.
The dominant industry story says the opposite, treating the European Union's regulatory programme as a brake on ambition that cedes the future to firms operating under looser rules. Olah's testimony points the other way. When every commercial incentive pushes toward letting AI quietly decide more about how engineers work and how they are judged, regulation is the one force currently strong enough to hold the line on a human-first path, because that path is rarely the cheapest one in the short term, and the market will not protect it unaided.
This is not an argument against measuring software work, and it is not a complaint about developer-experience tooling. Good measurement, done at the level of the team and used to improve how people work, is part of a human-first path. The concern is narrower and sharper: the growing class of systems that use AI to evaluate, rank, and decide about individual engineers, with the human reduced to someone who approves a number the machine produced. The AI Act draws a line precisely there, and the line it draws is the clearest statement in current law of what human-first software actually demands.
 
Start with what the commitment means in practice, because the phrase is easy to wave at and hard to honour. The Copenhagen Manifesto sets out the test directly: generative AI in software engineering must be evaluated first by what people need, and only then by what the technology can do (Russo et al., 2024). A tool earns its place not by raw speed but by its effect on the wellbeing, learning, and autonomy of the engineers who use it. That ordering, people first and technology second, is the whole of the human-first idea.
The evidence on adoption explains why the ordering is not sentimental. Russo's study of generative AI adoption across software engineering, built from structured interviews with one hundred software engineers and a mixed-methods design, found that uptake is governed by individual, technological, and social factors acting together rather than by any property of the tool in isolation (Russo, 2024). The resulting Human-AI Collaboration and Adaptation Framework treats adoption as something people do inside a context, weighing whether a tool fits how they already work and who they understand themselves to be. A system that designs the person out of that context, treating the engineer as an input to be optimised rather than a professional to be supported, is working against the only mechanism by which adoption succeeds.
A human-first path, then, is not a slogan about being nice to engineers but a design discipline. It keeps a competent person in genuine authority over consequential decisions, it preserves the autonomy that lets expert judgment operate, and it refuses to let efficiency erase accountability. The question for European software is whether anything ensures teams actually build this way when the cheaper alternative is always available. Left to incentives alone, the answer is no.
This is where the AI Act becomes concrete rather than abstract. Annex III of Regulation (EU) 2024/1689 lists the uses the law treats as high-risk, and point 4(b) names AI intended to evaluate and monitor the performance and behaviour of people in work relationships, to allocate tasks based on individual behaviour or traits, or to inform promotion and termination (European Commission, 2024). A system that scores engineers on responsiveness, reliability, or output, and routes those scores toward a decision about pay, ranking, or continued employment, falls inside that definition and inherits the full obligation set that comes with high-risk status.
Two design choices in the law make this hard to evade, and both express the human-first principle in legal form.
The first concerns profiling. Some Annex III systems can argue out of high-risk status by showing they perform only a narrow preparatory function, but the Commission's draft classification guidelines, published for consultation in May 2026, remove that escape for systems that profile a person under the General Data Protection Regulation (European Commission, 2026). Evaluating a developer's reliability and behaviour is profiling by definition, so the exemption that rescues other workplace tools does not reach this use. The law treats judging a person as categorically more serious than processing a thing.
The second choice addresses the comfortable belief that a human reviewer makes an automated scoring system safe. The guidelines reject that reasoning at its root.
"Since human involvement cannot change the purpose and area in which a system is intended to be used, it has no effect on the classification of the system as high-risk under Article 6(2)." (European Commission, 2026, para. 70)
The implication is the human-first principle stated as a rule. A human placed in the loop is there to exercise real authority over the outcome, which is a duty the law imposes under Article 14, and not a token whose presence lets the classification disappear. A manager who rubber-stamps an algorithmic ranking has not made the system safe; he has demonstrated why oversight was required to begin with. The guidelines extend the same logic to architecture, assessing a pipeline split across separate telemetry, scoring, and reporting components as a single system where its combined outputs shape a decision about a person (European Commission, 2026, para. 75). A modular design cannot launder an evaluative purpose into something the law overlooks.
Read carefully, the rule is permissive about exactly the practices a human-first team would already accept. Systems that surface neutral, objective factors such as availability or location stay outside the high-risk category. Analytics that aggregate to the team level, without ranking identifiable individuals and without feeding pay, promotion, or task allocation, also stay outside it. The boundary the regulation draws is the boundary between measuring work to help a team improve and using AI to pass judgment on a person.
If the human-first path were also the cheapest path, no regulation would be needed. It is not, and that gap is the honest case for the law.
Individual scoring is attractive precisely because it is cheap, legible to executives, and easy to automate. A dashboard that ranks engineers on commits or closed tickets produces a number a leader can act on without understanding the work behind it. The trouble is that the number is a poor measure and an active hazard. Software value is a joint product of people, code, and process under uncertainty, and decomposing that system into per-engineer counters discards the interaction effects that explain most of the variance in outcomes. Worse, a metric attached to evaluation becomes a target within about a quarter, and from that point it measures the behaviour it rewards rather than the work it was meant to track. The cheap path is also the path that quietly degrades the thing it claims to manage.
So the market, left alone, drifts toward automation-first measurement because the costs of doing so are deferred and diffuse, while the savings are immediate and visible. This is the failure Olah named from the Vatican, an incentive structure that no single firm escapes by good intention alone. A binding floor changes that calculation by making individual developer evaluation a high-risk, accountable act, the AI Act raises the cost of the cheap path to the point where the human-first path competes on equal terms. This is what regulation is for in a domain where the externalities fall on people who have no say in the tooling imposed on them.
The objection that this burdens European firms against less-regulated competitors mistakes a liability for an advantage. A firm that builds AI to manage its engineers without accountability has not found a shortcut; it has accumulated exposure that the revised Product Liability Directive, in force across the Union from December 2026, will eventually price (European Parliament and Council, 2024). The constraint that looks like a cost is the discipline that keeps the people who build software sovereign over their own work, and that sovereignty is the asset European software should most want to protect.
Welcoming the direction of the law does not mean the preparation is trivial, and the timing deserves precision because confusion about it is the most cited objection. High-risk obligations under Annex III were originally set to apply from 2 August 2026. A political agreement on the Digital Omnibus, reached in May 2026, would defer them to 2 December 2027, but that deferral takes legal effect only once it is published in the Official Journal, which had not happened at the time of writing. A disciplined organisation builds to the earlier date and re-baselines if the deferral is enacted. The AI literacy duty under Article 4 and the penalty regime are already live, so the readiness work is not a problem for some distant year.
Developer-evaluation systems also sit inside a wider regulatory stack that a European software organisation cannot treat in isolation. The Digital Operational Resilience Act applies to firms serving financial entities, the Cyber Resilience Act brings vulnerability reporting from September 2026 and full obligations from December 2027, and NIS2 governs secure development and incident reporting. Each regime asks for the same underlying thing, which is evidence: classification records, technical documentation, automatic logs retained for at least six months, oversight reports, and impact assessments. An organisation that builds this evidence capability once, around the human-first commitment, can serve several regimes from it rather than treating each as a separate scramble.
Run this against any AI system your organisation uses to observe, measure, or manage developers. Each item maps to a specific obligation, and each gap is a documented risk.
1.     Write a classification record for every developer-facing AI system, stating its intended purpose, the people it touches, and whether its output feeds a decision about an identifiable individual. The record is itself a required artefact.
2.     Decide whether the system profiles individuals under the GDPR definition. If it evaluates reliability, behaviour, or performance, treat it as profiling, and treat the preparatory-task exemption as unavailable.
3.     Trace where each output goes. If anything routes toward pay, promotion, task allocation, or termination, treat the system as high-risk, however many people review it.
4.     Confirm that automatic event logging is enabled and that logs are retained for at least six months inside your own perimeter, not only in a vendor's cloud.
5.     Document your human-oversight design under Article 14: who oversees the system, what competence they hold, and how their overrides are recorded.
6.     Verify that workers' representatives and affected developers were informed before the system went into use, as Article 26(7) requires.
7.     Produce a data protection impact assessment for systematic employee monitoring, and prepare a fundamental-rights impact assessment where the deployer scope applies.
8.     Record dated AI-literacy training for the staff and contractors who operate or oversee the system, since that obligation is already in force.
9.     Maintain a per-obligation effective-date register, defaulting high-risk dates to 2 August 2026 until any Omnibus deferral is published in the Official Journal.
10. Set retention rules deliberately: logs for at least six months, core technical documentation for ten years, with data-minimisation overrides where employee-monitoring proportionality calls for a lighter footprint.
Find out what every AI tool in your workflow records about you and where those records travel. You now have a recognised stake in the answer.
When a tool's output could shape a decision about you, ask in writing for review by a named person with the authority to overturn it, and treat anything less as the rubber stamp the law was written to prevent.
Keep a regular window where you write hard code unaided. Authority over your own craft is a capability you maintain through practice, not a status the organisation grants you.
When a compliance policy is unclear to you, write down your interpretation and circulate it, since forcing the ambiguity into the open is itself an act of governance.
Decline to supply per-engineer rankings upward, and offer team-level evidence across several dimensions instead. The regulation now gives you legal cover for a position the evidence already supports.
Translate any central compliance frame into a short operating note your team can act on within two weeks of receiving it. If you cannot, the policy is not implementable, and that finding is worth sending back upward.
Complete the worker-notification step before deploying any monitoring or measurement system, treating it as a trust-building act rather than a formality.
Protect review and thinking time as the definition of done shifts under AI assistance, so that measured speed does not consume the cognitive space on which good judgment depends.
Treat developer-measurement AI as a high-risk system in every build-or-buy decision, and require vendors to show how their product generates the evidence the law demands inside your perimeter rather than in a dashboard you cannot audit.
Stand up a per-obligation effective-date register now, default it to the conservative dates, and assign an owner to re-baseline it the day any deferral publishes.
Fund the compliance evidence as a product capability rather than an afterthought, since the classification records, logs, and oversight reports are the same artefacts that will defend you against a future product-liability claim.
Audit your engineering operating model for practices that strip autonomy, including centralised approval bottlenecks, mandated AI tooling without an opt-out, and metric-driven performance review, each of which now carries regulatory weight on top of its cultural cost.
The regulation reframes a question many organisations have preferred to avoid. The version worth asking is not how to measure developers more precisely, but which of your current practices would survive being classified as a decision about a person, and what that says about those that would not. The teams that read this as paperwork will produce paperwork. The teams that read it as a chance to deliberately commit to a human-first path will end up with both compliance and better engineering. 
Daniel Russo, Ph.D., is a Professor of Software Engineering whose research examines the intersection of human cognition and artificial intelligence. Through "Software Insights," he translates empirical research into actionable guidance for software practitioners and organizations.
If this issue surfaces a problem your organisation has been trying to name, I work with engineering leaders to diagnose exactly that kind of challenge, using the same methods behind the research you just read. No frameworks. No opinion without evidence.
danielrusso.org/advisory (Opens in a new window)
 
European Commission. (2024). Regulation (EU) 2024/1689 of the European Parliament and of the Council laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Official Journal of the European Union.
European Commission. (2026). Draft guidelines on the classification of high-risk AI systems under Regulation (EU) 2024/1689 (targeted consultation, May 2026).
European Parliament and Council. (2016). Regulation (EU) 2016/679 (General Data Protection Regulation). Official Journal of the European Union.
European Parliament and Council. (2024). Directive (EU) 2024/2853 on liability for defective products. Official Journal of the European Union.
Leo XIV. (2026). Magnifica humanitas: On safeguarding the human person in the time of artificial intelligence [Encyclical letter]. Holy See.
Olah, C. (2026). Remarks on Pope Leo XIV's encyclical "Magnifica humanitas." Anthropic.
Russo, D. (2024). Navigating the complexity of generative AI adoption in software engineering. ACM Transactions on Software Engineering and Methodology, 33(5), Article 135.
Russo, D., Baltes, S., van Berkel, N., Avgeriou, P., Calefato, F., Cabrero-Daniel, B., et al. (2024). Generative AI in software engineering must be human-centered: The Copenhagen Manifesto. Journal of Systems and Software, 216, 112115.

In May 2026, four Ericsson engineers (Britto, Palmgren, Saini, and Ohlin) released a paper titled The AI-Native Large-Scale Agile Software Development Manifesto. The…
The cohort graduating into software work in 2026 enters a labour market in which the entry-level rung has been quietly compressed by AI, and the empirical record on what…
A newsletter on U.S. politics and history – and the ongoing struggle over how much democracy, and for whom, there should be in America
See latest post
A pragmatic engineering and financial analysis of the tech sector, AI economy, and venture capital
See latest post
17,970 Followers
Disco Pogo is a bi-annual electronic music magazine and book publisher.
See latest post
12,204 Followers

source

Leave a Reply

Your email address will not be published.