Friday, March 16, 2018

The Tyranny of Metrics

One of the great themes of Aristotle’s work on ethics and politics is the need for human judgment. In the Politics, when he describes the virtues that the “master craftsman” (architekton) of the state must have, chief among them is practical wisdom (phronesis). In the Ethics he points out that no matter how carefully laws are written, they will always be incomplete by virtue of their generality — their relevance to a given case will always have to be determined by judges, judges who therefore need to possess the virtue of equity (epieikeia) in their decision-making: that is, the ability to decide, with tact and shrewdness, just how the law should be applied in a given case.

There are few things that the Modern Moral Order despises more than human judgment. One could argue that the chief energies of the MMO have been devoted to the elimination of such judgment, to render phronesis and epieikeia wholly unnecessary. What drives the MMO is what Taylor calls “code fetishism” or “normolatry.” In our time, one of the primary manifestations of code fetishism is, in the title of Jerry Z. Muller’s important new book, The Tyranny of Metrics. From the Introduction:

Schemes of measured performance are deceptively attractive because they often “prove” themselves by spotting the most egregious cases of error or neglect, but are then applied to all cases. Tools appropriate for discovering real misconduct become tools for measuring all performance. The initial findings of performance measurement may lead poor performers to improve, or to drop out of the market. But in many cases, the extension of standardized measurement may be of diminishing utility, or even counterproductive — sliding from sensible solutions to metric madness. Above all, measurement may become counterproductive when it tries to measure the unmeasurable and quantify the unquantifiable.

Concrete interests of power, money, and status are at stake. Metric fixation leads to a diversion of resources away from frontline producers toward managers, administrators, and those who gather and manipulate data.

When metrics are used by managers as a tool to control professionals, it often creates a tension between the managers who seek to measure and reward performance, and the ethos of the professionals (doctors, nurses, policemen, teachers, professors, etc.). The professional ethos is based on mastery of a body of specialized knowledge acquired through an extended process of education and training; autonomy and control over work; an identification with one’s professional group and a sense of responsibility toward colleagues; a high valuation of intrinsic rewards; and a commitment to the interests of clients above considerations of cost. 

It is noteworthy — and from where I sit very interesting — that Muller came to write this book because of his experience as the chair of an academic department. Much of a department chair’s job in the American academy today involves manipulating the metrics of assessing “learning outcomes” — as described in this essay by Molly Worthen. (There are advocates for more nuanced and humane models of assessment — Kate Drezek McConnell, for instance — but if you’re a professor and you get to deal with someone who thinks the way McConnell does, you’re very lucky.)

Of course, the reign of metrics extends far beyond the academy. Muller shows it at work in law enforcement — How many arrests is a police department making in relation to what the metrics say the number should be? Is the DA’s office meeting its expected conviction rate? — and in medicine — Hey surgeons, don’t take on difficult cases that might lower your success rate. And I vividly recall the moment several years ago when the gifted designer Douglas Bowman left Google because he wasn’t allowed to design, only oversee A/B testing.

Where does metrics succeed? Among other places, in sports. The analytics revolution has affected almost all sports, and has been wonderfully illuminating. Sometimes advanced analytics tells you that what you believed all along is indeed correct — there are no analytical models of basketball success that don’t put Michael Jordan at the top of the heap — and sometimes you discover that your observations of the game have led you to dramatically overrate some players and underrate others. (The latter discoveries are especially fun.) But all sports are, in one way or another, counting games: you count wins and losses, and count the actions that lead to wins and losses: made and missed shots, strikeouts, completed passes, unforced errors, and so on.

You can sort much of the rest of life that way if you want, I suppose. For instance, in evaluating the design of a website you can ignore such fuzzy notions as “beauty” and simply count the number of clicks associated with various shades of blue. (That’s why Bowman left Google.) You can “teach to the test,” ignoring every aspect of education except the ones that produce higher test scores — and if your job depends on your students’ test scores, teaching to the test is what you’d damn well better do.

And wherever it’s possible to make the metrics better, we should. Something that is not measurable now may become at least partially measurable in the future. The problem is not the use of metrics, it’s the tyranny of metrics. And perhaps the worst consequence of that tyranny is its tendency to make us give up altogether on the cultivation of judgment — of phronesis and epieikeia. Mistrusting judgment, believing that it can never be accurate, our technocracy figures that using whatever metrics we have — and torquing our questions and thoughts and concerns in the direction of existing techniques of measurement and assessment — is the best available option. The fear is that human judgment will never be anything more than emotionally-driven opinion. And you know what? Untrained judgment always will be emotionally-driven opinion. This is what we call self-fulfilling prophecy.


  • I like to draw a distinction between assessment, part of the normal course of academic improvement, and Assessment, the title of the meaningless exercise imposed on us by a bunch of online MBAs.

  • I agree almost entirely but I think one caveat is necessary: the MMO despises *individual* human judgment. The more market-oriented among them are perfectly happy to harness "the wisdom of crowds."

    Of course the "wisdom of crowds" approach often reduces individual judgment to a stimulus reponse, a piece of evidence about whatever someone's mental state happens to be rather than an action that person freely perform. I'm not sure if that reduction is necessary to the best version of the approach.

  • The question is always, which metrics and for what purpose? Take the SAT. Despite years and years of complaints, the test is a strikingly effective predictor of college performance - not just first year but second, third, and fourth year, and graduation rates. In fact SAT scores of precocious 13 year olds are an effective predictor of whether they will ever secure a patent or write a best-seller in adulthood. Some people find that level of predictive accuracy creepy, which in effect is saying that the metric is too good.

    The problem with metrics like the SAT is that people always attack their strength - their validity and reliability - instead of their potential weaknesses - their social and cultural costs. People want to tear down the tyranny of the number but it's perhaps better to think of tearing down the tyranny of the use of the number.

  • Freddie, I think your point is consonant with Muller’s argument. For him “metrics” is not a set of techniques so much as a whole subculture.

  • If measurement is always of outcomes, metrics will cause problems whenever the primary good of the activity evaluated is not an outcome. But that there are activities worth doing for their own sake, and not merely for the sake of an outcome, is something people conveniently forget. It's part of a deep-seated fear of leisure.

    What insulates some things from metricism? People seem content to consult evaluations of restaurants, musical performances, and yoga classes without insisting that their outcomes be measured. As an educator, I sometimes envy the chef, the performer, and the yoga instructor (who doubtless face their own, other challenges).

  • Alan, your post made me think of that passage in "1984" where Winston follows an old man into a pub, wanting to ask him about what it was like before the revolution. The old prole, horribly outdated and British, asks for a pint of beer and the barman has no idea what he's talking about. All they have are liter and half-liter mugs.

    'And what in hell's name IS a pint?' said the barman, leaning forward with
    the tips of his fingers on the counter.

    ''Ark at 'im! Calls 'isself a barman and don't know what a pint is! Why,
    a pint's the 'alf of a quart, and there's four quarts to the gallon.
    'Ave to teach you the A, B, C next.'

    'Never heard of 'em,' said the barman shortly. 'Litre and half
    litre--that's all we serve. There's the glasses on the shelf in front
    of you.'

    'I likes a pint,' persisted the old man. 'You could 'a drawed me off a pint
    easy enough. We didn't 'ave these bleeding litres when I was a young man.'

    'When you were a young man we were all living in the treetops,' said the
    barman, with a glance at the other customers.


    Always thought that was a great illustration of the modern rationalistic attitude, which permeates so much of our culture. And it seems relevant to your post.

    There's a great clip of Peter Hitchens reading that passage on YouTube somewhere. It's perfect :)

