Commentary on the Lewis Paper

by Jonathan Kaplan

April 26, 2013 at 3:59 pm GMT • 800 Words

know I’m quite late to this party, but just in case you still check the comments, I wanted to ask you to rethink your condemnation of Gould.

You write that: “…whose fraud on race and brain-size issues, presumably in service to his self-proclaimed Marxist beliefs, last year received further coverage in the New York Times. Science largely runs on the honor system, and once simple statements of fact—in Gould’s case, the physical volume of human skulls—are found to be false, we cannot trust more complex claims made by the particular scholar.”

This is, I think, a very distorted view of the Lewis et al article. It is worth noting that Lewis et al themselves do not accuse Gould of fraud (nor, and this is important, did Gould ever accuse Morton of fraud!). Rather, Lewis et al argue that Gould’s preferred statistical analysis of Morton’s data was no better justified than Morton’s, and in some cases, seemed less well justified. That’s not “fraud” — that’s a disagreement about how best to interpret a particular data-set!

First, it is important to note that Gould never claimed to have actually measured any of Morton’s skulls, and explicitly stated that, after Morton had switched from using mustard seed to using lead shot, Morton’s results — the actual measurements of skull volumes — were likely accurate and reliable. Gould never claimed that, once Morton switched to using shot, Morton’s skull measurements themselves were wrong.

Gould in fact credited Morton with having recognized that his original method of measurement was unreliable, and credited him with the integrity to switch to a better, much more reliable method, despite the fact that the new results Morton got when he switched were *less* in line with what Gould presumed Morton’s assumptions were. (It is interesting that Gould’s review of Morton’s work credits Morton with great personal and scientific integrity — Gould repeatedly stresses that Morton *tried* to avoid acting on his biases, and stresses that in leaving us all of his original data, as well as explaining, in his work, what choices he made and why, Morton was acting just as a good scientist ought to act! Gould thought it was interesting and important that, in his view, Morton still ended up coming to biased conclusions. Lewis et al think that in fact Morton didn’t in fact come to biased conclusions, and that is where the argument really is…)

Lewis et al acknowledge a) that Gould never made any skull measurements himself, nor claimed to, and b) Gould explicitly stated that Morton’s shot-based measurements were likely perfectly accurate and reliable. (Now, why they felt it necessary, given that, to remeasure a bunch of skulls is a bit of a mystery — their stated reason, that they were searching for signs of bias on Morton’s part that even Gould explicitly stated he didn’t expect to be there, is at best odd.) So, here I just want to say that, no, Gould never claimed that Morton’s actual measurements of individual skulls were inaccurate, did not remeasure any skulls himself, did not claim to do so, and hence could not himself be “wrong” about the skull measurements themselves.

As for whether Gould’s choices regarding his statistical analysis were better justified then Morton’s, less-well justified then Morton’s, or just about as well justified as Morton’s, turns out to be a tricky question to answer, trickier than Lewis et al’s analysis suggest. (So for example, a typical example from Lewis et al goes like this: Gould argues that following Morton’s reasoning about what data to exclude in the case of population A, say, we ought to exclude similarly situated data from population B; Lewis et al claim that the exclusion in the case of B is different than in the case of A, and that in any event, the better thing to do is to include the data in both cases, not exclude it. You get different answers given those two ways of making things consistent; Lewis et al argue that Gould’s way is worse, and that if you take their way, the answer you get is closer to Morton’s than to Gould’s…)

But in any event, once again, Gould was *explicit* about what decisions he was making regarding what data to include in his analysis, and what data to exclude, and gave reasons for those decisions. We might think, on reflection, that those reasons were poor ones, but he didn’t try to hide what he was doing, or lie about his analysis, or anything of the sort. So “fraud” hardly seems to be a fair description.

I know this is a bit of a tangent, but I think it is important not to overstate what Lewis et al actually showed. Gould may well have been wrong about Morton, but he wasn’t dishonest about it.

April 30, 2015 at 6:39 pm GMT • 800 Words [Reply to @Pincher Martin]

by Jonathan Kaplan

The reason I care about this is that Lewis et al remeasurements are often interpreted, as they were by Trivers, as showing, via their remeasurements, that Gould was wrong about there having been bias in Morton’s original measurements of the skulls. For this to be true, the remeasurements would have to show that Morton’s original seed-based measurements were not racially biased. But the remeasurements do not do that, and cannot do that. Given that, the remeasurements were, at best, a stupid stunt — completely irrelevant to the argument in Lewis et al. At worst, it was meant to be misunderstood and was grossly intellectually dishonest. But either way, it was that stupid, completely irrelevant stunt that got them attention.

I don’t like people getting credit for stupid, irrelevant stunts that border on the intellectually dishonest, especially when the point of their paper was to attack someone else for not being as careful and intellectually rigorous as they should have been.

As for what is suspicious, again, I reiterate that Gould’s argument was that the *difference* in what happened to the averages in the different races was problematic, and I reiterate that the explanation for that difference is *not* chance (the larger variance of the seed-based measurements and smaller sample size).

The argument, again, is as follows: – Gould noted that Morton recognized that the initial measuring system he used (using seeds rather than lead shot, and making use an assistant to do some of the measuring) was unreliable, changed it, and remeasured the skulls he’d originally measured badly (doing all the measurements himself, with lead shot). So Gould credits Morton with recognizing that he had a problem, and finding a way to fix the problem, and then redoing his measurements with the new, no longer problematic, system. But when Morton remeasured the skulls, something odd happened: the skulls Morton associated with “African” and “African-American” populations increased in size much more than the skulls Morton associated with “Caucasian” populations. Gould speculated that the earlier method, using seeds, permitted Morton’s unconscious bias against Blacks to influence his measurements. When Morton switched to a more reliable method, Gould hypothesized, his bias was no longer able to influence the results; the room for an unconscious bias to skew the results was eliminated by the new system. The difference between the measurements when a less-reliable system and a more-reliable system were used was what suggested, to Gould, that bias might be at play, and was what was responsible for skewing the results when the less reliable system was used. Gould speculation, quoted above, about how this bias might work in practice, emerges from this line of reasoning. (Now, Gould might have been wrong — bias might *not* be the correct explanation. Other explanations are possible. But whatever the explanation is, it isn’t just that the original measurements were unreliable in a way that was random with respect to race.)

An analogy might be helpful. If I initially grade students in my class based on my “overall impression” of their ability, and then switch to using a multiple choice test, and one finds that the scores of women in my class suddenly improve markedly compared to the men with the introduction of the new testing method, one might, justifiably, think that my initial grading method (“overall impression”) was biased against women. (Of course, one might also think that bias had nothing to do with it, and some other explanation was the right one.) But — and this is the key! — the way to test the hypothesis that bias might have been at play in my earlier measurements isn’t to regrade the multiple choice tests! If you do that, and find that I generally scored the multiple choice tests accurately, and that my grading of the multiple-choice tests wasn’t biased against women, this provides no evidence whatsoever that my initial “overall impression” based system was similarly fair! And yet that is precisely the argument that Lewis et al spend almost a third of their paper developing, and precisely the results that were reported as proving that Gould was wrong. That’s just stupid.

I think it is obvious that this is a serious problem with Lewis et al’s papers, and the way that their paper has been interpreted. Does it matter much in the grand scheme of things? No, probably not. Gould made a lot of mistakes in his interpretation of Morton (and so, for that matter, did Lewis et al). Had Lewis et al focused only on the mistakes Gould actually made, their paper would not have been as popular, but it would have been more honest, and better for it.

April 29, 2015 at 11:07 pm GMT • 400 Words [Reply to @Pincher Martin]

by Jonathan Kaplan

There is an additional contrast between Morton and Gould worth noting. To conjure up Morton’s mistakes, Gould lovingly describes the action of unconscious bias at work: “Morton, measuring by seed, picks up a threateningly large black skull, fills it lightly and gives a few desultory shakes. Next, he takes a distressingly small Caucasian skull, shakes hard, and pushes mightily at the foramen magnum with his thumb. It is easily done, without conscious motivation; expectation is a powerful guide to action.” Indeed it is, but careful re-measures show that Morton never made this particular mistake—only three skulls were mis-measured as being larger than they were and these were all either Amerindian or African.

Please repeat after me: Gould never claimed that Morton’s shot-based measurements (which is what Lewis et al compared their remeasurements to) were inaccurate. Never. Not at all. Gould in fact claimed, repeatedly, that after Morton switched from seeds to lead shot, and did all the measurements himself (rather than letting his assistant do some of them) that his measurements were accurate and reliable. Gould just flat out wrote that, OK? Flat out wrote in Mismeasure that after Morton switched to lead shot, and switched to making all the measurements himself, that Morton “achieved consistent results that never varied by more than a single inch for the same skull” (Gould, 1981 53).

So, what’s up with the quoted passage? Well, if you’d bothered to actually read Gould, you’d realize that what Gould noticed was that when Morton switched from measuring with seed to using shot, the average measurements for the different races changed different amounts, in a way that seemed to imply that the previous measurements (that Morton himself figured out were inaccurate), were biased by race.

As Jonathan Weisberg noted as well, the difference in the change is indeed suspicious ( Lewis et al claim it may just be random, but that’s a lousy hypothesis (my colleagues and I tested it statistically, and it is massively improbable — impossible, really — ).

Now, as it turns out, the hypothesis that seed is inherently easier to mismeasure than shot is problematic, too — so Gould’s preferred hypothesis isn’t that great, either ([John] Michael has done some great preliminary work on this). But something was wrong with Morton’s initial seed-based measurements, and that thing wasn’t random with respect to race.

Gould got some stuff very wrong re: Morton’s work. But the remeasurement of the skulls was a total waste of time, and completely irrelevant to Gould’s arguments.

May 7, 2015 at 3:43 pm GMT • 500 Words [Reply to @Pincher Martin]

by Jonathan Kaplan

The main point I was making here was correcting a misconception about what Lewis et al actually showed, a misconception that was repeated by Trivers as part of his story about Gould. So, naturally, I was focused on that misconception. What Trivers’ claims Lewis et al’s remeasurements show, those remeasurements do not, and cannot, show.

I remind you that if everyone agreed that after Morton switched to shot, his measurements of the skulls were accurate, then an entire third of Lewis et al’s paper was completely pointless and the thing they are mostly famous for — remeasuring a bunch of skulls — was an utter waste of time.

Further, I would argue (and, in the above cited co-authored paper, did argue), that Lewis et al’s interpretation of how Morton is understood in the literature is badly misguided; Gould did not accuse Morton of conscious manipulation, and in fact stressed that Morton was a careful, honest researcher trying to get the right answer, and *not* trying to manipulate his data (but see Jake Michael’s post, above, on whether Gould ought to argued that!). (An aside of sorts: Another “grievance” I have against Lewis et al and their paper is that, for example, they make claims about how Morton is understood in the literature, and provide several references, none of which, when one goes and reads them, actually support the claims they are making. So Lewis et al are either terrible readers (unlikely), or didn’t care that the claims they were making were unsupported by the references they were using to support those claims. As a reader, I find that intellectually dishonest, or unforgivably sloppy. Again, see our argument in )

As for Gould’s mistakes — yup, he made a number of them (not surprising, I think — you’ll get no robust defense of Gould from me!). But Jonathan Weisberg, in a paper also cited above, argues (compelling, I think), that Lewis et al’s defense of Morton’s analysis of his data fails in a number of key areas (again, see, and that Gould’s criticisms land rather more often than Lewis et al’s analysis would suggest. I would stress that this doesn’t make Gould “right” — my co-authors and I argued that Gould was at best foolish to attempt to reanalyze Morton’s data, and suggest that it was because he was able to get an answer that he liked that he was unable or unwilling to see that there could be no “correct” summary of the data, and that many of the assumptions he was making and methodologies he was deploying were no better justified than Morton’s.

But had Lewis et al written an honest paper, one that fairly criticized the mistakes Gould actually made, and distorted neither Gould’s claims, nor the place of Gould’s analysis in the literature, their paper wouldn’t have gotten written up in the NYT. We shouldn’t reward people for being dishonest or sloppy. That goes for Gould, and it goes for Gould’s critics, too.

2011-07-30 09:10 AM [Comment]

by John S. Michael

I am John S. Michael the undergraduate from Macalester College who measured the Morton skulls 26 years ago. I went on to a non-academic career in environmental land planning and was not aware that my paper had even been read up until a month or so ago. I am now undergoing a kind of Rip Van Winkle sort of experience, where I am trying to figure out what happened while I was gone. Fortunately, I kept my original notes, which I have been reading through. I should note that I am not an anthropologist, biologist, or evolutionary expert.

I started re-measuring the Morton skulls on January 3, 1988. One of the notes I wrote down on the very first day of my lab work was: ?While looking at the Native African group, it stuck me that there may be many juveniles. I think this is also true of Inca Peruvians. I can?t recall any in the Modern Caucasians. Morton said he didn?t use any under 16. I wonder. Perhaps I should get Janet to look at it, or some morning get her to look at it with me.?

I am coming to realize that over the last 26 years, there have been numerous publications that discussed Morton?s bias, and now Gould?s bias, regarding the Morton Collection. But to my mind, as one of the very few people who actually spent time with it, a more pressing issue is, why does that collection include so many Africans and Peruvians who to my undergraduate eyes, appeared to be younger and yes smaller, than many of the other specimens?

Many of Morton?s African skulls came from Cuba, and I had always assumed that many of them were teen or pre-teen slaves who died while being transported quite inhumanly to the New World. At the time I wondered if they might be from just one or two ships, and represent young people who may have come from but one province where everyone just happened to be short. I could be totally wrong about this, and I understand that a Penn grad student has already done some work with these slave skulls.

My point is that somebody should check this out, and look into the demographics of the Morton Collection?s other ethnic groups as well. In the past few weeks I have read all kinds of articles about the Morton-Gould affair, and how it reflects on subconscious bias. To my mind the first rule of science is to observe. In regard to the Morton Collection, Gould never did that, and I only did as much time as I could do in one semester. It just seems to me that an awful lot of energy has been spent speculating as to who was biased or not, when it could have been spent with the skulls, trying to figure out why this collection is the way it is.

If someone does not like the results of my study, or that of Lewis or Morton, or they suspect that one of us was biased, they should just re-measure the skulls. I understand that the collection has been CT-scanned and anyone can look at them on line. So, you don?t even need to go to Philadelphia. I would love to have somebody duplicate my work, which Lewis more or less did, rather than just comment on it. This would verify that I did a good job, or it would uncover errors I made which would help others from repeating my mistakes. I am willing to be proven wrong, or right, or something in between. As an outsider, I have no reputation to defend. Either way it would make me feel like I had contributed something of value, which is what I wanted to do all along.

SEPTEMBER 16, 2015 MASSIMO [Stephen Jay Gould and the Morton controversy]

by Massimo Pigliucci

The puzzling thing here is that we sent the paper to the journal (Plos Biology) that originally published the Lewis et al.’s article which, in our opinion, unfairly criticized noted evolutionary biologist Stephen Jay Gould’s (photo) treatment of the Morton-skulls affair in his popular book, The Mismeaure of Man. That journal’s editor responded that while our paper was well written etc., it wasn’t worth publishing there because it wasn’t novel enough. Since we were critical of a paper previously published in the same journal, I would have thought “novelty” shouldn’t be the overriding criterion, with better candidates perhaps to be found in integrity and intellectual honesty. So be it. Our paper is now out for everyone to see, hopefully contributing to rectify Lewis et al.’s misleading treatment of Gould (while at the same time pointing out Gould’s own mistake in the original piece).

