All Statistical Knowledge Is Built upon Historical Knowledge

Sometimes we encounter the contention that statistical studies in the social sciences are "rigorous," as opposed to the kind of "soft" knowledge we get from "merely" narrative history. This is mistaken in several ways, but probably the most fundamental is that any validity and significance of any statistical study in the social sciences are themselves based upon historical understanding.

For instance, if we are studying industrial output in the Soviet Union, we have to know that data on such things was systematically doctored: and knowing that is a matter of historical understanding. Similarly, if we want to correct for this false reporting and try to get at the true figures, we must examine plant records, diaries, post-Soviet interviews, and so on: again, an historical inquiry.

"Ah," you ask, "but what about where there wasn't such data distortion?" Well, we can only determine that there wasn't through... historical understanding.

There have even been cases where a "researcher" was found to have simply made up data without doing a study at all. His methods may have been "rigorous" in processing this "data," but the data itself was sheer fabrication. How can we expose this? Examining lab notes and similar studies, interviewing research assistants, looking into the researcher's biography, i.e., through an historical inquiry.

Claiming that statistical social science has a higher epistemic value than "mere" history is like claiming "I can stand up a lot longer than my legs can."


  1. Excellent points, sir! A blogger by the name of Clarissa who lived under the Soviet Union, specifically in Ukraine, told me about how she went to Cuba with her sister to receive the kind of healthcare that normal Cubans receive. It turned out to be a terrible experience and quite a lot of money for the kind of dismal service they received at the clinic in Cuba they went to. She also said that they purposefully manipulate their data to make it seem that their quality of healthcare is better than that of say the United States, when in reality, only the wealthiest of wealthy in Cuba are the people receiving such quality healthcare and the unemployment rate is similar to other countries in Latin America.

    I'm also questioning the Corruption Perceptions Index of Cuba according to this website. In comparison to other Latin American countries, Cuba seems to be less corrupt than say Venezuela.

  2. Garbage in, garbage out.

  3. I'm not sure this is correct. I agree with GIGO, but I'm not sure that's the relevant concept here. What is the "historical understanding" involved in Peter Norvig's speech translation project? [].

    It looks like he's taking a lot of data and trying to distill purely statistical information from it. If there are additional refinements [filtering the text, whatever], then this may be done on a grammar basis rather than Peter "special historical knowledge." I don't think he *wants* to override the frequent pattern X,Y,Z, with his own "recommendation" that the speech recognier give more weight to X,Y,W.

    1. I have now made this a little more clear with an edit, but already at the end I had mentioned: "statistical social science." I just added "statistical studies in the social sciences."

      I am not referring to engineering projects!