November 15, 2009

The rise and fall of American journalism in one picture

Looks like the NYT's search engine now lets you search by date all the way back to its founding in 1851. Searching for the word "the" returns the number of "articles" in a given period -- I use quotes because it may have been only a few paragraphs or several columns long. Using this measure of output, here's the annual picture over time:

Output of articles has been stagnant, though with oscillations, since the early 1970s. The really sharp drop began in 1952, perhaps reflecting the spread of TV as a news source from then until TV was close to universal in the 1970s. WWII had a visibly negative impact, although not the Great Depression -- output stalled out but remained at an all-time peak throughout, with the most articles written in 1936. The period of fastest growth was -- you guessed it -- the Roaring Twenties (leaving aside a brief dip that may reflect the early '20s recession). Before that, growth was pretty steady except for the early 1890s and mid '00s (the banking panics then may have disrupted output).

I'm not concerned here about the financial health of the industry but rather the quality of the product. It's hard to quantify quality, but since newspaper articles are probably like books, movies, and TV shows, you need much greater sample sizes to find the rare gems. Small sample sizes give you a decent picture of normal distributions, but these things are more fat-tailed or "superstar" distributions.

So, knowing very little about the history of newspapers, and using only this incredibly crude measure of quality, I predict that true journalism lovers will revere the 1920s and 1930s the most, with the '40s and '50s in second place (the absolute level is slightly lower, but it was also trending downward rather than upward, and we don't like downward trends). Output was greatest then, so it must have discovered more superstar articles than in other periods. As a rough check, judging from bibliographies the '20s and '30s look like the peak periods of activity for both Walter Lippmann and H.L. Mencken.

Who said quality is so hard to measure?


  1. I love the NYT search engine, and I love your meta-analyses that use it. Two thoughts:

    1) NYC has grown sixteen-fold in population during the NYT's lifetime. Insofar as the NYT is both a product of NYC and a reflection of life in NYC, it'd be fun to chart the number of articles vs. population over time. Eyeballing, the two tracked closely for the first half of the paper's history, then output fell increasingly during the second half.

    2) I can understand the NYT not growing forever with NYC's population -- after all, there's only so many pages a physical newspaper can hold. But the last decade has seen a steep decline (roughly 20% per decade), precisely at the same time the internet has freed the NYT from the constraints of physical media.

    In short, a decline in both quantity and quality with no good explanation beyond organizational failure.


  2. I don't think the number of articles is the right thing to divide by population -- it's not like when the population doubles, they're going to require twice as many articles to buy a newspaper.

    What you're thinking about is the number of hard copies -- how has that changed as the New York population size has changed? When population doubles, then you do need twice as many hard copies.

  3. The analysis amounts to parlor speculation. Yours would be the simplest explanation, but perhaps the more hum-drum low-quality pieces of news were no longer published with the spread of TV.

    Another interesting possibility: Is radio a complement and TV a substitute for newspapers?


You MUST enter a nickname with the "Name/URL" option if you're not signed in. We can't follow who is saying what if everyone is "Anonymous."