Reading Interests
How popular are the words we use frequently?
as your-favorite-city Marathon) offer several races of different distances. For instance, the Christie Clinic Illinois Marathon in Champaign-Urbana offers not only a marathon but also a marathon relay, half-marathon, 10K, 5K, and a youth run. By offering several race distances at such an event, the appeal
\ o doubt you’ ve noticed that many destination running events these days (such
is broadened and more folks run.
In recent years I’ve noticed a couple of interesting observations about these destination-running events. One is that peculiar question I’m asked about an upcoming marathon: “So are you running the full or the half?” Excuse me? Did I not say I was running the marathon? To avoid confusion when talking to nonrunners, I sometimes find myself taking a proactive stance and say I’m running the full marathon. Don’t get me wrong, I like the half-marathon distance—a lot. But if people say they’re running a marathon, how can there be any confusion that they are running a half-marathon?
The other observation I’ve made is that, of all the different races at a destinationrunning event, the 26.2-mile race often has the fewest runners. There is probably a marketing strategy to retaining the name “marathon” in the race event. You know, the “Illinois 5K,” with marathon and half-marathon events, doesn’t sound as inspiring. But what does it mean that the 26.2-mile race has a small field of runners compared with most of the other race distances? According to a recent issue of Running Times, the half-marathon is the fastest-growing race distance in the United States. Are we collectively losing interest in the marathon?
Being the hypothesis-driven, data-querying enthusiast that I am, I decided to find out. Assuming that our interests are reflected in the printed word, I decided to look at what we’re reading. Where better to start than Google?
You may have heard about the Google Books Library Project, the industrialscale scanning of books from several large libraries and publishers since 2004. The goal is to scan some 130 million books when all is said and done, including those published as far back as 600 years ago. The database being built is searchable—we are talking Google—so we can look at the frequency of specific words or phrases over time. Google is so kind as to provide access to part of its database in the online tool called “Ngram Viewer” (hitp://books.google.com/ngrams). The online database is built from the text of over 5 million books—about 4 percent of the books ever printed.
In the Ngram Viewer, we can find out how frequently individual words or phrases occur. To reduce the volume of the database, the word has to appear in each book 40 or more times. Also, frequencies are normalized to adjust for “linguistic inflation”—the expansion of language as more books are published each year. I first had to decide what running-related words or phrases to use. Obviously, “marathon” and “half-marathon” were going to be two of them. I assume that their use in a book more than 40 times probably describes a 26.2- or 13.1-mile race. But what else? Some of the shorter races, such as the 5K and 10K, just don’t work. Variations of the words (such as “5 kilometer,” “SK race,” and so forth) could describe all sorts of things other than running events. I needed to keep words or phrases that were specific to running events. I could think of two other events that seemed appropriate. One word was “ultramarathon”; because of the specificity of the root word, it seemed like a word that would refer mostly to running events. Another word that describes an athletic event that includes running is “triathlon.” Sure, the triathlon also refers to swimming and biking, but it’s pretty much a guarantee to also include running. There may well be other words or phrases that describe running events, but I struggled to find more that were exclusive to running (for example, “track,” “sprint,” and even “run” can describe something different from what we’re talking about).
When we consider a sampling of all the books ever published in English, how frequently is “marathon” mentioned? The figure on page 104 shows that the word didn’t show up much until early in the 20th century. (For ease of presentation, I deleted the uneventful years between 1400 and 1899.) The online tool Ngram Viewer was used to develop this figure, which queries a database of more than 5 million books digitized as part of the Google Books Library Project.
Frequency picked up a bit between 1920 and about 1935. For the next two decades, little change was observed in the use of the word “marathon.” But look at what happens from 1960 to about 1980. All sorts of new books mention the word “marathon.” The rate of increase of “marathon” frequency in books then tapered off but continued an overall climb in frequency between 1980 and 2000 and has remained steady since then.
What accounts for this trend line with use of the word “marathon”? I’m not a historian, but I’m sure there were some interesting phenomena during the 20th century that fueled the explosive popularity of the marathon since 1960.
How does this trend compare to “half-marathon” and other running-related words? The word “triathlon” is the next to show up, in the early 1980s. This
This article originally appeared in Marathon & Beyond, Vol. 17, No. 3 (2013).
← Browse the full M&B Archive