The rooting of historical linguistics

Most of the harder problems in the methodology of historical linguistics seem to come from it being a fairly “high-order” discipline, and a relatively isolated one at that.

To an extent, this true of all humanities. With the levels of computational power currently available to us, it’s not possible to start with a couple of known physical laws and derive exact predictions about human behavior from them. The best we can do along these lines is to establish boundary conditions. And of course, most of these are sufficiently obvious from our daily experience as humans that they sound more banal than profound when spelled out: e.g. language families usually have a distribution limited to the surface of the planet, and they fail to extend up to the stratosphere, or down to the oceanic crust. :ɪ

But the historical angle complicates things. Most of the historical sciences rely heavily on evidence preserved from the past: history itself is based on written sources, and the “auxiliary historical sciences” such as archaeology on other objects preserved from the past.

And yes, historical linguistics also builds on preserved evidence from the past, mainly via philology and epigraphy. But this has only been a small initial inspiration. Most of our historical insights are instead derived from from the observation and analysis of attested modern languages, and the application of a general theory of linguistic evolution. This model is, I think, quite alien to all other humanities. (Even plenty of non-historical lines of humanities research seem to remain stuck in a pre-scientific “there are no theories, only paradigms of discourse” mire.) In this sense historical linguistics has much more in common with evolutionary biology, although I suspect that also that discipline would not be doing as well as it is without the more direct evidence from an extensive fossil record. [1]

The inevitable implication is that nothing in historical linguistics can be understood without a good grasp of the underlying theory. And yet, it seems to me that many of its premises have not often been even stated aloud. No dout this is due to how the theoretical foundation seems to have been developed on a need-to-know basis by its users, as the discipline has expanded, not by any separate class of theoreticians. Yes, starting from the Neogrammarians, many of the surface phenomena have been described, from old’uns like “regularity of sound laws” to innumerable newer achievements like “typology of semantic change in body part terminology”… but the nuts and bolts of it, that really “root” historical linguistics to its sociolinguistic foundations, not so much. There has been so much work in cataloguing the “whats” that we have had not much success yet in uncovering the “whys”.

I am not sure if my term “root” is readily understandable, or if there might be a better term available. It seems likely that this could be confused with a discipline’s internal history, at least. Which is not what I mean: I refer here to by how various sciences can be ordered in how far removed they are from the basic laws of the universe. The typical example being how all biological processes can be broken down to individual biochemical processes; all biochemical processes to chemical ones; all chemical processes to particle physical processes. The reason that biology looks very different from chemistry, or from particle physics, is that studying the behavior of macroscopic masses of particles requires very different methods from studying 10 or even 1000 of them. A phenomenon such as embryonic development could in principle be modelled in terms of individual protons and electrons, but this would require enormous amounts of efforts wasted on reiterating problems like “how does a water molecule hold together” or “what happens to a protein when it encounters a water molecule”, that have already been solved to sufficient precision for us to instead model an embryo as being built from cells that are built from cellular organelles that are built from macromolecules. A biologist — or a geologist, or a cosmologist — is not interested in the whereabouts of individual particles, but rather in their patterns of distribution at a specific scale in space and time.

The same exact principle holds for the humanities. Say, all psychology is at a certain fundamental level about neurons; but in analysing the overall behavior of the brain, built from a hundred billion neurons, the beliefs, feelings, etc. that they encode can (and must) be treated as entities in their own sake. And similarly, while the speech of one human can be studied by phonetics, neurolinguistics, and similar disciplines, it again takes different tools to study the speech of a hundred million humans sprinkled across five thousand years. We need concepts such as “isoglosses” and “etymologies” that exist only as generalizations about the idiolects of individual speakers.

Our tools, however, do not seem to decompose easily into insights about smaller and smaller groups. How exactly does sociolinguistic variation in speech end up producing clean and neat sound laws, or patterns of loanword dispersal, or language areas sharing grammatical features? I do not think we have much more than loose guesses about the workings of these processes, so far.

This type of disconnect is, of course, quite common at the biology/humanities interface, and can be sometimes found elsewhere as well (e.g. in the absense of a working theory of quantum gravity). But to see it within a single discipline — linguistics — seems to me like a situation that ought to be resolvable.

This also means that historical linguistics knowledge rests, to an extent, on questionable ground. If we do not name our implicit starting assumptions, and end up making little effort to justify them on the basis of the more elementary phenomena they emerge from, is there not a risk that our edifice of knowledge stands askew, and ends up being an excercise in the construction of an essentially abstract theory, rather than a real description of the past?

Some philosophers would at this point certainly retort that all historical inquiry, being both unverifiable and unfalsifiable in the absense of a time machine, does not exist for the purpose of creating a real description of the past, but to create compelling stories about it. OK, I say, but some of us happen to consider truth an essential component of what makes a story “compelling”. Moreover… any model of the past will also make predictions about some parts of the present that we have not examined yet, which grants all historic theories a limited degree of falsifiability.

I do not claim to have a dossier of answers to issues of this sort prepared. Perhaps one or two sketches of solutions. But, of course, questions have to be asked before they can be even begun to be answered.

[1] Arguably though one could claim that the majority of our planet’s biodiversity exists at the microscopic level, and that most of biologial history must be thus similarly approached via comparative reconstruction. But in my understanding this is a relatively new approach in evolutionary biology; while historical linguistics dove headfirst into reconstruction already back in the 19th century.

Methodology

