Inheritance in Phonology

It occurred to me that there’s one concept I have never seen anyone else define or use, although I’ve been working with it in my own research for a while now: that of an inheritance phoneme.

This is in effect the polar opposite of the well-known case of the loanword phoneme. As the audience of this blog probably mostly knows, a loanword phoneme refers to a sound that is absent from the native lexicon of a language, but occurs in one or more of its contact languages, and has been taken on from there into the language itself. Clear examples include /b g f ʃ/ in modern Finnish.

But sometimes, we can by contrast find in a language a phoneme that is absent from its contact languages, and is only found in the native-enough lexicon. [1] In Finnish a recent example might be the labial opening diphthongs /uo/, /yö/. Although found as reflexes of earlier *oo, *öö even in some not especially old loanwords from e.g. Swedish (including tuoli ‘chair’, kyöpeli ‘kobold’; yet more recently also fluori ‘fluorine’), they appear to have within the last about 200 years become a “closed class” that, for now, is no longer acquiring new members. [2] Of course, this is not “closed” in the same sense as a morphological word class might be — the diphthongs remain entirely possible in new ideophones and onomatopoeia (blyögh ‘barf!’), blends (Suomalia ‘an area in Finland with a relatively large current or predicted Somali population’), and derivatives based on pre-existing roots.

Better examples can probably be found, from languages having some more strongly marked phonemes. For example, I’d expect Czech ř or German pf to be not very common in current loanwords, and to have been so for a good while; or the nasal vowels in French to be absent from most modern loanwords, with the exception of those from Portuguese or sub-Saharan African languages.

Even then, this concept seems less clearly defined than the loanword phoneme. While a loanword phoneme is established by its one-time inadmissibility in the language altogether, there is nothing in a language’s internal structure at any given time that could prevent a given phoneme from appearing in loans. This situation can only be an incidental fact about its contact languages — and if the contact situation changes, anything’s possible again. (Put a Czech speaker community in regular contact with speakers of Toda, and I for one would bet that ř would then start regularly turning up in some loanwords.) A phoneme could also be only “partially inherited”, in being found in some loan strata but not in others — as I hypothesized to be the case with French nasal vowels.

On the other hand, what is interesting here is that while words containing loanword phonemes allow setting up a terminus post quem for their acquisition into the language (if we know that Finnish circa 1600 had no /ʃ/, then all modern Finnish words with the consonant must be more recent, even if their etymology were unknown) — inheritance phonemes may allow establishing a terminus ante quem. This seems like a fairly powerful tool; usually we can backdate a word only by the comparative method, and even then not watertightly either. But, given a word like Fi. tuoksua ‘to smell’ (of unknown origin, not attested before the end of the 17th century, and in contrast to the more widespread native Finnic synonym haista), we can regardless consider it probable from its diphthong that this is not an especially young word, perhaps dating at least to the Middle Ages. Given an absense of known loan etymologies from any obvious candidates for a loangiver (Swedish, Russian etc.) would furthermore suggest that we can with slightly lower confidence add a couple of centuries more yet. [3]

We can also define similar concepts such as loan cluster and inheritance cluster. The former, although to my knowledge never explicitly named, is again a known phenomenon. Finnish continues to work as an example: while Modern Finnish clearly allows e.g. word-initial consonant clusters, it is not too hard to find phonological analyses that dismiss them as non-native and proceed to posit a “basic” syllable structure (C)V(V/C)(C). Jorma Koivulehto has also made good use of this approach in research of early loanwords, having e.g. shown that all Finnic word roots with the medial cluster *-rt- are ultimately Indo-European loans, and not of Uralic inheritance. [4] (This, however, is not to be confused with the occurrence of *rt in word stems, where it can well result from inherited *r + a suffix such as causative *-ta-; as in Fi. vieri ‘side’ → vier-tä- ‘to be or go beside smth.’)

It seems similarly possible to consider e.g. Finnish tk for the most part an inheritance cluster that indicates relatively native vocabulary. No examples of this cluster in old loans are known; and given that already in Late Proto-Indo-European, the inherited “thorn” clusters of dental + velar were metathesized or otherwise reduced, it seems likely that none will be found anytime soon either, at least not from an Indo-European direction. (Much newer examples can be found though, e.g. Atkinsin dieetti, votka; and in far-northern dialects, e.g. vietka ‘adze’, from Sami.)

I could explore various further examples here, but for now, this post should do for a point of reference for later use.

[1] “Nativeness” is a relative concept, of course, not an absolute one. E.g. Finnish kauppa ‘store’ can be considered a “native” counterpart of the more recent loans puoti (← Swedish), lafka (← Russian), basaari (ultimately ← Persian) etc., but ultimately it is a Germanic loanword as well. Similarly, even words reconstructible back to Proto-Uralic can in principle be loans at some deeper time-level yet (e.g. we can suspect on semantic grounds that pata < *pata ‘pot’ might be one).
[2] The illabial opening diphthong /ie/ remains possible in loans, e.g. fiesta, siesta, DJ Tiësto.
[3] For some speculation though, something could be perhaps made of the similarity to Swedish doft, German Duft ‘smell’. If these could be analyzed as earlier *duf-t-, perhaps in turn some kind of a labial-stop extension of PIE *dʰewh₂- ‘to smoke’ (PG *dup-?? Svensk Etymologisk Ordbok connects here also Greek τυφος ‘smoke’), then we might be able to assume that the Finnish word derives from pseudo-PF *tupa/*tupo ‘smell’ → *tuβa-ks-u-/*tuβo-ks-u- ‘to put out smell’ > *tu.aksu-/*tu.oksu-, with a similar late contracted diphthong as in words like siellä < *si.ällä < *siɣällä < *sigä-llä ‘there’, or haukka < havukka (attested dialectally) < *haβukka < *habukka ‘hawk’.
[4] See in particular: Koivulehto, Jorma (1979): Baltisches und Germanisches im Finnischen: die. finn. Stämme auf -rte und die finn. Sequenz VrtV. In: Schiefer, Erhard F. (ed.), Explanationes und tractationes Fenno-Ugricae in honorem Hans Fromm, pp. 129–164. München.

9 comments on “Inheritance in Phonology
  1. M. says:

    The problem with the term “inheritance/loanword phoneme” is its high degree of provisionality. If we don’t know anything further about a word’s origin, an apparent “inheritance/loanword phoneme” in this word will always have an alternative explanation: the word could be from an otherwise-vanished donor language that possessed this phoneme, or the phoneme could have arisen through a coalescence of older, lost phonemes (or still-existing ones — e.g. a cluster like -tk- could result from syllable contraction). You might be able to say that a phoneme or cluster “looks” Uralic, or Germanic, or otherwise, but if you *label* a phoneme as an “inheritance/loanword phoneme”, it seems to me that you are closing off avenues of investigation that might (if new evidence is found, or noticed) shed more light on the origin of a given form.

    A couple of side questions:

    1) Are we really sure that “fluori” is of Swedish origin? German also has “Fluor” as the name of this element.

    2) Re: the designation of -rt- is a “loanword cluster”: I have a recent etymological dictionary of Finnish and at least the loan-etymology of varsi/varte- “stem, stalk” doesn’t seem undisputed. And, even if all known words with -rt- were demonstrated to be of IE origin, I don’t see how this would disqualify -rt- as a *possible* inheritance cluster: word-medial consonant clusters are found in many ancient Finnic words (like the *-kt- in kaksi/kahte-), and I don’t know of any phonological grounds on which “resonant + stop”-clusters would be excluded. The only scenario for this that comes to mind is that there could have been a sound change that turned older *-rt- into something else, in which case all current “-rt-“s must have a different origin.

    • j. says:

      — Yes, you are correct. We cannot take inheritance phoneme arguments as a demonstration of a word’s origin. Likewise, though this is rarer, loanword phonemes can sometimes be introduced in relatively-native vocabulary as e.g. hypercorrections. The concepts are inherently statistical: what they are useful for is suggesting probable directions to further investigate. As soon as we actually know a word’s origin, there will be no need to talk about its minor foreign or native features in detail anyway.

      — Re fluori: most older scientific vocabulary in Finnish has obviously been transmitted thru Swedish, given that no university education in Finnish existed until the 1850s. The identification of fluorine as a separate element dates to a couple decades earlier; and its compounds, where the pseudo-root fluor- first was used, have been known far longer.

      — Re varsi: taken together, the competing Baltic / Germanic loan etymologies seem to me sufficient for establishing loan origin. Given the cognates in Mari, this could well be one of the oldest cases though.

      — As for the status of *rt in more general: it is obviously disqualified from being an inheritance cluster, since several clear loanword examples are known, e.g. parta ‘beard’. But yes, it is not disqualified from being a possibly native cluster solely due to a lack of known inherited examples. (Some examples of PU clusters for which we have no known Finnic reflexes include *pd₁, *wj.) To make that case, we will need to also note that none of the other Uralic languages show known inherited, reasonably widely distributed, non-loanword and non-derived cases of *rt either. I.e. we do not have sufficient evidence to suppose that the cluster existed in (early) Proto-Uralic. But indeed, even this is not as strong a case as could be. There have been various speculations for why homorganic clusters like *rt, *lt, *st were rare or absent from PU while heterorganic clusters like *rp, *lk are clearly attested, but I don’t think any actual evidence this way or the other has been uncovered. (Though sometimes I wonder if the Uralic words for ‘heart’ should be compared with Indo-European, in which case they might indicate that pre-Uralic *rt > PU *d₁.)

      • M. says:

        – Re: the lack of Finnish-language university education before 1850 — I don’t follow how this excludes all languages except Swedish as the source of “fluori”. Again, German and French use the same term for this element as Swedish (“Fluor”/”fluor” — the Swedish term may be based on the German or French one), and it seems possible (though probably impossible to prove either way) that the first Finnish-speakers to use the term had some knowledge of German as well as Swedish.

        The context through which international technical terms are acquired (scientifically-educated groups of people) seems strongly correlated with literacy in international “languages of science” (i.e. the languages in which technical papers tend to be written): in the 19th century, or at least the later part of the century, German would almost certainly have been one of these languages, possibly also French or English.

        – From what I have read, Germanic origin for “varsi” is unlikely because it has cognates as far as Mordva. As for Baltic origin, the proposed Baltic cognates that I’ve seen (Lithuanian “vìrdis” and “(apý)varde”) don’t match the semantics of the Finnic word (“vìrdis” is a horizontal crossbar or pole in a barn; “apývarde” refers to the pole around which a hop plant grows, not the stalk or stem of the plant itself).

        – Lack of evidence (in this case for -rt- in Uralic) is not the same thing as evidence of lack, especially given the small size of the data set in this case: if I recall right, there are about 400 roots reconstructed for Uralic currently?

        • j. says:

          The people first acquiring knowledge of fluorine in Finland would’ve been predominantly speakers of Swedish, though. Presumably, words like “fluorine” made their debut into Finnish thru the translation of study materials or thru popular science writing, not by Finnish-speaking scholars encountering them in scientific literature.

          …On the other hand, investigating closer, Kaisa Häkkinen in Nykysuomen etymologinen sanakirja suggests that the first attestation of fluori may have been a glossary of chemistry from 1862 — written by Julius Krohn, a native speaker of German! Looks like I’ll have to grant this particular case.

          The cognates that e.g. UEW and SSA lists for varsi are from Mari, nor Mordvin: Hill Mari wurðə, Meadow Mari wurðo ‘shaft’. It’s in principle possible that these are from Baltic, while the Finnic words are of Germanic origin. Though I would consider it more likely that a formation more similar to what we find in Germanic also once existed in Baltic. After all, the old Baltic loanwords in Finnic/Samic/Mordvinic/Mari have not been acquired from the ancestors of modern Eastern Baltic in particular — but rather from the extinct Baltic varieties once spoken further north and east. There are even a few cases where the only attestation for a given Baltic word with Finnic descendants is from Old Prussian (e.g. *hirvi < *širvi 'elk', compareable to OPr. širwis ‘roe deer’), which surely never was in direct areal contact with Uralic.

          There is moreover at least one assured loanword of Germanic origin in Mari anyway: pundo ‘money, capital’, thru PG *punda-. Cf. also Erzya pondo, Moksha ponda ‘pound’. Etymological references normally attribute this to contact with the Goths in the 4th century, though I am not sure if we can rule out trade contacts with Vikings. (Transmission through Finnic should be in principle possible too, but the word is not attested from Veps, or, indeed, even Karelian.)

          • M. says:

            There’s something I find disturbing about the assumption that international technical vocabulary (such as the names of chemical elements) was, for some prolonged period, acquired into Finnish primarily through pedagogical materials.

            This amounts to saying (unless I am missing something) that Finnish speakers were able to access the terminology of scientists, but not as peers (i.e. as fellow scientists), only as “pupils” following a course laid out by other people than them.

            Maybe this was true to some extent during the period when the first generation of Finnish speakers went through university, but once this generation had been educated, why would Swedish have been necessary any longer as a “conduit” through which to receive technical vocabulary? Even during this first generation, as long as students gained some command of German, French or English, they would have been able to read technical (or pedagogical) literature in these languages and pick up the terminology they encountered.

            Re: varsi — I checked Häkkinen’s dictionary and you’re right, the only cognate mentioned outside Finnic is in Mari, not Mordva. My mistake.

            Regardless, the proposed Germanic etymology of varsi seems to involve a connection to varras (“spit, skewer”), for which the proposed Germanic cognates (at least those mentioned by Häkkinen) mean “bundle”, “cluster” or similar, which seems no closer semantically to varsi than the Baltic words mentioned earlier.

            • j. says:

              It’s of course reasonable to expect that university-educated L1 speakers of Finnish would have largely first learned a language such as German and then dealt with materials written in it. But this is irrelevant as long as no specifically Finnish-language scientific community existed, and Latin, French, German, Swedish etc. remained the working languages in technical matters.

              If we’re rather tracing the first introductions of a term into Finnish in general, not into the language palettes of individual Finnish speakers, then we’ll have to account for materials closer to the general public as well, e.g. newspapers, almanacs, and study materials in pre-university schooling. In proportion they would for long have outnumbered any use of Finnish in higher education. It’s possible though that I’m overestimating the degree to which scientific terminology could penetrate into these prior to the mid-1800s.

              I certainly agree that Finnish gaining a steady foothold in the learned spheres would lead to the establishment of direct Finnish-international links. We can note e.g. Agathon Meurman’s early one-man encyclopedia project Sanakirja yleiseen siwistykseen kuuluwia tietoja varten from the 1880s, quite probably the earliest recoverable Finnish attestation of a great number of learned vocabulary — and which was largely based on the German Meyers Konversations-lexikon.

              • M. says:

                But this is irrelevant as long as no specifically Finnish-language scientific community existed, and Latin, French, German, Swedish etc. remained the working languages in technical matters.

                OK, but such a community is exactly what one would expect to develop once a generation of Finnish speakers had gone through university.

                If we’re rather tracing the first introductions of a term into Finnish in general, not into the language palettes of individual Finnish speakers, then we’ll have to account for materials closer to the general public as well, e.g. newspapers, almanacs, and study materials in pre-university schooling.

                This brings up the question of when a word can be said to be part of a language “in general”. For example, 95%-99.9% of English speakers have probably never heard or used the term xanthelasma (a skin condition) — does this make it problematic to say that xanthelasma is a “part of” English, even though it appears in a dictionary when you look under “x”?

                I’ve wondered sometimes how much the vocabulary of scientific registers (chemistry, medicine, etc.) is really “moored” in particular languages to begin with. Maybe it makes more sense to speak of various international “bodies” of scientific vocabulary, which speakers adjust in various ways to the sound patterns and morphologies of their native languages, just as I pronounce foreign placenames in a way that reflects my own language’s phonetics.

                We would still have to distinguish between the scientific vocabularies used in Europe versus those used in (e.g.) China. But we would save a lot of space and time otherwise spent in the labeling and codifying of minutia (at least it often seems like minutia to me: e.g., “akkusatiivi is a Finnish word”, “accusatif is a French word”, “Akkusativ is a German word”, “akkusativ is a Bokmål Norwegian word”, etc. etc.).

  2. David Marjanović says:

    Given that *punda- is a Latin loanword, I wonder if it should really be reconstructed as Proto-Germanic, or if it was borrowed by Proto-NW-Germanic and then passed on to the East…

    *lightbulb moment* Actually, I need to look up if there’s any factor in that word that would have blocked the NW Germanic umlaut of *u…a to *o…a. If not, the word must have been borrowed after morphological levelling had made *o phonemic.

    Likewise, though this is rarer, loanword phonemes can sometimes be introduced in relatively-native vocabulary as e.g. hypercorrections.

    Or even, though that must be at least as rare, by taboo avoidance. This has happened with clicks in the Bantu languages of South Africa, apparently due to an Australian-like taboo against uttering the names of the dead.

    • _j. says:

      Coda *n indeed regularly blocks *a-umlaut. Cf. bundle < *bundą, dung < *dungaz, hound < *hundaz, etc.

      (Or, as I believe this should be reinterpreted: the raising of mid vowels before coda nasals is later than *a-umlaut in NWG, and probably never occurred in Gothic.)

