Proto-Yukaghir voiced stops (and their implications)

One of the more popular proposals for external relationships of the Uralic family is the Uralo-Yukaghir hypothesis. By certain measures it might even count as the most popular one. The idea has been around for a long while, but in an infuriatingly entrenched state, with views divided between mainstream specialists dismissing everything as speculation, vs. macro-comparativists and several outsiders taking the relationship as more or less granted. [1] E.g. from the humbler and more “professionally credible” end of the latter group, consider Michael Fortescue’s 1998 monograph Language Relations Across Bering Strait: the book makes no attempt to explore the possibility of any Uralic/Yukaghir similarities resulting from anything but genetic inheritance. This is a particularly jarring omission since he does still cover other contact influences relevant to his idea of relating Uralic, Yukaghir, Chukotko-Kamchatkan and Eskimo-Aleut: those between Y + CK, CK + EA, and even between the individual branches of CK and EA.

Research into the hypothesis seems to be finally picking up these days, though. Much of this must have been enabled by Elena Nikolayeva’s ongoing work on the Yukaghir side, culminating in her 2006 monograph, A Historical Dictionary of Yukaghir. After an apparent latency period of diffusion and digestion, a bunch of new views on U/Y relations have emerged here in Finland within the last few years in particular:

  • Häkkinen, Jaakko (2012): Early contacts between Uralic and Yukaghir. [Appendix.] In: SUST 264.
    — An attempt to model lexical correspondences as several strata of loanwords, and to determine what this would imply for Uralic and Yukaghir prehistory in geographical and archeological terms.
  • Piispanen, Peter S. (2013): The Uralic-Yukaghiric connection revisited: Sound Correspondences of Geminate Clusters. In: SUSA 94.
    — A more optimistic take, presuming a relationship and suggesting some new lexical comparisons requiring rather wild new soundlaws.
  • Luobbal Sámmol Sámmol Ante (Ante Aikio): The Uralic-Yukaghir lexical correspondences:
    genetic inheritance, language contact or chance resemblance? [Preprint.] To appear in: FUF 62.
    — A detailed, conservative review, suggesting that the currently known material is too scarce to establish regular sound correspondences, and that therefore many lexical comparisons may turn out to be simply accidental similarities.

According to the word on the grapevine, there is also at least one further paper in the works on the topic.

I have yet to subscribe to any particular hypothesis on the topic (though of course a burden of proof should lie on those claiming a particularly close U/Y relationship). But it seems to me any assessment of the situation is going to strongly depend on our general understanding of Uralic and Yukaghir prehistory. One of the aims of my various ongoing work on Proto-Uralic is indeed to allow better assessing the various external relationships that have been proposed. I present here one proposal for amending Proto-Yukaghir as well.


The presence of voiced spirant consonants (at minimum *ð, *ɣ) have been listed by Fortescue as one of the better phonological markers of his “Uralo-Siberian” group of language families. The phonetic character of at least the Proto-Uralic “spirants” is however anything but clear… And on closer examination, I believe that for Proto-Yukaghir they’re probably a mistaken assumption.

The modern Yukaghir languages — Kolyma Yukaghir and Tundra Yukaghir — do not have any systematic series of voiced spirants. These only show up in Proto-Yukaghir as reconstructed by Nikolayeva. She posits PY word-medial *w, *ð, *ɣ [2] behind the following three sound correspondences:

  • Kolyma /b/ ~ Tundra /w/
  • Kolyma /d/ ~ Tundra /r/
  • Kolyma /g, ʁ/ ~ Tundra /g, ʁ/ (depending on the PY vowel backness)

This is not an immediately obvious reconstruction. Several changes are required here to derive the modern sound values: across-the-line spirant fortition in Kolyma, rhotacism of *ð + sporadic fortition of *ɣ in Tundra. It seems to me it would be more parsimonious to reconstruct here PY voiced stops *b, *d, *g (~ [ʁ]), and to assume only the lenition of *b and *d in Tundra. Note also that the change *d > *r can easily occur directly, without any intermediate *ð stage.

*w is reconstructed also word-initially for Proto-Yukaghir: again reflected as Tundra /w/, but instead lost in Kolyma. This is an odd asymmetry. Normally, glide or spirant fortition is more likely to occur word-initially — for example cf. Spanish and Selkup. [3] On the other hand, *b is not a consonant that is commonly lost word-initially, so reconstructing that here, too, would not help either. I suggest accepting the asymmetry instead of trying to explain it away: reconstructing initial *w- but medial *-b-. This state of affairs still technically allows identifying these two as the same proto-phoneme — which would provide a motivation for my newly assumed shift *b > /w/ in Tundra (and yet not *g > ˣ/ɣ/, which is a more common 1st step of voiced stop lenition).

Perhaps there was also an earlier original word-internal *-w-, which was vocalized/lost in all attested Yukaghir varieties; either already in Proto-Yukaghir, or even slightly later on, in which case it might explain some of the numerous irregular vowel correspondences between Tundra and Kolyma.

The history of PY consonant clusters can furthermore be streamlined here. Nikolayeva sets up a set of nasal + voiceless stop clusters such as *mt, *ŋć, *ŋk, and has to assume later voicing to yield the actually attested /md/, /ŋď/, /ŋg/, etc. However, if voiced stops and not spirants are posited for PY, they can easily be reconstructed here as well. Nikolayeva also reconstructs liquid + stop clusters, and notes that the stops “mostly” remain unvoiced in these; yet with some exceptions. It seems these alleged exceptions, which correlate neatly between Tundra and Kolyma, could have been in place already in Proto-Yukaghir.

The overall phonotactic pattern here — voiced stops that are restricted to word-medial positions and only contrast with voiceless stops between vowels (and, perhaps, after liquids?) — still suggests that some pre-Yukaghir stage only had voiceless stops; which were then voiced in some medial positions; followed by the introduction of new medial voiceless stops from some secondary source (e.g. geminate voiceless stops, loanwords). Some variation of this history has occurred widely among the Uralic languages, for one. But this is no reason to assume that the change is recent! Dialects of Mokša and Mari have resisted initial voiced stops in loanwords until fairly modern times (18th-20th century), despite medial voiced stops having existed already in Proto-Mordvinic and Proto-Mari times (somewhere around the 1st millennium CE).

Lexical correspondences with the Uralic languages also appear to support this model. I will refer here to Proto-Yukaghir roots by their index numbers in the Historical Dictionary, following Aikio’s paper linked above (it includes a useful appendix of Nikolayeva’s U/Y comparisons).

Considering the labial consonants other than *m, three recurring patterns involving these seem to be attested:

  • PU *w ~ PY ∅ (#620, ‘tree’ ~ ‘birch’; #1112, ‘vapor’ ~ ‘smoke’; ? #2050, ‘to hear’ ~ ‘sound’)
  • PU *(m)p ~ PY *w (#139, ‘older sister’; #1048, ‘warm’)
  • PU *pp ~ PY *p (#362, ‘sharp’; #1038, ‘to tear’; #2150, ‘to hit’)

Medial *-w-, *-p-, *-pp- are actually a fairly rare in PU, so even though some of the Uralic roots involved here are uncertain and there are some semantic differences, I find this a not quite trivial tally.

The correspondence *w ~ *w also seems to be absent (#806 ‘to leave’ is a clearly rejectable comparison since the supposed “Uralic” root is now known to be a Germanic loan). While the material is scarce and so this could be an accidental gap, it seems regardless preferrable to interpret the material as reflecting the following developments:

  • (pre-)PU *w → pre-Y *w > PY ∅
  • (pre-)PU *(m)p → PY *b (voiced either in pre-Yukaghir or in some loaning Uralic branch)
  • (pre-)PU *pp → PY *p (shortened either in pre-Y or in some loaning Uralic branch)

…which also implies that we should indeed not expect any examples of the correspondence *-w-  ~ *-b- to turn up. [4]

Though this does not seem to generalize to the other POAs. There indeed do not seem to be any recurring correspondences involving intervocalic dental obstruents (or even more suspiciously, any comparisons involving *-t- on either side [5]); and the only recurring intervocalic velar correspondence is PU *x ~ PY *g (#1480, ‘guard’ ~ ‘hunt’; #2599, ‘lead, take’). There is also one example each of *k ~ *g (#1302, ‘hill(s)’) and of *w ~ *g (#1019, ‘to eat’). These bring to mind the East Uralic development of *-k-, *-w- to *-ɣ-, which seems to suggest that if these comparisons are correct, they probably represent loans rather than inheritance.


Additionally, I wonder if the current issue has partly also been an issue of terminology. Nikolayeva’s model of the history of Yukaghir includes not only the Proto-Yukaghir stage, but also an “Old Yukaghir” stage, which would already have e.g. featured voiced stops in clusters. This is mainly used as a cover term for early historical records prior to the mid-19th century, but perhaps her underlying mental model in full detail actually looks like this:

Proto-Yukaghir > Old Yukaghir > dialectified Old Yukaghir > modern Kolyma Yukaghir & Tundra Yukaghir

Under this scenario, the 1st “Old Y.” stage would be the actual last common ancestor of the recorded Yukaghir varieties, while “Proto-Y.” would be an internally reconstructed entity. It would not be the first time a historical linguist were to abuse terminology in this way.

This is not a random guess. There are a couple other hints for this interpretation, e.g. the treatment of long vowels. Nikolayeva does not reconstruct these in certain positions where they do not contrast with short vowels, even though they appear in all records. She assumes that they must hence be ultimately somehow secondary even in other positions. This does not necessarily follow: consider e.g. Modern English, where “vowel length” (or rather: tenseness) fails to be contrastive in open monosyllables, in most dialects also before /r/. Regardless of this, and even regardless of numerous reconstructible processes of compensatory lengthening (e.g. light /laɪt/ ~ German Licht /lɪçt/), the vowel length contrast in English is absolutely ancient: it can be traced back all the way to Proto-Indo-European!

(English incidentally and probably coincidentally works as a typological parallel also for my idea that medial *-w- could have been lost earlier on while initial *w- still remained.)

Finally, I can’t help noticing that the long vowel issue and the reconstruction of spirants rather than voiced stops both swerve “Proto-Y.” typologically closer to standard-issue Proto-Uralic. Is this perhaps not an accident, but rather a general bias that has resulted from Nikolayeva’s working hypothesis of a Uralo-Yukaghir relationship?

[1] Incidentally I find it an interesting question why this particular hypothetical relationship is so pervasively accepted by Nostraticists and the like. There is no shortage of competing proposals, such as Indo-Uralic or Uralo-Dravidian; and neither does Uralo-Yukaghir have a history of recognition by the general public, unlike e.g. the Ural-Altaic or Uralo-Sumerian hypotheses. Is it perhaps that the relative obscurity of Yukaghir has made it more difficult to notice weaknesses of the idea?
[2] Yes, I am aware that /w/ is a semivowel, not a spirant, though frequently it may pattern as one (or, perhaps better: phonologically isolated voiced spirants may pattern as dental/velar glides).
[3] Even more so for geminate glides actually, with some precedents being North Germanic + Gothic (*ww > *ggw, *jj > *ddj ~ *ggj); Northern Sami (*jj > /dj/); Votic (*jj > /ďď/); various Prakrits including Pāli (e.g. *vv > /bb/); and several Berber varieties (e.g. *ww > /ggʷ/). This doesn’t seem to come into question here, though.
[4] There is a development *w > *b in most Samoyedic languages that could allow this, but being post-Proto-Samoyedic (absent from Nenets and Selkup), this might have been too late to be relevant.
[5] This is particularly curious since PU *-t- has, by contrast, Indo-European correspondences in abundance. Any macrocomparativist model that proposed common ancestry for all three, or even just for Y+U, would be hard-pressed to explain why Yukaghir has lost such words so consistently.

Tagged with: , , , , , , ,
Posted in Reconstruction
2 comments on “Proto-Yukaghir voiced stops (and their implications)
  1. David Marjanović says:

    A fascinating topic!

    It’s not true that Nostraticists accept Uralo-Yukaghir across the board. The latest I’ve seen from the Moscow School has a tree with Indo-Uralic in it and doesn’t even mention the U-Y hypothesis, IIRC. I’ll try to look it up (not now, I should go to bed).

    numerous reconstructible processes of compensatory lengthening (e.g. light /laɪt/ ~ German Licht /lɪçt/)

    You happen to have picked a bad example. This is shortening of a long vowel in front of a consonant cluster which followed the so-called “New High German monophthongization”. It’s limited to Central German. At least the latter process hasn’t happened in my Upper German dialect (a Central Bavarian one, meaning that vowel length isn’t phonemic), so it retains the original diphthong, and the word is pronounced [lɪɐ̯xt] as if spelled *lircht.

    Hence Liechtenstein.

    I think right, German recht, is an example of what you mean; it has [e] in my dialect and [æ] somewhere in Scotland or thereabouts. Most likely, night, German Nacht, also counts, though the vowel correspondence is really surprising.

  2. David Marjanović says:

    …both recht/Recht and richtig

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.