Freelance reconstruction

*ü > *i, *ü in Samoyedic


I have noted before that Proto-Uralic *ü, whose reconstruction has at times been opposed by various scholars, has never received a truly detailed defense.

Arguments contra have never been very detailed either — but one recurring claim has been that the contrast *i | *ü might not be reflected in Samoyedic. People who subscribe to a primary division between Finno-Ugric and Samoyedic (I do not, but I recognize that this is not universally held) will be therefore able to propose that the contrast did not yet exist in Proto-Uralic proper, and only emerged later on in the Finno-Ugric group, perhaps in different ways in different descendants. [1]

The most common opinion around is that the principal reflex of *ü in Samoyedic is indeed *i. This is claimed e.g. by Steinitz (1944), Sammallahti (1979) and Janhunen (1981). [2] The known examples all behave well enough:

However, Proto-Samoyedic is reconstructed with also an *ü (retained roughly as is in the southern languages: Selkup, Kamassian & Mator). And at least since Sammallahti, also cases with apparent retention from Proto-Uralic have been recognized. The data is rather sparse, but the following look reliable to me:

This kind of two-fold reflection would not be unique in Uralic. In fact, it resembles quite a bit the situation found in at least three other nearby Uralic dialect groups: Southern Khanty, Northern Khanty and Southern Mansi. In all three of these, we find “conditional retention”:

The exact same thing seems to be what is going on in Samoyedic. None of the cases in the first group occur next to an original velar consonant; all cases in the second group do. The contrast *i | *ü would therefore appear to be preserved in Samoyedic after all, although only in this one particular environment.

Phonetic details

Both Mansi and Khanty also suggest that this development is probably not conditional retention — but more likely something I could call “double cheshirization”. First *ü indeed develops to *i, but in the process it colors an adjacent velar consonant, e.g. *k > *kʷ. This phase is attested in Mansi, e.g. *künčə > *kʷinčə >> Western/Eastern /kʷäš/, Northern *kʷas > †/kʷos/ > /kos/ [6] ‘nail’; in the case of medial velar consonants, most often *-ɣ-, also in Surgut Khanty, e.g. *sükśə > *süɣəs >*siɣʷəs > /sewəs/ ‘autumn’.  Then, *kʷ and *ŋʷ develop back to *k and *ŋ, but in the process they color an adjacent *i (or the like) either back to /ü/ (Southern Mansi), or to a back vowel, /o ~ u ~ uu/ (Khanty). So altogether: *kü- > *kʷi- > /kü-/, while e.g. *tü- > *ti- (and not > **tʷi- > **tü-). Expected *ɣʷ however merges with *w, which then generally remains. [7]

One benefit of this approach is that we can assume some amount of vowel rotation to intervene. As seen from the above examples, the de-labialized reflexes of *ü in Mansi and Khanty are not /i/, unlike in Samoyedic. In Mansi, *ü still merges with inherited *i, but they both end up further lowered to *ä (which can be then e.g. secondarily lengthened to /ää/ or backed to /a/). In Khanty, *ü de-labializes at a late stage, by which the reflexes of *ü and *i in Khanty had already drifted out of sync. The development at this time is also probably more [ʏ] > [ɪ] than [y] > [i]. The eventual outcome can be a reduced close ~ mid vowel /e/ (in EKh and SKh; phonetically approx. [ɪ]), a reduced open vowel /ä/ (in the Obdorsk dialect of NKh) or even reduced back open /a/ (in the most innovative dialects of NKh where earlier *a mostly > /o/, such as that of Kazym). By contrast /i/ gives mainly tense /ee/ (~ open reduced /ä/ in Surgut).

For illustration, here are a few of the words from my first list again, with Proto-Uralic *ü in a non-velar environment, now with their reflexes in Southern Mansi, Southern Khanty and Northern Khanty:

(These two appear to be the only examples retained in both Samoyedic and in all relevant Ob-Ugric dialects.)

Additional parallels

While there is clearly a widely parallel set of innovations involved, it is however not possible to assume *(k)ü- > *(kʷ)i- as a general East Uralic innovation. After all, *ü remains as a rounded front vowel in Eastern Khanty on one hand, Hungarian on the other, regardless of the environment. [8]

But even more impressively, the similarities do not end here! Northern Khanty (but again, not Eastern Khanty) and Mansi (this time in general) also share a related sound change, by which original *wi- becomes *wü-. In the former this is followed by loss of *w-, and due to this the change can be identified as common Mansi rather than exclusively Southern Mansi: e.g. *wittə ‘5’ > *wütə > *ütə >> Proto-Mansi *ätə > SMs /ät/,  NMs ат /at/ ‘5’. [9] Compare EKh (Vakh-Vasjugan) and NKh /weet/.

This split, as well as the Mansi-style loss of *w, occurs even in Hungarian, where the resulting secondary *ü is retained as a labial vowel, just like primary inherited *ü: öt ‘5’, öl- ‘to kill’.

Samoyedic again follows suit:

While there are only two examples of this precise development, it can be identified as a more general shift *i > *ü next to labiovelars, with a total of five examples in Samoyedic after all.

I can indeed find no clear examples of Proto-Samoyedic words beginning with *wi-. Most cases that have earlier been reconstructed as such can be now identified as rather continuing *we-; notably *wet pro **wit ‘water’, where original *e is assured both by Old Nganasan †be’, and western cognates such as Fi. vesi. Probably a similar case is PSmy *wi/eŋü ‘son-in-law’, cf. Fi. vävy with open /ä/. The reconstruction of *e is not verifiable, though: Old Nganasan †biŋi has undergone a (regular) assimilation *e-i > *i-i, as also in e.g. *kettä (? *käktä) ‘2’ > *ketä > *śetä > †śiti.

Clear examples of *ü next to a former labiovelar also include *jüjə ‘beard moss’, *kürə ‘rope’ < PU *jäwjə, *käwd₁ə; but here we seem to have a distinct coloring process, perhaps with something like *äw > *öw as its first stage.

Implications within Samoyedic?

Another interesting fact is that the Nganasan reflex of PSmy *ü is *i. It is in principle possible that this actually reflects a more archaic stage than the rest of Samoyedic: if PSmy *ü normally develops by “re-coloring” from an intermediate *i, then perhaps this last stage of the shift never happened in Nganasan.

The contrast *i | *ü is nominally reflected in Nganasan, in that the former palatalizes a preceding *k, while the latter doesn’t (e.g. *kimä > /śimi/ ‘coal’ | *küntə > /kintə/ ‘smoke’). However, this could be also analyzed as reflecting instead an intermediate contrast *ki | *kʷi: just as in e.g. western Romance, plain *k would have been palatalized, yielding /ś/, while labialized *kʷ would have resisted, and later lost its palatalization.

This regardless seems less likely to me than the usual explanation. The shift *ü > /i/ would not be isolated in Nganasan: it is instead part of an extensive vowel chainshift, whereby also *u > /ü/, *o > /u/, *å > /o/. (And though I have not seen this mentioned in sources, presumably also the raising of 2nd syllable open “non-neutral” stem vowels *ä, *å to /i/, /u/ is a part of this.)

Another possible counterargument is that the case of *ńüktä- > *ńüt- shows that at least the re-cheshirization *ikʷ > *ük must be earlier than the loss of *k from consonant clusters. But the latter is reflected in all Samoyedic languages, and would be best dated to Proto-Samoyedic proper.

Exceptions & more

A few additional complications to the scheme above exist as well.

Firstly, there are some cases of *ü in Proto-Samoyedic that do not seem to occur next to a velar consonant.

In loanwords this is a non-issue, e.g. *jür ‘fat’ ← Turkic *jür₂. Many other cases could be also explained in a similar way as ‘to pull’, ‘rope’: from a pre-form with a velar consonant, later regularly lost. Some examples of this type would be e.g. *jü ‘knot’ < pseudo-PU ? *jüKə; *čürə ‘ski pole’ < ? *čäwrV-. There also seem to be cases of *ü that come from the fronting of former *u, such as *jürə- ‘to get lost’, most likely from PU *jurə- (> Samic *jorë-, etc.). In principle *ju- > *jü-, as seen here, could even be regular! There are no examples of PSmy *ju- in vocabulary of Uralic origin.

A second complication is one apparent counterexample. ‘Snake’ in Proto-Uralic is usually reconstructed as *küjə. Several reflexes however point to a secondary derivative along the lines of *küjə-wä. These include Erzya /kijov/, Hungarian kígyó — and in Samoyedic: *kiwä, rather than expected *kʷiwä > *küwä. However, I suspect that the reconstruction with original *ü is erroneous. It seems to be based on Finnic *küü (> Fi. kyy, etc.) in the first place, secondarily also on Udmurt /kɨj/. But if instead of assuming variation *küjə ~ *küjə-wä we reconstructed only a single variant, then perhaps the source of labialization in the Finnic form is instead the “suffix” *-wä, and we can make do with *kijəwä. The later development in Finnic would then seem to be something along the lines of > *kiiwä > *küüwä > *küü.

Maybe we could even reconstruct a simple bisyllabic form *kejwä. This has the benefit of regularly explaining Erzya /i/ (*e-ä > /i/). In Samoyedic, ‘snake’ is not attested from Nganasan, so also *kewä is an equally possible reconstruction; and in Hungarian, í /iː/ usually indicates earlier *e, not *i (e.g. *wetə > víz ‘water’; *me > mi ‘we’; *ke > ki ‘who’). Contraction *ijV > í would be another option in principle, but hardly here: *-j- seems to be instead fortited to yield medial -gy-. This is again less compatible with Udmurt, however, where from *e we would expect Proto-Permic **koj > ˣ/kuj/; something like a Finnic-style assimilation *ej > *ij or a late fronting *u > /ɨ/ (a common enough phenomenon in Udmurt) would have to be assumed.

Also challenging is Tundra Nenets /ṕud/ ‘rope’, which has known cognates in Mordvinic (/piks/), Hungarian (obsolete fiu) and Khanty (*püüɣəL). The Nenets form suggests PSmy *pütə. On first look, this fits well enough into the picture so far: *-t- comes from PU *-ks-, and hence *ü could be conditioned by the former coda velar. But the cognates do not suggest PU *ü; they look more like PU *peksä, or, in principle, *pexsV. [10] Is labialization triggered here by the initial *p-, or is this an example of something more complex along the lines of PU #pewVksV? Or maybe, as Janhunen (1981) suggests, /u/ in Nenets could also be some kind of a late secondary development? There seems to be a parallel of sorts in ‘liver’: PU *mëksa > PSmy *mïtə > TNe /mid/ (literary мыд) ~ /mud/, with /i/ > /u/ between a bilabial and *t. But this could also be purely a coincidence… *mï- > /mu-/ seems to be regular in Enets, and maybe the TNe variant with /u/ for ‘liver’ is simply borrowed from there.

[1] The most consistent defender of this approach has probably been Gyula Décsy. His proposal has been free variation *[i ~ y] in early Finno-Ugric in various labializing environments, later semi-randomly fossilized as separate phonemes. (E.g. 1969: “Die Streitfragen der finnougrischen Lautforschung”, Ural-Altaische Jahrbücher 41.) This is gratuitously vague, though. Any reconstructions involving free variation are probably unverifiable even in principle, though, and I’d like to see some actual precedents for the alleged mechanism of “fossilization of free variation” before I buy any non-explanation of this sort.
[2] The first two of these have been by now added to a new section on my Bibliography page.
[3] The proposed Hungarian cognate nyél is multiply irregular — unexpected palatalization, unexpected vowel height and roundedness — and might not belong here at all.
[4] A new comparison, to my knowledge. The derivative *ńüktä- has previously only been attested from western branches of Uralic: Finnic (Fi. nyhtää ‘to pull (off)’), Mordvinic and Mari. The base root *ńükə-, meanwhile, has cognates also in Ugric, e.g. Hungarian dialectal nyű. The sound correspondences are mostly unproblematic, though SW reconstructs *nüt¹- (= *nüt- or *nüč-) rather than *ńüt- for Proto-Samoyedic. The only reliable reflexes are however from Nenets and Kamassian, which do not distinguish *n- from *ń- before front vowels. Southern Selkup /nüš-/ ‘to tear in half’, with unexpected /š/ as well, is perhaps best left unrelated.
[5] Phonetically the Southern Khanty reflex is actually centralized [ɵ̆] (traditionally transcribed ȯ̆). I follow Honti in analyzing this as an allophone of /o/ next to a velar consonant. Also the opposite interpretation has been proposed though, by Edith Vértes in her editorial comments in the 2nd volume of K. F. Karjalainens Südostjakische Textsammlungen (1997; SUST 225): phonemic /ö/ which conditions a velar ([-back]) allophone [k] of /k/, distinct from phonemic /o/ which conditions the uvular ([+back]) allophone [χ]. However, it is necessary to consider [k] versus [χ] to be phonemic in some other positions (one minimal pair is /keečə/ ‘knife’ versus /xeečə/ ‘mold’; from Proto-Khanty *keečää | *kïïčəɣ), and I think this means that the same would be the preferrable analysis also for the contrast of [kɵ̆] versus [χŏ]. Vértes also notes cases of [ɵ̆] that occur in other environments, though; so Honti’s analysis may have to be back-dated to proto-Southern Khanty, followed by the phonemicization of /ö/ in the separate SKh dialects due to loanwords & such.
[6] /kʷo-/ for PU *kü- > PMs *kʷä- was still recorded by Bernát Munkácsi in his field records from 1888. The field records of Artturi Kannisto from 1905, however, already have just /ko-/, as still in today’s Northern Mansi.
[7] Phonetic [ɣʷ] is attested from Tremjugan Khanty, but this can be interpreted as a post-vocalic allophone of /w/. Márta Csepregi’s chrestomathy gives [ɣʷ] also for /w/ before /uu/.
[8] I suppose a theoretical realignment would be to reconstruct some kind of a different secondary labializing factor for word roots of the ‘glue’, ‘handle’ type (e.g. **d₂imwä, **nid₁wə?), but this does not seem to offer any clear benefits: we would have to assume that this labializing factor gets lost everywhere, but also manages to cause the same kind of rounding effect even in Finnic (which definitely has never been a neighbor of Eastern Khanty specifically).
[9] A final vowel is attested in early Mansi records, though by the late 1800s lost from all varieties. I reconstruct *-ə for such cases; they can be moreover secondarily identified by vowel lengthening in *open syllables in Western and Eastern Mansi, thus  here e.g. /äät/ and not ˣ/ät/.
[10] Janhunen (1981) suggests *piksi (= my *piksə), but this does not actually match any of the descendants.