On *ü in Mari vs. Proto-Uralic

It is always a low note of sorts when a scientific dispute gets resolved by quietly shifting consensus (e.g. due to proponents of one side passing away) rather than by actual discussion.

One of these seems to be the status of Proto-Uralic *ü. In literature up to about the mid-1900s, various skeptical viewpoints can be found on if a contrast between *i and *ü should be reconstructed or not. They dwindle away in later times however, with the modern researcher only really encountering any trace of the issue when perusing the UEW, which still provides proto-forms with *ü only as an alternative to proto-forms with *i. So far I have regardless been unable to locate any turning point source that argues in detail in favor of establishing *ü after all.

For sure, all major overviews of comparative Uralic vocalism (Steinitz 1944, Collinder 1960, Sammallahti 1988) still reconstruct contrastive front rounded *ü (or, in the case of Steinitz, largely equivalent reduced *ö̆), and give what they see as the regular later development in most individual languages. It is thus fairly simple to reverse-engineer a rough argument for in which cases to reconstruct *ü. Altogether, especially the following three contrasts appear to be relatively robust and in etymological correspondence to each other:

 • Finnic *i : *ü
 • Hungarian ë : ö
 • Khanty *e : *ö (perhaps rather *[ɪ] : *[ʏ])

Also the *i : *ɨ contrast in Permic correlates well with this (though *ɨ can also derive from PU *u and *ä).

Numerous further conditional developments, including also indirect traces in several Uralic languages that lack front rounded vowels, have also been identified. Collating these in one place would probably amount to an almost full answer to old skeptical viewpoints, which mostly have focused on the possibility that the contrasts seemingly pointing to *i : *ü have separately developed in each language.


I think one subgroup remains an open problem though. A phonetically equivalent contrast also appears in Mari, between *ĭ (> generally /ə/, in a couple of dialects /ɪ/ or /i/) and *ü̆ (> Hill Mari /ə̈ ~ ʏ/, Meadow Mari /y/). But this particular contrast seems to do a poor job at matching with the Proto-Uralic *i : *ü contrast, as could be reconstructed on the basis of the other languages. While reflexes with “correct” labiality seem to be in the lead, an abundance of counterexamples is also apparent: [1]

 • PU *i > Ma *ĭ: 15 cases
  *ićä ‘father’ > *ĭćä ‘older brother’, *kičək > *kĭčək ‘fresh snow’, *kirä- > *kĭre- ‘to hit’, *kiśkə- > *kĭške- ‘to throw’, *minä > *mĭńə ‘I’, *ńičkä- > *jĭčke- ‘to pluck’, *pićlä > *pĭćle ‘rowan’, *pilwə > *pĭl ‘cloud’, *pištä- > *pĭšte- ‘to put’, *pitä- > *pĭće- ‘to hold’, *śikšta (← II) > *šĭštə ‘beeswax’, *śilmä > *šĭnćä ‘eye’, *tinä > *tĭńə ‘thou’, *wittə > *wĭć ‘5’
 • PU *i > Ma *ü̆: 6 cases
  *kiwə > *kü ‘stone’, *piŋə > *pü ‘tooth’, *nimə > *lü̆m ‘name’, *śixələ > *šülə ‘hedgehog’, *šikšna (← Baltic) > *šü̆štə ‘strap’, *sitV- ‘to bind’ > *šüðəš ‘bind’
 • PU *ü > Ma *ĭ: 9 cases
  *küjə > *kĭškə ‘snake’, *külmä > *kĭlmə ‘cold’, *küńärä > *kĭńer ‘elbow’, *kütkə- > *kĭćke- ‘to harness’, *mükkä > *mĭk ‘mute’, *ńüktä- > *ńĭktä- ‘to pluck’, *süjə > *šĭjä ‘year ring’, *sükəśə > *šĭžə ‘autumn’, *śüklä (← Turkic) > *šĭɣəľə ‘wart’
 • PU *ü > Ma *ü̆: 11 cases
  *d₂ümä > *lü̆mə ‘glue’, *künčə > *kü̆č ‘nail’, *künčä- > *kü̆nče- ‘to dig’, *küsV > *kü̆žɣə ‘thick’, *kütV > *kü̆ðäl ‘middle’, *sülə > *šü̆lə ‘fathom’, *süskV- > *šü̆škä- ‘to cram’, *śüd₁ə > *šü ‘coal’, *śülkə > *šüwəl ‘spit’, *türə > *tü̆rəś ‘full’, [2] *tüŋə > *tü̆ŋ ‘base’, *wülä > *wü̆l- ‘over’

PU *e also mostly yields Ma *ĭ or *ü̆, again split fairly evenly.

 • PU *e > Ma *ĭ: 15 cases
  *e- > *ĭ- ‘negative verb’, *elä- > *ĭle- ‘to live’, *eštə- ‘to be in time’ > *ĭšte- ‘to do’, *jećə > *ĭške ‘self’, *jekä > *i ‘year’, *keltä- > *kĭlðe- ‘to bind’, *kenčV- > *kĭčälä- ‘to serch’, *neljä > *nĭl ‘4’, *le- > *liä- ‘to be’, *leštə > *lĭštäš ‘leaf’, *peljä > *pĭləkš ‘ear’, *penä > *pi ‘dog’, *pesä > *pĭžäkš ‘nest’, *repäś (← II) > *rĭwəž ‘fox’, *śerV > *sĭr ‘character, nature’
 • PU *e > Ma *ü̆: 12 cases
  *jetV > *jü̆t ‘night’, *kejə- > *küä- ‘to boil’, *kerə > *kü̆r ‘bast’, *pečä > *pü̆nčə ‘pine’, *pečkV- > *pü̆čkä- ‘to cut’, *sesar (← IE) > *šü̆žar ‘sister’, *śečä > *čü̆čə ‘uncle’, *śepä > *šü ‘neck’, *tejnəš (← II) > *tü̆əž ‘pregnant’, *terä (← II) > *tü̆r ‘blade’, *werə > *wü̆r ‘blood’, *wetə > *wü̆t ‘water’

I have included here cases with Proto-Mari *i and *ü only in stems of the shape CV(V-), where the appearence of “full” rather than “reduced” vowels is regular. Some other examples exist as well though, such as *ik ‘one’ (< *ü?), *üpš ‘smell’ (< *i?).

Existing literature does not seem to tackle the issue, and often I get the feeling that authors essentially try to sweep the problem under the carpet. Sammallahti leaves the history of Mari vocalism untreated. Collinder offers, for the cases with *e > *ü̆, only the slightly ad hoc rule that this development occurs “in the vicinity of *w and *r”, while he does not comment on the cases with *i > *ü̆ or *ü > *ĭ. Steinitz’ approach posits a late development *ĭ > *ü̆ again in the vicinity of labial consonants (and raises the possibility that it applies only to Meadow Mari and not even Proto-Mari), but leaves the other cases untreated.

I have not seen any specialized studies that would have fared better either. E. Itkonen in his major 1954 article on the history of Mari and Permic vocalism even explicitly notes that labiality assimilations that he posits next to *w, *p, *r cannot be considered regular. Contrast indeed e.g. ‘blood’ (*we- > *wü̆-) vs. ‘five’ (*wi- > *wĭ-), ‘tooth’ (*pi- > *pü-) vs. ‘cloud’ (*pi- > *pĭ-), ‘blade’ (*-er- > *-ü̆r) vs. ‘to hit’ (*-ir- > *-ĭr-). — Also, since when is *r a labial consonant anyway?


I suspect that already the basic assumptions underlying earlier research on this are incorrect. Instead of the developments *i > *ü̆ and *ü > *ĭ being some kind of exception cases to be explained away, the old skeptic contingent has been right this time: the contrast between Proto-Mari *ĭ and *ü̆ is unrelated to the contrast between Proto-Uralic *i and *ü. Rather, PU *i, *ü and *e merged in the early history of Mari, and this merged phoneme (I will mark it simply as *i) later secondarily split into *i > *ĭ and *ü > *ü̆ again — without regard for its PU origins.

The best single conditioning factor instead appears to be stem type:

 • *i-ä > *ĭ: 23 cases
  *elä- > *ĭle-, *ićä > *ĭćä, *jekä > *i, *külmä > *kĭlmə, *keltä- > *kĭlðe-, *küńärä > *kĭńer, *kirä- > *kĭre-, *minä > *mĭńə, *mükkä > *mĭk, *neljä > *nĭl, *ńičkä- > *jĭčke-, *ńüktä- > ńĭktä-, *pićlä > *pĭćle, *peljä > *pĭləkš, *penä > *pi, *pesä > *pĭžäkš, *pištä- > *pĭšte-, *pitä- > *pĭće-, *repäś > *rĭwəž, *śüklä > *śĭɣəľə, *śikšta > *šĭštə, *śilmä > *šĭnćä, *tinä > *tińə
 • *i-ä > *ü̆: 9 cases
  *d₂ümä > *lü̆mə, *künčä- > *kü̆nče-, *pečä > *pü̆nčə, *sesar > *šü̆žar, *śečä > *čü̆čə, *śepä > *šü, *šikšna > *šü̆štə, *terä > *tü̆r, *wülä > *wü̆l-
 • *i-ə > *ĭ: 11 cases
  *eštə- > *ĭšte-, *jećə > *ĭške, *kičək > *kĭčək, *küjə > *kĭškə, *kiśkə- > *kĭške-, *kütkə- > *kĭćke-, *leštə > *lĭštäš, *pilwə > *pĭl, *süjə > *šĭjä, *sükəśə > *šĭžə, *wittə > *wĭć
 • *i-ə > *ü̆: 15 cases
  *kejə- > *küä-, *künčə > *kü̆č, *kerə > *kü̆r, *kiwə > *kü, *nimə > *lü̆m, *piŋə > *pü, *sülə > *šü̆lə, *śüd₁ə > *šü, *śülkə > *šü̆wəl, *śixələ > *šülə, *tejnəš > *tüəž, *türə > *tü̆rəś, *tüŋə > *tü̆ŋ, *werə > *wü̆r, *wetə > *wü̆t
 • unclear/inapplicable > *ĭ: 4 cases
  *e- > *ĭ-, *kenčV- > *kĭčälä-, *le- > *liä-, *śerV > *sĭr
 • unclear > *ü̆: 6 cases
  *jetV > *jü̆t, *kütV > *kü̆ðäl, *küsV > *kü̆žɣə, *pečkV- >*pü̆čkä-, *süskV- > *sü̆skä-, *sitV- > *šüðəš

The raw accuracy of the maintenance hypothesis (*i > *ĭ, *ü > *ü̆) seems to be 26 cases predicted correctly out of 41 ≈ 63.5% (worse if we also wanted to presume *e > *ĭ). Assuming the typical reflexation to be *i-ä > *ĭ, *i-ə > *ü̆ instead reaches up to 38 correctly predicted out of 58 ≈ 65.5 %. Which is so far only marginally better… But there is room for fine-tuning here as well.

Some of the apparent exceptions in verb roots can be readily interpreted to indicate a shift of stem type in pre-Mari. *ĭšte- ‘to do’, *kĭške- ‘to throw’ and *kĭćke- ‘to harness’ (in red above) show 2nd syllable *e, which normally corresponds with PU *A-stem verbs; thus I would reconstruct pre-Mari *ist-ä-, *kiśk-ä- and *kitk-ä-. Here *-ä- is probably some kind of a transitivizing suffix, well known in Mari (the classic example is probably /koða-/ ‘to stay’ : /koð-e-/ ‘to leave’) and probably dating to earlier times already (reconstructible in a small number of PU doublets such as *künčə ‘nail’ ~ *künč-ä- ‘to plough/dig’; *ipsə ‘smell’ ~ *ips-ä- ‘to smell’). We could also take the final *-e, rather rare in nominals, of *ĭške ‘self’ as grounds to reconstruct pre-Mari *(j)iś-kä.

Similarly, *pü̆čkä- ‘to cut’, *šü̆škä- ‘to cram’ (in blue above) show 2nd syllable *ä, which normally corresponds with PU *ə-stems; and therefore I would reconstruct pre-Mari *pičkə-, *siskə-. The former thus turns out better compareable with Mordvinic *pečkə- ‘to cut’ than with Samic *peackē- ‘to cut (off)’ (< *pečk-ä-), and the latter with Samic *sëskë- ‘to rub against’ than with Fi. sysä-, Es. süska- ‘to push into’.

(This on the other hand creates new problems for *kĭčälä- ‘to serch’, *liä- ‘to be’, *ńĭktä- ‘to pluck’, which now start pointing to earlier *ə-stems…)

I would also take *kü̆žɣə ‘thick’ (also in blue) as pointing to earlier *kizəgV < *küsəkV (akin to Proto-Samic *kësëkV > Northern Sami gassat etc.), rather than the bare root *küsä that most sources report. Perhaps even *kĭškə ‘snake’ should be taken as pointing to PU *küjəwä (> Erzya /kijov/, Hung. kígyó, Smy. *kiwä) > pre-Mari *kiwä(-skV) rather than the bare root *küjə (> PF *küü, Udm. /kɨj/ [3]).

Nominal derivation phenomena could lie behind some of the other exceptions as well, though due to the non-maintenance of the PU stem vowel contrasts in Mari nominals, this will have to be more speculative. For example, Finnic *kidek ‘snowflake’ has a number of parallel derivatives etc. in the descendant languages, and the original root may well have been *kičä rather than *kičə. It would be also possible to assume PU *kičäk, and date the development *-Ak > *-Ek (as seen in cases such as Fi. jauha- ‘to grind’ ~ jauhe ‘powder’; jättä- ‘to leave behind’ ~ jäte ‘trash’) as inner-Finnic.

Consonant environment conditioning does not need to be ruled out entirely either. E.g. *šü ‘neck’ could be taken back to pre-Mari *siw(ä), and *šĭjä ‘year ring’ to pre-Mari *sijə, with the natural developments *iw > *ü̆ and *ij > *ĭ bleeding the usual stem type conditioning. (This provides also another possible line of explanation for ‘snake’.) The latter rule could be even generalized slightly to also capture *wĭć ‘5’.


The phonetics of this hypothesis do not have to be left arbitrary either: a kind of palatal umlaut mechanism seems to work. The root structure *i-ä > *ĭ(-e) remains consistently front-vocalic and illabial; while the root structure *i-ə would probably have been first retracted to something like *[ɨ]-[ə]. After this, I would suppose central *ɨ was labialized to [ʉ], and then re-fronted > [y] > [ʏ]. This development appears internally unmotivated (it could possibly be attributed to areal influence from Turkic) — but it has a good precedent in the fact that Mari is the only Uralic language with a front rounded reflex of PU *ë, for which we must then reconstruct the exactly parallel development [ɤ~ɜ] > [ɵ] > [ø] > [y].

Later vowel harmony between /a ~ ä/, as attested in Hill Mari (but not Meadow Mari) was likely not yet in effect by this stage. This appears to be shown by the straggling cases of Proto-Mari *ĭ-ä: where *ĭ is further reduced and retracted to /ə/ in Hill Mari, the stem vowel surfaces as /a/, not as /ä/. Cf. e.g. /kəčala-/ ‘to serch’, /ńəkta-/ ‘to skin’, /šəja/ ‘year ring’.

[1] This selection has been datamined from both older and newer literature. Individual referencing would go beyond the purposes of this blog post. Various dubious or difficult-to-reconstruct comparisons have been omitted, including e.g. most cases where some or most other reflexes point to original *ä rather than *e.
[2] To my knowledge, this comparison has not been previously presented, though it seems self-evident. The identity of the “suffix” is unclear to me however.
[3] Even this might derive from the longer form *küjəwä: contrast *süjə > /si/ ‘year ring’. Perhaps thus: *süjə > *süj > *si, but *küjəwä > *küjə > *kɨj?

Advertisement
Tagged with: , , , ,
Posted in Commentary, Reconstruction
One comment on “On *ü in Mari vs. Proto-Uralic
 1. Ante Aikio says:

  This problem is a tough nut to crack. When analyzing problems like this it is important that all the etymologies in the material are as certain as possible. So I’ll post a couple of etymological comments here; I think a couple of cases should be removed from the list, and in a couple of others the reconstruction should be rethought.

  PMari *šü̆ðəkš ‘hoop’ (not *šüðəš ‘bind’!) can hardly be related to Fi sito-, Md. sodo- because also the Malmysh and Bolshoj Kil’mez dialects have forms with š- (from PU *ś-), not ś- (from PU *s-).

  PMari *šü̆škä- ‘to cram’ also has š- in Malmysh and Bolshoj Kil’mez, so it does not match Saami (North Saami saskat). The comparison to Finnic *süskä- would work, but then the Saami verb would have to be of another origin. Moreover, the verb has initial affricate in Võro (tsüskä- ~ tsuska-), which would have to be regarded a secondary irregular development.

  PMari *śĭr ‘character, nature’ looks like a loanword due to its initial palatalized sibilant.

  Note the PMari sibilant system and its reflexes:

  PMari *s (from PU *s)
  – Malmyzh /ś/ (in front vocalic words), /s/ (in back vocalic words)
  – Bolshoj Kil’mez /ś/ (in front vocalic words), /š/ (in back vocalic words)
  – in other dialects /š/

  PMari *š (from PU *ś and *š)
  – in all dialects /š/

  PMari *ś (only in loanwords, as a reflex of Chuvash ś):
  – in most dialects /s/, in some eastern dialects /ś/

  The word for ‘night’ seems to go back to a back-vocalic form *jŭt, cf. Upsha jŭt, Northwest jǝ̑t, West jǝ̑t (~ jǝt). So probably the form *jü̆t in other dialects is a result of irregular palatalization caused by *j-. Consequently, the Uralic etymology suggested for this word is probably false.

  The PU word for ‘year ring’ cannot be reconstructed with word-internal *-j- (*süjə) because it has *-w- in Mansi (cf. Pelymka, Vagilsk tǟw, Upper Lozva (jīw-)taw, Sosva (jīw-ńaɣ-)taw ‘year ring’ from PMs *täwǝ) and possibly also in Saami (South Saami sïeve ‘stripe’, even though the vowel -ïe- is anomalous). Moreover, the Permic forms (Komi and Udm si) do not point to *ü. So I’d rather reconstruct *siwi or something like that. (The Saami word presupposes *säwi – cf. Finnish sävy and sävel? Maybe it is an etymologically distinct word.) In any case, the West Mari form šǝ̑j, šǝ̑ja with a back vowel (*ŭ) is completely obscure. The ending -ja must be a suffix. In the shorter form šǝ̑j could have emerged as a hiatus filler before suffixes beginning with a vowel, and then analogically generalized as a part of the stem.

  PMari *kü̆dǝl- does not mean ‘middle’ but ‘near’ instead. Isn’t it semantically better plausible to consider PMari *kĭdäl ‘waist; middle (of an oblong object)’ the reflex of PU *kütV ‘middle’ – or are both words ultimately formed from the same PU root?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Enter your email address to follow this blog and receive notifications of new posts by email.

Links
%d bloggers like this: