Proto-Uralic *ë in Mari

Mari is one of the key languages for the reconstruction of Proto-Uralic *ë, in having a mostly unique reflex *ü > Hill Mari /ü/ ~ Meadow Mari /ü/. The only other known regular source of this vowel correspondence is would-be *ü̆ (from earlier *ü, *i, *e) in roots of the shape *CV, such as Hill Mari /šü/ ~ Meadow Mari /šüj/ ‘neck’, from PU *śepä.

The development *ë > *ü was first explicitly proposed by Wolfgang Steinitz in his Geschichte des finnisch-ugrischen Vokalismus (1944) (in his notation: *i̮ > *ü). This fact has been later on essentially forgotten, though. E.g. fifty years later (1994), Gábor Bereczki in Grundzüge der tscheremissischen Sprachgeschichte recognizes only two examples theoretically falling under this: /šüm/ ‘bark, crust’ (< *śëmə ‘scales’) and /nölə-pikš/ ‘blunt-tipped arrow’ (< *ńëlə ‘arrow’), which he furthermore explains, following Erkki Itkonen’s views from 1954, instead as “sporadic” fronting from *u and *o. [1]

The grounds would have been ripe for a reassessment of the historical vocalism of Mari already since the rehabilitation of *ë by Janhunen and Sammallahti in the 80s. It has been taking a bit longer, though. The next source after Steinitz that is on board with his theory seems to be a footnote by Ante Aikio in his 2006 article “Etymological nativization of loanwords”, [2] hence adding up to a blackout period of more than 60 years. I believe this has been an independent rediscovery rather than a revival as well. Aikio notes also that the conditions for the change are unclear, and it is indeed the case that PU *ë as reconstructible per the evidence of the other languages is often enough reflected also as Mari *å (Hill /a/ ~ Meadow /o/) or *o (Hi. Me. both /o/). So how should we deal with these cases? [3] By now we have at least one initial suggestion, by Mikhail Zhivlov from 2014, that *ü would be the default reflex, *å ~ *o the reflex before the velars *k and *ŋ (but not *x).

The split *å ~ *o remains unclear for now as well, but this is also the typical development of *a, hence we seem to be dealing with an early lowering *ë > *a, as also in many other Uralic branches. This is general in West Uralic; in Permic it seems like the most common development (followed then by *a > *o > *u), versus retention *ë > *ë mostly before sonorants; and Hungarian has “a-umlaut”: *ë-a > *a. I suspect on the other hand that *ë-ə > *aa in Khanty is the result of relatively late lowering from earlier *ëë, which could be connected to the same change in Northern and partly Eastern Mansi. (The development *a > *aa is attested too, but rare, and more common developments like *a > *a, *a > *oo, *a > *uu seem to require room for maneuvering in pre-Khanty.)

After having looked over the data [4] once more though, I have settled on a different view: the primary conditioning seems to be instead syllable closure. (This is one of what I think of as the “stock” conditioning features for divergent vowel developments, along with metaphony and labial/palatal coloring due to neighboring consonants. [5])

1. *ë > *ü in open syllables (before a single consonant):

  • *ëla > *üla > *ül ‘under’
  • *ëŋas(V) > *üŋəSə ‘rested’ [6]
  • *jëxə- > *jüä- ‘to drink’
  • *lëCə-ta- > *lüðä- ‘to fear’ [7]
  • *lëčə- > *lüčä- ‘to get wet’
  • *mëxə > *mü-ländə ‘land’, *mü-ðe- ‘to bury’
  • *ńëlə > *nülə ‘arrow’
  • *ńërə > *nürə ‘flexible’
  • *sënə > *sünə > *Sün ‘sinew’
  • *sëtə > *Süðər ‘spindle’
  • *śëmə > *śümə > *Süm ‘bark, crust’
  • *śëta > *Süðə ‘100’
  • *wajə ~ *wëjə > *ü(j) ‘butter’

2. *ë > *a > *å ~ o in closed syllables:

  • *ëkta- > *akta- > *opte- ‘to put, place’
  • *lëkśə- > *lakśə- > *lokSə-ńća- ‘to chop’
  • *lëntV > *lantV > *lånda-ka ‘valley’
  • *mëksa > *maksa > *mokS ‘liver’
  • *ńëčkə > *načkə > *nočkə ‘wet’
  • *ńëkćəmə > *nakćəmə > *nåSmə ‘palate’
  • *pëŋka > *paŋka > *poŋgə ‘mushroom’
  • *tëktə > *taktə > *toktə ‘loon’
  • *wëlkətə > *walkətə > *wålɣəðə ‘light’

Set #2 certainly shows a lot of velars following either immediately (*kt *kś *ks *kć *ŋk) or as the second member of the cluster (*čk *lk), but this probably doesn’t need any explanation other than the general abundance of *k in the PU consonant cluster inventory. There is also one case with *nt. Set #1 meanwhile has only one example with *-ŋ-, but similarly, in CVCV roots *-k- is by contrast rather rare.

This pattern is complicated though by suffixation and consonant cluster simplification processes in Mari. In these cases we find both *üC and *aC, and I would hypothesize that this means that the split of *ë dates in-between some of these.

3. *ë > *ü also in secondarily open syllables:

  • *ëptə > ? *ëpə > *üp ‘hair’
  • *mërja > ? *mëra > *mür ‘strawberry’
  • *sëntə- > ? *sëtə- > *Süðä- ‘to clear woodland’

4. *ë > *a also in secondarily closed syllables:

  • *ďëmə → *ďëmə-pawə > ? *ďëmpa(w) > *lampa > *lombə ‘birch cherry’ (a compound with the word for ‘tree’ as the second member)

5. *ë > *a in “tertiarily open” syllables?:

  • *ëppə > ? *appə > *owə ‘father-in-law’
  • *këččə > ? *kaččə > *kåčə ‘bitter’
  • *lëmpə > ? *lampə > *lop ‘depression in ground’
  • *wëlka- > ? *walka- > *wåle- ‘to go down, descend’

6. *ë > *ü in “tertiarily closed” syllables?:

  • *ńërə(-ka) > ? *nürə-kA > *nürɣə ‘cartilage’

But it’s also possible that there are a few other, smaller conditioning factors here as well. It seems somewhat dubious to me in particular to to end up dating *mp > *p (in *lop) as younger than *nt > *t (in *Süðä-). In principle most cases here could be also further confirmed or falsified by other results on the chronology of consonant cluster simplification in Mari.

This hypothesis also points towards a different line of explanation for some other instances of Mari *ü. There is at least one case where we find *ü in a closed syllable clearly retained from PU, in what seems like an original back-vocalic environment. This is *jükSə ‘swan’, for which I have earlier sided with reconstructing *ë … but since none of other languages show evidence particularly in favor of *ë, maybe a development *o > *u > *ü or the like will be a better explanation for this one case after all.

[1] I don’t think I can dismiss strongly enough, in polite company at least, the notion of reconstructing “sporadic” sound changes. As some readers know, my (hopefully soon-to-be-wrapped-up) Master’s thesis treats the research history and reconstruction of the Proto-Finno-Ugric long vowels. One meta-result of this work has been that, by now, I see Itkonen’s insistence on sporadic sound changes as having prevented substantial progress in the reconstruction of comparative Uralic vocalism for just about half of the entire 20th century (to some extent even up to today). This device is not much more than a license to stop thinking — to avoid placing a given language group’s phonological structure in a general comparative context, and therefore, to be unable to discover more parsimonius explanations such as properly conditional splits. Closer to the topic though: I cannot blame Bereczki very much for not seeing /ö/ and /ü/ as etymologically equivalent, since the lowering of *ü to /ö/ (perhaps better: retention?), as later unraveled in detail by Aikio, has at least somewhat complex conditioning.
[2] In Diane Nelson & Ida Toivonen (eds.): Saami Linguistics, pp. 17–52.
[3] Steinitz did not have any trouble with these exceptions, since he postulated extensive original “ablaut” variation such as *a ~ *i̮ as a data-cleaning deus ex machina of his own.
[4] Three of the cases in section 1 are absent from all three recent overviews of the development of *ë either in general or in Mari in particular, i.e. Aikio 2014, 2016 and Zhivlov 2014 (see Bibliography). (1) The reconstruction *ëŋas(V) ‘rested’, reflected also in Samic *vōŋēs, is from Aikio’s PhD thesis (2009: 289). (2) *sëtə (> Mo Ma P) can be found in already in UEW (as *setɜ, and with Erzya /sad/ ‘stem, trunk’ rejected, though it fits perfectly under *ë; this may be a better etymology for the Mari word than the comparison with Finnic *kecrä, Mordvinic #kšťəŕə ‘spindle’ ← pre-II *ketstra-). (3) The ‘butter’ word has been consistently reconstructed only as *wajə (*waje, *wōje etc.) so far. Aikio 2016 notes that Samic and Mordvinic point to *ë — but so do also Mari as well as Udmurt /vɤj/. Finnic *voi (regular from both) and Komi /vɨj/ (irregular from both, though possibly less so from *ë) don’t allow disambiguating; therefore it is only the Ugric reflexes that point to *wajə, and perhaps it is them that have innovated here, not the western languages. — An additional similar case is (4) *lëčə-, appearing already in UEW as *lače-, and covered by Zhivlov, but not Aikio.
[5] I do not rule out other consonant-environment-related changes, of course. For just one of my favorite examples of something less obvious, there is how the labialization of earlier /wa/ to /wɔ/ in Early Modern English (later > /wɒ/, /wɑ/, /wɔːɹ/ etc.) (dwarf, quarter, swan, swap, walk, war, was, what, etc.) is blocked before velars (quack, twang, wag, wank, wax, whack etc. instead have the usual development /a/ > /æ/). But I would be hesitant to apply this type of explanation too liberally. At its worst this can turn into over-fitted sound laws where each specific environment applies to no more than one or two words.
[6] I’m leaving aside here the only marginally dialectally retained contrast between PU *s, *ś and *š, which is irrelevant for the present issue.
[7] A trisyllabic reconstruction with a lost middle syllable (all of *lëjə-, *lëwə-, *lëxə- and even *lëkə- would work) seems to be required to account for the correspondence between Mari /-ð-/ (normally < *-t-, *-tt-, *-d₂-) and Samoyedic *-r- (normally < *-r-, *-d₁-). The lenition of *-t- to *-ð- > *-r- in the latter, regular after noninitial syllables, seems to have taken place also in “contracted” roots of this type. Compare *jëxə- → *jëxə-ta- > *ë-r- ‘to drink’.

Tagged with: , , , ,
Posted in Reconstruction
18 comments on “Proto-Uralic *ë in Mari
  1. David Marjanović says:

    I was wondering how *ë > *ü was supposed to be possible, but during footnote 1 it struck me that ë > ö should be trivial. German transcriptions of Chinese /ɤ/ routinely used ö in the early 20th century, and we still talk about Pjöngjang.

    At the next opportunity I’ll need to listen to Despacito in Udmurt again to see if I can hear whether /ɤ/ is [ɤ]…! It’s spelled ö, right?

    • j. says:

      Compare also õ /ɤ/ > */ʌ/ > /œ/ in Insular Estonian (contrasts with ö /ø/); *ë > ü /y/ in Salaca and Eastern Western Courland Livonian; and the spelling of /ɤ/ as ö in pre-modern Estonian and most early Votic records.

      It’s spelled ö, right?

      Yes. I tend to use the symbol /ɤ/ more to emphasize that this is a non-reduced, stressable vowel than to show that it’s actually a back vowel. The usual UPA transcription is simply , which would correspond to any of IPA /ɘ ~ ə ~ ɜ/ (but is contrasted with reduced ə̑). Maybe I should cram ‹ɘ› somewhere in my keyboard layout already, one of these days.

      Both Komi and Udmurt are reported to have also dialects where ы and ӧ are rounded [ʉ], [ɵ] (not that I can claim to verify this though).

  2. David Marjanović says:

    des tscheremissisches

    Oh, amid all this fascination with [ɤ] I forgot: der tscheremissischen, “weak” feminine.

  3. CRCulver says:

    “The reconstruction *ëŋas(V) ‘rested’, reflected also in Samic *vōŋēs, is from Aikio’s PhD thesis (2009: 289).”

    Aikio sounds more tentative about this equation than you suggest. But my question would be, if the Mari is from that posited PU form, why doesn’t the *-s- voice intervocallically?

    “sëtə (> Mo Ma P) can be found in already in UEW (as *setɜ, and with Erzya /sad/ ‘stem, trunk’ rejected, though it fits perfectly under *ë; this may be a better etymology for the Mari word than the comparison with Finnic *kecrä, Mordvinic #kšťəŕə ‘spindle’ ← pre-II *ketstra-).”

    All attested forms of Mari ‘spindle’ have initial *š- (note the Malmyzh form šüẟü̆r in TschWb and Beke, not *śüẟü̆r). That makes Bereczki’s comparison with Permian items in his etymological dictionary problematic, and it also makes it difficult to derive the word from a root *sëtə. Perhaps Asko Parpola is right in the footnote in his recent JFSOu paper where he wonders if the Mari was borrowed from Mordvin.

    • j. says:

      if the Mari is from that posited PU form, why doesn’t the *-s- voice intervocallically?

      Good question. Aikio after all reconstructs only *ëŋa- as the common base. The *s could be still a part of the proto-form if this were originally word-final and the final vowel in Mari a later suffix. Alternately they could be two parallel suffixes that only coincidentally both have a sibilant. You can probably tell if either of these makes more morphological sense?

      All attested forms of Mari ‘spindle’ have initial *š- (…). That makes Bereczki’s comparison with Permian items in his etymological dictionary problematic

      Ah, sounds like I was a bit hasty in claiming that the sibilant details would not matter this time around.

      Do you happen to know by the way if there are reasonably comprehensive reference on this data that would be newer than Wichmann and Beke’s original articles in FUF 6 and 22? Most later sources appear to be happy enough to mention a few examples and then call it a day.

      • CRCulver says:

        “Do you happen to know by the way if there are reasonably comprehensive reference on this data that would be newer than Wichmann and Beke’s original articles in FUF 6 and 22?”

        Gruzov’s 1965 Фонетика диалектов марийского языка в историческом освещении treats the issue. (I have the Kaisatalo copy borrowed until the autumn, but I intend on scanning the whole thing this summer and I’ll send you a PDF if you remind me.) Učaev also has some things to say in his paper “Согласные s, ś и š в малмыжском диалекте марийского языка”.

        Mostly, though, my own familiarity with this phenomenon is just picked up from dictionaries (besides TschWb and Beke, also Veršinin’s dictionaries has forms from dialects that somehow preserve original *s-) and 18th-century manuscripts that sometimes reflect the Malmyzh dialect. The manuscripts are especially important in this regard as there are instances where they preserve a Malmyzh form that either disappeared by the time of later documentation, or Malmyzh speakers began using a form with š- borrowed from another dialect.

        • j. says:

          Thank you! I may get in touch with you about Gruzov, yes. It also sounds like a small summary paper by someone (hmm…) on updating the evidence base would be about due, especially if some of this manuscript evidence has not been included in the discussion before.

          • CRCulver says:

            Unfortunately, I think that depends entirely on our colleagues in Russia. There is a real need to simply digitize the entire collection of manuscripts cited by Sergeev 2002. I think it would lead to huge breakthroughs in the history of the Mari lexicon and even offer a few good insights for Uralic etymology more generally. But I have heard enough recently about harassment of Western researchers working with archival materials in Moscow that I wouldn’t want to attempt any such work myself. (In fact, I think I am done with traveling to Russia in general.) In the meantime, as long as a few complete manuscripts aren’t yet scanned, I feel it would be premature to write such a paper as you suggest.

            • David Marjanović says:

              In fact, I think I am done with traveling to Russia in general.

              …OK, that’s scary.

  4. David Marjanović says:

    *y > [ɨ] in Welsh briefly mentioned here, followed by *y > [ɯ] in Quanzhou Chinese. Also, /ɪ/ and /ʏ/ merge as [ʉ] before sonorants in much of northern German.

  5. Ante Aikio says:

    The data actually seems to suggest that BOTH syllable structure and the following consonant play a role in the split. To me it seems that the simplest solution is to assume that PMari *å / *o appears as a reflex of PU *ë under two conditions:
    1) before a velar or *č in the syllable coda (but not in the onset of the second syllable!)
    2) after *w-

    This would also make sense phonetically because velar (*k *ŋ) and retroflex consonants (*č) would plausibly have a backing effect on the preceding vowel, and could thus have prevented the fronting of *ë to *ü. The logic behind rule 2) may be different: here there could have been a shift of *wë- to *wo- (and further to PMari *wå-).

    This formulation explains the following words as fully regular: *kåčə ‘bitter’, *nürgə ‘cartilage’, *wåle- ‘to go down, descend’, *mür ‘strawberry’, *südä- ‘to clear woodland’, *üp ‘hair’. I.e., there is no need to invoke any special rule to account for any of these words.

    This leaves only *låndaka ‘valley’, *owə ‘father-in-law’ and *lombə ‘bird cherry’ as irregular. But the first two words also show irregular consonant correspondences (*nd instead of *d, *w instead of *p).

    The original geminate in *këčči- ‘bitter’ explains why its vowel development was different from that in PMari *lüčä- ‘soak’ (from *lëči-).

    The word *üp ‘hair’ can hardly go back to an intermediate form *ëpǝ. Instead, the unique change *pt > *p probably resulted from cluster simplification in word-final position and/or consonant stems, after syncope of the final vowel (and was then analogically extended to the entire paradigm of the noun). Other similar cases, though with a different cluster, are also found in Mari, e.g. *küč ‘nail’ from earlier *künč (PU *künči).

    The word *üj ‘butter’ can hardly provide any evidence of the conditions of the shift, because no other branch suggests the reconstruction of *ë for this word in the first place; the Ugric forms clearly point to *waji. I don’t see vhy Udmurt /vïj/ would suggest PU *ë instead of *a here, because the vowel is the same as in Udmurt /vïr/ ‘hill, highlands’ (from PU *wari; cf. Khanty *war ‘ridge’, Mansi *waar ‘forest’) and /vïj-/ ‘sink’ (from PU *wajV-; cf. Mansi *uj- < *waja/o-).

    It is also possible to adduce some more examples, at least the following:

    PMari *påčkǝńć- ‘twist (thread)’ from PU *pëčka- (cf. Komi pučkï- ‘twist (thread), spin (yarn)’, Khanty *pïïč ‘thread, string’, *pïïč-tǝ- ‘twine, braid weave’)

    PMari *pokte- ‘pursue, chase’ from PU *pëk-ta- (cf. Saa *puoktē- ‘bring’); derived from *pëki- ‘flee’ (Finnish pak-o), cf. also *pëki-ni- (Finnish pakene- ‘flee’, Khanty *paakǝṇ- / *pååkǝṇ- ‘get frightened’)

    (Note that both of the previous words must be reconstructed with *pë-, because PU *pa-(a-) is regularly reflected as Khanty *puu-.)

    MariE šüja ‘becomes firmer, becomes dry (e.g., of porridge); absorbs water, bloats up’ from PU *sëki- ‘thicken (of liquids)’ (cf. SaaN suohkat, Finn sakea ‘thick, dense’, Nenets te- ‘стечь, натечь, набежать (о жидкости); осесть на дно’; a new etymology I have not published yet)

    Note also Upsha Mari /lüpše/ ‘cradle’ from PU *lëpći; in this case there is a closed syllable, but it is problematic that other Mari dialects show an irregular form /lepš/.

    • j. says:

      Thanks for the comments! A few notes for now:

      1) the Permic words for ‘butter’ show the irregular correspondence Udm. вӧй ~ Komi вый, not the opposite (вый ~ вӧй), even though it could be perhaps expected as more regular. As I would hope is fairly generally agreed by now, ӧ ~ ӧ and ы ~ ӧ also cannot go back to a single Proto-Permic vowel (thus Harms, Sammallahti, Zhivlov; contra Itkonen, Lytkin, Csúcs). The former correspondence appears as a reflex of PU *o or *ë (maybe even *ä, in e.g. йӧл ‘milk’ ~ Fi. jälsi? which also shows /ɵ/ and not /ʉ/ in Jazva); the latter as a reflex of PU *o or *(w)a. Udmurt ӧ from *o is also very rare, seemingly only before *kC (*oksə- ‘to vomit’, *koktə ‘abdominal cavity’, *kokšə ‘dry’), so we can hardly be dealing with the same here. So IMO вӧй indeed points unambiguously to *ë. — So do the Samic and Mordvinic reflexes of course, for which you’ve supposed *aj > *ëj yourself.

      2) there does not seem to be anything irregular about *-pp- > /-w-/ in ‘father-in-law’. Both primary inherited *-p- and secondary *-p- from *-pp- *-mp- are lenited in Mari to /-w-/ when intervocalic. Cases of still surfacing /-p-/ seem to be always newer, maybe e.g. due to later suffixation. Other positive examples of *-pp- > /-w-/ are not that common either, but e.g. *kŭwəlćə ‘capercaillie hen’ and *leweðä- ‘to cover’ seem reliable. This also fits into the general pattern of medial lenition in Mari: /-ð-/ can continue any of *-t- *-tt- *-nt- *-d₂-, and /-ɣ-/ consistently only continues *-kk- (while *-k- > ∅).

      3) phonetical interpretation of this change as either coloring by adjacent consonants, or as a kind of shortening in open syllables (comparable to e.g. /ʌ ʌɹ/ > /ʌ ʌː/ > /a ɜː/ in Australian and some British English, and also recalling Steinitz and Katz’ reconstruction of allophonic vowel length already in PU) could probably make an argument in favor of either approach … if we had other evidence for these conditioning factors elsewhere in the development of the Mari vowel system. As it stands, I’m not sure if either has been established anywhere else.

      Your given Khanty cognates for *pëčka- or *pëkə- are new to me. You do not mention them as unpublished, so is there some recent or upcoming literature they appear in already? Your recent SUSA article still has *pakta- and not *pëk-ta-, and no sign of the former.

      • Ante Aikio says:

        My mistake – ‘butter’ in Udmurt is of course /vëj/, not /vïj/. Still, I don’t see why this would point to PU *ë either. The same vowel occurs in /vëlï-/ ‘carve (with a knife), whittle’, which must go back to *wali- and not *wëli-. The latter reconstruction would only be possible if one rejected the Ob-Ugric cognates and Saami *oaloo- (while maintaining *vuolë- as the Saami cognate, not as a Finnic loan).

        I could find at least 4 examples of the development *pp > Mari *p:

        Mari *kŭpa ‘gets mouldy’ < *koppi- (~ Saami *kuoppë-, Võro kopõq; hardly a loan from Germanic *xwapja-, as suggested by Koivulehto)

        Mari *lep(ǝ) ’spleen’ < *leppä ’spleen’

        Mari *lŭpǝ ‘driftwood’ < *loppV

        Mari *(wuj-)lep ‘fontanelle’ < *läppi ‘soft spot’ (a new etymology: cf. Lule Saami liehppa, Võro läpeq ‘hole in the ice’, MdM l’äpä ‘soft’, l’äpä vasta ‘fontanelle’, literally “soft place”; as for the semantics, cf. also SaaN suddi ‘hole in the ice; fontanelle’)

        Then there is *šåpǝ ‘sour’ from *šappa-, which shows an irregular -w- instead of -p- in some eastern dialects.

        *lewedä- ‘cover, put a roof on’ (and the parallel der. *lewäkš ’shelter, canopy, cover’) could better reflect *läpi-, cf. Ume Saami liahpa ‘Schutzdach in der Wildnis für den Renhirten’, Lule Saami liehpa-bielle ‘half of a tent cover’ with single *-p-. And then there is also Khanty *läpǝŋ ‘hallway’ which could reflect *-p- as well as *-pp-. So this leaves only *kŭwǝlćǝ as a clear counterexample; but bird names often contain irregularities.

        I'm not aware of any convincing examples of the reflexes of either PU *-tt- or *-kk- in Mari.

        And right, I didn't remember that the Khanty reflex of *pëki- was also apparently my own idea I hadn't published… Zhivlov (Studies in Uralic Vocalism III p. 143) connects Mari pok(-)te- with Finnish pake-ne-, pak-o, but does not mention the Khanty verb.

        As regards *pëčka-, UEW 342 connects the Mari and Komi verbs but rejects the appurtenance of the Khanty forms. Still, I don't see a problem in including them.

        If by *koktə ‘abdominal cavity’ you refer to Udmurt /kët/, I don't think it can go back to a form with *-kt- (as claimed in UEW 670); this cluster was regularly preserved in Permic. I'd rather reconstruct something like ?*kotti and connect it with Hungarian hát etc. (cf. UEW 225).

        • j. says:

          Your counterexamples seem to all involve consonant-stem nouns, where only /p/ is expected. ‘Fontanelle’ is clearly such a case, and beside *kŭpa- ‘to get moldy’ UEW gives also /kup/ ‘mold’, which would allow both levelling and the verb being derived secondarily as explanations. For ‘driftwood’ UEW gives both /ləpə/ and /ələp/. (TschWB does not seem to have either of the previous two words.) ‘Spleen’ is also a consonant stem in Eastern Mari. Probably the stem type *-ə₂ = West /-Cə/ ~ East /-C/ does not trigger lenition; I don’t think any of them involve correspondences of the type /-wə/ ~ /-p/. I’m not sure what this all adds up to, though. Is the western final vowel really secondary in some fashion? Or, since Mari regardless has today /-p- -t- -k-/ around too, was lenition to /-w- -ð- -ɣ-/ conditioned by some specific stem type?

          In the ‘sour’ case, I wonder if *šåpə ~ *šåwə represents confusion between two originally distinct words, e.g. *šåpə ‘sour’ versus *šåwə ‘kvass’. At least the adjective sense likely goes back to a derived stem of some sort, e.g. earlier *šapmə?, since Finnic *happoin : *happama- and Mordvinic *šapamə both suggest earlier *šappa- ‘to be/go sour’ → *šappama ‘sourness > sour’.

          (Mansi and Hungarian also point to a base verb *čaka- ‘to go sour’ vs. a derived adjective *čakama ‘sour’. *-k- in contrast to *-pp- in “Finno-Volgaic” remains mysterious of course. FWIW Permic шӧм ~ шом ‘sour’ must surely also be a part of the puzzle in some fashion: this suggests a “hybrid” proto-form with a consonant skeleton *šVkVmA or *šVpVmA.)

          For *-kk- > /-ɣ-/ there are *ćukkV- > /cəɣərɣe-/ ‘to bend’ (with *ć- so probably not PU proper, but also one of the few cases of this being reflected regularly in all descendants) and *d₂okka- > /loɣe-/ ‘to push with horns’. For *tt the clearest cases are surely the predicative forms of ‘5’ and ‘6’: *wĭźət, *kuðət; maybe also (UEW: *ko/uttV-) *kŭðala- ‘to run, ride fast’ ~ Komi котӧрӧн, котӧрт-, though with quite irregular vocalism. Word-finally we again expect and find only /-k/, /-t/, as in the deminutive suffix *-k, and the numerals’ attributive forms *wĭć, *kut.

          • Ante Aikio says:

            For Mari nouns that show dialectal variation between a vowel stem in *-ə and a consonant stem, I think the simplest explanation is mere analogical shift of stem type: the final vowel could have been reanalyzed as a part of the stem from the vowel *-ǝ- surfacing before suffixes in oblique forms (as in *kit : ACC *kid(-)ǝ-m ‘hand’). Or even vice versa, an original vowel stem noun could have been reanalyzed as a consonant stem noun of the *kit : *kid(-)ǝ- type. Since these kinds of stem type correspondences in Mari nouns are rather rare, this hypothesis seems more natural than reconstructing one more Proto-Mari stem vowel with an obscure or unspecified phonetic value (*ə2).

            Regarding PU *pp, one can of course speculate that the dual reflexes (*-p- ~ *-w-) were originally conditioned by consonant vs. vowel stems. But this just begs the question why *owə ‘father-in-law’ is a vowel stem in the first place. If PU *ëpti ‘hair’ and *läppi ‘soft spot’ gave PMari *üp and *lep, respectively, then one would expect PU *ëppi also to be a consonant stem (PMari *op – or, rather, *üp).

            In any case I’m sceptical as to whether *-mp- is ever reflected as PMari *p. The word *lop ‘valley’ (supposedly from *lëmpi ‘pond / bog’) also has an obscure duplet *lap, with the vowel *a that does not occur in inherited vocabulary.

            Regarding *kk and *tt, both *ćukkV- and *d’okkV- are weak etymologies due to the anomalous sound correspondences. The numerals ’five’ and ’six’ are, of course, rather obscure because the Finnic and Saami forms point to single *t instead. Moreover, in both cases there are indications that the stem originally had some kind of more complex, trisyllabic shape: cf. Samoyed *wüət ’ten’ (from *wiCit(t)i?) and Permic *kvat’ ’six’ (from Pre-PPerm *ku(C)at(t)i?).

            The dual development of PU *nt (to PMari *d ~ *nd) is interesting. The following examples, at least, seem convincing to me:

            *šåndǝ ‘urine / shit’ (PU *śonta)
            *kånde- ‘carry, bring’ (PU *kanta-)
            *mende- (W mende-) ‘dawdle, procrastinate’ (PU *mäntä-; a new etymology. Cf. SaaN meaddit ‘miss (a shot); make a mistake’, MdM mäńďǝ- ‘let go, let away, let escape’, Komi me̮d- ‘go, start moving, set off’, Udm medi̮- ‘intend to, be going to; procrastinate’. Apparently a causative of PU *mäni- ‘escape, flee’)

            *lŭdǝ ‘duck’ (PU *lunta)
            *südä- ‘clear (road, field, forest)’ (PU *sënti-)
            *jĭdäŋ ‘bowstring’ (PU *jänti(ŋ))
            *lådǝ ‘cut, notch, nick, mark’ (PU *lonti; a new etymology; cf. SaaN luodda ‘tracks; trace, mark (left by something); road, way’)
            *pĭdala- ‘defend, stand up for, protect’ (PU *pintili-; a new etymology; cf. Khanty *päntǝl- ‘demand (for oneself), claim (as one’s property); defend, stand up for, protect’)

            Here one could hypothesize that the latter type was conditioned by stem type / the PU second-syllable vowel *i, but this leaves *lŭdǝ ‘duck’ as irregular. On the other hand, the former type could have conditioned by the Mari open / non-closed vowel (*e, *å) in the first syllable, but this leaves *lådǝ ‘cut, notch, nick, mark’ as irregular.

            PU *lënti ‘lowland’ could perhaps be compared to Mari E liδa ‘valley’, liδe ‘hollow, dip’, W liδǝ ‘сухой овраг, ложбина’ instead of W landaka ‘small valley, depression (especially in a forest)’. Of course, the vowel *i (instead of expected *ü) is irregular, but the former comparison to landaka seems likewise irregular.

            Mari *kudala- ‘run, ride fast’ could probably be better compared to Khanty *kuntā- ‘run away, flee’ (PU *kunti-?); I don’t know if this etymology has been proposed earlier.

            • j. says:

              For Mari nouns that show dialectal variation between a vowel stem in *-ə and a consonant stem, I think the simplest explanation is mere analogical shift of stem type

              Probably possible, but I don’t think this explains why this supposed analogy would then only happen in Western Mari and never in Eastern. A comparative survey of non-initial vowels across the Mari varieties would be a good thing to have at this point.

              FWIW my earlier hypothesis has been that Proto-Mari had a contrast between only two qualities of non-initial vowels (let’s say *-A versus *-ə), and these could also be stressed vs. unstressed; then stressed *-Á *-ə́ give modern /-e/ and /-a ~ -ä/, while unstressed *-À *-ə̀ give what I marked in my 2016 post as *-ə₁ and *-ə₂. But this is not very compatible with my above hypothesis that *-ə₁ conditions lenition while *-ə₂ doesn’t!

              this just begs the question why *owə ‘father-in-law’ is a vowel stem in the first place.

              Not really. It is a vowel stem in any case, so this fact is available as a conditioning factor for /-w-/, regardless of the stem type’s prehistory. Generalization from possessed forms could be possible, similar to Hungarian (but this is clearly weaker for Mari, since in Hungarian reflexes like ipa the -a can be analyzed as directly continuing the earlier 3PS Px).

              I’m not sure what the alternative to inheritance from PU is for this word, anyway, it’s not as if e.g. borrowing through any other attested Uralic branch could explain the /-w-/ either.

              — The issue of geminates in general is getting expansive enough that I don’t think I can fully treat it here in the comments section. I may branch into at least a full blog post later on. I also have a few related points under work in my drafts already.

              On the treatment of *-mp- and *-nt-, actually I inversely don’t have strong opinions myself; I am instead referring in part to a paper Niklas Metsäranta has in the works. He has the same rough conclusions as you about *-nt-, but also e.g. additional examples with *-mp- leading to Mari /-p/.

    • CRCulver says:

      MariW landaka ‘depression’ should probably be seen as a derivational form of the same root as in MariE lomẟem NW landem ‘threshold; bump on road’. The definitions of lomdem etc. in TschWb suggest a landscape feature of a convex shape, but there is 19th-century material attesting to the use of the word for features of a concave shape instead. From the dialectal forms in TschWb, these words seem to go back to *låŋd-, which would explain the lack of a shift *nd > .

      This is part of a paper I am writing that criticizes several of the etymologies proposed by UEW or Bereczki, but it is somewhat far down the list of what I would like to publish, so it’ll probably still be a few months. I’ll eventually open it up to comments on

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Enter your email address to follow this blog and receive notifications of new posts by email.

%d bloggers like this: