Notes on Mari stem vowels

Though I often enough blog here about issues of consonantism too, it is clear that the largest challenges remaining in Uralic historical phonology concern vocalism.

Our current standard model of Uralic vowel history is mainly rooted in Samic, Finnic, and Mordvinic (the West Uralic group) on one hand, Samoyedic on the other. The evidence of these languages allows sketching a “canonical” system of eight stressed vowels in the 1st syllable, vs. a two-way contrast in the 2nd syllable. The later development in the other languages has also been surveyed in good enough detail to tell that the system probably is not going to need any fundamental uprooting. Perhaps we’ll eventually end up adding some further unstressed vowels; perhaps we can identify a ninth stressed vowel phoneme. But at least the former kind of updates will probably end up being based on these same key languages all the same. [1] Unstressed vowels are almost always lost in Permic, Hungarian, Mansi and Khanty, so positing new ones without any direct evidence would be quite questionable.

Mari is however a curious intermediate case. The original trochaic *CV(C)CV stem structure of Proto-Uralic is still partly preserved, though in a more reduced shape than in Mordvinic. But the development of stem vowels seems to diverge according to their parts of speech.


Verb roots in Mari are vocalic without exception, dividing into two stem classes: e-verbs (e.g. *ĭle- ‘to live’, *kånde- ‘to carry’, *pĭšte- ‘to put’) and a/ä-verbs (e.g. *kola- ‘to hear’, *lektä- ‘to leave’, *nelä- ‘to swallow’, *tola- ‘to come’). This distinction quite neatly corresponds to the West Uralic distinction between *a-verbs and *ə-verbs (cf. e.g. the Finnish cognates of the abov verbs: elä-, kanta, pistä-; kuule-, lähte-, niele-, tule-). There are still a number of exceptions; for many of them I could outline some lines of explanation, but in any case they don’t seem to rock the big picture.

As Mari /e/ in initial syllables regularly reflects PU *ä, it seems necessary to assume that inherited open stem vowels first merged as *ä, and were regularly raised after this. This would be quite similar to Samic, where the distinction between *ä and *a was similarly lost in the 2nd syllable, and the merged sound was in neutral environments eventually raised to *-ē.

The lowering of *ə to *a/*ä is not as trivial to understand, as Mari *a and *ä in initial syllables have no regular origin. Perhaps this is an additional piece of evidence that PU *ə was indeed a vowel quality that did not occur in the 1st syllable? The shift *[ə] > *a / *ä would be itself simple enough.


Nominal roots (including besides nouns also adjectival and numeral roots) are a different story. Almost no full-vowel-stem nominals occur in Mari, recent loanwords aside. The main types are instead consonantal and *ə-stems. Both types show only a simple reduced vowel /ə/ between the root and inflectional suffixes. In the latter stem type, this remains in the nominative; and is written е, ӧ, о in the orthography of Meadow Mari, though unlike actual /e ö o/ it remains unstressed. [2]

Vexingly, this distinction does not appear to correlate at all to the *a : *ə distinction recoverable from the West Uralic material. And unlike Mordvinic (where a class of consonant stems has emerged by loss of *-ə after single consonants and velar + sibilant clusters), the consonant environment does not seem to explain the duality, either. Final vowels can be either lost or retained after both heavy and light syllables; and this does not change if we look at the situation in Proto-Mari rather than Proto-Uralic. This adds up to a full set of no fewer than 12 different stem type correspondences between Mari and standard-issue Proto-Uralic:

  1. Light *a-nominal to vocalic stem:
    e.g. *kota > *kuðə ‘house’, *muna > *mŭnə ‘egg’, *śečä > *čü̆čə ‘uncle’
  2. Light *ə-nominal to vocalic stem:
    e.g. *kaśə > *kužə ‘long’, *ńëlə > *nülə ‘arrow’, *sülə > *šü̆lə ‘fathom’
  3. Heavy *a-nominal to light vocalic stem:
    e.g. *aška > *ošə ‘white’, *mërja > *mürə ‘berry’, *tälwä > *telə ‘winter’
  4. Heavy *a-nominal to heavy vocalic stem:
    e.g. *külmä > *kĭlmə ‘frozen’, *sonta > *šondə ‘dung’, *täštä > *tištə ‘sign’
  5. Heavy *ə-nominal to light vocalic stem:
    e.g. *ëppə > *owə ‘father-in-law’, *këččə > *kåčə ‘bitter’, *läwlə > *lelə ‘heavy’, *tammə > *tumə ‘oak’
  6. Heavy *ə-nominal to heavy vocalic stem:
    e.g. *oŋkə > *oŋgə ‘fishing hook’, *kośkə > *kåškə ‘rapids’, *wartə > *wŭrðə ‘shaft’
  7. Light *a-nominal to consonant stem:
    e.g. *ora > *ur ‘squirrel’, *kala > *kol ‘fish’, *pata > *påt ‘pot’
  8. Light *ə-nominal to consonant stem:
    e.g. *kätə > *kit ‘hand’, *lomə > *lŭm ‘snow’, *sënə > *šün ‘sinew’, *werə > *wü̆r ‘blood’, *wetə > *wü̆t ‘water’
  9. Heavy *a-nominal to light consonant stem:
    e.g. *ojwa > *wuj  ‘head’, *jalka > *jål ‘foot’, *neljä > *nĭl ‘4’
  10. Heavy *a-nominal to heavy consonant stem:
    e.g. *oksa > *ukš ‘branch’, *lupsa > *lŭpš ‘dew’, *mëksa > *mokš ‘liver’
  11. Heavy *ə-nominal to light consonant stem:
    e.g. *ëptə > *üp ‘hair’, *künčə > *kü̆č ‘nail’, *pučkə > *pŭč ‘hollow stem, tube’, *śarwə > *šur ‘horn’
  12. Heavy *ə-nominal to heavy consonant stem:
    e.g. *mekšə > *mükš ‘bee’, *soksə > *šukš ‘worm’

I get the feeling that this mess cannot (and shouldn’t) be resolved starting from just the canonical PU root structure and designing sound changes fine-tuned for exact vowel and consonant environments. E.g. supposing that *ə remains after *l, as in ‘arrow’ and ‘fathom’, will make it difficult to explain the consonant stem in ‘fish’. Probably at least one stem type distinction has been retained here that does not systematically survive in West Uralic.

This doesn’t mean that there couldn’t still be minor conditional sound laws involved, of course; e.g. heavy consonant stems seem to involve only plosive + *š clusters, and probably a similar conditional loss of *ə has occured here as did in Mordvinic. (Altho there are still words like *kukšə ‘dry’, *upšə ‘hat’ to be found as well.)

On the other hand: the fact that only nominals are this much of a mess suggests another avenue of explanation. Probably some parts of the situation can be cleaned up by distinguishing inherited and loanwords. Loans are typically nominals, and this can easily lead to a larger number or proportion of unetymological root shapes appearing in them. Consider e.g. Baltic *kerta → Finnish kerta, Mordvinic *kirda ‘time, instance’. If we naively equated the distribution of this word with its age, we might end up reconstructing a common West Uralic proto-form *kertä. But the expected reflexes of this should rather be *ä/*ə-stem forms: ˣkertä, **kiŕďə.

Verbs by contrast are somewhat less likely to be loaned. Modern Finnish makes a particularly striking example: in underived verb roots the etymological vowel combinations /e-ä/, /i-ä/ remain still more numerous than the loanword combinations /e-a/, /i-a/, although in nominals the battle has been lost ages ago (perhaps already in Proto-Finnic).

My next step in untangling this issue would probably be to tabulate how 1) widespread Uralic roots, 2) areal possibly-Uralic roots, and 3) known loanwords of various age are distributed in Mari between the 12 classes. Preliminarily, it seems that at least type #2 (*CVCə > *CVCə) is numerically much overshadowed by type #8 (*CVCə > *CVC). And here at least *nülə could be suspected of being a family-internal loanword from some direction, since this actually has an unexpectedly specific sense ‘arrowhead made of bone’, while the neutral Mari word for ‘arrow’ is instead *pikš. It would have to be a very old loan though, since it shows the expected proto-Mari sound changes *ń- > *n-, *ë-ə >  *ü, *ü > *[ö] / _R. [3]

Additionally, a second bisyllabic nominal stem type might have to be set up for Proto-Mari, for words where Hill Mari has a consonant stem but Meadow Mari has a vowel stem. It does not seem immediately clear if this correspondence can be always derived from an original vowel stem by apocope. Examples of this correspondence among the words mentioned above include ‘fathom’ (-lə₂), ‘berry’ (-rə₂), ‘oak’ (-mə₂) and ‘hat’ (-pšə₂); but not ‘house’ (-ðə₁), ‘egg’ (-nə₁), ‘uncle’ (-čə₁), ‘father-in-law’ (-wə₁), ‘bitter’ (-čə₁), ‘long’ (-žə₁), ‘dry’ (-kšə₁). This could add some extra resolution as well.


Finally, I’ll note that Mari also allows monosyllabic nominal stems. These regularly reflect roots with earlier medial semivowels or spirants, regardless of the original stem type: e.g. *kiwə > *kü ‘stone’, *luka > *luɣa > *lu ’10’, *śüd₁ə > *šü ‘coal’, *täjə > *ti ‘louse’. [4] But, interestingly, and further highlighting the stark split in stem type behavior between verbal and nominal roots in Mari, there are no monosyllabic verbs to go along with these. Candidates for monosyllabicity end up as bisyllabic CV.V stems instead, again with exactly the expected stem vowel. E.g. *jëxə- > *jü.ä- ‘to drink’, *kajwa- > *ko.e- ‘to dig’. Does this perhaps indicate that monosyllabic nouns should be considered a subtype of consonant-stem nouns, even though no nominals of a shape **CVə seem to occur?

[1] A few good candidates are indeed already provided by two kinship terms:
– PS *nōtōj ‘husband’s sister’ ~ PF *nato ‘spouse’s sister’ ~ PSmy *nåto ‘spouse’s younger sibling’
– PS *kālōj ‘husband’s brother’s wife’ ~ PF *kälü ‘(husband’s) brother’s wife’ ~ PSmy *kälü ‘sister’s husband’
The argument for reconstructing a “kinship suffix” *-w for these (*nataw, *käläw?) appears to be circularly motivated by the belief that PU did not allow any 2nd syllable labial vowels. On the other hand, the unstressed labial vowels in Proto-Samoyedic are a relatively new discovery as well, and before that, words like these could have well been counted among the words that have innovated 2nd syllable labial vowels in Proto-Finnic and Proto-Samic. — On the third hand, I also wonder if the problematic sound correspondences in a third similar word: PS *vivë ~ PF *vävvü ~ PSmy *weŋü ‘son-in-law’ should be attempted to resolve by constructing something like PU #weŋäwə, with Samic *-vë not corresponding to PF *-vvü and PSmy *-ŋü, but instead only to their final labial vowel.
[2] I have seen phonological descriptions of Meadow Mari that attempt to follow the orthography and identify final unstressed , , with /e, ö, o/ (e.g. Eeva Kangasmaa-Minn’s description of Mari in the 1998 reference book Uralic Languages by Routledge), but this seems like a terrible idea to me: it clashes with regular stress assignment on the rightmost full vowel, and requires setting up a rule by which final /ə/ becomes one of the full mid vowels, depending on vowel harmony. I sort wonder if the analysis lingers out of some kind of attachment to vowel harmony? which this schwa-fortition rule would be the only example of in Meadow Mari.
[3] Another option might be to assume that the final vowel represents some kind of a fossilized derivational element.
[4] It also appears to be the case that just about all of these cases have close tense vowels *i, *ü, *u.

Advertisements
Tagged with: , , , , ,
Posted in Reconstruction
24 comments on “Notes on Mari stem vowels
  1. M. says:

    This is a broader issue that applies beyond this particular blog post, but I’m curious why you include only reconstructions for *both* original and resultant forms here?

    For example, in “*tälwä > *telə“, both the starting point and the end point are reconstructions, whereas a formulation like “*tälwä > *telə (> Hill Mari tel, Meadow Mari tele)” shows the data that the reconstruction is based on.

    • j. says:

      Principally for brevity, really. In the case of Mari this isn’t too much of an issue, but when dealing with Finnic, Samic, Mansi, Khanty and Samoyedic data, citing all the direct evidence would be prohibitively time-consuming (esp. since we don’t even have up-to-date phonemic analyses available for several varieties). There certainly are broader issues in this, in that an incorrect mid-level reconstruction can risk giving a more regular (or a more irregular!) picture of the situation than it really is, but I think it should be understood in general that reconstructions are not infallible and can be subject to revision.

      I also definitely prefer listing reconstructions only, that explicitly admit to being reconstructions not data, to only listing selected “key variety” data, which risks giving the picture that there really isn’t any room for revision. Contrary to what some people have sometimes wanted to say about e.g. Veps, there is no such thing as “the Sanskrit of Finnic”, not even a “the Gothic”.

      (I also believe that IE studies overemphasize the ancientmost languages anyway, even though e.g. Avestan is not Proto-Iranic, Vedic is not Proto-Indo-Aryan, OCS is not Proto-Slavic etc.)

      For Permic I often make an exception of sorts on the other hand, since there still is no such thing as a generally accepted Proto-Permic reconstruction, especially as comes to vocalism.

      • M. says:

        Contrary to what some people have sometimes wanted to say about e.g. Veps, there is no such thing as “the Sanskrit of Finnic”, not even a “the Gothic”.

        You mean that there is no Finnic language that shows very little cluster reduction, syllable loss or sound mergers compared to reconstructed Proto-Finnic? Or, were you comparing the attested Finnic languages to reconstructed Proto-Uralic?

        I also believe that IE studies overemphasize the ancientmost languages anyway

        Maybe so, but isn’t reconstruction (at least traditional reconstruction) mainly a process of *ranking* the “ancientness” of attested sounds? For example, by reconstructing the Proto-Finnic form *nooli “arrow”, one is taking the position that the monophthong seen in Estonian nool is older than the diphthong seen in Finnish nuoli, but the final vowel in Finnish is older than the lack of one in Estonian.

        Of course, there are cases where no attested sound is clearly older than others (as in the case of IE syllabic nasals: Latin centum, Greek hekatón “100” < IE *kmtóm), but I think that the more conservative one’s reconstruction methdology is, the smaller the number of such cases will be.

        • M. says:

          but I think that the more conservative one’s reconstruction methdology is, the smaller the number of such cases will be.

          Or rather, the more conservative one’s methodology, the smaller the number of nowhere-attested sounds (such as IE syllabic nasals) that will be found in the reconstruction.

        • j. says:

          I mean that there is no Finnic variety (dividing the dialect continuum into “languages” is largely arbitrary) that would have been actually attested early enough for us to assign high probability to its features being generally archaic. Veps for example is archaic with respect to some key features, phonologically e.g. in retaining *b, *d, *g (it might even have had these as plosives all along rather than reverting them from *β, *ð, *ɣ) and also *h between unstressed syllables, but it’s also highly innovative with respect to medial lenition, syncope, cluster simplification, expansion of the case system, Slavic lexical influence etc. In many of these respects its northern neighbor Ludic actually fares better.

          isn’t reconstruction (at least traditional reconstruction) mainly a process of *ranking* the “ancientness” of attested sounds?

          It is, but if we take e.g. Sanskrit being obviously archaic with respect to 85% of its features, and non-trivially but still demonstrably archaic with respect to an additional 10%, as a licence to also treat the remaining 5% as archaic, this is still going to likely end up with a number of wrong analyses. (Especially if we do not even bother examining what the state of these features is across the later-attested Indo-Aryan languages. Modern dialectology for these still has much work to be done.)

          I’ve recently actually attempted assembling a brief overview of historical sound changes from PIE to PII to PIA to Sanskrit, and at least on an outsider’s look, there are numerous features that it doesn’t seem to make sense to analyze as innovations later on. E.g. in terms of clusters — Sanskrit does not only abstain from cluster assimilations, it moreover actually dissimilates e.g. //-s.s-//, //-ṣ.ṣ-// and //-ś.ṣ-// into -ts-, -kṣ- and -kṣ- respectively, a change that appears to have no support in the Prakrits. Which makes sense as a “regular hypercorrection”, if and given that Sanskrit existed as an acrolect alongside more geminate-happy mesolects for a good while.

          Of course, there are cases where no attested sound is clearly older than others (as in the case of IE syllabic nasals: Latin centum, Greek hekatón “100” < IE *kmtóm), but I think that the more conservative one’s reconstruction methdology is, the smaller the number of such cases will be.

          Yes. I also consider it demonstrably true (by the analysis of later linguistic radiations such as Germanic, Finnic, Slavic) that there is no reason to attempt minimizing the number of such cases. It is in fact even possible to find examples where a single innovation has at a post-protolang date spread to all modern descendants. A clear example would be fate of short vowels in Slavic, where short *a *ä *i *u are by default nowadays found as /o e ∅ ∅/ all over the Slavic-speaking area; yet the loanwords into both Finnic and Hungarian (and I would presume also other neighbors like Baltic and Romanian) still show the former as low vowels, the latter as close. Thereby we get correspondence sets like X ~ X ~ X ~ X ~ X that regardless reconstruct to *Y and not *X.

          • M. says:

            I mean that there is no Finnic variety (dividing the dialect continuum into “languages” is largely arbitrary) that would have been actually attested early enough for us to assign high probability to its features being generally archaic.

            Why is the date of attestation significant here? Even if, for example, Icelandic had no records older than 100 years ago, it would still be pretty easy to call it more archaic than mainland Scandinavian in many or most of its phonetic features.

            Or, is the situation in Finnic such that it is hard to rank most features on a scale of archaicness, unless criteria such as the earliness/lateness of attestation are brought in?

            Yes. I also consider it demonstrably true (by the analysis of later linguistic radiations such as Germanic, Finnic, Slavic) that there is no reason to attempt minimizing the number of such cases.

            I’m not well-versed in the reconstruction of Slavic or Finnic, but I can’t think of a single major feature of traditionally-reconstructed Proto-Germanic that is not present in at least one attested language. Even non-initial stress, though not demonstrably present in attested Germanic, is found in cognates in other branches of IE (e.g. English and : Greek antí) and can be inferred for Proto-Germanic because it correlates with attested features.

            If we had less Germanic data to work with, then we simply wouldn’t be able to reconstruct as much: e.g. without testimony from other Germanic and IE languages, we probably wouldn’t be able to tell whether the vowels of English bean and Scandinavian bønne / böna reflected an earlier diphthong or a monophthong.

            • j. says:

              Why is the date of attestation significant here? Even if, for example, Icelandic had no records older than 100 years ago, it would still be pretty easy to call it more archaic than mainland Scandinavian in many or most of its phonetic features.

              If we agree that “archaicity” refers to actual chronological stratification, a situation where e.g. attestations of Sanskrit date from thousands of years earlier than attestations of the Dardic languages is going to give the former a substantial head in the race. It would not be impossible for some archaisms to be regardless found in Dardic of course (just as Albanian or Lithuanian or Ossetic retains non-Sanskrit archaisms). But this is a very different situation from one where all Finnic varieties have only been first attested some 1500-2000 years after their breakup. In this case it is both a priori likely and, after some work, a posteriori verifiable that the protolanguage was not especially close to any of the attested varieties. Not only will it be asinine to use any one variety (whether with internal reconstruction applied or not) as a fully general stand-in for Proto-Finnic, but also using a narrow selection of “key” varieties, say {Finnish, Veps}, is also still likely to miss various details.

              Whereas within Indo-Aryan, already Sanskrit alone in fact still get us much closer to Proto-Indo-Aryan.

              I should check Ahlqvist’s original mid-1800s comments sometime, though. I suspect he intended his assessment of Veps as “the Sanskrit of Finnic” more in terms of its original “this newly discovered language provides valuable comparative evidence” news value rather than anything along the lines of Sanskrit’s slightly later notoriety as “this is pretty much the original language right here”.

              I can’t think of a single major feature of traditionally-reconstructed Proto-Germanic that is not present in at least one attested language

              I believe a few vowel developments might work for that (e.g. the Gothic general shift *e > i makes it difficult to date the early i-umlaut processes), but more important evidence appears within the Germanic subgroups. Not necessarily even in the form of innovations that are universal but areal, but in the form of secondary innovations with wide but non-universal spread. For a simple example, effectively all modern continental Germanic languages show *w > v, and if English (+ Scots etc.) and some minor dialects weren’t attested, the change would be dateable as Proto-Germanic, though we know that it’s substantially later.

              This principle is crucial for deeper reconstruction levels, when we can only trace the dialectal diversity of the initial radiation thru the intermediate protolanguages of groups that also managed to latch on later success. Descendants of English most likely will still be around e.g. 1000 or 5000 years from now, but what do you think the chances are that descendants of varieties like Elfdalian or Gottscheerish will be?

              • M. says:

                This principle is crucial for deeper reconstruction levels, when we can only trace the dialectal diversity of the initial radiation thru the intermediate protolanguages of groups that also managed to latch on later success.

                What do you mean by “through the intermediate protolanguages”?

                Are you saying that e.g. it would be justified to posit *w for Proto-Germanic even if all attested Germanic had v instead?

                • j. says:

                  Since no “too early to be subgrouped” IE languages have been attested, almost all of our knowledge comes from languages that were parts of later radiations. Proto-Germanic, Proto-Indo-Iranian, Proto-Greek etc. have all undergone a decent amount of post-PIE development, and their “original” sister dialects in early post-PIE times are lost for good under these subgroups’ own expansions. Which amounts to also plenty of evidence lost.

                  So while there are hundreds of individual data points available for reconstructing Proto-Germanic (or Proto-Finnic) — for the reconstruction of PIE (or Proto-Uralic), most information is not independent, and must have once passed thru the bottlenecks of Proto-Germanic etc.

                  And so the lesson from looking at later radiations is that our current reconstruction methods don’t allow clearly justifying reconstructing PGmc *w (that we, armed with fuller data, know was clearly there) from just a handful of varieties that all now have /v/. When reconstructing something like PIE, similar sources of error will be very likely lurking left and right.

                  What if a para-Germanic dialect that still retained evidence of *h₅ got steamrollered by Celtic already 3000 years ago? What if an intermediate Iranian-Balto-Slavic variety used to maintain both *péh₂ur and *Hn̥gʷnis with a clear semantic distinction between them? Or even stuff like: what if Tocharian actually distinguished uvulars from velars, but none of the scripts captured this? Etc.

              • M. says:

                Replying to this comment:

                And so the lesson from looking at later radiations is that our current reconstruction methods don’t allow clearly justifying reconstructing PGmc *w (that we, armed with fuller data, know was clearly there) from just a handful of varieties that all now have /v/

                I agree with you, but what puzzled me was your earlier statement that there is no reason to minimize the number of unattested sounds in a reconstruction. How is this supported by the dialectal radiation of Germanic, Finnic, etc., as you seemed to be saying?

                Maybe what I would like to see is a system of indicating the relative probability of reconstructed forms. For example, one could prefix a reconstruction with more asterisks the more speculative one considers this reconstruction to be. I would say that the reconstruction of two velar series for IE (or a velar and uvular series), while not an implausible inference from the data, is not at the „one-asterisk“-level, whereas the reconstruction of syllabic nasals is.

                There might be varying opinions on how high or low to rank a given reconstruction, but at least this ranking would make it clearer how much credibility a person is giving to a particular reconstructed form.

                • j. says:

                  [W]hat puzzled me was your earlier statement that there is no reason to minimize the number of unattested sounds in a reconstruction. How is this supported by the dialectal radiation of Germanic, Finnic, etc., as you seemed to be saying?

                  Within Finnic, we know by now from studies on the relative chronology of innovations that strictly speaking Proto-Finnic still had at minimum *š, *č and *d, probably also the full set of Proto-Uralic palatalized consonants + *x. They’ve changed to other stuff in all varieties, but the loss is posterior to branch-specific developments. The traditional Finnish-like palatal-less, postalveolar-less reconstruction applies at most to the protolanguage of the varieties other than South Estonian. (I believe Livonian also needs to be excluded, though that argument would take a while to fully outline, and even the Northern Finnic / Central Finnic split could perhaps turn out to go further back on sufficiently close inspection.)

                  Within Germanic there do not seem to be too many direct precedents for this exact kind of a thing, but it still demonstrates the underlying reasons that causes these issues (the sometimes quite extensive diffusability or reoccurability of sound changes, given enough time).

                  Of course I don’t mean “anything goes in reconstruction, go wild”. The overall weight of assumptions about sound changes still should be minimized; just not their bare count. If assuming new but natural and easily spreadable sound changes (perhaps ones that are even internally reconstructible) to have occurred in every attested language of a family would help explain other weird phenomena as e.g. archaic shunts, I think this will be a good idea.

            • David Marjanović says:

              Even non-initial stress, though not demonstrably present in attested Germanic, is found in cognates in other branches of IE (e.g. English and : Greek antí) and can be inferred for Proto-Germanic because it correlates with attested features.

              That depends on what you mean by “Proto-Germanic”. You’re almost certainly right if you mean the stage right after Grimm’s law happened, for example; but if you mean the last common ancestor of the attested Germanic languages, that stage already had resolutely initial stress, and that’s the sense in which “Proto-Slavic” is meant here: the *a > o shift, common to all attested varieties without exception (not counting the loans and foreign transcriptions hinted at above), appears to have happened after the branch-specific changes to the sequence *ar.

      • David Marjanović says:

        (I also believe that IE studies overemphasize the ancientmost languages anyway, even though e.g. Avestan is not Proto-Iranic, Vedic is not Proto-Indo-Aryan, OCS is not Proto-Slavic etc.)

        Oh, definitely. Shortly after Jones, Schlegel believed that Sanskrit was PIE; IEistics since then have been on a slow drift away from this position. :-)

        The perhaps most noticeable feature which shows that even Rgvedic is not Proto-Indo-Aryan is the megamerger of a whole lot of plosive + plosive and plosive + fricative clusters into kṣ, which is not shared by Prakrit or apparently the modern varieties. I have encountered the claim that Vedic is directly ancestral to the Dardic languages; unfortunately I have no idea what evidence this is based on or if there’s any to the contrary, because IEists and most other people have generally ignored Dardic altogether…

        the loanwords into both Finnic and Hungarian (and I would presume also other neighbors like Baltic and Romanian) still show the former as low vowels, the latter as close.

        Also a few Austrian placenames, and there’s an early OHG document where someone’s name is registered as Tagazino – evidently *togo synъ of conventional reconstruction, the son of the man mentioned just above in the list, with already shifted *y but unshifted *o (or largely unshifted – I don’t know how, say, [ɒ] would have been interpreted).

        However, the mere geographic distribution of these things doesn’t mean Proto-Slavic had already split up before the Slavic expansion. That this did happen is indicated by branch-specific developments which have to be ordered before at least this last phase of the Great Slavic Vowel Shift.

  2. If not a typo, what are your grounds for reconstructing Proto-Uralic ‘snow’ as *lomə and not *lumə? I understand that such a Mari reflex would plausible as one of the instances of Proto-Mari *ŭ < PU *o adjacent to labial consonants (Aikio 2014: 157), but what is the evidence from elsewhere in the Uralic family for *lomə?

    Also, what motivates reconstructing the vowel of ‘berry’ as *mërja and not *marja? Couldn’t one explain the Mari form as follows: PU *marja > *märjV (fronting of initial-syllable vowel in anticipation of /j/) > *mör(jV) (raising and rounding of *ä after *p, with /ö/ as the allophone before /r/) > mör. The development of the word would then be comparable to MariW pülä ‘ziemlich, beträchtlich’ < *paljз (UEW 350–1), though I have no idea how widely that etymology is accepted these days outside of Bereczki (2013: 211).

    • j. says:

      what are your grounds for reconstructing Proto-Uralic ‘snow’ as *lomə and not *lumə?

      I follow Janhunen’s (1981) observation that in stems of the shape *CVCə, both Proto-Samoyedic *o and *u regularly correspond to western Uralic *u, e.g. PSmy *jom ‘snow’, and that we should assume a raising *o > *u in the west. The other examples are

      • PSmy *por- ‘to bite’ ~ Fi. purra etc.
      • PSmy *so ‘mouth’ ~ Fi. suu etc.
      • PSmy *toj- ‘to come’ ~ Fi. tulla etc.
      • PSmy *kot- ‘to cough’ ~ Northern Sami gossat etc.

      Other possible cases are PSmy *lë ‘bone’ ~ Fi. luu etc., where we could assume PU *lëwə > *lowə > *luwə; and the Indo-European loanword *onə > Fi. uni. Contrast e.g. PSmy *tuj ‘fire’ ~ Fi. tuli etc., PSmy *uə- ‘to swim’ ~ Fi. uida etc., PSmy *kuj ‘spoon’ ~ Fi. kuiri etc. which point to original *u.

      I’m more skeptical about the analysis by Sammallahti (1988) that *o > *u in this words would be common already to all traditional Finno-Ugric languages (you might notice that e.g. Mari (*)tola- ‘to come’ does not seem to indicate pre-Mari *u), but going into the details on that topic would be an entire blog post, or perhaps paper, of its own. (Will be? It’s one of the things in my pile of drafts to work on.)

      Also, what motivates reconstructing the vowel of ‘berry’ as *mërja and not *marja?

      This being a case of the well-attested development *ë > *ü (admittedly in any reasonably recent works the case for it does not seem to have been reviewed in any detail, only mentioned in passing) seems more straightforward than what you propose. E.g. *mälkə > (*)mel ‘breast’ and *pälä > *pelə ‘half’ would moreover seem to argue against your second sound change. The proposal by Itkonen, as referred to in the UEW, through first *a > *o, then *oCj > *öCj would probably fare slightly better, although his other example: Mari *nörə ‘soft, moist, flexible’ has by now been argued (Aikio 2012: 234) to be better derived from PU *ńërə, and cognate to Fi. nuori ‘young’ rather than Fi. norja ‘flexible’.

      UEW proposes some cognates in Ob-Ugric for both ‘berry’ and ‘much’ that would not be compatible with reconstructions with *ë, but I find them rather uncertain — especially if Fi. paljo is actually an old Slavic loanword (cf. Saarikivi 2009, SUST 258). Its proposed Hill Mari cognate seems likely to also somehow derive from PIE *pelh₁u-, though I’ve yet to see a detailed etymology worked out.

      • crculver says:

        E.g. *mälkə > (*)mel ‘breast’ and *pälä > *pelə ‘half’ would moreover seem to argue against your second sound change.

        The words I was thinking of when I suggested rounding of *ä after labials were MariE pükš ‘nut’ < PU *päški- and MariE βüẟem ‘to lead’ < *wätä-. (The reconstruction of ‘to lead’ as *wätä- and not *wetä is from Aikio 2014: 155.)

        But I misremembered things as being so simple. Rounding of the vowel in βüẟem is obviously late as it is limited to MariE. Also, looking at the larger set of comparanda in Aikio 2014, it’s remarkable just how varying the treatment of *ä is: we find rounding in the two words mentioned about, but not in pište ‘linden’ < *päkšnä. The three words that Aikio lists as reflecting Mari ü < PU *ä – MariE jükšem ‘get cold’ < *jäkši, MariE pükš ‘nut’ < *päški, and MariE šükšö ‘rotten’ < *säskä – have in common only a medial cluster of a velar followed by a spirant and then (setting these apart from *päkšnä) a vowel, and I don’t see any phonetic motivation for rounding there. I suppose the most parsimonious development would be to propose a merger of *ä and *i̮ in this environment, because *i̮ also undergoes rounding for no particular reason.

        (Speaking of *i̮, looking at Aikio’s list of Mari ü < *i̮, all but two examples involve the high vowel *i or *j in the second syllable. If *i̮ was phonetically a mid unrounded vowel, and it was raised under a form of umlaut, would it, to Proto-Mari ears, have merged more readily with a rounded vowel *ü than an unrounded vowel *i? In Ossetian, *a was raised before nasals a few centuries ago, and it became o, not æ, in spite of no labial environment in words like don ‘water’ < *dan).

        I do wonder if, looking at the data in Aikio’s article and elsewhere, one can already construct an elaborate scheme of relative chronologies and posit the sort of stages for Pre-Proto-Mari that I admire so much from Korhonen’s textbook on the history of Saami. I’d be interesting in trying to do that, but I suspect I would be beaten to it.

        • j. says:

          I suppose the most parsimonious development would be to propose a merger of *ä and *i̮ in this environment, because *i̮ also undergoes rounding for no particular reason.

          Sounds sensible, especially since I just proposed in another post that *ä may have been [ɛ]; and since I already support a mid vowel *ë (UPA *e̮) in place of Janhunen and Sammallahti (and Aikio, and also already Steinitz)’s *i̮. So we would be dealing roughly with a simple retraction development [ɛ] > [ɜ] before (some) velar consonants.

          Speaking of *i̮, looking at Aikio’s list of Mari ü < *i̮, all but two examples involve the high vowel *i or *j in the second syllable. If *i̮ was phonetically a mid unrounded vowel, and it was raised under a form of umlaut, would it, to Proto-Mari ears, have merged more readily with a rounded vowel *ü than an unrounded vowel *i?

          I believe the conditioning there has more likely been a-umlaut (*ë-a > *a-a > *å/*o). Remaining — or newly incoming, as in the Indo-Iranian loanword *śëta ‘100’ — instances of *ë would probably have first drifted to *ö, then *ü *ö > *ü̆ *ü just as also *u *o > *ŭ *u.

          I do wonder if, looking at the data in Aikio’s article and elsewhere, one can already construct an elaborate scheme of relative chronologies and posit the sort of stages for Pre-Proto-Mari that I admire so much from Korhonen’s textbook on the history of Saami.

          One probably could. I have some sketches of my own already outlined. And the standards might not be quite as high as you think: Korhonen’s form of presentation is certainly attractive, but a number of his calls on relative chronology and phonetic specifics do not seem to have been defended in full detail anywhere.

      • M. says:

        and the Indo-European loanword *onə > Fi. uni

        You mean “possible IE loan”, right? While it’s not inconceivable that Finnic un(i) and Mordva on “sleep” come from a cognate of Greek ónar ”dream” (which I think is what you’re referring to here), this seems like a rather weak proposal (at least based on the information I have thus far). Some issues:

        – Only two segments match (*on-) between the Finno-Volgaic and the IE words

        – The IE word seems to have been disyllabic, but Finno-Volgaic shows no trace of the second IE syllable

        – The only known reflexes of the IE word, according to the dictionaries I have checked, are in branches that are relatively geographically close (Greek, Armenian and Albanian), so the word may be a regionalism that doesn’t go back to common IE to begin with; and, Uralic languages have not been in direct contact with any of these three IE branches

        – The IE words mean ”dream” specifically; the Finno-Volgaic words mean ”sleep” in general (I’m not claiming that the difference between these two meanings cannot be explained through semantic change, but the mere fact that such a change is possible doesn’t prove that it occurred in this case)

        • j. says:

          Most loanwords this far back are merely “possible”, yes. We know though that heteroclitic *-r̥ / *-n- in PIE roots generally fails to appear in possible Uralic counterparts, as also in *kesä ‘summer’ ← *h₁es-r̥/n- (and in *wetə ~ *wod-r̥, *wed-n- ‘water’, however this should be analyzed), so that does not seem like an especial weakness at least.

          Semantics-wise, we seem to have evidence for a shift ‘dream’ > ‘sleep’ in how the Finnic verbal derivatives such as *uneksi- (Finnish, Karelian, Ludian), *unista- (Veps, Estonian, Livonian) mean ‘to dream’. The neutral Finnic word for ‘to sleep’ is instead *nukku-, *tukku-. Better yet on the Mordvinic side, now that I re-check: *on actually does mean ‘dream’, in contrast to native Uralic *udəmə ‘sleep’.

          The narrow IE distribution is perhaps the worst problem. Are there competing PIE words for ‘dream’ that would have a wider distribution, though? *swépno- will have to be ‘sleep’ instead, and e.g. Gmc. *draumaz and Slavic *mьčьta don’t seem to go back to PIE as such.

          • M. says:

            We know though that heteroclitic *-r̥ / *-n- in PIE roots generally fails to appear in possible Uralic counterparts, as also in *kesä ‘summer’ ← *h₁es-r̥/n- (and in *wetə ~ *wod-r̥, *wed-n- ‘water’,

            Actually, I don’t accept that we know this. As far as I can currently see, the etymological proposals you mention kesä and vesi/vete- are *also* unconvincing, for similar reasons to the *un- /*on-r. proposal.

            And, even if we did accept this deletion of the *-r/*-n suffix as a principle, there would remain the problem that *un- /*on-r. entails only a two-segment match (VC : VC), which seems rather close to the threshold of coincidence.

            Semantics-wise, we seem to have evidence for a shift ‘dream’ > ‘sleep’ in how the Finnic verbal derivatives such as *uneksi- (Finnish, Karelian, Ludian), *unista- (Veps, Estonian, Livonian) mean ‘to dream’.

            Why can’t this just as easily be evidence for the reverse shift, “sleep” > “dream”? Cf. Latin somnium “dream” and Slovenian sanje “dream”, both from a root *swep-n- that seems to have meant “sleep” (Greek húpnos “sleep”, Icelandic svefn “sleep”, etc.)

            Better yet on the Mordvinic side, now that I re-check: *on actually does mean ‘dream’, in contrast to native Uralic *udəmə ‘sleep’.

            What dictionary are you using for this? The UEW translates Mordva on with the Russian words “сон, сновидение”, which according to Google Translate mean “sleep” and “dream”.

            It appears that Russian сон can itself mean both “sleep” and “dream” (whereas мечта seems to more unambiguously mean “dream”), so perhaps this duality of meaning has influenced the Mordva word, given that Mordva speakers have been surrounded by Russian speakers for some generations now.

            The narrow IE distribution is perhaps the worst problem. Are there competing PIE words for ‘dream’ that would have a wider distribution, though?

            It isn’t just the narrow distribution of the IE word that’s problematic — it’s that that this word is only attested in IE branches (Greek, Armenian, Albanian) that are well outside the range of where Uralic speakers are known (as far as I’m aware) to have historically lived.

            • j. says:

              Actually, I don’t accept that we know this. As far as I can currently see, the etymological proposals you mention kesä and vesi/vete- are *also* unconvincing, for similar reasons to the *un- /*on-r. proposal.

              I don’t disagree entirely. E.g. ‘summer’ seems to be the only suggested case of *k ← *h₁, which to me seems even phonetically less likely than the more securely attested group with *k ← *h₂. I’m only saying that this etymology for ‘dream’ is no worse than these cases, and I’d consider it at least the best lead available so far (preferrable to projecting this root to PU).

              What dictionary are you using for this?

              Paasonen’s dialect dictionery, which glosses on ‘Traum’, versus udomo ‘Schlaf’.

              It isn’t just the narrow distribution of the IE word that’s problematic — it’s that that this word is only attested in IE branches (Greek, Armenian, Albanian) that are well outside the range of where Uralic speakers are known (as far as I’m aware) to have historically lived.

              Obviously it’s not a loan from pre-Greek or pre-Albanian specifically, no, and probably not even pre-Armenian. But if we do end up having to reconstruct it for PIE or even just barely post-PIE, then it’s a possibility that the root was lost from Indo-Iranian or Balto-Slavic (I would be less optimistic about Germanic, which lexically leans more towards Italo-Celtic) late enough that it still was able to be handed over to Uralic before that. Prime candidates for transmission would be the unattested early Eastern Baltic(ish) varieties of the Fatyanovo-Balanovo culture, whom we probably also need to blame for the Baltic loans in the Volga-Kama zone languages: some dozens in Mordvinic, a handful or two in Mari, sporadic cases in Permic and even Ob-Ugric (the latter most likely thru early Permic).

              • M. says:

                Paasonen’s dialect dictionery, which glosses on ‘Traum’, versus udomo ‘Schlaf’.

                Thanks for the link. I notice that about half the citations in the entry for on seem to be variations on the phrase on ńäjəms, which (if my minimal grasp of Mordva serves me here) are etymologically and semantically equivalent to Finn. nähdä unta “to have a dream”, literally “to see dream/sleep”. I wonder if the “dream” meaning of on/uni preceded this phrase, or whether it developed by extrapolation from this and similar phrases?

                Obviously it’s not a loan from pre-Greek or pre-Albanian specifically, no, and probably not even pre-Armenian.

                I wasn’t thinking that either, but, if there’s no strong evidence that *oner- goes back to IE (its distribution is consistent with that of a regionalism), and if the similarities between *oner- and the Finno-Mordvinic words are not especially strong to begin with, then I don’t see a reason to rank an IE etymological proposal higher (in terms of plausibility) than a Uralic one.

                A somewhat-equivalent situation would be if we had two moderately similar-looking words in Slovakian and the Sami languages (but nowhere else in attested Finno-Ugric), and someone advocated an etymological connection between them on the logic that Sami is Finno-Ugric, and Slovakian speakers are known to have had extensive contact with at least one Finno-Ugric branch.

                • M. says:

                  “are etymologically equivalent” –> “is etymologically equivalent”

                  (Also, to correct what I wrote above about Russian мечта, this word seems to mean “dream” in the sense of “desire” — somewhat like FInnish haave — and not “dream” in the sense of “something you see during sleep”.)

                • j. says:

                  A somewhat-equivalent situation would be if we had two moderately similar-looking words in Slovakian and the Sami languages (but nowhere else in attested Finno-Ugric), and someone advocated an etymological connection between them on the logic that Sami is Finno-Ugric, and Slovakian speakers are known to have had extensive contact with at least one Finno-Ugric branch.

                  That’s exaggerrating the distance a bit, but you could substitute let’s say Székely for Sami here to get a somewhat equivalent scenario (Proto-Slavic ∶ Old Hungarian ∷ PIE ∶ PU). Or, say, Icelandic and Kola Sami? I would suspect that there is probably at least one Old Norse word that survives in common use only in these two.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Enter your email address to follow this blog and receive notifications of new posts by email.

%d bloggers like this: