Primary vs. secondary *ë

I claimed in my post “Two Lemmata” that the reconstruction of Proto-Uralic *ë rests on quite firm ground by now. Regardless, it is still not too rare to see studies which fail to recognize the idea. [1] Apparently the existence of this proto-vowel cannot be yet considered to have reached the status of general consensus. Why is this?

Assuming that the relevant literature has simply gone unread might be a bit too uncharitable. I believe a better reason for why doubts persist would be that no single unified source discussing the reconstruction of this vowel is available; the information needs to be pieced together from disparate sources. I hope to have previously provided a brief overview, though, and in this post I will explore some additional complications.

Probably one obstacle has been that the evidence for *ë is not trivial. For all other PU vowels, the evidence of Finnic, which has been presumed highly archaic, can generally be taken as direct: PF *a < PU *a, PF *o < PU *o, PF *ü < PU *ü, etc. (with only minor conditional shunts). The PF vowels also generally remain intact in the descendants. And only in Finnic does the contrast *a/*ë seem to be irrecoverably lost. Hence, one necessary precondition for accepting PU *ë is to accept that the Finnic vowel system does too contain innovations, even major ones.

(You’d think there should be no need to explicitly spell out something this basic, but alas, long-outdated ideas about “key languages” have persisted for long in Uralic studies. Better safe than sorry…)

The direct evidence in East Uralic

The best evidence for the reconstruction of *ë comes instead from the quite distinct reflexes in the easternmost branches: Mansi (*ë > *ëë), Khanty (*ë > *ïï) and Samoyedic (*ë > *ë, *ï). Hungarian *ï, though it has in the modern language merged with the front vowels i ~ í, is also quite distinct in its refusal to adhere to vowel harmony. However, in general the vowel systems of these groups have been subject to much innovation, and it takes care to wring out evidence from here.

The single most important observation, I believe, is to look beyond individual details and to note that among all these four branches — i.e. across the East Uralic group in entirety — the general categories of non-open unrounded back vowels appear cognate to each other. Thus we can find correspondence sets such as the following:

  • H ín (: ina-) ~ Ms *tëën ~ Kh *ɬaan ~ Smy *čën ‘vein, sinew’
  • H nyíl (: nyila-) ~ Ms *ńëël ~ Kh *ńaaL ~ Smy *ńëj ‘arrow’
  • H nyír (: nyira-) ~ Ms *ńëërəɣ ~ Kh *ńaarəɣ ~ Smy *ńër ‘cartilage’
  • Ms *ëët ~ Kh *aapət/ɔɔpət ~ Smy *ëptə ‘hair’
  • H al- ~ Ms *jal- ~ Kh *ïïL ~ Smy *ïlə ‘under’
  • H máj ~ Ms *mëëjt ~ Kh *muukəL ~ Smy *mïtə ‘liver’
  • Ms *tëët ~ Kh *ɬïïkəL ~ Smy *tïtə ‘Swiss pine (Pinus cembra)’
  • Kh *ïïkət- ~ Smy *ïtå- ‘to put up (e.g. a net)’

The alignment is not perfect, but it’s far better than we’d expect to happen randomly. It’d take some odd coincidences to end up with this situation from an original system containing no “ë-type” vowels. [2] I suppose there is the theoretical possibility of proposing *ë to have been an East Uralic innovation, or proposing a set of similar but not identical parallel innovations in the four groups, but I have not seen this done convincingly. [3]

The individual details of course still need examination as well. A 1st-degree correction factor is to note the mainly stem-vowel conditioned split developments in Hungarian (*ë-ə > i ~ í vs. *ë-a > a ~ á) and Khanty (*ë-ə > *aa vs. *ë-a > *ïï). There is very little direct evidence for the original stem vowels in any of the Ugric languages, and the Samoyedic evidence has its limitations as well, but their western relatives help here: cf. e.g. Finnish suoni, nuoli, hapsi vs. ala-, maksa, ahtaa. You may also notice that the H and Kh splits run in largely opposite directions, and indeed I do not think any examples are known where H í or i would correspond to Kh *ïï. There are moreover also some apparent exception cases with *ë-a > *aa in Khanty, though, so the exact analysis of this split may require further fine-tuning.

Secondary *ë in Hungarian and Mansi

As 2nd-degree corrections, it also seems to be the case that East Uralic *ë-type vowels can regardless in some cases represent conditional developments from different PU vowels altogether.

One prominent source of secondary *ë is cheshirization in Mansi. In what seems likely to be a late change, expected Proto-Mansi *oo followed by a velar consonant develops to *ëë followed by a labialized velar. Typical examples include *čaŋa- > *čooŋk- > *čëëŋkʷ- ‘to hit’; *ńoxə-lə- > *ńooɣl- > *ńëëwl- ‘to follow’. (Contrast Samoyedic *čåŋå-, *ńo-.) This is a fairly self-evident change on account of being one of the only regular sources of labiovelars in Mansi (together with similar effects triggered by other labial vowels). It has previously even inspired claims that perhaps all cases of *ëë in Mansi are similarly secondary — say, in Erkki Itkonen’s mid-1900s model of Finno-Ugric vocalism. [4] However the other cases resist explanation by similarly simple conditioning. “Redistributionary” splits, which do not lead to the creation of any new phonemes or even allophones, do happen! Being able to condition the appearence of a sound in one environment is not sufficient evidence for concluding that its appearence in other positions would therefore have to be conditioned by something as well.

And indeed, we can find even contrasts (near-minimal pairs) between primary and secondary *ëë in Mansi. Consider e.g. *këŋkə- ‘to climb’ > Ms *këëŋk-; but *aŋa- ‘to open’ > Ms *ëëŋkʷ- ‘to undress’. As the shift *oo > *ëë / _K has normally left a trace in the form of the labialization of the following velar consonant, then roots like the first could only be accommodated into the system by abandoning regularity and switching to a much weaker model running on “sporadic” sound changes.

Another sound law responsible for secondary *ë-type vowels also seems to be identifiable. This is a type of “illabiality assimilation”:  *o > *ë / _jC.

This development has long been recognized for Hungarian. E.g.:

  • *kojə-ma > *kojmV > *këjmV > hím ‘male’ (cf. Skolt Sami kuõjj ‘husband’ < PS *kōjë)
  • *pojə-ka > *pojɣV > *pëjɣV(-w) > fiú ‘boy’ (cf. Finnish poika)
  • *kojɜ-ta- > *kojðV- > *këjðV- > hízik (hízo-) ‘to become fat’ (cf. Mordvinic *kuja ‘fat’)
  • *tojə-ntV > *tojdV > *tëjdV(-w) > tidó ‘birch bark’ (cf. unsuffixed Udmurt /tuj/, Komi /toj/) [5]

The first two cases are well-known and relatively clear. I am not sure if the latter two have been previously noted, but they seem to work equally well. A fifth case might additionally be *kojə-ra > *kojrV > *këjrV > here ‘drone; testicle’ (cf. Finnish koiras ‘male’) — though it is unclear why we get here a mid vowel e, instead of the expected i ~ í. [6] It’s also interesting how hím (hime-) and here follow vowel harmony; yet the shift *k- > h- still indicates them descending from back-vocalic originals.

It is also fairly clear that the change only occurred in closed syllables: this is shown by e.g. *kojɜ > háj ‘fat’, *pojə > faj ‘species’ (though the semantic development here seems questionable), *śojə > zaj ‘noise’.

Interestingly there seems to be evidence of this change having extended to Mansi as well. At least three promising and two potential examples can be found:

  • *kojə-ra > *kojrV > *këjrV > *këër ‘male animal’ (cf. Fi. koiras)
  • *kojwV-lV > *kojlV > *këjlV > *këëĺ ‘birch’ (cf. Fi. koivu)
  • *soja-tV > *sojtV > *sëjtV > *tëëjt ‘sleeve’ (cf. Skolt Sami suäjj < PS *soajē; unsuffixed *soja > ujj in Hungarian)
  • ? *poskə > *poɣɬV > *pojɬV > *pëjɬV > *pëëjt ‘cheek’ (cf. Fi. poski)
  • ? *ńojta > *ńëjtV > *ńëëjt > *ńääjt ‘shaman’ (cf. Fi. noita)

The 4th has a kind of a chicken-and-egg problem: after primary *ë there is some evidence for a shift *ɣ > *j (e.g. *mëksa > *mëëjt ‘liver’; *wëlka- > *wëëɣl- ~ *wajt- ‘to rise’) [7], but we obviously cannot use both *ëë to condition the *j and *j to condition the *ëë. A possible ad hoc solution would be to reconstruct something like #pojsəkə, but let’s not.

The 5th requires a shift from *ëë to *ää, seemingly due to the influence of two flanking palatal/ized consonants. It is not clear though if this should be dated to the Proto-Mansi level, or perhaps later. Northern Mansi /ńaajt/ and Southern Mansi /näjt/ could actually regularly reflect PMs ńëëjt as well: the former thru the regular lowering *ëë > *aa, the latter thru the regular fronting *ëë > *ee adjacent to palatalized consonants + vowel shortening to /ä/. For these changes a perfect parallel is PMs *ńëëraa > *ńeerää > SMs /ńärää/ ‘legwear’; [8] a word not of Uralic inheritance, but here the regular back vowel is still found in Eastern Mansi /ńëërə/, Northern Mansi /ńaara/. It is only the Eastern and Western reflexes of ‘shaman’ that point to older *ää specifically.

It’s moreover possible that the 2nd case actually indicates instead a fairly similar change: *o > *ë / _ĺ. In this light two further interesting words are PU *śod₁ka > *soĺɣV > ? *sëĺɣV > Ms *sëëĺ ‘goldeneye’ (cf. Finnish sotka); and Ms *këëĺt- ‘to peel (e.g. hamp)’, which has been compared to Mari *kŭðaša-, Komi /kuĺ-/, Udmurt /kɨĺ-/ ‘to undress’, and behind which a PU root *kod₂V- could be reconstructed. [9]

There is no clear evidence on how *-od₂- is reflected in Hungarian — this has not been a frequent sound sequence. However, one old lexical comparison (that the UEW rejects) might be rehabilitable if we assumed that also this change occurred in Hungarian: *śod₂a ‘war, fight’ (cf. Finnish sota ‘war’; Mari *šuðala- ‘to scold’) > *śod₂a-nta- > *soĺdV- > *sëĺdV- > szid ‘to scold’? A cluster simplification *ĺd (? > *ɟd) > *d would also have to be assumed though.

However, even though these changes are highly similar, there is a strange complication that seems to preclude an analysis as a common Hungarian-Mansi innovation. In most words where Hungarian points to this kind of a secondary *ë, the Mansi development differs — we see a loss of *-j- instead:

  • *kojə-ma >> *kum ‘man’
  • *kojə-ta- >> *kaat- ‘to become fat’
  • *pojə-ka >> *piw ~ NMs /piɣ/ ‘boy’
  • *tojə-ntV >> NMs /toont/ ‘birch bark’

At least the 2nd and 3rd of these are clearly irregular: *-jt- is a perfectly valid consonant cluster in Mansi (cf. ‘sleeve’ and ‘shaman’ abov), and there are no parallels for a vowel development from *o (or for that matter, any other back vowel) to Ms *i. The 1st brings to mind the developments *kojə > *kuj ‘male’, *śojə > *suj ‘sound’. Was ‘man’ perhaps derived in Mansi from a vowel-stem variant *kojəma > *kujəmV > *kujm?

Perhaps it is relevant that the irregular loss of *-j- in these words extends also to Khanty: *kaatLə- ‘to become fat’, *pak ‘son’, *tontəɣ ‘birch bark’. A fourth example of this is also known, the word for ‘louse’: Ms *tääkəm, Kh *teeɣtəm (also Hungarian tetű); contrasting with Finnic *täi, Udmurt /tej/, Komi /toj/. [10] We could perhaps suppose a loss of *j before a consonant cluster to explain the last two… Though *-ktV is not really a typical Uralic noun formant, and so I also wonder if the Ugric words for ‘louse’ are not perhaps instead somehow related to the quite similar root *tikte found in Tungusic.

In Mansi, further examples of apparent secondary *ë can still be found as well. The residue includes e.g. Fi. os-ta- ‘to buy’ ~ Ms *wëëtaa ‘ware’; Fi. otta- ‘to take’ ~ Ms *wëët- ‘to pluck’. [11] Itkonen in his critique has claimed that *ëë would be even the most frequent correspondence of West Uralic *o, and this seems to still hold up pretty well even once we remove the words showing Finnic *oo (< *a/*ë via Lehtinen’s Law) from the count. It might still be possible that there has indeed been a default development *o > *ë in Mansi, only one bled by several conditional developments. — Regardless: this type of secondary *ë must still be distinguished from primary *ë, which is instead normally reflected as *a in West Uralic, and is further supported by the Samoyedic evidence.

[1] For just one example, no mention of this result appears in what I belive is the newest overview of Hungarian historical phonology available: the fifty-odd page appendix in Andras, Róna-Tás & Árpád, Berta (2011): West Old Turkic: Turkic Loanwords in Hungarian. Wien: Harassowitz.
[2] This can be contrasted with the Western end of the family. “Ë-type” vowels are not at all unknown here either. However, these show no relation to each other. E.g. Ter Sami has the vowels /ï/ and /ïë/, from Proto-Samic *ō and *oa < PU *a and *o. Skolt Sami has õ [ɘ] and â [ɜ] plus the long versions, under various conditions from PS *ë < PU *i, *ü, *e-ə. And the various languages of the Southern Finnic areal have õ [ɤ ~ ɨ], mostly from *e, though in some cases from *o.
[3] At least Reshetnikov & Zhivlov (2011; see Bibliography) have attempted an analysis to this effect, but they do not analyze Hungarian or Khanty, and they exclude some material previously reconstructed with original *o that turns out to be quite relevant. A recent follow-up in Zhivlov (2014) has abandoned the idea.
[4] He has presented some detailed critique against the reconstruction of *ë (“Vokaaliston kysymyksiä”, 1988, Virittäjä 92 pp. 325–329), though it seems this never led to much further discussion of the matter, and after Itkonen’s death in 1992 no one else seems to have had much interest in defending his system of vowel reconstruction.
[5] An alternate reconstruction *tejɜ- would also work for the 1st syllable vocalism, but this would predict a vowel-harmony-compliant **tidVw > ˣtüdő in Hungarian.
[6] It would be possible to hypothetize e.g. that inherited *ë had already been split to Old Hungarian *i vs. *a at this date, and that *oj first yilded not *ëj, but rather *ej, which was later assimilated to *i; and that in ‘drone’, *j was then lost early, leaving a mid vowel. The Mansi evidence seems to support an earlier shift specifically via *ë, though.
[7] There are other words as well with a more limited distribution; cf. Honti 1982: 29–30. These words mostly feature an alternation between a base form with *-ëëɣ- and an oblique stem with *-aj-. I would assume that this *j was later generalized to the nominative in the body part terms ‘liver’ and ‘cheek’, which will only rarely occur as subjects.
[8] On a slightly off-topic note, I am not sure if the Southern Mansi long open stem vowels should be taken as original. They don’t seem to contrast with the corresponding short full vowels, and indeed, they correspond to short stem vowels in the other Mansi dialects. They also regularly condition shortening of 1st syllable vowels. I suspect some sort of a prosodic effect here: e.g. ˈV₁-V₂ > V₁-ˈV₂ when V₂ was a full vowel, followed by lengthening of the newly stressed V₂, and if applicable, shortening of the newly unstressed V₁.
[9] The shift *u > /ɨ/ seems to be regular in Udmurt before coda /ĺ/. Other examples include *kad₂a- > PP *koĺ- > *kuĺ- > kɨĺ- ‘to stay’ (cf. Komi koĺ-); *kod₂ka > PP *kuĺ > kɨĺ ‘disease, evil spirit’ (cf. Komi kuĺ); *neljä > PP *ńoĺ > *ńuĺ > ńɨĺ ‘4’ (cf. Komi ńoĺ). Contrast though retention before intervocalic /ĺ/ in muĺɨ ‘berry’, tuĺɨm ‘topmost yearly growth of tree’.
[10] Mari *ti is ambiguous: this could also derive from e.g. *täkV or *tikV. Samic *tikē is though probably an unrelated loan from Germanic (or perhaps from the same pre-Indo-European source as the Germanic words).
[11] These two might suggest a dissimilation *wo- >> *wëë- at first glance, but a counterexample is *woča > Fi. ota-va ‘fish trap’ ~ Ms *wooš ‘weir; fence; city’.

Tagged with: , , , , , , , ,
Posted in Reconstruction
4 comments on “Primary vs. secondary *ë
  1. Your etymology of the Ugric words for ‘birch bark’ and ‘to become fat’ is very interesting. Both words have irregular vowel correspondences within Ugric, so an attempt to (at least partially) explain this irregularity through postulation of protoforms like *tojə-ntV and *kojɜ-ta is welcome. Two problems remain: 1) vowel reflexes in Khanty are different in the two cases; 2) why did *-j- disappear in Ob-Ugric, when it is preserved in Ob-Ugric reflexes of words like *nojta ‘shaman’ and (with another PU vowel) *ajta (or *ëjta) ‘fence’? I have no ready solution for the first problem. As for the second, I think the simplest way is to postulate the regular disappearance in Ob-Ugric of intervocalic *j in stems longer than two syllables. This would also explain Ob-Ugric reflexes of *pojə-ka and *kojə-ra. The preservation of *j in *nojta and *ajta is then explained by the fact that these words were disyllabic from the start. The word for ‘sleeve’ (it has *j in Mansi, but not in Khanty) is irregular anyway.

  2. crculver says:

    Could I get the full reference for Zhivlov (2014), please?

    It’s hard enough to keep up with publications advancing new reconstructions, sometimes one misses the arguably even more important publications taking claims back.

  3. crculver says:

    Thanks, I hadn’t previously noticed those other pages linked from the top.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Enter your email address to follow this blog and receive notifications of new posts by email.

%d bloggers like this: