Currently I’m looking a bit into older research on Mansi. Coverage on the language has not been optimal in the past, mainly due to most of the existing field research materials being rather slow to be released. The main sources on no less than a 100+year-delay! — Bernát Munkácsi’s 1880s records coming out in dictionary form in 1986, Artturi Kannisto’s 1900s records in 2014, and Antal Reguly’s 1840s records I’ve not seen any decent edition of at all. I think this has left etymological research in particular in a limbo. Mansi specialists with direct access to one or more of these field research corpora (e.g. Steinitz, Liimola, Kálmán, Honti, and of course Munkácsi and Kannisto themselves) have for long been able to dig out comparisons and publish their findings, but us more general Uralicists not so much.

Many of these Mansi specialists have also been working with Khanty, whose primary comparative lexical source, K. F. Karjalainen’s dialect dictionary likewise built on 1900s field research, came out already in 1948, making the language more accessible for investigation. This has, I believe, led to a kind of an “overlooked middle sibling” status for Mansi, creating a more Khanty-colored picture of the language’s history than is warranted. Comparisons between the two languages are much more readily apparent than more distant cognates. Yet it can be also suspected that many of these are not common Ob-Ugric inheritance, but rather newer loans (Ms → Kh, Kh → Ms, or from some common third source). We also know of a cautionary example from the western end of the Uralic family: untangling Finnic loans from true cognates, with the help of more distant relatives, has been integral to working out the history of Sami. This line of work has by now revealed that just about all especial commonalities between Finnic and Samic are either archaisms, loans, or areal, and that from a proper cladistic point of view, a Finno-Samic subgroup is really no stronger supported than some different hypotheses such as Finno-Mordvinic would be.

For Mansi and Khanty, this work has so far not been done … but I strongly suspect the results would have a similar lean. Extensive areal sharing of some secondary isoglosses is already well-documented along the Mansi–Khanty contact zone. There are also a number of known Mansi–Hungarian and even Khanty-Hungarian isoglosses, as well as several “Proto-Ob-Ugric innovations” that appear essentially out of the blue.

These considerations suggest some steps for going forward. One that could be done without too much trouble with just the existing materials would be to “re-root” the historical phonology of Mansi in Proto-Uralic. E.g. as has been established at least since Sammallahti (1988) (more debatably already since Steinitz 1944), the regular reflex of Proto-Uralic *ä in Mansi is *ää — a development that surely represents simple qualitative retention, and not a detour through a Proto-Ob-Ugric *ee (as per Honti) or *eä (as per Sammallahti). Corresponding mid *ee in Khanty is most likely an independent innovation (likely even post-Proto-Khanty, as per the reanalysis due to Tálos of Surgut Khanty /ä̆/ as more original than other varieties’ /e/).

But etymology will require work too. A Mansi analogue of Steinitz’ comparative-etymological dictionary of Khanty would be quite desirable, now that the main sources are finally out and available for easy consultation. This would doubtlessly take an additional long while to assemble though. Also, from the comparative Uralist’s view, this would involve lot of work being spent on clearly secondary material: compounds, derivatives, relatively recent Russian and Tatar loans, etc.

I have at this point a shortcut of sorts in mind. The Munkácsi and Kannisto materials have been the main sources for comparative research on Mansi for the last 140 years, and we might assume they have been already reasonably mined through for comparative purposes. They’re far from the only materials on Mansi though. Older collections could be still expected to maybe have some archaisms in them that have been lost in later times. We again know from precedent that this line of research is likely to bear some fruit. On historical phonology, the 1970s-80s “Hungarian school” (L. Honti, K. Rédei, E. Sal) revamp of Proto-Mansi reconstruction has been based on 18th-century records that show some retained word-final vowels, pointing to stem-type contrasts CVCə | CVC and CVCCə | CVCəC (from the 19th century on, collapsed to just CVC and CVCəC). This then can be leveraged for some reanalysis. — On etymology, there is so far at least a small 1991 article by Katz: “Altsüdwogulisches” (FUF 50), [1] which identifies from 18th-century records previously unknown Mansi reflexes for PU *kota ‘hut, house’ and Indo-Iranian → Ugric *täjɜ ‘milk’.

The 18th century materials are, alas, still not well-documented in print. The Hungarians mainly refer to a manuscript Altwogulische Dialekte by J. Gulya, which I believe ended up never being published (though some of the data is briefly covered in his articles in NyK 60 and 62). So I’m casting my hopes into the 19th century instead. There is too at least one smaller primary source to have been released relatively timely: A. Ahlqvist’s materials starting since the late 1850s, a wordlist of which was released in 1891, as the second SUST volume Wogulisches Wörterverzeichnis (and by now available digitally; also on, IMO in better scan quality than the National Library of Finland version). The usability of this data is limited somewhat by various dialect forms being given without specifics — perhaps Ahlqvist’s original records would have this info? — but with modern Mansi dialectology in hand, the big picture is clear enough. I am not aware of any later reappraisal of this material, and it seems likely that a close look could turn up some new etymological insights.

As a promising initial result, from the A section I have already run into an entry aidentantqtam ‘to vomit’. As Ahlqvist seems to render unstressed schwas varyingly as a, e, i, , [2] as well as coda /ɣ/ often as a vowel i or , we can thus see this as a reflex of PU *oksənta- ‘to vomit’ > PMs *aaɣtəntə- (showing several regular developments: *o-ə > *aa, *s > *t, *kC > *ɣC).

In overall phonology it is also interesting to note how, while most of Ahlqvist’s data seems to be Western Mansi, he has also numerous forms showing the Northern Mansi development *ä > /a/ (e.g. mań ~ mäń ‘daughter-in-law’, ńäl ~ ńal ‘handle’; notice also the inconsistent lemmatization), sometimes quite tellingly further combined with also typically Northern *š > /s/ (sam ~ šäm ~ šem ‘eye’). Yet, his examples of the combination *kʷä- show uniformly only küä-. In newer Northern Mansi this has undergone a shift to /o/, starting from Munkácsi’s materials, but no sign of this appears in Ahlqvist’s materials. Perhaps this is then indeed independent from the usual NMs shifts *ä > /a/ and *a > /o/ (it could be otherwise routed through either), and has instead proceeded as something like /kʷä/ > *[kʷɞ] > /kʷo/ > /ko/?
Edit 2019-01-11: nope: one doublet jelpi̮l-küäl ~ jalpi̮l-kol ‘church’ (lit. ‘holy house’), already seems to show the native NMs reflex. There is also plain kol, though given separately, not coordinated into the same entry with the WMs form küäl.

[1] Why specifically “süd” is unclear to me, given that some of his forms are clearly Northern Mansi.
[2] Theoretically some of this variation could represent real vowel contrasts, neutralized in later times, but that will require a more systematic look at the data, maybe with dialect division included.

  1. Howl says:

    “Corresponding mid *ee in Khanty is most likely an independent innovation (likely even post-Proto-Khanty, as per the reanalysis due to Tálos of Surgut Khanty /ä̆/ as more original than other varieties’ /e/).”

    I do want to note that the regular reflex of PU *i is also *ee in Khanty, and ä in Mansi. There is also a Proto-Khanty *öö that is always said to be in complementary distribution with *ee. But if there are any sound-laws behind this distribution, they are not obvious to me.

    Some Mansi dialects have öä as a reflex of PMs *ää. Perhaps PKh *öö also came from this *ää and was originally in alternation with PKh *ee. There is some evidence of such an alternation in Eastern Khanty (See Geschichte des Ostjackischen Vokalismus, Steinitz, p.112.)

    • j. says:

      The basic rule is *öö next to velars (*k *ɣ *ŋ), *ee otherwise, though there is still a small handful of exceptions, mainly *A-stem nouns (*keerää ~ *kerää ‘bundle’, *keeLää ~ *kelää ‘dew’).

      • Ante Aikio says:

        I don’t think these “A-stem” nouns are an exception at all. I think the Tálos/Helimski/Zhivlov reconstructions are correct in reconstructing a quite different Proto-Khanty vowel for these cases: Proto-Khanty *e (which is reflected as a short vowel everywhere except for Far East Khanty) vs. Proto-Khanty *ä (which was lengthened everywhere except for Surgut Khanty). From this starting point it then turns out that the Far East Khanty rounding of *ee to *öö next to velars only seems to have affected only *ee deriving from Proto-Khanty *ä (but not *ee from Proto-Khanty *e).

        This seems to leave only one remarkable exception: V Vj teɣǝn (not **töɣǝn!) ~ Sur tȧ̆ɣwǝn, Irt tewen, tewin, Ni Kaz O tewǝn ‘windless, calm’. But here there might be a different explanation. Helimski proposes that after back vowels one can establish an opposition between Proto-Khanty *ɣ and *w (the latter yields /w/ but the former /x/ in South and North Khanty). This conclusion appears to be supported by external etymologies, as Helimski’s *ɣ and *w in back-vocalic stems have different Uralic sources.

        But there is no direct trace of an opposition *ɣ : *w in front-vocalic stems. But maybe the lack of vowel rounding in V Vj teɣǝn is an indirect trace, and the word did not have *ɣ but *w in Proto-Khanty – which would be in line with Uralic porto-form *tiwini (cf. Finnish tyven, tyyni). This would contrast with a case like V Vj wöɣ ‘strength’ (from PU *wäki with an original velar). Note by the way that V Vj köɣ ‘stone’ (from PU *kiwi) is ambiguous in regard to this solution, as its vowel would in in any case have been rounded because of its initial velar.

  2. Howl says:

    Duh! Thanks, now I get it. There are not so many exceptions to this rule, and the alternations between *ee and *öö are all next to velars and mostly between dialects or derivatives. So I can agree that PKh *öö is a secondary development from PKh *ee.

  3. Ante Aikio says:

    As regards Ahlqvist’s aidentantqtam ‘vomit’, its root *ī̮jt- is also well attested in other later sources, and it corresponds to Khanty *āɣǝt(-) ‘vomit’ (see no. 15 in the word-list in Honti’s “Geschichte des obugrischen Vocalismus…”). This creates a problem for the comparison to PU *oksi-nta- ‘vomit’, as Khanty *t of course cannot reflect Proto-Uralic *s.

    • j. says:

      Good reminder, I had missed this somehow when checking my index of Honti.

      Loaning from Mansi to Khanty would be theoretically possible for explaining the *-t-, but there might be another way of tying these together too. Namely quite a few of the reflexes (F Mo Ma, maybe Komi-Jazva) point to the trisyllabic form *oksənta-, with no evidence of being derived from a shorter stem; including also Ahlqvist’s form, which has both a “formant” -nt- and normal factitive-frequentative + reflexive -nt-q(ə)t-, ruling out even a hypothetical analysis of the first -nt- as a secondary suffix. (It probably still is a frequentative originally, but in this stem seems to have fossilized already in Proto-Uralic. †) So perhaps the later Ob-Ugric forms come from a contraction of this trisyllabic base, something like *ëkɬ[ə]nt- > *ëkɬt- > *ëkt-.

      The newer Mansi vocalism might be more problematic here really, apparently requiring filing this word rather among the poorly understood correspondence type with West *o ~ (Ob-)Ugric *ë (as also in *śodka ~ *śëd₂(kɜ) ‘goldeneye’, or the loanword *wosa ~ *wësɜ ‘ware’.)

      † As an aside, I now wonder if the Finnish rivername Vuoksi (: Vuokse-) could be from the same original root, routed through Sami, and with ‘spew’ as the base meaning. The usual derivation from vuo ‘flow’ seems a bit too tame, and alternative speculation I’ve seen, on connecting this to either *wiksə ‘connecting river’ or *uktɜ ‘isthmus’, looks phonologically and semantically too far off.

      • Ante Aikio says:

        By the way, also Saami points to the same derivative: cf. North Saami vuovssadit (Proto-Saami *vuokse̮nte̮-). The bisyllabic form (North Saami vuoksit, etc.) must be secondary, because its stem vowel (Proto-Saami *vuoksē-) shows that it cannot possibly be a direct reflex of PU *oksi-. It probably arose as a back formation of *vuokse̮nte̮-, where *-e̮nte̮- was identified with the productive frequentative verb suffix. But still, I don’t see any particularly plausible way to connect the verbs with Ob-Ugric *ïktV- ‘vomit’ even though the similarity is very intriguing – I myself remember noting it a few years ago already, but could not find a good explanation for the phonological discrepancy.

        As regards place-name typology, a “tame” explanation is the most likely one to be correct, other things being equal. There is another major river called Väylä (nowadays better known by the name Tornionjoki), from the appellative väylä ‘major watercourse’ (a Saami loanword), so why couldn’t Vuoksi be derived from vuo ‘stream, current’? In contrast, I am not aware of a single river name in any language based on a word meaning ‘vomit’.

        • j. says:

          I know about the Sami reflex too, but deriving *vōksē- as a back-formation from *vōksëntë- appears to still leave the former being an *ē-stem unexplained. *-(ë)ntë- attaches to *ë-stem verbs as well (bodnjatbonjadit, čáhkatčágadit, etc.), so surely back-formation would also predict just **vōksë-? One possible explanation would be that the base was indeed *oksa-, preserved only in Sami, and all other branches are built on derived *oks-əntə- (though then we’ll still need to assume something along the lines of additional levelling to explain why the former doesn’t give **oaksē-). I.e. unlike the other cases, here we seem have some evidence suggesting that *oksəntə- is still segmentable.

          (I have actually a slightly different analysis still in mind for *vōksē-, a bit lengthy though to go into here; probably to be covered in a future article.)

          By “tame” I do not call the vuo etymology simpler, I mean that it is too superficial to assign an etymology as a “normal, flowing river” to one that is anything but. The Vuoksi’s most distinctive feature are easily the Imatra Falls, which also rather impede long-distance navigability. Also, to reiterate, I have not suggested ‘vomit > ‘river’, but rather ‘spew’ (> ‘water-spewing’ or the like) > ‘river with falls’ on one hand, ‘spew’ > ‘vomit’ on the other. The point is that if *oksəntə- is indeed originally a derivative, then the bare root does not need to have also meant ‘to vomit’. (Also cf. e.g. English throw up, Swedish kasta upp, French rejeter…)

          Note moreover that I’m also claiming that the proper noun Vuoksi is primary compared to the fairly rare common noun vuoksi ‘large river’ (as also seen in Kymi ~ kymi). On reflection though, the sense ‘tidal flow’ would still seem to be better considered a derivative *voo-ksi. Rivers on the Baltic do not have tidal waves, so there seems to be a missing semantic link between the senses ‘river’ and ‘tidal flow’, while ‘flow’ → ‘tidal flow’ is trivial.

          While we’re on this, do you have an opinion on the comparison (of at least Ob-Ugric) with Hu. utál ‘to hate’? This seems way off-base of ‘to vomit’, but I’m not aware of any better etymology either. Finnic *uhka ‘threat’ comes close, but not really close enough to work (*uktə- > *uhtë- → *uht-ka > *uhka?? but uho ‘bolster’ suggests that the root for this is just *uhV).

