This summer I’ve finished digitizing the main bulk of comparative data from László Honti’s Geschichte des obugrischen Vokalismus der ersten Silbe (1982): his 724 Proto-Ob-Ugric reconstructions and their descendants in the individual Mansi and Khanty varieties. Before making this available in any form though, I’m planning on eventually cross-checking at least a few other key sources. For one, there are Steinitz’ DEWOS, Kannisto’s recently released Vogulisches Wörterbuch, and some other materials for additional dialect coverage; for two, there are UEW and similar sources covering inherited vocabulary that has only been retained in one of Mansi and Khanty; for three, I will be also adding the data Honti includes but considers uncertain (this part already underway).  A potential fourth extension could be the known loanwords of Komi / Turkic / Tungusic / older Russian origin, at least whenever attested in both Mansi and Khanty: they should be able to offer substantial evidence for constraining speculation on historical phonology.
However, even at this stage, the data can be assumed to contain a substantial part of the inherited lexicon of Mansi and Khanty. So I have taken the opportunity to do some preliminary comparative analysis.
One interesting underresearched topic is second-syllable vocalism, which actually includes even the basic groundwork within Mansi or Khanty. This might have importance for Uralic comparison in general, since our current understanding of Proto-Uralic word stem types comes mostly by extrapolation from Finnic and Samic. Although the basic division into *A-stems (~ *a-stems & *ä-stems) and *ə-stems (~ *e-stems or *i-stems) finds some substantial confirmation from Mordvinic and Samoyedic, it fares substantially more poorly with Mari, and within Permic and Ugric, there is not too much direct evidence to work with second-syllable vowel contrasts at all the first place. Attempting to reconstruct other second-syllable contrasts from conditional vowel developments in the first syllable is theoretically possible (I believe Zhivlov (2014) is still the most recent example of this), but this carries often a risk of circular logic, and if low on data, may also run into accidental correspondences between unrelated phenomena.
There is regardless some direct evidence of second-syllable vocalism in Ugric. Looking in the rest of this post at Khanty in particular: the Khanty evidence has been explored in the 60s in some aspects by Gerhard Ganschow and Gert Sauer,  but mostly the topic has gone without detailed research. Steinitz’ Geschichte des ostjakischen Vokalismus (1950) does not treat the subject and only focuses on the first-syllable system.
A few overview notes on unstressed syllables, without detailed analysis of the data, are given by Sauer in Die Nominalbildung im Ostjakischen (1967) and Honti in Chrestomathia Ostiacica (1984). These outline a division into five stem categories:
- Basic consonant stems (the most common).
- *A-stems, with an open full vowel (*ää, *aa). Decently preserved in inlaut (verb roots, CVCAC and other longer stem types), but in absolute auslaut in the nominative of noun roots, the vowel is widely reduced and possibly lost entirely.
- *I-stems, with a close full vowel (*ii, *ïï). Preserved somewhat more widely, again better in inlaut than in auslaut.
- A third vocalic stem type, yielding *I-stems in Eastern Khanty but *A-stems in Western Khanty.
- *əɣ-stems: these behave as ordinary consonant stems in EKh, but vocalize in WKh to merge with the *I-stems.
This certainly covers most of the bases. A close look at the comparative data, however, suggests that this picture should be probably modified and perhaps also expanded.
The *I-stems, reinterpreted
I would propose as an initial adjustment that the *I-stems are to be reinterpreted as a part of the consonant stems: as *əj-stems. This is indirectly suggested by the absense of stems of the shape *CVCəj from the Proto-Khanty lexicon, even though *CVj is well-attested (e.g. *ɬöj ‘pus’, *pooj ‘ice crust’, *saaj ‘goldeneye’) and examples of *CVjəC occur too (*kaajəm ‘ash’, *waajəɣ ‘animal’). Direct support is provided by at least *ńooɣïï ‘meat’, *sooɣïï ‘clay’, cognate to Mansi *ńaawľ, *suwľ respectively. Instead of parallel suffixation, these can be analyzed as reflecting the typical sound correspondence Mansi *ľ ~ Khanty *j (< PU *ď; the intermediates are not obvious, but that question is irrelevant for now). Dating *-əj > *-I as a proto-Khanty innovation does not seem to be possible either, since in verb stems, Southern Khanty still retains /-əj-/: e.g. ‘to break’, Far Eastern (Vakh-Vasyugan) /aarïï-/ ~ Southern /oorəj-/. (And see below for some related considerations concerning the *əɣ-stems.)
This also accounts for a minor typological paradox. Why is second-syllable *-I better retained than *-A in the Khanty dialects, even though we would expect a close vowel to be more readily subject to reduction? A promising answer would be that the vocalization *-əj > *-I, despite being reflected in all Khanty varieties, is more recent than the partial reduction of *-A in some varieties. Sound changes *ej > /iij/ ~ /ij/, *je > /jii/~ /ji/ are also well-known in Northern Khanty,  and I suspect this is additionally a part of the same wave of vowel coloring, in these varieties further generalized to the first syllable. This would date *-əj > *-I as at minimum more recent than the Southern / Northern split.
I have seen the sound change *-əj > *-I mentioned in various works already (Sauer, Honti, Helimski…), but not anyone willing to bite the bullet and note that this can be taken as the definitive original source of this stem type.
Also, one secondary sound change. It appears that in Obdorsk Khanty, word-final *-I > /-aa/ after /x/: *ńalkïï > /ńalxaa/ ‘Siberian fir’; *ńooɣïï > */ńoxaa/ ‘meat’. This appears related to loss of vowel harmony. In NKh, first-syllable *ïï > *ee before velars instead of > *ii elsewhere, and I suspect something similar is involved here. I would assume first *-kïï > *-xïï > *-xëë, then *-ëë lowers to /-aa/ instead of backness neutralization to **-ee.
This would then seem to show that yes, Western Khanty too (or at least Obdorsk Khanty) has gone through a vowel-harmonic stage with *-ii ~ *-ïï, instead of directly vocalizing *-əj to front *-ii everywhere.
Reconstructing *əj-stems also sheds light on the fourth stem category in the outline above. I would side with Honti in reconstructing these as *Aj-stems. Southern Khanty provides clear evidence in favor: e.g. /xašŋääj/ ‘ant’ ~ Far Eastern /koočŋïï/. Sauer considers /j/ in SKh to be instead epenthetic, generalized from inflected forms where a vowel-initial suffix followed, but we can again appeal to comparative evidence from Mansi, where we find e.g. Northern /xooswoj/. I would add that with correct relative chronology, the development *-Aj > *-I in EKh drops right out of the other attested sound laws, with no need to posit any additional changes particular to this stem category: start with the reduction *A > *ə, follow up with *-əj > *-I.
The origin of the *Aj-stems also appears to be clarifiable. Words such as ‘ant’ point in the direction that they often originate in compounds. I believe that in many cases, their second member is likely the root seen in Ms *wuuj ‘animal’, though found independently in Khanty only in the suffixed form *waajəɣ. Some other *AAj-stems in Khanty that seem to have this origin include: *jeetərɣääj ‘black grouse’; *kaaməɭkaaj ‘water beetle’ (maybe with a first component akin to *koomɭəŋ ‘bubble’); #karŋaaj ‘woodpecker’ (and thus, contra Honti, not segment-for-segment identifiable with Hungarian harkály); #wuurŋaaj ‘crow’.
Many of these words also show irregular vacillation between medial *-ŋ- and *-ɣ-. My hypothesis is that this might be a trace of the PU genitive suffix *-n, and e.g. what I write as approximate #karŋaaj (Obdorsk metathesized /xaŋraa/; Konda spirantized /xaxrääj/; Surgut /kajaarŋïï/ and Far Eastern /kajərkïï/, maybe by metathesis and dissimilation: < *kaɣərKəj < *karɣəKaaj?) should be thus reconstructed as something like #karkə-n_waaj > #karɣəŋɣaaj, reflecting an original genitive attribute construction: ‘animal of the beak’, or something to that effect.
Compound origin would additionally explain also the complete absense of *Aj-stems among verbs.
It’s also possible I am late to the scene here. I’ve seen references to a 2003 paper by Anna Widmer “Zur Geschichte des obugrischen Tiersuffixes”,  and it sounds like this covers this same topic, but I do not (currently) have access to it.
Among the *əɣ-stems, an interesting complementary distribution appears that I have not seen remarked on before. Many sources note that the reflexation in Northern Khanty in nouns is somewhat inconsistent: in some cases we find Kazym /-i/, Obdorsk /-ii/, the same as in *I-stems; but, in others, we find Kazym and Obdorsk zero. (Southern Khanty and the “transitional” Nizyam dialect have consistently /-ə/ in both cases.) Verbs also only show the development to *-I-.
This split distribution seems to be conditioned by the preceding consonant: *-əɣ > *-I appears after obstruents, *-əɣ > ∅ after sonorants. Some examples of the former:
- ‘owl’: Vakh /jewəɣ/ ~ Kazym /jipi/
- ‘Khanty’: Vakh /kantəɣ/ ~ Kazym /xanti/
- ‘birch bark’: Vakh /tontəɣ/ ~ Kazym /tonti/
- ‘barbel’: Vakh /mööɣtəɣ/ ~ Kazym /meewti/
- ‘duck’: Vakh /wääsəɣ/ ~ Kazym /waasi/
- ‘knife’: Vakh /kööčəɣ/ ~ Kazym /keeši/
- ‘pine’: Vakh /ɔɔɳčəɣ/ ~ Kazym /wooɳši/
And some examples of the latter:
- ‘song’: Vakh /äärəɣ/ ~ Kazym /aar/
- ‘roach’: Vakh /läärəɣ/ ~ Kazym /ɬaar/
- ‘crane’: Vakh /taarəɣ/ ~ Kazym /tɔɔr/
- ‘bowl’: Vakh /ääɳəɣ/ ~ Kazym /aaɳ/
- ‘lightweight’: Jugan /köńəɣ/ ~ Kazym /keeɳ/
- ‘bog’: Vakh /kɔ̈ɔ̈ɭəɣ/ ~ Kazym /kaaɭ/
- ‘animal’: Vakh /waajəɣ/ ~ Kazym /wɔɔj/
There is only one example involving Proto-Khanty *L (a cover symbol representing both *ɬ and *l, which are medially neutralized everywhere).  It appears to align with the sonorants:
- ‘rope’: Nizyam /keetə/ ~ Kazym /keeɬ/
Inconveniently, here *-L- continues PU *-d-. It is therefore not possible to clearly tell if we are dealing with Proto-Khanty *-l- or *-ɬ-, since both paths of development have been suggested. In principle, though, this example would support a claim that the development was in fact first to *-l- (a sonorant), as also in Permic / Mansi / Hungarian.
I am not sure how the split development here should be interpreted phonetically, either. The core motivation seems to be a general cross-linguistic one at least: sonorant codas are more licensable than obstruent codas. But at least secondary loss of /-i/ after sonorants is ruled out, since in genuine Proto-Khanty *I-stems (*əj-stems) this remains. Examples are not numerous (by far most occur following /r/), but they exist:
- ‘riverbed’: Vakh /uurïï/ ~ Kazym /woori/, Nizyam /uurə/
- ‘sturgeon’: Vakh /köörii/ ~ Kazym /kari/, Nizyam /karə/
- ‘scab’: Vakh /kaľïï/ ~ Kazym /xaɬ´i/, Nizyam /xaťə/
This thus ends up further supporting my above-suggested chronology, where *-əj > *-ij > /-i/ took place only after the separation of Northern Khanty: the *-əɣ > ∅ group likely never went through an *-əj-stage. In other words, whatever the exact split development here was, it would have predated the common (but not Proto-!) Western Khanty shift *-əɣ > *-əj.
Maybe this could even be equated with the development of post-tonic (“non-stem”) *ɣ to /j/ in Obdorsk Khanty under certain conditions (e.g. ‘father’: EKh /jeɣ/, Nizyam /jiɣ/, Kazym /jiw/ ~ Obdorsk /jiij/; ‘power’: Vakh /wööɣ/, Nizyam & Kazym /weew/ ~ Obdorsk /weej/). This would then require rather early separation between Obdorsk and the other NKh dialects though, perhaps early enough to invalidate the concept of “Northern Khanty” as a genetic group altogether, and turning it into merely an areal subset of Western Khanty varieties.
I would not take this last corollary as a huge problem though, since I actually suspect the same already on other grounds as well… For just two examples:
- The word for ‘grass’. Far Eastern and Obdorsk have /paam/, while the other dialects have reflexes pointing to *pɔɔm. This surely involves an irregular (“non-provably regular”?) labialization between two bilabial consonants;  and yet this labialization cuts across the conventionally accepted grouping of the Khanty dialects.
- The treatment of supposed Proto-Khanty *ɔ̈ɔ̈ and *öö. These yield in some contexts /oo/ in Obdorsk, but *ää and *ee respectively in the rest of Western Khanty. Yet, the elimination of front rounded vowels is pan-WKh, and e.g. Honti and Steinitz claim it as indeed proto-WKh.  But if so, we have to route Obdorsk /oo/ differently. I wonder if another early shunt will work: if, following Helimski etc. we reconstruct lax open *ä, *a instead of *ee, *öö, *oo, then it will be possible to re-route “*öö > /oo/” as *ä > *a > /oo/, involving a pre-Obdorsk conditional retraction of *ä to *a in some environments.
— For some reason, nearly all words of the *-əɣ > ∅ group also involve Proto-Khanty low *aa, *ää, or mid *ee, *öö, *oo (= *ä, *a?). Perhaps there is also something more going on in here. This is also suggested by one example with a close vowel, where in Northern Khanty we find metathesis instead, viz. ‘eight’: Vakh /ńïïləɣ/ etc. ~ Nizyam /ńiwtə/, Kazym /ńiwəɬ/, Obdorsk /ńiijəl/ (< virtual PNKh *ńiiɣəɬ).
I also wonder how the changes *-əɣ > *-əj > *-I would interact with another innovation common to all of Khanty: the cluster contraction *-jt- > *-ć- (often involving the PU verbalizing suffix *-ta-, e.g. in *uj-ta- > PKh *ɔɔć- ‘to swim’). The more economical approach — that *-jt- > *-ć- was Proto-Khanty while *-əj > *-I was post-PKh — would however predict that we should find cases where an *I-stem noun or intransitive verb has a corresponding intransitive or transitive verb (respectively) ending in *-əć-. Offhand I cannot locate any such cases, however. But maybe this type of derivation was morphotactically impossible in the pre-PKh period? For comparison, in Finnic *-i < pre-PF *-j is a common suffix of deminutive nouns, and *-i- < *-j- is a common suffix for iterative verbs, but these generally do not form further verbal derivatives: any corresponding verbs are instead formed from the underived root.
At least one word also suggests the possibility of *əj > *-I being earlier than the contraction to *-ć-: ‘to split’, Vasyugan /ɭaaŋkïït-/ ~ SKh /laaŋxət/ ~ Kazym /ɭooŋkit-/, where we would seem to have PKh *ɭaaŋkəjt-. However, this could also be a later derivative, formed after *-jt- > *-ć- had ceased to operate.
There also seems to be a lack of PKh words ending in coronal + *-I, that is, earlier *-təj, *-səj, *-nəj, *-Ləj. (There are a few examples with a /Ct/ consonant cluster though, e.g. *aŋtïï < *aŋtəj ‘horn’; *maartïï < *maartəj ‘mythical land of birds’.) Maybe this indicates a parallel palatalization, and pre-Khanty *-Cəj or *-CjV resulted in a stem-final palatal instead of an *I-stem. Stems of the shape CVĆ are not very common in the current dataset either, though. But maybe any examples of this simply have not been connected with their equivalents in Mansi or elsewhere in Uralic yet?
Since it turns out that close second-syllable vowels in Khanty are secondary, from the Proto-Khanty perspective I should be probably talking about vocalizable stems, not “vowel stems”. This then suggests that a sixth category should be also distinguished: PKh *Aɣ-stems. These would then fill up a neat 2×3 system:
- vowel stems: *-A(C), *-Aj, *-Aɣ
- consonant stems: *-∅/-əC, *-əj, *-əɣ
A few words ending in *-Aɣ are indeed reconstructed by Honti, and they indeed also show distinctive development of their own. A representative example would be the adverb *koɳčaaɣ ‘on back’: Far Eastern /koɳčaaɣ/, Surgut /koɳɣïï/, Southern /xončää/, Nizyam & Kazym /xonšaa/, Obdorsk /xonsaa/. So we have here:
- loss/vocalization of *-ɣ in WKh, versus its retention in EKh (same as in *əɣ-stems);
- retention of *-A in not just EKh but also WKh, presumably protected by the earlier word-final consonant (partially same as in *Aj-stems);
- a strange development to /-ɣïï/ in Surgut, perhaps through metathesis (*-aaɣ > *-ïïɣ > *-ɣïï)?
Kind of paralleling *Aj-stems being mainly animal names, all of Honti’s examples seem to be adverbs. The other two are *koomtaaɣ ‘overhead’, *pertääɣ ‘back’. I would add to this group also *maakaaɣ ‘previous’, which he reconstructs as *maakaaj, despite SKh /maxaa/ and not ˣ/maxääj/.
Moving onto the main bulk of *A-stems, these may also need to be analyzed as partially secondary. This, however, requires taking a few steps back to look at the wider context.
While the modern Khanty varieties and also most reconstructions of Proto-Khanty abound in consonant stems of the shape CVC, CVCC or CVCəC, it is clear that this is an innovation, and that in Proto-Uralic the dominant root structure was bisyllabic *CV(C)CV. It is also clear that the transition towards consonant stems across a wide central area among the Uralic languages has taken place mostly as areal drift, not as a diagnostic subgroup innovation. Marginal languages of this type, such as Estonian, Nenets and Skolt Sami, still remain at a “thematic inflection” stage, showing consonantal nominative singular forms but vocalic inflectional stems. A good example would be Estonian nom.sg. silm : gen.sg. silm-a ‘eye’, where the latter form is at least from a historical point of view better viewed as silma-∅ (and thus structurally identical to Finnish silmä-n). Verbal roots, which generally cannot stand alone, also generally retain original second-syllable vocalism. And due to the lucky fact that the largest clear subgroups of Uralic all occur near the edges (Finnic, Samic, Samoyedic), in all of these cases we will be able to compare these languages with close relatives that remain at a firmly vowel-stem-centric inflection type (e.g. Votic, Inari Sami, Nganasan, respectively).
A transitional stage, one of several possible, is represented by Hungarian, where nouns retain a trace of thematic inflection (nom.sg. hal : plural hal-a-k ‘fish’; but nom.sg. dal : pl. dal-o-k ‘song’). However, in adjectives and verbs, presumable earlier lexically determined stem vocalism has been levelled entirely, and in most word forms second-syllable vocalism is now better analyzed as morphologically determined. Constantly vocalic stems have also been reintroduced among nouns, primarily in loanwords (e.g. balta : baltá-k ‘axe’, from Turkic), but also in derivatives (e.g. apa ‘father’, where -a has been interpreted as a fossilized possessive suffix).
Sauer’s old work proposes that *A-stems would be a retention from Proto-Uralic in one environment specifically: stem-finally in nominals, as suggested by a few equations like PU *neljä > PKh *ńeLää ‘4’. This would imply that elsewhere they aren’t retentions. The PKh situation as currently reconstructed therefore seems to derive from something close to the Hungarian situation, where original stem vowels have first been almost always phonetically reduced or analogically reshuffled away; then new ones are introduced.
Loanwords can of course fill in new second-syllable vowels, e.g. EKh /aarkaan/ ‘thick rope’, from Turkic; *ajaa > EKh /ajaa/ ~ /ajə/, WKh /aj/ ~ /oj/ ‘luck’, from Tungusic. In native vocabulary though, the most natural source for new second-syllable vowels are original third-syllable vowels. Given the original trochaic stress pattern of Proto-Uralic (as still continued in Samic, Finnic, partly Hungarian and Samoyedic), foot-final vowels would be expected to be the first ones to fall. After this, earlier 3rd-syllable vowels will move one syllable forward, becoming new unreduced 2nd-syllable vowels.
In at least some of the examples I’ve discussed above, 2nd syllable *-A clearly derives from an original 3rd syllable. *koomtaaɣ ‘overhead’, for example, is probably a derivative of PU *kuma- ‘overturned’, i.e. descends from pseudo-PU *kuma-takV. The entire animal name group also falls under this.
Now, the crucial question is — at what point in the history of Khanty was the distinction between “primary” 2nd syllable vowels, retained since PU, and “secondary” 2nd < 3rd syllable vowels lost for good? I think there’s reason to think that this, too, was post-Proto-Khanty.
Relatively poor retention of absolute final *-A is maybe best attributed to specifically word-final reduction/loss. The numeral ‘4’ for example, does not surface with a final full vowel anywhere: the reflexes are Far Eastern /ńelə/, Surgut /ńeɬə/, SKh /ńetə/, Nizyam /ńitə/, Kazym /ńaɬ/, Obdorsk /ńiil/. In many other cases, only the Vasyugan dialect delivers: e.g. *paraa ‘raft’ > Vy. /paraa/, Vakh, Surgut & Demyanka (SKh) /parə/, Obdorsk & most SKh /par/, Nizyam & Kazym /por/.
(It’s unclear at least to me what’s up with the loss of *-A in SKh and Nizyam in ‘raft’, versus its retention as /ə/ in ‘4’. Both patterns have further examples; retention is more common. I’m not sure if I would want to utilize a “primary/secondary” distinction just for these.)
A bigger problem though is that “primary” *-A is mostly lost also in verbs, even though in these the vowel would have been always protected by an inflectional ending. For example *kalaa- ‘to die’ yields Far Eastern /kalaa-/, Surgut /kaɬ-/, SKh & Nizyam /xat-/, Kazym /xaɬ-/, Obdorsk /xal-/. This is in clear contrast to “secondary” *-A in words such as ‘height’: VVy /peläät/, Tremjugan (Surgut) /peɬiit/ (?), Nizyam /pataat/, Kazym /paɬaat/, Obdorsk /päläät/ — which, again, clearly comes from a longer proto-form, being a derivative from PU *pidə > PKh *peL ‘tall’ (and probably further cognate to also e.g. Fi. pituus : pituude- ‘length’, allowing a PU reconstruction #pidə-(w)Otə).
There seems to be some evidence for a “primary/secondary” distinction to be found in *-AC nominals, too. A good example might be *raɣaam ‘relative’ > Vakh /raɣaam/, but Tremjugan /raɣəm/, WKh /raxəm/; derived from a base verb ‘to approach, be near’ — only attested in WKh, and it could be from PKh *raɣaa- rather than simply *raɣ-.
Even if Proto-Khanty had a contrast between two types of *A-stems, trying to reconstruct this in the original 2nd syllable / 3rd syllable fashion seems like the wrong approach, though. In cases like ‘height’, this would lead to awkward vowel-cluster reconstructions such as **peLəäät. In cases like ‘overhead’, nothing would immediately stand out typologically in reconstructing **koomətaaɣ, but this still has at least one undesirable consequence: we can no longer treat *ə as a purely epenthetic vowel in PKh, inserted to resolve consonant clusters (reconstructions like *waajəɣ ‘animal’ are in fact better taken as phonologically */waajɣ/), and at least some cases would have to be assumed underlying.
I have another hypothesis in mind: the distinction may have been prosodic. 3rd syllable vowels in PU would have originally born secondary stress, and this might have been retained in some form even after the loss of a preceding 2nd syllable. It’s not clear if an outright iambic stress pattern should be assumed though (*peˈLäät), or if something like a monosyllabic initial stress group followed by secondary stress will suffice (*ˈpeL|ˌäät). In principle it would be also possible to leverage the tenseness distinction, well-attested in initial syllables: *peLäät with tense *-ää, versus *ńeLä with lax *ä? For now, I will notate this distinction as *-À (“primary”, “unstressed”; individually *-a, *-ä) versus *-Á (“secondary”, “stressed”; individually *-aa, *-ää). Regardless of the phonetic specifics, later on *-À would have been generally reduced (*raɣam > /raɣəm/), while *-Á would have remained (*peLäät > /peLäät/).
The stress hypothesis finds some amount of direct confirmation as well: cases of fully iambic second-syllable stress have been reported at least from Eastern Khanty (Far Eastern /peˈläät/, Surgut /peˈɬäät/).
Stress in EKh does not appear to be a direct archaism, however. Per all descriptions I have seen, the attested distribution is purely phonological: stress is primarily initial, except when the 1st syllable contains a lax vowel and the 2nd syllable a tense one. This also rakes in cases of “unstressed” *-À; e.g. Far Eastern /kaˈlaa-/ ‘to die’. This seems like another point in favor of some kind of a more subtle distinction in PKh. I would suppose that in varieties of EKh, *-À was early on partly tensed to merge with *-Á, and could have actually acquired stress only later. Wherever this change failed to take place (including in all varieties of WKh), *-À was then reduced/lost.
Altogether, I propose the following general chronology for the development of second-syllable vocalism in the Khanty varieties:
- The partial merger of *-À and *-Á in Eastern Khanty (with variable conditions); including *-Áj > *-Àj.
- The reduction of remaining *-À across all of Khanty; loss of *-əɣ in Kazym and Obdorsk after sonorants.
- *-əɣ > *-əj across all of Western Khanty.
- *-əj > /-I/ across all of Khanty (with variable conditions); in parallel, *-Aj > /-A/ in Northern Khanty.
All of these changes are very heavily areal, and do not seem to define any substantial genetic subgroups. The main divisions of Eastern Khanty, the Far Eastern and Surgut groups, would have to be assumed to have split already before step 1 (*kala- > *kalaa- vs. *kal-); the Nizyam / Kazym / Obdorsk dialects of Northern Khanty, already before step 2 (*äärəɣ > *äärəɣ vs. *äär). The split of Nizyam and Southern Khanty could be in principle delayed until step 4 (making Nizyam a “Northernized Southern” rather than a “Southernized Northern” dialect after all), but this seems like a poor idea, even if for now I cannot refute it explicitly.
Areality seems to be further proven by how most parts of this scheme have parallels also in Mansi (e.g. *-əɣ > Northern and Pelymka (Western) Mansi /-iɣ/, Eastern and rest of Western Mansi /-i/; *-A > EMs, WMs -∅). But a detailed look into this will be a task for later.
So what can we do with this?
The above analysis leads to at least one more general interesting corollary for Khanty historical phonology. If PKh *À-stems were in the early common Khanty period reduced en masse — then this opens the possibility that several cases could have been lost entirely from the data. Already Sauer notes that all inherited word-final cases of PKh *A-stems seem to occur either following the PKh lax vowels (*e *ö *o *a), or the traditionally reconstructed tense mid ones (*ee *öö *oo). Other cases could have existed as well … we may just be currently unable to directly distinguish them from consonant stems.
There may be, however, indirect evidence to draw such distinctions. The notorious Khanty “ablaut” system (which I am afraid I cannot explain in detail in this post) has for a while now been explained as being instead a partly morphologized system of former umlaut.  Per this hypothesis, alternations like EKh (*)ɬɔɔj ‘finger’ ~ (*)ɬuuj ‘thimble’ would continue something like earlier *ɬɔɔj(A) ~ *ɬuuj-(i), either with i-umlaut of *ɔɔ to *uu in the derivative ‘thimble’; or a-umlaut of *uu to *ɔɔ in the base root ‘finger’. I am more inclined to side with the latter (Honti’s view) than with the former (Helimski’s). If close/open ablaut in Khanty is fundamentally based on a-umlaut, the assumed umlaut trigger could be then identified as *-À, and we could then amend ‘finger’ to PKh *ɬɔɔja instead. This in turn also accords fairly well with the PU reconstruction: *suwd₂a (with Samic *čuvðē, Samoyedic *təjå clearly indicating an original *A-stem). By contrast, Helimski’s assumed *I-stems seem to be nowhere supported by actual data: they are simply circularly inserted into proto-forms where a close-grade vowel eventually surfaces.
Perhaps even un-umlauted *ɬuuja is a possibility for PKh. Vowel alternation in many cases occurs only in EKh, not WKh, and I would not dismiss offhand the possibility that this reflects unstressed vowel isoglosses in early common Khanty. In this case we indeed find WKh *ɬuuj (SKh /tüüj/, Kazym /ɬuj/, etc.) and not *ɬɔɔj > **ɬooj. Instead of assuming levelling from ‘thimble’, or from possessed forms (Vakh /luujəm/ ‘my finger’), maybe no umlaut took place here to begin with, and the discrepancy between EKh *ɬɔɔj ~ WKh *ɬuuj goes back to already earlier *ɬuuja ~ *ɬuuj(ə), with some kind of an early conditional loss of *-À in WKh.
Some other cases of “umlaut” might turn out to be illusory entirely. I am on board with the “Helimski school” reanalysis of “Steinitz school” PKh *ee, *öö, *oo as lax open vowels, and PKh *e, *ö, *o as lax close vowels (though I would be content to keep on using the symbols *e, *ö, *o for the latter). However, the associated reanalysis of Steinitz’ lax open *a as close *ï seems unsatisfactory. In most cases, this continues PU open *a; it is also continued as lax open /a/ in most Khanty varieties. Moreover, we can identify numerous instances where this occurs in an *À-stem instead. The clearest evidence are “thematic verbs” such as ‘to die’, where at least in Eastern Khanty the surface alternation is between /oo/ (/kool-/) and /a-aa/ (/kalaa-/). Since Helimski considers *ï to be the i-umlaut counterpart of *a, he ends up proposing the phonetically nonsensical solution that *A-stems would have triggered i-umlaut!
Instead of a back-and-forth development *a > *ï > /a/, purely for the sake of making way for *a > /oo/, I would propose that the rewriting of *ee, *öö, *oo as *ä, *a does not reflect mechanical identity. Rather, the alternation of the sort /oo/ ~ /a-aa/ is again perhaps post-Proto-Khanty entirely. PKh lax *a and *ä were only tensed and raised to /oo/, /ee/ ~ /öö/ when stressed; when unstressed, they were left as is (and not umlauted to anything at all). The first-syllable alternation /oo/ ~ /a-aa/ should be taken back to an earlier stress alternation /á(-ə)/ ~ /a-á/, in turn going back to earlier *á-ə ~ *á-a, through the Eastern Khanty stress retraction shift *-À > *-Á.
Filling up the details on this hypothesis (and possible similar approaches to other ablaut patterns) will need a much closer analysis, though. But ultimately, it may be able to reduce the somewhat sprawling Proto-Khanty vowel system into a more manageable shape.
 Infuriatingly, he does not provide any comments on what has motivated the division of the data. There are hints, of course. Much of the “second-tier” data seems to have relatively limited dialect distribution on one or both sides, e.g. only in Northern Mansi, or only in Southern Khanty; or relatively irregular sound correspondences. I get the impression that he considers it likely that some of this data is either unrelated; are parallel loans from some third source; or consists of loans from Khanty to Mansi (or perhaps vice versa). On the other hand, I think even the main part of the data likely contains a number of cases of this kind. Are these oversights, or does he have any actual reasons in mind to consider some initially spotty-looking cases stronger than others?
 In their respective C2IFU contributions “Zur Geschichte der Nominalstämme in den ugrischen Sprachen”; “Nominalstämme auf *-a/*-ä im Ostjakischen”.
 Bear in mind that Proto-Khanty had a contrast between full and reduced vowels, not in vowel length, and e.g. “long” *aa *uu should be read simply as [ɑ] [u]. “Short” *e is then a reduced vowel, [ə] or [ɪ], and is traditionally indeed transcribed ə in close transcription by fieldworkers on Khanty. Thus, *ej > /iij/ does not involve seemingly unmotivated lengthening, but rather tensing: [jɪ] > [ji].
 Published in László Honti’s Festschrift (Ünnepi kötet Honti Lászó tiszteletére). The University of Helsinki library does have a copy, but it’s on loan currently. If by any chance the culprit happens to be reading this, please feel welcome to get in touch with me…
 The overall rarity of roots ending in *-Ləɣ in Khanty is not a mystery: it is due to the common (Proto-?) Ob-Ugric metathesis of PU *-lk-, *-sk- > East Uralic *-lɣ-, *-ɬɣ- > OUg *-ɣl-, *-ɣɬ-.
 At least two other examples exist of *aa > *ɔɔ before bilabials. 1) ‘Bird cherry’: *jɔɔm in place of expected *jaam, from PU *ďëmə. 2) ‘Hair’: Far Eastern *aawət < *aapət regularly continues PU *ëptə, but other dialects, including Obdorsk, indicate *ɔɔpət. On the other hand, there are counterexamples against assuming a regular change, e.g. *kaam ‘coffin’ (~ Mansi *kaməl), *kaap ‘boat’ (~ Mansi *këëpə), *saam ‘scales’ (~ Mansi *sëëmə, < PU *sëmə).
 To be exact, Steinitz and Honti only claim this about tense *üü, *öö, *ɔ̈ɔ̈. PKh reduced *ö has labial reflexes more widely in WKh, including fronted [ɵ] in SKh. However, this is only the case adjacent to velars; elsewhere we see the expected delabialization to *e. I would propose that this development involves “double cheshirization” (and is areally connected to the same in Southern Mansi): *kö > *kʷe, then re-coloring: *kʷe > South [kɵ] (= phonemically /ko/), North /kuu/.
 For a starting point, see e.g. E. Helimski (1999): “Umlaut in Diachronie – Ablaut in Synchronie: Urostjakischer Umlaut und ostjakischer Ablaut.” — Diachronie in der synchronen Sprachbeschreibung. Mitteilungen der Societas Uralo-Altaica 21: pp. 39–44.