A little thing that has been vexing me pretty much ever since I first for some fateful reason decided to take a look at Uralic vowel history is the distribution of *a versus *o in Mari.
These two Proto-Mari vowels are only distinguished in Hill (Western) Mari, while Meadow (Eastern) Mari merges both to /o/. Or — is it a merger after all, or perhaps instead a split? The puzzle is that in the wider etymological context, no real conditioning appears. Both vowels seem to go back to similar Proto-Uralic vowel combinations (mainly: *a_ə, *a_a, *ë_a, and *o_ə). E.g. *kala “fish” → *kol, but
*pala “bit” → *pal (EDIT: misquoted, see below; cf. instead *jalka “foot” → /jal/); or *poŋə “bosom” → *poŋəš, but *joŋsə “bow” → *jaŋəš. The rest is just about as ugly.
I finally seem to have hit some reliable results though. Namely, some partial soundlaws; conditioned by the Proto-Mari initial consonant.
- After *w, only *a appears. This is completely regular, based on about 20 examples. An impressive number for a specific two-phoneme combination, at least in a language like Mari that does not exactly abound in archaic inherited vocabulary.
- In the absense of an initial consonant, *o appears. This is mostly regular (both in good PU words such as *apta- → *opte- “to bark”, and areal ones such as *ožə “stallion”, which only has relatives in Permic ). I’ve gotten three exceptions together, though: *anće- “to blink”, *aškəl “step”, *ažnə “early”. While this is not too much to generalize upon, all three seem to contain a following palatal consonant (which are frequently depalatalized in Mari, but cf. Komi: /addźɨ-/, /voćkol/, /vodź/). Perhaps this, then, is further evidence for an allophonic distinction between *[a] and *[å] (IPA [ɑ] vs. [ɒ], if you will) in Proto-Uralic.
- After *p, *a appears. This is nicely parallel to the case of *w, and again mostly regular (*pontə → *pandə “stick”, *par(ə)ma → *parma “gadfly” etc.) but with three exceptions. Here the conditioning seems in a way the inverse of the previous case: all cases appear before velar consonants, namely *pokte- “to hunt”, *poŋgə “mushroom”, and the above-mentioned *poŋəš “bosom”. I have no idea if this should be interpreted as a backing effect, or perhaps as a fronting effect of dental/(post)alveolar/palatal medials. (No labial medials appear here, since we already have one initially; a common phonotactic restriction in the Uralic languages, if not wider across the world.)
It’s a start, though I can already see this won’t be the entire solution… I’ve also checked the remaining sonorants (*m, *n, *l, *r, *j), and lamentably, no clear pattern emerges yet for any of these. *m failing to abide to the pattern set by the other two labials is a particular bummer.
If the previous observations are anything to go by, the preceding consonants probably need to be taken into account too. E.g. a zero medial appears to condition *o (examples include *koe- “to dig”, *moa- “to find”, *roe- “to hack”, *šoe “duck”) — or, to be exact, a unique correspondence Hill /o/ ~ Meadow /u/, that however seems best explained by a raising *o → *u in the latter.
Also, since this all seems to be basically a Mari-internal phenomenon after all (and I’m definitely taking the results so far as a licence to treat *a and *o as interchangable for Uralic reconstruction), getting the full story will probably require looking into Proto-Mari words for which no Uralic cognates are known. Late loanwords will then probably interfere though… so finishing this job will basically have to wait for having a Mari etymological dictionary to consult. I’ve no interest in accidentally basing conclusions on what would turn out to be recent Russian loanwords for all I know…