A Problem Statement for Uralic vocalism

As noted in my previous post, I have by now nailed down as my next professional milestone a hunt for previously unnoticed innovative features within the Finnic vowel system.

Besides individual surface questions about how the vowel system of Proto-Uralic may have looked like (harmony this, stem vowels that, long vowels yay or nay…), there is also a second, more methodological theme involved that may be less apparent. This is the question of how to reconstruct a proto-language when faced with extensively overlapping correspondences.

Uralic vowel reconstruction is not really constrained by data. The etymological pool has been sitting at around 1000 items already since the late 19th century. Etymology has kept progressing over the time, but almost as much of it involving discarding old poor comparisons as adding new better ones, and hence with surprizingly not that much quantitative impact (an optimistic count would put us as having gone from about 900 to about 1200). Yet it still remains the case that effectively no two etyma fully agree in their correspondences! Even seemingly perfect rhyme series found in just about all languages show divergence in one or two languages. E.g. for PU *kala ‘fish’, *pala(-) ‘bit, to bite’ and *sala- ‘to steal’ it is Khanty and Selkup that diverge, both in different items even: *kuuL, *puuɭ, *ɬaaL-; *qwëlɨ, *poolɨ-, *twëlɨ-. [1] More often, numerous gaps in data prevent assigning correspondences definitely together as series: a given sparse correspondence set may be simultaneously compatible with three other sets, which however all disagree with one other. In other words we are saddled with too many correspondences to straightforwardly tackle. This all has already been noted before too, e.g. by Kaisa Häkkinen in 1983, in her PhD thesis Suomen kielen vanhimmasta sanastosta ja sen tutkimisesta, pp. 120–151.

My reading on the research history seems to moreover reveal that it’s tackling this issue that has been driving all the major debates on Uralic vowel reconstruction thru the years. There have been roughly four approaches considered throughout the years, all of them in principle admissible per known processes of language change:

1. Reconstruct different vowels for each correspondence (the “trivial reconstruction” approach). This was briefly attempted in the late 1890s, within the West Uralic (Samic–Finnic–Mordvinic) group. E. N. Setälä proposed at this time (see p. 839– here) reconstructions such as the following:

  • *ȧ > S *ā ~ F *a ~ Mo *a
  • *å > S *uo ~ F *a ~ Mo *a
  • *ɔ > S *oa ~ F *o ~ Mo *u
  • *o > S *uo ~ F *o ~ Mo *o, *u
  • *ɔ̄ > S *oa ~ F *oo ~ Mo *u
  • *ō > S *uo ~ F *oo ~ Mo *a

However, any attempt to extend this method wider out will turn out to require further and further splintering, and by the time we end up with a triple-digit number of different proto-vowels, this idea will be clearly untenable.

2. Assume original vowel alternations, with levelling in each descendant. This idea was also initiated by Setälä very shortly afterwards, indeed already explored in the same article I linked, and gained maybe in its purest form by T. Lehtisalo in the 1930s. [2] In his work e.g. what Setälä above reconstructs *ȧ becomes *ā; but *å is transformed into *ā ~ *ò, *ɔ into *ò ~ *ù, *o into *ò ~ *ō ~ *ū, *ɔ̄ into *ō ~ *ū, and *ō into *ō ~ *ā. Most of his various proto-vowels actually never exist outside such pseudo-ablaut patterns.

After WW2 the “locus” of this line of reconstruction moved from Finland to Germany, with W. Steinitz defending his own variant of the idea extensively in the 40s thru 60s. No real research on the topic has occurred since his death however. (Amazingly enough, it still lingers though in some overviews of Uralic penned by people who evidently ignore all research published outside of the German-speaking world.) I see this as unlikely to be effectively revived either: the only Uralic language showing somehow productive evidence for “ablaut” is Khanty, while everywhere else alleged evidence for vowel alternation is either due to transparently secondary changes, or is really based on sound correspondences rather than language-internal evidence.

3. Assume sporadic vowel changes, per ad hoc influence of varying surrounding phonetic environments: anything adjacent to labials might be labialized or delabialized, anything adjacent to velars might be backed or labialized, anything adjacent to /r/ might be lowered or backed, etc. No one has treated this as the sole explanation for various vowel correspondences across Uralic, but this was considered a major mechanism first by E. Itkonen, whose work ended up repealing the Setälä school “gradation” model in Finland, yet ended up enshrining a very Finnocentric image of Uralic vocalism (cf. before).

Today this approach most strongly still persists in Hungary. One reason surely is that this is the model that has been adopted in the UEW, often treated as the crown jewel of Hungarian Uralistics, and whose tentative reconstructions are then sadly often treated as ex cathedra truth. I suspect a second reason is moreover found in language-internal history: the Modern Hungarian vowel system cannot be derived from that of Old Hungarian by regular sound changes — if taken at face value. However the very limited inventory of Old Hungarian vowel graphemes (in first sources just ‹a e i o u›, slightly later expanded to ‹a e i o u ü›, etc.) very likely hides unwritten distinctions. [3]

4. Attempt to reconstruct conditional vowel shifts. First explored already by A. Genetz contemporarily with Setälä, and by now universally adopted e.g. for much of the West Uralic data: Setälä’s *ɔ, *o turn out to be in complementary distribution with respect to stem type (*o-a versus *o-ə, split only in Samic). Recent wisdom shows this to be mainly the case for his *å versus *ō likewise (*a-a, *aCCə, *aTə versus *aRə, split only in Finnic). This then also explains extremely naturally the identical reflexes in Finnic and Mordvinic for the former, in Samic and Mordvinic for the latter: they don’t just coincide, they’re always had the same vocalism.

More generally, this approach is adopted to some extent by everyone more recent than Lehtisalo (including Steinitz and Itkonen), but often only partially. I believe it can and should be still pushed further to reach new results.

It must be also noted that these are not methodologically equal approaches.

The first approach does make exact predictions, and is to an extent obligatory: we do need to assume some number of vowel phonemes in Proto-Uralic, and some unconditional/elsewhere reflexes for them in the daughter languages. [4] But the vast number of correspondences demands also some other mechanisms to account for the large number of non-core cases (not really “edge cases” when they may be the majority altogether). While a few of them could be in principle again accounted for by setting up new Proto-Uralic vowel phonemes, this method ends up as awfully arbitrary: we have no clear grounds to prioritize any single case of variance in reflexes as inherited from Proto-Uralic, while leaving other cases of variance to be explained by other methods. In fact I think by now that reconstructing any proto-language contrasts at all from only a single branch among several (i.e. at least in largely polytomous-looking dialect-continuum/linkage situations such as Uralic) is methodologically illegitimate — while such cases can obviously happen in principle, only when a contrast is continued by more than one line of evidence is it possible to securely privilege a particular reconstruction.

The second and third mechanisms however are poor patches to the problem: they end up as unfalsifiable “just-so phonology”. Both irregular sound change and paradigmatic levelling are singular events that can be only assumed, never defended in detail, and never clearly shown to be incorrect by additional data. Usually it also becomes nearly impossible to then establish the real proto-language starting point. For the former the main issue is one of directionality, especially for supposedly irregular correspondences widely across a family, but also since local archaisms are in principle possible. For the latter the typical problem has been treating “alternation” merely as a free-floating excuse to mix and match vowel reflexes, without giving it any original morphological or phonological distribution. Sometimes we may fall back to these, but they’re no more than band-aids for etymologies that otherwise seem to work and which we don’t feel like discarding for but a single irregular feature. (There are further similar mechanisms available to the historical linguist too, but they start to get outside phonology entirely. [5])

It is only the fourth approach that has real explanatory power for exception cases. Reliably established conditional sound changes allow accounting for the development of multiple words by a single explanation, reaching a more parsimonious historical scenario than anything built of one-off changes. Conditional sound changes make fairly exact predictions as well about what correspondences future etymological research may find. Though this should not be overstated: etymology is not a black box that feeds us experimental data, it’s made of scientists who are able to read work on historical phonology and might use it to hunt for new etymologies, in principle risking confirmation bias. New data rarely outright falsifies conditional sound changes either: more common responses in my impression are to either narrow down the conditioning further yet, or to seek explanations through relative chronology, so that apparent exceptions may turn out to be accountable as being due to counterfeeding sound changes.

As I’ve stated already in the intro slides to my CIFU 12 presentation: “one who seeks, shall find”. For several years now, a large proportion of new discoveries in Uralic historical phonology have precisely been conditional sound changes, either entirely new ones, or new and improved conditions for known sound correspondences. This includes also almost all results I am “sitting on”. Hence it seems evident that this represents a major underresearched area.

This is all the more surprizing since ample preliminary work has regardless already been done! With just a bit more rigor, many minor “sporadic” sound changes assumed by mid-20th-century researchers like Itkonen, Collinder or Rédei (to an extent also even 19th-century pioneers) could probably be transformed into more regular shape. This goes beyond the big names too: many minor articles may yet turn out to have the seeds of important insights, as maybe best exemplified by Lehtinen 1967; but also (staying still within Finnic) e.g. Bergsland 1968 as the inspiration for my idea of more general palatal unpacking *AĆ >*AjC, or the various loanword studies to have first discussed the idea of a sound change *ej > *ii. I have enough ideas already to put together a PhD from ideas I’ve already uncovered or developed on my own, but going onward from there, compiling and reassessing proposed sound changes from earlier research seems to me like an important desideratum for Uralic studies in the early 21st century.

[1] At least the Selkup development is perfectly explainable: there is no **pwë- in Proto-Selkup, and evidently diphthongization of Proto-Samoyedic *å to *wë was blocked after labials. Terentyev in СФУ 16 suggests *å > *o between a labial and a resonant (thus also e.g. PU *pončə ‘tail, back’ > PSmy *pånčə > PSk *ponč-ar ‘hem’, (? *parka >) *pårkå > *porqɨ ‘coat, parka’), *å > *u between a labial and an obstruent (thus also e.g. *mośkə- > *måsə- > *musɨ- ‘to wash’, *poskə > *påtə > *putɨ-la ‘cheek’). There are also cases of *å > *o/u not preceded by a labial though. I wonder if syllable closure and/or if PSmy *å goes back to PU *a or *o should also be taken into account.
[2] Most extensively in: Lehtisalo, T. 1933. “Zur geschichte des vokalismus der ersten silbe im uralischen vom qualitative standpunkt aus” [sic: no caps]. Finnisch-Ugrische Forschungen 21: 5–55.
[3] E.g. ‹i› when giving modern Hu. ë/ö is likely to have been a shorter/laxer *ɪ, while ‹i› when giving modern Hu. i/í is likely to have been longer/tenser *i ~ *iː, as can be confirmed by different Uralic sources for the two — and hence these correspondences do not involve “sporadic” lowering of †i, but rather quite regular lowering of */ɪ/.
[4] It is in theory however possible, given long enough phonological development, that many conditional sound changes bleed a proto-phoneme such as *a on its way to some default reflex *A in a sub-branch, and then this is bled by additional conditional sound changes in several environments including all the retention ones on its way to some modern reflex like /a/, that there aren’t actually any cases left at all where *a > /a/. In such a case all reflexes of original *a in this modern variety would be conditional one way or the other. An almost-example is the fate of PU *k in Tundra Nenets: when singleton palatalized to /sʲ/ before front vowels and (? backed-then-)lenited to /x/ before back vowels, in coda debuccalized to /ʔ/ as the first member, and almost always lost as the 2nd member of a cluster — so that the “default” development *k > /k/ is only really found in the original cluster *kk. On the average Uralic is still phonologically compact enough though that usually anything like this does not happen.
[5] One other common option is “find a root that does work phonologically, then go hog wild with semantics”. This has given us many such great etymologies as Kari Liukkonen’s infamous derivation of Finnic *noki ‘soot’ from Baltic *nagis ‘nail’, allegedly through an unattested sense ‘dirt under fingernails’ (I wish I were kidding). — When in need of a patch, I seem to tend towards phono-semantic contamination the most for some reason. Arguably this is also an underresearched area, but again, semantic change is singular and cannot be actually usefully reconstructed all by itself. At most it seems that we could collect examples and try to look for typological generalizations, hardly a project to have lasting impact very soon.

Tagged with: , , , , ,
Posted in Methodology, Reconstruction
3 comments on “A Problem Statement for Uralic vocalism
  1. David Marjanović says:

    [sic: no caps]

    An English-like usage of capital letters was fashionable in historical linguistics written in German in the 19th century, but I had thought this ended when German got its first officially regulated orthography in 1901. I guess Lehtisalo was a living fossil in 1933.

    Incidentally, I can’t recommend it.
    Ausländer, die deutschen Boden verkaufen “foreigners who sell German soil
    Ausländer, die Deutschen Boden verkaufen “foreigners who sell soil to Germans”
    Helft den armen Vögeln! “Help the poor birds!”
    Helft den Armen vögeln! “Help the poor to boink!”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Enter your email address to follow this blog and receive notifications of new posts by email.

%d bloggers like this: