Another Phonological Relict in South Estonian

Some days ago, I decided to go for a re-reading of Setälä’s classic Yhteissuomalainen äännehistoria (1891) (that’s “Common Finnic Historical Phonology”, for the non-Finnish-reading people in the audience). This proved a good idea, in yielding not just the confirmation of some issues I had been wondering about; but also various detail observations new to me that seem to support a theory of mine in the works.

I mean the thesis introduced at the end of my last post: the characteristic Finnic sound change *š > *h did not take place in unitary Proto-Finnic, or even in unitary Core Finnic (following the splitting-off of South Estonian and Livonian) but spread across the Finnic language area even later, after its splitting into dialects entirely.

One of these details appears in the Finnic word for ‘goose’, normally reconstructed as *hanhi (> e.g. Fi. hanhi, Es. hani). We are quite sure that this goes back to earlier *šanši, given that it’s a long-known loanword from PIE *ǵʰans- (most likely thru Baltic); and also given the recent observation that it could be traced back to even earlier *šänšä, allowing treating Erzya /šenže/ ‘duck’ as a “non-native cognate”.

Since the word fails to show up in Samic — or rather, shows up there in an entirely different form *ćōńëk, allegedly from a pre-Germanic alternative formation *ǵʰan-ut- according to an etymology from Koivulehto [1] — we probably still shouldn’t assume loaning into common West Uralic. Another point in favor of this seems to be given by the Finnic sound change *-ńć- ~ -ńś- > *-ć- ~ *-ś-: that this early denasalization only applies before a palatalized sibilant seems best explained by assuming that the clusters *-nš- and *-ns- had not yet even entered the language by this point (neither of them occurs in material inherited from Proto-Uralic). [2]

Denasalization before sibilants is a fairly natural sound change though. A second round of the same has later taken place again in the southern Finnic area, this time with compensatory lenghtening, affecting *-ns- found in innovated Proto-Finnic vocabulary (as in Es. põõsas ~ Fi. pensas < PF *pënsas ‘bush’) or developing thru *-nc- from the assibilation of *-nt- (as in Es. kaas ~ Fi. kansi < *kansi < PF *kanci < *kanti < PU *kamtə ‘lid’). And the interesting fact is: in South Estonian this affects ‘goose’ as well! yielding haah’ instead of the expected ˣhahn’.

You might protest that surely the loss of a nasal should be just as natural before /h/. This is also the mechanism Setälä appeals to. Crucially though: words showing *-nh- of some other origin are not denasalized. As just mentioned in my last post, they instead metathesize, yielding e.g. *tenho > tehn ‘thank’, *vanha > vahn ‘old’ (again, just like other sonorant+h clusters, regardless of if they go back to *-Rš- or not). ‘Goose’ appears to be the only example of this denasalization development. [3] I would not brush off as a coincidence the fact that it is also the only example that can be securely traced back to *-nš-.


This situation might not be obvious, as two other Finnic words with *-nh- have still been proposed to come from *-nš-. Yet newer research appears to have shown by now that neither example holds water.

*vanha ‘old’ is the first case with alleged earlier *-nš-, traditionally compared with Udmurt /vuž/, Komi /važ/, of the same meaning. Komi /a/ would be irregular as a counterpart of Finnic *a, though, and a recent proposal from Mikhail Zhivlov [4] identifies a better etymology for the Permic words: borrowing from Baltic *wetuša- ‘old’ (cf. Lithuanian vetušas). The development *e > /u ~ a/ seems to be regular before a lost medial consonant, as in PU *wetə > Udm. /vu/ ~ K. /va/ ‘water’. [5] A different etymology for Finnic *vanha has been proposed too: borrowing from Germanic *wanhaz ‘bent, crooked, bad’. This seems uncertain due to the semantic difference, but if the Permic connection fails, it appears to be the explanation we will have to default to. LÄGLOS is of the opinion that it would be exactly the existence of Permic cognates that shows this etymology to be unviable, not any formal flaw.

The second is *inhiminen ‘human’, which has been traditionally compared with Mordvinic *inžə ‘guest’. A loan etymology by Koivulehto derives these from PIE √ǵenh₁- ‘to beget’. Disassembling this requires a bit more analysis though. Given that the usual sound substitution for Indo-European *ǵ has been Uralic *j, Koivulehto suggests that the words continue the zero-grade *ǵn̥h₁-, with the sequence *ǵn̥- substituted as *in- (rather than *jVn-). Since we still have /i-/ and not the expected **e- in Mordvinic, the word would then have to have been loaned fairly late — but my soundlaw *je- > *i- for Finnic seems to “get in the way” of this: Koivulehto’s reconstruction could be quite well amended to a common proto-form *jenšä-, derived instead from the IE full grade.

Other considerations still chafe against this analysis. Firstly, Koivulehto also assumes a sound substitution *H → *š, but as has been recently argued by Adam Hyllested, [6] this is likely mistaken, and we should instead assume *H → *h straight away. Most of Koivulehto’s alleged examples are restricted to Finnic, and thus show no direct evidence for *š at all. For a few others, with cognates in e.g. Samic that explicitly point to *š, alternative etymologies have been suggested. If I were doing a more detailed review, I would consider also the possibility that they represent “etymological misnativization”, with IE *H → Finnic *h substituted as *š either in the other Uralic languages involved, or already in an archaic mediating Finnic variety.

Secondly, in Finnic we have no evidence for a bare root **inhä, only for the longer stem *inhimV- (mostly further suffixed with the adjectival/deminutive ending *-inen, but a few forms like Ludian inahmoi could in principle be parallel rather than “suffix-switched” derivatives). This seems to not match at all with the usual patterns of Finnic nominal derivation. We would expect something ending in *-imV-  to be either a nominalization (in *-mA-) from a frequentative verb (in *-i-), or a superlative. Instead the Indo-European derived noun *ǵenh₁mn̥ ‘offspring’ (> Latin genimen, Sanskrit janiman, etc.) seems to provide a better morphological match: it even provides half of the ending *-inen, whose presence in the neutral word for ‘human’ is otherwise a bit puzzling. In Mordvinic we see no signs of this though, which would seem to suggest that the ‘guest’ word has a different etymology entirely.

(Thirdly… in South Estonian only Northern-type reflexes inemine ~ inimene seems to be attested, so even if the history here had really been *ǵenh₁- > *jenšV- > *inhV-, it would not affect my analysis of ‘goose’ anyway.)


How late this reanalysis requires pushing *š > *h exactly is not clear. The terminus post quem on show is after the Southern Finnic denasalization (or perhaps concurrently with it: earlier in North Estonian vs. later in South) — but this is itself difficult to date. At minimum this would have to be later than the splitting-off of Northern Finnic, which in principle might however go quite deep into the Proto-Finnic period.

There is some weak evidence for some dialect diversity within the future Estonian area at this time as well. Another minor observation of Setälä’s is that, in a few central Estonian dialects, *Vns > *VVs postdates the diphthongization of original *aa and *ää to /ua/ and /iä/. This won’t have to mean that the entire denasalization development is this late, though: a nasal vowel stage *ṼṼs would make a very believable intermediate, with full loss of nasality only later.

The form haah’ also does not even appear to be common across the entire South Estonian dialect area, but is rather limited to its southernmost fringes. To some extent this probably means that the literary / North Estonian form hani has simply displaced the native form in some parishes… but a very similar distribution also seems to hold for tehn and vahn. In principle it would be possible that also the southwesternmost area of South Estonian had already split off by the time of *š > *h, and that the general Central Finnic soundlaw *nh > *n is the regular development elsewhere in the SE area.


This analysis may also raise a few methodological questions. Is it really legitimate to suppose a development *Vnš > *VVš for pre-South Estonian only on the basis of a single etymology? On one hand, it is clear that granting an open check for positing single-example sound changes with highly specific conditioning would allow rewriting the historical phonology of any language completely to taste. On the other hand, in this particular case we have some very strong constraints to avoid this failure mode: aside from the bare output (haah’), we can independently establish also all three of the input (*šanši), the specific conditioning environment (loss of *n before a sibilant) and the general phonetic motivation (the articulatory complexity of a nasal-sibilant transition) of the sound change I’m assuming.

Much seems to depend on how we model sound change phonologically. Do changes target, or are they conditioned by atomic phonemes — or by the features of neighboring segments? If the former, then we will be forced to treat *Vns > *VVs and *Vnš > *VVš as two parallel changes that have only incidental similarity; if the latter, then it will become possible to treat them as the one and the same sound change *VnS > *VVS, and to proceed to infer early dialect diversity within the Finnic languages.

[1] I am on the skeptical side though, and would expect anything showing Samic *ć ← PIE *ḱ to have been adopted from a Satem variety.
[2] The same relative dating is similarly suggested by how this sound change seems to extend to Mordvinic as well. None of the textbook examples such as PU *kuńćə ‘urine’ have known reflexes in Mordvinic; but one binary comparison, Erzya /saźi-/ ‘to gain, get’ ~ Permic *sudź- ‘to reach’ seems best reconstructed as *sëńćV-.
— It might be additionally a good idea to assume that the heterorganic clusters *-ŋs- and *-ŋš-, known in one word each (*joŋsə > PF *jousi ‘bow’; *jaŋša- > PF *jauha- ‘to grind’) had already changed to *-xs-, *-xš- in Finnic before the denasalization of *-ńć-.
[3] ‘Thank’ and ‘old’ are actually morever the only two examples of *-nh- > -hn- that I can get together on a quick search.
[4] I do not know of a more substantial publication on this yet, but an initial release has been in the proceedings of the 2008 conference Языковые контакты в аспекте истории. (My thanks to André Nikulin for the reference.)
[5] Rather than setting up a separate marginal Proto-Permic vowel *å, I would prefer explaining this correspondence as a conditional development in Komi from Proto-Permic *o (normally > Udm. /u/ ~ K. /o/). Finding a phonetically reasonable account of the development regardless remains to be done. A few possibilities that would initially seem plausible are blocked e.g. by how both *-ej- and *-at- still yield the expected /o/ in Komi (cf. /voj/ ‘night’, /śo/ ‘100’).
[6] In a conference paper to be found his PhD thesis Word Exchange at the Gates of Europe. Again, I do not know of a “more proper” published version.

Advertisements
Tagged with: , , , , , ,
Posted in Etymology, Reconstruction
15 comments on “Another Phonological Relict in South Estonian
  1. M. says:

    One of these details appears in the Finnic word for ‘goose’, normally reconstructed as *hanhi (> e.g. Fi. hanhi, Es. hani). We are quite sure that this goes back to earlier *šanši, given that it’s a long-known loanword from PIE *ǵʰans- (most likely thru Baltic)

    Maybe I’m missing something, but shouldn’t this be formulated as “It is a long-known loanword from Baltic *žansi-“?

    The stem of hanhi is nearly identical to that of modern Lithuanian žąsis “goose”, the only difference being that Lith. has (somewhat recently, I think) lost the nasalization in the vowel.

    Plus which, is there any known pattern wherein PIE *g / ǵ (i.e. a voiced velar or palatal stop) would surface as Finnic *š > *h? As far as I know, we should only expect this outcome in Finnic *after* the IE consonant had became a coronal fricative (Lithuanian ž, etc.), which it never did in the western IE branches.

    • j. says:

      Not in western IE, no, but we know of a variety of Iranian loanwords with its depalatalized *c or *dz substituted as the Uralic non-palatal affricate *č, which would word-initially give pre-Finnic *š as well. And already in Mordvinic the usually recognized situation seems to be that Indo-Iranian loans are probably more common than Baltic ones. (Though perhaps some of the Iranian ones could be reinterpreted as being from sufficiently early or slightly divergent Baltic, if we for some reason wanted to do so.)

      And yes, “long-known Baltic loanword” would be more in line with the current the consensus position for sure, this possibility of Iranian routing is my own observation entirely.

  2. M. says:

    The stem of hanhi is nearly identical to that of modern Lithuanian žąsis “goose”

    “Nearly identical” when one applies the sound change *ž / š > *š > *h, that is.

  3. M. says:

    One more thing that should have gone in my first comment:

    A different etymology for Finnic *vanha has been proposed too: borrowing from Germanic *wanhaz ‘bent, crooked, bad’. This seems uncertain due to the semantic difference, but if the Permic connection fails, it appears to be the explanation we will have to default to.

    The etymology we would have to default to in that case would be “origin unknown”, as far as I can see, not “of Germanic origin”. By what principle can semantic difficulties simply be waved off (i.e. “shelved” pending further research), while any phonetic irregularity immediately negates a proposed connection? (Several well-known IE cognates are not held to the latter standard, at least not routinely: Germanic *auǥon- “eye”, *xauƀuþ- “head” and *seƀun “7” all have an unexplained or missing segment that prevents a fully regular correspondence with phonetically-similar and semantically-matching words outside of Germanic.)

    • j. says:

      The phonetic irregularities have not sufficed to negate the Permic connection, it’s the new superior loan etymology (incompatible with Finnic) that I think does so. Semantics-wise, I’m going off of LÄGLOS’ assessment that it’s just the Germanic ~ Permic comparison that gets too difficult. Perhaps I should look up the actual arguments for this too though: I agree that just a formal resemblance does not cut it, and so I would hope Ritter and Koivulehto have e.g. managed to dig up Germanic reflexes with a meaning closer to ‘old’, or at least parallels for a semantic change ‘bent’ > ‘old’.

      On closer thinking, it might be relevant that vanha is used as an epithet of the Devil in Finnish (in expressions such as vanha vihtahousu); but this could still also be a homonymic Germanic loan rather than an actual part of the ‘old’ group. I do not know if similar usage appears elsewhere in Finnic.

      • M. says:

        The phonetic irregularities have not sufficed to negate the Permic connection, it’s the new superior loan etymology (incompatible with Finnic) that I think does so.

        The vetušas : vuž etymology may be superior to vanha : vuž in the sense of having more regular sound correspondences, but since it doesn’t seem like a persuasive etymology to me thus far (only a plausible one), I don’t think its superiority justifies banishing vanha : vuž from further consideration.

        Semantics-wise, I’m going off of LÄGLOS’ assessment that it’s just the Germanic ~ Permic comparison that gets too difficult.

        What does a Germanic-Permic connection have to do with the semantic gap between the Germanic and *Finnic* words under comparison?

        • j. says:

          No one is “banishing” anything. Since we cannot prove a negative, the only way in which we can possibly claim an etymology as as whole to have been negated (as opposed to claiming a particular argument for one to have been refuted) is in the sense that it will end up having to be considered less likely than some other one. At some point unlikely enough to not stand out from dozens of other implausible just-so etymologies though, but we don’t currently have models qualitatively sophisticated enough to measure such a thing.

          The semantic argument I’m so far making is essentially an ex silentio one. LÄGLOS normally does not shy away from calling attention to poor semantic correspondences; it does not do so here, from which I infer that some further arguments in favor of the alleged connection probably exist. That they omit listing these seems explainable by how at the time the Finnic-Permic connection remained uncontested, providing for them (but not necessarily for us) an understandable reason to dismiss the Germanic loan etymology anyway.

          Let’s see if this changes once I actually look up the references, though.

          • M. says:

            I’ve read snippets of LÄGLOS online, and some of its statements about semantics puzzle me. E.g. when discussing the etymology of laine “wave” and its Finnic cognates, the editors point out that the main meaning of the Finnic words is absent in the proposed North Germanic cognates (Icelandic hlein “ledge”, Norwegian lein “slope, inclination”), while Finnic has no trace of the NGerm. words’ meanings. They call this fact “auffällig” (= ”noteworthy”?), but right after mentioning this, they conclude that laine is a “germanische Lehnwort” with no question mark or other qualification.

            Perhaps they meant to imply that they find the semantics of Gothic hlain- „hill“ (the only non-North Germanic cognate, I think) sufficiently similar to those of laine to close the case, but if that’s what they meant, I think they should have said so explicitly.

            • j. says:

              Laine is an interesting case. They refer somewhat discreetly to a paper on the etymology of kumpu as a parallel for similar semantic development, but that one is probably rather a case of two unrelated etymologies: Fi. kummuta ‘to well up’ from PU *kompa ‘wave’, versus the group of kumpu ‘hill’, as also per LÄGLOS itself, from Germanic hump etc., instead of being an example of the inverse development ‘wave’ > ‘hill’.

              The variation between *lainëh < *lainëš (most of Finnic) and *lainis (Ludian / Veps) very much suggests loan origin to me, but yes, I would be skeptical about positing *hlainiz specifically as the origin.

      • David Marjanović says:

        I would hope Ritter and Koivulehto have e.g. managed to dig up Germanic reflexes with a meaning closer to ‘old’, or at least parallels for a semantic change ‘bent’ > ‘old’.

        I can very tentatively offer one: my East Central Bavarian dialect has an isolated word /vax/ (vowel length is not phonemic), which occurs exclusively in the outraged question /bistˈvax/ “are you completely out of your mind now”. The vowel fits if and only if we assume umlaut, and I wouldn’t know where that would come from in *wanhaz. (OHG/MHG /a/ and /aː/ become /ɒ/; the umlaut of that phoneme is sometimes /ɛ/, sometimes /a/, for reasons I have no idea of.)

        The obvious alternative is that it’s the cognate of Standard German weich “soft” (cognate of E weak). That is testable, just not by me, because my particular subdialect has, under Viennese influence, recently merged the reflex of PGmc *ai into /a/. If /vax/ is weich, there must be plenty of people out there who say */voɐ̯x/ instead; I have no idea if that’s the case.

        Central Bavarian lost the length contrast of word-final consonants just before the end of the MHG period. Given the Inderior German Gonsonand Weagening, I’m not optimistic about North Bavarian preserving the contrast. I don’t know if there are any South Bavarian dialects that preserve it, though. The short /x/ of *wanhaz should still be intact, while the /k/ of weak* should have been turned into long /xː/ by the HG consonant shift.

        * Wiktionary reconstructs *waikwaz without giving any hint about what the second *w is inferred from.

        • David Marjanović says:

          just before the end of the MHG period

          Sorry, that’s just the terminus post quem (the loss of the 1sg verb ending -e). Can’t have been much later, though, because long word-final consonants resulting from later rounds of apocope (like the loss of the noun plural ending -e) are preserved and have even been extended by analogy.

    • j. says:

      Alright, the literature checkup indeed comes up negative: nothing more detailed than an observation on the similar shapes of the Finnic and the Germanic words seems to have been proposed. I’m adjusting my degree of trust in LÄGLOS downward accordingly…

  4. I have a slightly crazy speculation about the case of vanha. I do not believe that the Germanic etymology is valid at all. (Like most Indo-Europeanists who occasionally dabble in Uralic, I’m totally confused when I see some of the claims of borrowing that do not look like the supposed source languages. There is a Gmc *wanxaz, which, though, would have been *wããhaz if not *waahaz by the time Finnish was in contact with Germanic. However, the earliest meanings are ‘maimed’, cf. OE wóh, and ‘blameworthy’, cf. Gothic unwahs ‘blameless’. Not the best source for a word meaning ‘old’, I think.) What if it really is related to Hu vén, from a protoform *wäšänä? Depending on details of the sound law that takes Pre-Finnic ä…ä to a…e, this would yield intermediate Finnic *wašena, the e should syncopate, then *vašna > *vahna > vanha. The Hungarian is of course straightforward from this preform. Excluding the Permic word, which really is better explained via *vetušas, there do not seem to be other cognates to consider. Under this hypothesis, the South Estonian forms with -hn- would be simple retentions, while the rest of Finnic metathesized, but I think that is plausible. The only potential difficulty I see, which I am not qualified to judge, is the plausibility of a Uralic word with the shape *wäšänä. You tell me.

    • j. says:

      There is a Gmc *wanxaz, which, though, would have been *wããhaz if not *waahaz by the time Finnish was in contact with Germanic.

      This sounds like you’re underestimating the contact timescale: the current understanding is that IE-Uralic contacts in the Northwest predate both Proto-Germanic proper and Proto-Finnic proper, going back easily 2500 years, quite possibly 3000. We call particular early loans as being from “Germanic” (or “Baltic” or even “Slavic”) (or for that matter, as coming into “Finnic” or “Samic”) mainly after the languages where the involved words surface nowaways; but they continue essentially smoothly all the way back to (late) Proto-Uralic / (late) Proto-Indo-European, really.

      What if it really is related to Hu vén, from a protoform *wäšänä?

      Does not look entirely impossible (and the comparison has been invented many times), but I would have additional concerns:
      – the Hungarian word is these days usually instead considered cognate with Upper Vyčegda Komi /vener/ ‘old, worn’ (which, admittedly, also suggests probably *ä);
      – the metathesis *hn > *nh outside South Estonian would seem fairly ad hoc (we’d perhaps have to then try shunting off remaining cases of Proto-Finnic *hn as having been for example *sn at the time);
      – positing instead *wänäšä would not work either, since this would be rather expected to yield Finnic **voonha or even *voonëh;
      – from original *ä, I’d expect in Hungarian the short-vowel oblique stem ˣvene-, as also in e.g. keze- ‘hand’ or fele- ‘half’ (but perhaps this alternation could have been levelled in adjectives, as at least the earlier A/O-stem distinction has been). I’d also want to check if Hungarian dialect data indicates Old Hungarian *ee or *ää.

      And yes, *wäCäCä would also be a very unusual shape for an underived root word in Proto-Uralic. Perhaps still not really so unusual as to work as a counterargument all by itself though; there are numerous other unusual-looking trisyllabic PU reconstructions (that have often been assumed to be “derivatives” without strong evidence), even if none look specifically like this one.

    • David Marjanović says:

      There is a Gmc *wanxaz, which, though, would have been *wããhaz if not *waahaz by the time Finnish was in contact with Germanic.

      Oh, I overlooked that last time… if *x > *h is supposed to be a sound change rather than a change in transcription conventions, it didn’t happen. Between vowels, *x is still [x ~ ç] in Upper German, short as opposed to the long one that comes (mostly) from the High German Consonant Shift.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Enter your email address to follow this blog and receive notifications of new posts by email.

%d bloggers like this: