A Fourth Laryngeal in PIE

The Proto-Indo-European laryngeals seem to form, in most people’s thinking, a kind of a phonological subsystem. Usually they end up as a class of back fricatives, or at least some kind of weaker back consonants. They certainly have similar diachronic behavior… but if this implies also unique synchronic similarity is not immediately obvious. After all, there is a rather wide range of consonants that can be easily lost from a language (in the “merges with zero” sense). And inversely: even if many members of some natural class are lost, not every one of them will have to. E.g. transient voiced spirants in various Uralic languages: early pre-Permic *β *ð *ɣ are all lost by late Proto-Permic, while out of late Common Finnic *β *ð *ɣ in Eastern Finnish/Karelian, only the latter two are lost, and *β instead gives /v/.

Occasionally PIE internal reconstructors will go further still, and point out that the most widespread reconstruction with three laryngeals would be tempting to compare with the three series of velar consonants, suggesting rewriting *h₁ *h₂ *h₃ as *x́ *x *xʷ. The analogy is clearly imperfect though. E.g. the laryngeals do not show much signs of a centum / satem isogloss, not along the usual dividing line at least; [1] there are no parallels to the conditional neutralizations among the velar stops, such as *ḱr > *kr; the labiovelar stops *kʷ *gʷʰ *gʷ do not show any *o-coloring effects (for *k *gʰ *g some *a-coloring effects have been proposed though). A more common objection still however seems to be that there is a widely held alternate hypothesis: many mainstream IEists think that *h₃ is better mapped as a voiced fricative: [ɣ], [ʁ] or [ʕ], and *h₁ as a glottal consonant: [h] or [ʔ].

This semi-consensus view still assigns *h₂ as a voiceless back fricative: [x] or [χ], as the direct Anatolian evidence also strongly suggests. The occasionally suggested pharyngeal [ħ] can be IMO ruled out per arguments such as those in Michael Weiss’ recent paper. (I have already opted to use *x and not *h₂ in my index of the LIV roots, and will mostly do so in the rest of this post too.) However, this leaves an opening for an objection that does not seem to be commonly made, but to me feels quite relevant. If *h₁ and *h₃ are really something like *h and *ɣ, would *h₂ = *x then really be an isolated voiceless velar fricative, without palatovelar and labiovelar counterparts? [2]

A brief typological survey shows that such gaps among back fricative systems are indeed not common. In particular, any language that has both /kʷ/ and /x/ is rather likely to also have /xʷ/. [3] A look at the PHOIBLE data turns up the following results:

  • all of /k kʷ x xʷ/: 35 languages
    (Bilin, Buwal, Central Atlas Tamazight, Central Siberian Yupik, Chipaya, Chipewyan, Comox, Cupeno, Dghwede, Gavar, (Paraguayan) Guarani, Gwandara “4 and 6”, (Northern) Haida, Iraqw, Jicarilla Apache, Kumiai, Lagwan, Lamang, Luiseno, Mezquital Otomi, Nootka, Quileute, Seri, Serrano, Shuswap, Tachelhit, Tera, (Southern) Tiwa, Tlingit, Tolowa, Tonkawa, Wamey, Wichi Lhamtes Nocten, Yuqui)
  • only /k kʷ x/: 14 languages
    (Awing, Ese Ejja, Kwasio, Nizaa, Nuclear Daba, Purepecha, Saliba, Sui, Taushiro, Tilquiapan Zapotec, Uru, Ute-Southern Paiute, Yala, Yurok)
  • near misses: Haka-Chin with /k kʷ x w̥/, Izi-Ezaa-Ikwo-Mgbo with /k kʷ χ/, Wuzlam with /k kʷ χ hʷ/.

So a language that has /kʷ/ and /x/ is about 2.5 times more likely to have /xʷ/ than not; a very substantial result, when otherwise only some 3.2% of the languages in the world PHOIBLE sample have /xʷ/.

There are moreover plenty of languages that have /k kʷ/ and some non-velar pair of ±labialized back fricatives. The most popular setup by far is /k kʷ h hʷ/ (Amharic, Arabela, Argobba, Cherepon, Fwe, Gikyode, Guinea Kpelle, Gwandara “2”, Hausa, Ikwere, Inor, Iyive, Kamayura, Kawaiisu, Kistane, Mbuko, Merey, Mesqan, Mofu-Gudur, Moloko, Nuclear Igbo, Piaroa, Sebat Bet Gurage, Siona, Suya “2”, Vame, Wandala, Wari, Wolaytta, Yeyi). Now, /k kʷ h/ is also very common; but given that x > h is a common sound change, it seems likely that many of this group of languages have come about from earlier *x *xʷ. In three cases /h hʷ/ also combines with an unpaired buccal back fricative: /k kʷ x h hʷ/ (Mfumte, Nyam), /k kʷ χ h hʷ/ (Tewa). [4] Other similar inventories are:

  • /k kʷ ç çʷ/ (Quechan)
  • /k kʷ χ χʷ/ (Bana, Kabyle, Xamtanga)
  • /k kʷ ħ ħʷ/ (Bade)
  • /k kʷ χ χʷ ħ ħʷ/ (Moroccan Arabic)

Lastly there is also the notable Pacific Northwest cluster of languages (Bella Coola, Coeur d’Alene, Lushootseed, Spokane, Squamish, Straits Salish, Upper Chehalis) with either just /kʷ xʷ/ (no plain velars; all have non-labialized uvulars though) or /k kʷ xʷ/ (with /k/ looking like a recent reintroduction by loans). This is tangential to the question, though.

Remarkably, this typological trend continues even within Indo-European! Nowadays Hittite is analyzed as having indeed phonemic /xʷ/ ḫu ~ uḫ beside plain /x/ (for a recent detailed review see Suter (2014) [5]). Per correspondences like Lycian /kʷ/ q, the same is also thought to have been the case already in Proto-Anatolian. This *xʷ corresponds to traditional PIE *h₂w, and is usually considered to come about by simple cluster coalescence. It would be however also quite feasible to set up *xʷ already for PIE itself, so that there wouldn’t be any asymmetry in stop versus fricative labialization. (This idea is supported already by Suter, whose article I only found after coming up with the idea myself.)

This will require a slight change in thinking: the concepts of “laryngeal” as “a consonant that is deleted” and “laryngeal” as “a back fricative” will need to be uncoupled. *xʷ will be a “laryngeal” in the second sense, but not in the first: it leaves at minimum a *w behind in core IE, after all. I think this sharpening of concepts would be beneficial, as Indo-European studies already suffers from treating the laryngals as excessively phonetically vague.

I belive additional evidence for *xʷ can be also found in PIE root structure. Clusters of (plain) velar + *w are often set up for PIE, but they’re much rarer than the labiovelars proper. LIV has the following counts: *kʷ 15 + 18 (root-initial + root-final), *gʷʰ 8 + 14, *gʷ 17 + 16; — *kw 7 (root-initial only), *gʰw 0, *gw 2. For *xw there are however 18 cases initially + 7 finally, which would make this both the most common *Cw cluster and by far the most common *HR cluster in PIE. [6]

Even more interesting are the verb root *xwyedʰ- ‘to strike dead or injured’, and the noun *xwl̥h₁néx ‘wool’: these appear to have a very rare *CRR- onset structure, unparalleled elsewhere in PIE to my knowledge. Reconstructing a monophoneme *xʷ and not a cluster **xw would however reduce these to the usual *CR-. Labiovelar stop + resonant clusters are rare as well, but at least attested, e.g. *kʷles- ‘to furrow’, *gʷyeh₃- ‘to live’, *gʷʰreh₁- ‘to smell smth’.

I would even suggest that some further internal reconstruction can be applied here. The typical onset structure in PIE is *(F)(T)(R)- (with F = fricatives, T = stops, R = resonants). In traditional reconstructions this is however violated by a number of cases of *w + resonant (attested in LIV: *wl- 1–3, *wr- 10–12, *wy- 3). However, many of these could be probably replaced by *xʷR-. Even the development to attested /wr-/, /vr-/ in a few descendants such as Germanic and Indo-Aryan would not have to be common core IE: it could represent independent developments, versus direct loss (or maybe *xʷ > *x > *h > ∅) in branches like Italic. — For *wy- some cases seem to be attested almost solely in zero-grade. They could probably be also reconstructed with *i as an original non-zero-grade root vowel, and an analogical full grade in some sporadic Indo-Iranian reflexes, similar to the case of *bʰux- ‘to grow’.

The above is just structural reanalysis, so far. It is less clear to me so far if setting up a PIE *xʷ will have implications also for the routing of the reflexes in the daughter languages; if some cases will regardless have to be retained as a cluster *xw; or even, if this could also be set up in a few additional positions.

Suter proposes one readjustment of this type: reconstructing ‘to wash’ as *lexʷ-, and not anything like *leh₃w- or *lewh₃- (and with intervocalic *-xʷ- > -[ɣʷ]- in Hittite, same as with plain *-x- > -[ɣ]-). This promisingly enough seems to cut out some ad hoc “laryngeal metathesis” rules. However, it also suggests an odd property for *xʷ: a-coloring in Latin (lavō) but o-coloring in Greek (λοέω).

How does this fit together with the seven examples I mentioned that have already been earlier reconstructed as *Ceh₂w-?

  • *deh₂w- ‘to roast on a spit’: Sanskrit dunóti < *du-ne-H-, Greek δαίω, δέδηε < *daw-ye-, *de-dāw-, OHG †zuscen < *du/ū-sḱe-, Irish dóïd < *do/ōw-eye- etc.
  • *geh₂w- ‘to be glad’: Greek γαίω, γάνυμαι < *gaw-ye-, *ga-n-u-, Latin gaudeō < *gāwedʰ-, and perhaps also some reflexes that LIV splits as a separate root *geh₂dʰ-.
  • *ḱeh₂w- ‘to set on fire’: Greek *kaw-ye-, *kāw-s-, Lith. kūles ‘Brandpilze’ (?!), Albanian than ‘to dry’ < *ća-, Tocharian *kaum ‘sun’. Kind of a weak-looking semantic grab-bag root etymology.
  • *keh₂w- ‘to hit’: reconstructed in LIV with *-h₂w- per Tocharian *kɐw- : *kåw- < *kəw- : *kāw-, even though most reflexes (Latin, Germanic, Balto-Slavic, Greek) point instead to *kuH- : *kewH-. If ad hoc metatheses are going to be assumed, why not in Tocharian rather than in all the other languages?
  • *kleh₂w- ‘to cry’: Greek + Albanian *klaw-ye-.
  • *melh₂w- ‘to grind’ — probably not with *xʷ, but rather an extended stem *melx-w/u-, from the more common *melx- ‘to grind’.
  • *peh₂w- ‘to stop, finish’: only Greek πάυω < *paw-.

It seems that the behavior here is rather different from the ‘wash’ case, with several examples confirming a-coloring in Greek. But they also all seem to involve more complex constructions; maybe the difference could be one between coda *xʷ (retained until a-coloring?) and medial *xʷ (leniting to *w earlier?). Many also seem to involve reflexes that point to *CuH-, instead of expected *ā(w) : *aw from *ex(w) : *əx(w). And does dunóti involve o < *aw < *exʷ, maybe coming about by some kind of a *dux- > *duxʷ- development?

Nowadays lengthened grades are usually thought to be secondary, so I even wonder if instances of ā that surface here are that, instead of from *aH < *ex. The (partial) late PIE ablaut scheme for roots in *xʷ would then be *āw : *aw : *u (lengthened grade : *e-grade : zero grade). Eichner’s Law (*ēx > *ē and not **ā) on the other hand still seems to require that a-coloring is usually younger than the rise of lengthened grade.

Latin lavō can be of course also explained through Thurneysen-Havet’s Law: *o > a / _wV́. And so, if this and λοέω are *o-grades after all, there will be no trouble in assuming that *xʷ is leftwards a-coloring, just like plain *x.

So far, in summary: introducing *xʷ gets rid of several typological-phonotactic anomalies in PIE. These include at least all *CRR- roots, a large group of *CeCR- roots, possibly numerous *RR- roots, the strange abundance of the cluster *h₂w, and the unusual /k kʷ x/ inventory.

The second of these issues is, however, not exhausted by this reanalysis. CeCR- roots also regardless remain like a suspicious feature of laryngeals in particular: there are no roots in anything like **-sw-, **-dy-, only things like *-h₁w-, *-xy-. One can wonder if *xʷ is maybe only the top of an iceberg, and also a few additional “laryngeals” of this kind (back fricatives that do not get deleted entirely) should be assumed.

But there will be many other options available too, especially with laryngeals other than *x that cannot be easily grounded in direct Anatolian evidence. For very quick offhand speculation for the sake of example… since laryngeals’ presence is in some ways easier to determine than their exact position, and since in particular *-Hy- clusters are often assumed to be subject to metathesis, we could rewrite these as the more typical *-wH-, *-yH-, and simultaneously then rewrite the roots currently reconstructed as *CewH-, *CeyH- as being instead “close-vowel roots” *CuH-, *CiH- (with ablaut only secondarily by analogy).

[0] Thanks to various members of the Zompist Bulletin Board for a number of discussions on this topic.
[1] It is true that *h₂e > *a and *h₃e > *o merge often, and conceivably this could even have gone through an early merger of *h₂ and *h₃. But this happens also in the non-satemic Germanic, while failing in the satemic Armenian. The corresponding “centum” merger of *e and *a as distinct from *o also seems to be unattested entirely.
[2] The same could be asked of *h₃ as *ɣ too, but there happens to be a very easy answer here — just identify “missing” *ɣ́ and *ɣʷ with the semivowels *y and *w, or at least assume that the fricatives merged with the semivowels at some early stage.
[3] The situation for palatalized velars seems similar, but the controversy over if if *ḱ was [kʲ], or if *ḱ *k were perhaps instead [k q], makes this question harder to survey.
[4] How these cases have come about seems harder to figure out from just general principles. Some hypotheses I can think of would be asymmetric debuccalization, i.e. *x ≡ but *xʷ > hʷ; and later secondary lenition, such as *q > χ or *ɸ > x, some time after the introduction of contrastive labialization. Loanword phonemes could be involved, too: for a not quite exact parallel, Udmurt has /k kʷ/ natively (the latter is usually, but IMO unconvincingly, analyzed as a cluster) versus /x/ only in recent loans from Russian.
[5] He also refers to the same typological sound inventory argument as I do, but working with an earlier stage of PHOIBLE, he only gets together 25 examples of symmetric /k kʷ x xʷ/ versus 11 of asymmetric /k kʷ x/.
[6] The other *H + glide clusters come in at *h₁w at 7 + 4, *h₁y at 1 + 1–5 (with lots of cases where it seems to be unclear if *y is a part of the root that gets deleted, or a widespread suffix), *xy at 0 + 1–6, *h₃w at 2 + 2, *h₃y at 1 + 0. All *H + liquid or *H + nasal clusters occur initially only, with *xm- the most common at 7 examples. Other *Cw clusters are likewise root-initial only: *sw- 21 (in this position more common than alleged *xw, but not altogether), *tw- 8, *dʰw- 7, *dw- 5, *ḱw- 5, *ǵʰw- 3–4, *ǵw- 1–2.

Tagged with: , , , , , ,
Posted in Reconstruction
32 comments on “A Fourth Laryngeal in PIE
  1. An additional benefit of this approach is that it reduces an anomalously high frequency of *h₂ in PIE lexicon.

  2. Y says:

    This opens the question, why did *xʷ leave some traces, while *ɣʷ disappeared completely?

    • j. says:

      Which *ɣʷ would that be — the one I suggest in endnote 2, or *h₃ itself? If the former, then this can be just an issue of relative chronology. If the latter, I don’t think there is much reason at all to think it was rounded: you may recall my view of the PIE vowel system is that traditional *a *o can be reassigned as *ɜ *a (and therefore *h₃ is not actually [o]-coloring). Still more to come on this though, e.g. at least the implications for Indo-Iranian should be easily guessable.

  3. Crom Daba says:

    I dislike laryngealism as it is today, because laryngeals can be combined with ablaut and analogy to fit most any data. But having said that, this seems promising.

    It reminds me of a post on linguistics.stackexchange (now deleted apparently) by an uneducated but imaginative user that suggested alternation between *h₃ and *w in pre-Indo-European although I take it you suggest *xʷe > *xʷa.

    • j. says:

      Actually no, insofar as the starting point is to simply identify *h₂w as *xʷ, the vowel-coloring rules would seem to be coming out asymmetrical: *exʷ > *aHw > *āw with a-coloring, but *xʷe > *we with no coloring.

  4. A note on Quechan. PHOIBLE gives the following inventory of palatal, velar and postvelar stops and fricatives: /c̟ c cʷ k kʷ ç çʷ/. PHOIBLE’s source (Halpern, A.M. 1944. Yuma. In Cornelius Osgood (ed.), Linguistic Structures of Native America), lists these as follows: /kʸ/ – prepatalal stop, /k/ – palatal stop, /kʷ/ – labialized palatal stop, /q/ – velar stop, /qʷ/ – labialized velar stop, /x/ – palatal fricative, /xʷ/ – labialized palatal fricative. One may compare a more modern source (Bryant, George and Miller, Amy. Xiipúktan (First of All): Three Views of the Origins of the Quechan People. Cambridge, UK: Open Book Publishers, 2013), where these consonants are described in the following way: /kʸ/ – “like the ky in backyard”, /k/ – “like the k in sky”, /kʷ/ – “the same sound, but made with rounded lips. It sounds like the kw in backward”, /q/ – “a sound similar to k but pronounced farther back in the mouth”, /qʷ/ – “the same sound, but made with rounded lips”, /x/ – “like the ch in German ach, or like Spanish j as in jota”, /xʷ/ – “the same sound, but made with rounded lips”. Halpern followed an outdated Americanist usage, where uvular consonants are called “velar” and velar consonants are called “palatal”. This usage was predominant in the 19th century and in the first half of 20th century, at least in the Americanist literature I am familiar with. So PHOIBLE editors falled prey to an outdated terminology. Actually, I suspect that the Indo-Europeanists of the 19th century used the same terminology, which means that M. Kümmel’s interpretation of PIE *k as /q/ and *kʸ as /k/ is not so new after all.

  5. David Marjanović says:

    Quite convincing – it hadn’t occurred to me that the labialized velar plosives imply labialized back fricatives, but they clearly do.

    [3] The situation for palatalized velars seems similar, but the controversy over if if *ḱ was [kʲ], or if *ḱ *k were perhaps instead [k q], makes this question harder to survey.

    I’m definitely for [kʲ k], for a number of reasons: it’s not as typologically odd as some claim, indeed early IE seems to have been in contact with West Caucasian, which has the same arrangement (in addition to [q]); many claimants of oddness seem to assume that the traditional claim is for actual [c], which is a quite unnecessary assumption;.the chain shift would be strange; *k and *kʷ seem to have had a single-feature difference; while [q] is common, [ɢ] is very rare and unstable, and [ɢʱ] seems to be wholly unattested worldwide; and finally, [q], [χ] or [ʔ] would be expected to be common reflexes of *k, but they’re not; a fortiori [ʁ] for *g and *gʰ.

    That leaves us with velar plosives but uvular fricatives (velar ones wouldn’t color so much, among other things). And that’s actually common. Swiss German, Scottish Gaelic and some Mandarin topolects come to mind. In the northern half or so of Germany, most of the ach-Laut is [χ], too; Weiss shows the same for at-least-Luwian.

    This disparity could then explain where the missing palatalized “laryngeals” are: palatalizing a uvular is really hard. And so, Pinault’s law deleted *h₂ and *h₃ immediately before every *j.

    retained until a-coloring?

    I rather think coloring happened immediately and inevitably as soon as a sound system with uvulars and few vowels was established in pre-PIE.

    Eichner’s Law (*ēx > *ē and not **ā) on the other hand still seems to require that a-coloring is usually younger than the rise of lengthened grade.

    Or that it was still active during the rise of the lengthened grade; or that vowel quality was leveled through the paradigm later.

    • M. says:

      A problem with * that that I have rarely seen pointed out is that it implies a full system of palatalized velars, in all phonotactic environments, but *no* other palatalized consonants in any other articulatory position: there is no evidence for a contrast in PIE between *T : *Tʲ, S : *Sʲ, etc. It seems extremely rare for palatalization to be distributed so selectively.

      For example, even though there are some NW Caucasian languages that have palatalized velars but not alveolars, all such languages that I’m aware of feature a contrast between alveolars, palatoalveolars and alveopalatals (both fricatives and stops) that could either be the result of earlier * and *Sʲ, or have served to block the development of such sounds. By contrast, if we accept the traditional reconstruction of PIE, then the only palatal articulation of any sort is the unattested *Kʲ (plus *, if we accept the idea presented in this post) and the palatal glide *j.

      Reconstruction of a full set of uvulars [q, G, Gh] only seems necessary if we assert that satemization was purely the result a phonemic chain shift (Q > K followed by K > Š or similar). Instead, the process may have been an interaction between a chain shift on the one hand, and a paradigm-leveling process on the other. For example, it may have started with voiceless *q > *k and *k > *š (or some other coronal consonant) before front vowels: this *š could then (via ablaut linkage) have been analogized to back-vowel and preconsonantal environments for most voiceless *k, and some (but not all) instances of voiced and breathy *g could then have been caught up in the same tide. Correct me if I’m wrong, but aren’t there noticeably fewer unambiguous cases of voiced ”plain velars” (both breathy and non-breathy) than of their voiceless counterparts?

      Aside from all the above, the traditional reconstruction of palatalized * also implies that the kentum languages experienced a massive, but somehow 100% “clean” depalatalization, with no attested traces of earlier palatal articulation whatsoever remaining in these languages. By contrast, the satem branches show scattered inconsistencies in their reflexes of this consonant series (e.g. Lithuanian klaušyti vs. Russian zaslushivat’). This supports the idea that the palatalization that caused *K > *Š did not exist at the PIE stage, but was instead confined to the satem branches.

      • David Marjanović says:

        It seems extremely rare for palatalization to be distributed so selectively.

        Modern northern/mainstream/standard Greek has [kʲ nʲ lʲ xʲ~ç ʝ] and a rare [gʲ], and no other palatalized or palatal consonants, though [ts] and [dz] exist and the phonemic status of everything is controversial.

        Unfortunately, however, Robert Woodhouse’s 1995 paper on reducing the 3 velar series to 2 doesn’t seem to exist anywhere on teh whole wide intarwebz, so I’m not able to form an opinion on that idea. Woodhouse’s other papers don’t explain it in sufficient detail.

        Correct me if I’m wrong, but aren’t there noticeably fewer unambiguous cases of voiced ”plain velars” (both breathy and non-breathy) than of their voiceless counterparts?

        I don’t know; that sounds like it’s worth looking into.

        Aside from all the above, the traditional reconstruction of palatalized *Kʲ also implies that the kentum languages experienced a massive, but somehow 100% “clean” depalatalization, with no attested traces of earlier palatal articulation whatsoever remaining in these languages.

        Yup. I agree this is unlikely as a native development. However, it makes perfect sense as a substrate effect by almost any substrate. Fitting this, we find the kentum merger in branches known to have thick substrates and to be spoken far from the Urheimat (West IE, Greek, Tocharian, Hittite), while Balto-Slavic and Indo-Iranian, which “stayed at home” (at first), instead increased the distance between the palatalized and the plain series by letting the former drift into the empty space for coronal consonants. Admittedly, I’m reduced to guessing when trying to explain why Hittite should have a stronger net substrate effect than Luwian.

        Exactly the same, BTW, holds for a Q-K merger (unless maybe if a vowel harmony system develops and redistributes velars and uvulars). As a native development, I’d expect Q to either remain or turn into back fricatives, or for [q] to yield [ʔ] (or [ʡ] perhaps, if located close enough to the Caucasus…).

        By contrast, the satem branches show scattered inconsistencies in their reflexes of this consonant series

        I bet these are all due to Weise’s law (the tautosyllabic *KʲR > *KR shift mentioned in the second paragraph of the OP), analogical levelings thereafter, loanwords from Celtic into Balto-Slavic or loanwords from Pre-Germanic into Balto-Slavic. Oh, and then there’s this extinct “Crotonian” branch. Much work on all this remains to be done.

        • j. says:

          However, it makes perfect sense as a substrate effect by almost any substrate. Fitting this, we find the kentum merger in branches known to have thick substrates and to be spoken far from the Urheimat

          This seems to fit poorly in with other big tendencies in phonological simplification: Albanian, Armenian and Iranian have the heaviest cluster reduction going on; most stop-system simplifying branches are satem, not centum; ditto most *a/o-merging ones.

          I am now however reminded that I have seen a proposal according to which initial dental + *y simplifies to a plain dental in Germanic. What if palatovelars first break into — or why not, originally come from — a cluster *Ky, and this glide then soundlawfully drops? Hmm.

          On the count of voiced plain velars, IIRC the LIV data comes out as about as widespread as could be expected. It is clear voiced labiovelars that seem to be the rarest. In particular initial *gʷʰ- only has six appearences. This still includes a few very widespread roots though: *gʷʰer- ‘warm’, *gʷʰen- ‘to strike (dead)’.

          • David Marjanović says:

            Good points. Of course, breaking into *Ky also makes sense as a substrate effect.

            • David Marjanović says:

              Oh. Are any *Ky clusters currently reconstructed for PIE?

              • j. says:

                *ǵyewH- ‘to chew’ and *gʷyeh₃- ‘to live’ are the main contenders seen around. LIV also has *kyex- ‘sieben’, *kyexp- ‘verfaulen’, *kʷyeh₁- ‘ausruhen’, *kʷyew- ‘sich in Bewegung setzen’; from II only: *ǵyeH- ‘berauben’, *ḱyeH- ‘gefrieren’. So not a lot.

                ‘To live’ plus most others could be also from zero grade *KiH(C)- by laryngeal breaking, and could be then rather reconstructed with *KeyH(C)- as the full grade. Back when I was first collecting the LIV data, I commented on how *(C)y- seems to appear very often before back consonants — this is probably a big reason for it. (I by now also suspect that *Ceyw- is probably a valid root structure, similar to how *wy- seems to be valid at least in late PIE; and that *dyēws is reshaped, while it is instead the more widespread *deyw-os that retains the original root structure.)

        • M. says:

          Yup. I agree this is unlikely as a native development. However, it makes perfect sense as a substrate effect by almost any substrate.

          I’m not sure how we can confidently say “almost any” here. Do you know of any specific examples in which palatalized velars that were abundant or reasonably common in the lexicon have been depalatalized without a trace due to substratal effects?

          As for clean [q]/[k] mergers, a possible example (albeit recent, and perhaps mediated by literacy) is Biblical Hebrew to modern Hebrew.

          • David Marjanović says:

            Do you know of any specific examples in which palatalized velars that were abundant or reasonably common in the lexicon have been depalatalized without a trace due to substratal effects?

            No, but palatalized velars aren’t common outside of Russian, Gaelic and West Caucasian… I need to check up on Hausa, which has a lot of second-language speakers.

            Biblical Hebrew to modern Hebrew

            Anything in modern Hebrew is either a substrate effect or a spelling-pronunciation.

    • j. says:

      while [q] is common, [ɢ] is very rare and unstable, and [ɢʱ] seems to be wholly unattested

      Right, I agree that this seems fishy. Some version of this problem comes up in many other PIE stop system revisions too: e.g. Kümmel’s implosives for the *D series work nicely in terms of correspondences, but /ɠ/ is much rarer than /ɓ ɗ/ (this is, as far as I can tell, a substantially stronger tendency than a missing /pʼ/ from an ejective series) and /ɠʷ/ even more so. (/ʛ/ starts being kinda common again, but that’s almost solely because of Mayan languages, which for some reason tend to have a mixed glottalized series, with /tʼ tʃʼ kʼ/ etc. but /ɓ ʛ/.)

      One approach that would solve this is Kloekhorst’s “all-voiceless” reconstruction with /tː t tʼ/ for standard *t *dʰ *d, but that brings all sorts of other problems along.

      It seems extremely rare for palatalization to be distributed so selectively.

      Re-hitting PHOIBLE, palatalization only on velars (or velars + /hʲ/) looks reasonably common actually, though most examples are from Africa or the Americas (Abishira, Bissa, Budu, Cherepon, Furu, Guerrero Amuzgo, Gwandara, Gweno, Hausa, Inor, Irantxe, Kinyarwanda, Lakkia, Mesqan, Nuclear Tsimshian, Pagibete, Saya, Sebat Bet Gurage, Siriono, Western Karaboro, Wichi Lhamtes Nocten, Yanuwa, Yuqui). OTOH I think most of these languages have a postalveolar or pure palatal series that effectively claims the slot of palatalized dentals. At least in Hausa this is a productive phonological rule, even.

      /xʲ/ also seems to be extremely rare for some reason (just 3 examples in the entire database).

      • David Marjanović says:

        /xʲ/ also seems to be extremely rare for some reason

        [xʲ] is almost exactly the same as [ç]; I bet no language on this planet distinguishes these two. How common is /ç/ in PHOIBLE?

        (Same for [ɣʲ] and [ʝ] of course. In fact, I don’t think I’ve ever seen the transcription ɣʲ at all.)

        For that matter, how often does /hʲ/ surface as [ç]? The English /hj/ does for easily half of all speakers.

        • j. says:

          Right, /ç/ is much more common, with 108 examples. Some non-negligible proportion of these should be /ɕ/ though, e.g. Komi, Mandarin, Maxakali.

          • David Marjanović says:

            In the Mandarin prestige accent it’s actually the dorso-palatal sibilant, which seems impossible to represent in the IPA… I’ve heard [ɕ] only from southerners who also don’t retroflex. But I’ve only spent two weeks in China, both in the north.

            (On top of that, young girls go for [sʲ] as a more sociolectal phenomenon.)

  6. Howl says:

    If there was a *xʷ/*χʷ in PIE then I would also expect it to be subject to the boukólos rule. It would really strenghen the case for *xʷ if there are examples of that (*wxʷ → *wx).

    As for typology, how many languages are there with at least 3 back fricatives but only one front fricative (like PIE *s)?

    • j. says:

      I guess old compounds would be expected to have *Cu-a- < *Cu-xe- in Greek instead of *Cu-awe- < *Cu-xʷe-, but since *xʷ was lost pretty early, this rule could not have remained productive into the daughter languages. Outside of Greek, I'm not sure how well this could be identified at all.

      The unusual rarity of front consonants in PIE is actually apparent already from just the stop system: just about all languages that have back stops at ≥ 2 POAs and ≥ 2 phonations also have an affricate such as /tʃ/. PHOIBLE doesn’t seem to allow looking for languages that lack a given sound, but the UPSID results for languages without any affricates are dominated by (1) languages with basic /p t k/, (2) languages with /p t k kp/, and (3) Dravidian ~ Australian-type languages with /p t ʈ c k/ or the like. The closest to PIE I can find is Southern Nambikwara: /p t k kʷ/ × plain/aspirated/ejective and additional /ɓ ɗ/. Next best matches are Hopi, with /p t k kʷ q/, and Yupik, with /p t k kʷ q qʷ/, but no phonation contrasts in either; or Wantoat and Yessan-Mayo, both with roughly /p t k kʷ mb nd ŋg ŋgʷ/. Also, Wikipedia’s inventories for different dialects of Hopi, Nambikwaran and Yupik all have at least one affricate.

      Even just about all IE languages either develop affricates (Satem languages, Proto-Greek, Tocharian, Anatolian) and/or get rid of the labiovelars (Satem langs again, later Greek, P-Celtic and Sabellic, arguably breaking *Kʷ > *Kw in NW Germanic). I think the only definite exceptions are Old Irish, Latin and Gothic. Still widespread enough though that I think this really was a legitimate unusual feature of PIE.

      • Howl says:

        I knew about the general objection against laryngeal theory of having so many back fricatives and only one front fricative. But I did not know it could also applied to all the consonants in PIE. And that would at least be a strong argument for reducing the 3 velar stop POAs to 2 or 1 in pre-PIE.

        • KathTheDragon says:

          I’d have to agree, it seems very unlikely that the PIE system with all its dorsals is much older than PIE itself (though it goes without saying that whatever the previous system is, it’s not presently accessible to us)

          • Howl says:

            Everything I have seen leads me to believe that the 3 dorsal POAs of PIE existed for a long time in some form. It is possible that pre-PIE had an additional coronal POA that merged in PIE. It is also possible that these 3 sets reflect chesirization from pre-PIE vowels/glides. I think more research into potential sister families of PIE like Uralic may make the previous system accessible to us. But it is also possible that PIE just had this configuration. Typology often gets overrated.

      • defseg says:

        PHOIBLE doesn’t seem to allow looking for languages that lack a given sound

        So it doesn’t – but I’m writing a thing that does. It isn’t ready for formal release yet, but why not, here’s a preview.

        Languages with only one front fricative but at least three back fricatives:

        1 -sonorant;+continuant;-dorsal >2 -sonorant;+continuant;+dorsal and
        Páez (pbb) s sʲ x ɸʲ βʲ xʲ
        Yuqui (yuq) s ʝ x xʷ xʲ

        Annoyingly, palatalized consonants are all +dorsal in PHOIBLE’s featural model, so we get a spurious result. But Yuqui is interesting…

        Languages with uvulars and no affricates:
        any +consonantal;+dorsal;-high;-low no -continuant;+delayed_release and
        Ket (ket) q
        Kunimaipa (kup) ɢ
        Moroccan Arabic (ary) q χʷ χ ʁ ʁʷ qː χː qʷ ʁː
        French (fra) ʀ

        OK, let’s try that again. Languages with uvular non-continuants, no affricates, and no /c/-like consonants:
        any +consonantal;-continuant;+dorsal;-high;-low no -continuant;+delayed_release and no /%c%/ and
        Ket (ket) q
        Kunimaipa (kup) ɢ
        Moroccan Arabic (ary) q qː qʷ
        YUPIK (ess) q qʷ
        HOPI (hop) q
        JAPANESE (jpn) ɴ
        KET (ket) q
        KUNIMAIPA (kup) ɢ
        NIVKH (niv) q qʰ
        TAMASHEQ (thv) q
        sooninke (Senegal) (snk) q qː
        Lebanese Arabic (ayl) q
        Tamazight Berber (tzm) q qʷ
        Tachelhit (shi) q qʷ

        (The percent sign is an undocumented feature, which exists because none of this stuff is sanitized when it’s turned into a query. That’s why it’s slow — sanitization is hard, so for now it’s running sqlite compiled to JS in the browser. There’ll be a backend eventually.)

        Is non-emphatic /t/ affricated in Lebanese Arabic and these Berber languages? Wikipedia says it is in Moroccan Arabic. It also says Yupik, Hopi, Nivkh, and Soninke have affricates. So it might be that the only language in PHOIBLE that really has uvulars and no affricates is Ket.

        How about languages with at least two labiovelar plosives and no affricates? This is a little trickier, since we have to distinguish labiovelars from labial-velars (which are -round — this means PHOIBLE doesn’t distinguish labiovelars from rounded labial-velars at all) and palatalized labials (which are +front), but:

        >1 +consonantal;-continuant;+labial;+dorsal;-front;+round;-nasal;+high no -continuant;+delayed_release and no /%c%/ and
        (23 results - snip)

  7. KathTheDragon says:

    One minor problem with including “wash” here is that it’s *lewh₃-. Proto-Celtic *lowatro- “bath” virtually necessitates it. Both de Vaan and Beekes uphold this reconstruction too, in their respective etymological dictionaries of Latin and Greek. Against this, the connection with Hittite lāhui “to pour” is impossible, so unless new cognates can be found, core IE has nothing to say about *lexʷ- “to pour”.

    • j. says:

      Cases like *lowatro- (~ Latin lavābrum, Greek λοετρόν) strictly speaking only get us up to *lowə-. I think schwa primum is epenthetic and not a direct continuation of laryngeals; it probably existed at least phonetically already in PIE, and should not be mechanically rewritten as *-CH-. A possibility could be something like *lexʷ- > *lexw[ə]- → *loxwətron > *lowətron. This would require a few auxiliary assumptions, such as that *w ~ *u was no longer productive by the time of *xʷ > *Hw, therefore requiring schwa epenthesis; and that *Hw- was a valid onset and therefore did not lead to vowel lengthening, unlike most cases of *-H.C-.

      This second assumption could also have other benefits, such as covering some supposed cases of Pinault’s Law (not *Hy > *y but syllabification as *V.HyV instead of **VH.yV), or allow deriving prothetic vowels in Greek also through *ə.HR- and not just *HəR-.

      • David Marjanović says:

        Where does the long vowel in lavābrum come from?

        • j. says:

          I would presume by analogy from lavāre ‘to wash oneself’, which according to de Vaan is a derivative *lava-ē- (or, derived within Latin itself, since this has wrong voicing in the instrument suffix).

  8. David Marjanović says:

    “close-vowel roots” *CuH-, *CiH-

    I just discovered this paper from 1998 which proposes preconsonantal *eiX > *ī (once preconsonantal laryngeals were lost in individual branches), *ouX > *ū and *oiX > *ō, where *X is *h₂ and *h₃, as well as preconsonantal *eih₁ > *ē. This explains several words currently reconstructed with *iH or *uH and no ablaut as having ablaut after all.

    Before I got to the part where a phonetic mechanism is proposed (it’s a bit complex and involves a short-lived marginal /ç/), I came up with my own (where X for *h₂ and *h₃ should be interpreted as “any uvular consonant”):

    Start from a sound system without long vowels.
    1) Coloring: **/ɛjX/ > “**[ɛɛ̯X]” = **[ɛːX]; **/ɔwX/ > “**[ɔɔ̯X]” = **[ɔːX]; **[ɔjX] > **[ɔɛ̯X]; also **/ɛwX/ > **[ɛɔ̯X], which turns out to be irrelevant. This is very similar to the Old High German monophthongization. The new long vowels are already exempt from further coloring (Eichner’s “law”).
    2) Long mid vowels do what long mid vowels do: **[ɛːX] > **[eːX] > *[iːX], **[ɔːX] > **[oːX] > *[uːX].
    3) New long mid vowels from Szemerényi’s & Stang’s laws, from reduplication + dissimilation (e.g. Narten presents) and eventually from morphologization of the resulting “lengthened grade”. **[ɔɛ̯X] > *[ɔːX] joins them, as already known from all the thematic *-o-e- > *-ō- phenomena.

    Probably this still works if *o was [ɑ] as long as there was no sound between [w] and [ɑ] in the system.

    This scenario has interesting phonological implications that may not be testable. To pick one of the “[n]umerous” examples from p. 77 of the paper, the nom. sg. and gen. sg. of the “mouse” word would have been *|mowXs-s| = */mowXs/ = *[muːXs] and *|mowXs-ós| = */mwXˈsos/ = *[muXˈsɔs], i.e. the ablaut pattern would still have been phonemically *o ~ *0 in PIE. Alas, both *[uːX] and *[uX] gave *[uː] in every daughter branch, so the ablaut was leveled everywhere. Likewise, the scenario implies that PIE lacked /iː/ and /uː/, because all cases of *[iː] and *[uː] would still have been */ej/ and */ow/ (before */X/ = *[X]) or */j/ and */w/ (lengthening of final vowels in stressed monosyllabic words, e.g. */tw/ = *[tu] ~ *[ˈtuː]) – and this, too, seems untestable.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Enter your email address to follow this blog and receive notifications of new posts by email.

%d bloggers like this: