The fate of *w in Altaic

A fairly striking typological commonality between the “micro-Altaic” language groups: Turkic, Mongolic and Tungusic (Tk, Mg, Tg) is the lack of a labial glide such as /w/.

This is clearly out of line among both the world’s languages in general, and Eurasia in particular. /w/ is one of the most common phonemes in the world’s languages, that can usually be found even in languages with seriously impoverished consonant inventories such as Hawai’ian (at ZBB we [1] once compiled stats on this thing), and there is no shortage of *w in any of the other major language families hanging out nearby: IE, Uralic, Semitic, Dravidan, Sino-Tibetan, Austronesian, Eskaleut, you name it. Even in languages that lack /w/ precisely (Finnic, Slavic, most non-English Germanic…), it has usually not gotten too far off-field and has merely become a more frontal labial continuant such as /β/, /v/, /ʋ/. Yet none of these can be found in Turkic / Mongolic / Tungusic either. This clearly means that any long-range relationship hypotheses like Nostratic, Eurasiatic, Ural-Altaic will need to explain whatever happened to *w in Altaic.

There are two main hypotheses going around that I know of: *w > ∅ versus *w > *b. The former is the stance of some old-school Ural-Altaicists like Räsänen, among Nostraticists apparently Bomhard [2] and I gather also Illič-Svityč. The latter is the stance of, at minimum, Dolgopolsky. (He proposes also *w > ∅ before labial vowels in Turkic. [3])

I think the actual answer is neither of these, and the demise of *w is only post-common Altaic (if such a thing existed at all) — since comparison with Uralic seems to be able to show a fair number of good examples of both developments, yet strongly split according to their distribution. It does not really matter for this purpose if the comparanda are real cognates or loans … but see below for a hypothesis.

In the following, I have stuck to the clearest data, where comparison with Uralic seems, usually on semantic grounds, preferrable to or at least equally good as the proposed Altaic connections. Checking up on the non-EDAL lexicon of the languages would probably also turn up something, but I will leave that for later.

1. Turkic: *w > *b

(1.1) *bāj ‘rich, noble’ ~ Samic-Finnic *wäjä- ‘to be able, have power’, Hung. vív ‘to fight’
(not worse than ~ Mg ‘strong’, Tg ‘many’, Jp ‘to surpass’)

(1.2) *bakɨr ‘copper’ ~ PU #wäśkä ‘(reddish) metal, ? copper’ > Khanty *wăɣ ‘iron’
(rather than ~ Mg ‘patina’, Jp ‘dust’)

(1.3) *balk- ‘to shine’ ~ PU *wëlkəta ‘light, white’
(rather than ~ Mg *mel-, Tg *mial- with no **-k-; Ko *mark- may or may not belong; maybe here also Tg *beli ‘pale’, rather than ~ Mg ‘dark’)

(1.4) *bań ‘fat’ ~ PU *wajə ‘id.’
(rather than ~ Mg ‘churn’, Tg ‘storage’)

(1.5) *bek ‘firm, stable’ ~ Samic-Finnic *waka ‘id.’
(rather than ~ Mg Tg ‘big’)

(1.6) *bejŋi ‘brain’ ~ PU *wajŋə ‘breath, spirit’ > Selkup *kȫŋə ‘brain’
(rather than ~ Mg ‘forehead’)

(1.7) *bij- ‘sharp edge’ ~ Samic-Finnic *wijə- ‘to be sharp’
(rather than ~ Mg ‘to crush’, Tg ‘to mince’)

(1.8) *(b)ōl-, Mg *bol- ‘to become’, Japonic *wər- ‘to be’ ~ Uralic *(w)alə- ‘to be’ > Ob-Ugric ‘to be, to become’

(1.9) *burun ‘nose’ ~ PU *wara ‘mountain’ > Hung. orr ‘nose, †peak’
(rather than ~ Jp Ko ‘beak’)

(1.10) *būt ‘leg’ ~ Samoyedic *utå ‘hand’
(Tg *begdi may or may not belong)

(1.11) *dabul ‘wind’ ~ PU #tɜwlə ‘id.’
(rather than ~ Mg ‘typhus’, Tg ‘to be infected’)

(1.12) *debe ‘camel’ ~ Samoyedic *tëə < ? *tëwə '(tame) reindeer'
(rather than ~ Mg *temeɣen; long compared also with isolated Karelian tevana ‘elk cow’ (often mis-cited as Finnish))

(1.13) *sib- ‘to spin thread, pull out fibre’, Tg *sib- ‘id.’ ~  PU *siwə ‘fibre’
(rather than ~ Mg ‘to tuck up’)

I suspect most of these to be loans into Turkic from early Ugric, and in the case of *bōl-, thence into Mongolic. At least #wūta is probably better taken as a loan in the opposite direction, since this is innovative vocabulary replacing PU *kätə (and Samoyedic does not tolerate **wu-). Perhaps likewise for #dewe.

For a few IE parallels, I can moreover mention e.g. Tk *basu ‘hammer’ ~ II *wadźra- ‘hammer, mace’; Tk *ebin ‘grain’ ~ IE *yewo- ‘id.’; Tk *gēb- ‘to chew’, Mg *gebi- ‘id.’, Tg *keb- ‘to bite’ ~ IE *ǵyew- ‘to chew’. Comparison with Japonic would also immediately provide examples for *w > *b. There has been some debate on if *b or *w should be reconstructed for Proto-Japonic, but as far as I gather, *b has been assumed for ease of Altaic comparison, while most of the actual data clearly sides with *w. [4]

*w > *b also has good areal parallels, being found in both the north(west)ern and south(west)ern neighbors of Turkic: on one hand widely in Samoyedic, viz. in Enets, Nganasan, Kamassian and Mator (partly even in Yurats and eastern dialects of Tundra Nenets), on the other, in East Sakan (Khotanese and Tumshuqese).

There is also one notable exception where Turkic seems to have *w > ∅ instead: *öl- ‘to die’ ~ PU *widə- > Hung. *ül- > öl- ‘to kill’. This isolated example could be, however, merely an accidental similarity, esp. since the semantics are off. (‘Die’ and ‘kill’ are close enough concepts, but usually do not interchange without causative / anticausative morphology.) Contrast also ‘nose’, where we seem to have *wu > *bu in Turkic but *wu > *u > o in Hungarian.

All in all, the details may use further fine-tuning, but I think there is good evidence to assume that earlier *w develops into *b in Turkic. Contrary to what I earlier commented on this topic though, it is also easy enough to find equally good-looking cases of Turkic *b ~ Uralic *p (e.g. *bas- ‘to press’ ~ *puńćə- ‘to press, squeeze’, *beliŋ ‘panic’ ~ *pelə- ‘to fear’, *bɨč- ‘to cut’ ~ *päčkä- ‘id.’, *bulun ‘cloud’ ~ *pilw/ŋə ‘id.’), so probably this was still a merger with a pre-existing *b.

2. Mongolic: *w > ∅

Supported by less data, but even fairly tight reins on semantics still allow finding some evidence.

(2.1) *oŋgi ‘hole’, Tg *uŋgV ‘id.’ ~ PU *woŋkə ‘id.’
(rather than ~ Tk ‘to dig’)

(2.2) *ök/g- ‘to give’ ~ PU *wexə- ‘to take somewhere’ > Samoyedic *ü- ‘to drag’
(rather than ~ Tk, Tg ‘to heap up’; maybe here better Tg *bū- ‘to give’?)

(2.3) *udu- ‘to lead’ ~ PU *we/ätä- ‘to pull, lead’, PIE *wedʰ- ‘id.’
(rather than ~ Tk ‘to send’)

(2.4) *usu ‘water’ ~ PU *wetə ‘id.’
(rather than ~ Tk *sɨb)

(2.5) *üdže- ‘to see’, Tg *edže- ‘to understand’ ~ PU *weńćä- ‘to look, watch’ [5]
(rather than ~ Tk ‘to think, understand’; ‘understand’ is surely secondary in both etymological groups, and ‘think’ ~ ‘see’ does not match)

(2.6) *ündü-sü ‘root’ ~ PU *wanča ‘id.’
(possibly suggests that PU *č < *ts or *tU; Tg *ŋǖŋte may or may not belong)

A possible IE parallel that looks like it could have been transmitted thru Uralic: *ös ‘revenge, hate’ ~ II *dwiša- ‘hate’ (→ Permic, Finnic #wiša) (not worse than ~ Tk, Tg ‘bad, evil’). This is not attested in Ugric or Samoyedic, though, unlike all of the above examples.

The different treatment here is possibly however simply due to geography / relative chronology and not due to an actually different native development. Mongolic is a more eastern family, and may have gotten rid of *w already before contact with Uralic or some flavor of Para-Uralic — perhaps still indeed by > *b as per comparison with Turkic. So the correspondence here might indicate that in later loans, *w was substituted as zero.

I have not managed to find any reasonable-looking cases of Mongolic *b ~ Uralic *w (other than ‘to become’, see under Turkic).

The loanword layer interpretation can be also supported by how for Tungusic I cannot on a quick look-around find any clear etymologies of either type at all (i.e. where comparison with Uralic would be clearly preferrable to supposed Altaic origin). You can find some Tg cognates above under both my Turkic and Mongolic comparisons, but they might be loans. I could still add a few word-internal cases suggesting *w > *b, though: *dolba ‘night’ ~ Samoyedic *tålwə ‘dark(ness)’ (no worse than ~ Mg ‘to stay up overnight’); *nebi ‘new’ ~ PIE *new-.

[1] “We” being at least 90% the OP “Nortaneous” (lingblr yeli-renrong); myself probably not more than 5%, and a handful of remaining people suggesting single datapoints.
[2] He does not explicitly say so, and in his book leaves the Altaic column empty in the overview of Nostratic sound correspondences; but the few examples he has of a root with *w- being reflected in Altaic show zero onset.
[3] Well, “with rounding of the adjacent vowel”, but I would not buy any current claims about Proto-Nostratic vowel reconstruction with a nine-feet pole.
[4] As for Korean, the modern language has /w/, but I have the impression it mostly occurs due to vowel breaking or in loanwords from Chinese. I admit knowing very little about Middle or Old Korean though, and hence I am skipping over Korean in this post entirely.
[5] IMO better thus than UEW’s *wića-. Permic *dź ~ Hung. gy clearly proves *ńć, and front-harmonic cognates in these clearly prove *-ä and not *-a. Hung. front-harmonic í is also almost always from *e, not *i. Finnic can be routed as “*weŋ́śä-” > *wejśä- > *viisä-, and for Permic I suspect early *e > *i next to palatals in certain cases.

Tagged with: , , , , , , , ,
Posted in Reconstruction
19 comments on “The fate of *w in Altaic
  1. David Marjanović says:

    Very promising!

    (rather than ~ Mg ‘forehead’)

    Well, the Bavarian dialects use “onto the brain” as a dysphemism for “onto the forehead” often enough that Wiktionary actually claims “forehead” as a meaning of Hirn.

  2. Crom Daba says:

    Interesting. One should investigate Paleosiberian words to see if there are any further cognates obeying this.

    Obligatory anti-EDAL nitpicks:
    – PTr *sib- looks more like *süg-, Chuvash term is possibly cognate with Siberian Turkic sööm ‘quarter’ according to Egorov (perhaps Turkish and Turkmen süyüm are too?) and other forms are Oghuz exclusive.
    – PM *udu(rï)- is from PTr *ud-, perhaps even from causative *uduŕ-
    – PM *usun is possibly analyzable as *u-sun, which would make the comparison a phoneme long.
    – PM *öš/č is loan from CTr *öč (there are no coda *š nor č in native Mongolic)
    – PTn *nebi is probably three etymons packed into one, neither of which shows unequivocal *b and the one meaning ‘new’ is found only in Even.

    For Tungusic, you’ve already mentioned *ŋuiŋte, initial *ŋ reconstructed based on Evenki results (unpredictably to me) in w- in Nanaic and sometimes in Manchu (g- being the regular reflex), so maybe *w could be original in some of these cases.

    I tried fishing for some additional cognates, and PU *wirka ‘noose, snare’ is interesting to compare to PM *huraka ‘lasso, noose, snare, trap’ and *uxurga ‘pole with a noose for catching horses’, two separate etymons of unclear relation.
    (I wonder if there was some sort of regular syncope operating on tetrasyllables, trisyllables are extremely common while tetrasyllables are vanishingly rare, gotta check if clusters are more common than expected when comparing tri- to disyllables.)

    • j. says:

      *usun ~ *wetə seems already a bit poor due to requiring *s ~ *t, but cutting it further down to just one phoneme would obviously put things beyond any realibility.

      I actually did a brief scan for some other reasonable hypotheses for Tungusic (*g-, *ŋ-), but didn’t get well enough anything together for those either.

      Finnic *virka ‘snare’ has a few suggested loan etymologies from Baltic and Germanic, which look better to me than Uralic derivation (already per disharmony). This still leaves Nenets *ẃārkā ‘lasso’ though.

  3. Among the languages in contact with “Altaic”,at least Tibetan did not have initial w- at a certain stage, see

  4. Howl says:

    When scanning EDAL for Mongolic words with possible PIE *w ~ Mg *b, it’s easy to find examples. These are mostly generic Altaic etymologies. But some work best with Mongolian.

    Mg. *balgu ‘to swallow, to gulp’ ~ PIE swelk ‘to gulp’ rather than Tg. *bilga ‘throat’

    Mg. *kebi ‘to chew’ ~ PIE *ǵyewh₁ ‘to chew’ ~ Tk. *gēb ‘to chew’ rather than Tg *keb ‘to bite’

    Mg. *baka ‘to wish’ ~ PIE *weḱ ‘to wish, to want’ rather than Tk. *bẹ̄ken ‘to appreciate’

    Mg. *sebe ‘to wave, to sway’ PIE sweh₁- ‘to sway’ ~ Tk. *sapɨ ‘to wave, to sway, to shake up’

    Mg. *göbi ‘to strike’ ~ PIE *kewh₂ ‘to hit, to strike, to forge’ ~ Tk. *gȫp ‘to hit, to pound’

    • David Marjanović says:

      Unrelatedly, let me express my admiration for your crazy idea. It looks crazy enough to work!!!

      • David Marjanović says:

        “Liver”, though, might not work. The Greek cognate of the Germanic one might not be hépar (“liver”) after all, but liparós (“fatty”), and the Armenian one is just weird…

        • Howl says:

          I got the idea for the liver word from Roland Pooth who consistently reconstructs it as *io/ekʷr. But I haven’t looked any further into this idea of a PIE ɬ, and a lot would need to be done to make it work. Right now I am frustrating myself with the Uralic vowel correspondences.

          • j. says:

            frustrating myself with the Uralic vowel correspondences

            Welcome to the club. IMO the state of the field is still nowhere near a good enough understanding for external comparison.

            FWIW *-l- in Uralic ‘hear’ and ‘be’ is probably a suffix though: cf. Samoyedic *kåw or perhaps *kåwɜ ‘ear’; Finnic *o-ma ~ *o-n ‘be.3PS’, *o-ma ‘own’. Even that in ‘pass by’ could be: cf. #mu or #mow ‘other’, possibly allowing an analysis as #mu-lə- ‘to go elsewhere’ > ‘to be going elsewhere’ > ‘to pass by’.

            • Howl says:

              I don’t know if the state of the field will ever be good enough for external comparison. I get the sense that there is a certain hostility in this field towards external comparison. But right now something like a Historic Phonology of the Uralic Languages v2.0 with all the deltas in Aikio’s papers would be a huge improvement.

              When it comes to the -l-, yes it could be a suffix. For now that explanation is good enough. But it just makes me suspicious that I keep seeing this in the same phonetical environment. And if this is not some sound correspondence, then in many cases I only have one-and-a-half phoneme to compare.

              • j. says:

                Macro-comparison of any kind seems to be getting more popular these days, though, with e.g. Adam Hyllested raising a new generation of Indo-Uralicists at Copenhagen, occasional Leiden IEists giving a nod for Indo-Uralic too, the post-Soviet Moscow School putting more effort into outreach and defending their methodologies, or Martine Robbeets & co. trying to assemble a less overengineered Proto-Altaic.

                Historic Phonology of the Uralic Languages v2.0

                It’s going to be done sooner or later; I’m planning on starting on essentially that myself if no-one’s done it by the time I wrap up my PhD (ETA 5-7 years). Before that I have maybe a dozen papers to write, though.

                — For the *h₃ versus *l case: instead of reconstructing a distinct *ɬ, note that PIE doesn’t allow root-final *-Cl-, so maybe worth considering is *l > *ɫ > *ɣ / R_.

                • Howl says:

                  I do have to say your information at is awesome. The consonant table and the vowel correspondence tables are a big help.

                  I didn’t know Hyllested had an Indo-Uralic program at Copenhagen. I do hope Hyllested teaches his students better Uralic than what he used in his paper ‘Internal Reconstruction versus External Comparison’.

                  I am fully an amateur. I’ll probably lose interest in this before I would consider a career-change. And such a thing as a career-change would be very difficult in my situation anyway. But if I am still busy with this next year, I might give the Leiden summer school a chance. It’s just 140km from where I live.

                • j. says:

                  Belatedly: no, Hyllested is not running an Indo-Uralic program, but I am seeing Indo-Europeanist Master’s theses and such coming out with an Indo-Uralic bent.

          • David Marjanović says:

            A lot would indeed need to be done, and I hope someone does it. :-) However, the *h₁ in the PIE “liver” word is needed to explain the Greek h.

          • David Marjanović says:

            Seeing as this thread has now been linked to from a discussion that has branched out to Uralic definite vs. indefinite conjugations, I recommend on that topic a part of Kirill Babaev’s long paper here, which is summarized in English here. Is the presentation of the Hungarian, Selkup and Unspecified Permic situations correct?

            • j. says:

              (Link to the discussion.)

              Doesn’t sound very promising so far…
              – 1PS definite -k in Northern Selkup comes from Proto-Selkup *-ŋ.
              – While Permic does not have -g as a general 1PS ending, *og is used as the 1PS form for the negative verb in both languages. Udmurt however uses this also for 3PS and 3PP in the present, and these have apparently been suggested as being the more original function. In any case though, all medial post-tonic and word-final original single stops are lost in Permic, so there should not be any possibility for this *-g to come directly from a PU *-k. If originally specifically a suffix on the negative verb *e-, it would require *-ŋkV. (Permic is just about the most morphologically innovative and phonologically worn-down branch of Uralic, and should mostly not be appealed to at face value.)
              – Hungarian -k seems unlikely to be from a plain *-k: the cases of 2PS -d and the word ideg ‘nerve’ (~ Finnic *jändek, Mansi *jääntəɣ/ŋ, Khanty *jöntəɣ ‘sinew’) suggest that final single stops end up as voiced.

              Given not too many assumptions, some of these could be maybe still united as coming from a common heavier ending such as *-kkV or *-ŋkV, but in this case one would like to know what was the function of this exactly. Mostly they regardless look like a grab-bag of secondary suffixes.

  5. Blasius B. Blasebalg says:

    Interestingly, some but apparently not most of the “Altaic” daughter languages have refilled the gap, e.g. Turkmen and Bashkir have developped /w/ from certain vocalic auslauts, Western Oghuz has /v/ from final /b/, (some) instances of intervocalic g,k have developped into /v/ in Chuvash and /ɰ/ in standard Turkish. w also occurs in Khalkha, Nanai, Manchu …

    This confirms that a situation without either w or v (or ɰ) is _somewhat_ unstable. On the other hand, the distribution in modern languages, somewhat restrained at least in Turkic, helps not to overestimate that effect.

    Besides, one of the peculiar features of the (micro-)Altaic languages is two-dimensional vowel harmony, including rounding harmony. Two-dimensional harmony seems extremly rare elsewhere (I am only aware of Hungarian as an example). I wonder, may the the disappearance of /w/ and the development of rounding harmony have been mutually reinforcing?

    I do not necessarily assume a causal relationship in one way or other. However, combinations like /wi/ and /we/ may proof a roadblock for establishing rounding harmony, as this keeps a rounded element in unrounded morphemes. Other languages might assimilate that sequence to /wü/ or /wø/, but that would destroy or prevent palatal harmony with other syllables in the same word (a similar argument can be made for ATR harmony). So the only way to end up with rounding harmony _and_ one other harmonic feature is to get rid of /w/ at least in unrounded contexts.- On the other hand, once rounding harmony is established, there is a strong motivation to avoid (or remove) additional rounded elements like /w/ into [-rounded] words.

  6. […] original **w being one source of *b was recently proposed at Freelance reconstruction, but to me a more attractive proposition is that most of it derives from earlier *m-, which is […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Enter your email address to follow this blog and receive notifications of new posts by email.

%d bloggers like this: