A recent blog post from Christopher Culver brings to my attention an apparent family of Turkic word roots showing irregular variation in form: *künäš ~ *qujaš ‘sun, day, heat’. Aside from the alternation *n ~ *j (for which *ń seems to be a standard explanation), these seem to make up a neat pair of front/back variants.

I am wondering however if this relationship might be illusory, and if there might be an old Uralic loanword in Turkic involved here instead. There are a few Uralic word roots (themselves probably in some sort of an obscured correlative relationship) that seem quite relevant here:

  • *kaja ‘sun, to shine’ (> Finnic *kajasta- ‘to dawn, to shine’, Lule Sami guojijdit ‘to rise (of sun or moon)’, Samoyedic *kåjå ‘sun’, etc.)
  • *kojə ‘dawn’ (> Finnic *koi ‘dawn’, Hungarian hajnal ‘dawn’, Mansi *kuj ‘dawn’, etc.)

Of particular interest is the Hungarian word, which seems to show the exact same “suffixal” elements as Turkic. This even has a formal equivalent in Khanty: *kuuńəɬ´ ‘dawn’ (apparently showing a change *jn > *ń, in neat parallel to the change *jt > *ć that was proposed by Aikio recently [1]), coming closer yet to the Proto-Turkic form.

It’s hard to say though what the dangling element -nal is here. It’s neither an independent word root on its own, nor a regular derivational affix. If I had to speculate, a compound *kojə-n‿alŋV > *kojnal- ‘beginning of sun’ could be assembled… but this seems a bit contrived semantically. Also I am not convinced if Khanty *aaLəŋ ~ Mansi *aaɣəl ‘beginning, end, point’ is an inherited root at all. [2]

And while phonetically the Khanty form in particular seems like a prime loan original, the semantics are a bit off. Is the meaning ‘dawn’ in Hungarian and Khanty perhaps secondary, from earlier ‘sun’ or the like? Or was there instead a shift ‘dawn’ > ‘sun’ in some transmission language along the way?

Some Turkologists, I’m sure, could also see it as an obstacle that this etymology seemingly requires adhering to sigmatism (reconstructing a Proto-Turkic lateral” *l₂ that later shifts to *š in Common Turkic) over lambdaism (reconstructing PT *š that shifts later to *l in Oghur Turkic). Now, yes, from what evidence I’ve seen, I lean on the view that sigmatism is the better solution [3]… But it is, however, not an entirely inescapable assuption here. Say we instead assumed that early Oghur maintaines *l₂ for some time apart from *l (perhaps indeed as a lateral fricative [ɬ]? [4]) Then we could posit an etymological sound substitution to have occurred during propagation to the other Turkic languages: Khanty *kuuńəɬ´ → Oghur #qujal₂ → Common Turkic *qujaš.

Independent loaning to different Turkic varieties might also be chronologically preferrable to assuming loaning already to unitary Proto-Turkic. Christopher notes that *qujaš seems to have a kind of northerly-leaning distribution across the Turkic languages… not bad news for an attempted Uralic loan etymology, I’m sure.

[1] Aikio, Ante (2014): Studies in Uralic etymology II: Finnic etymologies. Linguistica Uralica 50:1.
[2] There is, yes, a rather similar word root in Finnic: *alka- ‘to begin’ — but this does not quite correspond regularly to the Ob-Ugric words, esp. on account of the discrepancy between *ŋ and *k. The vowel correspondence Kh *aa ~ Ms *aa is not typical of inherited Uralic vocabulary either.
[3] But note that this does not compel me to take a stance on the similar rhotacism/zetacism debate, nor to consider *l₂ of “Altaic” inheritance.
[4] Which even brings to mind the East Uralic shift *š > *ɬ, rather similar to the shift *š > *l posited by the lambdaist side of the Turkic debate.

5 comments on "Some sunny words
  1. David Marjanović says:

    In case you’re wondering, it has indeed been proposed that Proto-Turkic *l₂ was [ɬ]. Bizarrely, the only reference for this I know is this paper (pdf), which seems to present the idea as new (obvious, but new). Without taking a stance on the question of whether the chain [l] <> [ɬ] <> [ʃ] must necessarily have begun in the middle rather than at one end (lambdatism) or the other (sigmatism)*, I wonder what – apparently – took the Turkologists so long to propose the very possibility.

    * That, I suppose, can only be decided by comparing Turkic to its closest relatives, whichever those may be.

    • j. says:

      In my understanding the idea that any nontrivial correspondence *A ~ *B might go instead back to *C, a different segment yet — i.e. that there might have been, in a language family, a drift generally away from the state of the proto-language — has not been embraced very widely in historical linguistics. And sure, one has to be very careful in doing that! It’s a bit too easy to generate random spam hypotheses of this sort — *tɬ, *ʒ, *ç, *ɻ? … But on the other hand, yes, I fail to see how reconstructing *l ~ *ʃ as *ɬ would be typologically much worse at all than reconstructing it as something like *ʎ.

      On the other hand, I don’t think there’s too much explicit evidence for specifically a lateral fricative value. For one, I ought to have written *Ľ and not *ɬ´ for the Khanty segment here. It is not clear what its original value was, due to the later merger in Khanty of lateral approximants and lateral fricatives (note to self: I really should finish Part III of my “Laterals and Spirants” post series one of these days); but it usually corresponds to *ľ in Mansi, so that is perhaps a marginally a more probable choice than a lateral fricative.

      I also don’t think we necessarily need to know what Turkic is related to to make progress on questions like this, as loanwords from elsewhere are going to be helpful as well. According to a review of the matter by Anna Dybo I’ve seen (I’ll see if I can dig up a link to it later), there are e.g. cases where PT *l₂ corresponds to *ľ in pre-Samoyedic (later > *j, as also in inherited words like PU *lomə > pre-Smy *ľom > PSmy *jom ‘snow’). The lambdaist solution would be to claim loaning from Oghur Turkic specifically — but apparently there is at least one word of this sort that has not even been attested from Chuvash: *bal₂maq ‘shoe’ → pre-Smy *paľma > PSmy *pajmå ‘shoe’. Chronologically, positing already Proto-Oghur to have been at least slightly older than Proto-Samoyedic, and by implication Proto-Turkic older yet, also feels a bit problematic.

      • David Marjanović says:

        there are e.g. cases where PT *l₂ corresponds to *ľ in pre-Samoyedic

        Huh. Maybe that’s part of why the apparently traditional Altaicist solution, from which Mudrak’s paper which I cited explicitly departs, was that *l₂ and *r₂ were [lʲ] and [rʲ].

        • j. says:

          Hmm. On one hand, it is a relatively recent observation that the default reflex of *l in Samoyedic is *j. A decent part of the early 20th century, prior to the development of decent comparative Samoyedic references, was spent thinking that it would be necessary to reconstruct PU *ľ for the inherited words in this group, despite some handwaving being required for why the affected words have an unpalatalized /l/ in other Uralic languages where it would be expected to remain palatal (e.g. Permic *lɨmɨ ‘snow’). On the other hand, I have no idea if any of these Turkic-Samoyedic loan etymologies were known that far back.

          On the third hand, PU *kalma > PSmy *kålmå ‘death’ seems to demonstrate that there was no palatalization of *l in the sequence *-alm- (a surprizingly specific environment: contrast *kalə- > *kåjə- > *kåə- ‘to die’, *śilmä > *sɪjmä > *səjmä ‘eye’), and so we regardless might have to assume a palatalized lateral in the Turkic original of ‘shoe’?

  2. crculver says:

    Berta rejects any connection of the Turkic word with Hungarian hajnal, though presumably he was arguing against considering it a Turkic loan into Hungarian.

    You might enjoy reading Helimski 1995 (“Samoyedic loans in Turkic: check-list of etymologies”) if you haven’t yet.

