12 + 1 old Indo-European loan etymology sketches

Most of the following are not-fully-polished thinking-out-loud analyses. Feel free to point out any inconsistencies, unadmitted weaknesses, and other general plotholes that you may spot.

1. peni

No clear Proto-Uralic root for ‘dog’ is known. We instead have one eastern and one western candidate: Ugric #ämpɜ on one hand (though close /e/ in Hungarian ëb raises suspicions on if the involved words are common inheritance with each other after all), Finno-Permic *penä(j) on the other. Samoyedic has a third root yet, *wën, but this has been explained as an early loan from Tocharian.

The Finno-Permic root has been often incorrectly reconstructed as *penə (UEW: *pene); but Samic *peanëk and Mordvinic *pińɜ both indicate *penä, while the Finnic *i-stem *peni- (not **pene-!) can derive from either earlier *penə-j- or *penä-j- equally well.

IE loan origin seems possible to suggest for this as well. Getting from the usual PIE word *ḱwō : *ḱun- to the Uralic form may seem difficult, for one because the substitution *Kw → *p does not really have credible parallels (while examples with something like *Kʷe- → *ko-, *ku-, *kü- are better attested). We can however find secondary /p/ developing in a suitably close-by branch: Central Iranian [1], where *ḱw > *ćw (> ? *cβ) > *sp.

The front vowel in Uralic creates some problems. If I was called Jorma Koivulehto, this would be my cue to propose an alternate *e-grade protoform for Indo-Iranian and to propose postdating the common Indo-Iranian sound change *e > *a as at least this late; a manoeuver that he has previously used to account for some other II loanwords as well. Or, in principle, another option would be to assume an intermediate dialect group of Indo-European, featuring a mix of Iranian and more archaic features. [2]

These are not especially parsimonious lines of approach, though. Instead, I have begun to suspect that not all such “e-loans” are archaisms retaining PIE *e at all. They seem to be disproportionally western in distribution, contrary to what we’d expect from ancient loans acquired before *e > *a in II (at least if we still wanted to hold some later loans as essentially Proto-Uralic — though this is perhaps not warranted either).

An explanation could perhaps lie inside Uralic. One of the more heavily Iranian-influenced branches of Uralic is Permic. In here, PU *e and, under some conditions, PU *a happen to have the same reflex: *o (thus Komi /pon/ ‘dog’, but also e.g. *aśkəl > /vośkol/ ‘step’). Most accounts have assumed that the trajectory of the development *e > *o was straight backwards drift, something along the lines of *e > *ö > *ȯ > *o. It however seems difficult to find any precedents at all for an unconditional labialization *e > *ö (even if the later steps seem plausible). I therefore wonder if this was rather a centralization development along the lines of *e > *ə̈ > *ɜ > *a? which would then have been followed by a general shift *a > *o, as a part of the late Proto-Permic back chainshift (where also *o > *u, *u > *ʉ > /ɨ/). And then — perhaps pre-Permic Indo-Iranian loanwords with *a could have been by default nativized with *e in more western Uralic dialects: e.g. Iranian *spān- (accusative stem) → pre-Permic *panV → western Uralic *penä?

Even disharmonic *pana → *pena could be an option. As noted, in Finnic we only find the *j-derivative *penVj > *penej > *peni ‘dog’ (SSA mentions Savonian pena ‘brat’, but due to narrow distribution this seems more likely to be a late descriptive backformation than the original root); while Samic, Mordvinic and Mari fail to show the loanword-introduced distinction between *e-a and *e-ä.

Accounting for Hungarian fene ‘wild’, which in the past has occasionally been considered a reflex that has semantically drifted out of sync, seems more difficult under this scenario. I would be content to leave it out of this etymology.

2. kero

I’ve identified another new “e-loan” candidate as well. This is the root traditionally reconstructed as PU (PFP) *kerV ‘throat’, reflected in e.g. Finnish kero, Estonian kõri, Samic *kërës, Permic *gor. However, resemblance with PIE *gʷel- ‘throat’ is unavoidable, even more so once we factor in early Indo-Iranian sound changes to reach *ger-.

As also in a couple of other cases [3], the “sporadic” initial voiced stop in Permic appears to simply continue the initial voiced stop on the IE side. It follows that loaning into unitary Proto-Finno-Permic cannot be assumed: we’re probably rather dealing with separate loaning in Permic and Samic/Finnic. Perhaps then again in the latter through the former? The Finnic and Samic words seem to each point to different stem shapes too, namely preF *kera- vs. preS *kerəs — the latter retaining the characteristic IE masculine nominative singular ending, the former showing disharmony characteristic of loanwords. This would go well with a late date of the word’s introduction too.

3. äimä

A Proto-Uralic word *äjmä ‘needle’ has been supposed for long, with reflexes in several branches (Samic, Finnic, Mari, Permic & Samoyedic). There are some reasons to be suspicious of this reconstruction, though, despite the seemingly perfect match between e.g. Finnic äimä (only attested in Finnish + Karelian) and Samoyedic *äjmä.

Firstly, this word constitutes one of the exceptions to *ä-backing in Finnic, as recently identified. An initial suggestion (Kallio 2012, Zhivlov 2014) has been that the change was blocked before syllable-final *j. The other relatively clear example of this (*päivä < ? *päjwä ‘sun, day’) has been suspected of being a possible derivative from a root of the shape *päjə, though, and I’ve proposed reconstructing original trisyllabic *päjəwä. The third example that could perhaps show blocking before coda *j is PF *äjjä ‘big; grandfather’ (with cognates only in Samic + Komi), but this can also be suspected to be secondary. Nowhere else is there evidence for geminate *-jj- in Proto-Uralic; moreover, the term’s distaff counterpart, PF *ämmä ‘grandmother’, seems to be derived from PF/PU *emä ‘mother’ by some kind of iconic intensifying gemination. [4] This could have been the case for *äjjä as well. Perhaps its pre-Finnic ancestor had only plain *-j-, and maybe also different vocalism altogether.

Since the evidence for this alleged exception development is starting to look questionable, it’s worth considering if the reasons for the absense of *ä-fronting in äimä could lie elsewhere as well. As a word root with a medial consonant cluster, a phonetically natural explanation would be to trace this, too, back to an earlier derivative *äj-mä < *äjə-mä.

A second reason to suspect that PU *äjmä might not have been a basic word root comes from that also the PU cluster *-jm- seems to be otherwise unattested in primary word roots! Most examples are clear derivatives in *-mA; e.g. *kojma ‘man’ (in P, H, Ms + Selkup) ← *kojə ‘male’; *wajma ‘heart, spirit’ (in F, Mo) ← *wajŋə ‘breath, spirit’; alleged *kejmä ‘lust’ (in S, F, P)  ← *kixə- ‘to rut’ (and thus better: *kixəmä); alleged *śajma ‘manger’ (in F, Mo) ← *sewə- ‘to eat’ (and thus better: *sewəmä).

Thirdly, a derivative analysis actually also makes good semantic sense. *äjmä is one of the clearest-reconstructible Proto-Uralic tool terms — and the suffix *-mA is regularly used to form instrumentals in Finnic (as *-in : *-imE-), with occasional cognates in or close to this function also elsewhere in Uralic (e.g. Mordvinic *kundamə ‘handle’; Tundra Nenets /sædoʔmā/ ‘thread’)

Altogether I therefore find it quite likely that the PU term for ‘needle’ was originally a derivative, and should perhaps be amended to *äjəmä. The basic root **äjə- does not appear to otherwise survive, but this analysis suggests a meaning such as ‘to pierce, (to be) sharp’.

— Unexpectedly, this exercise in internal reconstruction has now brought us quite close to the PIE root for ‘sharp’: *h₂aḱ-. The sound correspondences (*h₂ ~ ∅, *a ~ *ä, *ḱ ~ *j) do not suggest loaning directly from PIE, but Indo-Iranian *Hać- would make a more promising candidate for this (compare PIE *h₂aǵ- > PII *Hadź- → PU *aja- ‘to drive’).

One issue remains: we would expect PU to have rather substituted Indo-Iranian *ć by its own voiceless palatals, *ć or *ś (as also in previously known loanwords like *śëta ‘100’; *waśara ‘hammer’). Phonotactics may have interfered, though. There are almost no examples in widespread Uralic vocabulary of *-ć- or *-ś- as a single word-medial consonant; I only know of one truly good example (*kośəw or *kośəkV ‘long’), while most other cases that have been posited can be suspected to be instead from a cluster *-ńć-, from a geminate *-ćć-, or to be post-PU areal vocabulary. Perhaps this fact can have motivated a substitution *-ć- → *-j-.

4. kangertaa

Earlier this year I have, in a talk (slides in Finnish) at the XLIII Kielitieteen päivät conference, introduced a new model of the *ë/*ï split in Eastern Uralic. To summarize in brief, earlier research has supposed three essentially unrelated splits:

  • PU *ë > Samoyedic *ë in closed syllables, *ï in open ones (thus Janhunen)
  • PU *ë > Khanty *ïï, from which by the Khanty “ablaut” > *aa in several words (thus Steinitz); or, *aa by default and *ïï as an unexplained exception development (thus Sammallahti)
  • PU *ë > Hungarian i or a, with unclear conditioning (possibly initially *a, with i as a back-development in palatal environment)

My suggestion is that all three are in fact related, and conditioned by the original stem type:

  • PU *ë-a > Smy. *ï ~ Kh. *ïï ~ Hu. a (e.g. *ïlə- ~ *ïïL- ~ al- ‘under’ < PU *ëla, cf. Fi. ala)
  • PU *ë-ə > Smy. *ë ~ Kh. *aa ~ Hu. i (e.g. *ńëj ~ *ńaal ~ nyíl ‘arrow’ < PU *ńëlə, cf. Fi. nuoli)

(A few facets of this model I have already mentioned in some earlier blog posts.)

The conditioning appears to have later been blurred by the introduction of Indo-European loanwords, which has introduced words that rather point to a development *ë-a > Kh. *aa. Four examples of this correspondence are known by earlier research:

  • alleged PU *śëta > Kh. *saat ‘100’ (cf. Fi. sata)
    ← Indo-Iranian *ćata-
  • alleged PU *śëlka(w) > Kh. *saaɣəL ‘pole’ (cf. Fi. salko)
    ← (pre-)Balto-Slavic *dźalga-
  • alleged PU *kënta(w) > Kh. *kaant ‘foundation for a storehouse on a post’ (cf. Fi. kanta ‘basis’, kanto ‘tree stump’)
    ← Indo-Iranian *skandʰa-
  • alleged PU *pëŋka > Kh. *paaŋk ‘fly agaric’ (cf. Smy. *pëŋkå- ‘to get drunk’)
    ← PIE *(s)pongo- ‘mushroom’; or Indo-Iranian *bʰanga- ‘hamp, ? intoxicant plant’ (only in Indo-Aryan)

I propose that all of these have simply been borrowed late enough to escape the *ë/*ï split in native vocabulary. They do not even seem to point to common East Uralic *ë: in Hungarian we find száz ‘100’ (not ˣszíz), and szálka ‘splinter’, szálfa ‘log’ (not ˣszílka, ˣszílfa).

A fifth case can be added to the tally. A recent etymological comparison from Aikio [5] connects Finnic *kangërta-, Samic *kōŋkërtē- ‘to crawl, move with difficulty’ with the long-known Ugric verb root *këŋkV-. We see here quite similar vowel correspondences as above: in particular, long á in Hungarian hág ‘to step (up on)’, *ëë in Mansi *këëŋk- ‘to climb’. In Western Khanty we find an “u-ablauted” reflex *xooŋx- ‘to climb’ (possibly < PKh *kɔɔŋk- ← ? #kaaŋku-), while Far Eastern /kɑŋət-/ and Western *xaaŋteep ‘stairs, ladder’ point to a stem variant *kaaŋt- (presumably < earlier *kaaŋk-t-). This time the West Uralic cognates do not require an earlier *a-stem, but they also do not necessarily speak against it. While *-ər- is a rather rare verbal derivational suffix, a well-attested precedent is *pu(ń)ća- (> Samic *počē- ‘to squeeze’ etc.) → *puć-ər- ‘id.’ (> Fi. pusertaa, Hu. facsar ‘id.’ etc.)

The various Uralic words appear likely to derive from the IE verb root *ǵʰengʰ- ‘to step’. Hungarian and the Khanty words for ‘stairs’ would remain semantically the most archaic, with ‘to climb’ developing as a later meaning (if within Uralic or in some loangiving IE variety is not obvious), ‘to crawl’ probably even later. To account for the lack of satemization, we would need to reckon with very early loaning from just about PIE; or, as seems a tad more likely to me, secondary diffusion to Ugric through early West Uralic and pre-Germanic.

UEW’s hesitant comparison of Komi /kaj-/ ‘to climb’ with this word group does not seem to be really feasible.

5. ilo

Finnic *ilo ‘joy, mirth’ has no accepted etymology. A few Samic counterparts are known, but these are limited to the central dialects, and can be easily analyzed as loans from Finnic. Possibly in more than one layer though; forms pointing to Proto-Samic *ë < *ɪ and showing a more divergent meaning, such as Pite âllo ‘inclination’, can plausibly have been earlier loans than forms retaining /i/, such as North illu ‘joy’.

Since the word has word-initial *i-, it’s possible to ask if this might go back to earlier *je-, as I’ve proposed to be the case for several other words in Finnic as well. This seems to allow finding a promising loan original in Indo-European: the root *ǵelh₂- ‘to laugh’. IE *ǵ⁽ʰ⁾ → Uralic *j is well enough attested in some early loanwords of both Indo-Iranian and Balto-Slavic origin. This particular root does not happen to be reflected in either branch, but perhaps the next best thing is still available, namely Armenian. [6] We are not limited to bare root comparision, either: it appears possible to match the ending in the derived noun *ǵélh₂-ōs ‘laughter’ (> Greek γέλως, ? Armenian ծաղր) with *-o in Finnic.

Another Finnic noun, *ilka ‘tease, (mean) trick, practical joke’ could be perhaps analyzed as a parallel loanword from this PIE root. This would then involve a seemingly more archaic sound substitution *h₂ → *k, though I’m sure this and *h₂ → ∅ can have coexisted for a while (compare etymology #10 below). On the other hand, the older explanation as some kind of a backformation from *ilkëda ‘bad, mean’ (of Germanic origin) remains entirely feasible as well, and perhaps semantically preferrable. It also looks phonologically more straightforward, since in an old enough loanword an ä-stem **jelkä > **ilkä would be more expected than a disharmonic a-stem.

6. keev

One of the more obscure Finno-Samic etymological comparisons, though still well captured by the usual major sources, is an animal husbandry term surviving only in Livonian and Eastern Samic: Liv. keev ‘mare’ (borrowed also into Latvian: ķēve) ~ Inari Sami kiäváš, Skolt ǩiõvv etc. ‘reindeer cow’ (< PS *kēvë). The traditional reconstruction has been *keewe. Following the abandonment of vowel length in pre-Finnic reconstruction stages, this probably needs to be amended to *käwə, with lengthening *ä > *ää > *ee due to Lehtinen’s Law in Finnic (and as business as usual in Samic).

This adds up to an interestingly symmetric behavior of low vowel + glide roots in Finnic: “homorganic” *-äjə, *-awə apparently remain unaffected (as in Fi. täi ‘louse’, savi ‘clay’), while “heterorganic” *-äwə, *-ajə are lengthened.

One other example of *-äw- is known too though, without lengthening — and it’s a perfect minimal pair, even: *käü- ‘to go, walk’ (~ frequentativ *käv-ele-), suggesting likewise earlier *käwə-. However, as this is nowadays normally considered a Germanic loanword (← *skēwjan-) [7], it could be assumed to have arrived only after inherited *-äwə- >> *-eewe-. Despite some searching, I know no clear examples of vowel lengthening due to LL among the Baltic and Germanic loanwords in Finnic. (It ranks as one of the earliest Finnic sound changes also in relative chronology, and I would presume it has taken place already during the initial dialect diversification of West Uralic, somewhere around the upper Volga watershed.)

Back to *käwə: as a cultural term with narrow distribution, loan origin is likely already a priori. And indeed, at this point, resemblance to Indo-Iranian starts again being apparent: cf. *gāwš ‘cow’ (< PIE *gʷōw-). The meaning ‘mare’ in Livonian is a little bit off, but surely no more of an issue than e.g. the long-accepted comparison Finnic *lehmä ‘cow’ ~ Mordvinic *ľišmɜ ‘horse’. We also know of at least one precedent of an II loanword from the same semantic field: the common western Uralic words for ‘reindeer’ (approx. *počaw, if we wanted to set up a single proto-form [8]) derive from PII *paću ‘cattle’ (< PIE *peḱu-).

It is not clear to me if *ā → *ä should be cause for worry. The typical frontness/backness development across Iranian appears to be for *a to front vs. *ā to back (including in Ossetian, which suggests that this split has taken root early). However, loaning from the oblique stem *gaw- would be possible as well.

7. seaibi

The common Samic word for ‘tail’ is reconstructed as *seajpē. For pre-Samic (≈ proto-West Uralic), *sejpä or *šejpä would be implied. The word sports an unusual medial cluster *-jp- and has no reliable cognates elsewhere in Uralic; it can be easily suspected to be a loanword.

Indo-Iranian again offers a good loan original candidate. Indeed, several of them… Late Avestan xšuuaēpā-, Sanskrit śepa- and Prakrit cheppā- (all ‘tail’) fail to point to any clear common proto-form (though some ad hoc cluster could surely be set up [9]). They all regardless suggest, at minimum, the same consonant skeleton *S-jp- as in Samic, which seems a bit too good to be a complete coincidence.

As we’re again dealing with an “e-loan”, but now without Permic cognates, initially the explanation options would seem to be positing early loaning (which however seems unlikely per inner-II irregularities), or a la Koivulehto, late retention of *e. However, the II diphthong *ai likely could have later developed separately to a form close enough to *ej. Indeed, *ai monophthongizes in most (if not all?) later Iranian languages, even though per Avestan and Old Persian we know this development to have been firmly post-Proto-Iranian.

8. oksi

Attempts at reconstructing a PU word for ‘bear’ are most likely futile, due to ubiquitous taboo circumlocutions being used for the animal even by several groups of modern-day Uralic speakers. In the southwesternmost branches, Finnic and Mordvinic, one common root is identifiable though: *oktə, giving F. *okci / *oht-o (> standard Fi. hypercorrect otso) and Mo. *ovtə (? *oftə).

PIE *h₂r̥tḱos ‘bear’ may at first glance look quite far-removed from this. Factor in laryngeal loss and *tK-metathesis though, to reach *r̥ḱtos: rather closer already. A three-consonant cluster **-rkt- could not have been retained in early Uralic, so substitution as simply *-kt- seems possible. Initial *o could represent a variety of histories — e.g. direct substitution for syllabic *r̥, an early IE dialectal feature (cf. Latin ursus?), or even a word-initial development *a- > *o- in Uralic.

Unexpected retention of *o in Mordvinic (compare e.g. *oksə-nta- > *uksnə- ‘to vomit’) might also receive an explanation through this etymology. Aikio (2013) (see again footnote 5) reports one apparent environment where the development *o > *o is regular: before *ŋ, as in e.g. *joŋsə > *joŋs ‘bow’, *poŋə > *poŋ(gə) ‘bosom’. This could be further generalized to the environment before a velar sonorant: *o > *o appears to be regular also before *w (*powa > *pov ‘knob’, *śawə > *śowa > *śovə-ń ‘clay’); and even before *lk (*olkə > *olgə ‘straw’, *ńolkə > *nolgə ‘snot, slime’), where *l may have been at the time realized as *[ɫ]. If so, then perhaps an early pre-Mordvinic *orktə was similarly realized with [rˠ], which could have triggered *o > *o, before the full nativization of the root as *oktə?

This is all fairly complicated though, and other explanations are surely possible: e.g. that by the time of loaning, PU *u had already been reduced to [ʊ] in pre-Mordvinic; and *[ʊr] was then used as a substitute for Indo-European *r̥. Assuming that epenthesis to [ər] had already taken place in the latter would help too.

This time, loaning from Indo-Iranian seems to be out of the question, since I gather that nowadays the prevailing analysis is that Sanskrit kṣ in ṛ́kṣa- ‘bear’ does not result from metathesis, but from (hypercorrect?) dissimilation from *tś < *tć < *tḱ. This seems to be confirmed by how Prakrits have riccha ~ accha with cch, rather than expected kkh < *kṣ.

It may be somewhat of an issue that direct descendants of *h₂r̥tḱos have not been not attested from our next most likely loangivers: Balto-Slavic and Germanic. However, as their attested words for ‘bear’ are analyzable as taboo circumlocutions as well (“brown one”, “honey-eater” etc.), it is probably reasonable to assume that the older word was still around as well up until some point, instead of self-destructing as soon as PIE split into dialects. The Finnic word later shows a rather similar history: *okci has been mostly eclipsed by its substitute *karhu (which has later been still felt strong enough to require circumlocution), and it only survives as diminutives in Finnish and Estonian; in some place names; and in Livonian okš.

Or indeed: we would seem to have little reason to assume *oktə having been the earlier main term for ‘bear’ on the Uralic side. It could also have spent its history mostly as a circumlocution term, and risen to a new neutral term only in Mordvinic and Livonian separately.

9. xaws

Northern Mansi /χɑws/ ‘ash-gray’ ~  Southern Khanty /χɑ̆wəs/ ‘gray-haired’ is a part of the common Ob-Ugric lexicon with no known Uralic or Ugric origin. There are also phonological reasons to assume that this is indeed an innovation: Southern Khanty word-medial /-w-/ in a back-vocalic environment is highly rare.

If you’ll bear with me for another historical phonology tangent: the canonical analysis by Steinitz is that no Proto-Khanty medial **-w- is to be reconstructed at all, and that medial *-ɣ- developed in Western (= Southern + Northern) Khanty to /-w-/, when stem-final and following either *o, *oo, or a front vowel (but not following other labial back vowels: *ɔɔ, *uu). The latter condition sounds awfully arbitrary, though. There seems to be no good reason why labialization should happen only after close-mid vowels specifically. The words reconstructed with his *-ooɣ or *-oɣ also fail to align with expected vowel correspondences. For regular examples, compare Southern /joχət/ ~ Far Eastern /joɣət/ ‘bow’ (< *jooɣət) or Southern /tŏχət/ ~ Far Eastern /tŏɣəl/ ‘feather’ (< *toɣəL). In the cases with /w/, we instead find correspondences such as Southern /taw/ (with a front vowel!) ~ Far Eastern /loɣ/ ‘horse’ (< ? *loɣ), or Southern /ŏw/ ~ Far Eastern /oɣ/ ‘stream’ (< ? *ŏɣ).

In Western Khanty, any exceptional vowel developments can in principle be explained as being conditioned by /-w-/, regardless of how this first arose. But if /-ɣ-/ in Eastern Khanty is supposed to be a retention, it would be rather bizarre for it to condition exceptional vowel developments exactly in those word roots where a WKh /-w-/ also exceptionally develops.

What I consider more likely is that a distinction between *-w- and *-ɣ- should be reconstructed for Proto-Khanty after all, although we can only clearly identify it in back-vocalic words in Western Khanty. [10] This finds support from etymology, too. In a few cases, (Western) Khanty words with /-w-/ derive from Proto-Uralic roots that also have *-w- (e.g. ‘stream’ above < PU *uwa; compare e.g. Northern Sami avvit ‘to leak’), and seem to have simply retained the consonant; while words of the shape /(C)OɣəC/ generally derive from words with an earlier cluster *-kC- or *-Ck- (compare e.g. NS juoksa ‘bow’, dolgi ‘feather’).

The ‘gray’ word seems to provide corroboration for this reanalysis of Proto-Khanty. The traditional reconstruction scheme cannot really accommodate Southern Khanty words of the shape /COwəC/; at best they could be secondary derivatives from a root of the shape *COɣ. And while Northern Mansi is known to have several loanwords from Northern Khanty, in this case no Northern Khanty reflex appears to exist. Hence the NMs cognate would seem to show that the word cannot be considered a late innovation in Southern Khanty: the word should be traced in its entirety at least back to the common Ob-Ugric period.

Going further back from there, though, runs into difficulties again. Reconstructible Proto-Uralic clusters of the shape *-wC- are in Khanty regularly simplified to just *-C- (e.g. *lewlə ‘spirit’ > PKh *liiL; *kowsə ‘spruce’ > PKh *kooL), while those of the shape *-Cw- seem to give *-Cəɣ (e.g. *tälwä > *teləɣ ‘winter’). This leaves us with no plausible inherited source for apparent Ob-Ugric *kaws ‘gray’.

There may be some grounds for attempting setting up a concrete loan etymology, as the adjective shows intriguing resemblance to PIE *ḱyeh₁wós ‘gray, dark’. Phonetics remain problematic though. Loaning from Indo-Iranian (Sanskrit śyāva- etc.) is again not an option, due to the retained initial velar: the routing would either have to be from just about PIE, or from a specifically Centum variety. Tocharian B kwele, with syncope of the original root vowel and an additional suffix, is however not really close enough either. — The second problem is the back vowel *a in Ob-Ugric, matching poorly with PIE *-ye-. I could of course speculate if this word was derived not directly from Indo-European, but instead from whatever substrate preceded Ob-Ugric in western Siberia… but this contributes nothing productive.

For the time being, in the absense of phonetic parallels or other clarifications, this comparison seems to be stuck in the limbo of “possible but not probable”.

10. aač

An alleged Proto-Uralic (Proto-Finno-Ugric) word for ‘sheep’ has been for long reconstructed as approximately *učə (UEW: *uče). The reflexes however show a tremendous amount of irregularities (more on this to come later in a separate post of its own), and I am convinced that this etymon is mostly erroneous: the words might be instead separate IE loans of varying ages.

The case seems to be the clearest for Ob-Ugric. Mansi *aaš ~ Khanty *aač is, in itself, a very regular comparison. This is however just about the only allegedly inherited word where the vowel correspondence *aa ~ *aa appears. Most others are either of unknown origin, Indo-Iranian loans, or even late Komi loans. The raising *aa > /oo/ in non-southern Mansi is as late as 18th century, and the same change in Southern Khanty could be fairly recent as well. All the way up to this terminus ante quem, loanwords of any origin could easily have been adopted with *aa everywhere across Ob-Ugric.

A natural loan origin is provided by Proto-Iranian *adz- ‘goat’ (< PII *Hadź- < PIE *h₂aǵ-), whose unpalatalized *dz would have been substituted on the Uralic side by *č (as also e.g. in ‘reindeer’, tangentially mentioned above). The minor semantic difference seems like a lesser obstacle than the numerous phonetic difficulties in connecting these words to their western Uralic equivalents (such as Fi. uuhi, Erzya /uća/); and could be even related to sheep-rearing faring generally better than goat-rearing in the colder taiga zone.

In the absense of phonetic or other faultlines to dig into, I do not take any stance here on if we should assume loaning into already separated (pre-)Mansi and (pre-)Khanty, or into unitary (pre-)Proto-Ob-Ugric, which does not seem to make a difference on the viability of the etymology either way.

11. hajt

The Hungarian verb hajt comes with numerous meanings. Analyses normally break these into two homonymous groups, one with a rather polysemic range of meanings such as ‘to drive, to herd, to move, to repeat’; the other with the more restricted range ‘to fold’.

The first cluster has been equated with Mansi *kujt- ‘to chase’. As the correspondence *-t- : *-t- normally goes back to a cluster *-tt- or *-pt-, these verbs probably need to be analyzed as derivatives from a root *kajV- or *kojV-; indeed also UEW’s reconstruction approach.

This root however looks quite similar to the other, better-known and wider-distributed (S, F, P, Ms) Uralic root for ‘to drive, chase’, which is *aja-. I believe this is not an accident. The latter has been long since considered a loanword derived from, as mentioned above, PIE *h₂aǵ- ‘to drive’. The H-Ms root can be analyzed as a parallel loan from the same as well: the initial *k- is straightforwardly accountable by the reasonably well-attested word-initial substitution pattern *h₂ → *k. If this should be taken as chronologically earlier (it probably requires a relatively un-weakened sound value for *h₂ at the time) or simply a competing nativization strategy is not obvious, but will not create any significant trouble either way.

12. jam

The Proto-Samoyedic word for ‘sea’ has been reconstructed as *jam (yielding, among other reflexes, Old Nganasan jam, Nenets jām). An etymology suggested by Helimski derives this, through earlier *ľam < *lamə, as a loanword from Proto-Tungusic *lāmu ‘id.’

The notion of Proto-Tungusic loanwords in Proto-Samoyedic strikes me as unexpected, though. There are several thousand kilometers separating the Sayan mountains (the likely Samoyedic homeland, or at least close by to it) and the lower Amur (the likely Tungusic homeland). It might be possible to reckon with adjustments of various kind of course, such as adoption from early Evenki (the only Tungusic variety that has clearly been in contact with most of the Samoyedic-speaking area), combined with pushing the pan-Samoyedic development *l- > *j- substantially forward.

However, another etymology seems to be available too. The Tocharian A word for ‘sea’ is lyam, which would work as a loan original about as well as the Tungusic word. Loaning from Samoyedic into Tocharian is apparently ruled out, since this is a word with a good Indo-European pedigree (akin to e.g. Greek λίμνη).

There are a few phonetic kinks to work out. Both the IE etymology (thru earlier *lim-, the zero-grade of √(s)leym- ‘slime etc.’) and Tocharian B lyäm /lʲɨm/ seem to get in the way of straightforward loaning from Proto-Tocharian into Proto-Samoyedic: we’d instead expect something like **ľïm > **jïm or **ľɪm > **jə̈m in that case. Even the Toch. A vowel transcribed ‹a› was likely something in the *[ɐ ~ ə] region, in contrast to ‹ā› being the cardinal /a/, and so we might instead expect to see PSmy **ľəm > **jəm?

The chronological point brought along by having to prefer loaning from Toch. A specifically may provide a solution, though. If we again assumed that *ľ- > *j- took place late across Samoyedic (a slightly weaker assumption than postdating both this and the earlier change *l- > *ľ-), it will be relevant that Southern Samoyedic regularly shifts *ə > *a. After this, ‘sea’ would presumably be loaned from Tocharian as *ľam; and upon diffusion of the term into more northern dialects, the vowel could well be retained. — Alternately, late loaning would also allow assuming that Tocharian */lʲ/ was substituted as *j.

It might even be possible to tie both etymological groups together, and to suggest a borrowing chain Tocharian → Samoyedic → Tungusic. [11] Tungusic has no palatal lateral **ľ, so early South Samoyedic *ľ- would be naturally substituted as *l-. (If the vowel correspondences check out in this direction, too, seems however like a more precarious question that I am not currently equipped to address.)


That’s all I have on early loanwords from Indo-European into Uralic, for the time being. I have one going in the opposite direction too, though:

1. blow

Germanic *blewwan- ‘to beat up’ has no known Indo-European etymology. Etymological dictionaries sometimes set up a PIE preform *bʰlewH-, but without any other comparative evidence backing this up.

This root shows clear similarity though to the widespread Uralic root for ‘to hit’, usually reconstructed as *lewə-. Being attested as far as Mansi and Samoyedic, loaning from Germanic is right out of the question. Loaning from PIE would be theoretically feasible, but this does not really seem like sufficient grounds for projecting the Germanic verb that far back, either. If this resemblance is onto something, we would seem to have to instead consider the direction Uralic → Germanic.

Initial *bl- may look like an obstacle. However, this could be accounted for by a fossilized prefix *b- < *bi- ‘at’ (much like can be seen in German bleiben, Swedish bli vs. dialectal English belive). Semantically this works perfectly: “to beat” is precisely “to hit at, to keep hitting at”. Loss of the prefix vowel would probably have to have happened here already in PGmc, though.

The geminate *-ww- looks a bit trickier to account for. Nothing would strictly speaking prevent taking this as evidence for instead reconstructing Uralic *lewwə-; but again, since there is no substantial evidence for geminate glides in PU otherwise, this would be firmly an obscurum per obscurius explanation. Perhaps the proposed pre-Germanic reconstruction with *-wH- is the key instead. It would be quite possible to also reconstruct Uralic *lexə-, and assume that *-wH- represents the substitution of the early Finnic reflex of *-x-, which I believe was at one point likely a back unrounded glide, roughly [ɰ] or [ɣ]. Pre-Germanic *-w- could continue the velar glide aspect of this sound, *-H- the fricative aspect.

All of this matches poorly though with my earlier hypothesis that we should instead reconstruct Uralic *lüwä- or *lüxä-, from which Germanic **(b)li- or **(b)lu- would surely be expected instead…

[1] I.e. all Iranian languages other than the Persid and Saka groups.
[2] This possibility is especially suggested by how Iranian and its closest surviving Western relative, Slavic, seem to share a decent number of characteristic innovations that are missing either from Indic or from Baltic: e.g. the alveolarization of palatals (*ḱ > *ć > *c), secondary palatalization of the common Satemic velars, the shift *kh₂ > *x, the *B / *Bʰ merger, the *ā / *ō merger, or monophthongization of all diphthongs. Some of these could be independent, but the number seems a bit high for none of these to have been areally transmitted from one to the other.
[3] I do not aim for a full review in this post, but cf. e.g. Udmurt /bord/, Komi /berd/ ‘wall’ < “PU *pärtä” ← PIE *bʰr̥dʰ- ‘board’.
[4] For “intensive gemination” in family terms in Finnic, cf. also *ukko ‘old man’, likely an irregular derivative from *uros ~  *uroi ‘male’.
[5] Mentioned tangentially in the recent paper “The Finnic ‘secondary e-stems’ and Proto-Uralic vocalism”, in SUSA 95, and findable even in the handouts of his associated talk in 2013. — I would however continue to derive Finnic *kankëda ‘stiff’ from the noun *kanki ‘bar’, as per the analysis in SSA.
[6] Given the modern theory that the PIE “palatovelars” and “plain velars” should be reanalyzed as plain velars and back velars / uvulars, and that the former were only ever fronted in the Satem languages, loaning from any Centum group would be unconvincing for sound correspondences such as this, I think. I do not think loaning from pre-Armenian specifically is feasible either, but attestation there seems to suggest that the root may have once existed in early Indo-Iranian or Balto-Slavic as well.
[7] Germanic long *ē being reflected as short *ä in this word may seem mysterious. This is still perfectly accountable though by the original account given by Koivulehto upon presenting this etymology: it likely indicates a stage of development in Finnic where *ää had already been raised to *ee, while pre-Northwest Germanic still had open front *ǣ (later > *ā). This leaves just short *ä as a qualitatively faithful substitution option. — A couple of cases with *ā → *a seem to show similar development as well: the main candidates are *apila ‘clover’, *lapida ‘spade’, from Baltic *ābilis, *lāpetā, where the appearence of medial *-i- indicates somewhat late loaning.
[8] Though *o ← *a < *e worries me somewhat. If Finnish poro (< *podoi?) were a very early loanword from Samic, we might be able to get away with *pačəw instead.
[9] Lubotsky in Indo-Aryan ‘six’ proposes *pćw-. Would this mean the word being originally a derivative or a compound based on *peḱu-?
[10] I believe some indirect evidence for this contrast in other positions can be uncovered as well, but that would be a discussion for another time.
[11] Also Mongolic *lamug ‘swamp’ (> literary namuɣ), which has been proposed as an Altaic cognate of the Tungusic word, might then belong in this cluster.

Advertisements
Tagged with: , , , , , , ,
Posted in Etymology
23 comments on “12 + 1 old Indo-European loan etymology sketches
  1. Kathryn Spence says:

    Kroonen suggests a connection with Avestan mruta- “crushed” and Greek ἀμβλύς “blunt” for the Germanic word. This would probably be from *melh₂-w-, a u-extension of *melh₂- “grind”. We would have to assume development from the zero-grade, as is found in the cognates, yielding pre-Germanic *mlu- > *blu-. This is then naturally reinterpreted as the zero-grade of a root *blew-. Assuming this reinterpretation happened before the loss of contrastive accent in Germanic, we might well obtain an oxtone thematic present, whence gemination of the *w by Holtzmann’s law.

  2. Daba says:

    Perhaps rather than *ḱyeh₁wós, you could try connecting number 9 with Persian خاک xâk ‘earth’ ~ خاکستر xâkestar ‘ash’ ~ خاکستری xâkestari ‘grey’

  3. M. says:

    I may have more to say later, but I wanted to comment on this point from #4:

    A second reason to suspect that PU *äjmä might not have been a basic word root comes from that also the PU cluster *-jm- seems to be otherwise unattested in primary word roots!

    If roughly 400 roots (which I think is now considered a moderate estimate) are reconstructible for Proto-Uralic given our current evidence, then I fail to see how the rarity of the sequence *-jm- casts any strong doubt on the reconstruction of *äjmä.

    Compare Indo-European, which has perhaps 1,500 reconstructible roots, but which has only a few nouns whose reconstructed stem ends in e.g. a labial nasal (*gHijem- „winter“) or a lateral approximant (*sal- „salt“). Neither of these rarities is used (at least not widely) as justification for loan hypotheses or for rejecting/doubting the aforementioned reconstructed words.

    Broader patterns can certainly be discerned from a 400-root sample – for example, the lack of CC-initial roots in Uralic – but the lack of *-jm- in such a sample doesn’t seem striking at all.

    • j. says:

      If roughly 400 roots (which I think is now considered a moderate estimate) are reconstructible for Proto-Uralic given our current evidence, then I fail to see how the rarity of the sequence *-jm- casts any strong doubt on the reconstruction of *äjmä.

      You’re correct — this is not particularly strong evidence all by itself. (And the cluster type glide+nasal seems to be in general established, even if not particularly common.)

      There would be a more detailed argument on this from root structure considerations, though. Perhaps I should put out a more detailed post about my thoughts on this this at some point, but in brief: it’s been observed ever since rough PU reconstructions became available (IIRC the first papers on this came out in the 60s, in the wake of Collinder’s Comparative Grammar) that PU roots of the shape *CVCV strongly favor certain voiced medial consonants, in particular *-l-, *-r-, *-j-, *-d₁-, *-d₂- (interestingly enough not particularly *-w-, though); while word roots of the shape *CVCCV strongly favor cluster-final consonants that are favored also in word-initial position (especially *·k-, to an extent also *·w- and *·m-). Since PU is usually thought to have had trochaic stress, this has been taken as grounds to suspect that word roots with clusters of a type “-RK-” would come from a pre-PU contraction of word stems with two stress groups, something like *ˈCVRV|ˌKV(RV) > *CVRKV. In cases where we know we are dealing with derivatives, the final steps of this could even have been post-PU, with a contraction *CVRəKV > *CVRKV taking place independently in various descendants.

  4. David Marjanović says:

    Concerning number 9, isn’t there a Germanic “gray” word *xas-/*xaz-?

    Could *blewwan- be contaminated with “blue”? In modern German, the reflex (ein)bläuen is a straightforward fossilized causative of blau, indicating that the word was reinterpreted as “beat black and blue” at some point (or green and blue in German). That would certainly account for the *b-…

    *ḱwō : *ḱun-

    I thought *ḱwn-, as in *ḱwn̩bʰis > Skt. śvabʰis rather than **śumbʰis?

    “brown one”

    Rather “wild/ferocious one”: *ǵʰwēr-/*ǵʰwer- “wild animal, beast” > thematic adjective like Latin ferus > n-stem noun.

    Given the modern theory that the PIE “palatovelars” and “plain velars” should be reanalyzed as plain velars and back velars / uvulars

    Conversely, I wouldn’t be surprised at all if [gʲ] and [gʲʰ] don’t sound enough like plosives to someone who isn’t used to voiced ones, and are therefore interpreted as /j/.

    the initial *k- is straightforwardly accountable by the reasonably well-attested word-initial substitution pattern *h₂ → *k. If this should be taken as chronologically earlier (it probably requires a relatively un-weakened sound value for *h₂ at the time) or simply a competing nativization strategy is not obvious, but will not create any significant trouble either way.

    Obviously I’m not qualified to talk about this particular word (or, well, most others), but I wonder if some cases of this *h₂ – *k correspondence are cognates, and the common ancestor was *q.

    • David Marjanović says:

      I thought *ḱwn-, as in *ḱwn̩bʰis > Skt. śvabʰis rather than **śumbʰis?

      To answer my own question… both, thanks to the weird PIE preferences in syllable structure! */kʲwnbʰjs/ = *[kʲwn̩.bʰis], but genitive singular */kʲwnos/ = [kʲu.nos] > Skt. śunaḥ. (I forgot where the stresses go.)

      and the common ancestor was *q

      Obviously I can expand this to a theory, which is mine, to the effect that in some beginning or other there was a typologically Caucasian sound system with *qʰ, *qʼ, *ɢ, which stayed distinct from each other (and from the velars) in PIE as *h₂, *h₁, *h₃ (presuming that *h₁ was [ʔ] at some point), but merged in PU as *k. Testing this with further Nostratic correspondences could make a nice PhD thesis… 🙂

      Too bad that the Moscow School mostly doesn’t like IE laryngeals. They’re so much into uvulars otherwise.

  5. *äjmä ‘needle’ looks strangely similar to Baltic *aišma- (Lith. iešmas, OPruss. aysmis) and Greek aikhmē ‘(roasting) spit’. I don’t know if these forms warrants a PIE reconstruction (only two branches), but it would probably be *h2aik’-s-mo/ah2. Anyway, probably coincidental similarity to the Uralic word.

  6. M. says:

    It may be somewhat of an issue that direct descendants of *h₂r̥tḱos have not been not attested from our next most likely loangivers: Balto-Slavic and Germanic. However, as their attested words for ‘bear’ are analyzable as taboo circumlocutions as well (“brown one”, “honey-eater” etc.), it is probably reasonable to assume that the older word was still around as well up until some point, instead of self-destructing as soon as PIE split into dialects.

    Can you expand a bit on why you consider this assumption reasonable? This kind of logic seems to imply that early Germanic “probably” had reflexes of most other widespread IE roots (such as *kWer- “do, make” > Hindi kar “to do”, Lithuanian kurti “create”, etc.) that are not attested in any actually extant Germanic language.

    I would agree that it’s reasonable to *speculate* that Germanic/Baltic might have had a reflex of *Hrtkos, but there is no principle of lexical loss/replacement (that I know of) that supports turning “speculate” into “assume” (or “strongly suspect”).

    • j. says:

      Can you expand a bit on why you consider this assumption reasonable?

      Sure. To be exact: I do not assume that this word ever occurred in Germanic proper or Balto-Slavic proper. But as long as we consider Gmc. and BSl. to be subgroups of Indo-European, this implies that they had undergone a period of independent development – between the point of separating from their closest relatives, and their breakup into separate daughter lineages. Any linguistic innovations that characterize either subgroup as a whole can have happened at any given point during these phases. E.g. if the independent development phase of Germanic was 1000 years long, then there should be roughly 50% odds that the loss of *h₂r̥tḱós took place during the later half of this period; 75% odds that it took place somewhere during the last 750 years; etc. The closer we go to a group’s point of separation from its relatives, the more likely it is that some later changed (in this case: lost) trait was still around in its original form.

      An additional key point is that we know the Uralic languages to have already been in contact with “Germanic” and “Balto-Slavic” during the pre-Germanic / pre-Balto-Slavic eras. Several loanwords are known that e.g. preserve PIE *o as separate; preserve “proto-Satem” *ć as a palatal sibilant rather than postalveolar *š; or preserve the PIE laryngeals. A couple of them regardless already show some morphological or semantic traits that are only attested e.g. in Germanic.

      (At present, we indeed have no indication whatsoever for Uralic-IE contacts to have begun at some point. It seems entirely possible to me that they have been neighbors as long as distinct “Indo-European” and “Uralic” have existed at all.)

      • M. says:

        Thanks for the further explanation.

        E.g. if the independent development phase of Germanic was 1000 years long, then there should be roughly 50% odds that the loss of *h₂r̥tḱós took place during the later half of this period; 75% odds that it took place somewhere during the last 750 years; etc. The closer we go to a group’s point of separation from its relatives, the more likely it is that some later changed (in this case: lost) trait was still around in its original form.

        The problem as I see it is that, the more we increase the time depth and linguistic distance (i.e. the differences between Baltic/Germanic and the IE branches that have retained the *rtk- root), the more uncertainty there is about exactly what an unattested language was like at a given point in time, and the less accordingly persuasive an argument depending on this unattested language becomes.

        Contrast this with a case like that of Finn. hirvi and Old Prussian sirwis (which I think you mentioned a while back). Even though Finnic has never been in contact with Prussian, Prussian is both geographically and linguistically close to languages spoken in regions where there has been such contact; given this smaller window of variation, it is accordingly easier to posit an unattested dialectal Baltic *širwi- as the source of hirvi.

        • M. says:

          I.e., while you are probably right that the likelihood of vocabulary loss increases with time, it is not a process of simple linear decay. Vocabulary can be lost (or at least fall into unpopularity) quite rapidly, sometimes in a single generation. We don’t know when (within a ~2-3 thousand-year range) pre-Baltic or pre-Germanic lost the word in question, where their speakers lived when this happened, and what degree (if any) of contact they had with pre-Finnic speakers at the time.

          • j. says:

            Hmm. If you don’t think a linear probability distribution for loss dates over e.g. the pre-Germanic period is a reasonable prior, what would you suggest? I see no reason to privilege a particular point in time.

            Perhaps you want to note that I’m not assuming the word to have gradually drifted out of use. I’m assigning here probabilities exactly for a relatively rapid loss event, as you suggest too. (We could add the possibility of more gradual loss to the model, but it would, I think, not actually change anything. The boundary conditions are the same in any case.)

            There’s a separate issue over corner cases such as that the word could in principle have been lost even at a post-Proto-Germanic date, in parallel in the different Germanic varieties; or that it could have been an intrusion into Late Indo-European that never existed in the pre-Germanic dialects to begin with; but I don’t think you’re making a point about possibilities like this.

            We don’t know (…) where their speakers lived when this happened, and what degree (if any) of contact they had with pre-Finnic speakers at the time.

            I suspect this is where we might actually disagree. The question of location is largely irrelevant: as long as we’re staying in the same corner of the world, issues like these are largely settled through lines of evidence such as known language contacts in the first place. Archeology may provide some candidates for population movements and flows to be matched up with languages, but cannot actually provide direct linguistic evidence on what was spoken where. Same goes in turn for the existence of contacts themselves. If loanword etymologies do not provide evidence for these, what would? (And I stress again that we already know other pre-Germanic or pre-Balto-Slavic loanwords in Uralic.)

            • M. says:

              Hmm. If you don’t think a linear probability distribution for loss dates over e.g. the pre-Germanic period is a reasonable prior, what would you suggest? I see no reason to privilege a particular point in time.

              I don’t think I fully understand you here, but where am I privileging a particular point in time?

              My only point is that there are a lot of variables that potentially go into the process of vocabulary loss (i.e. what causes the most influential speakers in a community to have a preference for certain words over others), and it seems unlikely to me that these variables behave simply enough for us to say, as you did, “50% odds that the loss of *h₂r̥tḱós took place during the later half of this period; 75% odds that it took place somewhere during the last 750 years;”, or something along such lines.

              The question of location is largely irrelevant: as long as we’re staying in the same corner of the world, issues like these are largely settled through lines of evidence such as known language contacts in the first place.

              I’m not sure I follow. If we know (via loanword evidence) that there were language contacts between Uralic and IE languages at *some* points during the relevant time period (the ~2 millennia before the first Baltic/Germanic records), how does this allow us to assume that *this* particular instance of contact occurred, at the right time and place for **(H)rtko- to be acquired into pre-Finnic as *okte-?

              Granted, if the *Hrtko- : oksi-comparison were convincing on its own terms, then it would itself serve as evidence for such contact. But (respectfully) it isn’t convincing, at least based on the evidence in this post: at best, it inolves a two-segment match (*-kt- : *-tk-/-kt-), with no precedent for the correspondence between the initial segments.

              • David Marjanović says:

                If we know (via loanword evidence) that there were language contacts between Uralic and IE languages at *some* points

                Not so much at points, as continuously since ever.

                • M. says:

                  You’re free to interpret the evidence that way. Not everyone does.

                • M. says:

                  (Apologies if that sounded too curt, I just meant that the evidence doesn’t force us to conclude this.)

              • j. says:

                it seems unlikely to me that these variables behave simply enough for us to say, as you did, “50% odds that the loss of *h₂r̥tḱós took place during the later half of this period; 75% odds that it took place somewhere during the last 750 years;”, or something along such lines.

                That argument is not about modelling lexical loss. It is simple Bayesian reasoning. Whenever there is no specific hypothesis to be treated as the null hypothesis, we will instead default to an even probability distribution over the entire hypothesis space. (See principle of indifference.) Whatever detailed scenario actually went down, we have no reason to expect a particular time for it to have occurred, so we should assign equal probability to all possible times.

                (It actually sounds like you might be objecting to the applicability of probabilistic inference in the first place?)

                If we know (via loanword evidence) that there were language contacts between Uralic and IE languages at *some* points during the relevant time period (the ~2 millennia before the first Baltic/Germanic records), how does this allow us to assume that *this* particular instance of contact occurred (…)?

                Language contacts are roughly speaking continuous, though after the fact, we only can see direct evidence at individual points. If given a Burmese loanword into English that’s first attested in 1900, then an English loanword into Burmese that’s first attested in 1980, we should not infer from this data that English and Burmese broke off their contacts for the entirety of the early 1900s, only to re-establish contact several decades later. (Nor should we infer that contacts only started in 1900; cf. Signor–Lipps effect.) Similarly – the spottiness of the evidence for early Uralic-IE contacts is not best explained by some zigzagging process where IE and Uralic speakers only made contact once every other century. More likely several other loanwords that took ground back then are simply no longer identifiable to us (due to the dialects adopting them having later gone extinct, the words having been later lost from extant lineages, quirks of semantic drift, misdating, incomplete research, etc.)

                There also already is a “long tail” of various increasingly improbable hypotheses, especially due to Katz (2003). For just one example, consider a proposal to derive Finno-Mordvinic *lešmä ‘cow/horse’ from an early Indo-Iranian (“Proto-Satem”) *Hlekš-man ‘guarded one’, allegedly reflected in Sanskrit lakṣman- ‘mark’. It seems clear that some of his more imaginative proposals of this kind are probably correct, even if most probably aren’t. (He also already has this same IE–Uralic ‘bear’ comparison, but with different and IMO less convincing analysis.)

                the *Hrtko- : oksi-comparison (…) isn’t convincing, at least based on the evidence in this post: at best, it involves a two-segment match (*-kt- : *-tk-/-kt-), with no precedent for the correspondence between the initial segments.

                Note also the weaker third segment match: zero onset in both Uralic and in later IE. But yes, I agree I would need to defend in more detail the vocalism issues (and finding other examples of *rCC → *CC would be nice too). I have some work done on this already, though it would take too long to go into it just here. But in brief, there already are numerous long-accepted precedents for unexpected word-initial *o- in western Uralic: e.g. *ora ‘awl’ from Indo-Iranian *ārā < *ēlā.

                • M. says:

                  Language contacts are roughly speaking continuous, though after the fact, we only can see direct evidence at individual points.

                  Again, I don’t see what principle allows us to use this as a default hypothesis. Since English has both older Scandinavian vocabulary in English (nay < an ancestor of Icelandic nei) and newer such vocabulary (smorgasbord < Swedish smörgåsbord), then wouldn’t this logic imply that there has been “continual” contact between English- and North Germanic speakers over the past 1,000 years, rather than sporadic contacts that only became regular and frequent in the past 1-2 centuries?

                  Examples from the present day are misleading, I think, because modern global communication genuinely is an ongoing process: large numbers of Burmese people and English people talk to one another every day, or on a highly regular basis. Plus, due to modern-day conditions of political stability, this ongoing contact will probably not lead (over the decades or even centuries) to the total linguistic assimilation of people in the smaller and less powerful countries to the larger and more powerful ones. Neither of these things can be taken for granted when talking about inter-tribal contacts in ancient times.

                  But in brief, there already are numerous long-accepted precedents for unexpected word-initial *o- in western Uralic: e.g. *ora ‘awl’ from Indo-Iranian *ārā < *ēlā.

                  Are you implying that West Uralic had a tendency to use *o- as a “default” replacement for unfamiliar vowels? Otherwise I’m not sure what the outcome of syllabic *r has to do with that of *ā?

                  (And incidentally, why do we need to assume that *ora is a direct reflex of the specific form *āra to begin with — couldn’t it just as probably be from a “stray” dialect of early IIr. that pronounced the *ā as a higher vowel, or from a third mediating language spoken by a group that traded “Indo-Iranian” awls with early Uralic speakers? As far as I can see, the same kind of qualification applies to many other lexical items, such as porsas, that are often claimed as evidence for antiquity of contact, or for new loan-substitution patterns, when they could simply indicate less common contexts of contact.)

                • j. says:

                  Since English has both older Scandinavian vocabulary in English (nay < an ancestor of Icelandic nei) and newer such vocabulary (smorgasbord < Swedish smörgåsbord), then wouldn’t this logic imply that there has been “continual” contact between English- and North Germanic speakers over the past 1,000 years (…)?

                  It would, and there indeed have been continuous contacts. Intensity can vary, of course, but it would not be reasonable to claim that a proposed loanword between English and Scandinavian that otherwise appears to date to somewhere between the known periods of stronger contacts (say, Sw. rom ‘rum’, first attested 1711) should be rejected because existence of contacts “hasn’t been established / cannot be assumed”.

                  Perhaps we simply have a terminological confusion here. Note first that “language contact” does not mean “proximity”. Likewise, by “continuous”, I mean roughly “frequently enough that we would not be able to discern any substantial gaps from a complete loanword etc. record”, i.e. something like speakers of two language groups meeting once per year would already suffice. Languages are not strictly nailed down to geographic positions, and cultural loanwords can in principle be transmitted through nothing more than a single travelling trader/missionary/etc. For a word like ‘bear’, the likely cultural context is fur trade (either as primary merchandise, or as a means of payment for other valuables). I would think it’s only with more basic vocabulary like ‘no’ or ‘water’ that a loanword proposal would start calling for evidence of stronger, daily-level contacts.

                  Are you implying that West Uralic had a tendency to use *o- as a “default” replacement for unfamiliar vowels?

                  I’m suggesting a development from word-initial *a to *o, within Uralic. Covering conditioning and other evidence would be too much off-tangent in this comments section, though.

                • M. says:

                  Tangential, but one other thing I wanted to ask about: you said in regards to the *Hrtko- : okte- comparison,

                  Note also the weaker third segment match: zero onset in both Uralic and in later IE.

                  I don’t quite understand the (purported) evidential value of “zero”, i.e. the absence of a segment, in contexts like this. It’s true that the absence of a segment can detract from an etymology (when this absence is unexpected, or when a segment is present contrary to expectation), but this does not mean that it can *increase* (even weakly) the persuasiveness of an otherwise-unpersuasive etymology.

                  Suppose you were comparing a word pronounced [pstrax] in one language with a word pronounced [ru] in another. If the loss of the onset pst- and the coda x were phonologically expected in the latter language, then would you consider this to be a one-segment match (r : r), or a five-segment match (pstr_x : 000r_0)?

  7. I didn’t see anyone mentioning Lith. irštvė ‘bear den’ (with variants) analyzed by Karaliūnas as a derivative of the PIE word for ‘bear’. If correct, it attests to the survival for some time of this word.

  8. M. says:

    Likewise, by “continuous”, I mean roughly “frequently enough that we would not be able to discern any substantial gaps from a complete loanword etc. record”, i.e. something like speakers of two language groups meeting once per year would already suffice.

    OK. The evidence I’m familiar with seems consistent with a scenario where early Uralic branches and early IE dialects came into contact every few centuries. Plenty of vocabulary could have been lost/replaced on either side during the “dry periods” over this stretch of time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Enter your email address to follow this blog and receive notifications of new posts by email.

%d bloggers like this: