How to (not) report a lack of etymology: Samic *keaðkē

I have been having a simmering discussion with commentator “M.” under the post on what’s important for what in historical Uralistics. One general topic there that I keep pushing hard back at is the idea of “etymology unknown” as anything like a fallback explanation or default hypothesis. This is not a hypothesis at all, it is the absense of one. At the worst it might end up being elevated to a curiosity stopper, an excuse to not keep looking.

At the same time, I want to still stress that this doesn’t mean that anything at all, any kind of nonsense thrown out, makes an acceptable etymology. I’m already on record in favor of more attention being paid to “anti-etymologies“. “Etymology unknown” sometimes really is what should be reported. But I think that this is essentially always too little detail by itself and should be combined with telling what, exactly, is it that we have ruled out as not being known. Basically no language on Earth is at a point of etymological research so widely practiced and thoroughly scoured that we would have grounds to assume that “etymology unknown” means actually having exhausted all possibilities. Words reported as “etymology unknown” in some sources have new good etymologies coming out for them all the time, sometimes even from older literature that was neglected by the compilers of the reference work in question. They will keep coming too, if my own backlog of unpublished etymologies is anything to go on on.

So what should it look like when a word’s etymology really remains firmly unknown, not just underresearched? For an example, let us consider Samic *keaðkē ‘stone’.

Step one: check the semantic equivalents in all known relatives and main contact languages. In the meaning ‘stone’, we can find clear non-cognates in all reasonable directions:

  • Most Uralic languages reflect Proto-Uralic *kiwə. There is some phonological overlap here (initial *k, front vocalism) but the correspondence *ðk ~ *w seems unbridgeable without massive speculation. *ea ~ *i doesn’t have any good precedents for it either. It’s not literally impossible that these could be some day solved, especially as long as no traces of *kiwə are otherwise found in Samic, but for the time being this is a non-match.
  • Samoyedic reflects instead *pəj, an even worse phonological fit. *ðk ~ *j would be actually regular (< PU *ďk?), but this observation conflicts with the proposal to treat the Samoyedic word as cognate with Finnic *pii-kivi ‘flintstone’, both if reconstructed back to a separate PU root *pijə, or if treated as a semantically and phonologically divergent reflex of PU *piŋə ‘tooth’ (> Finnic *pii ‘tine’), e.g. by back-formation from the same or a similar compound, plus irregular lenition of *ŋ. [1]
  • Per Nikolayeva, Yukaghir has *kïj ‘stone’ (plausibly ~ *kiwə [2]), Kolyma Yukaghir also /pē/ ‘rock, big stone’ (plausibly ← Samoyedic), Tundra Yukaghir also /jeďi/ < ? *jenći ‘stone’ (no idea about the etymology of this), all still nowhere near *keaðkē and now also way off geographically and genealogically, hence a priori weaker than anything found in languages securely known to be related to Samic.
  • Germanic reflects *stainaz; Baltic reflects *ákmō and Slavic *kamy, both going back to PIE *h₂akmon- whence also e.g. Sanskrit áśman. No chance here for a loan from any known non-Uralic language of Northern Europe, no evidence for an ancient Indo-Uralic archaism either.

In known loanword sources a bit further off, we could try looking more into Indo-Iranian, where words for ‘stone’ seem to diverge quite a bit. A quick trawl thru Wiktionary nets at least Persian and Balochi /sang/, Pashto /kāɳaj/, Kurdish /bird ~ berd/, Wakhi /wurt/, Ossetic /dur/, Hindi, Kashmiri etc. /pattʰar/… but again none of this initial haul really gets us any closer to Samic.

Step two: check for morphological analyses. For words that don’t look like basic word roots, this probably should be step one. There is something that can be done here too though: *-kē < *-kA is a widespread Uralic nominal suffix, and we probably shouldn’t stress too much if this in particular fails to correspond in an otherwise decent cognate. Still, a shorter #keað- (suggesting pre-Samic #keð-) does just as poorly among the non-cognates above. We also don’t have anything within Samic that would particularly point to such a division. The most phonologically similar words reconstructible for Proto-Samic are *(s)keaðē- ‘temple (of head)’ and *kiðë ‘spring’, both semantically miles off from ‘stone’. In more narrowly distributed words from Northern Sami I can find geađđat ‘amicable’, geađđi ‘dimness’ (+ Lule skädot ‘to dim (of eyes)’, Skolt ǩieđâš [3]) which don’t help either. Relaxing phonological similarity even further allows reaching a different substance term *čëðë ‘coal’ (< PU *śüďə), but even allowing for irregular *č > *k would not suffice to set up any morphological relationship. Unless we are also wrong about the development of PU *ü(-ə) to PS *ë(-ë), and this somehow first merged with PU *e-ə rather than the phonetically expected *i(-ə)? If so, then we might consider *śüďə > *ćeðə > *ćeð-kä > *čeaðkē > ? *keaðkē. But I feel a semantic shift ‘coal’ > ‘stone’ remains nonsensical despite a vaguely shared semantic field. A connection between these meanings probably should rather start from something more generic like ‘nugget, pellet, grain’. Even ‘small stone’ perhaps, but that would be a poor match with ‘stone’ just in a supposedly derived Samic reflex vs. ‘coal’ all across Uralic.

Step three: check for phonological matches and see if their semantic difference can be bridged. We have done some of this already in the previous step. Looking more widely for PU roots, even of the very rough shape *k + front vowel + *d/ď again fails to turn anything good though. Besides ‘spring’ (with cognates in Mordvinic) our options are *keďə ‘skin’, *käďwä ‘female; ermine?’, *küdV ‘brother-in-law’ (unless the proposed Ob-Ugric cognates of Finnic *kütü are just divergent reflexes of *käləw ‘sister-in-law’), all again no-go. Germanic could be scanned as well, though for the time being I have no good resources for doing this thoroughly (anyone want to link me to a digital dictionary of Old Norse?). Balto-Slavic and Indo-Iranian we can probably leave aside, as there are no examples of Samic *ð or PU *d/*ď that originate from these.

Step four: check for semantic near-matches. This is somewhat harder to do rigorously. In recent times the CLICS database offers one handy tool at least: charts of typical colexification relationships between concepts in the world’s languages. Their concept map for STONE provides us with the rough gist that the options are limited. So far the only attested colexifications are with ‘mountain’, ‘egg’, ‘hill’ (mostly in Pama-Nyungan) and ‘seed’ (mostly in Austronesian; Finnish kivi as ‘pit of fruit’ might count too). Only the first, as observable already in e.g. English rock, has substantial amounts of evidence backing it.

However, it turns out that we are now in luck! PU *muna > PS *monē ‘egg’ is right out, and no PU or PS word for ‘seed’ is known at all. The proposed PU words for ‘mountain’ or ‘hill’ number a handful, and the best-attested cases like *wärä (> Samic *vārē) or *mäkə are also way off. But one less firmly attested example is *kaďV — continued in Hungarian hegy and a Samoyedic word family that might reconstruct as *koəjə (if we take Nganasan †koaja as recorded by Castrén as representative and not a later derivative from something shorter). This turns out to match well indeed with the morphological analysis *keað-kē that I have already hypothesized above, and the two root consonants match regularly. The vowel development *a > *ea is not the usual one, but can be tentatively explained: this turns up in Samic also in other cases before palatalized consonants, especially syllable-final ones, including *kaća > *keačē ‘point, end’, *kaććV- > *keaččë- ‘to look’, *laśkV- → *leaškō- ‘to pour (out)’, *waćara > *veačērē ‘hammer’ (cf. Finnic *kaca, *kacco-, *laskë-, *vasara); and perhaps the common Uralic Wanderwort *waśkV > *veaškē ‘copper’, back-vocalic also in Finnic *vaski, Mari *wåž, Hungarian vas, Khanty *wăɣ (but then front-vocalic also in Mordvinic *viśkə, Permic *-veś, Samoyedic *wäsa). The conditioning of this probably could use more research though.

Regardless it seems we can, after all, propose an etymology: PU *kaďə ‘(rocky?) mountain’ > early pre-Samic *kaď-ka ‘rock (object)’ > late pre-Samic *keďkä > *keðkä > Proto-Samic *keaðkē ‘stone (substance)’. A very nice result I feel, for explaining such a basic vocabulary item that has so far gone unetymologized! [4]

At this point I must emphasize that this result was not pre-decided. This etymology does not come from my above-alluded stash of unpublished discoveries. Right up to looking up the CLICS concept map, I was laboring under the assumption that *keaðkē indeed is a word of unknown etymology; certainly that’s the only thing I’ve seen reported for it, and certainly it also fits my typological expectations of substrate vocabulary (which is, in the absense of features like consistently recurring phonetic irregularities, generally fairly unknowable speculation in the case of any one particular word). And yet it turns out … if we just diligently explore the options, instead of worshipping our ignorance and writing words off as “unknown-therefore-unknowable”, a lot of the time we can make progress on their etymology. Wir müssen wissen, wir werden wissen.

Probably there would be indeed words where the four steps above are still insufficient for putting together an etymology; then again it would be possible to sketch out also a few further steps. And I think I have demonstrated regardless not just an apparent etymology for *keaðkē after all, but also, how and why the first few directions that we could think of for seeking its etymology do indeed fail.

[1] A hypothesis that would work decently here is that first *iŋ > *iń, which is not contradicted by any data (is nonprovably regular) and is within Uralic even paralleled by Permic *piń; followed by regular *ń > *j in most reflexes. Only Selkup really conflicts with this. — The reconstruction of *ə seems unclear too (actually given by Janhunen as *ə¹ = *ə/*å). Only the correspondence Nganasan /hᵘalə/ < †fala ~ Nenets *pæ points to this, while we have a seemingly preserved /i/ in Kamassian /pʰi/ and Mator hilä, and a close vowel also in Enets /pū/ < †puj ‹пуи›. Maybe some of these could even reflect a heavily contracted *pijwə < PS *pińwə < pre-PS *pińkiwə < PU *piŋə-kiwə (with loss of *k from a secondary cluster *ńk, but intervocalic *w preserved in a no longer posttonic position)?
[2] Considering the main etymology I discover here, another possibility could be to derive this thru some flavor of Samoyedic #kVj ‘(rocky?) mountain’.
[3] Related to Germanic *skadwaz ‘shadow’ somehow…? The front vowel seems like a poor match, though.
[4] One further phonologically interesting feature in this is that the Samic-specific fronting *a > *e seems to take place earlier than the common West Uralic depalatalization *ď > *d (or > *ð). I’m not concerned though. This seems to be proven as an areally-spread change already the fact that also Mari shows *ď > /ð/ while differing from West Uralic in showing *d > ∅. Actually in principle nothing rules out either that palatalization before *ď was more widespread, since we lack Finnic and Mordvinic reflexes, but I don’t see much benefit in this assumption over the previous.

Tagged with: , , ,
Posted in Etymology, Methodology
7 comments on “How to (not) report a lack of etymology: Samic *keaðkē
  1. Y says:

    In North Germanic (ON fjall, etc.), the direction of semantic change is stone > mountain. For English “rock” the direction is the opposite. I wonder if one direction is significantly commoner than the other.

  2. David Marjanović says:

    -stein crops up a fair bit in the names of Austrian mountains; probably all of them extend beyond the treeline.

    • David Marjanović says:

      …and particularly large pits are called Stein as well, and the fruit with such are collectively called Steinobst.

      • j. says:

        Could be in origin a Low German calque in Finnish really, culinarily important pit fruits are mostly imports in Finland after all. (Still, seems like a very natural metaphor and this is attested also of some minor native fruits like the bird cherry, stone bramble or mezereum.)

  3. Y says:

    FWIW, the Journal of Negative Results in Biomedicine had a respectable 15-year run. It ceased publication because, as it says, it “has succeeded in its mission and there is no longer a need for a specific journal to host these null results. For authors seeking an alternative outlet for the publication of null results, a number of other BioMed Central journals will consider this content.”

    May all sciences welcome the publishing of negative results.

    • David Marjanović says:

      The open-access megajournals already do; I don’t know if Glossa counts as one, or if I remembered its name correctly.

  4. Niklas says:

    For what it’s worth, at one time I contemplated about the possibility of connecting Proto-Sami *keað-kē with Komi kol’ ‘cone, nut’ and Udmurt kul’ï ‘penis’ dial. ‘cone’. I thought about a morphological analysis similar to what you propose here in that the Permic word would reflect an underived word PU *ked’ə and the Sami word a derivation *ked’-kä. The Permic words could in fact also reflect PU *kad’ə so phonologically they could very well belong here as well. I can’t remember why I didn’t pursue the connection further at the time, perhaps I couldn’t find a convincing semantic parallel or just forgot about it.

