An old etymology: aistiész

I find it interesting how modern advances in Uralic historical phonology can occasionally turn out to vindicate old sketchy etymological proposals, dating from the earliest phases of scientific comparison of the word stocks of the Uralic languages.

One of these cases appears to be a connection between Finnish aisti ‘sense(s)’ and Hungarian ész ‘reason’. This is a comparison that appears in 1800s work by the likes of Budenz. Already come the 1900s it had mostly been dropped, however (with decent reasons, as we shall see). But regardless, nowadays it seems that the two can be regularly connected after all.

Let’s start from the Finnish side. Any possibility of a comparison with Hungarian can only involve the first syllable, ais-. Once we factor in the other cognates within Finnic though, already internal reconstruction turns out to point towards the rest of the word being suffixal material.

“What other cognates”, you ask? Yes, there are no known cognates in the other Finnic languages, which usually doesn’t bode well for native origin. The Finnish dialects, however, make up a good reserve of lexical diversity (recall that “Finnish” in its widest sense is only a geographical collection of Finnic varieties, not an actual subgroup thereof). We can find in these some interesting parallel formations that allow some deeper exploration of this connection. Standard Fi. only provides aistin ‘sense organ’ and aistia ‘to sense’, both of which could be accounted as derived from aisti itself. More interesting are, however, aisto ‘intent’ (Southwestern dialects) and astaita ‘to observe’ (Tavastian and Far Northern dialects).

The aisti ~ aisto doublet points relatively clearly to an earlier lost verb stem *aistaa. For an exact parallel, cf. paistaa ‘to bake’ → paisti ‘roast’, paisto ‘baking’. Also aistia can be then better analyzed as an iterative aist-i- derived from this stem, not as a zero-derivation of aisti.

The heavy stem structure, in turn, suggests a segmentation of this verb as *ais-ta-. Almost all Finnish verbs of the shape CVXCtA- (where X is any segment: vowel length, semivowel, or consonant) are derivatives, most often of this kind. [1] Likely semantics at this point will be something like *ais(i) ‘senses, observation’ (in any case a nominal) → *ais-ta- ‘to sense, observe’.

So far this reconstructed *aisi does not need to be any older than medieval Finnish. e-stem inflection **aisi : **aise- would be usually be a good sign for dating a word back to at least Proto-Finnic, but we only have evidence for √ais- as purely a root element, not as an independent stem. It is true that consonant stems such as √ais- usually take e as their stem vowel, when required — but this kind of derivation can at least occasionally be based on other stem types as well. E.g. the i-stems kaali(-) ‘cabbage’, viini(-) ‘wine’ have regardless the analogical consonant-stem partitive singulars kaalta, viintä in colloquial Finnish; and verbs in -tA- derived from trisyllabic A-stem adjectives are quite regularly based on a consonant stem, e.g. kavala ‘treacherous’, kumara ‘slouched, bent’, matala ‘low’, viherä ‘green’ [2]kavaltaa ‘to embezzle’, kumartaa ‘to bow’, madaltaa ‘to make shallower’, vihertää ‘to be verdant’.

The key evidence for projecting the root fairly far back comes instead from astaita. Why do we have as- and not ais- in here? I believe the answer is that this is a very old parallel derivative, already from a Proto-Finnic *aisi. The overheavy stem structure *CVXCtA- is innovative in Finnish. In certain very old cases, we instead see consonant cluster simplification to a regular heavy stem CVCtA-. At least the following three cases are still apparent:

  • *kanci > kansi ‘lid’, stem *kant(ə)- > kant(e)-
    → *kant-ta- > *katta- > kattaa ‘to cover’;
  • *nowsə- > nouse- (infinitive nousta) ‘to rise’
    → *nows-ta- > *nos-ta- > nosta- (inf. nostaa) ‘to lift, raise’;
  • *vejcci > veitsi ‘knife’, stem *vejcc(ə)- > veitse-, veis- (partitive veistä)
    → *vejc-tä- > *vec-tä- > dialectal vestä- (inf. vestää) ‘to whittle’.

*ntt > *tt, seen in the first example, has been known for long, and has further support from inflectional morphology (e.g. in the ordinals: *kolmanci : *kolmant-ta > *kolmac : *kolmatta > kolmas : kolmatta ‘third’). Yet another instance of this same sound change, easiest formulated simply as *C₁C₂C₃ > *C₂C₃, is probably the loss of a stop before /st/. This is not evidenced in derivational morphology, but is quite regular in consonant-stem partitives or infinitives (the types lapsi : *laps-ta > lasta ‘child’; juokse- : *jooks-tak > juosta ‘to run’). The cases with loss of a semivowel do not build up a consistent a picture at all, but I think the cases with similar loss of *n, *p, *k etc. allow putting them on firmer ground. — Standard Finnish has analogical veistää for ‘to whittle’, but most other Finnic languages still retain the soundlawful ⁽*⁾vestä-.

Therefore, I would reconstruct here PF *ajs-ta- → *as-ta-, an earlier doublet of the later *ais-ta-. Further derivational extension towards astaita can be well later, however.

If an earlier Finnic stem *aisë- < *ajsə- can be therefore assumed, it turns out that this will an exact equivalent to Hungarian ész. The PU form can be reconstructed as *äśä: in Finnic we have first ä-backing to yield *aśə; followed by palatal breaking to yield *ajśə, and finally depalatalization to *ajsə. The modern Hungarian nominative could just as well continue earlier **eśV, but the short-vocalic (and vowel-stem) plural/accusative/possessive stem esze- clearly requires Proto-Hungarian *ä < PU *ä, just as in e.g. tél : tele- ‘winter’ < PU *tälwä > Fi. talvi.

(My proposal that *aĆV > *ajCV in Finnic still remains without a published defense. Anyone who is skeptical of this is welcome to reconstruct instead *äjśä in the meanwhile and assume cluster simplification in Hungarian.)

While this seems to work in principle, a look into any relatively modern etymological dictionary of Hungarian will present a different, simpler etymology of ész: borrowing from Proto-Turkic *äs ‘memory, mind’. Does this show that the Finnish-Hungarian parallel is only an elaborate coincidence?

I could argue that the loan etymology being “more simple” is mostly cosmetic. An etymology is in fact not more unlikely just because it involves a larger number of sound laws, as long as those sound laws are established well enough in the first place. The entire point of reconstructing sound laws is to group the phonetic development of multiple words under a single assumed event. New examples of a known soundlaw do not constitute new assumptions by themselves. As for morphological complications, my internal reconstruction of pre-Finnic *ajsə < ? *äśä is based solely on the Finnish data, therefore has no bearing on how we analyze the Hungarian.

However, there is also a better option! The connection of Hungarian with Finnish does not mean we have to discard the Turkic comparison entirely: we can simply invert the direction of loaning, and analyze this as a Hungarian loanword in Turkic.

Many of the numerous word comparisons between Hungarian and Turkic originate quite clearly from the Turkic side. Identifying features are common enough, even among the oldest layer of loanwords:

  • unetymological sound structure in Hungarian, e.g. bölcső ‘cradle’ < *belćöw < *belćəɣ ← Turkic *belčik; initial /b-/ is clearly non-native, and there are also no clear precedents for clusters of liquid + affricate in Proto-Uralic.
  • with a loan origin further away than in Turkic, e.g. gyöngü ‘pearl’ < *ďinďü ← Turkic *jinjü ← Chinese (Mandarin zhēnzhū)
  • replacing an established Proto-Uralic term, e.g. hattyú ‘swan’ << *qottVŋ ← Turkic *qotaŋ; contrast PU ? *jëxćə.

But in the absense of any evidence of this sort, it does not seem clear that we would have to continue to simply assume the direction Turkic → Hungarian. In the current case we indeed have equivalent evidence in favor of the other direction (an unproblematic cognate in Finnic, which moreover requires PU *ś > Ugric *s, which change would then be reflected also in Turkic). There is reason to expect more symmetry going on anyway: some of these loanwords go quite far back, to the early 1st millennium CE, when “Turkic” would have still been barely more than a single language (if likely with incipient dialect divisions), while “Hungarian” (maybe less anachronistically: “Magyaric”) would already have been an established branch of Uralic. The fact that Turkic today is a major language family stretching from Anatolia to the Lena, while Hungarian is a single language isolated within its family, is a much later development, from around the 2nd millennium CE.

I expect that a closer look at Hungarian-Turkic lexical parallels will reveal also other cases that can be analyzed as Hungarian loans in Turkic at least equally well as in the opposite direction.

A layer of early Hungarian loans in Turkic could moreover account also for a number of the known “Ural-Altaic” lexical parallels. I’ve posted before about *qujaš ‘sun’. Two quick further examples:

  • Turkic *al- ‘lower, below’: often compared with PU *ëla ‘under, below’. This seems to show the common-in-Uralic sound change *ë > *a, as well as apocope; both of these can be seen also in Hu. al-.
  • Turkic *tāla- ‘to rob, plunder’: well compareable with PU *sala- ‘to steal, hide’. The phonetic development lines the best up with Mansi or Samoyedic, where *s > *t. However, this could be perhaps derived also from a stage of Hungarian where *s > *ɬ had taken place, but further development > *h > ∅ had not. This would be then compareable to how /ɬ/ in Khanty tends to be borrowed as /t/ into Russian or other nearby languages that lack the sound. — EDAL compares the Turkic also with Korean and Japanese verbs for ‘to lure’, but this is a worse semantic match than comparison with Uralic (or, for that matter, with PIE *tsel- ‘to sneak’, whence Germanic *stela- ‘to steal’).

Elsewhere in Uralic, there are no clear inherited cognates that I would know of for my assumed *äśä . There is a Samic reflex though: PS *āj(c)cë- > NS áicat ‘to observe’, but the vowel correspondence *ā-ë ~ *a-e, and the unpalatalized sibilant, clearly point to a loanword from Finnic. (This also seems to have good chances of being one of the pseudo-PS reconstructions that never occurred in Proto-Samic proper.)

For a small tangent — the affricate -c- (= IPA [ts]) is interesting here. It would be possible to explore the possibility that this is somehow metathetical, and based on the suffixed verb *aista- (then further somehow contaminated with the *ë-stem noun to yield an *ë-stem verb). I suspect a different explanation, however.

Namely, the Samic languages are known to fortite inherited prevocalic *ś- to *č-. This unusual sound change could probably be reversed: taken to indicate that *ś- was originally an affricate *ć-, retained in Samic all along vs. normally deaffricated in all other Uralic languages. [3] The same is also suggested by the long-known Indo-Iranian loanwords like *śëta (*ćëta) ‘100’, *śarwə (*ćarwə) ‘horn’ (> PS *čuotē, *čoarvē > NS čuohti, čoarvi). Per the current understanding, these still had an affricate *ć in Proto-Indo-Iranian, retained as c in Nuristani. The fricatives ś in Indic and s in attested Iranian are therefore parallel innovations. Even Proto-Iranian may still require an affricate *c, to account for the development to /θ/ in Old Persian (though perhaps laminal [s̻] would work just as well). There also does not seem to be any reason to assume that any of the old II loans in Uralic would have come from the Indic branch specifically.

In Finnic, there is however no need to assume especially early deaffrication in all positions. We know by now that PF had an affricate *c, partly preserved in South Estonian, but later mostly deaffricated elsewhere. It would seem to be possible to assume that at least word-internal *-ś- in fact yields Proto-Finnic *-c-, not *-s-, and that this is only deaffricated later on, together with *c from other sources (such as the type *wetə > *veti > *veci > vesi ‘water’). — Since SE consistently only indicates s- for PU *ś- (sada ‘100’, sarv ‘horn’, silm ‘eye’, sälg ‘back’, süä ‘hart’, etc. [4]), I would however still assume early word-initial deaffrication: PU *ć- > *ś- > PF *s-. This would run in parallel to the often assumed development of PU *č- > *š- > PF *h-.

Therefore, to immediately correct what I write above, it may be preferrable to assume an original preserved affricate: PU *äćä > *aćə > *ajćə > PF *aici, borrowed at this point into Proto-Samic.

[0] This post is an extended version of an etymology I have presented before at one of the University of Helsinki etymology workshops, in case anyone feels like the basic gist is sounding overly familiar.
[1] E.g. kieltää ‘to deny’ ← kieli : kiel(e)- ‘tongue’; saartaa ‘to surround’ ← saari : saar(e)- ‘island’; köyttää ‘to tie’ ← köysi : köyt(e)- ‘rope’; varttaa ‘to graft’ ← varsi : vart(e)- ‘stem’; haistaa ‘to smell (tr.)’ ← haista : hais(e)– ‘to smell (intr.)’. Perhaps also paistaa ← *pais(e)- ‘to be baked’, given paisua ‘to swell, expand’, perhaps originally used of dough. (This shorter stem in turn has been explained as deriving from PIE √*bʰeh₁- ‘to heat’.) A few “overheavy” verb stems have instead been formed by a suffix -stA- plus a contracted first syllable, though, such as maustaa ‘to season’ < *maɣusta- ← maku ‘taste’.
[2] Mostly replaced in standard Finnish by the metathetic reshaping vihreä.
[3] Without going in too much details, this is a proposal that has already been made by various people, such as Abondolo, Janhunen and Katz. It does have the implication that something needs to be done with traditional PU *ć, though. The only reliable instances seem to be the clusters *-ńć- and *-ćć-. All other words are perhaps better considered later loanwords, diffused between the Uralic varieties. This is also suggested by how the candidates are disproportionally represented in Permic and Ugric anyway, and how they also often show vacillation between traditional *ć and *ś (say, ‘to break’: Permic *ćeg- < *ć-, but Ugric *säŋk- < *ś-).
[4] Võro-eesti synaraamat does have a nursery term tsimmä ‘eye’, but the affricate in here is probably better considered affective variation than an ancient retention. Secondary *ci- < *ti- can however remain; the usual example is tsiga ‘pig’ (~ Fi. sika).

8 comments on “An old etymology: aistiész
  1. David Marjanović says:


  2. Crom daba says:

    Interestingly, the word isn’t attested in Old Turkic (contra EDAL), but only its presumed derivation ‘esirge-‘ “to pity, to begrudge”.

    There’s also Middle Mongol ‘esi’, a dis legomenon probably meaning “memory”*. The final vowel here shouldn’t be paragogic and so needs to be explaned.

    * Mostaert reads “instruction” after Kowalewski’s 19th century dictionary, but since the sentence goes “This stele which [the Emperor] has erected, having become forever and
    always a _____, let the descendants of his descendants go upward” and “memory” appears to be the primary sense in Turkic, I’d go with “memory”.

    • j. says:

      Oldest Hungarian reflects at least some cases of PU stem-final *-a as -u (standard example: had ‘army’ < hodu < *konta, akin to Fi. kunta). We do not know the exact fate of *-ä at this time, but -i is conceivable.

      This would require some kind of an apocope cycle in Turkic, though, so that pre-PT *-I *-U are lost but later PT *-I *-U come from corresponding long vowels, or roots with final stress, or some kind of consonant-final roots, etc.

      • Crom daba says:

        Turkic apocope is very likely, there are a number of presumably Turkic words in Mongolic with final vowels that are not present in attested Turkic. Also, final vowels were probably phonetically long in Old Turkic.

        P.S. Is o > a in Hungarian regular? I couldn’t find any Hungarian Turkicisms with a *a ~ o correspondence which I kinda expected from Old Chuvash loans, were they just shifted back to a?

        • j. says:

          Yes, *o > a or á is just about exceptionless. I suspect the change may have been quite old, too. Old Hungarian had no short /e/ or /ö/, and so also o in cases like hodu may have been just a graphic device to spell [ɒ].

          Modern o is mainly from Old Hungarian u, partly also a before /l/+consonant (as in hal ‘to die’ : holt ‘dead’); long ó mainly from *aw ( : tava- ‘lake’) and *uw > *ow.

          • Crom daba says:

            Good to know, I also erroneously thought that ‘a’s in Slavicisms are retentions from before the a > o change.

            Are you planning to make a Hungarian page at FrathWiki at some point? Other Uralic pages were pretty informative.

            • j. says:

              I do think a‘s in Slavic loans in Hungarian are retentions of a kind. The bulk of the cases of *o > *a date to already common Ugric or common East Uralic (e.g. *kota ‘house’ > Hung. ház, Khanty *kaat (Far Eastern /kaat/ etc.), Mansi *kaatə (Old Northern ‹хотъ›)). They can well enough stand for *[ɔ] → [ɒ] though, so they do not require a fully open or unrounded value of *o in Slavic, they merely need to be older than the medieval u > o change.

              I’ve been hesitating writing a FrathWiki page on Hungarian; there’s a lot more literature on its history than on smaller Uralic groups such as Mari or Khanty, and most of it in Hungarian.

  3. David Marjanović says:

    PIE *tsel- ‘to sneak’, whence Germanic *stela- ‘to steal’

    Also still “sneak”: literary German sich davonstehlen “to sneak out”, verstohlen “shyly”.

