Three observations on Bactrian

As a part of my ongoing quest to get a better handle on the Indo-Iranian languages (mostly, yes, but not only due to their important early contact influence on the Uralic languages), some time ago I caught wind of Saloumeh Gholami’s PhD thesis Selected Features of Bactrian Grammar (2010) and have given it a thorough-ish read. Bactrian has been and probably continues to be one of the more poorly documented Iranian languages, and Gholami provides what seems like a good summary of the newer ongoing research.

Already at this point there are a few interesting observations to be made. And I hope you will not be too disappointed to find out that my thoughts so far mostly involve the historical phonology of Bactrian — the syntax and morphology no dout have interesting phenomena going on too, but I probably won’t be able to say anything intelligible about those before knowing much better how they work also in the other Iranian languages from the same period and/or area (Sogdian, Xwareshmian, Middle Persian, Pashto etc.)

Gholami’s overview of the phonology of Bactrian is introductory in nature but still very historically grounded: she gives a pre-Bactrian etymology for almost every example word mentioned. These are not sourced, so it is hard to tell how far back they are supposed to go (all the way to Proto-Iranian?), but I get the impression that they’re based on earlier groundwork on Bactrian by Nicholas Sims-Williams, whom she mostly refers to for basics.

The thesis also does not contain any kind of a word index, so I’ve had to comb the initial chapters by hand for examples, getting a bit over 400 of them together. Further vocabulary would appear in the grammatical chapters with their extensive interlinear glosses, but generally without proto-forms. If we regardless suppose her given pre-Bactrian reconstructions to be reliable, they seem to allow for the following observations.

One: there seems to be a rule of non-open vowel shortening.

Middle Iranian *ē (in Bactrian from Proto-Iranian *ai, *aya, *iya, *ā-i) is in Bactrian spelled varyingly as ‹η› (likely /eː/) or ‹ι› (likely either /i/ or /iː/). Gholami suggests that *ē develops to ‹ι› before a nasal, on the basis of the following data: *waina- > ‹οιν-› ‘to see’, *kainā- > ‹κινο› ‘revenge’, *abi-dayanā > ‹αβδδινο› ‘custom’, *abi-dayana-ka > ‹αβδδιγγο› ‘way, manner’, *xrayanā > ‹αρχινο› ‘purchase’. Raising of long vowels before nasals is common across Iranian, sure enough. However, Bactrian shows no signs of the parallel developments *ōN > **ūN (*gauni-čiya- > ‹γωνζο› ‘basket’, *čiyāt-gauna > ‹σαγωνδο› ‘as, like’) or *āN > **ōN (*bāmušn > ‹βαμοϸνο› ‘queen’, *gawāna > ‹γαοανο› ‘fault’, *nāma > ‹ναμο› ‘name’, *fra-māna > ‹φρομανο› ‘command’, *fšupāna > ‹χοβανο› ‘shepherd’…)

An assumption of pre-nasal raising also does not exhaust the cases with *ē > ‹ι›: this also occurs in *ziyakā > ‹ζιγο› ‘damage’, *waignā > ‹οιγνο› ‘famine’ (unless phonetically with [-ŋn-]?), *-iyaθwa > ‹-ιλφο› ‘a suffix’ (thanks Gholami, very illustrative glossing).

I would instead suggest the following rules:

  1. *ē gives ‹ι› before an original unstressed *ā. This handles ‘damage’ and ‘famine’, but also ‘revenge’, ‘custom’ and ‘purchase’. This is likely primarily also shortening *ē > *e, with raising *e > /i/ only following secondarily.
    • This does not seem to apply to /ē/ from i-umlaut of *ā: *dāraya- > ‹ληρ-› ‘to have’, *wādaya- > ‹οηλ-› ‘to lead’, *wādžaya > ‹οηζο› ‘ability, power’, *wi-čāraya- > ‹οισηρ-› ‘to purchase’. These could suggest either that implicit intermediate unstressed *ē (*dārē- > *dērē-, *wādē- > *wēdē- etc.) did not trigger shortening; or, alternately, maybe i-umlaut of *ā initially led to a distinct low front vowel *ǣ, which was only raised to ‹η› after the shortening/raising of *ē from *ai, *aya, *iya. The latter might be preferrable in light of one case with *au > *ō > ‹ο› (rather than ‹ω›) before *aya > *ē: *tauxmaya > ‹τοχαμηιο› ‘relationship’ (here *ē is not lost; thanks to further suffixation?). As a vowel, ‹ο› probably mostly stands for /u/, as is suggested by its use also for /w/ (‹οηζο› = /wēdz/, etc.) and the general typology of vowel systems across Iranian: Old and Middle Iranian languages mostly do not have short /o/. [1]
  2. *ē gives ‹ι› also before word-final consonant clusters. (NB: ubiquitous final ‹-ο› is thought to be only a Greek-derived orthographic device.) This handles ‘way’, as well as the ‹-ιλφο› suffix, and maybe also *fšuyantīčī > ‹φινζο› ‘lady’ (though here we instead have *-uya-, which I suppose could have contracted to *ī rather than *ē already to begin with).
    • This is again applicable also to the development of *ō: *aitat-gaunaka > ‹δαγογγο› ‘such, in this way’, *bawanta > ‹βονδο› ‘completely’.

These rules only seem to leave the verb root ‘to see’ unaccounted for. However, a more general version of rule 1 might cover some inflected forms (*wēn-ēd > ‹οινηδο› ‘see.2PL’), and actually also an allomorph with retained *ē exists (*wēn-an > ‹οηνανο› ‘see.subjunctive-1PS’). Gholami thinks these are chronologically separated versions before and after the sound change from ‹η› to ‹ι› (early /wēn-/ > late /win-/?), but if there is a chronological difference, maybe this rather involves levelling-away of the /wēn-/ allomorph.

Rule 1 then suggests that before the onset of root stress and the reduction of all suffix and prefix syllables, Bactrian went through a stage of mobile stress attracted rightwards by long vowels, as I believe occurs in several other Indo-Iranian languages (though don’t ask me about the exact details on this).

Two, a few notes on vowels in prefixes. These are mostly reduced heavily, and are spelled varyingly with ‹α› or ‹ο›, which Gholami interprets as [ə]. E.g. *fra-gāwa > ‹φρογαοο› /frəɣāw/ ‘profit’, *ni-kanta- > ‹νακανδο› /nəkand-/ ‘to dig’, *uz-bara- > ‹αζβαρο› /əzvar-/ ‘to bring forth’. There is also epenthetic /ə/ before some consonant clusters: *spāsV > ‹σπασο ~ ασπασο› /spās ~ əspās/ ‘service’. Despite some cases of variation like this, schwa seems to be still an underlying phoneme, however: consider *xšayanta- > ‹αχανδ-› /əxānd-/ ‘to control’, with first *xš- > *əxš-, followed by *š > ∅ (if not rather > *hx > /xː/, spelled simply as ‹χ›?); and *upa-stāna > ‹αβαστανο› /əvastān/ or /əvəstān/ ‘support’. There doesn’t seem to be much evidence against considering [ə] an unstressed allophone of /a/, though. (Gholami takes no stance on questions about the phoneme inventory of Bactrian and operates only with orthographic vs. surface phonetic levels of analysis.)

There are also some cases where *ni- is still spelled as ‹νι-›. Gholami suggests that these would be retentions. I think they might be however secondary umlaut developments: in the data given, they occur mostly preceding a palatal root vowel ‹ι› or ‹η›, as in *ni-štaya- > ‹νιττι-› /nihti-/ ‘to send (a message)’; or preceding a palatal sibilant (possibly itself originally conditioned by *i through RUKI), as in *ni-šadman > ‹νιϸαλμο› /nišalm/ ‘seat’. There are also examples of ‹ι› continuing earlier prefixal *a in a similar context: *waz-antiyaka > *wəzindēg (with umlaut in the root: *a-i > *i) > ‹οιζινδδιγο› /wizindiɣ/ ‘current’. Gholami attributes this last example to a supposed development of *a to ‹ι› before /s z/, which would also be seen in *dasta > ‹λιστο› /list/ ‘hand’. There are however plenty of counterexamples, say *aspa > ‹ασπο› ‘horse’, *ā-xasa- > ‹αχασ-› ‘to quarrel’, *basta- > ‹βαστο› ‘to bind’, *dasa > ‹λασο› ’10’; *azam > ‹αζο› ‘I’, *azdā > ‹αζδο› ‘knowledge’, *gazna > ‹γαζνο› ‘treasury’, *waza- > ‹οαζ-› ‘to use’. I don’t know what is up with ‘hand’; theoretically, some kind of suffixation to *dasta-ya- would work. [2]

Lastly, one case with the development of *fra- ‘pre-‘ suggests that vowel reduction actually has been fairly early, resulting in this prefix first in *fr̥-, which then in unstressed position mostly unpacks again to *frə-. Consider *fra-stāya- > ‹φοϸτιι-› ‘to send’: this exemplifies the sound change *rs > /š/ (compare e.g. *kr̥sta- > *kirsta- > ‹κιϸτο› /kišt/ ‘to detain’), and therefore requires *fr̥stēy- > *fštīy- > /fəštīy-/.

Three, the development of *š shows double treatment. Gholami notes that in some cases, *š is retained as ‹ϸ› /š/; in others, it developes to ‹υ› /h/, which can be further lost (or perhaps only unwritten in various consonant clusters, I wonder?). This does not appear to be a simple case of dialect mixture or whatever, since both outcomes can sometimes occur in the same word: *ni-šašta- > ‹ναυαϸτο› /nəhašt/ ‘to settle’.

Examining the data, to me the distribution does not appear to be entirely unpredictable, though. *š > *h seems to be the main development for *š originating by RUKI:

  • *is, *us > *iš, *uš > *ih, *uh
    • *awa-gta > ‹ωγοτο› /ōɣu(h)t/ ‘to conceal’
    • *d-manyu > ‹λρουμινο› /lruhmin/ ‘enemy’ [3]
    • *fra-ta-ka > ‹φρητογο› /frē(h)təɣ/ ‘messenger’
    • *kasta- > ‹κισατο› /kisə(h)t/ ‘youngest’
    • *ni-gaa- > ‹ναγαυ-› /nəɣāh-/ ‘to hear’
    • *ni-šašta- > ‹ναυαϸτο› /nəhašt/ ‘to settle’
    • *ni-štaya- > ‹νιττι-› /nihti-/ ‘to send (a message)’
    • *snā > ‹ασνωυο› /əsnōh/ ‘daughter-in-law’
    • *wi-šmāra- > ‹οαυμαρ› /wəhmār/ ‘to account’
    • *wrta-ka > ‹ροτιγο› /ru(h)tiɣ/ ‘rope’
  • *rs > *rš > *r(h)
    • *ā-pr̥št- > ‹βαρτ-› /var(h)t-/ ‘to be necessary’
    • *gr̥šta- > ‹γιρτο› /ɣir(h)t/ ‘to complain’ (past stem)
    • *hr̥šta- > ‹υιρτο› /hir(h)t/ ‘to leave’ (past stem)
    • *wi-xwata- > ‹οοχορτο› /wəxur(h)t/ ‘to quarrel’
  • *ḱs, *k⁽ʷ⁾s > *ćš, *kš > *š, *xš > *h, *x(h)
    • (PII *ćš >) *pašman > ‹παμανο› /pa(h)man/ ‘wool’
    • *āθriya > ‹χαρο› /x(h)ār/ ‘ruler’
    • *ayant- > ‹χανδ-› /x(h)ānd-/ ‘to control’
    • *apā- > ‹χαβρωσο› /x(h)avrō(t)s) ‘night-and-day’
    • *nauθra > ‹(α)χνωρο› /(ə)xnōr/ ‘satisfaction’
    • *wašti > ‹χοατο› /xʷa(h)t/ ’60’ [4]
    • (PII *ćš >) *xšwašti > ‹χοατο› ’60’
    • *waa > ‹οαχο› /wax(h)/ ‘interest’

In one case I’m not sure if RUKI or *ćt > *št is involved: *paršti-čī- > ‹παρσο› /parts/ ‘backwards’.

Meanwhile, retention of *š seems to be entirely regular in the position *a_V, *ā_V. In these positions *š would be maybe the most likely to continue PII *sč < *sk(e), though *ćš is also an option, and some could be innovative Iranian vocabulary from somewhere else entirely:

  • *dāšinV > ‹λαϸνο› /lāšn/ ‘gift’
  • *fra-xāšaya- > ‹φριχηϸ-› /frixēš-/ ‘to seduce’
  • *paga-šaka- > ‹παχϸιιο› /paxšiy/ ‘in-law’
  • *uz-gaša- > ‹αζγαϸ-› /əzɣaš-/ ‘to dissent’
  • *xāša-ka > ‹χαϸιγο› /xāšiɣ/ ‘clothing’

A few clear cases of retained /š/ from RUKI also appear:

  • *kr̥šāka > ‹κιϸαγο› /kəšāɣ/ ‘plough-ox’ (<< PIE *kʷels- ‘to plough’)
  • *ni-šādman > ‹νιϸαλμο› /nišalm/ ‘seat’ (<< PIE *sed- ‘to sit’)

In most cases of retention I am not sure about the pre-Iranian origin of *š (but RUKI is conceivable in many of them):

  • *a-xwašn- > ‹αχοαϸνο› /axwašn/ ‘unpleasantness’ (any relation to Ir. *xwad- ‘to make pleasant’ < PIE *sweh₂d-?)
  • *bāmušn- > ‹βαμοϸνο› /vāmušn/ ‘queen’ (any relation to Persian بانو /bānu/ ‘lady’?)
  • *daxštana > ‹λαχϸατανιγο› /laxšətaniɣ/ ‘crematory’ (from pseudo-PIE *dʰegʷʰ-sth₂no-?)
  • *hāwišta-ka > ‹υαϸκο› /hāšk/ ‘pupil’
  • *pitr̥-šti- > ‹πιδοριϸτο› /piðurišt/ ‘ancestral estate’ (from pseudo-PIE *ph₂tēr-steh₂-?)
  • *ni-šašta- > ‹ναυαϸτο› /nəhašt/ ‘to settle’ (maybe *š-s > *š-š, if from PIE *steh₂-?)
  • *škara- > ‹αϸκαρ-› /əškar-/ ‘to follow’
  • *wi-xwarša- > ‹οοχωϸ› /wəxōš/ ‘quarrel’
  • *xšāya- > ‹ϸιι-› ? /šīy-/ ‘to be able’ (< PII *kšaH-? but cf. /x(h)/ in the derivative ‘ruler’)
  • *xšidža-ka- > ‹ϸιζγο› /šidzɣ/ ‘good’

I could suggest at least that before a vowel, *rš >/š/ (‘plough-ox’, ‘quarrel’), while before a consonant, *rš > /r(h)/ (‘to be necessary’, ‘to complain’, ‘to leave’, ‘to quarrel’ and ‘backwards’).

The cases with *št from PII *ćt seem to be rather evenly split. *š > *h appears in:

  • *aštā > ‹αταο› /a(h)tā/ ‘8’ (<< PIE *oḱtōw)
  • *ham-gašta- > ‹αγγιτι› /angi(h)ti/ ‘to receive’ (past stem) (< East Iranian *gādz- ‘to receive’, of unknown earlier origin per Cheung)
  • *ni-pixšta- > ‹νιβιχτο› /nəvix(h)t/ ‘to write’ (past stem) (<< PIE *peyḱ- ‘to paint, decorate’)

while retention appears in:

  • *pašti- > ‹παϸτο› /pašt/ ‘agreement’  (<< PIE *peh₂ḱ-; cf. pact)
  • *rašta- > ‹ραϸτο› /rašt/ ‘true, loyal’ (<< PIE *h₃reǵ-; cf. right)

Same goes for cases with *fš, though examples are rather rare:

  • *š > *h in *pati-fšarV > ‹πιδοφαρο› /piðəf(h)ar/ ‘honour’; *fšuyantīčī > ‹φινζο› /f(h)indz/ ‘lady’
  • retention in *kafši > ‹καφϸο› /kafš/ ‘shoe’
  • and even: *fš > /x/ in *fšupāna > ‹χοβανο› /xuvān/ or /xəvān/ (or /xʷvān/?) ‘shepherd’.

Is it perhaps relevant that ‘shepherd’ comes from PII *pću-, while the others are more likely to be from *ps with Iranian “second RUKI” to *fš? Maybe additionally *fš- > /f-/ root-initially versus retained medially.

It’s also worth pondering that *š > *h fits somewhat poorly into the phonological big picture of Bactrian. Usually *š >> /h/ correspondences go through *x (thus in e.g. Finnic or Spanish; Pashto remains at the /x/ stage), but Bactrian retains Proto-Iranian *x just fine. Two other possibilities come to mind, but they both would require Bactrian to have split off from the other Iranian languages relatively early:

  1. Perhaps *kʰ > /x/ and *kC > /xC/ are fairly late in at least some parts of Iranian: they are, after all, not reflected in some of the languages, such as Balochi and Wakhi. *š > *h could then have passed through a transient *x state already earlier.
  2. Perhaps the path was here rather *š > *s > *h, the second change being common (but not Proto-) Iranian. But this leaves many cases unexplained: original *-st- for example does not develop into **-ht-, but *-št- still does in many cases (‹νιττι-›, ‹φρητογο›, ‹ωγοτο›, etc.)

Likely having more clarity on this issue would require examining also the cognates elsewhere in Iranian, and not necessarily taking Gholami’s pre-Bactrian reconstructions as a given. But this remains difficult as long as there is no general Iranian Etymological Dictionary to consult.

[1] Gholami suggests /o/ for cases with *a-u > ‹ο›, such as *madu > ‹μολο› ‘wine’. Other eastern Iranian languages with this assimilation, though, end up with *u, e.g. Ossetian муд. I-umlaut of *a-i also gives ‹ι›, not ‹ε›, e.g. *kanyā > ‹κινο› ‘canal’.
[2] Sometimes it is proposed that ‘hand’ in Iranian would be native only in Persian, and borrowed from there to most of the other varieties, since this has PIE *ǵʰ- and is expected to give /d-/ only in Persian, but /z-/ elsewhere (and Avestan indeed has that). In this case the widespread Middle Iranian fronting of short *a to *æ, which appears to be absent from Bactrian, might result in *destV > *ðistV > /list/ in Bactrian. However I think that dissimilation before syllable-final *s is perhaps more likely: PIr *dzast- > *dast- (this proposal I’ve seen from Martin Kümmel). — There is however the fact that ‘hand’ contains original PIE *s, while my counterexamples like ‘horse’, ’10’ and ‘I’ mostly have secondary *s *z from PIr. *c *dz < PII *ć *dź⁽ʰ⁾ < *PIE *ḱ *ǵ⁽ʰ⁾. This could be perhaps leveraged, if wanted, but I don’t see what phonetical sense this would make, and so I don’t feel like doing a full check-up on the matter.
[3] The (rather funky!) consonant cluster /lr-/ presumably by folk etymology from *drauga > ‹λρωγο› /lrōɣ/ ‘false(hood), wrong’.
[4] In principle pre-epenthesis *swašti > *šwašti could also work, with *š > *h then feeding into common Iranian *hw > *xw?

  1. Rastorgueva & Edelman’s “Etymological Dictionary of the Iranian Languages” (in Russian) now covers letters from A to N. Here is the pdf of volumes 1-3 (there are 5 volumes by now).

