Show Summary Details

Page of

PRINTED FROM OXFORD REFERENCE ( (c) Copyright Oxford University Press, 2013. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single entry from a reference work in OR for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: null; date: 21 June 2018

South Asian Languages

International Encyclopedia of Linguistics

Colin P. Masica

South Asian Languages. 

A major linguistic area, the South Asian languages are centered in the Indian subcontinent, hence this is often called the “Indian linguistic area.” It covers India, Pakistan, Bangladesh, Nepal, and Sri Lanka, with extensions into Afghanistan, Tibet, and Burma (see Map 1). It includes languages belonging to eight genetic groups:

South Asian LanguagesClick to view larger

Map 1. Distribution of Language Families of South Asia

  • (a) Dravidian: all languages

  • (b) The I[ndo-]A[ryan] sub-branch of I[ndo-] E[uropean]: all languages except Romany—which, having left the region, has lost many areal features

  • (c) The Munda branch of Austro-Asiatic most languages (see below)

  • (d) The Iranian sub-branch of IE: those languages on the immediate margins of the subcontinent—whether originally of the Eastern Iranian group, like Pashto, or intrusive, like Western Iranian Baluchi

  • (e) The Nuristani sub-branch of IE, mainly in northeast Afghanistan

  • (f) T[ibeto-]B[urman], particularly languages within the mountain rim that defines the subcontinent—but also Tibetan and, in part, Burmese

  • (g) Tai: only the Khamti language, intrusive in Assam

  • (a) The language isolate Burushaski, in the far north of Pakistan

Mere physical presence within the geographic confines of the area, without much interaction with speakers of other languages, does not entail typological participation in it. Thus the Khasi language (belonging to the Mon-Khmer branch of the Austro-Asiatic family)—self-sufficient on a mountaintop in the middle of north-eastern India (now the State of Meghalaya)—does not really participate; and Sora of the South Munda group, hidden in rugged hills on the Orissa/Andhra border, does so only to a limited extent. (Although the Nicobar and Andaman Islands are politically part of India, their languages stand outside the area, both geographically and typologically.)

The morpheme-by-morpheme intertranslatability of local varieties of Marathi, Kannada, and Urdu—reported by Gumperz and Wilson 1971 for the village of Kupwar in the Maharashtra/Karnataka border area—suggests some of the processes that went into the formation of the South Asian area; however, this is not the general norm. More typically, South Asian languages show agreement at some points, while retaining language-specific or group-specific characteristics at others. Like all large linguistic areas with a complex history, South Asia shows many sub-areas of more intense or special convergence, or of partial convergence with other areas. It also has features that place it within larger areal configurations.

For reference, see Weinreich 1957, Andronov 1964, Kuiper 1967, Edel'man 1968, Vermeer 1969, Southworth 1971, Pandit 1972, Southworth and Apte 1974, Masica 1976, 2001, Shapiro and Schiffman 1981 (especially chaps. 4–8), Klaiman 1986, and Krishnamurti 1986 (especially Part II, pp. 123–285).

1. Features of larger areas

Pan-areal features defining South Asia as part of significantly larger areas include the following.

1.1. O[bject] V[erb] word order

This is dominant in a wide contiguous area of Asia, including Iran, Burma, Central Asia, native Siberia, Korea, and Japan—not, however, in further Southeast Asia or the Arab Middle East, where VO order prevails. A VO exception within South Asia, besides Khasi, is Kashmiri, where this is a recent innovation.

1.2. So-called absolutive or conjunctive participle

This is a non-finite verbal form in which one or more successive predications are subordinated to a final one containing a finite verb. It is largely co-extensive with OV syntax in Asia, occurring in Altaic and Iranian as well as in South Asia (including Kashmiri); but it extends farther, in a diminished role, in the west.

1.3. Explicator compound verb construction

Here a finite verb, one of a limited set of special auxiliaries, “completes or specifies the sense” of an immediately preceding main verb in the absolutive form. The two verbs refer to a single event (cf. Hook 1977). The distribution is narrower than that of the absolutive itself: it is characteristic in varying degrees of Central Asian Altaic and of Japanese and Korean. The resultative verbs of Chinese and mainland SE Asian languages are somewhat analogous. As a device, this contrasts with the use of prefixes in Slavic, German, Hungarian, and Greek to express similar meanings (with important differences).

1.4. Reduplication

As a syntactic (rather than wordforming) device, reduplication is found in all languages to some degree, but is highly developed, with special grammaticalized functions, in both South and SE Asia. Whether any of these functions is distinctively South Asian, and hence definitive of that area, is not clear (see Gonda 1949, Abbi 1987); they are generally least characteristic of Malayalam, according to Abbi.

2. Features of South Asia

Pan-areal features defining or characteristic of South Asia include the following.

2.1. Retroflexion

A contrast of retroflexion (some prefer the term “retraction”) in the apical stops ṭ/t, ḍ/d—and in some languages, also in nasals, laterals, and flaps, thus ṇ/n, ḷ/l, ṛ/r—is found in all the areal stocks: Drav., IA, Munda, TB (Tibetan itself and a number of Sub-Himalayan languages), Nuristani, Burushaski, Eastern Iranian (Pashto, Ormuri, some Pamir languages), and intrusive Western Iranian Baluchi—and even in Indian English. However, this clearest “South Asian” feature is not present in all languages of the area: it is absent from Assamese (easternmost IA) and from nearby eastern TB dialects (Bodo, Garo, Meithei, and the Naga and Kuki groups), as well as from the Munda languages, Sora and Korku (the latter is located far to the west of other Munda languages, in the Satpura Range north of Maharashtra). These two Munda languages preserve a peculiar asymmetric system—dental voiceless t vs. retroflex voiced —which seems to have been characteristic of Proto-Munda. (On this and other phonological features, see Ramanujan and Masica 1969.) The retroflex opposition is most strongly developed in a band of languages stretching from eastern Iranian, Nuristani, and Burushaski in the northwest, southward through western IA to Drav., and around to Oriya on the east coast. These languages have or , or both—and, at the northwestern end, the additional oppositions ̣/c, ṣ/s z̳/z; they also have typically greater lexical and textual frequency of retroflexes.

2.2. Postpositions

rather than prepositions occur in all areal stocks, except Eastern Iranian and Nuristani on the western border of the area, which have both constructions. (Several optional prepositions have wandered farther east into Hindi.) This feature is shared with Altaic—with which the South Asian distribution is not contiguous, because of intervening Iranian and Nuristani. The pattern has been linked typologically with OV word order; but the two features are independent, as shown by their disassociation in Persian, which has OV with prepositions, transitional to “normal” VO with prepositions further west.

2.3. The “echo-word,”

repetition of a stem with change of the first consonant or syllable (typically to a labial in IA, a velar in Drav.), yields the meaning ‘and/or things like that’: Hindi caay-vaay ‘tea, etc.’, Telugu paalu-giilu ‘milk, etc.’ The function of the feature may be called ‘generalization’; it is also found in Munda and Eastern Iranian (with m-). Heston 1980 has pointed out that it extends to colloquial Iranian (using labials, e.g. cai-mai); its status in TB, Nuristani, and Burushaski is not clear.

2.4. Phonesthetic or expressive words

form a large lexical category. These are no doubt found in many languages (Eng. zigzag, flipflop), but are highly developed in South Asia (a statistical criterion might be invoked), where they have specific formal characteristics. Reduplication occurs on the pattern CVC-CVC or CVCV-CVCV, along with partial reduplication under certain rules, a characteristic suffix -k, and verb-forming propensities. The pattern also yields many “areal etymologies” (Emeneau 1980). Possible connections with SE Asia, via Munda, need investigation.

2.5. Adj[ective] + N[oun] order

(a variable independent of OV/VO), occurs in Drav., IA (including Kashmiri), Munda, Eastern Iranian, Nuristani, and Burushaski, but generally not in TB, with some exceptions (Balti in Kashmir; Newari, Magari, Rai, and Sunwar in Nepal; Kanauri). This pattern is also characteristic of Uralic and Altaic—with which South Asia is narrowly connected via the Pamirs—and of (northern) Chinese; it contrasts with N + Adj order in SE Asia, Persian, and Arabic.

2.6. Dative subject construction

The experiencer of an act or condition is expressed as a “Subject” (or topic) in an oblique case, most often the dative. Although found in other languages (e.g. German Mir ist kalt), this construction seems to be developed to an extraordinary degree in most South Asian languages (statistical confirmation may be possible); it may be correlated with an areal semantic feature of volitionality (Klaiman 1986). It is also related to the absence of a verb ‘have’.

2.7. Quotative construction

is a quote or onomatopoeic expression with a postposed marker which is either equivalent to ‘having said’ or to ‘thus’, the latter paralleled by the Sanskrit element iti. This is Pan-Dravidian; in IA it is not now universal, but is widely distributed (Eastern group, Dakhani Hindi-Urdu, Nepali, Marathi, Gujarati). It is also found in TB (Newari, Ladakhi), but its status in Munda is not clear.

3. Sub-areal features

Cross-genetic features defining major sub-areas within South Asia include the following.

3.1. An opposition of nasalized versus oral vowels

is found in IA, Nuristani, Munda, TB, and apparently in Baluchi, but not in Drav. languages except the northern Kurukh (and allophonically in Telugu and Tamil); also lacking in Burushaski, and in Pashto and other Iranian languages. It has been lost in IA Sinhalese and Marathi. The nearest languages with similar oppositions are very distant.

3.2. Exclusively suffixing agglutinative morphology

(except for some lexical prefixes recently introduced via Persian and Sanskrit vocabulary) is typically Aryo-Dravidian, though other areal stocks, as well as Dravidian Brahui, have both prefixes and suffixes. (Munda even makes extensive use of infixes.) This feature sets modern IA apart from the rest of IE, including neighboring Iranian and Nuristani, and from ancestral Sanskrit. It is shared by Altaic and Uralic languages to the north; but these are cut off from the Aryo-Dravidian area by Iranian, Tibetan, and Burushaski, in which prefixes play important roles.

3.3. Elaboration of suffixal causative morphology

(many verbs have second causatives) is found in Drav., West/Central IA, and Baluchi. It is surrounded by languages (including Eastern IA and Sinhalese) which are limited to morphological first causatives. In Munda (originally), TB, and Burushaski, causative morphology is prefixal; but Northern Munda has developed suffixal devices. Second-degree morphological (suffixal) causatives are also characteristic of Altaic and Uralic, which are not strictly contiguous with the Aryo-Dravidian area.

4. Partial area features

Features attaching some South Asian languages to languages outside the area include those listed below.

4.1. Contrastive aspiration in consonants

is absent from Drav. (except in careful pronunciation of borrowed words in Telugu and Kannada), Iranian, and Nuristani (also from IA Sinhalese as a sub-areal feature), and is present only through borrowing in Munda; but it is a major feature of IA, Burushaski, TB, and of the Chinese and Tai languages beyond. While voiceless aspirates kh, ch, ph, etc. are widespread, voiced aspirates gh, jh, bh etc. are a peculiarity of IA. Their absence from a few IA languages (Punjabi, East Bengali dialects) correlates with the presence of tone and related phenomena (glottalization, etc.), which are also present in TB, Burushaski, Tai, and Chinese. Neither aspiration nor tone is characteristic of Altaic or Uralic.

4.2. Numeral classifiers

(as in ‘two-piece thing’) have developed mainly in Eastern IA (especially Assamese) and adjoining NE Drav., TB, and Munda languages, but they are peripheral to a center of the phenomenon in SE Asia. Classifiers are also found (independently?) in Iranian, according to Heston 1980.

4.3. Pronominal suffixes

link some northwestern IA languages and Drav. Brahui with Iranian generally, with Burushaski (prefixes), and with Nuristani.

4.4. The ergative construction

has a distribution now interrupted by the spread of Persian and Turkish, perhaps linking much of IA plus Tibetan, Burushaski, and various Iranian languages to the ergative languages of the Caucasus.

5. Area and history

The history of these and similar developments (the list above is not exhaustive) is varied and complex. Attention has long focused on the basic retroflexion feature which sets off the South Asian area from neighboring areas. Though clearly non-IE, it is found in the earliest Sanskrit. The opposition appears to be recent in Munda and Tibetan, but must be reconstructed for Proto-Dravidian. The debate has accordingly been over whether the developments in early Sanskrit were motivated purely internally (Hock 1984), or owe something to Drav. loanwords, or to the carryover of habits involved in language shift from a Dravidian or other substrate (Thomason and Kaufman 1988). Generally overlooked in this controversy is the high development of the opposition in the northwestern corner of the area, where it affects sibilants and affricates. Such sounds may have triggered the Sanskrit developments long ago in the immediate geographic vicinity.

It is not disputed that the formation of the area has involved a basic morphological, syntactic, and semantic remodeling of IA on a Drav. model, occasionally tempered by other influences. The next question is that of the similarity of Drav. to the Uralic and Altaic languages. ‘Typological’ explanations are not completely satisfying. Genetic relationship has long been proposed—earlier to Altaic, and more recently to Uralic. The similarity may reflect ancient areal association.

The South Asian area is important for linguistic area studies generally, since the genetic boundaries are clear within the subcontinent. There is no question that convergence has taken place across them, both locally and on a larger scale.

See also Areal Linguistics; Austro-Asiatic Languages; Burmese; Dravidian Languages; Indo-Aryan Languages; Indo-Iranian Languages; Munda Languages; Pashto; Tai Languages; Tibetan; and Tibeto-Burman Languages.


Abbi, Anvita. 1987. Reduplication structures in South Asian languages: A phenomenon of linguistic area. New Delhi: Center for Linguistics and English, Jawaharlal Nehru University.Find this resource:

    Andronov, Mixail Sergejevič. 1964. On the typological similarity of New Indo-Aryan and Dravidian. Indian Linguistics 25.119–126.Find this resource:

      Edel'man, Dz̆oj Josifovna. 1968. Osnovnye voprosy lingvističeskoj geografii: Na materiale indoiranskix jazykov. Moscow: Nauka.Find this resource:

        Emeneau, Murray B. 1980. Language and linguistic area. Stanford, Calif.: Stanford University Press.Find this resource:

          Gonda, Jan. 1949. The functions of word duplication in Indonesian languages. Lingua 2.170–197.Find this resource:

            Gumperz, John J., and Robert Wilson. 1971. Convergence and creolization: A case from the Indo-Aryan/Dravidian border in India. In Hymes 1971, pp. 151–167.Find this resource:

              Heston, Wilma. 1980. Some areal features: Indian or Irano-Indian? International Journal of Dravidian Linguistics 9.141–157.Find this resource:

                Hock, Hans Henrich. 1984. (Pre-)Rig-Vedic convergence of Indo-Aryan with Dravidian? Another look at the evidence. Studies in the Linguistic Sciences, 14:1.89–108.Find this resource:

                  Hook, Peter E. 1977. The distribution of the compound verb in the languages of North India and the question of its origin. International Journal of Dravidian Linguistics 6.336–349.Find this resource:

                    Hymes, Dell, ed. 1971. Pidginization and creolization of languages. Cambridge: Cambridge University Press.Find this resource:

                      Klaiman, M. H. 1986. Semantic parameters and the South Asian linguistic area. In Krishnamurti 1986, pp. 179–194.Find this resource:

                        Krishnamurti, Bh., et al., eds. 1986. South Asian languages: Structure, convergence, and diglossia. Delhi: Motilal Banarsidass.Find this resource:

                          Kuiper, Franciscus B. J. 1967. The genesis of a linguistic area. Indo-Iranian Journal 10.81–102.Find this resource:

                            Masica, Colin P. 1976. Defining a linguistic area: South Asia. Chicago: University of Chicago Press.Find this resource:

                              Masica, Colin P. 2001. The definition and significance of linguistic areas: Methods, pitfalls, and possibilities (with special reference to the validity of South Asia as a linguistic area. In Singh 2001.Find this resource:

                                Pandit, Prabodh B. 1972. India as a sociolinguistic area. (Dr. P. D. Gune memorial lectures, 3.) Poona: University of Poona.Find this resource:

                                  Ramanujan, A. K., and Colin P. Masica. 1969. Toward a phonological typology of the Indian linguistic area. In Current trends in linguistics, vol. 5, South Asia, edited by Thomas A. Sebeok, pp. 543–577. The Hague: Mouton.Find this resource:

                                    Shapiro, Michael C., and Harold F. Schiffman. 1981. Language and society in South Asia. Delhi: Motilal Banarsidass.Find this resource:

                                      Singh, Rajendra, ed. 2001. The yearbook of South Asian languages and linguistics 2001. Delhi: Sage.Find this resource:

                                        Southworth, Franklin C. 1971. Detecting prior creolization: An analysis of the historical origins of Marathi. In Hymes 1971, pp. 255–273.Find this resource:

                                          Southworth, Franklin C., and Mahadev L. Apte, eds. 1974. Contact and convergence in South Asian languages. (International Journal of Dravidian Linguistics, 3:1–2.) Trivandrum: Dravidian Linguistic Association.Find this resource:

                                            Thomason, Sarah Grey, and Terrence Kaufman. 1988. Language contact, realization, and genetic linguistics. Berkeley and Los Angeles: University of California Press. Repr. 1991.Find this resource:

                                              Vermeer, Hans J. 1969. Untersuchungen zum Bau zentral-südasiatischer Sprachen. Heidelberg: Groos.Find this resource:

                                                Weinreich, Uriel. 1957. Functional aspects of Indian bilingualism. Word 13.203–233.Find this resource:

                                                  Colin P. Masica