Bolo!

View Original

Sanskrit Pronunciation: A Practical Overview

See this content in the original post

My goal with this guide is not merely to explain how to sound the letters of the Sanskrit alphabet. There are countless pronunciation guides online for that! Rather, I want to give a deeper understanding of how the sounds are formed in the mouth, and why almost all those pronunciation guides online are wrong. I will also show why the IAST transliteration system is by far the best.

The word “Sanskrit” — or saṃskṛta in IAST transliteration — means “well-made” or “perfected”. And it earns that name. It is well thought out, logically structured, and precise. This makes it relatively easy for non-native speakers like me to understand how to produce the sounds correctly (for the most part — there are a couple of tricky ones!).

Note: For now, don’t worry about how to pronounce the saṃskṛta words. The lines above and the dots below the letters might seem intimidating, at first. By the end of this guide you will understand what all those marks represent.

Pronunciation: The Theory

In many European languages, a single letter can represent many different sounds. For example, the “g” in “garage” makes two different sounds within the same word. Likewise the “c” in the Italian “cucina” (kitchen). This creates the need for extensive memorization of how to pronounce words. This is not the case with saṃskṛta, where every letter makes only one sound, and every sound has only one letter. This allows for easy pronunciation of written words. There are no confusing homonyms or homophones like “buffet” (buff-ay or buff-it), “through / threw”, or “to / too / two”. Some English words even have several correct pronunciations. The following three words — “ante”, “anti”, and “auntie” — can all be pronounced “ant-ee”. The listener needs context clues to infer which word was said. These words can also be pronounced differently: “ante” as “ant-ee”, “anti” as “ant-eye”, and “auntie” as “awn-tee”. This isn’t possible in saṃskṛta, where every word is spelled and pronounced in only one way1. If you hear a well-pronounced word, you’ll know how to spell it. If you see a correctly spelled word, you’ll know how to say it. This logical simplicity comes down to understanding how the mouth forms sounds.

The Five Mouth Positions

To understand saṃskṛta pronunciation, we have to visualize the mouth and how it creates different sounds. There are five distinct mouth positions where the energy of a particular saṃskṛta sound originates. From back to front:

See this content in the original post

The five mouth positions, the focal points of saṃskṛta sounds.

Throughout this guide, I will use the color coding shown above for the mouth positions. Wherever I reference the positions, I will highlight them with these colors. My hope is this will help emphasize the importance of learning and using the mouth positions.

See this content in the original post

Each of these mouth positions produces both a short and a long vowel sound, along with five stop consonants2. Each mouth position except velar also produces a semivowel, and each position except labial produces a fricative (I’ll explain these terms later).

The tongue and lips in each of the five mouth positions


Vowels

This will sound very obvious, but saṃskṛta is built with syllables. All languages are, of course. But saṃskṛta pays special attention to its syllabic nature by appending a vowel sound to every consonant by default. Every letter, unless otherwise specified, forms a complete syllable. For example, the consonant is not transliterated or pronounced as “k”, but rather as “ka”, with a short a vowel built right in3. Saṃskṛta’s syllabism manifests in its various alphabets used over the past three millennia. From the ancient Brahmi script to the modern day Devanāgarī, saṃskṛta’s predominant alphabet has been an an abugida, or syllabic script. This means every syllable is written and treated as a single unit. Vowels are not written as distinct letters (unless the word begins with a vowel). They are written as diacritical marks appended to consonants.

A syllable is a packet of sounds that contains exactly one vowel. A syllable must have a vowel and it can have only one vowel. Vowels are the energy that give syllables life. They are often referred to as the mātṛkā or śakti (“powers” or “energies”) of saṃskṛta. The short and long versions of each vowel sound the same, only the length is different. The long vowels are twice the length of short vowels.

Simple Vowels

The tables below show the five basic vowel sounds in their short and long forms. First is the Devanāgarī “independent” form, used only when the vowel begins a word. Then is the IAST transliteration of the vowel. And last is the Devanāgarī diacritical form attached to the consonant (ka).

See this content in the original post

These five simple vowels are the pure sounds created from each of the five mouth positions. In fact, they define the mouth positions used for the rest of the alphabet.

Diphthongs

Diphthongs are compound vowels that combine two simple vowels to create a new vowel. All diphthongs in saṃskrta are long vowels (dīrgha). They are called saṃdhyakṣara, which means “combined letters”.

See this content in the original post

Anusvāra and Visarga

The anusvāra (, transliterated as ) and visarga (, transliterated as ) are grouped with the vowels, but they are neither vowels nor consonants. They are not vowels because they can never follow a consonant. They are not consonants because they can never begin a syllable. They can only appear immediately after a vowel, and cannot precede a vowel. They serve to “close” the vowel.

Anusvāra () closes the vowel with a resonant (nasal) sound from the mouth position of the consonant that follows it.4.

Visarga () closes the vowel with an unvoiced breathy (aspirate) sound through the vowel’s mouth position. If the visarga comes at the end of a sentence, it is common to add a voiced echo of the vowel after the breath. For example, aḥ would be pronounced “aha”. This is not a standard rule, and many traditions end the word with the unvoiced breath sound. As for which method is more correct, I don’t think there is a definitive answer.


Consonants

Consonants (vyañjana) in saṃskṛta are also called “stops”. This is because they stop the flow of air (and thus the sound) by means of contact within the mouth by either the tongue or lips. Full stops (spṛṣṭa) are made by complete contact that blocks the air. Partial stops (īṣatspṛṣṭa, also called semivowels) are made by partial contact that does not stop the air, but suppresses it through the various mouth positions.

Stop Consonants

There are twenty-five stop consonants, five for each of the mouth positions. The stop consonants have the following three characteristics:

See this content in the original post

The voiced (ghoṣa) and unvoiced (aghoṣa) sounds have two variants:

See this content in the original post

The table below shows all twenty-five stop consonants, sorted by mouth position and type of sound.

See this content in the original post

Semivowels, Sibilants, and Aspirate

See this content in the original post

Transliterating with IAST

Before we move on to the practical application of what we’ve covered so far, I want to discuss transliteration.

The International Alphabet of Sanskrit Transliteration is the only transliteration system I use. It is very easy to learn and understand. It uses strict 1:1 character substitution, making it unambiguous and reversible. This means that IAST can be transliterated back into Devanāgarī or other Indic scripts with 100% accuracy.

The Diacritics

IAST employs a handful of diacritical marks.

See this content in the original post

Comparing Systems

Saṃskṛta does not differentiate between upper- and lower-case letters. Proper nouns are not identified by capitalization, but by grammatical rules (noun declension). IAST uses only lower-case letters, the exception being if a saṃskṛta word comes at the beginning of an English sentence. This makes IAST far more readable than some other transliteration schemes. Some systems use a mixture of upper- and lower-case letters or interior punctuation to represent saṃskṛta characters.

Here is an example using two famous saṃskṛta names. I give them first in an Anglicized spelling (not a transliteration), followed by several popular transliteration systems, beginning with IAST.

See this content in the original post

It is immediately clear that IAST is the most concise and legible of these examples. Mixing upper- and lower-case letters, or sprinkling punctuation into the middle of a word, wreaks havoc on readability. When reading a full sentence, or a whole stotram, or conducting a long pūjā ceremony, quick and easy legibility is a must.

There are some very similar systems that use all the same diacritical marks as IAST. These include National Library at Kolkata romanization and the ISO 15919 standard. These systems were not designed for saṃskṛta alone, but for all Indian language scripts. These include Bengali, Tamil, Malayalam, Kannada, Telugu, Arabic, Nastaliq, Oriya, Gujarati, and others. Unlike saṃskṛta, some of these languages have both short and long diphthongs, meaning the e and o have a macron: ē and ō. Since I only work with saṃskṛta, these macrons are superfluous, and I would rather not have to type or read them. This is why I prefer IAST, and use it exclusively.

IAST at a Glance

This chart shows the full saṃskṛta alphabet in IAST. It is color-coded by mouth position, and the letters with diacritical marks are outlined . Until you become accustomed to what each mark means, use this chart as a reference.

See this content in the original post

A Brief History of IAST

Modern IAST is derived from a system developed and adopted at the 1894 International Congress of Orientalists in Geneva. The scholars of the time saw a growing need to standardize a system of transliteration. The goal was to eliminate confusion caused by differing systems in use across Europe, as well as to streamline the printing of transliterated texts. They took inspiration from the most prevalent systems of the day, as well as leading saṃskṛta scholars like Monier Monier-Williams.

The system they arrived at is very similar to modern IAST, with a few distinctions. Namely, despite a desire to use a macron for all the long vowels, due to printing limitations of the day, they were not able to place a macron over the letter “l” for the long dental vowel. Since the short vowel had a dot below, they opted for an “l” with two dots below for the long vowel (l̤). Other minor differences include “m with dot above” for anusvāra (ṁ), and “m with candrabindu” (m̐) for anunāsika. There were also two additional variants of visarga: jihvāmūlīya (ẖ), used when ka or kha immediately follows visarga, and upadhmānīya (ḫ), used when pa or pha immediately follows visarga.

Modern IAST is essentially a simplification of the 1894 system. We are now able to put a macron above the letter “l”, so the long dental vowel is . The candrabindu mark for anunāsika is superfluous, so is used. This leads to the anusvāra being . And the two visarga variants occur so rarely in saṃskṛta that modern IAST ignores them, using in all cases.

If you are so inclined, you can read the Report of the Transliteration Committee from the 1894 Congress. It has some interesting insights into how they decided which diacritical marks to use where.


Pronunciation: The Practice

So, how are all these things actually pronounced? We’ll examine that in detail, but first I want to cover some of the most common mistakes people make.

(Mis)Pronunciation

First, I want to address something that, to me, seems straightforward, yet is the subject of controversy. It has to do with the retroflex and dental vowels. How the heck do you pronounce those things?

Pronouncing and

Many saṃskṛta pronunciation guides will tell you to pronounce like “ri” as in “crisp”, but with a trilled “r”. This leads to the instruction that is pronounced as “ree” in “creek” (again with a trilled “r”). They will then tell you to pronounce like , but starting with an “l” sound, leading to the truly bizarre “lri” (and “lree” for ). Some other guides and teachers will say that and are a rolled “r”, but for different lengths.

These guides have lost sight of the fact that these are simple vowels. They are the pure sound created from each mouth position. One could sound these vowels without variation for the length of a full breath. The notion that sounds like “ri” changes it from a simple vowel to, at best, a diphthong, and at worst, a consonant. And “lri” for is just wacky.

The simple vowel sounds very like an English “r” ( being the same, only longer), but with the tip of the tongue pointed straight up (retroflex position). Likewise, sounds very much like an English “l” (with being the same, only longer), with the tip of the tongue right behind the teeth (dental position).

Charles Wikner — author of A Practical Sanskrit Introductory — offers the following exercises for how to pronounce these vowels:

To get to the correct pronunciation of , begin by sounding a prolonged i and slowly raise the tip of the tongue so that it is pointing to the top of the head, approaching but not touching the roof of the mouth. Do not try to hold the back of the tongue in the i position, nor try to move it out of that position: simply have no concern with what is happening at the back of the tongue, just attend to the tip of the tongue and listen. Repeat the exercise a few times until comfortable with the sound of then practice directly sounding for a full breath.

Similarly for start sounding with a prolonged i and slowly raise the tip of the tongue to behind the upper front teeth without touching them. Continue the exercise as for .

Wikner closes this section with the following explanation of how the mispronunciation of as “ri” came to be (emphasis added):

In practice when either of these vowels is followed by a consonant whose mouth position requires that the tip of the tongue be at a lower position, a vestigial i will emerge due to the bunching of the muscle at the back of the tongue when moving the tip downwards. For example ṛk tends to produce rik, but a word like kṛṣṇa should produce no i sound at all.

Troubles With jña (ज्ञ)

If you have even a passing familiarity with saṃskṛta, you have likely encountered the compound consonant jña. You’ll have seen words like jñāna (wisdom), yajña (sacrificial worship), and ājñā (the “third-eye” cakra located behind the point between the eyebrows). And someone probably told you to pronounce these letters as “gya”, i.e., gyāna, yagya, and āgyā. Other common pronunciations of this character are “nya” (as in the Spanish word “mañana”), “dnya”, and “gna”. In modern Hindi, the character ज्ञ (jña) is, indeed, pronounced “gya”. In Marathi, it becomes “dnya”. And in South Indian dialects, we have “gna”. But for saṃskṛta these are all incorrect.

Jña is a compound of two consonants: ja and ña. Ja is the voiced, unaspirated palatal consonant, and ña is the palatal resonant. Both sounds originate from the palatal position, so their combined sound must also originate there.

Again citing Wikner:

The pronunciation of this is similar to the French “J” as in “Jean-Jacques”, or as in the “zh” sound in the English words “mirage”, “rouge”, “measure”, “or “vision”; but in all cases it is sounded through the tālavya (palatal) mouth position, and is strongly nasalized.

He gives the following exercise to practice this sound:

Now with the tongue in the palatal position, sound a prolonged śa (the palatal sibilant). And then repeat the sound but allowing the vocal cords to vibrate — with some imagination, this is beginning to sound like a prolonged ja (which is of course, impossible to sound). Now repeat this voiced sound allowing it to be strongly nasalized. This is about as close as one can get to describing the sound of jña.

To summarize, jña is pronounced as a strongly nasalized “zha” sound (as in “mirage” and “vision”) that originates from the palatal position. This is a very tricky sound to master, which is most likely why it devolved over centuries to the much simpler “gya”.

Silent Letters

Saṃskṛta has no silent letters. You will often hear Westerners pronounce brahmā as “braw-muh”, treating the “h” as silent. But every letter in saṃskṛta is important and needs proper articulation. Some people, knowing that the “h” is not silent, will pronounce it as brum-haw, with the “h” after the “m”. But this is also wrong. Every letter must be sounded in the order they are written.


Pronunciation Guide

If you read everything up to this point and you’re still with me, I commend you and your commitment to refining your skills. Now we’re going to get into the practical stuff.

Pronouncing the Vowels

The table below lists the vowels in order by mouth position. An English approximation is given with the relevant sound highlighted.

See this content in the original post

The ai and au diphthongs are “moving sounds” (not a technical term). The ai vowel begins with a short a sound and moves into the i position. The au vowel begins with a short a and moves into the u position (this is different from the English sound, which begins with an “a” like in “cat”, and moves to the u position).

The e and o diphthongs are “constant sounds”, so the tongue and lips don’t move during the vowel.

Pronouncing the Anusvāra and Visarga

As explained earlier, even though they are neither vowels nor consonants, the anusvāra () and visarga () are typically grouped with the vowels. Therefore, I will include them here, as well. These sounds are grouped with the vowels because they can only appear immediately after a vowel, thus “closing” the vowel.

The anusvāra closes its preceding vowel with a resonant (nasal) sound that changes depending on what consonant comes next. An anusvāra followed by any velar consonant would be pronounced as the velar resonant (). An anusvāra followed by any labial consonant would be pronounced as the labial resonant (m).

Take, for example, this famous mantra for the elephant-headed deity gaṇeśaoṃ gaṃ gaṇapataye namaḥ. This is often mispronounced as “gum ganapataye”. But because the consonant after the anusvāra is the velarga”, the anusvāra must be pronounced as a velar resonant: “gung”.

The visarga closes its preceding vowel with a breathy sound resulting from a small puff of air. The visarga uses the same mouth position as its preceding vowel. If the visarga is at the end of a sentence, it is very common to follow the breathy sound with a voiced echo of the vowel. Take, for example, the universal closing mantra oṃ śāntiḥ śāntiḥ śāntiḥ. The last word is often pronounced “shawn-ti-hi”. The gaṇeśa mantra from the previous example can end with either “namah” with a puff of air, or as “namaha” with a voiced a. The correct way is not definitively known. I come down on the side of the unvoiced puff of air, because a voiced echo adds a syllable to the word, which changes the poetic meter (if there is one). That being said, I still often say “namaha”, simply out of decades long habit.

Pronouncing the Stop Consonants

The twenty-five stop consonants are separated below by mouth position. Each table gives English approximations with the relevant sound highlighted, followed by specific guidance for pronouncing the sounds of that position.

See this content in the original post

Velar Mouth Position

The velar consonants are created with the back of the tongue and the soft palate, exactly as in English.


See this content in the original post

Palatal Mouth Position

The palatal consonants are created with the middle of the tongue against the hard palate. In English, it is common to make these sounds with the tip of the tongue placed at the palatal ridge behind the teeth, but this is incorrect for saṃskṛta.

Say the word “chai”, and pay attaention to the tip of your tongue. If the tip of your tongue is raised and / or touching the roof of the mouth, you will have to make some adjustments to properly get to the saṃskṛta sound. Relax the tip of the tongue, and bunch up the middle of the tongue. Now try saying “chai” again, and keep the tip of the tongue relaxed. Practice this tongue position until it becomes very easy, as this applies to all the sounds from this mouth position.


See this content in the original post

Retroflex Mouth Position

Dental Mouth Position

I have grouped the retroflex and dental consonants together because neither of these mouth positions occur in English.

In English these sounds occur as alveolar consonants, because the tip of the tongue touches the palatal ridge behind the teeth, called the “alveolar process” or “alveolar ridge”. The saṃskṛta versions of these consonants are pronounced similarly to the English, but with the tip of the tongue pointed straight up (touching the roof of the mouth) for the retroflex sounds, or straight forward (touching the teeth) for the dental sounds.


See this content in the original post

Labial Mouth Position

The labial consonants are created with the lips, precicely as they are in English.

Pronouncing the Semivowels

Every mouth position besides velar has a semivowel. The semivowels are similar to their simple vowel counterparts, but they act as the boundary of a syllable rather than the heart of the syllable (the “nucleus”, in phonetic terms). If you sound a simple vowel and immediately follow it with “a” to form a syllable, you get the semivowel for that mouth position.

See this content in the original post

Palatal — as explained above in the “chai” example, this semivowel is pronounced with the tip of the tongue relaxed and the middle of the tongue bunched up. Take care that you are not engaging the tip of your tongue when sounding this letter.

Retroflex — pronounced like an English “R”, but with the tongue-tip vertical at the roof of the mouth.

Dental — pronounced the same as an English “L”, taking care that the tip of the tongue is touching the upper teeth.

Labial — pronounced similarly to an English “W”, but the lips do not protrude as far forward (as when whistling).

For many years, before I truly understood the mouth positions, I made the mistake of pronouncing the labial semivowel (va) as an English “V” sound (as in “very” or “vertical”). But the English “V” is not a labial sound. It — along with its unvoiced couterpart, “F” — is called a labiodental fricativelabiodental because it involves both lips and the teeth (specifically, the lower lip curls under and touches the upper teeth), and fricative because it is created by the friction of air through a very narrow passage (i.e. a hissing sound)7. But the saṃskṛtava” is a labial sound, and does not involve the teeth at all. For example, where I used to pronounce viṣṇu as “Vishnu”, I now pronounce it more akin to “Wishnu”. After training myself or years to always say “V” — even in words where it is more awkward like svāmi or tattva (both of which are much easier to say with the correct sound: “swami” and “tattwa”) — the habit is deeply ingrained. If I don’t pay close attention I still absentmindedly default to a “V” sound. Don’t get frustrated by small errors. No matter how much we practice, we will still make simple mistakes. Just keep at it, and get a little better every day.

Pronouncing the Sibilants and Aspirate

The sibilants (hissing sounds) and aspirate (breathy sound) are called ūṣman, meaning “heated”. In English they are called fricatives.

See this content in the original post

Every mouth position except labial has a fricative. They are all unvoiced, and result from passing a stream of air through the mouth.

The aspirate has the same open throat and mouth as the simple vowel a, and a puff of air creates the an “H” sound, just as in English.

The palatal and retroflex sibilants are similar to the English “SH”, but distinctly different from it. The English sound is a voiceless postalveolar fricative. The tongue tip is between the teeth and the alveolar ridge, and the teeth are clenched. For both the saṃskṛta sounds the tongue tip is relaxed, and the teeth are open. The palatal sibilant can be especially tricky for an English speaker. It is similar to the German “ich”.

The dental sibilant is the easiest, as it is a simple “S” sound. To quote linguist W. D. Whitney:

…it is the ordinary European S — a hiss expelled between the tongue and the roof of the mouth directly behind the upper front teeth.


In Review

Everything in this section is redundant. We covered it all in the previous sections. But presenting the same information in a different, more compact way can be quite helpful. Think of this section as a quick reference guide if you need a refresher on a particular mouth position.

The Saṃskṛta Alphabet

The table below lists all the sounds created from each of the five mouth positions. Excluded are the diphthong vowels, as they don’t belong to any particular mouth position 8 .

See this content in the original post

See this content in the original post