A Pre-Homo sapiens Vowel Inventory Preserved in the Śivasūtras: Evidence from Comparative Primatology and Hominin Vocal Tract Evolution
Keyur Joshi
Independent Researcher, Amdavad, Gujarat, India
Abstract
This paper estimates the terminus post quem and terminus ante quem for the phonemic inventory (varṇamālā) preserved in the Śivasūtras using paleoanthropological evidence of vocal tract evolution.
The provisional terminus post quem (~7 Ma) is established by the phylogenetic emergence of consonantal production capacity at the Homo–Pan divergence. The terminus ante quem (~430 ka) is constrained by the loss of the anatomical ability to produce the vowels ऋ and ऌ as described in the Pāṇinīya Śikṣā and Ṛgveda–Prātiśākhya. These vowels require tongue root articulation at the hard palate and teeth, respectively, which became anatomically impossible following laryngeal descent in Homo sapiens.
The phonemic system preserved in the Śivasūtras thus reflects vocal tract constraints of pre-sapiens hominins, likely Homo erectus or related species. This hypothesis generates falsifiable predictions through acoustic modeling of reconstructed Homo erectus sensu lato vocal tracts and accounts for otherwise anomalous features including the absence of contrastive आ and retention of dual ह distinctions.
Cultural transmission of this inherited phonemic system through at least 430,000 years of linguistic continuity was maintained through structured teaching practices, with the Śivasūtras representing a systematic formalization of humanity’s phylogenetically deepest language substrate.
Scope & Methodology
The Śivasūtras comprise two distinct phonetic sections: vowels (svaras) and consonants (vyañjanas). Chronological bounds for the Śivasūtras are established by evaluating the following for each section:
-
Terminus post quem: Determined by the phylogenetic emergence of the anatomical and articulatory capacities required for the production of individual vowels (svaras) in the hominin lineage. For consonants (vyañjanas), the relevant baseline is the Homo–Pan divergence, as no evidence exists for the independent production of speech-like consonants prior to this split in the hominin lineage. The composite phonemic inventory of the Śivasūtras (vowels + consonants) requires the prior existence of anatomical capacity for both classes of phonemes.
-
Terminus ante quem: Inferred from comparison with articulatory descriptions in the Pāṇinīya Śikṣā and related texts, specifically:
- (a) Unique svara articulations are evaluated against vocal tract anatomy to determine the terminus ante quem from paleoanthropological constraints on relevant anatomy.
- (b) Svaras that are producible with the modern Homo sapiens vocal tract but are not included in the Śivasūtras inventory.
It is currently not possible to establish a terminus ante quem for the consonantal inventory independent of Homo sapiens, due to the lack of sufficiently detailed and reliable vocal tract reconstructions for earlier Homo species (e.g., Homo erectus). Therefore, the terminus ante quem for the overall (composite) phonemic inventory is evaluated primarily with reference to the vowel section.
The vowel–consonant distinction is not a prerequisite for linguistic structure itself, as evidenced by sign languages, which function as complete systems without this duality. However, when language is instantiated in the vocal–auditory modality, this distinction becomes relevant. The focus of this study is not language per se but only the vocal–auditory modality of language.
Accordingly, this analysis treats the Pāṇinīya Śikṣā and the Śivasūtras not only as artifacts of a historical or grammatical language (Sanskrit), but also as a sophisticated phonetic-phonological systematization that preserves the fundamental articulatory categories of an inherited speech system. The study views these articulatory categories as internally consistent phonetic constraints that reflect the vocal tract capabilities of their original speakers, independent of subsequent historical authorship or modification.
Phonetic Framework & Evolutionary Inference
Vowels
The vowel inventory in the Śivasūtras comprises three categories—quantal vowels (अ, इ, उ, ए, ओ), derived vowels (ऐ, औ), and unique vowels (ऋ, ऌ)—whose production requires specific articulatory and acoustic mechanisms traceable through hominin and primate evolution. Phylogenetic mapping of these characters (presence/absence of vowel types and combinatory abilities) constrains the terminus post quem for the vowel system’s compilation to the emergence of these capacities in the last common ancestor with relevant taxa.
Quantal vowels (अ, इ, उ, ए, ओ)
Quantal vowels exhibit stable formant patterns (F1/F2 dispersion-focalization) that minimize acoustic-perceptual confusion, as modeled by Stevens (1989) and quantified in human supralaryngeal vocal tract (SVT) simulations (Lieberman et al., 1969; Lieberman, 2012). Lieberman had postulated that vowel speech is uniquely human and is available only after the descent of the larynx. Recent studies, however, establish that vowel-like segments are possible without descent of the larynx. Acoustic analyses of Guinea baboon (Papio papio) vocalizations reveal homologous vowel-like segments (VLSs) matching human [ɨ, æ, ɑ, ɔ, u], organized along two axes: horizontal (tongue advancement: [æ] ⇔ [u, ɔ]) and vertical (tongue height: [ɑ] ⇔ [ɨ]) (Boë et al., 2017).
These VLS acoustic regions—[ɑ], [ɨ], [u], [æ], and [ɔ]—represent the biological-acoustic substrates for the Sanskrit vowels अ, इ, उ, ए, and ओ respectively, illustrating that the primary coordinates of the Śivasūtras vowel space are anchored in deep-time articulatory capacities. Baboons achieve quantal VLSs despite a high larynx, using tongue musculature (genioglossus, hyoglossus, styloglossus) comparable to early hominins. This demonstrates that a stable, quantal-like acoustic substrate can be generated without the descended larynx characteristic of Homo sapiens.
Such vowel-like segments (VLSs) are not vowels in the linguistic sense, but represent the biological-acoustic precursors from which the five primary human quantal vowels later evolved (Boë et al., 2017). Accordingly, the last common ancestor with Cercopithecoidea (~25 Ma) establishes a terminus post quem based on phylogenetic availability of open-tract VLS capacity, a prerequisite for the emergence of the quantal vowel substrate underlying the Śivasūtras’ vowel inventory. This bound applies to articulatory–acoustic capability alone and does not date the appearance of human-like vowels or symbolic phonological systems, both of which could arise only after this articulatory–acoustic capacity had emerged.
Derived vowels (ऐ, औ)
The diphthongs ऐ (/ai/) and औ (/au/) in the Śivasūtras inventory require coordinated sequencing of distinct articulatory targets within a single syllabic nucleus. This capacity for rapid, controlled transitions between vowel-like segments has been documented in non-human primates.
Campbell’s monkeys (Cercopithecus campbelli) produce context-specific call sequences through principled concatenation (Ouattara et al., 2009), demonstrating combinatorial capacity at the level of discrete call units. More recently, Girard-Buttoz et al. (2025) documented systematic call combinations in chimpanzees that expand semantic range through sequencing. While these combinations operate at the level of discrete calls rather than sub-phonemic transitions, they demonstrate the phylogenetic availability of sequencing mechanisms that could be repurposed for intra-syllabic diphthongal transitions.
The relevant capacity for diphthong production is not the presence of linguistic diphthongs in primate calls, but rather the motor control and perceptual discrimination abilities required to produce and recognize rapid formant transitions within controlled vocal gestures. These abilities are present in extant great apes and cercopithecoids.
The terminus post quem for diphthongal capacity therefore extends to the last common ancestor with taxa exhibiting controlled vocal sequencing, conservatively estimated at the Hominidae divergence (~15–20 Ma). This does not imply that diphthongs existed at this point, but rather that the articulatory-perceptual substrate for their later evolution was phylogenetically available.
Unique vowels (ऋ, ऌ)
Unlike the other vowels discussed so far, ऋ and ऌ are not represented as vowels in the IPA cardinal vowel chart. In IPA-based analysis, they are treated as syllabic liquids (consonants) rather than vowels. Their systematic phonological treatment as members of the vowel class is distinctive to the Vaidic Sanskrit traditions.
The Pāṇinīya Śikṣā (sūtras 4, 17–18) and Ṛgveda–Prātiśākhya (sutra 41) describe ऋ and ऌ as non-contact (aspṛśya) sounds produced by the tongue root (jihvāmūla) at the hard palate (mūrdha) and teeth (dantya) respectively. Thus, these vowels are supposed to be produced when the tongue root approaches but does not touch the hard palate (for ऋ) and teeth (ऌ).
In anatomically modern humans, the posterior third of the tongue (tongue root) occupies the oropharyngeal cavity rather than the oral cavity proper (Standring, 2020). At rest, it lies posterior to the hard palate and inferior to the soft palate, forming part of the anterior wall of the oropharynx. This spatial arrangement places the posterior third behind the posterior margin of the hard palate and separated from it by both anatomical distance and the intervening soft palate complex.
During normal physiological movement, contraction of the palatoglossus elevates the posterolateral margins of the tongue toward the soft palate and narrows the oropharyngeal isthmus. Its line of action is directed predominantly superomedially, with minimal anterior translational component. Accordingly, the posterior third of the tongue is not mechanically configured to approximate the hard palate (mūrdha) under normal articulatory conditions. Palatal articulations involve the tongue body or blade rather than the posterior third.
Crucially, the anatomical impossibility emerges only when the articulatory descriptions from multiple texts are synthesized. The Ṛgveda–Prātiśākhya specifies the articulator (tongue root, jihvāmūla), while the Pāṇinīya Śikṣā specifies the target (palate, mūrdha). Each description individually is anatomically achievable; the combination—tongue root approaching palate in non-contact articulation—is not. This distribution suggests that each text preserved a different component of an original unified articulation that had become impossible to produce as a complete gesture.
Compatibility with pre-sapiens vocal tract anatomy:
The earliest hyoid fossil evidence from the hominin lineage comes from Australopithecus afarensis (~3.3 Ma), which exhibits a bulla-shaped hyoid bone (Alemseged et al., 2006), consistent with the presence of laryngeal air sacs and a high (non-descended) laryngeal position similar to extant great apes. In this configuration, the tongue root was positioned more anteriorly relative to modern humans (Steele et al., 2013, Ekström et al., 2023). This geometry would have made it anatomically feasible for the tongue root to approach the posterior hard palate in a non-contact manner.
In the absence of hyoid fossils from early Homo species including Homo erectus, phylogenetic inference suggests retention of the ancestral high-larynx configuration observed in A. afarensis and shared with the great ape outgroup. This inference is supported by cranial base morphology in H. erectus indicating reduced flexion compared to modern humans (Laitman et al., 1979).
The first evidence of a descended larynx similar to modern humans appears at Sima de los Huesos (~430 ka), where hyoid morphology indicates loss of air sacs and laryngeal descent (Martínez et al., 2008; de Boer, 2012). Thus, we place the terminus ante quem around 430 ka based on loss of anatomical ability to produce ऋ and ऌ in the manner prescribed in Vaidic traditions.
The above discussion focuses on ऋ. However, We can arrive at the same conclusions with ऌ as well by evaluating for the tongue root approaching molars instead of the hard palate.
Falsifiable predictions:
The pre-sapiens hypothesis generates testable acoustic predictions: reconstructed Homo erectus sensu lato vocal tracts with high larynx position and air sacs should produce ऋ-like and ऌ-like acoustic outputs when the tongue root articulations approaching the hard palate (ऋ) and teeth (ऌ), respectively are modelled.
Direct testing of this prediction requires hyoid fossil evidence from H. erectus to enable accurate vocal tract reconstruction. The discovery of such fossils would allow falsification of the hypothesis if the predicted acoustic properties are not obtained. Until such fossils become available, the hypothesis remains consistent with available evidence but awaits direct empirical testing.
Terminus ante quem from laryngeal descent:
The loss of laryngeal air sacs and full laryngeal descent in Homo sapiens occurred approximately 430 ka (de Boer, 2012). This anatomical reconfiguration establishes a terminus ante quem for the productive use of ऋ and ऌ as described in the Ṛgveda–Prātiśākhya and _Pāṇinīya Śikṣā texts. After this point, these sounds would have become increasingly difficult or impossible to produce as anatomically specified, though they could be approximated through compensatory articulations.
There must have been a significant gap between the terminus ante quem of 430 ka for loss of ability to produce ऋ and ऌ as a jihvāmūlīya-mūrdhanya, and the earliest Vaidic texts. The abundance of ऋ and scarce but present ऌ at the time of Vaidic texts is possible only if there was linguistic and some form of cultural continuity resulting in the transmission of phonemes as components of words, and some of these words with ऋ and ऌ fossilized inside them. Because the jihvāmūlīya-mūrdhanya as a mechanism of production of ऋ and ऌ, is also preserved, there must be cultural continuity in some form of formalized teaching and continuity of language proper and not just words. We are not able to say that these exact texts have terminus ante quem of 430 ka, but the tradition from which these texts inherit their meanings, has terminus ante quem of 430 ka.
Their preservation in the Śivasūtras, Pāṇinīya Śikṣā, and Ṛgveda–Prātiśākhya despite anatomical obsolescence reflects the conservative nature of the teaching system through which they are inherited and subsequent deliberate orthoepic practices of the Vaidic tradition in the same spirit.
Additional Vowel System Features Supporting Pre-Sapiens Origin
Absence of आ as a contrastive phoneme
The Śivasūtras lack आ as a distinct phoneme. Latter-day traditional explanation is that आ is the same as दीर्घ अ, treating vowel length as a phonemic rather than prosodic distinction. This does not concur with Aṣṭādhyāyī 8.4.68 where दीर्घ अ is categorically mentioned as distinct from आ. This is also puzzling from the perspective of modern Indo-Aryan languages, where the /a/–/aː/ contrast is phonemic.
However, this absence is consistent with pre-sapiens vocal tract constraints. The low back vowel /aː/ requires:
- Substantial pharyngeal expansion
- Low and retracted tongue position
- Stable low-frequency F1 production
These acoustic-articulatory requirements are optimized by the descended larynx and expanded pharynx of Homo sapiens but would be difficult to achieve with the higher larynx and reduced pharyngeal space of pre-sapiens hominins. The systematic absence of आ thus reflects the articulatory constraints of the original speakers rather than an arbitrary phonological choice.
Retention of dual ह distinctions
The Śivasūtras and related texts preserve distinctions in the articulation and acoustic quality of ह that are not perceptually contrastive for modern speakers. These distinctions involve differences in laryngeal configuration and airflow that would be more perceptually salient with the different laryngeal anatomy of pre-sapiens hominins.
The retention of these non-contrastive distinctions in a highly systematized phonemic inventory suggests that they were functionally contrastive in the original system and were preserved as conservative features despite loss of perceptual salience.
Incomplete prosodic grid (lack of full 3×3×2 structure)
Vaidic Sanskrit has a complex tone and length system having a 3×3×2 prosodic grid (three tones × three lengths × two nasalization variations). The Śivasūtras lack this full elaboration. Brevity is considered the primary reason for this omission. However, if the phonemic system has roots in pre-sapiens anatomy, the absence of the prosodic grid may suggest a system that predates the fine-grained temporal and pitch control enabled by full modulation and breath control capacity.
This absence of prosodic structure is not evidence in itself, but it is consistent with the intermediate phonetic capabilities of pre-sapiens hominins, whose vocal tracts would have supported basic phonemic contrasts but not the full prosodic range available to modern humans.
Consonants of Śivasūtras
While the vowel inventory provides the primary evidence for pre-sapiens origin, the consonantal inventory also reveals phylogenetic stratification.
Boë et al. (2017) demonstrate that vowel-like segments (VLSs) in baboons represent an ancient inheritance traceable to ~25 Ma, but Demolin and Delvaux (2006) establish a distinct boundary for consonantal production. In their study of bonobos (Pan paniscus), they report no speech-class consonants, attributing this absence to limited neural control and lack of dynamically configurable vocal tract shaping required for controlled strictures.
This phylogenetic boundary places the terminus post quem for the consonantal section at the Homo–Pan divergence (5–7 Ma), indicating that the consonantal inventory could not have originated prior to the hominin lineage proper. This does not prove that it emerged with the Homo–Pan split, but it indicates that it was not present before the split, as extant Pan are unable to produce consonants.
The terminus post quem for stable, contrastive consonantal systems is necessarily provisional, as the emergence of voluntary fine-grained supralaryngeal motor control cannot be directly inferred from the current fossil record.
The case of ळ (retroflex lateral)
ळ is absent from the Śivasūtras and the Dhātupāṭha, yet it appears in later Vaidic usage, where it is accepted in sandhi but not in lexical stem formation. This distribution suggests that ळ emerged or was phonologized after the Śivasūtra-level consonantal inventory had stabilized, supporting the hypothesis that the Śivasūtras preserve an earlier phonemic stratum. It also indicates that the Dhātupāṭha—root stock for word genesis—was already stabilized before ळ evolved. The Dhātupāṭha is the prime candidate for words/language form through which fossilized phonemic strata were preserved.
Although the time frame is unknown and difficult to estimate anatomically or phylogenetically, the evolution of the ability to produce ळ, or its acceptance as a phoneme in earlier periods, is a terminus post quem for the Vaidic period.
Synthesis: Phylogenetic Stratification in the Śivasūtras
Language has evolved over time. Phonemes are the building blocks of spoken language. The terminus post quem and terminus ante quem can be estimated for the evolution of phonemes used in human languages through paleoanthropology of vocal tract anatomy, comparative speech ability of primates mapped in a phylogenetic framework, and linguistic records when available.
The evidence presented reveals a phylogenetically stratified phonemic inventory:
Layer 1: Deep-time acoustic substrate (~25 Ma)
The quantal vowel space (अ, इ, उ, ए, ओ) reflects vowel-like segment (VLS) capabilities present in the last common ancestor with cercopithecoids, providing the biological-acoustic foundation for vowel production.
Layer 2: Hominin-specific capabilities (7–5 Ma)
The consonantal inventory reflects articulatory control capacities that emerged in the Homo lineage after divergence from Pan, including the ability to produce consonants. This is the terminus post quem and not a definitive timeframe for consonants.
Layer 3: Pre-sapiens phonemic elaboration (~430 ka or earlier)
The specific phonemic system preserved in the Śivasūtras—including ऋ, ऌ, the absence of आ, dual ह, and the incomplete prosodic grid—reflects a system optimized for pre-sapiens vocal tract anatomy, likely in Homo erectus or related species.
Layer 4: Cultural transmission and formalization (430 ka or earlier–present)
The inherited phonemic system was transmitted from pre-sapiens populations to Homo sapiens during their co-existence period through normal language acquisition processes, despite increasing anatomical mismatch. Subsequent formalization in the Vaidic grammatical tradition preserved this system through deliberate orthoepic practices.
Note on the Transmission Model
Because the Śivasūtras varṇamālā indicates an anatomical configuration with a high hyoid bone, and because it has been preserved, along with instructions on how to pronounce each varṇa, there must have been some form of transmission from Homo erectus sensu lato to Homo sapiens in which phonemic inventory and some form of instructions on its pronunciation were transmitted.
We propose a three-stage transmission model:
Stage 1: Phylogenetic inheritance and language emergence (430 ka or earlier)
Language emerged in Homo erectus or closely related hominin species, whose vocal tract anatomy differed significantly from that of modern Homo sapiens. This ancestral language system was optimized for the articulatory capacities of its speakers, including the presence of laryngeal air sacs and a higher laryngeal position. This was reflected in the words formed in this era and it had the footprints of this anatomy in phonemes constituting those words.
Stage 2: Cultural transmission during hominin co-existence (300–100 ka)
During the period of Homo erectus and Homo sapiens coexistence, anatomically modern humans must have acquired this pre-existing language system through cultural transmission—most likely through intergenerational teaching of children, similar to contemporary language acquisition. This cultural transfer must have occurred despite the anatomical mismatch between the inherited phonemic system and the modern human vocal tract. It is likely that descent of the larynx was a slow evolutionary process. At any given point, there would have existed a population with different larynx configurations, the way modern human classrooms have students with different eye colors.
Language preservation is the same type of cultural transmission as tool-making techniques - just harder for us to see because it doesn’t leave stone artifacts.
It is worth noting that if we consider Homo erectus ऋ and ऌ as real vowels producible with tongue root articulation at the hard palate, Homo sapiens ऋ and ऌ must have been approximations constrained by the descended larynx. This accounts for the preservation of articulatory descriptions in the Pāṇinīya Śikṣā that describe the original articulation, even as anatomically modern speakers could only produce acoustic approximations.
Stage 3: Vaidic formalization and conservative preservation (some time post-co-existence to present)
Following the demographic replacement of archaic hominins, the inherited phonemic system was preserved through the conservative orthoepic practices of the Vaidic tradition. The Śivasūtras represent a systematic formalization of this inherited inventory, capturing articulatory patterns that had become increasingly difficult or impossible for anatomically modern speakers but were maintained through deliberate cultural transmission.
This model explains why the Śivasūtras preserve features that are suboptimal or impossible for modern human vocal tracts: they reflect an inherited system rather than one that evolved de novo in anatomically modern human populations.
Falsifiable Predictions
This hypothesis generates several testable predictions:
Models of reconstructed Homo erectus sensu lato vocal tracts with laryngeal air sacs should produce ऋ-like and ऌ-like vowels when tongue root articulations are applied to positions described in the Pāṇinīya Śikṣā and Ṛgveda–Prātiśākhya.
.
Conclusion
The terminus ante quem for the phonemic inventory preserved in the Śivasūtras is approximately 430 ka (laryngeal descent and air sac loss in Homo sapiens) and the terminus post quem is approximately ~7 Ma (Homo–Pan divergence, establishing consonantal production capacity).
Independent lines of evidence support this conclusion:
-
The tongue root articulation of ऋ and ऌ at the palate, as described in the Pāṇinīya Śikṣā and Ṛgveda–Prātiśākhya, requires a tongue configuration incompatible with modern human anatomy but consistent with pre-sapiens vocal tract organization.
-
Acoustic substrates for the quantal vowels appear in cercopithecoid vocalizations, while combinatorial sequencing capacities predate quantal stability in extant primates.
-
The absence of contrastive आ and the incomplete prosodic grid indicate a stage before stabilization of low-back distinctions and fine temporal/tonal modulation.
-
Retention of dual ह distinctions, lost in Homo sapiens perception, points to perceptual contrasts that were functionally significant in pre-sapiens populations.
-
Consonantal production boundaries postdate the Homo–Pan split, as no speech-like consonants are attested in the Pan lineage, placing the system firmly within the Homo lineage.
These constraints place the origin of the preserved phonetic system between approximately ~7 Ma and 430 ka. The proposal generates falsifiable acoustic predictions—most notably that air sac-mediated vowels in reconstructed pre-sapiens vocal tracts would produce rhotic-like and lateral-like vowel qualities approximating ऋ and ऌ—which can be tested through articulatory-acoustic modeling and future reconstructions of early Homo vocal tracts.
The Śivasūtras thus represent not merely a grammatical artifact, but a phonetic time capsule: a systematic preservation of humanity’s first language, inherited from our hominin ancestors and maintained through hundreds of thousands of years of cultural transmission.
References
- Alemseged, Z., Spoor, F., Kimbel, W. H., Bobe, R., Geraads, D., Reed, D., & Wynn, J. G. (2006). A juvenile early hominin skeleton from Dikika, Ethiopia. Nature, 443(7109), 296–301. https://doi.org/10.1038/nature05047
- Boë, L. J., et al. (2017). Evidence of a vocalic proto-system in the baboon (Papio papio) suggests pre-hominin speech precursors. PLOS ONE, 12(1), e0169321. https://doi.org/10.1371/journal.pone.0169321
- de Boer, B. (2012). Loss of air sacs improved hominin speech abilities. Journal of Human Evolution, 62(1), 1–6. https://doi.org/10.1016/j.jhevol.2011.07.007
- Demolin, D., & Delvaux, V. (2006). A comparison of the articulatory parameters involved in the production of sound of bonobos and modern humans. In A. Cangelosi, A. D. M. Smith, & K. Smith (Eds.), The evolution of language: Proceedings of the 6th International Conference (EVOLANG6) (pp. 67–74). World Scientific. https://www.worldscientific.com/doi/abs/10.1142/9789812774262_0009
- Ekström, A., & Edlund, A. (2023). Evolution of the human tongue and emergence of speech biomechanics. Frontiers in Psychology, 14, Article 1150778. https://doi.org/10.3389/fpsyg.2023.1150778 (Full open access: https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2023.1150778/full or PMC: https://pmc.ncbi.nlm.nih.gov/articles/PMC10266234/)
- Girard-Buttoz, C., et al. (2025). Versatile use of chimpanzee call combinations promotes meaning expansion. Science Advances, 11(3), eadq2879. https://doi.org/10.1126/sciadv.adq2879
- Lieberman, P., et al. (1969). Vocal tract limitations on the vowel repertoires of rhesus monkey and other nonhuman primates. Science, 164(3884), 1185–1187. https://www.science.org/doi/10.1126/science.164.3884.1185
- Laitman, J. T., Heimbuch, R. C., & Crelin, E. S. (1979). The basicranium of fossil hominids as an indicator of their upper respiratory systems. American Journal of Physical Anthropology, 51(1), 15–34. https://doi.org/10.1002/ajpa.1330510103
- Lieberman, P. (2012). Vocal tract anatomy and the origin of speech. In The Oxford handbook of language evolution (pp. 208–220). Oxford University Press.
- Martínez, I., et al. (2008). Human hyoid bones from the middle Pleistocene site of the Sima de los Huesos (Sierra de Atapuerca, Spain). Journal of Human Evolution, 54(1), 118–124. https://www.sciencedirect.com/science/article/abs/pii/S0047248407001960
- Ouattara, K., et al. (2009). Campbell’s monkeys concatenate vocalizations into context-specific call sequences. Proceedings of the National Academy of Sciences, 106(51), 22026–22031. https://doi.org/10.1073/pnas.0908118106
- Pāṇini Aṣṭādhyāyī. Classical text, no modern edition specified.
- Pāṇinīya Śikṣā. sūtras 4, 17–18. Classical text, no modern edition specified.
- Ṛgveda–Prātiśākhya. Classical text, no modern edition specified.
- Steele, J., Clegg, M., & Martelli, S. (2013). Comparative morphology of the hominin and African ape hyoid bone, a possible marker of the evolution of speech. Human Biology, 85(5), 639–672. https://doi.org/10.3378/027.085.0501
- Standring, S. (Ed.). (2020). Gray’s Anatomy: The Anatomical Basis of Clinical Practice (42nd ed.). Elsevier.
- Stevens, K. N. (1989). On the quantal nature of speech. Journal of Phonetics, 17(1–2), 3–45. https://doi.org/10.1016/S0095-4470(19)31520-7