# Why do languages sound different?



## Ben Jamin

Listening to different languages and different dialects of the same language I noticed that they differ not only by the different aspects of articulation of vowels like high/low, frontal/back, open/closed and different pitch, but also by a quality that I never found described or classified in any publication. I have given this quality two  working names: "basso continuo" and "basic sound colour". Has anyone met any description of this sound quality?
It is similar to different sound quality of musical instruments for example between a flute and a trombone, or guitar and lute. This sound "colour" varies between indviduals, but people speaking the same dialect have an easily recognizable common quality.


----------



## Hulalessar

I think that the "quality" you have in mind probably comes under prosody which covers features of speech other than individual phonemes. See this Wikipedia page for an introduction: Prosody (linguistics) - Wikipedia


----------



## S.V.

Ben Jamin said:


> Has anyone met any description of this sound quality?


For ex. "It is the even arrangement of syllables in Spanish that determines its syllable-timed rhythm type which is popularly known as the ‘machine gun’ rhythm or the so-called ‘staccato’ rhythm" (Odisho). 

Then for a simplification on _dialects_. After p. 290 in _Transcription of Intonation of the Spanish Language_ (UPF), you'd see some Argentinian Spanish. Along a kid's_ My name is Homer_ →_ He Fred_ ("a memorized pattern and [...] the learner's own rule"), you have the _merging_ of patterns with Italian's H+L*... This is, back when ...2nd-generation & non-Italian kids would grow up together.

That _centrifugal _figure (Odisho) would give you a similar 'potential' in ES & IT, along the underlying morphology, etc. (that last _-o_ or -_a_ which marks 'hey, this is a noun'; cf. "syllable arrangement"+"stress-timed"). Then every generation's Venice or La Habana comes with a history book. On the language level, that red line fits our cousins from Portugal.


----------



## Ben Jamin

Hulalessar said:


> I think that the "quality" you have in mind probably comes under prosody which covers features of speech other than individual phonemes. See this Wikipedia page for an introduction: Prosody (linguistics) - Wikipedia


As far as I know the term _prosody_ is used for the syllable length, rythm and pitch of the speech. What I meant here is the quality of the sound, like playing exactly the same melody on different musical instruments, for example piano and clavichord, or oboe and clarinett. I am aware that persons speaking different dialects will also vary in prosody, so it is difficult to isolate the "sound colour" from pitch and the rest, but it is not impossible.
I am surprised that I am the only person in this forum that has noticed this phenomenon.
Living in Norway I noticed how much voice quality varies in various Norwegian dialects. The speech of some dialect speakers reminds for example of squeaking of an old, long time not oiled door, while others voice is smooth, and still others have a raspy sound. Some dialects prefer falsetto tones, while others speak with a deep diaphragm voice.


----------



## Ben Jamin

S.V. said:


> For ex. "It is the even arrangement of syllables in Spanish that determines its syllable-timed rhythm type which is popularly known as the ‘machine gun’ rhythm or the so-called ‘staccato’ rhythm" (Odisho).
> 
> Then for a simplification on _dialects_. After p. 290 in _Transcription of Intonation of the Spanish Language_ (UPF), you'd see some Argentinian Spanish. Along a kid's_ My name is Homer_ →_ He Fred_ ("a memorized pattern and [...] the learner's own rule"), you have the _merging_ of patterns with Italian's H+L*... This is, back when ...2nd-generation & non-Italian kids would grow up together.
> 
> That _centrifugal _figure (Odisho) would give you a similar 'potential' in ES & IT, along the underlying morphology, etc. (that last _-o_ or -_a_ which marks 'hey, this is a noun'; cf. "syllable arrangement"+"stress-timed"). Then every generation's Venice or La Habana comes with a history book. On the language level, that red line fits our cousins from Portugal.


Please, see my post #4.


----------



## entangledbank

One feature I have noticed, and which was not mentioned in phonetics classes or books (that I can recall), is that some accents are characterized by a constriction of the throat: some German speakers do this, and so do some Irish English speakers, though the effects are not the same. I am not musical, but this might have the effect of a general lowering or change of timbre.


----------



## Hulalessar

Ben Jamin said:


> As far as I know the term _prosody_ is used for the syllable length, rythm and pitch of the speech. What I meant here is the quality of the sound, like playing exactly the same melody on different musical instruments, for example piano and clavichord, or oboe and clarinett. I am aware that persons speaking different dialects will also vary in prosody, so it is difficult to isolate the "sound colour" from pitch and the rest, but it is not impossible.
> I am surprised that I am the only person in this forum that has noticed this phenomenon.
> Living in Norway I noticed how much voice quality varies in various Norwegian dialects. The speech of some dialect speakers reminds for example of squeaking of an old, long time not oiled door, while others voice is smooth, and still others have a raspy sound. Some dialects prefer falsetto tones, while others speak with a deep diaphragm voice.


Voice quality, phonation or timbre comes under pitch, which is part of prosody.

In a musical instrument the string or column of air vibrates. The length of the string or column of air gives the fundamental note which establishes, for example, if it is middle C or the G immediately above middle C. However, vibration also occurs at integer lengths of the string or column of air, at 1/2, 1/3, 1/4, 1/5 and so on. The subsidiary vibrations, called overtones or harmonics, vary in intensity at different pitches according to the instrument and change as the tone continues. The subsidiary vibration may not be perfect. You can add to the mix noise, that is the indeterminate unpitched sounds, created by the instrument. The varying intensity and imperfections of the harmonics, the way they change after onset and noise are what give each instrument its characteristic timbre.

The human speech organs can be compared to a musical instrument, but not to a carefully crafted one. In particular, at least in speech, the vocal chords do not produce a neat set of harmonics. Humans have some measure of control over the sounds produced by their speech organs and can vary the timbre just as they can vary other elements of speech. Every language variety has not only its inventory of phonemes but also of its suprasegmentals, that is the elements of speech covered by prosody. It is often the case that getting the prosody wrong, rather than mispronouncing the phonemes, is what identifies a non-native speaker of a language. Indeed, if the speaker applies the prosody of his mother tongue it may identify what the mother tongue is.


----------



## Sobakus

No, voice quality and phonation don't come under pitch. They're not exactly part of prosody either, i.e. are not suprasegmental features. _Phonation type_ is just one dimension along which voice quality can vary, and is another term for _laryngeal features_; but these same features is also what distinguishes voiced, voiceless, aspirated and breathy voiced stops for instance, which means they serve as segmental features.

One can speak with different voice qualities throughout, for example one can whisper, which obfuscates differences in voicing; but a whispered voice still retains pitch modulation that allows you to distinguish statements from questions by intonation. So voice quality is something overlayed both on segments (phonemes) and on prosody (vowel length, intonation), something third.

Here's a definition from the article _Voice quality _by Esling, John H. (2006):


> The term ‘voice quality’ refers to those features of speech that are present more or less all the time that a person is speaking: the background characteristics perceived as the most constant or persistent over time. [...] It can either represent characteristics acquired as a part of ‘accent,’ the regional indicators or social markers of a person’s origins, or idiosyncrasies that in their particular and ultimately unique combination signal the identity of an individual speaker.



This version of the article has an intro section which highlights the 'third, overlayed' nature of voice quality by even calling it 'extralinguistic or paralinguistic'. Here's an excellent website based on that article that illustrates a large number of voice qualities in English. Here's a website exclusively on phonation types.

On the former website in particlar, you can see and hear that voice quality actually makes use of most if not all the same types and places of articulation that languages use to distinguish phonemes from each other. This has interesting implications for sound change: an example scenario is that the _palatalised voice_ that used to characterise some dialect gets dropped in favour of the _modal voice,_ but some consonants become re-interpreted as phonemically palatalised. This could happen to French in the near future. (Is there such a thing as a _cul de poule _voice? )


----------



## Hulalessar

You can say three things about any sound: what frequencies it has; how loud it is; how long it lasts. If the aspect you are considering does not involve intensity or duration you must be concerned with pitch. There are two types of pitch: definite and indefinite, which rather than being mutually exclusive are on a continuum. No naturally occurring sound perceived to have pitch lacks overtones. Sounds perceived to have indefinite pitch have a sort of relative pitch; an empty tin can being hit with a stick will be perceived to have a higher pitch than an empty oil drum being hit with a stick, but it will be difficult to discern a fundamental tone. Very roughly, definite pitch involves regular sound waves and indefinite pitch turbulence. The primary distinction between vowels and consonants is that vowels have regular sound waves and consonants turbulence; both have pitch.

Intensity, duration and pitch can occur at two levels. They can be phonemic. In English intensity is phonemic; in Italian duration is phonemic; in Chinese pitch is phonemic. They are also features of speech and the elements of speech other than its phonemes; in that respect they are the concern of prosody.


----------



## Sobakus

@Hulalessar a few questions I'd like to receive definite answers for before addresssing your reply.

1) Are you familiar at all with the notions of segment and autosegment (= a suprasegmental)?
2) Are you claiming that English stress, Chinese tones and Italian consonant length are phonemes or belong to one and the same level of phonology? What is that level?
3) Are you using "pitch" in the same sense when you say that "vowels and consonants have pitch," "Chinese phonemic pitch" and "pitch as the concern of prosody"? What does it mean for "pitch" to occur at two levels?


----------



## Hulalessar

Sobakus said:


> @Hulalessar a few questions I'd like to receive definite answers for before addresssing your reply.
> 
> 1) Are you familiar at all with the notions of segment and autosegment (= a suprasegmental)?
> 2) Are you claiming that English stress, Chinese tones and Italian consonant length are phonemes or belong to one and the same level of phonology? What is that level?
> 3) Are you using "pitch" in the same sense when you say that "vowels and consonants have pitch," "Chinese phonemic pitch" and "pitch as the concern of prosody"? What does it mean for "pitch" to occur at two levels?


1) Yes.

2) When I say that stress in English, tone in Chinese and consonant length in Italian are "phonemic" I am using the word as defined by Random House Kernerman Webster's College Dictionary: "concerning or involving the discrimination of distinctive speech elements of a language".

3) I use pitch to mean the frequency of sound as perceived. What the frequencies of a sound are is something which can be determined by measurement. Pitch is dependent on frequency, but is more subjective in the sense that some sounds (classed as having definite pitch) have harmonic or near harmonic waves where the fundamental pitch can be identified, while other sounds (classed as having indefinite pitch) have inharmonic waves. Pitch is essentially about being able to distinguish "high" from "low", but not necessarily in the straightforward sense that where two different notes are sounded on, say, a piano one can easily be heard to be higher than the other.

Pitch can operate at two levels. In Chinese the pitch assigned to a word establishes its meaning. In Spanish it operates at the prosodic level to, for example, distinguish a statement from a question. Statements and "yes or no" questions may contain the same words in the same order, but the rise in pitch at the end tells you when it is a question.


----------



## Ben Jamin

Hulalessar said:


> ...; in Italian duration is phonemic;


Wikipedia: In Italian there is* no* phonemic distinction between long and short vowels, but vowels in stressed open syllables, unless word-final, are long at the end of the intonational phrase (including isolated words) or when emphasized. .
Se mer


----------



## Ben Jamin

I see that I have sparked a debate on a very high detail level of phonology. Actually I wanted only to get some comments on my assumption that languages and dialects differ in the sound quality, which is something above the standard features acknowledged by phonology (length, pitch, stress and rythm), in a similar way as different musical instruments vary in sound quality playing the same melody. Can anybody confirm or disprove this hypothesis basing on research (or personal observations)?
Here are recordings of three languages as illustration
Japanese 



Standard Italian 



Sicilian


----------



## Hulalessar

Ben Jamin said:


> Wikipedia: In Italian there is* no* phonemic distinction between long and short vowels, but vowels in stressed open syllables, unless word-final, are long at the end of the intonational phrase (including isolated words) or when emphasized. .
> Se mer


I was referring to the gemination of consonants.


----------



## Hulalessar

I do not think research is needed as it is plain that every language variety (whether considered a language or dialect) has a distinctive sound. The sound of any language is a combination of its phonemes and prosody. All sound is ultimately reducible to three elements: duration, pitch and intensity which all play their part at different levels. A traditional Liverpool accent is high pitched compared to most other accents of England. However, when a Liverpudlian speaks he still uses pitch to, for example, indicate a question.

The possible combinations of duration, pitch and intensity are endless. Bear in mind that not only can different language varieties be identified but so can individual speakers. The number of possible variations account for the diversity of different languages. However, whatever terms (such as "nasal", "squeaky", "dark" or "staccato") you may use to describe the overall sound of a language, it comes down in the end to duration, pitch and intensity.


----------



## S.V.

Ben Jamin said:


> Living in Norway I noticed how much voice quality varies in various Norwegian dialects


You also find that example https://www.let.rug.nl/gooskens/pdf/publ_comphum_2003.pdf#page=16 (_formants_ above).


----------



## Sobakus

Hulalessar said:


> 1) Yes.
> 
> 2) When I say that stress in English, tone in Chinese and consonant length in Italian are "phonemic" I am using the word as defined by Random House Kernerman Webster's College Dictionary: "concerning or involving the discrimination of distinctive speech elements of a language".


The second of these points disproves the first.

Generative linguistics separates speech into two distinct levels, the segmental and the suprasegmental (= autosegmental). "distinctive speech element" = segment, corresponding to the traditional notion of the phoneme. Stress and pitch *are not segments*, but _suprasegmentals overlayed on segments_ and are the concern of prosody.




Hulalessar said:


> 3) I use pitch to mean the frequency of sound as perceived.


*Neither vowels nor consonants are perceived as pitches by humans*, they're perceived as speech sounds. Vowels are composed of four formant frequencies which on their own would be perceived as pitches, but together are perceived as speech segments, phonemes. Of these four formant frequencies, F0 is called the fundamental frequency, and it's this formant that is perceived as intonational pitch - it's the frequency at which vocal cords vibrate. All other frequencies that compose a vowel or a consonant are *not perceived as pitch*, but as vowel height, backness, labialisation, nasalisation and so on.


Hulalessar said:


> Pitch can operate at two levels. In Chinese the pitch assigned to a word establishes its meaning. In Spanish it operates at the prosodic level to, for example, distinguish a statement from a question. Statements and "yes or no" questions may contain the same words in the same order, but the rise in pitch at the end tells you when it is a question.


Both of these levels are levels of prosody; moreover, Chinese absolutely does have sentence intonation to distinguish statements from questions, just like Spanish does; conversely, Spanish uses pitch modulation over individual syllables, as well as words and groups of words.

Here's a good representation of the basic levels of Prosodic Phonology:



*All the levels* listed here, from Syllables to Utterance, are studied holistically by Prosodic Phonology. The level not labelled here is the lowest level, the mother node which represents individual speech segments, called the Segmental or Phoneme Tier; again, it's not a concern of prosody; although Autosegmental Phonology integrates it with prosody in order to explain segmental feature spreading processes such as consonant assimilation.

Below that you can see the Metrical Grid, which represents domains of stress or prominence - a specific concern of Metrical Phonology. *Pitch movement* operates on *all these prosodic levels *- the fundamental frequency (F0) changes over individual Syllables, Feet, Prosodic Words, Phonological and Intonational Phrases, as well as whole Utterances. All of these levels are characterised by an opposition of stressed and unstressed, higher and lower pitch, more and less prominent, head and dependent elements; here prominence is marked by X. None of these levels or their elements are segmental, but prosodic. The English and Spanish contrastive stress belongs to the Prosodic Word level and is an element of prosody. The Chinese tones are not fundamentally any different from these - they're simply an additional, separate suprasegmental tier, which not all languages possess. Other suprasegmental tiers have been postulated to successfully handle the Finnish vowel harmony or the Danish _stød_ (which is analogous to Swedish/Norwegian tones). All of these tiers are the concern of prosody. The Italian *geminate consonants are not the concern of prosody* - they're geminate segments.

*No level of prosodic hierarchy* nor any extant theory of prosody handles voice quality - this is a fact. Voice quality is relatively constant and doesn't lend itself nor require a prosodic discription. It *isn't involved* in conveying *linguistic meaning*, but mood, emotion and so on. It's a third, paralinguistic level.


----------



## Sobakus

Ben Jamin said:


> I see that I have sparked a debate on a very high detail level of phonology. Actually I wanted only to get some comments on my assumption that languages and dialects differ in the sound quality, which is something above the standard features acknowledged by phonology (length, pitch, stress and rythm), in a similar way as different musical instruments vary in sound quality playing the same melody. Can anybody confirm or disprove this hypothesis basing on research (or personal observations)?


This is what I tried to do by mentioning voice quality. If this isn't "different musical instruments vary in sound quality playing the same melody", I don't know what is. I would like to know whether this is what you mean, or how much of a role do you think it plays in the overall perception that you're referring to here. Please listen to the example on this website.


----------



## dojibear

Hulalessar said:


> In English intensity is phonemic; in Italian duration is phonemic; in Chinese pitch is phonemic.


In American English, "stress" is pitch (not intensity) for most native speakers. Most 2-syllable words have 2 levels. A few longer words have 3 levels. 

English sentences also have clause-level pitch patterns (3 levels), that express a variety of emotions: anger, consoling, surprise, mockery, indignation, questioning, doubt, etc. I am not sure if that is "phonemic", since I'm not a linguist.

Mandarin ("official Chinese") has tones, but they are really pitch patterns. In theory (in classrooms, or in slow formal speech), each syllable has one of five pitch patterns: high, rising, low (down-up), falling, and neutral. In full-speed normal speaking, no syllable is pronounced long enough to change pitch. Each vowel has exactly one pitch. But it isn't as simple as 5 pitches. There are patterns: pitch levels between two adjacent syllables. For example, if two adjacent syllables are tone 1 (the "high" tone), they are both high, but the second syllable is a little lower in pitch than the first one. 

Chinese sentences also have clause-level pitch patterns, that express a variety of emotions (see above). As Sobakus says:


Sobakus said:


> Chinese absolutely does have sentence intonation


----------



## Ben Jamin

dojibear said:


> In American English, "stress" is pitch (not intensity) for most native speakers. Most 2-syllable words have 2 levels. A few longer words have 3 levels.


Do you mean that American dialects have no intensity stress? It sounds strange.


----------



## Sobakus

From the first Google result by Xingrong Guo (2022):


> English lexical stress primarily involves an emphasis on individual syllables in a polysyllabic word. It is usually manifested by fundamental frequency (F0), duration, and intensity. Stressed syllables are usually produced with relatively higher F0, greater intensity, and longer duration compared to unstressed syllables. [...] *Although researchers disagree on the weight of acoustic correlates of English stress, they all agree that stress is “not a single mechanism” and consists of three main cues (F0, duration, and intensity).*


----------



## dojibear

Thanks for the link. "relatively higher F0" is the pitch difference that I hear. The other parts (intensity, duration) are less obvious to me. But maybe I just haven't noticed them.

I grew up learning (in school and everywhere else) that "stress" meant "intensity". Years later I noticed that "stressed" syllables were always pronounced at a higher pitch (in real sentences, not in 1-word audioclips). That changed my understanding. But my new view (expressed in post #19) is apparently also over-simplified. I'll read the article.


----------



## Hulalessar

As often happens in a thread there is confusion because people are using words with different ideas of what they mean. My definition of "pitch" is set out in post 11 and I distinguish between definite and indefinite pitch. Sobakus on the other hand uses "pitch" to refer solely to definite pitch.

The thread is headed: Why do languages sound different? The answer is that all sounds can be analysed in terms of: how loud they are; how long they last; and their frequencies. It is the different ways that loudness, duration and frequencies combine that explains the difference between one language and another.

The elements of loudness and frequency at the syllable level are relative. In a tonal language the pitch of the tones is not fixed, but varies according to the speaker and to the intonation of the utterance. (I never said that Chinese does not have sentence intonation.) A Spanish speaker distinguishes between _gano _"I win" stressed on the first syllable and _ganó _"he won" (stressed on the last syllable) whether he speaks softly or shouts and whether or not he is emphasising the word. Whilst every utterance is a complete entity it can still be analysed at both the syllable and utterance levels depending on what you are looking at.


----------



## S.V.

People can also think of images. With the _Addams Family_ (1964) set, you ignore 'color' with the _shape_ of that line.












But there are different ways of measuring an image. Cf. '_designed to approximate human vision_'1 & '_chosen to mimic human hearing_'2 (for the middle ones in #16). Then if we imagine a _surface_ for a dialect, Jamin may want an 'extant theory' that assigns a different name, if they are slightly bluer and 'harsher' in the colder regions.  Or a bit less vivid in a certain city.


----------



## Sobakus

Hulalessar said:


> As often happens in a thread there is confusion because people are using words with different ideas of what they mean. My definition of "pitch" is set out in post 11 and I distinguish between definite and indefinite pitch. Sobakus on the other hand uses "pitch" to refer solely to definite pitch.


It's clear to me that, unfortunately, you have insufficient linguistic knowledge to talk about speech sound perception on linguistic science's terms; as a result, you're trying to fill that gap by introducing into it *musical pitch*, which leads to wide-ranging confusion. The goal of my replies was to show this and to counteract this confusion. I care deeply about explaining things in an precise, understandable and scientifically correct way, and the effect of your replies in this thread has unfortunately been the opposite on all these counts. I think people who talk with authority on scientific matters on a public forum should take extra caution so as not to spread confusing or confused information.

I must emphatically repeat that speech sounds are not perceived as pitch, neither absolute nor indefinite. They're perceived as combinations of articulatory features when one's grammar allows their perception; otherwise they're perceived as strange, foreign speech sounds that the listener will struggle to repeat; nevertheless they aren't heard as pitch.

Again, a number of *frequencies, not pitches* can be distinguished in vowels. These individual *frequencies* will be heard as pitches only when artificially synthesised in isolation. These perceived pitches might be absolute (definite) or indefinite. But when produced in combination, humans listeners perceive them as a vowel that they can try and reproduce.


Hulalessar said:


> The thread is headed: Why do languages sound different? The answer is that all sounds can be analysed in terms of: how loud they are; how long they last; and their frequencies. It is the different ways that loudness, duration and frequencies combine that explains the difference between one language and another.


*Sounds as physical phenomena* can indeed be analysed in this way. This is a subject that physics deals with;* linguistics* deals with this only when conducting instrumental studies. Humans are unable to perceive absolute loudness, duration and frequencies. Birds potentially do that when they imitate human speech, phone rings, dogs and car alarms.

This is what my own not insignificant knowledge of phonetics and phonology tells me. If you disagree with this thesis, please provide any linguistic work that describes human speech perception using these physical terms. It is my firm and well-founded understanding that human perceive combinations of loudness, duration and frequencies as *speech segments.*


Hulalessar said:


> The elements of loudness and frequency at the syllable level are relative. In a tonal language the pitch of the tones is not fixed, but varies according to the speaker and to the intonation of the utterance. (I never said that Chinese does not have sentence intonation.)


You cannot deny saying here that Spanish has something that Chinese doesn't, that being "prosodic pitch". Otherwise you would have gone on talking about Chinese instead of introducing Spanish:


Hulalessar said:


> Pitch can operate at two levels. In Chinese the pitch assigned to a word establishes its meaning. In Spanish it operates at the prosodic level to, for example, distinguish a statement from a question. Statements and "yes or no" questions may contain the same words in the same order, but the rise in pitch at the end tells you when it is a question.


—


Hulalessar said:


> A Spanish speaker distinguishes between _gano _"I win" stressed on the first syllable and _ganó _"he won" (stressed on the last syllable) whether he speaks softly or shouts and whether or not he is emphasising the word. Whilst every utterance is a complete entity it can still be analysed at both the syllable and utterance levels depending on what you are looking at.


I have to say it bothers me when I'm being talked past in this manner after I have gone out of my way to explain the issue and illustrate it with diagrams and links to references and publications. Please re-read this:


Sobakus said:


> *All the levels* listed here, from Syllables to Utterance, are studied holistically by Prosodic Phonology. […] *Pitch movement* operates on *all these prosodic levels *- the fundamental frequency (F0) changes over individual Syllables, Feet, Prosodic Words, Phonological and Intonational Phrases, as well as whole Utterances.



You're trying to draw a distinction between "prosodic pitch" which according to you is "prosodic", and "syllable pitch" which isn't prosodic but "phonemic". I've been trying to show that the theory of Prosodic Phonology draws no such distinction. If you wish to contest this statement - and like any scientific-theoretical statement, it *can be contested* - you should do so by adducing evidence and referencing scientific publications.

I would be very happy to have that discussion, but I would ask that you stop talking past your interlocutor and trying to introduce musical theory into a linguistic discussion in order to fill your gaps in understanding. I don't pretend for a moment that Prosodic Phonology to be the be-all and end-all descriptive framework for this, but the alternative framework must be a linguistic one, and you must rely on your knowledge and understanding of both the alternative theory and of Prosodic Phonology in order to challenge someone who operates with knowledge and understanding of PPh.

One example of evidence that would disagree with PPh's treatment of lexical tone together with utterance intonation is if you're able to demonstrate that tone processing inside the brain for speakers of Chinese happens in a completely separate manner from intonation processing for speakers of Spanish. I have found a couple of papers that point to a directly opposite conclusion and support PPh's holistic treatment of tone and intonation as part of Prosody. I will quote from the abstract of Chien, Friederici, Hartwigsen, Sammler (2020). _Neural correlates of intonation and lexical tone in tonal and non‐tonal language speakers:_


> Tone processing overlapped with intonation processing in left fronto‐parietal areas, in both groups, but evoked additional activity in bilateral temporo‐parietal semantic regions and subcortical areas in Mandarin speakers only. Together, these findings confirm cross‐linguistic commonalities in the neural implementation of intonation processing but dissociations for semantic processing of tone only in tonal language speakers.


Li et al. (2010) is a study that finds clear separation between tones and segments, disagreeing with your treatment of tones as "phonemic":


> In direct contrasts between phonological units, tones, relative to consonants and rhymes, yield increased activation in frontoparietal areas of the right hemisphere. This finding indicates that the cortical circuitry subserving lexical tones differs from that of consonants or rhymes.


Both of these studies are of perception. Left-hemisphere damage leading to aphasia doesn't appear to lead to difficulty in tone production in speakers of tonal languages (Müller 2015), which is evidence that this happens in the right hemisphere, same as utterance intonation and singing.

I would again ask that you avoid referencing musical pitch perception in our linguistic discussion.


----------



## Sobakus

To remind the readers why this entire discussion is even necessary: as mentioned in message #8, it's not the case that Voice Quality is part of Prosody. Voice Quality is a paralinguistic phenomenon and relatively constant. It's my goal to demonstrate that disagreeing with this statement, as Hulalessar has been doing, necessarily presupposes an incorrect understanding of Prosody. Voice Quality has a physical expression (frequency and sound pressure, but not duration) insofar as all speech has a physical expression. Citing musical timbre to support the statement that Voice Quality is part of Prosody is effectively making the differences between the timbers of musical instruments a concern of the linguistic science of Prosody, and potentially all other perceptible sounds as well.

In fact, the musical instrument analogy illustrates precisely the opposite: the timbre of musical instruments is a characteristic constant of each particular musical instrument, just like Voice Quality is of each particular individual. In both cases there are commonalities between individual instruments and individual people (grouping humans together on the social, national, situational or some other level), and in both cases this constant can be changed in the relatively long term (no doubt more readily with humans). Prosody on the other hand studies the melody played on anyone's vocal chords regardless of their inherent timbre.


----------



## dojibear

Sobakus said:


> I must emphatically repeat that speech sounds are not perceived as pitch, neither absolute nor indefinite. They're perceived as combinations of articulatory features when one's grammar allows their perception; otherwise they're perceived as strange, foreign speech sounds that the listener will struggle to repeat; nevertheless they aren't heard as pitch.


I *do* perceive pitch in speech sounds. I *do not* "struggle to repeat" foreign speech sounds.



Sobakus said:


> I would ask that you stop talking past your interlocutor and trying to introduce musical theory into a linguistic discussion


This is not a linguistic discussion. Check the title on this thread. Nobody said this was a thread for linguists only, or a thread to discuss some linguistic theory. There is no earthly reason not to discuss music. If you want a linguistic discussion, start a thread for it. Don't tell other people what to say in this thread.



Sobakus said:


> it's not the case that Voice Quality is part of Prosody.


How is that relevant to languages sounding different?


----------



## Sobakus

dojibear said:


> I *do* perceive pitch in speech sounds. I *do not* "struggle to repeat" foreign speech sounds.


Please elaborate. What speech sounds do you perceive as pitch and how does this manifest? What makes you call that perception pitch perception? Is this in parallel to perceiving them as speech sounds, or instead of it?

Are you saying there are no foreign speech sounds that you struggle to repeat, i.e. that you have the perfect ability to repeat any speech sound? What about non-speech sounds? Are you able to reproduce any pitch perfectly?


dojibear said:


> This is not a linguistic discussion. Check the title on this thread. Nobody said this was a thread for linguists only, or a thread to discuss some linguistic theory. There is no earthly reason not to discuss music. If you want a linguistic discussion, start a thread for it. Don't tell other people what to say in this thread.


This is a thread about languages sounding different, on a forum about languages, on a sub-forum called _Etymology, History of languages, and Linguistics._ You're mistaken when you believe that I have to start a special thread to talk about linguistics here. Linguistics is the science of discussing languages.

I was objecting to Hulalessar discussing language prosody and linguistic pitch by confusingly mixing in music theory and physics. I explain why this is a very bad idea in the first of the last two messages. Hulalessar notes the results of this when he remarks that "there is confusion because people are using words with different ideas of what they mean". The ideas of music pitch and descriptions of physical sound vibrations in musical instruments are not helpful in explaining how humans perceive different languages, as this thread exemplifies. They're doubly unhelpful when discussing Prosody, which is a specialised idea that only makes sense in the framework of the science of linguistics. Arbitrarily changing the framework from language to music and back makes it impossible to talk about Prosody, or other ideas that are part of each individual framework.

Most importantly, as mentioned in the last message, arguments from music theory do not show that Voice Quality is part of Prosody, rather the opposite. It's this that I've been trying to show by explaining Prosody and its actual parts. I was hoping that the links I provided in my first reply to this thread would prove sufficient evidence for that assertion. Indeed they should have, but they didn't. It was never my intention to make this discussion as technical as it ended up being.

Note that the discussion I'm referring to is between me and Hulalessar, even if he doesn't directly quote my messages as is customary for him. I'm not stopping some abstract other people from discussing anything they want - albeit I reserve the right to ask people to stick to the topic of the thread and of the forum, as forum rules as well as common considerations require them to.


dojibear said:


> How is that relevant to languages sounding different?


Please read message #7, which message #8 is a reply to.


----------



## dojibear

Sobakus said:


> I was objecting to Hulalessar discussing language prosody and linguistic pitch by confusingly mixing in music theory.


I understand. He was mixing linguistic jargon with the jargon of another field. I agree: that's confusing.



Sobakus said:


> This is a thread about languages sounding different, on a forum about languages. You're mistaken when you believe that I have to start a special thread *to talk about linguistics *here. Linguistics is the science of discussing languages.


My concern (which I edited out, while "toning down" my post) was the use of jargon. There is a big difference between:

(1) a discussion (using general English) between English speakers (who might include Linguists) about language sounds

(2) a discussion (using Linguistic "jargon" instead of English) between trained Linguists about the detailed theories, terminology and concepts of Linguistics that relate to language sounds.

That was my objection: turning the thread into a jargon discussion that most people can't follow. *However*, you did not start that. Hulalessar and others used linguistic jargon. Even post #2 talks about "Prosody (linguistics)". So it was inevitable that the discussion would end up being incomprehensible to folks like me. Peace.


----------



## dojibear

dojibear said:


> I *do* perceive pitch in speech sounds. I *do not* "struggle to repeat" foreign speech sounds.





Sobakus said:


> Please elaborate. What speech sounds do you perceive pitch in


Speech in general. High-pitched Japanese female salesclerks. High-pitched Chinese females acting "cute". Males and females with unusually high (or low) voice pitch.



Sobakus said:


> and how does this manifest?


I don't understand this question.



Sobakus said:


> What makes you call that perception pitch perception?


What else would I call it? "Pitch" is a common English word.


----------



## Sobakus

dojibear said:


> Speech in general. High-pitched Japanese female salesclerks. High-pitched Chinese females acting "cute". Males and females with unusually high (or low) voice pitch.


Right, I understand. I explain this in message #17 using technical jargon. Of course, we all easily hear this type of pitch, which is the same as intonational pitch (also referred to in this thread as "utterance pitch" and "prosodic pitch"). In technical jargon that's F0 or the fundamental frequency, the one created by the vocal chords and the general shape of one's throat (and even the rest of the body).

However, this pitch is perceived as separate from speech sounds or segments or phonemes, which are consonants and vowels. Vowels in particular consist of at least 3 more distinct pitches, which when combined together sound like vowels to us. When musical instruments sound like human voice, you're hearing combinations of pitches that resemble those of human vowels.

We hear differences in these individual pitches as differences in vowel frontness (ee-oo), height (ee-ah), and lip rounding. But we don't hear 3 different pitches (+ intonational pitch) when when listening to a vowel - we just hear a vowel with intonational pitch overlayed onto it. Birds, when they imitate human speech, presumably hear them individually (together with the many overtones), which is how they can match them so accurately.

My argument goes that because we don't hear individual pitches, whether they're definite or indefinite pitches is irrelevant. Speech perception functions on a different level from music pitch perception. The F0 frequency, the intonational pitch, is obviously definite, but it seems to me that's all there is to say about pitch definiteness and speech.


----------



## dojibear

Sobakus said:


> However, this pitch is perceived as separate from speech sounds or segments or phonemes, which are consonants and vowels.


Yes, it seems to have nothing to do with understanding speech. It doesn't change the "meaning".



Sobakus said:


> Speech perception functions on a different level from music pitch perception.


I hear "harmony" in music (two or more melodies) that I never hear in speech.

Of course most speakers just speak -- they don't know *how *they speak. I'll let you linguists figure that out.


----------



## LeifGoodwin

Ben Jamin said:


> Listening to different languages and different dialects of the same language I noticed that they differ not only by the different aspects of articulation of vowels like high/low, frontal/back, open/closed and different pitch, but also by a quality that I never found described or classified in any publication. I have given this quality two  working names: "basso continuo" and "basic sound colour". Has anyone met any description of this sound quality?
> It is similar to different sound quality of musical instruments for example between a flute and a trombone, or guitar and lute. This sound "colour" varies between indviduals, but people speaking the same dialect have an easily recognizable common quality.


I think I understand what you mean by this, and generally someone whose native language is standard French will retain traces of their native tongue which will colour their English even if they speak it very well. I know an Austrian who speaks English extremely well, but there is a tightness in the throat which I assume is characteristic of his German speech. Arnold Schwarzeneggar has a more extreme Austrian accent than my friend, but it has that tightness. When we speak a given dialect or language perfectly, the mouth adopts a particular shape to suit that dialect. Thus posh Brits are said to speak with a bag of plums in the mouth, which is a way of indicating a particular way of forming phonemes especially vowels.   French people who speak excellent English often sound quite posh, presumably because their natural mouth shape corresponds to eating plums, so to speak. Then of course there are other aspects such as intonation which may carry over into another language, but they are for me not really ‘colour’ though they do form part of the accent. Bilingual people in my experience (I have known a lot) can speak a language without obvious colouration, apart of course from colouration arising from their physiology. I knew an Indian who could speak Indian languages, as well as perfect English with an educated English accent , or an upper class English accent, or an American accent, according to the social situation. It is though very hard for an adult to train the mouth in order to remove the colouration associated with ones native dialect or accent. Clearly it can be done, by building up muscle memory through repetition, as some ployglots manage it. 

My late brother lived in Finland and spoke perfect Swedish. After mum died, he came to England and initially he spoke English with a Swedish accent, which was quite amusing. After a few weeks his original English accent reappeared, and he started speaking like a Worzel ie with a rural Devon accent from 60 years ago. That was also quite amusing. In his case he could speak either language without colouration, but he could not consciously swap between the two, and required a period of adaptation.


----------



## Ben Jamin

LeifGoodwin said:


> I think I understand what you mean by this, and generally someone whose native language is standard French will retain traces of their native tongue which will colour their English even if they speak it very well. I know an Austrian who speaks English extremely well, but there is a tightness in the throat which I assume is characteristic of his German speech. Arnold Schwarzeneggar has a more extreme Austrian accent than my friend, but it has that tightness. When we speak a given dialect or language perfectly, the mouth adopts a particular shape to suit that dialect. Thus posh Brits are said to speak with a bag of plums in the mouth, which is a way of indicating a particular way of forming phonemes especially vowels.   French people who speak excellent English often sound quite posh, presumably because their natural mouth shape corresponds to eating plums, so to speak. Then of course there are other aspects such as intonation which may carry over into another language, but they are for me not really ‘colour’ though they do form part of the accent. Bilingual people in my experience (I have known a lot) can speak a language without obvious colouration, apart of course from colouration arising from their physiology. I knew an Indian who could speak Indian languages, as well as perfect English with an educated English accent , or an upper class English accent, or an American accent, according to the social situation. It is though very hard for an adult to train the mouth in order to remove the colouration associated with ones native dialect or accent. Clearly it can be done, by building up muscle memory through repetition, as some ployglots manage it.
> 
> My late brother lived in Finland and spoke perfect Swedish. After mum died, he came to England and initially he spoke English with a Swedish accent, which was quite amusing. After a few weeks his original English accent reappeared, and he started speaking like a Worzel ie with a rural Devon accent from 60 years ago. That was also quite amusing. In his case he could speak either language without colouration, but he could not consciously swap between the two, and required a period of adaptation.


Thank you! A very interesting review! Especially your pointing out for the use of the throat as the basic source of voice coloration. I could add a couple of other parts of the vocal system as possible culprits: the diaphragm, the lungs, the nose. Alteration of those four instruments can give a large spectrum of possible colorations.


----------



## LeifGoodwin

Ben Jamin said:


> Thank you! A very interesting review! Especially your pointing out for the use of the throat as the basic source of voice coloration. I could add a couple of other parts of the vocal system as possible culprits: the diaphragm, the lungs, the nose. Alteration of those four instruments can give a large spectrum of possible colorations.


Yes I had not considered those other organs. Interesting!


----------

