# Aspiration vs. voicing with plosives



## Forero

Split from here.



berndf said:


> Some languages, like German, English or Chinese, do not distinguish between unvoiced and voiced plosives but only between aspirated and unaspirated ones. In languages which do differentiate between voiced and unvoiced plosives, like all Romance languages, there is a difference between a "d" (the vocal cords start to vibrate several 10s of milliseconds before the plosive release) and an unaspirated "t".


Actually, Chinese stop consonants are very different from English ones.  In Mandarin Chinese, an unaspirated stop consonant normally becomes voiced in an unstressed syllable, but in some dialects, an aspirated stop consonant can also become voiced in an unstressed syllable.  The phonemic distinction is always between aspirated and unaspirated, not between voiced and unvoiced.

In English, initial unvoiced stop consonants are usually aspirated, and voiced stop consonants can be too, but the phonemic distinction is always between voiced and unvoiced, not between aspirated and unaspirated.

For example, we believe we hear a _t_ sound after the _s_ in _stake_, but to an untrained Chinese speaker, _stake_ has a "d" sound, very different from the _t_ in _take_.

Italian is much less likely to aspirate a stop consonant, so an Italian /t/ nearly always sounds like a "d" to the untrained Chinese speaker.  But an untrained Italian speaker hears "t" in both English _stake_ and English _take_, just like a native English speaker.  An English /d/ sounds like a "d" to the same Italian speaker.

In the case of final stop consonants, we perhaps need to be talking about the end of voicing, not the onset.  Of course, if voicing resumes after a final unvoiced stop consonant, we English speakers consider that another syllable.


----------



## sokol

The presence of voicing with stops does of course _not _necessarily mean that voicing is the distinctive feature - it may as well be that aspiration is what is distinctive (what is most important for native speakers to distinguish sounds), and not voicing even though this is present - as you described for Chinese, Forero.

A distinctive feature in phonology is what is most important to distinguish two sounds: so you can test what English speakers need to recognise final "d" in "bad" correctly:
I am sure that *English *native speakers will interprete those as* "bad":*
- [bæˑd] with audible release and voicing that continues just a few milliseconds after release (thus, much more voiced than would be usual, in English)
- [bæˑd̚] with no audible release
- [bæˑd̥] or [bæˑd̥̚] - devoiced "d" (with audible or also no audible release; this I think would be the most typical pronunciation but correct me if I'm wrong )
And I think that they would identify the *following pronounced with vowel length as is "bad" *as wrong though *not *as "bat" [bæt̥] because the vowel in "bat" is significantly shorter; due to the vowel length native speakers still should accept this as "bad" but state that pronunciation is wrong, or with an "accent":
- [bæˑtʰ] with aspiration
- [bæˑt] no aspiration, but audible release of the plosive
- [bæˑtə] with schwa (this they might not accept at all)
(Vowel length in English phonology is considered "intrinsic" here, that is - an additional feature [along with voiced/unvoiced] which helps distinguishing two phonemes but with the main distinctive feature still being aspirated/non-aspirated.)

I might not be right here (or not entirely, concerning how English native speakers will hear those) - but still, that's the method how to evaluate what *is *the distinctive feature.

And the *point *here is: especially in word-final position, where the release of plosives is difficult to pronounce (much more difficult than between vowels, or in consonant clusters followed by vowels), there occur many cross-linguistic misunderstandings.
As is proved by the posts so far. 

In German standard language, like Forero described for Chinese, the distinctive feature is aspiration (and definitely not voicing). In Romance and Slavic languages, the distinctive feature very clearly is voicing.


----------



## berndf

sokol said:


> In German standard language, like Forero described for Chinese, the distinctive feature is aspiration (and definitely not voicing). In Romance and Slavic languages, the distinctive feature very clearly is voicing.


To my understanding, the issue with English is that aspiration vs. voicing is not a question of black and white but a question of shades of gray. Languages which distinguish only between two (hard vs. soft) kinds of plosives there is a VOT value below which the sound is perceived as "soft" and above which the sound is perceived as "hard". For German and Chinese, this threshold value is distinctly positive, for Romance and Slavic languages it is distinctly negative. For English it is somewhere around zero, I think at least for BE still positive but smaller than the threshold for German or Chinese. English speakers will therefore pronounce "soft" plosives with a negative VOT of very small magnitude; English soft plosives are therefore called "partially voiced" (see table).
 
Thinking of it a bit more, I agree with Forero and Sokol that final plosives require some special consideration. In German and Chinese this is not an issue because Chinese does not have any final plosives at all and in Standard (not Austrian) German final plosives are always unvoiced and aspirated, regradless of spelling.


----------



## sokol

Yes of course, it is true that "soft" plosives in English typically are voiced; but if a "soft" plosive is not voiced an English native speaker will have no problem distinguishing it from a "hard" plosive because the latter is aspirated - or if it isn't, like in AE pronunciation of "better" where aspirated "t" not only can be reduced to "d" but often even is reduced to a tap, then that sound isn't heard as "t" but as "d".

Only in word-final position, when we compare "code" to "coat" (the latter in Brian's pronunciation with no aspiration and no audible release) an unaspirated "t" might still be perceived as "t" because vowel length (and optionally slight voicing of "d") indicates that it can't be a "d".

In Romance and Slavic languages however the opposition between "t" and "d" is lost (is neutralised) in word-final position if voicing is missing with "d". Some Slavic languages apply devoicing in word-final position; I don't know if this also happens in Romance languages (I don't think so, French doesn't do it, right? - and others avoid word-final plosives).


----------



## Forero

In most native English, aspiration and vowel length are not phonemic for consonants, but voicing is.

The very end of a word tends to devoice just before a pause, but the beginning of a final consonant or consonant cluster identifies it as voiced or unvoiced.

Vowels, and _r_, _l_, _m_, or _n_ after a vowel are voiced.  If voicing continues after one of these into a final consonant (cluster), the consonant (cluster) is "voiced".  In the case of a stop consonant, if voicing stops before the closure completes, the consonant (cluster) is "unvoiced".  This is why the vowel (and the _n_ or whatever, if present) tends to be cut short before an unvoiced consonant.

If a final consonant (cluster) is not released, its phonemic identity is the same.  If it is released with aspiration, or without, the final consonant is the same phonemically.

So what is important for a final consonant (cluster) in most native English is not how it ends but how the transition is made as the vowel ends and the consonant (cluster) begins.  This transition is important in identifying both the consonant (voiced or unvoiced) and the vowel (the offglide does more to identify an English falling diphthong than the onset).


----------



## Forero

sokol said:


> Some Slavic languages apply devoicing in word-final position; I don't know if this also happens in Romance languages (I don't think so, French doesn't do it, right? - and others avoid word-final plosives).


In most varieties of Spanish,_ d_ devoices at the end of a syllable, whether word-final or part of the _ad-_ prefix before a consonant, even a voiced consonant (e.g. _usted_ and _admiración_ normally have unvoiced fricative _d_, but the _d_ of _usted_ may be "silent").  Between vowels, _d_ is either a voiced fricative or absent, depending on dialect and speed.

French does not devoice final _d_, except in loan words from languages that devoice the _d_.


----------



## berndf

Forero said:


> In most native English, aspiration and vowel length are not phonemic for consonants, but voicing is.


I found a source here from London's University College which implies that the differentiation between _fortis_ and _lenis_ plosives in (British) English is through aspiration and not through voicing. Native speakers of languages which do this often confuse lack of aspiration with voicing. I used to do this myself until Sokol alerted me to my mistake in a thread about German plosives. 
 



Forero said:


> French does not devoice final _d_, *except in loan words from languages that devoice the d*.


I can't think of an example, can you?
 
In general, a final <d> is pronounced only if followed by a <e> in writing. In this case you often hear a faint Schwa, like in Italian but fainter.


----------



## Outsider

sokol said:


> In Romance and Slavic languages however the opposition between "t" and "d" is lost (is neutralised) in word-final position if voicing is missing with "d". Some Slavic languages apply devoicing in word-final position; I don't know if this also happens in Romance languages (I don't think so, French doesn't do it, right? - and others avoid word-final plosives).


I can't figure out what you mean, but I can tell you that in most Romance languages words ending with a plosive are rare. Spanish allows final [d], as has been said, but many native speakers don't even pronounce this consonant (some do). No other plosive may end a Spanish word.

In Portuguese, there are no words ending with plosives (with a handful of unrepresentative exceptions, such as proper nouns and learned words).

In French there is a phenomenon of devoicing of final plosives in liaison - e.g. _grand_ silent "d", but pronounced as [d] in the feminine, _grande_ --> _grand homme_ "d"=[t] -, which I always tended to assume was due to Germanic influence. Then again, this also seems to happen in Catalan...

Bottom line: plosives are rare at the end of words in most Romance languages. When one plosive is allowed by phonotactics, its voiced or unvoiced counterpart is typically not possible. Was this what you meant, Sokol?


----------



## Forero

In AmE, stops tend to become more _lenis_ between a stressed vowel and an unstressed vowel, but /k/ and /g/, /p/ and /b/ remain distinguishable by the degree of voicing.  In this position, /t/ is rather an exception, becoming a clear /d/ in some varieties of AmE.  Other AmE speakers pronounce such a /t/ as a voiceless flap that becomes voiced (like a flapped /d/) when speaking quickly, but is always distinguishable from a clear (i.e. carefully pronounced) /d/.  Our initial /b/, /d/, and /g/ tend to be somewhat aspirated (while still clearly being voiced) if the first syllable is stressed (e.g. _dogma_).  Our intial /b/, /d/, and /g/ are thus different from French initial /b/, /d/, and /g/.



berndf said:


> I found a source here from London's University College which implies that the differentiation between _fortis_ and _lenis_ plosives in (British) English is through aspiration and not through voicing.


I did not find the information about British consonants following this link. (My computer is not very good at web pages that run scripts.) Could you summarize please?





> Forero said:
> 
> 
> 
> French does not devoice final _d_, except in loan words from languages that devoice the _d_.
> 
> 
> 
> I can't think of an example, can you?
Click to expand...

No, and my dictionary is no help because it sorts words from first letter to last letter, not the other way around.  I checked _apartheid_ in Larousse and was surprised to see it with final /d/, not /t/.





Outsider said:


> Spanish allows final [d], as has been said, but many native speakers don't even pronounce this consonant (some do). No other plosive may end a Spanish word.


Mexican Spanish has no problem with words ending in _c_ (/k/) such as _Chapultepec_, and _mamut_ ends in a /t/ sound. Final voiced stops (/b/ and /g/) are pronounced less consistently, usually as fricatives with devoicing. For some Mexican speakers, a final /g/ in a loan word may become a voiced uvular stop. I can't see any reason for the latter phenomenon, but I have heard it.

_Woolworth_ gets really garbled in (Mexican) Spanish, something like /gul'gort/, but the final _t_ is clear.  _Ford_ however ends with an _rr_ sound that may be devoiced at the end.


----------



## berndf

Forero said:


> I did not find the information about British consonants following this link. (My computer is not very good at web pages that run scripts.) Could you summarize please?


It states that plosives count as _unvoiced_ in English, if the VOT is positive and count as _voiced_, if the VOT is zero whereas in French VOT zero plosives count as _unvoiced_ and a plosive counts as _voiced_ if VOT is negative.

In my words: In English aspiration is phonemic (although regarded as voicing by speakers themselves) while in French voicing is phonemic.

According to my own amateurish observations, a typical BE speaker pronounce "d" slightly voiced, i.e. a BE "d" sound a bit different from a German "d" which has a VOT indistinguishable from zero. But the demarcation threshold below which the sound is considered _lenis_ and above of which it is considered _fortis_ is roughly the same in English and German. You often find statements in books saying that English has aspirated and unaspirated unvoiced plosives, e.g. _*t*__ea_ (aspirated) and _fer*t*ile_ (unaspirated). Again in my amateurish view, this distinctions is, in BE, between strong aspiration (~60ms) and weak aspiration (~30ms).




Forero said:


> In AmE, stops tend to become more _lenis_ between a stressed vowel and an unstressed vowel...


I think you are talking here about the same thing I did in the preceding paragraph. The description I am familiar with is that plosives become more _fortis_ word-initially and at the beginning of stressed syllables. I think this hold for both, AE and BE but the definitions of _more lenis_ and _more fortis_ differ.


----------



## sokol

Outsider said:


> I can't figure out what you mean, but I can tell you that in most Romance languages words ending with a plosive are rare.


I know, and I was thinking about French in particular but I wasn't sure if devoicing of word-final plosives happens with French. As for that case:


Outsider said:


> In French there is a phenomenon of devoicing of final plosives in liaison - e.g. _grand_ silent "d", but pronounced as [d] in the feminine, _grande_ --> _grand homme_ "d"=[t] -, which I always tended to assume was due to Germanic influence. Then again, this also seems to happen in Catalan...


This is devoicing due to liaison = due to the fact that in French a "phonemic unit" always is the utterance and that assimilation happens also between words.
Romanian also has word-final plosives but I don't know of the opposition between voiced and unvoiced plosives is maintained in word-final position.

Anyway, we were both talking about the same thing basically, only that I wasn't sure about the situation in French.


----------



## berndf

Outsider said:


> In French there is a phenomenon of devoicing of final plosives in liaison - e.g. _grand_ silent "d", but pronounced as [d] in the feminine, _grande_ --> _grand homme_ "d"=[t] -, *which I always tended to assume was due to Germanic influence*. Then again, this also seems to happen in Catalan...


I am not sure about the origin. I think it is to distinguish between liaison consonants and word-initial consonant. E.g. the audible difference between "grande âme" and "grande dame" (unless the speaker pronounces the final "e") is the devoicing of "d".

It is typical for French to deviate from normal pronunciation rules to maintain audible distinctions, e.g. the "s" in _plus_ is pronounced, if and only if it is not a _ne...plus _(_ne_ is deleted in spoken French) or _aurai_ is pronounced as if written _auré_ to distinguish it from _aurais_.

It would be nice, if a native speaker could confirm or correct.


----------



## phosphore

sokol said:


> This is devoicing due to liaison = due to the fact that in French a "phonemic unit" always is the utterance and that assimilation happens also between words.
> Romanian also has word-final plosives but I don't know of the opposition between voiced and unvoiced plosives is maintained in word-final position.


 
What do you mean by "devoicing due to liaison"? The thing is that the word-final consonants were subject to devoicing formerly (and there still are some results of this sound change, such as _neuf,_ _bref_ (masculine) - _neuve, brève_ (feminine), previously mentioned _grand_ [t] _homme _or _sang_ [k] _impur_) but they are not anymore.


----------



## berndf

phosphore said:


> What do you mean by "devoicing due to liaison"? The thing is that the word-final consonants were subject to devoicing formerly (and there still are some results of this sound change, such as _neuf,_ _bref_ (masculine) - _neuve, brève_ (feminine), previously mentioned _grand_ [t] _homme _or _sang_ [k] _impur_) but they are not anymore.


_Liaison _is the contraction of two word if the second starts with a vowel. E.g. _grand homme_ is pronounced [gRɑ̃tɔm] rather than [gRɑ̃ ?ɔm]. In this example the normally mute "d" of _grand _is restored but unvoiced.

Things like _neuf_ vs. _neuve_ are possibly due to Germanic influence as Outsider surmised. But in my mind those are unrelated issues.


----------



## CapnPrep

We can observe both "devoicing" of plosives and "voicing" of fricatives in French liaison, but it is highly questionable whether either one (especially the former) should be considered to be a rule-governed phenomenon (I think the only truly productive case is /s/ → [z]). Liaison in [t] corresponding to a written -_d_ only really happens with two words (_grand_ and _quand_), -_g_ pronounced as [k] is even rarer and more variable, and as far as I know there are no cases of silent -_b_ realized as [p] in liaison. And outside of fixed expressions plenty of speakers will simply produce the voiced consonant in all of these cases (of course in large part due to orthographic influence).

However, the general point about French is that with the loss of final -_e_ (in the relevant varieties), voice is fully distinctive for final plosives (and for all other consonants, except of course sonorants and ).


----------



## berndf

The [t] liaison is productive in case of subject-verb inversions. E.g. _répond-il_ is pronounced as if written _répond-t-il_. In this case the assumed "t" is even normally aspirated although aspiration is otherwise insignificant in French.


----------



## CapnPrep

That is not a case of liaison, since (i) it does not occur between two independent words in a free syntactic combination (the pronoun is an enclitic, perhaps even a bound suffix), and (ii) the [t] is found with verb forms that do not allow a liaison consonant in other contexts (_parle_, _parla_, _parlera_, …)

Furthermore—with respect to the devoicing hypothesis—the "underlying" final consonant of _répond_ is not necessarily /d/; it could simply be /t/, just like the liaison form of _grand_ could just be /gRãt/. All of the forms in the paradigm don't have to contain the same consonant synchronically (cf. _frais/fraîche_, _beau/belle_, etc.)


----------



## Forero

Is "neuf années" pronounced with /f/ or /v/?


----------



## CapnPrep

Normally [f]. See this message, for example.


----------



## berndf

Forero said:


> Is "neuf années" pronounced with /f/ or /v/?


I am not sure I think I have heard both but it is definitely [v] in "neuf ans". As CapnPrep said, it is difficult to give a general rule.


----------



## Outsider

berndf said:


> I think it is to distinguish between liaison consonants and word-initial consonant. E.g. the audible difference between "grande âme" and "grande dame" (unless the speaker pronounces the final "e") is the devoicing of "d".


Just for the record, those phrases are pronounced [ɡʀɑ̃*d̪*ˈam(ə)] and [ɡʀɑ̃*d̪(ə)ˈd̪*am(ə)], respectively. The difference is not one of voicing.


----------



## berndf

Outsider said:


> Just for the record, those phrases are pronounced [ɡʀɑ̃*d̪*ˈam(ə)] and [ɡʀɑ̃*d̪(ə)ˈd̪*am(ə)], respectively. The difference is not one of voicing.


Are you saying _grand homme_ is pronounced [gRɑ̃tɔm] and _grand âme_ [gRɑ̃d?am] rather than [gRɑ̃tam]? I'd be surprised, if this were true. For speaker who pronounce the final "e" I could imagine [gRɑ̃də?amə] (as I mentioned).


----------



## Outsider

Well, _âme_ is a feminine noun, so normally it would be associated with the feminine adjective _grande_, as you wrote initially (recall that _grand_ without _e_ at the end is the masculine version of this adjective).

But now you've left me wondering. We do say _mon âme_ instead of _*ma âme_, because the noun starts with a vowel. Does this happen with adjectives, too? I can't recall...


----------



## phosphore

berndf said:


> _Liaison _is the contraction of two word if the second starts with a vowel. E.g. _grand homme_ is pronounced [gRɑ̃tɔm] rather than [gRɑ̃ ?ɔm]. In this example the normally mute "d" of _grand _is restored but unvoiced.
> 
> Things like _neuf_ vs. _neuve_ are possibly due to Germanic influence as Outsider surmised. But in my mind those are unrelated issues.


 
I know what_ liaison_ is; I just think that [t] instead of [d] in _grand homme_ is not *due* to _liaison_, either [d] or [t] would be there for that reason, but it is due to the historic devoicing of word-final consonants.



CapnPrep said:


> We can observe both "devoicing" of plosives and "voicing" of fricatives in French liaison, but it is highly questionable whether either one (especially the former) should be considered to be a rule-governed phenomenon (I think the only truly productive case is /s/ → [z]). Liaison in [t] corresponding to a written -_d_ only really happens with two words (_grand_ and _quand_), -_g_ pronounced as [k] is even rarer and more variable, and as far as I know there are no cases of silent -_b_ realized as [p] in liaison. And outside of fixed expressions plenty of speakers will simply produce the voiced consonant in all of these cases (of course in large part due to orthographic influence).


 
I do not remember any French word ending in <b> which could be devoiced in _liasion_.

On the other side, [z] instead of  in _liaison_ is due to historic voicing of intervocalic  and it has nothing to do with this topic.


----------



## CapnPrep

Outsider said:


> But now you've left me wondering. We do say _mon âme_ instead of _*ma âme_, because the noun starts with a vowel. Does this happen with adjectives, too? I can't recall...


No, it does not, in the feminine.

But speakers who pronounce final -_e_ elide it before another vowel in the same phonological phrase, so [grãdə(ʔ)ɑmə] (imagined by berndf  ) would be very strange (it would sound like you were saying _grande hâme_, whatever that is). As Outsider indicated, there is no devoicing in _grande âme_. 

There is a voice contrast in, for example, _grand ami_ [t] vs _grande amie_ [d]. And there are apparently some reliably measurable phonetic differences between fixed final consonants, liaison consonants, and initial consonants, but I don't really know if this should be encoded at the level of phonemic features. E.g. the 3 [t]'s in _petit acte_ vs _petit tact_ vs _petite actée _(sorry for the nonsense examples) have slightly different VOT, length, amplitude, etc.


----------



## berndf

CapnPrep said:


> there is no devoicing in _grande âme_.


How do you then characterize the difference between _grande âme_ and _grande dame_? Would you say it is the difference between final and initial voiced plosives?


----------



## CapnPrep

I am with Outsider on this one, too: there are two pronounced [d]'s in _grande dame_ (or one long [d]). (Also, the vowels in_ âme_ and_ dame _are different, but that's not the point here.)


----------



## berndf

CapnPrep said:


> I am with Outsider on this one, too: there are two pronounced [d]'s in _grande dame_ (or one long [d]). (Also, the vowels in_ âme_ and_ dame _are different, but that's not the point here.)


You are probably right.


----------



## Hulalessar

What exactly is a devoiced consonant? Is it a voiced consonant that becomes unvoiced completely or only partially? And if it is partial is it sort of half unvoiced, or does it start off voiced and trail off into unvoiced?


----------



## phosphore

I think that a devoiced consonant would be a realisation of a phoneme, which is usually represented by its voiced counterpart, in some particular phonetic context, such as word-final position; it is voiceless by its phonetic nature.


----------



## Forero

I am still not understanding about the +, -, or zero VOT.  I listened from a computer where I work, and to me the sounds in the "negative" column are voiced and not particularly aspirated, the ones in the "zero" column seem to be a rather successful attempt at unvoiced unaspirated consonants by someone who "wants" to voice them, and the ones in the "positive" column sound unvoiced aspirated.

What are they supposed to sound like?  And what would be the VOT of natural unvoiced unaspirated consonants like a French /t/ or English /t/ after /s/?  Does that make sense?

Do they have something similar for final consonants?  I have noticed that we tend to devoice before silence but not enough to make a voiced consonant into an unvoiced one.


----------



## berndf

In German VOT>0 counts as unvoiced, VOT<=0 as voiced. In French VOT>=0 is unvoiced and VOT<0 is voiced where VOT=0 means "below the threshold of audibility" which is about +/-30ms. English is, depending a regional variety (BE tends towards higher VOTs, AE towards lower VOTs), somewhere in between. In this respect, your description of what you hear is what one would expect given you are an AE speaker. I as a native German speaker perceive VOT=0 plosives as voiced and VOT<0 plosives as _excessively_ voiced. When a French speaker says _quatre_ I head 2 of 3 times _guatre_ and if an AE speaker “writes a letter” I often hear that he “rides a ladder”. Having been living in the French speaking world for quite some time now I can adjust but it still isn’t natural for me.


----------



## Hulalessar

phosphore said:


> I think that a devoiced consonant would be a realisation of a phoneme, which is usually represented by its voiced counterpart, in some particular phonetic context, such as word-final position; it is voiceless by its phonetic nature.


 
You mean like in German where in _bad_ the "d" is realised as /t/ but in _baden_ as /d/? Or in Russian in idti (идти) where the "d" is realised as /t/ because it is followed by /t/?

I am having a little difficulty seeing how all this applies to English, probably because it is my native language. I have read the Wikipedia article on voice onset time and am not sure I quite understand what it is. Will someone explain, please?


----------



## berndf

Hulalessar said:


> I am having a little difficulty seeing how all this applies to English, probably because it is my native language. I have read the Wikipedia article on voice onset time and am not sure I quite understand what it is. Will someone explain, please?


Think of the English word _tee_. You start the _t_ blocking the airflow through your mouth by pressing the tongue against the front of your palate and build up pressure inside your mouth. Now you articulate the _t_ by suddenly lowering the tip of your tongue and thereby removing the blockage. This is called the _plosive release_. Now at some time your voice has to set in (i.e. your vocal cords start to vibrate) in order to pronounce the _ee_. The time difference between these two events is called VOT. If your voice starts immediately with the plosive release, the VOT is zero. If the voice sets in a bit later allowing an _h_-like sound to occur for a brief period of time VOT is said to be positive and the _t_ is said to be aspirated. During the buildup of the air pressure while the airflow is blocked it is possible to let the vocal cords vibrate, i.e. it is possible to start voicing already a short time before the plosive release. If you do this the VOT is said to be negative. In all languages we have discussed here VOT>0 is considered _fortis_ and VOT<0 _lenis_. There is a difference between languages or dialects whether VOT=0 is considered _fortis_ or _lenis_.


----------



## sokol

phosphore said:


> I think that a devoiced consonant would be a realisation of a phoneme, which is usually represented by its voiced counterpart, in some particular phonetic context, such as word-final position; it is voiceless by its phonetic nature.


Well put.
Usually, we speak of devoiced consonants if we speak of a phoneme which usually would be voiced but is phonetically realised without voicing in specific contexts: this is devoiced as used in phonology.
But devoiced sounds also can be only "partly" devoiced. For example, take the pronunciation of "Paris" by some French people (I think this is more typical for Northern France but I'm not sure): here a vowel (!) is partly devoiced, phonetically that would sound like [parii̥] (with the ring below the second "i" standing for devoicing) - the final "i" is really a long "i" where the vowel is devoiced at the end. An untrained ear might hear this as [parih], that is, perceive the unvoiced part of "i" as .

And Hulalessar, concerning VOT: what is so very difficult about VOT is that every speaker is conditioned by his mother tongue - when you learn a foreign language, one of the most difficult parts is to adjust both perception and pronunciation of different VOT values.
That means, you as a native English speaker will always hear (and pronounce) a foreign language with your native VOT's - or at least as long as you either have been exposed to that foreign language long enough to pick up the different VOT through experience or you are a trained phonetician (and even for them that's difficult).

I think Bernd has explained it well enough, but if you still don't understand please say so and we'll try again.  I am sorry that I've introduced the concept of VOT here  (I felt it was necessary); this is really quite a specific and technical and difficult to explain, even phoneticians in training take some time to learn and comprehend it.


----------



## Forero

I listened to the nine samples again on the computer where I work, and wrote down a little more information, to wit:

First column-
 Top sample: _ba_ but slightly "implosive".
 Middle sample: _da_ slightly aspirated.
 Bottom sample: _ga_ almost no aspiration.

My breath-group-initial /b/, /d/, and /g/ in English are usually more aspirated than these samples.  When I speak Spanish, my breath-group-initial /g/ is similar to the sample.

Middle column-
 Top sample: _ba_ slightly voiced, no aspiration, sounds very similar to Chinese "ba" third tone (hold/grip).  Closer to French _pa_ than French _ba_, but not quite a French _pa_ because of the slight voicing.
 Middle sample: _da_ very similar to Chinese "da" fourth tone (big).  Almost like French _ta_.
 Bottom sample: _ga_ less like a French _ca_ than a "ga" with a Chinese "g".

I don't hear French unvoiced consonants as voiced, but these samples are slightly voiced and lenis, more suitable for Chinese than for French, Italian, or Spanish.

Third column-
 Top sample: _pa_ unvoiced, strongly aspirated, like Chinese "pa" fourth tone (handkerchief).  As I imagine a British "pa" in "wild party".
 Middle sample: _ta_ unvoiced, medium aspiration, like English "ta" in "tardy" but pronounced as if "ta" were a word itself. A little like Chinese "ta" (stamp/step), but less aspirated.
 Bottom sample: _ka_ as in my pronunciation of "cot", slight to medium aspiration.

In short, I don't think these particular samples show an equivalence between devoicing and aspiration since the middle-column consonants are slightly voiced, unlike French unvoiced stops and unlike English and German _p_ or _t_ in combination with _s_ (_sp-_, _st-_), and in the first column only _da_ has any aspiration to speak of.


----------



## TitTornade

Bonjour,
This is a very interesting debate. I read it with passion. And I learnt many things.

And I think I now understand why French people "can not" pronounce /h/... They (i.e. I) can't manage to have a positive VOT... 

Thus "aspiration", isolated as in "h" or associated to a consonnant as t', p', has no meaning in French (as in Chinese). The only distinction between plosive is voiced/unvoiced as in the following pairs :
pas-bas (/pɑ/ - /bɑ/);
quand-gant (/kã/ - /gã/);
pont-bon (/põ/ - /bõ/);
temps-dent (/tã/ - (dã/);
Aude-haute (/o:d/ - /o:t/);
vente-vende (/vã:t/ - /vã:d/).

And, except in some "liaison", there is never devoicing of consonnants at the end of a word (in the limit of my knowledge ), as shown in the last 2 exemples.


----------



## Forero

If a French sentence ends in a /t/, for example, _haute_, or _huit_ to take an example without _e_ muet, the final /t/ is aspirated, if I remember correctly.


----------



## TitTornade

To my mind, /t/ won't be aspirated at the end of a word or of a sentence. But I think it can appear a schwa... Is it as if there was an aspiration ?

In another post in French Forum, some people that was learning French asks the French community : "Why is there "h" at the end of many words in French ?" It think it was about vowel-sound-ending words. But the answer of all the French was :"What "h" ? Never heard such a "h" !"

That I can say (after trying to say _huit_ and _haute_) is : the /t/ at the end of a word is, how to say, *open* and not close as in cantonese : the air seem to continue to flow (do you understand what I mean ?)...
So perhaps /t/ is aspirated  I've never been aware of that till now...


----------



## Forero

Perhaps it depends on the variety of English.

Where I live, we usually don't release final stops in a normal conversation, but we are taught that when speaking publicly (e.g. in front of a crowd or in a broadcast) to maximize understanding without being obtrusive by releasing our final stops clearly (with some aspiration) but with less aspiration than with our initial stops.

The aspiration on the French final /t/ probably depends on the variety of French, but I believe final /t/ is aspirated in Parisian French very similarly to the way I would aspirate it when speaking publicly.


----------



## Hulalessar

sokol said:


> And Hulalessar, concerning VOT: what is so very difficult about VOT is that every speaker is conditioned by his mother tongue - when you learn a foreign language, one of the most difficult parts is to adjust both perception and pronunciation of different VOT values.
> That means, you as a native English speaker will always hear (and pronounce) a foreign language with your native VOT's - or at least as long as you either have been exposed to that foreign language long enough to pick up the different VOT through experience or you are a trained phonetician (and even for them that's difficult).
> 
> I think Bernd has explained it well enough, but if you still don't understand please say so and we'll try again.  I am sorry that I've introduced the concept of VOT here  (I felt it was necessary); this is really quite a specific and technical and difficult to explain, even phoneticians in training take some time to learn and comprehend it.


 
I think I've got the gist of it now, thank you both very much. I think Bernd should revise or add to the Wikipedia article!

As you suggest, one is obviously going to be influenced by ones own language in how one perceives the sounds of another language. English speakers are not aware of aspiration as it has no phonemic value. Interestingly, a lot of older books when explaining the pronunciation of Chinese consonants do so in terms of voiced versus unvoiced, even if they describe the two sets as aspirated and unaspirated. Like this:

Aspirated: Read _p'_, _t'_, _k'_, _ch'_, and _ts'_ as in _p_in, _t_ip, _k_ilt, _ch_in and bi_ts_

Unaspirated: Read _p_, _t_,_ k_, _ch_ and _ts_ (or _tz_) as in _b_in, _d_ip, _g_ilt, _g_in, and bi_ds_

Is Thai unusual in having a threeway phonemic contrast: aspirated unvoiced, unaspirated unvoiced and unaspirated voiced?

The Wikipedia article seems to suggest that VOT is only something that came to be considered properly (if it had previously been considered at all) when technology was developed allowing sounds to be graphically represented and accurately measured and described. Indeed, so far as I understand it, technology showed that there are in fact more allophones than a human ear can consciously detect. By this I mean that every segment of speech (a vowel or consonant being considered a segment) contains the seed of the segment that follows or proceeds it. This is apparently true to such an extent that 50% or more of a discourse can be removed without it becoming unintelligible. It explains why in telephone conversations, where not all sounds are actually transmitted, the missing sounds are supplied and "heard". This has led some linguists to suggest that the natural segments of speech are in fact syllables (or even something longer) rather than vowels or consonants, and that considering the natural segments to be vowels and consonants is just the result of prejudice by those brought up with alphabetic writing.


----------



## Outsider

I've noticed that in phone conversations it's very common to mistake voiced consonants for their voiceless counterparts, or vice-versa.


----------



## Athaulf

Outsider said:


> I've noticed that in phone conversations it's very common to mistake voiced consonants for their voiceless counterparts, or vice-versa.



It might have something to do with the fact that telephone lines filter out frequencies under 300Hz and above 3,400Hz. The fundamental frequency of voice chords in normal speech is something like 100-150Hz for men and 200-250Hz for women, so the fundamental tone will be cut out (and for speakers with deeper voices, even the fist overtone). Thus, it's not surprising that phonemic voicing contrasts are attenuated over the phone.


----------



## sokol

Hulalessar said:


> Is Thai unusual in having a threeway phonemic contrast: aspirated unvoiced, unaspirated unvoiced and unaspirated voiced?
> (...)
> Indeed, so far as I understand it, technology showed that there are in fact more allophones than a human ear can consciously detect.


To distinguish between /th t d/ phonemically is not too unusual, many languages do this - but as far as I know languages that distinguish only between two kinds of plosives (either /th t/ or /t d/ = either voicing or aspiration being the distinctive feature) are more common by far.

And I didn't know that over the phone the distinctive frequencies for voiced plosives are filtered out, that's very interesting.  Thus English native speakers easily could check if voicing or aspiration is distinctive for them; my guess would be that for English native speakers aspiration should be the most important feature: and that English native speakers wouldn't have a problem distinguishing plosives on the phone except in positions where aspiration is lost - like in AE "better" where "tt" is replaced with a tap.

As for what we can _distinguish_, concerning VOT: our perception here is very much influenced by our native language competence and foreign languages we know; it is very difficult to train your hearing to different VOT's.
I guess only few (if any) languages distinguish between more than two different VOTs (I don't know of any that does); those who have more than two different plosives use aspiration, ejective/implosive or emphatic articulations to differentiate them.


----------



## Forero

I listened to BBC radio this morning, and I noticed that all the speakers' final stops were the same as those we Americans are taught for public speaking.  At this point, I seriously doubt that aspiration is any more distinctive phonemically in BrE than in AmE.  BrE does aspirate initial unvoiced stops more strongly than AmE, but we still recognize each others' consonants. Misunderstandings occur, but they involve misinterpretation of the stronger aspiration as emphasis where it was not intended as such by the speaker.  For example, I noticed once when a BrE speaker said he was not a pacifist, with a strongly aspirated initial _p_, that several AmE speakers complained about his very negative idea of pacifism when in fact he meant only to be speaking about his own tendencies, not to be judging others'.





sokol said:


> And I didn't know that over the phone the distinctive frequencies for voiced plosives are filtered out, that's very interesting.  Thus English native speakers easily could check if voicing or aspiration is distinctive for them; my guess would be that for English native speakers aspiration should be the most important feature: and that English native speakers wouldn't have a problem distinguishing plosives on the phone except in positions where aspiration is lost - like in AE "better" where "tt" is replaced with a tap.


It is interesting that voicing, and even musical sounds, can still be perceived without the fundamental. We can even imagine a symphonic sound coming from a speaker too small to reproduce the lower frequencies.

So perception of sounds over the telephone would not be a good test of the distinctiveness of voicing vs. aspiration.

 It takes sophisticated artificial intelligence techniques for a machine to "hear" the melody lines from the instruments in an orchestra as reproduced through a tiny speaker, as both children and adults so readily do, and voicing may be just as difficult for a machine, but voicing is real acoustically and is a distinctive feature in English, American and British.

It is common for the second _d_ in _didn't_ not to be released, being followed by the syllabic _n_.  Some Americans devoice the unreleased _d_ in _didn't_, and to those who don't devoice it, this makes the word sound as if it were spelled "_dittn't_".  Since the stop is not released, aspiration is not an issue.

The flapped _tt_ in _latter_ varies between voiced and unvoiced.  When voiced, it becomes ambiguous since _ladder_ is also a word and the _dd_ may also be flapped.  When unvoiced, whether aspirated or not, it sounds unambiguously like _tt_.  If the flap in _latter_/_ladder_ is changed to a voiced aspirated stop, the word becomes unambiguously _ladder_.


----------



## Athaulf

sokol said:


> And I didn't know that over the phone the distinctive frequencies for voiced plosives are filtered out, that's very interesting.



Not entirely, though. The higher overtones still get through, and based on them, your brain will fill in the missing fundamental tone automatically. (This is why you can understand speech and recognize melodies over the phone or through cheap speakers that are incapable of producing bass frequencies -- you get the impression that the sound is somehow thin and hollow, but it's still easily recognizable.) Thus, the voicing contrasts are attenuated, but not completely obliterated. 

It's a pretty interesting phenomenon -- the Wikipedia articles on virtual pitch and missing fundamental have some interesting discussion and links. It's also fortunate, or otherwise it would be impossible to establish even the basic telephony and radio without expensive high-fidelity sound transmission and reproduction systems. It also helps with understanding live speech in non-ideal circumstances, of course.


----------



## berndf

Forero said:


> I listened to BBC radio this morning, and I noticed that all the speakers' final stops were the same as those we Americans are taught for public speaking.


Contrary to AE, this is also the colloquial pronunciation in BE.


> At this point, I seriously doubt that aspiration is any more distinctive phonemically in BrE than in AmE. BrE does aspirate initial unvoiced stops more strongly than AmE, but we still recognize each others' consonants.


Most of the times the different VOT thresholds do not create misunderstandings; not even between, say, French and German.


> When unvoiced, whether aspirated or not, it sounds unambiguously like _tt_.


This really depends on how your ear is trained. We would need to ask native BE speakers to decide the matter. If a speaker realizes the "t" in _tee_ with a VOT of +60ms and "d" in _dee_ with a VOT of -30ms you cannot tell how he/she perceives VOT=0.


----------



## berndf

Forero said:


> I listened to BBC radio this morning, and I noticed that all the speakers' final stops were the same as those we Americans are taught for public speaking.  At this point, I seriously doubt that aspiration is any more distinctive phonemically in BrE than in AmE.  BrE does aspirate initial unvoiced stops more strongly than AmE, but we still recognize each others' consonants. Misunderstandings occur, but they involve misinterpretation of the stronger aspiration as emphasis where it was not intended as such by the speaker.  For example, I noticed once when a BrE speaker said he was not a pacifist, with a strongly aspirated initial _p_, that several AmE speakers complained about his very negative idea of pacifism when in fact he meant only to be speaking about his own tendencies, not to be judging others'.


Have a look here. When I listened to this video I had to think of this old thread. While it pretends to be explaining the difference between unvoiced and voiced /t/ and /d/ it is clearly explaining the difference between aspirated and unaspirated /t/ and /d/. In my experience, the situation in Standard BE is much the same as in German: lack of aspiration alone causes the sound to be perceived as /d/, irrespective of voicing. Most speakers are unaware of this and think they hear voicing, if a plosive is unaspirated and for this reason, the video is talking of voicing when in reality it is describing non-aspiration.


----------



## JuanEscritor

berndf said:


> In German VOT>0 counts as unvoiced, VOT<=0 as voiced.


I may be wrong, but I was under the impression that most all the Germanic languages used a similar set of criteria for distinguishing their plosive pairs, those criteria in all of them being comparable to what is used in English, i.e., high (around 40–60ms) VOT for /p/, /t/, /k/ (class 1)*, and lower (less than 35ms, but _usually _not less than 0ms) VOT for /b/, /d/, /g/ (class 2), at least in initial position.  Furthermore, research that I've done has shown that in many cases final class 2 plosives lack voicing in English, as they do in German.



> In French VOT>=0 is unvoiced and VOT<0 is voiced where VOT=0 means "below the threshold of audibility" which is about +/-30ms. English is, depending a regional variety (BE tends towards higher VOTs, AE towards lower VOTs), somewhere in between.


I have yet to see research that bears this out.  English VOT for class 1 plosives is often very high.  My research (on AmE) has shown the value easily climbs into the 60ms mark and going upwards of 80ms is not unheard of (word-initially, that is).  

Perhaps you would be willing to link to some samples of your dialect of German as well as to those dialects of English with such low class 1 VOTs.  I do not necessarily think you are wrong, but what you are saying is contrary to what I've known and observed myself.
 
JE

__________
* I am using these terms to avoid calling them 'voiceless' and 'voiced', because I see that the use of the classical terminology has already caused great confusion throughout this thread.


----------



## JuanEscritor

Forero said:


> I listened to the nine samples again on the computer where I work, and wrote down a little more information, to wit:
> 
> First column-
> Top sample: _ba_ but slightly "implosive".
> Middle sample: _da_ slightly aspirated.
> Bottom sample: _ga_ almost no aspiration.
> 
> My breath-group-initial /b/, /d/, and /g/ in English are usually more aspirated than these samples.  When I speak Spanish, my breath-group-initial /g/ is similar to the sample.
> 
> Middle column-
> Top sample: _ba_ slightly voiced, no aspiration, sounds very similar to Chinese "ba" third tone (hold/grip).  Closer to French _pa_ than French _ba_, but not quite a French _pa_ because of the slight voicing.
> Middle sample: _da_ very similar to Chinese "da" fourth tone (big).  Almost like French _ta_.
> Bottom sample: _ga_ less like a French _ca_ than a "ga" with a Chinese "g".
> 
> I don't hear French unvoiced consonants as voiced, but these samples are slightly voiced and lenis, more suitable for Chinese than for French, Italian, or Spanish.
> 
> Third column-
> Top sample: _pa_ unvoiced, strongly aspirated, like Chinese "pa" fourth tone (handkerchief).  As I imagine a British "pa" in "wild party".
> Middle sample: _ta_ unvoiced, medium aspiration, like English "ta" in "tardy" but pronounced as if "ta" were a word itself. A little like Chinese "ta" (stamp/step), but less aspirated.
> Bottom sample: _ka_ as in my pronunciation of "cot", slight to medium aspiration.
> 
> In short, I don't think these particular samples show an equivalence between devoicing and aspiration since the middle-column consonants are slightly voiced, unlike French unvoiced stops and unlike English and German _p_ or _t_ in combination with _s_ (_sp-_, _st-_), and in the first column only _da_ has any aspiration to speak of.



Are you doing these analyses by ear?

JE


----------



## Forero

JuanEscritor said:


> Are you doing these analyses by ear?
> 
> JE


Yes, they are my own descriptions of what the samples sound like to me.  I would not call them analyses but auditory impressions. I am looking for evidence about how perception compares with actual measurements (see also post #31).

I don't have the means to download and analyze the samples in the format provided, the way I might with wav format. What are the actual VOTs of the nine samples, and how were they determined? Are better samples available? How is VOT=0 defined when there is no video or other measurable evidence of the time of separation at the point of articulation?


----------



## JuanEscritor

Forero said:


> Yes, they are my own descriptions of what the samples sound like to me.  I would not call them analyses but auditory impressions. I am looking for evidence about how perception compares with actual measurements (see also post #31).


In this case of perception, the answer is complicated.  As others have pointed out, there are more factors than merely VOT that listeners use to judge the class of a consonant.  Word-finally, for example, it has been my experience that VOT has absolutely no impact on interpretation in the case of minimal pairs.  A pair such as ‹bad› and ‹bat›, for example, that vary only in the voicing characteristic of their final consonant will be interpreted as the same word.  To adequately convey the /d/-/t/ distinction, a speaker will alter instead the vowel hold duration (which, honestly, I believe is just a form of prevoicing) and so indicate the consonant's value that way.

I might be able to show some more examples and values in a day or two.  I'm quite exhausted, though, so I cannot type anymore.  I hope what I said already was at least minimally sensible.

JE


----------



## berndf

JuanEscritor said:


> I may be wrong, but I was under the impression that most all the Germanic languages used a similar set of criteria for distinguishing their plosive pairs, those criteria in all of them being comparable to what is used in English, i.e., high (around 40–60ms) VOT for /p/, /t/, /k/ (class 1)*, and lower (less than 35ms, but _usually _not less than 0ms) VOT for /b/, /d/, /g/ (class 2), at least in initial position.  Furthermore, research that I've done has shown that in many cases final class 2 plosives lack voicing in English, as they do in German.
> 
> I have yet to see research that bears this out.  English VOT for class 1 plosives is often very high.  My research (on AmE) has shown the value easily climbs into the 60ms mark and going upwards of 80ms is not unheard of (word-initially, that is).
> 
> Perhaps you would be willing to link to some samples of your dialect of German as well as to those dialects of English with such low class 1 VOTs.  I do not necessarily think you are wrong, but what you are saying is contrary to what I've known and observed myself.
> 
> JE
> 
> __________
> * I am using these terms to avoid calling them 'voiceless' and 'voiced', because I see that the use of the classical terminology has already caused great confusion throughout this thread.


I was talking about perception not articalation here. A plosive must be audibly aspirated to count as fortis, in all other cases it is perceived as lenis. In the context VOT=0 is obviously shorthand for "neither audibly voiced nor aspirated", i.e. -30ms<VOT<30ms.

AE speakers do perceive VOT=0 plosives as fortis at the end of a syllable or at the beginning of unstressed syllables. The lenis/fortis perception threshold should hence be lower than in BE or German. On the other hand there are some German dialects which know unvoiced /t/s as well but they distinguish /d/ and /t/ only lexically, i.e. without context the words "Dorf" and "Torf" could not be distinguished. It would be interesting to know if in colloquial AE "bad" and "bat" can be distinguished without context. Asking an AE to say the two words in isolation wouldn't work because he would probably aspirate the t. One would have to record entire sentences (preferably by a speaker who doesn't know what we are after) and cut out the words.

PS: My personal guess: Yed, it could be distinuished but the distinction is less clear than in Romance or Slavic languages.


----------



## Forero

berndf said:


> I was talking about perception not articalation here. A plosive must be audibly aspirated to count as fortis, in all other cases it is perceived as lenis. In the context VOT=0 is obviously shorthand for "neither audibly voiced nor aspirated", i.e. -30ms<VOT<30ms.
> 
> AE speakers do perceive VOT=0 plosives as fortis at the end of a syllable or at the beginning of unstressed syllables. The lenis/fortis perception threshold should hence be lower than in BE or German. On the other hand there are some German dialects which know unvoiced /t/s as well but they distinguish /d/ and /t/ only lexically, i.e. without context the words "Dorf" and "Torf" could not be distinguished. It would be interesting to know if in colloquial AE "bad" and "bat" can be distinguished without context. Asking an AE to say the two words in isolation wouldn't work because he would probably aspirate the t. One would have to record entire sentences (preferably by a speaker who doesn't know what we are after) and cut out the words.
> 
> PS: My personal guess: Yed, it could be distinuished but the distinction is less clear than in Romance or Slavic languages.


Yes, there are American dialects in which _bad_ sounds like _bat_, but in most American dialects, there is a clear difference (to us) between _bad_ and _bat_ in isolation with unreleased final consonants (the usual case in casual speech).


----------



## berndf

Here we go. The words "bad" and "bat" cut from an arbitrary text spoken by a General American speaker. So, do you hear a difference and which is which and how would you describe the difference? As I know the answer to the second question I won't say anything more.

Good luck.


----------



## Forero

berndf said:


> Here we go. The words "bad" and "bat" cut from an arbitrary text spoken by a General American speaker. So, do you hear a difference and which is which and how would you describe the difference? As I now the answer to the second question I won't say anything more.
> 
> Good luck.


It's _bat_, then _bad_. The final consonant of the first is unvoiced, and the final consonant of the second is voiced.

If I had to guess, I would say the speaker is from the Northeastern U.S., probably New York. The speaker's _a_ of _bad_ sounds almost like _e_ of _bed_ to me.


----------



## clevermizo

berndf said:


> Here we go. The words "bad" and "bat" cut from an arbitrary text spoken by a General American speaker. So, do you hear a difference and which is which and how would you describe the difference? As I know the answer to the second question I won't say anything more.
> 
> Good luck.




The stop in bad and bat is unreleased. The difference between these two is the length of the vowel. 

This is a great example of where vowel length (and I mean real length, not tense/lax) is phonemic in modern (American) English. If you say [ba:t] instead of [bat] you will be perceived as having said "bad," unless of course you actually release the "t" and then you'll just sound silly overall. 

As to the speaker in the clip, I admit it sounds garbled to me. I can't really make it out.


----------



## berndf

clevermizo said:


> The stop in bad and bat is unreleased. The difference between these two is the length of the vowel.
> 
> This is a great example of where vowel length (and I mean real length, not tense/lax) is phonemic in modern (American) English. If you say [ba:t] instead of [bat] you will be perceived as having said "bad," unless of course you actually release the "t" and then you'll just sound silly overall.
> 
> As to the speaker in the clip, I admit it sounds garbled to me. I can't really make it out.


Ok, I reduced the length of the "a" in the second word by 40ms by cutting out something in the middle. What do you hear now?


----------



## clevermizo

berndf said:


> Ok, I reduced the length of the "a" in the second word by 40ms by cutting out something in the middle. What do you hear now?



Really, the audio doesn't sound like anything. Is this an encoding problem? I'm playing it with VLC which should be able to play anything. I really can't make it out.


----------



## berndf

I downloaded it again and played it with Windows Media Player. Was ok.


----------



## Nino83

berndf said:


> Some languages, like German, English or Chinese, do not distinguish between unvoiced and voiced plosives but only between aspirated and unaspirated ones. In languages which do differentiate between voiced and unvoiced plosives, like all Romance languages, there is a difference between a "d" (the vocal cords start to vibrate several 10s of milliseconds before the plosive release) and an unaspirated "t".



This is confirmed by John Peter Sloan (you can find this video on youtube - "pronuncia inglese per italiani", at 2'10"), who says: 

"Poi c'è il "t". Perché vostra "t" a noi sembra un "d". Infatti io mi ricordo, che io per tanti anni pensavo che quando tutti dicevano "troppo", era scritto con un "d", e quanto l'ho visto scritto con "tr", non riconoscevo la parola. We hear your "t" come un "d". So in English it's [tʰ], e se tu dovessi dire, mi sto sputando ma...gli ombrellini, magari serve anche per gli altri, quando fate questi esercizi. Quando fai il "t", it's [tʰ], con attack, ok?." 

So, Mr. Sloan, from Birmingham, says that he doesn't hear any difference between Italian [t] and Italian [d].


----------



## gburtonio

Nino83 said:


> This is confirmed by John Peter Sloan (you can find this video on youtube - "pronuncia inglese per italiani", at 2'10"), who says:
> So, Mr. Sloan, from Birmingham, says that he doesn't hear any difference between Italian [t] and Italian [d].



That's a possible interpretation but it's not really what he's saying. His point is that he mistook an Italian [t] for a [d], but it might well be that when he heard Italian [d] he knew it was a [d] and not a [t] (so the confusion may well have been only in one direction).

Personally (as a NS of English) I don't have any trouble hearing the difference in the initial consonants, for example, of 'dramma' and 'trama'.


----------



## berndf

gburtonio said:


> Personally (as a NS of English) I don't have any trouble hearing the difference in the initial consonants, for example, of 'dramma' and 'trama'.


That's a bad example as it isn't a minimal pair.

Of course it is possible for speakers of one language to tune into the phonology of a different language. I have learned, e.g., to hear the difference between _bed, bad, bet_ and _bat _in English although for an untrained mono-lingual German speaker these four words would be impossible to distinguish.

The same is true with Romance plosives, to some degree at least. It is also easier to distinguish minimal pairs, if you are aware of them. A more interesting test would be, if you were asked to write down a word you don't know and you've heard pronounced by a native speaker. E.g., I had never made the connection between Italian _pagare _and French _payer _before I saw _pagare _written for the first time (I had French at school but much of by (little) knowledge of Italian I picked up in the street) because from hearing the word it had never even occurred to me that it might have been spelled _pagare_ and not _bagare_.


----------



## gburtonio

berndf said:


> That's a bad example as it isn't a minimal pair.
> 
> Of course it is possible for speakers of one language to tune into the phonology of a different language. I have learned, e.g., to hear the difference between _bed, bad, bet_ and _bat _in English although for an untrained mono-lingual German speaker these four words would be impossible to distinguish.
> 
> The same is true with Romance plosives, to some degree at least. It is also easier to distinguish minimal pairs, if you are aware of them. A more interesting test would be, if you were asked to write down a word you don't know and you've heard pronounced by a native speaker. E.g., I had never made the connection between Italian _pagare _and French _payer _before I saw _pagare _written for the first time (I had French at school but much of by (little) knowledge of Italian I picked up in the street) because from hearing the word it had never even occurred to me that it might have been spelled _pagare_ and not _bagare_.



No, sure – you're absolutely right that it's possible to 'tune in' as you say. I really only mentioned my own perception because I know the speaker in the video has been in Italy a long time and seemed to be giving the impression that it's impossible for a English NS to hear the difference. He can surely hear the difference now. But I think my main point stands – it's not so much that an English speaker can't hear the difference between Italian [t] and [d], it's just that s/he may well hear an Italian [t] and think it's a [d]. That was the case for you, too, as a German NS with 'pagare'. But do you think you could hear an Italian [d] and think it was a [t]? That's what seems unlikely to me.


----------



## berndf

gburtonio said:


> But do you think you could hear an Italian [d] and think it was a [t]? That's what seems unlikely to me.


No, that would certainly not occur. To a German speaker, both Italian /t/ and /d/ would sound like /d/, if he didn't pay attention. There are, by the way, some German dialects where aspiration of plosives is lost and as a consquence, the phonemic distinction is lost and words like _Dorf_ and_ Torf_ are homophone.


----------



## gburtonio

berndf said:


> No, that would certainly not occur. To a German speaker, both Italian /t/ and /d/ would sound like /d/, if he didn't pay attention. There are, by the way, some German dialects where aspiration of plosives is lost and as a consquence, the phonemic distinction is lost and words like _Dorf_ and_ Torf_ are homophone.



So in that case I do think it's an exaggeration to suggest that an English speaker wouldn't hear a difference between /t/ and /d/ in all cases (sorry, I should really have been using /…/ rather than […] ). The problem is perceiving a /d/ when the sound used is a /t/, but it's probably unlikely that an English speaker would perceive a /t/ when the sound produced was a /d/.

I'm interested to hear about the loss of aspiration in plosives in some dialects. So in these cases there is really no compensatory change in either sound? There's really no difference at all, no earlier voicing for example?


----------



## Nino83

gburtonio said:


> His point is that he mistook an Italian [t] for a [d], but it might well be that when he heard Italian [d] he knew it was a [d] and not a [t] (so the confusion may well have been only in one direction).



Ciao, gburtonio. 
But it's exactly the object of this thread  

English (particularly British) speakers hear the difference between _aspirated_ and _unaspirates_ stops so Romance unaspirated [t] and [d] can be perceived like they are a [d].


----------



## berndf

gburtonio said:


> So in that case I do think it's an exaggeration to suggest that an English speaker wouldn't hear a difference between /t/ and /d/ in all cases (sorry, I should really have been using /…/ rather than […] ). The problem is perceiving a /d/ when the sound used is a /t/, but it's probably unlikely that an English speaker would perceive a /t/ when the sound produced was a /d/.


The issue is that for initial /b/-/d/-/g/ (for non-initials things get more complicated and dialect dependent) VOTs ranges from significant negative values to zero without any systematic to it. English speakers are therefore trained to ignore this as it contains no useful information and would only confuse. But this happens to be crucial for the distinction between Romance voiced and unvoiced plosives. Of course, you can "unlearn" this conditioning.



gburtonio said:


> I'm interested to hear about the loss of aspiration in plosives in some dialects. So in these cases there is really no compensatory change in either sound? There's really no difference at all, no earlier voicing for example?


There might have been voicing at one stage but today the phonemic distinction is completely lost (at least for /d/-/t/ and /b/-/p/; /g/-/k/ is a bit more complex, but mostly too). Numerous spelling mistakes in those areas demonstrate this, e.g. many Austrian restaurants offer _*G*ordon bleu _instead of _Cordon bleu_.


----------



## gburtonio

Dear Nino83 and berndf,

It might just be that I'm not actually saying anything useful, but … my feeling is that the 'problem' a Brit encounters with Romance languages is purely with /t/. I think that to suggest we (Brits) 'can't hear any difference between Italian [t] and Italian [d]' is perhaps overstating things. Certainly an Italian /t/ will confuse an inexperienced Brit at the beginning, but a /d/ won't be perceived as a /t/. I guess that's kind of obvious, but I'm just trying to say that it's not total chaos! We don't just randomly perceive a /t/ or a /d/ at random – /d/ will always be perceived as a /d/, but /t/ will (sometimes?) also be perceived as a /d/. I say 'sometimes' with a question mark because I'm not totally sure all Brits would hear 'troppo' as 'droppo' but I'm too far gone now – I started learning Italian ten years ago and I learnt Modern Greek (where plosives are more or less as in Italian) before that, so it's too long ago to remember what it was like for me at the beginning!


----------



## Nino83

gburtonio said:


> I'm just trying to say that it's not total chaos! We don't just randomly perceive a /t/ or a /d/ at random – /d/ will always be perceived as a /d/, but /t/ will (sometimes?) also be perceived as a /d/.



Yes, I agree, it's what I'm saying. 
Italian [t] and [d] are not aspirated and are perceived as a [d]. The Italian [d], being unaspirated, will be perceived as a [d]. 
In other words, an Englishman could think that in Italian there are no /t/ (at the beginning, we hope).


----------



## merquiades

Nino83 said:


> Yes, I agree, it's what I'm saying.
> Italian [t] and [d] are not aspirated and are perceived as a [d]. The Italian [d], being unaspirated, will be perceived as a [d].
> In other words, an Englishman could think that in Italian there are no /t/ (at the beginning, we hope).



I don't think English-speakers normally fail to hear and understand a Romance /t/.  It's the first time I heard about it causing a big problem.  People don't hear "Dorino" for "Torino".  What is true is some speakers find it difficult to pronounce it as a dental and not alveolar sound.

Germans sometimes cannot understand my /k/.  It's harsher in German.  I've had a few examples of that like "Ein Kaffee, bitte".  "Kartoffel" that they fail to understand the first time I say it, and when they finally get it they repeat it with a raspy sound.


----------



## Nino83

merquiades said:


> I don't think English-speakers normally fail to hear and understand a Romance /t/.  It's the first time I heard about it causing a big problem.  People don't hear "Dorino" for "Torino".  What is true is some speakers find it difficult to pronounce it as a dental and not alveolar sound.



I've just reported what Mr. Sloan said. 
It's true that Germans, often, fail to pronounce very well voiced stops (it is a trademark of Germans when they speak Italian). English speakers don't have any problem when they pronounce voiced stops (yes, they often keep pronouncing [tʰ] when they speak Italian, instead of [t]), but sometimes they can hear the Italian [t] as a [d] (at the beginning, if they are not accustomed to the language).


----------



## berndf

Nino83 said:


> I've just reported what Mr. Sloan said.
> It's true that Germans, often, fail to pronounce very well voiced stops (it is a trademark of Germans when they speak Italian). English speakers don't have any problem when they pronounce voiced stops (yes, they often keep pronouncing [tʰ] when they speak Italian, instead of [t]), but sometimes they can hear the Italian [t] as a [d].


The difference between German and English initial stops is that German does not have voiced stops at all while in English voiced initial stops exist but they are not distinguished from unaspirated unvoiced ones.


----------



## merquiades

The University of Iowa has made an audio phonetics guide in several languages showing mouth position, point of articulation with several  words pronounced in different positions.   Check out the German and English here and compare.  For me it sounds like the biggest difference is that aspiration is stronger in German.  But when I hear "Lüke" and "Glas", they do sound like they are approaching /k/ with no aspiration.


----------



## berndf

To make it absolutely clear: Standard German has no voiced plosives whatsoever.


----------



## Forero

berndf said:


> To make it absolutely clear: Standard German has no voiced plosives whatsoever.


German is not my native language, and my exposure to it has been limited, but I have to disagree with this "whatsoever". You keep saying it, but I need to see it to believe it.

To me, German initial /d/, English initial /d/, and French initial /d/ are (at least very nearly) the same thing, with the larynx already sounding when the following vowel is begun. It is not like a Romance /t/. The voicing is there. I hear it, and I (used to) see it in spectrograms. (Unfortunately I have yet to be able to synch audio and video in Windows well enough to demonstrate.)


----------



## merquiades

Forero said:


> German is not my native language, and my exposure to it has been limited, but I have to disagree with this "whatsoever". You keep saying it, but I need to see it to believe it.
> 
> To me, German initial /d/, English initial /d/, and French initial /d/ are (at least very nearly) the same thing, with the larynx already sounding when the following vowel is begun. It is not like a Romance /t/. The voicing is there. I hear it, and I (used to) see it in spectrograms. (Unfortunately I have yet to be able to synch audio and video in Windows well enough to demonstrate.)



Forero, could you listen to the plosive "d", "b", "g" on the German link I gave before and tell us if you hear a voiced consonant?  I think I do but now I'm wondering if seeing the spelling is making me believe so.


----------



## Forero

merquiades said:


> Forero, could you listen to the plosive "d", "b", "g" on the German link I gave before and tell us if you hear a voiced consonant?  I think I do but now I'm wondering if seeing the spelling is making me believe so.


Yes, these are definitely what I call voiced plosives, just like my "d" in "dog", "b" in "bed", "g" in "gab", not unvoiced unaspirated plosives like French "t", "p", "k", and some of them are rather strongly aspirated, unlike the usual French "d", "b", "g".

Berndf tells me I cannot trust my own ears, and it has been thirty years since I could do a good spectrogram, but this is what I hear and I am sure it is not an artifact of seeing how the words are spelled.


----------



## merquiades

Forero said:


> Yes, these are definitely what I call voiced plosives, just like my "d" in "dog", "b" in "bed", "g" in "gab", not unvoiced unaspirated plosives like French "t", "p", "k", and some of them are rather strongly aspirated, unlike the usual French "d", "b", "g".
> 
> Berndf tells me I cannot trust my own ears, and it has been thirty years since I could do a good spectrogram, but this is what I hear and I am sure it is not an artifact of seeing how the words are spelled.



Danke.  My ears are off too then!   The German "d", "g", "b" always sound a bit on the aspirated side but  they sound voiced to me.
I'm curious about the spectogram, never seen one or used one. 


For those who can't get enough, here is a link to people pronouncing David on forvo in English, German, French, Spanish etc.


----------



## berndf

Forero said:


> Yes, these are definitely what I call voiced plosives, just like my "d" in "dog", "b" in "bed", "g" in "gab", not unvoiced unaspirated plosives like French "t", "p", "k", and some of them are rather strongly aspirated, unlike the usual French "d", "b", "g".
> 
> Berndf tells me I cannot trust my own ears, and it has been thirty years since I could do a good spectrogram, but this is what I hear and I am sure it is not an artifact of seeing how the words are spelled.


I admit that I exaggerated. Of course voicing of initial lax plosives occurs. But it is a rare variant and what I insist on is that voicing is never, not even intervocalically, relevant for phoneme separation.

If you say that the German b, d or g sound just like yours, it is probably right. Voicing of initial b, d, g is more the exception than the rule in English as well. I remember research papers that showed statistics for RP speakers with about 20% voiced pronunciations. This might well match the relative frequency among German speakers though I would expect a lower percentage from my own experience.


----------



## Forero

berndf said:


> I admit that I exaggerated. Of course voicing of initial lax plosives occurs. But it is a rare variant and what I insist on is that voicing is never, not even intervocalically, relevant for phoneme separation.
> 
> If you say that the German b, d or g sound just like yours, it is probably right. Voicing of initial b, d, g is more the exception than the rule in English as well. I remember research papers that showed statistics for RP speakers with about 20% voiced pronunciations. This might well match the relative frequency among German speakers though I would expect a lower percentage from my own experience.


I don't speak RP, so I really don't know about that, but it sounds to me like your definition of "voiced" must be different from mine. For me, an initial stop is voiced if the stop is released with the larynx already vibrating. It does not have to be voiced during the entire time of closure. A voiced stop can have its own musical pitch, in my case in the baritone range, whereas an unvoiced stop cannot.


----------



## berndf

There are considerable differences between RP and GA d/t separation but I don't think it affects utterance-initial plosives very much.
I don't know how you pronounce your _d_s. I may well be that you consistently voice then but it is certainly not a general characteristic of GA. Even most German speakers would produce a beautifully voiced [d] if you asked them to explain how a German /d/ is or ought to be produced. But whether they do it in real life is a completely different story. Below you find a wave pattern taken from this pronunciation of _dog_ (left) by an American Speaker (sugardaddy, the one top with the most votes) and of _dolche_ (right) produced by an Italian speaker. The highlighted bit on the right side is the pre-release voicing (about 100ms; in real life this would probably be shorter but at least 30ms). You can see very clearly that this is missing on the left hand side. The voice onset is roughly simultaneous (within 5ms) with the release.


----------



## Forero

berndf said:


> There are considerable differences between RP and GA d/t separation but I don't think it affects utterance-initial plosives very much.
> I don't know how you pronounce your _d_s. I may well be that you consistently voice then but it is certainly not a general characteristic of GA. Even most German speakers would produce a beautifully voiced [d] if you asked them to explain how a German /d/ is or ought to be produced. But whether they do it in real life is a completely different story. Below you find a wave pattern taken from this pronunciation of _dog_ (left) by an American Speaker (sugardaddy, the one top with the most votes) and of _dolche_ (right) produced by an Italian speaker. The highlighted bit on the right side is the pre-release voicing (about 100ms; in real life this would probably be shorter but at least 30ms). You can see very clearly that this is missing on the left hand side. The voice onset is roughly simultaneous (within 5ms) with the release.
> 
> View attachment 14617


Thank you.

I am having trouble finding my way around without some sort of motion picture, a "talky" that connects the audio with video. Where exactly is the release on the left side?

Both "d"s here are voiced, but the Italian speaker does drag his "d" out, like a "dd".


----------



## berndf

Forero said:


> I am having trouble finding my way around without some sort of motion picture, a "talky" that connects the audio with video. Where exactly is the release on the left side?


Well the _Bang!_ of the release is easy to identify in both charts.



Forero said:


> Both "d"s here are voiced, but the Italian speaker does drag his "d" out, like a "dd".


According you your definition ("an initial stop is voiced if the stop is released with the larynx already vibrating"), the first one is not. There is no "vibration" before the release. The voice onset happens at the same time as the release. What you call "the Italian speaker does drag his 'd' out", that *is* voicing (by your own definition). It doesn't have to be quite that long but the presence of the highlighted wave form (an audible, i.e. sufficently strong and sufficiently long to be noticable, voicing *ahead *of the release) is necessary (or at least an important characteristic) in Romance languages to identify an initial _d_ as a _d _rather than as a _t_. In Germanic languages it is not. In Germanic languages, almost simultaneous voice onset, i.e. lack of noticiable aspiration, is sufficient.


----------



## Forero

berndf said:


> Well the _Bang!_ of the release is easy to identify in both charts.


Then I must be pretty dumb to not know what a "bang" looks like. 


> According you your definition ("an initial stop is voiced if the stop is released with the larynx already vibrating"), the first one is not. There is no "vibration" before the release. The voice onset happens at the same time as the release. What you call "the Italian speaker does drag his 'd' out", that *is* voicing (by your own definition). It doesn't have to be quite that long but the presence of the highlighted wave form (an audible, i.e. sufficently strong and sufficiently long to be noticable, voicing *ahead *of the release) is necessary (or at least an important characteristic) in Romance languages to identify an initial _d_ as a _d _rather than as a _t_. In Germanic languages it is not. In Germanic languages, almost simultaneous voice onset, i.e. lack of noticiable aspiration, is sufficient.


This is frustrating. What I hear is not what you see, and I cannot hear your graph. When I hear a voiced stop, you say the voice is not there. When I hear aspiration at the release of a voiced stop, you say it seldom happens.

I suspect we are each holding on to different parts of the elephant and neither of us sees the whole thing.


----------



## berndf

Forero said:


> Then I must be pretty dumb to not know what a "bang" looks like.


A sudden dramatic increase in amplitude. In the left image, there is complete silence to the left of the high amplitude burst, i.e. no voice is heard before the release.


Forero said:


> This is frustrating. What I hear is not what you see, and I cannot hear your graph. When I hear a voiced stop, you say the voice is not there. When I hear aspiration at the release of a voiced stop, you say it seldom happens.
> 
> I suspect we are each holding on to different parts of the elephant and neither of us sees the whole thing.


Ok, then let's put it another way that may be easier for you to relate to: It depends on the definition of the very term "voicing". In Romance languages, "voicing" can be described as you did, viz. that at the point of the release the vocal cords must already have started to vibrate audibly. In Germanic languages, it is sufficient that the vocal cords start to vibrate together with the release for an initial plosive to be perceived as "voiced".


----------



## Forero

berndf said:


> A sudden dramatic increase in amplitude. In the left image, there is complete silence to the left of the high amplitude burst, i.e. no voice is heard before the release.
> Ok, then let's put it another way that may be easier for you to relate to: It depends on the definition of the very term "voicing". In Romance languages, "voicing" can be described as you did, viz. that at the point of the release the vocal cords must already have started to vibrate audibly. In Germanic languages, it is sufficient that the vocal cords start to vibrate together with the release for an initial plosive to be perceived as "voiced".


That makes sense. If I hear vocal vibration at the beginning of the release, I call that a voiced consonant, even if there is only silence before the release. But voicing (vibration of the larynx) and aspiration (audible turbulence near the point of articulation) are still independent factors, are they not?


----------



## sumelic

The way I understand it, aspiration is simply when voicing starts significantly after the point of release. In voiced stops, the voicing starts before the release of the stop. As such, these two characteristics are incompatible, since they lie at different points on a one-dimensional spectrum that goes from prevoiced, to simultaneously voiced, to aspirated.


----------



## berndf

Exactly. The context we are discussing here is an initial plosive followed by a vowel. At some point voicing has to start. There are three cases to be distinguished: voicing starts before the release, at the release or after the release. This is measured by what is called VOT (Voice Onset Time). It is measured in ms. Positive values mean that the voice onset point is after the release and negative values mean that the voice onset is before the release. In Germanic languages d has a VOT <= 0 and in Romance languages d has a VOT < 0. Consequently, for the plosive to be identified as a t, VOT has to be > 0 in Germanic languages and >= 0 in Romance languages. I.e. the languages separate d and t at difference VOT values. By convention, when we have a negative VOT we speak of a voiced plosive, when we have a positive VOT of an aspirated unvoiced plosive and when VOT is close to zero we speak of an unaspirated unvoiced plosive.

The cut-off point between d and t is somewhere in the negative range in Italian and in the positive range English and German. A German t generally has a longer aspiration period than an English t. As a consequence, the cut-off point is higher in German than in English. This explains why Nico perceives that Germans pronounce a d like a t and English don't. The cut-off points differ only little between Italian and English and between English and German, so in both of the pairs the likelihood of a d/t confusion is low. But between Italian and German the distance between the cut-off points is the sum of the distances between Italian and English and between English and German and therefore the likelihood of a confusion (Italians perceiving a German d as a t and Germans perceiving an Italian t as a d) is higher than in the other two pairs.

Having said all that, there is also the phenomenon of what is described as "voiced aspirated" plosives in Indic languages. But in European languages, this doesn't occur and voiced and aspirated can indeed be seen as the end points of a single one-dimensional scale.


----------



## Forero

Well if you define aspiration as a function of VOT, what do you call audible turbulence?





berndf said:


> Having said all that, there is also the phenomenon of what is described as "voiced aspirated" plosives in Indic languages. But in European languages, this doesn't occur and voiced and aspirated can indeed be seen as the end points of a single one-dimensional scale.


Doesn't occur, or isn't phonemic within a given European language?


----------



## sumelic

I found some discussion of the motivation of using voice onset timing as the criterion for aspirated stops early in this paper from 1964: http://www.haskins.yale.edu/Reprints/HL0053.pdf. The main justifications they list are: it can explain in a more phonetically motivated way what was previously called "fortis-lenis" contrast in languages such as English and German, and it is easy to measure and synthesize for use in experiments. It also seems that no language has been found that has a contrast between delayed VOT stops with turbulence and without turbulence; I assume that the turbulence might be considered a side-effect of the delayed onset time, the way that features like lower tone are a side-effect following voiced stops.


----------



## berndf

sumelic said:


> I assume that the turbulence might be considered a side-effect of the delayed onset time, the way that features like lower tone are a side-effect following voiced stops.


So do I. I would be curious to know, if is is possible at all to produce audibly positive VOTs without turbulence and without creating other artefacts like, e.g., ejectiveness.


----------



## gburtonio

sumelic said:


> it can explain in a more phonetically motivated way what was *previously called *"fortis-lenis" contrast in languages such as English and German



The fortis/lenis contrast is a distinction still very much in current use. 

As for the delayed VOT stops with and without turbulence, it's been shown that NSs of English still perceive a voiceless stop sound when played recordings where the aspiration has actually been removed (the VOT is unchanged but the turbulence removed from the recording, replaced with silence). So certainly we're not using the aspiration in order to decode the sound – the VOT is everything.


----------



## berndf

gburtonio said:


> So certainly we're not using the aspiration in order to decode the sound – the VOT is everything.


As I understand it, most authors use the term "aspiration" today as a mere label for "positive VOT".


----------



## Forero

gburtonio said:


> The fortis/lenis contrast is a distinction still very much in current use.
> 
> As for the delayed VOT stops with and without turbulence, it's been shown that NSs of English still perceive a voiceless stop sound when played recordings where the aspiration has actually been removed (the VOT is unchanged but the turbulence removed from the recording, replaced with silence). So certainly we're not using the aspiration in order to decode the sound – the VOT is everything.


This is why I have kept saying that voiced vs. unvoiced is more important in English than aspirated vs. unaspirated.

I for one use the turbulence (what I have been calling aspiration) in English to signal stress and to indicate morpheme boundaries, and of course it is a big help in detecting what language is being spoken. (This may be why I tend to have trouble understanding BrE and Indian English.)

So apparently I was right that our disagreements about AmE phonology are due to differences in basic definitions. Voicing and aspiration are now both defined as functions of VOT, and the fortis-lenis dimension is now reserved for what I have always called amount of aspiration. What is the tense-lax dimension for consonants, or is tense not the opposite of lax?


----------



## sumelic

Fortis-lenis and tense-lax are fairly vague terms that aren't very well defined phonetically, but when they are used for a phonetic characteristic it is usually called "force of articulation". In practice, they seem to mostly be used to refer in an impressionistic way to the difference between a pair of phonemes, one of which seems "stronger" and the other "weaker". So "fortis" consonants might be associated with greater consonant length, or glottalization. It doesn't refer to aspiration specifically, since I believe there are some cases where authors describe a set of fortis and lenis stops but don't describe either as aspirated.
 I'm also not sure if we really have been talking about different things with regard to English: it's true that turbulence and delayed voice onset time are theoretically distinct, but it appears that English speakers don't generally hear a difference between these features, and they usually go together. This would suggest that you wouldn't be able to hear a difference specifically between aspirated and delayed-voice onset. 
I guess one thing that sounds like "aspiration" that isn't based on voice-onset time is the release of word-final stops. This can occur with voiceless stops at the ends of words in English in careful speaking, but I don't think it is quite the same phonetically as an aspirated stop before a vowel.


----------



## berndf

Forero said:


> This is why I have kept saying that voiced vs. unvoiced is more important in English than aspirated vs. unaspirated.


As I said:





berndf said:


> As I understand it, most authors use the term "aspiration" today as a mere label for "positive VOT".



In this context,
1. _voiced vs. unvoiced_ *is defined as* _VOT < 0 vs. VOT >= 0_.
2. _aspirated vs. unaspirated_ *is defined as* _VOT <= 0 vs. VOT > 0_.

2. is more important in English and German. 1. is more important in Italian.


Forero said:


> I for one use the turbulence (what I have been calling aspiration) in English to signal stress and to indicate morpheme boundaries, and of course it is a big help in detecting what language is being spoken. (This may be why I tend to have trouble understanding BrE and Indian English.)


This is an issue mainly with final stops. Obviously, VOT is not applicable there. I would propose the delay between release and subsidence of air pressure. Negative values would mean "unreleased" and positive values "aspirated".


----------



## gburtonio

I would sum up the situation as follows:

– In plosives in English, the voiced and voiceless distinction isn't considered (by some, at least) to be useful as what would be classified as voiced plosives aren't actually voiced. That's why many phonologists simply don't use this terminology when talking about English plosives.

– In terms of perception, VOT is what allows us to distinguish between /t/ and /d/, /p/ and /b/ and /k/ and /g/.

– The physical technique we use to achieve the difference in VOT in these pairs of plosives with same place of articulation is aspiration. But apparently the presence or otherwise of 'aspiration' does not actually play a role in perception.


----------



## Forero

berndf said:


> As I said:
> 
> In this context,
> 1. _voiced vs. unvoiced_ *is defined as* _VOT < 0 vs. VOT >= 0_.
> 2. _aspirated vs. unaspirated_ *is defined as* _VOT <= 0 vs. VOT > 0_.
> 
> 2. is more important in English and German. 1. is more important in Italian.
> 
> This is an issue mainly with final stops. Obviously, VOT is not applicable there. I would propose the delay between release and subsidence of air pressure. Negative values would mean "unreleased" and positive values "aspirated".


Then what do you do with implosive consonants, those for which the air pressure is negative?

Anyway, the little puff of air at release in English is significant to me. It is involved in identifying the language being spoken, and it is important in perceiving stressed syllables and in finding word and morpheme boundaries (e.g. "Pike speaks" vs. "Pike's peaks"). Just because a /b/ does not occur just after an initial /s/ does not mean the /p/ is "speak" is a /b/. It is just a different type of /p/ than in "peak". In fact, redundancy is useful even when not theoretically necessary. A mispronounced word or phrase may not take on another meaning, but it may lose intelligibility. Even if it just sounds odd though "understandable", that itself interferes with smooth communication.

And the little puff of air is essentially the same after an initial stop or after a released final stop. Its exact frequency distribution depends on the configuration of the vocal apparatus (lips, tongue, etc.) at the time, but then so does that of, for example, an /l/ or a /z/.


----------



## sumelic

Actually, I find it quite difficult to hear a difference between the unaspirated [p] in "speak" and a word-inital English b sound. I remember when I first learned about the difference between unaspirated and aspirated consonants, I kept trying to pronounce "speak without the s" and thought I must be doing it wrong, because it sounded just like "beak" to me! As another example, I don't hear or pronounce any consistent difference between the words "discussed" and "disgust". In my opinion, the idea this sound must be an allophone of /p/ is due more to historical and orthographical influences than analysis of the synchronic phonological system of English. But in any case, this serves as an example of how disparate sounds that are considered to be "the same" phonologically can be in their phonetic characteristics. 
That's part of why I'm not sure that word-final released unvoiced stops really share an independent phonetic characteristic of "aspiration" or "a puff of air" with the word-initial delayed-voice-onset stops of English, or at least, I'm not sure that that phonetic feature is actually important for perception of the sounds. This quote from Gburtonio suggests that it is not:


> As for the delayed VOT stops with and without turbulence, it's been shown that NSs of English still perceive a voiceless stop sound when played recordings where the aspiration has actually been removed (the VOT is unchanged but the turbulence removed from the recording, replaced with silence). So certainly we're not using the aspiration in order to decode the sound – the VOT is everything.


Forero, do you disagree with this? It sounds like you're still saying that you hear aspiration as something phonetically distinct from delayed voice onset. I'm curious why you seem so certain that what you're hearing is the puff of air in particular, and not just the difference in voice onset (for word-initial stops), or release ( for word-final stops). These two types of stops are grouped together in English, but I don't know if this is a universal association.


----------



## berndf

sumelic said:


> Actually, I find it quite difficult to hear a difference between the unaspirated [p] in "speak" and a word-inital English b sound.


Germanic languages don't distinguish between fortis-lenis plosives in syllables starting with clusters /s/+plosive, i.e. there are only three (sp, st, sk (the latter is merged to [ʃ] in English and German)) and not six combinations (sp, *sb, st, *sd, sk, *sg). They are unaspirated, yet phonemically perceived as fortis. That is a well known idiosyncrasy and is valid in all Germanic languages, even in German that otherwise aspirates fortis plosives very strongly.


----------



## sumelic

Yes, that's the historical explanation. But I doubt anybody argues that [ʃ] in modern English is in any way a realization of the cluster /sk/; there have been changes in the phonology over time. Is there evidence that current speakers perceive the second consonant in these clusters as fortis? I feel like an equally good explanation synchronically would be that any consonants immediately following voiceless consonants in the same syllable are lenis but devoiced; this would give the same phonetic results [sp] [st] [sk], but they would be analyzed as /sb/ /sd/ /sg/. This even takes advantage of the pre-existing phonetic rule in English that consonants that would normally be voiced, like [l], become (at least partially) devoiced after voiceless consonants, (we could also write [sb̥] [sd̥] [sg̊] on the phonetic level) and it allows more parsimonious explanation of the dental verb suffixes that also show progressive voicing assimilation. I doubt I'm the first person to propose this analysis, but I don't think I've actually encountered any discussion on it and why it might or might not be plausible; the consensus in phonemic transcriptions of these clusters, for whatever reason, seems to be /sp/ /st/ /sk/.


----------



## merquiades

Forero said:


> This is why I have kept saying that voiced vs. unvoiced is more important in English than aspirated vs. unaspirated.


  That is the impression I have too, and I also follow pretty much everything you have said.   
After being in contact with German speakers, I have learned very well that the aspiration is paramount.  I have anecdotes but I won't derail the discussion.


----------



## gburtonio

sumelic said:


> I doubt I'm the first person to propose this analysis, but I don't think I've actually encountered any discussion on it and why it might or might not be plausible; the consensus in phonemic transcriptions of these clusters, for whatever reason, seems to be /sp/ /st/ /sk/.



I think 'convention' might be a better word for 'consensus' for this. It has certainly already been suggested that /sb/, /sd/ and /sg/ would be equally valid transcriptions and I don't there are any conclusive arguments on either side of the argument. The main reason these alternative transcriptions are not used is most likely purely down to spelling.


----------



## berndf

sumelic said:


> Yes, that's the historical explanation. But I doubt anybody argues that [ʃ] in modern English is in any way a realization of the cluster /sk/; there have been changes in the phonology over time. Is there evidence that current speakers perceive the second consonant in these clusters as fortis? I feel like an equally good explanation synchronically would be that any consonants immediately following voiceless consonants in the same syllable are lenis but devoiced; this would give the same phonetic results [sp] [st] [sk], but they would be analyzed as /sb/ /sd/ /sg/. This even takes advantage of the pre-existing phonetic rule in English that consonants that would normally be voiced, like [l], become (at least partially) devoiced after voiceless consonants, (we could also write [sb̥] [sd̥] [sg̊] on the phonetic level) and it allows more parsimonious explanation of the dental verb suffixes that also show progressive voicing assimilation. I doubt I'm the first person to propose this analysis, but I don't think I've actually encountered any discussion on it and why it might or might not be plausible; the consensus in phonemic transcriptions of these clusters, for whatever reason, seems to be /sp/ /st/ /sk/.



Of course, many phonological features have changed. But this feature, viz. that there is no fortis-lenis contrast in clusters /s/+plosive has been completely stable.


----------



## berndf

merquiades said:


> That is the impression I have too, and I also follow pretty much everything you have said.
> After being in contact with German speakers, I have learned very well that the aspiration is paramount.  I have anecdotes but I won't derail the discussion.



Also in German, it is positive VOT that matters and not "puff of air".

In syllable-inital /t/, positive VOT is optional in *unstressed* syllables in AmE but neither in BrE nor in German.

Syllable final plosives is a completly different story. German has no fortis-lenis distinction there (_bunt_ and _Bund_ are completely homophone). Presence or absence of aspiration is therefore erratic though it is usually present.


----------



## merquiades

berndf said:


> Also in German, it is positive VOT that matters and not "puff of air".


  I'm not doubting that positive VOT matters (Voice onset timed after), but at least in Sarreland if you don't pronounce stops with a strong puff of air you are not understood.  That could even be with obvious words like TUTE or KAFFEE.
Incidentally, older regional speakers of French near the border do this in French: "ton café" sounds like "thon khafé"



> In syllable-inital /t/, positive VOT is optional in *unstressed* syllables in AmE but neither in BrE nor in German.


 Do you have an example of how this is optional?



> Syllable final plosives is a completly different story. German has no fortis-lenis distinction there (_bunt_ and _Bund_ are completely homophone). Presence or absence of aspiration is therefore erratic though it is usually present.


 That apparently has become official.  In language methods to learn German it is specifically stated to pronounce t, k, p for d, g, b in final position.


----------



## berndf

No, it is not "strong puff of air". It is longer VOT. Typically, German VOTs of fortis plosives are about 1/2 longer than in English (something like 45ms in English vs. 65ms in German).

In AmE the p in "operation" remains usually unaspirated, i.e. VOT below the threshold of audibility.


----------



## merquiades

berndf said:


> No, it is not "strong puff of air". It is longer VOT. Typically, German VOTs of fortis plosives are about 1/2 longer than in English (something like 45ms in English vs. 65ms in German).


 Longer not stronger...  I suppose that explains why there are so many /pf/ words in initial position.



> In AmE the p in "operation" remains usually unaspirated, i.e. VOT below the threshold of audibility.


 So does that mean you hear "oberation"?


----------



## berndf

It actually happened once to me that I had to get subtitles from the Internet because I couldn't understand why Captain Sisko of Star Trek DS9 said "aberration" in a context where it didn't make sense. In the end I found out that said "operation". There were two peculiarities of AmE I know in principle, unrounding of the short o and lack of aspiration of p. It still made me misunderstand what he said.


----------

