|
Analysis
Prominence & intonation (3 units)
Language pedagogical analysis
These three unis cover the greater topic of pitch movement, with a special emphasis put on the sub-topics of nuclear stress, thought grouping, and levels of stress & pitch (termed “prominence”) because of their interrelationship with connected speech, while minor emphasis is put on tones, i.e. utterance final pitch movement (which can be defined as “intonation” proper).
The whole topic of prominence/intonation/pitch is introduced with resources 1-3 of the first unit. Resource 1 (the text is adapted from Baker/Goldstein 2008: 22) combines the analysis with the listening discrimination phase organically, which is a very useful procedure since speech perception is a prerequisite to speech production and since pitch cannot really be analyzed in the classroom in absolute terms. While task 1 serves as a lead-in technique to sensitize students to listen for stress, task 2.1 starts the analysis stage proper in that students are now required to listen for nuclear stress (pedagogically referred to as “primary sentence stress” in Avery/Ehrlich (1992) or the “focus word” in Gilbert (2009)), which, in English, depends significantly on pitch (see phonological analysis). This activity is quite difficult for students since they probably never consciously had to listen for pitch before, but doing this activity several times clearly helps. Task 2.2 then specifically addresses the issue of stress/pitch movement to signal old vs. new information, which is practiced in a controlled and guided manner in the following tasks. Resource 2, in addition to the topic at hand, also makes mention of unreleased terminal obstruents, while, in addition, the whole dialogue and this task make excessive use of the /æ/ vowel. These issues are not systematically explained, but the articulation is modeled and especially in the /æ/ vowel students receive some formative, articulatory feedback that can help prepare for when the topic is introduced properly. By that time a mental category may already have been established subconsciously (though tasks as the present on) and students would be able to draw on these initial impressions when trying to relate to the topic when it comes (Roth 2009: 66). In addition, this might also potentially trigger some monitoring (Krashen 2003: 2f).
This topic is closely related to the concept of tone units/intonation units/intonation groups/intonation phrases/sense groups/breath groups (pedagogically referred to here as thought groups), since each tone unit normally has only one nucleus in unmarked utterances. In order to introduce this, students are given the following two sentences:
Do you remember (/) when we used to stay up all night (/) studying for exams?
Do I ever! Finals week was so bad / that we’d drink coffee (/) by the gallon.
Using a deductive technique, these sentences are provided without slashes (marking (possible) thought groups) together with the following 4 rules and the instruction to identify possible thought groups (the rules and sentences are taken from Celce-Murcia et al. 2011: 223).
A thought group is
1) set off by pauses before and after;
2) contains one prominent element
3) has an intonation contour of its own
4) usually has a grammatically coherent internal structure
Resources 4 and 5, then, combine the topics of thought groups, stress and prominence in listening discrimination and some inherent controlled practice. In that, resource 5 adds a significant bit of authenticity (ibid: 374) in that real songs and film excerpts are used, as opposed to texts designed to make a point. This is significant since, as Penny Ur (1987: 10) cautions, “[s]tudents who do not receive instruction or exposure to authentic discourse are going to have a rude awakening when [trying] to understand native speaker speech in natural communicative situations” (see also e.g. Ricard 1986: 244, Couper 2003: 56). Ur agues this is particularly so because of students’ problem to decipher word boundaries and to cope with fast (i.e. connected) speech. Therefore the use of such materials is a strong component of any approach that strives for authenticity.
One more element is added here before going into free practice and this is what happens to the pitch level at the end of a tone unit (in neutral sentences). To introduce this topic in a simple way, the following sentences are provided: “I visited the museum, the library and the park” and “we ordered soup, salad and coffee”. Students are asked what they notice when the teacher models the sentences, perhaps accompanied by respective hand movements. Once students realized that pitch slightly rises or stays the same after each item before it drops at the very end, they are asked to complete the following sentences with a list of items as prompted: “They saw a... [cars]”, “The shirt is... [colors]”, “The zoo has... [animals]”, “The suitcase contains... [items]”. Resource 6, then, puts everything together, while resource 7 necessitates the “listing” of items in extended discourse and therefore requires the use of all the topics covered. This particular didactic sequence is meant to put several individual prosodic aspects together in an organic and pedagogically conducive manner. The whole topic of pitch was started with a rather tangible aspect, old and new information (linguistically referred to as information structure), and introduced the notion of primary sentence stress at the same time. From here the tone unit was added because one, now, wonders where and in which intervals these primary stresses occur. Once these concepts are established, it makes sense to enquire what happens to pitch at thought group boundaries, which is again introduced in a more tangible way (listing items), before going into more complex discourse. All these aspects together establish a solid foundation of the prosodic structure of utterances, added to the somewhat more intangible topic of rhythm before. In the top-down process (from large to small), this is stage two, while the following unit takes one more step down and looks at pitch and stress per se.
While students at this stage have a pretty clear, if phonologically very basic idea of the function of pitch in sentences and discourse, one aspect that still causes some confusion but also builds expectation, which can be channeled and used for motivation !, is how much, or how little pitch is used where. This is addressed in the first part of the following unit, in which vocal coaching exercises proposed in Archibald (1992) are employed (each number is said on a higher or lower pitch level). After some more analysis to gain complete clarity of how the aspects introduced thus far operate in the language system (2.1), task 2 of resource 2 plays with pitch to express meaning (what are you listening to? who is listening? why aren’t you listening?).
Then, in the context of further pitch practice, rising and falling tones are brought in, an easy-to-use, illustrative and at the same time meaningful context for which being tag questions. Resource 4, finally, brings this together with prominence and highlights the complex meaning-distinguishing function of stress and pitch (the 2.2 sentences are adapted from Celce-Murcia et al. 2011). In this, task 1 also specifically addresses the intonation difference in open and closed questions (or, pedagogically, wh- questions and yes/no questions). This particular aspect can easily be elicited by the teacher during the previous tasks and practiced in a little supplemented information exchange activity. This stands in direct relation to the previous tasks in that both question types follow one of the two tones just introduced. Task 1 of resource 4 is very difficult because pitch has to be changed on one syllable, i.e. within one vowel. In order to illustrate this beyond modeling it, Ann Cook’s (2000) model of “staircase intonation” can be employed (see picture on the bottom of this page, adapted from Cook 2000:97, illustrating were), in which the second “stress step” on the intonation staircase is filled with a small schwa and the following consonant, thus splitting the vowel into two parts – in the task leading to “/brɪ-əŋ/”, with higher and lower pitch on the respective elements. Saying this artificially slowly and then increasing the speed seems, indeed, very helpful with this simple visualization.
The Relationship unit can be used either in a focus on meaning (task-based teaching, covering form at the very end) or in a focus on form manner (doing form practice intermittently) and is designed to practice everything further with communicative activities while students attention is drawn to prosodic features in further authentic listening texts, in this case longer videos scenes from a TV show. This unit also further introduces processes in connected speech in an impressionistic manner and in that builds some perception skills and concepts that can be used as a cognitive reference point in the following lessons (Roth 2009: 66). Task 3, after analysis has been completed, can very nicely be used as a shadow reading technique, which means that students read the text along with the speaker in the video, which, in a way, forces them to employ stress and connected speech in as authentic a way as possible. Task 4, finally, serves as a specific revision of rhythm, and some focused reading-out is also employed here.
Phonological analysis:
Most pedagogical concepts have already been related to their respective linguistic terms and their functioning in the language system, so that this section will deal with the development of intonation theories and their relation to pedagogical models.
Contemporary theories of intonation were developed chiefly between the 1940s and 1990s, with significant differences between the first approaches developed in Britain and North America, respectively. In contemporary pedagogical approaches, many bits and pieces of these theories are put together in an attempt to create a processable and comunicatively functional system for the teaching of English prosody, with the overall goal of helping students to achieve communicative competence (Celce-Murcia et al. 2011: 269). However, such an language pedagogical treatment of prosody blurs a number of distinctions that have been made in general linguistics. 't Hart et al. (1990: 2) set an agenda for what aspects a theory of intonation should address: (my emphasis)
[I]ntonation can be approached from a variety of angles, all of which are equally indispensable if one wishes ultimately to understand how speech melody functions in human communication. Ideally, a theory of intonation should comprise a phonetic and a linguistic component. The phonetic part of the theory should account for the physiological, acoustic, and perceptual aspects of intonation, and elucidate the relation between them. The linguistic component of the theory should aim at a phonological interpretation of the phonetic facts and at a pragmatic explanation of how intonation functions in the communicative interaction between speaker and listener. Finally, the theory should comprise a natural link between the linguistic and phonetic components: it should clarify how the melodic performance of the language user results from the interaction between his communicative intent and the peripheral means of his vocal and perceptual apparatus.
Pedagogic accounts of intonation usually strongly emphasize the functional aspect of intonation because of its clear communicative value. This was clearly the purpose of the unit on sentence stress & thought groups and the second half of the following unit ("basics of intonation"). However, the first part of that unit ("pitch range") and, by implication, the second half of it specifically addressed the aspect of phonetic form, for which an analogy to music was drawn to make this more feasible.
In the phonological classification of intonation phenomena British scholars have traditionally preferred a tune or contour approach, in which pitch contours, sentence types and emotional states were used to build a coherent model. In this, Sweet’s (1890: 3) classical distinction between 5 tones (level, rising, falling, falling-rising and rising-falling) was dominant. Developments of such a five or six tone system were O’Conner & Arnold’s (1961) tonetic approach (which put some special emphasis on attitudinal functions like doubt or certainty), and Halliday’s (1963, 1967) categories approach, which stressed the grammatical function of intonation (open and closed questions, incompleteness). In North America, theories of pitch levels were more prominent. In the American scholarly tradition contours were typically analyzed into sequences of four pitch levels (treated as prosodic ‘phonemes’) and three terminal junctures, falling, rising and level (together with preceding levels treated as prosodic ‘morphemes’) (Cruttenden 1997: 28). In this, Pike’s (1945) “pitch levels” approach was extremely influential, resulting in a notational system as depicted (Chun 2002: 26).
He wanted to do it (but couldn’t)
3- °2 -3 / 4- °2- -4 //
Generative phonologists like Pierrehumbert (1980/1987) developed this model into what came to be known as the autosegmental-metrical model of intonation, using a two-level approach (high and low) (see Gussenhoven 2004 and Ladd 2008 for contemporary accounts). It was also this model that lead to the ToBI transcription system (see Beckman/Hirschberg/Shattuck-Hufnagel 2005). While this model avoided some previously criticized shortcomings (e.g. the point that the 4 pitch levels were seen as random and that transitions were not accounted for) it is still not unchallenged (Cruttenden 1997: 64ff). Also, it is exceedingly complicated in its notational system and therefore seems to not have explicitly influenced pronunciation pedagogy much. This does, however, not mean that a technique like Archibald’s vocal coaching exercise (unit on pitch range), with which human speech pitch can be classified as being between level 4 and -2, is not useful. Such pedagogical exercises, while theoretically inadequate (because stress also involves loudness and duration in various ways and because they do not account for transition from one level to the other, not to speak of the relative character of such levels), establish very real perceptual categories (emphatic stress (4), primary stress (3), sentence stress (2), unstress (+/-1), drop to creaky voice (-2)). Therefore, one has to distinguish between what is expressible in phonological theory, and what is perceptually observable (and teachable).
In Britain David Brazil developed a system of discourse intonation in the 1970s and 80s in which he kept the classical distinction of 5 tones, combined with three keys (the pitch level of an entire tone group). Brazil’s model seems to have been very influential, but is rather impressionistic and complicated, so that it has been further developed by scholars like Elisabeth Couper-Kuhlen (e.g. 1986), who extends and clarifies the discourse intonation model and develops a number of different functions of intonation (informational, grammatical, illocutionary, attitudinal, textual/discourse, indexical) (see Wells 2006 for a contemporary account of intonational functions). Elizabeth Couper-Kuhlen herself later played a great role in the development of a new tradition in intonation studies: Prosody in interactional linguistics (building on the “interactive function” in her discourse intonation model) (e.g. Couper-Kuhlen 1993, Couper-Kuhlen & Selting 1996, Couper-Kuhlen 2007, see also e.g. Szczepek-Reed 2011).
As has been said before, applied linguistic accounts took bits and pieces from the systems that were proposed over the years and that seemed particularly conducive from a pedagogical standpoint. This lead Levis (1999: 37), in comparing intonation in theory and practice, to the observation that “intonation as currently presented in North American textbooks bears a strong resemblance to textbook treatments from 30-50 years ago”. Indeed Avery & Ehrlich (1992), a book still widely used today, only cover aspects like word tones much like the system as proposed by Halliday in the 1960s, while Celce-Murcia et al. (11996, 22011) still use, as one part of their discussion of intonation, a system much like Pike’s pitch levels approach from the 1940s. However, while the discourse function of intonation certainly deserves to be emphasized (Clennell 1997), as e.g. Gilbert (2009) or Celce-Murcia et al. (2011) certainly do, and as can be seen as necessary in serious CLT, the explicit treatment of pitch levels (as dealt with in the above unit) cannot be neglected, and student feedback reveals that specific pitch practice is perceived as extremely valuable. In language teaching, the fact that such levels are quite subjective is rather irrelevant because the teacher can and has to assess if a necessary contrast is achieved. Rather than ignoring the teaching of form altogether in favor of certain theoretical paradigms (interaction, discourse, communicative competence), as is often done today, a sensible, needs-oriented approach should be taken. The lesson sequence proposed here, with its sequential structure of individual topics, is meant to address communicative needs, but also to supplement some theory in order to achieve not only functional, but also more authentic production in the long run.
A final aspect bringing together theoretical and L2 phonological concerns has to be mentioned: the role of pitch in marking stress. As, for example, Cruttenden (1997: 13) notes, there are three aspects that contribute to stress: pitch, duration and loudness, which operate on a scale of importance with pitch clearly being most significant, duration being of medial importance and loudness playing only a minor role, especially in nuclear stress (see also e.g. Giegerich 1992: 179 or Chun 2002, who shows this in various contexts). It is known that the fundamental frequency (F0) of male speakers is between 60 and 200 Hz in average, and that of female speakers between 180 and 400 Hz (Cruttenden 1997: 3), and that many languages use pitch in quite a similar manner (Ohala 1983). Still, the exact blend of pitch, length and loudness in English and the correlation between a certain pitch level and this level’s role in the language system is difficult to acquire. Just as, for example, speakers from different L1s substitute the /æ/ vowel in their own ways[1], speakers of different L1 backgrounds seem to overcompensate with either pitch, duration or loudness while somewhat neglecting the others. Bertha Chela-Flores (e.g. 1997), stresses training students in duration because this is the primary problem of Spanish L1 speakers. Colleagues who have worked extensively with Japanese students inform me that in their context pitch is used to overcompensate for the others.
With German and (if generalizable) Slavic L1 speakers I have found that loudness, which is the least significant factor in English, is heavily used to compensate for pitch. This once showed very nicely in the controlled practice task on page 2 of the sentence stress unit. When asked to read out the dialogues after practice, one student, concentrating on getting it right, overused loudness immensely, which caused laughter from everybody because the point was to use pitch, not loudness. The person himself aborted even before the laughter started because he noticed. This means that depending on the L1 group, a certain aspect of stress must be particularly emphasized. Finally, it should be highlighted that prosodic acquisition seems to depend on developmental, at least as much as on transfer problems[2]. Still, as Cook (2000: 173) or Van Dommelen & Husby (2009: 314ff) have shown, it is possible for the teacher to utilize the L1 if it has similar features: In an empirical evaluation of these units one of the test groups had two Italian students, who initially struggled like everybody else, but learned pitch movement with significantly greater ease because Italian makes heavy use of pitch in its sentence melody as well.
[1] I have found that speakers of French or German substitute /ɛ/, while speakers of Japanese, Spanish, Italian or Polish substitute /a/, even though in all of these languages /ɛ/ as well as /a/occur.
[2] It is my experience that even though certain features may exist in students’ native languages, these factors are not simply transferred 1:1 to a second language. Also, pitch range and exact use of pitch are different even among varieties of English (see e.g. Meier 2011). Cook (2000: 173) notes that it is not typically the case that Chinese students acquire English intonation on their own, despite Mandarin being a tone language, but that they can be trained very effectively due to this advantage. It has also been shown that Chinese students can be instructed more effectively in Norwegian word tones than German students (Van Dommelen & Husby 2009: 314ff). Still, the exact degree of positive transfer of prosody and how to utilize it in L2 teaching is not clear. Likewise, Chun (2002, Ch. 6) shows instrumentally how rhythm in English and German are different, though both stress timed Germanic languages, so that it is not clear to which extent the similarity is actually helpful. It should further be noted that prosody is very much dependent on fluency (Derwing, Munro & Wiebe 1998), and therefore generally difficult to realize in an L2.
Prominence & intonation (3 units)
Language pedagogical analysis
These three unis cover the greater topic of pitch movement, with a special emphasis put on the sub-topics of nuclear stress, thought grouping, and levels of stress & pitch (termed “prominence”) because of their interrelationship with connected speech, while minor emphasis is put on tones, i.e. utterance final pitch movement (which can be defined as “intonation” proper).
The whole topic of prominence/intonation/pitch is introduced with resources 1-3 of the first unit. Resource 1 (the text is adapted from Baker/Goldstein 2008: 22) combines the analysis with the listening discrimination phase organically, which is a very useful procedure since speech perception is a prerequisite to speech production and since pitch cannot really be analyzed in the classroom in absolute terms. While task 1 serves as a lead-in technique to sensitize students to listen for stress, task 2.1 starts the analysis stage proper in that students are now required to listen for nuclear stress (pedagogically referred to as “primary sentence stress” in Avery/Ehrlich (1992) or the “focus word” in Gilbert (2009)), which, in English, depends significantly on pitch (see phonological analysis). This activity is quite difficult for students since they probably never consciously had to listen for pitch before, but doing this activity several times clearly helps. Task 2.2 then specifically addresses the issue of stress/pitch movement to signal old vs. new information, which is practiced in a controlled and guided manner in the following tasks. Resource 2, in addition to the topic at hand, also makes mention of unreleased terminal obstruents, while, in addition, the whole dialogue and this task make excessive use of the /æ/ vowel. These issues are not systematically explained, but the articulation is modeled and especially in the /æ/ vowel students receive some formative, articulatory feedback that can help prepare for when the topic is introduced properly. By that time a mental category may already have been established subconsciously (though tasks as the present on) and students would be able to draw on these initial impressions when trying to relate to the topic when it comes (Roth 2009: 66). In addition, this might also potentially trigger some monitoring (Krashen 2003: 2f).
This topic is closely related to the concept of tone units/intonation units/intonation groups/intonation phrases/sense groups/breath groups (pedagogically referred to here as thought groups), since each tone unit normally has only one nucleus in unmarked utterances. In order to introduce this, students are given the following two sentences:
Do you remember (/) when we used to stay up all night (/) studying for exams?
Do I ever! Finals week was so bad / that we’d drink coffee (/) by the gallon.
Using a deductive technique, these sentences are provided without slashes (marking (possible) thought groups) together with the following 4 rules and the instruction to identify possible thought groups (the rules and sentences are taken from Celce-Murcia et al. 2011: 223).
A thought group is
1) set off by pauses before and after;
2) contains one prominent element
3) has an intonation contour of its own
4) usually has a grammatically coherent internal structure
Resources 4 and 5, then, combine the topics of thought groups, stress and prominence in listening discrimination and some inherent controlled practice. In that, resource 5 adds a significant bit of authenticity (ibid: 374) in that real songs and film excerpts are used, as opposed to texts designed to make a point. This is significant since, as Penny Ur (1987: 10) cautions, “[s]tudents who do not receive instruction or exposure to authentic discourse are going to have a rude awakening when [trying] to understand native speaker speech in natural communicative situations” (see also e.g. Ricard 1986: 244, Couper 2003: 56). Ur agues this is particularly so because of students’ problem to decipher word boundaries and to cope with fast (i.e. connected) speech. Therefore the use of such materials is a strong component of any approach that strives for authenticity.
One more element is added here before going into free practice and this is what happens to the pitch level at the end of a tone unit (in neutral sentences). To introduce this topic in a simple way, the following sentences are provided: “I visited the museum, the library and the park” and “we ordered soup, salad and coffee”. Students are asked what they notice when the teacher models the sentences, perhaps accompanied by respective hand movements. Once students realized that pitch slightly rises or stays the same after each item before it drops at the very end, they are asked to complete the following sentences with a list of items as prompted: “They saw a... [cars]”, “The shirt is... [colors]”, “The zoo has... [animals]”, “The suitcase contains... [items]”. Resource 6, then, puts everything together, while resource 7 necessitates the “listing” of items in extended discourse and therefore requires the use of all the topics covered. This particular didactic sequence is meant to put several individual prosodic aspects together in an organic and pedagogically conducive manner. The whole topic of pitch was started with a rather tangible aspect, old and new information (linguistically referred to as information structure), and introduced the notion of primary sentence stress at the same time. From here the tone unit was added because one, now, wonders where and in which intervals these primary stresses occur. Once these concepts are established, it makes sense to enquire what happens to pitch at thought group boundaries, which is again introduced in a more tangible way (listing items), before going into more complex discourse. All these aspects together establish a solid foundation of the prosodic structure of utterances, added to the somewhat more intangible topic of rhythm before. In the top-down process (from large to small), this is stage two, while the following unit takes one more step down and looks at pitch and stress per se.
While students at this stage have a pretty clear, if phonologically very basic idea of the function of pitch in sentences and discourse, one aspect that still causes some confusion but also builds expectation, which can be channeled and used for motivation !, is how much, or how little pitch is used where. This is addressed in the first part of the following unit, in which vocal coaching exercises proposed in Archibald (1992) are employed (each number is said on a higher or lower pitch level). After some more analysis to gain complete clarity of how the aspects introduced thus far operate in the language system (2.1), task 2 of resource 2 plays with pitch to express meaning (what are you listening to? who is listening? why aren’t you listening?).
Then, in the context of further pitch practice, rising and falling tones are brought in, an easy-to-use, illustrative and at the same time meaningful context for which being tag questions. Resource 4, finally, brings this together with prominence and highlights the complex meaning-distinguishing function of stress and pitch (the 2.2 sentences are adapted from Celce-Murcia et al. 2011). In this, task 1 also specifically addresses the intonation difference in open and closed questions (or, pedagogically, wh- questions and yes/no questions). This particular aspect can easily be elicited by the teacher during the previous tasks and practiced in a little supplemented information exchange activity. This stands in direct relation to the previous tasks in that both question types follow one of the two tones just introduced. Task 1 of resource 4 is very difficult because pitch has to be changed on one syllable, i.e. within one vowel. In order to illustrate this beyond modeling it, Ann Cook’s (2000) model of “staircase intonation” can be employed (see picture on the bottom of this page, adapted from Cook 2000:97, illustrating were), in which the second “stress step” on the intonation staircase is filled with a small schwa and the following consonant, thus splitting the vowel into two parts – in the task leading to “/brɪ-əŋ/”, with higher and lower pitch on the respective elements. Saying this artificially slowly and then increasing the speed seems, indeed, very helpful with this simple visualization.
The Relationship unit can be used either in a focus on meaning (task-based teaching, covering form at the very end) or in a focus on form manner (doing form practice intermittently) and is designed to practice everything further with communicative activities while students attention is drawn to prosodic features in further authentic listening texts, in this case longer videos scenes from a TV show. This unit also further introduces processes in connected speech in an impressionistic manner and in that builds some perception skills and concepts that can be used as a cognitive reference point in the following lessons (Roth 2009: 66). Task 3, after analysis has been completed, can very nicely be used as a shadow reading technique, which means that students read the text along with the speaker in the video, which, in a way, forces them to employ stress and connected speech in as authentic a way as possible. Task 4, finally, serves as a specific revision of rhythm, and some focused reading-out is also employed here.
Phonological analysis:
Most pedagogical concepts have already been related to their respective linguistic terms and their functioning in the language system, so that this section will deal with the development of intonation theories and their relation to pedagogical models.
Contemporary theories of intonation were developed chiefly between the 1940s and 1990s, with significant differences between the first approaches developed in Britain and North America, respectively. In contemporary pedagogical approaches, many bits and pieces of these theories are put together in an attempt to create a processable and comunicatively functional system for the teaching of English prosody, with the overall goal of helping students to achieve communicative competence (Celce-Murcia et al. 2011: 269). However, such an language pedagogical treatment of prosody blurs a number of distinctions that have been made in general linguistics. 't Hart et al. (1990: 2) set an agenda for what aspects a theory of intonation should address: (my emphasis)
[I]ntonation can be approached from a variety of angles, all of which are equally indispensable if one wishes ultimately to understand how speech melody functions in human communication. Ideally, a theory of intonation should comprise a phonetic and a linguistic component. The phonetic part of the theory should account for the physiological, acoustic, and perceptual aspects of intonation, and elucidate the relation between them. The linguistic component of the theory should aim at a phonological interpretation of the phonetic facts and at a pragmatic explanation of how intonation functions in the communicative interaction between speaker and listener. Finally, the theory should comprise a natural link between the linguistic and phonetic components: it should clarify how the melodic performance of the language user results from the interaction between his communicative intent and the peripheral means of his vocal and perceptual apparatus.
Pedagogic accounts of intonation usually strongly emphasize the functional aspect of intonation because of its clear communicative value. This was clearly the purpose of the unit on sentence stress & thought groups and the second half of the following unit ("basics of intonation"). However, the first part of that unit ("pitch range") and, by implication, the second half of it specifically addressed the aspect of phonetic form, for which an analogy to music was drawn to make this more feasible.
In the phonological classification of intonation phenomena British scholars have traditionally preferred a tune or contour approach, in which pitch contours, sentence types and emotional states were used to build a coherent model. In this, Sweet’s (1890: 3) classical distinction between 5 tones (level, rising, falling, falling-rising and rising-falling) was dominant. Developments of such a five or six tone system were O’Conner & Arnold’s (1961) tonetic approach (which put some special emphasis on attitudinal functions like doubt or certainty), and Halliday’s (1963, 1967) categories approach, which stressed the grammatical function of intonation (open and closed questions, incompleteness). In North America, theories of pitch levels were more prominent. In the American scholarly tradition contours were typically analyzed into sequences of four pitch levels (treated as prosodic ‘phonemes’) and three terminal junctures, falling, rising and level (together with preceding levels treated as prosodic ‘morphemes’) (Cruttenden 1997: 28). In this, Pike’s (1945) “pitch levels” approach was extremely influential, resulting in a notational system as depicted (Chun 2002: 26).
He wanted to do it (but couldn’t)
3- °2 -3 / 4- °2- -4 //
Generative phonologists like Pierrehumbert (1980/1987) developed this model into what came to be known as the autosegmental-metrical model of intonation, using a two-level approach (high and low) (see Gussenhoven 2004 and Ladd 2008 for contemporary accounts). It was also this model that lead to the ToBI transcription system (see Beckman/Hirschberg/Shattuck-Hufnagel 2005). While this model avoided some previously criticized shortcomings (e.g. the point that the 4 pitch levels were seen as random and that transitions were not accounted for) it is still not unchallenged (Cruttenden 1997: 64ff). Also, it is exceedingly complicated in its notational system and therefore seems to not have explicitly influenced pronunciation pedagogy much. This does, however, not mean that a technique like Archibald’s vocal coaching exercise (unit on pitch range), with which human speech pitch can be classified as being between level 4 and -2, is not useful. Such pedagogical exercises, while theoretically inadequate (because stress also involves loudness and duration in various ways and because they do not account for transition from one level to the other, not to speak of the relative character of such levels), establish very real perceptual categories (emphatic stress (4), primary stress (3), sentence stress (2), unstress (+/-1), drop to creaky voice (-2)). Therefore, one has to distinguish between what is expressible in phonological theory, and what is perceptually observable (and teachable).
In Britain David Brazil developed a system of discourse intonation in the 1970s and 80s in which he kept the classical distinction of 5 tones, combined with three keys (the pitch level of an entire tone group). Brazil’s model seems to have been very influential, but is rather impressionistic and complicated, so that it has been further developed by scholars like Elisabeth Couper-Kuhlen (e.g. 1986), who extends and clarifies the discourse intonation model and develops a number of different functions of intonation (informational, grammatical, illocutionary, attitudinal, textual/discourse, indexical) (see Wells 2006 for a contemporary account of intonational functions). Elizabeth Couper-Kuhlen herself later played a great role in the development of a new tradition in intonation studies: Prosody in interactional linguistics (building on the “interactive function” in her discourse intonation model) (e.g. Couper-Kuhlen 1993, Couper-Kuhlen & Selting 1996, Couper-Kuhlen 2007, see also e.g. Szczepek-Reed 2011).
As has been said before, applied linguistic accounts took bits and pieces from the systems that were proposed over the years and that seemed particularly conducive from a pedagogical standpoint. This lead Levis (1999: 37), in comparing intonation in theory and practice, to the observation that “intonation as currently presented in North American textbooks bears a strong resemblance to textbook treatments from 30-50 years ago”. Indeed Avery & Ehrlich (1992), a book still widely used today, only cover aspects like word tones much like the system as proposed by Halliday in the 1960s, while Celce-Murcia et al. (11996, 22011) still use, as one part of their discussion of intonation, a system much like Pike’s pitch levels approach from the 1940s. However, while the discourse function of intonation certainly deserves to be emphasized (Clennell 1997), as e.g. Gilbert (2009) or Celce-Murcia et al. (2011) certainly do, and as can be seen as necessary in serious CLT, the explicit treatment of pitch levels (as dealt with in the above unit) cannot be neglected, and student feedback reveals that specific pitch practice is perceived as extremely valuable. In language teaching, the fact that such levels are quite subjective is rather irrelevant because the teacher can and has to assess if a necessary contrast is achieved. Rather than ignoring the teaching of form altogether in favor of certain theoretical paradigms (interaction, discourse, communicative competence), as is often done today, a sensible, needs-oriented approach should be taken. The lesson sequence proposed here, with its sequential structure of individual topics, is meant to address communicative needs, but also to supplement some theory in order to achieve not only functional, but also more authentic production in the long run.
A final aspect bringing together theoretical and L2 phonological concerns has to be mentioned: the role of pitch in marking stress. As, for example, Cruttenden (1997: 13) notes, there are three aspects that contribute to stress: pitch, duration and loudness, which operate on a scale of importance with pitch clearly being most significant, duration being of medial importance and loudness playing only a minor role, especially in nuclear stress (see also e.g. Giegerich 1992: 179 or Chun 2002, who shows this in various contexts). It is known that the fundamental frequency (F0) of male speakers is between 60 and 200 Hz in average, and that of female speakers between 180 and 400 Hz (Cruttenden 1997: 3), and that many languages use pitch in quite a similar manner (Ohala 1983). Still, the exact blend of pitch, length and loudness in English and the correlation between a certain pitch level and this level’s role in the language system is difficult to acquire. Just as, for example, speakers from different L1s substitute the /æ/ vowel in their own ways[1], speakers of different L1 backgrounds seem to overcompensate with either pitch, duration or loudness while somewhat neglecting the others. Bertha Chela-Flores (e.g. 1997), stresses training students in duration because this is the primary problem of Spanish L1 speakers. Colleagues who have worked extensively with Japanese students inform me that in their context pitch is used to overcompensate for the others.
With German and (if generalizable) Slavic L1 speakers I have found that loudness, which is the least significant factor in English, is heavily used to compensate for pitch. This once showed very nicely in the controlled practice task on page 2 of the sentence stress unit. When asked to read out the dialogues after practice, one student, concentrating on getting it right, overused loudness immensely, which caused laughter from everybody because the point was to use pitch, not loudness. The person himself aborted even before the laughter started because he noticed. This means that depending on the L1 group, a certain aspect of stress must be particularly emphasized. Finally, it should be highlighted that prosodic acquisition seems to depend on developmental, at least as much as on transfer problems[2]. Still, as Cook (2000: 173) or Van Dommelen & Husby (2009: 314ff) have shown, it is possible for the teacher to utilize the L1 if it has similar features: In an empirical evaluation of these units one of the test groups had two Italian students, who initially struggled like everybody else, but learned pitch movement with significantly greater ease because Italian makes heavy use of pitch in its sentence melody as well.
[1] I have found that speakers of French or German substitute /ɛ/, while speakers of Japanese, Spanish, Italian or Polish substitute /a/, even though in all of these languages /ɛ/ as well as /a/occur.
[2] It is my experience that even though certain features may exist in students’ native languages, these factors are not simply transferred 1:1 to a second language. Also, pitch range and exact use of pitch are different even among varieties of English (see e.g. Meier 2011). Cook (2000: 173) notes that it is not typically the case that Chinese students acquire English intonation on their own, despite Mandarin being a tone language, but that they can be trained very effectively due to this advantage. It has also been shown that Chinese students can be instructed more effectively in Norwegian word tones than German students (Van Dommelen & Husby 2009: 314ff). Still, the exact degree of positive transfer of prosody and how to utilize it in L2 teaching is not clear. Likewise, Chun (2002, Ch. 6) shows instrumentally how rhythm in English and German are different, though both stress timed Germanic languages, so that it is not clear to which extent the similarity is actually helpful. It should further be noted that prosody is very much dependent on fluency (Derwing, Munro & Wiebe 1998), and therefore generally difficult to realize in an L2.