Atnaujinkite slapukų nuostatas

Validating Analytic Rating Scales: A Multi-Method Approach to Scaling Descriptors for Assessing Academic Speaking New edition [Kietas viršelis]

Series edited by ,
  • Formatas: Hardback, 395 pages, aukštis x plotis: 210x148 mm, weight: 630 g
  • Serija: Language Testing and Evaluation 37
  • Išleidimo metai: 28-Nov-2015
  • Leidėjas: Peter Lang AG
  • ISBN-10: 3631666918
  • ISBN-13: 9783631666913
Kitos knygos pagal šią temą:
  • Formatas: Hardback, 395 pages, aukštis x plotis: 210x148 mm, weight: 630 g
  • Serija: Language Testing and Evaluation 37
  • Išleidimo metai: 28-Nov-2015
  • Leidėjas: Peter Lang AG
  • ISBN-10: 3631666918
  • ISBN-13: 9783631666913
Kitos knygos pagal šią temą:
This book presents a unique inter-university scale development project, with a focus on the validation of two new rating scales for the assessment of academic presentations and interactions. The use of rating scales for performance assessment has increased considerably in educational contexts, but the empirical research to investigate the effectiveness of such scales is scarce. The author reports on a multi-method study designed to scale the level descriptors on the basis of expert judgments and performance data. The salient characteristics of the scale levels offer a specification of academic speaking, adding concrete details to the reference levels of the Common European Framework. The findings suggest that validation procedures should be mapped onto theoretical models of performance assessment.

This book presents a unique scale development project, with a focus on the validation of two rating scales for assessing academic speaking. It reports on a multi-method study designed to scale the level descriptors on the basis of expert judgments and performance data. The characteristics of the levels offer a specification of academic speaking.
Acknowledgements 9(2)
List of figures
11(2)
List of tables
13(2)
List of abbreviations
15(2)
1 Introduction
17(8)
1.1 Background to the study
17(3)
1.2 Statement of the problem
20(1)
1.3 Purpose of the study
21(1)
1.4 Research questions
22(1)
1.5 Structure of the book
23(2)
2 Performance assessment of second language speaking
25(36)
2.1 Introduction to performance assessment
26(3)
2.2 The speaking construct in performance assessment
29(20)
2.2.1 Pre-communicative approaches
29(2)
2.2.2 Models of communicative competence
31(7)
2.2.3 Approaches to speaking
38(11)
2.3 Models of performance assessment
49(8)
2.3.1 McNamara (1996)
51(1)
2.3.2 Skehan (1998, 2001)
52(2)
2.3.3 Bachman (2002)
54(1)
2.3.4 Fulcher (2003)
55(2)
2.4 Rating scales in performance assessment
57(4)
3 Rating scales
61(18)
3.1 General characteristics
61(2)
3.2 Types of rating scales
63(2)
3.3 Theoretical and methodological concepts in rating scale development
65(9)
3.3.1 Intuitive approaches
66(1)
3.3.2 Theory-based approaches
67(1)
3.3.3 Empirical approaches
68(2)
3.3.4 Triangulation of approaches
70(4)
3.4 Controversy over rating scales
74(5)
4 Rating scale validation
79(12)
4.1 Validity and validity evidence
79(4)
4.2 Rasch-based rating scale validation
83(3)
4.3 Dimensionality
86(2)
4.4 Conclusion
88(3)
5 The ELTT rating scales
91(38)
5.1 The development process
91(9)
5.1.1 Intuitive phase
91(4)
5.1.2 Qualitative phase
95(5)
5.2 The ELTT construct
100(22)
5.2.1 Lexico-grammatical resources and fluency
100(8)
5.2.2 Pronunciation and vocal impact
108(2)
5.2.3 Structure and content
110(5)
5.2.4 Genre-specific presentation skills: formal presentations
115(1)
5.2.5 Content and relevance (interaction)
116(3)
5.2.6 Interaction
119(3)
5.3 Descriptor formulation
122(2)
5.4 ELTT speaking ability
124(3)
5.5 Conclusion
127(2)
6 Descriptor sorting
129(24)
6.1 Validating the ELTT scales
129(4)
6.2 Rationale
133(1)
6.3 Methodology
134(2)
6.3.1 Participants
134(1)
6.3.2 Instruments and procedures
134(2)
6.4 Analysis
136(2)
6.5 Results and discussion
138(8)
6.5.1 Inter-rater reliability
138(2)
6.5.2 Match between intended and empirical scale
140(2)
6.5.3 Descriptor analysis
142(4)
6.6 Preliminary conclusions
146(5)
6.6.1 Level allocation
146(1)
6.6.2 Specificity of proficiency levels
147(1)
6.6.3 Descriptor wording
148(3)
6.6.4 Recommendations for scale revision
151(1)
6.7 Conclusion
151(2)
7 Descriptor calibration
153(52)
7.1 Rationale
153(1)
7.2 Analysis
154(11)
7.2.1 Rasch measurement
154(3)
7.2.2 Specification of a measurement model and FACETS output
157(1)
7.2.3 Measurement quality control
158(3)
7.2.4 Descriptor analysis
161(4)
7.3 Results and discussion
165(38)
7.3.1 Measurement quality control
165(5)
7.3.2 Dimensionality of descriptors
170(6)
7.3.3 The proficiency continuum
176(8)
7.3.4 Cut-off points and content integrity
184(19)
7.4 Conclusion
203(2)
8 Descriptor-performance matching
205(58)
8.1 Rationale
205(1)
8.2 Methodology
206(13)
8.2.1 Participants
206(1)
8.2.2 Instruments and procedures
207(11)
8.2.3 Data collection
218(1)
8.3 Analysis
219(1)
8.3.1 Specification of a measurement model
219(1)
8.3.2 Measurement quality control
219(1)
8.4 Results and discussion
220(34)
8.4.1 Measurement quality control
220(5)
8.4.2 Dimensionality of descriptors
225(8)
8.4.3 The proficiency continuum
233(3)
8.4.4 Cut-off points and content integrity
236(18)
8.5 Conclusion
254(3)
8.6 Comparison of methods
257(6)
9 Revision of the ELTT scales
263(36)
9.1 Establishing a quality hierarchy of descriptor units
264(7)
9.2 The quality of descriptor units
271(7)
9.3 Constructing the revised scales
278(7)
9.4 Common points of reference
285(5)
9.5 The modified versions of the ELTT scales
290(9)
10 Conclusion
299(30)
10.1 Summary
300(4)
10.2 Theoretical implications
304(10)
10.3 Practical recommendations
314(7)
10.4 Limitations of the study
321(4)
10.5 Suggestions for further research
325(2)
10.6 Concluding statement
327(2)
11 References
329(20)
12 Appendix
349
12.1 Appendix 1: Original ELTT rating scales
349(4)
12.2 Appendix 2: Sorting task questionnaire
353(6)
12.3 Appendix 3: Consensual scales based on descriptor sorting
359(3)
12.4 Appendix 4: Descriptor unit measurement report (descriptor calibration)
362(7)
12.5 Appendix 5: All facet vertical ruler (sorting task)
369(1)
12.6 Appendix 6: Speaking tasks
370(2)
12.7 Appendix 7: Rating sheets
372(11)
12.8 Appendix 8: Rater guidelines
383(3)
12.9 Appendix 9: Student measurement report (descriptor-performance matching)
386(2)
12.10 Appendix 10: All facets vertical ruler (descriptor-performance matching)
388(1)
12.11 Appendix 11: Descriptor unit measurement report (descriptor-performance matching)
389
Armin Berger is a Senior Lecturer in English as a Foreign Language in the English Department at the University of Vienna. His main research interests are in the areas of teaching and assessing speaking, rater behaviour, language assessment literacy, and foreign language teacher education.