Dr. Jim’s speaking and listening class covered two full chapters of the Ear book this evening; discussion time was generally lively–sometimes bordering on boisterous but never actually crossing that delicate line–and marked by moments of enlightenment, when some concept was either finally understood or found to bear some personal meaning for one of the group. Tonight “my group” was Chunmei ( from Taiwan ), Alan ( from China ), Yae ( from England and Japan ) , Tokiko ( from Japan ) and me ( from the U.S. ). Hiroko ( in the picture below ) was in the group at the next table, but I wanted to show her bright smile here anyway.
We had been plowing through different listening-related concepts and negotiating for meaning with mixed success, when near the end of class we hit on an interesting phenomenon that was directly related to the topic of the hour: assessment.
I’d like to elaborate on the interesting conversation that ensued, but first, let me activate your schemata and talk a little bit about assessment. We encounter it on a daily basis: our health is assessed by doctors, our cars are assessed by mechanics, our financial situation is assessed by city officials ( who will tax us accordingly ), our skin type is assessed by flawlessly groomed saleswomen ( who will sell us the appropriate beauty cream ), situations are assessed, damages are assessed, our personalities are assessed, and so on. Assessment can be objective ( hopefully your doctor will do a thorough and professional examination before he pronounces you either ill or healthy ) or highly subjective ( say you poke your head into a restaurant, assess the atmosphere, and quickly decide it’s not for you, based on the feel of the place ), and either method of assessment can be valid, depending on the context.
So let’s talk about assessing second language learners, and why it’s such a tricky business. First of all, assessment involves judgement, and judgement implies consequences. The results of high stakes tests ( such as SAT and GRE exams, or TOEFL and TOEIC exams ) can decide learners’ educational or career paths, and the fair and accurate assessment of the exams is a huge–I would say grave–responsibility. At least once a year in Japan, there is a testing scandal involving students who actually pass their high school or college entrance exams, but are denied admittance. After the fact, it is discovered that the tests results had been wrongly calculated ( and the average listener never learns whether this was an accident or a deliberate mistake ), apologies are extended, and appropriate punishment is meted out. Again, the average listener never learns what happened to the ill-fated student in the end. I always wonder where they end up, and what kind of emotional baggage they’re saddled with as a result of the “mistake”.
Outright mistakes in assessment–whether deliberate or otherwise–are one thing, but the assessment tool itself ( in most cases, a test ) must be valid for it to be an accurate reflection of the learner’s proficiency. Some believe that no test can ever accurately measure a language learner’s ability. Lee Cronback, known as the “Father of Construct Validity” argued that this was the case. Still, in most formal learning situations, learners must be assessed, and tests must be both valid ( defined as accurately measuring what they purport to measure ) and reliable— ( defined by fair and consistent testing conditions )— or at least as valid and reliable as possible. The tests that Japanese learners are most familiar with are called criterion-referenced: this means that certain scores are equated with certain standards of proficiency or behavior in the subject matter being tested. For instance, a learner who scores in the 9th band ( top level ) of the International English Language Testing System is one who “has fully operational command of the language: appropriate, accurate and fluent with complete understanding.”
Think about those four components ( appropriateness, accuracy, fluency, and understanding ) and what they mean for a second language learner. Accuracy, understanding, and fluency are, to me, are more straightforward than appropriateness, since the latter involves pragmatics, which is never straightforward. Some who define fluency as not just speed and facility of production but as “natural use of language” also include an element of pragmatics there as well. Here is where the class became interesting, as our group exchanged stories and gave some thought to the issue of pragmatics and assessment.
The first story that surfaced was my own, and here it is. Eight years ago, I received a call from a friend who had a part-time job with a testing company ( to my dismay, I cannot remember the name of his employer, but perhaps that’s for the best ). His mission was to track down Japanese speakers who would potentially fit the “top band” of his company’s speaking proficiency standard chart, interview them, and submit the results for assessment. The company was in the process of re-evaluating their standards and looking to see if, in fact, those standards were realistic for Japan. “How about your husband? Do you think he’s proficient enough? Would he do an interview?” my friend asked me with some hesitation. “Well, sure!” I said confidently, with all the faith in the world that my clever husband would ace the interview. My husband, one of those rare people who likes tests, was immediately agreeable, and the three of us met at my school on the appointed day.
The first half of the test went smoothly, as the questions were fairly innocuous. My husband was in his element, speaking confidently and accurately, and I was sure he’d pass with flying colors. My clever husband, and clever me, to have found and married such a clever man! But then–but then–my friend the tester threw a curve ball. “Here’s the situation,” he said. “I’m your co-worker and friend, and I have a smoking problem. It really bothers you. I want you to convince me to stop smoking.” And everything fell apart. My husband is Japanese. Smoking was still not politically incorrect eight years ago in Japan, and even if it had been, one’s habits in this country are one’s own business. The workplace is not where men have heart-to-heart talks about alcohol or smoking habits. Most men here don’t have those talks anyway! I knew instinctively that things were not going to go well with this question, and I was dead right. My husband very compliantly said a few words, smiled, and dropped the ball. “No, no! Pick it up again! Don’t stop there, be persuasive!!” I thought, burning inside.
The interviewer was encouraging: “Oh, you can come at me stronger than that! The smoking really bothers you–come on!” But to no avail. At that point, my husband capitulated completely, proposing ( while still smiling politely ) that if the smoker really wanted to smoke it was okay, and he wouldn’t bother him anymore. And that was that. The interview ended, we ate a box of donuts together, and talked about our respective families. My friend called some weeks later to report that my husband had not fit the criteria for the top band of proficiency. And it was because of the smoking question. “He just didn’t use the language like a native speaker,” was the way my friend explained it. As I understood it, then, he didn’t use his English appropriately.
And yet the problem was not my husband’s English skill. The problem was that he was put in the uncomfortable position of engaging in a culturally unnatural and unfamiliar type of discourse. He was sideswiped by his cultural identity rather than his lack of vocabulary, fluency, or comprehension. Literally left tongue-tied. Too bad, I say, but at least the situation was voluntary, with no “high stakes” involved. Not counting, of course, my husband’s pride and my own high expectations.
As soon as I had finished my story, Chunmei chimed in with her own. “I can never get my students to do role plays about smoking, either!” she exclaimed. “I ask them to persuade me to stop smoking and they clam right up and give me a funny look!” We then agreed that skipping the role play ( the book lesson ) and getting students involved in a discussion about the topic would be much more valuable. And we wondered about pragmatics and assessment.
In the case of my husband, meeting the criteria for a “top level of proficiency” required him to behave in a decidedly un-Japanese way. He was judged to be a non-native level English language speaker because his reactions were Japanese. And this, of course, is the heart of the matter: is such a language proficiency assessment valid? Recall the IELTS Band 9 criteria: appropriateness, accuracy, fluency, and comprehension. Again, “appropriateness” is the tricky component, and ( according to my friend ) it was a component of the test my husband took as well. As I see it, his approach to the smoking question was perfectly appropriate within his own culture. The interviewer did not specifically say, “You’re in the U.S. Address this situation like an American!” Of course, if he had, it might not have made much difference in my husband’s answer, but at least the expectation or the context might have been clearer. And perhaps, given the ambiguity of the question and the informality of the interview in general, the problem was one of reliability rather than validity.
The Ear Book (chapter 10 ) presents a chart of assessment models that I found both interesting and enlightening. Specifically, Rost gives a visual overview of criteria for assessment according to the purpose of second language learning ( EFL vs ESL vs English for Young Learners vs English as a Lingua Franca ). According to this chart, my husband could probably get top scores in an assessment of English as a Lingua Franca, since this is the only category that specifies “expected to maintain national identity through English”. Still, if not for the pragmatic component, I believe he would qualify as a top-level speaker in an ESL assessment as well.
So what’s the moral of the story? I really don’t know. I consider my husband a native-level English speaker, but that’s my own subjective assessment. In the long run, I’m engaged in an ongoing attempt to understand the delicate balance between mastering a second language ( which means tackling the pragmatics and taking on a new identity ) and retaining the foundations of one’s own cultural identity at the same time. Happily, there’s no rush, because I think I’ll be busy with this one for some time to come.