Columns on ELTNEWS.com View All Columns
Visit ELTBOOKS - all Western ELT Books with 20% discount (Japan only)

The Uni-Files - Grading Archive

A candid look at EFL life and lessons from a university teacher's perspective.

July 24, 2009

The Catheter of Damocles and the Perils of Grading 'Effort'

Hospital-ity

First, thanks to those who wrote showing concern about my condition. Although the hernia was very painful, it was not a danger and I'll be released from hospital tomorrow. Others also wrote words supporting my previous blog entry about not feeling responsible for student success or failure, but were unable to post comments for reasons that are still unknown to me (although it may have been me that screwed something up- I'm working on it).

Here's a little anecdote from the hospital, which is directly attached to the medical school I teach at, that may make you think about getting on a student's case or failing them because it may come back to haunt you:
During my convalescence I woke up one morning with absolutely no feeling in my private regions, including my entire butt. What I could feel though was a nasty pressuring pain from the left side of my abdomen. I called the nurse who told me that my bladder was backed up with urine- but with complete numbness down below I could do nothing about it. So, she called in the doctor, who just happened to be- you guessed it- an ex-student, and hardly one of the most diligent I have known. Not only that, but I remember haranguing him in class a few times for his lackadaisical attitude and I believe I made him do three re-tests until (reluctantly) passing him. Now here he was with all the power in the world over sensei, and was about to insert a catheter into my nether regions and all I could think was "He wants revenge!"
And that he could get it!

Well, as it turned out, he did his job well enough. He inserted the catheter (due to the numbness it was painless although still disconcerting, as any male will testify) relieving the bladder pain (in fact, it had built up to a critical point). It turns out that the numbness had not been caused by a rupture or other orthopedic complication but due to the anesthetic node (placed in my spine) becoming slightly dislodged, a much less serious condition.

Unfortunately, he then proceeded to drop many of the brownie points he had earned by asking me how to say ‘shinkei’ (nerve) in English (I communicated with all but one doctor in Japanese). That an orthopedist didn’t remember such a basic term was unnerving (pun intended).

So, if someday you are in the operation room and that student you harangued and badgered now looms over you with a scalpel you would be right to be afraid, to be very afraid. Those of you who teach at police academies or public officials may also want to keep this in mind. Of course, the other side of the coin is that I was able to get some special 'recognition' treatment from doctors and nurses too (and in the case of the former, the interesting register problem of exactly who should call who 'Sensei' comes into play).
But that's another entry.

Points for attendance, effort and participation- a dilemma

I bet most of you give out points for the above, right? Most teachers I know make it account for anywhere from 10-30% of a student’s final grade. After all any type of communicative English class is based upon process, carrying out tasks, facing challenges, using the language within dynamic contexts. But I’ve noticed a problem. Let me illustrate it using two students as models, Kimiko and Shohei.

Kimiko is a genki chatterbox. She is always cheerful and perky. She sits at the front and makes eye contact with you. She responds to your jokes, in fact any comment. She calls you over and enjoys asking questions in (often broken) English. She greets you in the hallways. You can hear her carrying out the tasks in the classroom because her enthusiastic voice rings out above most others. You learned her name in the first class.

You notice though that she has been absent twice, late once and forgot both her homework and textbooks on occasion. You also know that she has the habit of finishing tasks rather quickly and then ubiquitously chatting in Japanese to friends. When you chastise her she bats her eyelashes.

Then there is Shohei. He is pasty-faced and rather disheveled. He sits at the back of the class. He rarely makes any facial expression. On the rare occasions you hear him speak his voice is monotone and he does not make eye contact. He probably doesn’t greet you in the halls and I say probably because you’ve never really remembered who he is. When monitoring an activity in class he participates, but not so audibly. But you’ve never seen him sleeping or acting as if he has a divine get-out-of-this-activity-free card in his pocket. You also note that his attendance is 100%.

So, you have your tests. One is a role-play test, so more dynamic and unpredictable language skills are being tested. The other is a paper test, a little more discrete-point focused, with more writing, showing understanding at a more detached level.

And... Shohei whoops Kimiko’s butt on both tests. In fact, Kimiko makes some pretty fundamental mistakes (albeit while batting her eyelashes). Obviously, for all of Shohei’s standoffishness, diffidence, or anti-social personality, at some level he is making the effort to soak it in, while Kimiko is hit or miss.

I’m willing to bet most teachers would give the higher participation/effort points to Kimiko because of her engaging personality (no, she need not be a cutie pie), but in fact it may be Shohei that, in his awkward unsocial way, is making the greater effort to learn and master the subject. It’s something that the teacher will almost never be able to see, let alone gauge.

Food for thought.

Share this:  

November 12, 2009

Q. When is a Native Speaker not good enough to be a Native Speaker? A. On the TOEIC test

Have you ever wondered if you might fail an English test that your students take even though you are a native speaker of English? And if you did fail, what would that say about the test (Assuming that you were giving it your best shot)? It could happen. It has happened...

One of the more memorable ELT presentations I've attended recently was given by Terry Fellner, Associate Professor at Saga University. Terry is not only a native speaker of English (duh!), he's a particularly well-educated and articulate one. But earlier this year Terry, out of curiousity as well as a means of 'testing the test', decided to take on the new Speaking/Writing TOEIC test (also known as 'Walmart').

I won't keep you in suspense here regarding the results. You can probably guess what's coming:
Terry's score was not in the highest percentile but in the second rank, which meant that Terry was judged not to be a proficient speaker/writer of his native language. Among the weaknesses cited were:
- errors when using complex grammar
- imprecise use of vocabulary
- minor difficulties with pronunciation, intonation, or hesitancy
In none of these categories (or rubrics, if you will) did the highly articulate Terry Fellner deliberately or willfully fall short.

Hmmm.

Terry was also judged to have some problems regarding the relevancy of his responses (readers should note that the test was all done online in real-time but obviously in a depersonalized manner). And this is where it gets interesting.

Terry decided to test the pragmatic preconceptions of the test by giving slightly unexpected responses but responses that were nonetheless, given the questions and tasks, logical, orderly, comprehensive and, of course, expressed with fluency. Let's take a look at some of these...

1. Terry was asked to describe a photograph. What he chose to do though was not start with an explanation of the foreground image (apparently a cart and horse) but rather focused upon the surrounding qualities of the picture: the weather, the background scenery etc. In one sense his choice might seem to be facetious, deliberately obverting the evaluator's expectations, but why should examines be expected to conform to certain narrow Western notions of centrality or importance on what is purpotedly a test of INTERNATIONAL communication?

Not only is it considered to be a cultural trait by many to focus on background and surroundings before articulating the 'center' but some personalities may also have this attribution (a skirt-chasing friend long ago displayed an incredible to spot, focus upon, and remember any 'hot babe' at locales such as The Parthenonor Notre Dame cathedral while forgetting what city he was in). A better example may be to look at a classical Chinese landscape painting. While there may be a hermit/poet scrawled in a lower corner somewhere this is not what the painting is 'about'. More central is the background- the atmosphere of the mountains, the textures and shapes that surround the 'subject'. Notions of background and foreground are blurred, mixed.

This can happen with music too. In many non-Western music styles the 'melody' is not nthe foreground or center, but rather tonality, timbre, texture, polyrhythm etc. Fans of modern classical music and most modern jazz, which tend to incorporate non-Western tonalities, will also be familiar with this aesthetic.

In other words, the test assumed a Eurocentric model of both viewing and description- again for a test of INTERNATIONAL communication. But wait, there's more...

For one speaking task, which required Terry to propose a solution to the problem of high office expenses, he suggested the use of clay tablets to replace computers and paper. This he supported logically and consistently (in keeping with the demands of the task) arguing that it was 'proven technology', with cheap and easily accessible materials, that are environmentally friendly, and that clay tablets could also recoup costs by being resold as housing material.
Maybe not what the test evaluators were suggesting but still expresed with relevance, logical consistency, sufficient support, and of course, fluency.

Terry also assumed a very familiar stance with his superiors in the task, going as far as to apologize for his lateness as being the result of a hangover. Here he was testing the TOEIC's notion of appropriacy- what kind of appropriacy? Whose standard?

Similarly, in a writing task in which he was required to respond to an email from a real estate agent with two requests and a question, Terry complied by not only thanking her for the email but expressing surprise that she was now out of jail. He also requested a location that would be near, among other things, a nunnery and a German bakery. His request was to pass along a hefty 'gift' to a police sargent while asking how the buasiness license was coming.

Socially inappropriate? In some (but certainly not all) cases. Does his response display a lack of knowledge or understanding of English discourse? Not at all. Was it unconnected to the demands of the the task? No. Was it expressed in an intelligible manner? Certainly. So where did Terry go 'wrong' on the test? What was the scoring rubric? And was it geared towards certain localized 'norms' that do not reflect a flexible, or international, standard? Seems likely.

It goes on..

In an essay writing task in which he had to explain qualities that Customer Service Representatives need to be succesful he responded by expounding upon the ability to project false sincerity and the willingness to work for a low salary. Again, he met the demands of the question, and utilized his English skills but not exactly in a way that the evaluators would be looking for. Perhaps then the place that they are looking is too narrow. To pragmatically focused upon a North American model. Culturally loaded. Morally loaded too.

Apparently, a lage number of TOEIC test takers end up being placed in the same percentile where Terry was rated. So, if a native speaker ends up there, what does this say about the accuracy and relevance of the scoring? A huge variety of skill levels seem to converge onto this one evaluation slot. It becomes rather meaningless.

And what about the feedback he received? Doesn't it sound a little like an o-mikuji bought at your local shrine at New Year's, where one's allotment of good luck or bad luck is already printed on the paper, prefabricated 'fortunes' completely independent of the individual actually buying them?

Now this isn't meant to rag on the TOEIC people. I know how hard it is to make a comprehensive test that is completely valid (it tests what it claims to be testing) and reliable (the result will be unaffected by happenstance). In many ways the TOEIC is admirable and comprehensive, but it is very very far from being foolproof- especially on the new Speaking/Writing version. I'll go even further. It is still very far away from being an accurate measure of a students' ability to speak or write English for international communication.

Recently, many universities have been getting all hot 'n sweaty about the alleged 'objective' value of TOEIC scores, which is supposed to represent an evaluation that is better than that of a trained in-house English teacher. Terry Fellner has shown though that this is still an illusion. Universities who think that a TOEIC orientation should replace normal communicative English learning had better think twice.

Share this:  

December 17, 2009

Failing- and failing to fail

One of the more persistent and widespread beliefs about Japanese universities is that all students pass their classes as a matter of course. Students who sleep or don't hand in any work are still given the green light to pass through the system. Apparently, administrative pressure and/or teacher apathy are the root causes. Hmmm.

I say this with some hesitancy because I haven't meant any teachers who actually admit to being in this situation so, while I'm certainly not saying that it doesn't happen, the extent of the behavior might well be overstated- something of an educational urban legend. In this way, it's similar to the widespread NJ notion that Japanese English teachers primarily teach grammar-translation lessons (which I've blogged about previously and with the same caveat that I've not actually met any Japanese teachers who admit to doing so). In short, it seems to be only second-hand 'common knowledge'. Most university teachers I've met have shown an almost defiant willingness to fail the laggards.

Now please realize I'm not talking about high schools here. I have heard regularly from very trustworthy sources that auto-passing is indeed a common practice in high schools. To some extent, this is understandable. If high schools fail students it looks as if they have failed to motivate or educate them properly (putting emphasis here on the phrase 'looks as if'). After all, student stewardship is a big part of a high-school teacher's role. This will therefore look bad on their records and any stats or data used to woo the public for recruiting purposes- which is, of course, a special concern for private high schools in particular. So, in order not to give off the appearance of creating 'failures' high school grades or standards might well be gerrymandered.

But universities? First, universities have almost nothing to gain from automatically passing students. After all, public perceptions of quality is based primarily upon entry standards. The fact that a student may take six years to do four years' work is unlikely to enter any meaningful record that would influence public perception of the institution (and it might even enhance the university's reputation for being tough).

Not only that, but by having students do an extra year or two means more revenue- not a small concern these days. And then there are the professors themselves- they will not in any way cause damage to their standing or reputations by failing students. There is also no 'teacher's room' or all-uni meetings where pressure to pass students (for what purpose I do not know) would be applied. And office administrators do not and cannot lord it over professors on such matters.

Most university professors I've met in Japan (both J and NJ) are in fact quite at home with the idea of failing students who do not meet expectations. It's no skin off their noses (although the big disadvantage may be that the laggards might be back in your class next year). At the university level, it is understood that professors are no longer responsible for motivating these young adults (it's university after all) and therefore generally do not feel that they have been derelict in their duties should a student get a failing grade.

Personally, I have never felt any pressure whatsoever here at Miyazaki University to automatically pass students. In fact, when some dicey pass/fail situations have come into play in the past administrators have been more than supportive of the failing option. I teach part-time at a nearby liberal arts university as well and they too have a similar policy (with the exception of soon-to-graduate students who have already secured jobs).

In the MU faculty of medicine (my home base) we have a year-fail ratio of about 15-20%. By 'year-fail' I mean that students fail three courses within a certain year and thereby have to repeat that year (although they will be obliged only to take the classes they fail and electives). Moreover, in their first two years, if a students fails ANY required course (and Communication English is numbered among these) they will be duly dropped a year (this can be traumatic for many students as they tend to build quite strong bonds with year-mates). Over six years in this medical school about 90% of students will fail some individual class at some time. I fail a few each year myself. I allow that this should be the norm when you are educating future doctors. medicine, of all faculties, should not be a walk-through.

So how do students fail? Well, attendance policies for one thing. More than three non-medical absences means an automatic zero. A total score of under 60% is the other criterion. No one in the administration will question how or why a student got under 60% (the professor's word is all that matters- it is unthinkable that any administrators, aside from the head professor's committee- the Kyouju kai, would interfere in this process).

There is a small catch though- and a good one I think. When preliminary grades are entered into the system, those with a grade of 30-59% must be offered a chance at some type of re-test (in the case of incorrigibly bad students a 29% score will conveniently offer no further re-testing opportunities). On the whole though, re-tests are a good thing. After all, the idea of education is to help the student learn the skill, complete the tasks, master the knowledge and if that means they get their asses in gear a little late- well, at least they will have fulfilled the basic requirements. (Of course if the re-test consists of little more than the pithy 'writing a report' the re-testing system is meanngless)

And here's where testing, content, and methodology come into play. If a student sleeps through all the classes, contributes nothing, and studies nothing, there should be no way that they can achieve the necessary 60%, even with a re-test. This is not so much a moral policy as a logical one. What I mean is that the course should NOT measured only by a singular final test based on discrete knowledge (akin to, in many ways, some entrance exams). Since education (especially that at the tertiary level) should be a process- a process that involves carrying out tasks and the development of specialized skills, students should be graded on the completion of these tasks and skill areas; things that are learned and practiced only in that class and cannot possibly be attained by a last-minute cramming of the textbook.

In other words, a returnee student who does nothing but easily fill in a discrete point English test form at the end of the semtster would end up get a passing 60% for doing nothing. This would indicate that there is something wrong with the class content, methodology and grading policy (pretty much the three strikes as to what constitutes a good class). In my 1st year English Communication classes I can categorically state that it would be impossible for such a student to get 60% because the medical discourse and related skills I teach- and they subsequently practice in process-based tasks- are NOT something they will have encountered in high school or by living/studying abroad.

As for sleeping students, that is a matter of the individual professor's responsibility and/or policy. I keep mine awake because the classes are task-based, not receptive 'lectures'. Pair and groupwork forces them into action. If they did sleep for any length of time, they simply would not know what to do and this would lead to- at the very least- two or three nasty re-tests. The students learn this very quickly (sometimes the hard way) and therefore avoid both lazy absences and sleeping.

Teachers who measure the course with a single year (or semester) ending test will likely not have this luxury. Students will know (from their seniors) that all they have to do is get the basic attendance, study the textbook just before the big exam, and focus on a few points that will be tested (all university students can get hold of old exams). Basically this serves a recipe not only for sloppy students attitudes but is pretty much a blueprint for meaningless education. If teachers prepare tests/grades this way they are basically shooting themselves in the foot. (Again, I don't know of anyone who actually admits to doing this)

But, if passing is incumbent upon actively participating in class-related tasks, learning something new and unique to the particular class, or manifesting a new skill (or best, all three of the above) then students will involve themselves accordingly. Not only that, but professors will feel that this makes their classes meaningful, that they are involved in the process of education, and not merely 'completing a course'.

In which case passing actually means something; and failing is a real option.

Share this:  

January 21, 2010

'Misses' and 'objectivity'

The Center Shiken (National University Entrance Exam) took place a week back and I'm sure many readers were involved at some level, most likely by proctoring. And if you were proctoring, (even if you were a back-up proctor, yes, there are benchwarmers in Japan's Center Shiken proctoring world) you will know the intricate protocols, steps, conditions, and general hoop jumping that is involved in what many might mistakenly think of as an easy process.

The key notion is of course that the Center Shiken must be fair and fully objective. That's why it is held nationwide with the same subjects being tested at the same time in over a thousand locales Japan-wide with over 500,000 students taking part. In order to maintain this integrity the surrounding system has to be airtight. Details are meticulous and must be adhered to under threat of your photo appearing in newspapers regarding a breach of Center Shiken protocol. No compromises. Nothing slipshod is allowed.

Lengthy protocol explanation sessions, complete with instructional CD ROMS, are prepared for proctors. The instruction booklet is the size of a small telephone book and, as far as I can read, contains provisions regarding appropriate actions to take if an examinee freaks out, becomes physically ill, if an alien lands in the testing room, and if an examinee suddenly morphs into The Dave Clark Five.

You know, the Japanese are generally very good with this type of thing. One old school generalization about Japan that I hold on to is the fact that the couuntry is pretty risk adverse and great lengths will be taken to ensure that there are no 'misses' ('miss' being the standard abbreviation for 'mistake', and it is the default term used in Japanese). If you've ever been involved, or merely watched, a kindergarten or elementary school undo-kai (sports day) you can see the meticulous, orderly planning manifested in a seamless- but somewhat tense and regimented- performance. (Whether people actually ENJOY it is another matter).

The thing is though, the more you try to avoid 'misses' by fine-tuning, tightening the screws, or devising manuals that try to cover every contingency, the tighter the system the more likely that a 'miss' is likely to occur- precisely because you've created a huge checklist of protocols that now could go wrong. As analogies, think of pure-bred dogs and how finnicky they are. Think of the guy (it's almost always a guy) who tweaks his computer to a T but it's always malfunctioning when any new software is introduced. Think of body builders where each muscle teeters on the brink of both 'perfection' and complete physical breakdown. The fact is, the tighter you build the foundation, and the more pieces that you use, the greater the likelihood that one piece will falter and lead the whole thing to collapse.

Hence, the near fetishistic emphasis upon 'miss' avoidance can actually induce scenarios where more misses are likely to occur. At the Center Shiken we proctors were quite tense, with almost every second accounted for and formally backed up in some way, making sure that the myriad steps were taken in precise order, with military obedince to the manual. This meant that we had to act with speed and efficiency but also meant that any screw ups would lead delays or claims from examinees of some breach of norm. And the more nervous, cluttered, and time constrained you are, the more likely that a 'miss' will occur. (There was also a ubiquitous stretcher placed outside the examination area, as if to underscore the severity of it all).

Now, here's the twist.
A miss in the test administering protocol is considerede a huge black mark. Therefore, about 95% of the pre-test information sessions and meetings focus upon the avoidance of a 'miss'. But, as an English teacher, I am more concerned about 'misses' at the larger level. Let me explain.

At the orientation sessions for teachers making the second-stage university entrance exams (NOT the Center Shiken orientation sessions) the overwhelming emphasis is also placed upon not having any 'misses' in the test. There is, in my opinion, too little emphasis placed upon producing a test that is valid and reliable. In other words, the overriding rubric is negative: "Don't have any mistakes on the test. That's all we ask". The endless fix-up and follow-up sessions are designed to make sure that no misses get through.

A big, get-called-before-a-committee mistake would be something like the following:
Match the four paraphrased sentences below with the undelined sentences (1,2,3,4) in the passage.
a.
b.
d.
e.

Although the lack of a 'c' answer should not really confuse students or cause them to answer incorrectly, this would be a huge black mark for the test makers.

Anyway, administrators usually want 'objective' style tests because objectivity, it is believed, reduces the likelihood of mistakes. So, in order to meet the heavy 'no-miss' criterion you could make discrete English language test questions like the following:
1. The Montreal Canadiens last won the Stanley Cup in [ ].
a. 1998
b. 1984
c. 1993
d. 2004

2. Hitler's [ ] regime lead to the restructuring of Europe's political boundaries
a. nebulous
b. soporific
c. pernicious
d. sendentious

As you will see, there are officially NO misses in the above questions. But they are clearly absolutely crap questions for an English test. (I've exaggerated the samples- I can't imagine any exam actually making such questions although they did come close in the not-too-distant past- to make a point).

The first question does not measure English skill in any way but rather teasts localized knowledge which happens to be presented in English. And even if this was accompanied by a passage containing the answer (c) it still would not be indicative of English skill, especially in terms of measuring suitability for university entrance. Also, if the answer was contained in the passage 99.9% of the examinees would get it correct which renders the stratifying force of the question meaningless. So, while there are technically no 'misses' in the question it is nonetheless both invalid (it doesn't measure what an English entrance exam is supposed to be measuring) and unreliable (it's either too hard, based on chance specialist knowledge, or -if the answer is in the passage- it is too easy) and thus cannot have any stratifying function for placing examinees.

But it IS 'objective'. It contains no 'misses'. Also, the answers can be immediately measured numerically: 2 out of 2. Administrators love this type of thing and consider it somehow more 'objective' because the results can easily be rendered as numbers- even though these numbers basically indicate NOTHING about actual English ability. "Hey, if it's mathematical it must be objective!"

In the second example, the vocabulary choices are obviously way over the students' heads which means that if the correct answer is chosen it will almost certainly be chosen randomly (and of course a trained chimpanzee has a 25% chance of getting the correct answer on a 4-item multiple choice question).

Hey, but it is still 'objective' and contains no 'misses'--- despite the fact that it is thoroughly invalid and unreliable.

OK- I can't imagine any university entrance exam test maker making such egregious errors (in fact, in my research I have found that many second stage entrance exams and recent Center Shiken are quite valid and reliable). But the point is that an inordinate focus upon avoiding misses and maintaining this surface, shallow notion of objectivity can obscure the bigger picture- that of makng valid and reliable tests that acuurately or reasonably measure a wide range of student English skills.

Questions that demand deep thinking or skills such as making inferences, reading between the lines, predicting, summarizing and so on tend to be both more complex and nebulous than simple kigou (so-called because they can be answered by a letter mark- a,, b, c, d) questions. This complexity or lack of clarity can often led to what overseeing commitees think of as 'misses'. Overseeing commitees don't like the alleged 'subjectivity' or interpretive element that such questions demand. Hence the safety factor in making more discrete TOEIC-type questions

I find this fear of alleged subjectivity odd. After all, as trained professionals it is precisely we who should be expected to be able discern which students display the greatest ability in a subjective or essay-type question. By taking away the subjective evaluation element from a trained, experienced pro (who is supposed to be an expert in the field- that's why you've hired them to teach at a university) you've basically narrowed the scope of the test. You're no longer measuring extensive English skills but discrete item knowledge. You're no longer testing English ability but knowledge about English.

Your emphasis on 'no misses' at the expense of greater test validity and an artificial sense of objectivity that in fact often reduces test reliability means that you've messed up the bigger picture of measuring holistic student English ability.

And that's the biggest 'miss' of all.

A QUICK FUNNY- My all-time greatest classroom mistake

A long time back, when I was new to Japan, I had a small class in which I asked the students to tell me about the Japanese person who they admired most. One of the students answered 'I admire Chiyonofuji'. At that time I had no idea who Chiyonofuji was, so I asked. "He is a small restaurant," came the reply. "Non, no," I responded. "He OWNS a small restaurant or he runs a small restaurant. Not 'He IS a small restaurant'". The student looked both frustrated and amused. "But he IS a small restaurant" he insisted. A few seconds later another student spoke up. "Chiyonofuji is a sumo wrestler," he explained.

Oh (blush).
But come to think of it, some sumo wrestlers are actually like small restaurants.

Share this:  

March 03, 2010

Putting together a half-decent achievement test

If you work at a JHS, HS, college, senmon gakkko, or university in Japan you have probably just completed several year or semester end achievement tests. After all, you need grades for your students so some kind of evaluation is required. But this is an area in which a lot of mistakes are made, a lot of educational principles violated...

I'd like to think that testing is something I know a little about, an area that I've become at least a little sophisticated with. It was one of my specializations during my MA days as well as one of those areas in which I've kept up the research level, so I'm hoping that a few of the things I mention below might carry some weight above and beyond the 'some guy on the internet' level of credibility.

First point-
Achievement tests are not placement tests nor, usually, are they proficiency tests.
In an achievement test you are evaluating the students' course work. That means the focus of test content must be upon what students have, or were supposed to have, covered in the course. This means that any content that was not dealt with in the course should not be part of the test. It means that the skill emphasis should match the skills that you were trying to teach in your class. Test tasks should resemble those tasks which were practiced during the course. You are not gauging the students' overall English ability or general skill- which would be more representative of a placement or proficiency test- so don't try to. The test should measure a student's ability to meet the specific course goals as set out in the syllabus.

Second point-
If you are an educator the test should have an educational function.
It should have a pedagogical purpose as well as an evaluative function. Students should be learning from their tests. This means that students must know what they did right, what they did wrong and be given a chance to fix it. In other words a good achievement test has a diagnostic function. This has several administrative implications:
1. You must give the test back to the students. It belongs to them.
2. There must be some type of review or feedback for the students.
3. You shouldn't give the test in the final class or else you can't review it.
4. Students should be able to find out what the correct or model answers are.
5. Students who did poorly should be made to do a re-test, or two, until they show that they have learned the material (or skill).
6. Why not have students obtain good or correct answers on those sections where they did poorly by checking with peers? I do a 'test interview' where students ask one another those questions they didn't answer correctly and if the partner knows the proper answer, they can teach (not just 'tell') it to the other student.

Third point-
You can and should diagnose your own teaching effectiveness from the test results.
If students do poorly on the test, or on specific items on the test, it is very likely because either 1) the question, task, or entire test was invalid ( the test didn't actually test what is was supposed to) or unreliable (if a similar test was given to the similar students at a different time and place scores would be very different- meaning that happenstance affected the test results, usually as a result of poor test design).
2) you didn't teach whatever it is that you were testing well enough.
This should be telling you sometyhing. After all, tests test the teacher's effectiveness as well as the students'.

Fourth point-
You need to test more than just recognition (memory) and discrete-item knowledge.
Memory is a limited skill. Not only that but memory is not just recognition (the most passive, receptive aspect of memory) but also recall (contextual understanding), and reproduction (application). If you were teaching a class that was expected to focus on developing productive skills but give a test that measures only memory-recognition you have an invalid test.

Likewise, language is not just a collection of discrete-item knowledge. It is a dynamic system that involves numerous social and pragmatic considerations. So again, if your class was expected to develop student skills in using English within meaningful and/or practical contexts, if you focus mainly (or solely) on discrete-items you will have made an invalid test, since the skills you are supposedly trying to inculcate will have escaped the net of evaluation.

Fifth point-
The test can easily be used as a study and/or review experience
Open-book tests are great. Students can once again review material and find those things that the teacher wants them to understand. Open-book test success also relies more on a general comprehensive understanding of a subject as opposed to memorizing discrete items. Of course, given that the test is open-book we should also expect standards to be high. I have come to notice that students who are well-organized and think actively succeed at these tests while the laggards who weren't paying much attention or making much of an effort all year rarely rise above their 'stations'- at least on the first test. This doesn't always happen on discrete-point knowledge-based TOEIC-type tests.

Providing students with the test tasks or questions or old exams in advance (they'll usually get them from their seniors anyway) can help too. By letting students know what to study for, you focus their energies on those things you really want to inculcate and leave less to random chance, circumstance or wasted/misguided student effort.

Sixth point-
Ongoing evaluation, especially if you are using a variety of evaluative means and measures, is more effective than the traditional 'one final paper exam' format.
Language learning is a process and so the evaluation should be process-based and focus less on the one, final 'this-is-your-official-result' mode of testing. Using a variety of testing methods and means allows students who respond differently to different challenges to strut their stuff. Not all 'good' students are sharp at paper tests and may do much better on a role-play, report, or some type of visual/tactile task. Ideally, using all test types you can get a panoramic view of their all-round skills, and therefore a more accurate reading of their English abilities (assuming that you are trying to educate them in holistic way, that is).

Weighting tests is also important. Putting something like 80% on a final test might not be a good indicator of actual student ability over the entire course of the class. Breaking evaluation up into 20% increments allows for more types of evaluation and widens range of the criteria. It also tends to keep students alert and focused.

Seventh point-
Let students have some say in the test content
Productive, open-ended tasks are to be encouraged as these allow for some self-expression and variety, letting students use the language while actively thinking and engaging it. Most teachers will tell you that in terms of marking, these tasks and problems are easier to grade- and tend to provide a more comprehensive view of actual student abilities. Even better, allow students to make some tests themselves. This will allow for a good review of content and also show the teacher what students have learned (or not), or feel is important (or not). And what a teacher learns from this can be applied to next year's lesson plans.

I allow my students to appeal their test grades too- as long as they do so in English. If they feel that the grade on a 'subjective' test or item was unfair they have the opportunity to explain to me why their score should be higher, a process which demands that they consider both the test result and content but also how they will plead their cases in front of me.

Reader suggestions on testing are more than welcome in the comments section.

Share this:  

October 26, 2010

Two items: 1. Nobel Prizes, Research and REAL WORK 2. How to avoid a test (and fail!)

Two mini-posts today…

1. Nobel prizes, the office concept, and research in Japan

Much was made in Japan of Prof. Akira Suzuki of Hokkaido Univ. being awarded the 2010 Nobel Prize in Chemistry. There is no doubt that Nobel Prizes provide a boost for national egos, even if the winner is usually more a product of individual genius that a product of that society. Oddly though, when a Japanese academic wins a Nobel prize it is usually accompanied by an equal amount of hand-wringing about shortcomings in the nation’s educational and research environments.

I say 'oddly' because you’d think that achieving the ultimate academic recognition would serve as a vindication of an educational system but not in Japan. One reason is that co-winner Eichi Negishi is based at the U. of Chicago and has been so for almost all of his research career (and he is not the first Japanese researcher who has been able to flourish abroad and be critical of research setting in his country of birth).

The criticism is that university research institutes in Japan are static and rigid. That there is a stifling hierarchy which discourages the type of open environment necessary for innovation and success (although I would argue that most countries would like to have Japan’s –ahem- lack of academic/innovative success).

Not working in a research lab I cannot confirm all of this firsthand but the fact that even young Japanese researchers (among them some that I’ve met on my own campus) seem discouraged certainly lends some credence to the notion. But I’d like to raise another factor that inhibits the pursuit of excellence in almost all of Japanese educational institutions but is rarely mentioned as a factor....

OK. When you think of the term “Japanese worker” what comes to mind? The guy in the blue suit who sits at a cubicle (or a shared table) in a company office 8AM-8PM, right? Mr. Salaryman (or Ms. OL in the case of women). This seems to be the set model for ‘working’ in Japan. Therefore, if you are not somehow engaging in office work of some sort you are not really working.

Now you might think that primarily teachers should teach, doctors should treat patients, and researchers should do research, right? And perhaps the occasional bit of paper work might come their way for inputting grades and the like. But not in Japan.

An enormous amount of my working time, concentration, and effort is taken up by requests from various offices in the university. Elaborate questionnaires have to be filled in, meaningless committees have to write vapid reports, databases are changed and have to be re-inputted, the Student Affairs bureau wants you to keep a record of student visits to your office and the purposes thereof- I could go on and on but you get the point. It seems like almost everyday the secretary comes to me with something to fill out, prepare, input, or comment on.

To be perfectly honest, I've come to feel that if I read an academic book on EFL in my office for more than 5 minutes I’m screwing around, indulging in a personal hobby. If I work on an academic paper on my computer I’m somehow cheating the university time-wise. Help! They’ve gotten to me!

I often get the impression that administrative office staff thinks that if we are not on our actual teaching contract hours that we aren’t really working and therefore have to fill our idle hands with some nefarious tasks to legitimize receiving our paychecks. And yes, I have heard researchers here claim the same thing- that they are always busy with ‘zatsuyo’ (paper work) and thus are forced to delay the very research that the ‘zatsuyo’ is based upon or work until the wee hours. The surrounding, peripheral work has supplanted the real work. It seems that the most important thing is to dance through the hoops created by someone in the office downstairs, not to produce actual research of worth. Your research could be total crap and you'd still be rewarded for it as long as you completed your online 'Research Report- reflective imprssions of the allotted travel funds section' correctly. And only in 12 MS font.

As I work next to an attached hospital (plus the fact that my wife is an MD) I know that this afflicts doctors (and nurses) too. Doctors complain of rushing patient visits in order to complete the pre and post visit paper requirements, which are ever increasing, demanded by the paper pervert powers in those dusty cubicles.

Maybe this is why research is usually more practical and productive at Japanese companies than at universities. The expectation inside a company seems to be that office workers do office work and the lab people stay in the lab and there are a sufficient number of clerks and secretarial go-betweens to bridge the two. Less so for universities and hospitals. Secretaries and clerks have their roles here to be sure, but the more they do on behalf of the teaching/research staff, the more the bureaus downstairs make up because- well we have to do some real work, right? And real work of course means filling in online forms and shuffling more and more papers…

2. How to avoid a test: An almost true account of where my class apparently ranks in the student life hierarchy

(Setting- My classroom with 32 2nd year English communication students)

Me: OK. Next week we’ll start the role-play tests based on what we’ve been working on over the last five weeks. You’ll be doing the role-play in pairs- 12 minutes per pair. Even numbered students will come next week, odd numbered students the week after.

Everybody: Ehhhhh!!??

Me: What do you mean, ehhhh???!!! It’s a university. We have tests here, right?

Yamada: But we have a test the day right after that in Anatomy! We have to study hard for it!

Me: Perhaps then you should ask the anatomy teacher to postpone his test- because you have an English test the day before and you have to study for that!

Watanabe: But it’s not fair because the students like me who come next week have the anatomy test as well as your test, but the students who come in two weeks don’t!

Sato: But it’s not fair for students like me who come in two weeks either!

Me: Ummm, why not Sato?

Sato: The rugby team is playing a tournament that weekend and we have practices!

Me: You don’t have practices Thursday morning, when our test is held!

Kobayashi: But we’re having a drinking party on Wednesday night to celebrate the tournament.

Me: Now why on earth did you schedule a drinking party on a weeknight?!

Hayashi: Our club seniors decided. So we have to go, and then we won't be able to study for your test. Plus it’ll be hard to get up in the morning for this class!

Me: Well that’s a choice you make. Please your seniors or get a failing grade on the test.

Suzuki: Give the test in three weeks! It’s better!

Yamamoto: No way! In three weeks the orchestra is doing a concert the day after English class and we in the orchestra have to focus on that. I may have to miss English that day anyway to set up seats in the concert hall.

Me: If I listened to you guys we would never have a test at all. Or even classes for that matter.

Setoguchi: Why don’t you do the tests in the final test season, like other teachers?

Me: Because it’s not suited to two weeks of role-play testing AND I can’t give you proper feedback. Plus, we use ongoing evaluation in English class. It's not just a pile of knowledge that we’re testing.

Abe: Yeah, Setoguchi, shut up! If we had the test in the usual testing season we couldn’t study for it anyway because we have three other tests scheduled then. So we wouldn’t be able to study for the English test at all.

Me: All right. I hear you. The only solution it seems is to do the test right here, right now in the next 30 minutes. Take out one pen and one piece of paper everyone. Here we go. This test, or should I say pop quiz, will account for 60 percent of your grade. Good luck!

Everybody: Ehhhh!!!???

Share this:  

January 12, 2011

Student opinions on entrance exams- a good idea but...

I noticed this item in the Daily Yomiuri on Dec. 30th (2010) about how some high schools are now including questions which allow examinees to express their opinions on entrance exams. I encourage you to read the article. Closely.

At first it is hard to argue with the intent. I have long been an advocate of avoiding discrete-item, passive, receptive test taking as being the sole determiner of entrance scores, since they capture only a small percentage of English skill and ability and, as we all know, tend to have a negative pedagogical washback. And I have long argued that most second-stage university entrance exams in Japan have moved more and more in this direction over the past decade. Essay writing, open ended writing tasks and other productive, active testing modes are now so routine that most high school and juku teachers will address these skills- obviously a good thing.

So, the fact that high schools are starting to take note and apply the same principles to their own entrance exams would seem to be cause for applause. But.. take a closer look at the article.

The main idea of this new approach is that 'independent thinking' should be encouraged and rewarded. Fine. But then in the article's test-item examples we see that 'correct answers' include very specific concepts and content (in the first example, students had to note that mankind had appeared on earth very recently, and in the second the term 'mutual assistance' had to be included in the answer.

So, hold on a second. We are asking for independent thinking, self-expression, and opinions and yet we have these very set, particular correct answers. Isn't this a contradiction?

An official from the Osaka Board of Education quoted in the article says, "These kind of questions test students' ability to choose important information, develop their own opinions and express their views intelligibly", except... the answers must include mention of specific items.

Here's the problem- it is entirely plausible that you could have a student address points raised in the text, write in an orderly and intelligible manner, and express an opinion with merit, and justify it, and still not receive due credit if they haven't made mention of the 'key' concepts.

In other words if the Osaka official really wants students to choose important information, develop their own opinions, and express their views intelligibly, if this is the criteria, then you have to drop entirely the notion of a correct answer. Instead you have to evaluate essay writing skills- Did the student actually address and understand the text? Was the response stylistically sound: rhetoric, organization, register etc.? Did the student present a meaningful opinion and were they able to justify it?

I think I know why the testmarkers still want to maintain the notion of a set answer. For one thing it makes the test papers easier to grade. Look for the keyword and if it appears, credit is given. No keywords = no credit. It also removes the dreaded notion of subjectivity in grading and the related possible charge of bias or imbalance in scoring. But arbitrarily assigning a 'correct' response to what is ostensibly an opinion-based writing task is worse than any aspect of subjectivity grading, as it renders the test item invalid- you are not grading what the question/task is actually asking.

And what's so bad about subjective grading anyway? Teachers do it on every classroom essay, report, or other assignments that don't feature fill-in-the-blanks or multiple choice (kigou) answers. We assume they can do so because they are trained professionals who, like judges, are expected to be specialists in evaluating the skills and abilities of their students. If they have no confidence in doing so on entrance exams, why are they teachers?

There's also a way to create more balance in scoring: Employ two scorers for any open-ended question. Have a skill criteria (a general one, not too detailed) established between the two of you and then mark separately. If the task is worth 20 points and you give one examinee a 17 and the second scorer gives a 13, you then make the final total for this question a 15. That seems fair.

Finally, I have to take issue with the seemingly automatic, but unnecessary, association teachers (both Japanese and native English speakers) make between productive writing/speaking tasks and 'expressing one's opinion'. First, it can be difficult to grade 'opinions' with all its value-laden baggage, but self-expression includes so much more than just giving opinions. Summarizing, narrating, predicting, creative writing, and commentary are all valid and important modes of self-expression that can also be tested. The easy fallback on 'giving your opinion' tasks fosters the unfortunate binary paradigm that if a text is not a cold hard fact it must be an opinion or, if you are not just regurgitating facts you must/should be indulging in expressing your opinion.

Some people just don't have strong opinions on certain topics, especially when an authority figure has chosen the topic. Cultural and even personal factors can come into play here too. Some cultures and some individuals are more indirect, opaque, restrained in their approach to offering opinions. They may not be comfortable with artificially forming a clear opinion in a certain number of words on a topic not of their choice, and yet they may understand the content perfectly well and likewise be adept at self-expression. Not everybody wants to be a Glenn Beck or a Michael Moore, nor should they. Students shouldn't be punished for this.

There is much more to productive, active, intelligence-engaging self-expression tasks than 'giving my opinion' (which seems to me to be a very post-sixties American value), just as there ways to grade such tasks without resorting to set answers.

Share this:  

January 27, 2011

Commentary: Keio's dropping of the Center Shiken and the (non-existent) Monkasho English 'word list'

1. Keio University drops the Center Shiken criteria for entry- Good!

Since it is exam season, and also because the aura surrounding exams are impossible to escape in Japan, I bring your attention to: this recent news item
... which informs us that prestigious Keio Univ. will drop the Center Shiken from its entrance requirements from next year.

This is, in my opinion, a good thing. I can well understand the argument made by Keio officials- that the Center Shiken did not sufficiently stratify student results, at least not enough so as to make it a meaningful or reliable indicator of suitability for entrance.

This is bound to happen of course when over 400,000 people take the exact same test. And at the higher-ranking institutions, entrance or non-entrance can be based upon a miniscule 1 point difference- hardly a reliable basis for determining whether you've got the right students, and definitely less so as a reliable measurement of intelligence or commitment.

If a university decides to use only it's own 'niji shiken' (second-stage) test plus an interview as the criteria for entrance (most now apply some weighted combination of the Center Shiken plus their own 'niji') they can more effectively streamline the procedure and judge students on their individual merit. Moreover, on a test made by Keio people, the element of anonymity would be reduced, making it more relevant to the specific goals or aims of the university.

This is not to say that there is something wrong with the content of the Center Shiken- it is quite well-written and reliable. It is simply the concept, this massive machinated mammoth that defaces the candidates and can make entrance to a specific university and department a matter of a computer spilling out numerical results somewhere in Tokyo.

Just think of the washback effect it would have on high school education if more universities chose to streamline or personalize their exams and bypass the goliath that is the Center Shiken.

2. There is no Monkasho English 'word list'. Sort of.

File this one under 'you learn something new everyday', or at my age, about once every three years.

I had long assumed, and not without good reason, that Monkasho (the Japanese Ministry of Education and A Whole Pile of Other Stuff) had a set list of English words that high school students could/should be expected to 'know' (whatever that may mean) upon graduation and in preparation for entrance exams. I had assumed this until a reader asked me to locate the list- and I couldn't. Then I started asking questions and no one seemed to know for sure- until I contacted a certain Mr. Big (not his real name, in case you were wondering) from a nearby campus.

I had assumed this because senior Japanese people around me had long made mention of a set list of words that were deemed suitable on entrance exams without a gloss. In other words, if 'catapult' or 'solenoid' appeared in your exam text (as they should!), you were pretty much required to mark them with a * and add glosses at the end. Or at least edit them in some way.

So, you might well ask, how did one know if 'catapult' or 'solenoid' were 'off-the-list' words that warranted the gloss treatment? Well, every educator worthy of his/her title in Japan has a large Shogakukan dictionary strategically placed at their right hand side (the 'Progressive' version being the closest to a standard- although Kenkyuusha is also widely used) in which words that are expected to be known at different levels of JHS and HS education were duly marked. No mark meant that we could not reasonably expect examinees to know the word.

Now, you might also well ask how the dictionaries set their asterisk criteria. This is where I had previously assumed that Monkasho had set the standard. After all Monkasho does have a required list which you can see by scrolling around on this page. But, as you will soon note, this is only a short beginner's list. A further careful reading of this Monkasho document reveals the number of words to be incrementally learned at each stage but no actual list of words. Thus, the JHS/HS teacher can use one of the 'marked' dictionaries as a reliable guideline.

But no one seems to know exactly how the compilers of the dictionaries set their standards, although it is widely believed that their choices are based upon the vast (and somewhat secretive, plus hard/expensive to obtain) Tokyo Eigo Kenkyuu (English Research) Corpus. Apparently, most of these marked items make up the bulk of the handiest reference available for such teachers and prospective examinees, this being the JACET 8000 , which is available in any bookstore that caters to dealing with entrance exams (meaning 99% of all bookstores in Japan).

So now you know. Like I didn't.

Any further insights would be appreciated- and questions welcomed.

Share this:  

August 04, 2011

The Problem with Numbers- Grading Follies

The passing grade for any course at my university is 60%. Actually, an initial grade of 30-59% means that we are required to give them a re-test but a provisional grade of under 30% disqualifies them from taking even a re-test-- an outright fail. Simple right? Not really. The meaning of 60% (or 30% for that matter) can mean very different things to different people-- which is why some students of mine took their failing scores (under 60% as a final final) to a university ombudsman (men?) committee for mediation recently. Let me explain.

What do the numbers mean? Two approaches
One way of looking at the meaning of 60% as a pass is as follows:
If I, the teacher, feel that the student has attained the required knowledge and skills or has completed the syllabus in a manner that indicates he or she deserves a credit and should advance to the next level, that student should get 60% or higher. If not, I as a teacher, will give them a grade under 60%. In other words, I assign the grade according to whether I think they should gain a course credit and advance or not. The resultant pass or fail will never come as a surprise to me the teacher. In short the teacher, not a number, decides. The number is simply the representation of the teacher’s decision.

Another way of looking at 60% is in a thoroughly numerical (which should never be confused as 'objective') way. If the student has attained a 60% average on all assessments and tests, then that student, by virtue of having achieved that number, should logically pass with a cumulative 60% or better final grade. This is the opposite of the manner described earlier. In such cases, the teacher simply calculates all the relevant scores and the math does the rest. The teacher may be surprised by the results- “I didn’t think student X would be good enough to pass but his total is actually 63% so I’m obliged to pass him”.

Matching grading with teaching methods
Both methods can make sense depending upon the content of your teaching, the teaching methodology employed in your classroom, and the types of assessment employed. If your course consists largely of transacting discrete points and your assessment requires that students know these as discrete facts then it easy to establish a baseline of 100% and discover whether your students have retained 60% of wht they should know (note- this requires that your assessment be comprehensive and not randomized regarding all these discrete points). Most standardized testing forms, such as TOEIC and TOEFL, take a similar approach although these are obviously not (in most cases at least) course exit (read: achievement) tests, but proficiency tests. The Center Shiken, more of a placement test than anything, is another such example where this works.

However, most ESL/EFL classroom pedagogy isn’t, or *shouldn’t be*, of this type. Something as holistic, organic and dynamic as language skills in a full classroom course can rarely be reduced to a set of discrete items and attempts to do so on a test will almost certainly put your assessment validity in question. If you are trying to develop holistic student skills and competencies you can’t break this down into meaningful numerical values very easily (see more on ‘analytic scoring’ in the footnotes). When you assign a number value to some classroom achievement assessment it will likely be holistic and somewhat subjective. It should never be a surprise to you. You’ll never say: “Wow! She got a 20 out of 30 on the role-play! Obviously I think she’s good enough to pass!” You gave the number *after* you decided that her performance merited a passing grade!

My Josef K. experience- the Tribunal
So what happened in my case with the students going to the student affairs ombudsman committee? I was thereafter asked to meet with three senior professors who make up the committee, in a regular classroom. It had the air of a war crimes tribunal upon first glance- the three professors sitting up at the front on the raised platform and me before them, like a defendant, down at a student’s desk. Why it couldn’t have taken place at a regular seminar table in an office or small classroom I don’t know. However, the attitudes of the three professors were not tribunal-like. I knew two of the professors quite well and the mood was reasonably friendly and almost apologetic.

The problem was this- the students had received scores of over 60% on my two main class assessments but were still required to do a re-test (my decision). Then, although they scored more than 60% (in the 70’s actually) on the re-test I deemed them unfit for passing and gave them a failing grade (there were other factors involved but for the sake of student privacy I will not divulge those here).

So, now you might be saying to yourself that the students’ confusion seems perfectly plausible. After all, if they got over 60% on three decisive evaluations and the course passing grade is 60% then what’s the problem? One of the senior professors (the one I was not very familiar with) echoed this proposition.

The argument from the defendant
First let’s look at my ‘paper’ test. It’s an open book and open note/open handout test (see the appendix below for a justification of this practice). Not only that, but students can look at previous (last year’s) tests in advance (not during the test though) and know that this year’s will be at least somewhat similar. I also give a full preparation class one week in advance where I tell students exactly where the test focus will be (including specific textbook page references). I make sure that everything that appears on the test was explicitly covered and even emphasized in the classes. As a result I expect 90% correct. That doesn’t seem unreasonable. Given all these advantages getting a mere 65% is just sloppy.

Obviously test preparation, validity and difficulty will affect the grade. A spot test on esoteric items demanding only memorization skills might make for a more meaningful ‘60% as pass’ criterion but pedagogically speaking, that doesn’t sound like a good achievement test (more like a proficiency or placement test).

Then there’s the role-play assessment. Again, students have two classes to prepare theirs, including time for checking with partners, peers and myself, plus revisions. And with three heads working together I expect that some thought should to go into it—and some practice too. On top of that, we all know that a poor student can be pulled up by being in a group with strong students. So if a student can’t pull off an 80% here, he or she needs to do some extra work.

And then there’s the re-test. Given everything I’ve written above about the other assessments plus the fact that I always have a test-follow up review lesson with feedback where students who did poorly can check their answers with those who did well--- well there is no reason I should expect anything under 90% on the re-test (for justification of re-testing also my footnotes below).

So if students score 70% on the paper test and 70% on the role-play, they haven’t shown me enough to gain a credit yet. And if they still get only 70% on the re-test I start to throw my hands up in the air (especially when I am confident that the student’s problem is not one of basic English skill or comprehension- which among national university medical students it never is).

The verdict
So, what was the committee’s response? Two of the senior professors understood my logic regarding numbers and leaned towards the “Give ‘em the see-you-next-year boot” option, giving me the big benefit of any doubt. The other professor though seemed to struggle with the numbers. Wasn’t my 60% an objective measurement? (No. It was my subjective judgment based on my qualifications and experience as a teacher). I mentioned that perhaps I should make my paper test so difficult that 60% would be an achievement but that wouldn’t be sound pedagogically. I mentioned that I could simply adjust my numbers accordingly and give scores like 40% and 50% for role plays that I was not satisfied with, but numerically that just seems harsh. Either way, I reiterated that a final 60%+ grade from the teacher was never meant to be a composite of in-class scoring but a reflection of the fact that in my estimation the student had passed the course, completing the requirements to my satisfaction.

Citing that the students in question may have been confused about this criteria (as he was) he recommended that I give them a re-re-test. So, to make everyone happy, I did--- and eventually passed them, but not until I was fully satisfied that they had achieved what was necessary to gain the credit.

Appendix 1: The brilliant and sudden transformation of students in peril
It is interesting to see how students react when told they will fail. Remember, these are the same students who, during the regular classes, thought nothing of missing the maximum number of classes, spoke mostly in Japanese during pair or group work or did their best to ignore other members, carried out the language activities with all the reluctance of Lindsay Lohan being sent off to Sunday School, and spent most of their in-class energy concentrating on pulling stray bits of rubber off of those shower sandals they always seem to wear. But when a repeat year looms suddenly postures straighten, formal letters or speeches of extreme regret are proffered and the suits- well the suits remind me of what bad guys wear in court to make an impression on the jury (Hey! If I can afford a Brooks Bros. get up I must be OK!). Entire sports clubs may visit your office to vouch for their man (very rarely a woman). Those deep bows that we usually associate with Japanese securities company presidents whose secretaries 'unknowingly' sold huge amounts of stock just before the market plunged are performed. Sometimes even tears make a guest appearance. But giving them a 'get out of repeat year free' card at this point makes gaining a credit appear to be a matter of showing good 'hansei' (self-reflection) form and has little or nothing to do with the larger educational picture.

Appendix 2: Analytical versus holistic scoring
Let me add a bit here about analytical vs. holistic scoring. Holistic scoring is when you look at the students' entire performance first and immediately, some might say instinctively, give it a grade. You can later break down your score and justify it skill by skill, function by function but basically you're giving the whole picture scenario. If possible it is good to have two raters in such cases to avoid excessive subjectivity.

With analytic grading you make a criteria beforehand, a rubric so to speak, and then assign scores for each item in the rubric. The final score is a composite of each item's score. The problem here is creating a valid rubric- should they all have the same weight/value? Has something been forgotten? Are there issues of competency that fall through the mesh of most detailed rubrics? And there's still the problem of subjectivity- since each rubric item grade is based on the whim of the teacher or his/her understanding of what that item means and how success in that particular item might be manifested in the test.

As you can probably guess by now- I'm a holistic grader.

Appendix 3: How and why re-testing can work
I have no problem with re-tests. But I never give them as punishment. I always give them to help the student master whatever it is they are having trouble with. The goal is for the students to learn what they need before going on to the next level. It serves a remedial function and works on the same playground principle that if your kid falls off the monkey bars the best thing you can do is help him/her get right back up there again until they gain confidence and competence.

By the way, my re-tests include 90% of the same questions or tasks as the main test. Why? Because that initial test contained exactly what I wanted them to know or be able to do. Making up a whole new set of questions or tasks would imply that the class in fact focused upon other skills/knowledge than was covered in the first test- making it less valid. Making up entirely new items may be OK with Math tests and English placement tests but not with a course exiting achievement test.

Appendix 4- How and why open book, open note testing works
Almost all my testing is open-book, open-note testing. The rationale for this is my wanting to avoid emphasizing memory as the sole or major learning skill and thereby becoming the operative determiner of a student's grade. Open-book tests allow for a consolidation and review of what we have been studying or practicing thus far. Organizational skills, good note-taking, and the ability to join the strands of knowledge into a larger whole are rewarded. These are academic skills and provide a good framework for developing (or at least enhancing) learner autonomy.

Two items I don't allow at the test site are old tests (from their seniors) where answers may be copied wholly with all the mental effort of single-celled organisms, and dictionaries, which can lead students astray as well as leading to simply copying down definitions. In short, I want my tests to be a part of the learning process, to maximize student understanding.

Comments? Here are some things I'm interested in hearing about-- What are the re-test and final failure grades at your institution? How do your students respond to re-tests or failing? On what basis do you decide the passes or fails? Do you do re-tests and open-book testing? Why? Why not?

Share this:  

November 29, 2011

On Demo Lessons, English Contests, Heroism, Coddling, and Mental Illness

A number of issues to discuss today.

1. English Teacher as Hero?

Let me start by suggesting that you watch this life-affirming, heartwarming video showing an 8-month old baby boy with a cochlear implant hearing his mother's voice for the first time . I'm linking this because one of the doctor/professors at my university (one who I know quite well having helped with his English publications and played golf with him) played a pivotal role in the development of this device. The man's a hero.

There's a part of me, of everybody I assume, that wants to be a hero too-- something that can make you look back on your life and allow you to say that you contributed to humanity so that you will be fondly remembered. Actually, I'd settle for being one of those veteran teachers who received a batch of flowers, a teary speech of thanks from the graduates, and a Sensei-with-the-students memorial slide show at the annual year-end Thank you party.

But this always happens to someone else. I'm jealous. And even though these affairs are inevitably maudlin and a bit contrived, sometimes I want to be that special teacher who the students hold dearly in their hearts-- who they refer to as an inspiration later when they are inventing, oh, even better cochlear implants.

But the reality is that teachers who try to too hard to be loved by their students can also often be seen as saps, pushovers-- 'pashiri' in Japanese. It's a bit like that overly needy guy at the singles bar-- the eligible ladies can smell need like an investment banker can smell an unearned bonus. And, yes, sometimes the most feted teachers have reputations as hardasses.

And while the teacher who gains plaudits has often done something way, way, way above and beyond the call of duty, a real self-sacrifice of time and effort for his or her charges it is also possible that even if you do go all out you may still earn little no more recognition than an o-tsukare-sama from one of your peers.

While I think I am generally quite liked by my students (knock on wood) I just can't imagine myself being a life-changing force for them. Correct me if you think I'm wrong, but there seems to be little that an English teacher can do (at least with university students) to become that hit movie-inspiring catalyst; the To Sir With Love type of mentor. Perhaps English teachers shouldn't strive to be heroes but merely aim at doing a good, solid 9-to-5 job and have no expectations beyond a basic appreciation from the students (and a half-decent salary).

But what I'm wondering is-- have any readers been, or seen, English teachers lauded as heroes by students? How and why? I'm curious.

2. Demonstration Lessons and American Idol

Ok, admit it. You have watched American Idol, even though it is to music appreciation what Greece is to fiscal responsibility. Since the candidates are given about 15 seconds to strut their stuff, the talented ones are pretty much required to indulge in a bout of vocal histrionics the whole time to show range and, I suppose, 'soul' (even if the tune would be more effective sung in a near monotone- I'm still waiting for some Celine Dion-esque diva to cover 'Autobahn'). It's basically a display of surface showmanship designed to impress celebrity judges, and is hardly indicative of what being a fully-fledged 'vocalist' entails.

This reminds me a bit of English class demonstration lessons (which fortunately, we are not required to do here at Miyadai since we don't have to actively recruit, being a national university and all). The problem with demonstration lessons is that you are expected to do an appealing, representative, and educationally sound lesson-- but in 20 minutes, and with a bunch of students who don't know you, the school, nor each other.

Now, generally speaking, one's best lessons tend to be those that have the following properties:
1. The lesson is connected to the one before and will connect to the one after. It fits naturally into the overall curriculum and stated purpose of the course.

2. There is a balance between teacher talk and student talk.

3. There has been sufficient introduction, presentation, or other groundwork laid before the meatiest part of the lesson-- the main task for the students-- is introduced.

4. As mentioned earlier, the students are at ease with the teacher and with each other. And the teacher knows what the students' abilities are, as well as what they have or haven't studied previously.

5. There is at least 60 minutes to pace and flesh the lesson out, especially to reinforce key teaching points at the end.

And yet none of these qualities are options when doing the standard 20 minute song-and-dance demonstration lesson.

So, my question to those readers who do demos is-- How exactly do you manage it?

3. English Contests in Japan-- And who should really be eligible?

As most readers know, in Japan there are numerous English speech or debate contests. Theoretically, any student enrolled at a Japanese school school can enter (am I right?).

So what about Pete? Pete is Canadian and has been in Japan only two years as his parents have been temporarily placed in the Nagoya office. He is, in every sense, a native English speaker. If Pete enters the contest would it demotivate other students? Does it somehow detract from the meaning or purpose of the competition? So, do you rule Pete out? If so, on what grounds?

Then what about Tatianna. She's from Poland and has been in Japan for six years but has a pretty good facility with English due to her family's past and some education in Poland, not to mention that her father's international business is conducted in English. But she's not a native speaker so should she be eligible? If you were a judge and you saw her Western face would you judge her more harshly even though she's not really a native English speaker?

Would you judge her more harshly than you would Ryo? Ryo is as Japanese as miso soup but he spent six years in the U.S. so his English is pretty close to native. Other students might feel disadvantaged by Ryo's appearance in the contest given his lengthy sojourn abroad, but it would be hard to disqualify him. Or would it?

Then what about Izumi? Izumi's case will dovetail with many Uni-files readers', I imagine. Izumi is half-Japanese half-whatever, and of course a Japanese citizen, and has grown up almost exclusively in Japan. However Izumi speaks English to her Australian father at home so her English is native-like. And she looks more Western than Asian. Izumi has an advantage to be sure... but is it an unfair one?

Is it any more unfair than the student who excels in science contests in no small part due to the fact that her mother is a Professor of Biochemistry at a prestigious university?

If you were a judge, would you treat all of these contestants equally and objectively? And if not, shouldn't we tell the contestants who might not get equal treatment that they shouldn't waste their time because they have no chance of winning from the outset?

I understand how a judge might think it's unfair for Pete to compete against your regular Yusuke or Sayuri in an English speech contest but where and how would you draw the line for participation and equal assessment? I can understand that it might feel 'unfair' or against the spirit of the competition if Pete wins the English speech contest, Tatianna is 2nd, Izumi 3rd and all others, your regular Yusukes and Sayuris, just also-rans. And it might further foster the notion that 'English is for foreigners'.

But I'd like to know how you would handle this...because otherwise we might be wasting Pete, Tatianna, and Izumi's time and effort.

4. Mental illness? Anti-social? Or just weak-willed?

We've all come across students who appear to have mental disorders and, in some cases, clinically confirmed mental disorders. The big question is, how do you handle this in terms of grading and credits?

In some cases, you don't have to. The student with the disorder may be as intellectually capable and hard working as anyone else in the class and their effort and test grades end up reflecting this. And on the other extreme side of the equation, students who display full blown psychosis and simply can't function properly probably shouldn't be in class and need more intensive treatment. But I'm talking about that middle ground.

You know, someone suffering from diagnosed depression or PSTD that is affecting performance. Do we cut them some slack in terms of grading their performance or, while considerate of their situations, are we bound only to grade the actual class performance regardless of external factors because otherwise it is unfair to the other students, since their grades are connected only to performance and not to personal issues? And if we choose to fail the afflicted student,shouldn't we be worried about the adverse effect this will have on their already fragile state?

The choice to fail, or at least defer a passing grade, might seem callous but if we make allowances for students with depression, we can start making that allowance for a number of students in the class. We could make them for the anti-social students, the impossibly shy, the permanently sleepy, or the perpetually bored. After all, it is arguable that they too are suffering from some disorder even if it is hasn't been clinically diagnosed. Mental disorders exist on a continuum-- having had a doctor check it from a list doesn't make it any more real than the problems of a person who never thinks to visit the psych ward.

Claiming some sort of exemption due to depression could become a convenient excuse. Even if the disorder has been clinically diagnosed, well, that may not mean much. These days the mere suggestion that you feel depressed is often sufficient to draw a get-out-of-work letter and/or meds from psychiatrists (I read Jon Ronson's The Psychopath Test recently on these matters-- I also know from Japanese doctors that this practice is much more common in Japan than it used to be). The problem is that since Fred feels depressed (as we all do at times), gets an official diagnosis and medication, we feel like we should go easy on him-- while Betty, who might have the same degree of depression as Fred, simply toughs it out and goes on with her work, home life, and social life despite how much of a struggle it all is. But we don't treat Betty with the kid gloves-- nor is she asking for them.

This raises another issue for me-- should depression be an excuse for rude and anti-social acts? Should we look the other way when students with a diagnosed depression walk into class 30 minutes late, immediately put their heads down on their desks, are unresponsive to the teacher or peers, and leave whenever the feel like it because, hey, they're depressed dammit!

It seems to me that depression should never be an excuse for anti-social or just plain rude, inconsiderate behaviour-- the pathology of being a sociopath is hardly a standard by-product of depression. The depressive is rarely psychotic and so can still judge the merits of their own actions. You and I both know enough people who have suffered from quite severe mental illnesses who still maintain a certain amount of social grace and persevere with duties and requirements even though they feel like zombies. (And yes, I've been subject to extreme changes where my spirit seems to be running out of my hands like water, where the real world almost appears like an apparition, and death and life do not seem so distinct-- thankfully much less so now than when I was younger).

So, the question once again is, how do you deal with students with diagnosed mental disorders?

5. Is it coddling?

As some readers may know I advocate giving students as much information, help, detailed outlines, and guidance as possible before they do tests or graded assignments-- with the goal of (hopefully) helping them to produce the best possible result. This includes giving them succesful old tests or assignments to look at, a list of textbook pages for study, I provide graphic outlines of what I expect them to do, do practice runs, prep classes etc.

But, after a recent presentation in which I mentioned this approach, one attendee suggested that this might be coddling students too much. This seems to me to be a reasonable argument-- that by giving them too many preparatory pointers I may actually be making them more dependent on the teacher, inhibiting the development of their autonomy, and not letting them use their own academic study skills to work things out.

So, the question (yet again) is... where do you stand on this?

Share this:  

Recent Columns

Recent Comments

Categories

Comments

Events

World Today