Columns on View All Columns
Visit ELTBOOKS - all Western ELT Books with 20% discount (Japan only)

The Uni-Files

A candid look at EFL life and lessons from a university teacher's perspective.

August 04, 2011

The Problem with Numbers- Grading Follies

The passing grade for any course at my university is 60%. Actually, an initial grade of 30-59% means that we are required to give them a re-test but a provisional grade of under 30% disqualifies them from taking even a re-test-- an outright fail. Simple right? Not really. The meaning of 60% (or 30% for that matter) can mean very different things to different people-- which is why some students of mine took their failing scores (under 60% as a final final) to a university ombudsman (men?) committee for mediation recently. Let me explain.

What do the numbers mean? Two approaches
One way of looking at the meaning of 60% as a pass is as follows:
If I, the teacher, feel that the student has attained the required knowledge and skills or has completed the syllabus in a manner that indicates he or she deserves a credit and should advance to the next level, that student should get 60% or higher. If not, I as a teacher, will give them a grade under 60%. In other words, I assign the grade according to whether I think they should gain a course credit and advance or not. The resultant pass or fail will never come as a surprise to me the teacher. In short the teacher, not a number, decides. The number is simply the representation of the teacher’s decision.

Another way of looking at 60% is in a thoroughly numerical (which should never be confused as 'objective') way. If the student has attained a 60% average on all assessments and tests, then that student, by virtue of having achieved that number, should logically pass with a cumulative 60% or better final grade. This is the opposite of the manner described earlier. In such cases, the teacher simply calculates all the relevant scores and the math does the rest. The teacher may be surprised by the results- “I didn’t think student X would be good enough to pass but his total is actually 63% so I’m obliged to pass him”.

Matching grading with teaching methods
Both methods can make sense depending upon the content of your teaching, the teaching methodology employed in your classroom, and the types of assessment employed. If your course consists largely of transacting discrete points and your assessment requires that students know these as discrete facts then it easy to establish a baseline of 100% and discover whether your students have retained 60% of wht they should know (note- this requires that your assessment be comprehensive and not randomized regarding all these discrete points). Most standardized testing forms, such as TOEIC and TOEFL, take a similar approach although these are obviously not (in most cases at least) course exit (read: achievement) tests, but proficiency tests. The Center Shiken, more of a placement test than anything, is another such example where this works.

However, most ESL/EFL classroom pedagogy isn’t, or *shouldn’t be*, of this type. Something as holistic, organic and dynamic as language skills in a full classroom course can rarely be reduced to a set of discrete items and attempts to do so on a test will almost certainly put your assessment validity in question. If you are trying to develop holistic student skills and competencies you can’t break this down into meaningful numerical values very easily (see more on ‘analytic scoring’ in the footnotes). When you assign a number value to some classroom achievement assessment it will likely be holistic and somewhat subjective. It should never be a surprise to you. You’ll never say: “Wow! She got a 20 out of 30 on the role-play! Obviously I think she’s good enough to pass!” You gave the number *after* you decided that her performance merited a passing grade!

My Josef K. experience- the Tribunal
So what happened in my case with the students going to the student affairs ombudsman committee? I was thereafter asked to meet with three senior professors who make up the committee, in a regular classroom. It had the air of a war crimes tribunal upon first glance- the three professors sitting up at the front on the raised platform and me before them, like a defendant, down at a student’s desk. Why it couldn’t have taken place at a regular seminar table in an office or small classroom I don’t know. However, the attitudes of the three professors were not tribunal-like. I knew two of the professors quite well and the mood was reasonably friendly and almost apologetic.

The problem was this- the students had received scores of over 60% on my two main class assessments but were still required to do a re-test (my decision). Then, although they scored more than 60% (in the 70’s actually) on the re-test I deemed them unfit for passing and gave them a failing grade (there were other factors involved but for the sake of student privacy I will not divulge those here).

So, now you might be saying to yourself that the students’ confusion seems perfectly plausible. After all, if they got over 60% on three decisive evaluations and the course passing grade is 60% then what’s the problem? One of the senior professors (the one I was not very familiar with) echoed this proposition.

The argument from the defendant
First let’s look at my ‘paper’ test. It’s an open book and open note/open handout test (see the appendix below for a justification of this practice). Not only that, but students can look at previous (last year’s) tests in advance (not during the test though) and know that this year’s will be at least somewhat similar. I also give a full preparation class one week in advance where I tell students exactly where the test focus will be (including specific textbook page references). I make sure that everything that appears on the test was explicitly covered and even emphasized in the classes. As a result I expect 90% correct. That doesn’t seem unreasonable. Given all these advantages getting a mere 65% is just sloppy.

Obviously test preparation, validity and difficulty will affect the grade. A spot test on esoteric items demanding only memorization skills might make for a more meaningful ‘60% as pass’ criterion but pedagogically speaking, that doesn’t sound like a good achievement test (more like a proficiency or placement test).

Then there’s the role-play assessment. Again, students have two classes to prepare theirs, including time for checking with partners, peers and myself, plus revisions. And with three heads working together I expect that some thought should to go into it—and some practice too. On top of that, we all know that a poor student can be pulled up by being in a group with strong students. So if a student can’t pull off an 80% here, he or she needs to do some extra work.

And then there’s the re-test. Given everything I’ve written above about the other assessments plus the fact that I always have a test-follow up review lesson with feedback where students who did poorly can check their answers with those who did well--- well there is no reason I should expect anything under 90% on the re-test (for justification of re-testing also my footnotes below).

So if students score 70% on the paper test and 70% on the role-play, they haven’t shown me enough to gain a credit yet. And if they still get only 70% on the re-test I start to throw my hands up in the air (especially when I am confident that the student’s problem is not one of basic English skill or comprehension- which among national university medical students it never is).

The verdict
So, what was the committee’s response? Two of the senior professors understood my logic regarding numbers and leaned towards the “Give ‘em the see-you-next-year boot” option, giving me the big benefit of any doubt. The other professor though seemed to struggle with the numbers. Wasn’t my 60% an objective measurement? (No. It was my subjective judgment based on my qualifications and experience as a teacher). I mentioned that perhaps I should make my paper test so difficult that 60% would be an achievement but that wouldn’t be sound pedagogically. I mentioned that I could simply adjust my numbers accordingly and give scores like 40% and 50% for role plays that I was not satisfied with, but numerically that just seems harsh. Either way, I reiterated that a final 60%+ grade from the teacher was never meant to be a composite of in-class scoring but a reflection of the fact that in my estimation the student had passed the course, completing the requirements to my satisfaction.

Citing that the students in question may have been confused about this criteria (as he was) he recommended that I give them a re-re-test. So, to make everyone happy, I did--- and eventually passed them, but not until I was fully satisfied that they had achieved what was necessary to gain the credit.

Appendix 1: The brilliant and sudden transformation of students in peril
It is interesting to see how students react when told they will fail. Remember, these are the same students who, during the regular classes, thought nothing of missing the maximum number of classes, spoke mostly in Japanese during pair or group work or did their best to ignore other members, carried out the language activities with all the reluctance of Lindsay Lohan being sent off to Sunday School, and spent most of their in-class energy concentrating on pulling stray bits of rubber off of those shower sandals they always seem to wear. But when a repeat year looms suddenly postures straighten, formal letters or speeches of extreme regret are proffered and the suits- well the suits remind me of what bad guys wear in court to make an impression on the jury (Hey! If I can afford a Brooks Bros. get up I must be OK!). Entire sports clubs may visit your office to vouch for their man (very rarely a woman). Those deep bows that we usually associate with Japanese securities company presidents whose secretaries 'unknowingly' sold huge amounts of stock just before the market plunged are performed. Sometimes even tears make a guest appearance. But giving them a 'get out of repeat year free' card at this point makes gaining a credit appear to be a matter of showing good 'hansei' (self-reflection) form and has little or nothing to do with the larger educational picture.

Appendix 2: Analytical versus holistic scoring
Let me add a bit here about analytical vs. holistic scoring. Holistic scoring is when you look at the students' entire performance first and immediately, some might say instinctively, give it a grade. You can later break down your score and justify it skill by skill, function by function but basically you're giving the whole picture scenario. If possible it is good to have two raters in such cases to avoid excessive subjectivity.

With analytic grading you make a criteria beforehand, a rubric so to speak, and then assign scores for each item in the rubric. The final score is a composite of each item's score. The problem here is creating a valid rubric- should they all have the same weight/value? Has something been forgotten? Are there issues of competency that fall through the mesh of most detailed rubrics? And there's still the problem of subjectivity- since each rubric item grade is based on the whim of the teacher or his/her understanding of what that item means and how success in that particular item might be manifested in the test.

As you can probably guess by now- I'm a holistic grader.

Appendix 3: How and why re-testing can work
I have no problem with re-tests. But I never give them as punishment. I always give them to help the student master whatever it is they are having trouble with. The goal is for the students to learn what they need before going on to the next level. It serves a remedial function and works on the same playground principle that if your kid falls off the monkey bars the best thing you can do is help him/her get right back up there again until they gain confidence and competence.

By the way, my re-tests include 90% of the same questions or tasks as the main test. Why? Because that initial test contained exactly what I wanted them to know or be able to do. Making up a whole new set of questions or tasks would imply that the class in fact focused upon other skills/knowledge than was covered in the first test- making it less valid. Making up entirely new items may be OK with Math tests and English placement tests but not with a course exiting achievement test.

Appendix 4- How and why open book, open note testing works
Almost all my testing is open-book, open-note testing. The rationale for this is my wanting to avoid emphasizing memory as the sole or major learning skill and thereby becoming the operative determiner of a student's grade. Open-book tests allow for a consolidation and review of what we have been studying or practicing thus far. Organizational skills, good note-taking, and the ability to join the strands of knowledge into a larger whole are rewarded. These are academic skills and provide a good framework for developing (or at least enhancing) learner autonomy.

Two items I don't allow at the test site are old tests (from their seniors) where answers may be copied wholly with all the mental effort of single-celled organisms, and dictionaries, which can lead students astray as well as leading to simply copying down definitions. In short, I want my tests to be a part of the learning process, to maximize student understanding.

Comments? Here are some things I'm interested in hearing about-- What are the re-test and final failure grades at your institution? How do your students respond to re-tests or failing? On what basis do you decide the passes or fails? Do you do re-tests and open-book testing? Why? Why not?

« O-makase teaching: The 'Leave it to sensei' approach | Main | Students you never forget- In memory of Moe »


Good one, Mike. Responding to some of your questions and some other comments.
Yes, I've had an open-book, open-note test policy for quite a few years now. For the same reasons as you wrote. My students also have a pretty good idea what's going to be on the test and know that they can't complete the test properly without prior preparation for it, and without having organised their notes into a kind of portfolio with their opinions, responses, experiences. I have continued to do this because students perform well, do prepare appropriately, and respond that although it forces them to do more than for other tests they feel they learn more.
Re re-tests, generally my uni is open to it, and encourages it for 4th year students, but I strongly believe in the students' RIGHT to fail, and they support me on this. Have never had pressure to give a re-test I didn't want to.
My bottom line about failing is that I would be prepared to give re-tests to anyone who doesn't know WHY they have EARNED an F. Your students - national university, I know, are very different to mine - small private women's university. If one of my students complained about a grade, it's not because she is trying to push the system, but because she truly doesn't know why she didn't get a better grade. I believe that is the responsibility of the teacher.
I wonder if you have a difference between purely Pass/Fail classes, and letter grade classes. It seems to me that your principle of 90% or more on the test would be best applied in a Pass/Fail type class. I'm sometimes surprised when my students don't do as well as I expect on my tests, and sometimes the reason is clear and simple - they forgot to bring their notes, or they came late for the test, or they just didn't do the preparation and were happy if they just passed the class, meaning they were happy to accept a C. That is their right, if it's a letter grade class. So I fulfil it.
Re the sudden transformation, similar to prisoners on death row transforming into remorseful, ideal prisoners. While I do not believe in the death penalty, it seems to me that some people will not do the right thing or change until they're faced with these drastic consequences of their actions.

Thanks for dropping by Dexter.
The 90% criterion I hold depends on the particular test or assessment. If I've done this type of assessment enough in the past I will have a very good idea as to what a good score will be, how easy or difficult the tasks/questions are.

With very new tests I will wait until marking to establish a re-test criterion. If about 90% of the students get an 80%+ but there are 4 or 5 students hovering in the 75 and under range, the latter will be the ones I call for revision- wherever that big break falls. If too many students have low numbers then my test/grading criteria or expectations have to be called into question.

However, the form and content of my re-tests have changed drastically over the years. My MO now is to have each of the struggling students come to my office for a one-on-one and go over what went wrong until I am confident that they have understood it or have shown a greater grasp/competency with the skill I was measuring or trying to develop. The students seem to respond very positively to this type of one-on-one discussion, seeing it (correctly, I believe) as helpful guidance and not as jumping through hoops or some other type of punishment.

A couple of years ago I had an instance where I had to give serious thought to how a pass/fail not simply based on numbers works.

A student of mine had gone through the semester mostly under the radar, turning in assignments and seemingly doing fine. Then that student got under 15% on the final exam (class average some 80%). Didn't fall asleep during the final, wasn't sick, etc.--just had no idea what was going on. But this student's semester grade with the exam (only given a weight of 20% of the final grade) was in the 70 or 80% range. It just wasn't right to pass this student. I had grading system with a basically analytical approach as you call it, but it didn't work for this particular situation and had to be adjusted.

Regarding some of your questions:
I don't do retests. My exams are only worth 20% anyway, homework and participation are weighted heavier. My main reason is because my classes are communication classes and mostly based on target language and other patterns that they pretty much have to pick up and practice in class. It's not a course where you can just study the text and pass a test. Studying the text helps but it only gives examples and I expect students to pick up sentence patterns and be able to personalize them and use them flexibly. if students are not up to speed on things by exam time, they really need to repeat the course. Retesting is an option at my university, but I always opt out and students haven't complained so far. But, I'm not inflexible to changing my policy someday in the future.

Open-book? No. I expect students to have internalized the language more. The exam is exactly like in class activities and homework and they know very well what is expected of them on the exam. I have considered having an open-book or open-note test, though. One problem, however, is that I do a few questions with concrete items that would be compromised with open-notes, such as nationalities, country names, and so on. Also, I sometimes ask the same exact questions they have previously answered for homework, especially open ones like "Tell me about yourself." That would encourage simply copying if they had open-homework. Open-text? Maybe, though I don't think it would help them very much in a short, timed exam situation.

Overall, pass/fails are pretty clear cut. When it's close, I often give the benefit of the doubt, but it's a judgement call, holistically-based.

One thing related to grades that came as a complete surprise to me was something unrelated to the course or the student's performance. It had to do with "the system." I'm not an expert and am still in disbelief, so please correct me if I'm wrong here.

Apparently, there's no such thing as a part-time student (at least in most situations for most universities). So, people can't just sign up for a course or two (at lower tuition) like they could at my university in the States. This has very serious consequences for graduating students who fail courses.

If a student scheduled to graduate fails one class and has to repeat it, it's bad. Really bad. It's not just about the course. It means they have to pay for an entire full-year's tuition and stay in school to repeat that one class. On top of that, seniors commonly have a job lined up already due to the common practice of job hunting, starting even to 2 years before they graduate, which they would have to forfeit. Since that qualifies as cruel and unusual punishment, I think many "tribunals" opt for finding a way to pass the student. It means quite a lot of work and a pain in the butt for the teacher, who would normally be on vacation or focusing on other work. This is quite a challenging situation that I mostly want to blame on the system, but one teacher's blaming doesn't help much here. Yes, there is student laziness (said student was very capable, just didn't show up and didn't do any work) and senioritis. And I think they would have "appreciated" and learned from failing. But the system here makes it basically impossible to fail.

Hi Tristan.

You raise a good point about graduating students. The final year often seems to be all about shuushoku (jobhunting) so students seem to get a near free-ride in terms of the doing-the-actual-classwork part. Failing can be traumatic and I understand that some companies, expecting their new grads to be showing up on April 1st, berate universities when students don't graduate as expected.

Of course many universities reserve the final year for seminars and grad theses, and in my case practica, which tends to be less of a pass/fail scenario.

For several reasons, teachers should fail students who do not make the effort to reach the stated goals of the language class. Failing is a very powerful learning experience that often leads to positive change for the failed students and for their future classmates. I have repeatedly experienced students coming into my higher level classes with almost no language skills or study ethic at all. They have no language skills or study ethic because they were allowed to pass other classes on the basis of attendance. If they had failed, they probably would have learned a valuable lesson and would have improved. (I admit that there are some students with emotional or psychological problems and these students may not react well to failure, but these students need special counseling, which should be given by properly trained specialists.) Students who do not study waste the time of the students who do study. Students who do not bother to remember vocabulary or review cannot effectively communicate with or practice the target language points with others. Too many teachers and administrators in private and public colleges pass students who are unfit to pass. As a result, students who have not mastered basic language skills compete on an equal basis for jobs against students who have mastered those skills, a situation which I consider unfair. In addition, companies end up hiring workers who are incompetent. Afterwards, the reputation of the college that passed the incompetent student suffers. Several teachers I know in both public and private colleges, worried about enrollment at their colleges, have expressed the argument that if they fail students, their colleges will get a bad reputation among high school students and will not attract students. I would rather that teachers take pride in providing an education that is based on professional educational standards, and I hope that an educational institution that facilitates the mastering of academic goals will attract students more than a college that provides a four-year break between high school and employment.

Well said, Greg. I couldn't agree more and I am happy to say that my uni takes a very hard stance against students who don't study and earn the credits required to advance in their studies. In fact, I have to teach the students who fail themselves in my courses the second time around (I don't like to say that "I am failing the students") and I am often impressed by the 180 degree change in their attitude. I guess living alone at 18 or 19 and moving to Tokyo (as is the case for many at my uni) is too full of distractions for some. Like you say failing is not always a bad thing and for many it is the wake-up call they sorely need.

Recent Columns

Recent Comments




World Today