![]() If you know about the exam, you can assign meaningful grades based on exam scores, but not according to any optimal distribution - you would assign scores based on how much of the exam students are expected to know to demonstrate various levels of mastery, not by mathematically shaping the grades into some predetermined "optimal distribution."Įdit: Suppose these test scores have only ordinal meaning. If you have no information about the exam and what it measures, you cannot assign meaningful grades based on that exam score. If you insist on thinking about it from an information theoretic perspective, what we really want is to minimize the Kullback–Leibler divergence between the distribution of students' achievements and distribution of students' grades. In advanced courses, it can be quite proper for all students to get A’s and B’s, because weak students would not take the course in the first place.įor an individual student, the grade should depend on that student's demonstrated mastery of course material, and hopefully not at all (or as little as possible) on the other students in the class. This scenario (where all students achieve excellent or very good grades) is not even that unusual, as Michael Covington points out: But in reality, it is much more useful to say that all students in this particular class achieved excellence and deserve a 10/10, than it would be to maximize the "information" carried by the grade and give some students a 1/10 because their performance was slightly less excellent than the highest level of excellence achieved by a student that year. This grade then seems to carry very little information. If every student in the course has achieved truly excellent mastery of the course material, they should all get high scores. What we actually want is for grades to signal as closely as possible the students' mastery of course material. It wasn't suggesting that grades should be curved a posteriori to maximize entropy.) (Note that this was in answer to a question that asked about using the entropy of exam scores as an indication of how good an exam is at distinguishing between different levels of mastery. I.e., you can have high entropy if every student gets a slightly different score, even if the scores are all very near to each other and thus not useful for distinguishing students. Strictly speaking, Shannon entropy pays no attention to the distance between scores, just to whether they are exactly equal. To quote from an answer by Anonymous Mathematician: It's not a very useful measure of a "good" set of grades. ![]() ![]() We don't want to maximize the entropy of the grades in a particular course. What is the motivation behind this?Īssigning grades to fit some "optimal distribution" is misguided. However, in practice most teachers implement distributions which are peaked. Thus, we would choose a scale which yields a uniform distribution. Regarding a) from an information theory perspective, we may want to maximize the information (entropy) of the grade distribution. We may also assume that the grading distribution/scale should achieve two goals: a) it should be informative about student's grasp of the material, b) it should incentivize students to study the material. What should be the optimal grading distribution? We may also assume that a (statistically speaking) large number of students representative of the student population took the exam. A student with a higher score can be said to have achieved better mastery of the course but having double the score does not mean that one has achieved double as much as another student. Suppose these test scores have only ordinal meaning. If I have no prior knowledge of the ability of the students but have obtained their test scores (let's say from 1 to 100). Suppose that I am grading students on a scale from 1 to 10. Researchers on education must have thought about this, so I hope to find some directions here.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |