Loading

 

Download here.

Assessment Methods in Core Classes

Mission

To have professors understand and implement grading structures that create a positive work environment and promote learning.

 

Executive Summary

Our team identified that students are dissatisfied with how they are evaluated in courses. We have found that professors and students alike share strong desires for fairness and consistency in grading, and that these are the two motivating factors for assigning and understanding grades. Combining student and professor perspectives, we have developed recommendations for improving course structures such that incentive systems promote learning and student satisfaction.

 

Student Perspective

Introduction

We surveyed a sample of 130 students across class years to understand the perspective of the general student body on the topic of grading structures in core classes. We asked questions designed to target student perception of grading across core classes, certain breakdowns of course assessments, the general impact of grading structures at Wharton, and the structure of the Core. To better understand these results, we ran a focus group with twelve students of varying academic standing and involvement. The raw survey data can be found in Appendix III.

An underlying component of our work was also to understand why students care about grades. Students explained that they care most about whether or not grades adequately reflect performance and differentiation among students. Students felt that if grades did not adequately reflect such measures, they would then serve as a poor signal to recruiters.

 

Grading Across Core Classes

Students were first asked about grading across core classes. We found that 81.7% of students believe that grading is inconsistent across core classes; in other words, if students received the same grade in two different core classes, they felt that the grades did not imply a similar level of course material understanding. Thus, students feel that grades fail to serve as consistent signals to stakeholders and to fairly differentiate students. This conclusion was bolstered by the 77.1% of students who believe that grades do not adequately reflect student mastery of the course material. Similarly, 76.7% of students believe that their grade in a multi-section core class was impacted by the teaching style of their specific instructor. In other words, because each professor may present the material differently, students feel that they are at an advantage or disadvantage on the standardized assessments.

 

Type and Weighting of Assignments

The type and weighting of assignments was also of interest. In the focus group, students were favorable to an increased number of graded assignments, which incentivizes continual learning, rewards effort in class, gives students an understanding of their position throughout the semester, and results in grades that are more representative of performance and knowledge learned. Students feel that core courses where grades are based on one to two exams lend to unfair grading. Correspondingly, given that many core classes are participation- and exam-based, students who alternately do poorly or extremely well through that particular medium will receive a grade disproportionate to their understanding of course material. 63.3% of students preferred to have multiple types of assessments, e.g., homework, quizzes, projects, and papers.

 

Impact of Grading

Students were finally asked more holistically about the impact of grading at Wharton. 76.7% of students believe that grading at Wharton does not support collaboration in team-based classes, group studying, or interpersonal relationships between undergraduates. One student commented, “Curved grading structure creates an overly competitive environment. I have had students not want to collaborate and directly tell me that the reason is that they don't want to hurt their chances of being on the good end of the curve.” This undermines the broadening purpose of the Core; 53.3% of students reported delaying taking a core class purely on the basis of its poor or unfair grading structure.

In our focus group discussion and the student responses to an optional form for additional remarks on our survey, we frequently heard students reporting that current grading structures serve as a disincentive for true learning. Instead, students are motivated to “cram” or memorize the majority of the course material to perform well on exams. Under current grading structures, students perceive a high mark in a course as a competitive gain rather than an achieved mastery of material. Finally, student data from both the survey and the focus group demonstrate overall frustration with the overly competitive environment. All students who chose to write comments in the optional additional remarks section of our survey expressed dissatisfaction with the curve, and 20.5% of these students specifically tied the curve to mental depression or extreme stress.

 

Academic Research

Academic research on incentives, assessment, and collegiate-level instruction studies the intrinsic and extrinsic foci of various grading structures.

 

Criterion-Based

On the achievement end of the spectrum, a professor using a criterion referencing system would determine grades by comparing a student’s achievements with predetermined criteria for each achievement level (A, B, C, etc.). Students tend to formulate mastery goals, which are “focused on the development of competence and task mastery” when evaluated under a criterion-based agenda. Studies on criterion-based grading have shown that under this system, students achieve the optimum outcomes on evaluated assignments and are the most strongly intrinsically motivated given that they have a clear definition of the meaning of expected achievement levels.

 

Normative

Academics explain normative grading as the practice of assessing students based on ranking and determining final grades based on a predetermined distribution, often a bell curve. In normative settings, students cultivate performance goals, which are defined as goals that are focused on the performance of others. There are two types of performance goals: approach and avoidance. A student who has formed an approach performance goal seeks to achieve in order to garner appreciation or respect from peers and professors. On the other hand, a student who has formed an avoidance performance goal seeks to achieve to avoid disdain. Studies of normative environments have demonstrated that normative curves negatively affect performance outcomes and in particular, deter certain students from creating and working towards mastery goals. Unfortunately, this effect is not distributed uniformly across individuals; students with certain personality types have been shown to respond less negatively, leading to a final distribution of grades that is based less on differentiation or achievement than on personality. Both the approach and avoidance subsets of performance goals have been linked to negative outcomes including low levels of persistence in the face of failure, the avoidance of challenges, and low levels of intrinsic task motivation with respect to mastery goals.

Relevant literature also addresses the incorporation of non-educational standards into students’ grades, including but not limited to class participation and attitude (both of which play a role in at least one Wharton core class [see Appendix I]). According to Lawrence H. Cross and Robert Frary of Virginia Polytechnic University, class participation is “as much a function of personality as it is indicative of knowledge possessed” and it follows then that incorporating such information into a final grade will distort the grade’s ability to reflect either differentiation or knowledge gained.

 

Benchmarking

To better understand grading structures at peer institutions, our team benchmarked the existing policies at Princeton University, Columbia University, Stanford University, and Harvard College.

 

Princeton University

From 2004 through 2014, Princeton University faculty had a common grading expectation for every department and program: A-range grades (A+, A, A-) were to account for no more than 35 percent of the grades given in undergraduate courses and less than 55 percent of the grades given in junior and senior independent work. This policy was put to rest in the 2013-2014 academic year; the President, Christopher Eisgruber, has expressed a desire to move away from “grades” and instead towards “providing accurate feedback.” This change was the result of conclusive research by faculty showing that the new system of criterion-based grading would be a more effective and positive means of assessment for the students. Furthermore, the University is making a strong push towards grading equality such that students in one academic department can expect to be graded according to the same standards as students in another academic department. The faculty also sees the adoption of this new grading structure as a way to signal clearly to students the difference between levels of personal achievement. That is, students will be able to understand their grades across departments as reflective of their individual achievement ability.

 

Columbia University

Standards at Columbia University are largely similar to those at Wharton, with the exception that an A+ grade at Columbia is weighted as a 4.33. There are no set grading distributions at Columbia. To address the issue of students taking classes in easier departments to optimize their chances of receiving an A, transcripts include an extra column to reveal the percentage of a class or seminar that received a grade in the A-range so that external stakeholders can glean the difficulty of a particular class.

 

Stanford University

While grades at Stanford University are computed in the same way as they are at Wharton (e.g. A+ and A convert to 4.0, A- converts to 3.7, etc.), it is interesting to note that the University does not have a failing (F) grade. Further, grade point average (GPA) and class rank are not officially computed under the general grading system, meaning that GPA does not appear on students’ official transcript and is not released to external stakeholders. To qualify this statement, it is certainly possible for employers to calculate students’ GPA based on their transcripts, and current students indicated, through an informal interview, that most students share their GPA on their resumes and employment applications regardless.

 

Harvard College

Finally, we consider grading at Harvard. Harvard is well known and often critiqued by the media for its grade inflation. In 2013, Harvard’s Dean of Undergraduate Education, Jay M. Harris, stated that “the median grade in Harvard College is indeed an A-…[and] the most frequently awarded grade in Harvard College is actually a straight A.” As of late, Harvard has not taken action to adjust for its grade inflation.

 
Discussion

As discussed previously, academic research disfavors the curve for reasons primarily relevant to learning optimization and motivation theory. While the alternative, a criterion-based grading structure, would answer these concerns, we recognize that such a structure is hard to implement in practice for several reasons. First, professors have communicated the difficulty of writing tests that perfectly reflect the expected level of course mastery. Further, the curve allows course administrators to standardize grading across graders, an issue that is particularly pertinent to multi-section courses like the Wharton core. Finally, to reference the student perspective, students were strongly concerned about consistency across time (i.e. semester to semester), graders, and professors. In practice, grades are distributed more consistently across graders and across time under a curve than under a criterion referencing system.

 

We also acknowledge that in discussing the most favorable class structures or frustrations with grading, student opinions may be colored by their own personal performance in the class at hand. Therefore, we center our thoughts and recommendations moving forward on the most strongly felt areas of concern as determined through the survey. Finally, we note that grading is a difficult issue to tackle, as it may always be possible to find unfairness in grading outcomes. This is countered by the overwhelming buy-in to the optional question on students’ favorite or most preferred classes/grading structure, which underscored several classes that students enjoyed or felt had equitable grading structures.

 

Recommendations

Our team focused on the breakdown of syllabi and apt matching of course style and grading structures as areas for improving assessment at Wharton. We have developed numerous recommendations to optimize the classroom experience for students and professors alike.

 

Assess students through several assessments.

Students report that having multiple assessments spaced throughout a semester incentivizes continual learning, rewards effort, serves as an “insurance policy” against a particularly over-stretched week or bad day, and helps students understand how they are performing throughout the semester. Courses that are primarily composed of one or two exams fail to adequately capture student learning throughout the semester and promote “cramming” as opposed to learning in tandem with the presentation of course material. In addition, having multiple assessments across the course of a semester implicitly benefits students who attend lecture and stay involved throughout the course, rewarding effort and engagement without explicitly enforcing it as a portion of the grade. Further, students in our focus group explained that exam-based grading structures put an undue amount of stress on students to perform extremely well on the one or two exam dates, especially since core class exams often fall within the same few weeks. Finally, having multiple assessments allows students to monitor their understanding of each part of the material, enabling them to seek help in real-time and to adjust study methods accordingly.

 

Assess students through multiple types of assessments.

It is also important that students are assessed through multiple types of assessments. Most core classes in Wharton are exam- and participation-based. Students who are poor test-takers or who feel uncomfortable speaking in class then do poorly in the course not necessarily due to poor understanding of the material, but rather because of a biased assessment medium. We recommend incorporating homework assignments, projects, papers, and small quizzes into graded output to give more varied opportunities for students to demonstrate knowledge learned. Ultimately, this creates grading outcomes that are more equitable and more representative of student understanding.

 

Create appropriate incentive structures for teamwork-based classes.

In classes where teamwork is required or encouraged, we recommend that professors strongly consider removing the class from a departmental curve and develop criterion-based grading schemes. Research has shown that if students know that a class will use normative grading, they will withhold information from teammates and peers to better their own chances of receiving a higher grade (see Student Demand and Appendix II for further explanation). To further contextualize this recommendation, we consider MGMT 101. To mitigate an over-competitive environment and instead promote teamwork, MGMT 101 was removed from a normative-based grading scheme, a move that has been met with high satisfaction on the part of both professors and students. To account for a particularly difficult test or assessment, the professors sets a minimum number of A grades at the beginning of the semester and bumps grades up if this minimum threshold is not met.

 

Fit curves by professor sections rather than the course overall.

Second, students in our focus group made the point that professors teaching the same course present material differently. In core classes with multiple sections led by different professors, students are given the same exam and are expected to understand the material at the same level as their peers. However, students report that they have felt at an advantage or disadvantage depending on the teaching style of their specific professor. If grades were to be determined within each section or within each professor’s sections, grading outcomes would more accurately reflect expected student learning and would fairly account for differences in presented material and teaching style.

 

Emphasize individual goal-setting in communicating grading structures.

Finally, these recommendations are dependent upon strong communication. Students in our focus group explained that by introducing a grading structure with the breakdown of students receiving each grade, professors drive the sense of competitiveness and create a situation most conducive to setting performance goals. By instead framing grading structures in ways conducive to setting mastery goals, the proposed changes will have the maximum possible benefit on assessment mechanisms and student satisfaction with grading at Wharton. Professors can accomplish this by emphasizing that there are objective thresholds of course mastery that students should strive to achieve and that achieving this benchmark will be rewarded with the requisite grade, regardless of peer achievement.