Association for Institutional Research in the Upper Midwest
1999 Fall Conference, St. Paul, Minnesota

Trends in Undergraduate Grades

Bruce Beck

Policy & Planning Analyst

Office of Budget, Planning & Analysis

University of Wisconsin-Madison

 Introduction

During the fall semester of 1997, the University of Wisconsin-Madison (UW-Madison) was invited to contribute data to an analysis of undergraduate grades to be conducted by statistics Professor Val Johnson at Duke University.  Professor Johnson had constructed an analytical method of adjusting student grade point averages (GPAs) to take account of the extent to which the student had enrolled in easy courses, or whether they had enrolled in difficult courses.  Professor Johnson argued that, in the absence of such an adjustment, variations in grading patterns were "unfair" to those students who chose to enroll in challenging courses, where they face a higher probability of receiving a lower letter grade.

 

The Committee on Undergraduate Education recommended UW-Madison decline to participate in the Duke University project, for a number of reasons related primarily to the high cost of the project and low expected benefits.  Discussion of the underlying issue of variations in grading patterns, however, continued in the months that followed.  During the 1998-99 academic year, the Provost, Deans and the University Academic Planning Council repeatedly examined the issue of variations in grades, as well as increasing GPAs over time.  In particular, the institution has been interested in whether the trend toward increasing GPAs is accounted for by an increase in the academic preparation of its students.  During the course of their deliberations, they reviewed data on trends and patterns of undergraduate grades at UW-Madison.

Method

There were two data sets available for this analysis.  The first data set was used to assess the rate of grade inflation.  This contained data pertaining to UW-Madison undergraduates enrolled in fall semesters during the period from 1990 through 1998. For each student enrolled during this period, the data available included gender, state residency, and ACT score at the time of original entrance into the university.  Beginning in 1989, new freshmen entering UW-Madison who were Wisconsin residents were required to submit ACT scores.  While some non-resident new freshmen also submitted ACT scores, they were more likely to submit SAT scores instead.  Students who did not submit ACT scores were excluded from the analysis data set.  For each fall semester of their enrollment during

1990 to 1998, the data available for each student included their semester grade point average, the school/college in which they were enrolled that semester, and their student level (freshmen, sophomore, junior, senior).  If, for a given semester, a student did not have a semester grade point average, they were excluded for that semester.  An additional 350 students were lost to the analysis because changes in their campus IDs prevented the construction of a complete record for them.

 

The second data set was used to assess variations in grade distributions across academic departments, and between multiple sections (classes) within the same course. This contained data pertaining to each of the grades received by all UW-Madison students for all courses they completed in the Fall semester of the 1998-99 academic year.  Data for each grade includes the value of the grade received, course credits taken, department offering the course, the course ID, and the specific course section in which the student was enrolled.  For some courses, students enroll in multiple sections within the same course.  In these cases, only one of the student's sections was identified in the data set:  the section designated as the "grading-giving" section.

In 1973, the UW-Madison Faculty Senate expanded the scale of letter grades to include the grades of "AB" and BC".  Since that time, there have been no changes to the scale of letter grades. The resulting scale of letter grades, together with associated grade points, is shown above.  At UW-Madison, a letter grade never includes a plus (+) or minus (-).

 


Other grades are possible, if a course is taken on a credit -- no-credit basis, or on a pass-fail basis, or an "incomplete" is taken.  A grade of "NW" stands for No Work.  It means the student never showed up for class and the instructor does not want to fail the student, so they give them an NW, which is a non-punitive grade that bears no points or credit.

 

These ancillary grades are excluded from this analysis.  Only the letter grades shown above are included in the calculation of the semester grade point averages reported below.


Results


Semester grade point averages were gradually increasing during the period from 1990 to 1998.  As shown in Table 4, the average semester GPA in Fall 1990 (for undergraduates with ACT scores) was 2.90.  By Fall 1998, the average GPA for the comparable group was 3.11.   During the period from 1990 to 1998, the average semester GPA increased each year.   The magnitude of these annual increases varied.

To interpret this trend, it is important to take account of changes in the student population. The trend in the level of academic preparation of undergraduates, at the time of their entrance into the university, may be a contributing factor to the trend in grades.  Table 4 includes the average ACT composite scores of the fall semester undergraduates having ACT scores.  The average ACT score in Fall 1990 was 24.3.  By Fall 1998, the average ACT for the comparable group was 26.3.   During the period from 1990 to 1998, the average ACT score increased each and every year, along with the average semester GPA. 

 

There are at least two alternative models faculty use in define grades.  For faculty and instructors who define grades in reference to a "bell curve", the trend in average levels of academic preparation may be irrelevant.  Chart A gives a graphical representation of the frequency distributions of the semester GPA for the fall semesters of 1990, 1994 and 1998.  During this period, the distribution has been shifting upwards on the GPA scale.  For those who define grades using the "bell curve", Chart A is sufficient proof of grade “inflation”—a lowering of grading standards.  In their view, an increase in the average academic preparation of undergraduates should have no impact upon the position of the bell curve on the GPA scale.

 


However, many other faculty and instructors use an alternate grading model in which grades are defined in reference to a fixed standard or level of subject matter achievement, without regard to the number of students who may or may not meet those standards.    Chart B gives a graphical representation of the frequency distributions of the composite ACT score for the fall semesters of 1990, 1994 and 1998.  During this period, the distribution has been shifting upwards on the ACT scale.   If the ACT score and semester GPA are both assumed to reflect fixed standards of subject matter achievement over time, the increase in semester GPA may be explained by a rising level of academic preparation upon entrance to the university, as measured by the ACT test.


 


However, further analysis demonstrates grade inflation was occurring even when academic preparation (ACT score) is held constant.  Among undergraduates enrolled in fall semesters at UW-Madison during the nine-year period from 1990 through 1998 who had the same ACT score, their average semester grade point averages increased gradually, and relatively steadily.  For example, this pattern is evident in Chart C showing semester GPA trends for students with ACT scores of 22, 26 and 30.   The three ACT scores included in Chart C are scores held by relatively large number of students (see Chart B).  Aside from the grade inflation just noted, Chart C nevertheless does demonstrate an expected relationship between ACT and GPA: students with higher ACT scores consistently earn higher semester GPAs, at least on average.  As discussed below, there is considerable variation around these averages.


 

During the period 1990 to 1998, the student level (freshmen, sophomore, junior, senior) composition of this subject population was changing, due in part to the fact that Wisconsin resident new freshmen who entered the university prior to Fall 1989 were not required to submit ACT scores.  The advent of the ACT requirement for Wisconsin resident new freshmen caused the pool of undergraduates who had submitted ACT scores to include increasing numbers of juniors and seniors over the nine-year period (see Table 1).   The higher student levels have higher average semester GPAs.  Chart D is a test of the possibility the trends in semester GPA shown in Chart C are the result of the change in student level mix.  While the result shown in Chart D tends to confirm the expected relationship between student level and semester GPA, it also provides evidence of grade inflation even when both student level and ACT score are held constant.

 


The rate of grade inflation during the period from 1990 through 1998 varied between the schools and colleges comprising the UW-Madison.  Table 2 above provides information on the relative sizes of these units. Chart E provides a comparison of selected schools and colleges, restricted to the semester GPAs earned by juniors an seniors only, since freshmen and sophomores are typically concentrated in the College of Letters & Science until they are accepted into programs offered by the other schools and colleges.  Chart E includes only undergraduates whose ACT scores were 26 or 27.  Based on this comparison, grade inflation in the School of Nursing was rapid compared to the other units.  For nearly all schools and colleges, however, there is evidence of grade inflation.  The data presented in Chart E also provides an example of how average grades typically differ between the UW-Madison schools and colleges.  The lowest grades are issued by the College of Agricultural and Life Sciences.  The School of Education and, now, the School of Nursing issue the highest grades.

Text Box: Dependent Variable: GPA      

Analysis of Variance
                                         Sum of         Mean
                Source          DF      Squares       Square      F Value       Prob>F
                Model           23  12076.56591    525.06808     1311.829       0.0001
                Error       162442  65018.48624      0.40026
                C Total     162465  77095.05214

                    Root MSE       0.63266     R-square       0.1566
                    Dep Mean       3.02816     Adj R-sq       0.1565
                    C.V.          20.89251

Parameter Estimates
                               Parameter      Standard    T for H0:
              Variable  DF      Estimate         Error   Parameter=0    Prob > |T|
              INTERCEP   1      1.279976    0.01274286       100.446        0.0001
              1991       1     -0.004343    0.00722086        -0.601        0.5476
              1992       1      0.013849    0.00712252         1.944        0.0518
              1993       1      0.026458    0.00707396         3.740        0.0002
              1994       1      0.050728    0.00707224         7.173        0.0001
              1995       1      0.048659    0.00703523         6.916        0.0001
              1996       1      0.051926    0.00699749         7.421        0.0001
              1997       1      0.062779    0.00693784         9.049        0.0001
              1998       1      0.062421    0.00691446         9.028        0.0001
              FEMALE     1      0.185768    0.00334324        55.565        0.0001
              SOPH       1      0.085285    0.00447874        19.042        0.0001
              JUNIOR     1      0.210184    0.00452513        46.448        0.0001
              SENIOR     1      0.337380    0.00454531        74.226        0.0001
              ACT_COMP   1      0.056548    0.00044696       126.516        0.0001
              NONRES     1     -0.041586    0.00503717        -8.256        0.0001
              MINNCOMP   1      0.033748    0.00547232         6.167        0.0001
              CALS       1     -0.115676    0.00591931       -19.542        0.0001
              BUS        1      0.121862    0.00784174        15.540        0.0001
              EDUC       1      0.264624    0.00559042        47.335        0.0001
              ENGR       1     -0.054370    0.00491094       -11.071        0.0001
              SOHE       1      0.096126    0.01138302         8.445        0.0001
              MED        1     -0.009871    0.01743604        -0.566        0.5713
              NURS       1      0.171122    0.01410826        12.129        0.0001
              PHARM      1     -0.087371    0.01688832        -5.173        0.0001

 

A regression model was used to try to further quantify the relationship between GPA and student preparation. The results for the subject population are shown above.  The dependent variable in this model is the semester GPA.  The model includes all of the variables for the population shown above in Table 1 and Table 2.  These include student level (freshmen, sophomore, junior, senior), composite ACT score at the time of the student's entrance, the student's school/college affiliation, state residency, and gender. The ACT score is the only independent variable that is a continuous variable.  All remaining independent variables are dummy variables representing nominal categories.   Each of the dummy variables has only two possible values: 1 or 0.

 

 The results show that the specific calendar year in which a semester GPA was earned is often, but not always, a statistically significant factor.  Aside from a few minor exceptions, the regression formula parameter for each calendar year included in this analysis generally increases from 1991 to 1998.  While Table 4 shows the average GPA had increased by 0.20 by 1998, the regression model attributes an increase of only 0.06 to the year 1998, after the other independent variables are taken into account.

 

The regression model, however, is able to account for only a very small portion of the variation in semester GPA, yielding an adjusted R-square of only .15.   To more fully understand how undergraduate grades vary, much further analysis will be required.

 

To observe the way in which grades may vary from department to department, a second dataset, containing undergraduate course grades in Fall 1998, was examined. These source data are described in more detail on page 2 above.   Table 5 below offers a comparison of those academic departments that issued at least 1,000 undergraduate course grades in Fall 1998.  An average course GPA, weighted by credits, is tabulated for each department.   The departments are sorted in descending order on GPA.  Table 5 includes departments from the School of Business, the School of Education, the College of Engineering and the College of Letters & Science. 


 

Table shows that departments in the humanities tend to grant the highest average grades, social science departments tend to fall into the middle of the list, while departments in more heavily quantitative subjects tend to grant the lowest average grades. Because some departments are large "service" departments serving students pursuing majors outside the department, the analysis was repeated based upon grades received only by undergraduates in their senior year.  This step was taken on the assumption seniors would be comparable across departments in the sense that they would be equally  likely to be taking courses in the department in which they were majoring.  While grades received by seniors were generally higher, the resulting changes in the rank order of the departments are few.  Charts F, G, H, I compare the course grade distributions of four departments, for all undergraduates and for seniors only.


 


 One potential explanation of these differences may be differences in academic preparation and abilities of students in the various departmental programs.  While this possibility should be explored, departmental differences in how course grades are defined would seem a more likely explanatory factor.  Quantitative disciplines appear more likely to use grades to describe a "bell curve" of student performance, while the humanities disciplines use grades to describe student performance relative to a fixed level of academic achievement.

 



Grades distributions can also vary considerably between sections (classes) within the same course.  If student enrollment into course sections is essentially a random process, average grades in individual sections can be expected to differ to some extent.   In some cases, however, differences between sections exceed the differences that would be expected from the random selection of students into sections.  For example, Chart J illustrates the grade distributions of three separate lecture sections within the 4-credit Economics course 101, entitled "Principles of Microeconomics."  These sections were large: sections 1, 2, and 3 had 434, 529, and 435 undergraduates respectively.  None of the three sections were honors sections.  The same instructor taught sections 1 and 2, while a different instructor taught Section 3.  The course GPA of students in Section 3 was 3.31, significantly higher than the 2.68 course GPA of the students in the other two sections.   A chi-square test of this distribution of students into sections and letter grades shows a high level statistical significance.  When the chi-square test is repeated excluding section 3, the result is no longer statistically significant.

 


As another example, Chart K illustrates a situation in which two sections within a course deviate from the course average in different directions.  This is a 4-credit course in the Philosophy department, numbered 101 and entitled "Introduction to Philosophy", where the average student received a course GPA of 3.06.  The course is available only to freshmen and sophomores.  None of the four sections were honors sections.   Again, the course sections are relatively large: sections 1, 2, 3 and 4 had 76, 94, 84, and 92 undergraduates respectively.   Different instructors taught each of the four sections.  The course GPA of students in Sections 1 and 4 was 2.69 and 3.41, respectively.  These averages are significantly different from than the 3.11 course GPA of the students enrolled in the other two sections. A chi-square test of this distribution of students into sections and letter grades shows a high level statistical significance.  When the chi-square test is repeated excluding sections 1 and 4, the result is no longer statistically significant.


 

Conclusions

This analysis suggests the following conclusions: