Saturday, December 4, 2010

Coming Soon to a Classroom Near You?

Some rambling thoughts on a fascinating set of articles about measuring teaching.

Today's NYT carried two stories -- on on page 1 -- about new techniques being used to evaluate K-12 teachers.  The news in the stories concerns two things: existence of a very large program for measuring educational effectiveness in schools and the central role of video-taping teachers teaching in that program.

Local readers' radar might ponder the resonance between programs like this and higher education assessment and higher education "learning and teaching centers" and the individuals and organizations who live off, rather than for, education.

The first story ("Teacher Ratings Get New Look, Pushed by a Rich Watcher") highlights Bill Gates' (via the Gates Foundation) interest in a gigantic project measuring the "value added" by teachers through multi-mode assessment. Among other tools : videos of instruction that are scored by experts.

"Interesting" is the fact that one of the movers and shakers in the project is none other than Educational Testing Services. And so this represents yet another opportunity for that organization to live off, rather than for, education in the U.S. Other contractors are mentioned in the story too -- as has been true of the assessment movement more generally, a big part of the driving force seems to be entrepreneurs who, after persuading you that you need to do something are more than happy to sell you the equipment needed to collect the data and then expertise to evaluate it.

The second article, "Video Eye Aimed at Teachers in 7 School Systems," describes some 3,000 teachers who are a part of the first phase of this search for new methods to evaluate teachers. Each will have several hours of teaching video-taped and the tapes will be assessed by experts using a number of carefully validated protocols.

The first article, describing the scope of the project, notes that the rating of 24,000 video-taped lessons will come to something like 64,000 hours of video watching. On a full-time basis that represents 32 person years of work. At 180 days/year, that's about 44 years of teaching.  The article suggests the costs to a school district will be about $1.5 million up front and then $800,000 per year.

I wonder if anyone has assessed the value of the information produced.

In the middle of the report there is a line about how this is a step forward because rather than having the principal observe once or twice during the year, outside experts (using scientific protocols) can observe up to a half dozen times.  This suggests an interesting phenomenon: in the name of standardization and objectivity, we deskill and depersonalize (among other things).

In one paper on value added modeling (VAM), by an ETS staff person (Braun 2004, 17), one finds this argument: (1) quantitative evaluation of teaching is here to stay; (2) evaluation of gains is preferable to just measuring year-end performance; (3) we have to think what would get used if not this; (4) therefore, use VAM even if it has real limitations. Another, by a Michigan State University economist concludes (about VAM):
We are looking at the educational system through a poor quality lens. The real world is probably more orderly than it appears from the analyses of noisy data (Reckase 2004, 7).


Amrein-Beardsley, Audrey. 2008. "Methodological Concerns About the Education Value-Added Assessment System." Educational Researcher, Vol. 37, No. 2, pp. 65–75


Rand Corporation. 2007. "The Promise and Peril of Using Value-Added Modeling to Measure Teacher Effectiveness"

Reckase, Mark D. 2004. "Measurement Issues Associated with Value-added Methods"

Wikipedia. "Value Added Modeling"

Monday, September 6, 2010

Closing the Loop in Practice: Does Assessment Get Assessment?

At a liberal arts college with which I am familiar, the administration recently distributed "syllabus guidelines" with 34 items for inclusion on course syllabi. Faculty leaders balked and asked for clarification: which of the 34 items were mandates (and from whom on what authority) and which were someone's "good idea"? The response was that guidelines are merely guidelines and most of the content were indeed good ideas. Most were.

A subsequent examination of a sample of syllabi revealed that most syllabi did not contain all 34. More specifically, there was not universal inclusion of several that, apparently, are important for accreditation purposes. 

The semester has begun.  The syllabi are printed.  The administration disseminated the guidelines -- their obligation is fulfilled.  If faculty choose not to comply, that's their decision. Overall, the situation is alarming because the school could appear to be non-compliant to its accreditors.  And it's the faculty's fault.  And folks are wondering how to fix it.

THIS COULD BE TURNED INTO SOMETHING POSITIVE, a shining example of assessment, closing the loop, and evidence-based change.

But first, WAIT A MINUTE! Do faculty get to say "We told them what to do; if they can't comply and don't learn, it's not my fault."?  Of course not.  If students aren't learning, faculty are doing something wrong.  Lack of learning = feedback, and feedback must lead to change.

Here we have a case of an institution ignoring unambiguous feedback. The feedback is simple: the promulgation of a list of 34 things one should do on a syllabus does not produce the uniform inclusion of the small handful of actually really important things to include on a syllabus.  That's it; that's what the evidence tells you.  It doesn't tell you faculty are bad; it tells you that this method of changing what syllabi look like was ineffective.

Never mind that any good teacher knows that you cannot motivate change with a list of 34 fixes.

The correct response? Close the loop: listen, learn, change the way syllabus guidelines are handled.

The unfortunate thing here is that folks who know (faculty) brought this immediately to the attention of the folks in charge. Faculty noted that the list was too long, its provenance ambiguous, its authority unclear, its applicability variable, its tone insulting. A solution was suggested. All this was met with, basically, a brush off -- they're just guidelines not requirements, what's the big deal?

And, it turns out, that is precisely how faculty understood them. No need for alarm.  Some adjusted their syllabi to some of the suggestions in the guidelines. But apparently, the faculty didn't all implement a few of the guidelines that really do matter (to someone).  Arrrrrrrgh.

And now for a little forward looking fantasy of what the outcome of this situation COULD be.

Since administrations and the assessment industry are apparently NOT really ready to adopt the underlying premise of assessment -- pay attention to feedback and change accordingly -- the faculty will.

From now on, only the faculty will disseminate syllabi guidelines. They will very clearly distinguish between legally mandated content, accreditation relevant functionality, college-specific custom and standards, and good pedagogical practice in general. They will invite all parties who become aware of syllabi-related mandates (or new good ideas) to communicate them to the faculty's educational policy committee for consideration for inclusion in their next semester's guidelines.

Those guidelines will explicitly articulate general goals (exactly which ones to be determined) such as syllabi are to be interesting documents that are useful to students and that permit colleagues to get a sense of what a course is about and at what level it is being taught as well as suggestions of particular features, boilerplate and examples that might be useful, and fully explained required items. They will include examples of an array of syllabi that explicitly demonstrate a variety of forms that meet their standards. And, all suggestions will be referenced where possible and requirements will be documented in terms of on what authority they are an obligation.

For assessment purposes the faculty will adapt* any externally supplied "rubrics" to their own intellectually and pedagogically defensible standards and practices and encourage our colleagues to make use of these college-specific tools in developing their syllabi.

Educators really committed to the stated goals of assessment would see in this affair an opportunity for an achievement they could boast about.  Those committed to one directional, top-down, assessor-centered, non-interactive, deaf-to-feedback approaches will see in it only faculty reluctance to get with the program. 

One lesson learned here is that institutional processes need adjustment. The amount of faculty and administrative time, emotional energy, and the augmentation of frustration and mistrust that this little thing has engendered was a phenomenal waste of precious institutional resources. Alas, accountability for THIS is unlikely ever to be reckoned.

Monday, August 9, 2010

How Academic Assessment Gets it Backwards

In a letter to the NYT about an article on radiation overdoses, George Lantos writes:

My stroke neurologists and I have decided that if treatment does not yet depend on the results, these tests should not be done outside the context of a clinical trial, no matter how beautiful and informative the images are. At our center, we have therefore not jumped on the bandwagon of routine CT perfusion tests in the setting of acute stroke, possibly sparing our patients the complications mentioned.

This raises an important, if nearly banal, point: if you don't have an action decision that depends on a piece of information, don't spend resources (or run risks) to obtain the information.

Consider, for a moment, the trend toward "assessment" in contemporary higher education. A phenomenal amount of energy (and grief) is invested to produce information that is (1) of dubious validity and (2) does not, in general, have a well articulated relationship to decisions.

Now the folks who work in the assessment industry are all about "evidence based change," but they naively expect that they can, a priori, figure out what information will be useful for this purpose.

They fetishize the idea of "closing the loop" -- bringing assessment information to bear on curriculum decisions and practices -- but they confuse the means and the ends. To show that we are really doing assessment we have to find a decision that can be based on the information that has been collected.

A much better approach (and one that would demonstrate an appreciation of basic critical thinking skills) to improving higher education would be to START by identifying opportunities for making decisions about how things are done and THEN figuring out what information would allow us to make the right decision. Such an approach would involve actually understanding both the educational process and the way educational organizations work. My impression is that it is precisely a lack of understanding and interest in these things on the part of the assessment crowd that leads them to get the whole thing backwards.