assessment-blog

Assessment

This is a post about assessment of learning -- finding out what people know. I know what you think, how boring! But let me try to convince you otherwise. My main points are that there is a great opportunity to make assessment more meaningful; that this happening in lots of places already every day, not just in universities but also on web sites like Stack Overflow; and that we can hugely increase the value of open educational resources if we get it right.

I will briefly write about two things, the new types of assessment that are enabled by tools and practices common to the web, and the role of assessment within a broader badges infrastructure (where badges are signals for achievements or learning). And at the end, I will mention two initiatives that are starting to experiment and build prototypes based on these ideas for new assessments, and invite everyone to get involved.

Is it going to be on the test?

The best type of assessment is implicit - it is a by-product of meaningful activity, not an activity in itself. Assuming we consider physical exercise a meaningful activity, if we want to assess athletic achievement, we can think of ways to do so that most people would agree are objective measures. For example, when a measurement of 8,90m recognizes a long-jumper's outstanding achievement, the measurement takes place independently of the jump itself and the connection between the measurement and the achievement is clearly established. A multiple choice exam that tries to evaluate a student's ability to communicate in French creates a much more artificial setting, and consequently it is much harder to establish a clear link between the assessment and the ultimate learning goal (to speak French).

Too often in education, assessment is not connected to authentic learning, but has instead become its own purpose. Everyone with teaching experience in formal education will have heard the question, "is this going to be on the test?" and has most likely struggled with the appropriate answer as this simple question so easily highlights a fundamental disconnect. We "teach" in order to prepare students for life and work, but they "learn" to succeed on the test.

There is another fundamental problem with current assessment practices. A model where few experts assess the work of many non-experts doesn't scale very well. Experts create bottlenecks. If open educational resources are to be useful to lots and lots of learners, then we need assessment practices than are meaningful and scale as part of the learning.

Wouldn't it be wonderful if we could identify certain meaningful tasks that people engage in, and recognize them in an authentic (non artificial) way that scales? The good news is that this is not just wonderful, but already common practice in online communities, where lots of users have to evaluate each others' contributions in order to achieve a common goal. The bad news is that as of today, these practices are rarely found in formal education.

An example from the web

[this section is good but needs one killer point that summarizes the idea and that you can bold]

Stack Overflow is an online question-and-answer platform for software developers. It is built on software that tracks every user's activity and awards badges (yes, like boy scouts) for all kinds of contributions. There is a badge for the first question a user asks, a badge for answering a certain number questions, a badge that is awarded when the community determines one's answer best addresses a particular question, and so on. Answers are voted up or down by the community, and a higher reputation gives users more power to influence an answer's score (which determines if it is shown at the top of a long thread where it is easy to find, or near the bottom). New and less experienced developers benefit from the expertise and skills of their more experienced peers - as they learn from the voting and the discussion that typically accompanies the separation of good from bad answers. It's important to note that Stack Overflow doesn't use these peer review and -assessment practices, and badges, to recognize learning achievements. It uses them to identify high quality answers. What we think of as assessment is simply a way for Stack Overflow to improve what it does - help software developers get good answers to their questions. Nevertheless, we can already learn quite a lot about users' participation, interests, and draw some conclusions regarding their expertise, by looking at the badges they have collected.

[add something else about the fact this also happens in open source community informally -- people do (peer) code review -- which is both mentoring and assessment?]

What are the key messages for learning assessments:

Communities are able to evaluate quality internally, using peer review, aggregating opinions, considering reputation and different levels of trust
Feedback loops enable improvements and learning for everyone, not just the active participants, but also lurkers
Assessment can be much more granular, and is driven by the needs or goals of the community (not a separate activity)
Because there can be many more and smaller achievements, new forms of recognizing and signaling them to others are needed.
It scales.

Storming the academy

Not only can these forms of assessment improve the way we recognize skills, they also enable a completely new and distributed way of certifiying learning and expressing it to potential employers. What is needed for this to happen is a badges infrastructure that is decentralized, controled by users and driven by the types of assessment discussed above.

The badge becomes a signal of learning achievements, and within a secure badges infrastructure users can control how they manage these signals. Learners collect badges from different communities (think of this as individual credits from different Universities). They control where to store and how to display these badges, for example to potential employers only or on a public web-site. And there is a way to authenticate badges, to make sure that claims about achievements are in fact true. See Mark's post about the details of this badges infrastructure for more information http://commonspace.wordpress.com/2010/08/12/badges-identity-and-you/

The P2PU / Mozilla School of Webcraft will take these ideas about assessment and certification and test them for web developer training. We plan to use community voting and discussion models similar to those created by sites like Stack Overflow, and connect them to learning achievements. We have started identifying different types of activities and behaviors - those that express skills relevant to web developers - and are now working on formalizing them as badges, similar to the way Stack Overflow recognizes types of contributions.

In late September, Peer 2 Peer University -- in collaboration with Carnegie Foundation, Mozilla, and Shuttleworth Foundation -- is organizing a workshop about "Learning assessment on the open web" to identify other mechanisms used by open source communities that might be applicable to assessment of learning achievements. And a first prototype of the badges infrastructure will be presented at the Mozilla Drumbeat Festival in November - we plan to roll it out more broadly, with more partners, in early 2011.

Add links:

http://pad.p2pu.org/assessment-workshop
https://www.drumbeat.org/festival

============================================================

Additional paragraphs, I might use in connection with the above somewhere else. This would be the introduction. Probably not neceessary to read ... unless you want to!

Information

As effective users of the web, we constantly review and assess information - using a myriad of clues to help us determine how relevant something is to us and how much we trust it. Much of this happens automatic and we don't think of our build-in sophisticated information filtering systems. Let's look at a simple example. We generally trust a friend's recommendation of a new band or movie, more than that made by a stranger. And even among our friends there are some whose taste in film we have especially learned to trust (or distrust) over time. The same is true for all pieces of information, content, knowledge we deal with. Depending on the importance of the information we might use different filtering mechanisms and employ them more consciously, but the basic mechanisms to determine usefulness and relevance are the same.

Moving things digital and online makes these skills more important, as we now have to filter and evaluate many more pieces of content and information as before, not only deciding which one's we value more than others, but even deciding which ones to access in the first place. When there was one newspaper that covered all the news of the world, we could conceivably (pretend to) absorb the world's event on a daily basis. But even a cursory glance at the news sources we can access online overwhelms.

Fortunately, many of the mechanisms we use in the offline world. still work online, for example the association of trust with certain brands or individuals. We may believe (or not) that journalists of the New York Time adhere to high professional standards, offline the same way as online. But there are new sources of information and knowledge that cannot rely on a similar trusted brand. When Wikipedia first made waves as a user edited Encyclopedia many questioned its ability to ensure quality of content and contributions. A new medium requires us to develop new ways of trusting it. This can create tension, when institutional ways of doing things are challenged and new norms developed (some professors allow their students to reference Wikipedia articles, and others don't). But it is not just the old ways that have to adopt, sometimes completely new opportunities to determine usefulness of information emerge.

The emergence of some of these patterns online is creating a buzz about assessment and accreditation opportunities in the open education space. When communities like Wikipedia are able to define, test and recognize quality work of its contributors - should the same not be true for communities of learners? Or, when a community of software developers recognizes someone's expertise and good judgement by entrusting her with the power to make changes to the group's central software repository (so called "commit" rights), is that less of a recognition of expertise and skill than a degree in computer science? If the Internet lets us learn in the wild, what are the interfaces to the formal institutions that make sense, both for the informal learners who need some form of institutional validation or the institution that is reshaping its space in the open education landscape. These are just some of the exciting questions that the study of online communities sparks.