• Research purpose:
    • sponsoring own research project
    • support outside researchers

  • Ways of sharing data:
    • APIs
    • regular datadumps

  • Key question:
    • how much can we release
    • to whom do we give access

  • Initial proposal (modified after discussion:
    • all user data and interaction are completely open

  • Two kinds of data to be shared:
    • user contributions (forum posts, etc)
    • interaction data (clickstream)
      • log of all the actions of a logged in user, including when they login, what they read, etc.

  • Links to outside platforms 
    • currently only feasible to collect data from P2PU platform
    • a lot of interactions happen on other platforms
    • part of (my) proposed architecture for future platforms - enable the development of "bridges" that can connect with external platforms, such as Blogger, Twitter, Flickr, YouTube, github etc and connect user identities, retrieve interaction data, and enable import or linking of content (important for portfolios, peer-assessment, badges etc)

  • Data to exclude:
    • e-mail addresses
    • personal messages
    • application information

  • Could we make application information public as well?  
    • only make successful applications public
    • include one field that is clearly marked as private, "message to course organizer" or something
    • make it very clear to applicants that this will happen
    • being able to see everyone's course applications will scaffold "introductions"
    • could also be used to express your own learning goals, etc, which you can later refer back to 

  • Current situation
    • there is an API that gives access to
      • the click stream for an entire course, by username
      • all the applications for a course (all the application data, which users were accepted etc)
    • this API is only accessible to a few admin 
    • this restriction also makes it difficult even for those users to script automatic retrieval of these logs, automatic retrieval of logs from all courses, etc.
    • there is no feature for creating data dumps, public or not

  • Uses for data
    • formative (real-time or near-real time, while courses are running)
      • for individual course organizers
        • could develop interesting visualizations and analyses that support course organizers in running the course, identifying weak students, people who need extra support, etc.
        • would need a lot of thinking, how to use in best possible way
      • for students
        • there have been experiments on other platforms (for example Knowledge Forum, by Dr Scardamalia) of the instructional value of releasing statistics and visualizations - latent semantic analysis, social network analysis etc - to all students.
        • again something that has to be done with care
      • for administrators / community
        • with rising number of courses, very difficult for support team to get the "pulse" of the community, and of individual courses
        • "dashboard" that shows the "health" of individual courses, enabling us to quickly provide extra support to courses that are struggling (or highlight courses that are highly successful)
    • summative/research
      • there are lot's of open research questions, both internal evaluation questions for P2PU itself, and more rigorous research studies of interest to the broader community, that could be answered by data
      • ability to probe questions of student engagement, depth of learning, social networks, tracking student engagement over time (even over different courses)
      • these could have a huge impact on improving the quality of learning at P2PU