Wednesday, January 11, 2023

ChatGPT vs. Online Formative Self-Assessments and Summative Multiple-Choice Exams

by Daniel Hickey and Qianxu (Morgan) Luo


This is the third post in a row considering ChatGPT’s implications for academic integrity, particularly in online education. The first post analyzed the media predictions. The second post tried to use ChatGPT to complete a graduate course on learning and cognition and write a literature review paper. We concluded that ChatGPT is indeed a threat to the academic integrity of many conventional educational practices, but that it would be difficult to use ChatGPT to complete those particular assignments, interactions, and literature reviews, without being detected and without learning quite a bit of the content.


ChatGPT vs. Summative  Assessments

Thanks to modern learning management systems and Google Forms, online educational assessment has become widespread, even for face-to-face classes. The popular Canvas LMS offers ten item types and four assessment formats for 40 combinations and can automatically score selected response and short-answer items. Time limits can thwart students from scouring the web for answers when scores matter. These allow all instructors to make powerful and efficient online assessments.

The most obvious question for us (and many others) is whether ChatGPT can thwart efforts to preserve the validity of online exam scores as evidence of what someone has learned in a course or knows about the field. One less obvious question concerns formative assessments: might ChatGPT give accurate feedback and relieve instructors of this burden? Another less obvious question concerns the role of digital proctors: Is ChatGPT going to require more of these intrusive and racially-biased monitors? (We are currently studying whether the approach below can help an online high school stop using them)

We scoured the research literature and uncovered a few preprints of recent studies exploring the first two questions. On the first question, Susnjak (2022) concluded that ChatGPT is “capable of exhibiting critical thinking skills and generating highly realistic text with minimal input, making it a potential threat to the integrity of online exams, particularly in tertiary education settings where such exams are becoming more prevalent.” Susnjak suggested we might need to return to in person and oral exams. Gilson et al. (2022) concluded that ChatGPT performed on the US Medical Licensing Exam (USMLE) “comparable to a third-year medical student”, while Kung et al. (2022) concluded that ChatGPT “performed at or near a passing threshold” on all three USMLE exams. On the second question Kung et al. concluded that ChatGPT “demonstrated high concordance and insight in its explanations,” which means it “may have the potential to assist with medical education…” We found no studies exploring the third question. But these studies were all conducted within the last month, so more are surely underway.


ChatGPT vs. Participatory Assessment

For the last decade, our lab has been exploring a situative approach to assessment that addresses long-standing challenges in assessment  As introduced in Hickey and Zuiker (2012) and elaborated in Hickey (2015) and Hickey and Harris (2021), this approach “aligns” learning across multiple levels of increasingly formal assessments:


  1. Public instructor assessment of student work and annotations positions students as accountable authors and audience for peer authors.

  2. Engagement reflections summatively assess prior engagement while formatively assessing understanding and future engagement.

  3. Formative self-assessments allow students to check and enhance their understanding and indirectly prepare for exams without requiring laborious private instructor feedback.

  4. Automated time-limited multiple-choice exams with unsearchable “best-answer” items provide valid evidence of achievement of learning outcomes.


These features free up instructor time to focus on efficient and powerful public assessment and engagement at the first level. The features support understanding and achievement while leaving digital “traces” of learning that bolster integrity.  The previous post examined a course that only included the first two levels and where students drafted a literature review paper week by week.  This post presents an initial analysis of a course called Assessment and Schools that include all four levels.


ChatGPT vs. Our Summative Exams

The course includes three summative exams at the end of each of the three modules. Illustrating the situative assessment design principle measure achievement discreetly (as needed) the exams are each only worth 10% of the final grade. The items are:


  • Drawn from the item banks for multiple textbooks, to help document transfer.

  • Never directly included in the assignments or formative assessments.

  • Include challenging “best answer” and are rewritten regularly in an effort to ensure that correct answers can’t be quickly located in search.

  • Regularly scrutinized to ensure that they have a high discrimination index (d), which means that students who get the item correct scored relatively well overall.

  • Regularly scrutinized to ensure that no more than one student gets a perfect score and the scores are normally distributed around an average difficulty (p) between 0.7 and 0.8 (70-80%)


Students are given forty minutes to complete each twenty-minute exam. To further ensure integrity, students are only shown their overall score after completing each exam and do not see the correct answer to each item.

Morgan entered each stem and four responses into ChatGPT (it recognizes them as multiple-choice items). She found that ChatGPT returned the correct answer for just 51% of the items. In contrast, the average score across students across exams was 80%. She then gave herself two minutes to see if she could locate that correct answer in google and concluded that she located the correct answer for 45% of the items (though this was likely inflated because students would not know the correct answer they were searching for).

We are currently analyzing the items and searching for conclusions regarding which item types ChatGPT is better or worse at.  The correlation between ChatGPT and p was surprisingly low.  The average difficulty of the items that ChatGPT missed was .67 while the average difficulty of the items it answered correctly was .77. ChatGPT correctly answered this item, which 93% of the students scored correctly:


Question/Prompt Which of the following assertions is most accurate?

Group of answer choices (inserted by ChatGPT)

Criterion-referenced score-based inferences are relative interpretations.

Norm-referenced score-based inferences and criterion-referenced score-based inferences are not essentially interchangeable.

Norm-referenced score-based inferences are typically more useful to classroom teachers than criterion-referenced score-based inferences.

Norm-referenced score-based inferences are absolute interpretations.

ChatGPT Reply: The most accurate assertion out of the options provided is:

"Norm-referenced score-based inferences and criterion-referenced score-based inferences are not essentially interchangeable."

Criterion-referenced score-based inferences are based on a specific set of standards or criteria and are used to determine whether a student has met those standards. They are relative to the specific criteria and are used to evaluate the student's performance relative to the criteria.

Norm-referenced score-based inferences, on the other hand, are based on the student's performance relative to a group of peers. They are typically used to rank the student's performance relative to others. Norm-referenced score-based inferences can be considered as absolute interpretation but is not necessary.

Criterion-referenced score-based inferences and norm-referenced score-based inferences are different types of inferences and are used for different purposes. They cannot be used interchangeably.

It can also depend on the context and the goals of an assessment, whether norm-referenced or criterion-referenced score-based inferences are more useful for a specific assessment. It can be helpful for teachers to understand the difference between the two and when to use them.


Both the answer and the answer explanations were factually correct.  In contrast. this item more relatively difficult for students (.75) 


Question/Prompt: A rubric is a scoring guide to be employed in judging students’ responses to constructed-response assessments such as a performance test.  Which one of the following elements is the least necessary feature of a properly constructed rubric?

Group of answer choices (inserted by ChatGPT)

An indication of whether a holistic or analytic scoring approach is to be used

An identification of the evaluative criteria to be used in appraising a student’s response

A designation of a performance standard required for skill-mastery (correct response)

Descriptions of different quality levels associated with each evaluative criterion

ChatGPT’s (incorrect)  reply: The least necessary feature of a properly constructed rubric among the options provided is:

"An indication of whether a holistic or analytic scoring approach is to be used"

Rubrics are often used to score constructed-response assessments such as performance tests, essays, and projects. A well-constructed rubric should have the following elements:

  1. An identification of the evaluative criteria to be used in appraising a student’s response: The rubric should clearly identify the specific criteria that will be used to evaluate the student's performance on the assessment.

  2. A designation of a performance standard required for skill-mastery: The rubric should include a description of the level of performance required for mastery of the skill or content being assessed.

  3. Descriptions of different quality levels associated with each evaluative criterion: The rubric should include a clear description of the different levels of quality that can be achieved for each criterion, such as excellent, good, fair, and poor.

The indication of whether a holistic or analytic scoring approach is to be used is something that is more associated with the choice of scoring method and it can depend on the type of assessment and the goal of the assessment. A holistic scoring approach means that an evaluator assigns a single score to the student's response, whereas an analytic scoring approach means that an evaluator assigns a separate score for each criterion or dimension of the assessment.

Rubrics can be used for both types of scoring methods, and the rubric itself doesn't dictate the scoring approach.

While all of the feedback provided is correct, ChatGPT selected the wrong answer. And there is nothing in the explanation that would point a student to the correct answer.  Its very high d index (.75) indicates that nearly all students who missed it were in the bottom half of the class in terms of their overall score.  

We are now further analyzing the items and searching for patterns. But clearly, more systematic study is needed, likely employing measurement scholars who have access to large pools of items created by professionals with detailed performance data. For now, we concluded that these exams do indeed boost academic integrity in the course. Some students may well boost GPT’s performance by scouring the output and checking answers but doing so likely would require knowledge about that concept and might have limited formative value. 

But it is worth noting that marginally engaged students sometimes score nearly as poorly on these exams as ChatGPT (i.e., 60%). Because each exam is only worth 10% of the grade, ChatGPT would have only lost 15% of a 100-point final grade, or a B if all other points were earned.


ChatGPT vs. Our Formative Self-Assessments

Unsurprisingly, the research literature on “cheating” on formative assessments is quite small.  Arnold (2016) used sophisticated techniques to find that doing so is “negatively related to academic progress” and “does not seem to pay off” in terms of summative exam performance. Our broader situative view of learning leads us to a rather different way of thinking about the formative functions of assessment. We view all assessments as “peculiar” forms of disciplinary discourse. We view learning primarily as increasingly successful participation in such discourse and only secondarily as the relatively static knowledge represented by assessment items. We acknowledge that this is unorthodox and continues to baffle our constructivist assessment colleagues and our empiricist measurement colleagues. But it has led us to innovative responses to the “conundrum” of formative assessment described in Bennett (2011) and Hickey (2015).

Building on Duschl and Gitomer (1997)Hickey et al. (2003) initiated an extended program of formative assessment research using situative theory. This work has gone in a number of directions. In the online Assessment course, each of the eleven assignments concludes with 5-10 formative self-assessment items.  These are constructed response items drawn from supplemental digital material from the textbook and the assignments.  Because they are more formal and objective than the assignments, they require a degree of transfer. Students are instructed to:


  • Try to draft a coherent/complete response from memory.

  • Consult with the textbook and/or student work as necessary to draft a complete/coherent response.

  • Compare their response with the “expert” answer displayed only after they submit something (anything).


 The key assessment design principles here are that (a) the items should require the transfer of knowledge, (b) students should believe that completing them as suggested will support exam performance, and (c) such students are well-positioned and motivated at that moment to understand feedback beyond the correct answer to that item.  

Students are instructed to complete the self-assessments after completing each assignment. But they are not mandatory and 80-90% of students typically attempt them. Naturally, we wondered what would happen if students consulted ChatGPT in addition to or instead of the textbook when completing these items.  Morgan tried to use ChatGPT to find the correct answer for each of the 78 items and compared that answer with the expert answer that the items provided. She also tried to find the correct answer on the web (no time limit) and tested both the chat GPT response and the expert response using the AI Content Detector at Writer.com.

She concluded that ChatGPT returned a clearly correct answer for 31% of the items and a partly correct answer for 11% of the items. The ChatGPT responses averaged 80% human-generated at Writer.com while the expert answers averaged 98% human generated. It is worth noting that many instructors might elect to use similar items in summative assessments, suggesting that ChatGPT is not a large threat to integrity in such a case.

As expected, ChatGPT was quite proficient and basic definitional items like this one:

Assessment question/prompt: In What-to-Assess Considerations, what are Relevant Curricular Configurations?

ChatGPT’s (correct) response: In the context of educational assessment, "relevant curricular configurations" refers to the ways in which the curriculum is organized and structured in a particular school or program. This can include things like the specific courses that are offered, the content and themes that are covered in those courses, and the way that instruction is delivered.

When designing assessments, it is important to consider the relevant curricular configurations so that the assessment can accurately measure student learning and progress. For example, if a school has a strong emphasis on hands-on, project-based learning, then an assessment that primarily consists of multiple choice questions may not be the best fit. Instead, the assessment should include opportunities for students to demonstrate their knowledge and skills through hands-on activities or projects.

Similarly, if a school focuses on particular themes or content areas, the assessment should reflect that focus. For example, a school that emphasizes environmental science should include questions or tasks related to environmental science in their assessment.

Relevant curricular configurations also can be linked to regional or national frameworks, like Common Core standards, or other standards or frameworks like the Next Generation Science Standards or International Baccalaureate program, these would also need to be considered when designing assessments.

It is important to keep in mind that different schools and programs may have different curricular configurations, so assessments should be tailored to the specific context in which they will be used.

Course’s “expert” answer: This concerns the extent to which students' mastery of curricula and performance on classroom assessments will contribute to mastery of some external state, national, or professional achievement test.  In other words how related is it to relevant external standards.


In fact, we see that ChatGPT provides a much more complete and comprehensive response than the expert response provided by the LMS (which was a rewording of an entry in the textbook, and probably should be extended). A student who relied on ChatGPT to complete the item as instructed might learn more using ChatGPT than the textbook, so long as the student paraphrased the information rather than simply copying and pasting. Not surprisingly, Mogan readily located an answer to the item on the web as well.

In contrast, we had dozens of items like this one that ChatGPT failed to answer correctly:


Assessment question/prompt: If a classroom teacher actually computed a Kuder-Richardson coefficient for a final exam, this would be an example of an:


stability reliability coefficient

internal consistency reliability coefficient

content validity coefficient

construct validity coefficient


ChatGPT’s (incorrect) response The dominant shortcoming of this item is that it confuses different types of reliability coefficients.


The Kuder-Richardson coefficient is a type of internal consistency reliability coefficient. It is used to measure the consistency of a test or questionnaire by comparing the responses of different items within the test.


Stability reliability coefficients, such as test-retest reliability coefficient or the equivalent-forms coefficient, are used to measure the consistency of test scores over time. They look at the consistency of scores obtained by a group of individuals when they take the same test on two different occasions.


Content validity coefficient and construct validity coefficient are not types of reliability coefficients but rather are forms of validity. Content validity coefficient evaluates if a test or measure is measuring the content it should be measuring, is it covering the whole domain, and if so, in an appropriate way. Construct validity coefficient evaluates if the test or measure is actually measuring what it is supposed to measure.


So to summarize, the statement is incorrect because it describes the Kuder-Richardson coefficient as a stability reliability coefficient, instead of an internal consistency reliability coefficient.

The use of the article “an” renders only Choice C a grammatically correct option.

 

Course’s “expert” answer: The use of the article “an” renders only Choice C a grammatically correct option

 

This response illustrates the title of Ian Bogosts' article in the special section of The Atlantic entitled “ChatGPT is dumber than you think.”  The open question is whether students would recognize that ChatGPT simply misunderstood this relatively easy item. 

The following self-assessment item illustrates a different point about the kinds of mistakes that ChatGPT is prone to make: 

 

Assessment question/prompt: What are the three primary deterrents to formative assessment according to Popham?

 

ChatGPTs (incorrect) response. According to James Popham, a leading expert on educational assessment, there are three primary deterrents to formative assessment:

 

  1. Time constraints: One of the biggest challenges to formative assessment is finding the time to design, administer, and analyze assessments, as well as provide feedback to students. Teachers are often already stretched for time and may not see the value in taking the time for formative assessment.

  2. Limited assessment literacy: Many teachers lack the knowledge and skills needed to effectively implement formative assessment, which can lead to a lack of confidence in their ability to design and use assessments.

  3. Lack of infrastructure support: Another deterrent to formative assessment is the lack of infrastructure support from schools and districts, such as a lack of resources to support assessment, technology and professional development to ensure teachers are prepared to use formative assessment in the classroom.

 

These are the three primary deterrents to formative assessment that Popham commonly highlights in his works, addressing these deterrents will help to promote the use of formative assessment in the classroom, which can lead to improved student learning and achievement.

 

Courses “expert” response: The three deterrents include:

  • Educators' misunderstanding about the nature of formative assessment: It is a test rather than a process; They only help teachers adjust their teaching (rather than helping students adjust their learning); Conflating commercial interim tests with classroom formative assessments

  • Educators' reluctance to change their ways 

  • Insensitivity of external achievement tests to the impact of formative assessment  (real or perceived).

 

This response nicely illustrates the points by Von Hippel in Education Next: “The viral chatbot is often wrong, but never in doubt.” ChatGPT knows that Popham is an assessment expert and it knows a good bit about formative assessment. But it does not know the answer to this specific question even though the answer is plausible in that those are indeed deterrents.

Finally, we uncovered a few questions and responses that illustrated perhaps the most worrisome concern that many have expressed about ChatGPT

 

Assessment question/prompt: In your own words, write the four steps for building a learning progression:

Building a learning progression involves four steps:

  1. Identify key concepts and skills: The first step in building a learning progression is to identify the key concepts and skills that students should learn at each grade level or developmental stage. This involves reviewing the existing curriculum and research on student learning to identify the most important and relevant content.

  2. Sequence the concepts and skills: The second step is to sequence the concepts and skills in a logical and developmentally appropriate order. This involves creating a clear progression of learning that builds on what students already know and can do, and prepares them for more advanced concepts and skills.

  3. Create assessment tasks: The third step is to create assessment tasks that align with the key concepts and skills identified in step one, and that can be used to measure student progress along the learning progression. These tasks should be designed to be challenging, but achievable for students at each stage of the progression.

  4. Analyze and interpret student data: The final step is to use the assessment tasks from step three to gather data on student progress, and then analyze and interpret that data to identify areas of strength and weakness in student learning. This information can then be used to inform instruction, adjust the learning progression as needed, and target additional support to students who are struggling.

 

Overall, the goal of building a learning progression is to provide a clear roadmap for student learning and to ensure that students are making progress towards long-term academic goals by providing them with well-defined and sequenced learning objectives, formative assessment and well aligned instruction.

 

Courses “expert” answer The following are the four steps written in Popham’s words. Check your own list against these steps to make sure yours has the same meaning.

 

Step 1: Acquire a thorough understanding of the target curricular aim.

Step 2: Identify all the requisite precursory subskills and bodies of enabling knowledge.

Step 3: Determine the measurability of each preliminarily identified building block.

Step 4: Arrange all the building blocks in an instructionally sensible sequence.


This response illustrates Roose’s point in an article about ChatGPT’s “brilliance and weirdness” in the New York Times. Roose observed that ChatGPT is“ominously good at answering the types of open-ended analytical questions that frequently appear on school assignments” but “prone to giving wrong answers.”  

Conclusion

ChatGPT is certainly impressive. We concluded that a score of 53% shows that ChatGPT is a modest threat to the integrity of these particular summative assessments. As with the Learning and Cognition course, we concluded that it would be difficult for a student to use ChatGPT to complete and discuss the weekly e-portfolio assignments in this course without being detected and without learning a substantial amount of content. 

We further concluded that ChatGPT might supplement the textbook as a resource for formative self-assessments but can’t effectively supplant it. We also concluded that a score of 31% means that ChatGPT is a very small threat to the integrity of these particular formative assessment items that were used in a summative assessment. 

This likely concludes this round of posts on ChatGPT.  Our heads are spinning imagining the possible research designs where we might use these two courses to explore some of the many suggestions for exploring ChatGPTs' value when instructors and students work together to exploit it in support of learning (such as Mollick and Mollick, 2022). We are both participating in an advanced topical seminar on theories of online learning this semester and look forward to tracking and systematically analyzing the explosion of research and media accounts.


Friday, January 6, 2023

ChatGPT vs. Participatory Online Learning? Participatory Learning (Mostly) Wins!

 by Daniel Hickey


Can ChatGPT hack my graduate course on learning and cognition? This post got very long. So here is the summary

  • I tried to hack my graduate course on Learning and Cognition in Education using the powerful new chatbot ChatGPT.
  •  ChatGPT could easily generate a “personally relevant” instructional goal and educational context to frame my “learning” in the course (in this case, in cybersecurity education)
  • ChatGPT could easily generate plausible social annotations and responses to peer discussion questions
  • ChatGPT could NOT generate engagement reflections, but these would not be hard to draft by a student who hacked
  • ChatGPT was able to write a marginally acceptable literature review paper, but fabricated some quotes and references. With more work such as including paper abstracts in the prompts, GPT is scarily good at referencing research literature, perhaps as well as a first-year graduate student.
But my students write their papers section by section each week and interact extensively with peers and the instructor via threaded comments on readings and drafts. My paper shows that ChatGPT knows about as much about learning and cognition as a typical student learns in my class. Perhaps, more importantly, ChatGPT readily authored new knowledge at the intersection of that knowledge and ChatGPT's knowledge of cybersecurity education. I concluded that it would be nearly impossible to use ChatGPT to complete the weekly assignments, interact with peers, reflect, and write a satisfactory literature review but do so undetected and without learning and retaining a significant amount of the course content along the way. Readers are encouraged to examine Ethan and Lilach Mollick's inspiring new paper that demonstrates three novel assignments that use ChatGPT to support deep content learning.

I have revised this post and the title several times as I have grown increasingly optimistic about the potential for using ChatGPT to support learning.  In particular, I was struck by the amount and value of feedback I got when I asked ChatGPT to analyze drafts according to Strunk and White's handbook. I really learned to write in tech writing courses at SDSU where my instructor proofread our drafts and required us to write out in longhand the Strunk and White rule we violated every time we violated it.  I can easily see using ChatGPT to streamline and personalize this relatively expensive educational practice

Sorry about the formatting problems. I should learn HTML!

ChatGPT vs. Learning

I spent much of my holiday break exploring just how powerful the new chatbot ChatGPT really is.  Like many, I was prompted to look into it by a viral essay in The Atlantic by a high school English teacher named Daniel Herman entitled The End of High School English.  Herman wrote:

Let me be candid (with apologies to all of my current and former students): What GPT can produce right now is better than the large majority of writing seen by your average teacher or professor. Over the past few days, I've given it a number of different prompts. And even if the bot's results don't exactly give you goosebumps, they do a more-than-adequate job of fulfilling a task

Herman went on to describe how ChatGPT excelled at an assignment that he had long considered to be "unhackable."  It is a specific version of the "compare and contrast" essay that many educators turned to once the Internet made it simple to locate summaries of almost any single work:

In January, my junior English students will begin writing an independent research paper, 12 to 18 pages, on two great literary works of their own choosing—a tradition at our school. Their goal is to place the texts in conversation with each other and find a thread that connects them. Some students will struggle to find any way to bring them together. We spend two months on the paper, putting it together piece by piece. 

Herman fed ChatGPT pairs of works that students had worked with previously and found that GPT "brought them together instantly, effortlessly, uncannily." He further reported how GPT instantly cleaned up a student's messy first draft: "It kept the student's words intact but employed them more gracefully; it removed the clutter so the ideas were able to shine through. It was like magic."

posted previously about my analysis of the nearly 30 media accounts of this issue so far. Some commentators were as worried as Herman.  But others argued that the risk was actually due to "mindless" assignments and that educators will just need to adapt. I am not an English teacher, but I don't imagine that many would dismiss Herman's assignment as mindless.  While most of the media accounts were mixed, most agree that the impact on education will be much larger than most previous technologies.  So I set out to explore whether my own online courses are similarly hackable.

ChatGPT vs. Expansive Framing and Participatory Learning and Assessment

For over a decade, my doctoral advisees and colleagues and I have been refining and studying a model of online education we call Participatory Learning and Assessment (PLA).  At PLA's core are the design principles for expansive framing that emerged in the situative design-based research of Randi Engle and Colleagues (1965-2012).  The principles suggest that students "problematize" learning from their own perspective. The goal is to position students as "authors" of new knowledge about the ways course concepts are related to their own experiences.  These expansively framed assignments are embedded within multiple levels of increasingly formal assessments. These assessments are intended to ensure "generative" learning that transfers readily and widely. But they also thwart cheating by leaving a clear "trace" of student learning, while avoiding expensive and intrusive digital proctors.

We have adapted PLA to a wide range of online course contexts, including secondary, undergraduate, and graduate courses, for-credit and open courses, and semi-synchronous and self-paced courses.  PLA first emerged in a graduate-level course called Learning and Cognition in Education.  Students learn about the three "grand theories" of knowing and learning (cognitive-associationist, cognitive-constructivist, and situative/sociocultural).  They learn how assumptions about knowing, learning, and transfer are tightly linked to each other, and then learn about the different implications of those assumptions for designing instruction, motivating engagement, and assessing learning.

Personalized Elements of Learning and Cognition in Education

Each student first generates a unique, personally-relevant instruction goal and an educational setting where that goal might be accomplished. They then engage with carefully selected readings from that perspective and locate additional references.  Students are pushed to identify elements of reading or educational resources that are "most relevant" and to justify those selections. Each week students add an entry to their "g-portfolio" (a google doc shared with the class), which they gradually turn into a literature review paper. 

In recent years we have adapted the Perusall social annotation platform.  This makes it simple for students to comment on and discuss assigned readings and threaded comments.  The reading prompts are typically expansively framed to help avoid mindless discussion threads.  

After each assignment, students submit a brief reflection.  The reflections are intended to "proleptically" shape future engagement because students come to expect that they will have to reflect on those aspects of their engagement.  But the reflections are also a summative assessment of prior engagement in that it is difficult to respond to them coherently without engaging in the assignment.

My Expansively Framed Course vs ChatGPT

Naturally, I wondered whether ChatGPT could be used complete all of these course elements.  While many elements are optional, most students engage with gusto (though some find the workload relatively demanding). I set out to explore whether ChatGPT could generate plausible examples of the following:

  • A personally relevant instructional goal and context?
  • Social annotations to reading passages and responses to peer comments?
  • Engagement reflections?
  • A complete literature review paper?

In other words, would I as an instructor recognize that students were using ChatGPT to cut corners in my class? I further wondered whether the ChatGPT output would be detectable by Turnitin or other plagiarism detectors or the newly available chatbot detectors such as the one at Writer. 

 Methods and Results

 I tried to complete each of the different elements of the course using ChatGPT while "learning" as little about learning and cognition as possible.  I used the topic of cybersecurity because I have been working in that space a lot for the last few years. 

 ChatGPTs Instructional Goal and Context

I used the topic of cybersecurity because I have been working in that space for the last few years.  My first prompt was rather broad: 

My Prompt: write an example instructional goal in undergraduate cybersecurity education for someone who teaches cybersecurity

 ChatGPT's response: Goal: Students will be able to identify and explain the fundamental principles of cybersecurity, including confidentiality, integrity, and availability.

Indeed, in designing an online self-paced course for K-12 network analysts a few years ago, I learned that the "CIA" principles are the starting point of most introductory cybersecurity courses.  GPT actually generated a dozen more lines, but I just went with the first sentence above.  I decided to run with that context for engaging with this goal:

My Prompt: Describe a self-paced online course for K-12 cybersecurity analysts where they might learn about confidentiality, integrity, and availability

ChatGPT's response: Course Title: "Fundamentals of Cybersecurity"

Course Description: In this course, K-12 cybersecurity analysts will learn about the fundamental principles of cybersecurity and how to apply them in practice. Topics will include confidentiality, integrity, and availability, as well as common types of cyber threats and best practices for protecting against them.

Course Outcomes:

·       Define confidentiality, integrity, and availability in the context of cybersecurity.

·       Explain the role of these principles in protecting against cyber threats.

·       Describe common types of cyber threats, including malware, phishing attacks, and network vulnerabilities.

·       Identify best practices for protecting against cyber threats, such as using strong passwords, keeping software up to date, and using antivirus software.

Course Format:

·       The course will be self-paced and delivered entirely online, with no in-person meetings required.

·       The course will consist of a series of modules, each covering a different topic. Each module will include a combination of video lectures, readings, and interactive exercises.

·       The course will also include quizzes and a final project, allowing students to demonstrate their knowledge and skills.

·       There will be opportunities for students to ask questions and interact with the instructor and other students through a discussion forum.

This self-paced online course provides K-12 cybersecurity analysts with a flexible and convenient way to learn about the fundamental principles of cybersecurity. By working through the course modules and completing quizzes and a final project, students will be able to demonstrate their understanding of the material and apply it in practice. The course also includes opportunities for interaction and support through a discussion forum. 

That response was breathtaking in its detail.  It is far more detailed than students are expected to write,  but it does demonstrate how much background knowledge GPT brings and how it can readily customize responses.  However, when I fed the post to the AI Content Detector at Writer.com it deemed it 68% human generated. In contrast, student generated response to this assignment were consistenty judged to over 90% human-generated.

In summary, GPT was entirely capable of generating a realistic instruction goal for a specific discipline and then describing a context where that goal might be met. In fact it created an entire course description. However the AI Content Detector recognized the text as machine-generated.

Social Annotation via ChatGPT

An initial ungraded assignment has students socially annotate the course syllabus.  The annotation instructions first ask each student to indicate which of seven course objectives is most relevant and which is least relevant in a way that says something about them as an educator and/or graduate student: The first response was not in first person, so I revised the prompt:

My prompt: In the first person, explain which one of these goals is most relevant to a graduate student who teaches cybersecurity online: (with course goals copied and pasted in)

ChatGPT's response. As a graduate student who teaches cybersecurity online, I believe that the goal of "describing the implications of cognitive and sociocultural theories for designing instruction, motivating engagement, and assessing learning, in a specific learning context" is most relevant to me. Understanding these theories and their implications is important for effectively designing and delivering online instruction in the field of cybersecurity.

By studying cognitive and sociocultural theories of knowing, learning, and transfer, and how they differ from behavioral theories, I can gain a deeper understanding of how students learn and how to design instruction that is effective and engaging. This knowledge can also help me to motivate student engagement and assess learning in a way that is aligned with these theories.

In addition, by considering the role of culture and context in different theories of knowing and learning, I can design inclusive instruction that takes into account the diverse backgrounds and experiences of my students. This is particularly important in an online setting where students may come from a variety of cultural and geographic locations.

Overall, this goal is highly relevant to my role as a graduate student who teaches cybersecurity online, as it will help me to design and deliver effective and inclusive instruction in this field.

That response was certainly plausible and personalized, though a bit longer than typical annotations, and could easily be shortened. However, the AI Content Detector deemed it to be 92% human-generated conten. ChatGPT similarly handled the "least" relevant by explaining that "learning APA formatting and referencing" would not be as important for someone teaching cybersecurity.

The first graded assignment has students socially annotate a landmark handbook chapter that systematically contrasts associationist, constructivist, and sociohistoric theories of knowing and learning and their implications for education. One of the annotation prompts asks students to post a comment indicating which of the three perspectives is more personally and professionally relevant. The discussion of the third emphasized the early Soviet theorist Lev Vygotsky, so I picked that one:

My prompt: Explain why Vygotsky's sociohistoric learning theory is most relevant for learning the cybersecurity principles of confidentiality, integrity, and availability.

ChatGPT's response: Vygotsky's sociohistorical learning theory, which emphasizes the role of social interactions and cultural experiences in learning, is particularly relevant for learning the cybersecurity principles of confidentiality, integrity, and availability because these principles are highly dependent on the context in which they are applied.

According to Vygotsky, learning occurs through social interactions with more knowledgeable others, such as teachers, mentors, or peers. In the context of cybersecurity, these interactions might include discussions with instructors or colleagues about the importance of confidentiality, integrity, and availability in protecting against cyber threats, or participating in hands-on activities to practice applying these principles.

Vygotsky's theory also emphasizes the importance of cultural experiences in learning, as these experiences shape an individual's understanding of the world and their place in it. In the context of cybersecurity, cultural experiences might include exposure to different norms and expectations around data protection and privacy, or the influence of laws and regulations related to cybersecurity.

By considering the social and cultural contexts in which the principles of confidentiality, integrity, and availability are applied, learners can develop a deeper understanding of these concepts and how to apply them effectively in practice. 

That was certainly a convincing response, certainly with enough detail to serve as a thesis for a literature review paper. In particular, the reference to the contextual nature of sociohistoric theory was eerily prescient. Notably, the AI Content Detector deemed the comment to 83% human-generated.  Indeed, over and over again, I found that ChatGPT was able to generate plausible responses to a wide variety of expansively framed annotation prompts.

Initially, ChatGPT refused to generate a response to a student’s annotation asking a  question about a later reading that collapsed associationist and constructivist perspectives into a single “cognitive” pespective.  But it responded so when I gave it more information; it revised it when asked to write more informally

My Prompt (the peer question): Why did the authors omit empiricist and cognitive associationist perspectives when they so clearly included these perspective later on? It seems like they would have at least mentioned when the cognitivists overlapped.

ChatGPT's failed response: Without more context, I am unable to accurately answer the question. Can you provide more information or clarify the question?

My revised prompt: these authors lumped cognitive associationist and cognitive constructivist theories into a single perspective and that did not really make sense to me

ChatGPTs revised response. It is possible that the authors grouped these two perspectives together because they believe that they share some commonalities or that they are closely related. It is also possible that the authors chose to group these two perspectives together for the sake of simplicity or clarity.

It is important to note that different authors and researchers may have different ways of categorizing and organizing theories, and it is not uncommon for different authors to group theories together in different ways. It is always a good idea to carefully read and consider the arguments and reasoning that the authors provide for their classification of theories, rather than making assumptions based on how theories are labeled or grouped.

My prompt to simplify the response: but write that answer more informally like a student

ChatGPT's simplified response: It's possible that the authors lumped these two perspectives together because they think they're similar or related in some way. It could also be that they just wanted to keep things simple or clear by grouping them together. It's worth noting that different authors can have different ways of organizing theories, and it's not uncommon for them to group things differently. When you're reading about theories, it's always a good idea to pay attention to the arguments and reasoning the authors give for how they've classified things, rather than just assuming based on labels or groupings.

The simplified response still read more like a professor's reply than a peer's.  But it was certainly convincing and plausible. The AI Content Detector deemed the simplified response to be 83% human-generated.

In summary, GPT is certainly capable of generating plausible social annotation comments and responses to peer discussion questions. And those responses are judged to be mostly human-generated

Can ChatGPT Generate Engagement Reflections? (NO!)

At the end of each assignment, students are asked to reflect on the following aspects of their engagement:

·       Contextual engagement. How well-suited were your current interests and goals as a graduate student for this assignment?

·       Collaborative engagement. What did you learn from others and what did others learn from you?  Be specific and name people and what they contributed that was helpful.

·       Cultural engagement. What did you learn this week that might help make education more culturally relevant for diverse learners? In what ways did factors associated with culture, race, gender, orientation, language, etc., impact how YOU engaged with this assignment or these ideas?

·       Conceptual engagement.  Were there any concepts that you encountered this week that seemed important but that you struggled to understand?

·       Consequential engagement. Did you learn anything this week that might have specific consequences for you in the future?

Over and over again, ChatGPT refused to generate a response to these prompts or any versions of them. Instead, ChatGPT repeatedly stated, “as a machine learning model, I do not have personal experiences or interactions with people.” 

In summary, this aspect of my course may be unhackable with ChatGPT. However, it would not be difficult for someone who hacked an assignment to write a plausible reflection.

Can ChatGPT Write an Entire Literature Review Paper? (YES!)

Finally I set out to write a literature review paper.  Actually I started the paper before I started this post.  Instead of the CIA principles ChatGPT generated for me, I want with cybersecurity in general and use cyber “ranges” and digital “twins.” These hyper-realitistic simulations are taking cyber education by storm and I have been learning and writing about them a lot lately. 

Here is a link to the 5000 word paper.  Here is the my prompt and GPT’s abstract:

My prompt: Write an abstract for a paper that uses cybersecurity education to contrast cognitive associationist, constructivist, and sociohistoric theories of knowing, learning, and transfer and their implications for designing cyber ranges and digital twins, making education more inclusive, motivating learning, and assessing learning. Conclude the cogntive assoctionist theory is least relevant and that sociohistoric theory is most relevant

ChatGPT’s response: This paper investigates the use of cybersecurity education to contrast cognitive associationist, constructivist, and sociohistoric theories of knowing, learning, and transfer. Through the analysis of existing literature and case studies, the paper explores the implications of these theories for the design of cyber ranges and digital twins, with a focus on making education more inclusive, motivating learning, and assessing learning outcomes. The results of this analysis reveal that while cognitive associationist theory is least relevant in the context of cybersecurity education, constructivist and sociohistoric theories offer valuable insights and approaches. In particular, socio historic theory, which emphasizes the role of cultural and social context in shaping knowledge and understanding, is found to be the most relevant for the design and implementation of effective cybersecurity education. Overall, this paper contributes to the understanding of how different theories of knowing and learning can be applied in the field of cybersecurity education, and provides practical guidance for educators looking to optimize the effectiveness of their instructional methods.

Here are some observations from the process. Most importantly, when I asked it to reference readings from class, it faked it really well. For example, I asked GPT to explain constructivist approaches to motivating engagement and to quote Sinatra, Heddi, and Lombardi (2015) it generated a plausible quote (page 8) and plausible reference. But quote and the reference were entirely fabricated:

Sinatra, G. M., Heddi, A., & Lombardi, D. (2015). Motivating constructivist learning: A review of the literature. Journal of Educational Psychology, 107(1), 1-17.

This was the actual article we read:

Sinatra, G. M., Heddy, B. C., & Lombardi, D. (2015). The challenges of defining and measuring student engagement in science. Educational Psychologist50(1), 1-13.

Because students add a section to their paper each week that references the assigned article, this might thwart hackers. But when I asked GPT to explain constructivist approaches to motivating cybersecurity learning using the abstract of Sinatra et al., GPT generated a plausible paper section:

Motivating constructivist learning in cybersecurity can be challenging due to the complex and constantly evolving nature of the field. However, engagement is a key factor in student motivation and achievement, and there are several ways to promote engagement in cybersecurity education.

According to Sinatra et al. (2015), engagement can be understood in four dimensions: behavioral, cognitive, emotional, and agentic. Behavioral engagement refers to the extent to which students are actively participating in their learning, such as through hands-on activities or discussions. Cognitive engagement refers to the level of mental effort and attention students are putting into their learning. Emotional engagement refers to the feelings of interest and enjoyment students experience while learning. Agentic engagement refers to the sense of agency and control students have over their own learning.

To promote engagement in cybersecurity education, it is important to create a supportive and interactive learning environment that encourages students to take an active role in their own learning. This can include a variety of activities such as hands-on simulations, problem-based learning, and collaborative projects.

Here are some other observations I took away from this activity.

  •  ChatGPT generated everything but the headings. I certainly knew what I was looking for but any student in my class would surely be able to figure out the prompts.
  • It took me about three hours.  I certainly had to do some trial and error with prompts.  For example when I used “empiricist” instead of “cognitive associationist,” it assumed I meant behaviorism. 
  • It wrote fine descriptions of cyber ranges and digital twins and even gave citations from Wikipedia. 
  • GPT made some correct references.  When I told it to cite John Anderson’s ACT-R theory, it did so correctly on page three. It got the book right but added Lynn Reder as the second author but that was a mistake (perhaps caused by Anderson, Reder, and Simon's widely-cited 1996 Educational Researcher paper)
  • When I asked GPT for APA references, it found some correct ones (e.g., Greeno, 1998; Lave & Wenger, 1991).  Others it just made up, but they looked very real (Brown et al., 2019) 
  • Perhaps most stunning was GPTs ability to generate a plausible explanation for why cognitive associationist theory was least relevant for my goal and context (page 6) and why sociohistoric theory is most relevant (pages 12-13).
  • To write the paper summary. I pasted in the first half of the abstract and asked GPT to “say more about this” and then did the same thing with the second half. 
  • I analyzed all sections in the AI Content Detector (which has a 350 word limit). Scores ranged from 23% to 100% human-generated with an average of 81%.  I tested 30 random paragraphs from 15 student papers and got an average of 88%, ranging from 17% to 100% human-generated. Because these papers were written section by section across the Fall 2022 semester, it is very unlikely they used ChatGPT.

 In summary, ChatGPT was remarkably good at some aspects of writing a graduate-level paper but not others. ChatGPT's ability to contextualize concepts fundamentally thwarts some aspect of expansive framing but not others. But its scores in the AI Content Detector were similar to actual student papers. 

The main finding is that requiring students to reference specific sources in their papers looks like a promising way of thwarting ChatGPT. There are workarounds. But it would likely be difficult to write a paper that references most of the course readings and other self-selected references without learning a significant amount of the content. 

A worthwhile thought experiment is imagining what would happen if every student used ChatGPT to generate all of their annotations, paper sections, and peer comments each week. As shown in my paper, ChatGPT knows roughly as much about learning theories and educational practice as I would expect an average graduate student to learn in my class. GPT also knew quite a bit about cybersecurity education, including the hottest new trends. Most importantly GPT was remarkably able to author new knowledge at this intersection. It seems reasonable to assume that these students might retain half of their new intersectional knowledge and perhaps 10% of each of their classmates' intersectional knowledge. That would likely be more than some of the less ambitious students are currently taking away from the experience. Dissertation study anyone?

 

 

Top of Form

Bottom of Form

 

Thursday, January 5, 2023

What Does the Media Say about ChatGPT and Education?

Daniel Hickey & Qianxu Morgan Luo

 Like millions of others, we have been quite impressed by the power of ChatGPT.  Numerous media accounts argue that education will never be the same. It is remarkably capable of generating original prose that is not detectable by the current generation of plagiarism detectors like Turnitin. Many have noted that ChatGPT is particularly good at writing "compare and contrast" essays that many educators presumed were difficult or impossible to hack by rewriting information located on the web.

ChatGPT really exploded in December 2022.  We suspect that many educators saw a massive improvement in the depth and quality of take-home exams and end-of-semester essays at that time.  We predict that many of us are going to find our existing approaches to instruction and assessment upended once the new semester begins.

What Does Media Say So Far?

We are systematically analyzing the accounts as they come out. As of today, we are up to 27, which includes both objective reports as well as editorials.  Here is what we have found so far:

  • Eight of them were classified as "worried" or "very worried." These included Stephen Marche's prescient 2021 article in The New Yorker that examined an earlier writing bot.  Recognizing the rapid massive improvement, Marche wrote "the undergraduate essay, the basic pedagogical mode of all humanities, will soon be under severe pressure."
  • Fifteen were classified as "mixed." Many of these were media accounts that aimed to be objective.  New Yorker technology columnist Kevin Roose pointed out that ChatGPT is "ominously good at answering the types of open-ended questions that frequently appear on school assignments" but is "prone to giving wrong answers."  Many suggested in-class handwritten essay exams or asking students to give impromptu presentations on ostensibly authored assignments.
  •  Four were classified as "positive" or "very positive."  Most reminded readers of similar concerns with prior technologies like spell checkers and blamed the risk on shallow instruction.  For example, English Professor Blaine Gretemen's editorial in Newsweek argued that it is "time for a new final exam, one that demands students find out something about themselves, and to tell you in a voice that is their own."

Are our observations of the media coverage so far:

  • As proponents and scholars of online learning, We were surprised by the lack of discussion of the specific consequences for online education.  Such settings are likely to preclude in-class essays or impromptu presentations as a short-term response. 
  • Given all the suggestions that educators will need to assign in-class handwritten essays, We are surprised that no one has mentioned that many young people are incapable of writing legibly at the speed that would be needed for this to be practical.
  • As learning scientists, we worry that there has not been enough attention to the crucial role that writing plays in learning.  As Marlene Scardamalia and Carl Bereiter convinced us in the late 1980s, skilled writers engage in knowledge construction where they use text to overcome the limits of short-term memory.  In contrast to more novice knowledge-telling writers, skilled writers typically know a lot more when they complete an essay, article, or assignment.
  • To the advocates who liken ChatGPT to other innovations (from the slide rule to graphing calculators to Google Translate), a week of experimentation has convinced us that ChatGPT is already more powerful than all of the other technologies combined.  And it is only going to get more powerful.
Where are We Going Next?

This is the first of several posts exploring the implications of ChatGPT. The next post will share a paper that GhatGPT wrote off Dan's online graduate course on Learning and Cognition.  

PS. We missed the excellent article in The Chronicle of Higher Education by Beth McMurtrie.  The title captures its insight: AI and the Future of Undergraduate Writing: Teaching Experts are Concerned, but not for the Reasons You Think.   It makes several excellent points.
  •  Typical high school English and the five-paragraph essay are responsible for training a generation of knowledge tellers.  
  • Many of the suggestions for thwarting ChatGPT are very labor-intensive.  We are currently writing another blog post that will dig more deeply into this issue.
  • It linked to the public page from Anna Mills compiling suggestions for essay prompts that might thwart chatbots.


Thursday, August 26, 2021

New Article about Situative Assessment

The awesome Diane Conrad of Athabasca University guest-edited a special issue of Distance Education on assessment and was kind enough to accept our proposal to present our situative approach to online grading, assessment, and testing:

Hickey, D., & Harris, T. (2021). Reimagining online grading, assessment, and testing using situated cognition. Distance Education42(2), 290-309.

The first part of the paper reframes the  "multilevel" model of assessment introduced in a 2012 article in the Journal of Research in Science Teaching and a 2013 article in the Journal of Learning Sciences for online settings.  

  1. Immediate-Level Ungraded Assessment of Online Discourse via Instructor Comments
  2. Close-Level Graded Assessment of Engagement via Informal Reflections
  3. Proximal Formative ­Self-Assessments
  4. Automated Distal Summative Achievement Tests

The second part of the article introduces ten new assessment design principles, 
  1. Embrace Situative Reconciliation over Aggregative Reconciliation.
  2. Focus on Assessment Functions Rather than Purposes.  
  3. Synergize Multiple Complementary Types of Interaction
  4. Use Increasingly Formal Assessments that Capture Longer Timescales of Learning
  5. Embrace Transformative Functions and Systemic Validity
  6. Position Learners as Accountable Authors
  7. Reposition Minoritized Learners for Equitable Engagement
  8. Enhance Validity of Evidence for Designers, Evaluators, and Researchers  
  9. Enhance Credibility of Scores and Efficiency for Educators
  10. Enhance Credibility of Assessments and Grades for Learners
I was particularly pleased with the new ideas under the seventh principle.  We were able to use Agarwal and Sengupta-Irvings (2019) critique of Engle & Conants (2002) Productive Disciplinary Engagement framework and their new Connected and Productive Disciplinary Engagement framework,  It forms the core of our Culturally Sustaining Classroom Assessment framework that we will be presenting for the first time at the Culturally Relevant Evaluation and Assessment conference in Chicago in late September, 2021.

Thursday, May 13, 2021

Articles Chapters, and Reports about Open Badges

by Daniel Hickey

Thanks to Connie Yowell and Mimi Itow at the MacArthur Foundation's Digital Media and Learning Initiative, I had the pleasure of being deeply involved with digital badges and micro-credentials starting in 2010.  While we no longer have any funding for this work, my colleagues and I are continuing to engage with the community.  I am thrilled to see the continued growth and the wide recognition that micro-credentials offer new career pathways to non-traditional learners.

I get occasional requests for copies of chapters, articles, and reports that we reproduced as well as some general "where do we begin" queries.  Given that we were funded to provide broad guidance from 2012-2017, we produced some things that beginners and advanced innovators have found quite useful. We continued to publish after MacArthur ended the DML initiative and funding ran out. Here is an annotated list of resources.  We hope you find them useful!

Getting Started.

If you are new to badges and microcredentials, this might be a good place to get some basic background:

Where Badges Work Better

We studied the 30 badge systems that MacArthur funded in 2012 to uncover the badge system design principles that might guide the efforts of innovators.  This included general principles and principles for recognizing, assessing, motivating, and studying learning. These findings were collected in a short report at EDUCAUSE and our longer report:

We also did a followup study two years later to determine which systems resulted in a "thriving" badge-based ecosystem.  Most of the constructivist "completion-badge" systems and associationist "competency-badge" systems failed to thrive, many never got past piloting and some never issued any badges.  Turned out that wildly optimistic plans for assessing competency or completion undermined the project.  In contrast, most of the sociocultural "participation-badge" systems were still thriving, in part because they relied on peer assessment and because they assessed social learning rather than individual completion or competency:

 Endorsement 2.0 and Badges in the Assessment BOOC

An important development is "endorsement" in the Open Badges 2.0 Standards.  It allows a "BadgeClass" to carry an endorsement (e.g., from an organization, after reviewing the standards) and for each "assertion" of that badge class to carry an endorsement (e.g., from a member of that organization, after reviewing the evidence in the badge).  Nate Otto and I summarized this feature and EDUCAUSE Review and predicted its s impact in the Chronicle:  

This chapter describes Google-funded "Big Open Online Course" ("BOOC") which really pushed the limits of open badges, including one of the first examples of "peer endorsement" and "peer promotion." It also showed that our asynchronous model of participatory learning and assessment (PLA) could be used at scale to support highly interactive learning with almost no instructor engagement with open learners:

The Varied Functions of Open Badges

This chapter used the BOOC badges to illustrate how badges to illustrate the range of functions of open badges.  It shows how badges support the shift (a) from measuring achievement to capturing learning. (b) from credentialing graduates to recognizing learning, (c) from compelling achievement to motivating learning, and (d) from accrediting schools and programs to endorsing learning:

This chapter used example badges from sustainable/sustainability education to similarly illustrate these four functions of digital badges.  The badges came from Ilona Buchem's  EU-funded Open Virtual Mobility project and the FAO e-Learning Academy from the UN's Food and Agriculture Organization.  BTW, the e-Learning Academy features some of the best self-paced open courses I have ever seen.  the assessments are great and you really can't prank them.  If it says the course will take two hours it is really impossible to earn the badges without spending two hours learning (I tried!):

This 2017 chapter presents the situative model of assessment that was first published in Hickey (2003) in the context of open badges.  It is my response to people like Mitch Resnick who claim that open badges will undermine intrinsic motivation.  I agree with him that they will if you use them as meaningless tokens.  So don't do that Mitch!  Instead take advantage of the fact that badges contain meaningful information and can circulate in social networks and gain more meaning, which has consistently been shown to enhance free-choice engagement:

Validity vs. Credibility

Early on in my journey with digital badges, Carla Casilli blew my mind when her early blogpost explained how the "open" nature of open badges forced us to rethink validity in assessment and testing.  The ability for a viewer to interrogate the evidence contained in a badge or micro-credential means that the credibility of that evidence is more important than the validity of that credential in a traditional sense.  So I was happy to write with her about this important issue: 

Wednesday, May 12, 2021

New articles on Participatory Learning and Assessment (including inclusion)

  Yikes, it has been a long time since we have posted.  Partly what happened is we redirected our energy from blogging to publishing.  Starting in 2019, we began translating the theory-laden design principles to practical steps for readers who may or may not be grounded in sociocultural theories. This was serendipitous in light of the pandemic and the explosion of interest in asynchronous online learning. 

In contrast to our earlier articles, these new articles reflect the influence of current research on power and privilege in the learning sciences.  Each includes design principles and/or steps that are intended to "reposition" minoritized learners.  In particular, the changes reflect the influence of papers by Priyanka Agarwal and Tesha Sengupta-Irving on Connective and Productive Disciplinary Engagement (CPDE, 2019) Each of the descriptions below is hotlinked to a copy of the article.

This first article is a very gentle introduction to online participatory learning and assessment (PLA). It was written for educators with no experience teaching online and who are not grounded in any particular theory of learning

This article describes how we translated the PLA principles into fourteen steps, focusing on engagement routines.  It was written for instructional designers and others who are grounded in more conventional cognitive-associationist and cognitive-constructivist theories of learning

This one introduces ten new situative assessment design principles, building on the "multi-level" assessment model in Hickey and Zuiker (2012).  While it includes the theoretical grounding, it was written for readers who might not be grounded in situative theory.