re-mediating assessment: "Prompt Engineeering" in Generative AI: Scholarly Referencing in Bing

By Daniel Hickey and Qianxu (Morgan) Luo

In our previous post, we compared a crucial question in the exploding debate over the impact of generative AI on school learning. Our previous analysis of the hundreds of media accounts uncovered dozens of problematic suggestions that educators should ask students to reference specific information that is presumably not part of a given platform's large language model (such as specific scholarly references). We argue that this overlooks the simple "prompt engineering" workaround of including the abstract (or sometimes the introduction) of the article in the prompt.

We have repeatedly found that ChatGPT is quite proficient at comprehending, analyzing, and applying an article when the abstract is provided. In our most recent post we found that in one example, ChatGPT was much better at this task than Google's new Bard. It appeared that Bard just ignored the abstract and dug into whatever it already knew about the topic.

Generative Scholarly Referencing in Microsoft's Bing

We just got access to Microsoft's New Bing and, of course, wanted to put it to the same test. Of course, the big difference with Bing is that it has access to the Internet. When we asked New Bing that same initial prompt, we got a VERY different response.

Initial promt: Use Sinatra, Heddy, and Lombardi (2015) to motivate engagement by cybersecurity learners who are using a digital range and use APA citations.

New Bing's "more balanced" response:

x Searching for Sinatra, Heddy, & Lombardi (2015) cybersecurity learners digital ranges APA citations

I found an article by Sinatra, Heddy, & Lombardi (2015) that talks about how engagement is one of the hottest research topics in the field of educational psychology 1. They suggest that engagement is a key factor in successful learning outcomes and that it can enhance motivation and achievement in cybersecurity learners who are using a digital range 1.
Here’s an example of how you can use their ideas to motivate engagement and use APA citations: “Research in educational psychology has consistently shown that engagement is a key factor in successful learning outcomes” (Sinatra, Heddy, & Lombardi, 2015)1.
I hope this helps! Let me know if you have any other questions.

Learn more: 1. veletsianos.com 2. files.eric.ed.gov 3. frontiersin.org

??? What is a digital range? What are some other ways to motivate engagement?

This result was astonishing on several levels. First, all of the superscript 1's and the first link are to a post from YESTERDAY from the awesome George Veletsianos about our earlier post. I have never met George, but I follow his work (but was not subscribed to his blog). We are honored that he characterized our earlier post as "worthwhile and expansive" and then went on to extend our initial effort using more "incremental prompts." Demonstrating the value of this more nuanced form of prompt engineering, George asked ChatGPT to (a) explain more about cybersecurity, (b) explain what a cyber range is, (c) explain how to create a supportive learning environment in a cyber range, and (d) give an encouraging example of a message to send to cybersecurity learners. ChatGPT's responses were great and this example was super helpful. I used a similar approach to prompt engineering to write a 3500-word literature review paper on the topic in another post and reached a similar conclusion as George seems to imply: ChatGPT "knows" a lot and can "learn" even more when prompted. I think we all agree that with the right assignments and support, students can learn a lot from generative AI.

Our second observation is that Bing did indeed locate the actual article. We know because the line about engagement being one of the "hottest research topics" is the opening sentence of the article's abstract. But Bing actually did not do much with the article and the quote it provided was actually a paraphrase and never appeared in the actual article.

Our third observation is that Bing offered suggested incremental prompts not unlike the ones George explored. For example, it gave a pretty decent (albeit brief) answer to the question "what is a digital range?" that included a link to a detailed web resource page that we have referenced in the past.

Our fourth observation is that Bing is a bit, well... cranky. The first response to our query offered three different versions, including on "creative" version and one "detailed" version. When I asked Bing the same question again, it told me "I think I already provided you the information you requested," and restated the hallucinated quote. I tried and tried to get it to offer me the three different versions, but it just kept giving me the first one without the other two.

More Engineered Prompt in Bing

Next we gave Bing the more engineered prompt that included the abstract of the article

Our more engineered prompt: Use the following article by Sinatra, Heddy, & Lombardi (2015) to motivate engagement by cybersecurity learners who are using a digital range and use APA citations: Engagement is one of the hottest research topics in the field of educational psychology. Research shows that multifarious benefits occur when students are engaged in their own learning, including increased motivation and achievement. However, there is little agreement on a concrete definition and effective measurement of engagement. This special issue serves to discuss and work toward addressing conceptual and instrumentation issues related to engagement, with particular interest in engagement in the domain of science learning. We start by describing the dimensional perspective of engagement (behavioral, cognitive, emotional, agentic) and suggest a complementary approach that places engagement instrumentation on a continuum. Specifically, we recommend that instrumentation be considered on a “grain-size” continuum that ranges from a person-centered to a context-centered orientation to clarify measurement issues. We then provide a synopsis of the articles included in this special issue and conclude with suggestions for future research.
Bing's response: Here’s an example of how you can use the article by Sinatra, Heddy, & Lombardi (2015) to motivate engagement by cybersecurity learners who are using a digital range and use APA citations:
Sinatra, Heddy, & Lombardi (2015) suggest that engagement is one of the hottest research topics in the field of educational psychology. They state that research shows that multifarious benefits occur when students are engaged in their own learning, including increased motivation and achievement. However, there is little agreement on a concrete definition and effective measurement of engagement. They recommend that instrumentation be considered on a “grain-size” continuum that ranges from a person-centered to a context-centered orientation to clarify measurement issues.
Here’s an example of how you can use their ideas to motivate engagement and use APA citations: “Research in educational psychology has consistently shown that engagement is a key factor in successful learning outcomes” (Sinatra, Heddy, & Lombardi, 2015).
I hope this helps! Let me know if you have any other questions.

We see that Bing learned and applied some of the information from the abstract. While Bing correctly quoted the crucial term "grain-size," it also again provided the same hallucinated quote. While Bing used more of the abstract than Bard, it used much less than ChatGPT. Somewhat frustrating, Bing again did not offer the three different versions like it did in the first offering.

What is the Big Deal?

Of course, the massive media coverage on the educational implications of generative AI for education suggests that these are indeed important questions. We have participated in several workshops and conference presentations where audience questions mirror the debate that is raging in the media. Some worry about students cheating on exams or bypassing the conventional learning that occurs when students explore and write about topics on their own. Others argue that the problem is really "shallow" assignments and assessments and suggest that instructors need more sophisticated prompts that ask about "local, recent, personal, and specific" knowledge that the LLMs behind platforms don't "know."

This initial pair of posts shows that Bing's widely cited access to the Internet does indeed mean it "knows" more than ChatGPT or Bard. But the previous post shows you can provide ChatGPT with specific knowledge that is not in its LLM and it appears to do much more with that information than Bing. For us, this shows that this common recommendation for educators is shortsighted AND that the obvious workaround functions very differently in the three widely available generative AI platforms.

As far as the "big debate," we contend that it largely misses the important point. Certainly, we all want to trust that most students will "do the right thing." Frankly, we worry that our work here may come off as overly suspicious. But it is difficult to overlook a central finding in the research summarized in James Lang's 2013 book Cheating Lessons: If students assume they are unfairly disadvantaged by cheating classmates and they assume they will not get caught, many, if not most, will also cheat.

What is our Goal Here?

When we presented at the Digital Learning Annual Conference (an awesome new conference and organization for K-12 online; a version of our talk is here), the classroom teachers were looking for practical responses they could implement now. One of our most "liked" observations in our presentations is that many of the suggested responses forget that most educators are already maxed out. They simply do not have the time or bandwidth "review interim artifacts" or implement other complicated suggestions. Meanwhile, as we are showing, many of the simpler suggestions are easily bypassed.

In the immediate term, we are trying to help educators avoid simplistic responses whose workarounds are instantly shared massively on Reddit and elsewhere. In the near term, we are attempting to extend our suggestions to consider ways that teachers can help students learn from and with generative AI.