A Philosopher’s Diary, #9–How Much Does The Chatbot Brouhaha Affect Anarcho-Philosophical Teaching and Learning?

“The Sleep of Reason Produces Monsters” by Francisco Goya (Los Caprichos, #43, 1799)

The descriptive sub-title of this blog—Against Professional Philosophy—originally created and rolled out in 2013, is “A Co-Authored Anarcho-Philosophical Diary.”

Now, ten years later, after more than 300,000 views of the site, this series, A Philosopher’s Diary, finally literally instantiates that description by featuring short monthly entries by one or another of the members of the APP circle, in order to create an ongoing collective philosophical diary that records the creative results of critical, synoptic, systematic rational reflection on any philosophical topic or topics under the sun, without any special restrictions as to content, format, or length.

In this ninth installment, Boethius looks closely and critically at the current flap about the use of “chatbots” in higher education, from a specifically anarcho-philosophical point of view.

PREVIOUS INSTALLMENTS

#1 Changing Social Institutions From Without Or Within

#2 The Vision Problem

#3 Against Perfectionism

#4 Respect For Choices vs. Respect For Persons

#5 Thirty-Six Philosophical Precepts of Martial Arts Practice

#6 Enlightenment, Education, and Inspiration

#7 Rigged and Lucky: The Myth of Meritocracy in Professional Academic Philosophy

#8 Ambition and Mortality

How Much Does The Chatbot Brouhaha Affect Anarcho-Philosophical Teaching and Learning?

Maybe I shouldn’t use ‘brouhaha’ in the title. Maybe ‘mass panic’ would have been better. Whatever to call it, we find no shortage of attention in the academic news to the new open-access AI text-generator ChatGPT. We have interesting pieces in The Atlantic, The New York Times, and The Chronicle of Higher Education.[i] Academic interests in the scholarship of teaching and learning will produce academic articles soon enough (e.g., Teaching Philosophy has a CFP out for a special issue). Is the brouhaha justified? I say yes and no. It’s not to hedge, but to argue here that we have a premium case here of technology forcing some reflections that in the end will be good, while at the same time conceding that the threats are real and that some action from many of us educators needs to happen pretty quickly. But the impact will depend on how one currently teaches and what one’s teaching philosophy or goals happen to be. Since our platform at APP generally has an anarcho-philosophical bent, and so does my teaching, I want to speak to that focus here. That teaching and learning mode might not have as much to fear. The anarchism might even be an antidote to the impact from the bots.

Why the fuss? The threat from the bots is that for many essay prompts, students can now use a free application that generates serviceable text for writing assignments at all levels, “serviceable” in the sense of at least sliding by with a B, a C at worst, and maybe even an A. Any student interested in cheating can simply use the bot, maybe with some tweaking or some trial-and-error prompting of the AI, and be done. The AI generates text that is unique and thus (apparently) uncatchable by plagiarism-detection software like Turnitin or our own web searches. On the presumption that many students will take advantage, “the essay is dead” since we can’t knowingly allow cheating to happen. In the past, students could always purchase essays, “recycle” old ones from acquaintances, or copy from a “sample essay” at some online cheater site. But now, the sheer ease and security of using an AI has brought about a perhaps-justifiable panic.

I actually doubt we’re all doomed or that the essay is dead. But the immediate reaction will probably be bad. I’ll speak only to the college level here, but I expect the same in high schools and middle schools. Expect overreach, more coercion, and less trust. Teachers will expect students to cheat, even more than they already expect it, and many instructors will surely take the easiest path in a war against the bots. The in-class essay will return, with the blue books, the handwriting-reading challenge, the mass anxiety over writing in a set time limit, and the inequality between students who think and write quickly compared to those who can’t. So much also for students revising their thoughts and text before submitting. If we give essay prompts ahead of time (I’ll give you five topics, and I’ll select two for the exam), then some students will just use the chatbot to generate some essays, then try to memorize and regurgitate. In-class essays tend to be far worse in quality too, and for good reason. They’re done on the fly. Since we’re not in the business of flunking everyone, we’ll grade similarly as we would for better-quality essays written with plenty of time as normal. Or we’ll weight the in-class essays less. Overall course grades need not be affected. But if the performance was worse with in-class essays, then either the grades will lie or we won’t have good information about students’ knowledge and their writing skills.

A deeper fear is that the essay really will die, in that most professors will go further and just eliminate essays or writing altogether, at least at the lower levels. I admit my first thought on all this was, “I’m not reading their handwriting, and in-class essays are hardly ever good anyway. Multiple-choice exams, here we come!” There’s a place for that type of so-called objective assessment. But students already don’t get enough writing practice. And even if we removed or minimized writing at the lower levels, we’d also face a pair of worries at the higher level of (still) “Did they get help from a bot in writing this?” and the task (still) of doing introductory college-level writing instruction, just conducted at the 300-level. Writing skills and communication skills then get depressed further with the corresponding depressing effect on what college graduates typically know and can do.

Another deep fear is that professors will keep the essay, in class or not, but then impose even more coercive measures on students to combat the cheating potential. Many teachers already require software be used to “lock down” web browsers during online tests. Some schools have testing rooms with watchful cameras in addition to proctors. Expect more use of such tools and of new ones to be developed. We’ll see more use of honesty pledges, signed or orally-stated promises to work with integrity: “I promise I didn’t use a bot to write this paper.” Expect hyper-fine-grained questions on essay exams and for papers, all designed to subvert those intending to use a bot for help. Expect a more general air of suspicion overall, and finally, expect more outright accusations of cheating, whether on good grounds or not.

All of this betrays a lack of trust of students. That carries significant costs to learning. We expect students to trust us, that we’ll help them learn, treat them fairly, tailor instruction to their abilities, and that we’ll stay cognizant of the fact that students aren’t experts and are still learning. They trust we’ll be accommodating to special needs and circumstances they might have, and cutting across all this, they should be able to trust we’ll treat them like human beings with dignity. When we assume they’re a bunch of cheats, that undermines their trust in us. But if they don’t trust us, everyone’s job gets harder, theirs and ours, and the whole educational enterprise gets undermined. By and large too, they’re not cheats. It’s the small number of really egregious cases that we remember, and I find most plagiarism cases are because, e.g., copying from online was fine in high school, or because they just didn’t know what to do for the assignment. I find most students do have integrity, but they’re learning that too, and some students need to develop it more than others.

I for one also tend to wonder what direction the causation goes in this: Is it that students cheat, so then we then don’t trust them? Or is it that we don’t trust them, where we assume they’re all tempted to cheat and will do so if it’s easy, and then, when students perceive that assumption on our parts, they go ahead and try to cheat? For they might well think, at least some of them, “Look, if you assume I’m going to cheat, and I see how to do it without getting caught, then I’ll play that game and just cheat. I can win that game.” What then if it’s us, with our coercive and untrusting measures, that actually increase the chances of students trying to cheat?

But whatever the order of causation, if the new bots spur less trust from us in students, and less trust from students in us, that is a deep problem. To solve it, some cues from anarchism’s resistance to coercion will help. Some reflection and analysis of what we ask students to value in our classes will help too. And yes, we can’t be completely naïve either: We need to analyze our teaching tools and assignments in light of the bots. But distrusting our students and extending the surveillance state into our classes further simply won’t do. Per anarchism’s general moral thesis against coercion, any of that must be minimized.

Take the point about analyzing our teaching tools and assignments first. At least at this stage, given what I see of what ChatGPT can do, I think we have less to fear than some of the brouhaha suggests. Most articles give examples of bot-generated text, but I won’t do that here. I invite readers instead to play with the software (it’s free with an account), to ask it some questions, to try out some possible prompts for essays or exam questions, and see. In my case, it was all interesting, but when I tried thinking like a die-hard plagiarizer, looking for it to generate text for an assignment, including some of my own, I got a mixed bag.

For some questions, the bot banged out perfectly serviceable answers. I tried “Can you give me 500 words on John Locke on personal identity and how he might be wrong?” That was excellent, even if it did sound somewhat like the Stanford Encyclopedia of Philosophy. (I say this because for a different question, I asked a follow up on what web sources would be good. The SEP was the top response.) I then asked, “How might Locke reply to those objections?” and again got reasonable answers. So if I were to ask on a take-home test to state Locke’s theory of personal identity, what an objection to it might be, and how Locke might respond, would-be cheaters have a ready resource. I might suspect something, but I couldn’t really catch it.

I did receive some erroneous answers. I asked about Derk Pereboom’s defense of hard incompatibilism, and I got an argument for hard determinism instead. If I were grading, I’d give it a C or something. Someone cheating might not mind that. My queries for longer texts didn’t come with introductions either, a blemish if someone’s looking to get a complete essay. But again, I might only dock the grade a little if I’m grading, so again it’s serviceable. The bot had some “tells” too. For instance, it tended to sum up with “Ultimately…” or “Overall…,” and it readily offered relativist-sounding conclusions along the lines of “There are arguments pro and con, and people need to decide for themselves about…” The writing quality was too good too. Even if I asked it “Can you say all that at the elementary school level?,” it still gave text that would be a little too good on the grammar, usage, and spelling front for what (unfortunately) I often see.

This of course is from an early iteration of an AI that can be used to generate text that sounds human- (or student-) generated. They will get better obviously with more information to draw from (ChatGPT has little information prior to 2021), more computing power, and after more learning of its own. But for just looking for information, for some summary material in the form of some short paragraphs that are generally accurate, I found the bot to be very good. I found it a little like asking Google for information, except you get paragraphs for your answers instead of specific web hits.

The bot flailed though if you ask its opinion on something, “Do we have free will?,” for instance. It replies that it’s an AI with no opinions of its own, and then it summarizes some arguments pro and con. Yes, that could assist our would-be cheater. But now an important set of examples: For any kind of question about what I thought myself (and imagine a student asking the AI this), you rightly get a response along the lines of “Well, I don’t know what you think, only you know that.” And for any metacognitive question, again about myself, it gave (rightly) a similar response. I have many such metacognitive questions on exams and assignments now. These are questions like “What do you plan to learn from this course?,” “What progress have you made on the course goals so far?,” and “What challenges have you faced in your learning so far, and what have you done to meet them?” The point is that for writing involving one’s own perspective, one’s own mental states and dispositions, and one’s own learning, the bot has no access to that and thus generates no answer for you. This is obvious. Perhaps there’s some creative way to ask the bot some metacognitive questions to get some text generated. But for such reflective questions, I’m not seeing how using a bot will be much help in putting text on a page for an assignment. So a suggestion for teachers: Have more writing on metacognition.

But I’m skirting the main question so far. If the current bots don’t always generate accurate, well-written essay text now, they soon will. If students are going to write essays, not in a classroom or under some watchful electronic eye, what to do? Here the bots should force some hard thinking about the very value of what we’re asking students to learn and do. We usually say we’re helping students to learn to think better, more rationally, more critically, more open-mindedly, etc. etc., and all of that about philosophical issues, questions, views, etc. etc. But oftentimes, all we really ask for students to do is simply know what those issues and questions are, and with our lectures and assigned readings telling them that, and the possible answers to those questions, and we test it by seeing what they say on an expository essay. Unless we have in-class essay exams, the bots destroy this form of teaching. If we really want to help students think better, they’ll have to practice thinking, again obviously, and they’ll have to practice putting their own ideas and arguments and questions together. Writing is the rightly time-honored way to do that practice. If the bots wind up making exposition easier, maybe especially at the entry level for philosophy, then students can focus more on what they think themselves about free will, God, knowledge, enlightenment, and all the rest.

One might object here that we all need to be able to read something, something challenging like philosophy, and understand what’s being said. When we do expository writing, we practice that skill, and the writing helps us toward understanding. We can’t simply leapfrog over that level of understanding and jump straight into the critical project, especially at the entry level. Since the bots cut this corner for demonstrating basic understanding, teachers still face a large problem.

But I see two strands of solutions: First, basic expository skills could easily be practiced in class, and students need more practice at those skills anyway. If the bots force that change on teachers, good. If that means less lecture and more workshop, less content and more skill-building, then good. Second, if we’re doing an intro-level class, we might have to just wait on assigning more heavy-duty reading and understanding and analysis of original texts. At most institutions, including mine, e.g., assigning all of Mill’s Utilitarianism in 101 isn’t going to have much success. Many students don’t have the skills (yet) for it, and they’ll use Google, YouTube, or (now) a bot to get enough of it for class or the test. But most importantly, most of them don’t (yet) see the value in reading independently, without electronic crutches, just to understand something challenging and coming to one’s own views on it. And just for the philosophical content, they don’t (yet) see the value in understanding it. Or they do see the value, but they have no strong motivation to do the heavy lifting to get it. Since intro students are still learning to see all this value, we can’t assign something that essentially requires that heavier-duty appreciation involved in order to complete it. The bots just help force this realization. We needed it all along.

So how do you help people see the intrinsic value of philosophy and its associate skills, where then they’re more likely to pursue philosophy for its own sake? I suggest a three-way strategy: First, remove the barriers to intrinsic motivation. Aristotle tells us humans by nature desire to know. Kant tells us human nature includes independent-mindedness. Empirical work from the last 30 years tells us that extrinsic motivation doesn’t work well for learning generally, and extrinsic motivation undermines intrinsic motivation. Extrinsic motivators include trophies (and punishments) like grades. Remove grades, and a cascade of good effects follow.[ii] I have been a so-called “ungrader” for some time now. While I can confirm the good effects aren’t universal, and the workload on instructors is significant (among other things, extensive narrative feedback and revise-and-resubmits take the place of one-and-done assignments), the student attitudes or affects with respect to intrinsic motivation for learning philosophy see massive improvement. How do I know? I’m just one professor, but students’ metacognitive reflections at the midterm and end of the course confirm it for large proportions of them.

Ungrading forms an anarcho-philosophicalpiece of pedagogy, at least in the broad sense of resisting coercive authoritarianism wherever possible. And grades are indeed coercive: If you want to keep your scholarship, then get good grades and avoid bad ones. If you want good grades, do X, Y, and Z. If you want to avoid bad grades, don’t do A, B, or C. We then get a grading game of doing the least work for the most reward, and while this is rational given the game we’ve set up for education, it’s not at all good for learning. It’s not at all good for our relationships with students either. The students aren’t the pigeons, and we’re not the trainers. Dignity of both parties gets lost here. Grades perhaps can’t be completely eliminated (institutions still require final course grades), but some power for judging overall performance can be passed to students, and teachers and students can collaborate on that judgment. But for the activities in one’s course, the assignments and papers, grading isn’t needed for learning or motivation, and removing grading looks in fact to improve them.

As a second element of a strategy for improving intrinsic motivation, use more exemplars of independent-minded philosophical thinkers. Exemplars of good conduct can serve as role models for the type of person (or at least the set of mental habits) that seeks the intrinsic goods of philosophy. All philosophers or philosophically-minded people presumably have some independent-mindedness, but some more than others, and some where that intellectual virtue shows through very easily. Easy examples include Socrates, various Cynics and Stoics, Martin Luther King Jr., James Baldwin (exemplars need be philosophers per se), bell hooks, and many others. Earlier in my career, I thought the people giving the arguments didn’t matter, only the arguments mattered. For getting to the truth, that still seems right. But for being motivated to get to the truth, having exemplars of the intellectual virtues required might help far more than one might think.

For a third element, we need to identify the philosophical topics, questions, figures, and activities that fit best with introducing philosophy to the students one has. “Meet them where they are” is the maxim in pedagogy circles, and this will be far different than one might think is necessary for, say, giving a picture of philosophy as it’s practiced at the “top” levels of the field. I find my students very interested in the nature of enlightenment, intellectual virtue, environmental ethics and environmental justice, personal identity, race and gender, free will, Buddhist philosophy, and classical Skepticism. These aren’t even that non-standard, but I don’t hit Gettier, functionalism, a priori knowledge, or much extensive ethical theory any longer. Those teaching at technical colleges, elite SLACs, or to high proportions of first-generation college students would each need something different.

The upshot is that with some more intrinsic motivation, we’ll get better learning, and with respect to the chatbots, they become tools rather than means for academic misconduct, or they’ll just not get used at all. If they’re tools, they give us background or things we could look up easily enough anyway. They won’t tell us how to think. As tools for writing, what about that? Here I’ll have to think more, but for basic exposition of a position or argument (“Can I have 100 words on what Stoicism is, please?”), we can’t stop people from using the chatbots that way. But I’m not sure that’s bad. For we can still analyze that text, compare it to others, and learn about good and bad writing as a result. And as with existing tools like Grammarly, we might use the bots to shape our own text for the better. “Take this text and put it all in passive voice,” we might say, and then analyze the readability of the result. “Take the same text and put it in active voice,” and we get something different, and maybe students learn good things on clarity. Or as I tried with ChatGPT, when I got an answer that sounded more like an academic article, “Can you say all that at the elementary-school level?,” it gave me a more readable text. We might then analyze in class to see what’s so readable about the latter rather than the former, and again we learn good things on clarity.

“But what then about using the bots for students’ own writing,” one might ask, “for that’s where inappropriate things are going to happen?” Indeed, but there we all need to find the good in having a source of serviceable prose out there that can be both useful and usable with integrity. Our standards on the latter might have to shift a bit. It’s one thing to turn in prose straight from the bot. That would be wrong, I’d agree, but imagine a future where the expository text we might get from a bot, with our own labor mixed with it to generate the text we want that summarizes what we want for setting up our own argument, and then the text from there is our own. Or maybe that text of our own reasoning started with ourselves, got analyzed by a bot, suggestions got made (again, Grammarly can do this already, and even Word, really), and the finished text thus got “assistance” from a computer but still defends a position and with an argument that’s ours. Now, how to help students to do that is another question, but it’s all going to be part of learning how to write and think in philosophy. So do we need brouhaha or mass panic over the chatbots? For those teaching in a mode of coercive grading for assignments and tests on topics their students see little value in, whether it really is valuable to learn or not, and thus where the game is on to do the least work with the least risk for the most reward, good luck. But for a more minimally coercive environment, with more compassion and attention to finding the value in philosophy and for oneself in studying it, there I predict the bots won’t be as disruptive. We can use them and show students how to use them as tools for their own improvement. The bots aren’t going away. We ignore them at our peril. But taking a page from some anarcho-philosophical thinking about pedagogy might do a lot of good at not just adapting to them, but of making changes that seem long overdue anyway.

NOTES

[i] For example, in The Atlantic, we have “The college essay is dead” from Stephen Marche (Dec. 6), and Daniel Herman, “The end of high-school English” (Dec. 9). The NYT has Frank Bruni, “Will ChatGPT make me irrelevant?” (Dec. 15); Zeynep Tufekci, “What would Plato say about ChatGPT?” (Dec. 15); and others. The Chronicle has Beth McMurtrie, “AI and the future of undergraduate writing” (Dec. 13). I only cite these because that’s all I’ve read so far. Readers can find many more easily enough.

[ii] Alfie Kohn is a good source for much of this, and for one-stop shopping for articles, see Alfiekohn.org, especially “From degrading to de-grading” (1999), “The trouble with rubrics” (2006), and “The case against grades” (2011), as well as Punished by Rewards: The Trouble with Gold Stars, Incentive Plans, A’s, Praise, and Other Bribes (New York: Mariner, 1993/2018). A good anthology on ungrading is Susan Blum’s Ungrading: Why Rating Students Undermines Learning (and What to Do Instead) (Morgantown, WV: West Virginia University Press, 2020).