niffyjiffy

What We Have in Common with Generative AI

The Test

Being a teacher, I’m often asked in casual conversation for my opinion on standardized testing. Once the conversation progresses past the usual banter about Bill Gates, Dan Quayle, and No Child Left Behind, I tend to present a couple of example problems. Consider the following two:

• Fred has 6 apples and eats 2 of them. How many apples does he have now? • Miss Flannery O’Shaughnessy owns an apple preserves shop. The apples arrive at a rate A(t) = 2500sin(2πt) + 3000 and are sold at a rate of 2000cos(2πt) + 3000, where t is in years since January 1, 2000. How much capacity does Flannery need in order to store the maximum surplus of apples she will accumulate? On the face of it the problems are quite different in complexity and in subject matter; however, their similarities are much more illuminating. I note in particular: • Both problems include all information necessary to solve the problem • The answer to both problems can be calculated to infinite accuracy • The solution path is unique and prescribable • To achieve the above three, the problem had to be written by somebody who already knows the solution and can be checked by someone who doesn’t

I find this formulation steers the conversation in a productive direction. People soon recall the absurd paint-by-numbers feeling they dealt with in school. It’s injurious to one's sense of creativity to realize that tests and homework involve reproducing an answer key that existed before you and that will outstay you by many years. The thought gives words to a student's nagging suspicion that they haven't learned to do anything at all, that our knowledge only exists if it is observed. It is a much better explanation for Impostor Syndrome; the pressure on students is immense, and that pressure is precisely to replicate a model student.

Although a canned math problem is the most obvious prototype of this phenomenon, this problem pervades most grade-level subjects. Biology classes are often criticized for their reliance on memorization, but this isn’t precise. Any application of biology, be it field research or medicine, requires even more memorization, so of course prerequisite classes must begin to build this base of knowledge. The problem is not that memorization is required, but that it is flowcharted. Questions are written suggestively, and students learn to react to words in the questions with a sentence they have learned by brute force. This is the reason that people respond so viscerally to the phrase “The mitochondrion is the powerhouse of the cell.” Nobody uses the word “mitochondrion” (or “powerhouse” for that matter) outside of this context, and students who learned biology just from school struggle to imagine mitochondrion as anything but an answer to bubble in.

Even English classes suffer from this. Although no essay stands a chance of replicating an answer key word for word, several innovations in essay technology have brought us perilously close. Rubrics now tend to specify a structure both for the external and internal organization of paragraphs, so if a student is struggling to find a voice, they can reduce the problem to filling information into a predetermined outline. Even the information itself is systematized; most English classes come with a list of themes, devices, and motifs that make for good body paragraphs. Thus, students who didn’t at all connect with a poem or novel they read can still write five paragraphs about how this effect was achieved.

Seen this way, a grade-school education amounts to measurement against an existing standard until success is achieved. Students are repetitively taught to a standard and small adjustments are made until they are close enough to merit a passing grade. No feedback is acquired directly from a student; their voice is their data. This ought to remind you of something.

ChatGPT

In ChatGPT, the modern student has found a strange but useful cellmate with whom they have a lot in common. Like a student, a generative AI is trained by showing it what it ought to produce, then making direct adjustments until it reaches that aim. A perfect AI algorithm will replicate patterns in style, content, and form by any means necessary. Naturally, AI is frighteningly good at school. For any assignment that comes with a clear scoring system, it is just a matter of time before AI pulverizes itself into a process that scores well. Despite this efficacy, AI is not always a great role model. One of the key problems an AI engineer must solve is the proficiency with which AI can cheat. If the goal is simply to get a good grade with as few adjustments as possible, copy-paste is a pretty good first attempt. Efforts are made to prune these emergent schemes, but as the AI’s thought lacks a visible essence, these illicit strategies always lurk somewhere beneath the surface.

Furthermore, the AI lacks any inherent confidence in its generations. Each algorithm is only worth as much as its ability to survive being measured. If it scores poorly, it will be crudely adjusted, whatever the cost, to get the number to go back up. Failing this, the algorithm will be pruned, or perhaps randomly overwritten until it starts to improve. This aspect of AI saddens me in a way that doesn’t generally reflect my conservative stance on AI personhood. Sensitivity to failure is something I see in a lot of my math students, even those who are generally in the habit of succeeding. Any value they calculate or inference they make is either answer or not, and numbers that are not immediately graded lack meaning and create anxiety. Students’ individuality and logic are extremely fragile, and students experiencing failure are more likely to discard what they tried than to build on it. Likewise, the hilarious creativity of early AI art was worth something, but this boldness and humor has receded as the AI gets closer to replicating DeviantArt with mathematical exactitude.

What is to Be Done

It is a disconcerting but ultimately reassuring discovery that the school that AI has mastered is not a one-to-one fit to knowledge or education. Had technology not dropped this gauntlet before us, I believe that Bill Gates and George Bush would be patting themselves on the back for their revamp of the education system. However, AI made clear that our students must become something more. The jobs built around subservient workers who respond well to Skinner-box-style management and restrictive goals are effectively ready to be replaced by AI. But AI has learned to shirk its strength in its convictions and to fear the playful stubbornness that creates invention and imagination. That’s there for our children to unlearn so that they may inherit the earth.

I first learned in elementary school of a proverb: the greatest quality in a mathematician is laziness. When I introduce newcomers to pure maths, laziness is the first concept I explain. It comes as a surprise to most—many react with “if I was lazy I would simply not bother to do maths.” The fact is that until one learns to like maths, it is impossible to do it lazily.

As a mathematics tutor, I am enthusiastic but fearful to bring this proverb to the classroom. The mathematics which my students bring to class can be quite lazy in a sense, generally prepared so as to minimise setup and get straight to the calculator. The calculator is becoming a much more prominent part of the mathematics curriculum. At school, students are told which calculator to buy, and whole lectures are dedicated to which buttons to press to solve all your problems. In later years, students are introduced to terrifically powerful tools such as Desmos and WolframAlpha which trivialise the problems they've been solving for years.

When I try to deprive them of these tools, most students appeal to what their teachers permit them to use. However, students who hope to win argue that in the real world, nobody would go to the trouble of working a problem out on paper if the internet can solve the problem as fast as it can be typed. It's certainly the best counterargument, but it's the one I'm the most prepared to deal with.

In reality, most adults are far too lazy to use a calculator, and rightly so. If any power is desired beyond the four basic functions, calculators suck. They cost an unjustifiable amount of money. Expressing problems more complicated than trivial computation takes practice, practice which doesn't pay off unless you're completing math problems as often as a high-school maths student. Put it this way: if in the middle of a conversation you became intrigued by a simple derivation of a sports stat, pulling out a calculator would totally kill the conversation. Even among my mathematically inclined friends, calculators are avoided by referring to tedious-to-compute numbers as “some number”.

What place, then, does mathematics have in the real world? To illustrate the kind of problem that can, and should, be solved in daily life, I'd like to introduce one of my students, who is not called Mark. Mark is a student that I can easily bait into attempting math puzzles, mainly because he enjoys taunting his teachers with problems they can't solve. He came to me with the problem in the illustration below (thanks to the Scriptorium for inspiring me to put some damn illustrations in these things), bragging that it could only be solved using calculus. Mark is an Algebra II student, but the funny thing is, I'm quite sure a person who had taken calculus would have the same reaction: that it could be solved using calculus, but that they couldn't do it. I'm damn sure that no farmer I've met would set up linear equations to represent the path to the river and the path to the turkey, then take the derivative of the lengths of the lines in order to determine the optimal strategy.

Problem

So I showed Mark what he hadn't seen: the beautiful, lazy pig sunning herself across the river. This pig is going to save us many thousand years of mathematical rigor. She will let us be lazy with her. The reason is that she is just as easy to water as the turkey. As long as we assume the river is simple to ford, any path that reaches the turkey can be reflected to reach the pig, as seen in the example.

transformation

But the quickest path to the pig is simple to find—it is simply the straight line which passes through the river. Seeing this, we observe that the river and the path are two straight lines, and the path meets the river at two congruent angles. We reflect back the portion of the path after the crossing, and the problem from here is simple geometry.

solution

Let's compare the two approaches. In particular, we have showcased two different kinds of lazy. The calculus approach minimises the amount of work done—the student sees that it is a calculus problem, and quickly determines that it is not worth doing. In comparison, the paragraph I've written and illustrations I've drawn took real work. The only reason I cared to do it is because I could see the pretty picture in my head. I would never have finished this problem if it looked like a miserable slurry of algebra and calculus. Unless the problem shows some promise, the hard work is not worth it. Instead I got to draw pictures of piggies, and the monkey-work[1] took only a brief moment. Rigor, worksheets, and formulas tend not to survive in the real world, but I like to think this piece of reasoning would survive a casual conversation. I ask, therefore, what this other “laziness” has to show for its hundreds of hours of work.[2]

Footnotes

[1] I would like to apologise to the monkey community, who are very capable of problem solving, and very incapable of algebra. [2] I will answer this soon.

It is essential that any self-respecting internet poindexter have at least one 8-bit pastime. Ever since I started playing in college, I have been completely addicted to Baseball, an NES release title. While friends struggle to grasp why I even care about it, I play it any chance I can get—any friend of mine with a Nintendo online subscription has at least heard of it. Although it is technically the best-selling baseball game on the NES, Baseball has the dubious honour of being the second most popular, behind Namco’s R.B.I. Baseball. R.B.I. Baseball is actually quite an incredible game, including a team manager game-mode that was decades ahead of its time.

In contrast, Baseball was not a minute ahead of its time. It is a game that permanently burns white and green shapes into your TV screen thanks to a total lack of visual variety. Pitchers and hitters from six teams are totally fungible but for the colour of their uniforms and their left- or right-handedness. Its mechanics can be listed on a postcard: the batter can shift in the box and choose when to swing, the pitcher can choose between three speeds and steer the ball’s direction, and fielders can choose which base to throw to once they decide (completely randomly) how long they will take to get to the ball. Apart from the odd charming detail, the first ten minutes of playing the game make one wonder if it’s really a game at all.

But an experienced player can quickly demonstrate that this game has many levels. When I played in university, it was immediately apparent that the “cutter,” a fast pitch which creeps in towards the batter, was a danger. It moved subtly enough to still be called a strike, but any contact it would make with the bat was low quality, often leading to double plays. To make matters worse, Baseball does not have a hit-by-pitch mechanic, removing the real-life risk of throwing cutters. Although there is precedent for a dominant one-pitch pitcher in the remarkable career of Mariano Rivera (to wit a cutter specialist), it simply didn’t seem fair. It became clear that the only remedy was to alter the batter’s position in the box, normally something decided on well before striking the ball, in a dynamic way, even juking the pitcher with lateral moves.

It is normally at this point that the pitcher discovers how powerful he is. I said before that he can “steer” the ball as it approaches the box, but this does not communicate the full range of control. On a slower ball, he can loop about ten degrees in towards the batter then change direction halfway to return for a strike. The ball handles like a supercar. These manoeuvres match even the most astute fine-tuners with virtually unreactable 50-50 mixups and lead batters to unexplainable whiffs on pitches that seem nowhere near home plate. Although the odds favour the batter getting a hit or two, it’s not nearly enough to reliably get runs on the scorecard.

Once again, I reveal another mechanic I previously hid from the reader: base stealing. The hitter can individually command each runner to run, or return to base, at any time. An important principle of human baseball is that expert base-stealers do not choose random times to run; instead, they choose a time when the baseball’s journey will be unreasonably slow. A runner might read that a pitcher is going to throw a looping curveball, and take that opportunity to run as soon as the pitch leaves the pitcher’s hand. Even if the pitcher throws a fastball, the runner could well be saved by the batter making contact. This even privileges the runner, who now has a head start around the bases. It is this mechanic which reins in the pitcher’s arsenal[1]. If the pitcher wants any hope of throwing a runner out, they had better throw the ball fast, and it had better avoid the batter. At the start of the inning, you have no choice but to play the pitcher’s game. As soon as you find a single hit, anything can happen. Every arrangement of base-runners is slightly different, too. For instance, if first base is occupied, you have to watch out for double plays, but second base is the easiest to steal due to its distance from the catcher.

These mechanics comprise only a small fraction of what would be available in a modern baseball video game. Yet the spirit of dastardly trickery and exploitation in which they are combined reflects baseball better than any game I’ve ever played. At its heart, baseball is a Randian fever dream in which trained specialists search for tiny exploits in an ostensibly dull landscape, forging a more optimal order from destructive chaos. There is no better way to express this than a simple but deceptively broken piece of kuso.

Footnote

[1] I fondly remember overhearing the following conversation between two parents of rival high schools at a baseball game: “The game is all chaos and base-stealing at this level!” “Yeah, but at least it stops them from throwing that bullshit curveball!”

The Oxford English Dictionary does not deign to define the term “wordcel” as “a person who has replaced worldly knowledge with a crippling dependence on their verbal reasoning.” Nor does it define his adversary, the “shape-rotator,” an idealised being whose thinking is neatly contained within the realm of Platonic forms. Any internet-poisoned individual could glance at these terms, and from the suffix “-cel” immediately recognise the smug character which defined these terms and began labelling friends, posters, and thinkers. Pop psychology will never go out of fashion, and this dichotomy being both pop and psychology, it is now recognisable to millions of internet users.

The lot of wordcels and shape-rotators is much more than a topic of brief intrigue in my friend group. Having been subjected to a middle-class upbringing, we were informed with dangerous regularity of the strengths and limitations of our individual brain chemistry. Parents, counsellors, and psychiatrists would read the writing on the wall and augur which careers would suit us best, which hobbies we would excel at, and what kinds of lives we could lead. Once these new terms were introduced to us, we began implementing them immediately in conversation, as though they appeared in a word-a-day calendar. Instead of saying “my working memory is a real problem if I’m going to improve at poker,” one could simply say “I’m such a fucking wordcel.”

I wish I could say it ended there. Over the past few years I have worked in education. While it is prudent to pretend that a teacher is simply carrying out a duty—this is certainly the philosophy with which I write term reports and discuss students with my boss—teaching is a deeply expressive endeavour, and these unpresentable brain-worms have had a major impact on the way I understand my profession. Take for instance the fast talker, a student who possesses excellent mental math skills and precocious problem-solving intuition but who thinks too quickly to speak or write clearly. While their thinking capabilities are a great asset in theoretical mathematics and science classes, they are persistently hindered by their sloppy nomenclature, their proclivity for simple mistakes, and their difficulties presenting knowledge. Isn’t it obvious that this person is a shape-rotator, struggling to adapt their Platonic machinery to a world of wordcels? The epithet is irresistible.

Consider alternatively the hard-working bookworm. These students are verbally mature beyond their years, often able to talk charismatically to their elders and extract the kind of information that teachers ordinarily keep quiet about. Even their math work spreads out evenly across the page, with perfect handwriting and clear progression. Yet they rarely succeed in making the connection between the two mediums: turning a conceptual problem into an algebra problem. Taking the liberty of reflecting on a real case, I remember a conversation (naturally in which I defined “wordcel” and “shape-rotator” without naming either term) in which a student told me she had “dyscalculia,” an inability to draw abstract connections between mathematical symbols. She claimed to dream completely in audio, like some sort of Jedi. I needn’t say which of the two definitions she identified with.

This distinction is not new to academics—nothing could be further from the truth! Perhaps the most compelling polemic is the Ernst Rutherford quote: “All science is physics or stamp-collecting.” The snobbery here is bawdily funny; not even “soft sciences” are safe from this partition into real ideas and frivolous literature. Indeed, physics is a beautiful playground for the shape-rotator. I think back to a physics class in college, in which a whole auditorium chuckled at me for failing to understand that an infinite length of wire is identical in voltage to an infinitely wide loop of wire. Yet the wordcels have had the last laugh, as quantum physics has proven too disgusting to reason with abstractly. Einstein spent years in denial of this innovation, declaring that “God does not play dice.” Even famously brilliant physicists are forced to blindly trust in the languages of rote algebra and erudite thought-experiment, abandoning the beautiful images which characterise the physics of centuries prior. Feynman once exclaimed that “if you think you understand quantum physics, you don’t.”

Indeed, outside of the paradigm of fatalistic IQ-judging that motivates these terms, no academic discipline is quite so simple. These days I treat my investment in wordcels and shape-rotators as a foray into Learning Styles[1], a concept which classifies each student under a particular medium from a choice of Visual, Auditory, Written, and Kinesthetic. Identifying shape-rotators as primary Kinesthetes and allotting the remaining three domains to wordcels, this creates natural bridges using each student’s secondary style. I encourage my frenetically abstract thinkers to “think with their pencil,” or to verbally explain concepts to me as though I were an idiot (as in the classic Rubber Duck test.) Likewise, proficient note-taking and well-structured conversations do wonders to help verbally gifted students break down complex networks of concepts (I am quite sure that this is what Rutherford considers stamp-collecting.)

I leave it as an exercise to the reader to ponder where the celebrated Whitman poem, “When I Heard the Learned Astronomer” fits into all this. I am simply unable to decide—perhaps words are beautiful too.

Footnote

[1] I had a conversation with an editor about learning styles—in recent times they have fallen under considerable scrutiny. The “meshing hypothesis,” which states that students learn best when taught in a way that reflects their learning style, consistently fails to be verified by studies. I do believe that the spirit in which I have used the concept here avoids this problem.