0:00
/
0:00
Transcript

The Difference Between a Tool and a Weapon Is the Hand That Holds It

While universities convene committees to discuss artificial intelligence, one Northeastern lab has already built hundreds of things with it

Universities have historically been better at producing scholarship about tools than producing people who can use them. They write papers on the sociology of the hammer. They do not, as a rule, teach carpentry.

OpenAI invited me to make a video. Five minutes, show and tell, talk about what we do with AI for education at Northeastern and at Humanitarians AI, the nonprofit I founded. I said yes, then spent a week trying to figure out what to say — not because we don’t do enough, but because we do too much. Hundreds of projects. Dozens of students building things for causes that matter. A book that couldn’t exist without AI. Songs engineered from neuroscience. Bots that answer calculus questions at 2 a.m. in the voice of a professor who has spent twenty years figuring out how to make students feel smart instead of stupid.

Five minutes. I ran out of time before I ran out of material. That gap — between what we’ve built and what I can explain — is not a communication problem. It’s a productivity problem of the best kind.

I made a webpage so people could find the rest.


The Bot That Replaced Office Hours

There’s a design problem at the center of every classroom that nobody talks about because it’s embarrassing to admit: the teacher is also the source of judgment. You can’t ask the person grading you a question without also revealing what you don’t know. So students go silent. They write down wrong answers and submit them. They fail quietly rather than ask loudly.

That’s not a student problem. It’s a system design problem.

ADA is a custom GPT Abby Williams and I built for her introductory calculus course at Northeastern — maybe two hours of work. Students upload photographs of handwritten problems. ADA checks the work. What makes ADA not just a calculator but a teaching tool is what Abby insisted on: scaffolding. The term comes from Vygotsky’s zone of proximal development — learning happens at the edge of what you can almost do, with structured support that gradually withdraws as competence builds. A good scaffold doesn’t do the work for you. It shows you the next step, checks whether you took it, and adjusts.

What Abby’s prompt engineering did — and this is the technical work people underestimate — is encode a pedagogical philosophy into a language model’s behavior. She didn’t just write “be helpful.” She specified: if the answer is wrong, identify the first point of error. If the answer is right, explain why it’s right. Maintain an encouraging tone. Don’t reveal the full solution unprompted. This is prompt engineering as instructional design. The same model, differently prompted, would just give answers. The same model, prompted by someone who understood pedagogy, became a scaffold.

The students rated ADA 4.8 out of 5. The number that matters more is a different kind of data point. Abby told me about a student with math anxiety — afraid to ask questions in class because asking meant revealing ignorance to the person doing the grading. She didn’t ask Abby. She asked ADA. She learned calculus.

The bot isn’t smarter than Abby. It’s lower-stakes than Abby. Lower stakes turns out to unlock learning in ways that competence alone cannot.

For my Coursera algorithms course, I built Grace. The first week, a few students came to office hours. By week three, almost none. Not because they stopped having questions — because Grace answered them at the moment of confusion, always available, never impatient, able to handle the same question five different ways without sighing. The technology isn’t replacing the relationship between teacher and student. It’s extending the conditions under which that relationship can function.

The tool is not the teacher. Abby is the teacher. ADA is the tool. That distinction has to remain intentional.


Build Fast, Verify Rigorously

The project with Shri is different in scale but identical in logic. Professor Srinivas Sridhar is a University Distinguished Professor of Physics, Biomedical Engineering, and Chemical Engineering at Northeastern, a Lecturer on Radiation Oncology at Harvard Medical School, the Director of the NIH CaNCURE program, a Fellow of the American Physical Society and the National Academy of Inventors, the author of more than 450 journal articles and patents. His 2003 paper in Nature was named among Science magazine’s Breakthroughs of the Year. He has trained over 1,200 students across 14 PhD programs. His lab spans 26 countries and $24 million in grant funding. His CaNCURE program specifically targets experiential cancer nanomedicine training for undergraduates from minority-serving institutions.

The book he and his student Aan are writing isn’t a side project. It’s the written record of one of the most productive research and education programs in nanomedicine anywhere — cancer treatment, nanotechnology, a global audience including regions where the latest oncology paper isn’t arriving by journal subscription.

Pre-AI, a book like this takes a decade. Not because the knowledge doesn’t exist, but because synthesizing it, structuring it, and making it teachable requires sustained intellectual labor that depletes even the most dedicated scholars. A book that takes a decade to write on nanotechnology is outdated by the time it arrives. With AI tools — language models to draft chapters, automated systems to search the literature — it takes months.

But generation without verification is just faster hallucination. So I built Popper — named deliberately for Karl Popper, whose philosophy of science rested on falsifiability, on the principle that a claim is only scientific if it can in principle be shown wrong. Popper combs through draft chapters automatically. Every factual assertion gets flagged: where is the evidence? The tool searches the literature, pulls relevant papers, returns a citation audit. Is this claim about a tumor suppressor gene actually supported by the research?

That epistemological standard is built into the process, not bolted on afterward. AI is fast at synthesis and unreliable about truth. Use it for speed, then verify the claims. Popper is what makes the speed trustworthy.

We also built Madavi, an intelligent textbook interface that presents the same book differently depending on who’s reading. A physician doesn’t need the molecular biology primer. A high school student does. The same knowledge, routed differently based on who needs what. This isn’t personalization as a marketing feature. It’s personalization as epistemology.


The Cost Collapse and What It Actually Means

Here’s the number that changed everything: a professional music track used to cost between $75,000 and $150,000 to produce. With AI tools, we produce them for approximately five dollars in API credits.

The cost collapse doesn’t sound like an educational story. It is one.

At Humanitarians AI, we run Lyrical Literacy — songs engineered from neurobiological research for children who learn differently, because every brain is different and the music industry has never had a financial incentive to produce content for children who don’t fit the average. The two-hertz rhythmic pattern that optimizes infant speech processing. Phonemic diversity that predicts reading ability. Narrative resolution that triggers dopaminergic reward and positive affect. These are documented mechanisms from developmental neuroscience, and for decades the knowledge existed while the music didn’t — because making the music cost too much to produce for a population that couldn’t pay premium prices.

Now it costs five dollars.

We’ve produced songs used by families at Homes of Hope India, an organization running thirty-three orphanages. We’ve produced protest music that reached a million views. We’ve produced lullabies for a grandmother’s family who wanted her voice back, songs for a son who needed to hear his dead father’s voice sing the theology that made him run unarmed onto a battlefield.

The tools that produce engagement-optimized streaming wallpaper can be pointed at the people who actually need music. The difference is not the tools. The difference is who controls the intent.


What “Just Do It” Actually Requires — And What It Misses

We have projects I didn’t get to show in the OpenAI talk: DayOff, using AI agents for protein structure prediction in computational biology. Madison, extending reinforcement learning into agentic AI for branding and marketing. Ramen Effect, building SERS spectroscopy tools for wastewater surveillance and public health. Each is a student or fellow or recent graduate building something real for a real problem. The projects are the curriculum.

I tell students: start building on the first day you’re on campus. Don’t wait until you know enough. You never know enough until you’ve built something that required you to know more than you did.

The orthodox answer in most universities is study the theory, understand the mathematics, then apply. This is a coherent position. It also has a problem in the current moment: the field moves faster than the curriculum. A student who graduates with thorough grounding in last year’s models and no experience building with this year’s tools is at a systematic disadvantage. A student who finished their degree two years ago missed every AI course that’s been developed since. The solution isn’t to retrofit their education in retrospect. It’s to build a culture of continuous building — where the appropriate response to a new tool is to deploy it on a real problem immediately, not to wait for the certification.

Custom GPTs make this possible in ways that matter for underfunded institutions. One GPT Plus account deploys a custom chatbot to hundreds of students. Not after a year of committee meetings. Today. The bureaucratic timeline I’m describing — a year to secure modest funding for an educational project at a university — isn’t hyperbole. It’s a lived constraint I solved by simply not waiting.

Here’s the honest evaluation of what this approach optimizes for, and what it doesn’t.

We optimized for reach over depth. A custom GPT tutoring hundreds of students simultaneously has less understanding of any individual student than a one-on-one human tutor. The GPT doesn’t know that this specific student has a conceptual gap around derivatives tracing back to a shaky algebra foundation. A human tutor might catch that in twenty minutes of conversation.

But the human tutor is unavailable at 11 PM before the exam. ADA is available. And availability at the moment of need turns out to matter more than depth for a large portion of educational interactions.

The design succeeds at democratizing access to patient, always-available, subject-specific support. It fails at deep diagnostic understanding of individual learning trajectories. If schools use tools like ADA as a replacement for human teaching rather than a supplement, they’ll optimize for something that looks like learning — high engagement metrics, good survey scores — while missing the diagnostic relationship that catches students before they’re too far behind.

The question I’m still working on: how do you encode the diagnostic capability into a system serving hundreds of students simultaneously? Popper does this for books. We don’t yet have Popper for students.

That’s what we’re building next.

Every institution that waited for consensus on how to handle AI education watched another semester pass. Every committee convened to discuss the appropriate role of generative tools produced students who graduated without knowing how to use them. The cost of institutional caution is measured in students who needed a tool that nobody got around to building.

I got around to building it.


Tags: educational AI Northeastern University, custom GPT scaffolding Vygotsky, Humanitarians AI nonprofit, Popper falsifiability computational skepticism, Srinivas Sridhar nanomedicine CaNCURE

<iframe width=”560” height=”315” src=”

title=”YouTube video player” frameborder=”0” allow=”accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share” referrerpolicy=”strict-origin-when-cross-origin” allowfullscreen></iframe>

Discussion about this video

User's avatar

Ready for more?