Don’t Think of an Elephant: LLMs, Search Engines, and Reasoning by Analogy in ChatGPT


Research Series on AI Benchmarking

This research note observes that AI work is fundamentally experimental, and raises points about technology assisted research, user purpose and search object in AI prompting, and distinctions between large language models and search engines.

This research note is part of a series on AI benchmarking and cross-pollinated methods.


Over the last six months I've immersed myself in AI augmented research. That is, I've been doing research-y things, the kinds of professional activity that I've always done, but incorporating LLM usage into as much of my process and workflow as possible. My objective was to play and learn. Quick thoughts and obvious points on the experience thus far:

****

It’s all experimental. I think it's fair to say that everything we do with LLMs right now is experimental, and we’re all contributing to a worldwide R&D project. With that in mind, my purpose was to test out this new(ish) set of tools. I wanted to learn as much as I could about what they can and can't do, and specifically what I could or couldn't do with them in my work, using my academic and professional training, competencies, and experience.

  • Humans use tools. Technology assisted research and analysis has been around for a while. Everyone appreciates that point, I think. Basic productivity software and personal applications, for example, are almost by default built into and enable basic conditions of work and play. Professional activities are a different matter, and different disciplines have their own specialist resources and requirements. Some are more complex or sophisticated than others.

  • Pencil and paper. In my case, my first training used pencil and paper (I know, I know) and basic principles, applied to specific problem sets using precisely defined types of evidence. I've since been exposed to clusters of sector-specific applications. I've developed varying degrees of user comfort and competency with them, over the course of a career featuring multiple academic and non-academic professional tracks.

  • Shiny new kit. What really comes out of this a sense that adjusting and adapting to new technologies and developing enough proficiency to integrate them almost organically into work routines, is a major part of the game. It is a technology skillset in it own right. Specific tools or categories of tool aren't magic, even if sometimes it's nice to sit back and enjoy the wonder of shiny new kit and capabilities.

Search engine UX. I want to say something about how "AI is different", but it’s a notion tempered by the point that every innovation differs from the building blocks of knowledge that preceded and informed its development. So I won't generalise too much on this next bit, and limit myself to a basic and immediate observation:  the more immersed I become as a tester and operator of LLMs, the more I tend to think of them as just another class of search engine.

  • End user testing. That’s not a new observation. It is one that I came to on my own through this exercise. LLMs and search engines are distinct, but views differ on how distinct they actually are. Conceptually, the search engine label has been applied in creative and disconcerting ways to real issues. I’m a lay end-user, not a STEM-side developer who designs and builds the tools, so bear with me. My point: in every way that matters, my experience has been that LLMs are engines that power my search.  

  • Purpose and object. I don’t know if that makes them search engines in technologically meaningful terms, so YMMV on that point. It will mean different things to different user communities with varying levels of research competency. Thinking through a little more constructively, the better observation might be that the purpose and object of my research are proper differentiators, more so than assumptions about how fast or extensive or processed or creatively articulated I want my prompt responses to be.

A second opinion. Setting aside user purpose and search object - points worth a much more extended discussion - I wanted a quick second opinion. As I write this, my children are still asleep and my wife is crewing a yacht on its way back from Muscat. So I anthropomorphized a little and asked ChatGPT what it thought about the search engine claim. (Note: I was just asking it for information. Really. Bartholomew ChatGPT isn’t a real person.)

  • Do lazy prompts… My prompt was simple and more or less unstructured: “what’s the difference between an LLM and a search engine?” That felt a bit lazy, but more detailed RCT or COSTAR prompt instructions would have been overkill for this. The response was a little superficial but accurate enough, and provided a basic run-down of common and differentiating characteristics.

  • …Beget lazy responses? The more interesting part of the response was its concluding note, under the simple heading “Analogy”. Here it is: “Think of an LLM as a knowledgeable writer who synthesizes what they know and answers in natural language. A search engine is like a librarian who fetches books and articles but doesn’t summarize them for you.”  

  • Don’t reason like an elephant. Tsk. Everyone knows comparison using “like” or “as” is a simile. Analogies are something else. The writer and librarian examples were pretty lightweight and not at all robust enough to withstand pushback. A child could quickly dismantle both in short order. I didn’t offer correction or feedback, or try to refine the prompt in any way, but it could have made for some interesting results. I wonder what George Lakoff or Yuen Foong Khong would say.

****

Author: Dr. Michael A. Innes is a Visiting Senior Research Fellow in the Department of War Studies at King’s College London, where he founded and directs the Conflict Records Unit. He maintains an active consultancy portfolio as managing director and lead consultant at Craighead Kellas SAAR.

Declaration: This research note summarises self-initiated research conducted independently by the author. The author drafted the article by hand, without the use of AI drafting support.


RESEARCH SERIES ON AI BENCHMARKING.

Exploration and discovery of AI, from a user perspective.

© Craighead Kellas 2024. All Rights Reserved.

CRAIGHEADKELLAS.COM

Previous
Previous

Clippings: Data Hoarders and the Data Rescue Project

Next
Next

Applying COSTAR to Daily Reporting in ChatGPT