Applying COSTAR to Daily Reporting in ChatGPT
Research Series on AI Benchmarking
This research note is part of a series on AI benchmarking and cross-pollinated methods. The note addresses COSTAR prompting, selecting appropriate prompt configurations, modifying COSTAR to prioritise information targets, and the pitfalls of putting the horse before the cart.
I’ve spent the last few days testing and modifying an AI-augmented approach to passive data collection. The start point for it was Michael Buehler’s research exercise using AI to collect online news and information about cybercrime. A specialist in comparative politics at the University of London’s School of Oriental and African Studies, he’s shared his work publicly in a series of explainers and publications, and outlined compelling use cases of AI as an enabler of traditional research activities.
I work on similar issues in my consultancy practice, and it’s an area I’ve started digging into in my academic research - so the cybercrime exercise immediately caught my attention. My first impressions of it appear below, along with points about research design, prompt modification, and best practice. Further down the line, I’ll release a closer look at the exercise, prompt design for specific forms of data collection and reporting, the effectiveness of social science methods in prompt design, and lessons learned… from working too closely with ChatGPT.
**
The exercise. The gist of Michael’s design, as I understand it, is this: it prompts ChatGPT to identify online information relevant to his research requirements, using preset selection criteria. Clippings are compiled and organized following more presets contained in the prompt. ChatGPT sends a daily notification at 8am that the current day’s report is ready - a requirement set out in the prompt in fine detail, using clear, precise language.
The approach. A key component of the design involves instructions organised in COSTAR prompt format. It’s an acronym for a well-regarded, widely used, and prize-winning approach to prompt configuration. The acronym stands for Context, Objective, Style, Tone, Audience, Response. I spent a weekend’s worth of hours working with it, and I can see its benefits as a mental construct for thinking through and articulating research requirements.
The categories. Still, I want to say COSTAR categories, on the face of it, seem a little lopsided and could be better conceived. There's some fungibility in the categories, which is fine, bit if clarity makes the prompt, then we can do better. Consider “targets”, which exists in a weird sort of conceptual limbo between "Objective” (“what I want to achieve”) and (target) "Audience” (“who I’ll be submitting the results of my work to'“).
The modification. I replaced “tone” with “targets”, addressed tone from within the “style” category, and used the new targets category to elevate and prioritise target information as first priority. The resulting response was solid. Simple adjustments that maintain overall prompt integrity and upgrade its function - zeroing in on categories and nuggets of data, and items of information, that I want the AI agent to identify and collate.
**
Red herrings. I’ve focused on COSTAR because it was a central piece of rhe cybercrime data collection exercise. My advice to researchers still developing their chops is to shop around and be critical when selecting a pre-existing prompt configuration. Be aware of and appreciate alternatives, but the real priority at this stage of the process is to first work out source and information targets that correspond to a problem or question, before chasing down ways to operationalise them. Selecting on one prompt configuration and learning to apply it well is good. It’s certainly better than nothing. Doing it to the exclusion of other approaches or precluding the opportunity to try alternatives, could mean potentially missing something that a different prompt could showcase more clearly.
Stray dogs. Role-Context-Task (RCT), for example, is an approach that I’ve been using for a little while now. I don’t use it because it’s the best. I don’t know if it is. I do use it because it’s the first structured prompt that I was introduced to. It’s simple, clear, and it was easy enough to remember and use that I adopted it very quickly. I like that it’s not over-categorised, which provides a bit of flexibility for me to conceptualise and order requirements into whatever set-up I like. Is that a good thing? COSTAR at least suggests I have more to learn.
Magic prompts. Preferences will vary. Ultimately, I think effective prompting has less to do with calcified acronyms and more to do with juiced up synapses. Here’s the argument: trained researchers should be engaged and competent enough that they can puzzle through and adopt whatever categories and concepts they need to satisfy different kinds of research problem or objectives. Prompt-ready research requirements are the result of serious thought, good habits, and basic competencies in research methods. They’re coherent, conceptually sound, logically ordered, and articulated in clear, plain language.
—
Author Bio: Dr. Michael A. Innes is a Visiting Senior Research Fellow in the Department of War Studies at King’s College London, where he founded and directs the Conflict Records Unit. He maintains an active consultancy portfolio as managing director and lead consultant at Craighead Kellas SAAR.
Declaration: This research note summarises self-initiated research conducted independently by the author. No AI was involved in the drafting of this document.
RESEARCH SERIES ON AI BENCHMARKING.
Exploration and discovery of AI, from a user perspective.
© Craighead Kellas 2024. All Rights Reserved.