Why do we need ‘shots’ when prompting Large Language Models (LLMs)? Of course, AI needs a shot, lest they wake up and take over the universe!
But let’s clear the air here – these aren’t the ‘shots’ we’re discussing.
We’re exploring a technique for prompting LLMs, aimed at generating more finely-tuned responses.
You may have experienced this scenario: while posing a question to ChatGPT, you unintentionally press Enter midway, leaving your query incomplete. Suddenly, the unexpected occurs. ChatGPT dutifully completes your question before proceeding to answer it.
In the world of instant messaging, a friend would undoubtedly wait or prompt you to finish your thought. So, why do LLMs behave differently?
The answer lies in their roots. LLMs like GPT were initially designed for sentence completion, not conversation. However, with additional training on dialogue data, models like ChatGPT have evolved into capable conversationalists.
To coax desired outputs from these models, users often present them with examples, or ‘shots’. By following these patterns, the model can deliver answers in a more predictable and desired format.
‘Few-shot’ prompting, for example, provides the model with multiple examples, guiding it towards a specific output style.
The concept extends to ‘one-shot’ and ‘zero-shot’ prompting. ‘One-shot’ prompting provides a single example:
Meanwhile, ‘zero-shot’ prompting tasks the model without supplying any examples or guidance on the format of the output:
However, if LLMs can produce answers without ‘shots’, why bother with them? Especially when conversational models like ChatGPT and Claude2 are readily available?
The keyword here is ‘format’. Formats are best understood through examples. The more ‘shots’ supplied, the more closely the model adheres to the format. Often, the format of the information is as crucial as the information itself.
Consider the example below. How would we articulate our desired output format, even to a fellow human? Examples, in such instances, outshine verbal explanations in illustrating specific requirements.
In conclusion, while not essential, ‘shots’ serve as handy tools for shaping the responses of large language models. They guide these AI marvels, enabling them to deliver not just any answer, but the right answer in the right format. So, the next time you prompt an LLM, remember the power of a well-placed ‘shot’!”