Prompting Chain-of-thought and Skeleton-of-thought.

Here we discuss two current techniques finding success in prompting, chain-of-thought (CoT) and skeleton-of-thought (SoT)

May 19, 2024

If you are here, I am assuming fairly you know about Zero and Few shot prompting.

Here I wanted to discuss two common but skilled techniques which has helped my LLM not hallucinate and act like my toddler nephew. My nephew is an annoying little menace, like an hallucinating LLM :), but I love him considering the good he has in him and the impact he will have on the world hopefully, again like an LLM, but my sister has to instil appropriate knowledge in him like we do with CoT and SoT, again like an LLM.

Chain-of-thought

Introduced in Wei et al. (2022), chain-of-thought (CoT) prompting is basically where you enable the LLM for complex reasoning capabilities through intermediate reasoning steps via prompts. There are multiple flavours of CoT

Zero shot CoT

The most basic idea of zero-shot CoT (Kojima et al. 2022) is that essentially adding "Let's think step by step" to the original prompt, helps our LLM to “think” before “responding”.

For example, for basic maths problem

prompt = f"""
    Let's think step by step for the following problem before answering:
    Problem: {problem}
    """

a more detailed version

prompt = f"""
    Let's solve the following math problem step-by-step:
    Problem: {problem}
    1. First, identify the operation needed to isolate the variable.
    2. Perform the operation step-by-step.
    3. Simplify the equation to find the solution.
    """

Few shot CoT

In this, we just augment our zero shot CoT with few sample examples and nothing else.

prompt = """
    Let's solve some math problems step-by-step.

    Example 1:
    Problem: Solve for x in the equation 3x + 2 = 11.
    1. Subtract 2 from both sides of the equation.
       - 3x + 2 - 2 = 11 - 2
       - 3x = 9
    2. Divide both sides by 3.
       - 3x / 3 = 9 / 3
       - x = 3

    Example 2:
    Problem: Solve for y in the equation 4y - 5 = 7.
    1. Add 5 to both sides of the equation.
       - 4y - 5 + 5 = 7 + 5
       - 4y = 12
    2. Divide both sides by 4.
       - 4y / 4 = 12 / 4
       - y = 3

    Now, let's solve a new problem step-by-step.

    Problem: {problem}
    """

Auto CoT

Not done enough work but what I understood is In zero-shot CoT, the model is given a single prompt that instructs it to solve a problem step-by-step without providing any prior examples. The model must use its own understanding to generate the intermediate reasoning steps. Auto-CoT involves the model autonomously generating intermediate reasoning steps. The key difference is that Auto-CoT may leverage multiple passes or iterations, potentially using self-consistency or prompt chaining techniques to refine its answers. This approach encourages the model to autonomously create and verify reasoning paths.

Self consistency

Self-consistency prompting is an advanced technique that builds on Chain-of-Thought (CoT) prompting. Instead of relying on a single reasoning path, self-consistency involves generating multiple diverse reasoning paths and then selecting the most consistent answer from these paths. This technique improves accuracy and reliability, especially for complex reasoning tasks. Perhaps one of the more advanced techniques out there for prompt engineering is self-consistency. Proposed by Wang et al. (2022), self-consistency aims "to replace the naive greedy decoding used in chain-of-thought prompting".

import openai
from collections import Counter

# Set your OpenAI API key
openai.api_key = 'your-api-key'

def generate_solutions(prompt, n=5):
    response = openai.Completion.create(
        engine="text-davinci-003",
        prompt=prompt,
        max_tokens=150,
        temperature=0.7,
        n=n  # Generate n different solutions
    )
    solutions = [choice.text.strip() for choice in response.choices]
    return solutions

def select_most_consistent_solution(solutions):
    # Automatically count occurrences and select the most common solution
    count = Counter(solutions)
    most_common_solution = count.most_common(1)[0][0]
    return most_common_solution

# Example problem
problem = "Solve for x in the equation 2x + 3 = 11."
prompt = f"""
Solve the following math problem step-by-step and explain your reasoning.
Problem: {problem}
"""

# Self-Consistency process
solutions = generate_solutions(prompt, n=10)  # Generate 10 solutions for better consistency
print("Generated Solutions:", solutions)

most_consistent_solution = select_most_consistent_solution(solutions)
print("Most Consistent Solution:", most_consistent_solution)

1. Generating Multiple Solutions:

• Set n=10 to generate 10 different completions, ensuring diverse reasoning paths and better consistency.

2. Automated Selection:

• Using Counter from the collections module to count the frequency of each solution and automatically select the most frequent one.

Skeleton-of-thought

Here we instruct the LLM to first generate crisp skeleton/structure on a given topic and then let it expand on those structure point by point

The 3 steps involved are:

Prompt to generate the skeleton outline
Prompt to elaborate each point
Concurrent execution of each point

class Gpt4Turbo:
    def __init__(self):
        self.MODEL = 'gpt-3.5-turbo-1106'
        self.TOKEN_LIMIT=4000
        self.client = OpenAI(api_key=OPEN_AI_API_KEY)
        self.temperature =0.7
        self.streaming = False

    def gptCall_json(self, temperature, messages: list):
        try:
            response = self.client.chat.completions.create(model=self.MODEL,
                                                    messages=messages,
                                                    temperature=temperature,
                                                    max_tokens=self.TOKEN_LIMIT,
                                                    stream=False,
                                                    response_format={"type": "json_object"}) ## Enforce output format


            return response.choices[0].message.content

        except Exception as e:
            print(e)
            return ""

    def generate_skeleton(self):
        question = self.question
        outline_prompt = f'''
        You're an organizer responsible for only giving the skeleton (not the full content) for answering the question.
        Provide the skeleton as a JSON to answer the question. Instead of writing a full sentence, each skeleton point should
        be very short with only 2~5 words. Generally, the skeleton should have 3~10 points. The skeleton is an outline that would be expanded later.
        Don't elaborate on the point in the skeleton.
        Example:
        \n\nQuestion:\nWhat are the typical types of Chinese dishes?: \n Response: {{"answer" : ["Dumplings" , "Noodles" , "Dim Sum" , "Hot Pot" , "Wonton", "Ma Po Tofu", "Char Siu", " Fried Rice"]}}.
        \n\nQuestion:\nWhat are some practical tips for individuals to reduce their carbon emissions?\n Response: {{ "answer" :["Energy Conservation", "Efficient transportation", "Home Energy Efficiency", "Reduce Water Consumption", "Sustainable Diet", "Sustainable Travel"]}}

        \n\nNow, please provide the skeleton for the following question.\n{question}\n Response: {{"answer": [...]}}
        '''

        ## Make the message
        message=[]
        message.append({"role": "system", "content": "You are a helpful assistant. You respond in JSON format."})
        message.append({"role": "user", "content": outline_prompt})

        self.result = self.gptCall_json(self.temperature, message)


    def elaborate_point(self, point):

        question = self.question

        point_prompt = f'''
        You help elaborate on the point user wants. Your input is a question and one possible answer from the question, also called <point>. You will elaborate on the <point> and give a 2-3 sentence response
        on how the <point> helps answer the question. Start your response by mentioning the <point> and then colon like point: and then your response
        Your response will be in JSON format. Example: {{"answer": {point}: your response"}}
        \n\nNow, please elaborate on the following point. Question: {question}\n <Point> : {point} \n Response: {{"answer": [...]}}
        '''

        ## Make the message
        message=[]
        message.append({"role": "system", "content": "You are a helpful assistant. You respond in JSON format."})
        message.append({"role": "user", "content": point_prompt})

        result = self.gptCall_json(self.temperature, message)
        point_elaborate = json.loads(result)
        return point_elaborate['answer']


    def concurrent_results(self, question):
        self.question = question
        self.generate_skeleton()
        result = json.loads(self.result)
        num_points = len(result["answer"])
        # Create a thread pool executor with 5 threads
        with concurrent.futures.ThreadPoolExecutor(max_workers=num_points) as executor:
            # Submit the API calls to the executor
            outputs = [executor.submit(self.elaborate_point, point) for point in result['answer']]
            # Wait for the API calls to complete and get the results
            results = [future.result() for future in concurrent.futures.as_completed(outputs)]

        # Use list comprehension to add enumeration and "\n" each record
        string_list = [f"{i+1}. {record}\n" for i, record in enumerate(results)]

        # Join the string_list elements into a single string
        final_output = ''.join(string_list)
        return final_output

Thats all folks

#PromptEngineering #ChainOfThought #SkeletonOfThought #AIResearch #MachineLearning

potentialmind's substack

Discussion about this post