First drafts are rarely perfect. The key to high-quality outputs is iteration—generating, critiquing, and refining. In this lesson, you’ll learn how to make LLMs critique and improve their own outputs, creating a systematic refinement process that dramatically improves quality.
Problem: A store offers 20% off, then an additional 10% off the discounted price. What's the total discount on a $100 item?Generate 3 independent solutions:Solution 1:- First discount: $100 × 0.20 = $20- Price after first discount: $100 - $20 = $80- Second discount: $80 × 0.10 = $8- Final price: $80 - $8 = $72- Total discount: $100 - $72 = $28 (28%)Solution 2:- Combined discount: 1 - (0.80 × 0.90) = 1 - 0.72 = 0.28- Total discount: 28%- Final price: $100 × 0.72 = $72Solution 3:- First discount: 20% off = $80- Second discount: 10% off $80 = $8 off- Total saved: $20 + $8 = $28- Total discount: 28%Consensus: All three solutions agree on 28% total discount and $72 final price.Answer: 28% total discount, final price $72
Research Finding: Self-consistency can improve accuracy by 20-30% on reasoning tasks compared to single-path solutions.
Problem: If 5 machines make 5 widgets in 5 minutes, how long does it take 100 machines to make 100 widgets?Solution 1:- 5 machines make 5 widgets in 5 minutes- So 100 machines make 100 widgets in 5 minutesAnswer: 5 minutesSolution 2:- 5 machines make 5 widgets in 5 minutes- Rate: 1 widget per machine per 5 minutes- 100 machines × 1 widget each = 100 widgets in 5 minutesAnswer: 5 minutesSolution 3:- 5 machines make 5 widgets in 5 minutes- 100 machines make 20× more = 100 widgets- Time needed: 5 minutes × (100/5) = 100 minutesAnswer: 100 minutes ❌Analysis: Solutions 1 and 2 agree (5 minutes). Solution 3 made an error in reasoning.The correct answer is 5 minutes (majority vote).Verification: Each machine makes 1 widget in 5 minutes, so 100 machines make 100 widgets in 5 minutes simultaneously.
Step 1: Generate Initial Response[Create first draft]Step 2: Critique the ResponseIdentify weaknesses:- What's missing?- What could be clearer?- What's incorrect?- What could be improved?Step 3: Generate Improved Version[Create refined version addressing critiques]
Task: Write a professional email declining a job offer.STEP 1: Initial DraftSubject: Job OfferHi,Thanks for the offer but I've decided to go with another company. Best,JohnSTEP 2: CritiqueIssues identified:- Subject line too vague- Lacks gratitude and professionalism- Too brief and abrupt- Doesn't maintain relationship- No specific reason given- Missing proper closingSTEP 3: Refined VersionSubject: Re: Software Engineer Position - DecisionDear [Hiring Manager Name],Thank you so much for offering me the Software Engineer position at [Company Name]. I truly appreciate the time you and your team invested in the interview process and the opportunity to learn more about your innovative work.After careful consideration, I have decided to accept a position with another organization that aligns more closely with my current career goals. This was a difficult decision, as I was impressed by [Company Name]'s mission and the talented team I met.I hope we can stay in touch, and I wish you and the team continued success.Warm regards,John Smith
Task: Explain quantum computing to a 10-year-old.ITERATION 1:Quantum computers use quantum mechanics to process information using qubits instead of bits, leveraging superposition and entanglement for parallel computation.Critique 1:- Too technical (quantum mechanics, qubits, superposition, entanglement)- No relatable examples- Assumes too much prior knowledgeITERATION 2:Quantum computers are super powerful computers that work differently from regular computers. They can solve really hard problems much faster.Critique 2:- Better but still vague- Doesn't explain HOW they're different- Needs concrete example- Could use analogyITERATION 3:Imagine you're in a maze trying to find the exit. A regular computer would try one path at a time until it finds the way out. A quantum computer is like having a magical ability to try ALL paths at the same time! That's why it can solve really tricky puzzles much faster than regular computers. Scientists use them for things like creating new medicines and understanding space.Critique 3:- Clear analogy (maze)- Age-appropriate language- Concrete examples (medicines, space)- Explains the key difference (parallel vs sequential)- Good! Minor improvement: could add one more relatable exampleFINAL VERSION:Imagine you're in a maze trying to find the exit. A regular computer would try one path at a time until it finds the way out. A quantum computer is like having a magical ability to try ALL paths at the same time! That's why it can solve really tricky puzzles much faster.Scientists use quantum computers for amazing things like:- Creating new medicines to help sick people- Understanding how stars and planets work- Making super-secure passwords that no one can crackIt's like having a super-brain that can think about many things at once!
Task: [TASK DESCRIPTION]Step 1: Generate initial response[RESPONSE]Step 2: Critique your responseEvaluate based on:- Accuracy: Is the information correct?- Completeness: Is anything missing?- Clarity: Is it easy to understand?- Relevance: Does it address the task?Identified issues:[LIST ISSUES]Step 3: Generate improved response[IMPROVED RESPONSE]
Task: Write a Python function to check if a string is a valid email address.STEP 1: Initial Codedef is_valid_email(email): return '@' in email and '.' in emailSTEP 2: Self-CritiqueAccuracy Issues:- Too simplistic - "a@b.c" would pass but isn't valid- Doesn't check for proper format- Doesn't validate domain- Allows multiple @ symbolsCompleteness Issues:- No input validation (None, empty string)- No docstring- No error handling- Missing edge casesClarity Issues:- Function name is good- But logic is too simple to be clear about what it actually validatesSTEP 3: Improved Codeimport redef is_valid_email(email): """ Validate email address format. Args: email (str): Email address to validate Returns: bool: True if valid email format, False otherwise Examples: >>> is_valid_email("user@example.com") True >>> is_valid_email("invalid.email") False """ if not email or not isinstance(email, str): return False # RFC 5322 simplified pattern pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' return bool(re.match(pattern, email))
Statement: "The Great Wall of China is visible from space and is over 13,000 miles long."Step 1: Identify verifiable factsFact A: The Great Wall is visible from spaceFact B: The Great Wall is over 13,000 miles longStep 2: Check each factFact A: Visible from spaceVerification: FALSEReasoning: This is a common myth. The Great Wall is not visible from space with the naked eye. NASA astronauts have confirmed this. While it's visible from low Earth orbit with aid, it's not visible from the Moon or deep space.Fact B: Over 13,000 miles longVerification: TRUEReasoning: According to archaeological surveys, the Great Wall (including all branches) measures approximately 13,171 miles (21,196 km). This was confirmed by Chinese surveys in 2012.Step 3: Overall assessmentPARTIALLY VERIFIED: The length claim is accurate, but the visibility from space claim is false.Corrected statement: "The Great Wall of China is over 13,000 miles long, making it one of the longest structures ever built, though contrary to popular belief, it is not visible from space with the naked eye."
Initial Output: [CONTENT]Refinement Focus: [SPECIFIC ASPECT]Critique focused on [ASPECT]:[TARGETED CRITIQUE]Refined version (improving [ASPECT]):[IMPROVED CONTENT]
Example:
Copy
Initial: "Our product is good and customers like it."Refinement Focus: Specificity and evidenceCritique: Too vague. "Good" is subjective. "Customers like it" needs evidence.Refined: "Our product has a 4.8/5 star rating from over 10,000 customers, with 94% reporting they would recommend it to others."
Refine based on different stakeholder perspectives.
Copy
Initial Output: [CONTENT]Perspective 1: [STAKEHOLDER TYPE]Concerns: [ISSUES FROM THIS PERSPECTIVE]Refinement: [ADJUSTMENTS]Perspective 2: [STAKEHOLDER TYPE]Concerns: [ISSUES FROM THIS PERSPECTIVE]Refinement: [ADJUSTMENTS]Balanced Final Version:[CONTENT ADDRESSING ALL PERSPECTIVES]
Refine this customer service email through 2 iterations.Initial: “We got your complaint. The problem is being looked at. We’ll let you know.”
Sample Solution
Copy
ITERATION 1 - Critique:- Too informal and brief- Lacks empathy- No timeline- Passive voice- No specific action mentionedITERATION 1 - Refined:"Thank you for contacting us about your issue. We're currently investigating the problem and will update you soon."ITERATION 2 - Critique:- Better but still vague ("soon")- Could be more empathetic- Should acknowledge specific issue- Could offer interim solutionITERATION 2 - Final:"Thank you for bringing this to our attention. I sincerely apologize for the inconvenience you've experienced with [specific issue]. Our technical team is actively investigating and we'll have an update for you within 24 hours. In the meantime, [interim solution if applicable]. Please don't hesitate to reach out if you have any questions."
def calculate(a, b, op): if op == '+': return a + b elif op == '-': return a - b elif op == '*': return a * b elif op == '/': return a / b
Sample Solution
Copy
# CRITIQUE:# - No input validation# - Division by zero not handled# - No docstring# - Limited operators# - No type hints# - No error handling for invalid operators# REFINED VERSION:def calculate(a: float, b: float, operation: str) -> float: """ Perform basic arithmetic operations. Args: a: First number b: Second number operation: Operation to perform ('+', '-', '*', '/') Returns: Result of the operation Raises: ValueError: If operation is invalid or division by zero TypeError: If inputs are not numbers """ # Input validation if not isinstance(a, (int, float)) or not isinstance(b, (int, float)): raise TypeError("Both operands must be numbers") # Perform operation operations = { '+': lambda x, y: x + y, '-': lambda x, y: x - y, '*': lambda x, y: x * y, '/': lambda x, y: x / y if y != 0 else (_ for _ in ()).throw( ValueError("Division by zero") ) } if operation not in operations: raise ValueError(f"Invalid operation: {operation}") return operations[operation](a, b)
Refine this technical explanation for a non-technical audience.Initial: “The API uses RESTful architecture with JSON payloads over HTTPS, implementing OAuth 2.0 for authentication.”
Sample Solution
Copy
ITERATION 1 - Critique:- Too many technical terms- No context for why this matters- Assumes knowledge of REST, JSON, OAuth- Doesn't explain benefitsITERATION 1 - Refined:"The API is a way for different software programs to communicate securely. It uses industry-standard methods to keep your data safe."ITERATION 2 - Critique:- Better but too vague- Could use analogy- Should mention specific benefits- "Industry-standard" is still jargonITERATION 2 - Final:"Think of our API as a secure messenger between different apps. When one app needs information from another, the API delivers it safely—like a courier with a locked briefcase. This means your data stays private, and different tools you use can work together seamlessly."