ChatGPT Sucks at Testing -James Bach

2 min readMay 7, 2024

James Bach critiques the effectiveness of ChatGPT in software testing, emphasizing that despite widespread claims that the AI model is revolutionizing testing, the reality is that ChatGPT has significant limitations when it comes to real-world software testing scenarios.

Here are the key points from the video:

1.ChatGPT’s Limitations in Testing: Despite reports and presentations touting ChatGPT’s benefits in testing, James argues that these claims are often based on simplistic examples that don’t reflect real-world complexity. ChatGPT’s ability to write code does not equate to effective software testing.
2.Hallucinations: ChatGPT tends to hallucinate, meaning it fabricates information that seems plausible but is inaccurate. Bach demonstrates this by asking Bing’s ChatGPT-enabled search to summarize a website’s functionality, only to find that the AI fabricated details about a non-existent tool. This behavior makes it unreliable for fact-based tasks and raises concerns about its use in testing.
3. Issues with Generating Test Cases: When asking ChatGPT to generate test cases from a real user story, the output often contains errors, irrelevant details, and lacks negative test cases unless explicitly prompted. This indicates that ChatGPT lacks the critical thinking skills required for effective test design and fails to ask relevant questions that a competent human tester would.
4. Inconsistency and Lack of Curiosity: ChatGPT demonstrates inconsistency and fails to ask clarifying questions when providing test cases or solutions. It tends to comply with user demands without thoroughly exploring the context, leading to incorrect or incomplete results.
5.Additional Shortcomings: He also outlines several other weaknesses of ChatGPT, including its incuriosity (lack of questioning), incongruence (inconsistencies in its responses), struggles with numbers, laziness (repeating answers), opacity (inability to explain its thought process), unteachability (lack of adaptability), and inability to process diagrams or visual information.
6.ChatGPT’s Role in Software Testing: Given these limitations, he suggests that current capabilities are far from adequate for software testing, especially when it comes to complex problem-solving, thorough investigation, and critical thinking. He advises caution in using AI-based tools for testing, indicating that they might not offer the promised revolution in the field.

It serves as a warning about the limitations of using ChatGPT and similar AI tools in software testing. Despite their potential, these AI models are prone to hallucinations, lack curiosity, and struggle with complex reasoning — traits essential for effective software testing.

If you’d like to watch the video recording, here’s the link.https://www.youtube.com/watch?v=Dh0K2X9AnG8

ChatGPT Sucks at Testing -James Bach

Written by Mahathee Dandibhotla