AI is quickly changing how developers write code, debug issues, and even design systems. But one question many testers and developers are starting to ask is: can AI actually write useful API tests?
As someone working in test automation, I was curious to see how far AI has come in this space. Instead of manually writing API tests, I decided to run a small experiment. I gave an AI assistant an OpenAPI specification and asked it to generate automated API tests. Then I executed those tests using Playwright inside Visual Studio Code to see how well they worked in practice.
The idea was simple:
- Provide AI with an OpenAPI specification
- Ask it to generate automated API tests
- Run those tests using Playwright
- Analyze the results
Sounds straightforward, but the results were more interesting than I expected.
In this article, I’ll walk through the full experiment step by step—from setting up the environment and generating tests with AI, to running them and evaluating how reliable they actually are. If you’re a QA engineer, developer, or someone curious about the future of AI in testing, this experiment might give you a practical look at where things stand today.
Let’s see what happens when AI becomes your API test engineer.
✅ 1. Create a Playwright Project and Set Up AI in VS Code
For this experiment, I didn’t need to create a new Playwright project since I’m using an existing project. If you were starting fresh, you could run:
npm init playwright@latestand follow the prompts to set up a basic project structure.
Since I already had a project ready, I focused on setting up AI inside Visual Studio Code using GitHub Copilot:
- Open Extensions in VS Code
- Search for GitHub Copilot Chat
- Install the extension
After installing, sign in with your GitHub account when prompted. Once signed in, the Copilot Chat panel is ready to generate code directly inside your project.
With the existing Playwright project and Copilot set up, the environment is ready to start generating AI-powered API tests.

✅ 2. Add the OpenAPI Spec (Petstore Swagger)
For this experiment, I used the Swagger Petstore OpenAPI specification, which provides a simple example API that’s perfect for testing.
I placed the Petstore OpenAPI JSON file in a folder called api-spec inside my project:
project-root/
└── api-spec/
└── petstore.jsonHaving the OpenAPI spec in the project allows AI to read it and generate Playwright tests based on the endpoints and responses defined in the file
✅ 3. Let AI Generate the Tests
With the OpenAPI spec ready, I asked AI to generate Playwright tests specifically for the pet object.
I used GitHub Copilot Chat in VS Code and provided the following prompt:
You are a Playwright API testing expert.
Using petstore.json OpenAPI spec given in the api-spec folder, generate Playwright API tests for pet object only in a new directory.
Requirements:
- Use Playwright test runner
- Use request fixture
- Validate response status
- Validate response structure
- Create one test per endpointWithin a few seconds, the AI generated a test suite in a new folder (for example tests/pet).

This gave a ready-to-run test suite for the pet endpoints, saving a lot of time compared to writing all tests manually.

Next, it was time to run the AI-generated tests. Since the goal was to see how well the AI did, I decided to run all the generated tests without any modifications.
In the terminal, from the project root, I executed:

Playwright ran all the tests in the generated folder. Out of 8 generated tests, 5 tests passed, while 3 failed. At this stage, I didn’t change anything—this was purely to see the AI’s first attempt.
We’ll analyse the results in the next step to understand why some tests passed and why some failed, and what can be improved.
✅ 4. Reviewing the Results
After running the tests, I reviewed the generated test suite to see how well the AI handled the API specification. Overall, the results were quite interesting.
Here are some observations from the generated tests:
- One test per endpoint: As requested in the prompt, the AI generated one test for each endpoint, which made the test suite clean and easy to understand.
- Basic validations were implemented: The tests mainly validated the response status code and the response body, and they included a reasonable number of assertions for the returned data.
- Some tests failed as expected: Out of the generated tests, three tests failed — delete pet, update pet, and get pet by id. This was somewhat expected because the tests used static IDs, and those records may not always exist in the system under test.
- Request payload coverage: For endpoints that required a request body, the AI generated payloads that covered most of the available fields defined in the OpenAPI specification with sample test data.
- Good starting point for automation: While the tests were not perfect, they provided a solid starting point that could easily be improved by adding better data handling, dynamic IDs, and additional validations.
✅ 5. Fixing a Failed Test with AI
Next, I wanted to try something different. Instead of manually fixing the failed tests, I asked AI to help improve one of them.
One of the failed tests was the delete pet test. The failure was expected because the test used a static pet ID, and that pet might not exist in the system when the test runs.
To fix this, I asked AI to update the test so that it creates a new pet before running the delete test, and then uses the returned ID when deleting it.
I used the following prompt in GitHub Copilot inside Visual Studio Code:
Improve the Playwright delete-pet-test.
Add a beforeEach hook that creates a new pet using the POST /pet endpoint.
Extract the pet id from the response and use it in the delete test.
Use that generated pet id in the DELETE /pet/{petId} test instead of using a static id.
Requirements:
Use Playwright test runner
Use request fixture
Validate response status
Keep the test structure simple
Impressively, the AI updated the test exactly as requested by adding the setup step to create a new pet before running the delete test. The test is now dynamic, using the generated pet ID instead of relying on a static value.
🧭 Conclusion
- Using GitHub Copilot with Playwright in Visual Studio Code made it very quick to generate API tests from an OpenAPI specification.
- AI successfully created one test per endpoint, including response status validations and several useful assertions.
- Out of the generated tests, most passed without any changes, showing that AI can produce a good starting point for API automation.
- Some tests failed due to static test data, which is a common issue in automated tests.
- With a simple prompt, AI was also able to improve a failing test by making it dynamic, demonstrating how AI can help refine tests as well.
- Overall, AI is not a replacement for testers, but it can significantly speed up test creation and reduce repetitive work.
Discussion — How This Helps Testers
AI-generated tests are not meant to replace testers, but they can significantly change how testers work. Instead of spending time writing repetitive test code, testers can use AI to quickly generate a starting point and then focus on improving test quality.
Here are a few ways this approach can help in a tester’s day-to-day work:
- Faster test creation
- Testers can use AI to quickly generate initial test cases or automation scripts from API specifications, documentation, or user stories.
- Better use of documentation
- Specifications such as OpenAPI can be used directly with AI to generate tests, making documentation more valuable in the testing process.
- Focus more on test strategy
- Instead of writing every test manually, testers can spend more time thinking about edge cases, negative scenarios, and overall test coverage.
- Improve existing tests
- AI can help refactor tests, add assertions, or make tests more dynamic, as we saw when fixing the delete pet test.
- Assist non-automation testers
- Testers who are new to automation can use AI as a guide to generate code examples and learn automation frameworks faster.
- Shift in the tester’s role
- The role of testers may shift from writing every line of code to reviewing, guiding, and improving AI-generated tests.
In other words, AI can act as a testing assistant, helping testers move faster while still relying on human expertise to design meaningful and reliable tests.

