ClearSpec — Agentic Pipeline banner
Agentic AI April 2026

From Jira Ticket to Test Code —
Building an Agentic Pipeline in Two Weeks

AI has changed how we build, test and ship software — and if you work in quality engineering, you've felt it. This is the story of two weeks, four certificates and one system that ties it all together.


The Shift I Couldn't Ignore

Over the past year I've been layering AI into my testing workflow piece by piece: Playwright MCP for browser automation, GitHub Copilot for test scaffolding and agentic flows to orchestrate the repetitive parts of a test cycle. Each addition saved time. But I wanted to understand the concepts underneath them — so I went deep.


Two Weeks, Four Certificates, One System

I carved out two weeks to properly learn the foundational concepts shaping modern AI engineering: agents, agentic workflows, RAG, MCP (Model Context Protocol) and Skills. I worked through Claude's Skilljar courses and a stack of YouTube deep-dives, completing four certificates along the way.

The concepts clicked. I built small experiments — MCP servers, standalone agent bots, multi-agent subagent chains. But the real test was putting them together into something that actually does something useful.


The Project: Jira to Test Code, End to End

I designed a sequential agentic pipeline that takes a Jira ticket ID as input and outputs production-ready test files — no manual handoffs, no copy-pasting, no context switching. The pipeline has four stages, each owned by a dedicated stateless agent.

Stage What it does Output
AC Extraction Connects to Jira via MCP, pulls acceptance criteria requirements-{id}.md
AC Refinement Probes for gaps — edge cases, error states, empty states refined-requirements-{id}.md
Test Planning Designs a test plan (unit ~70%, integration ~20%, E2E ~10%) test-plan-{id}.md
Test Implementation Writes and runs Playwright + Jest test files test-cases-{id}.md

Between every stage sits a Human-in-the-Loop gate — the pipeline pauses, surfaces the artefact, and waits for your explicit approval before advancing. You can edit the output before signing off. Nothing moves without you.

Well-scoped agents with clear handoffs outperform one monolithic model trying to do everything at once.

The Core AI Concepts at Work

This wasn't just gluing tools together. Every design decision maps to a specific agentic concept.

Skills
Each stage is a markdown slash command that orchestrates the sequence and delegates reasoning to subagents.
Stateless subagents
Every agent starts cold, does one job, writes one file. No shared context, no bleed between stages.
Model tiering
Haiku handles Q&A loops and validation; Sonnet runs only for synthesis. Same output quality, a fraction of the token cost.
Chained Inversion
Stages 2 and 3 use a one-question-at-a-time Q&A loop. The human answers each gap; the synthesiser reads the completed log at the end. Context stays small, quality stays high.
Tool use via MCP
Stage 1 calls the Atlassian Rovo MCP server directly to fetch live Jira data. No copy-pasting.
Agent memory
Q&A answers persist to disk so incremental reruns skip already-resolved questions.
Diagram showing the architecture of the agentic pipeline, with each stage as a separate agent and the flow of data between them

What This System Achieves

You trigger each stage individually and approve every handoff. But what the pipeline actually solves is a problem every QA engineer knows well: the gap between what a ticket says and what actually needs to be tested.

Most acceptance criteria are written from a developer's perspective — happy path, basic flows, nothing about empty states, error messages, boundary conditions, or accessibility. By the time a tester picks up the ticket, those gaps are invisible until something breaks in production. This pipeline makes them visible early. The AI probes the spec one question at a time, you answer and the result is a refined, test-ready specification before a single line of test code is written.

That refined spec then drives a structured test plan — coverage intentionally spread across unit, integration, and end-to-end layers — and from there, the test generator scaffolds real Playwright and Jest files and runs them. You're not starting from a blank file; you're reviewing working code that came from a clear spec that came from a clarified ticket.

The pipeline is reproducible, resumable and cheap — proof that well-scoped agents with clear handoffs outperform one monolithic model trying to do everything at once.


What's Next

This was a demo, but the patterns are real and the problem space is large. Vague acceptance criteria, missing edge cases, slow test cycles and late-stage defects are problems every QA team deals with daily. Agentic pipelines like this one are a practical way to push that quality work earlier in the process — at the spec stage, not the bug-report stage.

I want to take these further. A few areas I'm interested in exploring next: using agents to detect when existing tests drift from the acceptance criteria they were written against, flagging untested code paths from production logs and building smarter triage tools that categorise failing tests automatically after a deployment.

The broader goal is the same throughout — use AI to do the structured, repeatable work, keep the engineer in control of the judgement calls and ship with more confidence.

Built with  Claude Code  ·  Subagents  ·  Skills  ·  Atlassian Rovo MCP  ·  Playwright  ·  Jest