CyDRA: Diagnosis-Guided Zero-Shot Prompting for Text-to-Cypher Generation

June 2026

ICML 2026 Workshop - Failure Modes in Agentic AI: Website

Abstract

Execution accuracy alone is a terminal-score metric: it reveals that an LLM-generated Cypher query is wrong, but not why or in what systematic way. We present CyDRA, a Cypher Diagnostic Rule Alignment framework that operationalizes Text-to-Cypher failure as a taxonomy of 11 structurally grounded error categories covering graph pattern construction, aggregation, deduplication, and return projection, each with deterministic triggering conditions derived from AST-lite comparison against gold queries. This taxonomy serves simultaneously as a trace-level diagnostic and as the input to a mitigation strategy: CyDRA distills offline failure signal into a frozen natural-language prompt policy that requires no retrieved examples. On the CypherBench benchmark across seven held-out graph domains, CyDRA improves execution accuracy by 8.92 percentage points on average over a zero-shot baseline across five LLMs, whereas a documentation-only policy lacking diagnostic evidence yields no consistent gain, offering controlled evidence of both effective and ineffective intervention directions. We offer this framework as a case study in how failure taxonomies with operational definitions can close the loop from trace-level diagnostics to verifiable prompt interventions.

Type

Conference paper

Publication

ICML 2026 Workshop - Failure Modes in Agentic AI