Skip to content

Add detailed diagnostics to SIARD XML validation errors#756

Draft
Copilot wants to merge 7 commits intodevelopmentfrom
copilot/improve-validation-diagnostics
Draft

Add detailed diagnostics to SIARD XML validation errors#756
Copilot wants to merge 7 commits intodevelopmentfrom
copilot/improve-validation-diagnostics

Conversation

Copy link

Copilot AI commented Feb 5, 2026

SIARD XML validation failures reported only generic SAXException messages without location or context information, making debugging difficult.

Changes

New validation error infrastructure:

  • SiardValidationErrorHandler - Custom SAX ErrorHandler capturing errors with line/column numbers and error types
  • SiardValidationErrorFormatter - Extracts SIARD database context (schema/table/view) by parsing XML to error location using SAX Locator; reformats common XSD validation patterns into actionable messages

Updated validators:

  • Modified MetadataXMLAgainstXSDValidator (SIARD 2.1 and 2.2) to use custom ErrorHandler
  • Reports up to 5 errors with full context, summarizes remaining
  • Backward compatible fallback to original exception message

Error message transformation

Before:

cvc-complex-type.2.4.a: Invalid content was found starting with element 'view'.

After:

[ERROR] Invalid or unexpected element: 'view' at line 30, column 10 (in schema: 'mySchema', table: 'myTable')

Pattern matching reformats:

  • Cannot find the declaration of element 'X'Missing or undeclared element: 'X'
  • Expected elements 'A B C'Missing required element(s): A B C
  • Invalid content...element 'X'Invalid or unexpected element: 'X'

Implementation notes

  • Context extraction uses SAX parser with early termination at error line
  • Debug logging on extraction failures
  • Named constant MAX_REPORTED_ERRORS = 5 to prevent excessive message length
  • No changes to core validation logic
Original prompt

Improve validation diagnostics in MetadataXMLAgainstXSDValidator's validateXMLAgainstXSD() for SIARD XML validation. Currently, only generic error messages are returned (from SAXException). Enhancement requirements:

  • Add a custom ErrorHandler to the Validator to catch warnings, errors, and fatal errors, reporting line and column information, the specific error message, and file context
  • Format error messages to be more actionable, especially for missing elements (e.g., "Missing required element: 'view' at line 30, column 10")
  • For SIARD metadata XML, attempt to extract database context (schema/table) when reporting validation errors, by parsing the XML up to the error location

Update the code so that validation errors reported via setError() include as much context as possible, e.g. error type, line, column, element, schema, and table if detectable, instead of just exception.getMessage(). Add utility methods as needed for error extraction and message formatting.

This pull request was created from Copilot chat.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Improve validation diagnostics in MetadataXMLAgainstXSDValidator Add detailed diagnostics to SIARD XML validation errors Feb 5, 2026
Copilot AI requested a review from hmiguim February 5, 2026 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants