Skip to content

Conversation

@MichaelRFairhurst
Copy link
Collaborator

@MichaelRFairhurst MichaelRFairhurst commented Aug 23, 2025

Description

Implement naming package.

Change request type

  • Release or process automation (GitHub workflows, internal scripts)
  • Internal documentation
  • External documentation
  • Query files (.ql, .qll, .qls or unit tests)
  • External scripts (analysis report or other code shipped as part of a release)

Rules with added or modified queries

  • No rules added
  • Queries have been added for the following rules:
    • RULE 5-10-1
  • Queries have been modified for the following rules:
    • Rules that use macro arguments now treat variadic parameters better

Release change checklist

A change note (development_handbook.md#change-notes) is required for any pull request which modifies:

  • The structure or layout of the release artifacts.
  • The evaluation performance (memory, execution time) of an existing query.
  • The results of an existing query in any circumstance.

If you are only adding new rule queries, a change note is not required.

Author: Is a change note required?

  • Yes
  • No

🚨🚨🚨
Reviewer: Confirm that format of shared queries (not the .qll file, the
.ql file that imports it) is valid by running them within VS Code.

  • Confirmed

Reviewer: Confirm that either a change note is not required or the change note is required and has been added.

  • Confirmed

Query development review checklist

For PRs that add new queries or modify existing queries, the following checklist should be completed by both the author and reviewer:

Author

  • Have all the relevant rule package description files been checked in?
  • Have you verified that the metadata properties of each new query is set appropriately?
  • Do all the unit tests contain both "COMPLIANT" and "NON_COMPLIANT" cases?
  • Are the alert messages properly formatted and consistent with the style guide?
  • Have you run the queries on OpenPilot and verified that the performance and results are acceptable?
    As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.
  • Does the query have an appropriate level of in-query comments/documentation?
  • Have you considered/identified possible edge cases?
  • Does the query not reinvent features in the standard library?
  • Can the query be simplified further (not golfed!)

Reviewer

  • Have all the relevant rule package description files been checked in?
  • Have you verified that the metadata properties of each new query is set appropriately?
  • Do all the unit tests contain both "COMPLIANT" and "NON_COMPLIANT" cases?
  • Are the alert messages properly formatted and consistent with the style guide?
  • Have you run the queries on OpenPilot and verified that the performance and results are acceptable?
    As a rule of thumb, predicates specific to the query should take no more than 1 minute, and for simple queries be under 10 seconds. If this is not the case, this should be highlighted and agreed in the code review process.
  • Does the query have an appropriate level of in-query comments/documentation?
  • Have you considered/identified possible edge cases?
  • Does the query not reinvent features in the standard library?
  • Can the query be simplified further (not golfed!)

@MichaelRFairhurst
Copy link
Collaborator Author

Note that the unicode data came from advanced-security/codeql-qtil#13

I should definitely finish unicode support in qtil, publish, and then use that here. Likely, that should be done before merge, but not strictly necessary.

@MichaelRFairhurst
Copy link
Collaborator Author

Relevant qtil pull request: advanced-security/codeql-qtil#13

@mbaluda mbaluda requested review from mbaluda and removed request for lcartey December 11, 2025 19:02
Copilot AI review requested due to automatic review settings December 11, 2025 19:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a comprehensive naming validation package for MISRA C++ RULE-5-10-1, which enforces proper identifier formation in C++ code. The implementation introduces a sophisticated identifier tracking system that validates identifiers against multiple constraints including Unicode normalization, reserved names, namespace restrictions, and macro naming conventions.

Key changes:

  • Introduces the IdentifierIntroduction abstraction that systematically captures all identifier declarations across various C++ constructs (variables, functions, types, macros, namespaces, templates, etc.)
  • Implements Unicode support with UAX#44 compliance checking and NFC normalization validation using extensible predicates with external YAML data
  • Adds MISRA C++ RULE-5-10-1 query to detect poorly formed identifiers including underscore violations, lowercase in macros, reserved names, and reserved namespace usage

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
cpp/common/src/codingstandards/cpp/Identifiers.qll Introduces comprehensive IdentifierIntroduction class hierarchy that systematically tracks all identifier declarations across various C++ constructs
cpp/common/src/codingstandards/cpp/Unicode.qll Implements Unicode property checking (NFC_QC, XID_Start, XID_Continue) and unicode escape sequence handling for identifier validation
cpp/common/src/codingstandards/cpp/Macro.qll Fixes variadic macro parameter extraction to properly exclude ellipsis and empty parameter names
cpp/misra/src/rules/RULE-5-10-1/PoorlyFormedIdentifier.ql Implements the main query that validates identifiers against MISRA C++ RULE-5-10-1 constraints
cpp/common/src/codingstandards/cpp/exclusions/cpp/Naming2.qll Autogenerated metadata for Naming2 package query registration
cpp/common/src/codingstandards/cpp/exclusions/cpp/RuleMetadata.qll Registers Naming2 package in the rule metadata system
rule_packages/cpp/Naming2.json Defines query metadata for RULE-5-10-1 including severity, precision, and tags
cpp/misra/test/rules/RULE-5-10-1/test.cpp Comprehensive test file with 189 lines covering Unicode, normalization, underscores, macros, namespaces, and reserved names
cpp/misra/test/rules/RULE-5-10-1/PoorlyFormedIdentifier.expected Expected query results showing 48 violations across various identifier validation rules
cpp/misra/test/rules/RULE-5-10-1/PoorlyFormedIdentifier.qlref Query reference file for test execution
cpp/common/test/library/codingstandards/cpp/identifiers/* Library test suite with 666 lines testing identifier extraction across all C++ constructs
cpp/common/test/includes/standard-library/utility.h Adds pair and tuple support for structured binding tests
cpp/common/src/qlpack.yml Registers unicode.yml data extension
change_notes/2025-08-22-function-like-macro-param-name-bug-fixes.md Documents bug fixes in function-like macro parameter handling

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}

/**
* An identifier introduced as a template function name or as a parameter of a function-like macro.
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class documentation incorrectly describes this as "An identifier introduced as a template function name or as a parameter of a function-like macro." However, the implementation shows this class handles Macro identifiers (macro names and their parameters), not template functions. The documentation should be corrected to accurately describe that this class handles identifiers introduced by macros (both the macro name itself and any parameters of function-like macros).

Suggested change
* An identifier introduced as a template function name or as a parameter of a function-like macro.
* An identifier introduced by a macro, including both the macro name itself and any parameters of function-like macros.

Copilot uses AI. Check for mistakes.
Comment on lines +97 to +100
exists(Function func | func = intro.getElement().(FunctionDeclarationEntry).getFunction() |
isUserDefinedLiteralSuffixNonCompliant(func) and
message = "User-defined literal suffix '" + ident + "' is malformed."
)
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition appears unreachable. The query checks if the element is a FunctionDeclarationEntry with a Function that has a malformed user-defined literal suffix, and then tries to use 'ident' in the message. However, for user-defined literal suffixes, the identifier extracted on line 53 via 'intro.unescapeUnicode()' will be the suffix without the 'operator ""' prefix (e.g., '_foo'), not the full function name. This means this branch would never match the conditions in 'isUserDefinedLiteralSuffixNonCompliant' which checks for patterns in the full function name like 'operator""%'. This clause should either be removed as unreachable or the logic should be corrected to properly handle this case.

Suggested change
exists(Function func | func = intro.getElement().(FunctionDeclarationEntry).getFunction() |
isUserDefinedLiteralSuffixNonCompliant(func) and
message = "User-defined literal suffix '" + ident + "' is malformed."
)

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +13
/**
* Provides properties of a Unicode code point, where the property is of 'enumeration', 'catalog',
* or 'string-valued' type, however, the only supported property is `NFC_QC`.
*
* For example, `Block` is an enumeration property, `Line_Break` is a catalog property, and
* `Uppercase_Mapping` is a string-valued property.
*
* For boolean properties, see `unicodeHasBooleanProperty`, and for numeric properties, see
* `unicodeHasNumericProperty`.
*/
extensible predicate unicodeHasProperty(int codePoint, string propertyName, string propertyValue);
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states that this predicate provides properties of type 'enumeration', 'catalog', or 'string-valued', but then says "however, the only supported property is NFC_QC". This is confusing because it first suggests broad support and then limits it. Consider rephrasing to be more direct, such as: "Provides the NFC_QC property value for a Unicode code point. This is the only Unicode property currently supported."

Copilot uses AI. Check for mistakes.
* This has to be treated specially. The member predicate `getName()` on a `FriendDecl` returns the
* string "foo's friend", which is not an identifier in the program.
*
* The elements returned by the `getFriend()` member predicate often do not have a correspending
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: 'correspending' should be 'corresponding'.

Copilot uses AI. Check for mistakes.
/**
* @id cpp/misra/poorly-formed-identifier
* @name RULE-5-10-1: User-defined identifiers shall have an appropriate form
* @description Identifiers shall not conflict with keywords, reserved name, or otherwise be poorly
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar issue: 'reserved name' should be plural 'reserved names' to match the pattern of listing multiple items.

Suggested change
* @description Identifiers shall not conflict with keywords, reserved name, or otherwise be poorly
* @description Identifiers shall not conflict with keywords, reserved names, or otherwise be poorly

Copilot uses AI. Check for mistakes.
}

bindingset[s]
predicate hasDoubleUnderscore(string s) { s.matches("%\\_\\_%") }
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern used in this predicate uses backslash escaping that may be unclear. The regex pattern "\\%" is matching two consecutive underscores anywhere in the string. Consider using a more readable pattern like ".__." or adding a comment explaining that this matches identifiers containing double underscores anywhere in the string.

Suggested change
predicate hasDoubleUnderscore(string s) { s.matches("%\\_\\_%") }
/** Matches identifiers containing two consecutive underscores anywhere in the string. */
predicate hasDoubleUnderscore(string s) { s.regexpMatch(".*__.*") }

Copilot uses AI. Check for mistakes.
or
intro.isFromMacro() and
not ident.regexpMatch("^[a-zA-Z0-9_]+$") and
message = "Identifier '" + ident + "' contains invalid characters. "
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message ends with an unnecessary trailing space: "contains invalid characters. " (note the space before the closing quote). This should be removed for consistency with other error messages in this query.

Suggested change
message = "Identifier '" + ident + "' contains invalid characters. "
message = "Identifier '" + ident + "' contains invalid characters."

Copilot uses AI. Check for mistakes.
#define macro_ALL_CAPS 49 // NON_COMPLIANT - starts with lowercase
#define MACRO$DOLLAR 54 // NON_COMPLIANT - contains dollar sign
#define FUNCTION_LIKE_MACRO(x) \
((x) + 1) // NON_COMPLIANT - lower case argument name
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent comment: the comment says "NON_COMPLIANT - lower case argument name" but the actual issue is that the macro parameter 'x' violates the rule requiring macros to use only uppercase characters. The comment should more accurately describe the violation.

Suggested change
((x) + 1) // NON_COMPLIANT - lower case argument name
((x) + 1) // NON_COMPLIANT - macro parameter 'x' is not uppercase

Copilot uses AI. Check for mistakes.
#define FUNCTION_LIKE_MACRO(x) \
((x) + 1) // NON_COMPLIANT - lower case argument name
#define FUNCTION_LIKE_MACRO2(X) \
((X) + 1) // NON_COMPLIANT - lower case argument name
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent comment: the comment says "NON_COMPLIANT - lower case argument name" but line 102 shows the parameter is 'X' which is uppercase. Based on the expected results, this line is actually compliant (no error is expected for it). The comment should be corrected or removed.

Suggested change
((X) + 1) // NON_COMPLIANT - lower case argument name
((X) + 1) // COMPLIANT

Copilot uses AI. Check for mistakes.
d instanceof ClassTemplateSpecialization
}

private newtype TIndentifierIntroduction =
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the type name: 'TIndentifierIntroduction' should be 'TIdentifierIntroduction' (missing 'i' after 'd').

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants