String scalar in output has no pattern defined
Issue ID: graphql-data-output-string-scalar-pattern-needed
Average severity: Critical
Description
A string scalar used in an output position has no pattern defined. While GraphQL guarantees that a value is a string, it does not define any structural or semantic format.
For more details, see the GraphQL constraints specification.
Possible exploit scenario
Without a pattern or equivalent constraint, the API contract does not explicitly specify the expected format of returned values.
String inputs frequently represent structured data, such as email addresses, usernames, or URLs. If no pattern constraint is defined, the API may return:
- Unexpected formats
- Internal system values
- Corrupted or injected content
- Values leaking implementation details
- Data inconsistent with documented behavior
Attackers strive to make your APIs behave in an unexpected way to learn more about your system or to cause a data breach. Good data definition quality in the schemas used in API responses allows reliably validating that the contents of outgoing API responses are as expected.
While filtering API responses does not block a specific kind of attack, it is there as a damage control mechanism in the unfortunate event that a successful attack has been conducted: it allows blocking the response and prevent attackers from retrieving data they should not access.
In the vast majority of cases (with the notable exception of Denial of Service (DoS and DDoS) attacks) attacks are conducted because attackers want to access data or resources they should not have access to. Often, this means that the structure or the size of the API response changes as a result of a successful attack, compared to a normal API response.
Validating that API responses are as expected can be achieved through proper schema validation of the API responses. The accuracy of this depends on the quality of the response schemas: the better defined your schemas are, the easier it is to detect when something is not right.
In security-sensitive environments, clearly defined output formats serve two important purposes:
- Contract enforcement: Consumers can rely on predictable formats.
- Damage containment: If a backend component is compromised or misconfigured, schema-level constraints help detect abnormal or malicious output before it propagates.
If a backend service is compromised or incorrectly implemented, it may return:
- Internal file paths
- Stack trace fragments
- Raw database keys
- Malformed identifiers
For example, if a field is expected to return a UUID but instead returns a database primary key, stack trace fragment, or injected string, a pattern constraint would make this deviation detectable. Without a defined output pattern constraint:
- The deviation may go unnoticed
- Monitoring systems may fail to detect anomalies
- Downstream systems may process unexpected data
- Sensitive information may be exposed
In federated GraphQL environments, lack of format constraints may also introduce inconsistencies across subgraphs, increasing the risk of cross-service ambiguity and data leakage.
Because structured output values often represent security-relevant identifiers or sensitive metadata, missing format constraints significantly weakens the API contract and is therefore a critical issue.
Remediation
Define a strict pattern constraint for structured string scalars in output. This ensures that only strings matching the set pattern are exposed through your API. We recommend that you:
- Define patterns for all structured string types
- Prefer whitelist-style regular expressions (allowed characters) over blacklist rules
- Keep patterns aligned with business requirements
- Avoid overly permissive patterns such as
.*
Explicit structural constraints strengthen API contract integrity, improve anomaly detection, and reduce unintended data exposure.
We recommend that you carefully think what kind of regular expression best matches your needs. Do not simply blindly copy the pattern from the code example.
Remember to include the anchors ^ and $ in your regular expression, otherwise the overall length of the pattern could be considered infinite. If you include the anchors in the regular expression and the pattern only has fixed or constant quantifiers (like {10,64}, for example), you do not have to define the property maxLength separately for the object, as the length is fully constrained by the pattern. However, if the regular expression does not include the anchors or its quantifiers are not fixed (like in ^a.*b$), it can be considered to be just a part of a longer string and the property maxLength is required to constrain the length.
For more information on regular expressions, see the following:
- Language-agnostic information on regular expressions at Base Definitions page on regular expressions
- OWASP Validation Regex Repository
- RegExr, an online tool for building and testing regular expressions
The following are examples of regular expressions for some common elements:
| Element | Examples of regular expressions | Examples with escape |
|---|---|---|
| Alphanumeric string |
|
— |
| Base64‑encoding (for an image) |
|
^data:image\\/(?:gif|png|jpeg|bmp|webp)(?:;charset=utf-8)?;base64,(?:[A-Za-z0-9]|[+/])+={0,2}$
|
| Date and time |
|
|
| Duration |
|
^\\d+:\\d{2}:\\d{2}$
|
| Email address (common format) |
|
^([a-z0-9_\\.-]+)@([\\da-z\\.-]+)\\.([a-z\\.]{2,5})$
|
| File |
|
|
| IP address |
|
|
| Numbers |
|
|
| Password constraints |
Password that has:
|
^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[*.!@$%^&(){}[]:;<>,.?/~_+-=|\\]).{10,64}$
|
| Phone number |
International phone number, country code optional: Use libraries instead or regular expressions to validate phone numbers whenever possible. |
^(?:(?:\\(?(?:00|\\+)([1-4]\\d\\d|[1-9]\\d?)\\)?)?[\\-\\.\\ \\\\\\/]?)?((?:\\(?\\d{1,}\\)?[\\-\\.\\ \\\\\\/]?){0,})(?:[\\-\\.\\ \\\\\\/]?(?:#|ext\\.?|extension|x)[\\-\\.\\ \\\\\\/]?(\\d+))?$
|
| URL/URI (with protocol optional) |
|
^(https?:\\/\\/)?(www\\.)?[-a-zA-Z0-9@:%._\\+~#=]{2,256}\\.[a-z]{2,6}\\b([-a-zA-Z0-9@:%_\\+.~#?&//=]*)$
|
| UUID |
|
— |