Built-in string scalar used in output instead of a custom scalar

Issue ID: graphql-data-output-custom-scalar-string-needed

Average severity: Critical

Description

The schema exposes data using a built-in scalar of the type String in an output position instead of using a domain-specific constrained custom scalar.

For more details, see the GraphQL specification.

Possible exploit scenario

The GraphQL String scalar accepts any UTF-8 text of arbitrary length. It does not define:

Maximum length
Minimum length
Allowed character set
Structural format
Encoding expectations
Domain semantics

Unconstrained String in output position increases the risk of:

Accidental exposure of sensitive information
Leakage of internal system details
Contract instability (format changes across versions)
Downstream injection risks in consuming applications
Oversized responses leading to performance or availability issues
Inconsistent encoding behavior across services

For example, returning internal error messages as a built-in String may expose stack traces, returning file paths may leak infrastructure layout, and returning unrestricted free text may propagate unvalidated content into downstream systems (such as logs, HTML rendering, or report generators). In federated environments, uncontrolled string outputs may propagate across subgraphs and amplify exposure risk.

Because strings often carry business-critical or security-sensitive information, unconstrained output modeling significantly increases data exposure risk. Using domain-specific custom scalars for strings allows explicitly defining maximum length, allowed formats, sensitivity classification, controlled value sets, and explicit contract guarantees.

Remediation

Replace built-in String with constrained custom scalars that explicitly define validation rules. We recommend that you:

Define explicit scalars for output strings
Enforce maximum length for all output strings
Avoid exposing raw internal values

Explicit modeling of output strings strengthens contract clarity, reduces data leakage risk, and improves security governance across API ecosystems.