Inefficient Regular Expression Complexity in re Module
Patterns in Python's re module that are susceptible to catastrophic backtracking. Such patterns can lead to performance issues and may cause a Denial-of-Service (DoS) condition in applications by consuming an excessive amount of CPU time on certain inputs Vulnerability Explanation
Catastrophic backtracking occurs in regex evaluation when the engine tries to match complex patterns that contain nested quantifiers or ambiguous constructs. In certain cases, especially with maliciously crafted input, this can lead to an exponential number of combinations being checked, severely impacting application performance and potentially causing it to hang or crash.
Examples
import re
pattern = re.compile('(a+)+$')
result = pattern.match('aaaaaaaaaaaaaaaaaaaa!')
Remediation
When using Python's re module to compile or match regular expressions, ensure that patterns are designed to avoid ambiguous repetition and nested quantifiers that can cause catastrophic backtracking. Regular expressions should be reviewed and tested for efficiency and resistance to DoS attacks.
import re
pattern = re.compile('a+$')
result = pattern.match('aaaaaaaaaaaaaaaaaaaa!')
See also
- re — Regular expression operations
- Regular expression Denial of Service - ReDoS OWASP Foundation
- CWE-1333: Inefficient Regular Expression Complexity
New in version 0.3.14