From 3f7f811530dac6806199ec0e1b39e162ed0ac220 Mon Sep 17 00:00:00 2001 From: Josh Date: Mon, 15 Sep 2025 11:20:21 -0400 Subject: [PATCH 1/5] PCRE2: Clean-up "Perl Differences" docs Removed the VERY outdated list of differences (it's from Perl Differences Differences From Perl - The differences described here are with respect to Perl 5.005. - - - - By default, a whitespace character is any character that - the C library function isspace() recognizes, though it is - possible to compile PCRE with alternative character type - tables. Normally isspace() matches space, formfeed, newline, - carriage return, horizontal tab, and vertical tab. Perl 5 no - longer includes vertical tab in its set of whitespace characters. - The \v escape that was in the Perl documentation for - a long time was never in fact recognized. However, the character - itself was treated as whitespace at least up to 5.002. - In 5.004 and 5.005 it does not match \s. - - - - - PCRE does not allow repeat quantifiers on lookahead - assertions. Perl permits them, but they do not mean what you - might think. For example, (?!a){3} does not assert that the - next three characters are not "a". It just asserts that the - next character is not "a" three times. - - - - - Capturing subpatterns that occur inside negative - lookahead assertions are counted, but their entries in the - offsets vector are never set. Perl sets its numerical - variables from any such patterns that are matched before the - assertion fails to match something (thereby succeeding), but - only if the negative lookahead assertion contains just one - branch. - - - - - Though binary zero characters are supported in the subject string, - they are not allowed in a pattern string because it is passed as a - normal C string, terminated by zero. The escape sequence "\x00" can - be used in the pattern to represent a binary zero. - - - - - The following Perl escape sequences are not supported: - \l, \u, \L, \U. In fact these are implemented by - Perl's general string-handling and are not part of its - pattern matching engine. - - - - - The Perl \G assertion is not supported as it is not - relevant to single pattern matches. - - - - - Fairly obviously, PCRE does not support the (?{code}) and (??{code}) - construction. However, there is support for recursive patterns. - - - - - There are at the time of writing some oddities in Perl - 5.005_02 concerned with the settings of captured strings - when part of a pattern is repeated. For example, matching - "aba" against the pattern /^(a(b)?)+$/ sets $2 to the value - "b", but matching "aabbaa" against /^(aa(bb)?)+$/ leaves $2 - unset. However, if the pattern is changed to - /^(aa(b(b))?)+$/ then $2 (and $3) get set. - In Perl 5.004 $2 is set in both cases, and that is also &true; - of PCRE. If in the future Perl changes to a consistent state - that is different, PCRE may change to follow. - - - - - Another as yet unresolved discrepancy is that in Perl - 5.005_02 the pattern /^(a)?(?(1)a|b)+$/ matches the string - "a", whereas in PCRE it does not. However, in both Perl and - PCRE /^(a)?a/ matched against "a" leaves $1 unset. - - - - - PCRE provides some extensions to the Perl regular - expression facilities: - - - - Although lookbehind assertions must match fixed length - strings, each alternative branch of a lookbehind assertion - can match a different length of string. Perl 5.005 requires - them all to have the same length. - - - - - If PCRE_DOLLAR_ENDONLY - is set and PCRE_MULTILINE is - not set, the $ meta-character matches only at the very end of the - string. - - - - - If PCRE_EXTRA is - set, a backslash followed by a letter with no special meaning is - faulted. - - - - - If PCRE_UNGREEDY is - set, the greediness of the repetition quantifiers is inverted, - that is, by default they are not greedy, but if followed by a - question mark they are. - - - - - - - + Both Perl and PCRE2 are continually changing. Refer to PCRE2's + latest documentation covering the + differences between PCRE2 + and Perl. The version of the PCRE2 library in-use is also + a relevant factor. - - The PCRE library is a set of functions that implement regular - expression pattern matching using the same syntax and semantics - as Perl 5, with just a few differences (see below). The current - implementation corresponds to Perl 5.005. - &reference.pcre.setup; From 1f259f5069f6e6956bef707da25d7408b38c6b67 Mon Sep 17 00:00:00 2001 From: Josh Date: Mon, 15 Sep 2025 12:18:56 -0400 Subject: [PATCH 3/5] Revise PCRE extension installation instructions - Remove the outdated configuration option; replace it with the supported option - Tidy up the opening paragraph(s) - Update the required PCRE2 external version to match the config implementation - Tidy up JIT paragraph - Tidy up "changes" paragraph; replace changelog link to point at the one for PCRE2 rather than the deprecated/unmaintained PCRE1 one - Add 8.3.0-8.5.0 entries to bundled PCRE library history --- reference/pcre/configure.xml | 45 ++++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 15 deletions(-) diff --git a/reference/pcre/configure.xml b/reference/pcre/configure.xml index 5ce103488977..f888fe4d1265 100644 --- a/reference/pcre/configure.xml +++ b/reference/pcre/configure.xml @@ -3,27 +3,27 @@
&reftitle.install; - The PCRE extension is a core PHP extension, so it is always enabled. - By default, this extension is compiled using the bundled PCRE - library. Alternatively, an external PCRE library can be used by - passing in the - configuration option where DIR is the location of - PCRE's include and library files. It is recommended to use PCRE 8.10 or newer; - as of PHP 7.3.0, PCRE2 is required. + The PCRE extension is a core PHP extension and is always enabled. - PCRE's just-in-time compilation is supported by default, which - can be disabled with the - configuration option as of PHP 7.0.12. + The extension uses a bundled version (by default) of the PCRE2 library. + An external PCRE2 library can be used instead by using the + configuration + option. The minimum version supported is 10.30. + + + PCRE's just-in-time (JIT) compilation is enabled by default. + It can be disabled by using the + configuration option. &windows.builtin; - PCRE is an active project and as it changes so does the PHP + PCRE2 is an active project and as it changes so does the PHP functionality that relies upon it. It is possible that certain parts - of the PHP documentation is outdated, in that it may not cover the - newest features that PCRE provides. For a list of changes, see the - PCRE library changelog - and also the following bundled PCRE history: + of the PHP documentation is outdated. For a list of changes, see the + PCRE2 library changelog. + Also, if using the bundled library, refer to the following bundled PCRE library + history: @@ -37,6 +37,21 @@ + + 8.5.0 (upcoming) + 10.46 + + + + 8.4.0 + 10.44 + + + + 8.3.0 + 10.42 + + 8.2.0 10.40 From 5bfc132c3cd0bb6772d470260d71fb816b6f2bf0 Mon Sep 17 00:00:00 2001 From: Josh Date: Mon, 15 Sep 2025 16:45:37 -0400 Subject: [PATCH 4/5] Fix formatting in pattern.differences.xml --- reference/pcre/pattern.differences.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reference/pcre/pattern.differences.xml b/reference/pcre/pattern.differences.xml index f5cb37c4b6ac..b6c9b6da58dc 100644 --- a/reference/pcre/pattern.differences.xml +++ b/reference/pcre/pattern.differences.xml @@ -8,7 +8,7 @@ Both Perl and PCRE2 are continually changing. Refer to PCRE2's latest documentation covering the differences between PCRE2 - and Perl. The version of the PCRE2 library in-use is also + and Perl. The version of the PCRE2 library in-use is also a relevant factor. From a22e8bd1381431e3ce90d800c3e87ac7ccad3fad Mon Sep 17 00:00:00 2001 From: Josh Date: Mon, 15 Sep 2025 16:46:19 -0400 Subject: [PATCH 5/5] Fix formatting of warning message in book.xml --- reference/pcre/book.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reference/pcre/book.xml b/reference/pcre/book.xml index ca044a996d9b..86cda9587085 100644 --- a/reference/pcre/book.xml +++ b/reference/pcre/book.xml @@ -31,7 +31,7 @@ - There are some size and other limitations + There are some size and other limitations in PCRE2 that can occasionally be relevant.