Named Capture Groups
The syntax for a named capture group is (?P<identifier>...), where identifier represents the assigned name and ... denotes the target pattern. These groups function identically to standard capturing groups, with the added benefit of being accessible via the group('identifier') method alongside traditional numerical indexing.
import re
regex = r"(?P<prefix>foo)-(?:bar)-(baz)"
text = "foo-bar-baz"
result = re.match(regex, text)
if result:
print(result.group("prefix"))
print(result.groups())
foo
('foo', 'baz')
Non-Capturing Groups
Defined by the syntax (?:...), non-capturing groups match the pattern without storing the matched text. They do not consume a group index, meaning the captured value cannot be retrieved later in the code or referenced elsewhere in the expression.
import re
# 1. (?:pattern): Matches the pattern but omits the result from capture.
txt = "apple_pieapple_turnover"
matches = re.findall(r"apple(?:_pie|_turnover)", txt)
print(matches) # ['apple_pie', 'apple_turnover']
# 2. (?=pattern): Positive lookahead. Asserts that the pattern immediately follows, without consuming it.
txt = "version3version4version5"
updated = re.sub(r"version(?=4|5)", "release", txt)
print(updated) # version3release4release5
# 3. (?!pattern): Negative lookahead. Asserts that the pattern does not immediately follow, without consuming it.
updated = re.sub(r"version(?!4|5)", "release", txt)
print(updated) # release3version4version5
# 4. (?<=pattern): Positive lookbehind. Asserts that the pattern immediately precedes, without consuming it.
txt = "cat_dog_cat_fish"
updated = re.sub(r"(?<=cat_)fish", "bird", txt)
print(updated) # cat_dog_cat_bird
# 5. (?