Python正则表达式符号含义解析

Python 笔记

Python正则表达式是对字符串进行模式匹配和操作的强大工具。在处理文本时，常常需要对一些规则进行操作，用正则表达式能够高效地实现这些操作，同时还能使代码更简洁，易读。本文将详细解析正则表达式中的各种符号含义，帮助读者掌握正则表达式中的要点。

一、基础符号含义

1、. ：匹配任意单个字符，不包括换行符。

import re

text = "cat hat bat\nmat"
pattern = r".at"

print(re.findall(pattern, text))  # ['cat', 'hat', 'bat', 'mat']

2、^ ：匹配字符串的开头。

import re

text = "hello world"
pattern = r"^hello"

print(re.findall(pattern, text))  # ['hello']

3、$ ：匹配字符串的结尾。

import re

text = "hello world"
pattern = r"world$"

print(re.findall(pattern, text))  # ['world']

4、\b ：匹配单词的边界。

import re

text = "hello world"
pattern = r"\bhello\b"

print(re.findall(pattern, text))  # ['hello']

5、\d ：匹配数字字符。

import re

text = "1234"
pattern = r"\d"

print(re.findall(pattern, text))  # ['1', '2', '3', '4']

6、\D ：匹配非数字字符。

import re

text = "hello 123 world"
pattern = r"\D"

print(re.findall(pattern, text))  # ['h', 'e', 'l', 'l', 'o', ' ', ' ', 'w', 'o', 'r', 'l', 'd']

二、量词符号含义

1、* ：匹配前一个字符0次或多次。

import re

text = "A B AB ABC ABBBC"
pattern = r"AB*"

print(re.findall(pattern, text))  # ['A', 'AB', 'ABC', 'ABBB']

2、+ ：匹配前一个字符1次或多次。

import re

text = "A B AB ABC ABBBC"
pattern = r"AB+"

print(re.findall(pattern, text))  # ['AB', 'ABC', 'ABBB']

3、? ：匹配前一个字符0次或1次。

import re

text = "A B AB ABC ABBBC"
pattern = r"AB?"

print(re.findall(pattern, text))  # ['A', 'AB', 'AB', 'AB']

4、{m} ：匹配前一个字符m次。

import re

text = "A B AB ABB ABBB ABBBB"
pattern = r"AB{3}"

print(re.findall(pattern, text))  # ['ABBB']

5、{m,n} ：匹配前一个字符m次到n次。

import re

text = "A B AB ABB ABBC ABBBBC"
pattern = r"AB{2,4}"

print(re.findall(pattern, text))  # ['ABB', 'ABBB', 'ABBB']

三、字符集符号含义

1、[] ：匹配括号中的任意一个字符。

import re

text = "A B AB AC AD AE"
pattern = r"A[BDE]"

print(re.findall(pattern, text))  # ['AB', 'AD', 'AE']

2、[^] ：匹配除了括号中的任意一个字符。

import re

text = "A B AB AC AD AE"
pattern = r"A[^BDE]"

print(re.findall(pattern, text))  # ['AC']

3、\w ：匹配字母、数字、下划线。

import re

text = "hello_123 world"
pattern = r"\w"

print(re.findall(pattern, text))  # ['h', 'e', 'l', 'l', 'o', '_', '1', '2', '3', 'w', 'o', 'r', 'l', 'd']

4、\W ：匹配非字母、非数字、非下划线的字符。

import re

text = "hello_123 world"
pattern = r"\W"

print(re.findall(pattern, text))  # [' ', ' ']

四、特殊符号含义

1、| ：匹配两个表达式中的任意一个。

import re

text = "cat bat hat"
pattern = r"cat|hat"

print(re.findall(pattern, text))  # ['cat', 'hat']

2、() ：标记一个子表达式的开始和结束位置，可以在表达式求值时使用。

import re

text = "cat bat hat"
pattern = r"(c|h)at"

print(re.findall(pattern, text))  # ['cat', 'hat']

3、(?i) ：忽略大小写匹配。

import re

text = "Cat bat hat"
pattern = r"(?i)cat"

print(re.findall(pattern, text))  # ['Cat']

4、(?=) ：断言匹配，但不匹配真正的字符串内容。

import re

text = "hello world"
pattern = r"h(?=e)"

print(re.findall(pattern, text))  # ['h']

5、(?<=) ：断言匹配，但不匹配真正的字符串内容，并且前置断言的内容可以被后面的表达式使用。

import re

text = "hello world"
pattern = r"(?<=h)ello"

print(re.findall(pattern, text))  # ['ello']

以上就是Python正则表达式中常用符号含义的详细解析。希望本文内容能够帮助读者更好地掌握正则表达式的用法，并在实际工作中发挥更大的作用。