Python标准库-string模块《未完待续》

简介:
>>> import string
>>> s='hello rollen , how are you '
>>> string.capwords(s)
'Hello Rollen , How Are You'          #每个单词的首字母大写
>>> string.split(s)
['hello', 'rollen', ',', 'how', 'are', 'you']     #划分为列表 默认是以空格划分
>>> s='1+2+3'
>>> string.split(s,'+')                 #以‘+’号进行划分
['1', '2', '3']

maketrans()方法会创建一个能够被translate()使用的翻译表,可以用来改变一些列的字符,这个方法比调用replace()更加的高效。

例如下面的例子将字符串s中的‘a,改为1,‘b’改为2,‘c’改为3’:

>>> leet=string.maketrans('abc','123')
>>> s='abcdef'
>>> s.translate(leet)
'123def'
>>> leet=string.maketrans('abc','123')
>>> s='aAaBcC'
>>> s.translate(leet)
'1A1B3C'

string中的template的小例子:

import string
values = { 'var':'foo' }
t = string.Template("""
Variable : $var
Escape : $$
Variable in text: ${var}iable
""")
print 'TEMPLATE:', t.substitute(values)
s = """
Variable : %(var)s
Escape : %%
Variable in text: %(var)siable
"""
print 'INTERPOLATION:', s % values

上面的例子的输出为:

TEMPLATE: 
Variable : foo 
Escape : $ 
Variable in text: fooiable

INTERPOLATION: 
Variable : foo 
Escape : % 
Variable in text: fooiable

但是上面的substitute如果提供的参数不足的时候,会出现异常,我们可以使用更加安全的办法,如下:

import string
values = { 'var':'foo' }
t = string.Template("$var is here but $missing is not provided")
try:
    print 'substitute() :', t.substitute(values)
except KeyError, err:
    print 'ERROR:', str(err)
print 'safe_substitute():', t.safe_substitute(values)

上面例子的输出为:

substitute() : ERROR: 'missing' 
safe_substitute(): foo is here but $missing is not provided

 

下面来看一些template的高级用法:

import string
template_text = '''
Delimiter : %%
Replaced : %with_underscore
Ignored : %notunderscored
'''
d = { 'with_underscore':'replaced',
'notunderscored':'not replaced',
}
class MyTemplate(string.Template):
    delimiter = '%'
    idpattern = '[a-z]+_[a-z]+'
t = MyTemplate(template_text)
print 'Modified ID pattern:'
print t.safe_substitute(d)

输出为:

Modified ID pattern:

Delimiter : % 
Replaced : replaced 
Ignored : %notunderscored

在这个例子中,我们通过自定义属性delimiter 和 idpattern自定了规则,我们使用%替代了美元符号$,而且我们定义的替换规则是被替换的变量名要包含下环线,所以在上面的例子中,只替换了一个。

 

import textwrap

sample_text = '''
The textwrap module can be used to format text for output in
situations where pretty-printing is desired. It offers
programmatic functionality similar to the paragraph wrapping
or filling features found in many text editors.
'''

print 'No dedent:\n'
print textwrap.fill(sample_text, width=50)



输出为:

No dedent:

The textwrap module can be used to format text 
for output in situations where pretty-printing is 
desired. It offers programmatic functionality 
similar to the paragraph wrapping or filling 
features found in many text editors.

上面的例子设置宽度为50,下面的例子我们来移除缩进

import textwrap

sample_text = '''
The textwrap module can be used to format text for output in
situations where pretty-printing is desired. It offers
programmatic functionality similar to the paragraph wrapping
or filling features found in many text editors.
'''

dedented_text = textwrap.dedent(sample_text)
print 'Dedented:'
print dedented_text




Dedented:

The textwrap module can be used to format text for output in 
situations where pretty-printing is desired. It offers 
programmatic functionality similar to the paragraph wrapping 
or filling features found in many text editors.

Hit any key to close this window...

下面来一个对比:

import textwrap

sample_text = '''
The textwrap module can be used to format text for output in
situations where pretty-printing is desired. It offers
programmatic functionality similar to the paragraph wrapping
or filling features found in many text editors.
'''

dedented_text = textwrap.dedent(sample_text).strip()
for width in [ 45, 70 ]:
    print '%d Columns:\n' % width
    print textwrap.fill(dedented_text, width=width)
    print





上面的例子的输出如下:

45 Columns:

The textwrap module can be used to format
text for output in situations where pretty-
printing is desired. It offers programmatic
functionality similar to the paragraph
wrapping or filling features found in many
text editors.

70 Columns:

The textwrap module can be used to format text for output in
situations where pretty-printing is desired. It offers programmatic
functionality similar to the paragraph wrapping or filling features
found in many text editors.

Hit any key to close this window...

我们也可以设置首行和剩余的行:

import textwrap

sample_text = '''
The textwrap module can be used to format text for output in
situations where pretty-printing is desired. It offers
programmatic functionality similar to the paragraph wrapping
or filling features found in many text editors.
'''

dedented_text = textwrap.dedent(sample_text).strip()
print textwrap.fill(dedented_text,
initial_indent=' ',
subsequent_indent=' ' * 4,
width=50,
)

输出为:

The textwrap module can be used to format text 
    for output in situations where pretty-printing 
    is desired. It offers programmatic 
    functionality similar to the paragraph 
    wrapping or filling features found in many 
    text editors.

上面的例子设置首行缩进1个空格,其余行缩进4个空格

 

在文本中查找:

import re
pattern = 'this'
text = 'Does this text match the pattern?'
match = re.search(pattern, text)
s = match.start()
e = match.end()
print 'Found "%s"\nin "%s"\nfrom %d to %d ("%s")' % \
(match.re.pattern, match.string, s, e, text[s:e])

start和end返回匹配的位置

输出如下:

Found "this" 
in "Does this text match the pattern?" 
from 5 to 9 ("this")

 

re includes module-level functions for working with regular expressions as text strings, 
but it is more efficient to compile the expressions a program uses frequently. The compile() function converts an expression string into a RegexObject.

import re
# Precompile the patterns
regexes = [ re.compile(p) for p in [ 'this', 'that' ]]
text = 'Does this text match the pattern?'
print 'Text: %r\n' % text
for regex in regexes:
    print 'Seeking "%s" ->' % regex.pattern,
    if regex.search(text):
        print 'match!'
    else:
       print 'no match'

Text: 'Does this text match the pattern?'

Seeking "this" -> match! 
Seeking "that" -> no match 

The module-level functions maintain a cache of compiled expressions. However, 
the size of the cache is limited, and using compiled expressions directly avoids the 
cache lookup overhead. Another advantage of using compiled expressions is that by 
precompiling all expressions when the module is loaded, the compilation work is shifted 
to application start time, instead of to a point when the program may be responding to 
a user action.

 

So far, the example patterns have all used search() to look for single instances of 
literal text strings. The findall() function returns all substrings of the input that 
match the pattern without overlapping.

import re
text = 'abbaaabbbbaaaaa'
pattern = 'ab'
for match in re.findall(pattern, text):
    print 'Found "%s"' % match

Found "ab" 
Found "ab" 

finditer() returns an iterator that produces Match instances instead of the 
strings returned by findall().

import re
text = 'abbaaabbbbaaaaa'
pattern = 'ab'
for match in re.finditer(pattern, text):
    s = match.start()
    e = match.end()
    print 'Found "%s" at %d:%d' % (text[s:e], s, e)

Found "ab" at 0:2 
Found "ab" at 5:7

目录
相关文章
|
9天前
|
JSON 数据格式 Python
Python标准库中包含了json模块,可以帮助你轻松处理JSON数据
【4月更文挑战第30天】Python的json模块简化了JSON数据与Python对象之间的转换。使用`json.dumps()`可将字典转为JSON字符串,如`{"name": "John", "age": 30, "city": "New York"}`,而`json.loads()`则能将JSON字符串转回字典。通过`json.load()`从文件读取JSON数据,`json.dump()`则用于将数据写入文件。
16 1
|
10天前
|
Python 容器
python内置函数、数学模块、随机模块(二)
python内置函数、数学模块、随机模块(二)
|
10天前
|
索引 Python
python内置函数、数学模块、随机模块(一)
python内置函数、数学模块、随机模块(一)
|
13天前
|
人工智能 安全 Java
Python 多线程编程实战:threading 模块的最佳实践
Python 多线程编程实战:threading 模块的最佳实践
128 5
|
13天前
|
人工智能 数据库 开发者
Python中的atexit模块:优雅地处理程序退出
Python中的atexit模块:优雅地处理程序退出
10 3
|
16天前
|
存储 开发者 Python
Python中的argparse模块:命令行参数解析的利器
Python中的argparse模块:命令行参数解析的利器
17 2
|
16天前
|
开发者 Python
Python的os模块详解
Python的os模块详解
19 0
|
19天前
|
数据挖掘 API 数据安全/隐私保护
python请求模块requests如何添加代理ip
python请求模块requests如何添加代理ip
|
20天前
|
测试技术 Python
Python 有趣的模块之pynupt——通过pynput控制鼠标和键盘
Python 有趣的模块之pynupt——通过pynput控制鼠标和键盘
|
20天前
|
Serverless 开发者 Python
《Python 简易速速上手小册》第3章:Python 的函数和模块(2024 最新版)
《Python 简易速速上手小册》第3章:Python 的函数和模块(2024 最新版)
42 1