上海企业建站步骤,北京工商注册代理,域名解析ip,成都官网优化推广对于一个句子#xff0c;一种简单的方法是使用split()
a This is an apple. Do you like apple?
b a.split()
print(b) # [This, is, an, apple., Do, you, like, apple?]
可以看到切分结果不错#xff0c;但标点符号也当成了词的一部分#xff0c;可以使用正则表达式…对于一个句子一种简单的方法是使用split()
a This is an apple. Do you like apple?
b a.split()
print(b) # [This, is, an, apple., Do, you, like, apple?]
可以看到切分结果不错但标点符号也当成了词的一部分可以使用正则表达式来切分句子其中分隔符是除字母数字外的任意字符串。
import rea This is an apple. Do you like apple?
b re.split(r\W, a)
print(b) # [This, is, an, apple, Do, you, like, apple, ]
得到的词列表已不包含符号但是含有空字符串同时单词也混有大小写将其改进得到
import rea This is an apple. Do you like apple?
b re.split(r\W, a)
c [word.lower() for word in b if len(word) 0]
print(c) # [this, is, an, apple, do, you, like, apple]