在加入生词簿前,去掉全角方括号【】

Bug540-XiongJiaming
Lan Hui 2024-08-28 08:08:31 +08:00
parent 47a359c798
commit 62dd580974
1 changed files with 1 additions and 1 deletions

View File

@ -66,7 +66,7 @@ def file2str(fname):#文件转字符
def remove_punctuation(s): # 这里是s是形参 (parameter)。函数被调用时才给s赋值。
special_characters = '\_©~<=>+/[]*&$%^@.,?!:;#()"“”—‘’{}|,。?!¥……()、《》:;·' # 把里面的字符都去掉
special_characters = '\_©~<=>+/[]*&$%^@.,?!:;#()"“”—‘’{}|,。?!¥……()、《》【】:;·' # 把里面的字符都去掉
s = html.unescape(s) # 将HTML实体转换为对应的字符比如<会被识别为小于号
for c in special_characters:
s = s.replace(c, ' ') # 防止出现把 apple,apple 移掉逗号后变成 appleapple 情况