Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

11.5 词性标注

词性标注(POS Tagging)是分配给语法类别(例如,动词、名词等)的过程。

词性标注意思
NNP专有名词(Proper Noun,单数),指特定的人、地点、组织等名词
VBZ第三人称单数动词
VBG动名词或现在分词
IN介词(Preposition),用于连接词、短语或从句之间的关系,如"at"和"for"
CD基数词(Cardinal Number),表示数量或数值的词,
$货币符号
text = 'Google is looking at buying U.K. startup for $1 billion'
import nltk
tokens = nltk.word_tokenize(text, language='english')
nltk.pos_tag(tokens)
[('Google', 'NNP'), ('is', 'VBZ'), ('looking', 'VBG'), ('at', 'IN'), ('buying', 'VBG'), ('U.K.', 'NNP'), ('startup', 'NN'), ('for', 'IN'), ('$', '$'), ('1', 'CD'), ('billion', 'CD')]