閱讀(16.8k) 書簽贊(2) 我要糾錯

Python菜譜10：統(tǒng)計單詞出現(xiàn)的頻率

2018-07-25 18:21 更新

平時我們在工作的時候需要統(tǒng)計一篇文章或者網(wǎng)頁出現(xiàn)頻率最高的單詞，或者需要統(tǒng)計單詞出現(xiàn)頻率排序。那么如何完成這個任務(wù)了？

例如，我們輸入的語句是 “Hello there this is a test. Hello there this was a test, but now it is not.”，希望得到的升序的結(jié)果:

[[1, 'but'], [1, 'it'], [1, 'not.'], [1, 'now'], [1, 'test,'], [1, 'test.'], [1, 'was'], [2, 'Hello'], [2, 'a'], [2, 'is'], [2, 'there'], [2, 'this']]

得到降序的結(jié)果是:

[[2, 'this'], [2, 'there'], [2, 'is'], [2, 'a'], [2, 'Hello'], [1, 'was'], [1, 'test.'], [1, 'test,'], [1, 'now'], [1, 'not.'], [1, 'it'], [1, 'but']]

完成這個結(jié)果的代碼如下:

class Counter(object):

    def __init__(self):
        self.dict = {}

    def add(self, item):
        count = self.dict.setdefault(item, 0)
        self.dict[item] = count + 1

    def counts(self, desc=None):
        result = [[val, key] for (key, val) in self.dict.items()]
        result.sort()
        if desc:
            result.reverse()
        return result

if __name__ == '__main__':

    '''Produces:

 >>> Ascending count:
 [[1, 'but'], [1, 'it'], [1, 'not.'], [1, 'now'], [1, 'test,'], [1, 'test.'], [1, 'was'], [2, 'Hello'], [2, 'a'], [2, 'is'], [2, 'there'], [2, 'this']]
 Descending count:
 [[2, 'this'], [2, 'there'], [2, 'is'], [2, 'a'], [2, 'Hello'], [1, 'was'], [1, 'test.'], [1, 'test,'], [1, 'now'], [1, 'not.'], [1, 'it'], [1, 'but']]
 '''

    sentence = "Hello there this is a test.  Hello there this was a test, but now it is not."
    words = sentence.split()
    c = Counter()
    for word in words:
        c.add(word)
    print "Ascending count:"
    print c.counts()
    print "Descending count:"
    print c.counts(1)

以上內(nèi)容是否對您有幫助：

← Python菜譜9：soundex 算法

Python菜譜11：使用列表實現(xiàn)循環(huán)數(shù)據(jù)結(jié)構(gòu) →

寫筆記

我要補(bǔ)充

Python菜譜10：統(tǒng)計單詞出現(xiàn)的頻率

推薦文章

推薦教程

推薦課程