立即注册 登录
About云-梭伦科技 返回首页

chinaxuzw的个人空间 https://www.aboutyun.com/?31226 [收藏] [复制] [分享] [RSS]

日志

python 统计单词出现的次数,找出出现次数最多的单词

已有 2461 次阅读2015-12-30 09:52 |个人分类:python&go| python, 单词, 统计

#coding=utf-8

# 一个纯英文文本中,统计每个单词出现的次数,找出出现次数最多的单词
import re

def ConvertAllNoLetterToSpace(inString):
    templist = re.findall(r'[^a-zA-Z]+', inString)# 需再改进
    for i in templist:
        inString = inString.replace(i, ' ')
    return inString

def ConstructWordsDict(inFile):
    wordsDict = {}
    for line in inFile:
        line = ConvertAllNoLetterToSpace(line)
        listWord = line.split()
        for word in listWord:
            if word not in wordsDict:
                wordsDict[word] = 1
            else:
                wordsDict[word] = wordsDict[word] + 1
    return wordsDict

def FindMostWordInDict(inDict):
    retWord = ''
    retNum = 0
    for word in inDict:
        if inDict[word] > retNum:
            retWord = word
            retNum = inDict[word]
    return retWord, retNum

if __name__ == '__main__':
    inFilePath = r'D:\eclipse\workspace\PyStudy\words.txt'
    inFile = open(inFilePath)
    
    wordsDict = ConstructWordsDict(inFile)
    retword, retNum = FindMostWordInDict(wordsDict)
    
    print retword, retNum
    print wordsDict
    
    

路过

雷人

握手

鲜花

鸡蛋

评论 (0 个评论)

facelist doodle 涂鸦板

您需要登录后才可以评论 登录 | 立即注册

关闭

推荐上一条 /2 下一条