搜索资源列表
WordFrequenceCount
- 基于文本的词频计算,对文本内的单词进行统计,可统计上万单词,一次输出。-Based on the text of the word frequency calculation, the words within the text statistics, statistics can be tens of thousands of words, an output.
turrenv__fdequency
- 具体功能是在当前目录下读取文本文件,然后统计词频-The specific function is to read text files in the current directory, and word frequency statistics
iwnim
- 具体功能是在当前目录下读取文本文件,然后统计词频-The specific function is to read text files in the current directory, and word frequency statistics
mair
- 具体功能是在当前目录下读取文本文件,然后统计词频-The specific function is to read text files in the current directory, and word frequency statistics
WordCount
- 基于Hadoop的词频统计并行计算,输入输出控制在readMe.md中- U57FA u4E8EHadoop u7684 u8BCD u9891 u7EDF u8BA1 u5E76 u884C u8BA1 u7B97
pqniexd
- 具体功能是在当前目录下读取文本文件,然后统计词频-The specific function is to read text files in the current directory, and word frequency statistics
Thx-Rhe
- 具体功能是在当前目录下读取文本文件,然后统计词频(The specific function is to read text files in the current directory, and word frequency statistics)
svmcls
- 李荣陆老师做的文本分类器,特征选择方式包括全局和按类别选取,概率估算方法支持基于文档(布尔)统计和基于词频统计,支持三种特征加权方式,特征评估函数包括信息增益、互信息、期望交叉熵、X^2统计,文本证据权重,右半信息增益,分类方法包括支持向量机SVM和K近邻KNN,(text classifier that was written by Li Ronglu)
keyword
- 基于taxtrank 的关键词挖掘算法,考虑了词频、词性、词语位置(My father was a self-taught mandolin player. He was one of the best string instrument players in our town. He could not read music, but if he heard a tune a few times, he could play it.)
abpszph
- 具体功能是在当前目录下读取文本文件,然后统计词频()
romplexity-import
- 具体功能是在当前目录下读取文本文件,然后统计词频()
词频分析
- 通过关键字的查找,统计出某篇文档中相应关键字的个数,以及关键字的排序(word frequency analysis)
EMR
- 使用的是贝叶斯算法,进行文本的分类和词频统计(Using the Bayesian algorithm, the text classification and word frequency statistics)
Main
- 利用java的字符串分词实现英文文本的词频统计并进行输出(Using java string word segmentation to achieve English word frequency statistics and output)
str终版
- 统计单词出现的次数,并按照出现的次数排序(The number of occurrences of statistical words)
练习demo
- c++ 读取文件 用链表存储 按照词频大小进行冒泡排序并输出
pvuxx
- 具体功能是在当前目录下读取文本文件,然后统计词频()
课程设计作业
- 用分词包进行分词,并通过分词统计每个词频出现次数(use to seperate an article, and use the dictionary to find the frequency of each word)
主题模型
- 对给定文档进行分词处理,转换为词频矩阵并进行主题分类(Categorization of a given document)
learning-spark-master
- 将逻辑回归应用于二元分类的情况。这里以垃圾邮件分类为例,即是否为垃圾邮件两种情况。然后,根据词频把每个文件中的文本转换为特征向量,训练出一个可以把两类消息分开的逻辑回归模型,判断输入测试语句是否为垃圾邮件。(Spark MLlib (Java): Input: spam.txt; normal.txt; text sentence. Output:1.0(spam), 0.0(normal email))