搜索资源列表
kaggle-quora-question-pairs
- kaggle quora 问题相似度匹配比赛 top1% 代码(kaggle quora question pair competition top1% solution)
CRF++-0.58
- CRF经典demo,一般用于NLP(自然语言处理)(CRF---NLP---a good demo)
计算语言学-常宝宝
- 北京大学《计算语言学》课程讲义,按章节划分。(Slides of Computational Linguistics course -- Peking University.)
jieba_plus
- 解决jieba分词中部分bug,包括全角字母和数字等,更新中(solve part of the bugs in Jieba segmentation, update)
自然语言处理NPL-最大概率分词算法
- 自然语言处理NLP,最大概率分词算法,带有详细说明文档(Natural Language Processing NLP, maximum probability segmentation algorithm)
ngram模型分词与统计算法
- N-Gram(有时也称为N元模型)是自然语言处理中一个非常重要的概念,通常在NLP中,人们基于一定的语料库,可以利用N-Gram来预计或者评估一个句子是否合理。另外一方面,N-Gram的另外一个作用是用来评估两个字符串之间的差异程度。这是模糊匹配中常用的一种手段。本文将从此开始,进而向读者展示N-Gram在自然语言处理中的各种powerful的应用。(N-Gram (sometimes referred to as N metamodel) is a very important concept
Seq2Seq
- 自然语言处理中 Seq2Seq LSTM搭建示例(This is the code of Seq2Seq model in NLP,using LSTM neural network)
基于维基百科的命名实体消歧的研究与实现_杨雪
- 命名实体消歧涉及到很多的关键技术,包括特征提取、排序、聚 类等。(With the development of information technology, large unstructured data was generated on the network. How to get useful information from these large data, become the problem needed to solve in the NLP field.
wenben
- R语言做的一个文本分析入门实例,需要下载相应的包。(An introductory instance of text analysis made by the R language needs to download the corresponding package.)
NamedEntityRecognition
- NamedEntityRecognition
HanLP-master
- NamedEntityRecognition github
img
- this is a basic programme in nlp
inp
- this is just a sample code in nlp
split
- 对短信进行分类,如垃圾短信,重要短信,一般短信,群发短信等等(Categorize SMS, such as spam messages, important SMS, general SMS, group SMS and so on)
ltp4csharp
- LTP基础语言处理的c#实现。model请到 https://ltp.ai 下载 lib 请自行根据平台编译。适配ltp3.4.0(C# implementation of LTP basic language processing)
GAMSUsersGuide
- GAMS是应用于快速解决NLP问题的通用计算软件,这本书是GAMS 的用户指南。(GAMS is a universal computing software for fast solving NLP problems. This book is the user guide for GAMS.)
jiebacut.py
- 通过结巴分词处理中文分词问题【对文本进行分词以及词频统计处理】。(The problem of Chinese participle is dealt with by the branch word segmentation.)
chinese_seg_update
- 中文分词,采用逆向最大匹配方法实现,利用字典作为索引(The Chinese word segmentation is realized by the reverse maximum matching method, and the dictionary is used as the index.)
wiki_100
- 使用Wikipedia中文训练的100维词向量(100 dimensional word vectors used in Chinese training in Wikipedia)
02727464
- NLP中viterby算法的实现,对语料进行处理,建模,然后可以对新的语料进行句法标注()