lkf0217 发表于 2013-2-7 03:55:06

Software Tools for NLP

<span style="color: #3d3d3d; font-family: Arial, Helvetica, simsun, u5b8bu4f53;"><div class="bct fc05 fc11 nbw-blog ztag js-fs2" style="line-height: 22px; font-size: 14px; text-align: left; color: #3d3d3d; margin-top: 15px; margin-right: 0px; margin-bottom: 15px; margin-left: 0px; padding-top: 5px; padding-right: 0px; padding-bottom: 5px; padding-left: 0px;"><div style="line-height: 22px;">Software Archive

[*]CMU Artificial Intelligence Repository
[*]Resources Available Through CRL
[*]SIL Computing Resources
[*]Linguistics Tools at the University of Vaasa in Finland
[*]Leeds University, Natural Language Processing Research Group: RESOURCES
[*]ICOT Free Software
[*]Netlib Repository (mirror in Japan)
 
General Information


[*]Sourcebank - a search engine for programming resources.
[*]Resources related to content analysis and text analysis - Software
[*]Some publically available NLP packages
[*]SAL (Scientific Applications on Linux)
Artificial Intelligence

[*]Public Domain Generic Tools: An Overview - a paper written by Tomaz Erjavec
[*]A collection of online interactive CL tools (Computational Linguistics Group, University of Zurich)
[*]The LINGUIST List: Software
[*]The Natural Language Software Registry
[*]Language Software Helpdesk

[*]Frequently Asked Questions

[*]PennTools - Computational Linguistics Resources At Penn.
[*]Parsing Resources
[*]Taggers online, email message containing addresses
[*]Parsers and Taggers Information (by Steven Paul Abney)
[*]Relator Language Processing Resources
[*]Corpus Search Tools
[*]Neural Networks & Statistics: Software
 
Tagger, Morphological Analyzer


[*]A Perl/Tk text tagger
[*]Conexor
[*]Cogilex R&D inc - Makers of expert tools for natural language processing
[*]CLAWS part-of-speech tagger
[*]TnT - Statistical Part-of-Speech Tagging
[*]POS tagger for Spanish
[*]Tagging and Parsing tools
[*]AUTASYS - A Fully Automatic English Wordclass Analysis System
[*]TOSCA/LOB tagger
[*]Relaxation Labelling Based Multi-Tagger
[*]The QTAG Part of Speech Tagger
[*]QTAG: A portable Parts of Speech Tagger
[*]The Alvey Natural Language Tools
[*]The XTAG Project
[*]TreeTagger - a language independent part-of-speech tagger
[*]Xerox Part-of-Speech Tagger
[*]The Edinburgh/Cambridge Morphological Analyser System
[*]Winbrill - An adaptation of Brill’s tagger to Windows 95/98.
[*]Eric Brill’s Part of Speech Tagger
[*]Software Plaza: Brill’s Tagger
[*]Morphy - An integrated tool for German morphology and statistical part-of-speech tagging.
[*]Korean Morphological Analyzer
[*]Natural Language Tools - Japanese morphological analyzer (JUMAN) and parser (KNP) developed by Nagao Lab. at Kyoto University, Japan.
[*]WordSmith Tools - Wordsmith Tools is the Swiss Army knife of lexical analysis - an integrated suite of programs for looking at how words behave in texts. It is intended for linguists, language teachers, and anyone who needs to examine language.

[*]Mike Scott’s Home Page
[*]Oxford University Press

[*]A Lexical Analyzer for HTML and Basic SGML
[*]ARIES Natural Language Tools - Lexical platform for the Spanish language.
 
Stemmer


[*]Porter stemmer
[*]Porter stemmer
[*]Dutch Porter stemmer
[*]IRIS stemmer
[*]Iterated Lovins stemmer
 
Collocation


[*]Xtract - Frank Smadja’s Collocation Extractor.
 
Parser


[*]Malaga - a system for automatic language analysis
[*]Attribute-Logic Engine (ALE) System and Grammars - A freeware logic programming and grammar parsing system.
[*]CG Parser - Natural deduction categorial grammar and lambda-calculus parser.
[*]Head-Corner Parser (by Gertjan van Noord)
[*]A basic parser written to illustrate the bottom up parsing algorithms in Natural Language Understanding, Second Edition
[*]Cass Partial Parser
[*]CHILL: An empirical parser acquisition system using inductive logic programming
[*]ISSCO Tools - Left-head-corner Island Parser Compiler, etc.
[*]Georgetown University Natural Language Processing
Parser Modularity Demo page
[*]PC-PATR: A syntactic parser
[*]IMS Stuttgart: The CUF Web Page - Comprehensive Unification Formalism
[*]Apple Pie Parser - The Apple Pie Parser is a bottom-up probabilistic chart parser which finds the parse tree with the best score by best-first search algorithm.
[*]Link Grammar Parser
 
Corpus Tools


[*]WebCorp
[*]Concordances: Producing and Using them
[*]XCES: Corpus Encoding Standard for XML
[*]RST Tool - An RST (Rhetorical Structure Theory) Markup Tool.
[*]RST Annotation Tool
[*]Qwick - corpus browser
[*]Linguistic Annotation - This page describes tools and formats for creating and managing linguistic annotations.
[*]Alembic Workbench - a suite of tools for the analysis of a corpus, along with the Alembic system to enable the automatic acquisition of domain-specific tagging heuristics.
[*]The System Quirk - Workbench for Terminology, Lexicography and Text Analysis.
[*]Multext: Multilingual Text Tools and Corpora
[*]XCorpus - An Environment for Managing Corpus and Multilingual Web Server
[*]The IMS Corpus Toolbox Webpage
X
[*]Kobe Phoenix Laboratory - Corpus Wizard program.
[*]Concordance - A program for Windows NT 4.0 and Windows 95/98 which makes wordlists, concordances, and Web Concordances from your electronic texts.
[*]MonoConc (concordance program)
[*]MonoConc for Windows (concordance program)
[*]Text Analysis Computing Tools (TACT)
[*]The Lingua Project: The World of MultiLingual Parallel Concordancing
(http://prune.loria.fr/~bonhomme/lingua/)
- Sentences alignment tool in multilingual corpora.
[*]The Lingua Project: The World of MultiLingual Parallel Concordancing
(http://www.loria.fr/exterieur/equipe/dialogue/lingua/)
[*]Textual Corpora and Tools for their Exploration
 
Language Modeling


[*]Maximum Entropy Modeling
[*]Maximum Entropy Modeling Toolkit
[*]CMU-Cambridge Statistical Language Modeling Toolkit
[*]CMU Statistical Language Modeling Toolkit by Roni Rosenfeld

[*]Program
[*]Document

[*]Trigger Toolkit
[*]Simple Good-Turing Smoothing
[*]Smoothing tools software by Joshua Goodman and Stanley Chen
[*]Language modeling tools
[*]Statistical Decision Trees
 
HMM


[*]A HMM mini-toolkit (by Anand Venkataraman)
[*]HMM Software
see also: Exercise: Using a Hidden Markov Model
[*]Discrete HMM Toolkit
[*]Hidden Markov Model (HMM) Toolbox
[*]Meta-MEME: Motif-based Hidden Markov Models of Biological Sequences
 
Language Identification


[*]Ted E. Dunning’s program
[*]Gertjan van Noord’s program
[*]Doug Beeferman’s program
 
FSA Tools


[*]Finite State Utilities
[*]Automata Learning from Theory to Practice

[*]Downloadable Software

[*]Index to finite-state machine software, products, and projects
[*]FSA utilities

[*]FSA Utilities: A Toolbox to Manipulate Finite-state Automata

[*]Grail - a symbolic computation environment for finite-state machines, regular expressions, and other formal language theory objects.
[*]AMoRE - A program for the computation of Automata, Monoids, and Regular Expressions.
 
Speech


[*]HTK: Hidden Markov Model Toolkit
[*]CSLU Toolkit
[*]The Epos Speech Synthesis System
[*]ISIP public domain speech to text system

[*]The ISIP Automatic Speech Recognition Toolkit

[*]CSLU Toolkit (Center for Spoken Language Understanding, Oregon Graduate Institute of Science and Technology)
[*]Computer generation of accent marks
[*]Spoken Natural Language Processing Group Software
[*]CMU Error Analysis Toolkit
[*]Audio Tools
[*]VOICEBOX: Speech Processing Toolbox for MATLAB
 
Mathematical Software


[*]NIST Guide to Available Mathematical Software
 
Statistics


[*]Bayesian inference Using Gibbs Sampling
[*]CoCo - A statistics package for analysis of associations between discrete variables.
 
Machine Learning


[*]Machine Learning Toolbox (MLT)
[*]The Machine Learning Programs Repository
[*]The RIPPER rule learner
[*]mFOIL - An ILP systems designed to handle noisy examples.
 
Support Vector Machine


[*]SVMLight
[*]SVM package by William Noble Grundy
[*]Kernel Machines Web Site
 
Information Retrieval & Filtering


[*]seft - a Search Engine For Text
[*]MG - Managing Gigabytes
[*]Isearch - software for indexing and searching text documents.
[*]SMART Software and test collections (Cornell University)

[*]see also SMART links

[*]Doug Oard’s Research Software Page - SMART Modifications
[*]Bow: A Toolkit for Statistical Language Modeling, Text Retrieval, Classification and Clustering
[*]ifile - A general mail filtering system.
[*]IR-STAT-PAK - A program to compute descriptive and analytic statistics for the TREC IR trials.
[*]Yavi - A visual interface to textual information.
[*]Labeled data sets for information extraction
 
String/Pattern Matching


[*]Online Approximate String Matching
[*]Strmat package (exact string matching and suffix trees)
 
Sentence Boundary Detector


[*]SATZ: An Adaptive Sentence Boundary Detector
[*]Adwait Ratnaparkhi’s MXTERMINATOR
 
Clustering/Classification


[*]FCLUSTER - A tool for fuzzy cluster analysis
[*]LNKnet Pattern Classification Software
[*]Principal Direction Divisive Partitioning
[*]k-means clustering
 
WWW


[*]w3mir - HTTP copying and mirroring tool.
[*]HTTrack - The Web mirror utility.
[*]HTML Conversion, Shareware and Freeware
 
Other Tools


[*]German Morphology Browser (online service)
[*]‘mat2D’ Matrix/Vector Library in C
[*]Content Analysis Resources - for quantitative analyses of texts, transcripts, and images.
[*]SNoW learning program
[*]The ?-TBL Homepage - Logic Programming Tools for Transformation-Based Learning
[*]ROOT: An Object-Oriented Data Analysis Framework
[*]CAQDAS Networking Project - Computer Assisted Qualitative Data Analysis Software
[*]Suffix sort
[*]Nb - a graphical user interface for annotating the discourse structure of spoken dialogue, monologue, and text.
[*]GATE - General Architecture for Text Engeneering.
[*]TiMBL: Tilburg Memory Based Learner
[*]MtRecode - The Multext character translation program
[*]Evalb - A bracket scoring program. It reports precision, recall, non crossing and tagging accuracy for given data.
[*]The OC1 decision tree software system
[*]IND Version 2.0 - creation and manipulation of decision trees from data
[*]Paai’s text utilities
[*]Shoebox 3.0 for Windows and Macintosh - A database program oriented to the needs of a field linguist’s dictionary.
[*]Teaching materials for statistical NLP by Chris Brew, Language Technology Group, Human Communication Research Centre, University of Edinburgh
[*]Introducing environmentalism and post-fordism into NLP (NeuroTran)
[*]Tools for Estonian Language
[*]Dan Melamed’s Page - Simulated Annealing Program, XTAG morpholyzer post-processors for English Stemming, Good-Turing Smoothing Software, 150 miscellaneous text processing tools, 75 text statistics and bitext geometry tools.
[*]TOOLDIAG: Pattern recognition toolbox
[*]The DN2 Home Page - DN2 is an intelligent self-relating free format database system which accepts data in human text format, and retrieves it in response to human requests, like Where is London?
[*]Software Announcements
[*]Tools for drawing and graphically editing trees
[*]Paul Nation’s vocabulary programs
[*]syllable prediction code (a simple lisp function)
[*]Pratt - a pattern discovery tool
[*]XGobi - A system for multivariate data visualization.
[*]NODElib - Neural Optimization Development Engine library
 
<div style="line-height: 22px;">Related Posts<ul style="line-height: 22px; margin-top: 5px; margin-right: 0px; margin-bottom: 5px; margin-left: 40px; padding: 0px;"><li style="line-height: 22px;">Natural language processing<div style="line-height: 22px;">Natural language processing Natural language processing (NLP) is a subfield of artificial intelligen...
页: [1]
查看完整版本: Software Tools for NLP