Text mining
From Free net encyclopedia
←Older revision | Newer revision→
Text mining, also known as intelligent text analysis, text data mining , unstructured data management, or knowledge-discovery in text (KDT), refers generally to the process of extracting interesting and non-trivial information and knowledge (usually converted to metadata elements) from unstructured text (i.e. free text). Text mining is a young interdisciplinary field which draws on information retrieval, data mining, machine learning, statistics and computational linguistics. As most information (over 80%) is stored as text, text mining is believed to have a high commercial potential value. Recently, text mining is receiving much attention. Several research groups around the world, as well as R&D departments of big companies, are doing research on text mining. e.g. IBM, Microsoft etc. One of the largest text mining applications that exists is probably the classified ECHELON surveillance system. Until recently websites mosty used text based lexical searches. Text mining will allow more "semantic" searches. e.g. Searching for car company will yield up Toyota Company home page even if the page does not contains the words "car company" explicitly.
External links
- Nstein Technologies Text Mining & Business Intelligence Solutions
- UIMA standard An Open, Industrial-Strength Platform for Unstructured Information Analysis and Search
- Text-Mining.org Good reference of the text mining community
- Kmining List of text mining, data mining and KDD scientific conferences
- SAS Text Miner: Suite of tools for discovering and extracting knowledge from text documents
- SPSS: Predictive Text Analytics solutions
- KDNuggets Data Mining, Web Mining, and Knowledge Discovery Guide
- Text mining summit 2006
- Eruditionhome Directory site for data mining and web mining resources
- unstruct.org Latest news about the industry
- Semantic-Knowledge Semantic Search Engine and Text Mining
- TEMIS Text Mining Solutions
- Topicalizer - A text analysis tool
- Text analysis tool
- Textengines Quick guide: Text analysis explained
- Text Mining Search Engine: Specialized search engine for the "text mining" knowledge domain
- Language and Computing (L&C) L&C has ontology-based Natural Language Processing capabilities and advanced mining solutions.
- YALE (Yet Another Learning Environment): free open-source software for knowledge discovery, data mining including text mining, machine learning, etc.: YALE and its also freely available open-source plugin WordVectorTool offer a free complete software environment for many text mining tasks
- GATE (A General Architecture for Text Engineering): free open-source software for natural language processing (NLP), text mining, and information extraction, that partially uses YALE
- Bow: A Toolkit for Statistical Language Modeling, Text Retrieval, Classification and Clustering
- Knewron: Knowledge Discovery Platform: Automatic knowledge ETL capabilities, Web Service Platform support and XML result formats, Support source archival and document linking.
See also
- Data mining
- Information Retrieval
- Natural language processing
- Computational linguistics
- Business intelligencede:Textmining
hu:Szövegbányászat nl:Text mining pt:Text mining sv:Text mining fr:Fouille de textes