Text mining

From Free net encyclopedia

Revision as of 11:01, 12 April 2006; view current revision
←Older revision | Newer revision→

Text mining, also known as intelligent text analysis, text data mining , unstructured data management, or knowledge-discovery in text (KDT), refers generally to the process of extracting interesting and non-trivial information and knowledge (usually converted to metadata elements) from unstructured text (i.e. free text). Text mining is a young interdisciplinary field which draws on information retrieval, data mining, machine learning, statistics and computational linguistics. As most information (over 80%) is stored as text, text mining is believed to have a high commercial potential value. Recently, text mining is receiving much attention. Several research groups around the world, as well as R&D departments of big companies, are doing research on text mining. e.g. IBM, Microsoft etc. One of the largest text mining applications that exists is probably the classified ECHELON surveillance system. Until recently websites mosty used text based lexical searches. Text mining will allow more "semantic" searches. e.g. Searching for car company will yield up Toyota Company home page even if the page does not contains the words "car company" explicitly.

External links

See also

hu:Szövegbányászat nl:Text mining pt:Text mining sv:Text mining fr:Fouille de textes