- Cell Phones 1585
- Desktop 761
- Editors 427
- Education, Science & Engineering 1093
- Games 2268
- Internet 4693
- Multimedia & Graphics 977
- Office software 755
- PC 471
- Programming 1874
- Security 942
- SEO 306
- Registration in Catalogs 80
- Others 226
- Software for Pocket PC 187
- Utilities 3855
An authoring tool for cleaning keywords KCleaner
Refunds: 0
Uploaded: 23.07.2010
Content: kcleaner_1_0.rar 122,48 kB
Product description
KCleaner line utility for Windows to clean the keywords on the basis of stop words.
I am the author of this program.
The main difference from other tools of this kind - the ability to work with large volumes
data while maintaining high speed operation. For example, treatment of the key file,
containing ~ 500,000 keev, with a stop word file, containing ~ 50 000 words, enough
modest hardware configuration (AMD Sempron 2500 1.4GHz, 512Mb RAM), this is done
utility for 7-8 seconds.
The input kCleaner.exe takes 4 parameters:
- The input file that contains the complete list of keys (one per line) that will
filter;
- Input file of stop words (words are one per line, and this should be
it is a word, not a sentence, that is, in no spaces!);
- The output file that will contain a list of "good" keywords
(Words that have been filtering the stop words). The file is created in the process.
- An output file that contains a list of "bad" key words (the words that have not been
filtering the stop words). The file is also created in the process.
Both input files must be encoded in Windows-1251.
Example utility call:
> Kcleaner.exe in.txt stop.txt good.txt bad.txt
Where
in.txt - input file with the keyword / phrase,
stop.txt - input file with stop words
good.txt and bad.txt - the names of the files generated by the "good" and "bad"
keywords accordingly
The principle of utility is to look at the keywords and filter them
according to the following criteria: If the key enters any stop word (part of it
as a single word, not as a substring keywords!), this falls into the output of keywords
file containing "bad" keywords. If none of stop word in the current keyword is
not included, the keywords in the file gets "good" keywords.
The utility was tested on Windows XP, but with a high probability it should work
in other versions of OS Windows.