|
You can use the Statistics functions to analyse your texts and create wordlists. Figure 1: the Statistics Menu.
The first function is the Collocates list, shown in Figure 2. Figure 2: the Collocates View Dialog The second menu item is the Word Profile, for open or directory files, and this provides a total word count of a file together with a percentage of that total for a specified word. After specifying the word to profile, the Word Profile dialog is displayed while counting the words, as illustrated in Figure 3. Figure 3: The Word Profile dialog
When the word count is finished, the word is displayed as a percentage of the total, as shown in Figure 4. Figure 4: The Word Profile final count
The third menu item is the Unique Words List, as illustrated in Figure 5. This provides both a total word count and a count of the word types (unique words) together with the number of instances and frequency of occurrence, (expressed as a percentage of the total) for each unique word. The illustration shows the Unique Words List for the Hong Kong Corpus of Spoken English after the Frequency Sort button has been clicked. Figure 5: The Unique Words View Dialog Finally the fourth Menu item is the 'Match Texts with Word Lists' function, which compares all the words in an open file with two word lists. The file is printed at the end as an HTML document, and should not be longer than about 100,000 words. Two word lists are provided with the program, the first (mfwords.txt) being the 2000 most frequent words in English, and the second list (mfwords2-5K.txt) consisting of the next 2 - 5 thousand most frequent words. These lists were created using the Brown Corpus, consisting of about one million words, and words in List 1 are printed in black, words in List 2 in red, and words not in either list are printed in blue. The result for the file 'langtch.txt' provided also with the program is shown if Figure 8. Figure 8: The display for the 'Match Texts with Word Lists' function The text for Word List 1 (mfwords.txt) which comprises the 2000 most frequent words is shown in Figure 9: Figure 9: Word List 1 (mfwords.txt) This function is the same as the Text Analyser function at http://www.edict.com.hk, and can be used in conjunction with the Lookup in Net Dictionary function if you have in internet connection. lookup words Next page ->
|