PreLab 08

Due in class on Wednesday, November 1

In Lab 8 you will create two programs. In the first program you willl ask the user for the name of a text file and a number n (which is typically 10 or so). Your program will then print the text file to the screen, omitting the n most common words. This is called "distilling" the text; it is a useful step if you are trying to get a computer to understand text. In theh second program you will ask the user for strings and look for anagrams of those strings made up of words from a ddictionary fille. An anagram is a rearrangment of the letters of the string. For example, if the string is "oberlin student" one anagram is "let none disturb".

This prelab deals only with the distill program. Here is a sample of text:

This is a sample of text.
Sometimes words are repeated.
Sometimes words are unique.
The text is distilled by removing the
words that are the most frequently repeated.

Somewhere on your own paper make a table of the words in this text sample. You don't need to hand in this table. Do hand in answers for the following questions.

  1. How many words appear in this sample exactly 3 times? Exactly twice?
  2. List the words that appear more than oncne in order of how frequently they appear. If two words have the same frequency you can list them in either order.
  3. Rewrite the text sample omitting the three most frequently used words in the sample.

That is all for the prelab, but spend some time thinking about how you will code this. How will you find the words in a text file and count their frequencies? After you have done that, suppose I give you a number n. How will you make a list of the n most frequent words in the file? After you have done that, how will you print the file omitting the n most frequent words?

Honor Code

If you followed the Honor Code in this assignment, write the following sentence attesting to the fact at the top of your homework.

I affirm that I have adhered to the Honor Code in this assignment.