You can download the assignment instructions by clicking on this link
Random Seeds and Randomizing the Data Set
The purpose of using a random seed is to be able to replicate the "randomness" created by the (pseudo-)random number generator used by the computer. That is, when you specify a random seed, every random number generated will occur in the exact same order as any other time you used the exact same random seed. Diffferent random seeds will generate different sequences of "random" numbers.
Why are we concerned about this? In science, all experiments should be reproducable. So by specifying a random seed, we make it possible for our program to still use randomness to split the training and test sets (so that each instance is just as likely as any other to be used for training or testing), and yet we can also verify the results of the experiment by running it again to duplicate its results.
This means that if you run your program a second time with a particular random seed (for a given algorithm and data set), the output should be exactly the same as the first time you ran it with the same random seed. Whether your program does indeed produce the same results each time the same random seed is used might be considered when grading your assignment.
Following are some code snippets that you might use to randomly shuffle a list based on a random seed:
import random random.seed(yourRandomSeed) shuffled = list(yourInstances) random.shuffle(shuffled)
import java.util.Random; import java.util.Collections; Random rng = new Random(yourRandomSeed); List shuffled = new ArrayList(yourInstances); Collections.shuffle(shuffled, rng);