CSCI 241 - Homework 2:
Shell Scripting

Due by 11:59.59pm Sunday, February 19, 2017 Due by 11:59.59pm Tuesday, February 21, 2017

You may work with a partner on this assignment. If you choose to do so, it is expected that you work together and equally contribute to the development of your solution. Also, you are both responsible for understanding how your solution works. You need only submit one assignment per group, but clearly indicate your partnership in the README and comments for files.

Introduction

For this assignment you will be creating a number of shell scripts.

Part 1 - URL Testing

Write a shell script called testurl.sh that accepts a list of urls in a separate file and tests if the website is up or not. You might find it useful to checkout the curl, wget and tail commands.

rhoyle@clyde$ cat urls
http://cs.oberlin.edu/~ncare/cs241/labs/lab8.html
https://occs.cs.oberlin.edu/~rhoyle/17s-cs241/assignments/hw02.html
http://no.such.url
http://occs.cs.oberlin.edu
rhoyle@clyde$ ./testurl.sh u
Not found: http://no.such.url

This script should also handle errors. If the user doesn't provide any urls to the script it should print out a usage message.


Part 2 - Back it up a step

Next, I want you to create a script called backup.sh. The script should take as arguments a directory to backup into (with "./backup" as the default) and a list of one or more files to copy to the backup directory.

Your script should only copy files in if their timestamp is more recent than the file that exists in the backup directory when the script is run. You might find it helpful to check bash's test (i.e. [ ]) syntax. Additionally, you should make your script executable using chmod. That is, the command should be runnable as follows

$ backup.sh

For extra credit, your script should keep a list of the five most recent backup directories and store copies as symlinks.


Part 3 - Diskhogger

Third, I want you to create a shell script called diskhog.sh that lists the 5 largest items (files or folders) in the current directory in decreasing order of size. You should output the sizes in a human readable format like so:

% cd ~/pub/cs241
% ./diskhog.sh
3.9M week03
572K old
348K hw06
152K week06
112K week05

Check out the man pages for du, cut, sort, xargs and head (or tail)

For extra fun, have your script take a flag to change the number of items to display and another to limits it to files or directories.


Part 4 - linecount

Create a shell script called linecount that by default will report the total number of lines in all of the files in the current working directory (recursively).

If a glob is specified, use that as a delimiter for the files to scan. So, if you wanted to know how many lines of java code were in your folder, you might run:

% ./linecount '*.java' 

You'll want to take a look at wc, pushd/popd or cd, find, and test.

Part 5 - Retro-grade Scripting

I want you to write a script called gradeit.sh that will test your pyramid and rot128 submissions for lab 1.

The script should analyze student's submissions for correctness and warn if the output of the program differs from the reference implementation, which is located in ~/pub/cs241/lab1.

You must decide what to test for. You will be graded on how thorough your test is. Explain, in comments in your script, what you are testing for and why you are running that particular test. After your script has finished, you should clean up any temporary files created by the testing process

You'll want to take a look atwc, pushd (and popd), find, and diff


Part 6 - Data file analysis

I often find myself using shell tools to answer questions about a data file that I'm working on. Here is a data file from a machine learning dataset that I'd like you download and unzip: adult.data.zip The fields in the data set are described at http://archive.ics.uci.edu/ml/datasets/Adult.

Answer the following questions in your README file (and give the commands used to find the answer):

  1. How many entries are marked "Male" and how many are marked "Female"?
  2. The last column is the label that is applied to the entry. How many of each label type are there?
  3. Give the counts for each label used for "race" in decreasing order
  4. Give the counts for a combined "race"/"sex" attribute in decreasing order

Potentially useful commands to look at include cut, sort, and uniq. If you include the commands you used to generate your answers, it might be possible to give you partial credit. Once you have answered the questions, you should delete the adult.data and adult.data.zip files so that you don't hand them in.

Extra Credit

Programming Hints


handin

README

Create a file called README that contains

  1. Your name
  2. A description of the programs
  3. Your answers to the "Data File Analysis" questions and commands
  4. An estimate of the amount of time it took to complete each part
  5. Any known bugs or incomplete functions
  6. Any interesting design decisions you'd like to share

Now you should clean up your folder (remove test case detritus, etc.) and handin your folder containing your scripts and README.

% cd ~/cs241
% handin -c 241 -a 2 hw2

% lshand

Grading

Here is what I am looking for in this assignment:


Last Modified: February 12, 2017 - Roberto Hoyle and Nick Care. Some material based on work by Benjamin Kuperman.