CSCI 151 - Prelab 7 Picking the very best webpage

Due in class on Monday, April 3

In this prelab, you will familiarize yourself with some of the design and implementation issues in the upcoming lab 7. Please write or type up your solutions, and hand in a paper copy before class on Monday. Remember, late prelabs receive zero credit.

Overview

In the previous lab, you created a WebPageIndex class that represents the data from a single document (either local file or URL). In this lab, you will be creating a collection of those indexes and then determining which page best matches what a user is searching for.

MyPriorityQueue

In this section of the lab, you will be implementing your own version of a binary heap based Priority Queue. To begin, you should probably look over the Java documentation for a PriorityQueue<T> and for java.util.AbstractQueue<T> which it extends.

Bridging the generational gap

You will be using an ArrayList as an efficient implementation of a complete binary tree. One of the activities you will need to be able to do is to move up and down the tree.

  1. Assuming that your ArrayList is a class variable named "heap"

    1. At what index will you store the root of the tree?
    2. At what index is the parent of the node at index i?
    3. At what index is the left child of the node at index i?
    4. At what index is the right child of the node at index i?

Comparators

In order to make these heaps work, you will need to create Comparators of various sorts. Begin by looking over the documentation for java.util.Comparator<T>. Pay special attention to the compare() method you are required to implement.

  1. Give the Java code for a comparator class StringComparator that compares two Strings, but does not care about the case of the strings themselves. Hint, you might want to just let the String's compareToIgnoreCase(); method do all the heavy lifting.

 

Weighting web queries

In the application portion of this lab, you will be reading in and creating a number of WebPageIndex objects (from Lab 6), storing them in your heap, and then processing user search queries on those objects. You should reveiw the WebPageIndex methods before answering the following questions/

Simple queries

Explain how you compute the "score" of a particular web page given a String that represents a user query of one or more words under the following conditions (pseudocode or just a concise description is fine, but you should refer to the WebPageIndex methods):

  1. Based on just the sum of the word counts of the page for the words in the query.
  2. Based on just the sum of the word counts, but requiring every word to be present

 

Advanced queries

Our WebPageIndex objects allow us to also search for phrases in our web pages.

  1. Explain how you would process a user query to identify phrases that are set off by double quotes, and then score the various pages. For example,

    pancakes "maple syrup" bacon

    is looking for pages that contain the words "pancakes" and "bacon" as well as the phrase "maple syrup".

Last Modified: March 2017- Bob GeitzVI Powered