Overview
On a UNIX system, the last command provides you information about recent users. It includes when they were logged in, how long they stayed, where they came from, and their tty (terminal). Part #1 of this assignment asks you to write ttylast that uses AWK to produce a version of the last output sorted by tty.This part asks you to use PERL to interact with a Web site. In particular, you are asked to implement a tool, getimage to search for and download images from www.flickr.com. The lab is designed to reinforce your understanding of PERL as well as regular expressions. It is also designed to introduce you to using libraries in PERL.
Part 1, ttylast: The Details
The format of the report should exactly match the format of the sample report. Since this script is to be standalone, it might be easier as a shell script that can pipe the results of "last" into awk, rather than a pure awk script. It is important to note that the output of ttylast does not contain multiple uses of a tty by a user -- only the most recent. This makes your life much easier.In order to keep the assignment fun, you are allowed to use only the awk (awk, nawk or gawk) and last commands -- well, and, of course, a shell.
Some Hints
First, begin by looking at the output from my program. Then, take a look at the output of last. Make sure you understand the name of the game. To help you out, I included a copy of last's output captured from within my script, as well as my script's output.There are many ways of solving this problem. As a hint, I'll tell you how I did it. As I worked my way through last's output one line at a time, I kept three arrays:
- one of users (by user)
- one of ttys (by tty)
- one of the "whole lines" indexed by their "userid,tty" pair
The first two arrays amount to nothing more than a list of users and a list of ttys, respectively. In reality, I really only cared about the keys, not the values.
The next array gave me the mapping. By building the "userid,tty" key", I gave myself an easy way to solve the problem. Essentially, I could find any
pair. Invalid pairs would be the empty string "". So, if I'd look up "gkesden, pts/8" and gkesden didn't use pts/8, the result would be "". So, to produce the output, I used two nested loops. The outer loop walked through the list of ttys. The inner loop walked through the lists of users. The code inside printed out the line of last mapped to "user,tty", well, at least unless that lookup yielded the empty string ("").
The last detail was the annoying extraneous line in last's output:
wtmp begins Tue Sep 12 04:06:14 2006To avoid this, I filtered by "action" so that it wouldn't happen if the pattern contained a "wtmp" in the first field.
One detail that I haven't given you: How to keep the most recent, rather than the least recent, use. Think about this one a bit. One technqiue is to reverse the output of last. This can be done with sed. Or, as an alternative, you can use nl to number the lines and then sort in reverse, and then whack the line numbers using, for example, awk or cut. Another technique is to us an "if statement" within AWK to add the entry if, and only if, there isn't already one. But, you surelly don't want to sort in AWK since, so far as I know, there is no built-in sort function.
Sample Files
Since these files came from a real live unix system, you'll have to get them from AFS -- I don't want them to get crawled on the Web:
- /afs/andrew/course/15/123-kesden/handout/last.out
- /afs/andrew/course/15/123-kesden/handout/ttylast.out
Part 2, getimage: The Details
Details
NAMEgetimage - get images by keyword from www.flickr.comSYNOPSIS
getimage [OPTION]... KEYWORD [KEYWORD ...]DESCRIPTION
Query Flickr's Web site, by keyword, for images. Download images matching all provided keywords, saving them into the current working directory.
Saved files maintain their extension, but are named after the keywords, and enumerated, beginning with 0, in the order downloaded. For example, a query via the keywords "pony" and "hat" might result in the creation of the files "pony_hat_0.jpg" and "pony_hat_1.jpg".
The use of multi-word keywords is not support. For example, "George Washington" may only be expressed as two and'ed keywords, not an ordered tuple.
-n n
Optional argument limiting the number of results to, at most, n images. The default limit, absent this argument, is 10.-t target_directoryOptional argument forcing files to be saved into the target_directory instead of the current working directory.RETURN VALUE
Returns 0 on success, non-zero on any other condition. For example, returns 1 if the target directory does not exist.EXAMPLES
getimage pony hatDownload up to 10 images matching both of the keywords "pony" and "hat", save them into the current working directory, and name them using the convention as follows: pony_hat_0.jpg, pony_hat_1.jpg, &c. The limit of at most 10 matching images is the default imposed in the absence of the -n flag, which defines a user-specified limit.
getimage -n3 pony hat
As above, but save no more than three (3) matches.
getimage -n3 -timage_files pony hat
As above, but place results into the "images files" subdirectory.
The LWP Library
We'll be using the get() and getstore() functions from the LWP library to interact with the Web site. The get() function requests a page by URL. The resulting object is returned as a string. The getstore() function has two arguments, the second of which provides the name of the file to which the object should be saved on disk.Implementation HintsThe examples below illustrate these uses of get() and getstore(). In each of the examples below, please notice the "use LWP::Simple;" line. It is an tells PERL that we'll be using the Simple object from within the LWP library. The same form is used to include other libraries in PERL.
Example 1: Returning the page as a string
#!/usr/bin/perl use LWP::Simple; $url = "http://www.google.com"; print get($url);Example 2: Storing the page into a file
#!/usr/bin/perl use LWP::Simple; getstore('http://www.cmu.edu', 'cmu.html');
Hints:
- In your browser of choice go to http://www.flickr.com/ and search for something. On the page with your results, what happened to the URL? What happens when you change the URL directly? What happens to the URL when you go to a second page of results?
- Search the Web for "PERL LWP get" and "PERL LWP getstore". Figure out how to specify the name of the output file for your image files.
- Consider what happens if FlickrTM displays the results on multiple pages: You'll need to traverse the pages. See the first hint for advice on figureing out how to do this.
- Consider the following example HTML image tag:
Which components of this expression are going to change? Which componants will reamain static and can therefore be used as landmarks?<img ... src="http://www.foo.com/bar.jpg"/>
- Are all the images on the page related to the search term? What about the banner and buddy icons? Not so much, right? Look at the html and see how the page is structured to get an idea of how you can parse it (hint: class=DetailPic)
- Perl is a magical language and the Internet is your friend. Look for pre-built modules that do what you're interested before reinventing the wheel. The "use" statements below may be useful in your code:
- use Getopt::Long;
- use LWP::Simple;
We're Here To Help!
As always -- remember, we're here to help!