Hack 83 API Searches

figs/expert.giffigs/hack83.gif

Perform reliable searches with GetSearchResults.

In [Hack #17], a Perl script is used to perform an automated eBay search and then email new listings as they're discovered. Although the script serves a valuable function, it has the notable handicap of relying entirely on "scraping" (via the WWW::Search::eBay module) to retrieve its search results.

Scraping involves parsing standard web pages in order to retrieve the desired data. As you might expect, any changes to eBay's search pages, even minor ones, will break the script until the WWW::Search::eBay module on which it relies is updated to work with the new version.

The API, on the other hand, provides an officially supported interface to eBay's search engine, which means that scripts based on the API will be much more robust and nearly invulnerable to changes in eBay's search pages.

8.3.1 A Simple Search

Here's a simple Perl script, search.pl, that performs a search and displays the results.

#!/usr/bin/perl
require 'ebay.pl';

use Getopt::Std;
getopts('d');
$keywords = shift @ARGV or die "Usage: $0 [-d] keywords";

PAGE:     [1]
while (1) {
my $rsp =  call_api({ Verb => 'GetSearchResults',     [2]
                 DetailLevel => 0,
                       Query => $keywords,
         SearchInDescription => $opt_d ? 1 : 0,
                        Skip => $page_number * 100,
  });
  if ($rsp->{Errors}) {
    print_error($rsp);
    last PAGE;
  }
foreach (@{$rsp->{Search}{Items}{Item}}) {     [3]
    my %i = %$_;
($price, $time, $title, $id) = @i{qw/CurrentPrice EndTime Title Id/};     [4]
    print "($id) $title [\$$price, ends $time]\n";
  }
last PAGE unless $rsp->{Search}{HasMoreItems};     [5]
  $page_number++;
}

Given that searches can return hundreds or even thousands of results, the GetSearchResults API call (line [2]) divides the results into pages, not unlike the search pages at eBay.com. The loop, which begins on line [1], repeatedly resubmits the call, downloading a maximum of 100 results each time, until $rsp->{Search}{HasMoreItems} is false, on line [3]. That means that if there are 768 matching listings, you'll need to retrieve 8 pages, or make 8 API calls.

Your script might not need to retrieve all matching search results, as this one does. Instead, you may be content to search until a single auction is found, or perhaps to search only the auctions that have started in the last 24 hours. See the API documentation for more ways to limit the result set.

For each page that is found, a secondary loop, line [5], iterates through the result set for the current page, extracts relevant data (line [4]), and prints it out.

8.3.2 Performing a Search

Run the script to perform a title search, like this:

search.pl keyword

where keyword is the word you're looking for. To search for multiple keywords, enclose them in single quotes, like this:

search.pl 'wool mittens'

Or to search titles and descriptions, type:

search.pl -d 'wool mittens'

But the real beauty of API searches is how they can be used in an automated fashion.

8.3.3 Revising the Robot

Now, if we tie the API search into the script from [Hack #17], we get the following new, more robust search robot script:

#!/usr/bin/perl
require 'ebay.pl';

$searchstring = "railex";
$searchdesc = 0;
$localfile = "search.txt";
$a = 0;

# *** perform search ***
PAGE:
while (1) {
  my $rsp =  call_api({ Verb => 'GetSearchResults',
                 DetailLevel => 0,
                       Query => $keywords,
         SearchInDescription => $opt_d ? 1 : 0,
         Skip                => $page_number * 100,
  });
  if ($rsp->{Errors}) {
    print_error($rsp);
    last PAGE;
  }
  $current_time = $rsp->{eBayTime};

  foreach (@{$rsp->{Search}{Items}{Item}}) {
    my %i = %$_;
    ($title[$a], $itemnumber[$a]) = @i{qw/Title Id/};
    write;
  }
  last PAGE unless $rsp->{Search}{HasMoreItems};
  $page_number++;
}

# *** eliminate entries already in file ***
open (INFILE,"$localdir/$localfile");
  while ( $line = <INFILE> ) {
    for ($b = $a; $b >= 1; $b--) {
      if ($line =~ $itemnumber[$b]) {
        splice @itemnumber, $b, 1;
        splice @title, $b, 1;
      }
    }
  }
close (INFILE);
$a = @itemnumber - 1;
if ($a == 0) { exit; }

# *** save any remaining new entries to file ***
open (OUTFILE,">>$localdir/$localfile");
  for ($b = 1; $b <= $a; $b++) {
    print OUTFILE "$itemnumber[$b]\n";
  }
close (OUTFILE);

# *** send email with new entries found ***
open(MAIL,"|/usr/sbin/sendmail -t");
  print MAIL "To: $selleremail\n";
  print MAIL "From: $selleremail\n";
  print MAIL "Subject: New $searchstring items found\n\n";
  print MAIL "The following new items have been listed on eBay:\n";
  for ($b = 1; $b <= $a; $b++) {
    print MAIL "$title[$b]\n";
    print MAIL "http://cgi.ebay.com/ws/eBayISAPI.dll?
                                      ViewItem&item=$itemnumber[$b]\n\n";
  }
close(MAIL);

Note that the only difference in the search portion of this script is that the title and item number are stored in $title[$a] and $itemnumber[$a] arrays instead of being printed out.

If you end up scheduling this script as described in [Hack #17], you may not need to retrieve all matching search results each time. Probably the best way is to set the Order input value to MetaStartSort, which will retrieve newly listed items first. Then, assuming you've scheduled your search robot to run every 24 hours, you could then stop retrieving results as soon as an auction older than 24 hours is encountered. Use the $yesterday variable, introduced in [Hack #85], to do your date calculations.