These plugins are used to fetch information concerning items in default collections from websites. It will fill the fields with all the information it could get. This page describes how to create such a plugin. Please keep in mind the Coding conventions when writing such a plugin.
The easiest way to begin a new plugin is to copy an existing one. They are found in lib/gcstar/GCPlugins/GCxxx, where xxx is the kind of collection the plugins concerns. As an example, plugins to fetch information for movies are in lib/gcstar/GCPlugins/GCfilms.
You could also use a template provided with GCstar sources. It’s GCSiteTemplate.pm in templates directory.
You should rename your file with something explicit. But the 2 first letters should always be GC and the suffix has to be .pm.
The first line contains something like that:
package GCPlugins::GCxxx::GCyyy;
Change yyy so it will matches your file name (without its suffix). A few lines below, there is this text:
package GCPlugins::GCxxx::GCPluginyyy;
Don’t change GCPlugin in the last part, but replace yyy with the same value as previously.
Here is the list of methods your plugin should implement. It’s done in an object-oriented way, meaning that the first parameter of this method will always be a reference to an object. This object is an instance of your package that will have to do the work. The same instance could be used during a user session, but there is no guarantee about that. So your package should be ready in any case. That means you are supposed to clear internal values that should be resetted between 2 fetches, and to avoid storing values between 2 fetches.
sub getSearchUrl
{
my ($self, $word) = @_;
return ('http://www.example.com/search.php', ['query' => $word, 'type' => 'movies']);
}
In this example, the website will get 2 parameters, query and type with corresponding values.
The plugins are some event-based HTML parsers. That means they will go through an HTML page and some functions will be called when some events occured.
When a tag (such as <p> or <a href=...>) starts, the method called is start. When there is some textual content, the method called is text. When a tag ends, the method called is end. Refer to documentation about HTML::Parser for more information as it is the base package of your package (providing you didn’t remove the use base clause during preparation). We are supposing here you got the reference to the current object in a variable $self.
Inside these methods, there are 2 main blocks depending on the value of $self->{parsingList}. If this is a true value, that means we are parsing a results page (the list of items that match a query). If this is a false value, we are parsing the information for a given item.
When parsing search results, you have to fill an array named $self->{itemsList}. Each item of this array is a reference to a hash. Each key of a hash is the name of the field (the same that the ones in $self->{hasField} initialized in new method). The values are obviously the ones that have been extracted from the parsed page.
Some websites don’t return a search list page but the item description page if there is exactly one search result. The plugin needs to detect this and has to tell gcstar that the page isn’t a results page but to treat it as the item description page instead. The following code from GCImdb.pm is an example on how to do this. For IMDB, this is achieved by checking the page heading to see if it doesn’t include “Title Search”.
if (($self->{inside}->{h1})
&& ($origtext !~ m/IMDb\\s*Title\\s*Search/i))
{
$self->{parsingEnded} = 1;
$self->{itemIdx} = 0;
$self->{itemsList}[0]->{url} = $self->{loadedUrl};
}
When parsing item description, the values have to be stored in $self->{curInfo}->{fieldName}, where fieldName is the same name as the one in the .gcm file.
(from a forum post by zombiepig) For drag and drop support to correctly there’s basically two parts that need to be fulfilled within the getItemUrl function.
Here’s an example, from the boardgamegeek plugin:
sub getItemUrl
{
my ($self, $url) = @_;
if (!$url)
{
# If we're not passed a url, return a hint so that gcstar knows what type
# of addresses this plugin handles
$url = "http://www.boardgamegeek.com";
}
elsif (index($url,"xmlapi") < 0)
{
# Url isn't for the bgg api, so we need to find the game id
# and return a url corresponding to the api page for this game
$url =~ |/([0-9]+)[/]*|;
my $id = $1;
$url = "http://www.boardgamegeek.com/xmlapi/boardgame/".$id;
}
return $url;
}
So there’s two parts to it. First, is that if getItemUrl is called without a url, the plugin needs to return a sample url showing the domain the plugin handles (in this case www.boardgamegeek.com). This is so when a url is drag and dropped, gcstar can correctly determine which plugin is able to handle that url. The second part is only applicable sometimes, mostly for plugins that use an api, so that the page with the details to parse is different to the page the user will drag-and-drop. For the example above, the bgg plugin checks to see if the url isn’t for the xmlapi. If so, it extracts the game id from the url, and then returns a url for the actual page to parse. If those two conditions are met, drag and dropping should work fine. Mostly, for scraped pages, there’s nothing really required here. Eg, for imdb, the function is only:
sub getItemUrl
{
my ($self, $url) = @_;
return $url if $url =~ /^http:/;
return "http://www.imdb.com".$url;
}
This is the main criteria for the “update” button to work (I’m going to change that string to “refresh” I think). It uses the same routines as the drag and drop code to change the stored item url to the url that the plugin needs to parse. So if drag-and-drop works, then refresh should work as well.
While GCstar only does the same operations a web browser would do, it is nicer to inform the websmaster what you are doing. Just look for the contact information on the website you are writing a plugin to, and send them a mail to inform them about this. You may send them a link to the page with Information for webmasters.
As what GCstar is could be unclear, you probably will have to insist on the fact that GCstar is only for personal use. Also, the users will always know from where they are fetching the information. The goal of this application is in no way to hide what website is used as they are doing a useful and great work.