User:Irkm/BuildArchive/Tools/UserHistoryScript

Documentation
This is a little script i use to extract user names for crediting from article histories.

The script should run under any modern perl environment.

Usage: ./getEditUsersFromWiki.pl 

Example: ./getEditUsersFromWiki.pl "Build:W/D_Zealous_Decapitater"

Caveats
The script is not extensivly tested. It just works for me.

Edit histories longer than 500 entries are not handled correctly. The oldest entries are ignored. Since i have not yet encountered a build article with more than 500 entries this should not be a problem. I can fix this if neccessary.

Minor edits are ignored. I can add support for minor edits if someone needs it. Just leave me a message.

Script

 * 1) !/usr/bin/perl

use warnings; use strict;

if ($#ARGV != 0) { usage; exit(0); }

my $pageTitle = $ARGV[0];

my $url = "http://gw.gamewikis.org/wiki?title=$pageTitle&action=history&limit=500";

my $content = getURL($url);

my ($originalAuthor, $users) = extractUserNames($content);

print "\n\n== Original Authors ==\n"; print "Article created by: ". "" . $originalAuthor . " \n"; print "Additional contributions by: \n"; foreach (keys %$users) { next if $_ eq $originalAuthor; print "" . $_ . ", " } print "\n";

sub extractUserNames { my $content = shift;

my @lines = split(/\n/, $content); print "Lines= $#lines\n";

my $count = 0;

# Skip leading lines while ($lines[$count] !~ /^\(cur\)/) { $count++; }

my %users; my $originalAuthor;

while ($lines[$count] =~ /^/) { if ($lines[$count] =~ / m<\/span>/) { # minor edit -> skip $count++; next; }	$lines[$count] =~ /\(.*?)\<\/span\>/; my $part = $1; if ($part =~ /(.*?)<\/a>/) { my ($pagename1, $pagename2,$userlink, $username) = ($1,$2,$3,$4);

if (!exists($users{$username})) { if (defined($pagename1)) { #print "Pagename = $pagename1, Userlink=$userlink, Username=$username\n"; $users{$username}{pagename} = $pagename1; $users{$username}{userlink} = $userlink; } else { #print "Pagename = $pagename2, Userlink=$userlink, Username=$username\n"; $users{$username}{pagename} = $pagename2; $users{$username}{userlink} = $userlink; }	   }	    # remember author, last in list is original author $originalAuthor = $username; } else { print "unparsed line: $part\n"; }	$count++; }

return $originalAuthor, \%users; }

sub usage { print "Usage: getEditUsersFromWiki.pl \n"; print "Example: getEditUsersFromWiki.pl \"Build:W/D_Zealous_Decapitater\"\n"; }

sub getURL {

use LWP::UserAgent; use URI;

my($url,$ret) = @_;

if(!defined $ret) { $ret = 5; }

my($ua) = LWP::UserAgent->new(env_proxy => 1,                                 keep_alive => 1,                                  timeout => 60,                                  # agent => "Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; T312461)",                                 ); my($furl) = URI->new($url)->canonical; print "Getting URL : $url\n"; my($response) = $ua->get($furl); my($count); while(!$response->is_success) { $count++; warn "Error getting $furl, retrying ($count)."; sleep 5; if($count == $ret) { warn "Can't get $furl, aborting after $ret retries."; return undef; }   }

return $response->content; } __END__