User talk:Dr ishmael/Caps

Do you want me to do the : thing for you next time? and maybe match (Arena), too? --◄mendel► 23:25, 23 July 2008 (UTC)


 * I can handle it from here, thanks. I should have 99% of all this taken care of by next Wednesday, anyway.  &mdash;Dr Ishmael [[Image:Diablo_the_chicken.gif]] 00:38, 24 July 2008 (UTC)


 * lol, that was pathetic, the bot changed all the links here :P --Gimmethegepgun 02:20, 24 July 2008 (UTC)

Quite some progress! --◄mendel► 07:16, 24 July 2008 (UTC)


 * The majority of the Talk:, User:, and User talk: pages were false positives, where the hit was on plain text or a  section instead of inside an active link.  Everything else has been corrected.  &mdash;Dr Ishmael [[Image:Diablo_the_chicken.gif]] 15:44, 24 July 2008 (UTC)


 * Yeah, well, it is obviously hard to tell automatically whether that is historic information that needs no correction, or whether it is e.g. style guide info that ought to be adjusted. So for general quality control, I ought to limit my script to just checking links now (i.e. only inside  )?


 * If you could post the script for that here, that would be great - I haven't taken the time to learn awk yet. I can take care of running it on future db dumps.  &mdash;Dr Ishmael [[Image:Diablo_the_chicken.gif]] 17:27, 24 July 2008 (UTC)

July 31 update
The latest update is from the July 31 dump. The script now only looks at text inside double brackets, which means For the images, if you don't want to reupload them, you can make a redirect from the lowercase image page to the uppercase image; that works. A cursory glance at some of the mainspace articles still found some actual links that need updating, so it's not all false positives. :-P
 * it doesn't match any page titles any more
 * it still matches images

The new script is below. Awk reads the file a line at a time. If a /regular expression/ matches, the { action } is carried out. Unless the action had a "next" in it, awk keeps working through the script to the bottom. You can also make an action trigger on a boolean (such as "skipme"). END only triggers after the file has been read. $0 is the current input line. --◄mendel► 18:47, 1 August 2008 (UTC) / / { 	match($0,/ (.*)<\/title>/,thisTitle); skipme = 0; printme=0; }

skipme && /preserve">#REDIRECT/ { print thisTitle[1] > "redirect.txt" }

skipme {next;} / / {next;}

/\[\Location|Quest|Region)\)[^\.]/ {  i++; print ":[[" thisTitle[1] ""; skipme=1;} END{ print "* " i " pages matched.";}