Wikipedia talk:WikiProject U.S. Roads/Redirects/Texas/Farm to Market and Ranch to Market Roads

Writing a script

edit

I could probably write a script to check each of them remove the ones that aren't listed. Options for automatic removal might include <!-- commenting them out -->, striking them out, or moving them to a separate section or page to be manually examined, just in case they are false negatives. — Apr. 25, '06 [12:53] <freakofnurxture|talk>

Script

edit

I wrote a script to validate this list by checking TxDOT. Because I don't expect to continue maintaing this page, here is the perl script if anyone wants to re-check in the future.--Svgalbertian (talk) 17:58, 19 October 2010 (UTC)Reply

#!/usr/bin/perl
use strict;
use LWP::Simple;
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Response;

# Highway record URLs
my %roadURL = (
        'FM' => 'http://www.dot.state.tx.us/tpp/hwy/fm%subdir%/fm%route%.htm',
        'RM' => 'http://www.dot.state.tx.us/tpp/hwy/rm/rm%route%.htm',
        'UR' => 'http://www.dot.state.tx.us/tpp/hwy/ur/ur%route%.htm',
);

# Output files
open ( TXTFILE, ">wikitext.txt" ) || die ("you die now!");

# Cycle through all the road pages
for (my $i = 1; $i < 4000; $i++) {

 my %title = ();
 my $subdir = '';

 if ($i >= 500) {
   use integer;
   $subdir = sprintf("%04d", ($i / 500) * 500);
 }

 # Grab all the highway records from the internet
 my $paddednum = sprintf("%04d", $i);

 foreach my $routetype (keys (%roadURL)) {
   my $URL = $roadURL{$routetype};

   (my $currenturl = $URL) =~s/\%route\%/$paddednum/g;
   $currenturl =~s/\%subdir\%/$subdir/g;
   my $contents = &getPage($currenturl);

   $contents =~ m/<title>(.*?)<\/title>/;
   $title{$routetype} = $1;
 }

 # Does the FM route exist? If so is it also an UR?
 if($title{'FM'} !~ m/Page Not Found/) {
  if($title{'UR'} =~ m/Page Not Found/) {
	print "*{{txfm|$i}}\n";
        print TXTFILE "*{{txfm|$i}}\n";
  }
  else {
	print "*{{txfm|$i|u}}\n";
        print TXTFILE "*{{txfm|$i|u}}\n";
  }   	
 }

 # Does the RM route exist? If so is it also an UR?
 if($title{'RM'} !~ m/Page Not Found/) {
  if($title{'UR'} =~ m/Page Not Found/) {
	print "*{{txrm|$i}}\n";
        print TXTFILE "*{{txrm|$i}}\n";
  }
  else {
	print "*{{txrm|$i|u}}\n";
        print TXTFILE "*{{txrm|$i|u}}\n";
  }   	
 }
}

# Close the files
close (TXTFILE);

use strict;
sub getPage {
  # Function to get a webpage
  my($currenturl) = @_;
  my $contents;

  my $browser = LWP::UserAgent->new();
  $browser->timeout(10);

  my $request = HTTP::Request->new(GET => $currenturl);
  my $response = $browser->request($request);

  $contents = $response->content();

  return $contents;
}