Wikipedia:Link rot/URL change requests/Archives/2023/October
This is an archive of past discussions on Wikipedia:Link rot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current main page. |
.asp vs .aspx
I would like to report broken basketball URLs. Some work, some don't. I summarized it in the following table:
http:// or https:// |
www. | eurobasket usbasket latinbasket asia-basket australiabasket afrobasket |
.com/ | country_name | /basketball | nothing or -National-Team or maybe something else |
.asp or .aspx |
end_of_URL or ? anything |
For example
http://www.australiabasket.com/Australia/basketball-National-Team.asp?women=1 doesn't work,
https://www.australiabasket.com/Australia/basketball-National-Team.aspx?women=1 works.
So, version with https and .aspx is correct. (European URLs seem to work for both .asp & .aspx)
Another problem is that some African URLs use "africabasket" instead of correct "afrobasket". Maiō T. (talk) 19:52, 25 September 2023 (UTC)
- I think this search finds them all:
insource:basketball insource:/www[.](eurobasket|usbasket|latinbasket|asia-basket|australiabasket|afrobasket|africabasket)[.]com\/[^\/]+\/basketball[^.]*[.]*aspx?/
(1,821). Does it look right to you? -- GreenC 20:12, 25 September 2023 (UTC)
- OK GreenC, those search results look pretty good. So, if you would be so kind, please change every "asp" to "aspx" and "http" to "https". But I'm afraid it will take hours and hours. Maiō T. (talk) 13:45, 26 September 2023 (UTC)
- True, for the computer, for me an hour or less of work for this job looks easy. We'll see though if any problems come up, like bot blockers or new URLs that are soft-404s. -- GreenC 01:36, 27 September 2023 (UTC)
- OK GreenC, those search results look pretty good. So, if you would be so kind, please change every "asp" to "aspx" and "http" to "https". But I'm afraid it will take hours and hours. Maiō T. (talk) 13:45, 26 September 2023 (UTC)
Maiō T. : the bot edited about 1,400 pages. It modified 1,872 links, added 261 archive URLs, and 55 {{dead link}}
ie. the conversion to .aspx not working. -- GreenC 03:57, 5 October 2023 (UTC)
- Good job, GreenC. Thank you. Maiō T. (talk) 12:07, 5 October 2023 (UTC)
statestimesreview.com
Please add this domain to the list of usurped domains. There are only a handful of links here on enwiki from this site, and I have adjusted those references. – robertsky (talk) 06:19, 22 September 2023 (UTC)
- This domain as well: pofmaed.com – robertsky (talk) 08:18, 22 September 2023 (UTC)
User:Robertsky, will be done Special:Diff/1174167580/1179566502, thank you. -- GreenC 00:03, 11 October 2023 (UTC)
Ever since its merger with National Film Development Corporation of India, many old URLs are no longer accessible. Like this doesn't redirect to this. And this doesn't lead us to this. Kailash29792 (talk) 05:52, 10 October 2023 (UTC)
- Fortunately the new URL can be determined from the old. Any it can't, it will add archive, or mark dead. -- GreenC 00:14, 11 October 2023 (UTC)
This is done. It was only 22 pages, but I tried some new code and it didn't go well. I ended up manually fixing 17 pages. -- GreenC 16:00, 11 October 2023 (UTC)
MusicIndiaOnline
Seems this site is completely dead. This proves it. Existing URLs must be tagged accordingly. --Kailash29792 (talk) 05:16, 11 October 2023 (UTC)
- Ok for now I updated IABot to permadead status. I'll probably run WaybackMedic later to make sure they are all processed (there can be gaps in IABot coverage). It's in 530 pages including File and Template. -- GreenC 16:20, 11 October 2023 (UTC)
I noticed that several links at the bottom of the Epic! article were broken. Gfeissweet (talk) 18:16, 14 October 2023 (UTC)
- This is probably true for most articles. But you have some options, start by reading Wikipedia:Link rot. Run the IABot at https://iabot.org .. If you see some domains that you or the bot can't fix report them here. -- GreenC 22:24, 14 October 2023 (UTC)
bfi.org.uk soft-404s
Request at meta:User_talk:InternetArchiveBot#Dealing_with_redirect to check for soft-404s in bfi.org.uk -- GreenC 21:18, 11 October 2023 (UTC)
- User:Tobyhoward. Conversion has begun. Example. I wish there was a map this to this automatically. The page is there. We may be left with no option but convert to archive URL. I suspect BFI (or a contractor) started a new DB from scratch and they didn't have a way to map old to new. There are a lot of pages for the old DB on Wikipedia, probably 10k or more. -- GreenC 00:56, 14 October 2023 (UTC)
- Thanks @GreenC. Just to clarify, when you say "Conversion has begun" -- do you mean manual conversion or by bot? I think it's sensible to let BFI know that thay have broken so many WP links, and ask if they can provide any kind of map. You never know. They must have had some kind of mapping when they built the new DB from the old. I'll go back to them -- they responded promptly to my query last time. If they are unable/unwilling to help, perhaps someone senior in WP could talk to them? I am a mere gnome :-) Tobyhoward (talk) 08:09, 14 October 2023 (UTC)
- By bot. I am going slowly to discover soft-404s ie. pages that are status live, but redirect to a content-less or poor-content page. For example [1] in The Borrowers (1992 TV series) was archived by the bot: Special:Diff/1171436771/1180023855. If you get a map from BFI (good luck!), I can rerun the bot with the map. The work I am doing now can be undone and replaced with new info. It would be a big help. -- GreenC 16:46, 14 October 2023 (UTC)
- Thanks @GreenC. Just to clarify, when you say "Conversion has begun" -- do you mean manual conversion or by bot? I think it's sensible to let BFI know that thay have broken so many WP links, and ask if they can provide any kind of map. You never know. They must have had some kind of mapping when they built the new DB from the old. I'll go back to them -- they responded promptly to my query last time. If they are unable/unwilling to help, perhaps someone senior in WP could talk to them? I am a mere gnome :-) Tobyhoward (talk) 08:09, 14 October 2023 (UTC)
Done:
- Pages edited 11,015 - out of 16,864 containing bfi.org.uk
- Add new archive URL. 12,627 cites.
- Flip
|url-status=live
to dead. 1,082 cites. - Move URL based on redirect. 1,451 cites
- IABot database updated. Propagate to 300+ other wikis.
@Tobyhoward:. -- GreenC 21:49, 15 October 2023 (UTC)
Further work:
- Fixed all File: pages.
- Converted all instances of
{{BFI}}
to normal links and nominated template for deletion since it no longer works. - Further discussion at Wikidata bottom of page, in terms of discovery of new BFI IDs and what to do with the old IDs. -- GreenC 17:36, 17 October 2023 (UTC)
Because the site https://filmcompanion.in was blacklisted for over a year, many IP users took to fudging by typing filmcompanion.com, which does not exist. Hopefully these fake links can be rectified. Kailash29792 (talk) 08:00, 23 October 2023 (UTC)
- User:Kailash29792: There are 15 instances. It would be easiest to remove them by hand then develop bot code. Could you do it? Should take 5 minutes. -- GreenC 14:23, 23 October 2023 (UTC)
- That could do. I just thought there were countless, but this is stunningly short. Kailash29792 (talk) 14:25, 23 October 2023 (UTC)
- Great thanks. Blacklist? But then they add with .io or whatever. I wonder why they add non-working URLs. Possibly they can then sell the domain to spammers since it has presence on Wikipedia. -- GreenC 14:55, 23 October 2023 (UTC)
- I've fixed all the links. filmcompanion.in was blacklisted indefinitely until I, Krimuk2.0 and TrangaBellam fought to make it usable once more, albeit with abuse filter. So I think IPs and autoconfirmed users still can't add filmcompanion.in and may still take to link fudging. Kailash29792 (talk) 15:34, 23 October 2023 (UTC)
- Yes recall you posted here about 6 weeks ago about filmcompanion. I developed a technique to discover old/deleted redirects in the Wayback Machine, and from that was able to save many links. Example. I'm proud of that work it's never been done before as far as I know, and it opens new possibilities for fixing link rot whenever that condition occurs ie. pages at one time had redirects but the redirects were deleted/expired over time. -- GreenC 15:45, 23 October 2023 (UTC)
- I've fixed all the links. filmcompanion.in was blacklisted indefinitely until I, Krimuk2.0 and TrangaBellam fought to make it usable once more, albeit with abuse filter. So I think IPs and autoconfirmed users still can't add filmcompanion.in and may still take to link fudging. Kailash29792 (talk) 15:34, 23 October 2023 (UTC)
- Great thanks. Blacklist? But then they add with .io or whatever. I wonder why they add non-working URLs. Possibly they can then sell the domain to spammers since it has presence on Wikipedia. -- GreenC 14:55, 23 October 2023 (UTC)
- That could do. I just thought there were countless, but this is stunningly short. Kailash29792 (talk) 14:25, 23 October 2023 (UTC)