Wikipedia talk:Requests for comment/Archive.is RFC 3

Darkwarriorblade argument #1

edit
  1. Hasteur, your insults are not required, Matthias asked a legitimate question and you've responded with more hearsay and speculation. Whatever Rotlink and Rotlinkbot were doing, that has no bearing on what Archive.is is doing and what functionality it serves. It is a far more flexible, powerful, and robust archiving service than the other two major contributors. Anytime you wish to retract your insult directed at Hasteur would be a good time to do so.Darkwarriorblake / SEXY ACTION TALK PAGE! 17:47, 30 June 2014 (UTC)Reply
    Wow Darkwarriorblake, you really must be blind. Its not a matter of hearsay and what Rotlink did is relevant. Rotlink is the owner of archive.is. His use of a bot net (Ive never seen a legal use of a bot net spanning such a wide geographic area) is proven. Its not hearsay, Just because we dont have the person admitting it doesnt mean that its not fact. Usage of malware to propagate itself and its willingness to ignore website's policies is very questionable at best. If you really want a good solution just ask the WMF for one. Taking over website or going into partnership with archive.org is a far better solution that using a site that is a know source of abusive behavior. Werieth (talk) 17:54, 30 June 2014 (UTC)Reply
    I would think the person who ignores the dozens of talk messages from users telling him to stop what he is doing because it is detrimental, would be the last person to call anyone blind. Even if any of it were provable beyond speculation, the site has been sanctioned since the original RFC and the first time it attempted to do something of a similar nature it would be gone forever from Wikipedia. The important part is that it does what is necessary, and it is a far superior archiving service to anything else on offer, and it continues to operate without Wikipedia. It isn't some waiting harbinger looking for the slightest weakness in our defenses to inundate us with useful archives. Darkwarriorblake / SEXY ACTION TALK PAGE! 18:13, 30 June 2014 (UTC)Reply
    So lets ignore the fact that they use illegal bot nets, probable malware and abusive behavior, just because you think that they are useful? Werieth (talk) 18:18, 30 June 2014 (UTC)Reply

DWB, your personal attacks on editors who don't support your viewpoint are unacceptable. Next time you level a personal attack, I will bring your conduct to the attention of one or more administrators to determine if corrective action should be taken. Not a threat, but a promise/warning of what the next action will be. Hasteur (talk) 18:23, 30 June 2014 (UTC)Reply

Which personal attack are you talking about, since one wasn't made, while you've called Matthias intentionally dense. Feel free to report yourself while you are at it. Darkwarriorblake / SEXY ACTION TALK PAGE! 18:30, 30 June 2014 (UTC)Reply

Hasteur do not remove my comments from the page, they are part of the discussion and a point is being made. You are the only person who has use a direct personal attack and you are accusing others of it, I object to my comments being moved, do not do it again. Darkwarriorblake / SEXY ACTION TALK PAGE! 18:34, 30 June 2014 (UTC)Reply

Your mischaracterization of my statements is provably false, and if Matthias had educated themselves prior to asking the question I would have not had to use first level links from the RFC's page to prove it. Therefore the only assumption I could have made is that much like yourself, Matthias was going for the intentional denseness argument. It's not hearsay/speculation if it's proven that A) Rotlink operated RotlinkBot B) Rotlink was an editor who claimed they were operating Archive.is C) That RotlinkBot was caught operating outside of their own userspace without an appropriate BRFA approval D) That Rotlink, after begining to secure approval decided to stop working with the Bot Approval Group and instead push forward with their bot-like actions. E) That after the bot/Rotlinks accounts were blocked we had 2 waves of IP addresses come through and continue the same behaviors that exhibit the same behavior as the RotlinkBot. F) That these IP addresses were conducting edits so fast that they could be nothing by automated programs G) That a large portion of these IPs terminate at residental grade internet access points in a wide geographic distribution H) That Wikipedia decided by concensus that we are better off not allowing any additions of Archive.is to the site. Hasteur (talk) 18:40, 30 June 2014 (UTC)Reply
So Rotlink only claimed to be operating Archive.is. Honestly the original RFC seems like a massive case of overreach, blocking an entire site over hysteria and paranoia. Darkwarriorblake / SEXY ACTION TALK PAGE! 18:43, 30 June 2014 (UTC)Reply
So you accept my other statements in the chain? Good. Because Rotlink decided to cut communication with the Bot Approval group and push forward with the automated editing (which is against WP's terms of service) the bot was blocked and Rotlink was blocked for operating a bot without approval after having partially seeked it. When multiple geographically disparate IP addresses started exhibiting the same characteristics of the blocked bot, we could only conclude that the bot had been farmed out to a botnet to circumvent the blocks, so a Ban could be instituted in addition to Wikipedia deciding by consensus to prohibit the usage of Archive.is/Archive.today as site for caching a webpage. It's not an overreach if by consensus we deterine not to use Archive.is. Hasteur (talk) 18:55, 30 June 2014 (UTC)Reply
I accept that Rotlink and Rotlinbot were up to no good, I don't accept that, that undermines the purpose or reliability of archive.is. A lot of the previous RFC paranoia seems to stem from "a botnet is adding links, therefore archive.is intends to infect all our computers". I feel that this is hyperbole. Rotlink and Rotlinkbot were rightfully blocked and if Rotlink is still blocked I don't really see an issue with that, I also don't see an issue with preventing mass addition of links, even from Archive.is, but a lot of time has passed since the original RFC and the fears of archive.is have not manifested, and the moment it happened it'd just be blocked again, without any possible recourse next time. There is no reason not to allow the use of archive.is links by individual wikipedians. Darkwarriorblake / SEXY ACTION TALK PAGE! 19:01, 30 June 2014 (UTC)Reply
Hi all. As a suggestion for future readers, it won't be be a bad idea use www.archive.org as a first-level caching service of Wikipedia, and then and eventually use archive.today website as a second-level backup copy of the web archive page.
Secondly, A FOSS software like HTTrack may be parametrized as a Web Service, in order to copy-paste the address-link existing in the webpage (and in the linked ones) from a website A to a caching website B. Such a feature would be useful in relation to censorship, where users can access mirrored contents, which are saved or moved among servers diatributed in a wide geographic area.
Thirdly, dbpedia.org elaborates structured informations in order to create open knowledge graph (OKG) among different Wikimedia projects, whose contents are available for everyone in the Web. Dbpedia makes use of faceted browsers, ontology, SQL-like queries, and operates on local copies of the external sources, without any data synchronization which single wiki-project may choose to introduce.Micheledisaverio (talk) 02:56, 23 July 2018 (UTC)Reply
edit

I knew I had seen the evidence before. But here it is where Rotlink states that changes are going to happen to archive.is before they are done. Given those facts (s)he either owns the site, or works for it. Werieth (talk) 22:12, 30 June 2014 (UTC)Reply

Or happened to get that very general information in another manner (e.g. a separate inquiry to archive.is). Or just made up a response which might come true, or not. Either way – it happening, or not – is easy to continue: Happens: "see I told you"; doesn't happen: "I did not like the changes so I did not have them go live"; "still working on it"; "maybe in a month or two".
Is there a link that shows the changes mentioned actually happened and were A. at least reasonably close to the quite generic description of the changes being worked on and B. near to the time that he stated this was being done? Obviously, there were changes to their FAQ after that, but it appears to have been on or after 2013-04-17 (assumed from example dates used, which probably would have been in the past when written) which was 5 months later. Other dates imply that there may have also been a change in January 2013, which would be a month and a half after the post by Rotlink.
The only other one along these lines that I recall, but have not re-found, was one where he said changes had occurred (i.e. mentioned them after the fact). As such, that one is a null data point. Given that, I did not actually put that much effort into trying to re-find it. — Makyen (talk) 01:12, 1 July 2014 (UTC)Reply
Due to the nature of their site I cant get diffs or change histories to provide bulletproof evidence, however I do recall changes being made within ~2 week period after that. Werieth (talk) 01:14, 1 July 2014 (UTC)Reply
Ok, thanks.
Do you have any idea how often such changes were made in that time-frame? Knowing something along these line would help reduce the possibility of it just happening to be a change that could have been expected by someone familiar with the site. — Makyen (talk) 01:31, 1 July 2014 (UTC)Reply
IMO what User:Lexein said in response to my comment here Wikipedia:Archive.is RFC is most releveaing. When Lexein contacted Rotlink, they got a response from someone with the same first name as that in the WHOIS report. It wasn't an archive.is domain but still it's quite suspicious. It also sounds like Lexein has had difficulties getting any info. I'm a bit unclear if Lexein has tried contacting archive.is (rather than Rotlink) and asked them about Rotlink and the botnet. If they have, then this is even more suspicious to me. While a false flag operation is possible, if someone is possibly trying to impersonate you and spam your site elsewhere, it seems very weird if you don't respond when people ask you about it. Ultimately we can't help those who don't help themselves and if a site seems to be getting spammed on wikipedia, and the site owners don't say anything when you ask them about it, it's IMO resonable to assume the owners or people working for them are doing it. Nil Einne (talk) 07:29, 1 July 2014 (UTC)Reply

Misleading statements in Background section?

edit

I see the claim "Archive.is does use advertising" in the Background section, but according to the archive.is FAQ, it does not use advertising, and there are no plans to introduce advertising at least until the end of 2014 (if at all); and I have not been able to find any adverts on archive.is, except for archived adverts that were on the page being archived. The content of the section would be assumed by readers to be fact, and I think that the continued, unchallenged presence of this unsubstantiated claim in the section may skew the commenters' views against archive.is. --Joshua Issac (talk) 14:22, 18 July 2014 (UTC)Reply

It should not show any advertisement, not even those on the archived websites. The archive.is FAQ is not trusted because it had broken its promise, and it still refuses to explicitly state that it will never advertise (unlike some fund-dependent alternatives, which has decleared ad-free policies like Wiki itself). Moreover, the unapproved bot adding archive.is links has demonstrated the archive.is owners' intention (significance does not matter) to use Wiki as promotion tool, which is a complete violation to Wiki policies, guidelines, Terms of Use, and the CC-BY-SA & GFDL requirements. Honestly, WMF should take office action to defunct archive.is links.Forbidden User (talk) 14:51, 18 July 2014 (UTC)Reply
Are you suggesting that an archiving service should intentionally tamper with pages to remove certain content (adverts in this case) before it adds it to the archive? I do not think that such modification of pages would be appreciated by those who expect an archiving service to archive a page exactly as it is (or was). It is absurd to suggest that the non-modification of archived pages somehow breaks archive.is's promise of not using adverts.
There is a very big difference between having adverts, and not promising never to have them. The Background section claims the former, when that is not the case. Moreover, some services we do allow, such as WebCite, have made promises to the contrary, that they will introduce "content-specific ads".[1] Note that your characterisation of Wikipedia as an alternative to archive.is is inaccurate, because under the current policies and guidelines, Wikipedia would only be able to archive websites licensed under the CC-BY-SA or the GFDL, which most websites we use as references are not. See also this article on a previous Orange-Wikimedia partnership.
Finally, Rotlink's use of an unapproved bot (which did break the WP:NOTPROMOTION policy, and possibly the T&C, but definitely not the CC-BY-SA or the GFDL as you claimed) is completely irrelevant to whether the Background section of this RfC should make claims that mislead commenting editors. --Joshua Issac (talk) 15:46, 18 July 2014 (UTC)Reply
The last thing is only some additional comments. "The archive.is FAQ is not trusted because it had broken its promise" is based on others' comments that the site does not have good reputation. Are you suggesting that an archiving service should intentionally tamper with pages to remove certain content (adverts in this case) before it adds it to the archive?" - personally yes. Again, I post the thing here because I feel it is more personal. "It is absurd to suggest that the non-modification of archived pages somehow breaks archive.is's promise of not using adverts." - if it has good reputation that'd be absurd. The point is that many has lost faith in it, and it should be respected.Forbidden User (talk) 16:04, 18 July 2014 (UTC)Reply
Tampering with pages would serve to damage an archiving service's reputation for providing an accurate archive, so it is good that archive.is is not doing that. Which comments about the site's reputation (for what?) are you referring to? --Joshua Issac (talk) 16:31, 18 July 2014 (UTC)Reply
For the record using the template or changing the preamble of this RFC when a statement has remained stable since the launching of the RFC is not the appropriate way to challange a pre-conception. I have removed Joshua Issac's second attempt to modify. The statement has been incorporated into others responses. Discussing it here is the right way to challenge it. Also, you might want to ping Darkwarriorblake who put the statement in at the begining of the RFC or KWW who contributed significantly to this RFC. Hasteur (talk) 18:12, 18 July 2014 (UTC)Reply
Disputed statements are challenged by tagging them with inline dispute templates, so that readers know that they are disputed, and linking to the relevant discussion on the talk page, so that readers know where the discussion is taking place (see Wikipedia:Manual of Style/Biographies, Wikipedia:Naming conventions (languages), Wikipedia:Public domain, etc. for such usage of templates). This is what I did, before my change was reverted, and I was attacked with unwarranted accusations of being a POV-pushing pedant.
I am not going to edit war to restore the tag to the disputed claims, and will instead notify of this discussion @Darkwarriorblake and @Kww as you suggested, as well as all of the other editors who explicitly mentioned advertising in their comments (@PaleAqua, Nil Einne, Ceyockey, Damotclese, and Wbm1058).
For clarity's sake -- in my opposition to the first RFC element, I did not say they currently use advertising; I referred to their FAQ indicating that they reserve the right to start including advertising. --User:Ceyockey (talk to me) 13:40, 19 July 2014 (UTC)Reply
Just to make it clear, my proposal is that we either provide evidence for the advertising claim, or remove it. I hope that the editors notified will contribute to this discussion. --Joshua Issac (talk) 19:19, 18 July 2014 (UTC)Reply
That was KWW's contribution based on the original RFC, you'll note that my reply is the first on the page and my comments note htat I don't see any advertising on the site. Darkwarriorblake / SEXY ACTION TALK PAGE! 20:40, 18 July 2014 (UTC)Reply
Since I was pinged. I don't think the advertising bit is a big issue and a bit of a red herring. Yes that part of the RfC might not be completely neutral, but I don't see that many that use it as a sole reason for opinions one way or the other. I only mentioned it a couple times, most recently to explicitly discount it and the first time to challenge a comment that looks like it was in a response to a RfC comment request talk page notice and appeared to be place in the wrong subsection. PaleAqua (talk) 00:01, 19 July 2014 (UTC)Reply

Actually the claim that blocking the botnet "was merely procedual" is wrong, as the "user" clearly violates policies, and would have been rejected anyway. "An effort to get a bot approved to implement the RFC result stalled, indicating that the community may no longer believe the block to be warranted." - stalled by filibustering, thanks. Meanwhile, two people against it cannot indicate any WP:CCC. "Archive.is is an archiving service similar to sites like Webcite and the Wayback Machine, offering different levels of service up to and including snapshots that are retained regardless of modern changes in a sites robots.txt file, which the Wayback Machine can abandon (potentially delaying rather than removing the potential for LinkRot), while Webcite has presented itself as having an uncertain long term future tied to funding. No issues have been found with the quality of the snapshots provided at archive.is." tries to imply that its service is superior to others and that it is the only "good choice" for Wiki archiving, giving a first impression that "archive.is is something beneficial", which is disputed.Forbidden User (talk) 07:34, 20 July 2014 (UTC)Reply

It provides information that the two major alternatives are provably not reliable and that we need access to as many effective archive sites as possible. Darkwarriorblake / SEXY ACTION TALK PAGE! 18:31, 20 July 2014 (UTC)Reply
This is disputed, which is why it skews people to support.Forbidden User (talk) 14:50, 21 July 2014 (UTC)Reply
That the Wayback Machine deletes entries from its database is easily verifiable, and the procedures for removal are listed on the Internet Archive website. WebCite's uncertain future has also been confirmed by the website owner, whose fundraising campaign has still not attracted enough money to secure the website's future, after nineteen months of campaigning. The FAQ has also said that there will be ads on WebCite. The skew is not in support of archive.is, but in support of link removal, and it results from the archive.is ads claim, as well as the hypothetical botnet, which probably did not exist. --Joshua Issac (talk) 19:50, 29 July 2014 (UTC)Reply
About the advertising thing, I challenged the claim a few days after the RFC started pinging the appropriate people etc. From what I can tell, no evidence was ever provided in support of the claim archive.today is currently including their own advertising. I considered complaining about the misleading RFC claim but ultimately decided the RFC was already messy enough, with enough of a messy history (and this was before Wereith was blocked as a sockpuppet) that it would be best to just let it be. Any admins closing this RFC will I presume recognise that the claim is unsupported and therefore put zero weight on such claims and consider the appropriate weight to give any support or opposition for some action when part of the reason has no weight. Yes it's a mess fo admins, but I don't see it can be helped. By now, and even when this thread was first raised, IMO changing the wording of the RFC wasn't as important as mentioning to anyone who had made a comment in support or opposition based on the claim of advertising that the claim appears to be unsupported so they could reconsider their comment. Nil Einne (talk) 16:35, 1 August 2014 (UTC)Reply

{{Not a ballot}}?

edit

As I've spotted two suspected SPA, should we add this tag to avoid any canvassing or so, that occured in the previous RFCs? Or is it a little bit late?Forbidden User (talk) 15:26, 21 July 2014 (UTC)Reply

Blatant ad

edit

181.21.133.83 (talk · contribs) made a single edit to add "Alternatives" sub-head and wrote "You can use Permamarks instead. It run by respectable US company and obey DMCA and robots.txt." - This is highly suspicious for a first and only edit by an IP on something really unusual. Not only is it complete bunk - but I think something fishy is going on in this RFC. ChrisGualtieri (talk) 04:57, 28 July 2014 (UTC)Reply

I've noticed that as well, and also have a something not quite right feeling about some of the stuff related to this RfC. It's one of the reasons I took a step back from my original position and examined stuff more closely. I've got some suspicions of what it might be, but hard to prove. PaleAqua (talk) 05:11, 28 July 2014 (UTC)Reply
Also the IP is a proxy on several known blacklists, and its from the deep south of Argentina. The company chosen is also brand new, and the entire "ad" is a total lie. It doesn't respect robots.txt as "claimed". We are being played and the thinking that Wikipedia was being used to launch an attack is just different from the fact that there seems to be some bizarre and artificial SEO optimizations (for page ranking) and high page views to really unusual stubs like that pornographic actress with only a first name and a disambiguated title. I think a much smaller "round table" type discussion should be made which deals with the problem until it can be hashed out and a solution or two can be done, but I think it might be best to let the real issue be resolved by someone more familiar with this particular problem. ChrisGualtieri (talk) 05:49, 28 July 2014 (UTC)Reply

What ads? I don't any any ads!

edit

People keep on calling this commercial and continue complaining about advertisements, but I have yet to see any advertisements with this service. Perhaps "private" would be a better way to put it? Regardless, I don't understand complaints about something that isn't there. Dustin (talk) 17:37, 1 August 2014 (UTC)Reply

Probably they mean those on the archived pages. You know, those ads cause unnecessary concern on ad replacement and so. The "ad" could also be RotLink's act, which by some is advertising. P.S. You are right that it is private-funded.Forbidden User (talk) 15:37, 3 August 2014 (UTC)Reply
@Forbidden User: I still haven't seen any ads, and how do we know that Rotlink owns the service? Anyone could make that claim, but that does not mean it is true. I am still yet to see any archive.is-placed advertisements, so I will wait before making that judgement. The only ads which might be there would be ones hosted by whatever website itself and on that same page. In that case, it would be that the ad itself was archived, not the service placing the ads. Dustin (talk) 15:51, 3 August 2014 (UTC)Reply
I'm still looking at archive.is. Anyway, RotLink can promote archive.is even if they are unrelated. It has not been addressed as the links added by RotLink are not yet removed.Forbidden User (talk) 16:09, 3 August 2014 (UTC)Reply
Without placing its own ads, which it doesn't and we should not assume it does without any proof, you cannot say this is all some terrible for-profit scam. (not applying to you, but to many of the people on the actual RfC page) Dustin (talk) 16:46, 3 August 2014 (UTC)Reply
By the way, can you enter archive.is? I can't.Forbidden User (talk) 17:06, 3 August 2014 (UTC)Reply
edit

I noticed that Rotlink is still replacing lots of dead links on Wikia websites, especially Military Wiki with over 9,000 edits. Does anyone have thoughts about this? Dustin (talk) 03:37, 21 September 2014 (UTC)Reply

Also, I noticed that most of the links on Military Wiki being fixed by Rotlink are pointing to archive.org, not archive.is/archive.today. Interesting... Dustin (talk) 03:57, 21 September 2014 (UTC)Reply
Wikia is a for-profit commercial website to begin with, so I really don't think Jimbo Wales and Angela Beesley would really mind someone running an unauthorised bot to replace deadlinks. Perhaps it would be much less controversial there or something, since they're not supposed to be a "free content" website. --benlisquareTCE 07:18, 21 September 2014 (UTC)Reply

Statistics from dewiki

edit
url count
* Links to archive.today/[:timestamp:]/[:url:] 2328
Links to archive.today/[:url:] 12
Links to archive.today/ (namespace) 2
Links to archive.today/[:short-key:] 0 Editfilter 182 log
Links to archive.is/* 0
Links where memento of archive.today is used and web.archive.org is availible but not usefull: 280
Links without availible alternative in other archives: 2048
Links to *.archive.org/*/[:timestamp:] ~ 71100
Links to webcitation.org ~ 12900

With a restrictiv workflow, the usage rose 250 links since june 2014, 56 of theme with a available but not usefull memento in other archives.Boshomi (talk) 07:59, 26 October 2014 (UTC)Reply