User:Stefan2bot

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Operator: Stefan4 (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Adding original upload logs to pages.

Automatic or manually assisted: Automatic with some manual elements.

Edit type (e.g. Continuous, daily, one time run): To be run once in a while when I feel that a new run is needed.

Maximum edit rate (eg edits per minute): 6.

Bot flag requested: (Y/N): Yes.

Programming language(s): Python.

See Commons talk:Wikitravel Shared transfer task force#Time for a new solution?. Some users are copying files from Wikivoyage to Commons without adding an original upload log. In the near future, the Wikivoyage image repositories will go offline, so it won't be possible to check the source for upload information, and it might be harder to tell if a file is a copyright violation or not if the original upload time and date are unknown. I request permission to add original upload logs to files by bot. The plan is that the bot will run semi-automatically:

  • If the file has {{Flickrreview}} or certain other templates, it is assumed that Wikivoyage isn't the original source, so no original upload log is added.
  • If the file was uploaded to Commons before it was uploaded to Wikivoyage, it is assumed that Wikivoyage isn't the original source, so no original upload log is added.
  • If the file already seems to have an original upload log, it is assumed that no additional upload log is needed.

In other cases, an original upload log is added automatically if the files are identical (same hash value). If the files are different, the bot asks if an original upload log should be added or not. --Stefan4 (talk) 21:19, 23 November 2012 (UTC)

Discussion

[edit]
Does the bot operate hash based all the way? Or are you looking for filename matches first (this will obviously miss uploads with changed names)? --Dschwen (talk) 07:57, 24 November 2012 (UTC)
I forgot to tell that. There is a template which is added to files if they are available on Commons. Files with this template are categorised on Commons status ([1][2]). The bot goes through those two categories and obtains Commons file names from the NowCommons template. The checks mentioned above are just meant to verify whether the tag has been applied correctly (same/different hash value), whether it is likely that Wikivoyage is the original source and whether an original upload log is needed. --Stefan4 (talk) 10:35, 24 November 2012 (UTC)
Please go ahead with a ~30 file test run. --99of9 (talk) 13:43, 24 November 2012 (UTC)
There should be a test run with exactly 30 edits now. See Special:Contributions/Stefan2bot. I noticed one problem: There's no interwiki prefix for the Wikivoyage file repositories, so I had to use a normal link, and since the bot account isn't autoconfirmed yet, I had to enter CAPTCHA for every edit. Since I also had some other things to do, I sometimes waited a while before entering the CAPTCHA, so there may be a long time between different edits. However, the script seems to work properly (assuming that I get confirmed or autoconfirmed status on the account). --Stefan4 (talk) 21:05, 24 November 2012 (UTC)
Thanks. I've given the bot the confirmed status. --99of9 (talk) 22:06, 24 November 2012 (UTC)
When they shut down the WIkivoyage image repository, are all these links going to break? --99of9 (talk) 22:08, 24 November 2012 (UTC)
All links will break when the old servers are shut down, yes.
I could drop the userpage links if you prefer since it wouldn't make much sense to have a broken link. Other tools, such as tools:~magog/fileinfo.php and User:MGA73bot2, provide broken user links (example, discussion), which is worse than my links which do at least work for the moment.
{{Original description page}} forces a link to the file information page and I would prefer to use that template whenever possible since it provides localisation of the source notice.
Also note that there are two image repositories, Wts and Shared, which are used for different language versions. Shared uses a different URL structure which isn't compatible with {{Original description page}} at all. --Stefan4 (talk) 23:12, 24 November 2012 (UTC)
I confess. My bot provides bad links. My plan was to do a search and replace on all 40.000 (?) files when the transfer is complete. I think that Stefan should run his bot as fast/soon as possible, because once the file is deleted on the local project it is a nightmare to fix the problem. So fix all the easy problems and run the bot... --MGA73 (talk) 14:02, 3 December 2012 (UTC)
Yes, I'm just hoping that this will be approved. Any changes to links can be fixed later, as long as the information is available somewhere. --Stefan4 (talk) 14:10, 3 December 2012 (UTC)
As I see it the task IS approved: "Thanks. I've given the bot the confirmed status.". So I see no reason not to start the bot. This is not en-wiki. We do not block bots that do good edits. We talk to the operator if there we think there is a problem. --MGA73 (talk) 16:42, 3 December 2012 (UTC)
See Special:Log: User:99of9 added the bot to the user group "confirmed users". This was to prevent the need to enter CAPTCHA for each edit. I thought that a bureaucrat would still have to close the request and add a bot flag. Or did I misunderstand something? --Stefan4 (talk) 17:34, 3 December 2012 (UTC)
Yes, I was waiting for input from others. Do you have a sense of how urgent this is? --99of9 (talk) 12:39, 7 December 2012 (UTC)
My sense of the urgency here is that a week seems like a long time. It's already difficult to check some files from the trial run, as they've since been deleted from wikivoyage WTS. --Avenue (talk) 14:46, 8 December 2012 (UTC)
Yes, it would be nice if the bot could run as soon as possible, before too many files have been deleted. It would be more work if I have to change the bot to read deleted files too, and deleted files aren't categorised on WTS either.
I didn't realize they were deleting already. I'll move to speedy close this.--99of9 (talk) 10:15, 9 December 2012 (UTC)
One question: Apparently, there are two different sorting orders for original upload logs (see discussion at User talk:MGA73#Please fix your bot). Sometimes, the oldest revision is first, and sometimes the latest revision is first. Is there some official rule here, and which order is preferred? --Stefan4 (talk) 22:56, 8 December 2012 (UTC)
Go with whatever people want in that discussion. I'm not aware of an encoded standard.--99of9 (talk) 10:15, 9 December 2012 (UTC)
All the trial run edits look fine to me, although I can't check the upload history for the 5 files that have already been deleted from wikivoyage WTS. --Avenue (talk) 02:14, 9 December 2012 (UTC)

Given the urgency, and apparent lack of problems, I propose to flag the bot if there are no further objections within 24 hours. --99of9 (talk) 10:15, 9 December 2012 (UTC)

Approved --99of9 (talk) 00:13, 11 December 2012 (UTC)