User talk:Dexbot

From Wikimedia Commons, the free media repository
Jump to: navigation, search
Welcome to Wikimedia Commons, Dexbot!

-- Wikimedia Commons Welcome (talk) 05:59, 5 August 2012 (UTC)

Special:Diff/138740297[edit]

Hi Dexbot. Why don't you add a line break? --Leyo 22:50, 5 November 2014 (UTC)

Hi, thank you for telling I fixed it. Amir (talk) 07:42, 6 November 2014 (UTC)

Adding Wikidata link[edit]

Hi,

Your bot had a problem editing this page : Creator:Arthur Corvisy. It forgot Q in the wikidata reference. It might have happened on other pages ? Aristoi (talk) 00:44, 8 November 2014 (UTC)

hi, It will fix them very soon. Amir (talk) 00:51, 8 November 2014 (UTC)

Wikidata field in Institution pages[edit]

Hi, There is about 50 edits like this in the Institution namespace and they all end up in Category:Pages using duplicate arguments in template calls. It is great that you are adding the links to wikidata, but please remove the existing empty wikidata fields first. I will clean up the templates. --Jarekt (talk) 13:00, 17 November 2014 (UTC)

Hi, Thank you for doing it. I added lots of checks but since the instituion template accepts both |wikidata = and |Wikidata = it broke some of my checks. Amir (talk) 13:45, 17 November 2014 (UTC)
Ok, that makes sense, and wikidata/Wikidata confusion is sort of my fault, so it serves me right. By the way most infobox templates on commons accept wide variety of alternative field names (mostly for historic reasons and to make them backwards compatible) so this is something to be careful about. Thanks for all your work. --Jarekt (talk) 15:04, 17 November 2014 (UTC)

Incorrectly cleaning "see below"[edit]

Hi Ladsgroup, your bot is cleaning too much when "Removing redundant 'See below' in permission", see Special:Diff/142186006. Regards, --Patrick87 (talk) 00:27, 11 December 2014 (UTC)

I concur with the above and even add that the message "See below" under "| permission =" is never redundant as it clearly fills in that important field of {{Information}} and refers unambiguously to the license-header section. Removing it when the file page includes a license section, while strictly parsimonious, reduces clarity and may contribute to false positives. -- Tuválkin 01:16, 11 December 2014 (UTC)
Darn, that is right. Here the bot removed a link with the see below that should probably have stayed. I pulled the emergency brake and blocked the bot for a day. --Dschwen (talk) 03:06, 11 December 2014 (UTC)
I agree, and will be reverting all of these on my files, if the bot doesn't. The license template has always been ambiguous in its stating that attribution must be made "in the manner specified by the author or licensor" without including any field for the author to give any such specification. The place where I specify that is in the "Permission" field, but the bot has removing not just "see below" from that field but instead has been removing the entire "Permission" field, including statements I've placed there in my hundreds of uploads, such as "user must attribute the photographer on any reuse" – something which, surprisingly, is actually not stated in the license template. (The template does not say clearly that "attribution to the author is required", even though a lot of uploaders seem to believe it says that. It only says attribution is required "in the manner specified", but if no manner is specified, some people infer that that means the author in those cases doesn't care about attribution. So, I make it clear when I upload photos.) Steve Morgan (talk) 08:04, 11 December 2014 (UTC)

I stopped the bot, I will use more strict approach: the bot only removes the permission field if the permission contains only "see below" and similar (like See below.) I will revert the edits that has been made otherwise Amir (talk) 10:35, 11 December 2014 (UTC)

Thanks Amir! I will unblock the bot account immediately. --Dschwen (talk) 14:39, 11 December 2014 (UTC)

bad Information templates[edit]

Amir, Here is an example of a correct bot edit that resulted in incorrect Information template: [1], due to Garbage in, garbage out problem. May be the bot should count {} and [] brackets and make sure they match in the initial or final wiki code. --Jarekt (talk) 20:59, 29 December 2014 (UTC)

Hey, I will add a check for them but I also suggest that we check in which pages my bot deleted the category but still no valid Information template exists so we can find errors easily Amir (talk) 08:52, 30 December 2014 (UTC)

mistake in the bot edit[edit]

The bot made a mistake, when implementing the int:filedesc of File:Koh Samui Lipa Noi.jpg. It took the name of the uploader - who transfered the image from de.wikipedia 11 years ago - as the author's name. No big deal for this file since I am still active here and I corrected it. But it might be interesting for you and the code of the bot concerning other files that are not on anybody's watch list. --Tsui (talk) 22:36, 20 February 2015 (UTC)

Hey, you are right, I'm getting list of them and will check them by hand (it'll be about 50-100) or revert them all. Amir (talk) 12:40, 21 February 2015 (UTC)
All possible errors were here and I fixed them Amir (talk) 17:18, 26 February 2015 (UTC)

{{Information}} with only "description" field[edit]

Lately, I run couple times into files using {{Information}} with only "description" fields. After some digging it seems like each time it is due to your bot edit, like here. In my opinion it is totally pointless to add {{Information}} template if you did not detected source, date or author information and can not populate those fields. I would rather skip such files and wait for algorithms that can detect something, or for humans. --Jarekt (talk) 13:22, 25 February 2015 (UTC)

Hey, I was trying to work on pattern of users in in this list but it seems the regex wasn't prefect, I'll fix them soonish Amir (talk) 19:03, 25 February 2015 (UTC)

@Jarekt: I started to work on errors. There 142 cases that the bot worked this way and in some cases like this the bot made no mistake IMHO. Since source is not defined at all and these type of edits help us find them but in some cases like you exampled, source is defined and bot mistakenly put it in the description. I can split the list and fix latter cases or revert all 142 cases. What do you suggest? Amir (talk) 06:36, 4 March 2015 (UTC)

I am not much concerned with already done edits, but more thinking about 500+k files still without information template. In my opinion if a bot does not detect author or source than it should skip the file, so other bots or people can work on them in the second round. Maybe add a Hidden category like Category:Media missing infobox template: author and source not detected, so it is easier to keep track of them, but maybe be it is thenot necessary. If we menage to reduce the number of "easy" files in the Category:Media missing infobox template than it will be easier to work on the harder cases. One set of files I was thinking about is ~156k files in both Category:Media missing infobox template and Category:Self-published work, like File:060703 Kattegatsilo 02.jpg. It seems to me that for those (if the file was not transferred from somewhere else) we can use the the original uploader as the author, EXIF date as date (with Template:According to EXIF data) and {{own}} as source. Unfortunately I am not sure how to access original uploader and EXIF, but it might be similar to what you are already doing. --Jarekt (talk) 14:01, 4 March 2015 (UTC)
Yes, of course I won't work on "PD old" files and I leave them to humans. About self published works, I wrote the script to take care of this job but there are two main concerns: 1- There are cases that uploader uses {{Self}} but it's not the photographer, the uploader can be copyright holder instead of photographer and distinguishing these cases with simple ones (author, copyright holder and uploader is the same person) is not easy but If I can distinguish I will skip complex cases. 2- description is huge number of cases is not standard and it doesn't use language templates (I use a language detect system but it has issues). I appreciate your comments on these concerns Amir (talk) 15:39, 4 March 2015 (UTC)
Yes this script should only be applied to files using non-PD licenses, like CC and/or GFDL. I also assume that source and author are not easily detectable. About author or the copyright holder dilemma, we could create a wrapper template similar to Template:According to EXIF data, that would add text that author is presumed based on original uploader using {{self}} template. I can create one if you think it would be useful. As for descriptions, I would grab the full text of the page, detect categories and place them on the bottom, detect licenses and place them in the license section, permanently delete interwiki links (like [[en:foo]]), and place the rest in the "description" field. You might also need to strip old section headers ("summary" and "license") and place some safety triggers that would skip the file if "description" field is too big or complex. You might want to first only allow "descriptions" without any wikimarkup (no templates, or wikilinks = no "[[" or "{{"), just to keep it simple on the first pass. I would not worry about language detection. I can not imagine robust way to do it. --Jarekt (talk) 18:54, 4 March 2015 (UTC)
@Jarekt: please bring this discussion in a more general place and if people tend to agree on having a wrapper template we can start fixing files. Thanks for your efforts Amir (talk) 05:21, 28 March 2015 (UTC)
  • I assume these assumptions absolutely useless, while the the author is clearly mentioned. Moreover, the result is a wilful misrepresentation. --Figure19 (talk) 15:57, 10 October 2015 (UTC)

Metadata in description of {{Information}}[edit]

Hi Dexbot, is there any strategy how to move metadata like date, etc. which are now contained in the description instead of date parameter after your bot run? --Arnd (talk) 08:05, 25 October 2015 (UTC)

PS: Btw, maybe it is interesting for debugging/improving purposes to see what templates are broken Category:Pages using Information template with incorrect parameter after the bot run. --Arnd (talk) 08:28, 25 October 2015 (UTC)

Hey, Mostly we are working on moving these data from description based on common patterns in them, At first we need more descriptions there and then we can work on moving them to authors part. Amir (talk) 11:14, 26 October 2015 (UTC)

Category:Pages using Information template with incorrect parameter[edit]

Hi. Could you have a look at the files in that maintenance category? Your bot is responsible for the majority of them. --Leyo 21:52, 26 October 2015 (UTC)

Hey, Thank you for telling me. It happened because tables got moved to the information template by mistake. I will fix most of them pretty soon (probably by bot). Would it be okay for you if it takes one week or two? Amir (talk) 10:23, 27 October 2015 (UTC)
One option would be to fix tables by replacing the following:
  • {| → {{{!}}
  • | → {{!}} (inside of tables only)
  • |} → {{!}}}
--Leyo 13:48, 27 October 2015 (UTC)
Amir, since the number of broken {{Information}} is not so big, i would prefer to repair them manually. This way i can also extract the missing metadata as well. --Arnd (talk) 05:40, 30 October 2015 (UTC)
That would be great. thanks Amir (talk) 11:56, 30 October 2015 (UTC)

Duplicate description[edit]

Hi Amir. I noticed that after your bot's edit, there was a duplicate “Description”. Then, I found out that there are more such cases. --Leyo 22:06, 3 November 2015 (UTC)

Leyo, i guess that the bot just takes all existing content and uses it do the description parameter. --Arnd (talk) 05:46, 4 November 2015 (UTC)
Yes, but the current approach might be improved. --Leyo 10:17, 4 November 2015 (UTC)
Leyo, two section before i had a similar question. As i understand they first just want to have the {{Information}} in place. And later extract data such as date, author etc. from it. --Arnd (talk) 12:06, 4 November 2015 (UTC)

I understand that you like my work[edit]

If you have any question I will be very happy to answer you Yair-haklai (talk) 16:06, 5 November 2015 (UTC)

Reverted edit on monument image.[edit]

I reverted this edit -- I am not the author, so the new version was not correct. It will have to be update manually by someone who understands the new guidelines better than me and can read better than the bot. -- Phyzome is Tim McCormack 21:00, 8 November 2015 (UTC)

So why you used Template:PD-self? Amir (talk) 10:32, 9 November 2015 (UTC)
10 years later, your guess is as good as mine. The File Upload Service is long since defunct, and I don't recall the guidelines it used. Regardless, the metadata will have to be updated by hand for that page, not by bot. -- Phyzome is Tim McCormack 00:13, 10 November 2015 (UTC)

OK, just reverted another edit to the same page. Would it make sense to block the bot from editing that page until we're done talking about it? -- Phyzome is Tim McCormack 13:33, 12 November 2015 (UTC)

I'm finished with images for now. Please note that these errors are rare and unlikely to happen. Amir (talk) 06:29, 13 November 2015 (UTC)

strange result[edit]

https://commons.wikimedia.org/w/index.php?title=File:Acido_benzoico_struttura.png&diff=177922557&oldid=144586053 --Itu (talk) 09:08, 25 February 2016 (UTC)

It's because of using "==" in the math expression which made the bot think it's a new section. I will find any possible mistakes and fix it ASAP. Thank you for notyfing me Amir (talk) 11:59, 25 February 2016 (UTC)

Strange result - 2[edit]

Please note - Revision #168659732. This lead to broken description. --Kaganer (talk) 12:45, 30 June 2016 (UTC)

Hey, issues like this are inevitable when you run into a very large scale. Fortunately, finding issues like this is easy [2]. I fix all of them one by one. Amir (talk) 15:54, 2 July 2016 (UTC)
All issues got fixed now Amir (talk) 16:02, 2 July 2016 (UTC)
NP ;) Very thanks! --Kaganer (talk) 21:18, 3 July 2016 (UTC)