Commons talk:Structured data/Computer-aided tagging/Blocklist

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Please leave your suggestions for additions to the blacklist. The development team is open to ideas on how to process requests. If there is an appropriate existing template system that might help here or other request models that can be emulated, please feel free to make those suggestions as well. Keegan (WMF) (talk) 17:30, 1 April 2020 (UTC)[reply]

Maybe just put ✓ Done or  Not done behind the items got listed here. --GPSLeo (talk) 18:12, 1 April 2020 (UTC)[reply]

List of item should be added to the blacklist[edit]

translation/same word mismatches[edit]

Same word has multiple meanings and got matched with the wrong item:

theoretical concepts[edit]

Things that can not be seen on any image because they are only concepts:

Here I would disagree. I think that is a good suggestion if that in really on the image. --GPSLeo (talk) 20:34, 1 April 2020 (UTC)[reply]
@GPSLeo: But an estate is a theoretical space, just like nature reserve -- the problem is that its one of those ones that can be interpreted in dozens of different ways. Sadads (talk) 20:49, 1 April 2020 (UTC)[reply]
The item is a subclass of architectural structure (Q811979) and this is definitely touchable. --GPSLeo (talk) 21:30, 1 April 2020 (UTC)[reply]
@GPSLeo: Yeah, I strongly disagree with that in the data model -- and it was recently added -- there is probably a better way to describe that, because in English that concept usually means the collective property, not a particular part/element that you can reach out and touch. Sadads (talk) 23:15, 2 April 2020 (UTC)[reply]
building interior (Q30062422) shouldn't be a part of the black list. It's useful to specify for example for unspecified rooms. --XRay talk 05:19, 3 April 2020 (UTC)[reply]
I agree -- what I am proposing is that the interior design one is actually just describing building interiors: as a general concept I think this is hugely valuable. Sadads (talk) 12:44, 6 April 2020 (UTC)[reply]

Modes of producing media[edit]

These could theoretically be applied to another property (like the colors) Sadads (talk) 19:43, 1 April 2020 (UTC)[reply]

Time concepts[edit]

Can only be evaluated within the context of when and where the photo is taken and should be a different property if used. If someone wants to depict it -- I would rather that be added more sparingly, instead of these assumptions being made wholesale, Sadads (talk) 22:46, 1 April 2020 (UTC)[reply]

very generic[edit]

These items could be added to so many files that it would not make sense to add them to any:

Colors[edit]

Overly specific applied to many different things[edit]

  • @XRay: The tool is only suggesting these interiors of art museums where for the most part the images are depicting objects in the collection. As for the actual buildings, art museums should be named by the specific museum itself, since most of those are in Wikidata at the moment. Very rarely are we going to be in a position where the suggestions from the tool are value add. Sadads (talk) 14:00, 3 April 2020 (UTC)[reply]

Parts being applied to the whole -- or identifying the unnecessary tops of things[edit]

  • @XRay: So this is only for the Computer Aided tagging: the vast majority of the tags being applied by new users from the machine suggestions, are being applied this way. This is not meant to be an edict on the use of the property more generally -- its just that the vast majority of them is bad form this tool. Sadads (talk) 13:32, 3 April 2020 (UTC)[reply]

Parts being applied to the whole -- or identifying parts of the face without a clear reason[edit]

Other[edit]

Blacklist update pending 4-2-2020[edit]

Thanks for the suggestions so far. Many of the above items are now pending update (marked with  Half done). The remaining items require a bit more investigation for long-term effect. I'll update again when the changes are in place on production. RIsler (WMF) (talk) 23:33, 2 April 2020 (UTC)[reply]

Shouldn't all the suggestions investigated first? Adding to the black list within only two days with nearly no comments is very fast. --XRay talk 05:15, 3 April 2020 (UTC)[reply]
I think for now it is better to add them fast and maybe remove them later. --GPSLeo (talk) 08:27, 3 April 2020 (UTC)[reply]
+1 -- I think at this point, we should be moving as many of them off the tool that don't have a high consistency coming out of new users on the tool. The software has been added to the Beta Wikipedia app, as a "suggested edits" function: we will get more and more content that is mixed in with good things (like some of the edits swept up from you @XRay: in my work to reduce redundancy on these tags in the last couple days. Sadads (talk) 13:55, 3 April 2020 (UTC)[reply]
Updates from April 2 are now on production and marked with ✓ Done. This will prevent the suggestions from appearing on newly uploaded files. The patch to apply the blacklist to all files in any past queue is being worked on. RIsler (WMF) (talk) 18:37, 7 April 2020 (UTC)[reply]

Propose Excluding Order, Family, Genus as a generic practice[edit]

So for the generic tool, I think its highly risky (and probably really bad for positive identifications) to leave these items in the results: most of the examples that I have seen so far have been wrongly depicted, and it suggests to the user that its good enough to use that statement with the higher classification, when most of the time the photographer has already specifically identified what species it is.

There is probably a great opportunity for building a seperate tool for folks that are of the inaturalist bent, to go through and sort the huge categories of unidentified species, but in the short term I don't think we want to do that here. Sadads (talk) 20:40, 1 April 2020 (UTC)[reply]

Another idea/categories[edit]

A good suggestion may be a look to the categories of the photograph and possible corresponding items. --XRay talk 16:42, 4 April 2020 (UTC)[reply]

Racial bias[edit]

Hi, I have been redirected here by Astinson (WMF), after an email exchange. My concern follows:

In the context of the Visible Wiki Women campaign, I have uploaded images of Brazilian women onto Commons. I did not care too much initially about including structured info on the pics, but then today I was redirected to the suggested label editor on Commons and decided to give it a try. This is where the issue started: on images about a black women, and only when the woman was black, I was suggested to describe physical traits, especially her hair. This was never the case with non-African Brazilian women. Apparently, there is some sort of racial bias in the way SDC labeling is working. The gallery below brings some pics I have uploaded; only in the case of Silvana Bahia was I suggested labels that referred to physical traits.

Hopefully, this can be sorted out and fixed. Cheers. --Joalpe (talk) 00:42, 8 April 2020 (UTC)[reply]

This is appalling and deeply troubling to see in a Wikimedia tool, even as we know it's part of a wider and systemic form of racial bias in tech tools and platforms. Please look into this urgently, team, especially as a key focus of #VisibleWikiWomen is to challenge the biases of both gender and race on Wikimedia projects. And thank you, Joalpe for your work on this. Anasuyas (talk) 09:32, 9 April 2020 (UTC)[reply]
I think this is part of a broader Anglo-Saxon region bias. Many grassland become marked as prairie (Q194281), every lake with mountains around should be a loch (Q1172903). But for example the typical middle European deciduous forest (Q1211122)/temperate deciduous forest (Q1556311) are not known anyway. --GPSLeo (talk) 09:30, 8 April 2020 (UTC)[reply]
GPSLeo, while I understand (and generally agree with) a broader Anglo-Saxon bias in many things tech and automation, I find it troubling that you would equate persistent racial bias against categories of human beings as equivalent to the naming of geographical features. They are epistemically related, but they are not equivalent. Thanks, Anasuyas (talk) 09:32, 9 April 2020 (UTC)[reply]
I sad this problem is part of the general bias resulted by the AI training data. That was not meant as an excuse for this. I think this is technically the same problem and could have the same solutions. --GPSLeo (talk) 10:53, 9 April 2020 (UTC)[reply]

@Joalpe: Thank you for your attention to this matter. After reviewing the system, specifically the recent uploads from the VisibleWikiWomen campaign, we've found that the algorithm focuses on the *prominence* of hair in the image, not race/ethnicity. In a small sample of images, this may not be immediately apparent. However, an analysis of a few dozen images in the campaign reveals that the algorithm equally suggests hairstyle/hair color across a variety of ethnicities and body types.

Appending "?action=info" text to any image file page URL will show some of the "behind the scenes" data for that image, including the suggested tags (straight from the machine vision API, before a user made any selections). Just scroll to the bottom or search for "Suggested labels". Doing this just for VisibleWikiWomen images from the last few days yields many instances of hair/hairstyle/hair color suggestions for photos of many types of people:

Hair/black hair:

https://commons.wikimedia.org/wiki/File:Tashi_Malik_on_March_08,_2020_(cropped).jpg?action=info

https://commons.wikimedia.org/wiki/File:Tashi_Malik_(cropped).jpg?action=info

https://commons.wikimedia.org/wiki/File:Kaushiki_Chakroborty_(cropped).jpg?action=info

https://commons.wikimedia.org/wiki/File:Seema_Rao_(cropped).jpg?action=info

https://commons.wikimedia.org/wiki/File:RSK_photo_crop_copy.jpg?action=info

https://commons.wikimedia.org/wiki/File:Dannielle-Engle-Headshot.jpg?action=info

https://commons.wikimedia.org/wiki/File:Bik_Tye1.jpg?action=info

Blonde/brown hair:

https://commons.wikimedia.org/wiki/File:LisaMonteggia.jpg?action=info

https://commons.wikimedia.org/wiki/File:Chris_Murphy_with_Wanda_Vasquez_(cropped).jpg?action=info

https://commons.wikimedia.org/wiki/File:Janelle-ayres-0X8C9916-rt.jpg?action=info

We are very aware of problems with bias in machine learning systems and we continuously evaluate our solutions for those issues. As the tool evolves, we'll develop with feedback and discussions like this in mind. RIsler (WMF) (talk) 14:59, 9 April 2020 (UTC)[reply]

I came here after a discussion in Telegram in the Wikifeminist group. Apparently the faults of this tool are endless. Moreover, I wonder how this is actually a "good" response in any shape, way or form: "we've found that the algorithm focuses on the *prominence* of hair in the image, not race/ethnicity. In a small sample of images, this may not be immediately apparent. However, an analysis of a few dozen images in the campaign reveals that the algorithm equally suggests hairstyle/hair color across a variety of ethnicities and body types." What's the relevance at all of classifying people by the PROMINENCE of their hair?, especially when the PROMINENCE of the hair is typically a characteristic of women, suggesting a feature on the first photo of, for example, "hairstyle", and on and on in the rest of the photos, like "makeover". I mean aside from the discussion around racism it's probably sexist to add categories such as "hairstyle" or "makeover" in the depiction of women. So, no, if you "are very aware of the problems with bias in ML" it really doesn't show - and unless you have a women, a black person, a latino, and several other groups of traditionally excluded people working as a developer in your team, or a committee of all those people actively participating in discussions around development and deployment of this tool, it's not enough to just make a statement on awareness (especially after making this mistake). There are TONS of research around algorithmic bias in AI and there are lots of organizations that could have been brought as consultants before deploying this tool, like the Algorithmic Justice League, or even a check from any of these organizations or even following these simple guidelines would have been enough to avoid making this mistake. This is against of all the ethics of a non-for-profit like WMF and it's pretty much a very Silicon Valley approach to the problem. Scann (talk) 18:57, 9 April 2020 (UTC)[reply]

Hello @Scann: . Thanks for your input. To address a few of your concerns above:

  1. The team working on this tool is not your typical Silicon Valley setup. I am African American. Both Program Managers who work on this are women. One of the primary developers is a woman. We take this work seriously and have first hand experience with the issues involved.
  2. The tool is not attempting to classify people by the prominence of their hair. What it does try to do is identify prominent elements of a photograph. It's more about the characteristics of the photo than the person. This is relevant because Commons does actually have a fair number of good imagery where hair is the primary focus/topic. Just to list a few easy examples: Man_with_long_brown_hair,_rear_view.jpg, and Man_with_long,_dark_hair,_black_and_white_head_portrait.jpg, and Man_with_long_wavy_pepper_and_salt_hair_and_top_hat_at_Peach_Festival_2015.jpg. Our aim is to make imagery easier to discover using structured data across many languages so students, journalists, researchers, and Wikimedians can find just the right images for their use cases, which can be rather difficult to do at the moment.
  3. We've looked at some data for hair-related items and found consistency across gender and ethnicity. This image of a long-haired bearded man, for instance, had the hair suggested tag in addition to identifying his beard and mustache. We will continue to monitor this and will also talk with the machine vision API provider to make any needed tweaks.
  4. Speaking of tweaks, with all that said, you're right that some additional suggested tags probably aren't right for the task, and they may be applied with a bias, and we could do without them. That's why we filter things out proactively and when we discover them (which is a constant process as the algorithm is changed in efforts to improve). The 'makeover' item is one of those things that's popping up now and it's not something we want so we will make sure that item is filtered out going forward.
  5. Finally, we absolutely value all feedback and we hope our prompt responses and actions reflect that. Algorithmic equity is a complex and sensitive topic and one we take seriously, which is why we value the opinions and perspectives of as many community members we can get. RIsler (WMF) (talk) 23:25, 9 April 2020 (UTC)[reply]

Hello @RIsler (WMF): . I'm sorry if I came wrong or incorrectly. It's just that this tool has been a consistent problem (and a very annoying one, especially when I couldn't figure out where to shut it up) and today learning about this in the Wikifeminist chat made me very frustrated with the tool. I shouldn't have made those ad-hominem arguments or assumptions, and I'm sorry if that offended you/your team. I do feel that this tool has a lot of problems, and there should have been some more thought/training before making it public. For example, there was no way when it started showing me suggestions to say "this tag is not useful", it didn't allow for you to go ahead and search for a more useful tag, or double check other's suggestions. It also kept showing me my own images that I had already rejected the tag, with the same tag, instead of a better tag. Also, I still think that even if the team behind this is not the typical Silicon Valley set-up, a committee or a working group monitoring these things closely would still be beneficial and something that makes sense, especially advising and beta-testing the tool before it starts massively suggesting tags that might be potentially problematic. This is a timing issue. I don't think this is an issue of whether you or your team takes it more/less seriously, it's simply that the topic is complex (and sensitive) enough to merit more thought than the way in which it has been deployed. As for now, it looks like a mix between a tool full of bugs and an insensitive tool. And, btw, I understand the potential of the tool - but I don't think this way of deploying it is the best way to actually convince users to see the potential of it. I particularly would have gladly contributed to a "gaming" thing like this (because it's fun), but the way in which the tool has been working is no fun at all. An improvement of Andrew's AI identification tools could have been a good idea, or a more progressive deployment (for example, with artworks). Scann (talk) 00:00, 10 April 2020 (UTC)[reply]

Hello @Scann: . Thanks for the thoughtful comments. We hear you, we know you're frustrated, and we're working hard to make things better. We would have liked to start with artwork like other tools have, but the Commons community made it clear that artwork data should live primarily on Wikidata wherever possible, which doesn't address our Commons-specific use cases. So we were left to start with photography right at the beginning. We did a "soft launch" in December, with a small number of users, after a period of beta testing with other users, and we did our best to scale up responsibly, leading to a wider release in late February.
But machine vision analysis on photography is very challenging. We did run into some surprising bugs and, to your point about timing, as the COVID-19 crisis unfolded we had less capacity/productivity than we orginally planned to have in March. We hoped to have everything ironed out much faster.
Still, this month, we're releasing features and fixes that have been planned and worked on for a while. Just today we released the ability to add your own tags (which we've wanted to do since the beginning). We are also prioritizing new bugs that have come up, and working to improve our filtering (which is the purpose of this page). There are still community differences in opinion on which data modeling approach is better, and there are still assumptions about the tool that are often not accurate, but we're determined to work with community to clear things up. RIsler (WMF) (talk) 01:21, 10 April 2020 (UTC)[reply]
@RIsler (WMF): Thanks for looking into this. I am glad the WMF dev team is taking potential biases seriously in working on this tool. I am not familiar with querying the tool: is it possible to query all descriptors that have been used on a specific category, i.e., images that were contributed via the Visible Wiki Women campaign? Thanks. --Joalpe (talk) 19:31, 9 April 2020 (UTC)[reply]
Hello @Joalpe: . We're still working on providing better ways for community members to query everything. For now, you can track all images that include the Visible Wiki Women 2020 template AND have at least one "depicts" (tag) value with this search query. Click on any file in that search result, scroll down a bit in the resulting page, and click on the Structured Data tab to see the depicts statements (which is where computer aided tagging places tags users selected). Please keep in mind that some results may include tags/depicts statements that were added manually via UploadWizard or File pages. Let me know if you have further questions. RIsler (WMF) (talk) 23:48, 9 April 2020 (UTC)[reply]

Response to community on Suggested Tags bias issue[edit]

Thank you for identifying this issue and bringing to our attention quickly and explaining it so thoroughly.

Racial and gendered bias is inherently problematic wherever it is found. In the Wikimedia movement in particular, we’ve made an explicit commitment in our 2030 Strategic Direction to work against these structural biases, and toward knowledge equity.

Bias in software development, particularly ML tools, is a widely acknowledged problem. In the development of this program we disqualified a provider over unethical AI, but we knew that we would see this problem again. Bias is a deep structural problem in our cultures and societies that is often replicated in the things we create, and it is our responsibility to remain vigilant in identifying and addressing this harm. In this case, a tool we deployed appears to perpetuate, rather than reduce, bias. We take full responsibility for not conducting more thorough bias testing prior to deployment. We are truly sorry for the harmful way in which women, and especially black and African women, were categorized by this tool.

In order to address this specific issue, we are doing the following:

  1. We are adjusting the Suggested Tag system’s analysis criteria so that images of people will be excluded from the tag approval queues (both “Popular” and “User Uploads”). We will monitor and continue to make improvements to the initial implementation to make sure images of people rarely (if ever) appear in the approval queues until we have had a chance to review our systems further. We are currently discussing engineering feasibility.
  2. We are reviewing our internal processes for identifying and mitigating bias in our algorithms.
  3. We will not re-enable the functionality described in the first step until we have investigated the system’s analysis criteria for issues of bias. We have more work to do around planning this investigation and determining success criteria.

We have also received feedback about the name of this feature, “blacklist”, and have done some research on the background of the term.[1] It will be changed to “blocklist”.

We will have an update on our plans and progress next week.

1. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6148600/ RIsler (WMF) (talk) 18:59, 11 April 2020 (UTC)[reply]

The systems changes mentioned above have been made are now in effect on Commons. RIsler (WMF) (talk) 21:11, 14 April 2020 (UTC)[reply]

Canonical list[edit]

@Keegan (WMF): Which page is the canonical list? The main namespace page and this talk page list completely different pages as being blacklisted, which is confusing. Kaldari (talk) 01:54, 10 April 2020 (UTC)[reply]

Also this page is almost impossible to discover unless Ramsey tells you about it :) Seems like it should be linked from other places. Kaldari (talk) 01:58, 10 April 2020 (UTC)[reply]
@Kaldari: Hello! The true canonical list is at https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/InitialiseSettings.php if you want the latest up to the minute update. The main project page does have a delay of up to ~72 hours after the update hits production because we have to make sure it lands, works with no issues, find time to update the page, etc. FYI, I've just updated it to reflect the latest additions. Keegan did post this blacklist page in at least one CAT-related place, but I'll let him touch on that :) RIsler (WMF) (talk) 02:24, 10 April 2020 (UTC)[reply]
@Kaldari: I've linked to this page in several places, and I intend to add more to navigation templates next week. This page was "soft launched" because there is no process from Commons yet on how to actual agree upon and approve items to send along to the development team, and I didn't want the page to be overwhelmed with lists and requests all at once with no clear way to process them. Keegan (WMF) (talk) 15:27, 10 April 2020 (UTC)[reply]
Makes sense. Hopefully it can be transitioned into an on-wiki blacklist in the future and then y'all can just let the community worry about it :) Kaldari (talk) 15:33, 10 April 2020 (UTC)[reply]
Sorry, late circling back to this, but I hope we can get to that point as well. Keegan (WMF) (talk) 15:52, 14 April 2020 (UTC)[reply]

Potential Changes to blocklist methodology[edit]

Hello all. WMF staff were finally able to meet with Google devs yesterday to discuss more robust options for filtering/improving labels going forward. Talks are still in progress, but we expect some changes to happen on their side soon. On our side, we wanted to be careful about making too many filtering changes too quickly so we could adapt to performance concerns (applying lots of these retroactively can be costly) especially after implementing the change with filtering out photos of people. We're in the process of adding more items to the list, however some won't be added to the blocklist right now as we're still investigating the search/discovery impact of those labels. I'll mark those with  On hold RIsler (WMF) (talk) 23:34, 27 May 2020 (UTC)[reply]

List is out of date[edit]

@Keegan (WMF): The list at Commons:Structured data/Computer-aided tagging/Blocklist#Blocklist is out of date. Could you update it? Also, it would be nice if you used the {{Q}} template, rather than just listing the item number. That will also give you the label for free. Kaldari (talk) 19:08, 16 July 2020 (UTC)[reply]

Updated RIsler (WMF) (talk) 17:54, 17 July 2020 (UTC)[reply]
I might be able to get the list converted to the template if I can find a way to do it quick 'n easy with regex. We simply haven't had the time to get to something like templates for a list on that page. Keegan (WMF) (talk) 17:00, 20 July 2020 (UTC)[reply]