Commons talk:Structured data/Get involved/Feedback requests/Computer-aided tagging designs

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Please leave your observations and feedback here. Keegan (WMF) (talk) 20:27, 8 October 2019 (UTC)

How to test it?[edit]

I thought I will be able to test this tool from UploadWizzard, but it is not happening. I dont see any change in my UploadWizzard and there are not suggestions. Reading Submitting images for suggestions: *"Users have the option to opt-in to receive a notification to review their images for depicts suggestions when using the UploadWizard." -- where is this "opt-in", I dont see it. How does it look like?

  • "The opt-in option appears after publication in the UploadWizard, when the user is presented with links to their files. " -- Nothing appears to me. How to switch it on to test it?
  • "his opt-in for notification can be also selected in Special:Preferences, and users can opt-out from receiving notifications there as well if they change their mind." --I havent find this in my Preferences. Where should I find it, how it looks like?

Juandev (talk) 21:42, 8 October 2019 (UTC)

Right, these changes do not exist yet. The opt-in will look like a check box with a descriptive line of text, it will be very plain. I'll figure out how to make the main page more clear. Keegan (WMF) (talk) 21:44, 8 October 2019 (UTC)
To be clear, we're looking for feedback about how the design and the workflow looks before we build it, at which point you can tell us how it feels to use. Keegan (WMF) (talk) 21:46, 8 October 2019 (UTC)
I see, I thought we can test the feature yet. Juandev (talk) 21:59, 8 October 2019 (UTC)

Feedback[edit]

So how much time will it take to process the image? Will it take longer, when a bunch of images is sold for processing? I would say it would be useful to submit for tag suggestion new images and also old images. With the new images if the processing time is short before saving and with already existing images after the click. Would this service be used externally, like tools like VicunaUploader?

What if there is not a translation from tag to Wikidata item? I guess such tag is not displayed right?

What are the popular upload tabs? Would it be possible to create a pagePile or select a category for such tagging?

It would be nice to test it. Life version always brings up new problems and thoughts. Juandev (talk) 22:12, 8 October 2019 (UTC)

  • Hello, of what we can see in the screenshots and recordings provided, it looks pretty good both in terms of design and use. Regards, Christian Ferrer (talk) 04:20, 10 October 2019 (UTC)
@Juandev: I do not know how long it will take until the notification is sent out, but I do not think there will be a difference between batch uploads and smaller, individual file uploads. It will not work with other tools. The tagging tool is multilingual, and tags and translations should obey the fallback language chain that Wikidata uses (so yes, a tag should still be displayed...). Popular images are taken from an algorithm based on tagged assessment (quality/featured images, etc) and reuse on other wikis. It will not be possible to create lists or use categories with this tool. Keegan (WMF) (talk) 18:02, 11 October 2019 (UTC)
@Juandev: Oh, and for category and pagePile tagging, you can use the AC/DC tool. Keegan (WMF) (talk) 18:04, 11 October 2019 (UTC)

Tag popular images[edit]

Is it possible to tag/suggest tags for images of others? How will this work? This is not clear from the screenshots. Will this feature also be released at the end of October? John Samuel (talk) 19:36, 9 October 2019 (UTC)

This service is one-way, the computer suggesting tags to the user. For users to suggest tags for user's other images, I'd suggest simply editing the structured data for the file. Statements can be removed/undone/rolled back as needed as any other edit can be. And yes, the goal is to have the tool ready by the end of this month (October). Keegan (WMF) (talk) 19:38, 10 October 2019 (UTC)
Just to add to what Keegan said above - the Popular Images tag in the UI is for a pre-populated list of images gathered via an algorithm that only includes images that meet one of the following criteria: a.) have been assessed as quality, valued, or featured and b.) images that have been used on numerous wikis. Any autoconfirmed user can confirm tags that have been suggested for any images in that tab. As Keegan mentioned, if an image does not meet those criteria, it is best to tag it manually by editing the structured data directly on the image's file page. RIsler (WMF) (talk) 20:52, 10 October 2019 (UTC)

generic vs specific tags[edit]

Currently, the depicts guidelines say that users should add the most specific tags possible and not add more generic tags. For example, you should tag a photo of a Zebra Jumping Spider as "Salticus scenicus", but not "jumping spider" or "spider" or "animal" or "life". I imagine that in many cases (perhaps most cases), Google's API will suggest more generic tags than what a human could figure out from the description (or it will suggest both specific and generic tags, as in your examples). Should we still discourage the addition of such generic tags? If so, can we incorporate some kind of guidance to that effect into the interface? Like "Only choose the most specific tags that apply to the image.". Or do we want to change the policy at this point and say add as many tags as possible? Kaldari (talk) 00:55, 10 October 2019 (UTC)

I agree that Google AI will not be possible to indentify species and breeds, but maybe once sw got proper Wikidata item, could it be possible to provide drop down menus with some parent items? This may work for species.Juandev (talk) 06:42, 10 October 2019 (UTC)
@Kaldari: I personally think that the depicts guidelines could start to evolve to be more liberal than they are now. If I remember correctly, the current guidelines were written to be narrow on my own suggest at launch, to avoid people over-tagging images when the tool was brand new and before there was any guidance. I think enough time has passed to start broadening the statements. But again, this is my personal opinion. Keegan (WMF) (talk) 15:08, 10 October 2019 (UTC)
This is also my (professional) opinion :) RIsler (WMF) (talk) 16:01, 10 October 2019 (UTC)
@Keegan (WMF), RIsler (WMF): I think this mainly hinges on how y'all are planning on integrating structured data into search. If someone does a search for a specific tag, will the search engine also include items further down the Wikidata item hierarchy (even if they aren't explicitly tagged as such)? For example, if a photo is just tagged as "Chihuahua", would a search for the tag "dog" find that photo? If not, it seems like we would want to include tags for all levels of specificity (Chihuahua, dog, carnivore, mammal, animal), but if the search is going to know about item relationships, we would only want the most specific tag (Chihuahua). Kaldari (talk) 17:34, 10 October 2019 (UTC)
We've been talking about the reliability of the Wikidata concept hierarchy for quite a while now, and it just won't support that level of functionality. It'll work for some things, but for every example that works there's two others that lead to unexpected and/or unreliable results. We've brought the issue up with WMDE and the Wikidata community, with no resolution in sight. Unless something dramatic happens (always a possibility!), our best option for the foreseeable future is to handle it at the tag level. RIsler (WMF) (talk) 17:57, 10 October 2019 (UTC)

Identifying Gene Names[edit]

Overall, it's looking good!

I have a specific use case that I'd be eager to help work on. There are many biological pathway diagrams like File:Signal transduction pathways.png, containing gene names like FADD. But if you search for FADD, you won't get that diagram, because the word "FADD" is just in the PNG, not anywhere in the text. (I just now manually added a note to link FADD in the image to the corresponding Wikidata entry.)

It would helpful to researchers if we could identify the genes these diagrams mention. This could be done via a combination of OCR, post-processing to match results to known gene names, and human curation. How could I get started helping to make this happen? --Ariutta (talk) 04:28, 10 October 2019 (UTC)

The word is just in PNG? Where? I am probably blind. Juandev (talk) 06:45, 10 October 2019 (UTC)
No, it's not your fault. This is actually a good example of the problem I hope we can address :) Look for "FADD" near the lower left of the diagram (up from the gray "Death factors" symbol). Ideally, you would be able to use ctrl-f to find it on the page, and additionally, you could create a link that would display the image and highlight FADD, similar to this, where "ATP6AP2" is in yellow. --Ariutta (talk) 19:59, 10 October 2019 (UTC)
This use case shouldn't hold up the release of the work already completed; it's probably an item for a future release. I just wanted to start the conversation. --Ariutta (talk) 20:12, 10 October 2019 (UTC)
Hello! Although the Google Cloud Vision system does have a feature for OCR (which we already use on Wikisource), it requires an additional cost that prevents us from using it at scale along with the tagging feature. So we're not looking at this as a feature for Commons at the moment, but it is possible a future release will support it. RIsler (WMF) (talk) 20:46, 10 October 2019 (UTC)
OK, I had assumed it was the same cost to get "LABEL_DETECTION" vs. "LABEL_DETECTION + TEXT_DETECTION". Even without OCR, the tagging of gene mentions could definitely be done manually. But if the computer-aided tagging doesn't include TEXT_DETECTION, I should move this discussion somewhere else. Any suggestions for where? Thanks! --Ariutta (talk) 18:16, 11 October 2019 (UTC)

Do we have a tool for mass tagging?[edit]

This probably is not the right discussion venue, but although I am exited about computer-aided tagging, I feel like what we need now is some mass-tagging tool. Something like cat-a-lot tool but wot adding "depicted" tags. Are there any tools for that. Perhaps extension of cat-a-lot for GUI interface and QuickStatements tool for text interface? --Jarekt (talk) 13:56, 11 October 2019 (UTC)

I feel like we do not have proper tools for allowing people to do the job efficiently and computer-aided tagging tool should come after some more basic tools. --Jarekt (talk) 14:03, 11 October 2019 (UTC)
@Jarekt: Actually, there are two of them. See Commons:Depicts#Gadgets. Ayack (talk) 14:08, 11 October 2019 (UTC) And by the way QuickStatements supports Commons editing. Ayack (talk) 14:24, 11 October 2019 (UTC)