Commons talk:Structured data/Media search

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Please leave your feedback about the interface and experience of using Special:MediaSearch. Keegan (WMF) (talk) 18:13, 28 May 2020 (UTC)

Thanks for the first round of feedback[edit]

The development team greatly appreciates the feedback left by everyone here, thanks for taking the time. There will be a new round in the near future with design changes to the prototype based on what's been received about the tool, and I'll get a post along with it to explain the influence and purpose of the changes. Keegan (WMF) (talk) 16:28, 22 June 2020 (UTC)

I'll be updating the page tomorrow with mockups of design changes based on feedback that you can expect to see in the next version of the prototype. Keegan (WMF) (talk) 21:15, 1 July 2020 (UTC)

Update posted[edit]

I've updated the page, please have a look. Keegan (WMF) (talk) 16:58, 2 July 2020 (UTC)

Copying over what I put on the project page:

Vue.js and the next three weeks or so (13 July 2020)

MediaSearch is being ported to a new software library, vue.js. During the next few weeks while the port takes place, three features will briefly be removed from MediaSearch that will be restored when the port is complete:

  • Autocomplete–this will return stand-alone.
  • Audio/video playback–this will return as part of the Quick View feature.
  • Filters–these will return as part of the new Filters feature.

I'll update when changes go live. Keegan (WMF) (talk) 18:31, 13 July 2020 (UTC)

Media Viewer[edit]

It would be useful when clicking an image in the grid... that a Media Viewer were displayed, delivering additional data without exiting the search. Now when you click a thumbail you get redirected to the file page. Strakhov (talk) 02:16, 25 June 2020 (UTC)

Interesting, thanks for the feedback. Keegan (WMF) (talk) 16:30, 29 June 2020 (UTC)

New tabs–"Other" and "Categories and Pages"[edit]

I've updated the status section with screenshots of new designs. The team is building in a new tab, "Other," to handle file types such as .pdf, .djv, and .stl. The "Categories" tab has been expanded to "Categories and Pages," to cover text pages like talk pages and other non-mainspace areas of Commons. You can expect to see these new tabs live within the next few weeks after the vue.js port is completed.

Speaking of the vue.js port, the changes related to that should be going live later next week (week of 27 July). These changes will temporarily remove a couple of features as I've previously posted about. The features will be returning soon with these new tabs. Thanks for following along, I'll have more information next week after the vue.js version is live. Keegan (WMF) (talk) 17:53, 24 July 2020 (UTC)

Odd result[edit]

Hey, found one that is interesting -- searched for "Native American" (was comparing it with the new Google search's handling of a sensitive topic) and surfaced a few odd results not explainable by text or concepts on the page (including this fish and one of the American Gothic copies. Is there any way to understand/expose why the something is in a result to, for example, either improve the structured data, or give more specific feedback on the search results, etc? Sadads (talk) 22:52, 2 September 2020 (UTC)

@Sadads: This actually isn't exclusively a MediaSearch problem. It's an inherent, long-standing issue with text matching algos with our search backend as you can see from this example with default Commons search. As you can see there, the fish image shows up because it matches both the words "native" and "America" in the description. It's a somewhat similar situation with "American gothic" - compound terms are tricky for the Commons current search algorithm and it tries to match the component words separately as well as combined. Current search relies on text matching algorithms which are usually okay but make key assumptions. MediaSearch makes some improvements, but will work much better with structured data on more files. MediaSearch, as you can see, actually does a better job of surfacing relevant content as it tries to rank/prioritize files with structured data plus a few other tweaks. But, despite having some unique logic of its own, MediaSearch still utilizes the algorithms of the old search too so it will inherit some of that behavior. MediaSearch is still alpha level software and we'll continue to tweak and build upon the improvements we've already made, but by its nature search is imperfect and some inaccurate results are bound to show up. We'll do our best to keep those to a minimum. RIsler (WMF) (talk) 19:58, 3 September 2020 (UTC)
@RIsler (WMF): Oh that makes sense -- and I totally think this is a huge improvement. I think more what I was asking is around the lines that in the old search you get a hint for the text that "matches" that content (bolded text) -- is that something that you would be getting in the Quickview? -- i.e. a little widget highlighting the text or feature (i.e. structured data) that is being used to drive the result? or even a hover over feature that revealed some of the elements of the algorithm in the back that play heavily in that result. I am going to go in and tweak the language on that result for instance. Sadads (talk) 21:30, 3 September 2020 (UTC)
For example, I was able to get the fish removed from the result (American Gothic is kindof a bizarre one there and in the main results -- its not registering any use of the word "Native" there). Sadads (talk) 21:41, 3 September 2020 (UTC)
@Sadads: Ah yes. Quickview is actually available now, but it's behind a flag. You can add &quickview=1 to any search result URL (like this) to enable it. That will show you both the filename and the description, which is where there are most likely to be text matches. There are some technical limitations keeping us from highlighting the matching terms but we're looking into it. Not sure how that will play out at the moment. RIsler (WMF) (talk) 22:50, 3 September 2020 (UTC)
I think for me, less than highlighting (which would be great), it would be great to expose the elements of the search algorithm that weighed the most in it matching so that there could be a advanced user interface element (like the various ORES tools on English that expose which topic to the user) so that if something is over-weighting an image toward that search result it would be easy to zoom in and fix it -- i.e. if its the Structured data, or caption, or random bits of the text or a category, etc) -- I don't know if this is something that the search interface exposes anywhere. For example, I discovered that this file was featuring prominently in the search for "lion" and it wasn't immediately obvious why until I shifted into the Structured Data tab. Sadads (talk) 12:15, 4 September 2020 (UTC)
That does indeed sound like a cool feature. When it comes to weighting, we have to defer to the expertise of the Search Platform team and I'm not sure how much we can expose the inner workings of CirrusSearch, but I'll mention it to them and see if we can formulate a plan for that kind of advanced curation tool. RIsler (WMF) (talk) 19:37, 8 September 2020 (UTC)
@Sadads, RIsler (WMF): as far as I know the "text" field is one of the most important fields for the search engine to work on. If you look at the search contents for the American Gothic English Wikipedia article, you'll notice that the "text" field contains the article.
The text field here is a mess. Take the example file, the only contents of the text field is "English" so it seems to fall back to "auxiliary_text" which does contain both "Native" and "American". Time should be spend time on improving the search indexing. This whole tool is build on top of an standard search api query so if the index contains garbage, the results will contain garbage. Multichill (talk) 18:48, 23 September 2020 (UTC)
@Multichill: Thanks for pointing that out. We plan to look into some longstanding Commons search issues in addition to improving the methodology by incorporating structured data into the weighting as we build out MediaSearch. RIsler (WMF) (talk) 21:20, 24 September 2020 (UTC)

trying the tool with polish terms[edit]

I was trying the tool using polish terms. Some observations:

  • I tried "Księżyc" (moon) and was shown bunch of photos seemingly unrelated to the moon. The issue was that the the images were cropped sometimes removing most of the photo and the moon was often in the cropped part. It would be nice to opt for no photo cropping.
  • I tried "glowa żyrafy" (giraffe head) and got 2 head shots of giraffe head
  • I tried "łysa głowa" (bald head) and got nothing. I tried "Lysy" or "łysy" (bald) and got files by User:Lysy. I tried "Łysina" (baldness) and got a lot of places in Poland with that word in the filename. I tried "bald head" and got a lot of images but very few of bald heads. I tried "shaven head" and got a few more images. "Ogolona glowa" gave me nothing. So the search for bald heads works only in English.
  • I tried "pasikonik" (grasshopper) and got a lot of files with grasshoppers (without word "pasikonik" in the file page) and some images by User:Pasikonik1979
  • I tried "komorka" (cell phone, cell in a tissue, shed) and got a lot of images with that word in the filename. No other cell phones or sheds.

--Jarekt (talk) 01:15, 24 September 2020 (UTC)

Great, thanks for the details. Keegan (WMF) (talk) 16:56, 24 September 2020 (UTC)

ChristianKl's thoughts[edit]

  • I searched for "Baum" both with the setting of German and English. In both cases it lists images made by Dein Freund der Baum that don't seem to match my search intent. I would expect that it would make sense to downrank the user name in the relevancy search. ChristianKl (talk) 18:30, 24 September 2020 (UTC)
  • My search for "femur" brought up images that are NSFW when searching in German. Part of the issue seems to be that (Femur) is missing from the aliases of the relevant item, but it's unclear to me why those images end up in the search results. While NSFW images aren't of great concern to myself, some people have a problem with seeing NSFW images that they didn't ask for and it might be worth thinking about how to deal with the issue. ChristianKl (talk) 18:30, 24 September 2020 (UTC)
  • When it comes to search suggestions it would be nice to have search suggestions that propose ways to clarify what sense of a word is meant. When I search "apple" it would be nice if the search suggestions would show "apple (fruit)", "Apple Inc", "Apple (family name)". I do understand that this is a more complex feature request but if it would be possible to implement such functionality in the search it would be great. ChristianKl (talk) 18:30, 24 September 2020 (UTC)

Display error on mobile[edit]

For whatever reason when I use this feature it doesn't display thumbnails on mobile, I don't get them on the "desktop version" of Wikimedia Commons either. Does anyone else have this? --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 18:50, 24 September 2020 (UTC)

@Donald Trung: Thanks for reporting this. This is a bug with our lazy loading code on some mobile browsers. It will be addressed in an upcoming update. RIsler (WMF) (talk) 21:29, 24 September 2020 (UTC)
Thanks for the response, I think that I'll then try it on another device. The graphic user interface looks great so far. Face-grin.svg --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 21:41, 24 September 2020 (UTC)

How it looks[edit]

It looks a lot like a typical Ecosia, Microsoft Bing, Google, Verizon's Yahoo!, DuckDuckGo, Etc. Image search, I really like this design because it would make it familiar to most internauts (or however you call people that use the internet). Overal I would say kudos to the team for getting the design right immediately, it looks like it would be a built-in advanced search engine and a welcome upgrade of the standard search bar. Keep up the good work everyone. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 19:04, 24 September 2020 (UTC)

Text excerpts in Categories and Pages, and order of tabs[edit]

Overall I really like the UX of this. The one change I think is needed is adding in the text excerpts to the "Categories and Pages" results. That would then cover all my needs when searching for policy/help pages and for discussions, as well as for media files.

I also tentatively suggest moving the "Other" tab next to the "Video" tab, because the "Other" file-types are still media, whereas the "Categories and Pages" are media-adjacent. Quiddity (talk) 17:39, 25 September 2020 (UTC)

Medium size and other inputs[edit]

Hi, firstly I tried in Media search in French language, and at first view I saw no issues. Secondly I have a strange display of the results for the "medium" size selection. E.g. when searching for this DOI 10.3897/zookeys.740.20458, I have a fine relevance of the results, both for the "images" and for the "Categories and Pages". However when I select size "medium" I have 3 images that are selected, which is in itself perfectly fine, but the two first images are zoomed in (even maybe upscaled), it is not visually pleasing. Thirdly, when clicking on one of the thumbnail within the results, the windows that opens with the details is very good, maybe in this "window" little arrows, on each sides of the selected image, to go to the previous or next result could be a good thing. Christian Ferrer (talk) 21:10, 26 September 2020 (UTC)

Thanks for the feedback! We are aware of the blurry images you mentioned and are working to improve how we display images in the grid to resolve that. Also, good idea about the arrows allowing you to quickly navigate between images. MWilliams (WMF) (talk) 17:16, 30 September 2020 (UTC)

Audio and Video[edit]

It could be interesting to add the following criteria, as the size for pictures:

  1. size for video
  2. file duration for audio and video

Djiboun (talk) 21:24, 26 September 2020 (UTC)

New filter: assessment[edit]

Hi, it could be a good idea to add a filter "Assessment" with 3 possible choices: Quality, Featured and Valued. In order to highlight, in one click, and within the results, the images that have Wikimedia Commons valued image (Q63348040), Wikimedia Commons quality image (Q63348069) or Wikimedia Commons featured picture (Q63348049) in Structured Data. A bit the same principle as it was done with Help:FastCCI, but now using Structured Data. Christian Ferrer (talk) 07:07, 29 September 2020 (UTC)

I just would like to second this request – IMHO it is a very good idea. When search results include many many media, it can be very useful for the user to limit the results e.g. to Quality images. Thank you very much, --Aristeas (talk) 17:24, 2 March 2021 (UTC)
Thanks both for the feedback. I've created phab:T276257 to track this feature request. CBogen (WMF) (talk) 18:38, 2 March 2021 (UTC)

Still useless[edit]

I add "Vojníkov" and I expect images of or from the village called Vojníkov, but I get a lot of crap. Why structured data only are not in use? --Juandev (talk) 16:43, 30 September 2020 (UTC)

A "structured data only" filter sounds interesting, I'll pass it along. Keegan (WMF) (talk) 17:09, 2 October 2020 (UTC)
+1. Please. Strakhov (talk) 14:42, 3 October 2020 (UTC)
I'm looking forward to use this feature in Wikipedia, but it's still flawed. When entering a Wikidata ID in the search box, for example Q90, results may include:
  • "Structured data results". Those using that Wikidata id in Structured Data (P180, P170,...). These ones should be prioritized.
  • "Category results". Those files included in a Category sitelinked from the Wikidata ID (or P373-ed). Categorisation depth could be 2, 3, 4,.... These would be useful when there are not many files using the Wikidata ID yet.
  • "String results". Those files including "Paris" somewhere (description, caption, filename...).
The third ones IMHO should be disposable when indicated (for example or something like that). Strakhov (talk) 13:45, 10 October 2020 (UTC)

Intersection category/statement[edit]

Hi, maybe it could be considered to give the possibility to use Media Search within a specific category (including (or not?) the subcategories), exemple all the images within the category Category:Dogs, and within its subcats, that have a depict statement with ball (Q18545), it would may be an interesting thing to develop. Christian Ferrer (talk) 08:51, 3 October 2020 (UTC)


For curiosity I tried Brazil [1], and Chesus, this is bad!

Prostitution, homeless people, beaches (that are not even in Brazil), an elephant!! This search is a reinforcement of stereotypes. Most of the photos was not made by Brazilians also.

Move on to more important topics:

  • This "popularity" is quite bad.
It will dislocate the curve to photos that already have some attention, and create a bigger distance to other photos. Would be better finally embedded the FP, VI, and QI at the search. And prioritize the new ones, creating a better variety. Also, been popular is not necessary good, letting a machine work to bring photos to us is not working.
  • Two clicks $%#$¨¨¨&!!
Again, a new feature that increases the steps! What is wrong with you guys?
Seriously, all the recent changes increase steps to go to some place, the "contribution" now I have to click to open the search bar, the structured data, I need to click in a bar to open it... the UX here was always terrible, but you are increasing the issue!
Why not the photo, the author and license at the same place?
Now I'm one click from the page that I want, I do not need to click to times, and the first 3 I put as example, a link to the highest resolution, I have the author, and the license, and have a download link, at the same place.
You want to keep the info and enlarge, okay, but create a clickable link below the image. Not everyone will use the ctrl click, and on mobile, well, tiff do not work and also this two click are mandatory.
Make the things easier.
  • What's large?
Large for me are 30mb pictures, 4K videos... my 5 years old cheap mobile phone produces pictures 4032x3024, how a 1920×1080, the standard size of screen is big?
Would be better a way to us determinate the size that we want to find
  • Mapped search
I run the WLE this year, and the community request to not include as depict the protected area. But "located at" was a nice and good idea that worked for this, and we have tons of images geolocalised already.
But we do not a search that show us photos near a location.
We also do not have a map with all entries of particular subject, great for science, for example, showing the distribution of bird, based on our photos.

That is it for now. -- Rodrigo Tetsuo Argenton m 11:55, 3 October 2020 (UTC)

Hello and thank you for your feedback! Regarding your concern about two clicks, I can explain why we made this choice. We ran user research studies and usability tests on this experience without the "quick view" (the panel with more information that comes out on the right). We also tested this design with more information below the images like you've suggested. The majority of users were frustrated by needing to load an entirely new page to see the image larger and get the information they needed. There also wasn't a consensus on what information was the most important for each user, especially for the overwhelming majority of users who are new to Commons. Putting a direct link to download the image before visiting the file page also can be problematic as many Commons contributors want users to read over the information on the file page before downloading. In general, putting everything that everyone wanted below the image became messy and hard to scan, especially when that information is very inconsistent across Commons. The quick view panel is a common user experience pattern for an overwhelming majority of image search traffic across the internet and was repeatedly asked for due to these expectations. So for many users this saves time and bandwidth to quickly access a majority of the metadata for each image while being able to continue searching on the same page. That being said, I’d love to run a few experiments in the future around putting information below the image as you have in your example and appreciate your opinion and passion around this project. We have plenty of room for improvement and iteration as more people begin to use this. MWilliams (WMF) (talk) 18:35, 5 October 2020 (UTC)

Add the Total Number of results available to UI[edit]

In the results UI, I'd like to see a "Results 1 – 20 of 23,867" indication like we get with a standard search. That way, in some circumstances if there are too many results for my needs or expectations, I will know as soon as the page has 1st loaded whether I need to refine my search criteria. That will also give me a clue about how many more results are yet to be loaded when the "Load more" button appears. Thanks! Quiddity (talk) 21:14, 16 November 2020 (UTC)

Search preference survey[edit]

I've posted a quick survey for users to take about which search experience they prefer on Commons. Please take a moment to look it over and participate if possible, it will be open for about three weeks. Keegan (WMF) (talk) 21:08, 17 December 2020 (UTC)


There is still time to take the quick Media Search survey on which search experience you prefer using on Commons, Special:Search or Special:MediaSearch. The survey is only one question–which search do you prefer–and will just take a moment to fill out if you're interested. Thanks! Keegan (WMF) (talk) 19:07, 5 January 2021 (UTC)

@Keegan (WMF): Can we change our answer after we submit it? I'd like to set my answer to "no preference" for now, but I plan to test out both search tools someday before the deadline. Thanks, pandakekok9 05:51, 14 January 2021 (UTC)
Ah nevermind, I didn't see this: The survey can only be taken once, and it will not appear again after being taken. pandakekok9 08:00, 14 January 2021 (UTC)

MediaSearch does not respect file name / description changes[edit]

copied from mw:Help talk:MediaSearch

see the file File:Aus der Hörnlihütte.jpg is the result of a rename, as the name and the description have been wrong. But it is still found by MediaSearch, although as far as I can see, only the redirect could be the reason for that.

Sometimes a file is renamed, because the name indicates the wrong thing. After such a rename, MediaSearch should not find the file by the old term. I would skip the evaluation of redirects / renames from MediaSearch, as there always should be a reason to rename, renaming normally is a refinement, and SDC are not touched automatically by a file rename. --Herzi Pinki (talk) 18:13, 14 January 2021 (UTC)

Huh, interesting find, thank you for that. Keegan (WMF) (talk) 20:20, 26 January 2021 (UTC)

organizational non-content categories; wikidata[edit]

copied from mw:Help talk:MediaSearch

Under tab Categories and pages I find User:OgreBot/Uploads by new users/2019 February 07 18:00. This is silly.

And I do not find Wikidata entries at all. --Herzi Pinki (talk) 18:13, 14 January 2021 (UTC)

@Herzi Pinki: sorry for the delay in reply. There are no current or future plans to make searching Wikidata within MediaSearch possible, as that is too big of a technical challenge. The Commons Query Service may be able to suit some of your needs if you know what it is you're looking for in a query, as opposed to a search.
However, to your first point, there is a namespace selector filter that will be deployed soon that should take care of the issue that you're describing here. You'll have the ability to exclude these sorts of results if they're not what you're looking for. Keegan (WMF) (talk) 18:48, 18 February 2021 (UTC)

justification and cropping images[edit]

MediaSearch Wurtenkees.png crops the images to justify the line with 3 images. IMHO there is no need to crop the images (in that extreme way) just for the imagination of justification. I'm on a wide screen. --Herzi Pinki (talk) 18:25, 14 January 2021 (UTC)

This was an intentional design decision. While it's true this cropping may not be needed for wide screens, most users (including myself) are not yet using large monitors. Cropping and justifying the results allows for more results to fit on a page and makes the tool much more usable on "average" size screens and laptops. It's always possible to revisit these design decisions in the future, but this is purposeful. Keegan (WMF) (talk) 18:53, 18 February 2021 (UTC)
Problem is it renders it very poor as an image search tool since you have no idea without clicking if the poor composition is an issue with the photo or the search engine.Geni (talk) 13:27, 26 February 2021 (UTC)

Updates and moving towards a default state[edit]

@Strakhov, Sadads, Jarekt, Donald Trung, ChristianKl, Christian Ferrer, Multichill, Djiboun, Juandev, Rodrigo.Argenton:@Herzi Pinki, GPSLeo, PKM, Syced, Mike Peel, GerardM, Ayack, Spinster, Kaldari, EugeneZelenko:@Jmabel, Julle, Quiddity:


Thanks to all of you for leaving comments, questions, and concerns over the past year as this tool has been developed.

There's some new feature updates to Special:MediaSearch, and only a few left to implement. Thanks in part to the feedback from everyone that's been left here, the team thinks that Media Search is fast approaching the point of being able to replace Special:Search as the default search for Commons (Special:Search will remain available, and there will be a preference to keep that page as the primary search experience). Please let us know if there are any outstanding usability or design issues you think might need addressed as we move forward, I expect to be able to give a more information about plans to the broader Commons community next week. Keegan (WMF) (talk) 18:51, 19 February 2021 (UTC)

  • Pictogram voting comment.svg Suggestion, I think that references to the old search engine should be somewhere in the GUI so people are aware of the alternative option rather than just "the people in the know", default search engine options should probably be a cookie for users without a Wikimedia SUL account. --Donald Trung 『徵國單』 (No Fake News 💬) (WikiProject Numismatics 💴) (Articles 📚) 19:00, 19 February 2021 (UTC)
  • No time to study this, but will it still have the problem of offering a ton of suggestions that turn out to have no hits on Commons? Or has that been solved? - Jmabel ! talk 21:40, 19 February 2021 (UTC)
  • You should find the auto-suggest to be much improved. Keegan (WMF) (talk) 22:02, 19 February 2021 (UTC)
    • @Keegan: So are you saying you believe this is fixed, or just somehow mitigated? - Jmabel ! talk 09:52, 24 February 2021 (UTC)
      • I'm saying that while I have no idea exactly what you're referencing in regards to your own personal experience, I do know that there have been several updates to the search backend over this past year since you last left feedback here and whatever your previous experience was will be different now. Keegan (WMF) (talk) 19:42, 25 February 2021 (UTC)
  • @Keegan (WMF): It looks ... rather complex and segregated. If you want it to replace the main search engine, what's the basic result that will be returned when you enter a search query and press return? If I try searching for Lovell Telescope, I get images, but they are rather random. I'm normally interested in getting to the category, but that's the 5th option? How can I see a set of results that says 'Here's the most relevant images, here's some audio files, and here's the category where you can find more'? Thanks. Mike Peel (talk) 20:57, 22 February 2021 (UTC)
  • Yes, the segregation is part of the point, it vastly raises the usability of the interface.. Compare the query you're using to the old search. The category link in the search results is halfway down the page as the eleventh option, sandwiched between small media thumbnails, in addition to the link at the top, which is similar function to the tabs replicated in the new media search. Generally, when using Special:Search here on Commons, you're hit with a barrage of image files, pdfs, wikipages, and any number of other results that are jumbled together. Keegan (WMF) (talk) 20:07, 23 February 2021 (UTC)
    Unless I am mistaken, once upon a time, searching for “Lovell Telescope” would have sent Mike straight to the category: back in May 2013, the SearchExtraNS extension was activated, and as far as I remember it stayed for years. I am not sure when nor why it was eventually disabled, but to my recollection it seemed to be at the same time as the depicts search was enabled in the search box. Jean-Fred (talk) 23:51, 23 February 2021 (UTC)
Thanks for the feedback, @Jean-Frédéric: it seems like you're referring to the change made in 2019 in phab:T235263, which removed redirects directly to a matching page title. Those redirects were often sending users to gallery pages which did not accurately represent the breadth of available files. There is still a preference to allow users to enable that redirect. CBogen (WMF) (talk) 16:40, 24 February 2021 (UTC)
(Village pump announcement about this change, Search help talk page post) mw:Extension:SearchExtraNS is still installed and enabled here, FWIW, serving up search results from a few Commons-specific namespaces. Keegan (WMF) (talk) 17:40, 24 February 2021 (UTC)
Thanks for the clarifications @CBogen (WMF), Keegan (WMF): The rationale given to not send people to Dog made sense to me at the time, and still does to a degree. However, I had not fully groked that this would mean that searching for “Lovell Telescope” would not send me straight anymore to Category:Lovell Telescope − which I think would still be a desirable behaviour. Jean-Fred (talk) 11:01, 25 February 2021 (UTC)
@Jean-Frédéric: the only thing that was removed was search results redirecting to main namespace pages; so if Lovell Telescope existed on Commons then search would have taken you there instead of the search results page. We didn't touch category pages. I'm not familiar with a time that a search result would take me directly to a category page, I can't recall having had that experience personally with my (mostly) vanilla Commons preferences. Keegan (WMF) (talk) 21:00, 25 February 2021 (UTC)
@Jean-Frédéric: I had a similar memory to @Keegan (WMF): it worked for galleries but not very well for categories (which was a problem with the old system). You could (and in fact, still can) search for "Category:Lovell Telescope" and find the category that way - but the new search returns images by default, even with 'Category' in the search term, which isn't so good. Perhaps it will still display the category link in the pop-down menu in that case, though? Thanks. Mike Peel (talk) 08:36, 26 February 2021 (UTC)
  • @Keegan (WMF): That makes sense, but it makes the (unintentional?) impression that Commons is for images, rather than for all media files. I think it's *good* that we say "we have PDFs, we have audio files, we have other media related to this search as well" rather than just "here are the images, click on the links above and we may or may not have other files". I'd definitely like to see categories highlighted more (appearing in the 11th place isn't great), but this goes the other way and completely hides them from sight from most search users. That may be a good thing in the long term - it would be great if search and depicts could replace categories entirely - but we're not there yet.
    I'd much prefer the basic search returns a mix of contents, with obvious links to just show images etc., but that goes against your aims I guess. Failing that, could you put a number next to the links to the other options, to clearly demonstrate that they have relevant results as well. And I would really like to see a prominent note saying that there is a relevant category (and if available, gallery) for the search term, so that it's more obvious that they are available. At least using exact name matching, but ideally displaying the top category in a search result so that it benefits from multilingual support in the infobox metadata. Thanks. Mike Peel (talk) 19:12, 25 February 2021 (UTC)
Thanks for this perspective. We'll keep an eye on the MediaSearch metrics to ensure that we continue to see improvements in the ability for users to find what they're looking for. Meanwhile, I've filed phab:T275900 to track the request to show whether the other tabs have results relevant to the search query. CBogen (WMF) (talk) 19:15, 26 February 2021 (UTC)

When we are to know its effectiveness in other languages, we need metrics that show the use of Commons based on the language used in search criteria. Thanks, GerardM (talk) 05:43, 23 February 2021 (UTC)

Commons is an English language website. Its search engine to be will support all our languages. Images are needed in any and all of our languages. Without a plan this aspect of the search engine will either be an item on a tick list or of profound importance to all our projects. With proper attention we will get more pictures found in Commons itself in stead of being copied from articles in other languages. We will have students from all over the world looking for images and freely licensed ones at that.

Release announcement posted[edit]

I've provided information over at the Village pump. Keegan (WMF) (talk) 20:37, 23 February 2021 (UTC)

@Keegan (WMF): "fast approaching" and "this will be made live next month" are quite different things, one implies that the process is still open for iteration, the other gives a deadline by which it has to be acceptable. The first approach is much better. Thanks. Mike Peel (talk) 19:16, 25 February 2021 (UTC)
@Mike Peel: understandable observation. However, the two are not mutually exclusive. The team is at a point where they believe it's ready to serve as the default landing for search. They also believe that they can continue to iterate and make improvements as needed before, during, and after the software launches. I hope that clarifies things. Keegan (WMF) (talk) 19:38, 25 February 2021 (UTC)
@Keegan (WMF): That makes sense, but in general, 'believe' is often not the same as reality. Please don't feel afraid to say 'it's not ready yet, let's wait a bit longer' or 'we want to implement this first' rather than targeting deadlines. Thanks. Mike Peel (talk) 19:44, 25 February 2021 (UTC)

Media size[edit]

The media sizes (All image sizes, Small, Medium, Large) don't appear to be very useful. I select Large and it found images of only around 1.6MP. I'm guessing it was just looking for images with any dimension > 1000. Compare with Google's Advanced Image Search which in addition has MP options going up to >70MP. What are the use-cases you considered for size? Given that it is possible to downsize or crop a larger image, the main case for small sizes would I think to focus on those looking for icons in PNG or GIF format (which is exactly what Google calls its smallest size). I suggest you consider some options for larger image JPGs.

  • HDTV (1920×1080)
  • Ultra HDTV (3840 × 2160) (aka 4k)
  • > 5MP (about the resolution you need to print adequate quality on A4/US letter/magazine page)
  • > 10MP (high quality printing)
  • > 20MP (high resolution image)

At the moment, the highest setting, Large, is so low it doesn't filter Commons JPGs to any useful degree. PNG or GIFs are a different matter, if they are mainly used for icons or web art. Video files likely deserve their own size options similar to (standard dev, HDTV, Ultra HDTV) - perhaps similar to what YouTube offers for viewing choices. -- Colin (talk) 11:15, 24 February 2021 (UTC)

Thanks for this feedback, @Colin: currently, as you suggested, the image size filter categories correspond to pixel size. Small is < 500px, Medium is 500-1000px, and Large is >1000px. We modeled these categories after the categories in Google's Image Search (though not the Advanced version). We're open to making changes here and I'd love to hear from more folks about what image size filter categories would be useful to them. CBogen (WMF) (talk) 20:03, 24 February 2021 (UTC)
Megapixel is a better metric than number of pixels on a side as it copes better with panoramas. I'd suggest something like small being <3Mpix, medium 3-10Mpix, large >10Mpix. Most photos by current generation cameras (and previous generation good cameras) would fall under 'large', so hopefully there would still be a reasonable number of results for all options. Thanks. Mike Peel (talk) 19:22, 25 February 2021 (UTC)

Faceted search for location of creation[edit]

I filed phab:T275787. Multichill (talk) 16:26, 25 February 2021 (UTC)

May need to de-prioritise usernames in searches[edit]

For example "hilsea" mostly produces decent results but also pulls in a couple of rather random images due to author names.Geni (talk) 17:40, 26 February 2021 (UTC)