Commons talk:Categories

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
This is the talk page for discussing improvements to Commons:Categories.

LANGVAR in category names ?[edit]

Should a long-established and stable name in one variant of English be renamed without discussion to a US variant spelling, to be "Consistent with other such categories with American spelling."?

There is no COM:LANGVAR although I believe en:WP:LANGVAR would be relevant and illuminating. The intention there is to reduce "churn" in language-variant naming, and avoid edit-warring. There is also no reason that category names have to be consistent with each other: their membership is defined explicitly, not implied through names. Navigation between categories is primarily done by following links. If a reader is typing category names, they're more likely to be doing so in a local variant, not on the assumption "Everyone uses American spelling".

See [1]] @Soumya-8974: Andy Dingley (talk) 19:15, 22 December 2021 (UTC)[reply]

@User:Andy Dingley: Support the rename in this case. This is NOT a general preference for US over UK. Taylor 49 (talk) 22:32, 22 December 2021 (UTC)[reply]
  • IIRC, early discussion was to prefer UK spelling/terms in UK related categories and US spelling/terms in US related cats. "Consistency" should not be used as an excuse to rename things contrary to what locals call them. -- Infrogmation of New Orleans (talk) 22:45, 22 December 2021 (UTC)[reply]
    I know this is the case in English Wikipedia, as I found it written in its style guide. However, I couldn't find the same thing in Commons and tried to give categories a consistent American spelling. --Soumya-8974 (he) (talkcontribs) 05:33, 23 December 2021 (UTC)[reply]
  • Obviously, but why would you do that? Just to land-grab for America? Some of us do find that pretty offensive, and even if (yet again) Commons doesn't have an explicit policy, why do you choose to act so inconsistently with Wikipedia, and then justify that as "for great consistency"? Andy Dingley (talk) 14:30, 23 December 2021 (UTC)[reply]
  • @Andy Dingley: I don't know what you mean here. Is not some "style guide" better than none at all to justify a spelling? Krok6kola (talk) 16:22, 23 December 2021 (UTC)[reply]
  • My point is that the nearest we have to a style guide is that from en:WP. Which is very strong on "don't make pointless changes" to avoid this problem, rather than the false claim that we need "consistency" to make categorization work. Commons already goes too far for consistency, such that we have "(ship)" in every category name for marine vessels, even for barges, boats, floating cranes and a whole slew of things that definitely aren't ships. Andy Dingley (talk) 16:55, 23 December 2021 (UTC)[reply]

I have resolved to use relevant national spelling variants for categories, following the Wikipedia policy. --Soumya-8974 (he) (talkcontribs) 07:35, 8 January 2022 (UTC)[reply]

Wikipedia guidelines are not in effect here, but the reasoning in and about them can be enlightening. Unilaterally changing category names should also not usually be done.
The consistency becomes obviously problematic in categories that combine aspects: American spelling might have been consistent for space programmes, but a seriously doubt it would be consistent in the part "European" of Category:European space programmes, which this concerned. Any category can ultimately be subcategory of a category in a different scheme, so when aiming for consistency, one cannot decide based only on one aspect, but should have a discussion with a broad audience. At some point the consistent names will clash badly with e.g. the principle of most commonly used name.
LPfi (talk) 16:42, 8 January 2022 (UTC)[reply]
@Andy Dingley: The short answer to the original question is no. Any category name change that is not clearly non-controversial should be raised as a CfD. Josh (talk) 12:02, 31 March 2022 (UTC)[reply]

@Andy Dingley, Taylor 49, Infrogmation, Soumya-8974, and LPfi: :

Proposed addition to Commons:Categories#Category names
When creating a new category, the following principles apply to determining the correct variant of English to use in the category name in certain cases:
  1. Proper names of organizations, works, etc. should use the spelling of the subject in their category name, regardless of variant.
  2. Regional topics should use a variety of English predominant in the topic's region in their category name.
  3. Industries and academic/scientific fields with internationally-recognized standard naming conventions should reflect these standards in category names.

In all other cases, there is no automatic preference as to which variant of English should be used. Category names should not be changed solely in an attempt to standardize on a particular variety of English. It is acceptable to have sub-categories with names in a different variety of English than their parent category in cases where the parent is a general category while the child category is region-specific (e.g. Category:Gas stations in the United States being under Category:Petrol stations by country).

Proposed change to Commons:Categories#Universality principle (removed text struck; added text in italics)
Identical items should have identical names for all countries and at all levels of categorization. Categorization structure should be as systematical and unified as possible, local dialects and terminology should be supressed in favour of universality if possible though regional sub-categories may be named in accordance with the appropriate regional English variant. Analogic categorization branches should have analogic structure.
Reason: There is no policy (currently) on COM:CAT or COM:LP that gives any guidance as to which form of English to use. In general, I have found that most discussion support is for the enwiki-ish approach of using the predominant dialect for the given region or topic if applicable. Don't get hung up on exact wording, I'm happy to change that based on discussion. Hopefully this will give some basic guidance to users without being too prescriptive, while also defusing some attempts to 'convert' all categories over to anyone's particular preferred variant of English. It is particularly important to revamp the text under the universality principle, as the current policy does appear to strictly forbid regional names, though it is current common practice to permit such. Josh (talk) 12:02, 31 March 2022 (UTC)[reply]
  • There are problem with not being consistent. One problem is that guessing category names becomes difficult. The names need to be guessed e.g. in the Upload Wizard, Hotcat and Cat-a-lot, unless you open a parent category in another tab and start exploring, slowing you down radically. When uploading a batch of files, one doesn't want to use many seconds on adding categories to individual images, in addition to those common for the batch. This is aggravated for non-native English speakers, who don't necessarily recognise the local words. It is hard enough for me to decide whether something is a sledge or sleight (especially when it is something not often discussed in English). If which one it is depends on the region, that'd be real confusion. –LPfi (talk) 16:04, 31 March 2022 (UTC)[reply]
    • Nothing wrong with having category redirects from variants. Agree that categories are in some cases needlessly difficult for non-native English speakers, and that should be kept in mind. (For example, I don't see the reason in constructing date categories with names like "October 2021" rather than "2021-10", but that is perhaps a discussion for another day.) -- Infrogmation of New Orleans (talk) 17:27, 31 March 2022 (UTC)[reply]
Symbol oppose vote.svg Oppose the policy changes presented in the two boxes above ... sorry but the Commons:Categories#Universality principle is more useful than right to your dialect. As about UK-vs-US, the choice should not be based on ideological reasoning ("US is bigger and better than UK"), but practical usefulness:
  • "flat" or "apartment" ? "apartment" is better, because the word "flat" has 17'000 meanings whereas "apartment" has only one
  • "proctor" or "invigilator" ? "examination supervisor" is most universal and understandable
Dates should be of course YYYY-MM-DD (details can be discussed), consistent with HH:MM:SS, for practical reasons. The messed-up US system is simply not useful.
Taylor 49 (talk) 20:54, 31 March 2022 (UTC)[reply]
"Universality" as opposed to "right to your dialect"? Were there such a thing as standardized international English, I'd likely agree to implement it on Commons, but at present there is no such thing. There is nothing but "dialects". For global parent categories where there are significant differences, yes, we need to pick one, but for country specific subcategories I think using country specific language is reasonable (eg Indian English for "in India" categories, similarly US etc) - otherwise we risk using names for common objects that are completely unfamiliar to hundreds of millions of people, the people most likely to be contributing and categorizing media related to that topic. -- Infrogmation of New Orleans (talk) 00:52, 1 April 2022 (UTC)[reply]
Is there any work being done on making categories multilingual? We have been asked to keep them English only to avoid creating a mess, difficult to sort out once the good technical solutions is implemented. That sounds reasonable, but I have been waiting for 10+ years, and not heard about any work being done on it. While Indian English may seem reasonable for Indian categories, why aren't Chinese names reasonable for Chinese categories? I think a mechanism for translation cannot be postponed much longer. –LPfi (talk) 12:18, 1 April 2022 (UTC)[reply]
@LPfi: As far as I am aware, there has been no real progress on this goal over recent years. I agree that I would love nothing more than to have a system such as Wikidata uses, which would be identifying each category with a 'behind-the-curtain' ID number and presenting the user with a label (and description) from a list by language based on their region or user settings. This would also simultaneously solve a lot of DAB issues by eliminating the need for such labels to be unique. However, as great as that sounds, I do not have any reason to believe it is close or even on its way. I laud anyone with the tech skills to lend their support to such an effort. In the meantime, we are stuck with the real limitations of the current system with no end in site. Avoiding parallel categories for different languages isn't just to make such a future solution easier to implement, but also to allow our current category scheme to work in the meantime. Maintaining parallel categories would be an absolute nightmare, and inevitably would go against the most basic objective of categories: to get all 'images of X' in one organized place. However, to your point of 'Chinese for China', I actually agree. The argument that it would make it difficult for Latin-script language speakers to access, understand, or maintain such categories also, in my opinion, means that as a matter of logic, the English/Latin-script naming rules also exclude anyone who does not read Latin-script languages from participating in the project. However, this is my personal position and I do not see it having much support project-wide (but would love to see it discussed), and changing that is beyond this particular proposal. Josh (talk) 10:38, 3 April 2022 (UTC)[reply]
  • Symbol support vote.svg Support Better to have guideline than simply editors working at cross purposes. -- Infrogmation of New Orleans (talk) 00:52, 1 April 2022 (UTC)[reply]
  • Symbol support vote.svg Support A personal opinion. "Streetcars" where I live were all changed to "trams", a word not used in my locality, a place very proud of its "streetcars". In fact, no where I have lived in the U.S. (several places with streetcars) have they been called "trams" including San Francisco (see San Francisco cable car system. I don't categorize images in my current location anymore because building and such have been give names not used locally or by tourists. For whom is categorization "consistency" aimed? Krok6kola (talk) 12:39, 3 April 2022 (UTC)[reply]
    Having a policy hinder people from using their best judgement is not good. Like Taylor 49 writes above, it is good to choose the least ambiguous name (and perhaps the most widely understood), regardless of variant. This is especially important for the main category – and I think the main category (and any main X by Y category) also should have a description (and a rationale in many cases), and should be linked for that description from all subcategories. –LPfi (talk) 05:50, 3 April 2022 (UTC)[reply]
@LPfi and Taylor 49: That is exactly what current policy does -- hinders people from using their best judgement. According the current wording, the category Category:Gas stations in the United States must be renamed to Category:Petrol stations in the United States due to the name of its main and sibling categories being 'petrol stations'. Of course calling a gas station a petrol station in the US is as silly as calling a petrol station a gas station in the UK. This of course does not make sense, and so we often just ignore the actual wording of the Universality Principle during CfDs. One of the main points of my proposal is to better fit the policy to what we are actually doing in practice. It actually gives more flexibility by allowing regionalization that makes sense versus the current policy's top down rigidity, and lays the groundwork for going farther in the future to allow broader language localizations (per your 'Chinese for China' suggestion above). Josh (talk) 10:48, 3 April 2022 (UTC)[reply]
It'd make sense to have category redirects either from the common names or from the local ones. I don't know which one would be better. Having such redirects also involves risks: Are the redirects fixed on category moves? Will there be clashes between the synonyms (such as LNG gas stations in the UK in a Category:Gas stations in UK)? The latter risk increases with the number of languages. I think we need to figure out how internationalisation of category names should be done properly. I assume the WMF thinks structured data will solve the issue, but as they haven't really worked with the community I have no hope it will be anything but a big mess in the foreseeable future. –LPfi (talk) 07:13, 4 April 2022 (UTC)[reply]
@LPfi: I am a big fan of redirects in general. While they can at times become a maintenance headache, the majority of the time I believe that the help they offer outweighs the possible overhead issues that go with them. As for using structured data, users can already link WD items in the structured data section, and this theoretically allows categories to be automatically populated based on the classification scheme in WD. However, there are a couple of big problems here. First, this requires a massive amount of work to attach such links to existing and new media, but as you mention, they have not done much to get buy-in from the Commons community to embrace this work, so few files actually have such structured data added. Second, the development of the categorization tools is nowhere to be seen and will take a lot of development time to be ready for reliable employment. Thus in the end, I don't think we can count on WD and the WMF to solve the problem for us anytime in the foreseeable future. Thus it is on us in the now to figure out how to do things right. The reason I see a need for evolving our policy is that I see a continual stream of CfDs continually re-litigating what should be covered by guidelines. This not only consumes significant time and breeds antagonism as discussions devolve into argument over which language is better, but results in a patchwork of approaches depending on which users happen to be active and participate in any given discussion. I think one of the first policy questions that we need to answer is: Should sub-category names have to be an exact match with their parents (as per current Universal Principle), or should subs be allowed to be named differently to conform to local variation as appropriate for a given region (as is often the consensus in real CfDs)? Josh (talk) 23:40, 4 April 2022 (UTC)[reply]


Is there some kind of rule for when it is OK or not to abbreviate something when creating a category, for instance "Company" versus "Co."? Thanks. --Adamant1 (talk) 00:10, 3 February 2022 (UTC)[reply]

@Adamant1: It is strongly preferred to avoid abbreviations in category names. However, this is not an absolute rule. For certain proper names where the abbreviation is almost always the normal usage of the name for the subject (e.g. Category:IBM), it can be okay. This is a high bar though, and abbreviations that are merely 'very common', but where the spelled out version is reasonably common to see as well, are probably better in spelled-out form (e.g. Category:United States). Since this is a gray zone, borderline cases should be discussed in a CfD before changes are made. If you are making a new category, feel free to use your best judgement but be okay with discussing it later should another user raise a CfD on it. Happy categorization! Josh (talk) 09:51, 31 March 2022 (UTC)[reply]