User:Dysmorodrepanis/Categorizing

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Categorizing the Tree of Life, Wiki-style[edit]

The state of categorization and logical, hierarchical organization of organism content on Commons is dismal at best. Nothing even close to a widely-accepted (not only in word but in deed) SOP ever existed. While there are some hard-and-fast rules that at least ensure a basic structure, the result in many cases is bordering on the unworkable.

The following discussion is limited to the scope of the Commons project Tree of Life. Nothing what is said here needs to affect other projects in any way. This may be considered jeopardization of a "common SOP", but considering the fact that scientists have been working on a hierarchical classification of life on Earth for (as of 2008) 250 years and quite successfully at that - even though half of the time they had to work under a false doctrine, creationism -, the question begs whether a) such an overarching SOP may ever exist and be successful and b) what good is done by throwing a quarter-millennium of rigorous work out of the window.

Hence, the notion that the Tree of Life project's content-organizing scheme needs to follow the same rules as the rest of Commons is explicity rejected.

You may note that most of the examples deal with birds. This is for two reasons:

  • I am most familiar with these
  • At least in Neornithes, systematics and nomenclature have probably progressed farther towards resolving - and reproducing in classification - the true evolutionary tree than in any other similar-sized (c.10.000 species) group of organisms. The experience gained in reconstructing the avian tree of life by far exceeds that for any other organism group.

Excuse, furthermore, my harshness. Over at en:Wikiproject Birds we spent countless man-hours to make for example probably the single best freely available source in the world from a scientific standpoint, as regards taxonomy and systematics. Especially today, where there is significant effort to slander evolutionary theory, IMHO it is a major error at best and dangerous at worst to cut corners in respect of biological systematics. The work of taxonomists - themselves a kind of "endangered lifeform", sadly - does not deserve to be slighted: no matter than Sibley & Ahlquist got about one major point right and otherwise created a pernicious mess, judging in retrospect, 18 years ago they were certainly trying to to their best. And neither does the robust and expedient "folk taxonomy" that after all was the ultimate origin of scientific taxonomy deserve to be dismissed out of hand. Wikimedia Commons is, after all, a resource primarily for a non-scientific audience.

Summary[edit]

Definitions:

  • Taxonomy: the science of giving names to the constituent parts (taxa) of biological diversity. Here, "taxa" is used in the sense of "named taxa".
"A taxonomy" (as a countable noun) means a specific proposals for systematic and taxonomic treatment of organism groups (e.g. "Clements taxonomy" for birds). The term "system" is preferrable but only widely used in botany (e.g. "APG II system").
  • Systematics: The science of arranging taxa in a tree-like structure reproducing the evolutionary tree of life.

Observations:

  • It is noted that scientific classification demands a scientifically rigorous system insofar that a taxon unequivocally corresponds to certain groups of organisms (as defined by the type).
  • It is noted that scientific classification also demands a flexible system, because the relationship between taxon and phylogenetic entity is not bijective and subject to revision, change and uncertainty.
  • It is noted that most users think in paraphyletic, phenetic categories and that the present phylogenetic approach is too tough on the average user.
  • It is noted that a rudimentary vernacular-name category tree already exist (though part of it has been chopped down as of recently), because some "form taxa" figure too prominently in human perception to be done away with (for example fishes).

Assumptions:

  • Presumably, scientifically-minded users will want a tree that allows them to locate narrowly defined content. for example, the contact call of some bird species, a photo of a young of a certain mammal, a distribution map for a certain fish genus, etc.
  • Presumably, nonscientific users will want a tree that allows them to reach a certain "kind" or "sort" of animal quickly and provide them with an overview of the collected material so that they can pick the picture that is, essentially, "the prettiest". For example, a photo of a singing "warbler", or of a blue butterfly, or a drawing of the stuff that makes fireflys glow, or an elephant's vocalizations, or a photo of a lion male with a long mane. For most such purposes, it is irrelevant whether there is doubt about the precise identity of the organism in question etc.

Proposals:

  • It is proposed to immediately halt the dismantling of the vernacular tree to reassess what parts might be useful.
  • It is proposed to untangle and possibly restore the scientific from the vernacular tree and expand them to two interfaced but (all but) non-intersecting systematic schemes.
  • It is proposed that the scientific tree uses only gallery pages at the species-group level, for the reasons already discussed at length and a few more.
  • It is tentatively proposed that species pages are not categorized to the genus. (it is unnecessary and cumbersome under the proposed system)
  • It is proposed that the vernacular tree would adhere to the general Commons nomenclature, and that it would use only categories.
  • It is proposed to abolish gallery pages for any other purposes altogether, for the reasons already discussed at length and a few more.

Miscellanea:

  • A means is described how bots and scripts can gather the necessary placement information from content description pages, even if the content is not placed in a scientific-name category.
  • Examples are given to show how certain problematic cases can be resolved by the proposed system to yield a result that few if any users may find lacking in usability.

Background[edit]

The present system allows for organism content on Commons to be organized both via categories (tagging) and gallery pages (linking).

Categories allow for easier automated handling of content. Batch file handling is considered prohibitively cumbersome when dealing cointent linked to pages but not tagged to categories. Also, hierarchical, tree-like systems are easily and flexibly established by nesting categories within categories, and such a tree does not even need to be monophyletic - this allows for proper expression of systematic and taxonomic uncertainties. Yet 3 major and 1 minor problems exist regarding content organized in categories, namely when there is much content in a category:

  • Categories are limited to 200 linked pages or subcategories per displayed page. Once this number is exceeded, the category will be automatically split into several consecutive pages, and content is assigned dynamically. However, this affects both subcategories and pages in a category in the same way, though usually only one of the two has that many entries. Even if there is a single subcategory starting with "Z", if there are 2001 pages starting with "A" in the same category, the subcategory will never show up on the category page sequence's starting page. This problem is less frequent on Commons than on Wikipedia (see for example en:Category:Salticidae) but it is potentially severe. It is often considered to be a bug, but it is not certain when this will be fixed, if it is indeed a bug.
  • Categories, while easier to handle by software, are harder to handle by humans, i.e., intelligently. Moving and renaming is not easily done. In addition, categories can be made to redirect, but this will not prevent content from being placed in them and this content is then lost from view and may in fact become completely forgotten - there is at present no automated system checking for redirecting categories and moving content out of these, I think. (Note that simply changing systematics - as opposed to changing taxonomy, i.e. renaming - is easily achieved with categories: simply change the category it is tagged to)
  • Categories do not allow for tagging, annotating or subdividing content. Everything tagged to a category will automatically be placed in a single alphanumerically-sorted gallery. Image names are not - and certainly, given the sheer amount of content and the need for manual review never will be - logical. Some actually cannot be logical on a larger scale because they have to be logical in their particular circumstances, because of non-duplicate media having duplicate content (showing the same thing in different ways) for example, or because some media are derivatives (copped or cleaned) versions of others. In plants, scientific names are too often ambiguous without the taxonmic authority - what, pray tell, will happen to any images titled "Cassia lanceolata"? An automated system that is able to resolve such cases is technically impossible (if the photo contains simply the flowers for example there is no way if it is of a shrub or small tree - Senna alexandrina - or a herb - the other two species.

Thus, content in a large category will inevitably be devoid of any meaningful structure. Categories such as Category:Anas platyrhynchos force a painfully high workload upon the user to locate content. This can be avoided by using subcategories, but this will in the end probably result in further deterioration as consistent placement of this content would require user intervention to a degree I think is impossible to maintain. Certainly, such "fine-categorization" cannot be handled by bots, as media content must be intelligently evaluated.

  • The minor problem, a direct consequence of the preceding, is that categories containing much content will have loading times that are prohibitive for those on dial-up connections. Users that cannot afford a commercial encyclopedia - a main focus group of Wikipedia - are discriminated against and prevented from contributing to the fullest.

Gallery pages allow for more flexible hands-on handling. They can contain basically (with a few exceptions such as some templates) any means of organizing content that is possible on Wikipedia, such as sections. For example for plant taxa with innumerable varieties, subspecies, forms and cultivars but moderately much content in total, they are far superior to categories, which would in this case either do away with any meaningful way of discriminating these sub-specific taxa at all, or create many categories with just one or two media in each which is also not very satisfying. Galleries also allow for easy identification of taxa, as opposed to the galleries autocreated by category tagging, they allow for manual annotations. Important information from the media description page can thus be incorporated without the user having to access the medium (which may also have high loading times, particularly in upright hi-res photos as are becoming more and more common). However, there is also at least one significant problem with gallery pages:

  • Gallery pages do not give categorization information via tags. That is, content only placed in a gallery of their taxon lacks the usual way for bots and scripts to recognize its taxonomic assignment for automated maintenance work. Essentially, any work dealing with gallery-only content must presently be done manually, resulting in a workload that also can be prohibitively high.

(Note than category pages can also contain galleries. This is rather confusing however, because content in categories is expected to be tagged thus, and hence best avoided under almost any circumstance.)

Present problems[edit]

Probably the most annoying problem of the present state of affairs imperfect redundancy. With this term I describe the phenomenon that categories and pages exist for the same taxon, but are neither identical in scope nor in content. For a crass example, see Category:Thraupidae and Thraupidae. Though they may not look like it, dealing with the same taxon as they to they ought to be 100% redundant as regards media accessible from them.

At present, probably 30% of the ToL-specific Commons code is, for almost all purposes, dead weight. Cruft. Overhead. HD space, computing power and bandwidth that is essentially wasted. The system I am going to propose will need about the same amount of ressources but make it work and non-redundant.

This particular example also well illustrates another problem. The Thraupidae have undergone serious revision recently and the taxon page would make any specialist cringe in pain - compare en:Thraupidae. In some way or another, there has been much reshuffling in songbirds in particular in the last few years as the actual phylogenetic (and corresponding best-fit systematic) relationships are being resolves. In a nutshell, any "warbler", "babbler", "finch", "grosbeak", "seedeater", "sparrow", etc is liable to belong to a different family today compared to a mere 8 years ago. Most don't (except "warblers" and "babblers"), perhaps 25% on average do.

As a very crude estimate, recent systematic changes (changes in placement) affect many hundreds of bird taxa alone; perhaps a low 4-digit number.

Taxonomic changes (changes in valid name) still affect several hundreds of bird species alone, for example the genera Category:Otus and Category:Megascops were recently recognized to be well distinct. The genus Category:Lagopus as well as many dozens others hat recently a change in grammatical gender of the species names.

As regards Commons organization, systematic changes can usually be handled best via categories. It still needs to be done manually, as much uncertainty still remains.

On the other hand, there is no easy and good way to implement taxonomic changes using categories. Automated handling would in most cases still need so much human input initially to make it pointless. Human handling is prohibitively complex. Redirecting the obsolete taxon to the now-valid one creates "ghost" categories where inevitably (as the obsolete name will still be around in older sources) content will wind up that, if it does not have an informative name (it will certainly have a taxonomically wrong name), will effectively be lost until someone hapens to stumble across it.

Given that most taxonomic changes affect species-group taxa (species, subspecies etc), it can be bluntly stated:

Whoever uses categories to deal with species-group taxa does a Very Bad Thing and is liable to force upon the community a workload that can not be handled easily, will not be handled to any meaningful extent ever (as reality clearly shows), and make a bad situation worse.

The problem: should we tag all media with the genus only? No, and a simple example shows why:

Category:Anas holds the linked subcategories and species, plus media of Anas species which have too few content on Commons to justify a page/category of their own. There exists an image for Anas gibberifrons, the Sunda Teal. Try to locate it. Easy, isn't it?

Now assume Category:Anas platyrhynchos would be abolished and everything therein be placed in Category:Anas. Also assume that the uploader did not chose the name well, Image:Naumann gruenspecht.jpg shows that this may happen.

Tring to find the nice litho of A. gibberifrons would be a futile undertaking. Had it been named "KeulemansSundaTeal.jpg" for example - a reasonably enough name - and not provided enough descriptive information - as is often the case in batch uploads - he search function will not find it. There would be no easier to find it way than to manually screen the many hundreds of images in this category. One by one. Wasting hours on end by checking images where the thumbnails allow not for a precise ID.

The conclusion is clear:

The present system does not work - cannot work - in a satisfying way, and that cannot be corrected. Hence, it must be expanded into an entirely new direction.

Desiderata[edit]

The Wikimedia aim at organizing and providing knowledge discriminates not according to creed or background. This sets it apart from all other such undertakings in the history of humanity. Ultimately, knowledge organized via Wikimedia projects is to be presented in a fashion that fulfils the desiderata of a junior high school student and a Nobel prize winner, and everybody in between. And although the default language is English, a fundamental level of acessibility to non-English speakers must not be compromised.

Any categorization system used for organism media on Commons ought to be compatible with this approach.

As regards biological systematics, the desiderata of the ends of this user spectrum can hardly be farther apart:

  • The expert wants a system that is both rigorous in its application and flexible technically:
Content must be unequivocally assigned yet this assignment must allow for easy change at a moment's notice: content must be deposited at a taxon category/page, but this assignment needs to be non-static.
If by any means feasible, there must not be redundancy, but there must be leeway for ambiguity, to handle taxa incertae sedis: both having each media item in the tree once and only once, as well as being able to get tho this one location by several ways, is at least highly desirable.
Experts can be exected to bring along the patience to wade through taxonomic hierarchies without seriously losing their way, on the other hand. If Piranga olivacea were not in Thraupidae, an expert would be unfazed and check Emberizidae and Cardinalidae as them most sensible "defaults", and presto, there in the latter it is.
  • Layfolk want a system that uses easy-to-recognize categories. A purely phenetic classification, essentially.
Most people are not aware that the very color of the Scarlet "Tanager" kind of gives away that it is not Thraupidae, looking at it with the benefit of hindsight. To them, it's called a tanager so a "tanager" it is. It certainly behaves like one!
Yet despite Category:Crows and Category:Ravens being in the very same genus - Category:Corvus -, layfolk recognize them as different "sorts" of animals due to long-standing lore associated with each.
Non-experts cannot be expected to bring with them the patience to deal with the plethora of suborders, superfamilies, tribes, sections, we do. It would be nice, but it is impossible. But this must not result in locking the doors to the ivory tower from within. Any junior high student, IMHO, has a fundamental right to know that "crows" and "ravens" are biologically all but identical, and certainly not natural, evolutionary groups. But to get there, they need a path that leads through familiar terrain.

Hence,

A possible solution[edit]

I propose to stop trying to build the existing mess into what probably never will be a system that suits the needs of anyone involved even moderately well.

Rather, the present tree (starting at Category:Phylogenetic tree of life) should be cleaned up, and a second tree should be fully established. A tree that in part already exists, starting at Category:Organisms. I will subsequently call these trees "scientific" and "vernacular" as a shorthand.

I do not claim that this system is failsafe or can handle any taxon. But it will be able to handle almost all taxa, and at that it will be better than anything that has been proposed before I think. This is not a theoretical consideration, as the examples given below will show. It is based on the lessons learned by building a categorization scheme for birds on the English Wikipedia.

The scientific tree[edit]

This uses categories for all taxa except species-group. These will use gallery pages exclusively. The only significant change to the present guideline (as far as there is one) affects the treatment of species-group taxa

Content on species pages will have no category tag with the scientific name. They will appear in no category whose name is a taxon whatsoever. The role of this tag is taken over by the common-name tag discussed in the next section.

Category:Buteo albicaudatus would be scrapped as it is superceded by Category:White-tailed Hawk. The scientifically minded have ways and means to pick out that one photo from a batch of 1-2 dozen in the genus category. Layfolk browsing Commons would be hard pressed to make sense of a term like Buteo albicaudatus. If they are English-speakers, "White-tailed Hawk" will be more familiar. If they are non-English speakers, even those who don't know that the gavião-de-rabo-branco is B. albicaudatus might still remember it's a Buteo and thus both will arrive at the place where the content is located.

I am not sure whether species pages should be categorized into the genus categories. Might help botwork OTOH, might muck up layout when there are many OTOH.

In any case, the genus pages get a nice visual gallery like Category:Buteo. Note especially the gallery entry for Buteo rufofuscus.

How it works
Content gets piped into the tree at any point and moved down through the categories (towards species) until it can go no further. Often this will be the genus category. If a species page exists, the image will be placed there, into the gallery that is most appropriate, and given some basic information (sex, locality, etc) if required.

If there is no species page (genus category etc) already, the content will for the time being end up in the category of some higher-level taxon. This will in time accumulate content. When there is sufficient content - say 3-5 items or so - the appropriate subcategory - or in the case of genus categories, a species page - will be created and the content in question moved there.

Thus, categories will only contain the following content:

  • taxa for which very little content exists
  • unidentified content
  • content pertaining to the higher-level taxon as a whole (e.g. Image:Tyrannen.png

The "200 bug" is thus avoided. In theory, it may still occur, but it is very unlikely that the content outlined above will collectively be more than 200 items, or when more than 200 taxa are included in a category. In such cases, and expedient solution is also easy (for example taxa can be subdivided into intermediate ranks. Species groups in Drosophila for example... should we ever get media of more than 200 droso spp..

This is the tree to use if one has a clear idea of the content desired. See Junco hyemalis or Buteo buteo for different ways of organizing content, or, for how to organize some 150 items and NOT stink, Anas platyrhynchos.

Getting rid of species categories will provide a safeguard of content of questionable accuracy: if species categories suddenly appear or if the caption of photos has erroneous italicization, includes the authority in italics, has the specific name capitalized, ... it is wise to check the new content for errors. The leaf bit is intentionally a bit tough to handle correctly for people who don't know what they're doing. It is very easy on the other hand if you know the difference between Linné and Linen.

The vernacular tree[edit]

This is a common-names category-only tree. Except for species, the names are used in plural form (Category:Crows is distinct from Category:Crow), according to Commons SOP (e.g. Category:Automobiles). No pages are used whatsoever (but see below).

See Category:Animals by species for what already exists. I note that the entomo folks just got rid of the common-name categories. But as there is little apart from Category:Hawk moths around at present, little harm is done. (Category:Moths would need to be not deleted. The rest can be moved back by and by)

One will also not be bullshitting people with patently false redirects like Category:Eagles (So Old World vultures are eagles?) and solve problems like that of "warblers", "vultures" etc.

Category:Animals and Category:Plants give a rough idea already. Note we have no Category:Plants by species yet and I wonder whether this is not better delegated to Category:Trees and so on.

For a more complete overview of the power of the system, see en:Category:Birds by common name. Note that content in some subcategories has not yet been fully sorted. Try "Perching birds" for the deepest structure. You will note that the tree is still rather shallow, especially considering the shortcut to "Songbirds".

Interfacing[edit]

The ultimate root category is Category:Organisms. The Category:Animal species should be Category:Animals by species to agree with Category:Animals SOP

Pages otherwise do not generally interface not by category-tagging. There are a few exceptions however, such as Category:Primates.

Pages would rather interface by an initial statement or paragraph. At the higher levels, this is usally as simple as a "see also the scientific name [[:category:..." and in the phylogenetic tree just "see also [[:..." (It cannot be expected from the public at large to know that "Passeri" is the scientific term for "songbirds".)

For an example, see Category:Elephantidae and Category:Elephants. Note that content may be placed in both categories under a different subcategorization scheme. Especially when Category:African elephants and Category:Asian Elephant are added. (The latter would replace Category:Elephas maximus but Elephas maximus the page would remain.)

In some cases interfacing is not straightforward. A longer statement explaining why could be added. Thus, Category:Thraupidae would only contain taxa that the latest (2005-2007) reviews accept in the Thraupidae.

Everything that was placed there in the oldschool view - and what else might have been called "tanager" by common name even before the mid-late 20th century - would be assembled in Category:Tanagers. This would then get a warning statement like en:Category:Warblers or en:Category:Seedeaters.

There is no need to change anything on an ad hoc basis. If the present mess is switched to the system described above, this can be achieved by and by. The only change involves switching the species to vernacular categories. This can even be handled automatically.

There is also no need for a schedule, as the system will only improve; there is no transition period in which it is utterly broken and thus transition can be indefintely long.

Problems[edit]

  • Many species have no common names.
Most of the families have, and that is probably enough even for beetles.
See for example Category:Buprestidae. There are around 100 or so items of content subject to this category. People who simply seek a pretty photo of Category:Jewel beetles would probably be grateful to be able to without knowing the difference between Chrysobothris and Chrysochroa.
It may be more of a problem for plants. But as noted in a previous discussion, there is no necessity to use English common names. On the other hand, plants are probably those organism where the present system is least incompatible with scientific desiderata - see for example the synonymy of en:Senna obtusifolia: scientific names in plants can simply not be considered reliable without associated taxonomic authority - information that is de facto inaccessible to probably as much as 95% of uploaders. So even though one may need to cut corners in common names, scientific names in plants are far from being unambiguous either as it is presently handled. In fact, for the subtribe Cassiinae, only the English common names can be expected to be unequivocal to any significant degree. That is because the entire subtruibe was at one time included in a single genus, necessitating name changes over and over again.
As regards non-English names: since the search function will not directly forward to categories, this can be handled without problems. Simply make the non-English term redirect to the appropriate vernacular category.
  • Common names have not been standardized.
They have in birds. The rules are very simple and would probably ultimatelsy expanded top mammals.
Herpetology, ichthyology and entomology have informal conventions which are somewhat different but still pretty coinsistent.
  • Maintenance bots need categories.
Do they? Try the following: go to any image page linked in a gallery - say Image:Senna‗obtusifolia.jpg which has no taxonomic category - and view the HTML source (in browser, not in Wiki editor). Find this string: "the fo".
No, I am almost 100% certain that bots do NOT need categories.
I can't code, but for anyone who can I have a concept for a workflow around that would probably extract "linked by" information and use it like categories are used in maintenance. It would be slower, because you need to grab the HTML code first and strip everything but the page info from that.
Still, the only case where this would need to be used is when comparing content in vernacular-name species categories and scientific-name species gallery pages.

Examples[edit]

A ">" denotes a category tag (in the page/category to the left)

A "+" denotes a wikilink (in the page/category to the right)

Note how much shorter the vernacular trees are by comparison. Every person capable of basic English should be able to arrive at the content they are seeking with minimal fuss. Moreover it can be expanded with little trouble or content getting lost to support any number of languages.

Jerusalem artichoke (Helianthus tuberosus)[edit]

Vernacular categories for plants

Image:Tobinambur1.jpg

          ,+Helianthus tuberosus+Category:Helianthus->Category:Heliantheae->Category: Asteroideae->Category:Asteraceae->Category: Asterales->Category:Asterids->Category:Eudicots->Category:Angiosperms->Category:Plantae
Image file          +                        +                                                             +                                                                                                 +
          `>Category:Jerusalem artichoke->Category:Sunflowers------------------------------------->Category:Composites->Category:Annuals---------------------------------------------------------------->Category:Plants

("Composites" would also be accessible from "Trees" and "Shrubs". It may be better to create a category "Plants by species", as in Category:Animals)

Steps from Plantae to Helianthus tuberosus: 9

Steps from Plants to Jerusalem artichoke: 5

Dickcissel (Spiza americana)[edit]

How to express updated placement

Image:Spiza americana male 231051626 13e01e8125 o.jpg

This bird is a cardinal related to the "American buntings", but was formerly considered a true bunting or a tanager.

          ,+Spiza americana+Category:Cardinalidae->Category:Passeroidea->Category:Passerida->Category:Passeri------->[...]------->Category:Aves and so on
Image file             +                                                                                                            +
          `->Category:Dickcissel->Category:Tanagers->Category:Songbirds--------------------------->Category:Birds by common name->Category:Birds->Category:Animals by species->Category:Animals
                                ´>Category:Buntings-7                  `->Category:Perching birds-7

(There would also be a link from "Perching birds" and "Songbirds" to the scientific tree)

Steps from Aves to Spiza americana: 8 (6 after 2 superfluous categories have been deleted)

Steps from Birds to Dickcissel: 5-6

Black-capped Chickadee (Poecile atricapillus)[edit]

How to express uncertainty of placement

Image:Poecile atricapillus 03.jpg

Tits and chickadees are a very basal branch of the Sylvioidea, diverging from the "warblers" and "babblers" so early that Paridae + relatives might arguably be moved to a distinct superfamily Paroidea. It is unrealistic to expect a really robust consensus on that before the year 2010 however.

Note that the bird was until recently Parus atricapillus.

          ,->Category:Black-capped Chickadee------>Category:Titmice---------------------->Category:Songbirds-> etc as above
Image file             +                               +                                         +
          `+Poecile atricapillus+Category:Poecile->Category:Paridae---------------------->Category:Passerida-> etc as above
                                                                   `>Category:Sylvioidea-7

(This is already being used as it only affects categories for which there is consensus)

Steps from Aves to Poecile atricapillus: 8-9 (6-7 after 2 superfluous categories have been deleted)

Steps from Birds to Black-capped Chickadee: 5-6

Zebra Blue (Tarucus plinius)[edit]

How to overcome complicated systematic structures

Image:Tarucus.plinius.jpg

Many arthropod lineages, namely insect orders, are highly speciose. Ongoing research has resulted in much better systematic arrangements than only 10 years ago, but the sheer diversity of butterflies, beetles, mites etc still leaves much work to be done. Nonwithstanding, for Lepidoptera for example there exists a rather well-accepted and highly subdivided systematic framework.

Given the sheer volume of content and the exclusive use of scientific names, it is almost impossible for non-specialists to locate for example a certain butterfly's page on Commons.

          ,+Tarucus plinius->Category:Polyommatini->Category:Polyommatinae->Category:Lycaenidae->Category:Papilionoidea->Category:Rhopalocera->[...]->Category:Insecta and so on
Image file          +                                           +                                                               +                            +
          `->Category:Zebra Blue------------------->Category:Blues (butterflies)---------------------------------------->Category:Butterflies-------->Category:Insects----------------------->Category:Animals by species->Category:Animals
                                                                                                                                                                      `->Category:Arthropods-7

(T. plinius is presently the only member of its genus for which Commons content exists. Hence, a genus category is not necessary.)

Steps from Lepidoptera to Tarucus plinius: presently 4, but Rhopalocera is missing

Steps from Butterflies to Zebra Blue: 2

Steps from Animals to Zebra Blue: 5-6

Many people without an advanced scientific education routinely do not distinguish between insects and arthropods. Note how the "Arthropods" detour will both a) allow people to get to insects even if they have no idea what arthropods are and b) allow for a warning statement at Insects that links to Arthropods, to the effect that e.g. spiders are not insects.

Note that Category:Butterflies at present redirects to Category:Lepidoptera. This is about as correct as to make Category:Primates redirect to Category:Mammalia. From a scientific standpoint this is an outright lie. It is IMHO not acceptable to suggest to users that Butterflies = Lepidoptera. There is only an easily-overlooked statement to the contrary that is false also (Hesperoidea are butterflies too by about anyone's account).

Also, "Lepidoptera" now contains a lot of stuff like Category:Butterflies by country. Would it not be better if these were in another category one link away? Are they needed by people who regularly surf through the scientific category tree, or by people who don't know that "moths" are paraphyletic? The present jumbled morass of pages and subcategories bogs down both the average as well as the scientifically-minded user unnecessarily.

The proposed system would

a) allow almost any user to locate arthropod content easily
b) allow for an scientifically accurate way of organizing content.
The status quo fails to do either.

Lithops karasmontana ssp. bella[edit]

How to overcome lack of common names - I

An easy example.

Image:Lithops karasmontana bella.jpg

          ,+Lithops karasmontana+Category:Lithops->Category:Aizoaceae->Category:Caryophyllales->Category:Eudicots->Category:Angiosperms->Category:Plantae
Image file                      \      +                                                                                                     +
          `----------------------`>Category:Pebble plants->Category:Succulents->Category:Perennials------------------------------------->Category:Plants

(This means that all Lithops content would be categorized under "Pebble plants", and that the species pages would be categorized there also, in addition to the usual treatment as non-categorized(?) linked gallery unter the "Lithops" category.)

Steps from Plantae to Lithops karasmontana: 6

Steps from Plants to Lithops karasmontana: 4

Secondary categorization has been left out (e.g. "Aizoaceae" would also be in "Plantae by family". "Succulents" also in "Annuals", "Shrubs", "Trees").
Alternatively, "Pebble plants" could go in a category "Succulent perennials" which is accessible both from "Succulents" and "Perennials"; or see next example.

"Living stones" etc would redirect to the "Pebble plants" category.

Phidippus workmani[edit]

How to overcome lack of common names - II

A more complicated case

Image:Phidippus workmani face.jpg

          ,+Phidippus workmani+Category:Phidippus->Category:Salticidae->Category:Araneomorphae->Category:Araneae->Category:Arachnida->Category:Chelicerata->Category:Arthropoda->Category:Protostomia->Category:Eumetazoa->Category:Animalia
Image file                                       \         +                                          +                                                                                                                          +
          `---------------------------------------´>Category:Jumping spiders------------------->Category:Spiders----------------------->Category:Animals by species------------------------------------------------------->Category:Animals
                                                                                                                `->Category:Arthropods-7

(This means that all salticid content would be categorized under "Jumping spiders" and that salticid genus categories and species pages like Saitis barbipes - only one species of Saitis has Commons content at present - would be categorized there too.)

Steps from Animalia to Phidippus workmani: 10

Steps from Animal to Phidippus workmani: 5-6

Common liver fluke (Fasciola hepatica)[edit]

How to handle a speciose group with little content

Image:F. hepatica adults in bile duct.jpg

Presently there is an ugly mess of vernacular and scientific categories under Trematoda.

          ,+Fasciola hepatica->Category:Fasciolidae->Category:Trematoda->Category:Platyhelminthes->Category:Protostomia->Category:Eumetazoa-------->Category:Animalia
Image file           +                                      +                     +                                                                     +
          ´->Common liver fluke--------------------->Category:Flukes---->Category:Flatworms->Category:Worms (animals)->Category:Animals by species->Category:Animals

Steps from Animalia to Fasciola hepatica: 7 (As more content is added, Category:Echinostomida, Category:Digena etc will eventually be created, increasing step number to 9.)

Steps from Animals to Common liver fluke: 5

Category:Liver flukes may also be included, but at present it has a mere 4 items that could not be categorized elsewhere.

Other examples[edit]

  • How many Americans would be able to find on Commons a photo of those vultures they saw on their way home? How many would be able to add such a photo to Commons and correctly categorize it?
The latter would be placed in Category:Ducks and Canard colvert, Ánade real, Pato-real, Stockente, Кряква etc would redirect to it.
The result would be increased usability, because a) the content is easier to find for non-experts and b) experts can locate highly specific content more easily (e.g. a photo of a Mallard-Domestic duck hybrid drake, or a photo of Greenland mallards - if such a thing exists on Commons - showing one of the rare examples of Allen's Rule in birds, etc). In addition, there isn't scores unsortable and unannotable content items clogging up the scientific tree anymore.