Commons talk:Project scope/Proposal

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

This discussion is now closed and archived. Please do not modify it. Further comments can be made on the main talk page, at Commons talk: Project scope.

The existing scope page has a number of serious problems, in particular:

  • 1. It includes extraneous material (eg recommendations on file size) which should be elsewhere
  • 2. It is vague and unclear
  • 3. It provides little practical guidance in areas of particular concern (eg nudity) which results in arguments being continually re-run in deletion requests.
  • 4. It does not acknowledge as policy unwritten rules which admins frequently use to close deletion requests, the most important of which is the principle that it is not up to Commons to decide on content disputes and that if a file is in bona fide use on any Wiki project then it, by definition, cannot not fail the 'educational' test. (The way I have enunciated this principle derives from comments made by Rocket000).

The purpose of this proposal is to try to deal with these issues. Feedback is encouraged from as wide a range of editors as possible.

I suggest that, for clarity, comments should be made under the section headings below. General comments should come at the end. --MichaelMaggs (talk) 06:34, 25 June 2008 (UTC)


Aims of Wikimedia Commons[edit]

Scope of Commons[edit]

Excluded content[edit]

Scope part 1: files[edit]

Must be a media file[edit]

  • While to commons editors the gist of this section is obvious, it seems to me that this can imply that "Collections of Images" a.k.a. Galleries with explanatory text are excluded. I think one needs a little something about the role of Galleries and categories here. --Inkwina (talk contribs) 08:11, 25 June 2008 (UTC)
Not sure I quite follow: perhaps as I wrote it I can't see any other implication that might be visible to other readers. What sort of wording did you have in mind? --MichaelMaggs (talk) 16:46, 25 June 2008 (UTC)
This is basically what Herbythyme (talk · contribs) is saying below: That since Articles and Galleries are not refered to explicitly in the document, they run the risk of being slapped with an "out of scope" label.
  • I would not specify that explanatory text has to be "a little". e.g. Some complex graphics or graphs have attached the source code of the program that created them. This is obviously not just "a little explanatory text" but still pertinent. --Inkwina (talk contribs) 08:11, 25 June 2008 (UTC)
  • I am presuming that this refers to the Image: namespace only. This need to be made explicit. --Inkwina (talk contribs) 08:11, 25 June 2008 (UTC)
  • I think entire book scans are useful as historical document. Also such scans are definitely sharable between different projects. So I prefer to allow entire books scans on Commons. --EugeneZelenko (talk) 15:01, 25 June 2008 (UTC)
Interesting! Definitely worth discussing, as it seems that there is currently no consensus. --MichaelMaggs (talk) 16:29, 25 June 2008 (UTC)
In any case the djvu and pdf formats are allowed, so there seems to be an implication there. --Inkwina (talk contribs) 18:47, 25 June 2008 (UTC)
It has been pointed out that we already host this type of thing for Wikisource. That would seem well within our scope. But what about raw text? Would that be OK, too? Traditionally not, I think. --MichaelMaggs (talk) 19:48, 25 June 2008 (UTC)
I think raw text from original books is OK (but not from modern reprint or just PDFs). I think fonts and embedded illustration are important to preserve books. --EugeneZelenko (talk) 15:05, 26 June 2008 (UTC)
I think it is important that the Commons allows text files for both PDF and DJVU files. This is very important for Wikisource. We use these files for proofreading text. Also, if we can't host these files on the Commons we would have to do a massive move from the Commons to Wikisource. --Mattwj2002 (talk) 11:04, 27 June 2008 (UTC)
Absolutely agree. I don't think anyone is realistically advocating that these be disallowed. We just need to make sure we get the scope right. Eugene makes good points, Commons itself is not gutenberg nor wikisource. But we should facilitate Wikisource in every reasonable way. Because it's nifty. Especially what has been done with .djvu and side by side editing. ++Lar: t/c 19:05, 27 June 2008 (UTC)
  • ✓  Done Distinction made between a file that has raw text only (eg ASCII files), and files with scanned text of out-of-copyright books etc, where the scan includes eg the original layout, pictures if any. I do think we have to exclude the former as otherwise we just become another Gutenberg. Please say if you don't agree. --MichaelMaggs (talk) 14:19, 30 June 2008 (UTC)

The wording relating to encyclopdia articles etc has been moved to a new section on "excluded content". --MichaelMaggs (talk) 14:51, 30 June 2008 (UTC)

  • I do not like Computer programs in any format, including source code listings part, since source-code of computer programs used to create the graphics are encouraged. For example as in Image:Mandelbrot Creation Animation.gif. Also some people can classify svg files as computer source code, but this might be a stretch to worry about. --Jarekt (talk) 15:07, 29 July 2008 (UTC)
    • ✓  Done . I think we can ignore the .svg issue as those are explicitly permitted. --MichaelMaggs (talk) 19:06, 4 August 2008 (UTC)

Must be an allowable free file format[edit]

Please consider that the term "open Source file format" is awkward. puts it this way "Use of a free format: For digital files, the format in which the work is made available should not be protected by patents, unless a world-wide, unlimited and irrevocable royalty-free grant is given to make use of the patented technology. While non-free formats may sometimes be used for practical reasons, a free format copy must be available for the work to be considered free." --Inkwina (talk contribs) 19:04, 25 June 2008 (UTC)

Would "Must be an allowable free file format" be better? --MichaelMaggs (talk) 20:14, 25 June 2008 (UTC)

I think that the precise file formats and other related details and constraints (maximum sizes, versions of the standards such as mp*...) should be documented on a separate page as they are technology (and wikimedia version) dependent and might evolve much quicker than the scope. --Foroa (talk) 09:10, 30 June 2008 (UTC)

Must be freely licensed or public domain[edit]

Now includes the requirement (confirmed by Mike Goodwin) that every file must be free both under US law and under the law of the source country. --MichaelMaggs (talk) 14:51, 30 June 2008 (UTC)

Required licensing terms[edit]

Non-allowable licence terms[edit]

Okay, we should really factor {{CopyrightByWikimedia}} into this, because I find it stupid that a large amount of free screenshots are depicting non-free content, thus making the entire picture non-free. We should do something about that. ViperSnake151 (talk) 18:14, 16 July 2008 (UTC)

I'm not sure we need to make a special rule for that within the Project Scope. If the logos are non-free and are used in a way which is not De minimis then the images have to be deleted anyway under these rules. It seems to me we should educate users that the Wikipedia logo is not free. A separate help page migtht be the place for that. Uploaders will also get the message soon enough when their efforts are deleted. --MichaelMaggs (talk) 19:29, 16 July 2008 (UTC)

Allowable licence terms[edit]


It is incumbent on the uploader to supply suitable evidence, where necessary, to demonstrate either that the file is in the public domain or that the copyright owner has released it under a free licence. Typically that requires at least that the source of the file be specified, along with the original source where the file constitutes a derivative work.

As a complete layman, I would assume "source of the file" should include the country in which the image was taken, because the copyright laws do differ. This is probably common-sense but I have seen discussions which assume a particular country is meant.

Not sure it would be good to enforce that, as there are many images where that information is simply not known, especially very old images that we can realistically accept as PD whatever the country. --MichaelMaggs (talk) 16:48, 25 June 2008 (UTC)
Of course. I should have written "where the country affects the license descision", if I could write clearly. For example, I am now reviewing My own photos and a few would be Ok if in Switzerland, but not if in Belgium, Italy or France. But I was also thinking of very recent photos of people who have not given consent - that seems to be country dependent. United States and United Kingdom opinion seems to me to be less strict (or more open to "assuming consent" without evidence) than Germany, say. -Wikibob (talk) 18:39, 25 June 2008 (UTC)

Otherwise, I welcome this clearer wording of the scope in this proposal. Related to this section - I was about to post a request for a clearer definition of public event, but decided to leave it in in my notes page here (a lengthy way to reconcile People who have given their consent with people that are either public figures or are taken at public events.) -Wikibob (talk) 15:14, 25 June 2008 (UTC)

I would prefer to stick to general principles of "expection/no expectation of privacy". Once you start to legislate for every possible situation, things start to get impossibly complex. A portrait taken at a bar/pub, for example, might in some circumstances be OK and in others not, depending on who is shown, what they are doing, whether indoors or in the garden and many others. Also, exactly the same situation may be dealt with differently in different countries depending on local law, customs and expectations. --MichaelMaggs (talk) 14:27, 30 June 2008 (UTC)

Precautionary principle[edit]

Must be realistically useful for an educational purpose[edit]

  • I find the term educational a bit too domain specific. The term "Educational" is too often interpreted in a sense too narrow for the meaning you want to give it here. While I think that most of us have an clear idea of what we want here, there does not seem to be a single term that encompasses it. I think "Informational" would be closer to current practice. i.e. A Media file which can "impart useful information". The best phrase I can think of is "Media with added knowledge value"; but using such a term in real sentences sounds very kludgy. --Inkwina (talk contribs) 08:25, 25 June 2008 (UTC)
Yes, I had similar concerns. Perhaps using "informational content" throughout would be better? --MichaelMaggs (talk) 16:19, 25 June 2008 (UTC) ✓  Done --MichaelMaggs (talk) 16:24, 26 June 2008 (UTC)
The keyword is "useful". Rocket000 (talk) 00:50, 26 June 2008 (UTC)
Useful can be hard to judge but yes, it's the key word all right. "Educational or informational purpose"? Something can be very beautiful and still useful. ++Lar: t/c 03:58, 26 June 2008 (UTC)
By "useful" I simply mean that it's currently used or reasonably likely to ever be used. That's it. I'm not saying "useful" as in the more subjective sense where even if an image is being used you still may say that adds no value to the article. That's the hard part. That's were people disagree, but that's up to the writers of the article. Also, think about all the user images and images used on project pages. There's a lot of stuff that would never come close to be in any project's mainspace (well, except Commons), but we're here for that too. Rocket000 (talk) 05:59, 26 June 2008 (UTC)
We're on the same page. It's just sometimes hard to pin down in words that won't be ambiguous later. :) ++Lar: t/c 13:51, 26 June 2008 (UTC)
True. Maybe "usable" would be better. Rocket000 (talk) 21:07, 26 June 2008 (UTC)

Having made the change from "educational" to "informational", and reading through the resulting text a few times, I am not sure I like it. Does "informational" really convey our purpose to a new reader, and will it reduce the number of DR arguments? I find myself envisaging interminable arguments like this: "Why can't I upload this picture of my friend Bill? It provides all the information anyone could possibly need about what he looks like, so why have you deleted it for being non-informational"? --MichaelMaggs (talk) 06:07, 27 June 2008 (UTC)

I think that the keywords would be "useful" and "informational", i.e. it is the information imparted that has to be useful. There are two other ways to go around this:

  1. Define what is meant by educational at the beginning, then stick with it. This is what laws tend to do. once it is clear that educational does not mean "relating to the practice of teaching" there should be no problems, hopefully.
  2. Create a new term+concept, and explain it in detail elsewhere. I would think that images have to be "CommonsWorth" i.e. worthy of being hosted on commons. The definition of "CommonsWorth" would be left a bit self-referring, what is on Commons is CommonsWorth and what is CommonsWorth is allowed to come onto commons. But this would be explained and supplemented by a definition page that would highlight the points we are making here. This would be different than scope, as scope includes such considerations as licensing, format, legality etc ...--Inkwina (talk contribs) 10:42, 27 June 2008 (UTC)
I like that a lot, and will work up something along those lines. No 1 seems the simpler option. --MichaelMaggs (talk) 20:26, 27 June 2008 (UTC)


Done "Educational" is now defined. --MichaelMaggs (talk) 14:15, 30 June 2008 (UTC)

I think for Commons to remain useful for hosting media used on various Wikibooks and Wikiversity projects, Commons needs to also allow media that can be useful for purposes of teaching, learning and doing research. I'm not sure "providing knowledge; instructional or informative" as a broad definition of educational is enough to include these types of media, based on Inkwina's comment above about excluding media "relating to the practice of teaching" from the definition of educational. I'm not sure what is intending to be excluded by not allowing media related to the practice of teaching, but I think such an exclusion could be bad news for Wikibooks and Wikiversity where books and learning resources are more then just about providing knowledge, providing instructions and being informative. Books and learning resources are about learning a subject by teaching it, and media that help in that should not be excluded from Commons. Wikiversity also encourages (original) research and media that helps in that should also not be excluded from Commons. --darklama 15:38, 4 August 2008 (UTC)

I don't think Inkwina was suggesting that media "relating to the practice of teaching" should be excluded, merely that such an interpretation of "educational", on its own, would not be broad enough. The wording "providing knowledge; instructional or informative" seems to cover everything as it would be hard to find an example of something relating to the practice of teaching that is not at the very least instructional and informative. --MichaelMaggs (talk) 19:18, 4 August 2008 (UTC)

File in use in another Wikimedia project[edit]

This section mentions files used in "an article (the “mainspace”)" of one of the other projects as automatically confirming to this requirement. Not every project refers to the mainspace as an article and not every project limits educational content to a single namespace. English Wikibooks also has educational content in a Cookbook, Wikijunior, and Transwiki namespace, and English Wikiversity also has educational content in a Topic: namespace. Also sometimes users choose to work on content that will eventually be moved to a content namespace in there userspace first. In other words with the current wording it seems Commons could delete media if they were still being used appropriately as far as a project is concerned. --darklama 15:38, 4 August 2008 (UTC)

  • ✓  Done Reference to "article" and "mainspace" removed. Thanks for the comment. --MichaelMaggs (talk) 19:09, 9 August 2008 (UTC)

File in use in on Commons only[edit]

This one looks menacing - if it's not used elsewhere, let's nuke it. NVO (talk) 20:51, 1 July 2008 (UTC)

Not intended to be: You have to read the wording above which says that the file is OK if realistically useful for an educational purpose. All this section says is that "realistically useful for an educational purpose" cannot be automatically inferred merely on the grounds that a file is in use on Commons, for example in a category. If otherwise, an uploader could always ensure that his/her files, however useless, would have to be kept simply by categorizing them. Maybe this could be made clearer in some way. --MichaelMaggs (talk) 21:47, 1 July 2008 (UTC)
✓  Done . Now says "Non-educational files ...". --MichaelMaggs (talk) 06:40, 3 July 2008 (UTC)

File not legitimately in use[edit]

  • There is another role that Commons fulfils that is not covered; that of a Stock Repository for Free Content Authors. While this use might be beyond the original intentions of the project, it is now in practice. e.g. the dozens of Images of Sunrises and Fireworks do not fall within the scope as defined here. Would this new scope imply that no more such images would be accepted unless they are of exceptional quality? Yet such content can be useful (e.g. as backdrops) to someone laying out a free content educational poster to be put up in a classroom, or even a free content music video or comic book etc... --Inkwina (talk contribs) 08:25, 25 June 2008 (UTC)
It is not the intention to exclude files that are potentially useful for other free content authors (in an educational/informational context), even if the file is not used on any WMF wiki. I think it may be useful to say that explicitly. --MichaelMaggs (talk) 16:24, 25 June 2008 (UTC)
✓  Done --MichaelMaggs (talk) 16:25, 26 June 2008 (UTC)

Must not contain only excluded content[edit]

New section defining non-allowed content such as encyclopdia articles. Previously this came under the definition of a "media file", which confused file type and content and which conflated uploaded files containing encyclopdia articles and encyclopdia articles posted to a page. --MichaelMaggs (talk) 14:51, 30 June 2008 (UTC)

I must congratulate User:MichaelMaggs|MichaelMaggs]] for this extremely well written article. I have a couple of problems with the article.

1. Excluded: Self-created artwork without obvious educational purpose
This is a conflict generating exclusion as artwork by definition has not necessarily an educational purpose and is very subjective by nature. This implies equally that an artist cannot upload his own works. (Thank you you did not try to introduce the notion of "notable person" as this is another source of never ending debate)
2. There are requirements concerning educational purposes and quality that are quite subjective and that will open the door for long discussions (possibly inspired by underlying censorship--Foroa (talk) 17:01, 9 July 2008 (UTC)). Because of the subjective arbitration on conflicts during deletion requests, the deletion nomination procedure might need adaptation and quick deletion procedures should be excluded for scope related deletions.
3. I have been wondering many times, when categorising seemingly "non educational" images, that the fact that an image is easily inserted in a category system is often a proof of its educational value.

--Foroa (talk) 17:01, 9 July 2008 (UTC)

Hi Foroa, thanks for your comments.
1. On the exclusion of "Self-created artwork without obvious educational purpose" I think you are right and that it would be better to avoid "purpose" in that context as it might imply we are concerned with the purpose the artist had in mind when the work was created, which will normally not be known. I have changed this to "Self-created artwork without obvious educational use". Remember that the definition of "educational" is very broad, so most self-created works of art should qualify in the same way that most self-taken photographs will. The intent is to exclude only self-made artworks where it is difficult to envisage anyone other than the uploader ever wanting to use them. Self-made logos for unknown societies, groups or rock bands of which the uploader is a member come to mind. These are often probably uploaded as advertising, but that's not always easy to demonstrate. It would always be open for an uploader to explain what educational use could be made of the image, to counter this.
2. It's inevitable that some of the criteria are quite subjective, and I see no way round that. This proposal is not intended to change current practice significantly, and I doubt that experienced admins will see much difference in practice. Speedy deletions are already only allowed in clear cases, and issues of quality will have to go to DRs as they already do.
3. I hope it is clear that use in a Commons category will not in itself convert an otherwise non-educational image into an educational one. --MichaelMaggs (talk) 18:41, 9 July 2008 (UTC)

We could include a list of problems which by themselves do not put the image out of scope, but in combination with other problems might outweight its educational value. Some of those problems might be:

  • images too small to be useful
  • images of very poor quality
  • images with hard to remove watermarks
  • images uploaded to the commons with the main purpose of finding a place to host the image for hot-linking by outside websites

More can be added here. --Jarekt (talk) 15:36, 29 July 2008 (UTC)

Scope part 2: general pages, galleries and categories[edit]

Allowable page/gallery/category content[edit]

Non-allowable page/gallery/category content[edit]

Scope part 3: user pages, galleries and categories[edit]

Allowable user page/gallery/category content[edit]

Non-allowable user page/gallery/category content[edit]

A word on some areas of particular concern[edit]


"Neutral point of view"[edit]

I have added some comments on filenames and on text. Comments please. --MichaelMaggs (talk) 17:01, 7 July 2008 (UTC)

Hate-related images Deleted --MichaelMaggs (talk) 14:51, 30 June 2008 (UTC) [edit]

I think a little clarification is needed in this section. I know what is meant by "user-created" but some may interpret that as, well, any hate-related image that's made by users. Technically, most our images are "user-created" in the sense they made with Inkscape or other software. Right now I can't think of the right terminology but it should be something like "user-made up" logos.

The section states, "On the other hand, user-created hate-symbols are not educational and should be deleted as out of scope, even where they have since been added to one or more wiki articles." This slightly contradicts the general idea that being used in an article itself (or more correctly, another project's mainspace) implies it's within our scope. This, of course, doesn't apply to articles where the uploader simply added the image themselves and would be removed as soon as others see it. Rocket000 (talk) 08:02, 25 June 2008 (UTC)

I don't see the value of this section. Smacks of censorship, and the lack of any way to categorize an image as "hate". Makes me extremely uncomfortable. en.Wikipedia is not censored, neither should 'useful' images on commons be (within legal boundaries). Megapixie (talk) 09:07, 25 June 2008 (UTC)
This indeed hard to define in advance. I had in mind some deletion requests a few months back (which I now can't find) where various users had created personal anti-something images. For example, if I create a personal anti-Israel image such as a crossed-out map of Israel, is that OK? I think not, as it's intended purely as a hate image, and can't be used in any bona fide way to illustrate any Wiki article, even one that discusses the various anti-Israel political groups. It is something I have made up to make a political point. Is there any way of clarifying the wording to deal with personal user-created images like that, or is it simply not worth trying to legislate in advance? If we delete this section, on what other basis could we reject such images once they are actually in use (assuming of course that we think we should be doing so) ? --MichaelMaggs (talk) 16:45, 25 June 2008 (UTC)
I'm thinking of something more along the lines of WP's attack page policy. If an image was clearly uploaded for the purpose of attacking others (or vandalism or spam), then it should be deleted. In this way, we look at the intent rather than the content and thus avoid the slippery slope of censorship. Rocket000 (talk) 00:36, 26 June 2008 (UTC)
Intent can be hard to judge, especially a priori. We need a basis to reject problematic images but using uploader intent may not be the way to go. Perhaps usage or possible usage? ++Lar: t/c 03:53, 26 June 2008 (UTC)


Done In view of the comments, I have deleted this section. Some of the wording is re-used in the educational use section, but as an encouragement to keep not delete. --MichaelMaggs (talk) 14:14, 30 June 2008 (UTC)

General discussion[edit]

In General, this is a VAST improvement on what we currently have --Inkwina (talk contribs) 08:26, 25 June 2008 (UTC)

User Pages[edit]

While we are doing this can we take a look at the subject of user pages.

There are a number of issues that I have seen in patrolling them.

  1. They become "articles" that look like they have been deleted from en wp.
  2. They are used for "advertising" in some form.
  3. They have a number of personal links on - not an issue if they are a contributor. However I find some who "contribute" such pages across wikis & make no other contribution - these seem then to be vanity/personal pages rather than having any relevance to Commons.

I'm sure I will come up with more but I am short on time for a few days. The whole review is great work & appreciated, thanks. --Herby talk thyme 11:40, 25 June 2008 (UTC)

Is this in scope for the scope discussion? :) How does/should our user page policy differ from that at other wikis? I fear if too much policy is in this, it may bog down getting it done. ++Lar: t/c 03:51, 26 June 2008 (UTC)
Yes, let's keep these issues separate. We can always work on COM:USER too. Rocket000 (talk) 05:41, 26 June 2008 (UTC)


Done See new sections on general pages, galleries and categories, and also on user pages, galleries and categories. New headings listed above, so feedback can go in the relevant place. --MichaelMaggs (talk) 14:51, 30 June 2008 (UTC)

Wider discussion[edit]

As this potentially impacts all wikis, should we make an effort to encourage cross-wiki participation, eg by posting a notice on various Village Pumps and on MetaWiki? --MichaelMaggs (talk) 16:34, 25 June 2008 (UTC)

Do you want this to become policy or not ;) Though you may be correct! --Herby talk thyme 16:41, 25 June 2008 (UTC)
I can see pros and cons. What about keeping the discussion to Commons until the most obvious flaws have been ironed out, then seeking wider input at that point? --MichaelMaggs (talk) 16:58, 25 June 2008 (UTC)
I think this is our decision to make but that getting input and feedback from elsewhere might be good. The approach of getting this to where we think it's very close to done and then seeking wider input seems good. ++Lar: t/c 03:49, 26 June 2008 (UTC)

Gallery pages generally[edit]

I think we should be more explicit about non media pages generally. I am always rather shocked at the number of deletions that I have and I reckon around a third of the will be "out of scope" pages in the sense that they have no media. The out of scope tag mainly makes reference to the deletion of images rather than pages (though I guess I could go & change that!). --Herby talk thyme 16:44, 25 June 2008 (UTC)

Agreed. I'll think about some wording. --MichaelMaggs (talk) 18:16, 25 June 2008 (UTC)
Don't forget about COM:G! This policy should focus mainly on what kind of media should be uploaded, IMO. Let's not cram everything in here. :) Rocket000 (talk) 00:53, 26 June 2008 (UTC)


Done See above. --MichaelMaggs (talk) 14:51, 30 June 2008 (UTC)


I would like to take this opportunity to tell that I think the new page is marvellous! It summarises the many discussions and deletion requests which we have had on Commons very well. It is a clear help for admins in their decision-making while at the same time it probably doesn't change the way that the most experienced admins already work. The proposal is balanced, clear, and enough verbous so that it can be understood by almost everyone. Most importantly, it shows concern for improving this project and its approach is practical and not at all extremist or fundamentalist in any direction. Samulili (talk) 08:22, 27 June 2008 (UTC)

Absolutely agree. Very good work so far. ++Lar: t/c 19:05, 27 June 2008 (UTC)
I'll incorporate the feedback so far over the weekend. --MichaelMaggs (talk) 20:21, 27 June 2008 (UTC)


Done --MichaelMaggs (talk) 14:51, 30 June 2008 (UTC)

Post weekend rewrite[edit]

In general looking much better. Minor grumbles - "File not legitimately in use as discussed above" is a bit too verbose/long. Good stuff. Megapixie (talk) 15:28, 30 June 2008 (UTC)

I have shortened the heading. Does that help? --MichaelMaggs (talk) 16:03, 30 June 2008 (UTC)

Promotional material[edit]

Your work on this has been nothing short of awesome Michael - the community owes you for this. I'm not sure how this might be phrased or work but there are an increasing number of images being uploaded that are intended for promotional purposes. In some cases that may be of some "educational use" but the intention is to spam. I deleted some pictures of watches recently (uploaded by someone selling watches). There is a sense in which these images (obviously from a catalogue/brochure) might have been useful but...? Not at all sure how to incorporate anything I'm afraid. --Herby talk thyme 07:47, 3 July 2008 (UTC)

I see what the problem is. Give me a day or two to think up some wording. --MichaelMaggs (talk) 21:43, 3 July 2008 (UTC)
✓  Done but I think that if an image is of real potential value we ought to be able to keep it (if need be after deleting any promotional text on the image page) even if the uploader's intent was not educational. --MichaelMaggs (talk) 16:51, 7 July 2008 (UTC)

Wikimedia social content[edit]

The current draft does a good job of recognizing (and both permitting but confining) 'user' images. It recognizes their usefulness for lubricating our social machine, but it is not so permissive that it risks creating the argument that commons is a free photo hosting service. ... but it misses out on some other classes of useful but non-educational images. For example, there are several commons galleries of Wiki(p|m)edia meetups. Useful for our purposes, and few enough in number to never be a problem... but not educational. They often aren't used on user pages but are instead linked from discussions and meetup pages.Perhaps the policy should be made more general to cover images which facilitate our work but which are not themselves all that educational. --Gmaxwell (talk) 18:36, 2 July 2008 (UTC)

✓  Done . --MichaelMaggs (talk) 06:41, 3 July 2008 (UTC)

Further copyedits[edit]

For ease of reference, I have added a couple of new definitions, namely "Aims of Wikimedia Commons" and "Excluded content" which I have then cross-referred to throughout. Please comment on those new sections under the corresponding headings, above. --MichaelMaggs (talk) 16:56, 7 July 2008 (UTC)

Excellent proposal[edit]

This is the first time that I've had a chance to look since I emailed you mid-June. This proposal more than meets my expectations. You have done a fabulous job breaking it down into easily digestible bites of information. FloNight♥♥♥ 20:22, 7 July 2008 (UTC)

I really like this as well. It is easy to understand while still outlining all inclusion criteria. This way Commons:Licensing can be a more in dept document with details while this page gives the basics of it. -- Bryan (talk to me) 21:39, 7 July 2008 (UTC)

Feedback, please, on Excluded content section[edit]

I would like more feedback on the following wording, which may I think allow some educational .pdf and .djvu files to remain that are commonly objected to and indeed are being deleted under the current rules as "out of scope" (though there seems no really clear basis for this at present):

Files that contain nothing educational other than raw text. Purely textual material such as plain-text versions of recipes, lists of instructions, poetry, fiction, quotations, dictionary definitions and the like are better hosted elsewhere, for example at Wikibooks, Wikiquote, Wiktionary or Wikisource. However, Commons can be used to host such material if included in a media file that embodies something of educational value over and above raw text. For example, files consisting of scans of out-of-copyright books, newspapers and the like which preserve original font, layout, embedded images and the like are within scope. Text useful to the operation of Commons is permitted.

Here are some examples (please assume for simplicity that all of these have actually been released under a free licence):

1. Long academic paper on Copyright issues. Contains more than raw text (footnotes, page layout etc) but essentially the educational content here is really just the text. See for example Commons:Deletion requests/Image:Aaaaa.pdf (file visible to admins)

2. Would the answer be the same if this were a paper in some other academic discipline such as physics?

3. What about a scientific paper that includes more than textual information content, eg a paper that includes graphs?

4. If 3 is OK, presumably we should accept even large .pdf files such as copies of PhD theses?

Why should we not accept all of these, assuming that they are all "educational" - as they appear to be? There will never be large numbers, I suspect, so issues of storage space are probably not important even though some files like this may be quite large.

--MichaelMaggs (talk) 19:08, 9 July 2008 (UTC)

Hm. "educational beyond the text" is relevant, but I don't think it's the most important test... Commons should contain material which supports our wikis, not material which replaces them. A DJVU with fancy formatting included because it's a source document is good, a DJVU with fancy formatting included because some user didn't want to take the time to learn Wiki markup is bad. A DJVU file used because the user wanted his text to be harder to edit is excruciatingly bad, even block worthy behavior. Even if the documents were the same and all had useful formatting the usage and intention is highly relevant. --Gmaxwell (talk) 20:34, 9 July 2008 (UTC)
Any thoughts on some wording which would capture that? --MichaelMaggs (talk) 20:40, 9 July 2008 (UTC)

Well, again, I find myself drifting back to the "usage" thing. For example, there may be texts that lack images, graphs, or any historically significant aspects worth persevering (such as the typography), yet if they are being used by, say, Wikisource then it shouldn't be an issue. I think our scope gradually expanded naturally. From pretty much just images to videos and awesome djVu files. That's a good thing, but when we start getting PhD theses or instruction manuals or cookbooks, it's time to draw the line. We should make a distinction between book scans (almost certainly old due to copyrights and worth preserving) and modern machine-typed texts such as those found in some PDFs (haven't seen any in djVu). Here's my personal criteria when I run across text deletion requests (in order of significance, excluding licensing):

  1. Is it used?
  2. Does it contain anything besides text? (photos, drawings, graphs)
  3. Is there anything remarkable about the typeface, layout, or medium? Basically, is there anything besides what the text says that has educational value?
  4. Is there any historical or otherwise external significance where preservation may be desired?
  5. Is it in someway related to Wikimedia? (Many exceptions.)

Regarding #2, I have seen some scientific texts with graphs (or simple B&W logos) that I definitely consider out of scope, so there's some discretion there. Many times, the graphs or photos should be extracted, then uploaded in a more appropriate format. On side note, it's interesting to look back at the first description (oldid=15!) of our project. Also the first draft of this page (on Commons anyway). It's about time we got around to updating it. :-) Rocket000 (talk) 21:01, 9 July 2008 (UTC)

  • Usage seems to be a key point. It is outside of Common's scope to include text with the main use being comprehension of the concepts in the content. Exceptions for Wikimedia related text can be made. Including documents with historical significance, stylistic representation, or other significance beyond the idea is needed to be in Common's scope. For example, cookbooks, directories, indexes, and essays would be outside of scope unless they meet another criteria. FloNight♥♥♥ 23:37, 9 July 2008 (UTC)

Careful when considering "is it used" in the "is it transcluded into an article" sense. There are many images which are very much in *use* by being placed in galleries which are widely linked by the other projects. Some of the galleries add considerably to a readers understanding of a subject. I don't think we can solve this point by simply creating a free pass for content in use. ;) I'll sleep on the question of proposed language. --Gmaxwell (talk) 01:56, 10 July 2008 (UTC)

Thanks you for asking me to comment on the scope of Commons files, in particular, Pdf Files. I have built up most of the Category:Pdf files as there are over 12,000 entries if one searches pdf and about 800 if one searches pdf files. There is a pattern that appears throughout most of the files and apart from the content, less than 10% is linked or connected to anything and the predominant amount of pdf files are in Spanish. To answer three of the above points. I have placed most of the pdf files in Category:Education Pdf files and they have yet to be subdivided and organized. Most of these I felt were useful and tried to link them to appropriate (real) Categories or articles of a similar nature or they were already done that way.
  • Firstly in regards to Theses and higher education pdf files. Most were not linked to anything and were fairly lengthy. I can see why initially the uploader used this method rather than putting it into a Wikipedia article because it couldn't be edited or changed in any way and their educational integrity would be intact. Some were quite interesting and useful I am sure, so if a policy is put in place there has got to be a more clear guideline that they link pdf files to the Category:Pdf files or subcategories or galleries and also to a Commons Category that is relevant. Also many create totally unlinked and unusable Categories because the uploader page I think, just says create a Category or Place images in Categories or some such wording that is not leading the uploader to doing a search for an existing Category or Gallery.
  • Files that contain photos and graphs someone said can be extracted? I think the uploader should be contacted somehow and told (nicely) that just the images should be placed in Commons and should put the original uploads up for deletion and stick to this or whatever policy is in place for photos and art in pdf files Category:Art and related media Pdf files. I believe they are thinking their images won't be used and they still have the copyright so a clearer explanation and usage guideline should be put in place. Images and art this way are not containing any labels or explinations etc and are useless unless linked properly.
  • Historical, Biographies, Books etc should remain and be correctly linked. I have only put up for deletion those that are vanity or CV bio's and really short histories less than one or two pages. I think these uploaders are probably not aware that their material should be on Wikipedia as a short article. Category:Books (literature) in Pdf Category:Genealogy Pdf files Category:Biography Pdf files Category:Law related Pdf files

I will have to gather my thoughts and continue after some of your comments and suggestions. WayneRay (talk) 16:46, 11 July 2008 (UTC)WayneRay

This section needs still more feedback. I am unable to discern from the discussion any consensus that I can convert into a written generic principle :( --MichaelMaggs (talk) 06:31, 15 July 2008 (UTC)

PDF File discussion[edit]

Concerning pdf files, I feel that we have to think on some more dimensions of the problem.
  • If we consider that commons will last for eternity (minus one), then it is "the" repository for all sorts of reference files and reports (for example annual reports of Amnesty International, Greenpeace, ...)
  • It should be possible to keep all sorts of informatics files that cannot be altered (such as reports, invitations, posters, announcements, pamphlets) as reference documents. This might require an special license policy.
  • Maybe it might be needed to save some pdf files with an encrypted option so that it can never be altered.
--Foroa (talk) 07:57, 15 July 2008 (UTC)
With reference files in pdf format, I mean reports, files, announcements, thesises ... in their original pdf format that cannot be altered and are potentially protected against alterations. This is completely different with wikisources, which must be a nightmare to ensure coherence with the original texts. --Foroa (talk) 14:51, 15 July 2008 (UTC)
I understood that you were referring to pdf files of documents that need to be left intact to keep them true. But my concern is that we have no way of knowing if these documents are accurate and truly represent the correct version. Based on my experience working with texts, I'm not willing to take this leap of faith for the thousands or millions of documents that we might quickly acquire if we include documents that are not already notable or being scrutinized on other Foundation projects for accuracy. FloNight♥♥♥ 15:19, 15 July 2008 (UTC)
FloNight, I think we discuss about different issues. I see the use of PDF (Portable Document Format) as some modern facsimile/photocopy format that cannot be altered by a third party (If need be, it can be encrypted so that there is no possibility to change it). So, once you get the right version, it remains right for ever (or could be locked to prevent changes). --Foroa (talk) 16:45, 15 July 2008 (UTC)
Yes, that is a benefit. And for texts that easily fall within the scope of Commons, the use of PDF files might be perfectly correct, and maybe the best way to capture some items. But I feel that including a large number of pdf files with the assumption that they are accurate because they are already in the pdf format is a faulty assumption. FloNight♥♥♥ 17:01, 15 July 2008 (UTC)
Yes, I think we need to discuss this more in order to get on the same page. There are definite pros and cons to storing these files. Unfortunately, I think the potential problems out weigh the advantages.
My work on several other Foundation projects, as well as my work at Commons, makes me extremely reluctant to accept many, many large files that can not be verified for accuracy of content versus the original source. The inability to alter them raises problems, as well. This particular type of upload seems like a massive project that is different enough from Commons' other uploads that it warrants careful consideration. FloNight♥♥♥ 11:31, 15 July 2008 (UTC)
You are right on all accounts regarding Pdf files. There should be a place to store and a use for, large non-image files such as Thesis, reports etc. Right now the main thing I see is that on the Main Commons and Wikipedia page or the Upload page, is that there is nothing that states, "all pdf files go to the Category Pdf files" 90 percent of the ones I have gone through just sit there with no links what so ever so that point is not getting through to uploaders. Particularly the large number of pdf files in Spanish. I can create better sub categories but that doesn't help it at your end for Policy and also for getting the word out to the world.
Here is a modified Upload form if someone can make the appropriate upload summary page that automatically includes the Category:Pdf files link. Is that simple or what? Translated into the other languages

Where is the work from? (Click on the appropriate link)

  • It is entirely my own work
  • It is a Pdf file
  • It is from Flickr (more information on uploading Flickr images)
  • It is a derivative work of a file from Commons

WayneRay (talk) 13:39, 15 July 2008 (UTC)WayneRay

My experience working with texts on Wikisources makes me think that there are issues with the accuracy in the content of text documents as well as verifying the sourcing that can not be resolved easily by asking the uploader to better identify or categorize the material. Errors in transcription are common, as are mistakes about the original source. And unfortunately, it is not unheard of for some people to deliberate falsify texts. The main purpose of Wikisource is to upload texts. I'm not convinced that Commons should on a large scale be duplicating this work with less control on accuracy. FloNight♥♥♥ 14:37, 15 July 2008 (UTC)
If I may step back a bit; there are pdf files with only one sentence or a paragraph as well as one's with photographs that should not be in pdf but images none the less up to larger ones as you are discussing. My main concern before all that new categorizing etc is as above, there is nothing on the upload page or anywhere to tell uploaders about or where to put Pdf files. Can the upload form be changed to recognize this?
Secondly, I agree that duplication is not needed. If Wikisource is for texts then why not move all the Pdf books and texts over there and delete the duplicates here? I don't think I have been there so I should check it out first. PS I am originally from just near you in Alabama, been in Canada for a long time WayneRay (talk) 18:20, 15 July 2008 (UTC)WayneRay

Pdf examples to discuss[edit]

Please comment on the allowability of these pdf files that are currently up for deletion. There is a divergence of opinion on all of them.

This is a previously published and credited article and in the possession of the author after one year is up. It is highly informative and educational and can be used. This is one of the reasons I created some of the sub categories because there is so much useless and useful pdf files being uploaded that the better ones could be kept, like this one, and the one sentence one paragraph ones just deleted. WayneRay (talk) 09:34, 16 July 2008 (UTC)WayneRay
This one I read as well ad although now it seems similar to the one below, It is educational in its scope and not an advertisement for anything. I see now it is not complete and should be in a different Category but someone may learn from it. It should be an article on Wikipedia though. WayneRay (talk) 09:26, 16 July 2008 (UTC)WayneRay
I remember this one and read through the whole thing. It appears to be educational and logically written and because of the accompanying graphics I can see where it might not work as a Wikipedia article. I voted Keep and placed it in the Educational pdf files area because it could be used somewhere. WayneRay (talk) 09:20, 16 July 2008 (UTC)WayneRay

I'd like to hear opinions on the general principles on which you think these should be deleted/kept. --MichaelMaggs (talk) 06:25, 16 July 2008 (UTC)

I'm of the view that PDFs are better hosted on our sister projects. If it is textbook-like, move to Wikibooks; if it is a source text, move to Wikisource; etc. If there are some PDFs which fall outside the purview of the other projects (unlikely, given v:WV:WIW), it likely does not belong on any of them - including Commons (if there are counterexamples, I'd be interested). It should also be noted that most PDFs should be converted to wikitext and placed in the wiki proper rather than moved only as a PDF file (scans of source texts are the only exception I can think of).
Applying those principles to the examples mentioned:
  • MINDSPACE - The Stream of Consciousness - Final.pdf moves to Wikisource, and gets converted to wikitext (might also fit at Wikiveristy, which doesn't ban original research)
  • LessonsLearned.pdf moves to Wikisource, and gets converted to wikitext.
  • MolecModPaper.pdf moves to Wikisource and/or Wikiversity, and gets converted to wikitext.
 — Mike.lifeguard | @en.wb 10:57, 16 July 2008 (UTC)

Jumping into this discussion, I see only 3 reasons to use PDF format here, and only the first likely:

  1. as an image format, for example a scan of a public domain manuscript (TIFF or JPG might be better, but the original scan may be in PDF format, so should be the first version uploaded)
  2. when the original work is in PDF format, and contains formatting worth keeping, over and above pasting the text into Wikisource or another wikimedia project
  3. if and when a wikimedia project has a reason to link to media in PDF format (perhaps a slide show or animation) if that is not better achieved with a more free format

In general, I see PDF as problematic, as, like TIFF and JPG, it is not source code: instead it is probably generated from source code, such as an underlying word processor document, TeX markup, HTML or plain text, which the author should be strongly encouraged to contribute. Free content where the author has more freedom to change it than the Commons user, because the author has the source code, is not very free, and not in the spirit of our project.

--InfantGorilla (talk) 13:51, 16 July 2008 (UTC)

OK But what of the 8-900 I have organized already, have you or anyone gone through the subcategories? Please delete every single one if you like or move all the relevant ones to other Wiki's, Also get it into uploaders heads that PDF's are not supposed to be here. Mine are in a Publisher article and I would be more than happy to donate every single book I have published or written to Wikibooks or Wikisource etc. I don't mind organizing and linking as many as I can but if your proposal is to be finalized let me know because I see about 5 a day coming in new and unlinked. I agree they text ish ones should be converted or re uploaded as text but there are many that don't fit, particularly scientific and historical materials. Thanks for listening WayneRay (talk) 20:48, 16 July 2008 (UTC)WayneRay

I have been away for a while, so haven't been able to take this forward. Before going live I would like to see more comment on the pdf issue, as there are opposing views with no obvious consensus. Since this is probably of most concern to Wikibooks and Wikisource I propose to invite comments from users of those Wikis in particular. --MichaelMaggs (talk) 19:19, 3 August 2008 (UTC)

My 2c. I think it is ridiculous to exclude PDF/DJVU files that are useful to Wikibooks, Wikiversity or Wikisource. Rejecting these files just because they contain text instead of pictures is silly and reduces Commons' strength as the central resource for uploaded material. It is far easier if projects can just direct all uploads to Commons, instead of having to have some arbitrary rule about text-dominated files being uploaded locally. What exactly is the benefit to us? We serve them better if we take everything they want to accept. (copyright forcing the exclusion of fair use items.) --pfctdayelise (说什么?) 12:49, 4 August 2008 (UTC)
Most projects have local uploads, so Commons is actually not any more useful to them if it accepts PDFs than if it doesn't. The reason we have a central media repository is for media which are potentially useful to all projects; there is no reason to have media which are useful to a single project only. While we're happy to encourage uploaders of free media to do so at Commons, and we are happy to move free media here too, we are not going to require that media is uploaded here. Wikibooks content should be hosted at Wikibooks; where free media may be useful to other projects we will make an effort to move it here. Hosting Wikibooks content on Commons just doesn't make sense to me. This is not about the practical issues so much as the principle. Wikibooks content belongs on Wikibooks; common content belongs on Commons.
As a practical matter, many PDFs on Wikibooks include some fair use element(s), and would be unacceptable here. I do not know that it is worth lookin in each file (some of which are massive) and verifying the copyright of each element.  — Mike.lifeguard | @en.wb 14:22, 4 August 2008 (UTC)
This conversation sounds like it's getting to be about more than just PDFs. The question that seems to be under discussion now is, "Should Commons host media that is useful for only one Wikimedia Project?" And AFAIK the answer to that question has always been yes. For example, scans of a PD-old book written in German will really only be useful for German Wikisource, but those scans have always been hosted here at Commons. Indeed, German Wikisource's policy is that the scans must (or at least, really really ought to) be uploaded to Commons and not locally. If Commons continues to accept JPGs and PNGs that are useful at only one project, there's no reason it shouldn't accept PDFs that are useful at only one project. There is incidentally another reason why text PDFs may be useful to projects like Wikisource: readability. Works that use a large number of special characters that are only found in a few specialist fonts (e.g. works on advanced mathematics, linguistics, etc.) may not be easily readable in their online form for people who don't have the specialist fonts installed on their computer. By having the work available as a PDF too, it is accessible for more people. PDFs can also be formatted nicely for printing out, while printouts from web pages generally look like crap. —Angr 16:17, 4 August 2008 (UTC)
Agreed. There may also be other free-content users for whom the files could be valuable. The draft scope includes the statement that Commons "makes available public domain and freely-licensed educational media content to all". --MichaelMaggs (talk) 16:31, 4 August 2008 (UTC)


Here is a University Thesis that I would normally put up for deletion Image:Regularização Fundiária da Terras Indígenas no Brasil.pdf or move to Education , In Spanish, well written documented with anotations and references. Way too long for Wikipedia, is it usable at Wikisource (not in English though) And my way earlier point: why are all these Spanish Pdf's coming in, who and where is this message getting out to the world that text can go in pdf and University theses can be uploaded to Commons?? WayneRay (talk) 10:08, 16 July 2008 (UTC)WayneRay

No one answered the question and just went ahead and deleted the file before it could be looked at? WayneRay (talk) 15:06, 29 July 2008 (UTC)WayneRay


Would anybody object to me doing some bold editing to this to make it less TL;DR? I don't think this is a substantial improvement over the original, simply because nobody is ever going to read something this long. It can be made much, much shorter and readable while still preserving the same information. Lewis Collard! (lol, internet) 22:41, 14 July 2008 (UTC)

The idea of making this much much shorter scares me, as I think this is surprisingly short, for the ground it covers; mind putting it on a subpage?  — Mike.lifeguard | @en.wb 23:02, 14 July 2008 (UTC)
Well, you could always roll me back if I'm wrong... Lewis Collard! (lol, internet) 23:15, 14 July 2008 (UTC)
I don't think it is possible to make this much shorter without losing the sense. It tries to capture what admins actually do at the moment in closing DRs, almost none of which is mentioned on the current Scope page. Taking a hatchet to the text cannot avoid losing the subtlety or the examples, both of which I believe are useful. You need to bear in mind that this is a reference page of the type that users and admins will pore over and look for holes when they want to prove a point in years to come. It's not a quick-and-simple-summary type page. If it's useful to have a summary for quick-reference, let's have that by all means but not at the expense of losing this. I don't know to what extent you have been following this page, but it is the result of many tens of hours of drafting with input on the detailed wording from many users and I have to confess a real unease the prospect of sweeping changes, especially as you are coming from the viewpoint that you do not think it a substantial improvement on the original. --MichaelMaggs (talk) 06:26, 15 July 2008 (UTC)
Collard, the point of the re-write was to document the unwritten policies, customs, and practices related to the scope of Commons. These ideas have long been understood and followed by the regulars once they were acclimated to Commons. But with the growth of Commons, it is now helpful to put the ideas in writing. I agree with MichaelMaggs that this document needs to be detailed in order to capture the nuances of our policy and practices. Some users and administrators are looking for specifics on this topic and it is extremely labor intensive to repeatedly explain the details. For these reasons, I'm reluctant to shorten the document primarily for the purpose of brevity. FloNight♥♥♥ 11:48, 15 July 2008 (UTC)
I'm fully aware that "it is the result of many tens of hours of drafting with input on the detailed wording from many users", and that the project scope page as it is pretty much sucks. The reason I say that it is not a substantial improvement over the original is the sheer length of it, and how it is written and formatted like a legal document (which, and I mean this in a good-hearted and admiring way, does not surprise me given that MichaelMaggs is one of the primary authors). If you write something this long you cannot expect the vast majority of people to read it, and that especially includes new contributors (the ones who need the most education about what Commons is about!). I can see ways in which the proposal could be re-written without even losing any information.
With all that said, I do have concerns about whether codifying this sort of thing is beneficial. That's an issue I never raised above, but I can TL;DR you to death about that one if you really want. Lewis Collard! (lol, internet) 15:48, 15 July 2008 (UTC)
The primary reason for codifying is to reduce the huge amounts of time wasted by admins repeatedly having to re-explain and re-argue the same points over and over again on DRs, and to reduce the uncertainty for users from other projects who legitimately have a desire for consistency of approach to deleting files. Brevity can come in a separate summary page for new users. --MichaelMaggs (talk) 16:18, 15 July 2008 (UTC)
I agree, a summary page is worthwhile, but not at the expense of clarity. Commons' scope is a major issue currently, so being clear and comprehensive is of great value to us. However, the danger of such coverage being overwhelming can be overcome by making a summary. Lewis: If you think you can make a good summary, Commons:Project scope/Proposal/Summary is waiting for you!  — Mike.lifeguard | @en.wb 18:52, 15 July 2008 (UTC)


Are there any serious objections to the current proposal? Unless there are I propose to move this proposal to the main page. -- Bryan (talk to me) 13:30, 29 July 2008 (UTC)

It appears to me that a conclusion or at least a temporary position on pdf files is to be taken before publishing the proposition. --Foroa (talk) 15:48, 29 July 2008 (UTC)
I agree, and will be inviting more views on that. --MichaelMaggs (talk) 19:24, 3 August 2008 (UTC)

I have been waiting for someone to summarize and, preferably, simplify it. Until that happens, Commons:Project scope is far more useful than this proposal. I am certain that such a complex scope will scare away editors of other wikimedia projects, who will find it easier to host media locally with a scope they already know, rather than try to understand a long, complex scope. --InfantGorilla (talk) 16:27, 29 July 2008 (UTC)

See Commons:Project scope/Proposal/Summary. --MichaelMaggs (talk) 16:24, 4 August 2008 (UTC)

In my opinion, the last point in the Excluded content paragraph isn't clear enough regarding Wikisource. Especially German-language Wikisource with its strict policies relies heavily on scans of original publications (preferrably first editions) for proofreading and creating a reliable text. For example, Faust - Der Tragödie erster Teil is faithfully reproducing the first edition text of Goethe's Faust and needs the scan with pages like Image:Faust I (Goethe) 005.gif for verifiability (see Category:Faust I (Goethe)). It is stated in the proposal:

However, Commons can be used to host such material if included in a media file that embodies something of educational value over and above raw text. For example, files consisting of scans of out-of-copyright books, newspapers and the like which preserve original font, layout, embedded images and the like are within scope.

However, Wikisource is not using the scans stored here for their "original font, layout, embedded images and the like", this is not the point at all. The point is having a reliable, verifiable text base. Category:De Wikisource book is full of scans where there's nothing remarkable about "font, layout, embedded images and the like" but that are still needed for Wikisource purposes. And I think that this is still within scope of Commons - quite like images that are needed for Wikipedia purposes, we can host book scans that are needed for Wikisource purposes, can't we? Whether they are in the form of individual GIF/JPEG etc. files or PDF. Therefore I would like to see proofreading purposes explicitly mentioned. Gestumblindi (talk) 21:14, 29 July 2008 (UTC)

Thanks, that's useful feedback (I'm not very familiar with the procedures at Wikisource). I agree that we can/should be broader than the existing proposal. I will try to work up some revised text during the next few days that should, I hope, satisfy you. --MichaelMaggs (talk) 19:24, 3 August 2008 (UTC)

One last doubt[edit]

I have been away for a while and come back to find this very much improved version, but I still have one worry: I get the feeling that current practice is that newly added images are treated somewhat differently to images that have been around for a while. This is so particularly in reference to the sentence "New and existing files of poor or mediocre quality may or may not be realistically useful for an educational purpose depending on what they illustrate and what other holdings we have of the same subject". New files that add nothing new are treated differently from Old files that are obsoleted once by new files. (e.g. vide Commons_talk:Deletion_requests/Superseded.) I'm afraid that the current formulation might provoke a huge amount of deletion requests of people wanting to "clean up" Categories with too many poor images. On the otherhand some such cleaning up is desirable and appeal to the "educationaly useful" concept here can help. --Inkwina (talk contribs) 18:09, 30 July 2008 (UTC)

I don't expect wholesale deletion requests since a file that is in bona fide use on another Wiki can never be made obsolete by a later-uploaded, better file. That is the same as current unwritten practice. An unused file could be in theory be deleted as non-educational if Commons has plenty of better images of the same subject, but in practice I would expect that deletions will happen on that basis fairly rarely, and that such files will be approved for deletion only they are pretty-near useless. --MichaelMaggs (talk) 19:38, 3 August 2008 (UTC)
    • ✓  Done . Can you check if the revised wording meets your needs? --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)

On a different note where it is said "by custom the uploading of small numbers of images (eg of yourself) for bona fide use on a personal Commons user page is considered legitimate" I do not think there is any need to appeal to custom here. Such images are operationally useful of their own right, as lubricant to the social interactions that drive Commons and other wikis. --Inkwina (talk contribs) 18:09, 30 July 2008 (UTC)

You may be right, but it is worth saying explicitly. --MichaelMaggs (talk) 19:38, 3 August 2008 (UTC)

Actually not the last[edit]

So does the "educational test" imply that the following are out of scope: 1) an chemical diagram of a structure that is physically impossible 2) A Flag with the wrong colors/placement of symbols 3) a Graph based on bad data?. I am referring to images due to mistakes or inaccurate/outdated data, not about alternative interpretation here. --Inkwina (talk contribs) 17:38, 1 August 2008 (UTC)

The answer in all cases, both now and under the new Scope, is "no" if the image is in bona fide use on another Wiki. Where a file is in use we do not enquire into its accuracy nor try to impose our view about what is "right" onto other Wikis. If the files are not in use, and there appears to be no realistic reason for them to be used, they could be deleted as non-educational. Again, the new Scope is not changing anything here, simply writing down what experienced admins do every day when dealing with deletion requests. --MichaelMaggs (talk) 19:38, 3 August 2008 (UTC)

Freely licensed[edit]

"A file is considered to be in the public domain if either all copyright has expired or if the copyright owner(s) has voluntarily placed the content of the file into the public domain by irrevocably renouncing all copyright" ... or if the work is ineligible for copyright protection. I think that ineligibility should be mentioned, though we need to avoid inviting people to declare ineligibility for every image they really want to upload. Under the existing laws ineligibility is somewhat unusual. --Gmaxwell (talk) 15:31, 1 August 2008 (UTC)

Pdf and Djvu files[edit]

I have added text dealing with these based on the discussion above. Users are by no means unanimous about which files should be allowed, and I have tried to follow the majority opinion. Thus, the suggestion is that if a Pdf or Djvu file is educationally useful even to a single other Wiki it should be kept. Comments below please. --MichaelMaggs (talk) 20:19, 11 August 2008 (UTC)

Either that or we make Wikisource a shared repository like Commons. I'm good with that. ViperSnake151 (talk) 22:22, 11 August 2008 (UTC)
I think this text does a good job of balancing the competing concerns in this area, and I support it. In particular we know there are projects that make use of these formats and we should support them in their usage. I would NOT want to see us require, or even encourage other projects to become shared repositories of media files. ++Lar: t/c 22:24, 11 August 2008 (UTC)
Raw-text-equivalent should not be hosted here, but on the appropriate project, IMO. However, there is reason to have source texts which are not raw-text-equivalent hosted here, and that is fine by me. I'm not sure if that wording is more or less precise to others, or if that can be agreed upon.  — Mike.lifeguard | @en.wb 23:12, 11 August 2008 (UTC)
MichaelMaggs, thanks for letting me know of your most recent stab at rewording. This wording works for me. My concern was storing on this project solely for educational purpose with no specific target use on another project. The key is the usage factor. I feel that usage on another wikiproject, within the scope of that project, makes it appropriate for Commons to host. FloNight♥♥♥ 23:24, 11 August 2008 (UTC)
Agree w/ above, the current wording and usage as laid out by MichaelMaggs (talk · contribs) at Commons:Project_scope/Proposal#Pdf_and_Djvu_formats, particularly the list of allowable reasons and the text stating that if the file could have a potential use on Wikisource/Wikibooks it should be kept. Cirt (talk) 01:03, 12 August 2008 (UTC)
Added a bit to cover random badness like promotional stuff. If it would be excluded were it in some "media" format (ie png, jpg, whatever) then we don't want it if it's pdf either. Hopefully my wording is clear - if not, please re-word, keeping the intent.  — Mike.lifeguard | @en.wb 01:11, 12 August 2008 (UTC)
Looks good. --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)
While, I've only looked at the section quickly, it does cover the territory well. Thanks, John Vandenberg (chat) 01:42, 12 August 2008 (UTC)

I also had concerns over "educational" when it comes to this area. To me it's still about usage and potential usage. Commons isn't a educational media repository but a Wikimedia media repository, which means if it's within their scope, as far as media goes, then it's also in ours. To me, raw-text documents, like encyclopedia articles, are not "media" in this sense of the word, so they treated differently. However, even if they belong elsewhere, that shouldn't be the sole reason for deletion, which I think this covers well. I'm happy with the current wording and balance. Rocket000(talk) 02:51, 12 August 2008 (UTC)

In my opinion, PDF and djvu files should be considered as very different, since djvu are much more "transparent" than pdf. As a used of a couple of wikisource projects, I find djvu files excellent both as a container of original scans of pages for proofread procedure, and as containers for images from derived work - i.e. the collection of sets of images (drawings, pictures... ) from a specific book. Single pages of a djvu file can be addressed with usual Image: wiki tag, so any wiki user can use any single page of a djvu file. It's a pity that - in my present knowledge - images from a djvu file can not be listed into a gallery; but it is far from difficult to build up a "pseudo-gallery" (a table of thumbs) and this should be encouraged IMO, mainly for djvu files that contain images only (without scanned text). See Image:Equitation.djvu (original content of a book) Image:Equitation images.djvu (collection of images of the book) and Equitation images djvu (categorized pseudo-gallery of images).
So, my suggestion is to split the discussion about PDF and djvu files since they are very different. --Alex_brollo Talk|Contrib 04:18, 12 August 2008 (UTC)
Can you suggest precisely what wording you would like for each? --MichaelMaggs (talk) 05:46, 12 August 2008 (UTC)
PDFs can be treated like DjVu files once someone gets around to looking at bugzilla:11215. John Vandenberg (chat) 04:52, 12 August 2008 (UTC)
I have tried to avoid being too specific about which formats are allowed in which circumstances, as there are too many complications. Now, there is a more general appeal to Commons' aims. --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)
Following on from what Alex has said, I disagree with this language:
"Uploaded documents in these formats sometimes include embedded images that may be useful in their own right, but only once extracted from the file and converted into a more usable format. Users should in such cases be encouraged to re-upload in a better format, and Pdf and Djvu files should not be kept on the sole basis that someone might theoretically wish to extract the images at some future date."
and unacceptable reason
"Content is an image that would be more appropriate in a standard image format such as jpeg, gif or svg"
There is no well qualified reason when or why a "standard image format" would be more appropriate, and many people have no clue what a DjVu file is. It is quite likely that people will say that a set of scans should be separate images as individual images are more accessible :- unless this is clarified, it leaves the meaning open to wild interpretation. Often there is no obvious functional difference between a set of PNGs and a DjVu, unless some of the features of a DjVu are used.
Also, Image:Equitation images.djvu, being a bundle of images, could be extracted into a more usable format, so the current language would exclude it. The bundle of images isnt currently defensible under "it provides technical advantages to at least one other Wikimedia project" because it doesnt provide any advantages to Wikisource that I can think of. The only benefit I can think of is that it is easier to download the bundle than it is to download them all, however tools like DownThemAll! might be able to meet that need. It would be possible to argue that p.39 of Image:Equitation_images.djvu could use the "text layer" feature of DjVu files. I personally wouldnt use a DJVU for these images, but it is worth considering that the current wording would result in at least deletion request. I'm keen to hear reasons Alex thinks that djvu is useful; better to hear them now rather than at a deletion request.
✓  Done That wording has been removed and replaced with more general guidelines. --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)

(chat) 05:49, 12 August 2008 (UTC)

Well, I found a djvu collection of images extremely comfortable because they are "implicitely categorized" (i.e. Image:Equitation images.djvu is implicitely a "Category:Images contained into the book Image:Equitation.djvu", with a highly standardized name (page=1, page=2...) really comfortable when they have to be used into templates and so on. I can't calculate how much server space is saved, considering pages, indexes, and so on, when a bundle of 100 jpgs is converted and uploaded as a single djvu file, but I guess, it's not so small. The only objections I could imagine are:
  • single images can't be categorized
  • so far ther is no means to have a "automatic gallery" of them, just to take a look to the whole djvu file content. --Alex_brollo Talk|Contrib 13:41, 12 August 2008 (UTC)
Two other disadvantages I thought of just now: 1) the edit history contains changes to any image so it is harder to see what improvements have been made to a individual image, and 2) people are less likely to edit a djvu file to clean up an image. John Vandenberg (chat) 14:06, 12 August 2008 (UTC)

Thanks for the pointer, MichaelMaggs; I agree with the current wording. Gestumblindi (talk) 19:59, 17 August 2008 (UTC)

Non-allowable reasons for Pdf and Djvu formats[edit]

  • Format has apparently been selected by an author to prevent or to discourage the creation of derivative works (inappropriate fixation of content)

What about University Theses or papers where there should not be any editing or changes, in cases of important medical, scientific or historical Theses this would be detrimental. I didn't see these on the list you made? WayneRay (talk) 13:50, 12 August 2008 (UTC)WayneRay

Theses and the like are unacceptable on all WMF projects except Wikiversity (and possibly not even there) - I would say these should not be hosted on Commons.
As for historically important documents (regardless of their nature), those are already covered as being acceptable since they are not text-equivalent (if they are, they should be text, not a PDF).  — Mike.lifeguard | @en.wb 14:15, 12 August 2008 (UTC)
Mike - In view of what appears to be general consensus that we should allow files that may be useful on only a single project, and that theses are allowed on Wikisource, I have stated that theses are to be considered within Commons scope. Sorry. --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)
FWIW, Wikisource considers any published & peer reviewed work to be within scope; this is a perennial question/discussion on Wikisource. These quotes below are from our only 'crat at the time, and admins.
As the people quoted above are the most prolific admins, and would have been the ones to see and act, I am guessing it is safe to assume that speedy deletions of works of this nature are rare, as it would have been out of line with common practise and the admins stated opinion. Here are a few actual cases I can quickly put my hands on, where the inclusion policy was put into practise:
Just now, I have created s:Category:Dissertations to collect work of this class together. John Vandenberg (chat) 15:10, 12 August 2008 (UTC)
Very helpful, thanks. I have tried to ensure that Commons allows all files of this type that are within Wikisource scope. --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)
I just looked there and now How do I place the Pdf files in there if it is not Commons linked or should there be one in Commons Pdf files category? WayneRay (talk) 16:48, 14 August 2008 (UTC)WayneRay

Everything else is fine by me, clear and concise, however as 90% of the uploads are not connected to anything, what of my earlier suggestion of having a Pdf file upload notice on the "Upload file" page to use Category:Pdf files and/or place in appropriate category? If uploaders are made aware of the subcategories I have created maybe things will run smoother. WayneRay (talk) 13:59, 12 August 2008 (UTC)WayneRay

Please see my note below. --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)

Last few comments[edit]

After another full read I have the following few comments:

  1. In section Must be a media file I would propose the following rewording:
The following are not considered media files, and may not be hosted here:
  • Computer programs in any format including binary executable files and raw source code listings. Source-code may where relevant form part of the file description or metadata, eg a graphics file may include as descriptive text the code used to create it.
  • Files which are representative merely of raw text (eg ASCII files, already mentioned raw source code listings, etc).
  1. There are 6 use of term bona fide in Must be realistically useful for an educational purpose section, and I find them rather confusing. bona fide means [in] Good faith but the way they are used in this section do not make much sense to me. I would propose to remove all or most of them as they make sentences cryptic.
  1. In Scope part 3: user pages, galleries and categories/Non-allowable user page/gallery/category content section the sentence Non-allowable content includes: Excluded content, as described above should be expanded to specifically allow java script source-code in monobook.js files and the like.
  1. Sentence Advertising/promotional material does not advance Commons' aims. could be moved from "Scope part 1: files/Must be a media file" to "Scope of Commons/Excluded content"
    • Prefer not to do that, but to emphasise that the excluded content sections refers to excluded educational content. --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)
  1. In Must be realistically useful for an educational purpose/Discussion section I would add a paragraph:
Image quality is just one of the factors that limits educational usefulness of a file, other limiting factors also include: small size and hard to remove watermarks.

Otherwise I think it is great. --Jarekt (talk) 14:26, 12 August 2008 (UTC)

I strongly support the revised text concerning PDFs and DJVU files. Useful to any Wikimedia project = appropriate for Commons. --pfctdayelise (说什么?) 07:17, 13 August 2008 (UTC)

  • ✓  Done . The text now more clearly covers that. --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)

Thanks for all the comments on Pdf and Djvu. I will post a version 2 of the text in the next few days. Nearly there now, I think. --MichaelMaggs (talk) 19:42, 13 August 2008 (UTC)

And I support "Useful to any Wikimedia project = appropriate for Commons." as our "in a nutshell" or does that simplify it too much? Rocket000(talk) 00:23, 14 August 2008 (UTC)

I haven't forgotten this but have been rather busy recently. May be next week now. --MichaelMaggs (talk) 21:54, 15 August 2008 (UTC)

Can Main Page be changed[edit]


  • Upload file
    • Upload Pdf file (with a separate wording for uploads and redirection to wikisource etc etc
  • Recent changes
  • Latest files
  • Random file
  • Help
  • Contact us
  • Donate

WayneRay (talk) 16:43, 14 August 2008 (UTC)WayneRay

You mean the side bar, not the main page. Rocket000(talk) 16:47, 14 August 2008 (UTC)
Actually it is the Side Bar on the Main Page on the left hand side, thanks for the clarification and I see it appears here also, I was just seeing it as a whole not individual sum of its parts (it's not about the glass being half full or half empty, it's about the water LOL)WayneRay (talk) 18:53, 14 August 2008 (UTC)WayneRay
 :) The sidebar's actually part of the interface like the Commons logo and search bar. It will be there no matter what page your on. I'm not sure if we should change the interface but maybe add something to the upload form? Rocket000(talk) 22:12, 15 August 2008 (UTC)
Yes I had made that suggestion on another discussion page with a recommendation but never heard back on it.Who can change the Upload form? WayneRay (talk) 16:17, 16 August 2008 (UTC)Wayneray
You might want to ask Lupo about that. It is not really within the range of what we are trying to do in updating the Scope page, useful though the idea may well be. --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)

Release candidate now ready[edit]

I have made some further revisions based on the suggestions made above, and think (hope) we may now be ready to go live with something like this text. The main changes are:

  • Copyedits as proposed by various users
  • Make it clear that pdf and Djvu files are always in scope if another WMF project such as Wikisource or Wikibooks finds it useful for Commons to host them on their behalf. That means keeping an eye on the scope of other projects, which perhaps makes our own rules a little more fuzzy but which ensures we are always here to support the other projects. If their rules change and evolve, ours should too.

Is there a consensus that we can now go live with essentially this text? Also, given the length of this page now, would it be useful for me to divide it up into separate sub-pages dealing with different aspects? --MichaelMaggs (talk) 18:03, 20 August 2008 (UTC)

  • Yes go live WayneRay (talk) 04:27, 22 August 2008 (UTC)WayneRay
  • Subpages, please. And I think part 2 and 3 can be combined. So you would have "Part 1: Files" and "Part 2: Pages". Rocket000(talk) 07:10, 22 August 2008 (UTC)

I have gone live with the text today. --MichaelMaggs (talk) 08:57, 31 August 2008 (UTC)