From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Hi there.

Current tasks[edit]

  • Reviewing new PDF uploads as they show up.
  • Cleaning up Category:Flags, moving files into subcategories and/or nominating for deletion as appropriate.

Why care about PDF files?[edit]

  • Inadequate review. PDFs are often uploaded in large batches; thumbnails of PDF files often fail to load in Special:NewFiles; and these files often require a fairly close look to determine if they're in scope or not. The end result is that PDF uploads often get overlooked during new page patrols.
  • PDFs are indexed by search engines. The content of PDF uploads on Commons (including embedded links!) is indexed by search engines, but is not checked against the spam blocklist. There is evidence that some malicious users may be attempting to exploit this as a means of black-hat SEO. Even if this isn't the case, users may be misled by fake encyclopedia articles in uploaded PDFs.
  • Increased scope of copyright violations. PDF uploads can include entire books or large sets of images; the potential for financial damages to a content owner from the distribution of these files is considerably greater than for a single image.

Okay, so what are some good use cases for PDF uploads?[edit]

Glad you asked!

  • Storing scans of historical documents which are being transcribed or translated at Wikisource.
  • Reproducing documents published by official sources (like governments and research journals) which are not copyrighted, or which have been released under free licenses.
  • Similarly, storing internal documents produced by Wikimedia or its chapters, like presentations or how-tos.
  • Begrudgingly, as a container for single-page images and diagrams which haven't been converted to raster images or SVG yet. (This isn't ideal, but it's better than not having the image at all.)

What about PDFs for infographics?[edit]

Infographics are an unusual problem.

  • The educational value of many user-uploaded infographics is fairly low. Many of them simply repeat basic facts about a topic and surround it with clip art; these images provide very little added value over an encyclopedia article.
  • The clip art is often an issue as well. A lot of infographics are created with Canva, which allows users to use stock images from Pexels (among other sources). These images are not freely licensed, and cannot be used in CC-BY content on Commons.
  • Finally, PDF files cannot easily be edited. If a PDF infographic contains inaccurate or outdated information, they are difficult to correct.