User:SVGbot

From Wikimedia Commons, the free media repository
Jump to: navigation, search
Bot This user account is a bot operated by Sakurambo (talk).

It is not a sock puppet, but rather an automated or semi-automated account for making repetitive edits that would be extremely tedious to do manually.
Administrators: if this bot is malfunctioning or causing harm, please block it.


taskscontribscountlogspage moves block userblock logflag logglobal contribs flag bot

Currently inactive
Note: Until and unless a bot flag has been granted, this bot will not make any edits to Commons pages. It is expected to generate GET requests at a rate of no more than 1 every 10 seconds. If it generates excessive traffic, please contact its owner and request disconnection.


{{#babel|en|ja-3}}

What[edit]

This bot account has the following purposes:

  • Checking uploaded SVG images for embedded raster graphics.
  • Tagging these images with the {{BadSVG}} template so that they can be recreated or deleted if necessary.
  • At the moment the SVGbot is collecting data. It will not attempt to edit anything until it has finished scanning all the SVG images and has obtained a bot flag.

SVGbot is programmed in PHP by its creator and maintainer, User:Sakurambo.

Why[edit]

In addition to vector primitives such as lines, polygons and circles, SVG images can also contain embedded raster images such as JPEG and PNG imges. This is generally an inefficient way of storing such images because it results in a larger file size with no improvement in image quality and none of the scalability provided by vector primitives. When deleting raster images that have been superseded by SVG images, it is important to ensure that the new version really is scalable.

Since Wikimedia Commons already contains approximately 90,000 SVG images, it would be very difficult to check all these images manually. However, a bot can do this job quite easily by looking for <image> tags in the SVG source files. That's what this bot is programmed to do.

Progress[edit]

Of the 70,000 or so checked by this robot since 25th March, 838 contain <image> tags. These fall into three categories:

  1. GIF, PNG and JPEG images that have simply been saved in SVG format without adding any vector content. (e.g., Image:Sampleandholdgraph.svg). It would be more efficient to replace these images with their embedded artwork.
  2. Files containing a mixture of vector and raster elements. Some of these are high-quality images (e.g., Image:Mexico COA large.svg) that do not need to be replaced. Some are not so good.
  3. Artwork consisting wholly or partly of embedded images that only exist on the creator's filesystem (e.g., Image:Louisiana 631.svg, which references a file called C:\Documents and Settings\Robert Sheffield\My Documents\My Pictures\Louisiana 631.PNG)

There also seem to be a lot of SVG images with invalid markup (e.g., Isle_of_Man_map-en.svg. Although this bot can check for rudimentary parse errors, it isn't able to validate documents at present.

Operation[edit]

This bot is currently operating on my desktop computer. It can be identified by the following User-Agent string:

User:SVGbot (http://commons.wikimedia.org/wiki/User:SVGbot)