User:Quibik/Cleaning up SVG files manually

From Wikimedia Commons, the free media repository
Jump to: navigation, search

Written as a response to User:Ahnode.

Editing SVG files manually via a text-editor allows for a greater control over the format and included content in the file. The major benefits are reduced file sizes, simpler and more concise code, and possibly making the code valid with the SVG standard. In this article I will try to introduce a few useful techniques for accomplishing this.

About text-editing[edit]

Since the process involves editing SVG files by hand, a good text editor is a must. The minimal functionality should include undo-redo, search and replace (preferably with regular expression support), syntax highlighting, and automatic indentation. For Windows, my preferred application would be Notepad++. On Linux platforms: Kate or Geany. This is of course only a personal preference – a great variety of good alternatives is available. Remember to set syntax highlighting to XML.

Code validation[edit]

A SVG file can be checked for standards compatibility with the W3C validator: http://validator.w3.org/. Commons has ValidSVG and InvalidSVG tags that can be added to image to indicate either status.

Before editing[edit]

A good amount of work in later editing can be avoided by simply using the correct settings when saving the file inside the image-editing program. A good idea is to try resaving a problematic file with recommended settings before starting manual editing.

Inkscape[edit]

In Inkscape, prior to saving, the FileVacuum Defs command should be used. This removes unnecessary definitions from the file, reducing its size. Next, the image should be saved as 'Plain SVG' rather than 'Inkscape SVG'. This avoids saving Inkscape-specific metadata, which might be useful when editing the same file later, but is useless when using the image for display.

Adobe Illustrator[edit]

Using the correct settings in Adobe Illustrator (CS3) many issues can be avoided altogether. When saving the SVG image for use on Commons, the following settings should be used:

  • SVG profile: 'SVG 1.1'
  • Fonts: 'SVG' or 'Convert to outline', if the used fonts aren't supported by Wikimedia software (see meta:SVG fonts).
  • Images: 'Embed'
  • Preserve Illustrator editing capabilities: off. When this option is enabled, a lot of standards non-compliant code is generated and hundreds of kilobytes worth of metadata is added to the file. Unchecking this option cures most of the problems with the SVG file.

Editing[edit]

MR conditional sign.svg

I will be using the image File:MR conditional sign.svg as an example in this tutorial. (Link to revision used) The file is huge (297 KB), considering the content – consisting only of 6 not too complex objects, it should not be larger than a few kilobytes. By looking at the code, we can see that the image was saved with Adobe Illustrator CS3. Most issues can be solved by resaving the file with correct settings, but for the purposes of this tutorial this will be ignored.

Next I will go through the steps needed to produce a clean and well-formatted file. The most important idea to keep in mind, when editing SVG files is: remove everything you can, but no more. Image editing programs often add a lot of generic or otherwise unnecessary information, that isn't always needed.

Header[edit]

Original Header:

<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 13.0.1, SVG Export Plug-In . SVG Version: 6.00 Build 14948)  -->
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd" [
	<!ENTITY ns_extend "http://ns.adobe.com/Extensibility/1.0/">
	<!ENTITY ns_ai "http://ns.adobe.com/AdobeIllustrator/10.0/">
	<!ENTITY ns_graphs "http://ns.adobe.com/Graphs/1.0/">
	<!ENTITY ns_vars "http://ns.adobe.com/Variables/1.0/">
	<!ENTITY ns_imrep "http://ns.adobe.com/ImageReplacement/1.0/">
	<!ENTITY ns_sfw "http://ns.adobe.com/SaveForWeb/1.0/">
	<!ENTITY ns_custom "http://ns.adobe.com/GenericCustomNamespace/1.0/">
	<!ENTITY ns_adobe_xpath "http://ns.adobe.com/XPath/1.0/">
]>
<svg version="1.0" id="Layer_2" xmlns:x="&ns_extend;" xmlns:i="&ns_ai;" xmlns:graph="&ns_graphs;"
	 xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" x="0px" y="0px" width="100px" height="100px"
	 viewBox="0 0 100 100" enable-background="new 0 0 100 100" xml:space="preserve">
...

The header usually consists of the XML start tag, comment indicating image generator, doctype (can be left out), and the svg start tag. The Adobe Illustrator adds some non-standard definitions to the doctype, which can be removed without much consideration. I personally leave the generator information intact, since it is helpful to other editors. The header remains pretty much the same regardless of the image, so most of the time a generic header can be used:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"
	width="XXXpx" height="YYYpx" xml:space="preserve">

Only the width and height attributes need to be specified.

Removing unnecessary content[edit]

The first object to remove from this file is obviously the large block of image metadata added by Illustrator, beginning on line 983:

...
<i:pgf  id="adobe_illustrator_pgf">
	<![CDATA[
	eJzsvW2vJMlxZvm9gf4Pdz8IIIFlKd5fuIsB7quGO6JEkNSMFsKAKHWXqB51VxHFbmm5v37dzJ9j
7hZ5m2qKHEmrYQW6ujwyMzIz3OOEmYWfjD/5337ysx/cf/7hb9/9YH4z3H36yZ/8yePHd2+//vDx
[snip]
xyzboLO92eZEe2vuo7X//d/B9puuCaAauh5sbhYbLbU8bLS7ZM/RGjX+Qw02erD8jbH6BR8FW0N1
NO4P1eDovf83KSE3mTfADqYAwdj/AneFH2w=
	]]>
</i:pgf>
</svg>

Remove everything between (and including) the <i:pgf></i:pgf> tags. After this, the file has lost about 200 KB in size. With about 100 KB remaining, the file is still much too large. Something seems to be wrong. Looking further around the file we find a <pattern id="Polka_Dot_Pattern"></pattern> block, taking up a massive 939 lines! Glancing at the image, we don't see any polka-dot patterns or really any patterns at all. So this block can be safely removed.

The file is now small enough to be shown here.

<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 13.0.0, SVG Export Plug-In . SVG Version: 6.00 Build 14948)  -->
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"
	 width="100px" height="100px" xml:space="preserve">
<switch>
	<foreignObject requiredExtensions="&ns_ai;" x="0" y="0" width="1" height="1">
		<i:pgfRef  xlink:href="#adobe_illustrator_pgf">
		</i:pgfRef>
	</foreignObject>
	<g i:extraneous="self">
		<g>
			<g>
				<polygon fill="#FCB034" points="91.662,88.909 49.654,88.909 7.646,88.909 49.654,15.621 91.662,88.909 				"/>
				<path d="M49.653,10.591l46.324,80.818H49.653H3.329L49.653,10.591 M49.653,20.646L11.958,86.409h37.695h37.695L49.653,20.646
					L49.653,20.646z"/>
				<path fill="#FCB034" d="M49.653,10.591l46.324,80.818H49.653H3.329L49.653,10.591 M49.653,6.569l-1.735,3.027L1.593,90.415
					l-1.716,2.994h3.452h46.325h46.324h3.451l-1.716-2.994L51.389,9.597L49.653,6.569L49.653,6.569z"/>
			</g>
			<g>
				<path d="M29.3,82.314V61.556h6.273l3.766,14.16l3.725-14.16h6.287v20.759h-3.895V65.974l-4.12,16.341H37.3l-4.106-16.341v16.341
					H29.3z"/>
				<path d="M53.527,82.314V61.556h8.821c2.219,0,3.831,0.187,4.836,0.56c1.006,0.373,1.811,1.036,2.414,1.989
					c0.604,0.953,0.906,2.044,0.906,3.271c0,1.559-0.457,2.845-1.373,3.859s-2.284,1.654-4.106,1.918
					c0.906,0.529,1.654,1.109,2.244,1.742s1.386,1.756,2.386,3.37l2.535,4.05h-5.013l-3.03-4.518
					c-1.076-1.613-1.813-2.631-2.209-3.051s-0.816-0.708-1.26-0.864c-0.444-0.155-1.147-0.233-2.11-0.233h-0.85v8.666H53.527z
					 M57.719,70.335h3.101c2.012,0,3.267-0.085,3.768-0.255c0.5-0.17,0.892-0.463,1.175-0.878s0.425-0.935,0.425-1.558
					c0-0.698-0.187-1.263-0.56-1.692s-0.899-0.7-1.579-0.813c-0.34-0.048-1.359-0.071-3.059-0.071h-3.271V70.335z"/>
			</g>
		</g>
	</g>
</switch>
</svg>

Next, a couple of unnecessary items remain: the <switch> tags and the <foreignObject> item. These don't belong to a standard-compliant SVG file and can be removed. The <g> group tag includes another non-standard attribute i:extraneous="self", which can again be safely removed. I find the double level group tags unnecessary anyway, so I removed them both. Only thing left to do now is to bring the indentation level to the beginning of the line and we are done!

Resulting file:

<?xml version="1.0" encoding="utf-8"?>
<!-- Generator: Adobe Illustrator 13.0.0, SVG Export Plug-In . SVG Version: 6.00 Build 14948)  -->
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"
	 width="100px" height="100px" xml:space="preserve">
<g>
    <polygon fill="#FCB034" points="91.662,88.909 49.654,88.909 7.646,88.909 49.654,15.621 91.662,88.909"/>
    <path d="M49.653,10.591l46.324,80.818H49.653H3.329L49.653,10.591 M49.653,20.646L11.958,86.409h37.695h37.695L49.653,20.646 L49.653,20.646z"/>
    <path fill="#FCB034" d="M49.653,10.591l46.324,80.818H49.653H3.329L49.653,10.591 M49.653,6.569l-1.735,3.027L1.593,90.415
        l-1.716,2.994h3.452h46.325h46.324h3.451l-1.716-2.994L51.389,9.597L49.653,6.569L49.653,6.569z"/>
</g>
<g>
    <path d="M29.3,82.314V61.556h6.273l3.766,14.16l3.725-14.16h6.287v20.759h-3.895V65.974l-4.12,16.341H37.3l-4.106-16.341v16.341 H29.3z"/>
    <path d="M53.527,82.314V61.556h8.821c2.219,0,3.831,0.187,4.836,0.56c1.006,0.373,1.811,1.036,2.414,1.989
        c0.604,0.953,0.906,2.044,0.906,3.271c0,1.559-0.457,2.845-1.373,3.859s-2.284,1.654-4.106,1.918
        c0.906,0.529,1.654,1.109,2.244,1.742s1.386,1.756,2.386,3.37l2.535,4.05h-5.013l-3.03-4.518
        c-1.076-1.613-1.813-2.631-2.209-3.051s-0.816-0.708-1.26-0.864c-0.444-0.155-1.147-0.233-2.11-0.233h-0.85v8.666H53.527z
         M57.719,70.335h3.101c2.012,0,3.267-0.085,3.768-0.255c0.5-0.17,0.892-0.463,1.175-0.878s0.425-0.935,0.425-1.558
        c0-0.698-0.187-1.263-0.56-1.692s-0.899-0.7-1.579-0.813c-0.34-0.048-1.359-0.071-3.059-0.071h-3.271V70.335z"/>
</g>
</svg>

As we can see, the code is now fully standards-compliant and has been reduced from 296 KB to 1.7 KB by only a few simple steps.


Still some coding is redundant, it will be enough to write

<?xml version="1.0" encoding="utf-8"?>
<svg xmlns="http://www.w3.org/2000/svg" width="100" height="100">
<polygon fill="#FCB034" points="91.662,88.909 49.654,88.909 7.646,88.909 49.654,15.621 91.662,88.909"/>
<path d="M49.653,10.591l46.324,80.818H49.653H3.329L49.653,10.591 M49.653,20.646L11.958,86.409h37.695h37.695L49.653,20.646 L49.653,20.646z"/>
<path fill="#FCB034" d="M49.653,10.591l46.324,80.818H49.653H3.329L49.653,10.591M49.653,6.569l-1.735,3.027L1.593,90.415
 l-1.716,2.994h3.452h46.325h46.324h3.451l-1.716-2.994L51.389,9.597L49.653,6.569L49.653,6.569z"/>
<path d="M29.3,82.314V61.556h6.273l3.766,14.16l3.725-14.16h6.287v20.759h-3.895V65.974l-4.12,16.341H37.3l-4.106-16.341v16.341H29.3z
 M53.527,82.314V61.556h8.821c2.219,0,3.831,0.187,4.836,0.56c1.006,0.373,1.811,1.036,2.414,1.989
 c0.604,0.953,0.906,2.044,0.906,3.271c0,1.559-0.457,2.845-1.373,3.859s-2.284,1.654-4.106,1.918
 c0.906,0.529,1.654,1.109,2.244,1.742s1.386,1.756,2.386,3.37l2.535,4.05h-5.013l-3.03-4.518
 c-1.076-1.613-1.813-2.631-2.209-3.051s-0.816-0.708-1.26-0.864c-0.444-0.155-1.147-0.233-2.11-0.233h-0.85v8.666H53.527z
 M57.719,70.335h3.101c2.012,0,3.267-0.085,3.768-0.255c0.5-0.17,0.892-0.463,1.175-0.878s0.425-0.935,0.425-1.558
 c0-0.698-0.187-1.263-0.56-1.692s-0.899-0.7-1.579-0.813c-0.34-0.048-1.359-0.071-3.059-0.071h-3.271V70.335z"/>
</svg>

More simplification is possible but needs some consideration.