Commons:Convert tables and charts to wiki code or image files

From Wikimedia Commons, the free media repository
Jump to: navigation, search

Note. Commons: ('commons' followed by a colon) is a namespace for internal Commons work, and for resource pages on the Commons. See: {{resources}} and Category:Commons resources. One can search the Commons namespace via the advanced tab at Special:Search.

Here are some tools, resources, tips, and instructions for converting tables and charts to wiki code or image files. Most of the tools and resources are free. For more tools see Commons:Create charts and graphs online.

Some starting points[edit]

HTML to wiki code[edit]

Convert HTML (from web pages) to wiki code.

Online tools[edit]

  • HTML to Wiki Converter. "This is a slightly altered mirror of a script by Borislav Manolov." Hosted at Wikimedia Tool Labs on Magnus' tools. See User:Magnus Manske.
  • tab2wiki. Converts tables (tab-delimited, e.g. copied from Excel, LibreOffice Calc) to Wikitext tables. Hosted at Wikimedia Tool Labs on Magnus' tools.

Web to LibreOffice Calc to tab2wiki[edit]

See also: phab:T88694 "Be able to re-order the columns or rows of a table by dragging them to another position in VisualEditor". And phab:T108245 "Fully support basic table editing in the visual editor".

This is the easiest method. This method works with plain tables from web pages. It does not usually work well with complicated tables on web pages (JavaScript based, etc.). Launch freeware LibreOffice Calc. Go to the web page with the table. Select and copy it right off the web page. Do not go into the HTML. Paste it into Calc.

If you happen to have a spreadsheet for the table use it instead and open it directly with Calc.

Then select and copy right off the Calc page. You can select the whole table, or just the columns you want by clicking the top of the desired columns in Calc (ctrl-click for each additional column). Paste it into tab2wiki. Copy the wikitext, and paste it into a wiki page. This method is very fast, and produces very clean and compact table wikitext.

It is compact when "First element in a row is a header" is left unchecked (default setting) in tab2wiki. And be sure "Compress table" is checked. Each table row will be on one line of wikitext. Otherwise every cell in the table will have its own row of wikitext. This is one reason why pasting the table into tab2wiki is better than pasting it into VisualEditor. Plus it is much faster. Pasting a table into VisualEditor (at least in Firefox) can lock it up for a few minutes as it does its translation. Have to tell the script to keep going. In contrast, tab2wiki is almost instant.

Sometimes a table can be copied and pasted directly from a web page into tab2wiki.

Calc can be used to easily remove columns and rows. Or use VisualEditor to do that. In either case click the column head, and then right click and delete. Columns can easily be moved around in Calc. Use cut and paste. For dragging columns around see this thread.

Default Calc settings will remove trailing zeros after the decimal point. If desired, this can be fixed after pasting into Calc. Select the relevant data rows or columns from the table. Then click on "cells" from the format menu. Choose the number of decimal places to show. This will return the trailing zeros. They make sortable data columns easier to scan.

Sometimes you do not want anything after the decimal point. Click the column head to select that column. Then right click it, and click on "format cells" from the context menu that pops up. Click on the example number showing no decimal point. Then click OK.

Initial alphabetical order[edit]

While in Calc put the table in alphabetical order. Alphabetical tables are the easiest to keep up to date longterm. In Calc click on any cell in the column you want to sort the table by. Then click one of the sort options (ascending or descending) from the data menu at the top. Then the table is ready to be copied to tab2wiki to convert to table wikitext. For more info see the initial alphabetical sorting section at Help:Sorting.

Wiki readers can use sort buttons to put data or text columns in ascending or descending order. A fixed row-number column on the left can rank the items in a table whether the items are in ascending or descending order. See Help:Sorting sections on row numbering.

Quickly link long lists of countries[edit]

See this Stack Overflow discussion with examples and pictures. See the answer by Erutan409. It explains how to instantly add [[wikitext brackets]] around all the country names. The example below is for freeware Notepad++. It has Unicode capability, and will work with more languages. The regex code below has also been successfully used in another freeware program, NoteTab Light. It does not have Unicode capability.

Copy the table wikitext to a Notepad++ page. Go to: Search menu, Replace. Check the boxes for "regular expression" and "wrap around" and "down". Click the mouse cursor at the top of the table wikitext so that all the country names going down the page have brackets added. Copy this regular expression (regex) code to the find and replace forms:

^.*?\|\h*\K(.*?)(?=\h*\|) replace with [[$1]]

Then click "Replace all". Just clicking "Replace" one at a time may not work.

Brackets are added around both ends of country names, including country names consisting of more than one word. It only does this in the first cell of each row. In other words the country names before the first set of double bars in the wikitext for tables in the compact format (one line per table row). This is very useful in Wikipedia tables since it saves a lot of time versus manually linking a couple hundred country names. Some names may be red-linked. They need redirects to the main country name.

The redirects save time later when updating the table since the redirects only need to be done once. Redirects are kept as long as they are in use somewhere on Wikipedia. So try to avoid the urge to "correct" the country name in the table as long as the name is clear. The names are often the way they are so that alphabetization works better, or to fit in limited space.

Align cell data to right of cells[edit]

Add style="text-align:right;" to the top of the table wikitext.

For example:

{| class="wikitable sortable" border="1" style="text-align:right;"

Align country names to left of cells[edit]

Add style="text-align:left" to each country name cell.

For example; in List of countries by incarceration rate:

| style="text-align:left" | Afghanistan || 74

This can be done quickly in many text editors by finding and replacing wikitext all at once. Note the pattern below:

|[[Afghanistan]] ||74
|[[Albania]] ||192

In freeware NoteTab Light replace |-^p|[[ with |-^p| style="text-align:left" | [[

^p is the NoteTab code for line breaks. Click "replace all" to end up with:

| style="text-align:left" | [[Afghanistan]] ||74
| style="text-align:left" | [[Albania]] ||192

If there are no links other than country links in the table then just replace [[ with style="text-align:left" | [[

Conversion examples[edit]

List of countries by incarceration rate[edit]

The short version: Full rapid update of List of countries by incarceration rate. Use freeware LibreOffice Calc and tab2wiki (as described elsewhere). Please do not delete the country redirects. This allows the table to be fully updated much more often. It only takes half an hour to fully update the table if redirects do not also have to be recreated each time (which can take hours). The source page uses these country, territory, and subnational names. The country links are created all at once. In freeware NoteTab Light (with checkmark in regular expression box):

^.*?\|\h*\K(.*?)(?=\h*\|) "replace all" with [[$1]]

This creates a link of anything in the first cell of each row. This must be done before other formatting. Such as adding the text-align styles. Those styles are added all at once too (as described elsewhere).

This is an alternative method below to using LibreOffice Calc and tab2wiki discussed elsewhere on this page. LibreOffice Calc and tab2wiki are easier and faster when they can be used. But sometimes alternatives are required. That is why this info has been kept below, even though LibreOffice Calc and tab2wiki will work for this particular table.

See the long table at en:List of countries by incarceration rate. It has over 200 rows. It is too difficult to maintain the table in highest to lowest order. That order can change with a single country update by any random editor. Plus one gets highest to lowest order by clicking the rate header. Alphabetical order is easy to maintain, because it is not affected by multiple small changes in the stats. To add row numbering see en:Help sorting.

Source: Highest to Lowest. World Prison Brief. International Centre for Prison Studies. Use dropdown menu to choose lists of countries by region, or the whole world. Use menu to select highest-to-lowest lists of prison population totals, prison population rates, percentage of pre-trial detainees / remand prisoners, percentage of female prisoners, percentage of foreign prisoners, and occupancy rate. Column headings in tables can be clicked to reorder columns lowest to highest, or alphabetically.

Use the menus to create a whole world list of prison population rates. Then click on the country column to put it in alphabetical order. Then copy and paste it into an online HTML WYSIWYG editor such as this one:

Click inside the editing area. To paste in the table use the edit menu of your browser, and then click "paste". After pasting in the table remove the rank column. To do this click anywhere in the rank column. Then click on the toolbar button that says (tooltip) "remove column". Then go into source mode to get the HTML. The table HTML starts with <table and ends with </table>.

There are many free HTML editors that can remove table columns, and do other table editing. For example; KompoZer. Place cursor in column heading. Then; table menu, delete column. You can also select all, right click, and "remove all text styles".

Copy only the table HTML, and paste it into the "HTML markup" form of the online HTML to Wiki Converter hosted by Magnus Manske. It is here. Click "convert" at the bottom of the converter page. It converts the table HTML to wiki code (wikitext). Paste it into a sandbox or one of your user pages (or subpages) to see what it looks like as a MediaWiki table.

The wikitext needs to be cleaned up. To do so quickly, copy the wikitext into a blank page of a good free text editor such as NoteTab Light. Save this as a text file. Remove the CSS styling at the top of the table wikitext. It is unnecessary. Then use the text editor's find-and-replace tool to remove all the classes. Replace double spaces with single spaces. Repeat until no double spaces remain. Fix any text laddering. Some of it can be done with find-and-replace. For example; replace <a ^p with <a. Save the text file.

Add column header text as necessary. Test in one of your user sandboxes as necessary to get things working right. The easiest way is to remove all the header wikitext and styling; and then replace it with the header wikitext at en:List of countries by incarceration rate. For more info see en:Help:Sorting, meta:Help:Table, mw:Help:Tables, and en:Help:Tables.

Country (or dependent territory,
subnational area, etc.)
Incarceration rate
(Prisoners per
100,000 population)
{| class="wikitable sortable" border="1" 
! Country (or dependent territory,<br> subnational area, etc.)
! Incarceration rate<br>(Prisoners per<br>100,000 population)

Note: "c." (circa) indicates "approximately." It should be put after the number, or numerical sorting of the rate column will not work, even with data-sort-type="number" in the column header. It is no longer necessary to put "c" in a separate column though. It is now possible to force the numerical sort order of a column by adding data-sort-type="number" to the column header. See: en:Help:Sorting. After moving "c" save again. Keep a separate copy of this. It may be needed for the country linking option explained in the shaded box farther down.

Next, click the replace command from the search menu of NoteTab. It pulls up the find-and-replace form. It can be used to quickly find and remove any remaining HTML. To add country linking and flag icons go directly to the shaded box farther down, and skip this section.

First replace </a> with nothing.

Then remove the rest of the HTML used for the country links. Put a checkmark in the regular expression checkbox, and then remove everything between HTML tag brackets: <>

To do that paste this regular expression in the "find what" line:


In the "replace with" line put this:

style="text-align:left" |

Click "replace all." This instantly removes nearly all HTML code. It replaces it with styling that aligns the country names to the left. The names are easier to read that way.

Adding country links and flags[edit]
Alternatively, you can add country linking and flag icons (see: en:Template:Flaglist). To do so replace </a> with template end brackets: }}

Then replace the remaining HTML using the regular expression method. Put a checkmark in the regular expression checkbox, and then remove and replace everything between HTML tag brackets <>

To do so replace
with this:
style="text-align:left" |{{flaglist|

Those 2 changes should make all the country names into wikilinks, and adds flag icons in front of them. It also aligns the country text to the left. For example; see en:User:Timeshifter/Sandbox42. In some country lists created this way some of the country links may have to be created manually. This is because there is a need to create redirects to the country names used for these links. Fortunately, nearly all the redirects have been created for this country list. For the combined UK number for England and Wales add this manually to keep both flags:

{{flagicon|England}} & {{flagicon|Wales}} [[England and Wales]] ([[United Kingdom]])

Also add this for Sint Maarten:

{{flaglist|Sint Maarten}} ([[Netherlands]])

For more info see: en:Wikipedia:WikiProject Flag Template, en:Category:Country data templates and en:Category:Country data redirects. Country links and flag icons are not absolutely necessary in country lists. Timely updating of the list is more important. If you want flags and country links, and you want to be able to rapidly update a list you may need to create some redirects. To do so search here for the templates to redirect to. Add the country or territory name to the search.

Do some additional manual cleaning, if necessary, to get rid of the rest of the HTML. There shouldn't be a need to do so for en:List of countries by incarceration rate.

Adding a column for notes[edit]

A column for notes (if necessary) is instantly added by doing a regular find-and-replace. In NoteTab Light replace

  • ^p is the NoteTab code for line breaks.
  • |- is the wikitext for a table row.
  • | is the wikitext for a table cell. See meta:Help:Table.

For consistency use the same reference for all the countries and territories possible. That reference link is placed above the table. If rates for additional countries or territories are found they can be entered in the table, and info about the sources can be put in the notes section below the table.

Do not put references or long notes in the table. They will break the alignment of the row numbering. See en:Help:Sorting sections about row numbering. Point to the notes section below the table. For example; see List of countries by incarceration rate.

PDF to image files[edit]

See: Commons:PDF to image files

Add text to chart images[edit]

Freeware IrfanView, (and many other image editors), is useful for adding text to charts and other images. Open a chart in IrfanView. Crop the chart as needed. The smaller the margins, the better for the chart showing up more clearly in Wikipedia articles at smaller sizes. Click on the chart where you want text to begin, and drag a large rectangle box where the text will be inserted. Make it large since text will not extend outside the box. The box lines do not remain after you finish editing the image. After you have drawn a box go to the edit menu, and click on "insert text". For ease of use pick left alignment in the dialog box that pops up. This way the text will start at the top left of the large box you drew on the image. Pick your font and color from the dialog box. Then click "preview" in the dialog box. The text will be inserted temporarily in the large box. The background will not be effected if the background is set to "transparent" in the dialog box. Change your settings until you get what you want in the preview. Then click "OK" in the dialog box. Save the image when you are done.

PDF to HTML, wikitext[edit]

Convert PDF charts and graphs to wikitext, or to HTML.

Some PDF charts can be converted to HTML charts. Copying and pasting some PDF charts produces comma-separated values (CSV) when pasted into some freeware text editors such as Notetab Light. See the next section for tools to convert CSV data into HTML charts and wikitext charts.

Sometimes you can paste a PDF chart into a Kompozer page. After doing so, select the chart text on the Kompozer page. Then click "create table from selection" in the table menu. Choose between commas or spaces (depending on what was used) to separate the columns. Save the page as an HTML web page. Then use one of the HTML to wiki converters to convert the HTML chart to wikitext.

PDF tools[edit]

Some of the previous sections refer to PDF files. Wikipedia has a comprehensive list of PDF software. Much of it is free and open source.

CNET has a category: PDF Software Downloads for Windows. Much of it is free, or free to try. Click "free" link in the left sidebar for freeware. Initial sort is by downloads last week. Recent popularity is usually a good comparative guide to ease of use.

Appropedia has notes on porting PDF files to MediaWiki.

For tables and charts, converting PDF to SVG is usually the optimal choice if you have the skills and tools to do so. Two programs that can do this are:

Convert wiki tables to Excel[edit]

Maybe you need to work on some tables, and prefer to work in Excel. See:

Convert Excel to CSV[edit]

Convert xls, xlsx, etc. to CSV. CSV files create clean tables since CSV contains only values, headers, rows, and columns. No other formatting. So they are great for converting to clean wikitext tables.

Online tools[edit]

Convert CSV, DSV, or Excel to HTML or wikitext[edit]

Convert Excel, comma-separated values (CSV), tab-separated values, delimiter-separated values (DSV), etc. to table wiki code.

Online tools[edit]

Microsoft Word to wiki code[edit]

Microsoft Word converters. See: to wiki code[edit]

OpenOffice can export to wiki code. Install first the Sun Wiki Publisher extension. Next make your table and choose to Export to MediaWiki (.txt).

Print screen, and then edit chart image[edit]

The charts can be converted to images by using the "Print Screen" key on a keyboard. Image editors can then be used to capture and crop the chart found on the screenshot.

There are many free image editors. See:

For example; there is the free, popular, easy-to-use Irfanview. Use it to resize charts, remove watermarks, crop unused space from around the edges, and so on. Irfanview can also losslessly compress PNG images so as to use less kilobytes for the same image without any loss in image quality. Install the Irfanview plugin pack too. It installs instantly and includes even better PNG compression, PNGOUT, which is easy to use in Irfanview.

See also[edit]