Help:Creating a DjVu file

Translate this page

This page explains how to create a DjVu file. The format can be used in galleries or categories like any supported image format, and reduces the size and number of the files that need to be copied.

If you have difficulties, you can ask for help at the Commons:Help desk.

Example: [[File:Alice in Wonderland.djvu|thumb|Alice in Wonderland, page 9|page=9]].

Introduction

The aim is to create a DjVu file from bitmap versions (jpg, tif, etc.) found on Internet or scanned. This is favorable because:

all pages of a book are available on a single file
all pages can be seen from the file page of the DjVu file
every page can be used in the "page" space
DjVu files are small
only one single file needs to be copied, compared to hundreds of pages in bitmap format.
creating a DjVu file is quicker than uploading hundreds of bitmap files.

Drawbacks:

The numbering of the pages does not seem to be freely configurable
Loss of quality can occur

Within MediaWiki projects

Pages of DjVu files can be navigated in Mediawiki installations that have the ProofreadPage extension plugin installed. This is the case on all language versions of Wikisource.

Once a is uploaded to Commons, an index page needs to be created. Navigation is done by using the name of the file prefixed by "page:" and followed by "/X", with "X" is the page number.

Page numbering

The DjVu format created a default page numbering which is displayed in a drop-down menu (see Image:Wind in the Willows.djvu). It is advisable to have the page numbering match that of the original book, for easier use. This can be problematic when some pages (like in introductions) are numbered in Roman numbers. In this case, one solution is to create a second DjVu file for these pages.

Converting a PDF file

Please see page Help:Converting PDF to DjVu

Other formats

Tiff files from Gallica can be opened in FineReader (even after the evaluation period is over). By exporting the pages into tiff (same format), it is possible to crop the margins with XnView, and to load the pages into DjVu Solo. Page numbering is automatic.

MS Windows

Conversion through DjVu Solo

DjVu Solo is very simple and effective tools. This software will not update anymore but now it's stable. It is also possible to use LizardTech Virtual Printer, which is available for no charge and can convert documents through a printing process. Procedure is more or less the same for all conversion programmes: load the bitmap files into the programme, check their order and launch the conversion process.

By default, DjVu Solo is set to convert pages with a 300 dpi resolution. This is usually a good value.

Conversion through DjVuLibre

DjVuLibre doesn't have a GUI for converting files, but with an appropriate script you can do it automatically with next to no user input. See the scripts to create a single, collated DjVu file.

On Mac OS

Using MacPorts, one can install a number of DjVu programs for use on Macintosh computers:

# port list '*djvu*'
djvu2pdf                       @0.9.2          graphics/djvu2pdf
djvulibre                      @3.5.25         graphics/djvulibre
minidjvu                       @0.8            graphics/minidjvu
pdf2djvu                       @0.7.18         graphics/pdf2djvu
py-djvubind                    @1.2.1          python/py-djvubind
py31-djvubind                  @1.2.1          python/py-graveyard
py32-djvubind                  @1.2.1          python/py-graveyard
py33-djvubind                  @1.2.1          python/py-djvubind
py34-djvubind                  @1.2.1          python/py-djvubind
zathura-plugin-djvu            @0.2.3          office/zathura-plugin-djvu

On Linux, FreeBSD, etc.

You need the DjVuLibre software, a collection of command-line tools for creating, modifying, and viewing DjVu files. You will probably also need the ImageMagick or GraphicsMagick software if you need to convert page scans from bitmap formats.

Converting page scans

The tool cjb2 is used to creating a DjVu file from a PBM or TIFF file. Therefore you need to convert your scans if they are not already in one of these formats. (The examples below use the convert tool from ImageMagick, but they will also work with GraphicsMagick's gm convert command.)

Conversion from PNG format to PBM format with convert:

 convert rig_veda-000.png rig_veda-000.pbm

Depending on the quality of the original scans, you may find it useful to process them with the unpaper utility, which deletes black borders around the pages and aligns the scanned text squarely on the page. Unpaper is also capable of extracting two separate page images where facing pages of a book have been scanned into a single image.

Creation of a DjVu file from a PBM file

 cjb2 -clean rig_veda-000.pbm rig_veda-000.djvu

Adding the DjVu file to the final document

 djvm -i rig_veda.djvu rig_veda-000.djvu

You need to repeat these steps with a script for each page of the book. Example:

#!/bin/bash
for n in `seq 1 9`
do
        i="rig_veda-$n.png"
        j=`basename $i .png`
        convert $i $j.pbm
        cjb2 -clean $j.pbm $j.djvu
        djvm -i rig_veda.djvu $j.djvu
done

Alternatively, you may use a makefile, and run parallel processes with 'make -j'

UNPAPER_OPTS_COMMON= --mask-scan-threshold 0.01 --dpi 600 --mask-scan-size 100
UNPAPER_OPTS_ST1= --deskew-scan-size 5000 -dv 0.5
UNPAPER_OPTS_ST2= --no-noisefilter --no-blackfilter --no-grayfilter --no-blurfilter --no-deskew -S 3600,5250 --border-align top --border-margin 150
IMGS = $(wildcard *.png)
DJVUS = $(sort ${IMGS:.png=.djvu})
DJVU = __out.djvu

all: ${DJVUS} ${DJVU}

%.raw.pbm: %.png
	convert $< $@

# stage 1: clean borders, apply filters, rotate
%.stage1.pbm: %.raw.pbm
	unpaper --overwrite $(UNPAPER_OPTS_COMMON) $(UNPAPER_OPTS_ST1) $< $@ > $@.log

# stage 2: place in the center of the page, set page size
%.pbm: %.stage1.pbm
	unpaper --overwrite $(UNPAPER_OPTS_COMMON) $(UNPAPER_OPTS_ST2) $< $@ > $@.log

# Compress to .djvu
%.djvu: %.pbm
	cjb2 -clean $< $@

# Assemble final djvu
${DJVU}: ${DJVUS}
        # files that unpaper does not process well can be put in 
        # a differetnt directory, and assembled in the final djvu
	# cp ./_manfix/*.djvu ./
	djvm -c $(DJVU) ${DJVUS}
        # With parallel make, it will be impossible to tell which error
        # comes from which file, so save all output in one big log
	cat *.log > _one_big.log


clean:
	rm -f *.raw.pbm
	rm -f *.pbm
	rm -f *.djvu
	rm -f *.log

Converting PostScript files (PDF, PS, EPS)

DjvuLibre includes djvudigital, a tool that uses Ghostscript to directly convert PDF and other PostScript files to DjVu format. However, it requires rebuilding Ghostscript from source code to include a special driver needed by djvudigital (it's part of the DjvuLibre distribution, but because of conflicting open-source licenses, it cannot be distributed legally as a binary). Once built, though, it is a very convenient tool to use; it can even convert PDF files from Google Books without any extra work. It's as easy as:

 djvudigital --words some_book.pdf

The --words option should be included to copy any searchable text that exists in the PDF file over to the final DjVu file. This also allows words to be highlighted in searches. (To eke out a little savings in file size, you could use --lines instead of --words, which would record the position of each line instead of each individual word; text could still be searched by word, but entire lines would be highlighted in search results instead of the individual words. This probably won't matter in maps, illustrations, etc. where words are scattered all over the page.)

Conversion through DjVu Solo

DjVuSolo is available for Linux using WineHQ. The installtion procedure of Wine HQ is dependent on the distributive and described at the link above. If WineHQ is installed, simply download the installer of DJVU Solo and run it using Wine. Open the folder with installer in terminal and run wine djvusolo3.1-noncom.exe Usually, no tricks are necessary. To access the file system use the corresponding drive inside WineHQ (By default Z).

External links

Software

DjVuLibre package: open source, for Mozilla, Firefox, Konqueror, Netscape, Galeon, and Opera, Linux/Unix.
Lizardtech DjVu Browser Plug-in: for Win/Mac
WinDjView and MacDjView Desktop Viewers: open source, for Win/Mac
DjVuOutline: DjVu outline (contents, bookmarks) editor, open source, Windows only
STDU Viewer: for Windows
DjVu Viewer: for Windows
DjVu Solo 3.1, DjVuVersion Command Line Utility, DjVu ActiveX Control for Microsoft Office 2000 (exe, 1.5 Mb): can be found at djvu.org
PDF2DJVU command line utility to convert Adobe PDF to DjVu files, for Windows, Mac, and *NIX/Linux
PDF to DjVu GUI, a graphical interface for PDF2DJVU command line utility, for Windows
Djvu-Spec Pdf 2 Djvu Converter, for Windows
[1], Spacemacs djvu-layer for Linux/Mac (should be possible to get it work on Windows also).

Websites

Downloads & Resources at djvu.org
Any2DjVu: online DjVu compression server
MiniDjVu: open source DjVu compressor for Linux/Unix and Windows.
DjVu file extension File-Extensions.org Library

Help:Creating a DjVu file

Contents

Introduction

Within MediaWiki projects

Page numbering

Converting a PDF file

Other formats

MS Windows

Conversion through DjVu Solo

Conversion through DjVuLibre

On Mac OS

On Linux, FreeBSD, etc.

Converting page scans

Converting PostScript files (PDF, PS, EPS)

Conversion through DjVu Solo

See also

External links

Software

Websites

Navigation menu

Help:Creating a DjVu file

Introduction

Within MediaWiki projects

Page numbering

Converting a PDF file

Other formats

MS Windows

Conversion through DjVu Solo

Conversion through DjVuLibre

On Mac OS

On Linux, FreeBSD, etc.

Converting page scans

Converting PostScript files (PDF, PS, EPS)

Conversion through DjVu Solo

See also

External links

Software

Websites

Navigation menu

Search