Help:Creating a DjVu file/JPG-DJVU conversion scripts

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

The scripts on this page perform automatic conversion from a set of .jpg files to a single .djvu file using DjVuLibre. You will need to install DjVuLibre for this to work (but you don't need to add it to the environment variable).

Initially, the files need to be sequentially numbered, including leading zeros (i.e. "001, 002, ..., 010" rather than "1, 2, ..., 10", because the latter will not be sorted properly, and the .djvu file's pages will be in the wrong order.


Python script[edit]

This script below can correct files with names in the format "Name (1).jpg, Name (10).jpg, etc" (the automatic renumber facility provided by Windows) to "Name 0001.jpg, Name 0010.jpg, etc" (which will sort correctly). It will prompt to ask if this is necessary.

You just need to write in the location of the image folder, and the location of the DjVuLibre tools, and the rest is automatic.

The -decibel 48 option sets a high quality. If your file ends up to big, you'll need to split it into sections, or reduce the quality until it fits (current limit is 100MB).

This script was written by Inductiveload: contact him for help, bugs, suggestions or anything else.

Note that it uses an older version of Python. Pre-3.X

import os, glob, subprocess, re

def renumber( filename ):
    m = re.search('(.*)\((\d+)\)', filename)
    index = int(m.group(2)) #this is the numeric index
    index = '%04d' % index  #pad with zeros
    return m.group(1) + index +'.jpg'#append to name

#Change these to suit your situation=========================
IMGDIR="C:\\documents and settings\\YOU\\my documents\\images\\" #directory of images to be converted
EXEDIR="C:\\program files\\djvuzone\\djvulibre\\" #directory of DjVuLibre tools (specifically c44 and djvm) 
OUTDJVU = IMGDIR + 'OUT.djvu'

#Don't change these ==========================================
TMPDJVU = IMGDIR + 'TMP.djvu'

rename = raw_input('Rename files for proper sorting? [y/n]')
if rename == 'y':
    #rename files to have leading zero numbers so they sort properly
    for infile in glob.glob( os.path.join(IMGDIR, '*.jpg') ):
        newname = renumber(infile)
        os.rename(infile, newname)

#convert jpg to djvu and collate to a single file   
if os.path.exists(OUTDJVU):
    os.remove(OUTDJVU)

for infile in glob.glob( os.path.join(IMGDIR, '*.jpg') ):
    print 'Processing ' + infile

    #convert jpg to a temp djvu file
    cmd = '"'+EXEDIR+'c44.exe" -decibel 48 ' + '"'+infile+'"' + ' "'+TMPDJVU+'"'
    subprocess.call(cmd)
    
    if os.path.exists(OUTDJVU):
        #Add the djvu file to the collated file
        cmd = '"'+EXEDIR+'djvm.exe" -i ' + '"'+OUTDJVU+'"' + ' "'+TMPDJVU+'"'
    else:
        # Create the collated file
        cmd = '"'+EXEDIR+'djvm.exe" -c ' + '"'+OUTDJVU+'"' + ' "'+TMPDJVU+'"'
    subprocess.call(cmd)

#Delete the temporary file
os.remove(TMPDJVU)

print '\nAll files converted and collated successfully'


Windows[edit]

Alternatively this batch script can be used in Windows:

@echo off
rem using the script:
rem   script.cmd <SOURCE_PATH> <OUTPUT_NAME>

rem location of the source image files. The first argument followed the script.
set img=%1
rem location of the C44 command not used really... see below
set CMD1="C:\Program Files (x86)\DjVuZone\DjVuLibre\c44.exe"
rem location of the djvm.exe
set CMD2="C:\Program Files (x86)\DjVuZone\DjVuLibre\djvm.exe"
rem location of the temp file you can change it to whatever you want.
set WORK="D:\TEMP\tmp.djvu"
rem location of the output file. The second argument followed the script
set OUT="D:\TEMP\%2.djvu"
cd %IMG%
FOR  %%G IN (*.jpg) DO (
	rem real location of the C44 command 
	"C:\Program Files (x86)\DjVuZone\DjVuLibre\c44.exe" -decibel 48 "%%G" %WORK% 
	IF EXIST %OUT% (
		%CMD2% -i %OUT% %WORK% 
	) ELSE (
		%CMD2% -c %OUT% %WORK% 
	)
)
PAUSE


Enhanced Python Script (With Sub-directories)[edit]

This is an edit of the previous script. This script was written by Sylvertech: contact him for help, bugs, suggestions or anything else.

Differences:

  1. Allows for accessing sub-directories by default, which is a useful feature to have.
  2. Uses a newer version of Python
  3. Does not have the previous sorting functionality, yet.
  4. Is more flexible in regards to the IMAGES folder (By default starts working in the script file's directory).
  5. Has hidden unimplemented code left for others to continue from. e.g. Menus for changing settings, a sorting function.
'''

This script does the following.

It goes into a specified directory and its sub-directories,
and then compiles all images in a directory into a single DJVU file.



'''


import os, glob, subprocess, re

#Change these to suit your situation=========================

#root directory of images to be converted; currently set to script's directory.
IMGDIR= "./"
#directory of DjVuLibre tools (specifically c44 and djvm) 
EXEDIR="C:\\djvulibre\\"

#remnant of original script
''' 
def renumber( filename ):
    m = re.search('(.*)\((\d+)\)', filename)
    index = int(m.group(2)) #this is the numeric index
    index = '%04d' % index  #pad with zeros
    return m.group(1) + index +'.jpg'#append to name

     
rename = raw_input('Rename files for proper sorting? [y/n]')
if rename == 'y':
    #rename files to have leading zero numbers so they sort properly
    for infile in glob.glob( os.path.join(IMGDIR, '*.jpg') ):
        newname = renumber(infile)
        os.rename(infile, newname)'''
 

def SpewLots(peht = IMGDIR):
    for dirpath, dnames, fnames in os.walk(peht):
        print("\n"+peht + "has been accessed.")
        TMPDJVU = os.path.join(dirpath, 'TMP.djvu')        
        OUTDJVU = os.path.join(dirpath, 'OUT.djvu')
        if os.path.exists(OUTDJVU):
            os.remove(OUTDJVU)
            print("\nOUT removed")
        for f in fnames:
            infile = os.path.join(dirpath, f)
            print("\nProcessing " + f + " in " + dirpath)

            #convert jpg to a temp djvu file
            cmd = '"'+EXEDIR+'c44.exe" -decibel 48 ' + '"'+infile+'"' + ' "'+TMPDJVU+'"'
            subprocess.call(cmd)
            print("\nCreating TMP")
         
            if os.path.exists(OUTDJVU):
                #Add the djvu file to the collated file
                cmd = '"'+EXEDIR+'djvm.exe" -i ' + '"'+OUTDJVU+'"' + ' "'+TMPDJVU+'"'
                subprocess.call(cmd)
                print("\nAdding TMP to OUT")
            else:
                # Create the collated file
                cmd = '"'+EXEDIR+'djvm.exe" -c ' + '"'+OUTDJVU+'"' + ' "'+TMPDJVU+'"'
                subprocess.call(cmd)
                print("\nCreating OUT File")

            if os.path.exists(TMPDJVU):
                os.remove(TMPDJVU)
                print("\nTMP removed")

        #if Mode == True:
            for d in dnames:
                newPath = os.path.join(dirpath, d)
                print(newPath + "is being accessed.")
                SpewLots(newPath)
        print('\nAll files converted and collated successfully')

#Menu code. lol menues.
'''def Menu():
    maniacMode = 0 
    print("

    \n1 - Compile in root and sub-directories.

    \n2 - Compile only in root directory.
    
    ")
    choice = input('Pick a number.')
    if choice == 2: maniacMode = False
    elif choice == 1: maniacMode = True
    else: print("Error choosing mode")
    return maniacMode


Mode = Menu()'''


#foolproof switch            
if input("Start? [y/n]")== "y": SpewLots()