ukopp user guide   v. 3.3

concepts

first tryout              (1-page primer for those with RTFM problems)

toolbar buttons

file menu
backup menu
verify menu

report menu

restore menu
format menu
editing backup jobs
technical notes


License and Warranty
ukopp is a free program licensed under the GNU General Public License, V2 (Free Software Foundation). ukopp is not warranted for any purpose, but if you find a bug, I will try to fix it.


Origin and Contact

ukopp originates from the author's web site at: http://kornelix.squarespace.com/ukopp
Other web sites may offer it for download. Modifications could have been made.

If you have questions, suggestions or a bug to report:
kornelix@yahoo.de



Introduction


ukopp
is a Linux program for copying or backing-up disk files to a separate storage device, e.g. a USB drive or SD memory card. Any disk directory may be used as a backup location. You can select files to be copied using a GUI. You can navigate through the file system and select files or directories to include or exclude at any level in the directory hierarchy. These choices can be saved in a job file to automate recurring backups. If new files appear in an included or excluded directory, they are automatically taken into account. You need to revise the job file only if you change the directories or make new exceptions within those directories.

ukopp
copies only new and modified files: files that have not changed since the last backup are bypassed in microseconds. A typical daily backup of personal files can be done in less than a minute. ukopp can optionally retain previous versions of backup files instead of overwriting them with newer versions. You can specify the retention time and / or the number of versions to retain, and these can be specified once for all files, or separately for each group of included files. You can see these versions in the backup directories and recover them if needed.

ukopp has a
synchronize function, which is a simple method to keep files in two computers synchronized using a USB stick or other portable memory. ukopp copies the newest version of a file from one device to the other.

Backups can be verified three ways: full, incremental, and compare. A
full verify reads all the backup files and reports any files having read errors. An incremental verify reads only those files that have been newly written by a preceding backup job. This is very fast and provides a high level of security. A compare verify reads all backup files and compares them with their corresponding disk files. This is normally not necessary, but provides an effective check that all hardware and software is working correctly.

You can report all files in a backup job, or all files in a backup directory. You can search for file names using wildcards. You can report the differences between backup files and their corresponding disk files: files that have been created, deleted, or modified since the backup was made. These reports are available in three levels of detail: a list of all changed files, total file and byte counts per directory, and overall totals.


For disaster recovery or file transfer, ukopp has a
file restore capability. You can select and restore backup files to their original directories or anywhere else. Owner and permissions are are also restored, even if the backup device uses a Microsoft file system.


Concepts

The files in a backup job are specified with include and exclude records. These have filespecs with optional wildcards placed almost anywhere.

Examples:

    include /home/*                 # add all user files
    include /root/*                 # add all root files
    include /shared/*/documents/*   # add shared document files
    exclude */mp3/*                 # remove files in mp3 directories
    exclude */.Trash/*              # remove trash files

The first include adds all files owned by users in their home directories and sub-directories. The second include adds all files owned by root. The third include adds all files under the /shared top directory that also have an intermediate directory named /documents. The two exclude records remove all files within all /.Trash and /mp3 directories.

GUI interface:
The above records are normally generated using a file selection dialog.
This is documented in a following section: editing backup jobs.


File Selection Logic

    loop:
        get next control record

        if EOF, done

        if include: add all matching files to backup file set

        if exclude: remove all matching files from backup file set
    loop-end


Note that excludes are effective only against prior includes. They have no effect on following includes, which are processed afterwards. See the section on editing backup jobs.

Restriction: include records must include at least the first directory name (top-level) without wildcards (the GUI file-chooser does this automatically).


Retaining multiple file versions: if this option is elected, existing backup files that need updating are renamed with a version number instead of being overwritten. If the backup file "foo.bar" is updated, it is renamed to "foo.bar (1)", and "foo.bar" becomes the newest backup. If it is updated again, "foo.bar" is renamed to "foo.bar (2)", and so forth. Newer versions have higher numbers, and the unversioned file is always the current or latest version. The section on editing job files explains how to specify old version retention policies.

ukopp limitations

    max. 200,000 files in a backup job (compile time constant)
    max. file retention is 9999 days and 9999 versions
    must run as root user or use sudo to copy protected files
    not useful for disk imaging (operating system backup)


ukopp first tryout

After installing ukopp, please perform the following exercise. This may be all you need at first. You can enhance your file security and ultimately save time if you read this whole document. 

The following short exercise will check that ukopp functions correctly on your system and help you become familiar with ukopp usage.

  1. Choose a backup device or directory. If using a pluggable device (e.g. USB stick), plug it in and wait for it to be mounted, or mount manually if this is your system policy.
  2. Start ukopp: click the desktop launcher or input a terminal command:
    - if no privileges needed: $ ukopp
    - if privileges are needed: $ sudo ukopp
  3. Select button [ target ]. The drop-down list shows mounted disk devices and their mount points. You can choose one of these, or input your chosen backup directory.
  4. Select button [ mount ]. Check that the selected target device/directory mounts OK.
  5. Select button [ edit job ]
  6. Erase the default backup job shown (select and delete, or use the [ clear ] button)
  7. Select the button [ browse ] at the bottom
  8. Navigate through the directories and select the directories and files to be copied
  9. select the [ done ] button when finished selecting files
  10. inspect the generated include and exclude records
  11. Add an optional verify record at the end of the list. The format is "verify xxxx" where xxxx is one of: none, incremental, full, compare. Use compare until you are confident that everything is working, then speed things up later by changing to incremental.
  12. Select button [ done ] when finished editing the job
  13. If there are errors shown, select [ edit job ] and fix them (remember that exclude records must follow relevant include records - excludes are exceptions to prior includes)
  14. Select menu: Report > get disk files. Inspect the counts. Be sure the total byte count is within capacity. Look for zero counts, indicating possible errors. Re-edit if needed.
  15. Select button: [ run job ]. Backup and verify should run automatically.
    Check that the error count is zero.
  16. Save the job file if desired: menu: File > save job
  17. Select button: [ quit ]
  18. Next steps: play with the report and restore functions

Detailed Usage Instructions

Toolbar buttons

target
The drop-down list displays all drives that are visible to ukopp, with their mount points and descriptions. Choose one of these to set the target device and location for a subsequent backup. You may also type-in a directory directly. This must be a valid directory for which you have write permission, and of course there should be enough space for the backup files. Choose one of the two options for flushing the memory cache to the physical device, which is done bewteen a backup and verify. "sync" will use the Linux sync command, and "remount" will cause the backup device to be unmounted and remounted. For remount, the backup target directory must match the mount point of a mounted device.


edit job
Shortcut to the backup job editor (same as menu File > edit job)


run job
The current job is executed.


pause / resume
The currently running job or menu function may be paused and resumed. Use this to inspect output on the fly.


kill job The currently running function is killed.

clear
The main window, where messages and reports are written, is cleared.


quit
Exit ukopp.



File Menu

open job

Open a previously saved backup job file for re-use (edit, run). Default location is the hidden directory /home/user/.ukopp (or /root/.ukopp).


edit job
Opens an edit dialog for the current backup job (the last job file opened, or from a prior edit). If no file has been opened, internal default data will be used as a starting point.


show job

List the current backup job data and diagnose any errors.


save job

Save current backup specs in a job file. Default is the same file that was last opened, but you may select any file. The data includes any edits that were made to the job.


run job

The current backup job is executed. Backup and verify modes are taken from the job.



Backup menu

backup
The backup job is run without verify. You can then run whatever verify you want.


synchronize

This is a bi-directional copy. Files present on one side only (disk or backup location) are copied to the other side. Files that are present on both sides will get the newest version copied to the other side. "Newest" is based on the time of the last file update.


Assume you normally use computer A, but you need to use B while traveling. You can use a portable memory device (SD card, USB stick, etc.) to keep the computer files synchronized.
  1. A and B must have identical backup job files, naming the same set of backup files.
  2. Initial synchronization: backup A, move the memory device to B, restore to B.
  3. Work with B: create and modify some files.
  4. Run synchronize on B, move the memory device to A, run synchronize on A.
  5. The modifications done on B are now carried over to A.
  6. You can update files on both A and B in parallel, as long as you work on different files between synchronizations. Synchronize A, then B, then A. Now both will have the same set of files, and these will be the newest ones present on either A or B.

Verify menu

full

All backup files are read and checked for errors.


incremental
New backup files are read and checked for errors. "New" means any files written by an immediately prior backup. Files not modified are not checked.


compare

All backup files having the same modification time and size as their corresponding files on disk are read and compared with the disk. There should be no differences. This verifies that ukopp is working correctly. Other files are read and checked, but not compared to disk.



Report menu

get disk files

The backup job include and exclude records are listed, along with the file and byte counts that are added or removed. Look for zero counts, indicating a possible error. The disk directories are read at the time this command is executed, and the list of files included in the backup job is retained in memory. This data is used to determine which backup files are now out of date and must be copied again from disk. The file list is static and is not updated by disk activity. The list of "new" files for a subsequent incremental verify is also reset.


diffs summary
Report the total number of files in each category:

    new disk files with no corresponding backup file

    modified both files exist, but are not identical

    deleted backup files with no corresponding disk file

    unchanged both files exist and are identical


Differences between the disk and the backup files may be caused by disk updates (file additions, deletions, updates, or moves), or by changes to the backup job file itself.


diffs by directory
The above counts are reported for each directory having any differences between the disk and backup files.


diffs by file

List all different files, grouped in the first three categories above. If a file is present on both the disk and the backup location, and the backup file is newer than the disk file, then the file is flagged in a way that is easy to see. This can be normal if you use the synchronize function.


version summary

List backup files having old versions retained, with the range of versions and file ages (days) available. File age is days since the file was modified.


expired versions

List backup file versions that are expired and will be purged from the backup medium or location with the next backup run.


list disk files
All files in the backup file set are listed in alphabetic sequence. Use this to check that the correct files are being backed-up.


list backup files
All backup files are listed in alphabetic sequence. A summary of the space used for prior file versions is also provided.


find files
Enter a search pattern with optional wildcards (e.g. /home/dir*name/file*name).
All matching disk files and backup files are listed.


save screen

The main window, where messages and reports are written, is saved as an ordinary text file.
 

Restore menu

setup restore job

Specify the copy-from location (in the backup files), the copy-to location (disk), and the files to be restored.

The copy-from location is the topmost directory of a tree of files to be restored.

    example: /home/joeblow/documents        # mount point is omitted


The copy-to location is an existing disk directory where the tree of files will be copied-to.

    example 1: /home/joeblow/documents

    example 2: /home/joeblow/documents/restored

In example 1, the restored files will go back to the same place they were when backed-up.
In example 2, they will go to a new place.

Files to be restored are specified the same way as in a backup job (see the section below on using the file selection dialog).

If you need to restore multiple trees of files, you can do this in multiple runs, or you can simply begin the tree at a higher level and use the file selection dialog to specify multiple sub-trees, with included and excluded branches.

list restore files
After performing the file restore setup above, use this function to list all matching files that will be restored, at the locations where they will be restored. You should check this list carefully to be sure you are restoring the correct files to the intended locations.


restore files
When you are satisfied with the restore job specification, use this menu to start the restore. You will see a running log of the activity. The file owners and permissions are automatically restored, even if the backup files are on a FAT file system.


Format menu


format device

This is a convenient way to initialize a portable memory device such as a USB stick or SD card for use with ukopp. You may select the vfat (Microsoft) or ext2 (Linux) file system. You may choose from all known devices, mounted or unmounted. You may also choose a device label which will show under the device desktop icon if automatic mounting is enabled. Before format begins, you are shown which device will be formatted and given an opportunity to stop. Be sure you format the correct device, since all data on this device will be lost!


Microsoft vfat works somewhat faster than ext2 for USB devices, for reasons not clear to me. The disadvantage is that some of the strange file names typically found in Linux hidden directories are not vfat compatible and will not copy (error messages are produced and the backup job continues). Use ext2 if you must copy these files. Use vfat if you must exchange files with a Windows computer.


Editing backup jobs

The [edit job] button starts the job edit dialog. See screenshot below.


include and exclude records

You may edit the backup job (the include and exclude records) directly in the text window. You may also use the browse button to start a file selection dialog. This dialog has the following buttons: hidden, include, exclude. The [hidden] button toggles the display of hidden files. Select one or more directories or files, using left-mouse or Ctrl+left-mouse, then press the [include] or [exclude] button. The selected directories or files will be written into the text window as include or exclude records. If you select a directory, the entry is modified to add a wildcard at the next level, e.g. selecting /aaa/bbb/ccc and then pressing [include] generates include /aaa/bbb/ccc/*.


You may alternate between editing the text window and using the file-chooser dialog. When you are done, press [done] to accept. The include / exclude records will be validated to the extent possible. Re-edit to fix any problems. To change the sequence, cut and paste in the text window. When you are done, use the report functions "get disk files" and "list disk files" to verify that you have the correct files!

The include and exclude records allow precise control of the backup file set, allowing you to quickly converge on the desired results:

    include /aaa/bbb/*             # include file tree under /aaa/bbb/

    exclude /aaa/bbb/ccc/*         # exception: exclude /ccc/ subtree

    include /aaa/bbb/ccc/xxx.yyy   # exception: include file /ccc/xxx.yyy


Because of wildcards, newly added files within the scope of existing include or exclude records are automatically comprehended. In the above example, if a new file is added in /aaa/bbb/* then it will be automatically included in the next backup job.


retain records
With no version retention, a modified disk file replaces the corresponding backup file, and a deleted disk file causes the corresponding backup file to be deleted. If you wish to retain previous file versions, place retain records anywhere in the sequence of include and exclude records.

The format is
retain ddd vvv filespec where ddd is the number of days that old versions should be retained, and vvv is the number of versions that should be retained. If no filespec is present, the retain specs apply to all following files. If present, there must be at least one wildcard in the filespec, and the retain specs apply only to matching files. Files selected with an include record will inherit their retain policy from the last preceeding retain record with a matching filespec. Place one at the begining to apply a default policy to all files. Place more retain records within the list to change the policy for different files.

The retain logic is not obvious, so please pay attention: old file versions are deleted only when they are older than BOTH retain rules, i.e. older than ddd days, and not within the most recent vvv versions. You can disable either of the limits by using zero (retain zero versions or days).

Here are some examples that will hopefully make this clear:

    retain 10 3 delete versions older than 10 days and 3 newest versions

    retain 10 0 delete all versions older than 10 days
    retain 0 8 delete versions older than the 8 newest versions
    retain 30 9 *.c retain all C program files at least 30 days and 9 versions

    retain 5 2 */docs/* applies to all files within a directory named docs


target record
This optional record can be used to specify a target device and/or directory associated with the job file.
This will be overridden if a backup device is selected using the [target] button. After the directory, you can specify an optional method for flushing the memory I/O cache between backup and verify. The two methods are "sync" (use the Linux sync command) and "remount" (unmount and remount the device containing the backup target directory). The format of the target record is as follows:
   target /dev/sdf1 /directory  sync   (or)   target /dev/sdf1 /directory  remount           (v.3.2)
If no flush option is given,  sync will be used by default. Either the device or directory may be omitted and ukopp will find and use a given device's existing mount point, or the mounted device corresponding to a given directory, if this exists. If the target record has insufficient information, you must use the [target] button to select a device and directory.

verify record
Place a verify record at the end. Format: "verify xxxx" where xxxx is one of the following:
 
none
no automatic verify after backup (use the verify menu instead)
incremental
verify all files copied by the backup job (i.e. new and modified files)
full
read all backup files to check data integrity
compare
full + compare all backup files to corresponding disk files (if present)


ukopp job edit dialog



The [ edit job ] toolbar button pops up the left box. This can be edited directly: click anywhere in the text area and start writing. The right box is the choose files dialog, which is started with the browse button in the left box. Choose files using the right box, and the left box records your choices. You can navigate around the directory hierarchy and select any number of files or directories. The hidden button toggles the display of hidden files. Click one of the include or exclude buttons to get the selected files added to or removed from the backup list. Selecting a directory is an implied selection of all its contained files, thus the selection appears as directory/* in the list of selected files. To make an exception, go down one level, select some files, and select the opposite include or exclude button. You can refine the file selections manually if desired. It is sometimes handy to use wildcards in the directories to make more general and compact selection criteria, e.g. "exclude *thunderbird*/Trash" will omit trashed mail even if the overlying directories change (they do) and even for multiple users.


You can add comments (or disable a record) by putting # in column 1.


Annotated example of a backup job file

This is an example of what one might do to backup all personal files. In this example, we avoid backing up stuff that is not important (browser cache) or stuff that can be automatically regenerated (gnome thumbnails). Old file versions should be retained for the periods and version counts specified. All files copied during this run should be read and verified. Files not copied (because they have not changed since the last backup) are not verified. The backup target or location is a USB disk that, when plugged-in, mounts at /media/disk (which can be changed at run time if desired).

    retain 30 3                 # retain at least 3 versions for at least 30 days

    retain 60 9 */docs/*        # for this directory, longer retentions

    include /home/rosi/*        # include all of Rosi's personal files

    exclude */.thumbnails/*     # omit gnome thumbnail files

    exclude */firefox/*Cache/*  # omit the browser cache files

    verify incremental          # verify all files copied by this run only

    target /dev/sdf1            # use removable USB thumb drive
sdf1

The above backup job can be created using the following steps:
        include /home/rosi/*
        exclude /home/rosi/.thumbnails/*
        exclude /home/rosi/.mozilla/firefox/xxxxx.default/Cache/* ("xxxxx" means the random directory name that firefox generates for a user)


Technical Notes

Symlink files:
starting with version 3.0, symlink files are no longer discarded, but treated like regular files. They are copied if included in the backup job. The target file of an included symlink is NOT automatically included. A target file is included only if it's own file name is included in the backup job. Symlinks are verified by checking they are readable using function readlink(). If the target file system is vfat, symlinks will not copy and will be reported as errors.

Running ukopp as root: ukopp will only copy files for which the user has read access. If files belonging to root or other users are to be copied, you must run ukopp as root. Use "su" or "sudo", or log in as root (see the note below about making a launcher to handle this).


Command line arguments:

    $ ukopp -job jobfile      # load job file

    $
ukopp jobfile           # load job file

    $
ukopp -run jobfile      # load job file and run it

If the jobfile name contains blanks, quotes are required, e.g. $ ukopp -job "my ukopp job"


File type association:
I suggest using the extension .ukopp for job files and specifying ukopp as the "start with" program. Then you can click on a job file and launch ukopp.


Desktop launcher:
a desktop icon / launcher may contain a command like this:

    gksu /usr/local/bin/ukopp -job myjob.job
"gksu" will ask for the root or administrator password and run the job as root.


Incremental backups:
a backup file is considered identical to its corresponding disk file if their lengths and modification times are the same. Incremental backups exclude such files. If the modification times differ by less than 1 second they are considered equal. 1 second is the time resolution for a Microsoft vfat (FAT32) file system, usually present by default on detachable drives.


Restoring file owner and permissions:
A detachable drive file system may not support Linux file owner and permissions (e.g. Microsoft FAT). The ukopp backup function copies a special file to the backup location, with the data needed to restore file owner and permissions. The ukopp restore and synchronize functions use this file.


Special ukopp files:
A directory named ukopp-data is written to the backup location.
It contains the following three files:

    datetime            backup date-time

    poopfile            owner and permissions data for all files

    jobfile             a copy of the backup job file used

These are ordinary text files which you can view with an editor.


Special file types:
pipes, devices, symlinks, and sockets are not copied.


Duplicate files:
If job file "include" records overlap, resulting in duplicate files in the backup set, this is reported and the backup does not proceed.


GTK thread locking:
the functions zlock() and zunlock() are used to surround GTK function calls and make them thread-safe. These locking functions have also been coded to do nothing if called from the main() thread, and to detect and avoid redundant locking (a fatal bug) if there are nested calls.


Finding disk drives:
the Linux utility udevinfo is used to find block devices with the characteristics "disk". The file /etc/mtab is used to find mount points.


Removing detachable drives:
To remove a detachable drive, right click on its icon, select "eject", and wait for the "OK to remove" message, or the LED on the drive to stop blinking. Pulling the drive out without doing this can result in data corruption or total loss.

File system cache: The library function synch() is called after a backup and before the verify begins. This causes all the cached data in memory to be written to the backup medium before the program can proceed. This is done to assure that the verify function is reading from the medium and not from cache memory, which would be pointless for verifying the medium.

(poor) Linux error codes:
Linux error codes can be misleading. If an attempt is made to open a file that is already open and is therefore locked, the error text is "no such file or directory". There are several such screwups in Linux. This will hopefully improve over time.

Funny file names: Disk drives formatted with the vfat file system (Microsoft FAT) will not accept some Linux file names. Notably, files names containing " : " or " ? " or ending with a blank will fail to copy, and this will be reported in the backup job. Unless you need Microsoft compatibility, format the drive with ext2, or avoid trying to copy all the really strange file names you can find among the hidden files in your home directory.

Retention and version limits:
The retention limits are 9999 days and 9999 versions. As an example, if the version limit were set to 100, retained versions for a file could reach 9899 to 9999 before ukopp stopped working. These limits are easy to increase, but performance would start to deteriorate long before this. If you reach 1000 retained versions it is time to start over (erase the medium).