$BLOGNAME: ComicCheck - Extension checker for cbr and cbz files

Having recently moved city, I left behind all of my dead-tree copies of books and comics/manga. I'm working around it with my old-ish Sony Reader and an ipad1 I won. I did a quick write-up of using the ipad in gentoo on the new wiki, but if you've a recent version of iOS, you'll need to build from version control (ebuilds to do this are available from my overlay).

I use CloudReaders to read comics, as it's nice and fast, even for images-only PDFs. The price (free) is right too! I use it both for textbooks and for comics. The only problems that I've come across is that it gets slow on launch if there are many files for it to parse through and that there's no way to remove labels from it's dictionary (even if there are no labelled files left). I'm also stuck with a blank label, which I now can't get rid of. Aside from that, it does the job perfectly well, with good responsiveness, zooming and auto-scaling. It's actually the best software PDF/comic reader I've ever used (although the tablet form factor is a major plus for it).

There are two problems that I've encountered using it, which are only somewhat its fault;

It uses the archive-order for zip and rar files, not the sorted-order
It relies on file extension, and many cbr and cbz files have an incorrect extension. (comicbook-rar, and comicbook-zip, but you probably knew that)

Since realising that a cbr should be a cbz at the time that I try to read one, and with no way to rename the files on-the-go, I wrote a script to check comics and rename them if it detects that they're not open-able with the appropriate unarchiver. Note that it depends entirely on the stdout from app-arch/unrar and app-arch/unzip as they exist today, so it may well fail in the future. Using EXIT_CODES would be nicer, and I no longer recall why I didn't; perhaps unrar didn't feel like co-operating. The script doesn't overwrite, and has the endearing property of renaming completely broken files every time (so keep an eye out on it's output).

Since the people packing these archives don't seem to be aware that archives have internal order that isn't lexical, sometimes you get files that have the same order as whatever the directory entries happened to be, and CloudReaders doesn't sort the extracted files before rendering them. I might patch the script to re-pack the archives in future, depending on how prevalent this is.

Oh, and the usual security caveats apply (although what you're doing managing comics on your ipad on a secure server is entirely your business... see this for a security flaw in usbmuxd which would allow someone with a specially-rigged iPad to run arbitrary code)

Available to view through through google code or below.

Normal Operation:

$ls 
bar.cbz  foo.cbr
# foo.cbr is really a zip, bar.cbz is really a rar 

$ comiccheck 
!cbr foo.cbr
!cbz bar.cbz

$ls 
bar.cbr  foo.cbz

$ comiccheck 
$

#!/bin/bash

# swap_type FROM TO (eg. swap_type cbr cbz)
swap_type() {
 FAILTXT=""
 BIFS=${IFS}
 IFS=""
 
 if [ ${1,,} == cbr ] ; then
  TESTSTART="unrar l "
  TESTEND="  | grep -i 'is not'"
 elif [ ${1,,} == cbz ] ; then
  TESTSTART="unzip -l "
  TESTEND="  2>&1 | grep 'cannot find zipfile'"
 else
  echo "I don't know that type!"
  return
 fi

 for i in *${1,,} *${1^^} ; do 
  TEST="${TESTSTART} \"${i}\" ${TESTEND}" 
  FAILTXT=$(eval ${TEST})
  # if not zero length, echo, try as zip, mv to cbz
  if ! [ -z "$FAILTXT" ] ; then 
   echo "!${1,,}: ${i}" 
   toname="${i/%$1/${2,,}}"
   if ! [ -e "${toname}" ] ; then 
    mv "$i" "${toname}"
   else
    echo "${i} is not a ${1,,}, but not overwriting ${toname}"
   fi
  fi

  FAILTXT=""
 done
 
 IFS=${BIFS}
}

swap_type cbr cbz
swap_type cbz cbr

As usual, comments, patches are welcome.

$BLOGNAME

2012-03-12

ComicCheck - Extension checker for cbr and cbz files

No comments:

Post a Comment