I bought the printer Ricoh SP112 when it was on sale, thinking that it would meet all of my printing needs. In the past could printer support in GNU/Linux be outright terrible, with non-existent driver support from vendors and if there was, be closed source-software requiring outdated libraries and the alike. This was in 2016, and by then, most printers did actually work with GNU/Linux out-of-the-box without additional fuss. In my navité, I thought the SP112 would also work out-of-the-box in GNU/Linux. Well, I was wrong.
The Issue
Vendors do sometimes decide to use GDI for their (lower-end) printers, in contrast to PostScript/PCL which is otherwise standard for printers. I think it enables cost-cutting measures as GDI prescribes that the print-rendering is done on the computer, in other cases the printer does the rendering. Wikipedia describes GDI for printers like this:
The host computer does all print processing: the GDI software renders a page as a bitmap which is sent to a software printer driver, usually supplied by the printer manufacturer, for processing for the particular printer, and then to the printer.[10][11] The combination of the GDI and the driver is bidirectional; they receive information from the printer such as whether it is ready to print, if it is out of paper or ink, and so on.
Ricoh only support Windows for their GDI printers, so my printer was essentially a paperweight in face of this situation. Faced by this I decided to research whether I would be able to write a driver on my own for my printer.
Virtualization And Wireshark To The Rescue
The SP112 is a USB-only printer. I had to dump the USB data packets to be able to see what was going on. It wasn't (and surely not today either) not a pleasant experience to try to dump USB data packets on Windows, so I resorted to GNU/Linux and chose to use Virtualbox. It's easy to couple USB ports to Virtualbox, so that went fine. My next issue was to actually be able to read the USB data sent to the printer.
Luckily there is Usbmon, a Linux kernel module that can access USB data. Wireshark is a well-known Ethernet frame inspector, but lesser known to be able to interact with Usbmon. It required some minor hassle to work.
The usbmon devices are numbered by the respective USB buses and I figured out the relevant USB bus for the printer with the help of lsusb:
[ghaglund@pc ~]$ lsusb
[...]
Bus 001 Device 008: ID 05ca:0447 Ricoh Co., Ltd SP 112
[...]
[ghaglund@pc ~]$
All this sorted out, I could finally start to reverse engineer the Ricoh-supplied printer driver.
The data sent to the printer
\1b%-12345X@PJL\r\n
@PJL SET TIMESTAMP=[...]\r\n
@PJL SET FILENAME=[...]\r\n
@PJL SET COMPRESS=JBIG\r\n
@PJL SET USERNAME=ghaglund\r\n
@PJL SET COVER=OFF\r\n
@PJL SET HOLD=OFF\r\n
@PJL SET PAGESTATUS=START\r\n
@PJL SET COPIES=1\r\n
@PJL SET MEDIASOURCE=AUTO\r\n
@PJL SET MEDIATYPE=PLAINRECYCLE\r\n
@PJL SET PAPER=A4\r\n
@PJL SET PAPERWIDTH=[...]\r\n
@PJL SET PAPERLENGTH=[...]\r\n
@PJL SET RESOLUTION=[...]\r\n
@PJL SET IMAGELEN=[...]\r\n
[jbig...]
@PJL SET DOTCOUNT=[...]\r\n
@PJL SET PAGESTATUS=END
@PJL EOJ\r\n
%-12345X\r\n
Well, it's PJL and then the binary raster passed to the printer. I was unfamiliar with PJL at first, it's apparently standard among printers besides PCL (that's another language).
Wikipedia describes PJL like this:
Printer Job Language (PJL) is a method developed by Hewlett-Packard for switching printer languages at the job level, and for status readback between the printer and the host computer. PJL adds job level controls, such as printer language switching, job separation, environment, status readback, device attendance and file system commands.
It's not particularly convenient to constantly send pages to the printer, so I chose "print to a file" in Windows to get what the Ricoh driver wanted to pass to the printer. The rest of the examining of the printing data took place in GHex, a simple hex editor I like to use.
JBIG is a lossless image compression algorithm similar to GIF. I wasn't sure if it was JBIG or JBIG2, so I had to test various utilities. Finally jbgtopbm seemed to work out with the raw binary data. It spitted out an image in the PBM format, which curiously enough is a text-based format.
The PBM format was invented by Jef Poskanzer in the 1980s as a format that allowed monochrome bitmaps to be transmitted within an email message as plain ASCII text, allowing it to survive any changes in text formatting.[5]
I had to investigate this PBM image, I thought. It seemed to be nothing special about the PBM file, as it was only colour descriptions in the file.
What about the JBIG data then? Jbgtopbm can display what kind of properties PBM-to-JBIG has, which I then used on the JBIG file.
[ghaglund@pc]$ jbgtopbm -d jbig.jbig
BIH:
DL = 0
D = 0
P = 1
- = 0
XD = 4961
YD = 7016
L0 = 128
MX = 0
MY = 0
order = 3 ILEAVE SMID
options = 72 LRLTWO TPBON
55 stripes, 1 layers, 1 planes => 55 SDEs