Wednesday, March 26, 2014

Scanning from an eSCL Device Using Command Line

eSCL is HP's and Apple's scan protocol. (IETF standards track even?) Uses XML.
xmllint is from the libxml2-utils package.

Get Scanner Status

%  curl -s http://localhost:8080/eSCL/ScannerStatus | xmllint -format -
Get Scanner Capabilities

% curl -s http://localhost:8080/eSCL/ScannerCapabilities | xmllint -format -
Start a scan job

%  curl -v -X POST -d @scansettings.xml  http://localhost:8080/eSCL/ScanJobs

Device should respond with a 201 + Location of the new job. The Location will have a jobid (integer) (or a UUID).

Retrieve the scan job (in this example, 208 is the jobid from the 201 response to the POST)
% curl -s http://localhost:8080/eSCL/ScanJobs/208/NextDocument > out.dat

The 'out.dat' file should be the scanned image. Should be a jpeg or pdf or some other image. Jpeg is most likely.

% file out.dat
out.dat: JPEG image data, JFIF standard 1.01

Simple(ish) scansettings.xml

<?xml version="1.0" encoding="UTF-8"?>
<scan:ScanSettings xmlns:pwg="http://www.pwg.org/schemas/2010/12/sm" xmlns:scan="http://schemas.hp.com/imaging/escl/2011/05/03">
  <pwg:Version>2.0</pwg:Version>
  <pwg:ScanRegions>
    <pwg:ScanRegion>
      <pwg:Height>3300</pwg:Height>
      <pwg:ContentRegionUnits>escl:ThreeHundredthsOfInches</pwg:ContentRegionUnits>
      <pwg:Width>2550</pwg:Width>
      <pwg:XOffset>0</pwg:XOffset>
      <pwg:YOffset>0</pwg:YOffset>
    </pwg:ScanRegion>
  </pwg:ScanRegions>
  <pwg:InputSource>Platen</pwg:InputSource>
  <scan:ColorMode>Grayscale8</scan:ColorMode>
</scan:ScanSettings>

Monday, March 24, 2014

Scanning from Apple AirPrint + AirScan

TL;DR. AirPrint/AirScan with Image Capture + AirScanScanner requires pdl=application/octet-string in mDNS TXT record. Otherwise, AirScanScanner will not work.


AirPrint Requires AirScan.


The Apple AirPrint specification 1.4 requires MFP (Multi-Function Printers; printers with an attached scanner) to also allow scanning over the AirPrint connection. The required protocol is eSCL, an XML over HTTP originally created by HP.

AirScan/eSCL devices advertise themselves over mDNS as _uscan._utcp.  One of the fields in the TXT record is "pdl".

The documentation shows an example: "pdl=application/pdf,image/jpeg"

The AirPrint documentation says PDF and JPEG scanning are required. That's all that's mentioned.  (The pdl field becomes critically important later.)

OSX Image Capture finds network scanners through the mDNS. The actual scanning is done through an executable called AirScanScanner.
/System/Library/Image Capture/Devices/AirScanScanner.app

Only Mavericks can successfully scan from an AirScan/eSCL device. MtnLion connects but fails.


Adding AirScan to Existing AirPrint Device.


I was tasked with adding AirScan support for our scanners.

We have an HP X576 that supports eSCL. Image Capture successfully scans gray/rgb JPEG from HP through eSCL. However, when scanning from my code, Image Capture would only scan rgb. Grayscale would silently fail. No output image.

Only clue was a log message:

Mar 14 16:22:24 latches.local Image Capture[5359]: ImageIO: CGImageSourceCreateWithURL url parameter is nil


The Plot Thickens.


After mimicing as much of the HTTP+XML eSCL as possible, I attacked AirScanScanner with the debugger and dtruss. AirScanScanner writes its temporary images to /var/folders/hw/<longname>/T. The incoming image is written to a temporary file then moved to the user's Pictures folder.

A color scan (for both HP and me) :
rename("/var/folders/hw/vb001gqj3zv8vrlbr4tn8cqw0000gp/T/temp.awDE7YxI\0", "/Users/davep/Pictures/Scan 93.jpeg\0")         = 0 0

In the case of a grayscale scan, AirScanScanner's behavior deviates. From the HP, AirScanScanner writes a .ica file (??? Image Capture Application intermediate format perhaps?) 

rename("/var/folders/hw/vb001gqj3zv8vrlbr4tn8cqw0000gp/T/temp.K7RsxSBk\0", "/var/folders/hw/vb001gqj3zv8vrlbr4tn8cqw0000gp/T/Image Capture_TempScan.LLCnftRz/Scan 91.ica\0")         = 0 0

However, The dtruss traces showed AirScanScanner writing my incoming JPEG image as a TIFF.

rename("/var/folders/hw/vb001gqj3zv8vrlbr4tn8cqw0000gp/T/temp.jlIExPds\0", "/var/folders/hw/vb001gqj3zv8vrlbr4tn8cqw0000gp/T/Image Capture_TempScan.N8yDdG7e/Scan 96.tiff\0")         = 0 0

The .tiff is never moved to Pictures. The scan's entire temp directory is never deleted (leaks).


The PDL.


Eventually I found I was advertising pdl slightly differently than the HP. The HP response is shown below:

% dns-sd -L "HP Officejet Pro X576dw MFP [F7816C]" _uscan._tcp
Lookup HP Officejet Pro X576dw MFP [F7816C]._uscan._tcp.local
DATE: ---Fri 21 Mar 2014---
 9:41:34.872  ...STARTING...
 9:41:35.026  HP\032Officejet\032Pro\032X576dw\032MFP\032[F7816C]._uscan._tcp.local. can be reached at HP843497F7816C.local.:8080 (interface 4)
 txtvers=1 vers=2.0 pdl=application/octet-stream,application/pdf,image/jpeg ty=HP\ Officejet\ Pro\ X576dw\ MFP adminurl=http://HP843497F7816C.local. note= UUID=1c852a4d-b800-1f08-abcd-843497f7816c representation= rs=/eSCL cs=binary,color,grayscale is=platen,adf duplex=T

The "pdl=application/octet-stream,application/pdf,image/jpeg" turns out is the key.

When I changed my advertisement to match the HP's, monochrome scanning suddenly began working.

If I changed my advertisement to only "pdl=application/octet-string", scanning still worked.

For reasons unknown, "application/octent-string" is requred in the pdl field of the mDNS advertisement.



EnableLogging and SaveIntermediateFiles.


The AirScanScanner executable contains two very interesting strings: EnableLogging and SaveIntermediate Files.  I would love to be able to enable those debug features. I tried a few tricks with the AirScanScanner.plist but nothing happened. The Image Capture utility launches AirScanScanner so I wasn't able to successfully inject environment variables into it.


In Conclusion.


I'm mostly writing this to help the next firmware engineer tasked with adding Scan support to an AirPrint device. The eSCL is a beautiful simple protocol that simplifies scanning.