0

I'm working on an application that allows users to generate certain PDF documents using templates, then sign those documents and store the signed documents for later verification.

When using completely external (detached, say RSA) signature, I can do the following optimization: as a document is generated using a template from a small structured dataset, I can ignore the document itself and only store structured data along with signature.

This greatly reduces requirements for disk space and throughput, as PDF file is 50-100 times bigger than it's source data.

However, recently we've got a requirement to use signatures embedded in PDF files themselves. Here it is - iText comes to rescue!

The idea is to have the following process:

  1. Generate PDF on server. This includes getting the source data, transforming it to PDF using template, then using iText insert an empty signature container to it.
  2. Send PDF to client for signing
  3. Extract user signature from PDF. Store it separately along with source data, template version, and byte ranges used for signing
  4. When a document is requested, generate the same document from source data, then insert stored signature using previously recorded byte ranges.

So far, sounds so good.

However, the problem is getting repeatable result during transformation chain Source Data -> PDF -> PDF + Signature Container

The step Source Data -> PDF obviously works fine. The problem is with inserting a Signature Container. Each time I'm using the same code on the same PDF i get different result (in bytes) of resulting PDF + Container.

I'm using roughly the following code to prepare document for signing:

PdfReader reader = new PdfReader(resultDocument);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfStamper stamper = PdfStamper.createSignature(reader, baos, '\0');
PdfSignatureAppearance sap = stamper.getSignatureAppearance();
sap.setReason("Sign reason");
sap.setLocation("Sign location");
sap.setVisibleSignature(new Rectangle(36, 748, 144, 780), 1, "sig");
sap.setSignDate(externallyStoredSignDate)
PdfSignature dic = new PdfSignature(
PdfName.ADOBE_PPKLITE, PdfName.ADBE_PKCS7_DETACHED);
dic.setReason(sap.getReason());
dic.setLocation(sap.getLocation());
dic.setContact(sap.getContact());
dic.setDate(new PdfDate(sap.getSignDate()));
sap.setCryptoDictionary(dic);
HashMap<PdfName, Integer> exc = new HashMap<PdfName, Integer>();
exc.put(PdfName.CONTENTS, new Integer(8192 * 2 + 2));
sap.preClose(exc);

InputStream data = sap.getRangeStream(); // Whooops - different every time!

Any help on how to get repeatable PDFs with prepared space for signatures is greatly appreciated! Thanks in advance.

execc
  • 1,083
  • 12
  • 25
  • The (most important ?) feature of PDF with embedded signature to me is to prove that THIS document is unchanged. So I think that part of the "embedment process" is a random component that is new generated for every new embedding of a signature. B.t.w.: the iText7 example "C2_01_SignHelloWorld.java" increases the filesize from (unsigned but prepared pdf "hello_to_sign.pdf") 7kb to (signed "hello_signed1.pdf") 19kb [but I agree, the "source data" 'Hello World!' is only some bytes long :-)]. – Michael Fehr Jul 16 '20 at 11:49
  • Sure, but exactly THE SAME document is generated each time, and bytes does not have a concept of origin. So using any other digital signature mechanics works for such a process. Also, the increase to 19kb is due to the fact that iText reserves 8kb of 'space' in PDF for digital signature – execc Jul 16 '20 at 12:16
  • As far as I studied the documents regarding iText the regular way is it to add the signature container when signing is done. There seems to be a way to "contruct" a signatture container without knowing the certificate in front - here is the link to the answer of mkl (a developer of iText ?): https://stackoverflow.com/a/58806524/8166854, maybe this helps you in preparing the pdf. Good luck! – Michael Fehr Jul 16 '20 at 15:19
  • 1
    When you create a PDF with iText, some variable data are involved. In particular a PDF contains meta data like the **LastModifiedAt** timestamp. Also each time a PDF is generated or manipulate, it gets a random new ID. Furthermore, in case of encrypted PDFs, some encryption algorithms imply the use of random seed values. One can patch these sources of randomness or if iText but by default they're there. – mkl Jul 17 '20 at 05:08

0 Answers0