1

Sample PDF download: https://drive.google.com/file/d/12wv1Pb7gh4vCKOGhX4cZ3aOrLSiOo4If/view?usp=sharing

So when the PDF is opened in A.Reader (Contineous release) it says the Certificate is invalid as Changes have been made to this document that rendered the signature invalid.

But I can't see what/where is changed. Only a Signature (certificate) was added with our own application that adds correct signatures for thousands of other PDFs. No other changes performed. Verifying the Hash with our own code or using PDFBox2 with following code says the signature is valid (true).

So why is A.Reader complaining?

Any help much appreciated as I'm banging my head to the wall for some days now...

public static void main(String [] args) throws IOException, CMSException, OperatorCreationException, CertificateException
{
    System.out.println("\nValidate signature in SignatureVlidationTest.pdf; original code.");
    byte[] pdfByte;
    PDDocument pdfDoc = null;
    SignerInformationVerifier verifier = null;
    try
    {
        pdfByte = FileUtils.readFileToByteArray(new File(FOLDEROUT, "102089-5913E701-5EE6-AC3F-7B03-A8D27A7CD9FA.pdf"));  
        pdfDoc = PDDocument.load(new File(FOLDEROUT, "102089-5913E701-5EE6-AC3F-7B03-A8D27A7CD9FA.pdf"));  
       // pdfDoc = Loader.loadPDF(new ByteArrayInputStream(pdfByte));
        PDSignature signature = pdfDoc.getSignatureDictionaries().get(0);

        byte[] signatureAsBytes = signature.getContents();
        byte[] signedContentAsBytes = signature.getSignedContent(pdfByte);
        CMSSignedData cms = new CMSSignedData(new CMSProcessableByteArray(signedContentAsBytes), signatureAsBytes);
        SignerInformation signerInfo = (SignerInformation) cms.getSignerInfos().getSigners().iterator().next();
        X509CertificateHolder cert = (X509CertificateHolder) cms.getCertificates().getMatches(signerInfo.getSID())
                .iterator().next();
        verifier = new JcaSimpleSignerInfoVerifierBuilder().setProvider(new BouncyCastleProvider()).build(cert);

        // result if false
        boolean verifyRt = signerInfo.verify(verifier);
        System.out.println("Verify result: " + verifyRt);
    }
    finally
    {
        if (pdfDoc != null)
        {
            pdfDoc.close();
        }
    }
}
mkl
  • 90,588
  • 15
  • 125
  • 265
user1391606
  • 91
  • 1
  • 2
  • 8
  • Maybe something in the page structure; the only thing I could find is that in the original page `/MediaBox[ 0 0 595.3 841.9]`, in the revised page `/MediaBox [0 0 595.300 841.900 ]` maybe Adobe considers this as different numbers? – Tilman Hausherr Feb 25 '22 at 18:24
  • Hi, thx for the feedback. I need to check it but I doubt it as in other PDFs we have similar changes and there this error does not occur. I think it must be something else... – user1391606 Feb 25 '22 at 21:37
  • 1
    So I can see that the PDF has three revisions, the base one and two via incremental updates. The last revision is the one containing the signature. Since there is no other incremental update done on top of that one, there can be no changes. My guess is that Adobe Reader complains about something else. The used certificate seems to be missing the "Digital Signature" key usage extension, so that might be a problem for Reader. In fact, when you view the certificate and go to details, a red exclamation mark is shown for the key usage. – gettalong Feb 25 '22 at 22:58
  • I cut off the signed part and signed with the PDFBox example, and got the same problem. Then I cut off so that there is only one revision (which shows some private data) and have the same problem. And I don't have a /MediaBox in the incremental segment. – Tilman Hausherr Feb 27 '22 at 05:34
  • &@gettalong: Thx for the feedback but I do not have a red exclamation mark in A.Reader and the same certificate is used for certifying other documents which do not have this problem, so I guess it is not a Certificate releated issue ? – user1391606 Feb 27 '22 at 13:18
  • @Tilman, not sure how to read your comment but the Java code example is just to verify a signature, not to place a signature in a PDF. Also if you removed the last revision (defining the signature) then you can't have 'the same problem' as there is no signature left in the PDF. Or do I misread your comment ? – user1391606 Feb 27 '22 at 13:33
  • 1
    The first revision of the document has a broken cross reference table. This is known to cause issues during validation. Usually these issues only surface if there are incremental updates to the original document, but essentially this means this broken original PDF is not suitable for signing. – mkl Feb 27 '22 at 14:14
  • @user1391606 One can get the unsigned document by using an editor like NOTEPAD++, by removing stuff after "%%EOF". I tried signing that one. – Tilman Hausherr Feb 28 '22 at 09:11
  • @mkl Thx for your message. Just to be sure I understand it correctly. You mean the xref table starting at line 1758 when I open the PDF in Notepad++. Correct ? If so, what/where is the error 'broken'? Can you elaborate on that? Do you use any software to check this? Thx! – user1391606 Feb 28 '22 at 10:05
  • @Tilman OK, now I understand but it seems mkl is getting the root cause of the problem. Hope I can get more info from mkl. – user1391606 Feb 28 '22 at 10:07
  • I'll explain the error in the cross references in an answer. **But** while that is *one* reason you'll run into issues with signing with your PDF, it is *not the only one*. The cross reference problem usually only causes problems if an arbitrary incremental update is added after signing but not if there is only one signature with no incremental updates thereafter. Thus, there most likely is yet another issue in your full example PDF. – mkl Feb 28 '22 at 10:16
  • By the way, *"You mean the xref table starting at line 1758 when I open the PDF in Notepad++. Correct ?"* No, the cross references of the initial revision are at line 1014 in Notepad++. You found the delta cross references of an intermediary revision. – mkl Feb 28 '22 at 11:11
  • What happens if you sign only the first revision of the PDF? And what if you sign a PDF without that cross reference issue? – mkl Feb 28 '22 at 12:26
  • @mkl Thx for your extensive explanation. We did not generate the original document (with the broken xref table). We will try to detect such an error and if so do a 'rewrite'of the document to remove this broken xref table. – user1391606 Feb 28 '22 at 15:00
  • Beware, if the original document with the cross reference table error already is signed, you cannot repair without breaking that earlier signature. Consider rejecting in that case. (Also, if an answer post sufficiently answers your question, you may consider marking it as accepted answer.) – mkl Mar 01 '22 at 08:33

1 Answers1

2

Broken Cross References in First PDF Revision

The cross reference table at the end of your first revision looks like this:

xref
0 19
0000000000 65535 f
0000000018 00000 n
0000000348 00000 n
0000000422 00000 n
0000000481 00000 n
0000000776 00000 n
0000003138 00000 n
0000032630 00000 n
0000033308 00000 n
0000033489 00000 n
0000033723 00000 n
0000033932 00000 n
0000056202 00000 n
0000056645 00000 n
0000056837 00000 n
0000070988 00000 n
0000071312 00000 n
0000071521 00000 n
0000071543 00000 n
20 26
0000071844 00000 n
0000080069 00000 n
0000080373 00000 n
0000080556 00000 n
0000097791 00000 n
0000097813 00000 n
0000097833 00000 n
0000097853 00000 n
0000097876 00000 n
0000097899 00000 n
0000097922 00000 n
0000097945 00000 n
0000097968 00000 n
0000097991 00000 n
0000098014 00000 n
0000098037 00000 n
0000098059 00000 n
0000098083 00000 n
0000104407 00000 n
0000104444 00000 n
0000104483 00000 n
0000104565 00000 n
0000104704 00000 n
0000104728 00000 n
0000111035 00000 n
0000111072 00000 n
48 1
0000111098 00000 n
50 2
0000111296 00000 n
0000113066 00000 n

As you can see it consists of multiple subsections with object numbers 0..18, 20..45, 48, and 50..51. In particular there is no mapping for object numbers 19, 46, 47, and 49.

This is disallowed for two reasons:

  • For a file that has never been incrementally updated, and so in particular for the first revision of each PDF file, the cross-reference section shall contain only one subsection, whose object numbering begins at 0.

  • The cross-reference table (comprising the original cross-reference section and all update sections) shall contain one entry for each object number from 0 to the maximum object number defined in the file, even if one or more of the object numbers in this range do not actually occur in the file.

(ISO 32000-1 section 7.5.4 "Cross-Reference Table")

Thus, the first cross reference table of a regular PDF must consist of only a single subsection. And even if that was not required, gaps with unmapped object numbers are not allowed.

Normally Adobe Reader ignores violations of these requirements, but in the context of signature validation it is stricter. Usually this shows in situations where the PDF in question is signed and then some arbitrary incremental update is added.

For example, I took your first revision (the first 114510 bytes of your file) and signed them and then extended them to LTA:

only signed signed and extended
only signed signed and extended

This has been the topic in multiple questions here on stack overflow:

Additional Problems

There most likely are other issues still to be found in your example PDF, though. As mentioned above, cross references like in your first revision usually only cause problems after adding an incremental update to the signed PDF. That is not the case for your example PDF. Thus, I would expect other oddities in it.

Remarks

Some additional observations:

  • Starting from the first revision you have a number of extra entries in the document information dictionary: SIG_PAGE, SIG_LLX, SIG_LLY, SIG_URX, and SIG_URY. IMO this is not the appropriate place, although conforming readers may store custom metadata in the document information dictionary, they may not store private content or structural information there. Such information shall be stored in the document catalogue instead. (ISO 32000-1 section 14.3.3 "Document Information Dictionary") IMO those entries look like processing instructions private to your work flow, not metadata of public interest.

  • Your signature dictionary contains a R value. Since PDF 1.5 this entry shall not be used, and the information shall be stored in the Prop_Build dictionary. (ISO 32000-1 table 252 "Entries in a signature dictionary")

mkl
  • 90,588
  • 15
  • 125
  • 265