Lecture d’annotations de pièces jointes au format PDF avec iTextSharp

J’ai le problème suivant. J’ai un PDF avec un fichier XML attaché comme annotation à l’intérieur. Pas comme fichier incorporé mais comme annotation. Maintenant, j’essaie de le lire avec le code du lien suivant:

iTextSharp – comment ouvrir / lire / extraire une pièce jointe?

Cela fonctionne pour les fichiers incorporés mais pas pour les pièces jointes de fichiers en tant qu’annotations.

Je Google pour extraire des annotations à partir de PDF et trouver le lien suivant: Lecture de PDF Annotations avec iText

Largeur d’annotation rectangular MS Chart en pourcentage et non en pixel

Donc, le type d’annotation est “Annotations de pièce jointe”

Quelqu’un pourrait-il montrer un exemple de travail?

Merci d’avance pour votre aide

Comme souvent dans les questions concernant iText et iTextSharp, il convient tout d’abord de consulter la liste des mots clés sur itextpdf.com . Vous trouverez ici Fichier en pièce jointe, extraire les pièces jointes faisant référence à deux exemples Java à partir de iText in Action – 2nd Edition :

part4.chapitre16. KubrickDvds
part4.chapitre16. KubrickDocumentaire

Les exemples analogues Webified iTextSharp sont

KubrickDvds.cs
KubrickDocumentary.cs

KubrickDvds contient la méthode suivante extractAttachments / ExtractAttachments pour extraire les annotations de pièces jointes:

Java:

 /** * Extracts attachments from an existing PDF. * @param src the path to the existing PDF */ public void extractAttachments(Ssortingng src) throws IOException { PdfReader reader = new PdfReader(src); PdfArray array; PdfDictionary annot; PdfDictionary fs; PdfDictionary refs; for (int i = 1; i <= reader.getNumberOfPages(); i++) { array = reader.getPageN(i).getAsArray(PdfName.ANNOTS); if (array == null) continue; for (int j = 0; j < array.size(); j++) { annot = array.getAsDict(j); if (PdfName.FILEATTACHMENT.equals(annot.getAsName(PdfName.SUBTYPE))) { fs = annot.getAsDict(PdfName.FS); refs = fs.getAsDict(PdfName.EF); for (PdfName name : refs.getKeys()) { FileOutputStream fos = new FileOutputStream(String.format(PATH, fs.getAsString(name).toString())); fos.write(PdfReader.getStreamBytes((PRStream)refs.getAsStream(name))); fos.flush(); fos.close(); } } } } reader.close(); }

C #:

 /** * Extracts attachments from an existing PDF. * @param src the path to the existing PDF * @param zip the ZipFile object to add the extracted images */ public void ExtractAttachments(byte[] src, ZipFile zip) { PdfReader reader = new PdfReader(src); for (int i = 1; i <= reader.NumberOfPages; i++) { PdfArray array = reader.GetPageN(i).GetAsArray(PdfName.ANNOTS); if (array == null) continue; for (int j = 0; j < array.Size; j++) { PdfDictionary annot = array.GetAsDict(j); if (PdfName.FILEATTACHMENT.Equals( annot.GetAsName(PdfName.SUBTYPE))) { PdfDictionary fs = annot.GetAsDict(PdfName.FS); PdfDictionary refs = fs.GetAsDict(PdfName.EF); foreach (PdfName name in refs.Keys) { zip.AddEntry( fs.GetAsString(name).ToString(), PdfReader.GetStreamBytes((PRStream)refs.GetAsStream(name)) ); } } } } }

KubrickDocumentary contient la méthode suivante extractDocLevelAttachments / ExtractDocLevelAttachments pour extraire les pièces jointes au niveau du document:

Java:

 /** * Extracts document level attachments * @param filename a file from which document level attachments will be extracted * @throws IOException */ public void extractDocLevelAttachments(Ssortingng filename) throws IOException { PdfReader reader = new PdfReader(filename); PdfDictionary root = reader.getCatalog(); PdfDictionary documentnames = root.getAsDict(PdfName.NAMES); PdfDictionary embeddedfiles = documentnames.getAsDict(PdfName.EMBEDDEDFILES); PdfArray filespecs = embeddedfiles.getAsArray(PdfName.NAMES); PdfDictionary filespec; PdfDictionary refs; FileOutputStream fos; PRStream stream; for (int i = 0; i < filespecs.size(); ) { filespecs.getAsString(i++); filespec = filespecs.getAsDict(i++); refs = filespec.getAsDict(PdfName.EF); for (PdfName key : refs.getKeys()) { fos = new FileOutputStream(String.format(PATH, filespec.getAsString(key).toString())); stream = (PRStream) PdfReader.getPdfObject(refs.getAsIndirectObject(key)); fos.write(PdfReader.getStreamBytes(stream)); fos.flush(); fos.close(); } } reader.close(); }

C #:

 /** * Extracts document level attachments * @param PDF from which document level attachments will be extracted * @param zip the ZipFile object to add the extracted images */ public void ExtractDocLevelAttachments(byte[] pdf, ZipFile zip) { PdfReader reader = new PdfReader(pdf); PdfDictionary root = reader.Catalog; PdfDictionary documentnames = root.GetAsDict(PdfName.NAMES); PdfDictionary embeddedfiles = documentnames.GetAsDict(PdfName.EMBEDDEDFILES); PdfArray filespecs = embeddedfiles.GetAsArray(PdfName.NAMES); for (int i = 0; i < filespecs.Size; ) { filespecs.GetAsString(i++); PdfDictionary filespec = filespecs.GetAsDict(i++); PdfDictionary refs = filespec.GetAsDict(PdfName.EF); foreach (PdfName key in refs.Keys) { PRStream stream = (PRStream) PdfReader.GetPdfObject( refs.GetAsIndirectObject(key) ); zip.AddEntry( filespec.GetAsString(key).ToString(), PdfReader.GetStreamBytes(stream) ); } } }

(Pour une raison quelconque, les exemples c # placent les fichiers extraits dans un fichier ZIP, tandis que les versions Java les insèrent dans le système de fichiers ... ah bon ...)