Editor de metadados PDF gratuito
Edite os metadados PDF, título, autor, assunto, palavras-chave etc. Roda inteiramente no seu navegador.
O que são os metadados PDF ?
Os metadados PDF são informações sobre o documento que não aparecem no conteúdo visível. Incluem o título, o autor, o assunto, as palavras-chave, a data de criação e outras propriedades. Essas informações facilitam a organização, a busca e a identificação dos documentos.
Por que editar os metadados PDF ?
- Organização · defina metadados consistentes em seus documentos para melhor classificação e busca.
- Profissionalismo · garanta que seus documentos exibam o autor e o título corretos.
- Indexação & descoberta · as palavras-chave nos metadados ajudam na descoberta.
- Correção de propriedades · corrija informações de autor, título ou assunto incorretas ou ausentes.
Perguntas frequentes
Editar os metadados altera o conteúdo do PDF ?
Não. Apenas os metadados são modificados. O conteúdo, as páginas e a formatação do PDF permanecem exatamente idênticos.
Posso editar os metadados de um PDF criptografado ?
Se o PDF estiver protegido por senha, você não pode editar seus metadados com esta ferramenta. O arquivo deve primeiro ser desbloqueado.
Qual é o limite de tamanho do arquivo ?
Esta ferramenta suporta PDFs até 10 MB. Arquivos maiores podem levar mais tempo para serem processados.
What PDF metadata actually is
A PDF file can carry document-level metadata in two places at once. The original mechanism, present since PDF 1.0 (1993), is the Document Information Dictionary (called "DocInfo" or /Info): a key/value object referenced from the PDF trailer. PDF 1.4 (2001) added a second, richer mechanism, an XMP metadata stream, an XML packet (RDF/XML conforming to Adobe's eXtensible Metadata Platform) embedded as a stream object attached to the document catalog. XMP became an open ISO standard in 2012 (ISO 16684-1).
The two stores are not the same and may disagree. Adobe's reference and the ISO 32000 standards both say XMP is preferred when present, and that DocInfo should be treated as a legacy mirror. In ISO 32000-2 (PDF 2.0), the older DocInfo dictionary is formally deprecated for everything except CreationDate and ModDate (which signature handlers still use). In practice, almost every reader (Adobe Acrobat, Foxit, Preview on macOS, browser viewers) reads DocInfo by default and only falls back to XMP for fields like copyright that DocInfo never supported.
The standard DocInfo fields are Title, Author, Subject, Keywords, Creator (the application that originated the document, e.g. "Microsoft Word"), Producer (the application that produced the actual PDF, e.g. "Adobe PDF Library 17.0"), CreationDate, ModDate (in PDF date format like D:20240315093000-04'00'), and Trapped. XMP organises fields into namespaces, Dublin Core's dc:title, dc:creator, dc:rights, dc:language; XMP-MM's DocumentID, InstanceID, and History editing log; PDF/A and PDF/UA conformance markers; and any custom namespaces a tool wants to add. This editor exposes the most-used DocInfo fields directly; XMP-only fields require a more specialised editor.
A short history
PDF began with John Warnock's 1991 internal Adobe memo (the "Camelot" paper) proposing a portable document format that preserved visual fidelity across devices. Adobe shipped PDF 1.0 with Acrobat 1.0 in 1993; the DocInfo dictionary was there from day one. Through the 1990s and early 2000s the format added encryption, hyperlinks, forms, JavaScript, transparency, tagged-PDF accessibility (PDF 1.4, 2001), and the XMP metadata mechanism (also PDF 1.4). PDF/A (the archival subset that mandates embedded XMP and forbids encryption) was ratified as ISO 19005-1 in 2005. Adobe transferred PDF to ISO in 2008, where PDF 1.7 became ISO 32000-1:2008. ISO 32000-2:2017 published PDF 2.0, with the major metadata change being the deprecation of DocInfo in favour of XMP. The 2020 revision and the PDF Association's free release of the spec in April 2023 mean the standard is now openly accessible.
The privacy problem, what PDFs leak
A PDF created by typical office software broadcasts substantially more about its provenance than most users realise. From a single PDF you can usually extract:
- Author's full name. Microsoft Word writes
Authorfrom the user's Office account or the registered Windows username at install time. LibreOffice writes the user's first/last name from the user-data settings. Pages on macOS uses the system "Full Name." A PDF saved-as from any of those inherits the embedded value automatically. - The full editing history. XMP's
xmpMM:Historyrecords each save and conversion event with a timestamp, software name, and instance UUID, producing a partial revision log of the document. - Software identification down to version and build. The
Producerfield typically reads like "Microsoft® Word for Microsoft 365" or "Adobe PDF Library 17.00.6" or "Skia/PDF m120" (Chrome's print-to-PDF). This fingerprints the workstation OS and patch level. - Creation timestamp + modification timestamp + the gap between them. A 4-second gap suggests a print-to-PDF; a 45-minute gap suggests substantial editing. Together these can establish when, where and by whom a document was authored.
- Embedded image EXIF. When an image carrying EXIF GPS coordinates is dragged into a Word or InDesign document and exported to PDF, the underlying image stream often retains the EXIF tags, including latitude and longitude. ExifTool will pull them out even from "embedded" images.
- Track-changes annotations. PDFs exported from Word with "Show Markup" enabled embed reviewer initials and timestamps in annotation streams (technically content rather than metadata, but often invisible until a reader expands the comments panel).
Notable real-world cases
- Manafort court filing (January 2019): Paul Manafort's defence attorneys filed a court document using PDF redaction rectangles drawn over text. The text itself was untouched in the content stream and was extracted within hours by reporters using basic copy-paste, exposing claims that Manafort had shared US polling data with a Russian intelligence-linked associate. The accompanying metadata also named the law-firm machine and software that produced it.
- UK government "dodgy dossier" (February 2003): the document "Iraq, Its Infrastructure of Concealment, Deception and Intimidation" had editing-history metadata that named four authors, including a US graduate student whose 2002 thesis had been copy-pasted in. The Word document's hidden authorship trail was the smoking gun.
- TSA security manual (December 2009): TSA published a redacted version of its passenger-screening Standard Operating Procedures. The redactions were image overlays on top of the original text in a PDF; the underlying text was extractable. The full document, including the names of allied governments whose passport-holders received elevated screening, leaked.
- "Author: opposing-counsel firm name": repeated incidents at law firms where outgoing PDF briefs include the opposing-counsel firm name in the
Authorfield, because someone copy-pasted from a discovery PDF into a new Word document and the destination document inherited the source's author. Many firms now require Word's "Document Inspector" or Acrobat's "Sanitize Document" before any external send.
Honest scope of this tool
This editor lets you view and overwrite the standard DocInfo fields. It is genuinely useful for cleaning up author names before sending a document externally, fixing wrong title metadata that's confusing your document-management system, or stripping a workstation fingerprint from a press release. It is not a complete sanitiser. Specifically:
- Image EXIF inside embedded photos may still carry GPS coordinates and camera details.
- Track-changes and reviewer comments stored as annotations are not removed.
- Hidden text under "redaction" rectangles is still extractable, drawing a black rectangle over text doesn't remove the text from the PDF's content stream. This is the most common source of accidental disclosure.
- The
xmpMM:Historyediting log in the XMP stream is not necessarily cleared. - Embedded font subsets can identify the originating workstation if unusual fonts were used.
- Printer tracking dots (yellow microdot patterns most colour laser printers embed) are content-level and unaffected by metadata editing, the Reality Winner case (June 2017) hinged on these.
For a complete sanitisation pass on a sensitive document, the right tools are Adobe Acrobat Pro's "Sanitize Document" command, the open-source cpdf command-line utility's -remove-metadata option, or ExifTool's -all= directive followed by manual inspection. Sensitive workflows often re-create the document from extracted plain text rather than trying to scrub the original.
Tools to view metadata
- Adobe Acrobat: File → Properties. Shows the DocInfo fields and a separate "Additional Metadata" panel for the XMP packet.
- ExifTool (Phil Harvey), the command-line gold standard.
exiftool file.pdfprints everything;exiftool -all= file.pdfstrips everything. - pdfinfo (part of poppler-utils), quick CLI dump of DocInfo plus page-level details.
- pdf.js / PDF.js (the library Firefox uses to render PDFs)) exposes metadata via
doc.getMetadata()for browser-side reading. - pdf-lib: the JavaScript library powering this tool's edit pass; exposes
setTitle(),setAuthor(), etc., and writes a fully-conformant PDF back.
When you'd reach for this
- Cleaning up author/creator names before sending a document outside your organisation.
- Setting consistent title metadata for a batch of documents that will end up in a document-management system or library catalogue.
- Adding keywords for internal full-text-search systems that use them as a discovery boost.
- Fixing the wrong title when "save-as PDF" inherited a misleading filename.
- Asserting copyright / licence via the
Authorand (for tools that handle XMP)dc:rightsfield. - Quick privacy sanitisation for routine documents, though see the scope caveat above for high-stakes cases.
More questions
Why do my edits sometimes appear in DocInfo but not XMP (or vice versa)?
Because PDFs carry both stores and they can disagree. This editor writes to DocInfo (the field every reader inspects). XMP is updated for fields that have a clear DocInfo equivalent. Some viewers (Adobe Acrobat in particular) read XMP first; if you see "stale" metadata after editing, open the document with a different reader to confirm whether the issue is XMP-only or whether your reader is just caching the old version.
Will this tool break a digital signature?
Yes, almost always. A digital signature on a PDF protects the entire document including the metadata; modifying any byte breaks the signature's cryptographic verification. If you need to edit metadata on a signed PDF, you'll either need to remove the signature first (with the signer's permission), edit the metadata, and have it re-signed; or apply the metadata changes before signing in the original workflow.
What about PDF/A archival files?
PDF/A files have additional XMP requirements (the pdfaid:part and pdfaid:conformance markers, plus required Dublin Core fields). Editing a PDF/A's DocInfo without updating the XMP packet may technically take the file out of PDF/A conformance. For archival workflows, use a PDF/A-aware editor like Acrobat Pro or veraPDF.
How do I make a "completely anonymous" PDF?
For routine documents: edit the DocInfo here to clear identifying fields, then run the result through Acrobat's "Sanitize Document" or cpdf -remove-metadata. For high-stakes anonymisation (whistleblowing, journalism, legal disclosure): re-create the PDF from scratch on a different machine using only extracted plain text, with no images that came from the original. Print-and-rescan also works (the OCR layer of the rescanned PDF is freshly authored), at the cost of file size and image quality.
Does anything get sent to a server?
No. The PDF is parsed and rewritten by pdf-lib running locally in your browser; the modified file is downloaded straight to your device. Nothing about your PDF leaves the page, useful when the document contains internal author names, client information or confidential subject lines that you'd rather not upload to a third-party service. The pdf-lib library itself loads from a public CDN once with subresource-integrity verification, then is cached.