Document Metadata
The <Metadata> section inside <Document> sets PDF information fields
(author, title, dates, etc.) that are embedded in the output file.
Example
<Document>
<Metadata>
<DocumentTitle>Annual Report 2025</DocumentTitle>
<Author>Docraft Team</Author>
<Subject>Financial summary</Subject>
<Keywords>finance, report, 2025</Keywords>
<Creator>Docraft 1.0</Creator>
<Producer>Docraft/libharu</Producer>
<CreationDate>
<Year>2025</Year>
<Month>12</Month>
<Day>31</Day>
</CreationDate>
<AutoKeywords enabled="true" max_keywords="15"
min_length="4" language="en"/>
</Metadata>
<Body>
<!-- content -->
</Body>
</Document>
Metadata Fields
Element |
Description |
|---|---|
|
PDF title string. |
|
Author name. |
|
Application that created the content. |
|
Application that produced the PDF. |
|
Document subject. |
|
Comma-separated keyword list. |
|
Trapping flag ( |
|
GTS_PDFXVersion string. |
Date Elements
<CreationDate> and <ModificationDate> contain sub-elements:
Element |
Type |
Description |
|---|---|---|
|
int |
Four-digit year. |
|
int |
Month (1–12). |
|
int |
Day (1–31). |
|
int |
Hour (0–23). |
|
int |
Minutes (0–59). |
|
int |
Seconds (0–59). |
|
char |
UTC indicator ( |
|
int |
UTC offset hours. |
|
int |
UTC offset minutes. |
Automatic Keywords
Enable <AutoKeywords> to have the library automatically extract keywords
from text nodes using term-frequency analysis.
Attribute |
Type |
Description |
|---|---|---|
|
bool |
Enable/disable extraction. |
|
int |
Maximum number of keywords (default |
|
int |
Minimum word length (default |
|
string |
Stop-word languages, comma-separated ( |