<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PDF Archives - ITEC4B</title>
	<atom:link href="https://itec4b.com/category/application/file-manipulation/pdf/feed/" rel="self" type="application/rss+xml" />
	<link>https://itec4b.com/category/application/file-manipulation/pdf/</link>
	<description>Information Technology Expert Consulting</description>
	<lastBuildDate>Sat, 25 Feb 2023 17:29:09 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.1.3</generator>
	<item>
		<title>qpdf: PDF Transformation Software</title>
		<link>https://itec4b.com/qpdf-pdf-transformation-software/</link>
		
		<dc:creator><![CDATA[author]]></dc:creator>
		<pubDate>Sat, 25 Feb 2023 17:16:05 +0000</pubDate>
				<category><![CDATA[Application]]></category>
		<category><![CDATA[File Manipulation]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[qpdf]]></category>
		<category><![CDATA[file manipulation]]></category>
		<category><![CDATA[linux]]></category>
		<category><![CDATA[pdf]]></category>
		<guid isPermaLink="false">https://itec4b.com/?p=1573</guid>

					<description><![CDATA[qpdf is both a free command-line program and a C++ library (open source PDF manipulation library) for structural, content-preserving transformations on PDF files.qpdf has been designed with very few external dependencies and is intentionally very lightweight. It was created in 2005 by Jay Berkenbilt. One of the main features is the capability to merge and &#8230; <p class="link-more"><a href="https://itec4b.com/qpdf-pdf-transformation-software/" class="more-link">Read more<span class="screen-reader-text"> "qpdf: PDF Transformation Software"</span></a></p>]]></description>
										<content:encoded><![CDATA[
<p><a href="http://qpdf.sourceforge.net">qpdf</a> is both a free command-line program and a C++ library (open source PDF manipulation library) for structural, content-preserving transformations on PDF files.<br>qpdf has been designed with very few external dependencies and is intentionally very lightweight.<br><br>It was created in 2005 by Jay Berkenbilt.<br><br><strong><span style="text-decoration: underline;">One of the main features is the capability to merge and split PDF files by selecting pages from one or more input files</span></strong>.<br><span style="text-decoration: underline;"><strong>It is also capable of performing a variety of transformations such as linearization (known as web optimization or fast web viewing), encryption, and decryption of PDF files</strong></span>.<br><br><a href="https://qpdf.readthedocs.io/en/stable/cli.html">qpdf Online Documentation</a><br><br><span style="text-decoration: underline;">qpdf Local Documentation</span>: /usr/share/doc/qpdf/qpdf-manual.html</p>



<h2>Portable Document Format</h2>



<p><a href="https://www.adobe.com/acrobat/about-adobe-pdf.html">Adobe created the PDF in 1992 by Dr. John Warnock</a>, offering an easy, reliable way to present and exchange documents regardless of the software, hardware, or operating systems being used.<br>Today, it is one the most trusted file formats around the world, it can be easily viewed on any operating system.<br><br><span style="text-decoration: underline;">PDF was standardized as ISO 32000 in 2008 as an open standard</span>.<br>The PDF format is now maintained by the International Organization for Standardization (ISO).<br><span style="text-decoration: underline;">ISO 32000-2:2020 edition was published in December 2020, it does not include any proprietary technologies</span>.</p>



<p>The PDF specification also provides for encryption (in which case a password is needed to view or edit the contents), digital signatures (to provide secure authentication), file attachments, and metadata.<br>PDF 2.0 defines 256-bit AES encryption as standard for PDF 2.0 files.<br><br>The standard security provided by PDF consists of two different passwords:<br><br>&#8211; user password, which encrypts the file and prevents opening<br><br>&#8211; owner password, which specifies operations that should be restricted even when the document is decrypted, which can include modifying, printing, or copying text and graphics out of the document, or adding or modifying text notes.</p>



<p>The user password encrypts the file, the owner password does not, instead relying on client software to respect content restrictions.<br>An owner password can easily be removed by software.<br>Thus, the used restrictions that an author places on a PDF document are not secure, and cannot be assured once the file is distributed.</p>



<p>Metadata includes information about the document and its content, such as the author’s name, document title, description, creation/modification dates, application used to create the file, keywords, copyright information, etc.</p>



<h2>Install qpdf (Debian)</h2>



<pre class="wp-block-code"><code># apt install qpdf</code></pre>



<h2>Usage</h2>



<p><code>--linearize</code><br>Create linearized (web-optimized) output file.<br>Linearized files are formatted in a way that allows compliant readers to begin displaying a PDF file before it is fully downloaded.<br>Ordinarily, the entire file must be present before it can be rendered because important cross-reference information typically appears at the end of the file.</p>



<pre class="wp-block-code"><code>$ qpdf --linearize infile.pdf  outfile.pdf</code></pre>



<h2>Merge PDF files with pages selection</h2>



<p>qpdf allows you to use the <code>--pages</code> option to select pages from one or more input files.</p>



<pre class="wp-block-code"><code>$ qpdf primary_input_file.pdf --pages . &#91;--password=password] &#91;page-range] &#91; ... ] -- outputfile.pdf

Within &#91; ... ] you may repeat the following:  inputfile_N.pdf &#91;--password=password] &#91;page-range]</code></pre>



<p>The special input file <code>'.'</code> can be used as an alias for the primary input file.<br>Multiple input files may be specified and you can select specific pages from it.<br>For each inputfile that pages should be extracted from, specify the filename, a password (if needed) to open the file, and a page range.<br>Note that <code>'--'</code> terminates parsing of page selection flags.<br><br><code>--password=password</code> specifies a password for accessing encrypted files<br>The password option is only needed for password-protected files<br><br>The page range may be omitted. In this case, all pages are included.<br><br>Document-level information (metadata, outline, etc.) is taken from the primary input file (in the above example, <code>primary_input_file.pdf</code>) and is preserved in <code>outputfile.pdf</code><br><strong><span style="text-decoration: underline;">You can use <code>--empty</code> in place of the primary input file to start from an empty file (without any metadata, outline, etc.) and just merge selected pages from input files</span></strong>.<br><br><strong><span style="text-decoration: underline;">In most cases you will most likely use this following syntax</span></strong></p>



<pre class="wp-block-code"><code>$ qpdf --empty --pages inputfile_1.pdf &#91;page-range] inputfile_2.pdf &#91;page-range] inputfile_3.pdf &#91;page-range] &#91; ... ] -- outputfile.pdf</code></pre>



<p>The page-range is a set of numbers separated by commas, ranges of numbers separated dashes, or combinations of those.<br>The character <code>'z'</code> represents the last page.<br>A number preceded by an <code>'r'</code> indicates to count from the end, so <code>r3-r1</code> would be the last three pages of the document.<br>Pages can be specified in any order (selection of any pages).<br>Ranges can be specified in any order (ascending or descending): a high number followed by a low number causes the pages to appear in reverse.<br>Numbers may be repeated in a page range.<br>A page range may be optionally appended with <code>:even</code> or <code>:odd</code> to indicate only the even or odd pages in the given range.<br>Note that even and odd refer to the positions within the specified, range, not whether the original number is even or odd.<br><br><span style="text-decoration: underline;">Example page ranges</span>:<br><br>1,3,5-9,15-12<br>Pages 1, 3, 5, 6, 7, 8, 9, 15, 14, 13, and 12 in that order</p>



<p>z-1<br>All pages in the document in reverse</p>



<p>r3-r1<br>The last three pages of the document</p>



<p>r1-r3<br>The last three pages of the document in reverse order</p>



<p>1-20:even<br>Even pages from 2 to 20</p>



<p>5,7-9,12:odd<br>Pages 5, 8 and 12, which are the pages in odd positions from among the original range (pages 5, 7, 8, 9, and 12)</p>



<pre class="wp-block-code"><code>Example, to extract pages 1 through 5 from infile.pdf while preserving all metadata associated with that file in outfile.pdf
$ qpdf infile.pdf --pages . 1-5 -- outfile.pdf

If you want pages 1 through 5 from infile.pdf without any metadata, use
$ qpdf --empty --pages infile.pdf 1-5 -- outfile.pdf

Merge all .pdf files
$ qpdf --empty  --pages *.pdf -- outfile.pdf</code></pre>



<h2>Split a PDF into separate PDF files</h2>



<p><code>--split-pages[=n]</code><br>Write each group of n pages to a separate output file.<br>If n is not specified, create single pages.<br><br>Output file names are generated as follows:<br>If the string %d appears in the output file name, it is replaced with a range of zero-padded page numbers starting from 1.<br>Otherwise, if the output file name ends in .pdf (case insensitive), a zero-padded page range, preceded by a dash, is inserted before the file extension.<br>Otherwise, the file name is appended with a zero-padded page range preceded by a dash.<br><br>Zero padding is added to all page numbers in file names so that all the numbers are the same length, which causes the output filenames to sort lexically in numerical order.<br><br>Page ranges are a single number in the case of single-page groups or two numbers separated by a dash otherwise.<br><br>Here are some examples. In these examples, infile.pdf has 20 pages</p>



<pre class="wp-block-code"><code>Output files are 01-outfile through 20-outfile with no extension
$ qpdf --split-pages infile.pdf %d-outfile

Output files are outfile-01.pdf through outfile-20.pdf
$ qpdf --split-pages infile.pdf outfile.pdf

Output files are outfile-01-04.pdf, outfile-05-08.pdf, outfile-09-12.pdf, outfile-13-16.pdf, outfile-17-20.pdf
$ qpdf --split-pages=4 infile.pdf outfile.pdf

Output files are outfile.notpdf-01 through outfile.notpdf-20
The extension .notpdf is not treated in any special way regarding the placement of the number
$ qpdf --split-pages infile.pdf outfile.notpdf</code></pre>



<p>Note that metadata, outline, etc, and other document-level features of the original PDF file are not preserved.<br>For each page of output, this option creates an empty PDF and copies a single page from the output into it.<br>If you require the document-level data, you will have to run qpdf with the <code>--pages</code> option once for each page.<br>Using <code>--split-pages</code> is much faster if you don’t require the document-level data.<br><br><span style="text-decoration: underline;">If you don’t want to split out every page, use page ranges to select the pages you only want to extract</span>.<br>The page range is used to specify the pages or ranges you want, <span style="text-decoration: underline;">but each extracted page is still stored in a single PDF</span>.</p>



<pre class="wp-block-code"><code>$ qpdf --split-pages infile.pdf outfile.pdf --pages infile.pdf 4-5,8,9-13 --</code></pre>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
