PDF-KungFoo with Ghostscript & Co.
PDF-KungFoo with Ghostscript & Co.
100 Tips and Tricks for Clever PDF Creation and Handling
About the Book
Introduction
This book, once finished, will contain a few rare gems. It will contain some great tricks and other stuff which you'll not find anywhere else on the Big Net, nor in any books you can buy from Amazon.
However, it is "work in progress". It summarizes some of the practical solutions I applied to real-world problems encountered by my clients. Early buyers and readers will be able to follow its progressing development -- and they will be entitled to updates until the final completed version without paying any extra money.
Most of the book's chapters deal with Ghostscript commands. But sometimes I also refer to other helper utilities, which I employ when Ghostscript isn't the right tool for the job.
Each chapter is intended to be of immediate practical value, and each one can stand on its own, giving the reader a basic or more advanced "recipe" that can be applied and adapted to his own situation, while at the same time giving additional background information and highlighting technical concepts in context.
While this book is still work in progress, readers are encouraged to submit their own suggestions and questions about topics to be included into the final version.
My experience in the prepress world and in the printing industry spans over 2 decades. To date, I've used Ghostscript and other Free Software tools for more than 15 years. Most of the 'problems' and practical tasks I describe here have been posed to me...
- ...either from paying customers, whom I helped through consulting, troubleshooting, training or software development activities,
- ...or from emails I received (sometimes from people I have not heard of before or after) asking me some particular question about a problem,
- ...or via some public internet forum, newsgroup or platform where people ask IT- or programming related questions, most prominently on StackOverflow.com.
Luckily I kept a record of the most interesting and of the most commonly asked things.
This document is a condensed summary from my archives. And sometimes I didn't write paragraphs completely from scratch, but copied them straight from my old mails. So, if you come across some sentence in the "Question" or the "Answer" section of the coming chapters which sounds familiar to you: maybe it's because you sent me the question before, or because you received the same answer from me years ago. Over time, I may decide to edit, polish and straighten many of the original, still "raw" pieces in this book. However, this may also depend on readers' general feedback.
Be warned though: this document is not necessarily a comprehensive, systematic tutorial! Some of the snippets explained in different chapters may be duplicates and therefor could be seen as redundant. However, should you end up reading and working through all chapters of the booklet, you'll remember these parts better and you may have gained a rather complete picture of Ghostscript's capabilities :-)
While I didn't do a precise count: I'm pretty sure that a newbie Ghostscript user will easily find 100 different pieces of practical Ghostscript usage snippets here, even if the book currently does not (yet) contain 100 distinct chapters. Experienced users will also be able to find one or the other 'gem of wisdom'.
All in all I hope you'll find my 'PDF-KungFoo -- 100 Tips + Tricks for Ghostscript & Co.' useful. I intend to expand and update this document over time. Readers will be entitled to free updates. So I hope, in a year or two, you will have a document which could rather be named '100 Chapters with 1000 Tipps + Tricks for Ghostscript & Co.'
-- Kurt Pfeifle
Preliminary Plan for Table of Content (to be expanded)
Contents Metadata Changelog Introduction 100 Tipps and Tricks- 1 Where can I download the tools shown in this book?
- 2 How can I convert PCL to PDF?
- 3 How can I convert XPS to PDF?
- 4 Why doesn't Acrobat Distiller embed all fonts fully?
- 5 How can I extract fonts from PDFs as valid font files?
- 6 How can I embed fonts when generating PDFs?
- 7 How can I embed a missing font into an existing PDF?
- 8 How can I convert a font to an outline in an existing PDF?
- 9 Can I replace a font inside a PDF?
- 10 How can I make invisible fonts visible?
- 11 How can I spellcheck a scanned PDF?
- 12 How can I convert a color PDF into grayscale?
- 13 How can I convert a CMYK-based PDF into an RGB-based one?
- 14 How can I check for colored pages inside a PDF?
- 15 How can I check for all-white pages inside a PDF?
- 16 How can I use 'pdfmark' to insert bookmarks into PDF?
- 17 How can I use 'pdfmark' to change PDF metadata?
- 18 How to extract text from PDF?
- 19 How do I unit test a Python function that draws PDF graphics?
- 20 How do I determine the number of PDF pages?
- 21 How do I crop PDF pages?
- 22 How do I scale PDF pages?
- 23 How can I rotate PDF pages?
- 24 How can I open PDF “raw”?
- 25 How can I remove white margins from PDF pages?
- 26 How can I determine which pages of a PDF use color?
- 27 What are PostScript dictionaries, and how can they be accessed (in ghostscript)?
- 28 How can I use Ghostscript to query the default settings used by an output device (such as ‘pdfwrite’ or ‘tiffg4’) ?
- 29 What is the difference between PostScript and PDF?
Table of Contents
-
- Changelog (major changes only)
-
I Preliminaries
- 1 Introduction
- 2 Hints for Linux, Windows, Mac OS X and Unix Users
- 3 Downloading the tools
-
II 100 Tipps and Tricks
- 4 How can I convert PCL to PDF?
- 5 How can I to convert XPS to PDF?
- 6 How can I unit test a Python function that draws PDF graphics?
- 7 How can I compare 2 PDFs on the commandline?
- 8 How can I remove white margins from PDF pages?
- 9 Using Ghostscript to get page size
-
III Fonts
- 10 Why doesn’t Acrobat Distiller embed all fonts fully – even when explicitly setup to do so?
- 11 How do I make Ghostscript show all fonts it can find on my local system?
- 12 How can I extract embedded fonts from a PDF as valid font files?
- 13 How can I convert fonts to outlines in an existing PDF?
- 14 How can I get Ghostscript to use embedded fonts in PDF?
- 15 How do I embed fonts when generating PDFs? (CONTENT STILL MISSING)
- 16 How can I embed a missing font into an existing PDF? (CONTENT STILL MISSING)
- 17 Can I replace a font inside a PDF? (CONTENT STILL MISSING)
- 18 How can I use invisible fonts in a PDF?
-
IV Scanned Pages and PDF
- 19 How can I make the invisible OCR information on a scanned PDF page visible?
- 20 How can I spellcheck a scanned PDF? (CONTENT STILL MISSING)
-
V Colors
- 21 How can I convert a color PDF into grayscale?
- 22 How can I convert a CMYK-based PDF into an RGB-based one? (CONTENT STILL MISSING)
- 23 How can I check for colored pages inside a PDF? (CONTENT STILL MISSING)
- 24 How can I check for all-white pages inside a PDF? (CONTENT STILL MISSING)
-
VI Using
pdfmarks
- 25 How can I understand what this funny ‘pdfmark’ stuff is about?
-
26 How can I use
pdfmark
to insert bookmarks into PDF? (CONTENT STILL MISSING) -
27 How can I use
pdfmark
with Ghostscript to change PDF metadata? - 28 How can I use Ghostscript to add an annotation to a PDF?
-
VII Text extraction
- 29 How can I extract text from PDF? (CONTENT STILL MISSING)
-
VIII Miscellaneous
- 30 How can I re-order pages in a PDF
- 31 How to recognize PDF format?
- 32 How can I let Ghostscript determine the number of PDF pages?
- 33 How can I crop PDF pages? (CONTENT STILL MISSING)
- 34 How can I scale PDF pages? (CONTENT STILL MISSING)
- 35 How can I rotate PDF pages? (CONTENT STILL MISSING)
- 36 How can I open PDF “raw”? (CONTENT STILL MISSING)
- 37 How can I use Ghostscript as a calculator inside the shell?
- 38 Do you also use non-FOSS tools for your PDF-related work? If so, which?
- 39 Why do you call Apple’s Preview.app ‘evil, malicious and ambidextrous to unsuspecting users’?
-
IX Some Topics in Depth
- 40 Can I query the default settings Ghostscript uses for an output device (such as ‘pdfwrite’ or ‘tiffg4’)?
-
Appendix
- About the Author
- Acknowledgements
Causes Supported

Electronic Frontier Foundation
Defending your civil liberties in a digital world.
https://www.eff.org/Based in San Francisco, EFF is a donor-supported membership organization working to protect fundamental rights regardless of technology.
Other books by this author
The Leanpub 60-day 100% Happiness Guarantee
Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms
80% Royalties. Earn $16 on a $20 book.
We pay 80% royalties. That's not a typo: you earn $16 on a $20 sale. If we sell 5000 non-refunded copies of your book or course for $20, you'll earn $80,000.
(Yes, some authors have already earned much more than that on Leanpub.)
In fact, authors have earned$12,046,757writing, publishing and selling on Leanpub.
Learn more about writing on Leanpub
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them
Top Books
OpenIntro Statistics
David Diez, Christopher Barr, Mine Cetinkaya-Rundel, and OpenIntroA complete foundation for Statistics, also serving as a foundation for Data Science.
Leanpub revenue supports OpenIntro (US-based nonprofit) so we can provide free desk copies to teachers interested in using OpenIntro Statistics in the classroom and expand the project to support free textbooks in other subjects.
More resources: openintro.org.
Mastering STM32 - Second Edition
Carmine NovielloWith more than 1200 microcontrollers, STM32 is probably the most complete ARM Cortex-M platform on the market. This book aims to be the most complete guide around introducing the reader to this exciting MCU portfolio from ST Microelectronics and its official CubeHAL and STM32CubeIDE development environment.
C++20 - The Complete Guide
Nicolai M. JosuttisAll new language and library features of C++20 (for those who know previous C++ versions).
The book presents all new language and library features of C++20. Learn how this impacts day-to-day programming, to benefit in practice, to combine new features, and to avoid all new traps.
Buy early, pay less, free updates.
Other books:
Jetpack Compose internals
Jorge CastilloJetpack Compose is the future of Android UI. Master how it works internally and become a more efficient developer with it. You'll also find it valuable if you are not an Android dev. This book provides all the details to understand how the Compose compiler & runtime work, and how to create a client library using them.
Talking with Tech Leads
Patrick KuaA book for Tech Leads, from Tech Leads. Discover how more than 35 Tech Leads find the delicate balance between the technical and non-technical worlds. Discover the challenges a Tech Lead faces and how to overcome them. You may be surprised by the lessons they have to share.Functional Event-Driven Architecture
Gabriel VolpeExplore the event-driven architecture (EDA) in a purely functional way. Learn to design and develop distributed systems that scale. Identify common design patterns in such systems.
Take your functional programming skills to the next level by joining me in developing a distributed system powered by Apache Pulsar and Fs2 streams, all in Scala 3!
Machine Learning Q and AI
Sebastian Raschka, PhDHave you recently completed a machine learning or deep learning course and wondered what to learn next? With 30 questions and answers on key concepts in machine learning and AI, this book provides bite-sized bits of knowledge for your journey to becoming a machine learning expert.
Getting to Know IntelliJ IDEA
Trisha Gee and Helen ScottIf we treat our IDE as a text editor, we are doing ourselves a disservice. Using a combination of tutorials and a questions-and-answers approach, Getting to Know IntelliJ IDEA will help you find ways to use IntelliJ IDEA that enable you to work comfortably and productively as a professional developer.
The Rails 7 Way
Obie Fernandez, Lucas Dohmen, and Tom Henrik AadlandThe Rails™ 7 Way is the comprehensive, authoritative reference guide for professionals delivering production-quality code using modern Ruby on Rails. It illuminates the entire Rails 7 API, its most powerful idioms, design approaches, and libraries. Building on the previous editions, this edition has been heavily refactored and updated.
Ansible for DevOps
Jeff GeerlingAnsible is a simple, but powerful, server and configuration management tool. Learn to use Ansible effectively, whether you manage one server—or thousands.
Top Bundles
- #1
Software Architecture
2 Books
"Software Architecture for Developers" is a practical and pragmatic guide to modern, lightweight software architecture, specifically aimed at developers. You'll learn:The essence of software architecture.Why the software architecture role should include coding, coaching and collaboration.The things that you really need to think about before... - #2
CCIE Service Provider Ultimate Study Bundle
2 Books
Piotr Jablonski, Lukasz Bromirski, and Nick Russo have joined forces to deliver the only CCIE Service Provider training resource you'll ever need. This bundle contains a detailed and challenging collection of workbook labs, plus an extensively detailed technical reference guide. All of us have earned the CCIE Service Provider certification... - #3
Modern C++ Collection
3 Books
Get All about Modern C++C++ Standard Library, including C++20Concurrency with Modern C++, including C++20C++20Each book has about 200 complete code examples. Updates are included. When I update one of the books, you immediately get the updated bundle. You can expect significant updates to each new C++ standard (C++23, C++26, .. ) and also... - #4
Pattern-Oriented Memory Forensics and Malware Detection
2 Books
This training bundle for security engineers and researchers, malware and memory forensics analysts includes two accelerated training courses for Windows memory dump analysis using WinDbg. It is also useful for technical support and escalation engineers who analyze memory dumps from complex software environments and need to check for possible... - #5
1500 QUIZ COMMENTATI (3 libri)
3 Books
Tre libri dei QUIZ MMG Commentati al prezzo di DUE! I QUIZ dei concorsi ufficiali di Medicina Generale relativi agli anni: 2000-2001-2003-2012-2013-2014-2015-2016-2017-2018-2019-2020-2021 +100 inediti Raccolti in unico bundle per aiutarvi nello studio e nella preparazione al concorso. All'interno di ogni libro i quiz sono stati suddivisi per... - #6
Practical FP in Scala + Functional event-driven architecture
2 Books
Practical FP in Scala (A hands-on approach) & Functional event-driven architecture, aka FEDA, (Powered by Scala 3), together as a bundle! The content of PFP in Scala is a requirement to understand FEDA so why not take advantage of this bundle!? - #8
Growing Agile: The Complete Coach's Guide
7 Books
Growing Agile: Coach's Guide Series This bundle provides a collection of training and workshop plans for a variety of agile topics. The series is aimed at agile coaches, trainers and ScrumMasters who often find themselves needing to help teams understand agile concepts. Each book in the series provides the plans, slides, handouts and activity... - #9
Development and Deployment of Multiplayer Online Games, Part ARCH. Architecture (Vol. I-III)
3 Books
What's the Big Idea? The idea behind this book is to summarize the body of knowledge that already exists on multiplayer games but is not available in one single place.And quite a fewof the issues discussed within this series (planned as three nine volumes ~300 pages each), while known in the industry, have not been published at all (except for...