Merging PDF files is one of the most common document tasks in automation, backend development, and file processing. Whether you are building a web app, a desktop tool, or a command-line utility, the ability to combine multiple PDFs into one file can save time and simplify workflows.
In Python, this task is straightforward once you know the right library and approach. You can merge invoices, reports, scanned pages, contracts, e-books, or any other PDF documents into a single file with only a few lines of code. But beyond the basic example, there are many practical details to understand: how to control page order, how to merge PDFs from a folder, how to handle errors, how to work with large files, and how to build reusable scripts.
This article explains everything in a practical, beginner-friendly way. You will learn how to merge PDF files in Python using modern libraries, see multiple code examples, and understand how to make your solution reliable for real-world projects.
Why Merge PDFs in Python?
There are many situations where PDF merging is useful:
Combining several reports into one document
Joining invoice pages into a single file
Merging scanned pages after digitizing paper documents
Building a document automation system
Creating downloadable PDF bundles in web apps
Organizing chapters or sections of an e-book
Preparing files for archiving, emailing, or printing
Python is a strong choice for this because it is easy to read, easy to maintain, and works well with file automation. With the right library, you can create scripts that merge files locally, on a server, or inside a larger application.
Which Python Library Should You Use?
For PDF merging, the most common modern choice is pypdf.
It is the updated successor of the older PyPDF2 project and is widely used for reading, writing, splitting, merging, and modifying PDF files.
Install it
pip install pypdf
If you are working in a virtual environment, activate it first, then install the package.
Basic PDF Merge in Python
The simplest way to merge PDF files is to use PdfMerger from pypdf.
Example: merge two PDF files
from pypdf import PdfMerger
merger = PdfMerger()
merger.append("file1.pdf")
merger.append("file2.pdf")
merger.write("merged.pdf")
merger.close()
How it works
PdfMerger()creates a merger objectappend()adds a PDF to the outputwrite()saves the final fileclose()releases resources
This is the most direct solution when you already know the filenames and want a quick merge.
Merge Multiple PDF Files
You can merge many files by storing them in a list and looping through them.
Example: merge a list of PDFs
from pypdf import PdfMerger
pdf_files = [
"intro.pdf",
"chapter1.pdf",
"chapter2.pdf",
"chapter3.pdf"
]
merger = PdfMerger()
for pdf in pdf_files:
merger.append(pdf)
merger.write("book.pdf")
merger.close()
This approach is useful when the order matters, such as when you are assembling chapters, reports, or form pages.
Merge PDFs from a Folder Automatically
In many projects, you will not want to hardcode filenames manually. Instead, you may want to merge every PDF in a folder.
Example: merge all PDFs in a directory
import os
from pypdf import PdfMerger
folder_path = "pdfs"
output_file = "merged_output.pdf"
pdf_files = sorted(
[
os.path.join(folder_path, f)
for f in os.listdir(folder_path)
if f.lower().endswith(".pdf")
]
)
merger = PdfMerger()
for pdf_file in pdf_files:
merger.append(pdf_file)
merger.write(output_file)
merger.close()
Why use sorted()?
Sorting helps keep the merge order predictable. Without sorting, the order returned by the operating system may not match what you expect.
For example, this can matter if your files are named like:
001-cover.pdf002-introduction.pdf003-chapter-one.pdf
Sorting ensures they are merged in the correct sequence.
Merge PDFs in a Specific Order
Sometimes the folder contains many PDFs, but you only want to merge selected files in a custom order.
Example: custom merge order
from pypdf import PdfMerger
pdf_files = [
"cover.pdf",
"table_of_contents.pdf",
"chapter_3.pdf",
"chapter_1.pdf",
"chapter_2.pdf"
]
merger = PdfMerger()
for pdf in pdf_files:
merger.append(pdf)
merger.write("custom_order_book.pdf")
merger.close()
You can place files in any order you need. This is particularly useful when your document structure is not alphabetical.
Insert One PDF Into Another at a Specific Position
Sometimes you do not want to append at the end. You may need to insert pages from one file into a specific location.
PdfMerger supports an append() method with a pages parameter, and also a merge() method for inserting a file at a position.
Example: insert a PDF into page position 3
from pypdf import PdfMerger
merger = PdfMerger()
merger.append("main_document.pdf")
merger.merge(3, "inserted_section.pdf")
merger.write("updated_document.pdf")
merger.close()
This is useful when you need to place an appendix, annex, or additional section in the middle of a file.
Merge Only Certain Pages from Each PDF
You do not always need the full PDF. Sometimes you only need a few pages from each file.
Example: merge selected pages
from pypdf import PdfMerger
merger = PdfMerger()
# Merge pages 0 to 2 from the first file
merger.append("file1.pdf", pages=(0, 3))
# Merge only page 4 from the second file
merger.append("file2.pdf", pages=(4, 5))
merger.write("selected_pages.pdf")
merger.close()
Important note about page ranges
In pypdf, page numbers are zero-based:
page 0 = first page
page 1 = second page
page 2 = third page
This is a common source of confusion, so it is worth remembering.
Merge PDFs and Keep Metadata
Sometimes you want to preserve or set metadata for the final document.
Metadata may include:
title
author
subject
keywords
creator
Example: set metadata after merge
from pypdf import PdfMerger
merger = PdfMerger()
merger.append("report1.pdf")
merger.append("report2.pdf")
merger.add_metadata({
"/Title": "Combined Annual Reports",
"/Author": "Your Name",
"/Subject": "Merged PDF documents",
"/Keywords": "PDF, merge, reports, Python"
})
merger.write("final_report.pdf")
merger.close()
This is especially useful if the merged PDF will be shared publicly or uploaded to a document management system.
A Safer Merge Script with Error Handling
Real-world scripts should handle errors gracefully. Files may be missing, corrupted, encrypted, or inaccessible.
Example: robust merge function
from pypdf import PdfMerger
import os
def merge_pdfs(pdf_files, output_file):
merger = PdfMerger()
try:
for pdf in pdf_files:
if not os.path.exists(pdf):
print(f"Skipping missing file: {pdf}")
continue
try:
merger.append(pdf)
print(f"Added: {pdf}")
except Exception as e:
print(f"Could not add {pdf}: {e}")
if len(merger.pages) == 0:
print("No pages were added. Output file was not created.")
return
merger.write(output_file)
print(f"Saved merged PDF to: {output_file}")
finally:
merger.close()
pdf_files = ["a.pdf", "b.pdf", "missing.pdf", "c.pdf"]
merge_pdfs(pdf_files, "merged_result.pdf")
Why this matters
This kind of script is more reliable when used in production, especially in:
backend services
scheduled jobs
file upload systems
batch document processing
Merge PDFs from User Uploads in a Web App
If you are building a website or API, users may upload PDF files that you want to merge on the server.
A basic workflow might look like this:
Receive uploaded PDF files
Save them temporarily
Merge them with Python
Return the merged file for download
Delete temporary files afterward
Example: merge uploaded files saved on disk
from pypdf import PdfMerger
import tempfile
import os
def merge_uploaded_pdfs(uploaded_paths, output_path):
merger = PdfMerger()
try:
for path in uploaded_paths:
merger.append(path)
merger.write(output_path)
finally:
merger.close()
# Example usage
uploaded_files = [
"/tmp/upload1.pdf",
"/tmp/upload2.pdf",
"/tmp/upload3.pdf"
]
merge_uploaded_pdfs(uploaded_files, "/tmp/final_merged.pdf")
This approach works well in Flask, Django, FastAPI, or any other Python-based backend.
Merge PDFs in Django
If you are using Django, you may want to merge files after upload and return the merged PDF as a downloadable response.
Example using Django
from django.http import FileResponse
from pypdf import PdfMerger
import os
import tempfile
def merge_pdfs_view(request):
uploaded_files = request.FILES.getlist("pdf_files")
if not uploaded_files:
return FileResponse(open("no_files_error.pdf", "rb"))
with tempfile.TemporaryDirectory() as temp_dir:
paths = []
for file_obj in uploaded_files:
temp_path = os.path.join(temp_dir, file_obj.name)
with open(temp_path, "wb+") as destination:
for chunk in file_obj.chunks():
destination.write(chunk)
paths.append(temp_path)
output_path = os.path.join(temp_dir, "merged.pdf")
merger = PdfMerger()
try:
for path in paths:
merger.append(path)
merger.write(output_path)
finally:
merger.close()
return FileResponse(open(output_path, "rb"), as_attachment=True, filename="merged.pdf")
This is the general pattern for a document upload-and-merge feature.
Merge PDFs in Flask
Flask makes it easy to accept multiple PDF files and return the merged file.
Example using Flask
from flask import Flask, request, send_file
from pypdf import PdfMerger
import os
import tempfile
app = Flask(__name__)
@app.route("/merge", methods=["POST"])
def merge_pdfs():
files = request.files.getlist("pdf_files")
if not files:
return {"error": "No files uploaded"}, 400
with tempfile.TemporaryDirectory() as temp_dir:
paths = []
for file in files:
file_path = os.path.join(temp_dir, file.filename)
file.save(file_path)
paths.append(file_path)
output_path = os.path.join(temp_dir, "merged.pdf")
merger = PdfMerger()
try:
for path in paths:
merger.append(path)
merger.write(output_path)
finally:
merger.close()
return send_file(output_path, as_attachment=True, download_name="merged.pdf")
This pattern is suitable for document tools, SaaS apps, or internal file utilities.
Merge PDFs with pathlib for Cleaner Code
Python’s pathlib module often makes path handling cleaner and more readable.
Example using pathlib
from pathlib import Path
from pypdf import PdfMerger
pdf_dir = Path("pdfs")
output_file = Path("merged.pdf")
pdf_files = sorted(pdf_dir.glob("*.pdf"))
merger = PdfMerger()
for pdf_file in pdf_files:
merger.append(str(pdf_file))
merger.write(str(output_file))
merger.close()
This version is elegant and portable across Windows, macOS, and Linux.
Merge PDFs Recursively from Subfolders
Sometimes PDFs are spread across nested folders. You can use recursive search to collect them.
Example: merge all PDFs in subdirectories
from pathlib import Path
from pypdf import PdfMerger
base_dir = Path("documents")
output_file = Path("merged_all.pdf")
pdf_files = sorted(base_dir.rglob("*.pdf"))
merger = PdfMerger()
for pdf_file in pdf_files:
merger.append(str(pdf_file))
merger.write(str(output_file))
merger.close()
This can be useful when you are organizing document archives or combining multiple project folders.
Merge PDFs and Remove Temporary Files
When you create temporary files during processing, remember to clean them up.
Example: cleanup after merge
from pypdf import PdfMerger
import tempfile
import os
pdf_paths = []
with tempfile.TemporaryDirectory() as temp_dir:
# Example temporary files
for i in range(3):
path = os.path.join(temp_dir, f"file{i}.pdf")
pdf_paths.append(path)
merger = PdfMerger()
try:
for pdf in pdf_paths:
merger.append(pdf)
merger.write("final.pdf")
finally:
merger.close()
Using TemporaryDirectory() is a good way to prevent clutter and avoid leaving unused files on disk.
Merge PDFs and Skip Invalid Files
In some folders, not every file may be a valid PDF. You may find text files, images, or damaged files with a .pdf extension.
Example: validate before merge
from pypdf import PdfMerger, PdfReader
import os
def is_valid_pdf(path):
try:
PdfReader(path)
return True
except Exception:
return False
pdf_files = ["a.pdf", "b.pdf", "bad.pdf", "c.pdf"]
merger = PdfMerger()
for pdf in pdf_files:
if os.path.exists(pdf) and is_valid_pdf(pdf):
merger.append(pdf)
else:
print(f"Skipping invalid file: {pdf}")
merger.write("merged_valid_only.pdf")
merger.close()
This is helpful when you are processing user-uploaded content and want to avoid hard failures.
Merge Encrypted PDFs
Encrypted PDFs require a password. If you try to read or merge them without unlocking, the operation may fail.
Example: decrypt before merge
from pypdf import PdfReader, PdfWriter, PdfMerger
def decrypt_pdf(input_path, password, output_path):
reader = PdfReader(input_path)
if reader.is_encrypted:
reader.decrypt(password)
writer = PdfWriter()
for page in reader.pages:
writer.add_page(page)
with open(output_path, "wb") as f:
writer.write(f)
decrypt_pdf("protected.pdf", "mypassword", "unlocked.pdf")
merger = PdfMerger()
merger.append("unlocked.pdf")
merger.append("other.pdf")
merger.write("merged.pdf")
merger.close()
If your workflow includes protected documents, decrypting them first can simplify the merge process.
Merge and Compress Workflow
Merging does not automatically reduce file size. In fact, the output can become larger if the PDFs contain many images or high-resolution scans.
If file size matters, you may need a separate compression step after merging. Compression usually requires another library or external tool, depending on your setup.
A common workflow is:
Merge PDFs
Optimize the combined file
Save or distribute the result
The merge step itself is usually done with pypdf, while compression may involve other tools.
Merge PDFs with Page Reordering
Sometimes you need to build a new PDF by taking pages from different files in a custom order. For example, you may want:
cover page first
appendix pages later
selected chapters from different PDFs
Example: merge files and order pages manually
from pypdf import PdfReader, PdfWriter
writer = PdfWriter()
pdf1 = PdfReader("first.pdf")
pdf2 = PdfReader("second.pdf")
# Add first page of first PDF
writer.add_page(pdf1.pages[0])
# Add second page of second PDF
writer.add_page(pdf2.pages[1])
# Add third page of first PDF
writer.add_page(pdf1.pages[2])
with open("custom_pages.pdf", "wb") as output:
writer.write(output)
This is not a pure file merge, but it is often the next step when building advanced PDF workflows.
Build a Reusable Merge Function
It is better to wrap the merge logic in a function than repeat the same code many times.
Example: reusable function
from pypdf import PdfMerger
def merge_pdfs(input_files, output_file):
merger = PdfMerger()
try:
for file_path in input_files:
merger.append(file_path)
merger.write(output_file)
finally:
merger.close()
# Example usage
files = ["one.pdf", "two.pdf", "three.pdf"]
merge_pdfs(files, "output.pdf")
Reusable functions make it easier to:
test your code
integrate it into larger systems
handle different file sets
keep your project clean
Add Logging to Your Merge Script
If your script is used in production, logging is more helpful than printing plain text.
Example: merge with logging
import logging
from pypdf import PdfMerger
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
def merge_pdfs(input_files, output_file):
merger = PdfMerger()
try:
for file_path in input_files:
logger.info("Adding %s", file_path)
merger.append(file_path)
merger.write(output_file)
logger.info("Saved merged PDF to %s", output_file)
except Exception as e:
logger.exception("Failed to merge PDFs: %s", e)
raise
finally:
merger.close()
merge_pdfs(["a.pdf", "b.pdf"], "merged.pdf")
Logging is especially useful when the script runs in a server, cron job, or background worker.
Common Mistakes When Merging PDFs in Python
Here are some mistakes that often cause problems:
1. Forgetting to close the merger
Always call close() or use try/finally. Otherwise, file handles may remain open.
2. Wrong page numbering
PDF libraries often use zero-based indexing. The first page is page 0, not page 1.
3. Merging in the wrong order
Always sort or explicitly define the order of files before merging.
4. Using invalid file paths
Make sure paths are correct and files exist before processing them.
5. Trying to merge corrupted PDFs
Some files may look like PDFs but be broken or incomplete.
6. Forgetting about encrypted files
Protected PDFs may require a password before they can be merged.
7. Assuming merge reduces file size
Merging only combines files. It does not compress them automatically.
A Full Command-Line Merge Script
Here is a practical script that merges multiple PDFs from the command line.
Example: command-line tool
import sys
from pypdf import PdfMerger
def merge_pdfs(output_file, input_files):
merger = PdfMerger()
try:
for pdf in input_files:
merger.append(pdf)
merger.write(output_file)
finally:
merger.close()
if __name__ == "__main__":
if len(sys.argv) < 4:
print("Usage: python merge_pdfs.py output.pdf input1.pdf input2.pdf [input3.pdf ...]")
sys.exit(1)
output = sys.argv[1]
inputs = sys.argv[2:]
merge_pdfs(output, inputs)
print(f"Merged {len(inputs)} PDFs into {output}")
Run it like this
python merge_pdfs.py merged.pdf a.pdf b.pdf c.pdf
This is a simple but very useful utility for everyday work.
A Complete Folder-Merge Script
Here is a polished script for merging all PDFs in a folder into one output file.
Example
from pathlib import Path
from pypdf import PdfMerger
def merge_folder(folder_path, output_file):
folder = Path(folder_path)
pdf_files = sorted(folder.glob("*.pdf"))
if not pdf_files:
print("No PDF files found.")
return
merger = PdfMerger()
try:
for pdf in pdf_files:
print(f"Adding {pdf.name}")
merger.append(str(pdf))
merger.write(output_file)
print(f"Created {output_file}")
finally:
merger.close()
merge_folder("pdfs", "merged_folder_output.pdf")
This script is easy to adapt for real projects.
When to Use PyPDF and When to Use Something Else
pypdf is excellent for merging, splitting, page extraction, and metadata handling. It is usually the first library to try.
However, depending on your project, you may also need:
OCR tools for scanned documents
compression utilities for reducing file size
rendering tools for previewing pages
text extraction tools for reading content
For the merge operation itself, pypdf is usually enough.
Final Thoughts
Merging PDF files in Python is simple at the basic level, but it becomes much more powerful when you know how to handle real-world requirements such as folder processing, custom order, page selection, metadata, errors, and encrypted files.
The pypdf library gives you a clean and flexible way to combine documents with only a few lines of code. From small scripts to full document-processing systems, it is a practical tool that can save time and automate repetitive work.
If you are building a PDF utility, the merge feature is one of the most important features to implement first. It is useful, fast to develop, and easy to integrate into web apps, desktop apps, and automation scripts.
Quick Reference Example
Here is the shortest complete example again:
from pypdf import PdfMerger
merger = PdfMerger()
merger.append("file1.pdf")
merger.append("file2.pdf")
merger.write("merged.pdf")
merger.close()
That is all you need for a basic merge.
Hassan Agmir
Author · Filenewer
Writing about file tools and automation at Filenewer.
Try It Free
Process your files right now
No account needed · Fast & secure · 100% free
Browse All Tools