Create PDF Documents in Python With ReportLab
ReportLab is an open source toolkit for creating PDF documents from Python. It is a very extensive library with many features, from small texts and geometric figures to large graphics and illustrations, all of which can be included in a PDF. In this post we will be approaching to its general features and main functions to create this type of documents.
The library is simply installed via pip :
The source code is hosted on this Mercurial repository .
First Steps ¶
ReportLab includes a low-level API for generating PDF documents directly from Python, and a higher-level template language—similar to HTML and the template systems used in web development—called RML. Generally, the second option is usually more convenient for those who must make exhaustive use of the capabilities of the library when generating documents. For the rest of the cases, the low-level API that we will describe in this article will suffice. However, you can find the official documentation for the package in its entirety at this link .
The most basic code that we can find using ReportLab is the one that generates an empty PDF document, which is the following.
The first thing we do is import the reportlab.pdfgen.canvas module, then we create an instance of the canvas.Canvas class passing it the name or path of the file we want to generate as an argument, and finally we call the Canvas.save() method that effectively saves the changes to the document.
While our c object represents the entire file we're working on, a canvas should be thought of simply as a blank sheet of paper on which to write, draw, or whatever. These writing or drawing operations will always occur between the creation of the document (line 3) and the method that saves the changes (line 4).
Let's start, then, by writing our first text in the document (remember that this line is located between the previous two).
Now when you open the hello-world.pdf file you will find our little message in the bottom left corner of the page.
As you may have guessed, the first two arguments passed to drawString() indicate the (x, y) position at which the text will appear. Unlike most popular desktop application development libraries, in ReportLab the origin of the coordinates (that is, the (0, 0) position) is at the bottom left. That means the Y-position increases as you go up the screen, and the X-position increases as you scroll to the right. This inversion of the Y axis can be a bit confusing at first, but it does not present any additional difficulty, just remember these issues when positioning the objects.
That said, it is essential to know what the measurements of each sheet are when generating the document. Height and width correspond to standard A4 measurements, which is used by default when creating a canvas . Sheet dimensions are expressed in points , not pixels, with one point equaling 1/72 inch. An A4 sheet is made up of 595.2 points wide ( width ) and 841.8 points high ( height ).
By creating an instance of canvas.Canvas we can specify an alternate dimension for each of the sheets via the pagesize parameter, passing a tuple whose first element represents the width in points and the second, the height. We said that the default dimensions are those of to the A4 standard; the reportlab.lib.pagesizes module provides the dimensions of other standards, such as letter , which is the most widely used in the United States.
Thus, to create a document with the dimensions used in the United States, we would do the following.
And to use the dimensions of standard A4:
Which results in a document equal to the first one we created, because pagesize is A4 by default.
Now that we know the height and width of our sheet, we can use them to calculate different positions within it. For example, to write our message in the upper left corner with margins of (approximately) 50 points:
In this case we have added a call to c.showPage() before saving the document. This method tells ReportLab that we have finished working on the current sheet and want to move on to the next one. Although we haven't worked with a second sheet yet (and it won't appear in the document until anything has been drawn) it's good practice to remember to do so before calling c.save() .
We'll come back to writing later, first let's look at how to draw some basic geometric shapes and lines.
Geometric Shapes and Lines ¶
ReportLab allows you to draw lines, rectangles, circles and other figures in a simple way. For example, to draw a line we call the line() method indicating the position of the two points of the segment: x1, y1, x2, y2 .
For a rectangle, rect(x, y, width, height) .
roundRect() operates similarly, but a fifth argument indicates the radius by which the ends are curved.
In the case of circles, the position of the center is indicated followed by the radius.
Lastly, for ellipses the arguments are similar to those for lines.
Putting all this together we can generate a PDF document like the following.
Other methods for generating shapes include bezier() , arc() , wedge() , and grid() . We will talk about the latter at the end of the article.
So far, both the text and the figures that we have drawn have used the default styles (basically black and white colors). You may have noticed that the functions we have been using do not support arguments such as foreground or background to indicate the color of each drawing in particular. Instead, the styles are set directly on the canvas (the sheet), and all operations on the sheet that follow this setting will use the indicated styles. When we change the sheet ( showPage() ), the styles are lost and must be set again if necessary.
So, for example, the setFillColoRGB() method sets the fill color of any object drawn on the sheet, so the following code outputs the text "Hello world!" and a square both in red.
Note that functions that draw shapes include the fill argument ( False by default) to indicate whether they should be colored.
Also, the setStrokeColorRGB() method sets the border color of shapes.
And to alter the font and size of the text drawn via drawString() , we use setFont() .
Although drawString() is sufficient for some words, it is somewhat inconvenient when drawing medium or large texts, since it is not capable of accepting line breaks. For tasks like this, ReportLab includes text objects , a more specialized way of drawing text.
At first we must create the proper object, indicating where we want to position the text.
Once this is done, we proceed to configure the different styles from the created object. For example, here we also have a setFont() method, but it acts on this particular object and not on the rest of the sheet.
Via the textLine() method we add lines of text to our object.
Once the text is written, we draw it on the sheet.
Other methods for formatting text include setCharSpace() , setWordSpace() , and setLeading() , which take the size of the distance as an argument (in points) between respectively two characters, two words, and two lines.
To insert images in a PDF document, ReportLab makes use of the Pillow library, which is simply installed via pip install Pillow .
The drawImage() method takes as arguments the path of an image (it supports multiple formats such as PNG, JPEG and GIF) and the (x, y) position at which you want to insert it.
We can shrink or enlarge the image by indicating its dimensions via the width and height arguments.
When we need to make calculations from the dimensions of an image, it is convenient to open it first via ImageReader() . For example, if we want to place an image in the upper left corner of the sheet, it will be necessary to know a priori the height of the image to calculate the position on the Y axis:
When generating grids, ReportLab makes our work easier by providing the grid() method, instead of having to do it manually via line() or lines() functions, which takes as its first argument a list of positions in X and as its second a list of positions in Y.
And here is the result:
As you may have noticed, xlist indicates the positions on the X axis of the start of each of the vertical lines, while ylist indicates the start (on the Y axis) of the horizontal ones. Based on this information, the library is in charge of constituting the grid in its entirety.
As an illustration, consider the following code that generates, using this method, a grid of students with their respective grades.
(If you are from the US, think of 10-1 grades as A-F grades, "Approved" as "Passing", and "Disapproved" as "Not passing".)
We have examined the main features of ReportLab, although it is only a small selection of its vast collection of functionalities as we have discussed at the beginning of the article. Those who require a more exhaustive use of the library will have already known the basics and I refer them once again to the official documentation to get involved in the most complex tools.
A Simple Step-by-Step Reportlab Tutorial
The subtitle for this article could easily be “How To Create PDFs with Python”, but WordPress doesn’t support that. Anyway, the premier PDF library in Python is Reportlab . It is not distributed with the standard library, so you’ll need to download it if you want to run the examples in this tutorial. There will also be at least one example of how to put an image into a PDF, which means you’ll also need the Pillow package (PIL).
Reportlab supports most of the normal Python installation methods. For the old Reportlab 2.x versions you have the option of downloading the source and running “python setup.py install” or running a binary installer (on Windows).
For the newer Reportlab 3.x , you can now use pip on all platforms:
pip install reportlab
Note that Reportlab 3.x only supports Python 2.7 and Python 3.3+ . If you are on an older version of Python 2, then you have to use Reportlab 2.x.
Creating a Simple PDF
Reportlab has decent documentation. What I mean by that is that the documentation gives you just enough to get started, but when you find something slightly complex to do, you get to figure it out on your own. Just recently, they added a Code Snippets section to their website that will hopefully become a recipe book of cool tips and tricks and also help ameliorate this issue. But enough about that. Let’s see how to actually create something!
In Reportlab, the lowest-level component that’s used regularly is the canvas object from the pdfgen package. The functions in this package allow you to “paint” a document with your text, images, lines or whatever. I’ve heard some people describe this as writing in PostScript. I doubt it’s really that bad. In my experience, it’s actually a lot like using a GUI toolkit to layout widgets in specific locations. Let’s see how the canvas object works:
You should end up with a PDF that looks something like this:
The first thing to notice about this code is that if we want to save the PDF, we need to supply a file name to the Canvas object. This can be an absolute path or a relative path. In this example, it should create the PDF in the same location that you run the script from. The next piece of the puzzle is the drawString method. This will draw text wherever you tell it to. When using the canvas object, it starts at the bottom left of the page, so for this example, we told it to draw the string 100 points from the left margin and 750 points from the bottom of the page (1 point = 1/72 inch). You can change this default in the Canvas constructor by passing a zero to the bottomup keyword argument. However, I’m not exactly sure what will happen if you do that as the Reportlab user guide isn’t clear on this topic. I think it will change the start point to the top left though. The final piece in the code above is to save your PDF.
That was easy! You’ve just created a really simple PDF! Note that the default Canvas size is A4, so if you happen to be American, you’ll probably want to change that to letter size. This is easy to do in Reportlab. All you need to do is the following:
The main reason to grab the width and height is that you can use them for calculations to decide when to add a page break or help define margins. Let’s take a quick look at the constructor for the Canvas object to see what other options we have:
The above was pulled directly from the Reportlab User Guide , page 11. You can read about the other options in their guide if you want the full details.
Now let’s do something a little more complicated and useful.
A Little Form, Little Function
In this example, we’ll create a partial printable form. As far as I can tell, Reportlab doesn’t support the fillable forms that were added to Adobe products a few years ago. Anyway, let’s take a look at some code!
This is based on an actual receipt I created at work. The main difference between this one and the previous example is the canvas.line code. You can use it to draw lines on your documents by passing two X/Y pairs. I’ve used this functionality to create grids, although it’s pretty tedious. Other points of interest in this code include the setLineWidth(.3) command, which tells Reportlab how thick or thin the line should be; and the setFont(‘Helvetica’, 12) command, which allows us to specify a specific font and point size.
Our next example will build on what we’ve learned so far, but also introduce us to the concept of “flowables”.
Going with the Flow
If you’re in advertising or do any kind of work with form letters, then Reportlab makes for an excellent addition to your arsenal. We use it to create form letters for people who have overdue parking tickets. The following example is based on some code I wrote for that application, although the letter is quite a bit different. (Note that the code below will not run without the Python Imaging Library)
Well, that’s a lot more code than our previous examples contained. We’ll need to look over it slowly to understand everything that’s going on. When you’re ready, just continue reading.
The first part that we need to look at are the new imports:
From enums, we import “TA_JUSTIFY”, which allows our strings to have the justified format. There are a number of other constants we could import that would allow us to right or left justify our text and do other fun things. Next is the platypus (which stands for Page Layout and Typography Using Scripts) module. It contains lots of modules, but probably the most important of them are the flowables, such as Paragraph. A flowable typically has the following abilities: wrap , draw and sometimes split . They are used to make writing paragraphs, tables and other constructs over multiple pages easier to do.
The SimpleDocTemplate class allows us to set up margins, page size, filename and a bunch of other settings for our document all in one place. A Spacer is good for adding a line of blank space, like a paragraph break. The Image class utilizes the Python Image Library to allow easy insertion and manipulation of images in your PDF.
The getSampleStyleSheet gets a set of default styles that we can use in our PDF. ParagraphStyle is used to set our paragraph’s text alignment in this example, but it can do much more than that (see page 67 of the user guide). Finally, the inch is a unit of measurement to help in positioning items on your PDF. You can see this in action where we position the logo: Image(logo, 2*inch, 2*inch). This means that the logo will be two inches from the top and two inches from the left.
I don’t recall the reason why Reportlab’s examples use a Story list, but that’s how we’ll do it here as well. Basically you create a line of text, a table, and image or whatever and append it to the Story list. You’ll see that throughout the entire example. The first time we use it is when we add the image. Before we look at the next instance, we’ll need to look at how we add a style to our styles object:
The reason this is important is because you can use the style list to apply various paragraph alignment settings (and more) to text in your document. In the code above, we create a ParagraphStyle called “Justify”. All it does is justify our text. You’ll see an example of this later in the text. For now, let’s look at a quick example:
For our first line of text, we use the Paragraph class. As you can see, the Paragraph class accepts some HTML-like tags. In this instance, we set the font’s point size to 12 and use the normal style (which is left aligned, among other things). The rest of the example is pretty much the same, just with Spacers thrown in here and there. At the end, we call doc.build to create the document.
Now you know the basics for creating PDFs in Python using Reportlab. We didn’t even scratch the surface of what all you can do with Reportlab though. Some examples include tables, graphs, paginating, color overprinting, hyperlinks, graphics and much more. I highly recommend that you download the module along with its user guide and give it a try!
- Reportlab Home Page
- Generating Reports with Charts
- A series on Reportlab
- Getting Started with Reportlab
12 thoughts on “A Simple Step-by-Step Reportlab Tutorial”
easy_install ReportLab works for me.
Thanks for checking that out. When I looked on Google, all I found were threads from people who had trouble making it work, so I didn’t want to include an installation method that was buggy.
well guess this post is a bit old and things have changed since then.but thanks for informing me.
What exactly are you talking about? I used examples from their current documentation and from my own code base and they all work for me.
Anyway, the premier PDF library in Python is Reportlab. It is not distributed with that standard library, so youâ€™ll need to download it if you want to run the examples in this tutorial.Anyway, the premier PDF library in Python is Reportlab.thanks for the tips G-d bless 😉
in 2.4, there is no named parameter encoding for Canvas
Well , the view of the passage is totally correct louis vuitton Jewelry ,your details is really reasonable and you guy give us valuable informative post, I totally agree the standpoint of upstairs. I often surfing on this forum when I m free and I find there are so much good information we can learn in this forum!
Your article is really well-written.
Really nice. Any chance you’ll do another one with a table? Thanks! 🙂
I don’t have those kinds of abilities, but that would be kind of nice. Thanks!
Comments are closed.
Search code, repositories, users, issues, pull requests...
We read every piece of feedback, and take your input very seriously.
Use saved searches to filter your results more quickly.
To see all available qualifiers, see our documentation .
PDF Report example with a front-page, headers and table
Name already in use.
Use Git or checkout with SVN using the web URL.
Work fast with our official CLI. Learn more about the CLI .
- Open with GitHub Desktop
- Download ZIP
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
PDF Report example with a front-page, header, footer and table
- Python 100.0%
- Listing a directory using Python
- How to insert a dictionary in another dictionary in Python (How to merge two dictionaries)
- range vs. xrange in Python
- List Comprehension vs Generator Expressions in Python
- Plain function or Callback - An example in Python
- Callback or Iterator in Python
- Function vs Generator in Python
- urllib vs urllib2 in Python - fetch the content of 404 or raise exception?
- Print HTML links using Python HTML Parser
- Extract HTML links using Python HTML Parser
- Creating an Iterator in Python
- Python Weekly statistics (using urllib2, HTMLParser and pickle)
- Solution: Number guessing game in Python
- Show Emoji in Python code
- for-else in Python indicating "value not found"
- Create your own interactive shell with cmd in Python
- Traversing directory tree using walk in Python - skipping .git directory
- Python: avoid importing everything using a star: *
- Type checking of Python code using mypy
- Python package dependency management - pip freeze - requirements.txt and constraints.txt
- Create images with Python PIL and Pillow and write text on them
- Python: get size of image using PIL or Pillow
- Write text on existing image using Python PIL - Pillow
- Crop images using Python PIL - Pillow
- Resize images using Python PIL Pillow
- Showing speed improvement using a GPU with CUDA and Python with numpy on Nvidia Quadro 2000D
- Send HTTP Requests in Python
- Command-line counter in Python
- Never use input() in Python 2
- Parallel processing in Python using fork
- Static web server in Python
- How to serialize a datetime object as JSON using Python?
- Skeleton: A minimal example generating HTML with Python Jinja
- Simple logging in Python
- Logging in Python
- Python argparse to process command line arguments
- Using the Open Weather Map API with Python
- Python: seek - move around in a file and tell the current location
- Python: Capture standard output, standard error, and the exit code of a subprocess
- Python: Iterate over list of tuples
- Print version number of Python module
- Python: Repeat the same random numbers using seed
- Python: split command line into pieces as the shell does - shlex.split()
- Python context tools - cwd, tmpdir
- Python and PostgreSQL
- RPC with Python using RPyC
- Python: Fermat primality test and generating co-primes
- Python UUID - Universally unique identifier
- Time left in process (progress bar) in Python
- Counter with Python and MongoDB
- qx or backticks in python - capture the output of external programs
- Calling Java from Python
- Python and ElasticSearch
- Python daemon (background service)
- Python: print stack trace after catching exception
- Python: logging in a library even before enabling logging
- Python atexit exit handle - like the END block of Perl
- Python: traversing dependency tree
Creating PDF files using Python and reportlab
- Show file modification time in Python
- Static code analysis for Python code - PEP8, FLAKE8, pytest
- Python timeout on a function call or any code-snippet.
- Compile Python from source code
- Introduction to Python unittest
- Doctest in Python
- Testing Python: Getting started with Pytest
- Python testing with Pytest: Order of test functions - fixtures
- Python Pytest assertion error reporting
- Python: Temporary files and directory for Pytest
- Mocking input and output for Python testing
- Testing random numbers in Python using mocks
- Python: fixing random numbers for testing
- Python: PyTest fixtures - temporary directory - tmpdir
- Caching results to speed up process in Python
- Python unittest fails, but return exit code 0 - how to fix
- Testing with PyTest
- Parsing test results from JUnit XML files with Python
- Combine pytest reports
Fonts (types and sizes), default page size.
Published on 2019-10-19
Author: Gabor Szabo
- Knowledge Hub
Data Deep Dive: Creating PDF reports with ReportLab and Pandas
ReportLab “create solutions to generate rich, attractive and fully bespoke PDFs at incredible speeds”. They provide both commercial and open source offerings. Here, I will focus on the open source Python library . This is used by MediaWiki (the platform behind Wikipedia) to create PDF versions of articles.
Things I like about ReportLab:
- Everything happens in Python and there is no need to work with multiple files
- Support for changing page size within documents
- Rendering is fast
- The Platypus layout engine
Things I don’t like about ReportLab:
- Objects get modified in place when building a document. This can become a problem when running in a notebook where objects were created in previous cells and don’t necessarily get recreated when only running the cell which builds the document.
- Errors are often difficult to interpret
- Camel case is used instead of snake case
- The documentation is only available in PDF format and it can be difficult to find what you’re looking for
But there are other tools available. For example, Plotly can be used to generate HTML pages containing graphs and tables which can then be converted to PDF. This is handy if you are already using Plotly to create your figures, however, this does not give you any control over what goes on which page as HTML has no concept of pages as such. Also, you will need to know some HTML to get this working. Another way of creating PDF reports from Python is to use PyFPDF . However, as their documentation demonstrates, PyFPDF does not provide a flexible page layout engine. Again, this means you have to specify what goes on which page. This may be manageable in your use case, but I personally find it much easier to provide a list of content and have the pages be created automatically (similar to what MS Word does).
The recommended way to install ReportLab is using PyPI
To create tables and figures, we will also need Pandas and Matplotlib:
We will use the Platypus layout engine to do most of the work of laying out our PDF.
Platypus stands for “Page Layout and Typography Using Scripts”. It is a high level page layout library which lets you programmatically create complex documents with a minimum of effort.
If you want to use ReportLab without Platypus, you will need to manually position your content. This may be appropriate in some use cases but, in the majority of cases, using Platypus will make your life easier.
The first thing you will want to do is create some frames.
A frame is a region of a page that can contain flowing text or graphics
Frames are used to decide how much content can fit on each page. I have created two frames, one for portrait pages and one for landscape:
The first and second arguments of the Frame class are the x and y coordinates of the lower left corner of the frame. These are followed by the width and height of the frame, information which can be obtained from the page size object (A4). Finally, the padding in each direction can be specified. I have set the padding to be the same in both frames, but this doesn’t always need to be the case.
Next, create a function that will be called (calling a function runs the block of code it contains) on each page. This can be used to add logos and page numbers, for example. Notice that the location of the page number and image need to be manually specified and that they don’t sit within a frame. I have created two versions of the function, one for each page size, as I wasn’t able to work out how to get the current page size within the function. The landscape function conveniently converts the A4 page size to be landscape.
This function can be combined with frames to create page templates.
A page template can contain one or more frames
It is important to specify an ID for each of these page templates as this will be used later to switch between them. Notice that different onPage functions are used in each template, as mentioned above. A single frame, created above, is used to create each template.
These page templates can be combined into a document template, which also specifies the name of the file to create. You can only have one document template per document as this is the top-level container. The best way to think about it is that there is always a single document template, which can contain one or more page templates, each of which can contain one or more frames.
Defining conversion functions
To be able to insert Matplotlib figures and Pandas DataFrames into reports, they’ll need to be converted into ReportLab Images and Tables. To convert figures in memory, I’m using the io library and creating a binary stream. I then save the figure as a PNG to this binary stream and seek to the beginning of the stream. You could also store save them as vector graphics (SVG), however ReportLab is not able to handle these out of the box and you will need to use Svglib .
The ReportLab Image class requires the image size in pixels. Matplotlib figures have a get_size_inches method and the output of this can be easily converted to pixels using the inch object from ReportLab. Depending on the figure, it may be necessary to call tight_layout before converting it to an Image. You could also save these images to disk and then load them again, but I personally think it’s simpler and easier to do everything in memory. It also avoids having to deal with file names and potential write permission issues.
To convert DataFrames to Tables, you will first need to convert all the columns to Paragraphs. This ensures that the text can wrap in the title row, which is helpful for long column names. The title row can be added to the values from the DataFrame as a list. The result is a list of rows, with each row being a list of values. The DataFrame index is not used here, but you could incorporate it if you wanted to. The style argument can be used to set fonts, backgrounds and borders. In this example, the table will have alternating grey and white backgrounds and borders around all cells. There is a detailed description of how table styling works in the documentation but I won’t go into it here.
I have created a single function each for tables and figures but you could have multiple. This would allow, for example, certain tables to be styled differently.
To create some images and tables, we’re going to use the famous Iris dataset . This dataset describes various features of 150 iris plants. I read this dataset using Pandas and then aggregate the features of plants by type, storing this as a new DataFrame (plant_type_df) . Notice that the variable name is suffixed with _df to make it easier to distinguish from figures. I then create a figure with a unique name ( plant_type_fig ) using plt.subplots and plot the plant_type_df onto this figure as a bar plot. Notice that the figure variable name is suffixed with _fig to make it easier to distinguish from DataFrames. Both the figure and DataFrame can be used later when building the report. If you have many figures and DataFrames, it’s important to use easily understandable variable names.
We can create mode figures and tables like so:
Now, let’s build the report! The figures and DataFrames that we made earlier can be converted using the appropriate conversion functions and included in a list called story . To switch between templates, set the next page template and then create a page break. You can then add titles and paragraphs using the Paragraph class. These can be styled using the sample style sheet provided by ReportLab, or you can create your own styles. Unnamed DataFrames can also be created and used within the story. For example, I call the corr method of df to get a DataFrame with the pairwise correlations and then convert this to a table.
Notice that I am not creating the ReportLab Image and Table objects earlier in my script. This is because these objects will be modified by ReportLab when the document is built and therefore if you are running in an interactive environment like a notebook, this may result in unexpected behaviour if you try and build the document without running the code which creates them.
Once you are happy with your story, pass it to the doc.build function and your report will be ready in no time!
Here is a link to the report we created.
We have built a PDF report containing figures and tables using ReportLab and Pandas. This process is easily reproducible for other datasets and could be automated for producing reports on a regular basis. While there are other Python libraries available, I believe ReportLab provides great potential for fine-grained control over your reports and adding new content is simple once everything is set up. Getting started can be tricky due to the many types of object involved (document templates, page templates, frames, images, tables, and so on), but but hopefully the above example should help you on your way. If you work at a company which would like to automate their reports but doesn’t know where to start, why not run a data skills project with us?
This article was built using Quarto , you can see the source code HERE .
Dr Fergus McClean is a Data Scientist at the National Innovation Centre for Data specialising in, among other things, data engineering and visualisation. He has a background in environmental modelling and his PhD investigated the impact of using global datasets for flood inundation modelling and involved designing a cloud-based framework for running simulations.
Want to know more? Attend of Data Innovation Showcase
Join Fergus at our Data Innovation Showcase on the 27 September for a practical workshop where we'll walk you through the process of automating your first report and provide inspiration to optimise your own reporting workflows. No prior coding experience is needed. Learn more about the session by exploring the conference programme .
More of our latest news
Data Deep Dive - Unlocking entity resolution: An end-to-end guide to overcoming unique identifier challenges with Splink
Data Deep Dive: Serving machine learning models using AWS Lambda
Deploying a Machine Learning Model Using Plumber and Docker
To begin working with us, sign up here. we can have a chat and sign you up for a free discovery workshop.
Generate a Simple PDF using Python ReportLab (9 Steps)
Problem formulation and solution overview.
This article shows how to generate a formatted PDF file from a CSV file using the ReportLab and Pandas libraries in conjunction with slicing .
ℹ️ Python offers numerous ways to generate a PDF file. One option is to use the ReportLab library. The article selected the path that used the least amount of code.
To make it more interesting, we have the following running scenario:
🚗 Example Scenario : KarTek , a car dealership in Toronto, previously asked you to import their CSV file into a Database. They would like this data converted to a PDF format to email out to their Sales Staff daily .
To follow along with this article, download and place the cars.csv file into the current working directory:
Download and save the logo file to the current working directory as car_logo.png.
💬 Question : How would we write code to generate a PDF from a CSV file ?
We can create a PDF from a CSV file by performing the following steps:
- Install ReportLab and Pandas Libraries
- Add Library References
- Read CSV File
- Calculate Total Pages
- Get Stylesheet and Template
- Create Page Header
- Paginate the Data
- Build the PDF
- Generate the PDF
Step 1: Install ReportLab and Pandas Libraries
Before moving forward, the ReportLab and Pandas libraries must be installed. To install these libraries, run the following code at the command prompt.
The ReportLab library is needed to generate a PDF, and the Pandas library is required to read and manipulate the cars.csv file.
Step 2: Add Library References
To run this code error-free, references to the required modules must be added.
To add these references, navigate to the IDE. Create a file called pdf.py and place this file into the current working directory.
Copy the code snippet below and paste this code into the above-created file.
An alternate option would be to reference precisely what is needed.
The first option is much cleaner. There are pros and cons for each selection. However, the choice is up to you.
Save this file.
Step 3: Read CSV File
The next step is to read the CSV file and extract the heading row.
Copy and paste this code snippet to the bottom of the pdf.py file.
The first line in the above code snippet reads in the first 60 rows ( head(60) ) of the cars.csv file. In this file, the field separator character is a semi-colon ( ;) and must be specified, as read_csv() assumes the separator character is a comma ( , ). The results save to df_cars .
The following line casts all data types to strings astype(str) and splits the data into lists . The result saves to df_data . If output to the terminal, the data would display as shown below (a snippet of the file).
The last line removes the header row using slicing , leaving only the data. The results save to pg_data . If output to the terminal, the data would display as shown below (a snippet of the file).
Step 4: Calculate Total Pages
This code snippet is used to calculate how many pages the PDF will be.
The first line in the above code snippet declares an empty list called elements . This list will hold all the page formatting for the PDF.
The following line declares how many records per page will display (not including the header row). The results save to recs_pg .
The last line uses the ceil() function from the math library to calculate how many pages this PDF will be. This function returns an integer value. The results save to tot_pgs . If output to the terminal, the following will display.
Step 5: Get Stylesheet and Template
This code snippet is used to set the styles and a template for the PDF.
The first line in the above code snippet gets the getSampleStyleSheet() function. This function gives us access to default styles, such as Title , Heading1 , Heading2 , etc. Additional styles can be found here .
The following line calls the SimpleDocTemplate() function and passes it five (5) arguments: the filename of the PDF to generate and the page margins. The results save to doc .
Step 6: Create Page Header
This code snippet creates a header for each page of the PDF.
This first line in the above code snippet declares the function createPageHeader() . The contents of this function will appear at the top of each page.
The following line adds spacing from the top of the page using the Spacer() function and passing it two (2) arguments: the width and height. The results are appended to elements , declared at the beginning of the code.
💡To achieve the offset, the topMargin argument could also be adjusted. However, we wanted to show another way to achieve the same result.
The next line calls the Image() function and passes it three (3) arguments: the image, and the x and y positions, respectively.
The results are appended to elements , declared at the beginning of the code.
Then, a heading of Inventory is called using the Paragraph() function and passing it two (2) arguments: the text and the style for the text. The results are appended to elements , declared at the beginning of the code.
The last line appends another Spacer() .
Step 7: Paginate the Data
This code snippet shows how the data for each page is paginated.
The line in the above code snippet declares the function paginateInventory() which accepts two (2) arguments: a start and stop position.
The following line creates the data for the page. The Table() function is called and passed one (1) argument: the header row from df_data ( df_data[0:1] ), plus the data for the specified page ( pg_data[start:stop] ). These results save to tbl .
The next line sets out the format for the table on the page. This line does the following:
- Change the background color of the header row.
- Changes the font size.
- Sets up the grid lines between all the cols/rows,
The tbl is then appended to elements , declared at the beginning of the code.
Step 8: Build the PDF
This code snippet creates a function to loop through the paginated data and builds a PDF.
The first line in the above code snippet declares the function generatePDF() .
The following three (3) lines declare three (3) variables and their initial positions.
- cur_pg , which keeps track of the page we are currently on.
- start_pos , which is where the data initially starts.
- stop_pos , which is where the data initially ends ( recs_pg , or 39).
The following section does all the work! It declares a for loop, which loops through all pre-determined pages ( tot_pgs , or 2 in this case). Then, the following creates a page by:
- Executing the createPageHeader() function.
- Outputting the page data using slicing ( paginateInventory(start_pos, stop_pos) ).
- Adds a Page Break ( PageBreak() ).
- Re-calculate the start_pos and stop_pos variables.
This loop continues until all pages have been generated,
The last line in this code snippet, builds the PDF.
Step 9: Generate a PDF
This code snippet generates the PDF built earlier.
If you ran the above code, no PDF file would be generated. This is because the following code needs to be appended to the end of the pdf.py file.
The first line is mainly used to declare the top-level starting point of a program.
An alternative is to just call the function.
There are pros and cons to both options. However, the choice is up to you.
Save and run this file.
If successful, the inventory.pdf file will be saved to the current working directory.
The Full Code
This article has shown you a compact way to generate a customized PDF file.
Good Luck & Happy Coding!
At university, I found my love of writing and coding. Both of which I was able to use in my career.
During the past 15 years, I have held a number of positions such as:
In-house Corporate Technical Writer for various software programs such as Navision and Microsoft CRM Corporate Trainer (staff of 30+) Programming Instructor Implementation Specialist for Navision and Microsoft CRM Senior PHP Coder
Python, Pygame and Tkinter with free tutorials – on twitter I'm @pythonprogrammi on youtube GiovanniPython
Create a pdf with reportlab
The module reportlab seems very good to make pdf files. This is the first post about this library. Let’s start with making a simple new pdf with some text like our hello world script.
I have used this module to create a nice application that adds a page with text to an existing pdf file (made to add an evaluation to tests) in this post here .
This code was taken from the Pythonvsmouse blog . You can find more documentation here with some useful code snippets .
You start from the bottom. To print the text at the start you need to pyt something like 700 as the height.
This is what you should come up to
In the following video there is this code above, just to show how easy it is. The audio is not so good, sorry.
The default size of the page is a4. If you want letter size, add this to the code when you use the Canvas class, the pagesize argument=letter, like you see down here:
How to set fonts
To set the size and tyope of the fonts, use a code like this
You can draw lines with this code.
- first you set the width of the line
- then you give the coordinates
Video 1: reportlab hello world
In the next post we will go further into this module to see what we can do with a pdf and python.
Pygame's Platform Game
Other Pygame's posts
- Click to share on Twitter (Opens in new window)
- Click to share on Facebook (Opens in new window)
Published by pythonprogramming
Seamless Theme Just Pretty , made by Altervista