Read PDF File Using Python in Robot Framework — Devstringx

Task:- Read Data from the pdf file and compare the text “Testing” is present in the pdf file.

Create a function to read data from PDF File using Python

First Install PdfMiner and Pdf2TextLibrary library in your system as per the steps mentioned below:

  1. Open a command prompt.
  2. Write the “pip install pdfminer” command to install pdfminer library.
  3. Write the “pip install robot framework-pdf2textlibrary” command to install pdf2textlibrary.

Now create a python file. You can give any name to your file and save it with .py extension.

I have created a file python file as Pdf2TextLibrary.py

from pdfminer.pdfinterpimport PDFResourceManager, PDFPageInterpreter
from pdfminer.converterimport TextConverter
from pdfminer.layoutimport LAParams
from pdfminer.pdfpageimport PDFPage
from io import StringIO

class Pdf2TextLibrary(object):
ROBOT_LIBRARY_SCOPE = ‘Global’

def __init__(self):
print(‘pdf to text library’)

def convert_pdf_to_txt(self,path):
rsrcmgr = PDFResourceManager()
retstr = StringIO()
codec = ‘utf-8’
laparams = LAParams()
device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
fp = open(path, ‘rb’)
interpreter = PDFPageInterpreter(rsrcmgr, device)
password = “”
maxpages = 0
caching = True
pagenos=set()
for page in PDFPage.get_pages(fp, pagenos, maxpages=maxpages, password=password,caching=caching, check_extractable=True):
interpreter.process_page(page)
fp.close()
device.close()
str = retstr.getvalue()
retstr.close()
return str

Here I have created a function as convert_pdf_to_txt to convert PDF data to text.

Read Also:- Read Excel Using Python in Robot Framework

Calling python function in robot framework to read pdf data.

# Importthe File in which you have created a function to read data from a PDF file.
*** Settings ***
Library ../Scripts/Pdf2TextLibrary.py

*** Test Cases ***
Read PDF File Data
# open downloaded PDF and read data from PDF${file_name} List Files in directory ${EXECDIR}/Files/Downloads
${string}= convert_pdf_to_txt${DWNLDFOLDER}${file_name}
Should Contain ${string} Testing #check entered text is present in PDF

Let’s save this file with “TestPDF.robot”

Run the above file by using the following command:
>robot TestPDF.robot

Output: Your program will run successfully if the text is present in the downloaded pdf.

Originally published at https://www.devstringx.com on April 05, 2022.

--

--

--

Devstringx Technologies is highly recommended IT company for custom software development, mobile app development and automation testing services

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How to dockerize a spring boot app with Google’s Jib

How to deploy a simple Nodejs app on Heroku Platform using Gitlab CI/CD pipelines.

5 Things I’ve Learnt from Being a Software Developer for 15 Years

Precondition, assertion and fatalError. A short story about how to fail gracefully.

Introducing the ELEGOO Tumbller

How I Completely Automated My YouTube Editing

The Middle-Tier, A Beginners Guide to Web Development

INTRODUCTION TO TEST DRIVEN DEVELOPMENT

TDD cycle

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Devstringx Technologies

Devstringx Technologies

Devstringx Technologies is highly recommended IT company for custom software development, mobile app development and automation testing services

More from Medium

Synchronization in Selenium WebDriver

How to Upload File using Selenium Webdriver

Verify a Dynamic Chart in Selenium WebDriver

Parameterization in TestNG using testng.xml file