{"id":669,"date":"2020-04-05T21:19:52","date_gmt":"2020-04-05T11:19:52","guid":{"rendered":"http:\/\/clickworks.me\/?p=669"},"modified":"2020-04-18T12:12:48","modified_gmt":"2020-04-18T02:12:48","slug":"create-unique-pdf-files-test-data-generation-for-python-selenium-end-to-end-test-automation","status":"publish","type":"post","link":"https:\/\/clickworks.me\/index.php\/2020\/04\/05\/create-unique-pdf-files-test-data-generation-for-python-selenium-end-to-end-test-automation\/","title":{"rendered":"Create  &#8220;Unique&#8221; PDF Files:  Test data generation for Python Selenium End-to-End Test Automation"},"content":{"rendered":"<p>GitHub repository: <a href=\"https:\/\/github.com\/MaksimZinovev\/pdfhandy\">pdfhandy<\/a><\/p>\n<p>This is the first article where I finally got courage to share some code from my\u00a0 first Test Automation Project. I started to learn Python and Selenium in November 2019. Since then I managed to write\u00a0 a few test, but most importantly, think I&#8217;ve built some kind of foundation for scaling and improving my tests.\u00a0 \u00a0Probably most satisfying was seeing\u00a0 a steady growth of short snippets which I was creating to understand new concepts and get my feet wet as I learned Python, Pytest,\u00a0 Selenium and other areas.\u00a0\u00a0<\/p>\n<p>The code below is something that I tried to Google but found only a few responses on stackoverflow. I used them as a starting point and after 1-2 weeks of hard work and several &#8220;a-ha&#8221; moments I finally managed to get it work &#8211; my\u00a0 own PDF generatorwhich I could use to create &#8220;unique&#8221; pdf files.\u00a0<\/p>\n<h1>The Goal<\/h1>\n<p>I got to a point where I wanted to test the web app file upload process. However the app it does not\u00a0 allow to upload\u00a0 identical\u00a0 files.\u00a0 I used to do it manually: googled pdf file, downloaded as many files as I could and used them\u00a0 in manual tests. Sometimes I also opened file manually, added some short text so that file was then treated as new one. That saved some time but still was quite time-consuming activity.\u00a0<\/p>\n<p>That&#8217;s why I decided to write a small script that would make my life easier. How to\u00a0 automate this process so I could use a simple pdf generator function to create files for\u00a0 my tests?\u00a0<\/p>\n<p>With this goal in mind I drafted a list of requirements before getting to work:<\/p>\n<ul>\n<li>-can be run as a fixture before test starts<\/li>\n<li>-multiple pdf files can be generated and stored in dedicated directory<\/li>\n<li>-filename is generated in the following format\/pattern: &#8220;Test_pdf_0318_1.pdf&#8221;. Where &#8220;0318&#8221; is test number, &#8220;1&#8221; &#8211; pdf number within current test<\/li>\n<li>-files automatically deleted after the test;\u00a0<\/li>\n<li>-files automatically archived after test finished<\/li>\n<li>-base pdf file is used to generate new pdf files; base file can be replaced<\/li>\n<li>-pdf object (instance of custom PDF class) is returned. Pdf object contains useful information that could be used later in tests to upload file and use this info in assertions:\u00a0 file_time (when pdf was generated), file_date, file_size, file_path, etc.<\/li>\n<\/ul>\n<h1>The structure<\/h1>\n<p>I know that my implementation is probably\u00a0 far from\u00a0 good coding practices. However, &#8220;practice makes perfect&#8221;. That&#8217;s why I decided to share it even if it&#8217;s ugly and missing some things. At least it does the job I needed.\u00a0<\/p>\n<p>Here is the structure of the function that I called &#8220;pdf_factory&#8221;:<\/p>\n<ul>\n<li>-cleanup<\/li>\n<li>-generate one-page pdf file containing pdf name, time\/date, test-caller<\/li>\n<li>-merge\u00a0 generated one-page pdf with with base pdf file<\/li>\n<li>-create PDF instance using PDF class<\/li>\n<li>-return single pdf or list of PDF objects<\/li>\n<\/ul>\n<h1>Clean up<\/h1>\n<p>Clean up simply deletes pdf files generated\u00a0 for previous tests before new test starts. Ideally\u00a0 the cleanup should be done after the test. The &#8220;yield&#8221; operator would be ideal for this purpose as it allows to run some parts of the script before and after the test. However\u00a0 it didn&#8217;t work for me. If I am correct, that&#8217;s because in pdf_factory fixture I used the pattern when function returns reference to inner function (I used it to make it possible to use arguments with fixture).\u00a0\u00a0<\/p>\n<h1>Generate one-page pdf<\/h1>\n<p>Once the clean up is finished one-page pdf is generated. The main purposes are to use it later to create &#8220;unique&#8221; pdf file and also display useful information so that you can easily identify it in you tests. I used &#8220;reportlab&#8221;\u00a0 for pdf\u00a0 generation. It has\u00a0 lots of methods and it was quite easy to find examples and documentation.<\/p>\n<p>Here is the screenshot:<\/p>\n<p>&nbsp;<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-full wp-image-693\" src=\"http:\/\/clickworks.me\/wp-content\/uploads\/2020\/04\/pdfhandy-get_testid_pdf_example.png\" alt=\"\" width=\"535\" height=\"689\" srcset=\"https:\/\/clickworks.me\/wp-content\/uploads\/2020\/04\/pdfhandy-get_testid_pdf_example.png 535w, https:\/\/clickworks.me\/wp-content\/uploads\/2020\/04\/pdfhandy-get_testid_pdf_example-233x300.png 233w\" sizes=\"(max-width: 535px) 100vw, 535px\" \/><\/p>\n<p>&nbsp;<\/p>\n<h1>Merge\u00a0 two pdf files<\/h1>\n<p>Now we just need to merge base pdf file and generated one-page pdf with info. This time I used very popular PyPDF2 library. Here is the screenshot of my base pdf file which has 16 pages<\/p>\n<p>&nbsp;<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-large wp-image-694\" src=\"http:\/\/clickworks.me\/wp-content\/uploads\/2020\/04\/Screenshot-2020-04-05-19.36.38-1024x929.png\" alt=\"Merge two pdf files using PyPDF2\" width=\"640\" height=\"581\" srcset=\"https:\/\/clickworks.me\/wp-content\/uploads\/2020\/04\/Screenshot-2020-04-05-19.36.38-1024x929.png 1024w, https:\/\/clickworks.me\/wp-content\/uploads\/2020\/04\/Screenshot-2020-04-05-19.36.38-300x272.png 300w, https:\/\/clickworks.me\/wp-content\/uploads\/2020\/04\/Screenshot-2020-04-05-19.36.38-768x697.png 768w, https:\/\/clickworks.me\/wp-content\/uploads\/2020\/04\/Screenshot-2020-04-05-19.36.38.png 1190w\" sizes=\"(max-width: 640px) 100vw, 640px\" \/><\/p>\n<p>&nbsp;<\/p>\n<h1>Create PDF object<\/h1>\n<p>Now it&#8217;s time to create PDF object to store some useful information:<\/p>\n<pre><code class=\"language-python\">class PDF:\n    file_path = None\n    file_name = None\n    file_dt = None\n    file_date = None\n    file_time = None\n    file_num = None\n    file_pages = None\n    file_size = None\n    file_tzoffset = None<\/code><\/pre>\n<h1>Return pdf or list of pdf objects<\/h1>\n<p>By default &#8220;count=1&#8221;. This is the parameter in the fixture that defines number of pdf files generated for current test. If count &gt;1 the fixture returns the list of PDF objects. Otherwise &#8211; single PDF object. Here is the body of the fixture.\u00a0<\/p>\n<pre><code class=\"language-python\">_pdf_factory body\n        pdf_list = []\n        pdf = PDF()\n\n        cleanup(folder, fname_template, archive_num)\n        for k in range(1, count+1):\n            pdf = get_testid_pdf(node, folder, testid_filename, fname_template, k)\n            pdf.file_path, pdf.file_size = write_merged_pdf(base_files, folder, testid_filename, fname_template, k)\n            pdf_list.append(pdf)\n        return pdf if count == 1 else pdf_list<\/code><\/pre>\n<p>Below is the example of the test that uses pdf_factory fixture. Among with &#8220;pdf_factory&#8221; fixture, few other fixtures are included in arguments of the test function because we use them in pdf_factory\u00a0<\/p>\n<pre><code class=\"language-python\">def test_pdf_factory_multiple(request, current_test_num, pdf_factory):\n    pdfs = pdf_factory(request.node.nodeid, current_test_num, count=3)\n    for k, pdf_obj in enumerate(pdfs):\n        logging.info(f&#039;ITERATION: {k}&#039;)\n        logging.info(f&#039;file_date: {pdf_obj.file_date}&#039;)\n        logging.info(f&#039;file_path: {pdf_obj.file_path}&#039;)\n        logging.info(f&#039;file_size: {pdf_obj.file_size}&#039;)\n        logging.info(f&#039;file_name: {pdf_obj.file_name}&#039;)<\/code><\/pre>\n<p>&nbsp;<\/p>\n<p>Here is the output:<\/p>\n<pre class=\"prettyprint lang-bsh\">-------------------------------- live log call ---------------------------------\r\n19:54:35 INFO pdf.file_num: 0319_1\r\n19:54:35 INFO pdf.file_name: Test_pdf_0319_1.pdf\r\n19:54:35 INFO pdf.file_date: 2020-04-05\r\n19:54:35 INFO pdf.file_time: 19:54:35.308970\r\n19:54:35 INFO pdf.file_tzoffset: 11.0\r\n19:54:35 INFO pdf.file_num: 0319_2\r\n19:54:35 INFO pdf.file_name: Test_pdf_0319_2.pdf\r\n19:54:35 INFO pdf.file_date: 2020-04-05\r\n19:54:35 INFO pdf.file_time: 19:54:35.379654\r\n19:54:35 INFO pdf.file_tzoffset: 11.0\r\n19:54:35 INFO pdf.file_num: 0319_3\r\n19:54:35 INFO pdf.file_name: Test_pdf_0319_3.pdf\r\n19:54:35 INFO pdf.file_date: 2020-04-05\r\n19:54:35 INFO pdf.file_time: 19:54:35.432892\r\n19:54:35 INFO pdf.file_tzoffset: 11.0\r\n19:54:35 INFO ITERATION: 0\r\n19:54:35 INFO file_date: 2020-04-05\r\n19:54:35 INFO file_path: \/Users\/maksim\/repos\/p4-python-aerofiler\/data\/Test_pdf_0319_1.pdf\r\n19:54:35 INFO file_size: 262\r\n19:54:35 INFO file_name: Test_pdf_0319_1.pdf\r\n19:54:35 INFO ITERATION: 1\r\n19:54:35 INFO file_date: 2020-04-05\r\n19:54:35 INFO file_path: \/Users\/maksim\/repos\/p4-python-aerofiler\/data\/Test_pdf_0319_2.pdf\r\n19:54:35 INFO file_size: 262\r\n19:54:35 INFO file_name: Test_pdf_0319_2.pdf\r\n19:54:35 INFO ITERATION: 2\r\n19:54:35 INFO file_date: 2020-04-05\r\n19:54:35 INFO file_path: \/Users\/maksim\/repos\/p4-python-aerofiler\/data\/Test_pdf_0319_3.pdf\r\n19:54:35 INFO file_size: 262\r\n19:54:35 INFO file_name: Test_pdf_0319_3.pdf\r\n<\/pre>\n<h1>Arguments<\/h1>\n<ul>\n<li>node &#8211; custom fixture that returns the name of the test so that you can always see\u00a0 what test generated particular pdf file (e.g.&#8221;tests\/dashboard\/test_pdf.py::test_act_table&#8221;)<\/li>\n<li>current_test_num &#8211; custom fixture that returns current test number. It uses\u00a0 pytest&#8217;s built-in &#8220;cache&#8221; fixture to store previouse test number<\/li>\n<li>count &#8211; number of pdf files generated for current test. Default value is &#8220;1&#8221;<\/li>\n<li>folder &#8211; specifies folder used to store pdf files (&#8220;data&#8221; folder in my case)<\/li>\n<li>testid_filename &#8211; default name for one-page pdf I described above<\/li>\n<li>base_files &#8211; file names that pint to pdf files to merge with one-page pdf . It can be one file or multiple files<\/li>\n<li>fname_template &#8211; pattern that will be used for the names of the generated pdf files<\/li>\n<li>archive_num = argument that controls how many files will be archived. This can be handy when you do not want delete all files generated in previous tests<\/li>\n<\/ul>\n<pre><code class=\"language-python\">def _pdf_factory(node, current_test_num,\n                      count=1,\n                      folder=&#039;data&#039;,        # Default folder: project_dir\/data\n                      testid_filename=&#039;test_id.pdf&#039;,\n                      base_files=(&#039;contract_template.pdf&#039;,),\n                      fname_template=&#039;Test_pdf_&#039;,\n                      archive_num=2):<\/code><\/pre>\n<h1>Files and project folder structure<\/h1>\n<p>Here is how my project folder looks like. &#8220;Data&#8221; folder servs as a place to store generated pdf files<\/p>\n<pre><code class=\"language-shell-session\">.\n\u251c\u2500\u2500 LICENSE\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 __pycache__\n\u251c\u2500\u2500 __requirements\\ 2.txt\n\u251c\u2500\u2500 conftest.py\n\u251c\u2500\u2500 data\n\u251c\u2500\u2500 pages\n\u251c\u2500\u2500 pytest.ini\n\u251c\u2500\u2500 requirements.txt\n\u251c\u2500\u2500 snippets\n\u251c\u2500\u2500 tests\n\u251c\u2500\u2500 utils\n\u2514\u2500\u2500 venv<\/code><\/pre>\n<h1>Contents of &#8220;data&#8221; folder<\/h1>\n<pre><code class=\"language-shell-session\">\u25b6 tree -L 1\n.\n\u251c\u2500\u2500 Test_pdf_0319_1.pdf\n\u251c\u2500\u2500 Test_pdf_0319_2.pdf\n\u251c\u2500\u2500 Test_pdf_0319_3.pdf\n\u251c\u2500\u2500 _Test_pdf_0317_1.pdf\n\u251c\u2500\u2500 _Test_pdf_0318_1.pdf\n\u251c\u2500\u2500 contract_template.pdf\n\u2514\u2500\u2500 test_id.pdf\n<\/code><\/pre>\n<h1>Result<\/h1>\n<p>GitHub repository: <a href=\"https:\/\/github.com\/MaksimZinovev\/pdfhandy\">pdfhandy<\/a><\/p>\n<p>&nbsp;<\/p>\n<p><iframe loading=\"lazy\" src=\"https:\/\/gfycat.com\/ifr\/SplendidPoshGlassfrog\" width=\"640\" height=\"375\" frameborder=\"0\"><\/iframe><\/p>\n","protected":false},"excerpt":{"rendered":"<p>GitHub repository: pdfhandy This is the first article where I finally got courage to share some code from my\u00a0 first Test Automation Project. I started to learn Python and Selenium in November 2019. Since then I managed to write\u00a0 a few test, but most importantly, think I&#8217;ve built some kind of foundation for scaling and improving my tests.\u00a0 \u00a0Probably most satisfying was seeing\u00a0 a steady growth of short snippets which<\/p>\n<div class=\"read-more\"><a class=\"btn read-more-btn\" href=\"https:\/\/clickworks.me\/index.php\/2020\/04\/05\/create-unique-pdf-files-test-data-generation-for-python-selenium-end-to-end-test-automation\/\">Read More<\/a><\/div>\n","protected":false},"author":1,"featured_media":721,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[57,30],"tags":[32,53,54,56,45,55],"post_folder":[],"_links":{"self":[{"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/posts\/669"}],"collection":[{"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/comments?post=669"}],"version-history":[{"count":30,"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/posts\/669\/revisions"}],"predecessor-version":[{"id":728,"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/posts\/669\/revisions\/728"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/media\/721"}],"wp:attachment":[{"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/media?parent=669"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/categories?post=669"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/tags?post=669"},{"taxonomy":"post_folder","embeddable":true,"href":"https:\/\/clickworks.me\/index.php\/wp-json\/wp\/v2\/post_folder?post=669"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}