Building a PDF to Word Conversion Bot with ARGOS LABS: A Citizen Developer’s Guide

Process Automation - Citizen Developers

The Foundation: Basic Conversion
The core of the bot involves a simple sequence of operations. First, a “drag and drop” operation is used to allow the user to provide the PDF file by simply dragging it onto a designated area. This operation stores the full file path of the dropped file into a variable. This variable, containing the file path, is then handed over to the second key operation: “pdf to doc”. The “pdf to doc” operation takes the input file path and performs the conversion. By default, if no output file path is specified, the Word file is created in the same folder as the original PDF. However, the user can specify a different output path and filename.

The converted file’s full path (folder name and file name) is stored in another predefined variable. This variable can then be used, for instance, to automatically open the newly created Word file using a “run program” operation. The initial test run shows the bot successfully converting a PDF and opening the resulting Word file, highlighting the simplicity of the basic conversion.

Enhancing Robustness: Verifying the File Type
To make the bot more reliable, a crucial step is added: verifying that the dropped file is actually a PDF. This is done using the “compare value” operation. The file path obtained from the “drag and drop” operation is passed to “compare value”. Inside “compare value,” a “wildcard equal” function is used to check if the file path ends with “.pdf”. The wildcard represents any string of characters before the “.pdf” extension.

If the comparison is true (the file ends with .pdf), the bot proceeds to the “pdf to doc” operation. If it’s false, the bot is directed to skip the conversion steps and jump to a different operation. In the example, it jumps to operation number 5, which is a “simple dialog” operation. This dialog box displays a message like “no pdf check your file”. The dialog is configured with button labels, offering options like “stop the bot” or “return to drag and drop” (which jumps back to operation number one). A test run dropping a Word file successfully triggered this error treatment, displaying the dialog. This verification step adds a decision-making capability, ensuring only correct file types are processed.

Adding Practicality: Creating Unique Output File Names
A limitation of simply specifying a fixed output file name (like temp.docx) is that each conversion overwrites the previous one, which isn’t practical for processing multiple files over time. To address this, the bot is modified to create unique file names every time.

A popular method for this is using the “timestamp” tool. The “timestamp” tool takes the system time and returns it as a character string. This ensures a unique string every time it’s run. A simple format is chosen for the timestamp, including hours, minutes, and seconds to guarantee uniqueness even if conversions happen rapidly within the same day. The timestamp string is stored in a new variable.

The output file path for the “pdf to doc” operation is then modified to include this timestamp variable. Instead of a fixed name like temp, the variable name enclosed in curly brackets is used (e.g., {my.timestamp}.docx). This dynamically generates a unique file name based on the timestamp for each conversion. A test run demonstrated that dropping a PDF resulted in a new Word file being created in the folder with a unique name incorporating the timestamp. This makes the bot much more practical for continuous use.

Handling Multiple Files (Alternative Design)
While the drag-and-drop bot handles one file at a time, running the sequence of operations repeatedly for multiple files, a different design is needed to convert many PDFs in one action. This alternative design involves using a “folder monitor” operation to grab all files inside a specified folder. The list of files is typically stored in an array variable. A “repeat” function is then used to loop through this array variable, processing each file within the folder. The number of times the repeat loop runs can be dynamically set using the count of files in the array variable obtained from the folder monitor. This approach allows for bulk conversion. It was noted that this part of the discussion involved programmers and might require their expertise for citizens needing clarification.

Preparing for Deployment
Before deploying the bot, some final touches are demonstrated. If the bot is intended to run continuously (e.g., monitoring a folder all day), any imposed timers might need to be removed. Additionally, when building and testing, developers often create test steps within the same scenario. To prevent these test steps from running during normal operation, an “end of scenario” operation can be placed after the main bot sequence. This operation tells the runner (Pam, in this context) to ignore everything that follows it.

Deployment: Saving and Packaging
Deployment involves saving the developed scenario. The scenario file is saved to a “supervisor”. The supervisor is where the bot is managed, identified by a unique bot ID. The scenario is saved with a name (e.g., “pdf to docs”) and appears in the supervisor dashboard, ready for deployment.

One deployment method shown is packaging the bot into a standalone executable (.exe) file. This is done via the “file menu” and “make exe” option. The executable file is created, often saved to the desktop, and includes the bot ID in its name. Double-clicking this .exe file prepares the running environment (which happens quickly after the first run) and starts the bot.

Handling Multiple Files (Alternative Design)
While the drag-and-drop bot handles one file at a time, running the sequence of operations repeatedly for multiple files, a different design is needed to convert many PDFs in one action. This alternative design involves using a “folder monitor” operation to grab all files inside a specified folder. The list of files is typically stored in an array variable. A “repeat” function is then used to loop through this array variable, processing each file within the folder. The number of times the repeat loop runs can be dynamically set using the count of files in the array variable obtained from the folder monitor. This approach allows for bulk conversion. It was noted that this part of the discussion involved programmers and might require their expertise for citizens needing clarification.

Preparing for Deployment
Before deploying the bot, some final touches are demonstrated. If the bot is intended to run continuously (e.g., monitoring a folder all day), any imposed timers might need to be removed. Additionally, when building and testing, developers often create test steps within the same scenario. To prevent these test steps from running during normal operation, an “end of scenario” operation can be placed after the main bot sequence. This operation tells the runner (Pam, in this context) to ignore everything that follows it.

Deployment: Saving and Packaging
Deployment involves saving the developed scenario. The scenario file is saved to a “supervisor”. The supervisor is where the bot is managed, identified by a unique bot ID. The scenario is saved with a name (e.g., “pdf to docs”) and appears in the supervisor dashboard, ready for deployment.

One deployment method shown is packaging the bot into a standalone executable (.exe) file. This is done via the “file menu” and “make exe” option. The executable file is created, often saved to the desktop, and includes the bot ID in its name. Double-clicking this .exe file prepares the running environment (which happens quickly after the first run) and starts the bot.

Advanced Error Handling: Permission Issues
During a test of the executable bot, a specific PDF file caused an error. To diagnose this, the supervisor dashboard was checked, which records failures and provides reasons. The supervisor logs revealed a “permission issue” – Microsoft Word reported that the document’s author had set permissions preventing content reuse. The failure also had a return code of 1 (execution failed), whereas 0 indicates success.

This led to modifying the bot to specifically handle this type of error. An action was added to the “pdf to doc” operation: if the return code is 1, the bot is configured to jump to a different operation, in this case, operation number 6. A new dialog (operation 6) was added or modified to include a message explaining the permission problem in addition to the “no pdf” message. Saving this updated bot to the supervisor overwrites the previous version. When the problematic file was run again with the updated bot, the new dialog appeared, correctly indicating the permission issue based on the return code. This process of analyzing errors via the supervisor and integrating return code handling into the bot makes it more stable and robust.

Limitations: Image-Based Text
A question arose regarding whether the bot could read text from images within a PDF. It was clarified that the specific “pdf to doc” operation used in this bot does not read image-based text; it simply takes the image and places it in the Word document. Handling image-based text requires different technology, specifically OCR (Optical Character Recognition), such as Google OCR. This was mentioned as a potential topic for a future session.

In summary, building an ARGOS LABS bot involves chaining operations, using variables to pass data, incorporating conditional logic for verification and error handling, using tools like timestamp for practicality, and leveraging the supervisor for deployment and error analysis. While the basic conversion is simple, additional steps are crucial for creating a robust and practical automation solution. The platform supports different designs for various needs, like handling single or multiple files, and emphasizes a citizen development approach with support from technical teams.

Ready to simplify your document workflows? Explore ARGOS LABS today!

Watch this video for more information: https://tinyurl.com/2n5wc3p9

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top