## Project Description
The project involves creating a Python script that extracts data from Excel, Word, text, and CSV files to generate reports based on specific client requirements. The script features a GUI developed with PyQt5, enabling users to select input files and specify the output directory for the reports. The GUI ensures ease of use, providing options to customize the report generation process.
## Technologies Used
- **Python**: The programming language used for scripting.
- **Pandas**: For data manipulation and extraction from Excel and CSV files.
- **OpenPyXL**: For reading and writing Excel files.
- **python-docx**: For extracting data from Word documents.
- **PyQt5**: For creating the graphical user interface (GUI).
- **OS and Sys**: For handling file paths and system operations.
## Challenges Faced
### 1. **Data Extraction from Multiple Formats**
- **Issue**: Handling different file formats and ensuring accurate data extraction.
- **Solution**: Utilizing specialized libraries such as Pandas for Excel and CSV, python-docx for Word documents, and built-in Python functions for text files.
### 2. **User-friendly GUI Development**
- **Issue**: Creating an intuitive GUI that allows users to easily select files and save locations.
- **Solution**: Designing the GUI with PyQt5 to provide a straightforward interface for file selection and report generation.
## Domain Knowledge Requirements
In addition to programming skills in Python, the project required specialized knowledge in the following areas:
- **Data Manipulation**: Proficiency in using Pandas for data extraction and manipulation.
- **File Handling**: Understanding of handling various file formats including Excel (with OpenPyXL), Word (with python-docx), text, and CSV.
- **GUI Development**: Experience with PyQt5 for developing user-friendly graphical interfaces.