Project
AIRES: AI Resume Screening System
Aires is an intelligent resume screening application that leverages OpenAI's GPT models to automate candidate evaluation and shortlisting.
Aires - AI Resume Screening Assistant
Overview
Aires is an intelligent resume screening application that leverages OpenAI's GPT models to automate candidate evaluation and shortlisting. The system uses OpenAI Assistants API with retrieval capabilities to analyze resumes and provide detailed candidate assessments based on job requirements.
Technologies Used
Core Technologies
- Python 3.7+: Primary programming language
- OpenAI API: GPT-3.5-turbo models for AI-powered resume analysis
- Chainlit 1.1.202: Interactive chat interface framework
- Watchdog 2.0.1: File system monitoring for real-time resume processing
Document Processing
- PyMuPDF 1.24.4: PDF file parsing and text extraction
- python-docx 1.1.2: DOCX file processing
- textract 1.6.5: DOC file processing and text extraction
DevOps & CI/CD
- Azure Pipelines: Automated testing and deployment
- pytest: Unit testing framework
Project Structure
Aires/
├── .chainlit/ # Chainlit configuration files
│ ├── config.toml # UI and app configuration
│ └── translations/ # Internationalization files
├── .vs/ # Visual Studio settings
├── _metadata/ # Assistant metadata and mappings
│ └── assistants.json # OpenAI Assistant ID mappings
├── env/ # Virtual environment (not tracked in production)
├── openai-assistant/ # Main application directory
│ ├── app.py # Main Chainlit application entry point
│ ├── assignAssistant.py # Assistant assignment logic based on query classification
│ ├── createAssistants.py # Creates OpenAI Assistants with knowledge files
│ ├── fileWatcher.py # Monitors resume directory for file changes
│ ├── parseResume.py # Parses PDF/DOC/DOCX resumes to JSON
│ ├── updateAssistant.py # Updates assistants when resumes change
│ └── config.json # Application configuration (API keys, paths, etc.)
├── azure-pipelines.yml # Azure DevOps CI/CD pipeline configuration
├── requirements.txt # Python dependencies
├── oauth process.txt # OAuth implementation notes
└── README.md # This file
Key Features
1. Multi-Format Resume Processing
- Supports PDF, DOC, and DOCX formats
- Automatic text extraction and cleaning
- Converts resumes to structured JSON format
2. Intelligent Assistant Assignment
- Classifies queries by technology domain (.NET, Data Engineering, Drupal, Power BI, etc.)
- Routes to specialized assistants based on job requirements
- Uses GPT-3.5-turbo for query classification
3. Real-Time File Monitoring
- Watches resume directories for new additions or deletions
- Automatically updates assistants when resumes change
- Maintains synchronized knowledge base
4. Candidate Scoring System
Evaluates candidates on a 100-point scale based on:
- Skills match
- Experience level
- Certifications
- Educational background
5. Interactive Chat Interface
- Built with Chainlit for conversational UI
- File upload support (CSV, PDF)
- Real-time streaming responses
- Image and text content display
How It Works
1. Resume Parsing (parseResume.py)
# Processes all resumes in specified directories
# Extracts text from PDF/DOC/DOCX files
# Creates JSON files with resume content
2. Assistant Creation (createAssistants.py)
# Creates OpenAI Assistants for each job category
# Attaches resume JSON files as knowledge base
# Stores assistant IDs in metadata
3. Query Processing (app.py)
# User submits query via Chainlit interface
# System classifies query and assigns appropriate assistant
# Assistant analyzes resumes and returns scored candidates
4. File Monitoring (fileWatcher.py)
# Monitors resume directories continuously
# Detects new/deleted files
# Triggers automatic assistant updates
5. Assistant Assignment (assignAssistant.py)
# Classifies user query into technology categories
# Returns appropriate assistant ID from metadata
# Routes to Default assistant if no match
Configuration
The application uses config.json for configuration:
{
"api_key": "YOUR_OPENAI_API_KEY",
"file_watch_path": "Path to resume JSON outputs",
"app_name": "Cognine GPT",
"metadata_path": "Path to metadata storage",
"cc_model_name": "gpt-3.5-turbo",
"assistant_model_name": "gpt-3.5-turbo",
"max_tokens": 20,
"json_path": "Path to JSON files",
"resume_folder": "Path to resume folder"
}
Installation
Prerequisites
- Python 3.7 or higher
- OpenAI API key
- Windows environment (for current path configurations)
Setup Steps
- Clone the repository
git clone https://github.com/manideepsp/Aires.git
cd Aires
- Create virtual environment
python -m venv env
env\Scripts\activate # Windows
- Install dependencies
pip install -r requirements.txt
- Configure the application
- Update
openai-assistant/config.jsonwith your API key and paths - Ensure resume directories exist
- Initialize assistants
cd openai-assistant
python parseResume.py
python createAssistants.py
- Run the application
chainlit run app.py
Usage
Starting the Application
cd openai-assistant
chainlit run app.py
Querying Candidates
Example queries:
- "Find me the best .NET developers with 5+ years experience"
- "Who are the top Data Engineers with AWS certification?"
- "Show me Power BI experts for a senior analyst role"
Adding New Resumes
- Place resume files in configured directories
- File watcher automatically detects and processes them
- Assistants are updated with new candidate data
Assistant Categories
The system creates specialized assistants for:
- .NET Developer: Web development technologies
- Data Engineer: ETL and Cloud technologies
- Drupal Developer: Drupal CMS expertise
- Power BI: Business Intelligence and Reporting
- Project Manager: Project management skills
- SharePoint Developer: SharePoint technologies
- Default: General queries
Output Format
For each candidate, the system provides:
- Candidate Name: [Name from resume]
- Candidate Score: [0-100]
- Candidate's Strengths: [Key strengths]
- Candidate's Weaknesses: [Areas for improvement]
- Candidate's Short Summary: [Brief overview]
CI/CD Pipeline
Azure Pipelines configuration (azure-pipelines.yml):
- Triggers on
qasbranch - Uses Windows-latest VM
- Python 3.7 environment
- Runs pytest for testing
- Deploys Chainlit application
Security Notes
⚠️ Important:
- Never commit
config.jsonwith real API keys - Use environment variables for sensitive data in production
- The current config contains example/placeholder values
Future Enhancements
Potential improvements:
- OAuth 2.0 integration (notes in
oauth process.txt) - Multi-language support
- Advanced filtering and search capabilities
- Resume template standardization
- Analytics dashboard
- Batch processing improvements
Dependencies
See requirements.txt for complete list:
- watchdog==2.0.1
- PyMuPDF==1.24.4
- python-docx==1.1.2
- textract==1.6.5
- chainlit==1.1.202
- openai==1.30.1
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
License
[Specify your license here]
Author
manideepsp
Support
For issues and questions, please open an issue on GitHub.
Note: This project uses OpenAI's API which may incur costs. Monitor your API usage accordingly.