Project

AIRES: AI Resume Screening System

Aires is an intelligent resume screening application that leverages OpenAI's GPT models to automate candidate evaluation and shortlisting.

AI Resume ScreeningPythonOpenAI GPT-3.5ChainlitDocument ProcessingHR Automation

View on GitHub

AIRES: AI Resume Screening System preview

Aires - AI Resume Screening Assistant

Overview

Aires is an intelligent resume screening application that leverages OpenAI's GPT models to automate candidate evaluation and shortlisting. The system uses OpenAI Assistants API with retrieval capabilities to analyze resumes and provide detailed candidate assessments based on job requirements.

Technologies Used

Core Technologies

Python 3.7+: Primary programming language
OpenAI API: GPT-3.5-turbo models for AI-powered resume analysis
Chainlit 1.1.202: Interactive chat interface framework
Watchdog 2.0.1: File system monitoring for real-time resume processing

Document Processing

PyMuPDF 1.24.4: PDF file parsing and text extraction
python-docx 1.1.2: DOCX file processing
textract 1.6.5: DOC file processing and text extraction

DevOps & CI/CD

Azure Pipelines: Automated testing and deployment
pytest: Unit testing framework

Project Structure

Aires/
├── .chainlit/                    # Chainlit configuration files
│   ├── config.toml              # UI and app configuration
│   └── translations/            # Internationalization files
├── .vs/                         # Visual Studio settings
├── _metadata/                   # Assistant metadata and mappings
│   └── assistants.json         # OpenAI Assistant ID mappings
├── env/                         # Virtual environment (not tracked in production)
├── openai-assistant/            # Main application directory
│   ├── app.py                  # Main Chainlit application entry point
│   ├── assignAssistant.py      # Assistant assignment logic based on query classification
│   ├── createAssistants.py     # Creates OpenAI Assistants with knowledge files
│   ├── fileWatcher.py          # Monitors resume directory for file changes
│   ├── parseResume.py          # Parses PDF/DOC/DOCX resumes to JSON
│   ├── updateAssistant.py      # Updates assistants when resumes change
│   └── config.json             # Application configuration (API keys, paths, etc.)
├── azure-pipelines.yml          # Azure DevOps CI/CD pipeline configuration
├── requirements.txt             # Python dependencies
├── oauth process.txt            # OAuth implementation notes
└── README.md                    # This file

Key Features

1. Multi-Format Resume Processing

Supports PDF, DOC, and DOCX formats
Automatic text extraction and cleaning
Converts resumes to structured JSON format

2. Intelligent Assistant Assignment

Classifies queries by technology domain (.NET, Data Engineering, Drupal, Power BI, etc.)
Routes to specialized assistants based on job requirements
Uses GPT-3.5-turbo for query classification

3. Real-Time File Monitoring

Watches resume directories for new additions or deletions
Automatically updates assistants when resumes change
Maintains synchronized knowledge base

4. Candidate Scoring System

Evaluates candidates on a 100-point scale based on:

Skills match
Experience level
Certifications
Educational background

5. Interactive Chat Interface

Built with Chainlit for conversational UI
File upload support (CSV, PDF)
Real-time streaming responses
Image and text content display

How It Works

1. Resume Parsing (`parseResume.py`)

# Processes all resumes in specified directories
# Extracts text from PDF/DOC/DOCX files
# Creates JSON files with resume content

2. Assistant Creation (`createAssistants.py`)

# Creates OpenAI Assistants for each job category
# Attaches resume JSON files as knowledge base
# Stores assistant IDs in metadata

3. Query Processing (`app.py`)

# User submits query via Chainlit interface
# System classifies query and assigns appropriate assistant
# Assistant analyzes resumes and returns scored candidates

4. File Monitoring (`fileWatcher.py`)

# Monitors resume directories continuously
# Detects new/deleted files
# Triggers automatic assistant updates

5. Assistant Assignment (`assignAssistant.py`)

# Classifies user query into technology categories
# Returns appropriate assistant ID from metadata
# Routes to Default assistant if no match

Configuration

The application uses config.json for configuration:

{
  "api_key": "YOUR_OPENAI_API_KEY",
  "file_watch_path": "Path to resume JSON outputs",
  "app_name": "Cognine GPT",
  "metadata_path": "Path to metadata storage",
  "cc_model_name": "gpt-3.5-turbo",
  "assistant_model_name": "gpt-3.5-turbo",
  "max_tokens": 20,
  "json_path": "Path to JSON files",
  "resume_folder": "Path to resume folder"
}

Installation

Prerequisites

Python 3.7 or higher
OpenAI API key
Windows environment (for current path configurations)

Setup Steps

Clone the repository

git clone https://github.com/manideepsp/Aires.git
cd Aires

Create virtual environment

python -m venv env
env\Scripts\activate  # Windows

Install dependencies

pip install -r requirements.txt

Configure the application

Update openai-assistant/config.json with your API key and paths
Ensure resume directories exist

Initialize assistants

cd openai-assistant
python parseResume.py
python createAssistants.py

Run the application

chainlit run app.py

Usage

Starting the Application

cd openai-assistant
chainlit run app.py

Querying Candidates

Example queries:

"Find me the best .NET developers with 5+ years experience"
"Who are the top Data Engineers with AWS certification?"
"Show me Power BI experts for a senior analyst role"

Adding New Resumes

Place resume files in configured directories
File watcher automatically detects and processes them
Assistants are updated with new candidate data

Assistant Categories

The system creates specialized assistants for:

.NET Developer: Web development technologies
Data Engineer: ETL and Cloud technologies
Drupal Developer: Drupal CMS expertise
Power BI: Business Intelligence and Reporting
Project Manager: Project management skills
SharePoint Developer: SharePoint technologies
Default: General queries

Output Format

For each candidate, the system provides:

- Candidate Name: [Name from resume]
- Candidate Score: [0-100]
- Candidate's Strengths: [Key strengths]
- Candidate's Weaknesses: [Areas for improvement]
- Candidate's Short Summary: [Brief overview]

CI/CD Pipeline

Azure Pipelines configuration (azure-pipelines.yml):

Triggers on qas branch
Uses Windows-latest VM
Python 3.7 environment
Runs pytest for testing
Deploys Chainlit application

Security Notes

⚠️ Important:

Never commit config.json with real API keys
Use environment variables for sensitive data in production
The current config contains example/placeholder values

Future Enhancements

Potential improvements:

OAuth 2.0 integration (notes in oauth process.txt)
Multi-language support
Advanced filtering and search capabilities
Resume template standardization
Analytics dashboard
Batch processing improvements

Dependencies

See requirements.txt for complete list:

watchdog==2.0.1
PyMuPDF==1.24.4
python-docx==1.1.2
textract==1.6.5
chainlit==1.1.202
openai==1.30.1

Contributing

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

License

[Specify your license here]

Author

manideepsp

Support

For issues and questions, please open an issue on GitHub.

Note: This project uses OpenAI's API which may incur costs. Monitor your API usage accordingly.