Project Overview
This initiative aims to revolutionize the HR recruitment process by introducing an AI-driven resume analysis system. It leverages cutting-edge Natural Language Processing (NLP) and Machine Learning (ML) techniques to automate screening, ensuring efficiency, fairness, and insightful candidate feedback.
The system intelligently parses resumes to extract critical information suchs skills, experience, and educational background. It then ranks candidates based on their relevance to specific job descriptions, significantly reducing manual workload and accelerating the hiring pipeline. A key focus is to minimize human bias and provide constructive, personalized feedback to applicants, fostering a more transparent and equitable recruitment experience.
Core Objectives
Efficiency & Automation
- Automate and significantly speed up resume screening processes.
- Enable HR departments to efficiently process large volumes of resumes.
- Reduce the time-to-hire for various roles.
Fairness & Feedback
- Minimize manual effort and inherent biases in candidate short-listing.
- Provide candidates with personalized, constructive feedback and improvement tips.
- Enhance transparency in the recruitment journey for all applicants.
Key Achievements
Technical Implementation
- Developed a robust end-to-end AI screening pipeline using NLP and ML.
- Implemented advanced PDF parsing for accurate data extraction.
- Utilized Named Entity Recognition (NER) for skill and experience identification.
- Applied TF-IDF, Bag-of-Words, and domain classification for relevance scoring.
User & Data Management
- Built an intuitive admin dashboard with a secure PostgreSQL backend.
- Integrated comprehensive visual analytics for insightful data representation.
- Enabled easy management and categorization of job descriptions and resumes.
Advanced Features
- Incorporated LLM-driven (e.g., Mistral-7B) suggestions for comprehensive candidate feedback.
- Designed modules for seamless resume export and detailed scoring.
- Achieved highly accurate domain prediction for diverse job sectors.
Technologies Utilized
Frameworks & Languages
- Frontend: Streamlit (Python) - for interactive web applications.
- Backend: Python - for core logic and data processing.
- LLM Integration: Hugging Face (Mistral-7B) - for advanced text generation.
NLP & ML Techniques
- Text Preprocessing: Tokenisation, Lemmatisation, Stop-word Removal.
- Feature Extraction: Named Entity Recognition (NER), TF-IDF, Bag-of-Words (BoW), Count Vectoriser.
- Modeling: Various ML algorithms for classification and ranking.
Database & Libraries
- Database: PostgreSQL - for robust and scalable data storage.
- PDF Parsing: PDFMiner, PyMuPDF - for extracting text from PDFs.
- ML Library: Scikit-learn - for implementing machine learning models.