Efficient Document Retrieval with RAG for GPT-3.5

Ollama, PrivateGPT, Chroma DB

GitHub

This project explores using Retrieval-Augmented Generation (RAG) to enhance document retrieval capabilities for GPT-3.5. RAG leverages a dual approach, combining information retrieval techniques with the powerful language generation of GPT-3.5.

AG News Classification

NLP, PySpark, BERT

GitHub

A news classification model based on the BERT architecture, achieving over 91% accuracy and significantly outperforming traditional models. Trained on a robust dataset of over 120,000 news articles, the model effectively handles diverse topics and evolving language.

Empathy and Emotion Detection on Tweets

NLP, RoBERTa, PyTorch

GitHub

This study, presented at WASSA 2023 workshop, investigates how RoBERTa can enhance emotion classification accuracy in essays, tackling the challenges associated with imbalanced datasets and offering valuable insights for future research.

LFW Face Recognition

Matplotlib, Numpy

GitHub

Evaluation of SVM and kNN classifiers using different data representation methods on Labeled Faces in the Wild (LFW) dataset.

Resume Parsing and Classification

Pandas, Scikit-Learn, Python

GitHub

A resume parsing and classification system designed using Python to accurately categorize resumes. The system leverages multiple machine learning techniques, optimized through k-fold cross-validation and hyperparameter tuning with GridSearchCV, to enhance performance. It effectively identifies key skills and qualifications, streamlining candidate evaluation and improving recruitment efficiency.

PhotoHive - A Photo Gallery App

Aws Lambda, React.js, Flask

GitHub

PhotoHive is a gallery app that allows users to upload, search, and manage photos using a serverless backend.

Vocabulary Builder

Python, JavaScript, CLI

GitHub

Vocabulary Builder is a command-line application and web app designed to help users expand their vocabulary by reviewing and memorizing words from different stacks of GRE vocabulary.