PhD Student in Bioinformatics at UAB
AdaGradOptimizer contains code implementing a custom AdaGrad neural network gradient descent optimizer from scratch.
LogisticRegression contains code for training a neural-network-based logistic regression classification model from scratch in PyTorch.
MNIST contains code for implementing, training, and error-scoring a series of feedforward neural networks, in Keras, for recognizing handwritten MNIST digits (with a best macro average test F1 score of 97%).
TranscriptionFactorBinding contains code for implementing and training a neural network to predict the presence/absence of transcription factor binding sites in DNA fragments.
Alzheimers contains a data science project for predicting Alzheimer’s disease status from RNA expression data (with 5-fold cross-validation and 92% macro average F1 score), across 25k genes, for a small sample of postmortem human and mouse brain tissue.
RealEstatePrediction contains an encoded XGBoost regression model (trained with randomized grid search and 5-fold cross-validation) and an IPython notebook that implements code for predicting 2021 estimated full cash values from 2016-2020 housing data in the Tuscon, Arizona real estate market (test root mean squared error of $8,568 for a dataset of 101,661 homes).
AnthemMiniProject contains a data science project for predicting the chance of insurance policy renewals with XGBoost.
CostEffectiveMaintenance contains code that exhibits data wrangling/feature engineering skills, and trains an XGBoost classifier to determine whether corporate truck maintenances from a real-life dataset (hosted on a MySQL server) are either cost-effective or cost-ineffective.
KNearestNeighbors contains code that implements, from scratch, a two-dimensional k-nearest neighbors regression algorithm.
LinearRegression contains code that implements a linear regression, from scratch, without using scikit-learn.
RetailDemandForecasting contains work on feature engineering and demand forecasting of weekly companywide sales with a 2011-2012 Walmart retail dataset.
SongAnalytica contains code that prototypes using a statistical model (XGBoost) to determine whether U.S. presidents’ re-election chances are correlated with approval rating features. Inspired by the following Wikipedia article.
Alignment uses Needleman-Wunsch and Smith-Waterman algorithms to perform alignment of genetic sequences.
MedianSortedArray finds the median of two sorted arrays with O(log (m+n)) time complexity.
PerfectSquare uses Newton’s method to evaluate whether an integer is a perfect square.
ConnectFour
FlappyFrogger
HeartHistory contains a Python command line tool that queries the Art Institute of Chicago’s API to display a random work of art.
pip install hearthistory
NewsAPI contains a Python script that aggregates the top 20 news titles from sources across 4 categories in over 50 countries.
QuoteAPI contains a Python script that provides a quote about happiness from famous people throughout history.
WhatShouldIReadNext contains a Python command line tool that searches for similar book recommendations based on a title or author query.