Welcome!
About
Hello, I’m Harry—welcome to my GitHub Pages website! Here, you’ll find an overview of my completed projects as well as my CV. I am currently pursuing an Masters in Statistics at Baruch with a track in Data Science. My interest lies in the intricacies of modern statistical methods, particularly Machine and Statistical Learning, and their applications in financial and business topics, economics, and technology. Below, you can explore my CV, complete with links to all the projects mentioned.
HARRY SOHAL
NYC | LinkedIn | GitHub
EDUCATION
Baruch College, Zicklin School of Business | New York, NY| MS in Statistics & Data Science
- GPA 3.9/4.0
Baruch College, Zicklin School of Business | New York, NY| BBA in Business & Marketing Management
Worked approximately 30 hours per week while pursuing degree.
Honors: Cum Laude
EXPERIENCE
Annalect| NYC | Data Science Intern (June 2025 - Aug 2025)
Developed scripts and data pipelines to clean, preprocess and transform large data sets in R.
Applied Bayesian modeling and regression analysis to uncover key drivers of sales performance and optimize marketing spend.
Implemented Markov Chain algorithms to support multi-touch attribution modeling and customer journey analysis.
Conducted in-depth exploratory data analysis to identify trends, anomalies, and actionable insights.
Translated complex model outputs into clear, client-facing deliverables and presentations in PowerPoint for major stakeholders.
Led data-driven strategy for a simulated product launch by applying clustering techniques to segment audiences and inform market positioning and targeting strategies.
Collaborated with cross-functional teams to align statistical modeling with marketing objectives and campaign strategies
Hydr8| Brooklyn, NY| Marketing Data Analyst Intern (June 2023 - June 2024)
Conducted market and customer research using SQL, Python, and R, identifying strategic opportunities, and delivering actionable insights to improve business outcomes.
Analyzed customer and sales data to uncover trends, delivering comprehensive product insights and monthly performance
metrics to inform business strategy.
Designed and implemented advanced analyses and data visualizations in R, driving data-driven decision-making and increasing
stakeholder engagement by 20% during quarterly reviews.
Automated a data pipeline to scrape text data using Python, reducing manual data extraction time by over 70%.
Managed multiple data-driven projects simultaneously, improving team productivity by 20% and delivering high-quality insights.
PROJECT EXPERIENCE
Monte Carlo Analysis of Retirement Systems, Baruch College (Nov 2024)
Developed a Monte Carlo simulation to statistically evaluate retirement plans and provide insights on optimal plans.
Automated data extraction using raw API calls from Alpha Vantage and FRED, conducted analysis and advanced visualizations.
Implemented dynamic functions in R to simulate retirement outcomes based on inputs like age, salary, and estimated lifespan.
Built an interactive Shiny application to visualize return distributions, allowing users to adjust parameters like the number of simulations for personalized insights.
JPMorgan Chase & Co Quantitative Research Virtual Internship (Sept 2024)
Gained hands-on experience in quantitative research through a series of practical assignments.
Applied a machine learning model to predict gas prices based on relevant market data. - Developed a dynamic pricing function for commodity contracts, adjusting for factors like volume, rates, and dates.
Built a predictive credit risk model to estimate the probability of loan default and calculate expected loss.
Used dynamic programming to convert FICO scores into categorical data to predict defaults.
Stock Prediction using Machine Learning (July 2024)
Learned the fundamentals of developing ML models and neural networks in Python.
Developed an LSTM model using the Keras library to make predictions on Apple stock data.
Handled and processed data in preparation for model development.
Performed in-depth research, data analysis, and visualization.
UFC Data Analysis and Modeling (April 2024)
Cleaned, processed, and prepared large datasets using R programming.
Applied statistical methods such as regression analysis and hypothesis testing to understand correlations between data and make forecasts.
Created detailed data visualizations to summarize insights and support strategic recommendations.
Developed and deployed a logistic regression model to predict fight outcomes with 55% accuracy, with an interactive UI using RShiny.
SKILLS, CERTIFICATIONS, & EXTRACURRICULARS
Skills: R, SQL, Python, Google BigQuery, Tableau, Hypothesis Testing, Data Mining, Time Series Analysis, Regression Analysis, Monte Carlo Analysis, Statistical Modeling, Machine Learning
Certifications: JP Morgan Chase & Co - Quantitative Research Job Simulation, Google Data Analytics Professional Certificate
Other Experience
SUNY FIT | NYC | Statistics/Math Tutor (Oct 2024 - Present)
- Helping students develop a strong understanding of various subjects, including mathematics, statistics, fundamental economics, and introductory programming for machine learning and data analysis.