Andy Vu
Mathematics and Statistical Sciences Graduate
Email / GitHub / LinkedIn
Education
Financial Technology Boot Camp
(May 2023 - October 2023)
University of Toronto School of Continuing Studies, Toronto, ON
- Completed a 24-week Financial Technology Boot Camp focused on financial fundamentals, blockchain and cryptocurrency, machine learning applications in finance, and programming and financial libraries.
- Developed a strong understanding of cutting-edge financial technologies and their applications in the financial industry through rigorous coursework, hands-on projects, and collaborative team work.
Honours Bachelor Of Science in Mathematics and Statistics
(September 2018 - November 2022)
University of Toronto (St. George), Toronto, ON
- Mathematics & Its Applications Specialist (Probability/Statistics) and Statistics program graduate.
- Graduated with Distinction (3.22 CGPA).
Technical Skills
Languages and IDEs:
- R (dplyr, ggplot2, forcats, tibble, tidyr, Knitr)
◦ RStudio
- Python (pandas, scikit-learn, NumPy)
◦ PyCharm
◦ WING
◦ Visual Studio Code
◦ JupyterLab
- SQL
- Java
◦ IntelliJ IDEA
- Solidity
◦ Remix IDE
- HTML & CSS
- LaTeX
Software and Applications:
- Git/GitHub
- Amazon Web Services (Amazon Lex, Amazon SageMaker, Lambda)
- Microsoft Office (Word, Excel, Outlook, Powerpoint, OneNote, Forms, Access)
- Google Docs Editors (Docs, Sheets, Slides, Keep, Forms)
- Adobe Creative Suite (Acrobat, Photoshop, Premiere Pro, After Effects)
Experience
University of Toronto Vietnamese Students’ Association, Toronto, ON - General Member
(September 2018 - April 2022)
- Actively participated in monthly social, academic and cultural events organized by the club.
- Contributed to the success of the IGNITE XXII and IGNITE XXIII cultural shows, placing second in IGNITE XXII.
- Awarded the Most Engaged Member recognition for exceptional contributions and active participation in the club.
Projects
Blockchain Car Marketplace Blockchain Project (Solidity, Python, GitHub)
(October 2023)
- Collaborated in a team contributing to the creation of a car marketplace on the Ethereum blockchain, taking the lead in implementing machine learning functionalities.
- From a dataset of 4000+ used cars, cleaned, encoded, and evaluated models using the
pandas
and scikit-learn
package, ultimately selecting the Gradient Boosting Regressor for its superior performance.
- Developed an Ethereum Smart Contract backend and implemented functionalities for minting NFTs, facilitating ownership transfers, and enabling users to effortlessly list, query, and purchase cars through an intuitive Streamlit interface.
Check First Machine Learning Project (Python, GitHub)
(September 2023)
- Initiated and guided a collaborative effort with three peers in a Machine Learning project to create an application aimed at using advanced models to assess the risk of common health diseases, with a particular focus on Chronic Obstructive Pulmonary Disease and Dementia.
- Extensive use of the
scikit-learn
package to preprocess the data as well as train, test and evaluate machine learning models utilizing support vector machine, k-nearest neighbors, ensemble-based methods and naive bayes algorithms.
- Employment of the
tensorflow
package to train and test a sequential neural network deep learning models.
Analyzing Weather Data for Climate Trends Data Analysis Project (Python, GitHub)
(July 2023)
- Collaborated with three peers on an analysis project to determine climate trends in the last 50 years using hurricane, drought and temperature data.
- Extended the dataset by developing a function that travese it and initiate API calls.
- Utilized the
pandas
package extensively to clean and prepare the data for visualizations with the hvPlot
package.
Analyzing the Gateway to Vaping Data Analysis Report (R)
(December 2021)
- Wrangled survey data collected through The Canadian Tobacco and Nicotine Survey with the
tidyr
and dplyr
packages.
- Summarized and visualized the data with tables and plots with the
ggplot2
package.
- Executed the Propensity Score Matching method to determine the factor that affects the probability of trying vaping for the first time.
Gender Parity in Black Saber Software Consulting Report (R)
(April 2021)
- Collaborated with three peers on a consulting project that evaluated gender parity in hiring, wages and promotion at a simulated software company.
- Processed and visualized large datasets using the
dplyr
and ggplot2
packages to support project analyses.
- Employed linear and generalized linear mixed models to assess gender parity, including verifying assumptions and interpreting results with confidence intervals and significance testing.
- Produced a report utilizing R Markdown to ensure the report was both cohesive and reproducible.
- A full description of this project (written by my past instructor) can be found here
Exploring COVID-19 data for Toronto, Canada Data Exploration Assignment (R, GitHub)
(February 2021)
- Utilized an API to extract current data from the Open Data Toronto Portal.
- Employed the
tidyr
and dplyr
packages to wrangle, parse and transform data on COVID-19 cases across the 140 Toronto neighborhoods.
- Developed visualizations using the
ggplot2
package to effectively communicate insights from the data such as, histograms for the number of daily reported cases, and thematic maps for the proportion of cases by region.
Are NBA Players Underpaid or Overpaid? Data Analysis Report (R)
(August 2020)
- Collaborated remotely with a team of three to complete a data analysis project on NBA players salaries.
- Developed and validated linear regression models utilizing diagnostic plots, various criterion, and out-of-sample validation.
- Created a comprehensive report utilizing R Markdown, featuring tables and figures to communicate key insights to the reader.
Relevant Coursework
- CSC108 ~ Introduction to Computer Programming
- CSC165 ~ Mathematical Expression and Reasoning for Computer Science
- CSC148 ~ Introduction to Computer Science
- MAT137 ~ Calculus with Proofs
- ECO101 ~ Principles of Microeconomics
- ECO102 ~ Principles of Macroeconomics
- MAT223 ~ Linear Algebra I
- MAT224 ~ Linear Algebra II
- MAT237 ~ Multivariable Calculus with Proofs
- MAT244 ~ Introduction to Ordinary Differential Equations
- MAT246 ~ Concepts in Abstract Mathematics
- STA247 ~ Probability with Computer Applications
- STA248 ~ Statistics for Computer Scientists
- STA302 ~ Methods of Data Analysis I
- STA303 ~ Methods of Data Analysis II
- STA304 ~ Surveys, Sampling and Observational Data
- STA305 ~ Design and Analysis of Experiments
- STA347 ~ Probability
- STA355 ~ Theory of Statistical Practice
- MAT301 ~ Groups and Symmetries
- MAT309 ~ Introduction to Mathematical Logic
- MAT315 ~ Introduction to Number Theory
- MAT334 ~ Complex Variables
- MAT337 ~ Introduction to Real Analysis
- MAT344 ~ Introduction to Combinatorics
- APM346 ~ Partial Differential Equations
- STA437 ~ Methods for Multivariate Data
- STA442 ~ Methods of Applied Statistics
- STA457 ~ Time Series Analysis
Languages
English: Native
Vietnamese: Mother Tongue