João Pinheiro#
Professional with 4 years of experience in Data Science, Data Engineering, Computer Vision, and Artificial Intelligence within the banking and technology sectors. Specialist in Python, Machine Learning, Big Data, and Data Engineering, with hands-on experience in projects leveraging Spark, AWS, and Cloud Computing tools. Proven ability to transform data into actionable insights and deliver impactful business solutions.
Conducting research in robotics and Artificial Intelligence at USP, mentoring students, and leading projects with budgets exceeding R$48 million. Published research articles in international journals.
I also maintain a blog showcasing my Data Science and Machine Learning portfolio: joaomh.github.io. Additionally, I run a YouTube channel, “2001 Engenharia,” where I create educational programming content: youtube.com/2001engenharia.
Skills: Python | SQL | Scala | Julia | MATLAB | C++ | LaTeX | Data Science | Machine Learning | Computer Vision | Artificial Intelligence | Data Analysis | Data Engineering | Big Data | ETL | Software Development | Cloud Computing | Research | Calculus | Statistics | Mathematics | Optimization | Git | GitHub | Linux | AWS | EMR | Glue | Lambda | Athena | S3 | SageMaker | Step Function | CI/CD | CloudFormation | Terraform | CloudWatch | Pandas | TensorFlow | PyTorch | scikit-learn | Keras | OpenCV | Optuna | NumPy | SciPy | PySpark | Spark | Hadoop | Iceberg | Hive | Presto
Projects#
Here I have some of my projects that I am working on or had worked on in the past.
Current projects#
Books#
An Introduction to Machine Learning#
An online book built with Jupyter Book that covers the main topics in machine learning with projects.
YouTube Channel#
2001 Engenharia#
YouTube channel that aims to teach courses in programming, data science, machine learning and engineering.
Past Project#
Boosted Trees: XGBoost vs. CatBoost vs. LightGBM#
An overview of boosting tree algorithms, their main differences, performance comparisons, and hyperparameter optimization
Credit Risk Prediction#
A machine learning model to classify clients into risk profiles
A Study on Gradient Boosting Algorithms and Hyperparameter Optimization Using Optuna#
Undergraduate thesis ‘A Study on Gradient Boosting Algorithms and Hyperparameter Optimization using Optuna’ is an empirical study that covers three Machine Learning Algorithms (XGBoost, CatBoost, LightGBM) in four datasets to improve the baseline model’s performance using Optuna and literature review of each topic.
Python Data Structures and Algorithms#
Python-based examples of many data structures and algorithms
MATLAB Course in PT-BR#
A MATLAB programming course on my YouTube channel, with more than 90 videos.