🧠Deep Learning | Tags

Yuki’s Blog

Blog

PPO Explained for Dummies (With Python)

2022-3-9

Proximal Policy Optimization (PPO) is one of the most powerful reinforcement learning algorithms, balancing stability and efficiency. This article breaks down how AI gradually improves in decision-making using trial, error, and strategic policy updates—just like learning to ride a bike!

REINFORCE Explained for Dummies (With Python)

2022-3-8

The REINFORCE algorithm is the most basic policy gradient reinforcement learning algorithm. Imagine you’re learning to ride a bicycle without a teacher to guide you on what to do. You can only learn through "try → see the result → adjust → try again." The REINFORCE algorithm is the mathematical expression of this learning process.

MCTS Explained for Dummies (With Python)

2022-3-10

Imagine you're playing a game of chess, and there are many choices at each step. Monte Carlo Tree Search is like a smart assistant that helps you find the best move by "simulating the future.”

Roboflow: Build A Coins Detection App

2025-4-5

Getting Started with Roboflow: Annotate Your Dataset and Train Models All in One Place. A Hands-On Tutorial for Building a Coin-Detection App. Computer Vision Workshop for ADSP 32023 IP01: Advanced Computer Vision with Deep Learning

Recommendation System Collection

2025-6-15

Recommendation System workshop Collection