Problem and Context
The challenge was to learn strong strategy in a combinatorial board game where long-horizon planning and local tactics must coexist.
ML/AI
Research-driven self-play framework combining PPO and MCTS with CNN-based policy/value modeling for strategic board-game agents.
Role: ML engineer + research author
Timeline: 2025
The challenge was to learn strong strategy in a combinatorial board game where long-horizon planning and local tactics must coexist.
I paired reinforcement learning with search: PPO improved policy quality through self-play while MCTS sharpened action selection and exposed emergent strategy behavior for analysis.
I built training/evaluation loops, encoded board state for convolutional models, and documented methodology and learning dynamics in a research-style write-up.
The project produced agents with measurable strategic improvement and gave me a strong workflow for bridging theory, implementation, and technical communication.