ML/AI

BlokusDuo ML Agents

Research-driven self-play framework combining PPO and MCTS with CNN-based policy/value modeling for strategic board-game agents.

Role: ML engineer + research author

Timeline: 2025

Problem and Context

The challenge was to learn strong strategy in a combinatorial board game where long-horizon planning and local tactics must coexist.

Approach

I paired reinforcement learning with search: PPO improved policy quality through self-play while MCTS sharpened action selection and exposed emergent strategy behavior for analysis.

Technical Details

I built training/evaluation loops, encoded board state for convolutional models, and documented methodology and learning dynamics in a research-style write-up.

Outcome

The project produced agents with measurable strategic improvement and gave me a strong workflow for bridging theory, implementation, and technical communication.