MS Final Oral Exam: Jessica Kinnevan
Strategy Classification in Adversarial Gameplay Modeling
Effective adversarial agents in turn-based tactical games must not only play well, but also counter the specific strategies a human player employs. Real-time strategy mitigation requires real-time, label-free strategy classification, identifying how a player is behaving from game state observations alone, without access to ground-truth policy labels. We present a Vector Quantized Variational Autoencoder (VQ-VAE) approach (van den Oord, Vinyals, & Kavukcuoglu, 2017; Razavi, van den Oord, & Vinyals, 2019) for unsupervised strategy discovery in a turn-based tactical puzzle game inspired by Into the Breach, played on an 8×8 grid over five turns. Each game state paired with its preceding action is encoded as a token and processed by a transformer encoder (Vaswani et al., 2017) with GELU activations (Hendrycks & Gimpel, 2016). Turn-level latent vectors are aggregated via mean pooling and quantized against a learned codebook of N=5 vectors, updated through exponential moving average, with game-level strategy assigned by majority vote across turns. The model is trained on a dataset of winning games generated by five behaviorally distinct policy variants: balanced, aggressive, defensive, guardian, and efficient, without access to those labels.
Post-training analysis shows that the codebook identifies five spatially differentiated strategies rather than reproducing the injected policy-variant labels: an upper-board high-activity style (S0), a right-edge flanking style (S1), a compressed lower-center cluster distinguished by reduced mobility and mech spread (S2), a wide-spreading left-flank style (S3), and a center-anchored generalist style (S4). Each discovered strategy contains a near-uniform mixture of all five injected policy variants (16–23% each), demonstrating that the VQ-VAE recovers genuine spatial-behavioral structure that cuts across policy identity. These results validate the approach as a foundation for strategy-aware adversarial training.
Committee: Simanta Mitra (major professor)