> I just didn’t want to mix my M:tG findings with this tutorial so that readers who are into Data Science but not into the game won’t be bored.<p>I'd encourage you to add some examples here, even if they're dumbed down. Without that, the article is not telling me what's been achieved through the process.
Neat idea, but I'm not sure the approach of using euclidian distance on what's essentially a categorical variable is valid. Instead try a different clustering algorithm like K-prototypes [1], or Gower distance instead of euclidian.<p>[1] <a href="https://pdfs.semanticscholar.org/d42b/b5ad2d03be6d8fefa63d25d02c0711d19728.pdf" rel="nofollow">https://pdfs.semanticscholar.org/d42b/b5ad2d03be6d8fefa63d25...</a><p>Edit: Thinking about it more, you could treat the cards in each deck as a bag of words and run LDA on it. Alternatively create an embedding (just keep in mind skip-grams aren't meaningful for decks of cards) and cluster those.
I like the LDA one (<a href="https://towardsdatascience.com/finding-magic-the-gathering-archetypes-with-latent-dirichlet-allocation-729112d324a6" rel="nofollow">https://towardsdatascience.com/finding-magic-the-gathering-a...</a>) using non parametric bayesian.<p>But seeing different cluster algorithms in action is very enlightening.
How many people would like to play MtG vs a good AI?<p>How dominant is the social aspect when playing online?<p>(Lets ignore copy right issues for the moment)