3 点作者 legel大约 6 年前

1 comment

legel大约 6 年前

Mesh-TensorFlow (<a href="https://github.com/tensorflow/mesh" rel="nofollow">https://github.com/tensorflow/mesh</a>) solves the problem of networks being too large to fit into a single GPU's memory. I just trained a network with > 1 billion parameters across 4 GPUs. Beside size of data set and compute power, representational capacity (size of embeddings) is a key blocker to learning, and so this library opens up new possibilities for many of us.

Mesh-TensorFlow: Model Parallelism for Supercomputers (TF Dev Summit ‘19)

1 comment

Mesh-TensorFlow: Model Parallelism for Supercomputers (TF Dev Summit ‘19)

1 comment