The model architecture follows minGPT as much as possible. It actually uses minGPT for JS tests generation (gradients, predictions). The main advantage of having the model implemented in TensorFlow.js is the ability to perform training/fine-tuning, for example, in a browser using WebGPU or in Node.js.<p>Examples in the `projects` folder include:
- sorting (basic example)
- loading GPT-2 weights
- training on large texts using streams<p>Feedback is really welcome! There also an open PR on porting the model to Typescript, which has some unresolved issues