10 pointsby repeat_orover 3 years ago

2 comments

repeat_orover 3 years ago

We published a notebook and a GitHub repo that helps you train synthetic models on highly dimensional datasets (e.g. 1000's of columns, and millions of records). It works by using Gretel's open source header clustering to group correlated data and parallelize training across multiple GPUs.

ag408over 3 years ago

Thanks for posting this :)

Show HN: Training synthetic models on highly complex datasets

2 comments

Show HN: Training synthetic models on highly complex datasets

2 comments