I didn't see any notice of this being CUDA only (like FlashAttention). I tried running on my Mac M3, python 3.11.8, following the quickstart (with the deviation of running it in a new venv). Got the following error:<p>/attention-gym/.venv/lib/python3.11/site-packages/torch/_subclasses/functional_tensor.py:258: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
cpu = _conversion_method_template(device=torch.device("cpu"))
Traceback (most recent call last):
File "/attention-gym/attn_gym/masks/document_mask.py", line 7, in <module>
from torch.nn.attention.flex_attention import _mask_mod_signature
ModuleNotFoundError: No module named 'torch.nn.attention.flex_attention'