Highlights
Torch.Compile support for Torch Function Modes NVIDIA Blackwell Architecture Support
Mega Cache
PyTorch Native Context Parallel
Enhancing Intel GPU Acceleration
FlexAttention LLM first token processing on X86 CPUs
FlexAttention LLM throughput mode optimization on X86 CPUs
Foreach Map
Flex Attention for Inference
Prologue Fusion Support in Inductor