科技回声

11 条评论

crest将近 2 年前

Hasn't AMD proven multiple times that a double pumped packed-SIMD implementation works well enough? Just the permute operations need a full width data path to get reasonable latencies. Intel already overplayed their hand with AVX-512 when they still had a stronger position. Let's hope they fail to hold back the field with their misguided attempts to increase their margin no matter the cost (even to their own bottom line).

评论 #36855698 未加载

canucker2016将近 2 年前

from <a href="https://cdrdv2.intel.com/v1/dl/getContent/784343" rel="nofollow noreferrer">https://cdrdv2.intel.com/v1/dl/getContent/784343</a> ("The Converged Vector ISA: Intel Advanced Vector Extensions 10" Technical Paper PDF)"Intel AVX10 Version 1 will be introduced for early software enablement and supports the subset of all the Intel AVX-512 instruction set available as of future Intel Xeon processors with P-cores, codenamed Granite Rapids, that is forward compatible to Intel AVX10. This version will not include the new 256-bit vector instructions supporting embedded rounding or any of the new instructions and will serve as the transition base version from Intel AVX-512 to Intel AVX10.Intel AVX10 Version 2 will include the 256-bit instruction forms supporting embedded rounding as well as a suite of new Intel AVX10 instructions covering new AI data types and conversions, data movement optimizations, and standards support. All new instructions will be supported at 128-, 256-, and 512-bit vector lengths with limited variances. All Intel AVX10 versions will implement the new versioning enumeration scheme."And who knows when AMD will have time to update Zen ? architecture with these new instructions.

评论 #36855695 未加载

Am4TIfIsER0ppos将近 2 年前

Slow down dammit! I've barely started writing avx512 since they became worth it on ice lake.> being able to work for both P and E coresOh yes I forgot they were gimping their own processors.> the converged version has a maximum vector length of 256-bits [on] the E cores while P cores will have optional 512-bit vector useMaybe they shouldn't have made xmm and ymm "extensions" to the base set to begin with.

评论 #36854910 未加载

colejohnson66将近 2 年前

That’s a massive extension. 32 GPRs! And they’re finally reusing an encoding made reserved in long mode (D5 - AAM in legacy modes).Guess Intel’s feeling the pressure from Zen 4 supporting AVX-512.

评论 #36859927 未加载

codedokode将近 2 年前

From the name I thought that they extended registers to 1024 bits, but it looks like instead they made 512-bit width support optional.

评论 #36860526 未加载

aperture147将近 2 年前

As an average developer who works on high level interface, I don't really see the benefit of AVX-512. I've heard that some math calculating software MAY gains some benefit from AVX instructions (like BLAS), but I've never use it personally. Can you guys please explain?

评论 #36860470 未加载

评论 #36858496 未加载

jauntywundrkind将近 2 年前

This seemed really cool. I'm used to a lot of new instructions & boosts, but Intel adding new conditional load/store is a smart interesting coupling that could help increase execution unit efficiency in a significant way.> As out-of-order CPUs continue to become deeper and wider, the cost of mispredictions increasingly dominates performance of such workloads. Branch predictor improvements can mitigate this to a limited extent only as data-dependent branches are fundamentally hard to predict.> To address this growing performance issue, we significantly expand the conditional instruction set of x86, which was first introduced with the Intel® Pentium® Pro in the form of CMOV/SET instructions. These instructions are used quite extensively by today’s compilers, but they are too limited for broader use of if-conversion (a compiler optimization that replaces branches with conditional instructions).> Intel® APX adds conditional forms of load, store, and compare/test instructions, and it also adds an option for the compiler to suppress the status flags writes of common instructions. <a href="https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html" rel="nofollow noreferrer">https://www.intel.com/content/www/us/en/developer/articles/t...</a>I didn't understand everything about the "caller-saved volatile" new general purpose register interface & legacy compatibility. But some potentially really interesting optimizations where load/store being dual register capable, and being capable of staying on the AVX unit & not having to go further out to "memory" (caches?):> Generally, more register state will need to be managed at function boundaries. In order to reduce the associated overhead, we are adding PUSH2/POP2 instructions that transfer two register values within a single memory operation. The processor tracks these new instructions internally and fast-forwards register data between matching PUSH2 and POP2 instructions without going through memory.Neat stuff. Very superficially reminds me of Semantic Streaming Registers on the very novel standalone-ish FPU on PULP's RISC-V based Occamy many-core chip. In that the unit is acting in a more standalone fashion. <a href="https://www.youtube.com/watch?v=kMhdq7A3d3I#t=10m">https://www.youtube.com/watch?v=kMhdq7A3d3I#t=10m</a> <a href="https://pulp-platform.org/docs/BeniniSC11-22.pdf" rel="nofollow noreferrer">https://pulp-platform.org/docs/BeniniSC11-22.pdf</a>

评论 #36856332 未加载

dathinab将近 2 年前

Is someone here who understands the nitty bitty details of AVX-512/AVX10 and could tell me what is included which current latest gen AMD processors do not support?Because the only thing I can pick out is the 256bit AVX-512 which AFIK recent amd processors do support (including 512bit support) both on their normal cores and their new compacted code.But I don't know much about AVX_ so I'm 100% I missed a bunch of stuff and/or limitations with current AMD code (besides it being double pumped).

评论 #36855140 未加载

评论 #36855051 未加载

评论 #36854915 未加载

评论 #36854768 未加载

RcouF1uZ4gsC将近 2 年前

It seems Intel is borrowing Microsoft’s XBox naming scheme. At least they didn’t name the successor to AVX-512 AVX-One.

shmerl将近 2 年前

Why did it take around 10 years for AMD to implement AVX-512 and will they need to wait as long for this too? Doesn't seem to be patent related (patents are 20 years and AVX-512 was introduced in 2013?).

评论 #36855319 未加载

评论 #36855745 未加载

评论 #36855629 未加载

评论 #36855156 未加载

phkahler将近 2 年前

Can we get great RISC-V cores from Apple or AMD please with that vector ISA so we can shut down this whole notion of ISA as a product differentiator?

评论 #36855212 未加载

评论 #36854862 未加载

11 条评论

crest将近 2 年前

评论 #36855698 未加载

canucker2016将近 2 年前

评论 #36855695 未加载

Am4TIfIsER0ppos将近 2 年前

评论 #36854910 未加载

colejohnson66将近 2 年前

评论 #36859927 未加载

codedokode将近 2 年前

From the name I thought that they extended registers to 1024 bits, but it looks like instead they made 512-bit width support optional.

评论 #36860526 未加载

aperture147将近 2 年前

评论 #36860470 未加载

评论 #36858496 未加载

jauntywundrkind将近 2 年前

评论 #36856332 未加载

dathinab将近 2 年前

评论 #36855140 未加载

评论 #36855051 未加载

评论 #36854915 未加载

评论 #36854768 未加载

RcouF1uZ4gsC将近 2 年前

It seems Intel is borrowing Microsoft’s XBox naming scheme. At least they didn’t name the successor to AVX-512 AVX-One.

shmerl将近 2 年前

评论 #36855319 未加载

评论 #36855745 未加载

评论 #36855629 未加载

评论 #36855156 未加载

phkahler将近 2 年前

Can we get great RISC-V cores from Apple or AMD please with that vector ISA so we can shut down this whole notion of ISA as a product differentiator?

评论 #36855212 未加载

评论 #36854862 未加载

Intel AVX10: The Successor to AVX-512

11 条评论

Intel AVX10: The Successor to AVX-512

11 条评论