TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Introduction to AVX2 optimizations in x264

75 pointsby DarkShikariabout 12 years ago

4 comments

jasinabout 12 years ago
A couple of comments/elaborations on the "core differences" mentioned in the article:<p>The first difference mentioned is that whereas the first SSE2 implementations were often implemented using 64-bit ALUs internally, yielding roughly the same performance as doing two equivalent MMX ops manually, this isn't the case with AVX2. However, it may be worth noting, that it largely _is_ the case with the current AVX ("AVX1", i.e. pre-Haswell) implementations.<p>The second cited difference is that there's a 128-bit "boundary" in many of the operations. This is effectively what can throw down the drain the hopes of getting 2x gains over SSE2 just by naïvely migrating into AVX2. For instance, you cannot do shuffles to/from arbitrary components anymore, but have to consider the 128-bit lane boundaries instead.<p>The third issue, i.e. data layouts of internal formats and the assumptions of various algorithms are probably the most significant factors that determine how large a benefit you are going to get. Typically the internal data layouts (i.e. is my pixel block size 2x2, 4x4, 16x8 or something else?) are married with the ISA. Thus, when migrating from one instruction set to another, these typically may need to be reconsidered if speed is paramount. Interestingly enough, this means that when the ISA changes, you most likely want to do some higher-level algorithmic optimizations as well.
lmmabout 12 years ago
Anyone have a non-scribd copy?
评论 #5598560 未加载
评论 #5598354 未加载
Osirisabout 12 years ago
Are there binary builds available with AVX2 support compiled in for testing? I'm curious if FMA(3/4) support available in AMD processors would increase performance. A quick Google search shows that there are some patches available for FMA support.
评论 #5598201 未加载
评论 #5598207 未加载
zobzuabout 12 years ago
Nice gains. Thanks for the writeup and explanations!