TE
TechEcho
Home24h TopNewestBestAskShowJobs
GitHubTwitter
Home

TechEcho

A tech news platform built with Next.js, providing global tech news and discussions.

GitHubTwitter

Home

HomeNewestBestAskShowJobs

Resources

HackerNews APIOriginal HackerNewsNext.js

© 2025 TechEcho. All rights reserved.

Computing sin and cos in hardware with synthesisable Verilog

173 pointsby Cieplakover 6 years ago

9 comments

GeertBover 6 years ago
While CORDIC is great for fixed point, it has limitations for floating point. The original 8087 fsin and fcos instructions used CORDIC, but later versions of the architecture switched to polynomial approximations, see <a href="https:&#x2F;&#x2F;software.intel.com&#x2F;sites&#x2F;default&#x2F;files&#x2F;managed&#x2F;f8&#x2F;9c&#x2F;x87TrigonometricInstructionsVsMathFunctions.pdf" rel="nofollow">https:&#x2F;&#x2F;software.intel.com&#x2F;sites&#x2F;default&#x2F;files&#x2F;managed&#x2F;f8&#x2F;9c...</a>. Today it&#x27;s possible to develop implementations of these elementary functions on x86 CPUs that are more precise and more performant using regular multiply&#x2F;addition&#x2F;fused multiply add than even the current improved post-CORDIC fsin and fcos functions.<p>The main issue is that having an instruction executing a fixed-function block with a given (high) latency and little if any pipelining tends to be far worse than having many more fully pipelined multiply&#x2F;add instructions. The other issue is that argument reduction and approximation over the reduced domain are not independent. For some parts of the domain, such as computing the sine of a number very close to a multiple of pi, you may need to spend more cycles reducing the argument accurately to counter cancelation effects. However, as the reduced argument is then very close to zero, a simple polynomial suffices.<p>So, for most modern systems, I&#x27;d put the effort in efficient pipelined fused-multiply-add and use that for all elementary functions. Fixed-function hardware for elementary functions has generally been proved sub-optimal.
评论 #18319151 未加载
评论 #18320199 未加载
Y_Yover 6 years ago
What a well explained and simple article. Though it would be nice to know why each optimisation is made wrt the eventual hardware.
评论 #18317969 未加载
hatsunearuover 6 years ago
Oh wow, I literally <i>just</i> finished my quadrature sinusoid DDS generator using MyHDL last night. I didn&#x27;t use CORDIC but rather a LUT. I found out I can optimize generating quadrature sinusoids by having two separate LUTs where each one stores from 0 to pi&#x2F;2 and the other from pi&#x2F;2 to pi, and this has an advantage because when the sine output takes inputs from the first LUT, the cos output takes inputs from the second LUT and vice versa, thus saving duplicates.<p>I&#x27;m still cleaning up the testbench code and I plan to put out a blog post here if y&#x27;all are interested: hatsunearu.github.io
评论 #18318679 未加载
评论 #18319133 未加载
kkaranthover 6 years ago
Nice read! I have 2 questions:<p>When calculating K, the author says “It can be shown through the use of trigonometric identities that:” and proceeds to show a formula. How exactly does this happen?<p>After calculating K, the author assigns it to c in the cordic function, but not to s. Why?
评论 #18319104 未加载
toolsliveover 6 years ago
Isn&#x27;t it simpler (and more efficient) to build a interpolating polynomial approximation for sin(x) for the range [0,pi&#x2F;8) (using Chebichev iso Lagrange interpolation fe)
JoeAltmaierover 6 years ago
Is this just an example? Because if I wanted to rotate a vector, I&#x27;d do it with vectors, not trig. Which requires multiplication and addition, right? What am I missing.
评论 #18320938 未加载
评论 #18320768 未加载
评论 #18320755 未加载
man-and-laptopover 6 years ago
How is CORDIC different from e^x ~= (1+x&#x2F;N)^N where N is a power of 2?
评论 #18320217 未加载
gravypodover 6 years ago
How would one build an asynchronous implementation of this circuit in an HDL
评论 #18318542 未加载
评论 #18317952 未加载
andrewflnrover 6 years ago
This is incredibly slick. Is it used in real hardware?
评论 #18318757 未加载