TechEcho

8 comments

grandmczebover 6 years ago

Here's the bottom line for anyone who doesn't want to read the whole article.> Using a commercially available 28-nanometer ASIC process technology, we have profiled (8, 1, 5, 5, 7) log ELMA as 0.96x the power of int8/32 multiply-add for a standalone processing element (PE).> Extended to 16 bits this method uses 0.59x the power and 0.68x the area of IEEE 754 half-precision FMAIn other words, interesting but not earth shattering. Great to see people working in this area though!

评论 #18419784 未加载

评论 #18420160 未加载

评论 #18420772 未加载

评论 #18419741 未加载

评论 #18421448 未加载

moflomeover 6 years ago

Not sure why this isn't getting more votes, but it's a good avenue of research and the authors should be commended. That said, this approach to optimizing floating point implementations has a lot of history at Imagination Technologies, ARM and similar low-power inferencing chipsets providers. I especially like the Synopsys ASIP Design [0] tool which leverages the open-source (although not yet IEEE ratified) LISA 2.0 Architecture Design Language [1] to iterate on these design issues.Interesting times...[0] <a href="https://www.synopsys.com/dw/ipdir.php?ds=asip-designer" rel="nofollow">https://www.synopsys.com/dw/ipdir.php?ds=asip-designer</a> [1] <a href="https://en.wikipedia.org/wiki/LISA_(Language_for_Instruction_Set_Architecture)" rel="nofollow">https://en.wikipedia.org/wiki/LISA_(Language_for_Instruction...</a>

Geeeover 6 years ago

A bit off-topic, but I remember some studies about 'under-powered' ASICs, ie. running with 'lower-than-required' voltage and just letting the chip fail sometimes. I guess the outcome was that you can run with 0.1x power and get 0.9x of correctness. Usually chips are designed so that they never fail and that requires using substantially more energy than is needed in the average case. If the application is probabilistic or noisy in general, additional 'computation noise' could be allowed for better energy efficiency.

评论 #18422288 未加载

dnauticsover 6 years ago

Wow! It's kind of a wierd feeling to see some research I worked on get some traction in the real world!! The ELMA lookup problem for 32 bit could be fixed by using the posit standard, which just has "simple" adders for the section past the golomb encoded section, though you may have to worry about spending transistors on the barrel shifter.

评论 #18419759 未加载

sgt101over 6 years ago

For those interested the general area I saw a good talk about representing and manipulating floating point numbers in Julia at CSAIL last week by Jiahao Chen. The code with some good documentation is on his github.<a href="https://github.com/jiahao/ArbRadixFloatingPoints.jl" rel="nofollow">https://github.com/jiahao/ArbRadixFloatingPoints.jl</a>

davmarover 6 years ago

caveat: i haven't finished reading the entire FB announcement yet.google announced something along these lines at their AI conference last september and released the video today on youtube. here's the link to the segment where their approach is discussed: <a href="https://www.youtube.com/watch?v=ot4RWfGTtOg&t=330s" rel="nofollow">https://www.youtube.com/watch?v=ot4RWfGTtOg&t=330s</a>

moltensyntaxover 6 years ago

> Significands are fixed point, and fixed point adders, multipliers, and dividers on these are needed for arithmetic operations... Hardware multipliers and dividers are usually much more resource-intensiveIt's been a number of years since I've implemented low-level arithmetic, but when you use fixed point, don't you usually choose a power of 2? I don't see why you'd need multiplication/division instead of bit shifters.

评论 #18420223 未加载

评论 #18420024 未加载

saagarjhaover 6 years ago

I find it interesting that they were able to find improvements even on hardware that is presumably optimized for IEEE-754 floating point numbers.

评论 #18421795 未加载

8 comments

grandmczebover 6 years ago

评论 #18419784 未加载

评论 #18420160 未加载

评论 #18420772 未加载

评论 #18419741 未加载

评论 #18421448 未加载

moflomeover 6 years ago

Geeeover 6 years ago

评论 #18422288 未加载

dnauticsover 6 years ago

评论 #18419759 未加载

sgt101over 6 years ago

davmarover 6 years ago

moltensyntaxover 6 years ago

评论 #18420223 未加载

评论 #18420024 未加载

saagarjhaover 6 years ago

I find it interesting that they were able to find improvements even on hardware that is presumably optimized for IEEE-754 floating point numbers.

评论 #18421795 未加载

Making floating point math highly efficient for AI hardware

8 comments

Making floating point math highly efficient for AI hardware

8 comments