TechEcho

3 comments

svetly0over 8 years ago

MIPS32 support - kudos to Vladimir Stefanovic and Imagination Technologies for making this happen. Many people from the embedded world will also greatly appreciate support for soft-float MIPS32 hardware.

评论 #12995058 未加载

0xmohitover 8 years ago

There are scores of other optimizations [0] as well:<p><pre><code> Optimizations: bytes, strings: optimize for ASCII sets (CL 31593) bytes, strings: optimize multi-byte index operations on s390x (CL 32447) bytes,strings: use IndexByte more often in Index on AMD64 (CL 31690) bytes: Use the same algorithm as strings for Index (CL 22550) bytes: improve WriteRune performance (CL 28816) bytes: improve performance for bytes.Compare on ppc64x (CL 30949) bytes: make IndexRune faster (CL 28537) cmd/asm, go/build: invoke cmd/asm only once per package (CL 27636) cmd/compile, cmd/link: more efficient typelink generation (CL 31772) cmd/compile, cmd/link: stop generating unused go.string.hdr symbols. (CL 31030) cmd/compile,runtime: redo how map assignments work (CL 30815) cmd/compile/internal/obj/x86: eliminate some function prologues (CL 24814) cmd/compile/internal/ssa: generate bswap on AMD64 (CL 32222) cmd/compile: accept literals in samesafeexpr (CL 26666) cmd/compile: add more non-returning runtime calls (CL 28965) cmd/compile: add size hint to map literal allocations (CL 23558) cmd/compile: be more aggressive in tighten pass for booleans (CL 28390) cmd/compile: directly construct Fields instead of ODCLFIELD nodes (CL 31670) cmd/compile: don't reserve X15 for float sub/div any more (CL 28272) cmd/compile: don’t generate pointless gotos during inlining (CL 27461) cmd/compile: fold negation into comparison operators (CL 28232) cmd/compile: generate makeslice calls with int arguments (CL 27851) cmd/compile: handle e == T comparison more efficiently (CL 26660) cmd/compile: improve s390x SSA rules for logical ops (CL 31754) cmd/compile: improve s390x rules for folding ADDconst into loads/stores (CL 30616) cmd/compile: improve string iteration performance (CL 27853) cmd/compile: improve tighten pass (CL 28712) cmd/compile: inline _, ok = i.(T) (CL 26658) cmd/compile: inline atomics from runtime/internal/atomic on amd64 (CL 27641, CL 27813) cmd/compile: inline convT2{I,E} when result doesn't escape (CL 29373) cmd/compile: inline x, ok := y.(T) where T is a scalar (CL 26659) cmd/compile: intrinsify atomic operations on s390x (CL 31614) cmd/compile: intrinsify math/big.mulWW, divWW on AMD64 (CL 30542) cmd/compile: intrinsify runtime/internal/atomic.Xaddint64 (CL 29274) cmd/compile: intrinsify slicebytetostringtmp when not instrumenting (CL 29017) cmd/compile: intrinsify sync/atomic for amd64 (CL 28076) cmd/compile: make [0]T and [1]T SSAable types (CL 32416) cmd/compile: make link register allocatable in non-leaf functions (CL 30597) cmd/compile: missing float indexed loads/stores on amd64 (CL 28273) cmd/compile: move stringtoslicebytetmp to the backend (CL 32158) cmd/compile: only generate ·f symbols when necessary (CL 31031) cmd/compile: optimize bool to int conversion (CL 22711) cmd/compile: optimize integer "in range" expressions (CL 27652) cmd/compile: remove Zero and NilCheck for newobject (CL 27930) cmd/compile: remove duplicate nilchecks (CL 29952) cmd/compile: remove some write barriers for stack writes (CL 30290) cmd/compile: simplify div/mod on ARM (CL 29390) cmd/compile: statically initialize some interface values (CL 26668) cmd/compile: unroll comparisons to short constant strings (CL 26758) cmd/compile: use 2-result divide op (CL 25004) cmd/compile: use masks instead of branches for slicing (CL 32022) cmd/compile: when inlining ==, don’t take the address of the values (CL 22277) container/heap: remove one unnecessary comparison in Fix (CL 24273) crypto/elliptic: add s390x assembly implementation of NIST P-256 Curve (CL 31231) crypto/sha256: improve performance for sha256.block on ppc64le (CL 32318) crypto/sha512: improve performance for sha512.block on ppc64le (CL 32320) crypto/{aes,cipher}: add optimized implementation of AES-GCM for s390x (CL 30361) encoding/asn1: reduce allocations in Marshal (CL 27030) encoding/csv: avoid allocations when reading records (CL 24723) encoding/hex: change lookup table from string to array (CL 27254) encoding/json: Use a lookup table for safe characters (CL 24466) hash/crc32: improve the AMD64 implementation using SSE4.2 (CL 24471) hash/crc32: improve the AMD64 implementation using SSE4.2 (CL 27931) hash/crc32: improve the processing of the last bytes in the SSE4.2 code for AMD64 (CL 24470) image/color: improve speed of RGBA methods (CL 31773) image/draw: optimize drawFillOver as drawFillSrc for opaque fills (CL 28790) math/big: 10%-20% faster float->decimal conversion (CL 31250, CL 31275) math/big: avoid allocation in float.{Add, Sub} when there's no aliasing (CL 23568) math/big: make division faster (CL 30613) math/big: use array instead of slice for deBruijn lookups (CL 26663) math/big: uses SIMD for some math big functions on s390x (CL 32211) math: speed up Gamma(+Inf) (CL 31370) math: speed up bessel functions on AMD64 (CL 28086) math: use SIMD to accelerate some scalar math functions on s390x (CL 32352) reflect: avoid zeroing memory that will be overwritten (CL 28011) regexp: avoid alloc in QuoteMeta when not quoting (CL 31395) regexp: reduce mallocs in Regexp.Find* and Regexp.ReplaceAll* (CL 23030) runtime: cgo calls are about 100ns faster (CL 29656, CL 30080) runtime: defer is now 2X faster (CL 29656) runtime: implement getcallersp in Go (CL 29655) runtime: improve memmove for amd64 (CL 22515, CL 29590) runtime: increase malloc size classes (CL 24493) runtime: large objects no longer cause significant goroutine pauses (CL 23540) runtime: make append only clear uncopied memory (CL 30192) runtime: make assists perform root jobs (CL 32432) runtime: memclr perf improvements on ppc64x (CL 30373) runtime: minor string/rune optimizations (CL 27460) runtime: optimize defer code (CL 29656) runtime: remove a load and shift from scanobject (CL 22712) runtime: remove defer from standard cgo call (CL 30080) runtime: speed up StartTrace with lots of blocked goroutines (CL 25573) runtime: speed up non-ASCII rune decoding (CL 28490) strconv: make FormatFloat slowpath a little faster (CL 30099) strings: add special cases for Join of 2 and 3 strings (CL 25005) strings: make IndexRune faster (CL 28546) strings: use AVX2 for Index if available (CL 22551) strings: use Index in Count (CL 28586) syscall: avoid convT2I allocs for common Windows error values (CL 28484, CL 28990) text/template: improve lexer performance in finding left delimiters (CL 24863) unicode/utf8: optimize ValidRune (CL 32122) unicode/utf8: reduce bounds checks in EncodeRune (CL 28492) </code></pre> [0] <a href="https://github.com/golang/go/blob/master/doc/go1.8.txt" rel="nofollow">https://github.com/golang/go/blob/master/doc/go1.8.txt</a>

grabcocqueover 8 years ago

So, the language has now spent getting on for 18 months significant slower than it used to be, and nobody seems to have a real issues with this?

评论 #12993994 未加载

评论 #12994038 未加载

3 comments

svetly0over 8 years ago

评论 #12995058 未加载

0xmohitover 8 years ago

grabcocqueover 8 years ago

So, the language has now spent getting on for 18 months significant slower than it used to be, and nobody seems to have a real issues with this?

评论 #12993994 未加载

评论 #12994038 未加载

Go 1.8 toolchain improvements

3 comments

Go 1.8 toolchain improvements

3 comments