runtime: consider caching previous map hash value #5147

bradfitz · 2013-03-28T15:08:20Z

Michael Jones notes that the following pattern is common in Go code:

  if v, ok := map[key]; ok { // or !ok
      ....
      map[key] = newValue
  }

If hashing that key type is expensive (say: large string/struct) and the map isn't empty
or very small, the above code does it twice.

If we kept a per-P cache of the last key & last computed hash value for that P, we
could eliminate some hash calls.

Probably worth measuring. (Probably not for Go 1.1)

bradfitz · 2013-03-29T02:43:32Z

Comment 2:

Example from net/textproto.ReadMIMEHeader:
                m[key] = append(m[key], value)

bradfitz · 2013-03-29T17:08:06Z

Comment 3:

I've hacked this up as an exercise: https://golang.org/cl/8160043
This isn't being proposed for Go 1.1.  I just wanted to see what it'd involve, and learn
some new code.

MichaelTJones · 2013-03-29T18:32:27Z

Comment 4:

Pleased to see this. It's what I would expect. Poor performance is a "death
by a thousand cuts" issue, so every place where something is done twice is
a place where one of those cuts may be healed.

bradfitz · 2013-03-29T20:50:49Z

Comment 5:

This issue was updated by revision ecdcec1.

R=khr, r
CC=golang-dev
https://golang.org/cl/8165044

bradfitz · 2013-03-29T21:15:51Z

Comment 6:

It helps when it triggers, and hurts when it doesn't, as expected.
Now the challenge would be making it cost less when it doesn't trigger:
benchmark                             old ns/op    new ns/op    delta
BenchmarkNewEmptyMap                        144          142   -1.39%
BenchmarkHashStringSpeed                     42           46  +10.21%
BenchmarkHashInt32Speed                      34           31   -8.45%
BenchmarkHashInt64Speed                      31           32   +2.52%
BenchmarkHashStringArraySpeed               112          126  +12.50%
BenchmarkMegMap                          298981           28  -99.99%  << best case
BenchmarkMegOneMap                           24           24   +0.40%
BenchmarkMegEmptyMap                          4            4   +0.44%
BenchmarkSmallStrMap                         23           23   +0.00%
BenchmarkIntMap                              17           17   +0.56%
BenchmarkRepeatedLookupStrMapKey32           45           35  -20.22%  << new test
BenchmarkRepeatedLookupStrMapKey1M       306886       154843  -49.54%  << new test

MichaelTJones · 2013-03-29T23:14:19Z

Comment 7:

Strings have lengths. you could compare those before comparing the strings.
I bet in real life (words in text, etc.) the random
serial correlation would be low enough to bring the 10% down to 1-2%
(unless you're already doing that. ;-)
Same for String Array -- do all the lengths first and then the string
bodies.
I wonder what the real diversity is. Maybe you should make a version that
does the compares, tabulates matches/non-matches, and prints that at exit.
Run the benchmarks that way and do some utilities (build the docs, etc.)
and see what the norm is. That's a good way to understand what really
happens in practice.

bradfitz · 2013-03-29T23:17:05Z

Comment 8:

We already do that.
That's not what this bug is about.

bradfitz · 2013-03-31T02:46:44Z

Comment 9:

Out of curiosity, I ran the go1 benchmark before and after this CL:
ante:go1 $ ~/go/misc/benchcmp before after 
benchmark                         old ns/op    new ns/op    delta
BenchmarkBinaryTree17            6370288817   6376970261   +0.10%
BenchmarkFannkuch11              3774653509   3782454073   +0.21%
BenchmarkFmtFprintfEmpty                 94           95   +1.49%
BenchmarkFmtFprintfString               290          288   -0.69%
BenchmarkFmtFprintfInt                  189          190   +0.53%
BenchmarkFmtFprintfIntInt               319          317   -0.63%
BenchmarkFmtFprintfPrefixedInt          356          358   +0.56%
BenchmarkFmtFprintfFloat                496          503   +1.41%
BenchmarkFmtManyArgs                   1317         1311   -0.46%
BenchmarkGobDecode                 12415020     12463592   +0.39%
BenchmarkGobEncode                 14668974     14508952   -1.09%
BenchmarkGzip                     569880241    570438324   +0.10%
BenchmarkGunzip                   128886401    138989651   +7.84%
BenchmarkHTTPClientServer             72004        76433   +6.15%
BenchmarkJSONEncode                48857161     49610984   +1.54%
BenchmarkJSONDecode               111535431    112135503   +0.54%
BenchmarkMandelbrot200              4760656      4494016   -5.60% << only good
result
BenchmarkGoParse                    6673246      6722963   +0.75%
BenchmarkRegexpMatchEasy0_32            122          123   +0.82%
BenchmarkRegexpMatchEasy0_1K            384          385   +0.26%
BenchmarkRegexpMatchEasy1_32            106          106   +0.00%
BenchmarkRegexpMatchEasy1_1K           1103         1106   +0.27%
BenchmarkRegexpMatchMedium_32           216          215   -0.46%
BenchmarkRegexpMatchMedium_1K         90747        92652   +2.10%
BenchmarkRegexpMatchHard_32            4405         4739   +7.58%
BenchmarkRegexpMatchHard_1K          139907       144608   +3.36%
BenchmarkRevcomp                 1026696025   1021416881   -0.51%
BenchmarkTemplate                 141715311    142208244   +0.35%
BenchmarkTimeParse                      487          482   -1.03%
BenchmarkTimeFormat                     548          552   +0.73%
The Mandelbrot result suggests that having the compiler give a hint to the runtime hash
insert/lookup functions about when to update & consult the cache would be worthwhile.

bradfitz · 2013-03-31T02:52:13Z

Comment 10:

Sigh, the mandelbrot test doesn't even use maps. (I didn't even think about it a second
ago)
Yay benchmarks.
Maybe with a higher benchtime it'd be more reliable.  (This isn't even on a laptop,
where I could write it off as CPU frequency scaling)

MichaelTJones · 2013-03-31T04:12:33Z

Comment 11:

If nothing else, it's not a glowing endorsement. Hmm...

randall77 · 2013-03-31T04:23:02Z

Comment 12:

Maybe you could instrument what fraction of the lookups would result in a hit?  It would
be interesting to know for each of the benchmarks, and that kind of measurement doesn't
suffer from timing variance issues.

bradfitz · 2013-03-31T18:28:39Z

Comment 13:

For go1 bench tests overall:
 204 [fast2_hit]
2674 [gen_hit]
202447 [gen_miss]
259524 [fast1_miss]
281748 [fast1_hit]
472442 [fast2_miss]
Where fast1 and fast2 are the kind int32/int64/string lookup fast paths, and "gen" is
any map assignment, deletion, or a non-specialized (not int/string) lookup.  
Broken up per-test, the baseline is:
   2 [fast2_hit]
  48 [gen_hit]
  63 [fast2_miss]
 331 [gen_miss]
12823 [fast1_hit]
102468 [fast1_miss]
... for just the test infrastructure / binary start-up somehow.  
Then the only interesting ones using maps are:  (these include the baseline numbers)
BenchmarkHTTPClientServer
   2 [fast2_hit]
  48 [gen_hit]
  64 [fast2_miss]
12823 [fast1_hit]
102468 [fast1_miss]
151846 [gen_miss]
BenchmarkJSONEncode
   2 [fast2_hit]
  48 [gen_hit]
  63 [fast2_miss]
 331 [gen_miss]
102469 [fast1_miss]
281748 [fast1_hit]   <<<<< nice
BenchmarkGoParse
   2 [fast2_hit]
2674 [gen_hit]
12823 [fast1_hit]
50932 [gen_miss]
259523 [fast1_miss]
471228 [fast2_miss]
For net/textproto ReadMIMEHeader's benchmark, which has:
     m[key] = append(m[key], value).
... from the comment above.  Does:
  25 [fast2_miss]
30050 [fast1_miss]   // m[key] lookup on 9th and 10th keys
30071 [gen_hit]       // mkey] assignment on 9th and 10th keys
60101 [BenchmarkReadMIMEHeader-loop]
541151 [gen_miss]   // m[key] assignment on 1st-8th keys (miss because no hash was done
for short-strings when len(m)  <= 8)
It does better if the header is bigger (more unique keys) or has larger longer (over 32)
keys.
For net/http's BenchmarkServerFakeConnNoKeepAlive:
  22 [gen_hit]
  27 [fast2_miss]
841760 [gen_miss]
JSONEncode looks to benefit the most from this.

bradfitz · 2013-04-03T17:44:37Z

Comment 14:

Labels changed: added performance.

bradfitz · 2013-04-03T17:45:25Z

Comment 15:

Labels changed: removed optimization.

rsc · 2013-12-04T01:49:19Z

Comment 16:

Labels changed: added repo-main.

rsc · 2014-03-03T20:46:14Z

Comment 17:

Adding Release=None to all Priority=Someday bugs.

Labels changed: added release-none.

bradfitz · 2014-12-02T15:14:59Z

Comment 18:

I realize that nowadays (as opposed to when this bug was filed) we have the go/ssa
package, which should ease doing an analysis over all public Go code to see how often
this pattern occurs.

axw · 2014-12-03T10:05:17Z

Comment 19:

@bradfitz: if you do the analysis part with go/ssa, would you please share the code?
I'd be interested to see how it might fit in with a go/ssa-based compiler (llgo) as an
optimisation. If you can statically identify the map lookup/modification that operate on
the same key value, then you should be able to do the cache/reuse statically with
minimal impact on other sites.

adonovan · 2014-12-03T18:23:24Z

Comment 20:

The doublehash program in this CL uses go/ssa to find occurrences:
  https://golang.org/cl/182420043/
% doublehash encoding/json
/home/adonovan/go/src/encoding/json/encode.go:1073:14:  this map[reflect.Type]int lookup
/home/adonovan/go/src/encoding/json/encode.go:1073:14:          dominates this update
with same map and key
/home/adonovan/go/src/encoding/json/encode.go:1074:17:  this map[reflect.Type]int lookup
/home/adonovan/go/src/encoding/json/encode.go:1073:14:          dominates this update
with same map and key
It only finds update-after-lookup patterns, but it could be easily adapted to
lookup-after-lookup too, although they're rare in code that's already performance
sensitive.
Since unlike gc, go/ssa doesn't have a precise interprocedural escape analysis, I've
added an -unsound flag to make optimistic aliasing assumptions about the map expression,
e.g. that in this program,
  if !o.m[k] {
    ...
    o.m[k] = true
    ...
  }
the two references o.m must alias to the same variable.
doublehash finds 77 definite occurrences in the standard library, and 176 with the
-unsound flag.  (Escape analysis would likely yield even more since the key expression
may also be subject to aliasing, e.g. m[o.k].)

bradfitz · 2016-09-16T19:23:54Z

Instead of caching the map value per key, what if the frontend turned map operations into separate SSA operations for hashing the key vs. using the hash to do the various map operations?

Then there would be SSA operations for map lookup and map assignment which take the hash in addition to the key.

The downside is that the subsequent map operations would still need to do == on all entry values hashed to the same bucket, but at least the hash operation itself would only be done once.

The alternative (having map operations return a pointer into the map value) is a little more invasive in that it has to be aware of maps being resized during sets. You might need a separate SSA value for each potential layout of the hash, with all map mutation operations returning a new SSA value for the potentially-new map layout (e.g. after growth).

@randall77, do you have plans in general of moving some of the frontend's runtime call lowering later into SSA, so more of the SSA goodness can eliminate things?

randall77 · 2016-09-16T19:39:21Z

No plans at the moment, although that is a possibility.

myitcv · 2016-09-18T13:25:27Z

Does copying a map fall in to the same category of optimisation with respect to key hashing?

The following is, I believe, the encouraged way of writing a map copy (assuming we want a "simple", exact copy):

var m map[t1]t2

theCopy := make(map[t1]t2, len(m))

for k, v := range m {
  theCopy[k] = v
}

This results in a call to runtime.mapassign1 within the loop which, as I understand it, results in a call to hash the key value.

Feels like this particular scenario can be optimised heavily?

That said, whilst this copy example is another instance of avoiding unnecessary hashing, it perhaps falls into a different category of (SSA) optimisation...

gopherbot · 2016-10-12T18:20:07Z

CL https://golang.org/cl/30815 mentions this issue.

To compile: m[k] = v instead of: mapassign(maptype, m, &k, &v), do do: *mapassign(maptype, m, &k) = v mapassign returns a pointer to the value slot in the map. It is just like mapaccess except that it will allocate a new slot if k is not already present in the map. This makes map accesses faster but potentially larger (codewise). It is faster because the write into the map is done when the compiler knows the concrete type, so it can be done with a few store instructions instead of calling typedmemmove. We also potentially avoid stack temporaries to hold v. The code can be larger when the map has pointers in its value type, since there is a write barrier call in addition to the mapassign call. That makes the code at the callsite a bit bigger (go binary is 0.3% bigger). This CL is in preparation for doing operations like m[k] += v with only a single runtime call. That will roughly double the speed of such operations. Update #17133 Update #5147 Change-Id: Ia435f032090a2ed905dac9234e693972fe8c2dc5 Reviewed-on: https://go-review.googlesource.com/30815 Run-TryBot: Keith Randall <khr@golang.org> Reviewed-by: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>

gopherbot · 2018-03-15T06:24:10Z

Change https://golang.org/cl/100838 mentions this issue: cmd/compile: avoid mapaccess at m[k]=append(m[k]..

Currently rvalue m[k] is transformed during walk into: tmp1 := *mapaccess(m, k) tmp2 := append(tmp1, ...) *mapassign(m, k) = tmp2 However, this is suboptimal, as we could instead produce just: tmp := mapassign(m, k) *tmp := append(*tmp, ...) Optimization is possible only if during Order it may tell that m[k] is exactly the same at left and right part of assignment. It doesn't work: 1) m[f(k)] = append(m[f(k)], ...) 2) sink, m[k] = sink, append(m[k]...) 3) m[k] = append(..., m[k],...) Benchmark: name old time/op new time/op delta MapAppendAssign/Int32/256-8 33.5ns ± 3% 22.4ns ±10% -33.24% (p=0.000 n=16+18) MapAppendAssign/Int32/65536-8 68.2ns ± 6% 48.5ns ±29% -28.90% (p=0.000 n=20+20) MapAppendAssign/Int64/256-8 34.3ns ± 4% 23.3ns ± 5% -32.23% (p=0.000 n=17+18) MapAppendAssign/Int64/65536-8 65.9ns ± 7% 61.2ns ±19% -7.06% (p=0.002 n=18+20) MapAppendAssign/Str/256-8 116ns ±12% 79ns ±16% -31.70% (p=0.000 n=20+19) MapAppendAssign/Str/65536-8 134ns ±15% 111ns ±45% -16.95% (p=0.000 n=19+20) name old alloc/op new alloc/op delta MapAppendAssign/Int32/256-8 47.0B ± 0% 46.0B ± 0% -2.13% (p=0.000 n=19+18) MapAppendAssign/Int32/65536-8 27.0B ± 0% 20.7B ±30% -23.33% (p=0.000 n=20+20) MapAppendAssign/Int64/256-8 47.0B ± 0% 46.0B ± 0% -2.13% (p=0.000 n=20+17) MapAppendAssign/Int64/65536-8 27.0B ± 0% 27.0B ± 0% ~ (all equal) MapAppendAssign/Str/256-8 94.0B ± 0% 78.0B ± 0% -17.02% (p=0.000 n=20+16) MapAppendAssign/Str/65536-8 54.0B ± 0% 54.0B ± 0% ~ (all equal) Fixes #24364 Updates #5147 Change-Id: Id257d052b75b9a445b4885dc571bf06ce6f6b409 Reviewed-on: https://go-review.googlesource.com/100838 Reviewed-by: Matthew Dempsky <mdempsky@google.com> Run-TryBot: Matthew Dempsky <mdempsky@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>

mvdan · 2023-03-25T23:27:29Z

This would make declaring flags about 10% faster, which matters as cmd/go spends as much as 15% of its init time in https://pkg.go.dev/flag#FlagSet.Var. For numbers, see https://go-review.googlesource.com/c/go/+/435259. I had to abandon that CL as it was technically a breaking change, but all it tried to do was avoid accessing the same map with the same string key twice.

bradfitz added accepted Performance labels Dec 3, 2014

bradfitz mentioned this issue Jan 21, 2015

encoding/gob: panic on random input #9649

Closed

rsc added this to the Unplanned milestone Apr 10, 2015

rsc removed priority-someday labels Apr 10, 2015

rsc removed the repo-main label Apr 14, 2015

randall77 mentioned this issue Sep 16, 2016

cmd/compile: optimizing multiple accesses to same key in map #17133

Open

bradfitz mentioned this issue Mar 15, 2018

cmd/compile: We can avoid extra mapaccess in map[k] = append(m[k],...) #24364

Closed

rsc mentioned this issue Sep 2, 2020

proposal: builtin: delete returns bool #41130

Closed

bradfitz mentioned this issue Mar 15, 2021

runtime: performance of map[string(bytes)]++ #45021

Open

riobard mentioned this issue Mar 1, 2022

proposal: Go 2: builtin: delete should return the deleted map item #51405

Closed

gabyhelp mentioned this issue Nov 27, 2024

cmd/compile: re-use the already calculated map key hash on multiple accesses/assignments of map items for the given key #70594

Closed

mvdan mentioned this issue Dec 14, 2024

runtime: reduce duplicate work by merging map operations #70837

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: consider caching previous map hash value #5147

runtime: consider caching previous map hash value #5147

bradfitz commented Mar 28, 2013 •

edited

Loading

bradfitz commented Mar 29, 2013

bradfitz commented Mar 29, 2013

MichaelTJones commented Mar 29, 2013

bradfitz commented Mar 29, 2013

bradfitz commented Mar 29, 2013

MichaelTJones commented Mar 29, 2013

bradfitz commented Mar 29, 2013

bradfitz commented Mar 31, 2013

bradfitz commented Mar 31, 2013

MichaelTJones commented Mar 31, 2013

randall77 commented Mar 31, 2013

bradfitz commented Mar 31, 2013

bradfitz commented Apr 3, 2013

bradfitz commented Apr 3, 2013

rsc commented Dec 4, 2013

rsc commented Mar 3, 2014

bradfitz commented Dec 2, 2014

axw commented Dec 3, 2014

adonovan commented Dec 3, 2014

bradfitz commented Sep 16, 2016

randall77 commented Sep 16, 2016

myitcv commented Sep 18, 2016

gopherbot commented Oct 12, 2016

gopherbot commented Mar 15, 2018

mvdan commented Mar 25, 2023

runtime: consider caching previous map hash value #5147

runtime: consider caching previous map hash value #5147

Comments

bradfitz commented Mar 28, 2013 • edited Loading

bradfitz commented Mar 29, 2013

bradfitz commented Mar 29, 2013

MichaelTJones commented Mar 29, 2013

bradfitz commented Mar 29, 2013

bradfitz commented Mar 29, 2013

MichaelTJones commented Mar 29, 2013

bradfitz commented Mar 29, 2013

bradfitz commented Mar 31, 2013

bradfitz commented Mar 31, 2013

MichaelTJones commented Mar 31, 2013

randall77 commented Mar 31, 2013

bradfitz commented Mar 31, 2013

bradfitz commented Apr 3, 2013

bradfitz commented Apr 3, 2013

rsc commented Dec 4, 2013

rsc commented Mar 3, 2014

bradfitz commented Dec 2, 2014

axw commented Dec 3, 2014

adonovan commented Dec 3, 2014

bradfitz commented Sep 16, 2016

randall77 commented Sep 16, 2016

myitcv commented Sep 18, 2016

gopherbot commented Oct 12, 2016

gopherbot commented Mar 15, 2018

mvdan commented Mar 25, 2023

bradfitz commented Mar 28, 2013 •

edited

Loading