Conversation
`compact_child_nodes` allocates an array. We can skip that step by simply yielding the nodes.
Benchmark for visiting the rails codebase:
```rb
require "prism"
require "benchmark/ips"
files = Dir.glob("../rails/**/*.rb")
results = files.map { Prism.parse_file(it) }
visitor = Prism::Visitor.new
Benchmark.ips do |x|
x.config(warmup: 3, time: 10)
x.report do
results.each do
visitor.visit(it.value)
end
end
end
RubyVM::YJIT.enable
Benchmark.ips do |x|
x.config(warmup: 3, time: 10)
x.report do
results.each do
visitor.visit(it.value)
end
end
end
```
Before:
```
ruby 3.4.8 (2025-12-17 revision 995b59f666) +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
2.691 (± 0.0%) i/s (371.55 ms/i) - 27.000 in 10.089422s
ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
7.278 (±13.7%) i/s (137.39 ms/i) - 70.000 in 10.071568s
```
After:
```
ruby 3.4.8 (2025-12-17 revision 995b59f666) +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
3.429 (± 0.0%) i/s (291.65 ms/i) - 35.000 in 10.208580s
ruby 3.4.8 (2025-12-17 revision 995b59f666) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
16.815 (± 0.0%) i/s (59.47 ms/i) - 169.000 in 10.054668s
```
~21% faster on the interpreter, ~56% with YJIT
| def child_nodes: () -> Array[Prism::node?] | ||
| def comment_targets: () -> Array[Prism::node | Location] | ||
| def compact_child_nodes: () -> Array[Prism::node] | ||
| def each_child_node: () { (Prism::node) -> void } -> void | () -> Enumerator[Prism::node] |
There was a problem hiding this comment.
Unsure if this is correct. I'm also missing sorbet signatures which I don't know how to write for this
There was a problem hiding this comment.
This looks right to me, and it would have failed to typecheck otherwise.
There was a problem hiding this comment.
So are sorbet signatures not needed? I don't use these systems so no idea when/where it gets used. Other functions are present so I assume there is a reason for them.
| end | ||
|
|
||
| # def compact_child_nodes: () -> Array[Node] | ||
| def compact_child_nodes |
There was a problem hiding this comment.
This can just be implemented as each_child_node.to_a but performance is not so great
There was a problem hiding this comment.
Nope, you're right to keep this the same.
|
Nice, thank you! FWIW in the description you mention
but the numbers above look much better than that to me, I would say 16.815/7.278 = 2.3x "as fast" / 130% "faster" (and I think "2.3x faster" is clear for most people too). |
| <%- when Prism::Template::OptionalNodeField -%> | ||
| yield <%= field.name %> if <%= field.name %> | ||
| <%- when Prism::Template::NodeListField -%> | ||
| <%= field.name %>.each {|node| yield node } |
There was a problem hiding this comment.
| <%= field.name %>.each {|node| yield node } | |
| <%= field.name %>.each(&block) |
might be faster because it avoids an extra block (that literal block) and consumes less stack (both good for performance and to avoid stack overflow errors).
I'll give it a try.
There was a problem hiding this comment.
I tried
diff --git i/templates/lib/prism/node.rb.erb w/templates/lib/prism/node.rb.erb
index c97c029d3..76cbb2608 100644
--- i/templates/lib/prism/node.rb.erb
+++ w/templates/lib/prism/node.rb.erb
@@ -343,8 +343,8 @@ module Prism
end
# def each_child_node: () { (Prism::node) -> void } -> void | () -> Enumerator[Prism::node]
- def each_child_node
- return to_enum(:each_child_node) unless block_given?
+ def each_child_node(&_block)
+ return to_enum(:each_child_node) unless _block
<%- node.fields.each do |field| -%>
<%- case field -%>
@@ -353,7 +353,7 @@ module Prism
<%- when Prism::Template::OptionalNodeField -%>
yield <%= field.name %> if <%= field.name %>
<%- when Prism::Template::NodeListField -%>
- <%= field.name %>.each {|node| yield node }
+ <%= field.name %>.each(&_block)
<%- end -%>
<%- end -%>
endand it seems a little bit faster in interpreter but much slower in YJIT (2 runs each to get an idea of variance, for some reason the ± 0.0% seems broken in benchmark-ips):
BEFORE:
$ ruby -Ilib bench_visitor.rb
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
2.720 (± 0.0%) i/s (367.65 ms/i) - 28.000 in 10.294286s
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
14.286 (± 0.0%) i/s (70.00 ms/i) - 143.000 in 10.009988s
$ ruby -Ilib bench_visitor.rb
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
2.740 (± 0.0%) i/s (364.94 ms/i) - 28.000 in 10.218390s
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
14.309 (± 0.0%) i/s (69.88 ms/i) - 144.000 in 10.064832s
AFTER:
$ ruby -Ilib bench_visitor.rb
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
2.790 (± 0.0%) i/s (358.37 ms/i) - 28.000 in 10.034481s
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
11.968 (± 0.0%) i/s (83.55 ms/i) - 120.000 in 10.026674s
$ ruby -Ilib bench_visitor.rb
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
2.805 (± 0.0%) i/s (356.53 ms/i) - 29.000 in 10.339397s
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +YJIT +PRISM [x86_64-linux]
Warming up --------------------------------------
1.000 i/100ms
Calculating -------------------------------------
12.116 (± 0.0%) i/s (82.54 ms/i) - 122.000 in 10.069442s
Yeah, I could have written that more clearly. benchmark/ips has https://github.com/evanphx/benchmark-ips?tab=readme-ov-file#independent-benchmarking but I was too lazy |
#3808 (comment)
This also came up someplace else before but I don't remember where. I agree this makes sense. cc @eregon
compact_child_nodesallocates an array. We can skip that step by simply yielding the nodes.Benchmark for visiting the rails codebase:
Before:
After:
~21% faster on the interpreter, ~56% with YJIT