RubyKaigi and the Path to Ruby 3

Performance, Concurrency, and Static Analysis in Ruby

Reddit
LinkedIn

Original Image by Mathias Meyer CC BY-SA 2.0

At RubyKaigi in Fukuoka, Japan, Matz spoke at length about Ruby 3—the next major version of the Ruby programming language, scheduled for release on Christmas of 2020. Many of the features planned for Ruby 3 are already well underway and RubyKaigi had a ton of talks that delved into the details. The three major areas of planned improvement for Ruby 3 are performance, concurrency, and static analysis.

Performance

“No language is fast enough.” ―Matz

One of the areas of feedback Matz often hears is that Ruby needs to be faster. A stated goal of Ruby 3 is to be three times faster than Ruby 2. Matz explained that memory performance needs to be improved as well as CPU performance. In real world conditions, slowdowns are often due to memory performance issues rather than the CPU being pegged.

Memory Performance

Since Ruby 1.9, Ruby core committers have done a ton of work to improve Ruby’s garbage collector (GC). The GC used to be a lazy mark and sweep collector, where living objects are marked and the rest are swept away. One of the problems with a mark and sweep collector is that it stops the world to do its work, causing jitter in a Ruby application.

With Ruby 2.0, a new bitmap GC was introduced, so the objects themselves aren’t marked. By having just the bitmap change and not the objects themselves, it lowers the amount of memory that needs to be copied with forked processes due to copy-on-write (COW) optimization. In Ruby 2.1, generational garbage collection was added, where long-living objects are presumed to be used and are not checked as often, thereby stopping the world less frequently. Ruby 2.2 improved the situation further by adding tri-color incremental collection, where things are held up a tiny bit at a time rather than all at once. Since then, there have been numerous additional improvements to the GC that have all added up to greatly improve Ruby’s memory performance.

One of the remaining problems with Ruby’s memory usage in Ruby 2.6 is that as objects are created and then swept away they fragment the heap. This means there are empty, unused sections of the heap where swept away objects used to be. These holes in memory mean we use more heap pages than necessary. Aaron “Tenderlove” Patterson took a stab at solving this problem by writing a garbage “compactor” for Ruby.

Tenderlove’s new garbage compactor means fewer pages in the heap and better copy-on-write friendliness. The new garbage compactor considers some objects to be pinned, but the rest can be moved to defragment the heap. Things that are pinned include anything on the stack rather than the heap, Hash keys, and C-extension objects marked by rb_gc_mark. It also introduces a rb_gc_mark_no_pin that C-extension authors can now use to signify they don’t need VALUE pointer continuity.

Another interesting thing about the new garbage compactor is that it preserves object_ids and even supports ObjectSpace._id2ref. If you want more details, check out Tenderlove’s great writeup on the Ruby bug tracker.

For the moment, you have to manually trigger the garbage compactor with GC.compact, but auto-compaction may be supported in the future. Garbage compaction has been merged into Ruby trunk, and should be released with Ruby 2.7.

GC.compact
=> {:considered=>
  {:T_NONE=>9219,
   :T_OBJECT=>13,
   :T_CLASS=>75,
   :T_MODULE=>7,
   :T_FLOAT=>0,
   :T_STRING=>19905,
  #...

It’s great to see Ruby’s GC continue to evolve and improve as new versions of Ruby are released. We’ve come a long way since Ruby 1.8, and further GC improvements and new additions like the garbage compactor will make Ruby memory even faster and more efficient.

CPU Performance

One of the promising efforts for Ruby CPU performance is the new Ruby JIT. Check out an article I wrote about Ruby’s New JIT for some background and details about MJIT. Since Ruby 2.6, k0kubun has continued his work improving Ruby’s new JIT with frequent commits.

Another promising project that might bring significant performance improvement to Ruby is vnmakarov’s MIR JIT compiler, which is a “light-weight JIT compiler based on MIR (Medium Internal Representation).” This new internal representation (IR) for Ruby might bring significant performance improvement beyond the current JIT, but it requires a much larger change for Ruby internals than the introduction of MJIT, which was also based on vnmakarov’s work. According to Matz, the new MIR JIT compiler might be included as part of Ruby 3, but we’ll have to wait and see!

Concurrency

“This is the ‘Year of Concurrency.’” ―Matz

Matz has repeatedly said that he wishes he hadn’t introduced Threads to Ruby. He doesn’t feel like Threads are the right level of abstraction for Rubyists to use. Threads are error prone and often blocked from parallel computation by Ruby’s global VM lock (GVL or GIL). With Ruby 3, Matz hopes to introduce better concurrency primates for Ruby!

“I regret adding Threads.” ―Matz

Async with Fibers

One feature that’s already been merged into Ruby is a low-level rewrite of Fibers by Samuel “ioquatix” Williams. Building on this work, he made a PR that implements a light-weight selector for deterministic I/O based on Fibers.

To see how this makes concurrency with I/O awesome in Ruby, ioquatix demonstrated his Falcon Rack webserver, which uses his async Fibers libraries under the hood. The huge benefit of Falcon over other Ruby web servers is that Falcon does not block on I/O! Commonly, the majority of the request time is spent waiting for I/O to unblock. Handling that I/O asynchronously means that your server is free to field other requests while it waits on your database, memory store or API calls.

This approach means that we need to be able to handle asynchronous I/O with adapters. There are works-in-progress for popular tools including Postgres, MySQL, and Redis. Once these tools mature, Falcon might be a very compelling option when compared to Puma, Unicorn, Passenger and other popular Rack web servers in Ruby when dealing with apps that are often waiting for I/O.

Threadlets

Naming is hard, so I’ll just call this proposal “Threadlets,” even though that’s probably not what we’ll end up calling them if they’re merged into Ruby!

Threadlets are a proposal by Eric “normalperson” Wong for Ruby 3, that would introduce a new type of auto-scheduling Fibers that implement the same methods as the current Thread class. This would let existing Ruby programs, including popular webservers, just swap Threadlets in place of Threads for a much lighter primitive under the hood. On my laptop, I’m only able to spin up 4096 sleeping Threads, but using normalperson’s branch I was able to spin up over a million sleeping Threadlets before I got bored and quit the process.

My personal favorite name for this feature is “Thread::Feather” since it’s a Thread as light as a feather. Alas, my suggestion was already rejected on the epic Ruby bugtracker issue for the Threadlet proposal. If you can think of a better name than has been proposed in the issue tracker, add a comment! Matz says “we need better names” and he is open to suggestions.

Guilds

Async Fibers and “Threadlets” are both focused on lightweight I/O. What about parallel computation? That’s where Koichi “ko1” Sasada’s proposal for Guilds shines. Ruby’s global VM lock (GVL) currently prevents most Ruby Threads from computing in parallel. Guilds work around the GVL by allowing Threads in different Guilds to compute at the same time. This means you could spin up as many Guilds as you have cores and be able to do fully parallel threaded computation in Ruby!

Threads within a Guild will be able to share mutable data between themselves, like Threads currently can in Ruby. On the other hand, Threads from one Guild will not be able to mutate data from another Guild unless ownership is transferred.

Matz prefers the name “Isolates,” but ko1 prefers “Guilds,” so the naming question is still open. Matz says the gaming industry may have to now compete with Ruby for the meaning of “Guild,” but he’s also still open to new naming ideas before Guilds are finalized.

Static Analysis

“I hate tests.” ―Matz

Matz started talking about static analysis by saying, “I hate tests.” Matz hates tests because they aren’t DRY - they’re a repetition of the code. Nonetheless, we write tests, because as human beings we can’t create “correct” programs without help. So what can we do beyond tests to ensure program correctness? Static analysis!

Static analysis is a tool we can use in addition to tests to help ensure our programs have fewer bugs. Most static analysis tools these days rely on inline type annotations, where you markup your code with types. Once again, Matz’ complain is that type annotations aren’t DRY - they’re a repetition in the code. So just like tests, Matz hates type annotations!

“I hate type annotations.” ―Matz

So how can you have type analysis without type annotations? The solution Matz prefers is to have .rbi files with type information that are parallel to our .rb Ruby code. It’s still repetition, but it’s in a separate file, not intermingled with our code.

So where does Ruby currently stand with type analysis? The two main tools discussed at RubyKaigi were Steep and Sorbet.

Steep

Steep is a tool by written by Soutaro Matsumoto, the CTO of Sider. Steep already uses a parallel .rbi files for type annotations.

For example, lets add some basic type annotations to this customer.rb file:

class Customer
  attr_reader :birthday
  attr_reader :given_name
  
  def initialize(birthday:, given_name:, groups: [])
    @birthday = birthday
    @given_name = given_name
    @groups = groups
  end
end

First, we'll scaffold out a customer.rbi type annotations file:

steep scaffold customer.rb > customer.rbi

This command creates a new customer.rbi file:

class Customer
  @birthday: any
  @given_name: any
  @groups: any
  def initialize: (birthday: any, given_name: any, ?groups: Array<any>) -> any
end

Let's edit the .rbi file and replace some of the any signatures with better types:

class Customer
  @birthday: DateTime
  @given_name: String
  @groups: Array<String>
  def initialize: (birthday: String, given_name: String, ?groups: Array<String>) -> Customer
end

Now, we can check our types!

steep check customer.rb

This is a super simple example, but check the Steep README for more!

Sorbet

Sorbet is a tool written in C++ by Stripe engineers, who currently use it in production. Sorbet is closed source, so it’s not yet possible to get into many details. The good news is that there are plans to open source Sorbet later this year! Sorbet uses inline type annotations, unlike Steep’s parallel annotation files. Matz noted that Sorbet is quite fast but that he prefers Steep-style annotations in a separate file.

A Unified Type Standard

One of the goals for Ruby’s new type analysis is standardizing the way type annotations are written. That way, multiple type analysis tools can all use the same annotations. With Ruby 3, you may well be able to scaffold a parallel .rbi type annotation file and then use Steep, Sorbet, and other type analysis tools together to glean various things about your Ruby code.

Square at RubyKaigi

We were excited to see Square engineer Shawnee “shawneegao” Gao on stage at RubyKaigi! Shawnee gave a talk on how Square uses Ruby metaprogramming with GraphQL to avoid tedious, manual code and tests in her team’s codebase of several hundred thousand lines of Ruby.

Square uses a ton of Ruby in a wide variety of microservices, so we’re incredibly excited about Ruby 3 and the future of Ruby!

If you want to keep up to date with the rest of our content, be sure to follow this blog & our Twitter account, drop by and say “Hi!” on our developer Slack, and sign up for our developer newsletter!