This document introduces the structure of the MRI source code. It also introduces the minimum required knowledge for hacking on MRI.
There are the following topics:
- Exercise: Clone the MRI source code.
- Exercise: Build MRI and install built binaries.
- Exercise: Execute Ruby programs with built Ruby.
- MRI source code structures.
- Exercise: The 1st hack. Change the version description.
The following commands assume an Unix-like environment, such as Linux, macOS, etc. If you're using a Windows environment, you will need to refer to other resources.
NOTE: We provide an experimental docker image: docker pull koichisasada/rhc
. Use rubydev
account with su rubydev
and enjoy hacking.
We assume the use of the following directory structure:
workdir/
ruby/
<- git cloned directorybuild/
<- build directory (*.o
files and other compilation artifacts are stored here)install/
<- install directory (workdir/install/bin/ruby
is the installed binary)
The commands git
, ruby
, autoconf
, gcc
(or clang
, etc), and make
are required.
Standard Ruby extensions (such as zlib, openssl, etc.) will be built if the libraries they depend on are available.
If you use apt-get
(or apt
) for package management in your environment, then you can get all dependencies with the following command:
$ sudo apt-get install git ruby autoconf gcc make zlib1g-dev libffi-dev libreadline-dev libgdbm-dev libssl-dev libyaml-dev
If you would like to install other than apt-get
, see for example Home · rbenv/ruby-build Wiki
Use the following commands:
$ mkdir workdir
$ cd workdir
$ git clone https://github.com/ruby/ruby.git
# The cloned source code will be available inworkdir/ruby
Due to limited network bandwidth at the venue, please clone the source code at home.
- Check the required commands described above.
$ cd workdir/
# Move toworkdir
$ cd ruby
# Move toworkdir/ruby
$ ./autogen.sh
$ cd ..
$ mkdir build
$ cd build
$ ../ruby/configure --prefix=$PWD/../install --enable-shared
- the
prefix
option specifies an install directory. You can specify the directory of your choice by supplying the full absolute path (in this case,workdir/install
is specified). - users of
Homebrew
will need to add the following options--with-openssl-dir="$(brew --prefix openssl)" --with-readline-dir="$(brew --prefix readline)" --disable-libedit
$ make -j
# Run build.-j
specifies parallel build.$ make install
# Tip: for a faster install, instead runmake install-nodoc
to install ruby without rdoc.$ ../install/bin/ruby -v
will show the version description of your installed ruby command.
NOTE: Running make
with the V=1
option (i.e. make V=1 -j
, etc.) will output the full commands that are executed during the build. By default, V=0
is specified and detailed output is suppressed.
There are several ways to run Ruby scripts on the Ruby you built.
The simplest way is to launch the installed Ruby directly, i.e. invoke workdir/install/bin/ruby
. This is the same as invoking a pre-built Ruby binary. However, this means you will need to run make install
every time you make a change to the Ruby source code, which can be rather time-consuming.
Here we introduce a few convenient ways to launch our version of Ruby without installing.
After building Ruby, the miniruby
command is available in workdir/build
. miniruby
is a limited version of Ruby for building Ruby itself. The limitations of miniruby
, however, are minimal: it is unable to load extension libraries and limited encodings are available. You can try most of Ruby's syntax using miniruby
.
miniruby
is built during the first phase of the Ruby build process. Thus, miniruby
is useful for a early verification of modifications made to MRI.
The following development loop is very efficient:
- Modify MRI
- Run
make miniruby
to buildminiruby
(this is faster thanmake
ormake all
) - Run a Ruby script in
miniruby
to test the correctness of your modifications.
To support this development loop, we provide a make run
rule in the Makefile. This rule does the following:
- Build
miniruby
- Run
workdir/ruby/test.rb
(test.rb
in source directory) with the built miniruby.
Using make run
, you can test your modifications with the following steps.
- Write a test for your modifications in
ruby/test.rb
. Note that you can't require gems or extension libraries intest.rb
. - Modify MRI.
- Invoke
$ make run
in the build directory
If you want to run the "normal" Ruby, which can load extension libraries, you can use make runruby
. This allows you to run Ruby without the make install
step, which should save you some time.
- Write in
ruby/test.rb
what you want to check. - Modify MRI.
- Invoke
$ make runruby
in the build directory.
NOTE: Running gdb
on macOS can be quite difficult. The following commands assume a Linux environment.
When modifying the MRI source code, you can easily introduce critical problems that result in a SEGV. To debug such problems, we provide Makefile rules to support debugging with gdb. Of course, you can also debug with break points.
- Write in
ruby/test.rb
what you want to check. Note that you can't use gems or extension libraries intest.rb
. - Invoke
$ make gdb
to run miniruby with gdb. If there are no problems, gdb finishes silently.
make gdb
uses ./miniruby
. If you want to debug with ./ruby
, use make gdb-ruby
instead.
If you want to use break points, modify the run.gdb
file generated by the make gdb
command.
For example, the b func_name
gdb command inserts a break point at the beginning of the func_name
function.
There is a similar rule for lldb, $ make lldb
, for using lldb instead of gdb (but Koichi doesn't know the details because he doesn't use lldb). It may be useful if you use macOS.
$ make btest
# run bootstrap tests inruby/bootstraptest/
$ make test-all
# run test-unit tests inruby/test/
$ make test-spec
# run tests provided inruby/spec
These three tests have different purposes and characteristics.
At a glance, the following directory structure you can observe:
ruby/*.c
MRI core files- VM cores
- VM
vm*.[ch]
: VM implementationvm_core.h
: definitions of VM data structureinsns.def
: definitions of VM instructions
compile.c, iseq.[ch]
: instruction sequence (bytecode)gc.c
: GC and memory managementthread*.[ch]
: thread managementvariable.c
: variable managementdln*.c
: dll management for extension librariesmain.c
,ruby.c
: the entry point of MRIst.c
: Hash algorithm implementation (see https://blog.heroku.com/ruby-2-4-features-hashes-integers-rounding)
- VM
- Embedded classes
string.c
: String classarray.c
: Array class- ... (file names show class names, such as time.c for Time class)
- VM cores
ruby/*.h
: internal definitions. C-extension libraries can't use them.ruby/include/ruby/*
: external definitions. C-extension libraries can use them.ruby/enc/
: encoding information.ruby/defs/
: various definitions.ruby/tool/
: tools to build MRI.ruby/missing/
: implementations for features that are missing in some OSesruby/cygwin/
,ruby/nacl/
,ruby/win32
, ...: OS/system dependent code.
There are two kinds of libraries.
ruby/lib/
: Standard libraries written in Ruby.ruby/ext/
: Bundled extension libraries written in C.
ruby/basictest/
: place of old testruby/bootstraptest/
: bootstrap testruby/test/
: tests written in test-unit notationruby/spec/
: tests written in RSpec notation
ruby/doc/
,ruby/man/
: documentation
the Ruby build process is composed of several phases involving source code generation and so on. Several tools are written in Ruby, so the Ruby build process requires the Ruby interpreter. Release tarballs contain generated source code so that installing Ruby with a release tarball does not require the Ruby interpreter (and other development tools such as autoconf).
If you want to build MRI with source code fetched by Subversion or Git repository, you need a Ruby interpreter.
The following steps describe the build and install process:
- Build miniruby
- parse.y -> parse.c: Compile syntax rules into C code with lrama
- insns.def -> vm.inc: Compile VM instructions into C code with ruby (
BASERUBY
) *.c
->*.o
(*.obj
on Windows): Compile C code into object files.- link object files into miniruby
- Build encodings
- translate enc/... to appropriate C code with
miniruby
- compile C code
- translate enc/... to appropriate C code with
- Build C-extension libraries
- Generate
Makefile
fromextconf.rb
withmkmf.rb
andminiruby
- Run
make
using generatedMakefile
.
- Generate
- Build
ruby
command - Generate documentation (
rdoc
,ri
) - Install MRI (to the install directory specified by the
configure --prefix
option)
There are actually many more steps in the process. It is difficult, however, to comprehensively list all the steps (even I don't know all of them!), so the above is an abbreviated sequence of steps. If you are curious, you can see all the rules in common.mk
and related files.
Let's start modifying MRI. We assume that all source code is placed at workdir/ruby/
.
For your first exercise, let's modify the version description which is displayed with ruby -v
(or ./miniruby -v
) to display it as your own Ruby (for example, show a version description with your name included).
- Open
version.c
in your editor. - Briefly skim over the entirety of
version.c
. - The function
ruby_show_version()
seems like what we're looking for fflush()
is a C function that flushes the output buffer, so we can guess that adding some printing code just beforefflush()
call could work.- Add the line
printf("...\n");
(Replace...
with a string of your choice) $ make miniruby
and build (don't forget to move to the build directory)- run
$ ./miniruby -v
and check the result. $ make install
and install build ruby.- run
$ ../install/bin/ruby -v
and check the result with the installed ruby.
Finally, instead of just inserting a printf(...)
statement, try replacing the entire ruby ...
description with something else (such as perl ...
and so on) would be interesting ;p