Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace master/slave naming convention with coordinator/worker #650

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ https://github.com/burke/zeus/compare/v0.15.6...v0.15.8
zeus Gem no longer requires native extensions and file monitoring is
much faster and more reliable.
* Track files from exceptions during Zeus actions in Ruby.
* Fix a thread safety in SlaveNode state access.
* Fix a thread safety in WorkerNode state access.

# 0.15.7

Expand Down Expand Up @@ -158,7 +158,7 @@ https://github.com/burke/zeus/compare/v0.13.1...v0.13.2

* Improved a few cases where client processes disconnect unexpectedly.

* Changed up the slave/master IPC, solving a bunch of issues on Linux, by
* Changed up the worker/coordinator IPC, solving a bunch of issues on Linux, by
switching from a socket to a pipe.

* Client terminations are now handled a bit more gracefully. The terminal is
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ A: No. You can, but running `bundle exec zeus` instead of `zeus` adds precious s
It is common to see tests running twice when starting out with Zeus. If you see your tests/specs running twice, you should try disabling `require 'rspec/autotest'` and `require 'rspec/autorun'` (for RSpec), or `require 'minitest/autorun'` (for Minitest). (see [#134](https://github.com/burke/zeus/issues/134) for more information).


## Rails Set up
## Rails Set up

In your app's directory initialize zeus:

Expand Down
8 changes: 4 additions & 4 deletions contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ One or two sentences giving an overview of the issue.

## System details

* **`uname -a`**:
* **`uname -a`**:

* **`ruby -v`**:
* **`ruby -v`**:

* **`go version`**: (only if hacking on the go code)

Expand Down Expand Up @@ -62,10 +62,10 @@ use to crosscompile multiple binaries.
### Context: How zeus is structured

The core of zeus is a single go program that acts as the coordinating process
(master, e.g. `zeus start`), or the client (called per-command, e.g. `zeus
(coordinator, e.g. `zeus start`), or the client (called per-command, e.g. `zeus
client`). This code is cross-compiled for a handful of different architectures
and bundled with a ruby gem. The ruby gem contains all the shim code necessary
to boot a rails app under the control of the master process.
to boot a rails app under the control of the coordinator process.

### Building

Expand Down
56 changes: 56 additions & 0 deletions docs/client_coordinator_handshake.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Client/Coordinator/Command handshake

Client Coordinator Command
1 ----------> | Command, Arguments, Pid
2 ----------> | Terminal IO
3 -----------> | Terminal IO
4 -----------> | Arguments, Pid
5 <----------- | pid
6 <--------- | pid
(time passes)
7 <----------- | exit status
8 <--------- | exit status


#### 1. Command & Arguments (Client -> Coordinator)

The Coordinator always has a UNIX domain server listening at a known socket path.

The Client connects to this server and sends a string indicating the command to run and any arguments to run with (ie. the ARGV). See message_format.md for more info.

#### 2. Terminal IO (Client -> Coordinator)

The Client then sends an IO over the server socket to be used for raw terminal IO.

#### 3. Arguments (Coordinator -> Command)

The Coordinator sends the Client arguments from step 1 to the Command.

#### 4. Terminal IO (Coordinator -> Command)

The Coordinator forks a new Command process and sends it the Terminal IO from the Client.

#### 5. Pid (Command -> Coordinator)

The Command process sends the Coordinator its pid, using a Pid & Identifier message.

#### 6. Pid (Coordinator -> Client)

The Coordinator responds to the client with the pid of the newly-forked Command process.

The Client is now connected to the Command process.

#### 7. Exit status (Command -> Coordinator)

When the command terminates, it must send its exit code to the coordinator. This is normally easiest to implement as a wrapper process that does the setsid, then forks the command and `waitpid`s on it.

The form of this message is `{{code}}`, eg: `1`.

#### 8. Exit status (Coordinator -> Client)

Finally, the Coordinator forwards the exit status to the Client. The command cycle is now complete.

The form of this message is `{{code}}`, eg: `1`.

See [`message_format.md`](message_format.md) for more information on messages.

56 changes: 0 additions & 56 deletions docs/client_master_handshake.md

This file was deleted.

44 changes: 44 additions & 0 deletions docs/coordinator_worker_handshake.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Coordinator/Worker Handshake

#### 1. Socket

The Worker is always started with an environment variable named `ZEUS_COORDINATOR_FD`. The file descriptor at the given integer value is a socket to the Coordinator process.

The Worker should open a UNIX Domain Socket using the `ZEUS_COORDINATOR_FD` File Descriptor (`globalCoordinatorSock`).

The Worker opens a new UNIX datagram Socketpair (`local`, `remote`)

The Worker sends `remote` across `globalCoordinatorSock`.

#### 2. PID and Identifier

The Worker determines whether it has been given an Identifier. If it is the first-booted worker, it was booted
by the Coordinator, and will not have one. When a Worker forks, it is passed an Identifier by the Coordinator that it
passes along to the newly-forked process.

The Worker sends a "Pid & Identifier" message containing the pid and the identifier (blank if initial process)

#### 4. Action Result

The Worker now executes the code it's intended to run by looking up the action
in a collection of predefined actions indexed by identifier. In ruby this is implemented
as a module that responds to a method named according to each identifier.

If there were no runtime errors in evaluating the action, the Worker writes "OK" to `local`.

If there were runtime errors, the worker returns a string representing the errors in an arbitrary and
hopefully helpful format. It should normally be identical to the console output format should the errors
have been raised and printed to stderr.

Before the server kills a crashed worker process, it attempts to read
any loaded files from `local`, until that socket is closed.

#### 5. Loaded Files

Any time after the action has been executed, the Worker may (and should) send, over `local`, a list of files
that have been newly-loaded in the course of evaluating the action.

Languages are expected to implement this using clever tricks.

Steps 1-4 happend sequentially and in-order, but Submitting files in Step 5 should not prevent the Worker from
handling further commands from the coordinator. The Worker should be considered 'connected' after Step 4.
44 changes: 0 additions & 44 deletions docs/master_slave_handshake.md

This file was deleted.

22 changes: 11 additions & 11 deletions docs/message_format.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,39 @@
# Message Format

There are a number of different types of messages passed between Master and Slave processes.
There are a number of different types of messages passed between Coordinator and Worker processes.

In the interest of simplifying Slave libraries, messages are sent as single packets over a UNIX datagram socket, with a single-letter prefix, followed by a colon, indicating the message type.
In the interest of simplifying Worker libraries, messages are sent as single packets over a UNIX datagram socket, with a single-letter prefix, followed by a colon, indicating the message type.

the parenthesesized values after each title are the message code, and the handling module.

#### Pid & Identifier message (`P`, `SlaveMonitor`)
#### Pid & Identifier message (`P`, `WorkerMonitor`)

This is sent from Slave to Master immediately after booting, to identify itself.
This is sent from Worker to Coordinator immediately after booting, to identify itself.

It is formed by joining the process's pid and identifier with a colon.

Example: `P:1235:default_bundle`

#### Action response message (`R`, `SlaveMonitor`)
#### Action response message (`R`, `WorkerMonitor`)

This is sent from the Slave to the Master once the action has executed.
This is sent from the Worker to the Coordinator once the action has executed.

It can either be "OK", if the action was successful, or any other string, which should be a stderr-like
It can either be "OK", if the action was successful, or any other string, which should be a stderr-like
representation of the error, including stack trace if applicable.

Example: `R:OK`

Example: `R:-e:1:in '<main>': unhandled exception`

#### Spawn Slave message (`S`, `SlaveMonitor`)
#### Spawn Worker message (`S`, `WorkerMonitor`)

This is sent from the Master to the Slave and contains the Identifier of a new Slave to fork immediately.
This is sent from the Coordinator to the Worker and contains the Identifier of a new Worker to fork immediately.

Example: `S:test_environment`

#### Spawn Command message (`C`, `ClientHandler`)

This is sent from the Master to the Slave and contains the Identifier of a new Command to fork immediately.
This is sent from the Coordinator to the Worker and contains the Identifier of a new Command to fork immediately.

Example: `C:console`

Expand All @@ -46,7 +46,7 @@ Example: `Q:testrb:-Itest -I. test/unit/module_test.rb`

#### Feature message (`F`, `FileMonitor`)

This is sent from the Slave to the Master to indicate it now depends on a file at a given path.
This is sent from the Worker to the Coordinator to indicate it now depends on a file at a given path.

The path is expected to be the full, expanded path.

Expand Down
Loading