Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore merging development and CI environments #3946

Draft
wants to merge 1,038 commits into
base: main
Choose a base branch
from

Conversation

kocolosk
Copy link
Member

@kocolosk kocolosk commented Feb 27, 2022

Overview

Just a work in progress right now, but the ideas I'm exploring include:

  1. Using the same container image for PRs in Jenkins and as the default container environ in e.g. VS Code
  2. Converting over to the official Erlang container images so we can automatically stay up-to-date with patch releases
  3. Using a separate FDB container for the FDB server and linking it to the development environment

Testing recommendations

Opening a PR to see how Jenkins handles the linked container approach. I'm intending to follow the pattern from https://www.jenkins.io/doc/book/pipeline/docker/#running-sidecar-containers

Checklist

  • Code is written and works correctly
  • Changes are covered by tests
  • Any new configurable parameters are documented in rel/overlay/etc/default.ini
  • A PR for documentation changes has been made in https://github.com/apache/couchdb-documentation

iilyak and others added 30 commits December 2, 2020 10:16
New `elixir-suite` Makefile target is added. It runs a predefined set of elixir
integration tests.

The feature is controlled by two files:
- test/elixir/test/config/suite.elixir - contains list of all available tests
- test/elixir/test/config/skip.elixir - contains list of tests to skip

In order to update the `test/elixir/test/config/suite.elixir` when new tests
are added. The one would need to run the following command:

```
MIX_ENV=integration mix suite > test/elixir/test/config/suite.elixir
```
Add ability to control which Elixir integration tests to run
All endpoints but _session support gzip encoding and there's no practical reason for that.

This commit enables gzip decoding on compressed requests to _session.
1. The caching effort was a bust and has been removed. 2) chunkify can be done externally with a custom persist_fun.
* Simplify and speedup dev node startup

This patch introduces an escript that generates an Erlang .boot script
to start CouchDB using the in-place .beam files produced by the compile
phase of the build. This allows us to radically simplify the boot
process as Erlang computes the optimal order for loading the necessary
modules.

In addition to the simplification this approach offers a significant
speedup when working inside a container environment. In my test with
the stock .devcontainer it reduces startup time from about 75 seconds
down to under 5 seconds.

* Rename boot_node to monitor_parent

* Add formatting suggestions from python-black

Co-authored-by: Paul J. Davis <[email protected]>
* Add a development container config for VS Code

This creates a development environment with a FoundationDB server
and a CouchDB layer in two containers, sharing a network through
Docker Compose.

It uses the FDB image published to Docker Hub for the FDB container,
and downloads the FDB client packages from foundationdb.org to provide
the development headers and libraries. www.foundationdb.org is actually
not trusted in Debian Buster by default, so we have to download the
GeoTrust_Global_CA.pem. The following link has more details:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=962596

Once the Docker Compose setup is running, VS Code executes the
create_cluster_file.bash script to write down a cluster file containing
the IP address in the compose network where the FDB service can be
found. This cluster file is used both for a user-driven invocation of
`./dev/run`, as well as for unit tests that require a running CouchDB.

Additionally, I've got a small fix to the way we run explicitly specified
eunit tests:

* Run eunit tests for each app separately

The `eunit` target executes a for loop that appears intended to use a
separate invocation of rebar for each Erlang application's unit tests.
When running `make eunit` without any arguments this works correctly,
as the for loop processes the output of `ls src`. But if you specify a
comma-delimited list of applications the for loop will treat that as a
single argument and pass it down to rebar. This asymmetry is
surprising, but also seems to cause some issues with environment
variables not being inherited by the environment used to execute the
tests for the 2..N applications in the list. I didn't bother digging
into the rebar source code to figure out what was happening there.

This patch just parses the incoming comma-delimited list with `sed` to
create a whitespace-delimited list for the loop, so we get the same
behavior regardless of whether we are specifying applications
explicitly or not.
…g: chunked (#3360)

Transfer-Encoding: chunked causes the server to wait indefinitely, then issue a a 500 error when the client finally hangs up, when PUTing a multipart/related document + attachments.

This commit fixes that issue by adding proper handling for chunked multipart/related requests.
This allows users to verify that compaction processes are suspended
outside of any configured strict_window.
These two test cases expose the subtle bug in ebtree:lookup_multi/3
where a key that doesn't exist in the tree can prevent a subsequent
lookup key from matching in the same KV node.
If one of the provided lookup keys doesn't exist in the ebtree, it can
inadvertently prevent a second lookup key from being found if it the
first key greater than the missing lookup key is equal to the second
lookup key.
A tidier version of #3384 that
saves an unnecessary call to collate.
Previously, when an erlfdb error occured and a recursive call to `update/3` was
made, the result of that call was always matched against `{Mrst, State}`.
However, in the case when the call had finalized and returned
`couch_eval:release_map_context/1` response, the result would be `ok` which
would blow with a badmatch error against `{Mrst, State}`.
Will and others added 22 commits December 17, 2021 10:57
* Win32-SM91 support and fixes

* spidermonkey_68 identified as spidermonkey_60 and erroneously(?) blocked by configure on aarch64 #3149

* remove unnecessary shell when setting ERL_AFLAGS

* fix foundationdb urls in github workflow

* quote AFLAGS like win echo and fix references to pwd

Co-authored-by: Will <[email protected]>
Instead of building one image with all supported Erlang versions through
kerl, this configuration looks for a specific container image for each
Erlang version. Decoupling it like this enables us to more easily adopt
newer distros for newer Erlang versions, and to build new images with
patch releases of Erlang without needing a simultaneous PR to the
CouchDB repo to pick them up in CI (although some change to Jenkins
might be needed to avoid images being cached for too long when a stable
tag changes).
This avoids the situation where a build fails with a timeout because
all the docker-based agents were busy running other jobs. Jenkins'
semantics for options.timeout is that the stage-specific timeout starts
the countdown even while waiting for an agent matching the selected
label to become available. We see occasional spurious job failures as a
result.
This makes it easier to observe the pipeline progres in the UI. We get
timings for each step in the build, and if one of the steps fails the
logs for that step will be the only ones expanded by default. We can
also label each of the steps to provide a bit more context to the
developer about what the CI pipeline is actually doing.
It seems different versions of mix cannot agree on how these lines
ought to be formatted, so let's try to help out.
This is one of those situations where you go in to make a small change,
see an opportunity for some refactoring, and get sucked into a rabbit
hole that leaves you wondering if you have any idea how computers
actually work. My initial goal was simply to update the Erlang version
used in our binary packages to a modern supported release. Along the
way I decided I wanted to figure out how to eliminate all the copypasta
we generate for making any change to this file, and after a few days of
hacking here we are. This rewrite has the following features:

* Updates to use Debian 11 (current stable) as the base image for
  building releases and packaging repos.

* Defaults to Erlang 24.2 as the embedded Erlang version in packages.

* Dynamically generates the parallel build stages used to test and
  package CouchDB on various OSes. This is accomplished through a bit
  of scripted pipeline code that relies on two new methods defined at
  the beginning of the Jenkinsfile, one for "native" builds on macOS
  and FreeBSD and one for container-based builds. See comments in the
  Jenkinsfile for additional details.

* Expands commands like `make check` into a series of steps to improve
  visibility. The Jenkins UI will now show the time spent in each step
  of the build process, and if a step (e.g. `make eunit`) fails it will
  only expand the logs for that step by default instead of showing the
  logs for the entire build stage. The downside is that if we do make
  changes to the series of targets underneath `check` we need to
  remember to update the Jenkinsfile as well.

* Starts per-stage timer _after_ agent is acquired. Previously builds could
  fail with a 15m timeout when all they did was sit in the build queue.
Credit to @nickva for the original improvements. The main branch is
already Erlang 21+ so the minimum version check is less essential, but
the performance improvements are greatly appreciated!
* Remove emilio-related Python script

The Emilio style checker was removed in #3674.

* Remove unused scripts from autotools days

* Update credo to support Elixir v1.12

* Ensure the bin directory sticks around
Added the commit message conventions from the proposal of discussion #3918 and updated all links to use https and moved all external links to the end of the file
Long overdue, lots of build improvements and a couple of bug fixes
in that patch release.
Still a work in progress, but the idea is that developers should be
working with the same base image that we use to validate Pull Requests
in CI. I've also started to add a GitHub Action that could publish
these devcontainer images on a regularly scheduled basis to pick up
fixes and new patch releases from upstream.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.