Skip to content

Commit

Permalink
Prepare Flintrock 2.1.0 release (#369)
Browse files Browse the repository at this point in the history
- Tweak the license file so GitHub recognizes it.
- Fix a mistake in the manifest file so the change log is included as intended.
- Update the default Amazon Linux 2 AMI.
- Update and trim the main README a bit.
- Adopt pyproject.toml. It is "strongly recommended" and commands like python setup.py sdist bdist_wheel are deprecated in favor of python -m build.
- Trim outdated comments and pin of cryptography from setup.py.
- Update testing code for setting up private VPC.
  • Loading branch information
nchammas committed Nov 27, 2023
1 parent d45a2c8 commit 5d4e653
Show file tree
Hide file tree
Showing 19 changed files with 92 additions and 91 deletions.
10 changes: 7 additions & 3 deletions .github/workflows/flintrock.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ jobs:
- ubuntu-20.04
- macos-11
python-version:
# Update the artifact upload steps below if modifying
# this list of Python versions.
- "3.8"
- "3.9"
- "3.10"
Expand All @@ -32,14 +34,16 @@ jobs:
architecture: x64
- run: "pip install -r requirements/maintainer.pip"
- run: "pytest"
- run: python setup.py sdist bdist_wheel
- run: python -m build
- uses: actions/upload-artifact@v3
if: ${{ matrix.python-version == '3.9' }}
# Use the latest supported Python to build a standalone package.
if: ${{ matrix.python-version == '3.12' }}
with:
name: Flintrock Standalone - ${{ matrix.os }}
path: dist/Flintrock-*-standalone-*.zip
- uses: actions/upload-artifact@v3
if: ${{ matrix.os == 'ubuntu-20.04' && matrix.python-version == '3.9' }}
# Use the oldest supported Python to build a wheel.
if: ${{ matrix.os == 'ubuntu-20.04' && matrix.python-version == '3.8' }}
with:
name: Flintrock Wheel
path: dist/Flintrock-*.whl
10 changes: 9 additions & 1 deletion CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,28 @@

## [Unreleased]

[Unreleased]: https://github.com/nchammas/flintrock/compare/v2.0.0...master
[Unreleased]: https://github.com/nchammas/flintrock/compare/v2.1.0...master

Nothing notable yet.

## [2.1.0] - 2023-11-26

[2.1.0]: https://github.com/nchammas/flintrock/compare/v2.0.0...2.1.0

### Changed

* [#348], [#367]: Bumped default Spark to 3.5.0 and default Hadoop to 3.3.6; dropped support for Python 3.6 and 3.7; added CI builds for Python 3.10, 3.11, and 3.12.
* [#361]: Migrated from AdoptOpenJDK, which is deprecated, to Adoptium OpenJDK.
* [#362], [#366]: Improved Flintrock's ability to cleanup after launch failures.
* [#366]: Deprecated `--ec2-spot-request-duration`, which is not needed for one-time spot instances launched using the RunInstances API.
* [#369]: Adopted `pyproject.toml` and tweaked Flintrock's Python packaging accordingly. This keeps Flintrock in line with modern Python packaging standards and should be transparent to end-users.

[#348]: https://github.com/nchammas/flintrock/pull/348
[#361]: https://github.com/nchammas/flintrock/pull/361
[#362]: https://github.com/nchammas/flintrock/pull/362
[#366]: https://github.com/nchammas/flintrock/pull/366
[#367]: https://github.com/nchammas/flintrock/pull/367
[#369]: https://github.com/nchammas/flintrock/pull/369

## [2.0.0] - 2021-06-10

Expand Down
7 changes: 3 additions & 4 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
Expand Down Expand Up @@ -179,15 +178,15 @@
APPENDIX: How to apply the Apache License to your work.

To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "{}"
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.

Copyright {yyyy} {name of copyright owner}
Copyright 2024 Nicholas Chammas

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand All @@ -199,4 +198,4 @@
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
limitations under the License.
4 changes: 2 additions & 2 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# See: https://docs.python.org/3/distutils/commandref.html
# See: https://setuptools.pypa.io/en/latest/userguide/miscellaneous.html
graft flintrock

include README.md
include CHANGELOG.md
include CHANGES.md
include COPYRIGHT
include LICENSE

Expand Down
78 changes: 25 additions & 53 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ flintrock launch test-cluster \
--spark-version 3.5.0 \
--ec2-key-name key_name \
--ec2-identity-file /path/to/key.pem \
--ec2-ami ami-0aeeebd8d2ab47354 \
--ec2-ami ami-0588935a949f9ff17 \
--ec2-user ec2-user
```

Expand Down Expand Up @@ -123,10 +123,17 @@ without too much trouble, too.

### Release version

To get the latest release of Flintrock, simply run [pip](https://pip.pypa.io/en/stable/):
To get the latest release of Flintrock, simply install it with [pip][pip].

Since Flintrock is a command-line application rather than a library, you may prefer to
install it using [pipx][pipx], which automatically takes care of installing Flintrock to
an isolated virtual environment for you.

[pip]: https://pip.pypa.io/en/stable/
[pipx]: https://pypa.github.io/pipx/

```
pip3 install flintrock
pipx install flintrock
```

This will install Flintrock and place it on your path. You should be good to go now!
Expand All @@ -140,27 +147,14 @@ flintrock configure

### Standalone version (Python not required!)

If you don't have a recent enough version of Python, or if you don't have Python installed at all,
you can still use Flintrock. We publish standalone packages of Flintrock on GitHub with our
[releases](https://github.com/nchammas/flintrock/releases).

Find the standalone package for your OS under our [latest release](https://github.com/nchammas/flintrock/releases/latest),
unzip it to a location of your choice, and run the `flintrock` executable inside.

For example:

```sh
flintrock_version="2.0.0"

curl --location --remote-name "https://github.com/nchammas/flintrock/releases/download/v$flintrock_version/Flintrock-$flintrock_version-standalone-macOS-x86_64.zip"
unzip -q -d flintrock "Flintrock-$flintrock_version-standalone-macOS-x86_64.zip"
cd flintrock/
We used to publish standalone versions of Flintrock that don't require you to have Python
installed on your machine. Since Flintrock 2.1.0, we have stopped publishing these
standalone builds.

# You're good to go!
./flintrock --help
```
If you used these standalone packages, please [chime in on this issue][standalone] and
share a bit about your environment and use case.

You'll probably want to add the location of the Flintrock executable to your `PATH` so that you can invoke it from any directory.
[standalone]: https://github.com/nchammas/flintrock/issues/370

### Community-supported distributions

Expand All @@ -175,7 +169,7 @@ These packages are not supported by the core contributors and **may be out of da
If you like living on the edge, install the development version of Flintrock:

```sh
pip3 install git+https://github.com/nchammas/flintrock
pipx install git+https://github.com/nchammas/flintrock
```

If you want to [contribute](https://github.com/nchammas/flintrock/blob/master/CONTRIBUTING.md), follow the instructions in our contributing guide on [how to install Flintrock](https://github.com/nchammas/flintrock/blob/master/CONTRIBUTING.md#contributing-code).
Expand Down Expand Up @@ -203,17 +197,17 @@ There are some things that Flintrock specifically *does not* support.

Flintrock is not for managing long-lived clusters, or any infrastructure that serves as a permanent part of some environment.

For starters, Flintrock provides no guarantee that clusters launched with one version of Flintrock can be managed by another version of Flintrock, and no considerations are made for any long-term use cases.
For starters, Flintrock provides no guarantee that clusters launched with one version of Flintrock can be managed by another version of Flintrock, and no considerations are made for any long-term use cases.

If you are looking for ways to manage permanent infrastructure, look at tools like [Terraform](https://www.terraform.io/), [Ansible](http://www.ansible.com/), [SaltStack](http://saltstack.com/), or [Ubuntu Juju](http://www.ubuntu.com/cloud/tools/juju). You might also find a service like [Databricks](https://databricks.com/product/databricks) useful if you're looking for someone else to host and manage Spark for you. Amazon also offers [Spark on EMR](https://aws.amazon.com/elasticmapreduce/details/spark/).
If you are looking for ways to manage permanent infrastructure, look at tools like [Terraform](https://www.terraform.io/), [Ansible](http://www.ansible.com/), or [Ubuntu Juju](http://www.ubuntu.com/cloud/tools/juju). You might also find a service like [Databricks](https://databricks.com/product/databricks) useful if you're looking for someone else to host and manage Spark for you. Amazon also offers [Spark on EMR](https://aws.amazon.com/elasticmapreduce/details/spark/).

### Launching non-Spark-related services

Flintrock is meant for launching Spark clusters that include closely related services like HDFS, Mesos, and YARN.
Flintrock is meant for launching Spark clusters that include closely related services like HDFS.

Flintrock is not for launching external datasources (e.g. Cassandra), or other services that are not closely integrated with Spark (e.g. Tez).
Flintrock is not for launching external datasources (e.g. Cassandra), or other services that are not closely integrated with Spark (e.g. Tez).

If you are looking for an easy way to launch other services from the Hadoop ecosystem, look at the [Apache Bigtop](http://bigtop.apache.org/) project.
If you are looking for an easy way to launch other services from the Hadoop ecosystem, look at the [Apache Bigtop](http://bigtop.apache.org/) project.

### Launching out-of-date services

Expand Down Expand Up @@ -263,7 +257,7 @@ providers:
identity-file: /path/to/.ssh/key.pem
instance-type: m5.large
region: us-east-1
ami: ami-0aeeebd8d2ab47354
ami: ami-0588935a949f9ff17
user: ec2-user
```

Expand All @@ -283,29 +277,7 @@ flintrock launch test-cluster \

### Fast Launches

Flintrock is really fast. This is how quickly it can launch fully operational clusters on EC2 compared to [spark-ec2](https://github.com/amplab/spark-ec2).

#### Setup

* Provider: EC2
* Instance type: `m3.large`
* AMI:
* Flintrock: [Default Amazon Linux AMI](https://aws.amazon.com/amazon-linux-ami/)
* spark-ec2: [Custom spark-ec2 AMI](https://github.com/amplab/spark-ec2/tree/a990752575cd8b0ab25731d7820a55c714798ec3/ami-list)
* Spark/Hadoop download source: S3
* Launch time: Best of 6 tries

#### Results

| Cluster Size | Flintrock Launch Time | spark-ec2 Launch Time |
|---------------|----------------------:|------------------------:|
| 1 slave | 2m 06s | 8m 44s |
| 50 slaves | 2m 30s | 37m 30s |
| 100 slaves | 2m 42s | 1h 06m 05s |

The spark-ec2 launch times are sourced from [SPARK-5189](https://issues.apache.org/jira/browse/SPARK-5189).

Note that AWS performance is highly variable, so you will not get these results consistently. They show the best case scenario for each tool, and not the typical case. For Flintrock, the typical launch time will be a minute or two longer.
Flintrock is really fast. It can launch a 100-node cluster in about three minutes (give or take a few seconds due to AWS's normal performance variability).

### Advanced Storage Setup

Expand All @@ -330,7 +302,7 @@ Flintrock is built and tested against vanilla Amazon Linux and CentOS. You can e

Supporting multiple versions of anything is tough. There's more surface area to cover for testing, and over the long term the maintenance burden of supporting something non-current with bug fixes and workarounds really adds up.

There are projects that support stuff across a wide cut of language or API versions. For example, Spark supports Java 7 and 8, and Python 2.6+ and 3+. The people behind these projects are gods. They take on an immense maintenance burden for the benefit and convenience of their users.
There are projects that support stuff across a wide cut of language or API versions. For example, Spark supports multiple versions of Java, Scala, R, and Python. The people behind these projects are gods. They take on an immense maintenance burden for the benefit and convenience of their users.

We here at project Flintrock are much more modest in our abilities. We are best able to serve the project over the long term when we limit ourselves to supporting a small but widely applicable set of configurations.

Expand Down
3 changes: 1 addition & 2 deletions flintrock/__init__.py
Original file line number Diff line number Diff line change
@@ -1,2 +1 @@
# See: https://packaging.python.org/en/latest/distributing/#standards-compliance-for-interoperability
__version__ = '2.1.0.dev0'
__version__ = '2.1.0'
2 changes: 1 addition & 1 deletion flintrock/config.yaml.template
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ providers:
instance-type: m5.large
region: us-east-1
# availability-zone: <name>
ami: ami-0cabc39acf991f4f1 # Amazon Linux 2, us-east-1
ami: ami-0588935a949f9ff17 # Amazon Linux 2, us-east-1
user: ec2-user
# ami: ami-61bbf104 # CentOS 7, us-east-1
# user: centos
Expand Down
5 changes: 5 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Minimal pyproject file per: https://packaging.python.org/en/latest/guides/modernize-setup-py-project/
[build-system]
# Minimum setuptools version that supports version in setup.cfg per: https://packaging.python.org/en/latest/guides/single-sourcing-package-version/
requires = ["setuptools >= 46.4.0"]
build-backend = "setuptools.build_meta"
1 change: 0 additions & 1 deletion requirements/developer.pip
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,6 @@ coverage[toml]==7.3.2
cryptography==41.0.5
# via
# -r requirements/user.pip
# flintrock
# paramiko
exceptiongroup==1.2.0
# via pytest
Expand Down
1 change: 1 addition & 0 deletions requirements/maintainer.in
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@
wheel >= 0.31.0
twine == 4.0.2
PyInstaller == 6.2.0
build >= 1.0.3, < 2.0.0
15 changes: 11 additions & 4 deletions requirements/maintainer.pip
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ botocore==1.32.4
# boto3
# flintrock
# s3transfer
build==1.0.3
# via -r requirements/maintainer.in
certifi==2023.11.17
# via requests
cffi==1.16.0
Expand All @@ -45,7 +47,6 @@ coverage[toml]==7.3.2
cryptography==41.0.5
# via
# -r requirements/developer.pip
# flintrock
# paramiko
docutils==0.20.1
# via readme-renderer
Expand All @@ -55,10 +56,11 @@ exceptiongroup==1.2.0
# pytest
flake8==6.1.0
# via -r requirements/developer.pip
idna==3.4
idna==3.6
# via requests
importlib-metadata==6.8.0
# via
# build
# keyring
# pyinstaller
# twine
Expand Down Expand Up @@ -94,6 +96,7 @@ nh3==0.2.14
packaging==23.2
# via
# -r requirements/developer.pip
# build
# pyinstaller
# pytest
paramiko==3.3.1
Expand All @@ -118,7 +121,7 @@ pyflakes==3.1.0
# via
# -r requirements/developer.pip
# flake8
pygments==2.17.1
pygments==2.17.2
# via
# readme-renderer
# rich
Expand All @@ -130,6 +133,8 @@ pynacl==1.5.0
# via
# -r requirements/developer.pip
# paramiko
pyproject-hooks==1.0.0
# via build
pytest==7.4.3
# via
# -r requirements/developer.pip
Expand Down Expand Up @@ -167,7 +172,9 @@ six==1.16.0
tomli==2.0.1
# via
# -r requirements/developer.pip
# build
# coverage
# pyproject-hooks
# pytest
twine==4.0.2
# via -r requirements/maintainer.in
Expand All @@ -179,7 +186,7 @@ urllib3==1.26.18
# botocore
# requests
# twine
wheel==0.41.3
wheel==0.42.0
# via -r requirements/maintainer.in
zipp==3.17.0
# via
Expand Down
2 changes: 1 addition & 1 deletion requirements/user.in
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@
# See: https://caremad.io/2013/07/setup-vs-requirement/
# - The #egg= syntax is a workaround for pip-tools.
# See: https://github.com/jazzband/pip-tools/issues/204#issuecomment-550051424
-e file:.#egg=Flintrock
--editable file:.#egg=Flintrock
4 changes: 1 addition & 3 deletions requirements/user.pip
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,7 @@ cffi==1.16.0
click==8.1.7
# via flintrock
cryptography==41.0.5
# via
# flintrock
# paramiko
# via paramiko
jmespath==1.0.1
# via
# boto3
Expand Down
4 changes: 4 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
# See: https://packaging.python.org/en/latest/guides/single-sourcing-package-version/
[metadata]
version = attr: flintrock.__version__

[tool:pytest]
norecursedirs = venv
addopts =
Expand Down

0 comments on commit 5d4e653

Please sign in to comment.