Skip to content

numfocus/gsoc

Repository files navigation

Google Summer of Code

| Sub organizations | IDEAS LIST | Student guides |

NumFOCUS will be applying again as an umbrella mentoring organization for Google Summer of Code 2024. NumFOCUS supports and promotes world-class, innovative, open source scientific software.

NumFOCUS is committed to promoting and sustaining a professional and ethical community. Our Code of Conduct is our effort to uphold these values and it provides a guideline and some of the tools and resources necessary to achieve this.

Google Summer of Code is an annual open source internship program sponsored by Google. This repository contains information specific to NumFOCUS' participation in GSoC. For general information about the competition, including this year's application timeline and key phases involved, please see the GSoC website

This Git repository stores information about NumFOCUS' application for Google Summer of Code in the current and previous years.

Table of Contents

Students

NumFOCUS is participating as a umbrella organization. This means that you will need to identify a specific project to apply to under the NumFOCUS umbrella. (Projects are listed below.)

Read this document to learn how to apply for the GSoC program with NumFOCUS. Please also check out our ideas list.

For any questions, please open an issue in our issue tracker or send a email to [email protected], our mailing list address. Please also consider subscribing to the mailing list at https://groups.google.com/a/numfocus.org/forum/#!forum/gsoc.

Sub Organizations

If you want to participate as a sub organization of NumFOCUS please read this guide.

Organizations Confirmed Under NumFOCUS Umbrella

In alphabetic order.

aeon

aeon is an open-source scikit-learn compatible toolkit for time series tasks such as forecasting, classification, regression, clustering, anomaly detection and segmentation. It provides a broad library of time series algorithms, including efficient implementations of the latest advances in research.

Website | Ideas Page | Slack | Source Code

AiiDA

AiiDA is a python framework for managing computational science workflows, with roots in computational materials science. It helps researchers manage large numbers of simulations (10k, 100k, 1M, ...) and complex workflows involving multiple executables. At the same time, it records the provenance of the entire simulation pipeline with the aim to make it fully reproducible.

Website | Ideas List | Discourse | Source Code

ArviZ

ArviZ is a project dedicated to promoting and building tools for exploratory analysis of Bayesian models. It currently has a Python and a Julia interface. ArviZ aims to integrate seamlessly with established probabilistic programming languages like PyStan, PyMC, Turing, Soss, emcee, or Pyro. Where the probabilistic programming languages aim to make it easy to build and solve Bayesian models, the ArviZ libraries aim to make it easy to process and analyze the results from those Bayesian models.

Website | Ideas List | Contact (Gitter) | Source Code

Bambi

Bambi (BAyesian Model Building Interface) is an open source Python package designed to make it easier for practitioners to build statistical models from a wide range of families using a formula notation similar to those found in R. It is built on top of the PyMC probabilistic programming framework and the ArviZ package for exploratory analysis of Bayesian models.

Website | Ideas List | Discussions | Source Code

biocommons

The biocommons is a community that fosters collaboration on pre-competitive, interoperable, and high-quality bioinformatics open source software and data, primarily for biological sequence analysis and interpretation. Our software is used by clinical genetics/diagnostics companies, computational biologists and scientists, and tool and database developers.

Website | Project Ideas | Getting Connected | GitHub

CB-Geo MPM

CB-Geo MPM is an HPC-enabled Material Point Method solver for large-deformation modeling. It supports isoparametric elements to model complex geometries and creates photo-realistic rendering.

Website | Ideas List | Discussions | Source Code

Colour

Colour is an open-source Python package providing a comprehensive number of algorithms and datasets for colour science.

It is freely available under the New BSD License terms.

Website | Ideas List | Contact | Source Code

CuPy

CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python. CuPy acts as a drop-in replacement to run existing NumPy/SciPy code on NVIDIA CUDA or AMD ROCm platforms.

Website | Ideas List | Chat on Gitter | Contact | Source Code

Conda Forge

A community led collection of recipes, build infrastructure and distributions for the conda package manager.

Website | Ideas List | Gitter | Source Code

Data Retriever

The Data Retriever ecosystem improves reproducible research through data product management. The platform takes advantage of freely available data sources in a variety of formats, standardizes them, and makes them available to scientists in a form that is ready to analyze. Data sources range from tabular data, spatial data packages and APIs. Several data packages use the ecosystems, and many projects support or rely on the ecosystem.

Website | Ideas List | Contact (Gitter) | Source Code

FEniCS

FEniCS is an automated finite element library used to solve equations used in modeling, featuring a domain-specific language and automated code generation. Users input a problem that looks very much like mathematical notation; FEniCS then translates that into computer code. It solves problems for which there is no analytical (exact) solution numerically.

Website | Ideas List | Contact | Source Code

FluxML

FluxML is a 100%-pure Julia machine learning stack built on top of Julia's native automatic differentiation and GPU support. Our organization maintains packages for building and training neural networks, data pre-processing pipelines, standard deep learning models, automatic differentiation, and more. By writing our complete toolchain in Julia, we aim to make machine learning simple, extensible, and performant.

Website | Ideas List | Contact (Slack or Zulip) | Source Code

Gridap

Gridap is a new generation, open-source, finite element (FE) library implemented in the Julia programming language. Gridap aims at adopting a more modern programming style than existing FE applications written in C/C++ or Fortran.

Website | Ideas List | Contact (Gitter) | Source Code

JuMP

JuMP is a modeling language and collection of supporting packages for mathematical optimization in Julia. JuMP makes it easy to formulate and solve a range of problem classes, including linear programs, integer programs, conic programs, semidefinite programs, and constrained nonlinear programs.

Website | Ideas List | Contact | Source Code

JupyterLab

JupyterLab is a web-based interactive development environment for notebooks, code, and data. Its flexible interface allows users to configure and arrange workflows in data science, scientific computing, computational journalism, and machine learning. A modular design invites extensions to expand and enrich functionality.

JupyterLab is a project of the Jupyter organization: free software, open standards, and web services for interactive computing across all programming languages.

JupyterLab Website | Jupyter Website | Ideas List | Contact | Source Code

Matplotlib

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible.

Website | Ideas List | Gitter | Source Code

Mesa

Mesa allows users to quickly create agent-based models using built-in core components (such as spatial grids and agent schedulers) or customized implementations; visualize them using a browser-based interface; and analyze their results using Python’s data analysis tools. Its goal is to be the Python 3-based counterpart to NetLogo, Repast, or MASON.

Website | Ideas Page | Contact (Mailing List) | Source Code

NetworkX

NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.

Website | Ideas Page | Contact (GitHub Discussions) | Source Code

OpenFHE

OpenFHE is an open-source Fully Homomorphic Encryption (FHE) library that includes efficient implementations of all common FHE schemes: BFV, BGV, CKKS, DM and CGGI.

Website | Ideas Page | Contact Us (Discourse)| Source Code

Open Science Labs

Open Science Labs is a global community dedicated to creating an open space for teaching, learning, and sharing information about open science and computational tools. Our community develops tools that address real-world problems and collaborates with other projects and workgroups to improve technology and create international opportunities for our community. Although our focus may seem broad, we initially prioritize supporting Research Software Engineers (RSEs) who often face computational challenges in their work.

Website | Ideas Page | Contact (GitHub Discussions) | Source Code

Optuna

Optuna is an open source hyperparameter optimization framework to automate hyperparameter search. Optuna features 1. define-by-run interface for defining search spaces, 2. state-of-the-art algorithms to efficiently search large spaces and prune unpromising trials for faster results, and 3. easy parallelization for hyperparameter searches over multiple threads or processes without modifying code.

Website | Ideas Page | Contact ([email protected])| Source Code

pvlib

pvlib python provides a set of functions and classes for simulating the performance of photovoltaic energy systems.

Website | Google Group Forum | Ideas Page | Source Code

PyBaMM

PyBaMM (Python Battery Mathematical Modelling) solves physics-based electrochemical DAE models by using state-of-the-art automatic differentiation and numerical solvers.

Website | Contact | Ideas Page | Source Code

PyLops

PyLops is an open-source Python library focused on providing a backend-agnostic, idiomatic, matrix-free library of linear operators and related computations. It is inspired by the iconic MATLAB Spot – A Linear-Operator Toolbox project.

Website | Slack | Ideas Page | Source Code

PyMC

PyMC is a python module for Bayesian statistical modeling and model fitting which focuses on advanced Markov chain Monte Carlo and variational fitting algorithms. Its flexibility and extensibility make it applicable to a large suite of problems.

Website | discourse | Ideas Page | Source Code

PySAL

PySAL is a python library for geographical data science. It consists of 18 subpackages that cover a wide range of spatial analytical methods from exploratory spatial data analysis, spatial interaction modeling, spatial optimization, spatial econometrics, segregation, and spatial interpolation, among others.

Website | gitter | Ideas Page | Source Code

PyTorch-Ignite

PyTorch-Ignite is a high-level library to help with training neural networks in PyTorch

Website | Discord | GitHub Discussions | Ideas Page | Source Code

QuTiP

QuTiP is a software for simulating quantum systems. QuTiP aims to provide tools for user-friendly and efficient numerical simulations of open quantum systems. It can be used to simulate a wide range of physical phenomenon in areas such as quantum optics, trapped ions, superconducting circuits and quantum nanomechanical resonators. In addition, it contains a number of other modules to simplify the numerical simulation and study of many topics in quantum physics such as quantum optimal control, quantum information, and computing.

Website | Contact | Ideas Page | Source Code

SciML

SciML is an open source software organization created to unify the packages for scientific machine learning. This includes the development of modular scientific simulation support software, such as differential equation solvers, along with the methodologies for inverse problems and automated model discovery. By providing a diverse set of tools with a common interface, we provide a modular, easily-extendable, and highly performant ecosystem for handling a wide variety of scientific simulations.

Website | Contact | Ideas Page | Source Code

Taskflow

Parallel and heterogeneous programming with high performance and simultaneous high productivity

Website | Contact | Ideas Page | Source Code

TNL

TNL is a collection of building blocks that facilitate the development of efficient numerical solvers and HPC algorithms. It is implemented in C++ using modern programming paradigms in order to provide a flexible and user-friendly interface such as the STL library, for example. TNL provides native support for modern hardware architectures such as multicore CPUs, GPUs, and distributed systems, which can be managed via a unified interface.

Website | Gitter | Ideas Page | Source Code

Zarr

Zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.

Website | Gitter | Ideas Page | Source Code