Hub distributions and features#

2i2c builds and operates distributions of JupyterHubs that are tailored for particular use-cases. These services share many of the same infrastructure components, but have customizations and optimizations that are more domain- or community-specific.

Note

Our services are in an β€œalpha” state - we are still learning a lot about the best way that these hubs can serve communities in research and education. The infrastructure and service may change over the coming months! See our strategy page for an overview of what we’re hoping to do and where we’re headed next.

For more information about specific hub distributions, see the links below. Otherwise, read onward for high-level information about all of our Managed JupyterHubs.

What technology makes up each hub?#

πŸš€ core infrastructure

Underneath each 2i2c JupyterHub is a JupyterHub. These provide interactive computing sessions for each of your users, and connect to the other infrastructure in the cloud. We use auth0 and CILogon for authenticating users, which can connect to a number of other authentication protocols (such as OAuth2).

πŸ’» interfaces

Each 2i2c JupyterHub has two main interactive interfaces: Jupyter interfaces (Notebook and Lab), and RStudio. Each of them is accessible from your session via /tree, /lab, and /rstudio endpoints in your URL.

πŸŒ„ environment

Your 2i2c JupyterHub has an environment that has been created for your particular use-case. It exists as a Docker image that your JupyterHub loads when a user starts a new session. These images can either be built with the tool repo2docker, or pulled directly from a Docker registry. The environment also comes pre-loaded with some tools that are helpful for working with JupyterHub, such as nbgitpuller. See Customize your user environment for more information.

πŸ€– hardware

2i2c JupyterHubs can run on most major cloud providers - the primary thing that is needed is a working Kubernetes deployment. By default, 2i2c runs its hubs on Google Cloud, but if communities wish to use a different provider, this can be accomplished as well. This also means that the hardware underlying the Kubernetes deployment is configurable.

πŸ“¦ data

The data that is used by your 2i2c JupyterHub is provided by you! 2i2c JupyterHubs can connect with a variety of public data sources. We recommend using standard data structures or specifications via libraries like Intake. Note that 2i2c does not host this data itself, but can build connections between 2i2c JupyterHubs and these data sources.

Features of each hub#

Here is a brief overview of the major features that are present in each.

name

description

research

education

Authentication πŸ”

Access control

Hub administrators can control over who has access to your hub

βœ”οΈ

βœ”οΈ

GitHub Logon

Authenticate with a list of GitHub usernames

βœ”οΈ

βœ”οΈ

Google OAuth logon

Authenticate with email addresses that use Google OAuthentication

βœ”οΈ

βœ”οΈ

GitHub Teams Logon

Authenticate via membership in a GitHub Team that you control

βœ”οΈ

User Environment βš’οΈ

Custom user environment

Communities may bring their own Docker images for user environments.

βœ”οΈ

βœ”οΈ

Host content in repositories

Use nbgitpuller to store content in online repositories and distribute them to users with a click

βœ”οΈ

βœ”οΈ

Jupyter Interfaces

Jupyter Lab and Notebook interfaces are designed for interactive data science environments

βœ”οΈ

βœ”οΈ

RStudio

RStudio is an integrated development environment (IDE) for R

βœ”οΈ

βœ”οΈ

Configurable resources πŸ“ˆ

User storage

Users have their own filesystem that persists between sessions.

up to 20GB

up to 20GB

Configurable RAM

Configure the RAM available to users from the hub UI

2-64GB

1-4GB

Configurable CPU

Configure the CPU available to users from the hub UI

2+ dedicated CPUs

1-2 shared CPUs

Shared storage

Administrators can place files in a shared folder that all users may access.

up to 100GB

up to 100GB

Cloud infrastructure ☁️

Use commercial cloud

Hubs can run on either AWS, GKE, or Azure

AWS/GKE/Azure

AWS/GKE/Azure

Connect with cloud data

Access cloud-hosted data from your hub

βœ”οΈ

Scalable Dask Clusters

Scale your computing with Dask Gateway clusters

βœ”οΈ

Bring your own credits

Communities can run 2i2c Hubs on their cloud accounts and projects.

ask us

ask us

Service Level πŸ‘·β€β™€οΈ

Operations Support

2i2c provides a dedicated support channel for all hubs

βœ”οΈ

βœ”οΈ

Hub Uptime

2i2c has a team of Hub Engineers that keep the infrastructure up-to-date, upgraded, and running smoothly

98%

98%

User Privacy

Hubs follow best practices in user privacy, and 2i2c retains no user data.

βœ”οΈ

βœ”οΈ

Connect with communities

2i2c provides a communications channel in Slack for Community Representatives to connect with one another

βœ”οΈ

βœ”οΈ

Open Source πŸ’—

Right to Replicate

Hubs are designed to be replicable by anybody on their own infrastructure.

βœ”οΈ

βœ”οΈ

Open Source Stack

Hubs are built entirely with open source and community-driven tooling

βœ”οΈ

βœ”οΈ

Open Source Support

Hub fees fund open source engineers to do development and community work across the stack.

βœ”οΈ

βœ”οΈ

Where are hubs accessed?#

By default all 2i2c JupyterHub get their own URL with the following form:

<hub-name>.<community-name>.2i2c.cloud

Each 2i2c JupyterHub has hub name (denoted by <hub-name>) and a community name (denoted by <community-name>). Communities are collections of hubs around a particular community or collaboration. Each community infrastructure may be run by different teams. For more information, see What 2i2c provides.

It is also possible to provide your own URL that points to a 2i2c JupyterHub.

Data outside of the hub#

If you wish to access data that exists outside of your 2i2c Hub, it is your responsibility to put this data in the cloud and manage the infrastructure around it. 2i2c does not control this data, it merely provides access to it via your hub infrastructure.

Where are hubs configured and deployed?#

All of the configuration and deployment scripts for the 2i2c JupyterHub can be found at the infrastructure/ repository. This repository contains both the deployment code as well as documentation that explains how it works. It should be treated as β€œfor advanced users only”, and is provided for transparency and as a guide for the community to follow if they wish to manage their own infrastructure similar to 2i2c JupyterHub.

To learn about how the infrastructure/ repository works, we recommend checking out the infrastructure documentation.

See the next sections for more information about each hub distribution.