Home Deploying JupyterHub in Heroku
Post
Cancel
Preview Image

Deploying JupyterHub in Heroku

For collaborative working and rapid experimentation’s in Data Science, Jupyter Notebooks play a major role. Data Scientists/ ML Engineer or anybody who works with data science love jupyter notebook because of its simple UI, running environment and its integration support.

Most of them personally use their local setup to run jupyter labs or notebook for their experimentation. What if you need to collaboratively work with the set of developers and handle multi user support. Yes we prefer to deploy the Jupyter in cloud environments like AWS, GCP or Azure etc. But here is the catch, all these cloud providers are costly and for initial phases it prices to much to set it up also Jupyterlab or Notebook does not support multiuser.

In this blog we gonna deploy JupyterHub in Heroku(cloud provider) which cut down the cost and also easy to deploy.

Note: There are many ways of doing this. I am explaining the best workflow which I experimented

JupyterNotebook vs JupyterLab vs JupyterHub

Jupyter Notebook is a web-based interactive computational environment for creating Jupyter notebook documents.

JupyterLab is the next-generation user interface, including notebooks. It has a modular structure, where you can open several notebooks or files (e.g., HTML, Text, Markdowns, etc.) as tabs in the same window

Jupyterhub is for servers, and let you have jupyter notebook for an entire office or classroom or team.

Prerequisites

1. Github Account:

If you have a github account please create a repo as per your custom name if you dont have github account please create a account and create a repo.

2. Heroku Account:

If you have a heroku account please create a dyno app as per your custom name if you dont have github account please create a account and create a dyno app.

Workflow

  1. Connect Github repo with Heroku App
  2. Prepare the docker and other config files
  3. Push the codes to Git and build in Heroku
  4. Check the Heroku app

Connect Github Repo with Heroku App

  • At this stage you will be having a github repo and a heroku app created. In the heroku app go to Deploy option dashboard -> select the deployment method -> select Github. You will see something like this.

    heroku_jupyterlab_github

  • Give the repo name in connect to Github field and select Connect
  • In the Automatic deploys, select Enable Automatic Deploys. After the above two steps you will see something like this. heroku_gitlab_connect

Now when we push code to github it will be auto deployed to Heroku app (Mini CI/CD Pipeline)

Prepare the docker and config files

Clone the github repo to your local and add the below files

  • Create Jupyter Config file(jupyterhub_config.py). You can add additional settings if you are familiar with configuring jupyter.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
    from jupyterhub.spawner import SimpleLocalProcessSpawner
    
    # setting a dummy user admin for now
    c.JupyterHub.authenticator_class = "dummy"
    c.DummyAuthenticator.password = "admin"
    
    # using simplelocalspawner for now
    c.JupyterHub.spawner_class = SimpleLocalProcessSpawner
    c.Spawner.cmd = ['jupyter-labhub']
    
    # for creating new users
    c.LocalAuthenticator.add_user_cmd = ['python3','/app/analysis/create-user.py','USERNAME']
    c.LocalAuthenticator.create_system_users = True
    
  • Create the create user helper script(create-user.py)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    
    import crypt
    import os
    import sys
    
    if __name__ == '__main__':
        if len(sys.argv) <= 1:
            sys.stderr.write('Usage : create-user.py <username>\n')
            sys.exit(1)
        if 'DEFAULT_USER_PASSWORD' in os.environ:
            default_password = os.environ['DEFAULT_USERS_PASSWORD']
        else:
            default_password = 'remember.change.it'
    
        username = sys.argv[1]
        os.system("useradd -p "+crypt.crypt(default_password,"22")+" -m "+username)
    
  • Create the Dockerfile
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    
    FROM ubuntu:latest
    
    # for heroku the ports association is dynamic
    ARG port
    ENV PORT=$port
    
    # if needed you can rename the workdir
    WORKDIR /app/analysis
    
    # Install python, node, npm packages
    RUN apt-get upgrade -y && apt-get update -y && apt-get install -y python3-pip && pip3 install --upgrade pip
    RUN apt-get -y install curl gnupg
    RUN apt-get -y install git
    RUN curl -sL https://deb.nodesource.com/setup_14.x  | bash -
    RUN apt-get -y install nodejs
    RUN npm install
    RUN npm install -g configurable-http-proxy
    
    # Install python packages add jupyter plugin packages if you are using I have added dask and git for my experiment
    RUN pip3 install jupyterhub && \
        pip3 install --upgrade notebook && \
        pip3 install oauthenticator && \
        pip3 install pandas scipy matplotlib && \
        pip3 install "dask[distributed,dataframe]" && \
        pip3 install dask_labextension && \
        pip3 install --upgrade jupyterlab jupyterlab-git && \
        jupyter lab build
    
    # add user admin to create login for your jupyterhub
    RUN useradd admin && echo admin:change.it! | chpasswd && mkdir /home/admin && chown admin:admin /home/admin
    
    # adding python supporting scripts
    ADD jupyterhub_config.py /app/analysis/jupyterhub_config.py
    ADD create-user.py /app/analysis/create-user.py
    
    # expose the port
    EXPOSE $PORT
    
    # run the jupyter hub feel free to add your arguments needed
    CMD jupyterhub --ip 0.0.0.0 --port $PORT --no-ssl
    
  • Create heroku yaml which says to heroku how to execute these files (heroku.yaml)
    1
    2
    3
    
    build:
      docker:
        web: Dockerfile
    

    Git Push & Build Heroku

    Now we have all the required scripts to build our jupyterhub and deploy to heroku app.

  • Navigate to root of the repo
  • For the first time we need to register this repo with heroku and set them to container stack. Before that make sure you installed heroku cli
    1
    
    $ heroku login
    
    1
    2
    
    $ heroku stack:set container
    $ git push heroku
    

    Important: This process is only for the first time of code push.

  • From next time you can normally commit and push the code to git and it will autodeploy to heroku app
    1
    2
    
    $ git commit -a -m "added build codes"
    $ git push origin
    

Check the Heroku App

  • You can see build logs in your heroku app activity console and the build success completion status
  • Click the open app in the console which will navigate to your jupyterhub app. heroku_log
  • You can see Jupyterhub Login screen. By default we configures as username: admin and password: admin jupyter_login

You will see a jupyterhub console like this

jupyter_dash

Feel free to add more features to this docker file and experiment more on this procedure.

Special thanks to Rodrigo Ancavil for his reference articles

Github Link: https://github.com/robinreni96/jupyterhub-heroku

Docker Image LInk: https://hub.docker.com/repository/docker/robinreni96/jupyterhub-heroku/general

References:

This post is licensed under CC BY 4.0 by the author.

Using AWS Lambda(Serverless) for ML Jobs

Connecting to MongoDB deployed on AWS using Bastion and SSH Tunneling from Local

Comments powered by Disqus.