Home

Cloud Scheduler is a cloud-enabled distributed resource manager backend.

Cloud Scheduler manages virtual machines on clouds like Nimbus, OpenNebula, Eucalyptus or EC2 to create an environment for batch job execution.

How Cloud Scheduler Fits into the Cloud / HPC Ecosystem



How a User Interacts with Cloud Scheduler

For the most part, a user doesn't interact with Cloud Scheduler at all. Everything should be automatic. Here's how it should work for the user:

  1. Jane prepares a VM image loaded with the software she needs for processing, then uploads it to an image repository. It's also possible that this could have been done previously by one of her colleagues, or she picks a pre-cooked image.
  2. Jane submits a bunch of processing jobs to a Condor pool. In the Condor jobs, she specifies regular Condor parameters, but also specifies a VM image that she would like her job to run on.
  3. Jane then waits for her jobs to complete.
  4. Jane gets her results.

So to the user, the only difference from a traditional batch-queue system should be that she creates an image, and specifies it in her job description.

So What Does Cloud Scheduler Do Again?

Cloud Scheduler acts after step 2 above. It looks at the job queue to discover which VM images are needed to complete the jobs in the queue, boots some VM images on the clusters it has access to. These VM images run the jobs from the queue, and Cloud Scheduler then shuts them down when they're no longer necessary.

We aim to support Nimbus, Eucalyptus, OpenNebula, and Amazon EC2 on the backend.

Who Makes This?

The University of Victoria High Energy Physics Research Computing group (HEPRC), along with the CANFAR project, and NRC-Sussex in Ottawa develop Cloud Scheduler. It will be used in CANFAR and in the HEP Legacy Data Project, both of which are NEP projects funded by CANARIE.

Contact us if you're interested in knowing more.

Where are We Now?

Right now, we have compatibility with Nimbus Eucalyptus/EC2 and OpenNebula support is in progress. We're working on scaling for thousands of simultaneous jobs. Check out our RoadMap for our plans for the next few months, if you like.

Source

We keep the source on GitHub . Feel free to take a look at it. It's GPLv3 and Apache v2 dual-licensed.