When I wanted to window shop OpenShift, I had 2 options, either set up on my local using Minishift(previously using an all-in-one VM) or using the “oc cluster up” method. While they are good options, you don’t get the real feel of trying out OpenShift without deploying it on a real cloud-ish setup.
I had a few hiccups with cloud-based setups, they were too big for my needs and came in with components I didn’t really need, and sometimes way over budget. If you check the hardware requirements for running OpenShift, you’ll know what I’m talking.
My aim was to set up a production-ish cluster, not too brittle, at the same time not big enough to power the backend of a bank. I went with a VPS provider like DigitalOcean, so that I know what costs I will incur fairly well. I wanted to get as close as possible to Heroku, which I consider as the gold standard of application deployment.
Before I jump into the costs calculation, let’s do a quick run through of OpenShift architecture. There are 3 kinds of nodes in an OpenShift cluster:
NOTE, To ensure high availability, you need more than one master node. If you are running single master and it goes down, your applications will still run, you won’t be able to control or manage any of it since the master is down.
Also, you can have the same node act as both the master and infrastructure node by tagging with appropriate roles, although it is not recommended for a real production setup.
The number of compute nodes depends on the volume of applications and deployments you are going to have. You can always start with 1 and scale up, i.e. add more nodes to the cluster when you need them.
The next piece of infrastructure in your cluster is storage. You need storage if your app needs some form of persistence(think MySQL DB) and for other stuff, like storing your container images. OpenShift comes with a multitude of options for setting up storage for the cluster. You can either wire it up with the de facto storage for that cloud(like use EBS for AWS) or use a cloud agnostic solution like GlusterFS.
For our cluster, I chose 3 VMs with 4 gigs of RAM each(way less than recommended hardware). I went with GlusterFS for storage, and because it needs a lot of space, I created 3 volumes, 100 Gb each mounted to each VM.
OpenShift runs on RedHat flavored OSes(like RHEL, Centos or Atomic) and requires quite some prerequisites. So, I chose the VM OS to be Centos 7 x64.
We deploy 3.11 version of OpenShift, which at the time of writing this, is the latest stable release. Also, OpenShift has many offerings, like ones with paid support, online multitenant version and the upstream, bleeding edge version called Origin and recently rebranded as OKD. We will be going with the community edition, i.e. OKD. Don’t let that deter you from using OpenShift as the community edition is equally performant and robust as the other editions.
We mostly use Ansible and Terraform to install and configure our OpenShift cluster. No familiarity of either is assumed but would be a bit more useful to fine-tune the setup parameters.
RedHat ships elaborate and well maintained Ansible scripts to create a cluster and manage it. But that’s after the fact that you have booted the infrastructure, setup the DNS, installed prerequisites and everything. I wanted this setup to be consistent and reproducible so that anyone wanting to use OpenShift could follow the same set of instructions and get the exact same results. I automated the infrastructure creation and prerequisites part as well.
I’m not sure how many engineers it takes to turn a light bulb, but for creating any amount of complex infrastructure in a predictable, consistent and repeatable fashion, it takes only 1 engineer, if they are using Terraform.
Terraform helps you create the machines in the cloud, alter their DNS settings, create volumes and attach it to the machines all using a single command. I could also use Ansible to get the same thing done, but Terraform has a slight edge when compared to Ansible. It stores the previous state of the system. It takes the input as the end state of the system and alters the system until the desired state is achieved. This is not doable in Ansible. I still have a few gripes around Terraform and miss Ansible templating, but it is good enough to get the job done.
In our case, Terraform does 2 things. It creates the infrastructure and generates an inventory file for the OpenShift playbooks to consume. Why would I use Terraform to generate my inventory?
The next release of OpenShift is officially adopting Terraform to provision clusters.
You first initiate the Terraform context by running:
$ terraform init
But before that, take a moment to inspect the variables which you can configure for your cluster, most notably,
The SSH key pair path you will use for this cluster.
variable "public_key_path" { default = "~/.ssh/tf.pub" description = "The local public key path, e.g. ~/.ssh/id_rsa.pub"}variable "private_key_path" { default = "~/.ssh/tf" description = "The local private key path, e.g. ~/.ssh/id_rsa"}
I recommend this to be exclusive to your setup.
Also, the VM size and region,
variable "master_size" { default = "4gb" description = "Size of the master VM"}
variable "node_size" { default = "4gb" description = "Size of the Node VMs"}...variable "region" { default = "blr1" description = "The digitalOcean region where the cluster will be spun."}
Also, you need to supply terraform with the domain you are going to use for OpenShift. We can edit the variables.tf file or pass it using environment variables.
$ export TF_VAR_domain=example.com
Note that this must be a domain you own.
Most of these are sensible defaults.
Then, you supply Terraform with the DigitalOcean API token you generated in your settings.
$ export DIGITAL_OCEAN_TOKEN=abcxyz123
You can generate the infrastructure and inventory file by running:
$ terraform apply
Terraform does the grunt work of creating the VMs, mapping domains, creating volumes and attaching them, and finally generating the inventory files. The first inventory file called the preinstall-inventory.cfg
, contains the master and compute nodes.
$ ansible-playbook -u root --private-key=~/.ssh/tf -i preinstall-inventory.cfg ./pre-install/playbook.yml
This is the same key which you specified in the Terraform variables section.
Now is the time to fetch the Ansible scripts to set up the cluster.
git clone git@github.com:openshift/openshift-ansible.git --branch release-3.11 --depth 1
Note the release-3.11
branch. The playbooks branching convention follows the cluster version closely. Also, any ansible commands going forward must be run using the Ansible version specified in this repo. I hit issues numerous times when I used a different version of Ansible other than the one specified in this repo. Let’s install Ansible recommended by this repo by creating a virtualenv.
$ cd openshift-ansible$ virtualenv venv$ source venv/bin/activate$ pip install -r requirements.txt$ which ansible # ensure that this is from the virtualenv
Let’s take a look at the second inventory file generated by Terraform.
Before you set out to install the cluster, you have to ensure that the hardware and software setup thus far meet the prerequisites recommended by OpenShift. This is not set on stone and we can take the liberty to tweak it as per our needs, as set in the inventory,
openshift_disable_check = disk_availability,docker_storage,memory_availability,docker_image_availability
We run the prerequisite check.
$ ansible-playbook -i ../inventory.cfg playbooks/prerequisites.yml
This will take around 5-10 minutes. This setup by default installs an all-in-one master node(which acts as the master, infrastructure node and a compute node).
[masters]1.2.3.4 openshift_ip=1.2.3.4 openshift_schedulable=true[etcd]1.2.3.4 openshift_ip=1.2.3.4[nodes]1.2.3.4 openshift_ip=1.2.3.4 openshift_schedulable=true openshift_node_group_name='node-config-all-in-one'
The openshift_node_group_name
is a way to tag an OpenShift node. We add 2 compute nodes in addition to the master node.
[nodes]1.2.3.4 openshift_ip=1.2.3.4 openshift_schedulable=true openshift_node_group_name='node-config-all-in-one'2.3.4.5 openshift_ip=2.3.4.5 openshift_schedulable=true openshift_node_group_name='node-config-compute'1.3.5.7 openshift_ip=1.3.5.7 openshift_schedulable=true openshift_node_group_name='node-config-compute'
The persistent storage is managed by GlusterFS. OpenShift comes with a ton of persistent storage options, but at the time of writing this, I found GlusterFS to be the best fit between ease of configuring and utility for DigitalOcean.
Now for the final playbook run to install the cluster.
$ ansible-playbook -i ../inventory.cfg playbooks/deploy_cluster.yml
Be warned that this takes a good 30 minutes to finish. An ideal time for a coffee break :)
The final Ansible report after running OpenShift installer
This cluster is configured with the HTPassword Identity provider, in other words, the plain simple username password based identity system.
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
You can configure your cluster to include multiple identity providers. This is useful if you are installing it for an organization, for example.
You can login to the master node using the command
$ ssh -i ~/.ssh/tf root@console.<whatever domain name you gave>
Some setups recommend creating a bastion or a jump host to ssh into your cluster. This involves creating a small VM just for ssh-ing into your OpenShift cluster, implying that you can’t ssh into your cluster from any other machine. Whether you set this up or not is totally up to you. I don’t add the bastion host in this setup.
Also, you will notice that the web console and the OpenShift API server don’t have a valid HTTPS certificate. You can later install your own certificates in your cluster, but I’ve excluded that step for simplicity’s sake.
Let’s create an admin user by logging into the master node.
$ htpasswd -cb /etc/origin/master/htpasswd admin <your choice of password>
The next important thing is to configure a default storage class for the cluster. Although we install GlusterFS as part of our setup, we don’t make it the default storage option. Let’s do that.
$ kubectl get storageclassNAME PROVISIONER AGEglusterfs-storage kubernetes.io/glusterfs 16m
$ oc patch storageclass glusterfs-storage -p '{"metadata":{"annotations":{"storageclass.beta.kubernetes.io/is-default-class":"true"}}}'
OpenShift login screen when you hit console URL
Now that we’ve installed the cluster, let’s verify whether our setup works as intended. Your first check will be to ensure that OpenShift lists all the nodes.
$ oc get nodes
Next check is to ensure that oc status
shows nothing alarming.
$ oc status
Ensure that the registry and router pods are up and running.
$ oc get pods -n default
While we are at the master node, let’s create an actual application and deploy it. OpenShift comes with a set of default templates. We can fetch them using,
$ oc get templates -n openshift
First, we create a new project, or a namespace which will hold all our apps and services.
$ oc new-project myns
Then, we create a new app from the Node.js+MongoDB template, so that we can verify that features like persistence, creating and attaching volumes etc. work.
$ oc new-app --template=nodejs-mongo-persistent -n myns
You can stream the logs realtime using this command.
$ oc logs -f nodejs-mongo-persistent-1-build
Once the application is deployed successfully,
$ oc get pods -n myns # ensure that we have a pod which begins with the same name as our app$ oc get routes # to get the app URL
hit the URL in your browser. It will be of the pattern nodejs-mongo-persistent-myns-<your-app-domain>.
Congratulations! You’ve created and deployed your own OpenShift cluster and created your first app on it. You can download the source code used for this setup here.
The apps displayed in the web console
The OpenShift ecosystem is one of the best ways to get a taste of enterprise Kubernetes without much of the cognitive overhead it presents. But configuring and maintaining it can rob you of time and energy which you’d otherwise spend on shipping your product.
That’s why I created ShapeBlock, a “bring your own server” SaaS which helps you set up and use OpenShift on your own servers. I’m launching ShapeBlock early February and offering huge discounts to early adopters. Make sure you are one of them by signing up for an invite here.