Building an IaaS cloud with Red Hat OpenStack – 2 of several posts

This is the second in a series that will step through the process of building an OpenStack IaaS cloud in your datacenter. (The first post introduced Infrastructure as a Service).

What are cloud workloads? What applications are suited to the cloud?

Physical servers that run email, ERP, CRM or web applications are reasonably static and their load is predictable. ERP may have a predictable spike in traffic at month end, CRM or email may use extra compute resources when a marketing campaign is launched. To grow this infrastructure time, labour and costs are incurred to purchase and rack additional servers, storage and networking.

Virtualization provides some gains as existing physical servers and their apps can be consolidated and higher utilization extracted from them by running several virtual machines on the same physical hardware.  To grow this infrastructure a virtualization administrator creates additional virtual machines.

Cloud workloads are applications that are primarily elastic, they grow and shrink dynamically and automatically, with no user intervention – no racking of new servers, no creation of additional virtual machines. Furthermore they can be deployed automatically using APIs by the end-user with no intervention required by an IT administrator.

Overview of OpenStack

OpenStack is an Infrastructure as a Service OpenSource project, that was started by NASA and RackSpace but now has over 200 contributors. OpenStack offers an Infrastructure-as-a-Service cloud that can be deployed as private or public offering. To fully understand OpenStack, lets dispense with some mis-understandings:

OpenStack is a product

OpenStack is not a product, per se. Comparing OpenStack to a commercial product like VMware vCloud or Microsoft Private cloud is an apples to oranges comparison. OpenStack is an open-source project used to build an IaaS cloud. OpenStack is a cloud operating system, that controls virtual compute, software defined networking and storage.

You can download the distribution directly from the github source and deploy, test and run it alone or purchase a supported distribution such as Red Hat’s, where most of the testing is done for you and deployment services are included.

OpenStack is a replacement for virtualization

OpenStack does not replace your virtualization initiative. OpenStack runs alongside the hypervisor and supports several: XEN, VMware, HyperV, LXC and of course KVM. Moreover, OpenStack supports container technologies such as Docker (will be available on Red Hat Enterprise Linux 7) and OpenStack even runs on bare-metal (still in development).

So how is OpenStack used? What are some use cases?

OpenStack is a cloud operating system. OpenStack provides a platform to deploy virtual servers, software defined networking and storage. It is suitable for workloads that dynamically grow (and shrink). This takes advantage of the elastic nature of a cloud infrastructure.

  1. Automatic deployments – where scripts and programs use APIs to deploy virtual servers, networking and storage. This is a faster and more efficient alternative than using a web GUI or contacting an IT admin.
  2. Reuseable scenarios – where applications demand and identical configuration be created and reused. Example QA testing on the identical configuration
  3. Servers and applications based on templates – a prepackaged set of services that is deployed from an image or template. Example: A webserver, application server and database.
  4. Automatic scale – accommodate a growth in demand automatically. Example an eCommerce application that has spikes during peak shopping periods
  5. Big Data – analysis of datasets that vary in size and thus require automatic addition of nodes/servers as the data grows and the need for compute expands. Example: Hadoop performs data analysis on the end-node and not at a central server, thus as data analysis grows new nodes need to automatically spin up for analysis to be performed.

OpenStack offers:

  • Distributed object storage
  • Persistent block-level storage
  • Storage for provisioning virtual-machines and their images.
  • RBAC – Roles based Access Control and Authentication
  • Software-defined networking
  • Web browser-based GUI or a Command Line and APIs for users and administrators.

OpenStack components/projects

The projects that comprise OpenStack are listed in this table and described in detail below.

Service OpenStack Project code name Description
Dashboard Horizon Web-based GUI for using/administering OpenStack
Identity Keystone Authentication and Authorization (roles/privileges)
Networking Neutron Software defined networking for connectivity between OpenStack components
Block Storage Cinder Persistent block storage/volumes/virtual disks for instances/virtual machines
Compute Nova Launch and schedule instances/virtual machines on servers/nodes
Image Glance A registry for virtual machine images
Object Storage Swift Storage of files for users
Metering Ceilometer Usage and measuring of cloud resources
Orchestration Heat Template based engine for automatically creating resources(compute/storage/networking)

OpenStack Compute service: (PROJECT NOVA)

OpenStack Compute is the heart or core of the  OpenStack cloud. Compute provisions and manages on-demand virtual machines. Compute schedules virtual machines to run on a set of physical servers (nodes). Virtual machines can be started, stopped, suspended, created and deleted. The virtual machines run on hypervisors such as XEN, ESXi, HyperV and of course or KVM.

Compute interfaces with the Identity service to authenticate a user who is requesting a compute action (create/delete/suspend/copy a virtual machine). Compute interfaces with the Dashboard (project Horizon) service for the user web GUI interface. Compute interfaces with the image service (project Glance) to provision an image.  There are security/access controls to govern which images are accessible by users and a quota on how many instances can be created per project.

The Compute service scales horizontally on standard hardware: meaning to grow an OpenStack cloud you add more servers/nodes (horizontal scaling), rather than adding memory, cpu and disk to an existing server (vertical scaling).

OpenStack Image service: (PROJECT GLANCE)

The OpenStack Image Service provides discovery, registration and delivery services for disk and server images. Images can be used as a template to create new virtual servers. It can also be used to store and catalog backups.

The image service stores images in a variety of formats:

  • AMI- Amzon Machine Image
  • ISO – (virtual CDROM)
  • qcow2 (Qemu/KVM)
  • OVF (Open Virtualization Format)
  • RAW (unstructured)
  • VDI (VirtualBox)
  • VHD (Hyper-V, XEN, Microsoft, VMware)
  • VMDK (VMWare)

OpenStack Object storage: (PROJECT SWIFT)

Object storage provides virtual containers which allows users to store and retrieve files (images, documents, video files, graphics etc.). Object storage supports asynchronous eventual consistency replication and uses the concept of:

  • Replicas – maintain the state of objects in the case of an outage.
  • Zones – used to host replicas and ensure that each replica of given object can be stored separately. A zone might represent a disk, an disk array, a rack of servers or an entire datacenter.
  • Regions – a group of zones sharing a location

OpenStack Block storage: (PROJECT CINDER)

Block storage provides persistent block storage that comprise virtual hard-drives or volumes used by OpenStack virtual machines. These volumes are integrated into the Dashboard and Compute services to enable users to manage their own storage needs. Thus users can create (or list or delete) a volume(s) and attach it to (or detach from) a virtual machine(s). Virtual Machine snapshots are also stored on block storage volumes.

OpenStack Metering service: (PROJECT CEILOMETER)

The Metering service provides user level statistics that can be used for alerting, billing or monitoring. There is a plugin system to add new monitors.

OpenStack Orchestration service: (PROJECT HEAT)

The Orchestration service provides a template-based engine for the OpenStack cloud, used to create and manage cloud  resources: storage, networking, instances (virtual machines), and applications as a repeatable running environment. Templates are used to create stacks, or collections of resources (instances, floating IPs, volumes, security groups, or users). The service offers access to the OpenStack core services via a single modular template.

OpenStack Networking service: (PROJECT NEUTRON)

OpenStack provides networking models to accomodate different applications. It is a scalable and API driven system for providing network connectivity. As a software defined network, OpenStack networking can create networks, assign IP addresses, route traffic amd connect servers. Various network services are supported: flat networks, VLANs, GREs (Generic Routing Encapsulation – a tunneling protocol), multi-tier topologies etc.

OpenStack Networking manages IP addresses, to allocate static  or DHCP addresses. Floating IP addresses allow traffic to be dynamically rerouted to any compute resource,  for example to redirect traffic during maintenance or in the case of a failure. OpenStack Networking has a plugin extension framework to add intrusion detection systems (IDS), load balancing, firewalls and virtual private networks (VPN) .

The diagram below shows:

  1. Horizon – providing a web user interface to manage and use Cinder(block)/Swift(object), Glance (images) Nova (compute) and Quantum (networking).
  2. Keystone – providing authentication/authorization to Cinder/Swift (storage), Glance (images) Nova (compute) and Quantum (networking).
  3. Nova(compute) – scheduling and provisioning Virtual Machines.
  4. Ceilometer – monitoring Cinder/Swift (storage), Glance (images) Nova (compute) and Quantum (networking).
  5. Cinder(block storage) – providing volumes for Virtual Machines and storing backups in Swift.
  6. Glance – storing images in Swift and providing them to Virtual Machines.
  7. Quantum – providing network connectivity to Virtual Machines.
  8. Heat – orchestrating the OpenStack cloud.
OpenStack architecture

OpenStack architecture – courtesy of openstack.org

Building an Infrastructure as a Service cloud in your datacenter – first of several articles

Infrastructure as a Service (Iaas)

IaaS is one of the three delivery methods of cloud computing (the other two are Platform as a Service and Software aa Service).

blog-cloud-iaas_0

Infrastructure as a Service delivers compute, networking and storage as software on commodity hardware, typically rack-mounted servers that can be added as required to scale a cloud horizontally.

  1. Compute – virtual machines of different sizes, different number of CPUs and/or memory.
  2. Networking – software defined networking: networks, routers, switches defined in software that also provider networking services: Load Balancing, Firewalls, VPN etc.
  3. Storage  – blocks of storage as virtual disks or for storing/retrieving files

These three components are managed using a dashboard, command-line interface or API.

OpenStack dashboard

OpenStack dashboard

Characteristics of IaaS:

  • Elasticity: A user can provision (add) or de-provision (remove) cloud instances to scale their cloud up or down.
  • Multi-tenancy: The cloud servers are hosted on a shared infrastructure. This means that your cloud instances co-exist on the same hardware as another user’s cloud instances. To understand multi-tenancy, think of an apartment building (or block of flats). The renters/tenants have their own apartment, but share an elevator or stairway, foundation and roof. The owner of the building rents out apartments as needed and is responsible for the plumbing etc while each tenant is responsible for their own furniture and interior decorations. Similarly: an IaaS customer is responsible for their own applications, the cloud provider is simply providing the infrastructure.
  • User self-service: Users can create their own cloud instances/virtual servers, provision their own storage and networks. This is one of the most compelling reasons to use a cloud, users are not beholden to an IT organization to provision their infrastructure for them.
  • Utility billing: The cloud provider will bill the cloud-user for the resources used. Infrastructure as a Service is akin to a utility company providing and billing for electricity, water and natural-gas. You share electricity with everyone on the power grid provided by the power station, and only pay for what you use.
  • Virtual Machines: The servers, also called “cloud instances”, are delivered to customers as virtual machines. A virtual machine is a server or workstation, with operating system and applications that appears to the user as a physical server.

Infrastructure as a Service is typically offered in three forms:

  1. Private cloud also called on-premise
  2. Public cloud
  3. Hosted private cloud

An organization can build a private IaaS cloud and then provide infrastructure services to their internal departments or partners. To build a private IaaS cloud, you need virtualization software to run a hypervisor.

Examples of hypervisor software are:

  • HyperV, VMware, XEN.
  • KVM – Kernel-based Virtual Machine is available with most Linux distributions and as open-source software. Red Hat offers KVM virtualization.

Once you have a virtualization or hypervisor layer, then you need cloud software to provide the on-demand, user self service and elasticity features of cloud computing as a Service.

Examples of IaaS private cloud software are:

  • Eucalyptus, Microsoft, VMware.
  • OpenStack: OpenStack is open-source project with over 200 contributors.

openstack-software-diagram

These series of articles will focus on building a private cloud using Red Hat OpenStack, which is offered as a free version or paid subscription.

Next….. Concepts and architecture of OpenStack

Back in the saddle, galloping to secure electronic health data.

Gentle reader,

After a hiatus of a few weeks, adjusting to my new position selling this, I am back in the blogosphere.

With my new focus on security for cloud, virtualization and general data center, I bring a new perspective and focus on healthcare IT – that is security of patient data. Ever so important if patient records are going to go electronic, especially if stored in the cloud. Aside from my new paid position, I have also had the privilege of volunteering under the stewardship of Arien, as the leader of the Security and Trust Workgroup of NHIN-Direct. I also have the privilege of working with the likes of Sean Nolan, who wrote a terrific compliment on my comparison of a Google and Microsoft PHRs.

So, securing electronic health data: Last week I attended a CSO (Chief Security Officer) conference in San Francisco and learnt some interesting lessons:

  1. Trust is fundamental in healthcare – patients may not disclose an embarrasing disease if they fear the data is not private.
  2. Security is required for regulatory purposes and patient safety.
  3. Computers are not personal. When IBM coined the term, PC or Personal Computer, computer users at work believed that the computer they used was theirs. Thus security software that is designed to restrict the flow of data, prevent users from accessing certain websites, download specific files or copy files to disks/thumb drives is viewed by the user as an invasion of their personal space, a restriction on their personal computer. Don’t make users paranoid to do their job or feel that big brother is watching their every mouseclick, but rather explain the highly personal nature of healthcare records and the need to secure access.
  4. Refine business processes. Often one reads of data lost when a laptop or external hard-drive is stolen, for example: 600 patient records lost on a stolen laptop. A natural reaction is one of horror and surprise. While certainly justified, a more analytical reaction would be “Employees are rarely malicious or dishonest, so what business process necessitated copying patient data to a laptop?” Refine, the business process that necessitated this action. Remove the individual choice of where to store patient data, rather make a business decision and apply a policy based on the data.

More on cloud and SaaS security to follow. I was pleased to read that the VA is taking steps to tighten security.

It’s good to be back!

MUMPS anyone?

As a kid I got mumps and stayed home from school with swollen glands;  today there is the MMR vaccination for children fortunate to live in developed countries.

I am not writing about the disease though, rather the programming language used to create electronic medical record software, for example: VISTA and EPIC. This is another assignment from my class, Healthcare Informatics – the University of California, Davis.

If you were writing a new Electronic Medical Record (EMR) software solution today, would you use MUMPS, which is admittedly widely deployed?

Those in favor might argue:

  1. MUMPS is the language used by existing EMR deployments from large established EMR vendors,
  2. The MUMPS database does not waste disk space as it uses sparse arrays and B-trees queries are  faster than indexed relational databases.
  3. MUMPS based EMR systems installed today are stable and reliable.

I posit no, because:

  1. Where would you find MUMPS programmers today? Are new college graduates proficient in MUMPS or JAVA/C++ ?
  2. How would you interface with other EMRs today? Interoperability is the one of the biggest challenges between healthcare systems today and creating a new EMR system based on older non-standards approaches will not result in an interoperable system.
  3. Rather than run a MUMPS based system on large monolithic hardware, a new EMR system could be written on distributed highly available hardware.

Of course there is also the option of not writing your own EMR software,  but rather using a Cloud computing EMR solution from vendors such as  AdvancedMD or (my local favourite) Practice Fusion.

Cloud based EHRs – a response to PracticeFusion

In response to Dr Rowley’s posting

Note: I attempted to comment in the EHRBloggers blog but there were technical glitches with the “word” verification (used to prevent spamming) thus I am writing my comment here

Dr Rowley,
Thank you for your well crafted insight into the benefits of ‘cloud’ oriented EHRs, especially for solo practitioners who may not wish to invest in in-house hardware, software and associated maintenance.
Some responses:
1. Is a solo practitioner or very small medical practice, likely to have the high bandwidth internet connection required for SaaS based EHR?

2. Like any other SaaS solution, does the Dr’s practice grind to a halt because an Internet connection is down (due to the fault of the ISP or any other conditions beyond his control) and the physician cannot request an EMR for a patient?

3. The ‘care co-ordination’ you write about sounds wonderful, my question is what technical standards exist for medical practices to exchange EMR data ? Or is the ‘care co-ordination’ you write about restricted to medical practices that use the PracticeFusion cloud?

Looking forward to the ongoing conversation