Planet RDO

July 18, 2018

RDO Blog

Community Blog Round-Up: 18 July

We’ve got three posts this week related to OpenStack – Adam Young’s insight on how to verify if a patch has been tested as a reviewer, while Zane Bitter takes a look at OpenStack’s multiple layers of services, and then Nir Yechiel introduces us to the five things we need to know about networking on Red Hat OpenStack Platform 13. As always, if you know of an article not included in this round up, please comment below or track down leanderthal (that’s me! Rain Leander!) on Freenode irc #rdo.

Testing if a patch has test coverage by Adam Young

When a user requests a code review, the review is responsible for making sure that the code is tested. While the quality of the tests is a subjective matter, their presences is not; either they are there or they are not there. If they are not there, it is on the developer to explain why or why not.

Read more at https://adam.younglogic.com/2018/07/testing-patch-has-test/

Limitations of the Layered Model of OpenStack by Zane Bitter

One model that many people have used for making sense of the multiple services in OpenStack is that of a series of layers, with the ‘compute starter kit’ projects forming the base. Jay Pipes recently wrote what may prove to be the canonical distillation (this post is an edited version of my response):

Read more at https://www.zerobanana.com/archive/2018/07/17#openstack-layer-model-limitations

Red Hat OpenStack Platform 13: five things you need to know about networking by Nir Yechiel, Principal Product Manager, Red Hat

Red Hat OpenStack Platform 13, based on the upstream Queens release, is now Generally Available. Of course this version brings in many improvements and enhancements across the stack, but in this blog post I’m going to focus on the five biggest and most exciting networking features found this latest release.–

Read more at https://redhatstackblog.redhat.com/2018/07/12/red-hat-openstack-platform-13-five-things-you-need-to-know-about-networking/

by Rain Leander at July 18, 2018 09:05 AM

July 17, 2018

Adam Young

Testing if a patch has test coverage

When a user requests a code review, the review is responsible for making sure that the code is tested.  While the quality of the tests is a subjective matter, their presences is not;  either they are there or they are not there.  If they are not there, it is on the developer to explain why or why not.

Not every line of code is testable.  Not every test is intelligent.  But, at a minimum, a test should ensure that the code in a patch is run at least once, without an unexpected exception.

For Keystone and related projects, we have a tox job called cover that we can run on a git repo at a given revision.  For example, I can code review (even without git review) by pulling down a revision using the checkout link in  gerrit, and then running tox:

 

git fetch git://git.openstack.org/openstack/keystoneauth refs/changes/15/583215/2 && git checkout FETCH_HEAD
git checkout -b netloc-and-version
tox -e cover

I can look at the patch using show –stat to see what files were changed:

$ git show --stat
commit 2ac26b5e1ccdb155a4828e3e2d030b55fb8863b2
Author: wangxiyuan 
Date:   Tue Jul 17 19:43:21 2018 +0800

    Add netloc and version check for version discovery
    
    If the url netloc in the catalog and service's response
    are not the same, we should choose the catalog's and
    add the version info to it if needed.
    
    Change-Id: If78d368bd505156a5416bb9cbfaf988204925c79
    Closes-bug: #1733052

 keystoneauth1/discover.py                                 | 16 +++++++++++++++-
 keystoneauth1/tests/unit/identity/test_identity_common.py |  2 +-

and I want to skip looking at any files in keystoneauth1/tests as those are not production code. So we have 16 lines of new code. What are they?

Modifying someone elses’ code, I got to

 git show | gawk 'match($0,"^@@ -([0-9]+),[0-9]+ [+]([0-9]+),[0-9]+ @@",a){left=a[1];right=a[2];next};\
   /^\+\+\+/{print;next};\
   {line=substr($0,2)};\
   /^-/{left++; next};\
   /^[+]/{print right++;next};\
   {left++; right++}'

Which gives me:

+++ b/keystoneauth1/discover.py
420
421
422
423
424
425
426
427
428
429
430
431
432
433
437
+++ b/keystoneauth1/tests/unit/identity/test_identity_common.py
332

Looking in a the cover directory, I can see if a line is uncovered by its class:

class="stm mis"

For example:

$ grep n432\" cover/keystoneauth1_discover_py.html | grep "class=\"stm mis\""

432

For the lines above, I can use a seq to check them, since they are in order (with none missing)

for LN in `seq 420 437` ; do grep n$LN\" cover/keystoneauth1_discover_py.html ; done

Which produces:

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

I drop the grep “class=\”stm mis\”” to make sure I get something, then add it back in, and get no output.

by Adam Young at July 17, 2018 05:35 PM

Zane Bitter

Limitations of the Layered Model of OpenStack

One model that many people have used for making sense of the multiple services in OpenStack is that of a series of layers, with the ‘compute starter kit’ projects forming the base. Jay Pipes recently wrote what may prove to be the canonical distillation (this post is an edited version of my response):

Nova, Neutron, Cinder, Keystone and Glance are a definitive lower level of an OpenStack deployment. They represent a set of required integrated services that supply the most basic infrastructure for datacenter resource management when deploying OpenStack. Depending on the particular use cases and workloads the OpenStack deployer wishes to promote, an additional layer of services provides workload orchestration and workflow management capabilities.

I am going to explain why this viewpoint is wrong, but first I want to acknowledge what is attractive about it (even to me). It contains a genuinely useful observation that leads to a real insight.

The insight is that whereas the installation instructions for something like Kubernetes usually contain an implicit assumption that you start with a working datacenter, the same is not true for OpenStack. OpenStack is the only open source project concentrating on the gap between a rack full of unconfigured equipment and somewhere that you could run a higher-level service like Kubernetes. We write the bit where the rubber meets the road, and if we do not there is nobody else to do it! There is an almost infinite variety of different applications and they will all need different parts of the higher layers, but ultimately they must be reified in a physical data center and when they are OpenStack will be there: that is the core of what we are building.

It is only the tiniest of leaps from seeing that idea as attractive, useful, and genuinely insightful to believing it is correct. I cannot really blame anybody who made that leap. But an abyss awaits them nonetheless.

Back in the 1960s and early 1970s there was this idea about Artificial Intelligence: even a 2 year old human can (for example) recognise images with a high degree of accuracy, but doing (say) calculus is extremely hard in comparison and takes years of training. But computers can already do calculus! Ergo, we have solved the hardest part already and building the rest out of that will be trivial, AGI is just around the corner, and so on. The popularity of this idea arguably helped created the AI bubble, and the inevitable collision with the reality of its fundamental wrongness led to the AI Winter. Because, in fact, though you can build logic out of many layers of heuristics (as human brains do), it absolutely does not follow that it is trivial to build other things that also require layers of heuristics out of some basic logic building blocks. (In contrast, the AI technology of the present, which is showing more promise, is called Deep Learning because it consists literally of multiple layers of heuristics. It is also still considerably worse at it than any 2 year old human.)

I see the problem with the OpenStack-as-layers model as being analogous. (I am not suggesting there will be a full-on OpenStack Winter, but we are well past the Peak of Inflated Expectations.) With Nova, Keystone, Glance, Neutron, and Cinder you can build a pretty good Virtual Private Server hosting service. But it is a mistake to think that cloud is something you get by layering stuff on top of VPS hosting. It is relatively easy to build a VPS host on top of a cloud, just like teaching someone calculus. But it is enormously difficult to build a cloud on top of a VPS host (it would involve a lot of expensive layers of abstraction, comparable to building artificial neurons in software).

That is all very abstract, so let me bring in a concrete example. Kubernetes is event-driven at a very fundamental level: when a pod or a whole kubelet dies, Kubernetes gets a notification immediately and that prompts it to reschedule the workload. In contrast, Nova/Cinder/&c. are a black hole. You cannot even build a sane dashboard for your VPS—let alone cloud-style orchestration—over them, because it will have to spend all of its time polling the APIs to find out if anything happened. There is an entire separate project, that almost no deployments include, basically dedicated to spelunking in the compute node without Nova’s knowledge to try to surface this information. It is no criticism of the team in question, who are doing something that desperately needs doing in the only way that is really open to them, but the result is an embarrassingly bad architecture for OpenStack as a whole.

So yes, it is sometimes helpful to think about the fact that there is a group of components that own the low level interaction with outside systems (hardware, or IdM in the case of Keystone), and that almost every application will end up touching those directly or indirectly, while each using different subsets of the other functionality… but only in the awareness that those things also need to be built from the ground up as interlocking pieces in a larger puzzle.

Saying that the compute starter kit projects represent a ‘definitive lower level of an OpenStack deployment’ invites the listener to ignore the bigger picture; to imagine that if those lower level services just take care of their own needs then everything else can just build on top. That is a mistake, unless you believe that OpenStack needs only to provide enough building blocks to build VPS hosting out of, because support for all of those higher-level things does not just fall out for free. You have to consciously work at it.

Imagine for a moment that, knowing everything we know now, we had designed OpenStack around a system of event sources and sinks that are reliable in the face of hardware failures and network partitions, with components connecting into it to provide services to the user and to each other. That is what Kubernetes did. That is the key to its success. We need to enable something similar, because OpenStack is still necessary even in a world where Kubernetes exists.

One reason OpenStack is still necessary is the one we started with above: something needs to own the interaction with the underlying physical infrastructure, and the alternatives are all proprietary. Another place where OpenStack can provide value is by being less opinionated and allowing application developers to choose how the event sources and sinks are connected together. That means that users should, for example, be able to customise their own failover behaviour in ‘userspace’ rather than rely on the one-size-fits-all approach of handling everything automatically inside Kubernetes. This is theoretically an advantage of having separate projects instead of a monolithic design—though the fact that the various agents running on a compute node are more tightly bound to their corresponding services than to each other has the potential to offer the worst of both worlds.

All of these thoughts will be used as fodder for writing a technical vision statement for OpenStack. My hope is that will help align our focus as a community so that we can work together in the same direction instead of at cross-purposes. Along the way, we will need many discussions like this one to get to the root of what can be some quite subtle differences in interpretation that nevertheless lead to divergent assumptions. Please join in if you see one happening!

by Zane Bitter at July 17, 2018 03:17 PM

July 15, 2018

Nir Yechiel

July 13, 2018

Red Hat Stack

Red Hat OpenStack Platform 13: five things you need to know about networking

Red Hat OpenStack Platform 13, based on the upstream Queens release, is now Generally Available. Of course this version brings in many improvements and enhancements across the stack, but in this blog post I’m going to focus on the five biggest and most exciting networking features found this latest release.

franck-v-705445-unsplashPhoto by Franck V. on Unsplash

ONE: Overlay network management – bringing consistency and better operational experience

Offering solid support for network virtualization was always a priority of ours. Like many other OpenStack components, the networking subsystem (Neutron) is pluggable so that customers can choose the solution that best fits their business and technological requirements. Red Hat OpenStack Platform 13 adds support for Open Virtual Network (OVN), a network virtualization solution which is built into the Open vSwitch (OVS) project. OVN supports the Neutron API, and offers a clean and distributed implementation of the most common networking capabilities such as bridging, routing, security groups, NAT, and floating IPs. In addition to OpenStack, OVN is also supported in Red Hat Virtualization (available with Red Hat Virtualization 4.2 which was announced earlier this year), with support for Red Hat OpenShift Container Platform expected down the road. This marks our efforts to create consistency and a more unified operational experience between Red Hat OpenStack Platform, Red Hat OpenShift, and Red Hat Virtualization.     

OVN was available as a technology preview feature with Red Hat OpenStack Platform 12, and is now fully supported with Red Hat OpenStack Platform 13. OVN must be enabled as the overcloud Neutron backend from Red Hat OpenStack Platform director during deployment time, as the default Neutron backend is still ML2/OVS. Also note that migration tooling from ML2/OVS to OVN is not supported with Red Hat OpenStack Platform 13, and is expected to be offered in a future release, and so OVN is only recommended for new deployments.

TWO: Open source SDN Controller

OpenDaylight is a flexible, modular, and open software-defined networking (SDN) platform, which is now fully integrated and supported with Red Hat OpenStack Platform 13. The Red Hat offering combines carefully selected OpenDaylight components that are designed to enable the OpenDaylight SDN controller as a networking backend for OpenStack, giving it visibility into, and control over, OpenStack networking, utilization, and policies.

OpenDaylight is co-engineered and integrated with Red Hat OpenStack Platform, including Red Hat OpenStack Platform director for automated deployment, configuration and lifecycle management.

The key OpenDaylight project used in this solution is NetVirt, offering support for the OpenStack Neutron API on top of OVS. For telecommunication customers this support extends to OVS-DPDK implementations. Also available in technology preview, customers can leverage OpenDaylight with OVS hardware offload on capable network adapters to offload the virtual switch data path processing to the network card, further optimizing the server footprint.

 

2OpenStack_OpenDaylight-Product-Guide_437720_0217_ECE_Architecture

THREE: Cloud ready load balancing as a service

Load balancing is a fundamental service of any cloud. It is a key element essential for enabling automatic scaling and availability of applications hosted in the cloud, and is required for both “three tier” apps, as well as for emerging cloud native, microservices based, app architectures.

During the last few development cycles, the community has worked on a new load balancing as a service (LBaaS) solution based on the Octavia project. Octavia provides tenants with a load balancing API, as well as implements the delivery of load balancing services via a fleet of service virtual machine instances, which it spins up on demand. With Red Hat OpenStack Platform 13, customers can use the OpenStack Platform director to easily deploy and setup Octavia and expose it to the overcloud tenants, including setting up a pre-created, supported and secured Red Hat Enterprise Linux based service VM image.

OpenStack_Networking-Guide_450456_0617_ECE_LBaaSFigure 2. Octavia HTTPS traffic flow through to a pool member

FOUR: Integrated networking for OpenStack and OpenShift

OpenShift Container Platform, Red Hat’s enterprise distribution of Kubernetes optimized for continuous application development, is infrastructure independent. You can run it on public cloud, virtualization, OpenStack or anything that can boot Red Hat Enterprise Linux. But in order to run Kubernetes and application containers, you need control and flexibility at scale on the infrastructure level. Many of our customers are looking into OpenStack as a platform to expose VM and bare metal resources for OpenShift to provide Kubernetes clusters to different parts of the organization – nicely aligning with the strong multi-tenancy and isolation capabilities of OpenStack as well as its rich APIs.     

As a key contributor to both OpenStack and Kubernetes, Red Hat is shaping this powerful combination so that enterprises can not only deploy OpenShift on top of OpenStack, but also take advantage of the underlying infrastructure services exposed by OpenStack. A good example of this is through networking integration. Out of the box, OpenStack provides overlay networks managed by Neutron. However, OpenShift, based on Kubernetes and the Container Network Interface (CNI) project, also provides overlay networking between container pods. This results in two, unrelated, network virtualization stacks that run on top of each other and make the operational experience, as well as the overall performance of the solution, not optimal. With Red Hat OpenStack Platform 13, Neutron was enhanced so that it can serve as the networking layer for both OpenStack and OpenShift, allowing a single network solution to serve both container and non-container workloads. This is done through project Kuryr and kuryr-kubernetes, a CNI plugin that provides OpenStack networking to Kubernetes objects.

Customers will be able to take advantage of Kuryr with an upcoming Red Hat OpenShift Container Platform release, where we will also release openshift-ansible support for automated deployment of Kuryr components (kuryr-controller, kuryr-cni) on OpenShift Master and Worker nodes.   

Screen Shot 2018-07-12 at 3.13.30 pmFigure 3. OpenShift and OpenStack

FIVE: Deployment on top of routed networks

As data center network architectures evolve, we are seeing a shift away from L2-based network designs towards fully L3 routed fabrics in an effort to create more efficient, predictable, and scalable communication between end-points in the network. One such trend is the adoption of leaf/spine (Clos) network topology where the fabric is composed of leaf and spine network switches: the leaf layer consists of access switches that connect to devices like servers, and the spine layer is the backbone of the network. In this architecture, every leaf switch is interconnected with each and every spine switch using routed links. Dynamic routing is typically enabled throughout the fabric and allows the best path to be determined and adjusted automatically. Modern routing protocol implementations also offers Equal-Cost Multipathing (ECMP) for load sharing of traffic between all available links simultaneously.

Originally, Red Hat OpenStack Platform director was designed to use shared L2 networks between nodes. This significantly reduces the complexity required to deploy OpenStack, since DHCP and PXE booting are simply done over a shared broadcast domain. This also makes the network switch configuration straightforward, since typically there is only a need to configure VLANs and ports, but no need to enable routing between all switches. This design, however, is not compatible with L3 routed network solutions such as the leaf/spine network architecture described above.

With Red Hat OpenStack Platform 13, director can now deploy OpenStack on top of fully routed topologies, utilizing its composable network and roles architecture, as well as a DHCP relay to support provisioning across multiple subnets. This provides customers with the flexibility to deploy on top of L2 or L3 routed networks from a single tool.

OpenStack_NFV_Mobile_Networks_438707_0317_ECE_Figure12

Learn more

Learn more about Red Hat OpenStack Platform:


For more information on Red Hat OpenStack Platform and Red Hat Virtualization contact your local Red Hat office today!

by Nir Yechiel, Principal Product Manager, Red Hat at July 13, 2018 01:28 AM

July 11, 2018

RDO Blog

Community Blog Round-Up: 11 July

I know what you’re thinking – another blog round up SO SOON?!? Is it MY BIRTHDAY?!? Maybe! But it’s definitely OpenStack’s birthday this month – eight years old – and there are an absolute TON of blog posts as a result. Well, maybe not a ton, but definitely a lot to write about and therefore, there are a lot more community blog round ups. Expect more of the same as content allows! So, sit back and enjoy the latest RDO community blog round-up while you eat a piece of cake and wish a very happy birthday to OpenStack.

Virtualize your OpenStack control plane with Red Hat Virtualization and Red Hat OpenStack Platform 13 by Ramon Acedo Rodriguez, Product Manager, OpenStack

With the release of-Red Hat OpenStack Platform 13 (Queens)-we’ve added support to Red Hat OpenStack Platform director to deploy the overcloud controllers as virtual machines in a Red Hat Virtualization cluster. This allows you to have your controllers, along with other supporting services such as Red Hat Satellite, Red Hat CloudForms, Red Hat Ansible Tower, DNS servers, monitoring servers, and of course, the undercloud node (which hosts director), all within a Red Hat Virtualization cluster. This can reduce the physical server footprint of your architecture and provide an extra layer of availability.

Read more at https://redhatstackblog.redhat.com/2018/07/10/virtualize-your-openstack-control-plane-with-red-hat-virtualization-and-red-hat-openstack-platform-13/

Red Hat OpenStack Platform: Making innovation accessible for production by Maria Bracho, Principal Product Manager OpenStack

An OpenStack®️-based cloud environment can help you digitally transform to succeed in fast-paced, competitive markets. However, for many organizations, deploying open source software supported only by the community can be intimidating. Red Hat®️ OpenStack Platform combines community-powered innovation with enterprise-grade features and support to help your organization build a production-ready private cloud.

Read more at https://redhatstackblog.redhat.com/2018/07/09/red-hat-openstack-platform-making-innovation-accessible-for-production/

Converting policy.yaml to a list of dictionaries by Adam Young

The policy .yaml file generated from oslo is not very useful for anything other than feeding to oslo-policy to enforce. If you want to use these values for anything else, it would be much more useful to have each rule as a dictionary, and all of the rules in a list. Here is a little bit of awk to help out:

Read more at https://adam.younglogic.com/2018/07/policy-yaml-dictionary/

A Git Style change management for a Database driven app. by Adam Young

The Policy management tool I’m working on really needs revision and change management.- Since I’ve spent so much time with Git, it affects my thinking about change management things.- So, here is my attempt to lay out my current thinking for implementing a git-like scheme for managing policy rules.

Read more at https://adam.younglogic.com/2018/07/a-git-style-change-management-for-a-database-driven-app/

by Rain Leander at July 11, 2018 02:58 PM

July 10, 2018

Red Hat Stack

Virtualize your OpenStack control plane with Red Hat Virtualization and Red Hat OpenStack Platform 13

With the release of Red Hat OpenStack Platform 13 (Queens) we’ve added support to Red Hat OpenStack Platform director to deploy the overcloud controllers as virtual machines in a Red Hat Virtualization cluster. This allows you to have your controllers, along with other supporting services such as Red Hat Satellite, Red Hat CloudForms, Red Hat Ansible Tower, DNS servers, monitoring servers, and of course, the undercloud node (which hosts director), all within a Red Hat Virtualization cluster. This can reduce the physical server footprint of your architecture and provide an extra layer of availability.

Please note: this is not using Red Hat Virtualization as an OpenStack hypervisor (i.e. the compute service, which is already nicely done with nova via libvirt and KVM) nor is this about hosting the OpenStack control plane on OpenStack compute nodes.

Video courtesy: Rhys Oxenham, Manager, Field & Customer Engagement

Benefits of virtualization

Red Hat Virtualization (RHV) is an open, software-defined platform built on Red Hat Enterprise Linux and the Kernel-based Virtual Machine (KVM) featuring advanced management tools.  RHV gives you a stable foundation for your virtualized OpenStack control plane.

By virtualizing the control plane you gain instant benefits, such as:

  • Dynamic resource allocation to the virtualized controllers: scale up and scale down as required, including CPU and memory hot-add and hot-remove to prevent downtime and allow for increased capacity as the platform grows.
  • Native high availability for Red Hat OpenStack Platform director and the control plane nodes.
  • Additional infrastructure services can be deployed as VMs on the same RHV cluster, minimizing the server footprint in the datacenter and making an efficient use of the physical nodes.
  • Ability to define more complex OpenStack control planes based on composable roles. This capability allows operators to allocate resources to specific components of the control plane, for example, an operator may decide to split out networking services (Neutron) and allocate more resources to them as required. 
  • Maintenance without service interruption: RHV supports VM live migration, which can be used to relocate the OSP control plane VMs to a different hypervisor during their maintenance.
  • Integration with third party and/or custom tools engineered to work specifically with RHV, such as backup solutions.

Benefits of subscription

There are many ways to purchase Red Hat Virtualization, but many Red Hat OpenStack Platform customers already have it since it’s included in our most popular OpenStack subscription bundles, Red Hat Cloud Infrastructure and Red Hat Cloud Suite. If you have purchased OpenStack through either of these, you already own RHV subscriptions!

Logical Architecture

This is how the architecture looks when splitting the overcloud between Red Hat Virtualization for the control plane and utilizing bare metal for the tenants’ workloads via the compute nodes.

Screen Shot 2018-07-10 at 1.22.13 pm

Installation workflow

A typical installation workflow looks like this:

RHVOSP integration Blog post

Preparation of the Cluster/Host networks

In order to use multiple networks (referred to as “network isolation” in OpenStack deployments), each VLAN (Tenant, Internal, Storage, …) will be mapped to a separate logical network and allocated to the hosts’ physical nics. Full details are in the official documentation.

Preparation of the VMs

The Red Hat OpenStack Platform control plane usually consists of one director node and (at least) three controller nodes. When these VMs are created in RHV, the same requirements we have for these nodes on bare metal apply.

The director VM should have a minimum of 8 cores (or vCPUs), 16 GB of RAM and 100 GB of storage. More information can be found in the official documentation.

The controllers should have at least 32 GB of RAM and 16 vCPUs. While the same amount of resources are required for virtualized controllers, by using RHV we gain the ability to better optimize that resource consumption across our underlying hypervisors

Red Hat Virtualization Considerations

Red Hat Virtualization needs to be configured with some specific settings to host the VMs for the controllers:

Anti-affinity for the controller VMs

We want to ensure there is only one OpenStack controller per hypervisor so that in case of a hypervisor failure, the service level disruption minimalized to a single controller. This allows for HA to be taken care of using the different levels of high availability mechanisms already built in to the system. For this to work we use RHV to configure an affinity group with “soft negative affinity,” effectively giving us “anti-affinity!” Additionally it provides the flexibility to override this rule in case of system constraints.

VM network configuration

One vNIC per VLAN

In order to use multiple networks (referred to as “network isolation” in OpenStack deployments), each VLAN (Tenant, Internal, Storage, …) will be mapped to a separate virtual NIC (vNIC) in the controller VMs and VLAN “untagging” will be done at the hypervisor (cluster) and VM level.

Full details can be found in the official documentation.

Screen Shot 2018-07-10 at 11.35.44 (1)

Allow MAC Spoofing

For the virtualized controllers to allow the network traffic in and out correctly, the MAC spoofing filter must be disabled on the networks that are attached to the controller VMs. To do this we set no_filter in the vNIC of the director and controller VMs, then restart the VMs and disable the MAC anti-spoofing filter.

Important Note: If this is not done DHCP and PXE booting of the VMs from director won’t work.

Implementation in director

Red Hat OpenStack Platform director (TripleO’s downstream release) uses the Ironic Bare Metal provisioning component of OpenStack to deploy the OpenStack components on physical nodes. In order to add support for deploying the controllers on Red Hat Virtualization VMs, we enabled support in Ironic with a new driver named staging-ovirt.

This new driver manages the VMs hosted in RHV similar to how other drivers manage physical nodes using BMCs supported by Ironic, such as iRMC, iDrac or iLO. For RHV this is done by interacting with the RHV manager directly to trigger power management actions on the VMs.

Enabling the staging-ovirt driver in director

Director needs to enable support for the new driver in Ironic. This is done as you would do it for any other Ironic driver by simply specifying it in the undercloud.conf configuration file:

enabled_hardware_types = ipmi,redfish,ilo,idrac,staging-ovirt

After adding the new entry and running openstack undercloud install we can see the staging-ovirt driver listed in the output:

(undercloud) [stack@undercloud-0 ~]$ openstack baremetal driver list
+---------------------+-----------------------+
| Supported driver(s) | Active host(s)        |
+---------------------+-----------------------+
| idrac               | localhost.localdomain |
| ilo                 | localhost.localdomain |
| ipmi                | localhost.localdomain |
| pxe_drac            | localhost.localdomain |
| pxe_ilo             | localhost.localdomain |
| pxe_ipmitool        | localhost.localdomain |
| redfish             | localhost.localdomain |
| staging-ovirt       | localhost.localdomain |

Register the RHV-hosted VMs with director

When defining a RHV-hosted node in director’s instackenv.json file we simply set the power management type (pm_type) to the “staging-ovirt” driver, provide the relevant RHV manager host name, and include the username and password for the RHV account that can control power functions for the VMs.

{
    "nodes": [
        {
            "name":"osp13-controller-1",
            "pm_type":"staging-ovirt",
            "mac":[
                "00:1a:4a:16:01:39"
            ],
            "cpu":"2",
            "memory":"4096",
            "disk":"40",
            "arch":"x86_64",
            "pm_user":"admin@internal",
            "pm_password":"secretpassword",
            "pm_addr":"rhvm.lab.redhat.com",
            "pm_vm_name":"osp13-controller-1",
            "capabilities": "profile:control,boot_option:local"
        },
        {
            "name":"osp13-controller-2",
            "pm_type":"staging-ovirt",
            "mac":[
                "00:1a:4a:16:01:3a"
            ],
            "cpu":"2",
            "memory":"4096",
            "disk":"40",
            "arch":"x86_64",
            "pm_user":"admin@internal",
            "pm_password":"secretpassword",
            "pm_addr":"rhvm.lab.redhat.com",
            "pm_vm_name":"osp13-controller-2",
            "capabilities": "profile:control,boot_option:local"
        },
        {
            "name":"osp13-controller-3",
            "pm_type":"staging-ovirt",
            "mac":[
                "00:1a:4a:16:01:3b"
            ],
            "cpu":"2",
            "memory":"4096",
            "disk":"40",
            "arch":"x86_64",
            "pm_user":"admin@internal",
            "pm_password":"secretpassword",
            "pm_addr":"rhvm.lab.redhat.com",
            "pm_vm_name":"osp13-controller-3",
            "capabilities": "profile:control,boot_option:local"
        }
    ]
}

A summary of the relevant parameters required for RHV are as follows:

  • pm_user: RHV-M username.
  • pm_password: RHV-M password.
  • pm_addr: hostname or IP of the RHV-M server.
  • pm_vm_name: Name of the virtual machine in RHV-M where the controller will be created.

For more information on Red Hat OpenStack Platform and Red Hat Virtualization contact your local Red Hat office today!

by Ramon Acedo Rodriguez, Product Manager, OpenStack at July 10, 2018 08:31 PM

July 09, 2018

Red Hat Stack

Red Hat OpenStack Platform: Making innovation accessible for production

An OpenStack®️-based cloud environment can help you digitally transform to succeed in fast-paced, competitive markets. However, for many organizations, deploying open source software supported only by the community can be intimidating. Red Hat®️ OpenStack Platform combines community-powered innovation with enterprise-grade features and support to help your organization build a production-ready private cloud.

Through an open source development model, community leadership, and production-grade life-cycle options, Red Hat makes open source software more accessible for production use across industries and organizations of any size and type.

omar-albeik-589641-unsplashPhoto by Omar Albeik on Unsplash

Open source development model

In order for open source technologies to be effective in production, they must provide stability and performance while also delivering the latest features and advances. Our open source development model combines fast-paced, cross-industry community innovation with production-grade hardening, integrations, support, and services. We take an upstream-first approach by contributing all developments back to the upstream community. This makes new features immediately available and helps to drive the interoperability of Red Hat products with upstream releases. Based on community OpenStack releases, Red Hat OpenStack Platform is intensively tested and hardened to meet the rigors of production environments. Ongoing patching, bug fixes, and certification keep your environment up and running.

Community leadership

We know that open source technologies can be of the highest quality and work with communities to deliver robust code. Red Hat is the top code contributor to the OpenStack community. We are responsible for 28% of the code in the Queens release and 18% of the code across all releases. We collaborate with our customers, partners, and industry organizations to identify the features they need to be successful. We then work to add that functionality into OpenStack. Over time, these efforts have resulted in enhancements in OpenStack’s availability, manageability, and performance, as well as industry-specific additions like OpenDaylight support for telecommunications.

Production-grade life-cycle options

The OpenStack community delivers new releases every six months, which can be challenging for many organizations looking to deploy OpenStack-based production environments. We provide stable branch releases of OpenStack that are supported for an enterprise production life cycle—beyond the six-month release cycle of the OpenStack community. With Red Hat OpenStack Platform, we give you two life-cycle options that let you choose when to upgrade and add new features to your cloud environment.

  • Standard release cadence. Upgrade every six to twelve months between standard releases to stay aligned with the latest features as they become available. Standard releases include one year of support.
  • Long-life release cadence. Standardize on long-life releases for up to five years. Long-life releases include three years of support, with the option to extend support for an additional two years with extended life-cycle support (ELS), for up to five years of support total. All new features are included with each long-life release.

Red Hat OpenStack Platform director—an integrated deployment and life-cycle management tool—streamlines upgrades between standard releases. And, the new fast forward upgrade feature in director lets you easily transition between long-life releases, without the need to upgrade to each in-between release. So, if you are currently using Red Hat OpenStack Platform 10, you now have an easy upgrade path to Red Hat OpenStack Platform 13—with fewer interruptions, no need for additional hardware, and simpler implementation of containerized OpenStack services.

Fast forward upgrade diagram v1

Learn more

Red Hat OpenStack Platform can help you overcome the challenges of deploying OpenStack into production use. And, if you aren’t sure about how to build your cloud environment, don’t have the time or resources to do so, or just want some help on your cloud journey, we provide a variety of expert services and training.

Learn more about Red Hat OpenStack Platform:

by Maria Bracho, Principal Product Manager OpenStack at July 09, 2018 08:19 PM

July 08, 2018

Adam Young

Converting policy.yaml to a list of dictionaries

The policy .yaml file generated from oslo has the following format:

# Intended scope(s): system
#"identity:update_endpoint_group": "rule:admin_required"

# Delete endpoint group.
# DELETE /v3/OS-EP-FILTER/endpoint_groups/{endpoint_group_id}
# Intended scope(s): system
#"identity:delete_endpoint_group": "rule:admin_required"

This is not very useful for anything other than feeding to oslo-policy to enforce. If you want to use these values for anything else, it would be much more useful to have each rule as a dictionary, and all of the rules in a list. Here is a little bit of awk to help out:

#!/usr/bin/awk -f
BEGIN {apilines=0; print("---")}
/#"/ {
    if (api == 1){
	printf("  ")
    }else{
	printf("- ")
    }
  split ($0,array,"\"")
  print ("rule:", array[2]);
  print ("  check:", array[4]);
  rule=0
}    
/# / {api=1;}
/^$/ {api=0; apilines=0;}
api == 1 && apilines == 0 {print ("- description:" substr($0,2))}
/# GET/  || /# DELETE/ || /# PUT/ || /# POST/ || /# HEAD/ || /# PATCH/ {
     print ("  " $2 ": " $3)
}
api == 1 { apilines = apilines +1 }

I have it saved in mungepolicy.awk. I ran it like this:

cat etc/keystone.policy.yaml.sample | ./mungepolicy.awk > /tmp/keystone.access.yaml

And the output looks like this:

---
- rule: admin_required
  check: role:admin or is_admin:1
- rule: service_role
  check: role:service
- rule: service_or_admin
  check: rule:admin_required or rule:service_role
- rule: owner
  check: user_id:%(user_id)s
- rule: admin_or_owner
  check: rule:admin_required or rule:owner
- rule: token_subject
  check: user_id:%(target.token.user_id)s
- rule: admin_or_token_subject
  check: rule:admin_required or rule:token_subject
- rule: service_admin_or_token_subject
  check: rule:service_or_admin or rule:token_subject
- description: Show application credential details.
  GET: /v3/users/{user_id}/application_credentials/{application_credential_id}
  HEAD: /v3/users/{user_id}/application_credentials/{application_credential_id}
  rule: identity:get_application_credential
  check: rule:admin_or_owner
- description: List application credentials for a user.
  GET: /v3/users/{user_id}/application_credentials
  HEAD: /v3/users/{user_id}/application_credentials
  rule: identity:list_application_credentials
  check: rule:admin_or_owner

Which is valid yaml. It might be a pain to deal with the verbs in separate keys. Ideally, that would be a list, too, but this will work for starters.

by Adam Young at July 08, 2018 03:38 AM

July 06, 2018

Adam Young

A Git Style change management for a Database driven app.

The Policy management tool I’m working on really needs revision and change management.  Since I’ve spent so much time with Git, it affects my thinking about change management things.  So, here is my attempt to lay out my current thinking for implementing a git-like scheme for managing policy rules.

A policy line is composed of two chunks of data.  A Key and a Value.  The keys are in the form

  identity:create_user.

Additionally, the keys are scoped to a specific service (Keystone, Nova, etc).

The value is the check string.  These are of the form

role:admin and project_id=target.project_id

It is the check string that is most important to revision control. This lends itself to an entity diagram like this:

Whether each of these gets its own table remains to be seen.  The interesting part is the rule_name to policy_rule mapping.

Lets state that the policy_rule table entries are immutable.  If we want to change policy, we add a new entry, and leave the old ones in there.  The new entry will have a new revision value.  For now, lets assume revisions are integers and are monotonically increasing.  So, when I first upload the Keystone policy.json file, each entry gets a revision ID of 1.  In this example, all check_strings start off as are “is_admin:True”

Now lets assume I modify the identity:create_user rule.  I’m going to arbitrarily say that the id for this record is 68.  I want to Change it to:

role:admin and domain_id:target.domain_id

So we can do some scope checking.  This entry goes into the policy_rule table like so:

 

rule_name_id check_string revision
68 is_admin:True 1
68 role:admin and domain_id:target.domain_id 2

From a storage perspective this is quite nice, but from a “what does my final policy look like” perspective it is a mess.

In order to build the new view, we need sql along the lines of

select * from policy_rule where revision = ?

Lets call this line_query and assume that when we call it, the parameter is substituted for the question mark.  We would then need code like this pseudo-code:

doc = dict()
for revision in 1 to max:
    for result in line_query.execute(revision):
        index = result['rule_name_id']
        doc[index] = result.check_string()

 

This would build a dictionary layer by layer through all the revisions.

So far so good, but what happens if we decided to revert, and then to go a different direction? Right now, we have a revision chain like this:

And if we keep going, we have,

But what happens if 4 was a mistake? We need to revert to 6 and create a new new branch.

We have two choices. First, we could be destructive and delete all of the lines in revision 4, 5, and 6. This means we can never recreate the state of 6 again.

What if we don’t know that 4 is a mistake? What if we just want to try another route, but come back to 4,5, and 6 in the future?

We want this:

 

But how will we know to take the branch when we create the new doc?

Its a database! We put it in another table.

revision_id revision_parent_id
2 1
3 2
4 3
5 4
6 5
7 3
8 7
9 8

In order to recreate revision 9, we use a stack. Push 9 on the stack, then find the row with revision_id 9 in the table, push the revision_parent_id on the stack, and continue until there are no more rows.  Then, pop each revision_id off the stack and execute the same kind of pseudo code I posted above.

It is a lot.  It is kind of complicated, but it is the type of complicated that Python does well.  However, database do not do this kind of iterative querying well.  It would take a stored procedure to perform this via a single database query.

Talking through this has encouraged me decide to take another look at using git as the backing store instead of a relational database.

by Adam Young at July 06, 2018 07:38 PM

July 04, 2018

RDO Blog

Community Blog Round-Up: 04 July

So much happened over the past month that it’s definitely time to set off the fireworks! To start, Steve Hardy shares his tips and tricks for TripleO containerized deployments, then Zane Bitter talks discusses the ever expanding OpenStack Foundation, while Maria Bracho introduces us to Red Hat OpenStack Platform’s fast forward upgrades in a step-by-step overview, and so very much more. Obviously, prep the barbecue, it’s time for the fourth of July community blog round-up!

Red Hat OpenStack Platform: Two life-cycle choices to fit your organization by Maria Bracho, Principal Product Manager OpenStack

OpenStack®️ is a powerful platform for building private cloud environments that support modern, digital business operations. However, the OpenStack community’s six-month release cadence can pose challenges for enterprise organizations that want to deploy OpenStack in production. Red Hat can help.

Read more at https://redhatstackblog.redhat.com/2018/07/02/red-hat-openstack-platform-two-life-cycle-choices-to-fit-your-organization/

CPU model configuration for QEMU/KVM on x86 hosts by Daniel Berrange

With the various CPU hardware vulnerabilities reported this year, guest CPU configuration is now a security critical task. This blog post contains content I’ve written that is on its way to become part of the QEMU documentation.

Read more at https://www.berrange.com/posts/2018/06/29/cpu-model-configuration-for-qemu-kvm-on-x86-hosts/

Requirements for an OpenStack Access Control Policy Management Tool by Adam Young

“We need a read only role.”

Read more at https://adam.younglogic.com/2018/06/requirements-for-an-openstack-access-control-policy-management-tool/

Red Hat OpenStack Platform 13 is here! by Rosa Guntrip

Accelerate. Innovate. Empower. In the digital economy, IT organizations can be expected to deliver services anytime, anywhere, and to any device. IT speed, agility, and innovation can be critical to help stay ahead of your competition. Red Hat OpenStack Platform lets you build an on-premise cloud environment designed to accelerate your business, innovate faster, and empower your IT teams.

Read more at https://redhatstackblog.redhat.com/2018/06/27/red-hat-openstack-platform-13-is-here/

Red Hat Certified Cloud Architect – An OpenStack Perspective – Part Two by Chris Janiszewski – Senior OpenStack Solutions Architect – Red Hat Tiger Team

Previously we learned about what the Red Hat Certified Architect certification is and what exams are included in the “OpenStack-focused” version of the certification. This week we want to focus on personal experience and benefits from achieving this milestone.

Read more at https://redhatstackblog.redhat.com/2018/06/24/red-hat-certified-cloud-architect-an-openstack-perspective-part-two/

Red Hat OpenStack Platform fast forward upgrades: A step-by-step overview by Maria Bracho, Principal Product Manager OpenStack

New in Red Hat®️ OpenStack®️ Platform 13, the fast forward upgrade feature lets you easily move between long-life releases, without the need to upgrade to each in-between release. Fast forward upgrades fully containerize Red Hat OpenStack Platform deployment to simplify and speed the upgrade process while reducing interruptions and eliminating the need for additional hardware. Today, we’ll take a look at what the fast forward upgrade process from Red Hat OpenStack Platform 10 to Red Hat OpenStack Platform 13 looks like in practice.

Read more at https://redhatstackblog.redhat.com/2018/06/22/red-hat-openstack-platform-fast-forward-upgrades-a-step-by-step-overview/

Red Hat Certified Cloud Architect – An OpenStack Perspective – Part One by Chris Janiszewski – Senior OpenStack Solutions Architect – Red Hat Tiger Team

The Red Hat Certified Architect (RHCA) is the highest certification provided by Red Hat. To many, it can be looked at as a “holy grail” of sorts in open source software certifications. It’s not easy to get. In order to receive it, you not only need to already be a Red Hat Certified Engineer -(RHCE) for Red Hat Enterprise Linux (with the Red Hat Certified System Administrator, (RHCSA) as pre-requisite) but also pass additional exams from various technology categories.—

Read more at https://redhatstackblog.redhat.com/2018/06/21/red-hat-certified-cloud-architect-an-openstack-perspective-part-one/

Tips on searching ceph-install-workflow.log on TripleO by John

  1. Only look at the logs relevant to the last run

Read more at http://blog.johnlikesopenstack.com/2018/06/tips-on-searching-ceph-install.html

TripleO Ceph Integration on the Road in June by John

The first week of June I went to an upstream TripleO workshop in Brno. The labs we used are at https://github.com/redhat-openstack/tripleo-workshop

Read more at http://blog.johnlikesopenstack.com/2018/06/tripleo-ceph-integration-on-road-in-june.html

The Expanding OpenStack Foundation by Zane Bitter

The OpenStack Foundation has begun the process of becoming an umbrella organisation for open source projects adjacent to but outside of OpenStack itself. However, there is no clear roadmap for the transformation, which has resulted in some confusion. After attending the joint leadership meeting with the Foundation Board of Directors and various Forum sessions that included some members of the board at the (2018) OpenStack Summit in Vancouver, I believe I can help shed some light on the situation. (Of course this is my subjective take on the topic, and I am not speaking for the Technical Committee.)–

Read more at https://www.zerobanana.com/archive/2018/06/14#osf-expansion

Configuring a static address for wlan0 on Raspbian Stretch by Lars Kellogg-Stedman

Recent releases of Raspbian have adopted the use of dhcpcd to manage both dynamic and static interface configuration. If you would prefer to use the traditional /etc/network/interfaces mechanism instead, follow these steps.

Read more at https://blog.oddbit.com/2018/06/14/configuring-a-static-address-f/

Configuring collectd plugins with TripleO by mrunge

A way of deploying OpenStack is to use TripleO. This takes the an approach to deploy a small OpenStack environment, and then to take OpenStack provided infrastructure and tools to deploy the actual production environment.

Read more at http://www.matthias-runge.de/2018/06/08/tripleo-collectd/

TripleO Containerized deployments, debugging basics by Steve Hardy

Since the Pike release, TripleO has supported deployments with OpenStack services running in containers.- Currently we use docker to run images based on those maintained by the Kolla project.We already have some tips and tricks for container deployment debugging in tripleo-docs, but below are some more notes on my typical debug workflows.

Read more at https://hardysteven.blogspot.com/2018/06/tripleo-containerized-deployments.html

by Rain Leander at July 04, 2018 02:00 PM

July 02, 2018

Red Hat Stack

Red Hat OpenStack Platform: Two life-cycle choices to fit your organization

OpenStack®️ is a powerful platform for building private cloud environments that support modern, digital business operations. However, the OpenStack community’s six-month release cadence can pose challenges for enterprise organizations that want to deploy OpenStack in production. Red Hat can help.

elizabeth-lies-20237-unsplashPhoto by elizabeth lies on Unsplash

Red Hat®️ OpenStack Platform is an intensely tested, hardened, and supported distribution of OpenStack based on community releases. In addition to production-grade features and functionality, it gives you two life-cycle choices to align with the way your organization operates:

  • Standard releases. These releases follow the six-month community release cadence and include one year of support.
  • Long-life releases. Starting with Red Hat OpenStack Platform 10, every third release is a long-life release. These include three years of support, with option to extend support for an additional two years with extended life-cycle support (ELS), for up to five years of support total.

Why does this matter? Different organizations have different needs when it comes to infrastructure life cycles and management. Some need to implement the latest innovations as soon as they are available, and have the processes in place to continuously upgrade and adapt their IT environment. For others, the ability to standardize and stabilize operations for long durations of time is paramount. These organizations may not need the newest features right away—periodic updates are fine.

tristan-colangelo-39719-unsplashPhoto by Tristan Colangelo on Unsplash

Red Hat OpenStack Platform life-cycle options accommodate both of these approaches. Organizations that need constant innovation can upgrade to the latest Red Hat OpenStack Platform release every six months to take advantage of new features as they become available. Organizations that prefer to use a given release for a longer time can skip standard releases and simply upgrade between long-life releases every 18 to 60 months.

Here’s a deeper look into each option and why you might choose one over the other.

Standard upgrade path

With this approach, you upgrade every six to twelve months as a new release of Red Hat OpenStack Platform is made available. Red Hat OpenStack Platform director provides upgrade tooling to simplify the upgrade process. As a result, you can adopt the latest features and innovations as soon as possible. This keeps your cloud infrastructure aligned closely with the upstream community releases, so if you’re active in the OpenStack community, you’ll be able to take advantage of your contributions sooner.

This upgrade path typically requires organizations to have processes in place to efficiently manage continuously changing infrastructure. If you have mature, programmatic build and test processes, you’re in good shape.

The standard upgrade path is ideal for organizations involved in science and research, financial services, and other fields that innovate fast and change quickly.

jordan-ladikos-62738-unsplashPhoto by Jordan Ladikos on Unsplash 

 

Long-life upgrade path

With this approach, you upgrade every 18 to 60 months between long-life releases of Red Hat OpenStack Platform, skipping two standard releases at a time. Starting with Red Hat OpenStack Platform 13, the fast forward upgrade feature in director simplifies the upgrade process by fully containerizing Red Hat OpenStack Platform deployment. This minimizes interruptions due to upgrading and eliminates the need for additional hardware to support the upgrade process. As a result, you can use a long-life release, like Red Hat OpenStack Platform 10 or 13, for an extended time to stabilize operations. Based on customer requests and feasibility reviews, select features in later standard releases may be backported to the last long-life release (Full Support phase only), so you can still gain access to some new features between upgrades.

The long-life upgrade path works well for organizations that are more familiar and comfortable with traditional virtualization and may still be adopting a programmatic approach to IT operations.

This path is ideal for organizations that prefer to standardize on infrastructure and don’t necessarily need access to the latest features right away. Organizations involved in telecommunications and other regulated fields often choose the long-life upgrade path.

Wrapping up

With two life-cycle options for Red Hat OpenStack Platform, Red Hat supports you no matter where you are in your cloud journey. If you have questions about which path is best for your organization, contact us and we’ll help you get started.

Learn more about Red Hat OpenStack Platform:

by Maria Bracho, Principal Product Manager OpenStack at July 02, 2018 12:36 PM

June 29, 2018

Daniel Berrange

CPU model configuration for QEMU/KVM on x86 hosts

With the various CPU hardware vulnerabilities reported this year, guest CPU configuration is now a security critical task. This blog post contains content I’ve written that is on its way to become part of the QEMU documentation.

QEMU / KVM virtualization supports two ways to configure CPU models

Host passthrough
This passes the host CPU model features, model, stepping, exactly to the guest. Note that KVM may filter out some host CPU model features if they cannot be supported with virtualization. Live migration is unsafe when this mode is used as libvirt / QEMU cannot guarantee a stable CPU is exposed to the guest across hosts. This is the recommended CPU to use, provided live migration is not required.
Named model
QEMU comes with a number of predefined named CPU models, that typically refer to specific generations of hardware released by Intel and AMD. These allow the guest VMs to have a degree of isolation from the host CPU, allowing greater flexibility in live migrating between hosts with differing hardware.

In both cases, it is possible to optionally add or remove individual CPU features, to alter what is presented to the guest by default.

Libvirt supports a third way to configure CPU models known as “Host model”. This uses the QEMU “Named model” feature, automatically picking a CPU model that is similar the host CPU, and then adding extra features to approximate the host model as closely as possible. This does not guarantee the CPU family, stepping, etc will precisely match the host CPU, as they would with “Host passthrough”, but gives much of the benefit of passthrough, while making live migration safe.

Recommendations for KVM CPU model configuration on x86 hosts

The information that follows provides recommendations for configuring CPU models on x86 hosts. The goals are to maximise performance, while protecting guest OS against various CPU hardware flaws, and optionally enabling live migration between hosts with hetergeneous CPU models.

Preferred CPU models for Intel x86 hosts

The following CPU models are preferred for use on Intel hosts. Administrators / applications are recommended to use the CPU model that matches the generation of the host CPUs in use. In a deployment with a mixture of host CPU models between machines, if live migration compatibility is required, use the newest CPU model that is compatible across all desired hosts.

Skylake-Server
Skylake-Server-IBRS
Intel Xeon Processor (Skylake, 2016)
Skylake-Client
Skylake-Client-IBRS
Intel Core Processor (Skylake, 2015)
Broadwell
Broadwell-IBRS
Broadwell-noTSX
Broadwell-noTSX-IBRS
Intel Core Processor (Broadwell, 2014)
Haswell
Haswell-IBRS
Haswell-noTSX
Haswell-noTSX-IBRS
Intel Core Processor (Haswell, 2013)
IvyBridge
IvyBridge-IBRS
Intel Xeon E3-12xx v2 (Ivy Bridge, 2012)
SandyBridge
SandyBridge-IBRS
Intel Xeon E312xx (Sandy Bridge, 2011)
Westmere
Westmere-IBRS
Westmere E56xx/L56xx/X56xx (Nehalem-C, 2010)
Nehalem
Nehalem-IBRS
Intel Core i7 9xx (Nehalem Class Core i7, 2008)
Penryn
Intel Core 2 Duo P9xxx (Penryn Class Core 2, 2007)
Conroe
Intel Celeron_4x0 (Conroe/Merom Class Core 2, 2006)

Important CPU features for Intel x86 hosts

The following are important CPU features that should be used on Intel x86 hosts, when available in the host CPU. Some of them require explicit configuration to enable, as they are not included by default in some, or all, of the named CPU models listed above. In general all of these features are included if using “Host passthrough” or “Host model”.

pcid
Recommended to mitigate the cost of the Meltdown (CVE-2017-5754) fix. Included by default in Haswell, Broadwell & Skylake Intel CPU models. Should be explicitly turned on for Westmere, SandyBridge, and IvyBridge Intel CPU models. Note that some desktop/mobile Westmere CPUs cannot support this feature.
spec-ctrl
Required to enable the Spectre (CVE-2017-5753 and CVE-2017-5715) fix, in cases where retpolines are not sufficient. Included by default in Intel CPU models with -IBRS suffix. Must be explicitly turned on for Intel CPU models without -IBRS suffix. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
ssbd
Required to enable the CVE-2018-3639 fix. Not included by default in any Intel CPU model. Must be explicitly turned on for all Intel CPU models. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
pdpe1gb
Recommended to allow guest OS to use 1GB size pages.Not included by default in any Intel CPU model. Should be explicitly turned on for all Intel CPU models. Note that not all CPU hardware will support this feature.

Preferred CPU models for AMD x86 hosts

The following CPU models are preferred for use on Intel hosts. Administrators / applications are recommended to use the CPU model that matches the generation of the host CPUs in use. In a deployment with a mixture of host CPU models between machines, if live migration compatibility is required, use the newest CPU model that is compatible across all desired hosts.

EPYC
EPYC-IBPB
AMD EPYC Processor (2017)
Opteron_G5
AMD Opteron 63xx class CPU (2012)
Opteron_G4
AMD Opteron 62xx class CPU (2011)
Opteron_G3
AMD Opteron 23xx (Gen 3 Class Opteron, 2009)
Opteron_G2
AMD Opteron 22xx (Gen 2 Class Opteron, 2006)
Opteron_G1
AMD Opteron 240 (Gen 1 Class Opteron, 2004)

Important CPU features for AMD x86 hosts

The following are important CPU features that should be used on AMD x86 hosts, when available in the host CPU. Some of them require explicit configuration to enable, as they are not included by default in some, or all, of the named CPU models listed above. In general all of these features are included if using “Host passthrough” or “Host model”.

ibpb
Required to enable the Spectre (CVE-2017-5753 and CVE-2017-5715) fix, in cases where retpolines are not sufficient. Included by default in AMD CPU models with -IBPB suffix. Must be explicitly turned on for AMD CPU models without -IBPB suffix. Requires the host CPU microcode to support this feature before it can be used for guest CPUs.
virt-ssbd
Required to enable the CVE-2018-3639 fix. Not included by default in any AMD CPU model. Must be explicitly turned on for all AMD CPU models. This should be provided to guests, even if amd-ssbd is also provided, for maximum guest compatibility. Note for some QEMU / libvirt versions, this must be force enabled when when using “Host model”, because this is a virtual feature that doesn’t exist in the physical host CPUs.
amd-ssbd
Required to enable the CVE-2018-3639 fix. Not included by default in any AMD CPU model. Must be explicitly turned on for all AMD CPU models. This provides higher performance than virt-ssbd so should be exposed to guests whenever available in the host. virt-ssbd should none the less also be exposed for maximum guest compatability as some kernels only know about virt-ssbd.
amd-no-ssb
Recommended to indicate the host is not vulnerable CVE-2018-3639. Not included by default in any AMD CPU model. Future hardware genarations of CPU will not be vulnerable to CVE-2018-3639, and thus the guest should be told not to enable its mitigations, by exposing amd-no-ssb. This is mutually exclusive with virt-ssbd and amd-ssbd.
pdpe1gb
Recommended to allow guest OS to use 1GB size pages. Not included by default in any AMD CPU model. Should be explicitly turned on for all AMD CPU models. Note that not all CPU hardware will support this feature.

Default x86 CPU models

The default QEMU CPU models are designed such that they can run on all hosts. If an application does not wish to do perform any host compatibility checks before launching guests, the default is guaranteed to work.

The default CPU models will, however, leave the guest OS vulnerable to various CPU hardware flaws, so their use is strongly discouraged. Applications should follow the earlier guidance to setup a better CPU configuration, with host passthrough recommended if live migration is not needed.

qemu32
qemu64
QEMU Virtual CPU version 2.5+ (32 & 64 bit variants). qemu64 is used for x86_64 guests and qemu32 is used for i686 guests, when no -cpu argument is given to QEMU, or no <cpu> is provided in libvirt XML.

Other non-recommended x86 CPUs

The following CPUs models are compatible with most AMD and Intel x86 hosts, but their usage is discouraged, as they expose a very limited featureset, which prevents guests having optimal performance.

kvm32
kvm64
Common KVM processor (32 & 64 bit variants). Legacy models just for historical compatibility with ancient QEMU versions.
486
athlon
phenom
coreduo
core2duo
n270
pentium
pentium2
pentium3
Various very old x86 CPU models, mostly predating the introduction of hardware assisted virtualization, that should thus not be required for running virtual machines.

Syntax for configuring CPU models

The example below illustrate the approach to configuring the various CPU models / features in QEMU and libvirt

QEMU command line

Host passthrough
   $ qemu-system-x86_64 -cpu host

With feature customization:

   $ qemu-system-x86_64 -cpu host,-vmx,...
Named CPU models
   $ qemu-system-x86_64 -cpu Westmere

With feature customization:

   $ qemu-system-x86_64 -cpu Westmere,+pcid,...

Libvirt guest XML

Host passthrough
   <cpu mode='host-passthrough'/>

With feature customization:

   <cpu mode='host-passthrough'>
       <feature name="vmx" policy="disable"/>
       ...
   </cpu>
Host model
   <cpu mode='host-model'/>

With feature customization:

   <cpu mode='host-model'>
       <feature name="vmx" policy="disable"/>
       ...
   </cpu>
Named model
   <cpu mode='custom'>
       <model name="Westmere"/>
   </cpu>

With feature customization:

   <cpu mode='custom'>
       <model name="Westmere"/>
       <feature name="pcid" policy="require"/>
       ...
   </cpu>

 

by Daniel Berrange at June 29, 2018 12:49 PM

June 28, 2018

Adam Young

Requirements for an OpenStack Access Control Policy Management Tool

“We need a read only role.”

It seems like such a simple requirement.  Users have been requesting a read-only role for several years now.  Why is it so tough to implement?   Because it calls for  modifying access control policy across multiple, disjoint services deployed at innumerable distinct locations.

“We need help in modifying policy to implement our own read only role.”

This one is a little bit more attainable.  We should be able to provide better tools to help people customize their policy.  What should that look like?

We gathered some information at the last summit, and I am going to try and distill it to a requirements document here.

Definitions

  • Verb and Path:  the combination of the HTTP verb and the templated sub path that is used by the mapping engines.  If I were to use Curl to call https://hostname:5000/v3/users/a0123ab6, the verb would be the implicit GET, and the path would be /v3/users/{user_id}.
  • policy key:  the key in the policy.json and policy.yaml file that is used to match the python code to the policy.  For example, the Keystone GET /v3/user/{user_id} verb and path tests against the policy key identity:get_user.
  • API Policy Mapping:  the mapping from Verb and Path to Policy key.

The tool needs to be run from the installer. While that means Tripleo for my team, it should be a tool that can be enlisted into any of the installers.  It should also be able to run for day 2 operations from numerous tools.

It should not be deployed as a standard service, at least not one tied in with the active OpenStack install, as modifying policy is a tricky and potentially destructive and dangerous operation.

Input

Policy files need to be gathered from the various services, but this tool does not need to do that; the variations in how to generate, collect, and distribute policy files are too numerous to solve in a single, focused tool.  The collection and distribution fits more into Ansible playbooks than a tool for modifying policy.

External API definitions

End users need to be able to test their policy.  While the existing oslo-policy command line can tell whether a token would or would not pass the checks, those are done at the policy key level.  All integration is done at the URL level, even if it then passes through libraries or the CLI.  The Verb and URL  can be retrieved from network tools or debug mode of the CLI, and matched against the tuple of (service,verb,template path) to link back to the policy key, and the thus the policy rule that oslo-policy will enforce.  Deducing this mapping must be easy.  With this mapping, additional tools can mock a request/response to test whether a given set of auth-data would pass or fail a request.  Thus, the tool should accept a simple format for uploading the mappings of Verb and Path to policy key.

Policy.json

Policy files have several implementations.  The old Policy.json structure provides the least amount of information. Here is a sample:

"context_is_admin": "role:admin",
"default": "role:admin",

"add_image": "",
"delete_image": "",
"get_image": "",
"get_images": "",
"modify_image": "",
"publicize_image": "role:admin",
"copy_from": "",

policy.yaml

The policy in code structure provides the most, including the HTTP Verbs and templated Paths that map to the rules that are the keys in the policy files. The Python code that is used by oslo-policy to generate the sample YAML files uses, but does not expose, all that data.  Here is an example:

# This policy only checks if the user has access to the requested
# project limits. And this check is performed only after the check
# os_compute_api:limits passes
# GET /limits
# "os_compute_api:os-used-limits": "rule:admin_api"

A secondary tool should expose all this data as YAML , probably a modification of the oslo-policy CLI.  The management tool should be able to consume this format.  It should also be able to consume a document that maps the policy keys  to the Verb and Path separate from the policy

Upgrades

A new version of an OpenStack service will likely have new APIs.  These APIs will not be covered by existing policy.  However, if a site has made major efforts into customizing policy in the past, they will not want to lose and redo all of their changes.  Thus, it should be possible to upload a new file indicating the over all or just changes to the API mapping from a previous version.  If an updated policy-in-code format is available, that file should merge in with the existing policy modifications.  The user needs to be able to identify

  • Any new APIs that require application of the transformations listed below
  • Any changes to base policy that the user has customized and now conflict with the assumptions.  The tool user should be able to accept the old version, the new version, or come up with a modified new, manually merged version.

Transformations

End users need to be able to describe the transformations that then need to perform in simple terms.  Here are some that have been identified so far:

  • ensure that all APIs match against some role
  • ensure that APIs that require an role (especially admin) also perform a scope check
  • switch the role used for a given operation or set of operations
  • standardize the meaning of interim rules such as “owner.”
  • Inline an interim rule into the rules that use it
  • Extract an interim rule from all the rules that have a common fragment

Implied Roles

The Implied Roles mechanism provides support for policy,  The tool should be able to help the tool users to take advantage of implied roles.

  • Make use of implied roles to simplify complex matching rules
  • Make use of implied roles to provide additional granularity for an API:
  • Make it possible to expand implied rules in the policy file based on a data model

Change sets

The operations to transform the rules are complex enough that users will need to be able to role them forward and back, much like a set of changes to a git repository.

User Interface

While the tool should be visible, the majority of the business logic should reside in an API that is callable from other systems.  This seems to imply a pattern of REST API + A visible UI toolkit.

The User Interface should make working with large sets of rules possible and convenient.  Appropriate information hiding and selection should be coupled with the transformations to select the set of rules to be transformed.

Datastore

The data store for the application should be light enough to run during the install process. For example, SQLite would be preferred over MySQL.

Output

The tool should be able to produce the individual policy files consumed by the APIs.

It is possible to have a deployment where different policy is in place for different endpoints of the same service.  The tools should support endpoint specific overrides.  However, the main assumption is that these will be small changes from the core service definitions.  As such, they should be treated as “service X plus these changes” as opposed to a completely separate set of policy rules.

 

by Adam Young at June 28, 2018 06:57 PM

Red Hat Stack

Red Hat OpenStack Platform 13 is here!

Accelerate. Innovate. Empower.

In the digital economy, IT organizations can be expected to deliver services anytime, anywhere, and to any device. IT speed, agility, and innovation can be critical to help stay ahead of your competition. Red Hat OpenStack Platform lets you build an on-premise cloud environment designed to accelerate your business, innovate faster, and empower your IT teams.

Logotype_RH_OpenStackPlatform_RGB_Black (2)

Accelerate. Red Hat OpenStack Platform can help you accelerate IT activities and speed time to market for new products and services. Red Hat OpenStack Platform helps simplify application and service delivery using an automated self-service IT operating model, so you can provide users with more rapid access to resources. Using Red Hat OpenStack Platform, you can build an on-premises cloud architecture that can provide resource elasticity, scalability, and increased efficiency to launch new offerings faster.

Innovate. Red Hat OpenStack Platform enables you differentiate your business by helping to make new technologies more accessible without sacrificing current assets and operations. Red Hat’s open source development model combines faster-paced, cross-industry community innovation with production-grade hardening, integrations, support, and services. Red Hat OpenStack Platform is designed to provide an open and flexible cloud infrastructure ready for modern, containerized application operations while still supporting the traditional workloads your business relies on.

Empower. Red Hat OpenStack Platform helps your IT organization deliver new services with greater ease. Integrations with Red Hat’s open software stack let you build a more flexible and extensible foundation for modernization and digital operations. A large partner ecosystem helps you customize your environment with third-party products, with greater confidence that they will be interoperable and stable.

With Red Hat OpenStack Platform 13, Red Hat continues to bring together community-powered innovation with the stability, support, and services needed for production deployment. Red Hat OpenStack Platform 13 is a long-life release with up to three years of standard support and an additional, optional two years of extended life-cycle support (ELS). This release includes many features to help you adopt cloud technologies more easily and support digital transformation initiatives.

Fast forward upgrades

With both standard and long-life releases, Red Hat OpenStack Platform lets you choose when to implement new features in your cloud environment:

  • Upgrade every six months and benefit from one year of support on each release.
  • Upgrade every 18 months with long-life releases and benefit from 3 years of support on that release, with an optional ELS totalling to up to 5 years of support. Long life releases include innovations from all previous releases.

Now, with the fast forward upgrade feature, you can skip between long-life releases on an 18-month upgrade cadence. Fast forward upgrades fully containerize Red Hat OpenStack Platform deployment to simplify the process of upgrading between long-life releases. This means that customers who are currently using Red Hat OpenStack Platform 10 have an easier upgrade path to Red Hat OpenStack Platform 13—with fewer interruptions and no need for additional hardware.

Fast forward upgrade diagram v1Red Hat OpenStack Platform life cycle by version

Containerized OpenStack services

Red Hat OpenStack Platform now supports containerization of all OpenStack services. This means that OpenStack services can be independently managed, scaled, and maintained throughout their life cycle, giving you more control and flexibility. As a result, you can simplify service deployment and upgrades and allocate resources more quickly, efficiently, and at scale.

Red Hat stack integrations

The combination of Red Hat OpenStack Platform with Red Hat OpenShift provides a modern, container-based application development and deployment platform with a scalable hybrid cloud foundation. Kubernetes-based orchestration simplifies application portability across scalable hybrid environments, designed to provide a consistent, more seamless experience for developers, operations, and users.

Red Hat OpenStack Platform 13 delivers several new integrations with Red Hat OpenShift Container Platform:

  • Integration of openshift-ansible into Red Hat OpenStack Platform director eases troubleshooting and deployment.
  • Network integration using the Kuryr OpenStack project unifies network services between the two platforms, designed to eliminate the need for multiple network overlays and reduce performance and interoperability issues.  
  • Load Balancing-as-a-Service with Octavia provides highly available cloud-scale load balancing for traditional or containerized workloads.

Additionally, support for the Open Virtual Networking (OVN) networking stack supplies consistency between Red Hat OpenStack Platform, Red Hat OpenShift, and Red Hat Virtualization.

Security features and compliance focus

Security and compliance are top concerns for organizations deploying clouds. Red Hat OpenStack Platform includes integrated security features to help protect your cloud environment. It encrypts control flows and, optionally, data stores and flows, enhancing the privacy and integrity of your data both at rest and in motion.

Red Hat OpenStack Platform 13 introduces several new, hardened security services designed to help further safeguard enterprise workloads:

  • Programmatic, API-driven secrets management through Barbican
  • Encrypted communications between OpenStack services using Transport Layer Security (TLS) and Secure Sockets Layer (SSL)
  • Cinder volume encryption and Glance image signing and verification

Additionally, Red Hat OpenStack Platform 13 can help your organization meet relevant technical and operational controls found in risk management frameworks globally. Red Hat can help support compliance guidance provided by government standards organizations, including:

  • The Federal Risk and Authorization Management Program (FedRAMP) is a U.S. government-wide program that provides a standardized approach to security assessment, authorization, and continuous monitoring for cloud products and services.
  • Agence nationale de la sécurité des systèmes d’information (ANSSI) is the French national authority for cyber-defense and network and information security (NIS).

A updated security guide is also available to help you when deploying a cloud environment.

Storage and hyperconverged infrastructure options

Red Hat Ceph Storage provides unified, highly scalable, software-defined block, object, and file storage for Red Hat OpenStack Platform deployments and services. Integration between the two enables you to deploy, scale, and manage your storage back end just like your cloud infrastructure. New storage integrations included in Red Hat OpenStack Platform 13 give you more choice and flexibility. With support for the OpenStack Manila project, you can use the CephFS NFS file share as a service to better support applications using file storage. As a result, you can choose the type of storage for each workload, from a unified storage platform.

Red Hat Hyperconverged Infrastructure for Cloud combines Red Hat OpenStack Platform and Red Hat Ceph Storage into a single offering with a common life cycle and support. Both Red Hat OpenStack Platform compute and Red Hat Ceph Storage functions are run on the same host, enabling consolidation and efficiency gains. NFV use cases for Red Hat Hyperconverged Infrastructure for Cloud include:

  • Core datacenters
  • Central office datacenters
  • Edge and remote point of presence (POP) environments
  • Virtual radio access networks (vRAN)
  • Content delivery networks (CDN)

You can also add hyperconverged capabilities to your current Red Hat OpenStack Platform subscriptions using an add-on SKU.

RHHCI use cases v0Red Hat Hyperconverged Infrastructure for Cloud use cases

Telecommunications optimizations

Red Hat OpenStack Platform 13 delivers new telecommunications-specific features that allow CSPs to build innovative, cloud-based network infrastructure more easily:

  • OpenDaylight integration lets you connect your OpenStack environment with the OpenDaylight software-defined networking (SDN) controller, giving it greater visibility into and control over OpenStack networking, utilization, and policies.
  • Real-time Kernel-based Virtual Machine (KVM) support designed to deliver ultra-low latency for performance-sensitive environments.
  • Open vSwitch (OVS) offload support (tech preview) lets you implement single root input/output virtualization (SR-IOV) to help reduce the performance impact of virtualization and deliver better performance for high IOPS applications.
OpenStack_OpenDaylight-NetVirt_437720_0317-illustratedRed Hat OpenStack Platform and OpenDaylight cooperation

Learn more

Red Hat OpenStack Platform combines community-powered innovation with enterprise-grade features and support to help your organization build a production-ready private cloud. With it, you can accelerate application and service delivery, innovate faster to differentiate your business, and empower your IT teams to support digital initiatives.

Learn more about Red Hat OpenStack Platform:

by Rosa Guntrip, Senior Principal Product Marketing Manager at June 28, 2018 12:53 AM

June 25, 2018

Red Hat Stack

Red Hat Certified Cloud Architect – An OpenStack Perspective – Part Two

Previously we learned about what the Red Hat Certified Architect certification is and what exams are included in the “OpenStack-focused” version of the certification. This week we want to focus on personal experience and benefits from achieving this milestone.

Let’s be honest, even for the most skilled engineers the path to becoming an RHCA can be quite challenging and even a little bit intimidating!  Not only do the exams test your ability to perform specific tasks based on the certification requirements, but they also test your ability to repurpose that knowledge and combine it with the knowledge of other technologies while solving extremely complex scenarios.  This can make achieving the RHCA even more difficult; however, it also makes achieving the RHCA extremely validating and rewarding.

samuel-clara-69657-unsplashPhoto by Samuel Clara on Unsplash

Many busy professionals decide to prepare for the exams with Red Hat Online Learning (ROLE), which allows students to access the same robust course content and hands-on lab experience delivered in classroom training from the comfort of their own computer and at their own pace. This is made even easier through the Red Hat Learning Subscription (RHLS).

RHLS provides access to the entire Red Hat courseware catalog, including video classrooms, for a single, convenient price per year. This kind of access can help you prepare for all the certifications. We found that before sitting an exam, it was important to be able to perform 100 percent of the respective ROLE lab without referring back to any documentation for help; with RHLS this is much easier to do!  

While documentation and man pages are available during an exam, they should be used as a resource and not a replacement for deep knowledge. Indeed, it’s much better to make sure you know it by heart without needing to look! We also found that applying the comprehensive reviews found at the end of each ROLE course to real world scenarios helped us better understand how what we learned in the course applied to what we would do on a day-to-day basis.  

For example, when taking the Ansible ROLE course DO407, which uses a comprehensive virtual environment and a video classroom version, we were easily able to spawn instances in our own physical OpenStack environment and apply what we had learned in the course to the real world.  By putting the courseware into action in the real world it better allowed us to align the objectives of the course to real-life scenarios, making the knowledge more practical and easier to retain.

What about formal training?

nathan-dumlao-572047-unsplashPhoto by Nathan Dumlao on Unsplash

We wouldn’t recommend for anyone to just show up at the examination room without taking any formal training. Even if you feel that your level of proficiency in any of these technologies is advanced, keep in mind that Red Hat exams go very deep, covering large portions of the technology. For example, you might be an ‘Ansible Ninja’ writing playbooks for a living. But how often do you work with dynamic inventories or take advantage of delegation, vaults or parallelism? The same applies for any other technology you want to test yourself in, there is a good chance it will cover aspects you are not familiar with.

The value comes from having the combination of skills.  Take the example of an auto mechanic who is great at rebuilding a transmission, but may not know how to operate a manual transmission!  You can’t be an expert at one without knowing a lot about the other.

For us, this is where Red Hat training has been invaluable. With every exam there is a corresponding class provided. These classes not only cover each aspect of the technology (and beyond) that you will be tested on, but also provide self-paced lab modules and access to lab environments. They are usually offered with either a live instructor or via an online option so you can juggle the education activities with your ‘day job’ requirements!

More information about the classes for these exams can be found on the Red Hat Training site. 

How long does it take?

It doesn’t have to take long at all. If you already have an RHCE in Red Hat Enterprise Linux and OpenStack is not a new subject to you, the training will serve as an excellent reminder rather than something that you have to learn from scratch. Some people may even be able to complete all 5 exams in less then a month.

But does everyone want to go that fast? Probably not.

estee-janssens-396876-unsplashPhoto by Estée Janssens on Unsplash

When our customers ask us about what we recommend to achieve these certifications in a realistic timeframe we suggest the Red Hat Learning Subscription to them. As mentioned, it gives you amazing access to Red Hat courseware.

But it is more than that.

The Red Hat Learning Subscription is a program for individuals and organizations that not only provides the educational content to prepare you for the exams (including videos and lab access), but also, in some cases, may includes actual exams (and some retakes) at many Red Hat certified facilities. It is is valid for one year, which is plenty of time to work through all the courses and exams.

This kind of flexibility can help to shape an individual learning path.

For instance, imagine doing it like this:

With the Red Hat Learning subscription you could schedule all the exams in advance in two month intervals. These exams then become your milestones and give you a good predictable path for studying. You can always reschedule them if something urgent comes up. This lets you sign up for classes, but don’t take them too far apart before your exam. Then re-take all the self paced labs a week before your exam, without reading guided instructions. After that you should be in a position to assess your readiness for the exams and reach the ultimate goal of an RHCA.

Don’t get discouraged if you don’t pass on the first try, it’s not unusual even for subject experts to fail at first try! Simply close the knowledge gaps and retake the exam again. And with RHLS, you’ve got the access and time to do so!

The benefits of becoming RHCA can be substantial. Outside of gaining open source “street cred”, the most important aspect is, of course, for your career – it’s simple: you can get better at your job.

clark-tibbs-367075-unsplashPhoto by Clark Tibbs on Unsplash

And of course, being better at your job can translate to being more competitive in the job market, which can lead to being more efficient in your current role and potentially even bring additional financial compensation!

But becoming an RHCA is so much more. It helps to broaden your horizons. You can learn more ways to tackle real life business problems, including how to become more capable of taking leadership roles through translating problems into technology solutions.

As a proud Red Hat Certified Architect you will have the tools to help make the IT world a better place!

So what are you waiting for … go get it!


Icon_RH_Transportation_Space-Rocket_RGB_Flat (1)Ready to start your certification journey? Get in touch with the friendly Red Hatters at Red Hat Training in your local area today to find all the ways you can master the skills you need to accelerate your career and run your enterprise cloud!


About the authors:

Screen Shot 2018-06-20 at 12.23.14 pmChris Janiszewski is an Red Hat OpenStack Solutions Architect. He is proud to help his clients validate their business and technical use cases on OpenStack and supporting components like storage, networking or cloud automation and management. He is the father of two little kids and enjoys the majority of his free time playing with them.  When the kids are asleep he gets to put the “geek hat” on and build OpenStack labs to hack crazy use cases!


Screen Shot 2018-06-20 at 12.23.23 pmKen Holden is a Senior Solution Architect with Red Hat.  He has spent the past 3 years on the OpenStack Tiger Team with the primary responsibility of deploying Red Hat OpenStack Platform Proof-Of-Concept IaaS Clouds for Strategic Enterprise Customers across North America.  Throughout his 20 year career in Enterprise IT, Ken has focussed on Linux, Unix, Storage, Networking, and Security with the past 5 years being primarily focused on Cloud Solutions. Ken has achieved Red Hat Certified Architect status (RHCA 110-009-776) and holds Certified OpenStack Administrator status (COA-1700-0387-0100) with the OpenStack Foundation. Outside of work, Ken spends the majority of his time with his wife and two daughters, but also aspires to be the world’s most OK Guitar Player when time permits!

by Chris Janiszewski - Senior OpenStack Solutions Architect - Red Hat Tiger Team at June 25, 2018 02:54 AM

June 22, 2018

Red Hat Stack

Red Hat OpenStack Platform fast forward upgrades: A step-by-step overview

New in Red Hat®️ OpenStack®️ Platform 13, the fast forward upgrade feature lets you easily move between long-life releases, without the need to upgrade to each in-between release. Fast forward upgrades fully containerize Red Hat OpenStack Platform deployment to simplify and speed the upgrade process while reducing interruptions and eliminating the need for additional hardware. Today, we’ll take a look at what the fast forward upgrade process from Red Hat OpenStack Platform 10 to Red Hat OpenStack Platform 13 looks like in practice.

Screen Shot 2018-03-22 at 9.33.50 am

There are six main steps in the process:

  1. Cloud backup. Back up your existing cloud.
  2. Minor update. Update to the latest minor release.
  3. Undercloud upgrade. Upgrade your undercloud.
  4. Overcloud preparation. Prepare your overcloud.
  5. Overcloud upgrade. Upgrade your overcloud.
  6. Convergence. Converge your environment.

Step 1: Back up your existing cloud

First, you need to back up everything in your existing Red Hat OpenStack Platform 10 cloud, including your undercloud, overcloud, and any supporting services. It’s likely that you already have these procedures in place, but Red Hat also provides comprehensive Ansible playbooks to simply the fast forward process even more.

Manual backup procedures are likewise supported by Red Hat’s Customer Experience and Engagement (CEE) group.

A typical OpenStack backup process may involve the following steps:

  1. Notify your users.
  2. Purge your databases, including any unnecessary data stored by Heat or other OpenStack services. This will help to streamline the backup and upgrade process.
  3. Run undercloud and overcloud backups. This will preserve an initial backup of the cloud – it may take some time if you don’t have another backup to reference to this point in time.

By performing a backup before starting the upgrade, you can speed the overall upgrade process by only requiring smaller backups later on.

Step 2: Update to the latest minor release

lucas-davies-500439-unsplashPhoto by Lucas Davies on Unsplash

Next, update your Red Hat OpenStack Platform environment to the latest minor release using the standard minor update processes. This step consolidates all undercloud and overcloud node reboots required for moving to Red Hat OpenStack Platform 13. This simplifies the overall upgrade, as no reboots are needed in later steps. For example, an upgrade from Red Hat OpenStack Platform 10 to the latest, fast forward-ready minor release will update Open vSwitch (OVS) to version 2.9, Red Hat Enterprise Linux to version 7.5, and Red Hat Ceph®️ Storage to version 2.5 in your overcloud. These steps do require node reboots, so you can live-migrate workloads prior rebooting nodes to avoid downtime.

Step 3: Upgrade your undercloud

In this step, you’ll upgrade Red Hat OpenStack Platform director, known as the undercloud, to the new long-life release. This requires manual rolling updates from Red Hat OpenStack Platform 10 to 11 to 12 to 13, but does not require any reboots, as they were completed in the previous minor update. The same action pattern is repeated for each release: enable the new repository, stop main OpenStack Platform services, upgrade director’s main packages, and upgrade the undercloud. Note that Red Hat OpenStack Platform director will not be able to manage the version 10 overcloud during or after these upgrades.

Step 4: Prepare your overcloud

Red Hat OpenStack Platform 13 introduces containerized OpenStack services to the long-life release cadence. This step goes through the process to create the container registry to support the deployment of these new services during the fast forward procedure.

arnel-hasanovic-673679-unsplashPhoto by Arnel Hasanovic on Unsplash

The first part of this step is to prepare the container images to accomplish this:

  1. Upload Red Hat OpenStack Platform 13 container images to your cloud environment. These can be stored on the director node or on additional hardware. If you choose to store them on your director node, ensure that the node has enough space available for the images. Note that during this part, your undercloud will be unable to scale your overcloud.

Next, you’ll prepare your overcloud for features introduced in Red Hat OpenStack Platform 11 and 12, including composable networks and roles:

  1. Include new services in any custom roles_data files.
  2. Edit any custom roles_data files to add composable networks (new for Red Hat OpenStack Platform 13) to each role.
  3. Remove deprecated services from any custom roles_data files and update deprecated parameters in custom environment files.

If you have a Red Hat OpenStack Platform director-managed Red Hat Ceph Storage cluster or storage backends, you’ll also need to prepare your storage nodes for new, containerized configuration methods.

  1. Install the ceph-ansible package of playbooks in your undercloud and check that you are using the latest resources and configurations in your storage environment file.
  2. Update custom storage backend environment files to include new parameters and resources for composable services. This applies to NetApp, Dell EMC, and Dell EqualLogic block storage backends using cinder.

Finally, if your undercloud uses SSL/TLS for its Public API, you’ll need to allow your overcloud to access your undercloud’s OpenStack Object Storage (swift) Public API during the upgrade process.

  1. Add your undercloud’s certificate authority to each overcloud node using an Ansible playbook.
  2. Perform one last backup. This is the final opportunity for backups before starting the overcloud upgrade.

Step 5: Upgrade your overcloud

eberhard-grossgasteiger-330357-unsplashPhoto by eberhard grossgasteiger on Unsplash

This step is the core of the fast forward upgrade procedure. Remember that director is unable to manage your overcloud until this step is completed. During this step you’ll upgrade all overcloud roles and services from version 10 to version 13 using a fully managed series of commands. Let’s take a look at the process for each role.

Controller nodes

First, you’ll upgrade your control plane. This is performed on a single controller node, but does require your entire control plane to be down. Even so, it does not affect currently running workloads. Upgrade the chosen controller node sequentially through Red Hat OpenStack Platform releases to version 13. Once the database on the upgraded controller has been updated, containerized Red Hat OpenStack Platform 13 services can be deployed to all other controllers.

Compute nodes

Next, you’ll upgrade your compute nodes. As with your controller nodes, only OpenStack services are upgraded—not the underlying operating system. Node reboots are not required and workloads are unaffected by the process. The upgrade process is very fast, as it adds containerized services alongside RPM-based services and then simply switches over each service. During the process, however, compute users will not be able to create new instances. Some network services may also be affected.

To get familiar with the process and ensure compatibility with your environment, we recommend starting with a single, low-risk compute node.

Storage (Red Hat Ceph Storage) nodes

Finally, you’ll upgrade your Red Hat Ceph Storage nodes. While this upgrade is slightly different than the controller and compute nodes, it is not disruptive to services and your data plane remains available throughout the procedure. Director uses the ceph-ansible installer making upgrading your storage nodes simpler. It uses a rolling upgrade process to first upgrade your bare-metal services to Ceph 3.0, and then containerizes Ceph services.

steve-johnson-541507-unsplashPhoto by Steve Johnson on Unsplash

Step 6: Converge your environment

At this point, you’re almost done with the fast forward process. The final step is to converge all components in your new Red Hat OpenStack Platform 13 environment. As mentioned previously, until all overcloud components are upgraded to the same version as your Red Hat OpenStack Platform director, you have only limited overcloud management capabilities. While your workloads are unaffected, you’ll definitely want to regain full control over your environment.

This step finishes the fast forward upgrade process. You’ll update your overcloud stack within your undercloud. This ensures that your undercloud has the current view of your overcloud and resets your overcloud for ongoing operation. Finally, you’ll be able to operate your Red Hat OpenStack Platform environment as normal: add nodes, upgrade components, scale services, and manage everything from director.

Conclusion

Fast forward upgrades simplify the process of moving between long-life releases of Red Hat OpenStack Platform. However, upgrading from Red Hat OpenStack Platform 10 to containerized architecture of Red Hat OpenStack Platform 13 is still a significant change. As always, Red Hat is ready to help you succeed with detailed documentation, subscription support, and consulting services.

Watch the OpenStack Upgrades Strategy: The Fast Forward Upgrade video from OpenStack Summit Vancouver 2018 to learn more about the fast forward upgrade approach.

Learn more about Red Hat OpenStack Platform:

by Maria Bracho, Principal Product Manager OpenStack at June 22, 2018 09:33 PM

Red Hat Certified Cloud Architect – An OpenStack Perspective – Part One

The Red Hat Certified Architect (RHCA) is the highest certification provided by Red Hat. To many, it can be looked at as a “holy grail” of sorts in open source software certifications. It’s not easy to get. In order to receive it, you not only need to already be a Red Hat Certified Engineer  (RHCE) for Red Hat Enterprise Linux (with the Red Hat Certified System Administrator, (RHCSA) as pre-requisite) but also pass additional exams from various technology categories.

vasily-koloda-620886-unsplashPhoto by Vasily Koloda on Unsplash

There are roughly 20 exams to choose from that qualify towards the RHCA. Each exam is valid for 3 years, so as long as you complete 5 exams within a 3 year period, you will qualify for the RHCA. With that said, you must keep these exams up to date if you don’t want to lose your RHCA status.

An RHCA for OpenStack!

Ok, the subtitle might be misleading – there is no OpenStack specific RHCA certification! However you can select exams that will test your knowledge in technologies needed to successfully build and run OpenStack private clouds. We feel the following certifications demonstrate skills that are crucial for OpenStack:

Let’s take a deeper look at each one.

The first two are strictly OpenStack-based. To become a Red Hat Certified System Administrator in Red Hat OpenStack, you need to know how to deploy and operate an OpenStack private cloud. It is also required that you have a good knowledge of Red Hat OpenStack Platform features and how to take advantage of them.

A Red Hat Certified Engineer in Red Hat OpenStack is expected to be able to deploy and work with Red Hat Storage as well as have strong troubleshooting skills, especially around networking. The EX310 exam has recently been refreshed with a strong emphasis on Network Functions Virtualization (NFV) and advanced networking – which can be considered ‘must have’ skills in many OpenStack Telco use cases in the real world.

Since Red Hat OpenStack Platform comes with Red Hat CloudForms, the knowledge of it can be as crucial as OpenStack itself. Some folks even go as far as saying CloudForms is OpenStack’s missing brother. The next certification on the list, the Red Hat Certified Specialist in Hybrid Cloud Management, focuses on managing infrastructure using Red Hat CloudForms.  Where OpenStack focuses on abstracting compute, network and storage, CloudForms takes care of the business side of the house. It manages compliance, policies, chargebacks, service catalogs, integration with public clouds, legacy virtualization, containers, and automation platforms. CloudForms really can do a lot, so you can see why it is essential for certification.

But what about … Ansible?!

jess-watters-483666-unsplashPhoto by Jess Watters on Unsplash

For workload orchestration in OpenStack you can, of course, natively use Heat. However, if you want to become a truly advanced OpenStack user, you should consider exploring Ansible for these tasks. The biggest advantages of Ansible are its simplicity and flexibility with other platforms (not just OpenStack). It is also popular within DevOps teams for on- and off-premises workload deployments. In fact, Ansible is also a core technology behind the Red Hat OpenStack director, CloudForms, and Red Hat OpenShift Container Platform. It’s literally everywhere in the Red Hat product suites!

Logotype_RH_AnsibleAutomation_RGB_Black (1)

One of the reasons for Ansible’s popularity is the amazing functionality it provides through many reusable modules and playbooks. The Red Hat Certified Specialist in Ansible Automation deeply tests your knowledge of writing Ansible playbooks for automation of workload deployments and system operation tasks.

Virtualization of the nation

The last three certifications on this list (the Specialist certifications in Virtualization, Configuration Management, and OpenShift Administration), although not as closely related to OpenStack as the other certifications described here, extend the capability of your OpenStack skill set.

Many OpenStack deployments are complemented by standalone virtualization solutions such as Red Hat Virtualization. This is often useful for workloads not yet ready for a cloud platform. And with CloudForms, Red Hat Virtualization (RHV) and Red Hat OpenStack Platform can both be managed from one place, so having a solid understanding of Red Hat Virtualization can be very beneficial. This is why being a Red Hat Certified Specialist in Virtualization can be so crucial. Being able to run and manage both cloud native workloads and traditional virtualization is essential to your OpenStack skillset.

ng-30950-unsplashPhoto by 贝莉儿 NG on Unsplash

Puppets and Containers

To round things off, since Red Hat OpenStack Platform utilizes Puppet, we recommend earning the Red Hat Certified Specialist in Configuration Management certification for a true OpenStack-focused RHCA. Through it you demonstrate skills and knowledge in the underlying deployment mechanism allowing for a much deeper understanding and skill set.

Finally, a popular use case for OpenStack is running containerized applications on top of it. Earning the Red Hat Certified Specialist in OpenShift Administration shows you know how to install and manage Red Hat’s enterprise container platform, Red Hat OpenShift Container Platform!

Reach for the stars!

Whether you are already an OpenStack expert or looking to become one, the Red Hat Certified Architect track from Red Hat Certification offers the framework to allow you to prove those skills through an industry-recognized premier certification program. And if you follow our advice here you will not only be perfecting your OpenStack skills, but mastering other highly important supporting technologies including CloudForms, Ansible, Red Hat Virtualization, OpenShift, and Puppet on your journey to the RHCA.

greg-rakozy-38802-unsplashPhoto by Greg Rakozy on Unsplash

So what is it like to actually GET these certifications? In the next part of our blog we share our accounts of achieving the RHCA! Check back soon and bookmark so you don’t miss it!


Icon_RH_Transportation_Space-Rocket_RGB_Flat (1)Ready to start your certification journey now!? Get in touch with the friendly Red Hatters at Red Hat Training in your local area today to find all the ways you can master the skills you need to accelerate your career and run your enterprise cloud!


About our authors:

Screen Shot 2018-06-20 at 12.23.14 pmChris Janiszewski is a Red Hat OpenStack Solutions Architect. He is proud to help his clients validate their business and technical use cases on OpenStack and supporting components like storage, networking or cloud automation and management. He is the father of two little kids and enjoys the majority of his free time playing with them.  When the kids are asleep he gets to put the “geek hat” on and build OpenStack labs to hack crazy use cases!



Screen Shot 2018-06-20 at 12.23.23 pmKen Holden is a Senior Solution Architect with Red Hat. He has spent the past 3 years on the OpenStack Tiger Team with the primary responsibility of deploying Red Hat OpenStack Platform Proof-Of-Concept IaaS Clouds for Strategic Enterprise Customers across North America.  Throughout his 20 year career in Enterprise IT, Ken has focussed on Linux, Unix, Storage, Networking, and Security with the past 5 years being primarily focused on Cloud Solutions. Ken has achieved Red Hat Certified Architect status (RHCA 110-009-776) and holds Certified OpenStack Administrator status (COA-1700-0387-0100) with the OpenStack Foundation. Outside of work, Ken spends the majority of his time with his wife and two daughters, but also aspires to be the world’s most OK Guitar Player when time permits!


 

by Chris Janiszewski - Senior OpenStack Solutions Architect - Red Hat Tiger Team at June 22, 2018 12:27 AM

June 21, 2018

John Likes OpenStack

Tips on searching ceph-install-workflow.log on TripleO

1. Only look at the logs relevant to the last run

/var/log/mistral/ceph-install-workflow.log will contain a concatenation of the ceph-ansible runs. The last N lines of the file will have what you're looking for, so what is N?

Determine how long the file is:


[root@undercloud mistral]# wc -l ceph-install-workflow.log
20287 ceph-install-workflow.log
[root@undercloud mistral]#

Find the lines where previous ansible runs finshed.


[root@undercloud mistral]# grep -n failed=0 ceph-install-workflow.log
5425:2018-06-18 23:06:58,901 p=22256 u=mistral | 172.16.0.21 : ok=118 changed=19 unreachable=0 failed=0
5426:2018-06-18 23:06:58,901 p=22256 u=mistral | 172.16.0.23 : ok=81 changed=13 unreachable=0 failed=0
5427:2018-06-18 23:06:58,901 p=22256 u=mistral | 172.16.0.25 : ok=113 changed=18 unreachable=0 failed=0
5428:2018-06-18 23:06:58,901 p=22256 u=mistral | 172.16.0.27 : ok=38 changed=3 unreachable=0 failed=0
5429:2018-06-18 23:06:58,901 p=22256 u=mistral | 172.16.0.28 : ok=77 changed=13 unreachable=0 failed=0
5430:2018-06-18 23:06:58,901 p=22256 u=mistral | 172.16.0.29 : ok=58 changed=7 unreachable=0 failed=0
5431:2018-06-18 23:06:58,901 p=22256 u=mistral | 172.16.0.30 : ok=83 changed=18 unreachable=0 failed=0
5432:2018-06-18 23:06:58,902 p=22256 u=mistral | 172.16.0.31 : ok=110 changed=17 unreachable=0 failed=0
9948:2018-06-20 12:06:38,325 p=11460 u=mistral | 172.16.0.21 : ok=107 changed=12 unreachable=0 failed=0
9949:2018-06-20 12:06:38,326 p=11460 u=mistral | 172.16.0.23 : ok=69 changed=4 unreachable=0 failed=0
9950:2018-06-20 12:06:38,326 p=11460 u=mistral | 172.16.0.25 : ok=102 changed=11 unreachable=0 failed=0
9951:2018-06-20 12:06:38,326 p=11460 u=mistral | 172.16.0.27 : ok=26 changed=0 unreachable=0 failed=0
9952:2018-06-20 12:06:38,326 p=11460 u=mistral | 172.16.0.29 : ok=46 changed=5 unreachable=0 failed=0
9953:2018-06-20 12:06:38,326 p=11460 u=mistral | 172.16.0.30 : ok=70 changed=8 unreachable=0 failed=0
9954:2018-06-20 12:06:38,326 p=11460 u=mistral | 172.16.0.31 : ok=99 changed=10 unreachable=0 failed=0
14927:2018-06-20 23:14:57,881 p=7702 u=mistral | 172.16.0.23 : ok=118 changed=19 unreachable=0 failed=0
14928:2018-06-20 23:14:57,881 p=7702 u=mistral | 172.16.0.27 : ok=110 changed=17 unreachable=0 failed=0
14932:2018-06-20 23:14:57,881 p=7702 u=mistral | 172.16.0.34 : ok=113 changed=18 unreachable=0 failed=0
20255:2018-06-21 09:46:40,571 p=17564 u=mistral | 172.16.0.22 : ok=118 changed=19 unreachable=0 failed=0
20256:2018-06-21 09:46:40,571 p=17564 u=mistral | 172.16.0.26 : ok=134 changed=18 unreachable=0 failed=0
20257:2018-06-21 09:46:40,571 p=17564 u=mistral | 172.16.0.27 : ok=102 changed=14 unreachable=0 failed=0
20258:2018-06-21 09:46:40,571 p=17564 u=mistral | 172.16.0.28 : ok=113 changed=18 unreachable=0 failed=0
20260:2018-06-21 09:46:40,571 p=17564 u=mistral | 172.16.0.34 : ok=110 changed=17 unreachable=0 failed=0
[root@undercloud mistral]#

Subtract the last run's line number from the total file lines:


[root@undercloud mistral]# echo $(( 20260 - 14932))
5328
[root@undercloud mistral]#

Tail from that line line going forward.

2. Identify the node(s) where the playbook run failed:

I know the last 100 lines of the relevant run will have failed set to true if there was a failure. Doing a grep for that will also show me the host:


[root@undercloud mistral]# tail -5328 ceph-install-workflow.log | tail -100 | grep failed=1
2018-06-21 09:46:40,571 p=17564 u=mistral | 172.16.0.32 : ok=66 changed=14 unreachable=0 failed=1
[root@undercloud mistral]#

Now that I know the host I want to see on which task that host failed so I grep for 'failed:'. Just grepping for failed won't help as the log will be full of '"failed": false'.

In this case I extract out the failure:


[root@undercloud mistral]# tail -5328 ceph-install-workflow.log | grep 172.16.0.32 | grep failed:
2018-06-21 09:46:06,093 p=17564 u=mistral | failed: [172.16.0.32 -> 172.16.0.22] (item=[{u'rule_name': u'', u'pg_num': 128, u'name': u'metrics'},
{'_ansible_parsed': True, 'stderr_lines': [u"Error ENOENT: unrecognized pool 'metrics'"], u'cmd': [u'docker', u'exec', u'ceph-mon-controller02',
u'ceph', u'--cluster', u'ceph', u'osd', u'pool', u'get', u'metrics', u'size'], u'end': u'2018-06-21 13:46:01.070270', '_ansible_no_log': False,
'_ansible_delegated_vars': {'ansible_delegated_host': u'172.16.0.22', 'ansible_host': u'172.16.0.22'}, '_ansible_item_result': True, u'changed':
True, u'invocation': {u'module_args': {u'warn': True, u'executable': None, u'_uses_shell': False, u'_raw_params': u'docker exec ceph-mon-controller02
ceph --cluster ceph osd pool get metrics size', u'removes': None, u'creates': None, u'chdir': None, u'stdin': None}}, u'stdout': u'', u'start':
u'2018-06-21 13:46:00.729965', u'delta': u'0:00:00.340305', 'item': {u'rule_name': u'', u'pg_num': 128, u'name': u'metrics'}, u'rc': 2, u'msg':
u'non-zero return code', 'stdout_lines': [], 'failed_when_result': False, u'stderr': u"Error ENOENT: unrecognized pool 'metrics'",
'_ansible_ignore_errors': None, u'failed': False}]) => {"changed": false, "cmd": ["docker", "exec", "ceph-mon-controller02", "ceph",
"--cluster", "ceph", "osd", "pool", "create", "metrics", "128", "128", "replicated_rule", "1"], "delta": "0:00:01.421755", "end":
"2018-06-21 13:46:06.390381", "item": [{"name": "metrics", "pg_num": 128, "rule_name": ""}, {"_ansible_delegated_vars":
{"ansible_delegated_host": "172.16.0.22", "ansible_host": "172.16.0.22"}, "_ansible_ignore_errors": null, "_ansible_item_result":
true, "_ansible_no_log": false, "_ansible_parsed": true, "changed": true, "cmd": ["docker", "exec", "ceph-mon-controller02",
"ceph", "--cluster", "ceph", "osd", "pool", "get", "metrics", "size"], "delta": "0:00:00.340305", "end": "2018-06-21 13:46:01.070270",
"failed": false, "failed_when_result": false, "invocation": {"module_args": {"_raw_params": "docker exec ceph-mon-controller02
ceph --cluster ceph osd pool get metrics size", "_uses_shell": false, "chdir": null, "creates": null, "executable": null,
"removes": null, "stdin": null, "warn": true}}, "item": {"name": "metrics", "pg_num": 128, "rule_name": ""}, "msg":
"non-zero return code", "rc": 2, "start": "2018-06-21 13:46:00.729965", "stderr": "Error ENOENT: unrecognized pool
'metrics'", "stderr_lines": ["Error ENOENT: unrecognized pool 'metrics'"], "stdout": "", "stdout_lines": []}],
"msg": "non-zero return code", "rc": 34, "start": "2018-06-21 13:46:04.968626", "stderr": "Error ERANGE:
pg_num 128 size 3 would mean 768 total pgs, which exceeds max 600 (mon_max_pg_per_osd 200 * num_in_osds 3)",
"stderr_lines": ["Error ERANGE: pg_num 128 size 3 would mean 768 total pgs, which exceeds max 600
(mon_max_pg_per_osd 200 * num_in_osds 3)"], "stdout": "", "stdout_lines": []}
...
[root@undercloud mistral]#

So that's how I quickly find what went wrong in a ceph-ansible run when debugging a TripleO deployment.

3. Extra

You may be wondering what that error is.

There was a ceph-ansible issue with creating pools before the OSDs were running made the deployment fail because of the overdose protection check. This is something you can still fail if your PG numbers and OSDs are not aligned correctly (use pgcalc) but better to fail a deployment then put production data on a misconfigured cluster. You could also fail it because of this issue that ceph-ansible rc9 fixed (technically it was fixed in an earlier version but it had other bugs so I recommend rc9).

by John (noreply@blogger.com) at June 21, 2018 03:56 PM

TripleO Ceph Integration on the Road in June

The first week of June I went to an upstream TripleO workshop in Brno. The labs we used are at https://github.com/redhat-openstack/tripleo-workshop

The third week of June I went to a downstream Red Hat OpenStack Platform event in Montreal for those deploying the upcoming version 13 in the field. I covered similar topics with respect to Ceph deployment via TripleO.

by John (noreply@blogger.com) at June 21, 2018 03:39 PM

June 14, 2018

Zane Bitter

The Expanding OpenStack Foundation

The OpenStack Foundation has begun the process of becoming an umbrella organisation for open source projects adjacent to but outside of OpenStack itself. However, there is no clear roadmap for the transformation, which has resulted in some confusion. After attending the joint leadership meeting with the Foundation Board of Directors and various Forum sessions that included some members of the board at the (2018) OpenStack Summit in Vancouver, I believe I can help shed some light on the situation. (Of course this is my subjective take on the topic, and I am not speaking for the Technical Committee.)

In November 2017, the board authorised the Foundation staff to begin incubation of several ‘Strategic Focus Areas’, including piloting projects that fit in those areas. The three focus areas are Container Infrastructure, Edge Computing Infrastructure, and CI/CD Infrastructure. To date, there have been two pilot projects accepted. Eventually, it is planned for each focus area to have its own Technical Committee (or equivalent governance body), holding equal status with the OpenStack TC—there will be no paramount technical governance body for the whole Foundation.

The first pilot project is Kata Containers, which combines container APIs and container-like performance with VM-level isolation. You will not be shocked to learn that it is part of the Container Infrastructure strategic focus.

The other pilot project, in the CI/CD strategic focus, is Zuul. Zuul will already be familiar to OpenStack developers as the CI system developed by and for the OpenStack project. Its governance is moving from the OpenStack TC to the new Strategic Focus Area, in recognition of its general usefulness as a tool that is not in any way specific to OpenStack development.

Thus far there are no pilot projects in the Edge Computing Infrastructure focus area, but nevertheless there is plenty of work going on—including to figure out what Edge Computing is.

If you attended the Summit then you would have heard about Kata, Zuul and Edge Computing, but this is probably the first time you’ve heard the terms ‘incubate’ or ‘pilot’ associated with them. Nor have the steps that come after incubation or piloting been defined. This has opened the door to confusion, not only about the status of the pilot projects but also that of unofficial projects (outside of either OpenStack-proper or any of the Strategic Focus Areas) that are hosted using on the same infrastructure provided by the Foundation for OpenStack development. It also heralds the return of what I call the October surprise—a half-baked code dump ‘open sourced’ the week before a Summit—which used to be a cottage industry around the OpenStack community until the TC was able to bed in a set of robust processes for accepting new projects.

Starting out without a lot of preconceived ideas about how things would proceed was the right way to begin, but members of the board recognise that now is the time to give the process some structure. I expect to see more work on this in the near future.

There is also a proposed initiative, dubbed Winterscale, to move governance of the foundation’s infrastructure out from under the OpenStack TC, to reflect its new status as a service provider to the OpenStack project, the other Strategic Focus Areas, and unofficial projects.

by Zane Bitter at June 14, 2018 08:26 PM

Lars Kellogg-Stedman

Configuring a static address for wlan0 on Raspbian Stretch

Recent releases of Raspbian have adopted the use of dhcpcd to manage both dynamic and static interface configuration. If you would prefer to use the traditional /etc/network/interfaces mechanism instead, follow these steps.

  1. First, disable dhcpcd and wpa_supplicant.

    systemctl disable --now dhdpcd wpa_supplicant
    
  2. You will need a wpa_supplicant configuration …

by Lars Kellogg-Stedman at June 14, 2018 04:00 AM

June 08, 2018

Matthias Runge

Configuring collectd plugins with TripleO

A way of deploying OpenStack is to use TripleO. This takes the an approach to deploy a small OpenStack environment, and then to take OpenStack provided infrastructure and tools to deploy the actual production environment. This is actually done by an addition to the openstack command line client:

openstack overcloud …

by mrunge at June 08, 2018 06:40 AM

June 07, 2018

RDO Blog

Rocky Test Days Milestone 2: June 14-15

Who’s up for a rematch? Rocky Milestone 2 is here and we’re ready to rumble! Join us on June 14 & 15 (next Thursday and Friday) for an awesome time of taking down bugs and fighting errors in the most recent release. We won’t be pulling any punches.

Want to get in on the action? We’re looking for developers, users, operators, quality engineers, writers, and, yes, YOU. If you’re reading this, we think you’re a champion and we want your help!

Here’s the plan:
We’ll have packages for the following platforms:
* RHEL 7
* CentOS 7

You’ll want a fresh install with latest updates installed so that there’s no hard-to-reproduce interactions with other things.

We’ll be collecting feedback, writing up tickets, filing bugs, and answering questions.

Even if you only have a few hours to spare, we’d love your help taking this new version for a spin to work out any kinks. Not only will this help identify issues early in the development process, but you can be the one of the first to cut your teeth on the latest versions of your favorite deployment methods like TripleO, PackStack, and Kolla.

Interested? We’ll be gathering on #rdo (on Freenode IRC) for any associated questions/discussion, and working through the “Does it work?” tests.

As Rocky said, “The world ain’t all sunshine and rainbows,” but with your help, we can keep moving forward and make the RDO world better for those around us. Hope to see you on the 14th & 15th!

enter image description here

by Mary Thengvall at June 07, 2018 02:51 PM

June 04, 2018

Steve Hardy

TripleO Containerized deployments, debugging basics

Containerized deployments, debugging basics

Since the Pike release, TripleO has supported deployments with OpenStack services running in containers.  Currently we use docker to run images based on those maintained by the Kolla project.

We already have some tips and tricks for container deployment debugging in tripleo-docs, but below are some more notes on my typical debug workflows.

Config generation debugging overview

In the TripleO container architecture, we still use Puppet to generate configuration files and do some bootstrapping, but it is run (inside a container) via a script docker-puppet.py

The config generation usage happens at the start of the deployment (step 1) and the configuration files are generated for all services (regardless of which step they are started in).

The input file used is /var/lib/docker-puppet/docker-puppet.json, but you can also filter this (e.g via cut/paste or jq as shown below) to enable debugging for specific services - this is helpful when you need to iterate on debugging a config generation issue for just one service.

[root@overcloud-controller-0 docker-puppet]# jq '[.[]|select(.config_volume | contains("heat"))]' /var/lib/docker-puppet/docker-puppet.json | tee /tmp/heat_docker_puppet.json
{
"puppet_tags": "heat_config,file,concat,file_line",
"config_volume": "heat_api",
"step_config": "include ::tripleo::profile::base::heat::api\n",
"config_image": "192.168.24.1:8787/tripleomaster/centos-binary-heat-api:current-tripleo"
}
{
"puppet_tags": "heat_config,file,concat,file_line",
"config_volume": "heat_api_cfn",
"step_config": "include ::tripleo::profile::base::heat::api_cfn\n",
"config_image": "192.168.24.1:8787/tripleomaster/centos-binary-heat-api-cfn:current-tripleo"
}
{
"puppet_tags": "heat_config,file,concat,file_line",
"config_volume": "heat",
"step_config": "include ::tripleo::profile::base::heat::engine\n\ninclude ::tripleo::profile::base::database::mysql::client",
"config_image": "192.168.24.1:8787/tripleomaster/centos-binary-heat-api:current-tripleo"
}

 

Then we can run the config generation, if necessary changing the tags (or puppet modules, which are consumed from the host filesystem e.g /etc/puppet/modules) until the desired output is achieved:


[root@overcloud-controller-0 docker-puppet]# export NET_HOST='true'
[root@overcloud-controller-0 docker-puppet]# export DEBUG='true'
[root@overcloud-controller-0 docker-puppet]# export PROCESS_COUNT=1
[root@overcloud-controller-0 docker-puppet]# export CONFIG=/tmp/heat_docker_puppet.json
[root@overcloud-controller-0 docker-puppet]# python /var/lib/docker-puppet/docker-puppet.py2018-02-09 16:13:16,978 INFO: 102305 -- Running docker-puppet
2018-02-09 16:13:16,978 DEBUG: 102305 -- CONFIG: /tmp/heat_docker_puppet.json
2018-02-09 16:13:16,978 DEBUG: 102305 -- config_volume heat_api
2018-02-09 16:13:16,978 DEBUG: 102305 -- puppet_tags heat_config,file,concat,file_line
2018-02-09 16:13:16,978 DEBUG: 102305 -- manifest include ::tripleo::profile::base::heat::api
2018-02-09 16:13:16,978 DEBUG: 102305 -- config_image 192.168.24.1:8787/tripleomaster/centos-binary-heat-api:current-tripleo
...

 

When the config generation is completed, configuration files are written out to /var/lib/config-data/heat.

We then compare timestamps against the /var/lib/config-data/heat/heat.*origin_of_time file (touched for each service before we run the config-generating containers), so that only those files modified or created by puppet are copied to /var/lib/config-data/puppet-generated/heat.

Note that we also calculate a checksum for each service (see /var/lib/config-data/puppet-generated/*.md5sum), which means we can detect when the configuration changes - when this happens we need paunch to restart the containers, even though the image did not change.

This checksum is added to the /var/lib/tripleo-config/hashed-docker-container-startup-config-step_*.json files by docker-puppet.py, and these files are later used by paunch to decide if a container should be restarted (see below).

 

Runtime debugging, paunch 101

Paunch is a tool that orchestrates launching containers for each step, and performing any bootstrapping tasks not handled via docker-puppet.py.

It accepts a json format, which are the /var/lib/tripleo-config/docker-container-startup-config-step_*.json files that are created based on the enabled services (the content is directly derived from the service templates in tripleo-heat-templates)

These json files are then modified via docker-puppet.py (as mentioned above) to add a TRIPLEO_CONFIG_HASH value to the container environment - these modified files are written with a different name, see /var/lib/tripleo-config/hashed-docker-container-startup-config-step_*.json

Note this environment variable isn't used by the container directly, it is used as a salt to trigger restarting containers when the configuration files in the mounted config volumes have changed.

As in the docker-puppet case it's possible to filter the json file with jq and debug e.g mounted volumes or other configuration changes directly.

It's also possible to test configuration changes by manually modifying /var/lib/config-data/puppet-generated/ then either restarting the container via docker restart, or by modifying TRIPLEO_CONFIG_HASH then re-running paunch.

Note paunch will kill any containers tagged for a particular step e.g the --config-id tripleo_step4 --managed-by tripleo-Controller means all containers started during this step for any previous paunch apply will be killed if they are removed from your json during testing.  This is a feature which enables changes to the enabled services on update to your overcloud but it's worth bearing in mind when testing as described here.


[root@overcloud-controller-0]# cd /var/lib/tripleo-config/
[root@overcloud-controller-0 tripleo-config]# jq '{"heat_engine": .heat_engine}' hashed-docker-container-startup-config-step_4.json | tee /tmp/heat_startup_config.json
{
"heat_engine": {
"healthcheck": {
"test": "/openstack/healthcheck"
},
"image": "192.168.24.1:8787/tripleomaster/centos-binary-heat-engine:current-tripleo",
"environment": [
"KOLLA_CONFIG_STRATEGY=COPY_ALWAYS",
"TRIPLEO_CONFIG_HASH=14617e6728f5f919b16c74f1e98d0264"
],
"volumes": [
"/etc/hosts:/etc/hosts:ro",
"/etc/localtime:/etc/localtime:ro",
"/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro",
"/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro",
"/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro",
"/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro",
"/dev/log:/dev/log",
"/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro",
"/etc/puppet:/etc/puppet:ro",
"/var/log/containers/heat:/var/log/heat",
"/var/lib/kolla/config_files/heat_engine.json:/var/lib/kolla/config_files/config.json:ro",
"/var/lib/config-data/puppet-generated/heat/:/var/lib/kolla/config_files/src:ro"
],
"net": "host",
"privileged": false,
"restart": "always"
}
}
[root@overcloud-controller-0 tripleo-config]# paunch --debug apply --file /tmp/heat_startup_config.json --config-id tripleo_step4 --managed-by tripleo-Controller
stdout: dd60546daddd06753da445fd973e52411d0a9031c8758f4bebc6e094823a8b45

stderr:
[root@overcloud-controller-0 tripleo-config]# docker ps | grep heat
dd60546daddd 192.168.24.1:8787/tripleomaster/centos-binary-heat-engine:current-tripleo "kolla_start" 9 seconds ago Up 9 seconds (health: starting) heat_engine

 

 

Containerized services, logging

There are a couple of ways to access the container logs:

  • On the host filesystem, the container logs are persisted under /var/log/containers/<service>
  • docker logs <container id or name>
It is also often useful to use docker inspect <container id or name> to verify the container configuration, e.g the image in use and the mounted volumes etc.

 

Debugging containers directly

Sometimes logs are not enough to debug problems, and in this case you must interact with the container directly to diagnose the issue.

When a container is not restarting, you can attach a shell to the running container via docker exec:


[root@openstack-controller-0 ~]# docker exec -ti heat_engine /bin/bash
()[heat@openstack-controller-0 /]$ ps ax
PID TTY STAT TIME COMMAND
1 ? Ss 0:00 /usr/local/bin/dumb-init /bin/bash /usr/local/bin/kolla_start
5 ? Ss 1:50 /usr/bin/python /usr/bin/heat-engine --config-file /usr/share/heat/heat-dist.conf --config-file /etc/heat/heat
25 ? S 3:05 /usr/bin/python /usr/bin/heat-engine --config-file /usr/share/heat/heat-dist.conf --config-file /etc/heat/heat
26 ? S 3:06 /usr/bin/python /usr/bin/heat-engine --config-file /usr/share/heat/heat-dist.conf --config-file /etc/heat/heat
27 ? S 3:06 /usr/bin/python /usr/bin/heat-engine --config-file /usr/share/heat/heat-dist.conf --config-file /etc/heat/heat
28 ? S 3:05 /usr/bin/python /usr/bin/heat-engine --config-file /usr/share/heat/heat-dist.conf --config-file /etc/heat/heat
2936 ? Ss 0:00 /bin/bash
2946 ? R+ 0:00 ps ax

 

That's all for today, for more information please refer to tripleo-docs,, or feel free to ask questions in #tripleo on Freenode!

by Steve Hardy (noreply@blogger.com) at June 04, 2018 05:09 PM

RDO Blog

Community Blog Round-up: June 4

I’m in a bit of shock that it’s already June… anyone else share that feeling? With summer around the corner for those of us in the northern hemisphere (or Juneuary as we call it in San Francisco), there’s a promise of vacations ahead. Be sure to take us along on your various adventures — sharing about your new favorite hacks, the projects you’re working on, and the conferences you’re traveling to. We love hearing what you’re up to! Speaking of which… here’s what you’ve been blogging about recently:

TripleO deep dive session #13 (Containerized Undercloud) by Carlos Camacho

This is the 13th release of the TripleO “Deep Dive” sessions. Thanks to Dan Prince & Emilien Macchi for this deep dive session about the next step of the TripleO’s Undercloud evolution. In this session, they will explain in detail the movement re-architecting the Undercloud to move towards containers in order to reuse the containerized Overcloud ecosystem.

Read more at https://www.anstack.com/blog/2018/05/31/tripleo-deep-dive-session-13.html

Tracking Quota by Adam Young

This OpenStack Summit marks the third that I have attended where we’ve discussed the algorithms to try and record quota in Keystone but not update it on each resource allocation and free.

Read more at https://adam.younglogic.com/2018/05/tracking-quota/

Don’t rewrite your driver. 80 storage drivers for containers rolled into one! by geguileo

Do you work with containers but your storage doesn’t support your Container Orchestration system? Have you or your company already developed an Openstack/Cinder storage driver and now you have to do it again for containers? Are you having trouble deciding how to balance your engineering force between storage driver development in OpenStack, Containers, Ansible, etc? Then read on, as your life may be about to get better.

Read more at https://gorka.eguileor.com/cinderlib-csi/

Ansible Storage Role: automating your storage solutions by geguileo

Were you in the middle of writing your Ansible playbooks to automate your software provisioning, configuration, and application deployment when you realized you had to manage your storage as well? And it turns out that each of your storage solutions has a completely different Ansible module. Now you have to figure out how each module works to create ad-hoc tasks for each one. What a pain! If this has happened to you, or if you are interested in automating your storage solutions, you may be interested in the new Ansible Storage Role.

Read more at https://gorka.eguileor.com/ansible-role-storage/

Cinderlib: Every storage driver on a single Python library by geguileo

Wouldn’t it be great if we could manage any storage array using a single Python library that provided the right storage management abstraction? Well, this is no longer a beautiful dream, it has become a reality! Keep reading to find out how.

Read more at https://gorka.eguileor.com/cinderlib/

“Ultimate Private Cloud” Demo, Under The Hood! by Steven Hardy, Senior Principal Software Engineer

At the recent Red Hat Summit in San Francisco, and more recently the OpenStack Summit in Vancouver, the OpenStack engineering team worked on some interesting demos for the keynote talks. I’ve been directly involved with the deployment of Red Hat OpenShift Platform on bare metal using the Red Hat OpenStack Platform director deployment/management tool, integrated with openshift-ansible. I’ll give some details of this demo, the upstream TripleO features related to this work, and insight around the potential use-cases.

Read more at https://redhatstackblog.redhat.com/2018/05/22/ultimate-private-cloud-demo-under-the-hood/

Testing Undercloud backup and restore using Ansible by Carlos Camacho

Testing the Undercloud backup and restore It is possible to test how the Undercloud backup and restore should be performed using Ansible.

Read more at https://www.anstack.com/blog/2018/05/18/testing-undercloud-backup-and-restore-using-ansible.html

Your LEGO® Order Has Been Shipped by rainsdance

In preparation for the Red Hat Summit this week and OpenStack Summit in a week, I put together a hardware demo to sit in the RDO booth.

Read more at http://groningenrain.nl/your-lego-order-has-been-shipped/

Introducing GPUs to the CERN Cloud by Konstantinos Samaras-Tsakiris

High-energy physics workloads can benefit from massive parallelism — and as a matter of fact, the domain faces an increasing adoption of deep learning solutions. Take for example the newly-announced TrackML challenge [7], already running in Kaggle! This context motivates CERN to consider GPU provisioning in our OpenStack cloud, as computation accelerators, promising access to powerful GPU computing resources to developers and batch processing alike.

Read more at https://openstack-in-production.blogspot.com/2018/05/introducing-gpus-to-cern-cloud.html

A modern hybrid cloud platform for innovation: Containers on Cloud with Openshift on OpenStack by Stephane Lefrere

Market trends show that due to long application life-cycles and the high cost of change, enterprises will be dealing with a mix of bare-metal, virtualized, and containerized applications for many years to come. This is true even as greenfield investment moves to a more container-focused approach.

Read more at https://redhatstackblog.redhat.com/2018/05/08/containers-on-cloud/

Using a TM1637 LED module with CircuitPython by Lars Kellogg-Stedman

CircuitPython is “an education friendly open source derivative of MicroPython”. MicroPython is a port of Python to microcontroller environments; it can run on boards with very few resources such as the ESP8266. I’ve recently started experimenting with CircuitPython on a Wemos D1 mini, which is a small form-factor ESP8266 board.

Read more at https://blog.oddbit.com/2018/05/03/using-a-tm-led-module-with-cir/

ARA Records Ansible 0.15 has been released by DM Simard

I was recently writing that ARA was open to limited development for the stable release in order to improve the performance for larger scale users.

Read more at https://dmsimard.com/2018/05/03/ara-records-ansible-0.15-has-been-released/

Highlights from the OpenStack Rocky Project Teams Gathering (PTG) in Dublin by Rich Bowen

Last month in Dublin, OpenStack engineers gathered from dozens of countries and companies to discuss the next release of OpenStack. This is always my favorite OpenStack event, because I get to do interviews with the various teams, to talk about what they did in the just-released version (Queens, in this case) and what they have planned for the next one (Rocky).

Read more at https://redhatstackblog.redhat.com/2018/04/26/highlights-from-the-openstack-rocky-project-teams-gathering-ptg-in-dublin/

Red Hat Summit 2018: HCI Lab by John

I will be at Red Hat Summit in SFO on May 8th jointly hosting the lab Deploy a containerized HCI IaaS with OpenStack and Ceph.

Read more at http://blog.johnlikesopenstack.com/2018/04/red-hat-summit-2018-hci-lab.html

by Mary Thengvall at June 04, 2018 02:18 PM

RDO Duck adventures at the Vancouver OpenStack Summit

Last week, I attended the OpenStack Summit in Vancouver, as RDO Ambassador. That was a good experience to connect with the RDO community.

What does an RDO ambassador do?

As an ambassador, your role is to engage the community by answering their questions, helping them to fix their issues, or getting started to contribute. You don’t need to know everything, but it implies, you can at least point people to the right direction.

On Sunday, we started helping with the booth setup, and for me it started with massive T-shirt folding and sorting them by size in laundry baskets. During trade shows, many people just want their swag, so efficient T-shirt distribution means more time to exchange with community members. Thanks to everyone at the booth, it was done quickly.

Massive T-shirt folding with RDO Duck

I also set up the RDO hardware demo. Rain Leander, RDO community liaison had prepared a shiny new demo using two NUCs and a portable gaming screen, using TripleO, this will be used in future events to demo RDO. Since we had limited time, I had to wait monday morning to reinstall everything since demo was borked during the previous event.

RDO Duck busy debugging TripleO deployment

From monday to thursday, I had to do shifts at the booth to advocate RDO and engage with the community, welcoming people doing their shifts and helping them to get set up. The community pod was shared with Carol Chen (ManageIQ) and Leonardo Vaz (Ceph), so we were able to cover many topics.

RDO’s swag was ducks – everyone loves them, we had three colors (Red, Green, Blue!) and I can say they were a bit mischievous 😉

RDO Duck having fun with stickers

Most of the questions we’ve had were related to RDO/RHOSP relations, how to contribute, TripleO, Ceph, etc. We also helped an user to debug his RDO deployment to find out a bug in CentOS cloud images breaking qemu-ev repository ($contentdir variable set to altarch instead of x86_64)

Demo and Q/A sessions

We also had people who came for demo’ing the cool stuff they made. We’ve had Michael J. Turek who demo’ed Ironic deployment on ppc64le (the shiny new arch supported by RDO!) mjturek and I

Also T. Nicholle Williams who presented us how to deploy OpenShift on OpenStack and as a guest star her lovely puddle. 🙂 Nicholle and I

And last but not least, David M. Simmard, creator of ARA which was the center of many questions.

David and I

RDO meetup

We had a RDO meetup jointly with the Ceph community at the Portside Pub. It was awesome to discuss around drinks and snacks. I noticed that among us, there were local stackers not attending the summit, so we reached a wider audience.

RDO meetup as if you were there

Conclusion

That was a great Summit, it was good to connect with fellow stackers (old and new friends!), and share the love around RDO. We’ve had an exceptionally good weather in Vancouver, and I must say that the scenery in British Columbia is just breathtaking. I really enjoyed my short stay there.

Thanks to Rain for sending me, my team for allowing me to go 🙂 Many thanks to Leo, Carol, Jennifer, Tracy with whom I had a lot of fun 🙂 The Dream Team packing up

by hguemar at June 04, 2018 01:35 PM

June 01, 2018

Groningen Rain

What We’ve Learned So Far

As I’ve been rebuilding this site. In case you hadn’t noticed. I have. Cause of the hack. And this is what I’ve learned so far… Backups Aren’t Necessary UNTIL THEY ARE If I had been maintaining regular backups (which the hosting company is totally happy to do for just a few euros more per month), …

by K Rain at June 01, 2018 08:00 AM

May 31, 2018

Carlos Camacho

TripleO deep dive session #13 (Containerized Undercloud)

This is the 13th release of the TripleO “Deep Dive” sessions

Thanks to Dan Prince & Emilien Macchi for this deep dive session about the next step of the TripleO’s Undercloud evolution.

In this session, they will explain in detail the movement re-architecting the Undercloud to move towards containers in order to reuse the containerized Overcloud ecosystem.

You can access the presentation or the Etherpad notes.

So please, check the full session content on the TripleO YouTube channel.



Please check the sessions index to have access to all available content.

by Carlos Camacho at May 31, 2018 12:00 AM

May 26, 2018

Adam Young

Tracking Quota

This OpenStack summit marks the third that I have attended where we’ve discussed the algorithms to try and record quota in Keystone but not update it on each resource allocation and free.

We were stumped, again. The process we had planned on using was game-able and thus broken. I was kinda bummed.

Fortunately, I had a long car ride from Vancouver to Seattle and talked it over with Morgan Fainberg.

We also discussed the Pig War. Great piece of history from the region.

By the time we got to the airport the next day, I think we had it solved. Morgan came to the solution first, and I followed, slowly.  Here’s what we think will work.

First, lets get a 3 Level deep project setup to use for out discussion.

The rule is simple:  even if a quota is subdivided, you still need to check the overall quota of the project  and all the parent projects.

In the example structure above,  lets assume that project A gets a quota of 100 units of some resource: VMs, GB memory, network ports, hot-dogs, very small rocks, whatever.  I’ll use VMs in this example.

There are a couple ways this could be further managed.  The simplest is that, any resource allocated anywhere else in this tree is counted against this quota.  There are 9 total projects in the tree.  If each allocate 11 VMs, there will be 99 created and counted against the quota.  The next VM created uses up the quota.  The request after that will fail due to lack of available quota.

Lets say, however, that the users in project C33 are greedy, and allocate all 100 VMs.  The people in C11 Are filled with righteous indignation.  They need VMs too.

The Admins wipe everything out and we start all over.  They set up a system to fragment the quota by allowing A project to split its quota assignment up and allocate some of it to subordinate projects.

Project A says “I’m going to keep 50 VMs for myself, and allocate 25 to B1 and B2.”

Project B1 Says I am going to keep 10 for Me and I’m going to allocate 5 to each C11, C12, C13.  And the B1 Tree is happy.

B2 is a manipulative schemer and decides to play around.  B2 Allocates his entire quota of 25 to C21.  C21 Creates 25 VMs.

B2 now withdraws his quota from C21.  There is no communication with Nova.  The VMs keep running.  He then allocates his entire quota of 25 VMs to C22, and C22 creates 25 VMs.

Nova says “What project is this?  C22?   What is its quota?  25?  All good.”

But in reality, B2 has doubled his quota.  His subordinates have allocated 50 VMs total.  He does this again with project C33, gets up to 75 VMs, and contemplates creating yet another project C34 just to keep up the pattern.  This would allocate more VMs than project A was originally allocated.

The admins notice this and get mad, wipe everything out, and start over again.  This time they’ve made a change.  Whenever the check quota on a project, they also will go and check quota on the parent project, counting all VMs underneath that parent.  Essentially, they will record that a VM created in project C11 is also reducing the original quota on B1  and on A.  In essence, they record a table.  If the user creates a VM in Project C11, the following will be recorded and check for quota.

VM Project
VM1 A
VM1 B1
VM1 C11

 

When a User then creates a VM in C21 the table will extend to this:

VM Project
VM1 A
VM1 B1
VM1 C11
VM2 A
VM2 B2
VM2 C21

In addition, when creating the VM2, Nova will check quota and see that, after creation:

  • C21 now has 1 out of 25 allocated
  • B2 now has 1 out of 25 allocated
  • A now has 2 out of 100 allocated

(quota is allocated prior to the creation of the resource to prevent a race condition)

Note that the quota is checked against the original amount, and not the amount reduced by sub allocating the quota.  If project C21 allocates 24 more VMs, to quota check will show:

  • C21 now has 25 out of 25 allocated
  • B2 now has 25 out of 25 allocated
  • A now has 26 out of 100 allocated

If B2 tried to play games, and removes the quota from C21 and gives it to C22, project C21 will be over quota, but Nova will have no way to trap this.  However, the only people this affects is other people within projects B2, C21, C22, and C23.  If C22 attempts to allocate a virtual machine, the quota check will show that B2 has allocated its full quota and cannot create any more.  The quota check will fail.

You might have noticed that the higher level projects can rob quota from the child projects in this scheme.  For example.  If Project A allocates 74 more VMs now, project B1 and children will still have allocated quota, but their quota check will fail because A is at full.  This could be mitigated by having 2 checks for project A: total quota (max 100), and directly allocated quota (max 50).

This scheme removes the quota violation by gaming the system.  I promised to write it up so we could continue to try and poke holes in it.

EDIT:  Quota would also have to be allocated per endpoint, or the endpoints will have to communicate with each other to evaluate usage.

 

by Adam Young at May 26, 2018 05:46 AM

May 23, 2018

Gorka Eguileor

Don’t rewrite your driver. 80 storage drivers for containers rolled into one!

Do you work with containers but your storage doesn’t support your Container Orchestration system? Have you or your company already developed an Openstack/Cinder storage driver and now you have to do it again for containers? Are you having trouble deciding how to balance your engineering force between storage driver development in OpenStack, Containers, Ansible, etc? […]

by geguileo at May 23, 2018 12:06 PM

Ansible Storage Role: automating your storage solutions

Were you in the middle of writing your Ansible playbooks to automate your software provisioning, configuration, and application deployment when you realized you had to manage your storage as well? And it turns out that each of your storage solutions has a completely different Ansible module. Now you have to figure out how each module […]

by geguileo at May 23, 2018 12:05 PM

Cinderlib: Every storage driver on a single Python library

Wouldn’t it be great if we could manage any storage array using a single Python library that provided the right storage management abstraction? Imagine writing your own Python storage management software and not caring about the backend or the connection. What if your code was backend and connection agnostic so the same method was used […]

by geguileo at May 23, 2018 12:03 PM

May 22, 2018

Red Hat Stack

“Ultimate Private Cloud” Demo, Under The Hood!

At the recent Red Hat Summit in San Francisco, and more recently the OpenStack Summit in Vancouver, the OpenStack engineering team worked on some interesting demos for the keynote talks.

I’ve been directly involved with the deployment of Red Hat OpenShift Platform on bare metal using the Red Hat OpenStack Platform director deployment/management tool, integrated with openshift-ansible. I’ll give some details of this demo, the upstream TripleO features related to this work, and insight around the potential use-cases.

TripleO & Ansible, a Powerful Combination!

For anyone that’s used Red Hat OpenStack Platform director (or the upstream TripleO project, upon which it is based), you’re familiar with the model of deploying a management node (“undercloud” in TripleO terminology), then deploying and managing your OpenStack nodes on bare metal.  However, TripleO also provides a very flexible and powerful combination of planning, deployment, and day-2 operations features. For instance, director allows us to manage and provision bare metal nodes, then deploy virtually any application onto those nodes via Ansible!

The “undercloud” management node makes use of several existing OpenStack services, including Ironic for discovery/introspection and provisioning of bare metal nodes, Heat, a declarative orchestration tool, and Mistral, a workflow engine.  It also provides a convenient UI, showcased in the demo, along with flexible CLI interfaces and standard OpenStack ReST APIs for automation.

As described in the demo, director has many useful features for managing your hardware inventory – you can either register or auto-discover your nodes, then do introspection (with optional benchmarking tests) to discover the hardware characteristics via the OpenStack ironic-inspector service.  Nodes can then be matched to a particular profile either manually or via rules implemented through the OpenStack Mistral workflow API. You are then ready to deploy an Operating System image onto the nodes using the OpenStack Ironic “bare metal-as-a-service” API.

When deciding what will be deployed onto your nodes, director has the concept of a “deployment plan,” which combines specifying which nodes/profiles will be used and which configuration will be applied, known as “roles” in TripleO terminology.

This is a pretty flexible system enabling a high degree of operator customization and extension through custom roles where needed, as well as supporting network isolation and custom networks (isolated networks for different types of traffic), declarative configuration of  network interfaces, and much more!

Deploying Red Hat OpenShift Container Platform on bare metal

What was new in the Summit demo was deploying OpenShift alongside OpenStack, both on bare metal, and both managed by Red Hat OpenStack Platform  director. Over the last few releases we’ve made good progress on ansible integration in TripleO, including enabling integration with “external” installers.  We’ve made use of that capability here to deploy OpenShift via TripleO, combining the powerful bare-metal management capabilities of TripleO with existing openshift-ansible management of configuration.

Integration between Red Hat OpenStack Platform and Red Hat OpenShift Container Platform

Something we didn’t have time to get into in great detail during the demo was the potential for integration between OpenStack and OpenShift – if you have an existing Red Hat OpenStack Platform deployment you can choose to deploy OpenShift with persistent volumes backed by Cinder (the OpenStack block storage service). And for networking integration, the Kuryr project, combined with OVN from OpenvSwitch, enables the sharing of a common overlay network between both platforms, without the overhead of double encapsulation.

This makes it easy to add OpenShift managed containers to your infrastructure, while almost seamlessly integrating them with VM workloads running on OpenStack. You can also take advantage of existing OpenStack capacity and vendor support while using the container management capabilities of OpenShift.

Container-native virtualization

After we deployed OpenShift we saw some exciting demos focussed on workloads running on OpenShift, including a preview of the new container native virtualization (CNV) feature. CNV uses the upstream KubeVirt project to run Virtual Machine (VM) workloads directly on OpenShift.

Unlike the OpenShift and OpenStack combination described above, here OpenShift manages the VM workloads, providing an easier way to  transition your VM workloads where no existing virtualization solution is in place. The bare-metal deployment capabilities outlined earlier are particularly relevant here, as you may want to run OpenShift worker nodes that host VMs on bare metal for improved performance. As the demo has shown,  the combination of director and openshift-ansible makes deploying, managing, and running OpenShift and OpenStack easier to achieve!

 

by Steven Hardy, Senior Principal Software Engineer at May 22, 2018 11:56 PM

May 18, 2018

Carlos Camacho

Testing Undercloud backup and restore using Ansible

Testing the Undercloud backup and restore

It is possible to test how the Undercloud backup and restore should be performed using Ansible.

The following Ansible playbooks will show how can be used Ansible to test the backups execution in a test environment.

Creating the Ansible playbooks to run the tasks

Create a yaml file called uc-backup.yaml with the following content:

---
- hosts: localhost
  tasks:
  - name: Remove any previously created UC backups
    shell: |
      source ~/stackrc
      openstack container delete undercloud-backups --recursive
    ignore_errors: True
  - name: Create UC backup
    shell: |
      source ~/stackrc
      openstack undercloud backup --add-path /etc/ --add-path /root/

Create a yaml file called uc-backup-download.yaml with the following content:

---
- hosts: localhost
  tasks:
  - name: Print destroy warning.
    vars:
      msg: |
        We are about to destroy the UC, as we are not
        moving outside the UC the backup tarball, we will
        download it and unzip it in a temporary folder to
        recover the UC using those files.
    debug:
      msg: ""
  - name: Make sure the temp folder used for the restore does not exist
    become: true
    file:
      path: "/var/tmp/test_bk_down"
      state: absent
  - name: Create temp folder to unzip the backup
    become: true
    file:
      path: "/var/tmp/test_bk_down"
      state: directory
      owner: "stack"
      group: "stack"
      mode: "0775"
      recurse: "yes"
  - name: Download the UC backup to a temporary folder (After breaking the UC we won't be able to get it back)
    shell: |
      source ~/stackrc
      cd /var/tmp/test_bk_down
      openstack container save undercloud-backups
  - name: Unzip the backup
    become: true
    shell: |
      cd /var/tmp/test_bk_down
      tar -xvf UC-backup-*.tar
      gunzip *.gz
      tar -xvf filesystem-*.tar
  - name: Make sure stack user can get the backup files
    become: true
    file:
      path: "/var/tmp/test_bk_down"
      state: directory
      owner: "stack"
      group: "stack"
      mode: "0775"
      recurse: "yes"

Create a yaml file called uc-destroy.yaml with the following content:

---
- hosts: localhost
  tasks:
  - name: Remove mariadb
    become: true
    yum: pkg=
         state=absent
    with_items:
      - mariadb
      - mariadb-server
  - name: Remove files
    become: true
    file:
      path: ""
      state: absent
    with_items:
      - /root/.my.cnf
      - /var/lib/mysql

Create a yaml file called uc-restore.yaml with the following content:

---
- hosts: localhost
  tasks:
    - name: Install mariadb
      become: true
      yum: pkg=
           state=present
      with_items:
        - mariadb
        - mariadb-server
    - name: Restart MariaDB
      become: true
      service: name=mariadb state=restarted
    - name: Restore the backup DB
      shell: cat /var/tmp/test_bk_down/all-databases-*.sql | sudo mysql
    - name: Restart MariaDB to perms to refresh
      become: true
      service: name=mariadb state=restarted
    - name: Register root password
      become: true
      shell: cat /var/tmp/test_bk_down/root/.my.cnf | grep -m1 password | cut -d'=' -f2 | tr -d "'"
      register: oldpass
    - name: Clean root password from MariaDB to reinstall the UC
      shell: |
        mysqladmin -u root -p password ''
    - name: Clean users
      become: true
      mysql_user: name="" host_all="yes" state="absent"
      with_items:
        - ceilometer
        - glance
        - heat
        - ironic
        - keystone
        - neutron
        - nova
        - mistral
        - zaqar
    - name: Reinstall the undercloud
      shell: |
        openstack undercloud install

Running the Undercloud backup and restore tasks

To test the UC backup and restore procedure, run from the UC after creating the Ansible playbooks:

  # This playbook will create the UC backup
  ansible-playbook uc-backup.yaml
  # This playbook will download the UC backup to be used in the restore
  ansible-playbook uc-backup-download.yaml
  # This playbook will destroy the UC (remove DB server, remove DB files, remove config files)
  ansible-playbook uc-destroy.yaml
  # This playbook will reinstall the DB server, restore the DB backup, fix permissions and reinstall the UC
  ansible-playbook uc-restore.yaml

Checking the Undercloud state

After finishing the Undercloud restore playbook the user should be able to execute again any CLI command like:

  source ~/stackrc
  openstack stack list

Source code available in GitHub

by Carlos Camacho at May 18, 2018 12:00 AM

May 11, 2018

Groningen Rain

Your LEGO Order Has Been Shipped

In preparation for the Red Hat Summit this week and OpenStack Summit in a week, I put together a hardware demo to sit in the RDO booth. I know, I know – the title has LEGO in it and now I’m talking tech. Bait and switch, AMIRITE?!? I promise it’s relevant. So I put together …

by K Rain at May 11, 2018 08:00 AM

Your LEGO® Order Has Been Shipped

In preparation for the Red Hat Summit this week and OpenStack Summit in a week, I put together a hardware demo to sit in the RDO booth.

I know, I know – the title has LEGO in it and now I’m talking tech.

Bait and switch, AMIRITE?!?

I promise it’s relevant.

So I put together this little hardware demo…

It ended up being two NUCs – one provisioning the other to build an all-in-one cloud using TripleO quickstart.

I didn’t use all of the hardware in the original demo and this is something I’d ultimately like to do after it all ships back to me.

Originally it was a build with one router for the public network, one switch for the private network, four NUCs – one to provision, one undercloud, one overcloud compute node and one overcloud controller and all the necessary networking and power cables.

Then it evolved to include a Pine64 to demo power management, but that doesn’t actually belong to our project, so I need to return it to its owner in June.

Anyway, LEGOs, RIGHT?!?

Long version longer is that I wanted to rebuild this demo AND build a lego NUC rack, too.

I found instructions on the web that looked simple enough and includes every single brick needed to build a four NUC rack.

If you scroll down to the Bill of Materials, it’s… detailed.

I ran out of time for these events, but it’s something that’s still on my mind for future events, so this week I started ordering the bricks.

And HOLY FREE HOLY I had to order parts from FOUR DIFFERENT STORES.

Thankfully, I could get most of the bricks from LEGO pick a brick despite it being IMPOSSIBLE to search for specific individual bricks. Then, I got the stackable plates from Strictly Bricks.

Then the continuous arches from an obscure shop in the Netherlands. And the roof bricks which are RETIRED by LEGO from ANOTHER obscure shop in the Netherlands.

And the last two I’m not even linking because I don’t remember how I found them or if I actually remembered to ORDER those parts because it took hours of frustrating, painstaking time to find what I did find and I think by the end I totally forgot more than a few things.

Can you tell I ran into some issues?

That I’m frustrated?

But there’s another side of me that’s completely totally OVER THE MOON cause I get to play with LEGO for my JOB.

And, thankfully, that’s the bigger part.

OMG OMG OMG OMG LEGO!! !

by rainsdance at May 11, 2018 08:00 AM

May 10, 2018

Groningen Rain

And Then I Realized I Hadn’t Posted Yet

I completely forgot to write until eleven o’clock at night and now it’s past time for bed and I haven’t written.

This is how you get crap.

Or is this Not Crap ™?

I have to share this bit that happened earlier tonight. I was remotely managing the RDO portion of the RDO / ManageIQ / Ceph booth at Red Hat Summit – we were working on the hardware demo. At some point yesterday it borked out and needed to be reinstalled to work.

I was on irc with the OSAS Ambassador and she said, effectively, “I can’t do this install, I don’t have my guru.”

And without hesitation.

I replied.

“I am your guru.”

I can’t believe I wrote that because it’s so goddamn egotistical. And I’m totally wincing at the pompous ego that typed those words.

But simultaneously?

Seriously.

I AM YOUR GURU.

I know this stuff.

And if I run into something that I haven’t run into before, I know how to figure it out.

I AM THE GODDAMN GURU.

by rainsdance at May 10, 2018 08:00 AM

May 09, 2018

OpenStack In Production (CERN)

Introducing GPUs to the CERN Cloud

High-energy physics workloads can benefit from massive parallelism -- and as a matter of fact, the domain faces an increasing adoption of deep learning solutions. Take for example the newly-announced TrackML challenge [7], already running in Kaggle! This context motivates CERN to consider GPU provisioning in our OpenStack cloud, as computation accelerators, promising access to powerful GPU computing resources to developers and batch processing alike.

What are the options?

Given the nature of our workloads, our focus is on discrete PCI-E Nvidia cards, like the GTX1080Ti and the Tesla P100. There are 2 ways to provision these GPUs to VMs: PCI passthrough and virtualized GPU. The first method is not specific to GPUs, but applies to any PCI device. The device is claimed by a generic driver, VFIO, on the hypervisor (which cannot use it anymore) and exclusive access to it is given to a single VM [1]. Essentially, from the host’s perspective the VM becomes a userspace driver [2], while the VM sees the physical PCI device and can use normal drivers, expecting no functionality limitation and no performance overhead.
Visualizing passthrough vs mdev vGPU [9]
In fact, perhaps some “limitation in functionality” is warranted, so that the untrusted VM can’t do low-level hardware configuration changes on the passed-through device, like changing power settings or even its firmware! In fact, security-wise PCI passthrough leaves a lot to be desired. Apart from allowing the VM to change the device’s configuration, it might leave a possibility for side-channel attacks on the hypervisor (although we have not observed this, and a hardware “IOMMU” protects against DMA attacks from the passed-through device). Perhaps more importantly, the device’s state won’t be automatically reset after deallocating from a VM. In the case of a GPU, data from a previous use may persist on the device’s global memory when it is allocated to a new VM. The first concern may be mitigated by improving VFIO, while the latter, the issue of device reset or “cleanup”, provides a use case for a more general accelerator management framework in OpenStack -- the nascent Cyborg project may fit the bill.
Virtualized GPUs are a vendor-specific option, promising better manageability and alleviating the previous issues. Instead of having to pass through entire physical devices, we can split physical devices into virtual pieces on demand (well, almost on demand; there needs to be no vGPU allocated in order to change the split) and hand out a piece of GPU to any VM. This solution is indeed more elegant. In Intel and Nvidia’s case, virtualization is implemented as a software layer in the hypervisor, which provides “mediated devices” (mdev [3]), virtual slices of GPU that appear like virtual PCI devices to the host and can be given to the VMs individually. This requires a special vendor-specific driver on the hypervisor (Nvidia GRID, Intel GVT-g), unfortunately not yet supporting KVM. AMD is following a different path, implementing SR-IOV at a hardware level.

CERN’s implementation

PCI passthrough has been supported in Nova for several releases, so it was the first solution we tried. There is a guide in the OpenStack docs [4], as well as previous summit talks on the subject [1]. Once everything is configured, the users will see special VM flavors (“g1.large”), whose extra_specs field includes passthrough of a particular kind of gpu. For example, to deploy a GTX 1080Ti, we use the following configuration:
nova-compute
pci_passthrough_whitelist={"vendor_id":"10de"}
nova-scheduler
add PciPassthroughFilter to enabled/default filters
nova-api
pci_alias={"vendor_id":"10de",”product_id”:”1b06”,”device_type”:”type-PCI”,”name”:”nvP1080ti_VGA”}
pci_alias={"vendor_id":"10de",”product_id”:”10ef”,”device_type”:”type-PCI”,”name”:”nvP1080ti_SND”}
flavor extra_specs
--property "pci_passthrough:alias"="nvP1080ti_VGA:1,nvP1080ti_SND:1"
A detail here is that most GPUs appear as 2 pci devices, the VGA and the sound device, both of which must be passed through at the same time (they are in the same IOMMU group; basically an IOMMU group [6] is the smallest passable unit).
Our cloud was in Ocata at the time, using CellsV1, and there were a few hiccups, such as the Puppet modules not parsing an option syntax correctly (MulitStrOpt) and CellsV1 dropping the pci requests. For Puppet, we were simply missing some upstream commits [15]. From Pike on and in CellsV2, these issues shouldn’t exist. As soon as we had worked around them and puppetized our hypervisor configuration, we started offering cloud GPUs with PCI passthrough and evaluating the solution. We created a few GPU flavors, following the AWS example of keeping the amount of vCPUs the same as the corresponding normal flavors.
From the user’s perspective, there proved to be no functionality issues. CUDA applications, like TensorFlow, run normally; the users are very happy that they finally have exclusive access to their GPUs (there is good tenant isolation). And there is no performance penalty in the VM, as measured by the SHOC benchmark [5] -- admittedly quite old, we preferred this benchmark because it also measures low-level details, apart from just application performance.
From the cloud provider’s perspective, there’s a few issues. Apart from the potential security problems identified before, since the hypervisor has no control over the passed-through device, we can’t monitor the GPU. We can’t measure its actual utilization, or get warnings in case of critical events, like overheating.
Normalized performance of VMs vs. hypervisor on some SHOC benchmarks. First 2: low-level gpu features, Last 2: gpu algorithms [8]. There are different test cases of VMs, to check if other parameters play a role. The “Small VM” has 2 vCPUs, “Large VM” has 4, “Pinned VM” has 2 pinned vCPUs (thread siblings), “2 pin diff N” and “2 pin same N” measure performance in 2 pinned VMs running simultaneously, in different vs the same NUMA nodes

Virtualized GPU experiments

The allure of vGPUs amounts largely to finer-grained distribution of resources, less security concerns (debatable) and monitoring. Nova support for provisioning vGPUs is offered in Queens as an experimental feature. However, our cloud is running on KVM hypervisors (on CERN CentOS 7.4 [14]), which Nvidia does not support as of May 2018 (Nvidia GRID v6.0). When it does, the hypervisor will be able to split the GPU into vGPUs according to one of many possible profiles, such as in 4 or in 16 pieces. Libvirt then assigns these mdevs to VMs in a similar way to hostdevs (passthrough). Details are in the OpenStack docs at [16].
Despite this promise, it remains to be seen if virtual GPUs will turn out to be an attractive offering for us. This depends on vendors’ licensing costs (such as per VM pricing), which, for the compute-compatible offering, can be significant. Added to that is the fact that only a subset of standard CUDA is supported (not supported are the unified memory and “CUDA tools” [11], probably referring to tools like the Nvidia profiler). vGPUs are also oversubscribing the GPU’s compute resources, which can be seen in either a positive or negative light. On the one hand, this guarantees higher resource utilization, especially for bursting workloads, like developers. On the other hand, we may expect a lower quality of service [12].

And the road goes on...

Our initial cloud GPU offering is very limited, and we intend to gradually increase it. Before that, it will be important to address (or at least be conscious about) the security repercussions of PCI passthrough. But even more significant is to address GPU accounting in a straightforward manner, by enforcing quotas on GPU resources. So far we haven’t tested the case of GPU P2P, with multi-GPU VMs, which is supposed to be problematic [13].
Another direction we’ll be researching is offering GPU-enabled container clusters, backed by pci-passthrough VMs. It may be that, with this approach, we can emulate a behavior similar to vGPUs and circumvent some of the bigger problems with pci passthrough.

References

[5]: SHOC benchmark suite: https://github.com/vetter/shoc
[11]: CUDA Unified memory and tooling not supported on Nvidia vGPU: https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#features-grid-vgpu

by Konstantinos Samaras-Tsakiris (noreply@blogger.com) at May 09, 2018 05:55 PM

Groningen Rain

My Miles Are About to Skyrocket

I reached out to my colleague that used to do This Job and asked him, “What events do you typically travel to outside of the OpenStack Summits [0] / PTGs [1]?”

And he replied something witty and important and vital and I completely didn’t have logging enabled nor did I write it down because I’m made of awesome.

But I did write down what OpenStack Days [2] are the biggest ones that I should try to attend, if not this year, in the upcoming years.

OpenStack Days Israel [3]
OpenStack Days Benelux [4]
OpenStack Days Nordic [5]
OpenStack Days NYC [6]
OpenStack Days UK [7]

And then there’s and FOSDEM [8] and DevConf.CZ [9] and two Centos Dojos [10] that I’m helping plan.

Oh, RIGHT! Plus RDO Test Days [11] in Brno!

This week, in particular, I have severe FOMO because it’s Red Hat Summit. And then 21-24 May is OpenStack Summit Vancouver. I’m remotely managing both. I’ve done everything I can possibly do to prepare for both, now I can only sit from afar and wait.

And put out fires, as needed.

Which means I could really use some distractions, People. Therefore, I thought it’d be nice to look at all the places IT

This means, over the next year, POSSIBLY, I’ll be travelling to…..

#drumrollPLEASE

June 2018
14-15 RDO Test Days Rocky M2 Brno Czech Republic

August 2018
2-3 RDO Test Days Rocky M3 Brno Czech Republic
?? OpenStack Days NYC New York USA (tentative!)

September 2018
6-7 RDO Test Days Rocky GA Brno Czech Republic
10-14 OpenStack PTG Denver Colorado USA
13 OpenStack Day Benelux Amsterdam Netherlands (conflicts with PTG – need to send someone else!)
?? OpenStack Days UK London United Kingdom (tentative!)

October 2018
09-10 OpenStack Days Nordic Stockholm Sweden
20 CentOS Dojo @CERN (tentative!)
?? OpenStack Days Israel Tel Aviv (tentative!)

November 2018
13-15 OpenStack Summit Berlin Germany

January 2019
?? DevConf.cz Brno Czech Republic (tentative!)

February 2019
?? FOSDEM Brussels Belgium (tentative!)
?? OpenStack PTG APAC (tentative!)

And, boy, are my arms tired ALREADY.

[0] https://www.openstack.org/summit/
[1] https://www.openstack.org/ptg/
[2] https://www.openstack.org/community/events/openstackdays
[3] https://www.openstack-israel.org/
[4] http://www.openstack.nl/en/events/
[5] http://stockholm.openstacknordic.org/
[6] OpenStack Days NYC was called Openstack Days East in 2016, doesn’t appear to have happened last year and I can’t find information about it anywhere.
[7] OpenStack Days UK doesn’t have any information up for this year, but last year was at https://openstackdays.uk/2017/
[8] https://fosdem.org/
[9] https://devconf.info/cz
[10] https://wiki.centos.org/Events/Dojo/
[11] https://www.rdoproject.org/testday/

by rainsdance at May 09, 2018 08:00 AM

May 08, 2018

Red Hat Stack

A modern hybrid cloud platform for innovation: Containers on Cloud with Openshift on OpenStack

Market trends show that due to long application life-cycles and the high cost of change, enterprises will be dealing with a mix of bare-metal, virtualized, and containerized applications for many years to come. This is true even as greenfield investment moves to a more container-focused approach.

Red Hat® OpenStack® Platform provides a solution to the problem of managing large scale infrastructure which is not immediately solved by containers or the systems that orchestrate them.

In the OpenStack world, everything can be automated. If you want to provision a VM, a storage volume, a new subnet or a firewall rule, all these tasks can be achieved using an easy to use UI or with a command line interface, leveraging Openstack API’s. All these infrastructure needs might require a ticket, some internal processing, and could take weeks. Now such provisioning could all be done with a script or a playbook, and could be completely automated. 

The applications and workloads can specify cloud resources to be provisioned and spun up from a definition file. This enables new levels of provision-as-you-need-it. As as demand increases, the infrastructure resources can be easily scaled! Operational data and meters can trigger and orchestrate new infrastructure provisioning automatically when needed.

On the consumption side, it is no longer a developer ssh’ing into a server and manually deploying an application server. Now, it’s simply run a few OpenShift commands, select from a list of predefined applications, language runtimes, databases, and then just have those resources provisioned, on top of the target infrastructure that was automatically provisioned and configured.

Red Hat OpenShift Container Platform gives you the ability to define an application from a single YAML file. This makes it convenient for a developer to share with other developers, allowing them to  launch an exact copy of that application, make code changes, and share it back. This capability is only possible when you have automation at this level.

Infrastructure and application platforms resources are now exposed differently in an easy and consumable way, and the days when you needed to buy a server, manually connect it to the network and install runtimes and applications manually are now very much a thing of the past.

With Red Hat OpenShift Container Platform on Red Hat OpenStack Platform you get:

A WORKLOAD DRIVEN I.T. PLATFORM: The underlying infrastructure doesn’t matter from a developer perspective. Container platforms exist to ensure the apps are the main focus. As a developer I only care about the apps and I want to have a consistent experience, regardless of the underlying infrastructure platform. Openstack provides this to Openshift.

DEEP PLATFORM INTEGRATION: Networks (kuryr), services (ironic, barbican, octavia), storage (cinder, ceph), installation (openshift-ansible) are all engineered to work together to provide the tightest integrations across the stack, right down to bare metal. All are based in Linux® and engineered in the open source community for exceptional performance

PROGRAMMATIC SCALE-OUT: OpenStack is 100% API driven across the infrastructure software stack. Storage, networking, compute VM’s or even bare metal all deliver the ability to scale out rapidly and programmatically. With scale under workloads, growth is easy.

ACROSS ANY TYPE OF INFRASTRUCTURE: OpenStack can utilise bare metal for virtualization or for direct consumption. It can interact with network switches and storage directly to ensure hardware is put to work for the workloads it supports.

FULLY MANAGED: Red Hat CloudForms and Red Hat Ansible Automation provide common tooling across multiple providers. Ansible is Red Hat’s automation engine for everything, and it’s present under the hood in Red Hat CloudForms. With Red Hat Openstack Platform, Red Hat CloudForms is deeply integrated into both the overcloud, the undercloud, and the container platform on top. Full stack awareness means total control. And our Red Hat Cloud Suite bundle of products provides access to OpenStack and OpenShift, as well as an array of supporting technologies. Red Hat Satellite, Red Hat Virtualization, Red Hat Insights, and even Red Hat CloudForms are included!

A SOLID FOUNDATION: All Red Hat products are co-engineered with Red Hat Enterprise Linux at their core. Fixes happen fast and accurately as all components of the stack are in unison and developmental harmony. Whether issues might lie at the PaaS, IaaS or underlying Linux layer, Red Hat will support you all the way!

Red Hat Services can help you accelerate your journey to Hybrid Cloud adoption, and realize the most value of best-of-breed open source technology platforms such as OpenShift on top of Openstack. Want to learn more about how we can help? Feel free to reach out to me directly for any questions, slefrere@redhat.com. Or download our Containers on Cloud datasheet

by Stephane Lefrere at May 08, 2018 02:50 PM

May 04, 2018

RDO Blog

The RDO Community Represents at RedHat Summit, May 8-10

Over the past few weeks we’ve been gearing up for Red Hat Summit and now it’s almost here! We hope to see you onsite — there are so many great talks, events, and networking opportunities to take advantage of. From panels to general sessions to hands-on labs, chances are you’re going to have a hard time choosing which sessions to attend!

We’re particularly excited about the below talks, but the full schedule of talks related to RDO, RHOSP, TripleO, and Ceph is over here.

Once you’re sessioned-out, come swing by the RDO booth, shared with ManageIQ and Ceph to see our newly updated hardware demo.


OpenStack use cases: How businesses succeed with OpenStack

Have you ever wondered just how the world’s largest companies are using Red Hat OpenStack Platform?

In this session, we’ll look at some of the most interesting use cases from Red Hat customers around the world, and give you some insight into how they achieved their technical and digital successes. You’ll learn how top organizations have used OpenStack to deliver unprecedented value.

Date:Tuesday, May 8
Time:3:30 PM – 4:15 PM
Location:Moscone West – 2007

Red Hat OpenStack Platform: The road ahead

OpenStack has reached a maturity level confirmed by wide industry adoption and the amazing number of active production deployments, with Red Hat a lead contributor to the project. In this session, we’ll share where we are investing for upcoming releases.

Date:Tuesday, May 8
Time:4:30 PM – 5:15 PM
Location:Moscone West – 2007

Production-ready NFV at Telecom Italia (TIM)

Telecom Italia (TIM) is the first large telco in Italy to deploy OpenStack into production. TIM chose to deploy a NFV solution based on Red Hat OpenStack Platform for its cloud datacenter and put critical VNFs—such as vIMS and vEPC—into production for its core business services. In this session, learn how TIM worked with Red Hat Consulting, as well as Ericsson as VNF vendor and Accenture as system integration, to set up an end-to-end NFV environment that matched its requirements with complex features like DPDK, completely automated.

Date:Wednesday, May 9
Time:11:45 AM – 12:30 PM
Location:Moscone West – 2002

Scalable application platform on Ceph, OpenStack and Ansible

How do you take a Ceph environment providing Cinder block storage to Openstack from a handful of nodes in a PoC environment all the way up to an 800+ node production environment, while serving live applications? In this session we will have two customers talking about how they did this and lessons learned! At Fidelity, they learned a lot about scaling hardware, tuning Ceph parameters, and handling version upgrades (using Ansible automation!). Solera Holdings Inc committed to modernizing the way Applications are developed and deployed with the need for highly performant, redundant and cost effective Object-Storage grew tremendously. After an successful PoC with Ceph Storage, Red Hat was chosen as a solution partner due to their excellence in customer experience and support as well as expertise in Ansible, as Solera iwill automate networking equipment (Fabric, Firewalls, Load-balancers) .

In addition to committing to reducing expensive enterprise SAN storage Solera also decided to commit to a new Virtualization strategy and building up a new IaaS to tackle challenges such as DBaaS and leveraging its newly built SDS backend for OpenStack while using the new SDS capabilities via iscsi to meet existing storage demands on VMware.

Solera will share why they chose RedHat as partner, how it has impacted and benefited Developers and DevOps Engineers alike and where the road will be taking us. Come to this session to hear about how both Fidelity and Solera Holdings Inc did it and what benefits were learned along the way!

Date:Thursday, May 10
Time:1:00 PM – 1:45 PM
Location:Moscone West – 2007

Building and maintaining open source communities

Being successful in creating an open source community requires planning, measurements, and clear goals. However, it’s an investment that can pay off tenfold when people come together around a shared vision. Who are we targeting, how can we achieve these goals, and why does it matter to the bigger business strategy?

In this panel you’ll hear from Amye Scarvada (Gluster), Jason Hibbets (OpenSource.com), Greg DeKoenigsberg (Ansible), and Leslie Hawthorn (Open source and standards, Red Hat) as they share first-hand experiences about how open source communities have directly attributed to the success of a product, as well as best practices to build and maintain these communities. It will be moderated by Mary Thengvall (Persea Consulting), who after many years of building community programs is now working with companies who are building out a developer relations strategy.

Date:Thursday, May 10
Time:2:00 PM – 2:45 PM
Location:Moscone West – 2007

by Mary Thengvall at May 04, 2018 09:02 PM

Consuming Kolla Tempest container image for running Tempest tests

Kolla project provides a docker container image for Tempest.

The provided container image is available in two formats for centos: centos-binary-tempest and centos-binary-source.

The RDO community rebuilds the container image in centos-binary format and pushes it to https://registry.rdoproject.org and to docker.io/tripleomaster.

The Tempest container image contains openstack-tempest and all available Tempest plugins in it.

The benefit of running Tempest tests from Tempest container is that, we do not need to install any Tempest package or Tempest plugin on the deployed cloud and keep the environment safe from dependency mismatch and updates.

In TripleO CI, we run Tempest tests using Tempest container images in tripleo-ci-centos-7-undercloud-containers job using featureset027 set.

We can consume the same image for running Tempest tests locally in TripleO deployment:

  • Follow this link for installing containerized undercloud.

    Note: At step 5 in the above link, open undercloud.conf in an editor and set

    enable_tempest = true.

    It will pull the tempest container image on the undercloud.

  • If tempest container is not available on the undercloud, we pull the image from Dockerhub.

    $ sudo docker pull docker.io/tripleomaster/centos-binary-tempest

  • Create two directories: container_tempest and tempest_workspace and copy stackrc, overcloudrc, tempest-deployer-input.conf, whitelist and blacklist related files to container_tempest. These files should be copied from undercloud to the container. Below commands do the same:

    $ mkdir container_tempest tempest_workspace

    $ cp stackrc overcloudrc tempest-deployer-input.conf whitelist.txt blacklist.txt container_tempest

  • Creating alias for running tempest within a container and with mounted container_tempest:/home/stack and tempest:/home/stack

    $ alias docker-tempest="sudo docker run -i \
    -v container_tempest:/home/stack \
    -v tempest:/home/stack \
    docker.io/tripleomaster/centos-binary-tempest \
    /usr/bin/bash"

  • Create tempest workspace using docker-tempest alias

    $ docker-tempest tempest init /home/stack/tempest

  • List tempest plugins installed within tempest container

    $ docker-tempest tempest list-plugins

  • Generate tempest.conf using discover-tempest-config

    Note: If tempest tests are running against undercloud then:

    $ source stackrc
    $ export OS_AUTH_URL="$OS_AUTH_URL/v$OS_IDENTITY_API_VERSION"
    $ docker-tempest discover-tempest-config --create \
    --out /home/stack/tempest/etc/tempest.conf

    Note: If tempest tests are running against overcloud then:

    $ source overcloudrc
    $ docker-tempest discover-tempest-config --create \
    --out /home/stack/tempest/etc/tempest.conf \
    --deployer-input /home/stack/tempest-deployer-input.conf

  • Running tempest tests

    $ docker-tempest tempest run --workspace tempest \
    -c /home/stack/tempest/etc/tempest.conf \
    -r <tempest test regex> --subunit

    In the above command:

    • --workspace : To use tempest workspace
    • -c : Use the tempest.conf file
    • -r : To run tempest tests
    • --subunit: to generate tempest tests results subunit stream in v2 format

Once tests are finished, we can find the test output in /home/stack/tempest folder.

Thanks to Kolla team, Emilien, Wes, Arx, Martin, Luigi, Andrea, Ghanshyam, Alex, Sagi, Gabriel and RDO team for helping me in getting things in place.

Happy Hacking!

by chkumar246 at May 04, 2018 09:47 AM

Running Tempest tests against a TripleO Undercloud

Tempest is the integration test suite used to validate any deployed OpenStack cloud.

TripleO undercloud is the all-in-one OpenStack installation that includes components for provisioning and managing the OpenStack nodes that form your OpenStack environment (the overcloud).

For validating undercloud using Tempest, Follow the below steps:

  • Using tripleo-quickstart:

    • Follow this link to provision a libvirt guest environment through tripleo-quickstart
    • Deploy the undercloud and run Tempest tests on undercloud against undercloud

      $ bash quickstart.sh -R master --no-clone --tags all \
      --nodes config/nodes/1ctlr_1comp.yml \
      -I --teardown none -p quickstart-extras-undercloud.yml \
      --extra-vars test_ping=False \
      --extra-vars tempest_undercloud=True \
      --extra-vars tempest_overcloud=False \
      --extra-vars run_tempest=True \
      --extra-vars test_white_regex='tempest.api.identity|tempest.api.compute' \
      $VIRTHOST

      The above command will:

      • Deploy an undercloud
      • Generate tempest_setup.sh script in /home/stack folder
      • Run test_white_regex tempest tests
      • Store all the results in /home/stack/tempest folder.
  • Running Tempest tests manually on undercloud:

    • Deploy the undercloud manually by following this link and then ssh into undercloud.
    • Install openstack-tempest rpm on undercloud

      $ sudo yum -y install openstack-tempest

    • Source stackrc on undercloud

      $ source stackrc

    • Append Identity API version in $OS_AUTH_URL

      $OS_AUTH_URL defined in stackrc does not contain the Identity API version, what will lead to a failure while generating tempest.conf using python-tempestconf. In order to fix the above issue, we need to append the API version to the OS_AUTH_URL environment variable and export it.

      $ export OS_AUTH_URL="$OS_AUTH_URL/v$OS_IDENTITY_API_VERSION"

    • Create the Tempest workspace

      $ tempest init <tempest_workspace>

    • Generate Tempest configuration using python-tempestconf

      $ cd <path to the tempest_workspace>

      $ discover-tempest-config --create --out etc/tempest.conf

      The above command will generate tempest.conf in /etc/ directory.

    • Now we are all set to run Tempest tests. Run the following command to run Tempest tests

      $ tempest run -r '(tempest.api.identity|tempest.api.compute)' --subunit

      The above command will:

      • Run tempest.api.identity and tempest.api.compute tests
      • All the Tempest test subunit results in v2 format will be stored in .stestr directory under Tempest workspace.
    • Use subunit2html command to generate results in html format

      $ sudo yum -y install python-subunit

      $ subunit2html <path to tempest workspace>/.stestr/0 tempest.html

And we are done with running Tempest on undercloud.

Currently tripleo-ci-centos-7-undercloud-oooq job is running Tempest tests on undercloud in TripleO CI using featureset003

Thanks to Emilien, Enrique, Wes, Arx, Martin, Luigi, Alex, Sagi, Gabriel and RDO team for helping me in getting things in place.

Happy Hacking!

by chkumar246 at May 04, 2018 08:22 AM

May 03, 2018

Lars Kellogg-Stedman

Using a TM1637 LED module with CircuitPython

CircuitPython is "an education friendly open source derivative of MicroPython". MicroPython is a port of Python to microcontroller environments; it can run on boards with very few resources such as the ESP8266. I've recently started experimenting with CircuitPython on a Wemos D1 mini, which is a small form-factor ESP8266 board …

by Lars Kellogg-Stedman at May 03, 2018 04:00 AM

dmsimard

ARA Records Ansible 0.15 has been released

I was recently writing that ARA was open to limited development for the stable release in order to improve the performance for larger scale users. This limited development is the result of this 0.15.0 release. The #OpenStack community runs over 300,000 continuous integration jobs with #Ansible every month with the help of the awesome Zuul. Learn more about scaling ARA reports with @dmsimard https://t.co/l8zFXHqhhc — OpenStack (@OpenStack) April 18, 2018 Changelog for ARA Records Ansible 0.

May 03, 2018 12:00 AM

May 02, 2018

RDO Blog

RDO Test Days is TOMORROW and FRIDAY!

Bust out your spoons cause we’re about to test the first batch of Rocky [road] ice cream!

Or, y’know, the first Rocky OpenStack milestone.

On 03 and 04 May, TOMORROW and Friday, we’ll have our first milestone test days for Rocky OpenStack. We would love to get as wide participation in the RDO Test Days from our global team as possible!

We’re looking for developers, users, operators, quality engineers, writers, and, yes, YOU. If you’re reading this, we want your help!

Let’s set new records on the amount of participants! The amount of bugs! The amount of feedback and questions and NOTES!

Oh, my.

But, seriously.

I know that everyone has Important Stuff To Do but taking a few hours or a day to give things a run through at various points in the RDO cycle will benefit everyone. Not only will this help identify issues early in the development process, but you can be the one of the first to cut your teeth on the latest versions with your favorite deployment methods and environments like TripleO, PackStack, and Kolla.

So, please consider taking a break from your normal duties and spending at least a few hours with us in #rdo on Freenode.

And who knows – if we have enough interest, perhaps we’ll have ACTUAL rocky road ice cream at the next RDO Test Days.

See you TOMORROW!

by Rain Leander at May 02, 2018 11:34 AM

Groningen Rain

Rocky Road Ice Cream, People. The Best Ice Cream, Obviously.

Grab your spoons, people, the first milestone of OpenStack Rocky has come and gone which can mean only one thing!

RDO Test Days!

Wait, were you expecting ACTUAL ice cream?

The ice cream is a lie.

But RDO Test Days are HERE!

RDO is a community of people using and deploying OpenStack on CentOS, Fedora, and Red Hat Enterprise Linux. At each OpenStack development cycle milestone, the RDO community holds test days to invite people to install, deploy and configure a cloud using RDO and report feedback. This helps us find issues in packaging, documentation, installation and more but also, where appropriate, to collaborate with the upstream OpenStack projects to file and resolve bugs found throughout the event.

In order to participate, though, people needed to:

* Have hardware available to install and deploy on
* Be reasonably knowledgeable / familiar with OpenStack
* Have the time to go through an end-to-end installation, test it and provide feedback

SAY NO MORE!

In an attempt to eliminate these barriers, we’re continuing the experiment started last year by providing a ready to use cloud environment. This cloud will be deployed with the help of Kolla: https://github.com/openstack/kolla and Kolla-Ansible: https://github.com/openstack/kolla-ansible which will install a containerized OpenStack cloud with the help of Ansible and Docker.

The cloud will be built using 5 bare metal servers – three controllers and two compute nodes.

COME TRY A REAL OPENSTACK ROCKY DEPLOYMENT!
Would you like to participate? We’d love your help!

The next test days start TOMORROW – on the third and fourth of May we will test the first milestone of the latest OpenStack Rocky release.

To sign up to use the Kolla cloud environment and for more information, please visit https://etherpad.openstack.org/p/rdo-rocky-m1-cloud.

In the meantime, visit https://www.rdoproject.org/testday/rocky/milestone1/ and join us on channel #rdo on Freenode irc where we’re available to answer any questions and troubleshoot.

And if we get enough interest, who knows?

Maybe we’ll get ACTUAL ice cream at future test days.

Interested?

Me, too!

by rainsdance at May 02, 2018 08:00 AM

Rocky Road Ice Cream, People, The Best Ice Cream Obviously

Grab your spoons, people, the first milestone of OpenStack Rocky has come and gone which can mean only one thing! RDO Test Days! RDO is a community of people using and deploying OpenStack on CentOS, Fedora, and Red Hat Enterprise Linux. At each OpenStack development cycle milestone, the RDO community holds test days to invite …

by K Rain at May 02, 2018 08:00 AM

April 27, 2018

Red Hat Stack

Highlights from the OpenStack Rocky Project Teams Gathering (PTG) in Dublin

Last month in Dublin, OpenStack engineers gathered from dozens of countries and companies to discuss the next release of OpenStack. This is always my favorite OpenStack event, because I get to do interviews with the various teams, to talk about what they did in the just-released version (Queens, in this case) and what they have planned for the next one (Rocky).

If you want to see all of those interviews, they are on YouTube at:

(https://www.youtube.com/playlist?list=PLOuHvpVx7kYnVF3qjvIw-Isq2sQkHhy7q) and I’m still in the process of editing and uploading them. So subscribe, and you’ll get notified as the new interviews go live.

In this article, I want to mention a few themes that cross projects, so you can get a feel for what’s coming in six months.

40533090551_0f0452cb1a_zThe interview chair. Photo: Author

I’ll start with my interview with Thierry Carrez. While it was the last interview I did, watching it first gives a great overview of the event, what was accomplished, why we do the event, and what it will look like next time. (Spoiler: We will have another PTG around early September, but are still trying to figure out what happens after that.)

One theme that was even stronger this time than past PTGs was cross-project collaboration. This is, of course, a long-time theme in OpenStack, because every project MUST work smoothly with others, or nothing works. But this has been extended to the project level with the introduction of new SIGs – Special Interest Groups. These are teams that focus on cross-project concepts such as scientific computing, APIs best practice, and security. You can read more about SIGs at https://wiki.openstack.org/wiki/OpenStack_SIGs

I spoke with two SIGs at the PTG, and I’ll share here the interview with Michael McCune from the OpenStack API SIG.

Another common theme in interviews this year was that while projects did make huge progress on the features front, there was also a lot of work in stabilizing and hardening OpenStack – making it more enterprise-ready, you might say. One of these efforts was the “fast forward upgrade” effort, which is about making it easy to upgrade several versions – say, from Juno all the way to Queens, for example – in one step.

39822693574_32ac7bf26b_zCroke Park, Dublin. PTG Venue! Photo: Author

Part of what makes that possible is the amazing work of the Zuul team, who develop and run the massive testing infrastructure that subjects every change to OpenStack code to a rigorous set of functional and and interdependence tests.

And I’ll share one final video with you before sending you off to watch the full list (Again, that’s at https://www.youtube.com/playlist?list=PLOuHvpVx7kYnVF3qjvIw-Isq2sQkHhy7q). Over the years, OpenStack had a reputation of being hard to deploy and manage. This has driven development of the TripleO project, which is an attempt to make deployment easy, and management possible without knowing everything about everything. I did a number of TripleO interviews, because it’s a very large team, working on a diverse problem set.

26662148028_a9e1f8a763_zThe big storm during the PTG. Photo: Author

The video that I’ll choose here is with the OpenStack Validations team. This is the subproject that ensures that, when you’re deploying OpenStack, it checks everything that could go wrong before it has a chance to, so that you don’t waste your time.

There are many other videos that I haven’t featured here, and I encourage you to look at the list and pick the few interviews that are of most interest to you. I tried to keep them short, so that you can get the message without spending your entire day watching. But if you have any questions about any of them, take them to the OpenStack mailing list (Details at https://wiki.openstack.org/wiki/Mailing_Lists#General_List) where these people will be more than happy to give you more detail.

About Rich Bowen

Rich is a community manager at Red Hat, where he works with the OpenStack and CentOS communities. Find him on Twitter at: @rbowen.

 

by Rich Bowen at April 27, 2018 02:05 AM

April 25, 2018

John Likes OpenStack

April 20, 2018

RDO Blog

Community Blogpost Round-up: April 20

The last month has been busy to say the least, which is why we haven’t gotten around to posting a recent Blogpost Roundup, but it looks like you all have been busy as well! Thanks as always for continuing to share your knowledge around RDO and OpenStack. Enjoy!

Lessons from OpenStack Telemetry: Deflation by Julien Danjou

This post is the second and final episode of Lessons from OpenStack Telemetry. If you have missed the first post, you can read it here.

Read more at https://julien.danjou.info/lessons-from-openstack-telemetry-deflation/

Unit tests on RDO package builds by jpena

Unit tests are used to verify that individual units of source code work according to a defined spec. While this may sound complicated to understand, in short it means that we try to verify that each part of our source code works as expected, without having to run the full program they belong to.

Read more at https://blogs.rdoproject.org/2018/04/unit-tests-on-rdo-package-builds/

Red Hatters To Present at More Than 50 OpenStack Summit Vancouver Sessions by Peter Pawelski, Product Marketing Manager, Red Hat OpenStack Platform

OpenStack Summit returns to Vancouver, Canada May 21-24, 2018, and Red Hat will be returning as well with as big of a presence as ever. Red Hat will be a headline sponsor of the event, and you’ll have plenty of ways to interact with us during the show.

Read more at https://redhatstackblog.redhat.com/2018/04/13/openstack-summit-vancouver-preview/

Lessons from OpenStack Telemetry: Incubation by Julien Danjou

It was mostly around that time in 2012 that I and a couple of fellow open-source enthusiasts started working on Ceilometer, the first piece of software from the OpenStack Telemetry project. Six years have passed since then. I’ve been thinking about this blog post for several months (even years, maybe), but lacked the time and the hindsight needed to lay out my thoughts properly. In a series of posts, I would like to share my observations about the Ceilometer development history.

Read more at https://julien.danjou.info/lessons-from-openstack-telemetry-incubation/

Comparing Keystone and Istio RBAC by Adam Young

To continue with my previous investigation to Istio, and to continue the comparison with the comparable parts of OpenStack, I want to dig deeper into how Istio performs RBAC. Specifically, I would love to answer the question: could Istio be used to perform the Role check?

Read more at https://adam.younglogic.com/2018/04/comparing-keystone-and-istio-rbac/

Scaling ARA to a million Ansible playbooks a month by David Moreau Simard

The OpenStack community runs over 300 000 CI jobs with Ansible every month with the help of the awesome Zuul.

Read more at https://dmsimard.com/2018/04/09/scaling-ara-to-a-million-ansible-playbooks-a-month/

Comparing Istio and Keystone Middleware by Adam Young

One way to learn a new technology is to compare it to what you already know. I’ve heard a lot about Istio, and I don’t really grok it yet, so this post is my attempt to get the ideas solid in my own head, and to spur conversations out there.

Read more at https://adam.younglogic.com/2018/04/comparing-istio-and-keystone-middleware/

Heading to Red Hat Summit? Here’s how you can learn more about OpenStack. by Peter Pawelski, Product Marketing Manager, Red Hat OpenStack Platform

Red Hat Summit is just around the corner, and we’re excited to share all the ways in which you can connect with OpenStack® and learn more about this powerful cloud infrastructure technology. If you’re lucky enough to be headed to the event in San Francisco, May 8-10, we’re looking forward to seeing you. If you can’t go, fear not, there will be ways to see some of what’s going on there remotely. And if you’re undecided, what are you waiting for? Register today. 

Read more at https://redhatstackblog.redhat.com/2018/03/29/red-hat-summit-2018-openstack-preview/

Multiple 1-Wire Buses on the Raspberry Pi by Lars Kellogg-Stedman

The DS18B20 is a popular temperature sensor that uses the 1-Wire protocol for communication. Recent versions of the Linux kernel include a kernel driver for this protocol, making it relatively convenient to connect one or more of these devices to a Raspberry Pi or similar device.

Read more at https://blog.oddbit.com/2018/03/27/multiple-1-wire-buses-on-the-/

An Introduction to Fast Forward Upgrades in Red Hat OpenStack Platform by Maria Bracho, Principal Product Manager OpenStack

OpenStack momentum continues to grow as an important component of hybrid cloud, particularly among enterprise and telco. At Red Hat, we continue to seek ways to make it easier to consume. We offer extensive, industry-leading training, an easy to use installation and lifecycle management tool, and the advantage of being able to support the deployment from the app layer to the OS layer.

Read more at https://redhatstackblog.redhat.com/2018/03/22/an-introduction-to-fast-forward-upgrades-in-red-hat-openstack-platform/

Ceph integration topics at OpenStack PTG by Giulio Fidente

I wanted to share a short summary of the discussions happened around the Ceph integration (in TripleO) at the OpenStack PTG.

Read more at http://giuliofidente.com/2018/03/ceph-integration-topics-at-openstack-ptg.html

Generating a list of URL patterns for OpenStack services. by Adam Young

Last year at the Boston OpenStack summit, I presented on an Idea of using URL patterns to enforce RBAC. While this idea is on hold for the time being, a related approach is moving forward building on top of application credentials. In this approach, the set of acceptable URLs is added to the role, so it is an additional check. This is a lower barrier to entry approach.

Read more at https://adam.younglogic.com/2018/03/generating-url-patterns/

by Mary Thengvall at April 20, 2018 02:57 PM

April 19, 2018

Julien Danjou

Lessons from OpenStack Telemetry: Deflation

Lessons from OpenStack Telemetry: Deflation

This post is the second and final episode of Lessons from OpenStack Telemetry. If you have missed the first post, you can read it here.

Splitting

At some point, the rules relaxed on new projects addition with the Big Tent initiative, allowing us to rename ourselves to the OpenStack Telemetry team and splitting Ceilometer into several subprojects: Aodh (alarm evaluation functionality) and Panko (events storage). Gnocchi was able to join the OpenStack Telemetry party for its first anniversary.

Finally being able to split Ceilometer into several independent pieces of software allowed us to tackle technical debt more rapidly. We built autonomous teams for each project and gave them the same liberty they had in Ceilometer. The cost of migrating the code base to several projects was higher than we wanted it to be, but we managed to build a clear migration path nonetheless.

Gnocchi Shamble

With Gnocchi in town, we stopped all efforts on Ceilometer storage and API and expected people to adopt Gnocchi. What we underestimated is the unwillingness of many operators to think about telemetry. They did not want to deploy anything to have telemetry features in the first place, so adding yet a new component (a timeseries database) to have proper metric features was seen a burden – and sometimes not seen at all.
Indeed, we also did not communicate enough on our vision for that transition. After two years of existence, many operators were asking what Gnocchi was and what they needed it for. They deployed Ceilometer and its bogus storage and API and were confused about needing yet another piece of software.

It took us more than two years to deprecate the Ceilometer storage and API, which is way too long.

Deflation

In the meantime, people were leaving the OpenStack boat. Soon enough, we started to feel the shortage of human resources. Smartly, we never followed the OpenStack trend of imposing blueprints, specs, bug reports or any process to contributors, obeying my list of open source best practice. This flexibility allowed us to iterate more rapidly; compared to other OpenStack projects; we were going faster proportionately to the size of our contributor base.

Lessons from OpenStack Telemetry: Deflation

Nonetheless, we felt like bailing out a sinking ship. Our contributors were disappearing while we were swamped with technical debt: half-baked feature, unfinished migration, legacy choices and temporary hacks. After the big party that happened, we had to wash the dishes and sweep the floor.

Being part of OpenStack started to feel like a burden in many ways. The inertia of OpenStack being a big project was beginning to surface, so we put up a lot of efforts to dodge most of its implications. Consequently, the team was perceived as an outlier, which does not help, especially when you have to interact with a lot your neighbors.

The OpenStack Foundation never understood the organization of our team. They would refer to us as "Ceilometer" whereas we formally renamed ourselves to "Telemetry" since we were englobing four server projects and a few libraries. For example, while Gnocchi has been an OpenStack project for two years before leaving, it has never been listed on the project navigator maintained by the foundation.

That's a funny anecdote that demonstrates the peculiarity of our team, and how it has been both a strength and a weakness.

Competition

Nobody was trying to do what we were doing when we started Ceilometer. We filled the space of metering OpenStack. However, as the number of companies involved increased and the friction with it along, some people grew unhappy. The race to have a seat at the table of the feast and becoming a Project Team Leader was strong, so some people preferred to create their project rather than trying to play the contribution game. In many areas, including our, that divided the effort up to a ridiculous point where several teams where doing the exact the same thing, or were trying to step on each other toes to kill the competitors.

We spent a significant amount of time trying to bring other teams in the Telemetry scope, to unify our efforts, without much success. Some companies were not embracing open-source because of their cultural differences, while some others had no interest to join a project where they would not be seen as the leader.

That fragmentation did not help us, but also did not do much harm in the end. Most of those projects are now either dead or becoming irrelevant as the rest of the world caught up on what they were trying to do.

Epilogue

As of 2018, I'm the PTL for Telemetry – because nobody else ran. The official list of maintainer for the telemetry projects is five people: two are inactive, and three are part-time. During the latest development cycle (Queens), 48 people committed in Ceilometer, though only three developers made impactful contributions. The code size has been divided by two since the peak: Ceilometer is now 25k lines of code long.

Panko and Aodh have no active developer. A Red Hat colleague and I are maintaining the projects afloat to keep it working.

Gnocchi has humbly thriven since it left OpenStack. The stains from having been part of OpenStack are not yet all gone. It has a small community, but users see its real value and enjoy using it.

Those last six years have been intense, and riding the OpenStack train has been amazing. As I concluded in the first blog post of this series, most of us had a great time overall; the point of those writings is not to complain, but to reflect.

I find it fascinating to see how the evolution of a piece of software and the metamorphosis of its community are entangled. The amount of politics that a corporately-backed project of this size generates is majestic and has a prominent influence on the outcome of software development.

So, what's next? Well, as far as Ceilometer is concerned, we still have ideas and plans to keep shrinking its footprint to a minimum. We hope that one-day Ceilometer will become irrelevant – at least that's what we're trying to achieve so we don't have anything to maintain. That mainly depends on how the myriad of OpenStack projects will chose to address their metering.

We don't see any future for Panko nor Aodh.

Gnocchi, now blooming outside of OpenStack, is still young and promising. We've plenty of ideas and every new release brings new fancy features. The storage of timeseries at large scale is exciting. Users are happy, and the ecosystem is growing.

We'll see how all of that concludes, but I'm sure it'll be new lessons to learn and write about in six years!

by Julien Danjou at April 19, 2018 11:55 AM

April 17, 2018

RDO Blog

Unit tests on RDO package builds

Unit tests are used to verify that individual units of source code work according to a defined spec. While this may sound complicated to understand, in short it means that we try to verify that each part of our source code works as expected, without having to run the full program they belong to.

All OpenStack projects come with their own set of unit tests, for example this is the unit test folder for the oslo.config project. Those tests are executed when a new patch is proposed for review, to ensure that existing (or new) functionality is not broken with the new code. For example, if you check this review, you can see that one of the CI jobs executed is “openstack-tox-py27”, which runs unit tests using Python 2.7.

Unit tests in action

How does this translate into the packaging world? As part of a spec file, we can define a %check section, where we add scripts to test the installed code. While this is not a mandatory section in the Fedora packaging guidelines, it is highly recommended, since it provides a good assurance that the code packaged is correct.

In many cases, RDO packages include this %check section in their specs, and the project’s unit tests are executed when the package is built. This is an example of the unit tests executed for the python-oslo-utils package.

“But why are these tests executed again when packaging?”, you may ask. After all, these same tests are executed by the Zuul gate before being merged. Well, there are quite a few reasons for this:

  • Those unit tests were run with a specific operating system version and a specific package set. Those are probably different from the ones used by RDO, so we need to ensure the project compatibility with those components.
  • The project dependencies are installed in the OpenStack gate using pip, and some versions may differ. This is because OpenStack projects support a range of versions for each dependency, but usually only test with one version. We have seen cases where a project stated support for version x.0 of a library, but then added code that required version x.1. This change would not be noticed by the OpenStack gate, but it would make unit tests fail while packaging.
  • They also allow us to detect issues before they happen in the upstream gate. OpenStack projects use the requirements project to decide which version of their own libraries should be used by other projects. This allows for some inter-dependency issues, where a change in an Oslo library may uncover a bug in another project, but it is not noticed until the requirements project is updated with a new version of the Oslo library. In the RDO case, we run an RDO Trunk builder using code from the master branch in all projects, which allows us to notify in advance, like in this example bug.
  • They give us an early warning when new dependencies have been added to a project, but they are not in the package spec yet. Since unit tests exercise most of the code, any missing dependency should make them fail.

Due to the way unit tests are executed during a package build, there are some details to keep in mind when defining them. If you as a developer follow them, you will make packagers’ life easier:

  • Do not create unit tests that depend on resources available from the Internet. Most packaging environments do not allow Internet access while the package is being built, so a unit test that depends on resolving an IP address via DNS will fail.

  • Try to keep unit test runtime within reasonable limits. If unit tests for a project take 1 hour to complete, it is likely they will not be executed during packaging, such as here.

  • Do not assume that unit tests will always be executed on a machine with 8 fast cores. We have seen cases of unit tests failing when run on a limited environment or when it takes them more than a certain time to finish.

Now that you know the importance of unit tests for RDO packaging, you can go ahead and make sure we use it on every package. Happy hacking!

by jpena at April 17, 2018 04:49 PM

April 13, 2018

Red Hat Stack

Red Hatters To Present at More Than 50 OpenStack Summit Vancouver Sessions

OpenStack Summit returns to Vancouver, Canada May 21-24, 2018, and Red Hat will be returning as well with as big of a presence as ever. Red Hat will be a headline sponsor of the event, and you’ll have plenty of ways to interact with us during the show.

First, you can hear from our head of engineering and OpenStack Foundation board member, Mark McLoughlin, during the Monday morning Keynote sessions. Mark will be discussing OpenStack’s role in a hybrid cloud world, as well as the importance of OpenStack and Kubernetes integrations. After the keynotes, you’ll want to come by the Red Hat booth in the exhibit hall to score some cool SWAG (it goes quickly), talk with our experts, and check out our product demos. Finally, you’ll have the entire rest of the show to listen to Red Hatters present and co-present on a variety of topics, from specific OpenStack projects, to partner solutions, to OpenStack integrations with Kubernetes, Ansible, Ceph storage and more. These will be delivered via traditional sessions, labs, workshops, and lunch and learns. For a full list of general sessions featuring Red Hatters, see below.

Beyond meeting us at the Red Hat booth or listening to one of us speak in a session or during a keynote, here are the special events we’ll be sponsoring where you can also meet us. If you haven’t registered yet, use our sponsor code: REDHAT10 to get 10% off the list price.

Containers, Kubernetes and OpenShift on OpenStack Hands-on Training
Join the Red Hat’s OpenShift team for a full day of discussion and hands on lab to learn how OpenShift can help you deliver apps even faster on OpenStack.
Date: May 20th, 9:00 am-4:00 pm
Location: Vancouver Convention Centre West – Level Two – Room 218-219
RSVP required

Red Hat and Trilio Evening Social
All are invited to join Red Hat and Trilio for an evening of great food, drinks, and waterfront views of Vancouver Harbour.
When: Monday, May 21st, 7:30-10:30 pm
Location: TapShack Coal Harbour
RSVP required 

Red Hat and Dell: Crafting Your Cloud Reality
Join Red Hat and Dell EMC for drinks and food, and take part in the Red Hat® Cloud Challenge, an immersive virtual reality game.
When: Tuesday, May 22nd, 6:00-9:00 pm
Location: Steamworks Brew Pub
RSVP required

Women of OpenStack Networking Lunch sponsored by Red Hat
Meet with other women for lunch and discuss important topics affecting women in technology and business
Guest speaker: Margaret Dawson, Vice President of Product Marketing, Red Hat
Date: Wednesday, May 23 2018, 12:30-1:50 pm
Location: Vancouver Convention Centre West, Level 2, Room 215-216 
More information

 

Red Hat Training and Certification Lunch and Learn
Topic: Performance Optimization in Red Hat OpenStack Platform
Wednesday, May 23rd, 12:30-1:30 pm
Location: Vancouver Convention Centre West, Level 2, Room 213-214 

RSVP required 

Red Hat Jobs Social

Connect with Red Hatters and discover why working for the open source leader is a future worth exploring. We’ll have food, drinks, good vibes, and a chance to win some awesome swag.
Date: Wednesday, May 23, 6:00-8:00 pm
Location: Rogue Kitchen and Wetbar
RSVP required

Red Hat Sponsored Track – Monday, May 21, Room 202-204

We’ve got a great lineup of speakers on a variety of topics speaking during our sponsored breakout track on Monday, May 21. The speakers and topics are:

Session Speaker Time
Open HPE Telco NFV-Infrastructure platforms with Red Hat OpenStack Mushtaq Ahmed (HPE) 11:35 AM
What’s New in Security for Red Hat OpenStack Platform? Keith Basil 1:30 PM
Is Public Cloud Really Eating OpenStack’s Lunch? Margaret Dawson 2:20 PM
OpenShift on OpenStack and Bare Metal Ramon Acedo Rodriguez 3:10 PM
The Modern Telco is Open Ian Hood 4:20 PM
Cloud Native Applications in a Telco World – How Micro Do You Go? Ron Parker (Affirmed Networks), Azhar Sayeed 5:10 PM

 

Breakout Sessions Featuring Red Hatters

Monday

Session Speaker Time
OpenStackSDKs – Project Update Monty Taylor 1:30 PM
Docs/i18n – Project Onboarding Stephen Finucane, Frank Kloeker (Deutsche Telekom), Ian Y. Choi (Fusetools Korea) 1:30 PM
Linux Containers Internal Lab Scott McCarty 1:30 PM
The Wonders of NUMA, or Why Your High Performance Application Doesn’t Perform Stephen Finucane 2:10 PM
Glance – Project Update Erno Kuvaja 3:10 PM
Call It Real: Virtual GPUs in Nova Silvain Bauza, Jianhua Wang (Citrix) 3:10 PM
A Unified Approach to Role-Based Access Control Adam Young 3:10 PM
Unlock Big Data Efficiency with CephData Lake Kyle Bader, Yong Fu (Intel), Jian Zhang (Intel), Yuan Zhuo (INTC) 4:20 PM
Storage for Data Platforms Kyle Bader, Uday Boppana 5:20 PM

 

Tuesday

Session Speaker Time
OpenStack with IPv6: Now You Can! Dustin Schoenbrun, Tiago Pasqualini (NetApp), Erlon Cruz (NetApp) 9:00 AM
Integrating Keystone with large-scale centralized authentication Ken Holden, Chris Janiszewski 9:50 AM
Sahara – Project Onboarding Telles Nobrega 11:00 AM
Lower the Barries: Or How To Make Hassle-Free Open Source Events Sven Michels 11:40 AM
Barbican – Project Update Ade Lee, Dave McCowan (Cisco) 11:50 AM
Glance – Project Onboarding Erno Kuvaja, Brian Rosmaita (Verizon) 11:50 AM
Kuryr – Project Update Daniel Mellado 12:15 PM
Sahara – Project Update Telles Nobrega 1:50 PM
Heat – Project Update Rabi Mishra, Thomas Herve, Rico Lin (EasyStack) 1:50 PM
Superfluidity: One Network To Rule Them All Daniel Mellado, Luis Tomas Bolivar, Irena Berezovsky (Huawei) 3:10 PM
Burnin’ Down the Cloud: Practical Private Cloud Management David Medberry, Steven Travis (Time Warner Cable) 3:30 PM
Infra – Project Onboarding David Moreau-Simard, Clark Boylon (OpenStack Foundation) 3:30 PM
Intro to Kata Containers Components: a Hands-on Lab Sachin Rathee, Sudhir Kethamakka 4:40 PM
Kubernetes Network-policies and Neutron Security Groups – Two Sides of the Same Coin? Daniel Mellado, Eyal Leshem (Huawei) 5:20 PM
How To Survice an OpenStack Cloud Meltdown with Ceph Federico Lucifredi, Sean Cohen, Sebatien Han 5:30 PM
OpenStack Internal Messaging at the Edge: In Depth Evaluation Kenneth Giusti, Matthieu Simonin, Javier Rojas Balderrama 5:30 PM


Wednesday

Session Speaker Time
Barbican – Project Onboarding Ade Lee, Dave McCowan (Cisco) 9:00 AM
Oslo – Project Update Ben Nemec 9:50 AM
Kuryr – Project Onboarding Daniel Mellado, Irena Berezovsky (Hauwei) 9:50 AM
How To Work with Adjacent Open Source Communities – User, Developer, Vendor, Board Perspective Mark McLoughlin, Anni Lai (Huawei), Davanum Srinivas (Mirantis), Christopher Price (Ericsson), Gnanavelkandan Kathirvel (AT&T) 11:50 AM
Nova – Project Update Melanie Witt 11:50 AM
TripleO – Project Onboarding Alex Schultz, Emilien Macchi, Dan Prince 11:50 AM
Distributed File Storage in Multi-Tenant Clouds using CephFS Tom Barron, Ramana Raja, Patrick Donnelly 12:20 PM
Lunch & Learn – Performance optimization in Red Hat OpenStack Platform Razique Mahroa 12:30 PM
Cinder Thin Provisioning: a Comprehensive Guide Gorka Eguileor, Tiago Pasqualini (NetApp), Erlon Cruz (NetApp) 1:50 PM
Nova – Project Onboarding Melanie Witt 1:50 PM
Glance’s Power of Image Import Plugins Erno Kuvaja 2:30 PM
Mistral – Project Update Dougal Matthews 3:55 PM
Mistral – Project Onboarding Dougal Matthews, Brad Crochet 4:40 PM
Friendly Coexistence of Virtual Machines and Containers on Kubernetes using KubeVirt Stu Gott, Stephen Gordon 5:30 PM
Intro to Container Security Thomas Cameron 11:50 AM


Thursday

Session Speaker Time
Manila – Project Update Tom Barron 9:00 AM
Oslo – Project Onboarding Ben Nemec, Kenneth Giusti, Jay Bryant (Lenovo) 9:00 AM
Walk Through of an Automated OpenStack Deployment Using Triple-O Coupled with OpenContrail – POC Kumythini Ratnasingham, Brent Roskos, Michael Henkel (Juniper Networks) 9:00 AM
Working Remotely in a Worldwide Community Doug Hellmann, Julia Kreger, Flavio Percoco, Kendall Nelson (OpenStack Foundation), Matthew Oliver (SUSE) 9:50 AM
Manila – Project Onboarding Tom Barron 9:50 AM
Centralized Policy Engine To Enable Multiple OpenStack Deployments for Telco/NFV Bertrand Rault, Marc Bailly (Orange), Ruan He (Orange) 11:00 AM
Kubernetes and OpenStack Unified Networking Using Calico – Hands-on Lab Amol Chobe 11:00 AM
Multi Backend CNI for Building Hybrid Workload Clusters with Kuryr and Kubernetes Daniel Mellado, Irena Berezovsky (Huawei) 11:50 AM
Workshop/Lab: Containerize your Life! Joachim von Thadden 1:50 PM
Root Your OpenStack on a Solid Foundation of Leaf-Spine Architecture! Joe Antkowiak, Ken Holden 2:10 PM
Istio: How To Make Multicloud Applications Real Christian Posta, Chris Hoge (OpenStack Foundation), Steve Drake (Cisco), Lin Sun, Costin Monolanche (Google) 2:40 PM
Push Infrastructure to the Edge with Hyperconverged Cloudlets Kevin Jones 3:30 PM
A DevOps State of Mind: Continuous Security with Kubernetes Chris Van Tuin 3:30 PM
OpenStack Upgrades Strategy: the Fast Forward Upgrade Maria Angelica Bracho, Lee Yarwood 4:40 PM
Managing OpenStack with Ansible, a Hands-on Workshop Julio Villarreal Pelegrino, Roger Lopez 4:40 PM

 

We’re looking forward to seeing you there!

 

by Peter Pawelski, Product Marketing Manager, Red Hat OpenStack Platform at April 13, 2018 07:12 PM

April 12, 2018

Julien Danjou

Lessons from OpenStack Telemetry: Incubation

Lessons from OpenStack Telemetry: Incubation

It was mostly around that time in 2012 that I and a couple of fellow open-source enthusiasts started working on Ceilometer, the first piece of software from the OpenStack Telemetry project. Six years have passed since then. I've been thinking about this blog post for several months (even years, maybe), but lacked the time and the hindsight needed to lay out my thoughts properly. In a series of posts, I would like to share my observations about the Ceilometer development history.

To understand the full picture here, I think it is fair to start with a small retrospective on the project. I'll try to keep it short, and it will be unmistakably biased, even if I'll do my best to stay objective – bear with me.

Incubation

Early 2012, I remember discussing with the first Ceilometer developers the right strategy to solve the problem we were trying to address. The company I worked for wanted to run a public cloud, and billing the resources usage was at the heart of the strategy. The fact that no components in OpenStack were exposing any consumption API was a problem.

We debated about how to implement those metering features in the cloud platform. There were two natural solutions: either achieving some resource accounting report in each OpenStack projects or building a new software on the side, covering for the lack of those functionalities.

At that time there were only less than a dozen of OpenStack projects. Still, the burden of patching every project seemed like an infinite task. Having code reviewed and merged in the most significant projects took several weeks, which, considering our timeline, was a show-stopper. We wanted to go fast.

Pragmatism won, and we started implementing Ceilometer using the features each OpenStack project was offering to help us: very little.

Our first and obvious candidate for usage retrieval was Nova, where Ceilometer aimed to retrieves statistics about virtual machines instances utilization. Nova offered no API to retrieve those data – and still doesn't. Since it was out of the equation to wait several months to have such an API exposed, we took the shortcut of polling directly libvirt, Xen or VMware from Ceilometer.

That's precisely how temporary hacks become historical design. Implementing this design broke the basis of the abstraction layer that Nova aims to offer.

As time passed, several leads were followed to mitigate those trade-offs in better ways. But on each development cycle, getting anything merged in OpenStack became harder and harder. It went from patches long to review, to having a long list of requirements to merge anything. Soon, you'd have to create a blueprint to track your work, write a full specification linked to that blueprint, with that specification being reviewed itself by a bunch of the so-called core developers. The specification had to be a thorough document covering every aspect of the work, from the problem that was trying to be solved, to the technical details of the implementation. Once the specification was approved, which could take an entire cycle (6 months), you'd have to make sure that the Nova team would make your blueprint a priority. To make sure it was, you would have to fly a few thousands of kilometers from home to an OpenStack Summit, and orally argue with developers in a room filled with hundreds of other folks about the urgency of your feature compared to other blueprints.

Lessons from OpenStack Telemetry: Incubation

An OpenStack design session in Hong-Kong, 2013

Even if you passed all of those ordeals, the code you'd send could be rejected, and you'd get back to updating your specification to shed light on some particular points that confused people. Back to square one.

Nobody wanted to play that game. Not in the Telemetry team at least.

So Ceilometer continued to grow, surfing the OpenStack hype curve. More developers were joining the project every cycle – each one with its list of ideas, features or requirements cooked by its in-house product manager.

But many features did not belong in Ceilometer. They should have been in different projects. Ceilometer was the first OpenStack project to pass through the OpenStack Technical Committee incubation process that existed before the rules were relaxed.

This incubation process was uncertain, long, and painful. We had to justify the existence of the project, and many technical choices that have been made. Where we were expecting the committee to challenge us at fundamental decisions, such as breaking abstraction layers, it was mostly nit-picking about Web frameworks or database storage.

Consequences

The rigidity of the process discouraged anyone to start a new project for anything related to telemetry. Therefore, everyone went ahead and started dumping its idea in Ceilometer itself. With more than ten companies interested, the frictions were high, and the project was at some point pulled apart in all directions. This phenomenon was happening to every OpenStack projects anyway.

On the one hand, many contributions brought marvelous pieces of technology to Ceilometer. We implemented several features you still don't find any metering system. Dynamically sharded, automatic horizontally scalable polling? Ceilometer has that for years, whereas you can't have it in, e.g., Prometheus.

On the other hand, there were tons of crappy features. Half-baked code merged because somebody needed to ship something. As the project grew further, some of us developers started to feel that this was getting out of control and could be disastrous. The technical debt was growing as fast as the project was.

Several technical choices made were definitely bad. The architecture was a mess; the messaging bus was easily overloaded, the storage engine was non-performant, etc. People would come to me (as I was the Project Team Leader at that time) and ask why the REST API would need 20 minutes to reply to an autoscaling request. The willingness to solve everything for everyone was killing Ceilometer. It's around that time that I decided to step out of my role of PTL and started working on Gnocchi to, at least, solve one of our biggest challenge: efficient data storage.

Ceilometer was also suffering from the poor quality of many OpenStack projects. As Ceilometer retrieves data from a dozen of other projects, it has to use their interface for data retrieval (API calls, notifications) – or sometimes, palliate for their lack of any interface. Users were complaining about Ceilometer dysfunctioning while the root of the problem was actually on the other side, in the polled project. The polling agent would try to retrieve the list of virtual machines running on Nova, but just listing and retrieving this information required several HTTP requests to Nova. And those basic retrieval requests would overload the Nova API. The API does not offer any genuine interface from where the data could be retrieved in a small number of calls. And it had terrible performances.
From the point of the view of the users, the load was generated by Ceilometer. Therefore, Ceilometer was the problem. We had to imagine new ways of circumventing tons of limitation from our siblings. That was exhausting.

At its peak, during the Juno and Kilo releases (early 2015), the code size of Ceilometer reached 54k lines of code, and the number of committers reached 100 individuals (20 regulars). We had close to zero happy user, operators were hating us, and everybody was wondering what the hell was going in those developer minds.

Nonetheless, despite the impediments, most of us had a great time working on Ceilometer. Nothing's ever perfect. I've learned tons of things during that period, which were actually mostly non-technical. Community management, social interactions, human behavior and politics were at the heart of the adventure, offering a great opportunity for self-improvement.

In the next blog post, I will cover what happened in the years that followed that booming period, up until today. Stay tuned!

by Julien Danjou at April 12, 2018 12:50 PM

April 09, 2018

Adam Young

Comparing Keystone and Istio RBAC

To continue with my previous investigation to Istio, and to continue the comparison with the comparable parts of OpenStack, I want to dig deeper into how Istio performs
RBAC. Specifically, I would love to answer the question: could Istio be used to perform the Role check?

Scoping

Let me reiterate what I’ve said in the past about scope checking. Oslo-policy performs the scope check deep in the code base, long after Middleware, once the resource has been fetched from the Database. Since we can’t do this in Middleware, I think it is safe to say that we can’t do this in Istio either. SO that part of the check is outside the scope of this discussion.

Istio RBAC Introduction

Lets look at how Istio performs RBAC.

The first thing to compare is the data that is used to represent the requester. In Istio, this is the requestcontext. This is comparable to the Auth-Data that Keystone Middleware populates as a result of a successful token validation. How does Istio populate the the requestcontext? My current assumption is that it makes an Remote call to Mixer with the authenticated REMOTE_USER name.

What is telling is that, in Istio, you have

      user: source.user | ""
      groups: ""
      properties:
         service: source.service | ""
         namespace: source.namespace | ""

Groups no roles. Kubernetes has RBAC, and Roles, but it is a late addition to the model. However…

Istio RBAC introduces ServiceRole and ServiceRoleBinding, both of which are defined as Kubernetes CustomResourceDefinition (CRD) objects.

ServiceRole defines a role for access to services in the mesh.
ServiceRoleBinding grants a role to subjects (e.g., a user, a group, a service)

This is interesting. Where-as Keystone requires a user to go to Keystone to get a token that is then associated with a a set of role assignments, Istio expands this assignment inside the service.

Keystone Aside: Query Auth Data without Tokens

This is actually not surprising. When looking into Keystone Middleware years ago, in the context of PKI tokens, I realized that we could do exactly the same thing; make a call to Keystone based on the identity, and look up all of the data associated with the token. This means that a user can go from a SAML provider right to the service without first getting a Keystone token.

What this means is that the Mixer can respond return the Roles assigned by Kubernetes as additional parameters in the “Properties” collection. However, with the ServiceRole, you would instead get the Service Role Binding list from Mixer and apply it in process.

We discussed Service Roles on multiple occasions in Keystone. I liked the idea, but wanted to make sure that we didn’t limit the assignments, or even the definitions, to just a service. I could see specific Endpoints varying in their roles even within the same service, and certainly have different Service Role Assignments. I’m not certain if Istio distinguishes between “services” and “different endpoints of the same service” yet…something I need to delve in to. However, assuming that it does distinguish, what Istio needs to be able to get request is “Give me the set of Role bindings for this specific endpoint.”

A history lesson in Endpoint Ids.

It was this last step that was a problem in Keystonemiddleware. An endpoint did not know its own ID, and the provisioning tools really did not like the workflow of

  1. create an endpoint for a service
  2. register endpoint with Keystone
  3. get back the endpoint ID
  4. add endpoint  ID to the config file
  5. restart the service

Even if we went with an URL based scheme, we would have had this problem.  An obvious (in hindsight) solution would be to pre-generate the Ids as a unique hash, and to pre-populate the configuration files as well as to post the IDs to Keystone.  These IDs could easily be tagged as a nickname, not even the canonical name of the service.

Istio Initialization

Istio does no have this problem, directly, as it knows the name of the service that it is protecting, and can use that to fetch the correct rules.  However, it does point to a chicken-egg problem that Istio has to solve; which is created first, the service itself, or the abstraction in Istio to cover it?  Since Kubernetes is going to orchestrate the Service deployment, it can make the sensible call;  Istio can cover the service and just reject calls until it is properly configured.

URL Matching Rules

If we look at the Policy enforcement in Nova, we can use the latest “Policy in Code” mechanisms to link from the URL pattern to the Policy rule key, and the key to the actual enforced policy.  For example, to delete a server we can look up the API

And see that it is

/servers/{server_id}

And from the Nova source code:

  policy.DocumentedRuleDefault(
        SERVERS % 'delete',
        RULE_AOO,
        "Delete a server",
        [
            {
                'method': 'DELETE',
                'path': '/servers/{server_id}'
            }
]),

With SERVERS %  expanding via :  SERVERS = 'os_compute_api:servers:%s'  to  os_compute_api:servers:delete.

Digging into Openstack Policy

Then, assuming you can get you hand on the policy file specific to that Nova server you could look at the policy for that rule. Nova no longer includes that generated file in the etc directory. But in my local repo I have:
"os_compute_api:servers:delete": "rule:admin_or_owner"

And the rule:admin_or_owner expanding to "admin_or_owner": "is_admin:True or project_id:%(project_id)s" which does not do a role check at all. The policy.yaml or policy.json file is not guaranteed to exist, in which case you can either use the tool to generate it, or read the source code. From the above link we see the Rule is:

RULE_AOO = base.RULE_ADMIN_OR_OWNER

and then we need to look where that is defined.

Lets assume, for the moment, that a Nova deployment has overridden the main rule to implement a custom role called Custodian which has the ability to execute this API. Could Istio match that? It really depends on whether it can match the URL-Pattern: '/servers/{server_id}'.

In ServiceRole, the combination of “namespace”+”services”+”paths”+”methods” defines “how a service (services) is allowed to be accessed”.

So we can match down to the Path level. However, there seems to be no way to tokenize a Path. Thus, while you could set a rule that says a client can call DELETE on a specific instance, or DELETE on /services, or even DELETE on all URLS in the catalog (whether they support that API or not) you could not say that it could call delete on all services within a specific Namespace. If the URL were defined like this:

DELETE /services?service_id={someuuid}

Istio would be able to match the service ID in the set of keys.

In order for Istio to be able to effectively match, all it really would need would be to identify that an URL that ends /services/feed1234 Matches the pattern /services/{service_id} which is all that the URL pattern matching inside the Web servers do.

Istio matching

It looks like paths can have wildcards. Scroll down a bit to the quote:

In addition, we support prefix match and suffix match for all the fields in a rule. For example, you can define a “tester” role that has the following permissions in “default” namespace:

which has the further example:

    - services: ["bookstore.default.svc.cluster.local"]
       paths: ["*/reviews"]
       methods: ["GET"]

Deep URL matching

So, while this is a good start, there are many more complicated URLs in the OpenStack world which are tokenized in the middle: for example, the new API for System role assignments has both the Role ID and the User ID embedded. The Istio match would be limited to matching: PUT /v3/system/users/* which might be OK in this case. But there are cases where a PUT at one level means one role, much more powerful than a PUT deeper in the URL chain.

For example: The base role assignments API itself is much more complex. To assign a role on a domain uses an URL fragment comparable to that to edit the domain specific configuration file. Both would have to be matched with

       paths: ["/v3/domains/*"]
       methods: ["PUT"]

But assigning a role is a far safer operation than setting a domain specific config, which is really an administrative only operation.

However, I had to dig deeply to find this conflict. I suspect that there are ways around it, and comparable conflicts in the catalog.

Conclusion

So, the tentative answer to my question is:

Yes, Istio could perform the Role check part of RBAC for OpenStack.

But it would take some work. Of Course. An early step would be to write a Mixer plugin to fetch the auth-data from Keystone based on a user. This would require knowing about Federated mappings and how to expand them, plus query the Role assignments. Of, and get the list of Groups for a user. And the project ID needs to be communicated, somehow.

by Adam Young at April 09, 2018 05:55 PM

dmsimard

Scaling ARA to a million Ansible playbooks a month

The OpenStack community runs over 300 000 CI jobs with Ansible every month with the help of the awesome Zuul. It even provides ARA reports for ARA’s integration test jobs in a sort-of nested way. Zuul’s Ansible ends up installing Ansible and ARA. It makes my brain hurt sometimes… but in an awesome way. As a core contributor of the infrastructure team there, I get to witness issues and get a lot of feedback directly from the users.

April 09, 2018 12:00 AM

April 07, 2018

Adam Young

Comparing Istio and Keystone Middleware

One way to learn a new technology is to compare it to what you already know. I’ve heard a lot about Istio, and I don’t really grok it yet, so this post is my attempt to get the ideas solid in my own head, and to spur conversations out there.

I asked the great Google “Why is Istio important” and this was the most interesting response it gave me: “What is Istio and Its Importance to Container Security.” So I am going to start there. There are obviously many articles about Istio, and this might not even be the best starting point, but this is the internet: I’m sure Ill be told why something else is better!

Lets start with the definition:

Istio is an intelligent and robust web proxy for traffic within your Kubernetes cluster as well as incoming traffic to your cluster

At first blush, these seems to be nothing like Keystone. However, Lets take a look at the software definition of Proxy:

A proxy, in its most general form, is a class functioning as an interface to something else.

In the OpenStack code base, the package python-keystonemiddleware provides a Python class that complies with the WSGI contract that serves as a Proxy to the we application underneath. Keystone Middleware, then is an analogue to the Istio Proxy in that it performs some of the same functions.

Istio enables you to specify access control rules for web traffic between Kubernetes services

So…Keystone + Oslo-policy serves this role in OpenStack. The Kubernetes central control is a single web server, and thus it can implement Access control for all subcomponenets in a single process space. OpenStack is distributed, and thus the access control is also distributed. However, due to the way that OpenStack objects are stored, we cannot do the full RBAC enforcement in middleware (much as I would like to). IN order to check access to an existing resource object in OpenStack, you have to perform the policy enforcement check after the object has been fetched from the Database. That check needs to ensure that the project of the token matches the project of the resource. Since we don’t know this information based solely on the URL, we cannot perform it in Middleware.

What we can perform in Middleware, and what I presented on last year at the OpenStack Summit, is the ability to perform the Role check portion of RBAC in middleware, but defer the project check until later. While we are not going to be doing exactly that, we are pursuing a related effort for application credentials. However, that requires a remote call to a database to create those rules. Istio is not going to have that leeway. I think? Please correct me if I am wrong.

I don’t think Istio could perform this level of deep check, either. It requires parsing the URL and knowing the semantics of the segments, and having the ability to correlate them. That is a lot to ask.

Isito enables you to seamlessly enforce encryption and authentication between node

Keystone certainly does not do this. Nothing enforced TLS between services in OpenStack. Getting TLS everywhere in Tripleo was a huge effort, and it still needs to be explicitly enabled. OpenStack does not provide a CA. Tripleo, when deployed, depends on the Dogtag instance from the FreeIPA server to manage certificates.

By the time Keystone Middleware is executed, the TLS layer would be a distant memory.

Keystoneauth1 is the client piece from Keystone, and it could be responsible for making sure that only HTTPS is supported, but it does not do that today.

Istio collects traffic logs, and then parses and presents them for you:

Keystone does not do this, although it does produce some essential log entries about access.

At this point, I am wondering if Istio would be a viable complement to the security story in OpenStack. My understand thus far is that it would. It might conflict a minor bit with the RBAC enforcement, but I suspect that is no the key piece of what it is doing, and conflict there could be avoided.

Please post your comments, as I would really like to get to know this better, and we can share the discussion with the larger community.

by Adam Young at April 07, 2018 10:00 PM

March 29, 2018

Red Hat Stack

Heading to Red Hat Summit? Here’s how you can learn more about OpenStack.

Red Hat Summit is just around the corner, and we’re excited to share all the ways in which you can connect with OpenStack® and learn more about this powerful cloud infrastructure technology. If you’re lucky enough to be headed to the event in San Francisco, May 8-10, we’re looking forward to seeing you. If you can’t go, fear not, there will be ways to see some of what’s going on there remotely. And if you’re undecided, what are you waiting for? Register today

From the time Red Hat Summit begins you can find hands-on labs, general sessions, panel discussions, demos in our partner pavillion (Hybrid Cloud section), and more throughout the week. You’ll also hear from Red Hat OpenStack Platform customers on their successes during some of the keynote presentations. Need an open, massively scalable storage solution for your cloud infrastructure? We’ll also have sessions dedicated to our Red Hat Ceph Storage product.

Red Hat Summit has grown significantly over the years, and this year we’ll be holding activities in both the Moscone South and Moscone West. And with all of the OpenStack sessions and labs happening, it may seem daunting to make it to everything, especially if you need to transition from one building to the next. But worry not. Our good friends from Red Hat Virtualization will be sponsoring pedicabs to help transport you between the buildings.   

Here’s our list of sessions for OpenStack and Ceph at Red Hat Summit:

Tuesday

Session Speaker Time / Location
Lab – Deploy a containerized HCI IaaS with OpenStack and Ceph Rhys Oxenham, Greg Charot, Sebastien Han, John Fulton 10:00 am / Moscone South, room 156
Ironic, VM operability combined with bare-metal performances Cedric Morandin (Amadeus) 10:30 am / Moscone West, room 2006
Lab – Hands-on with OpenStack and OpenDaylight SDN Rhys Oxenham, Nir Yechiel, Andre Fredette, Tim Rozat 1:00 pm / Moscone South, room 158
Panel – OpenStack use cases: how business succeeds with OpenStack featuring Cisco, IAG, Turkcell, and Duke Health August Simonelli, Pete Pawelski 3:30 pm / Moscone West, room 2007
Lab – Understanding containerized Red Hat OpenStack Platform Ian Pilcher, Greg Charot 4:00 pm / Moscone South, room 153
Red Hat OpenStack Platform: the road ahead Nick Barcet, Mark McLoughlin 4:30 pm / Moscone West, room 2007


Wednesday

Session Speaker Time / Location
Lab – First time hands-on with Red Hat OpenStack Platform Rhys Oxenham, Jacob Liberman 10:00 am / Moscone South, room 158
Red Hat Ceph Storage roadmap: past, present, and future Neil Levine 10:30 am / Moscone West, room 2024
Optimize Ceph object storage for production in multisite clouds Michael Hackett, John Wilkins 11:45 am / Moscone South, room 208
Production-ready NFV at Telecom Italia (TIM) Fabrizio Pezzella, Matteo Bernacchi, Antonio Gianfreda (Telecom Italia) 11:45 am / Moscone West, room 2002
Workload portability using Red Hat CloudForms and Ansible Bill Helgeson, Jason Woods, Marco Berube 11:45 am / Moscone West, room 2009
Delivering Red Hat OpenShift at ease on Red Hat OpenStack Platform and Red Hat Virtualization Francesco Vollero, Natale Vinto 3:30 pm / Moscone South, room 206
The future of storage and how it is shaping our roadmap Sage Weil 3:30 pm / Moscone West, room 2020
Lab – Hands on with Red Hat OpenStack Platform Rhys Oxenham, Jacob Liberman 4:00 pm / Moscone South, room 153
OpenStack and OpenShift networking integration Russell Bryant, Antoni Segura Puimedon, and Jose Maria Ruesta (BBVA) 4:30 pm / Moscone West, room 2011

 

Thursday

Session Speaker Time / Location
Workshop – OpenStack roadmap in action Rhys Oxenham 10:45 am / Moscone South, room 214
Medical image processing with OpenShift and OpenStack Daniel McPherson, Ata Turk (Boston University), Rudolph Pienaar (Boston Children’s Hospital) 11:15 am / Moscone West, room 2006
Scalable application platform on Ceph, OpenStack, and Ansible Keith Hagberg (Fidelity), Senthivelrajan Lakshmanan (Fidelity), Michael Pagan, Sacha Dubois, Alexander Brovman (Solera Holdings) 1:00 pm / Moscone West, room 2007
Red Hat CloudForms: turbocharge your OpenStack Kevin Jones, Jason Ritenour 2:00 pm / Moscone West, room 201
What’s new in security for Red Hat OpenStack Platform? Nathan Kinder, Keith Basil 2:00 pm / Moscone West, room 2003
Ceph Object Storage for Apache Spark data platforms Kyle Bader, Mengmeng Liu 2:00 pm / Moscone South, room 207
OpenStack on FlexPod – like peanut butter and jelly Guil Barros, Amit Borulkar NetApp 3:00 pm / Moscone West, room 2009

 

Hope to see you there!

 

by Peter Pawelski, Product Marketing Manager, Red Hat OpenStack Platform at March 29, 2018 02:00 PM