Essential DevOps Skills


Anyone involved in hiring DevOps engineers would soon realize that it is hard to find prospective candidates who would have all the skills listed in the job description. Most of the experienced applicants would have deep knowledge of few tools and that might not make an exact match. Then you look for people with essential skills, with an intention to get them trained on-the-job. But what are those essentials skills?

For details, read my recent article on this topic, Essential DevOps Skills.

Posted in DevOps | Leave a comment

DevOps Stack on a Shoestring Budget


There are many applications and tools DevOps engineers use on a daily basis. But without some of the essential apps it would be hard to develop and rollout automation tools and processes. Any attempt at kickstarting DevOps culture should include plans to make those apps available as core infrastructure to the team.

What if such an exercise would begin with a list of solid, battle-tested open-source tools which won’t cost you a dime in license fees. You might end up with a set of tools that would fit your budget and comfort level at the end, but, make such decisions only after closely looking at the tools listed here. Please see the complete list in a recent article I contributed to

I am open to suggestions and changes in the list if you would make a good case for your favorite tool. I agree upfront that the selection of these tools is heavily influenced by my hands-on experience successfully using those in various DevOps projects.

Posted in DevOps, Monitoring, Ansible | Tagged , , , , , , , , , , , | Leave a comment

Live Tail – Loggly’s “tail -f”


Log management solutions usually provide a web UI for users to search logs as well as a query language to filter information to suit specific needs. Though such methods are powerful, many sysadmins and operations engineers miss the option of logging into a problem machine and tailing the application logs to look for clues during an outage. Loggly’s Live Tail addresses that need: Using Live Tail, you can tail the stream of log aggregates and filter it to do whatever you could do in a UNIX shell.

For details, see a recent piece I wrote for Loggly here. It also provides a code sample that explains how easily diverse apps can be integrated with the popular Cloud-based monitoring platform Datadog.

Posted in Monitoring, Uncategorized | Tagged , , , | Leave a comment

Proactive Monitoring

Most of the DevOps discussions are centered around automated provisioning of infrastructure, or,  testing, packaging and deploying code in an automated fashion. However, stinging site-down issues would eventually force any nascent production engineering team to make monitoring a high priority down the road.

This article I recently published in is a comprehensive look at all kinds of monitoring requirements in production, that would help to rollout monitoring proactively. Read on and let me know your feedback.

Posted in Monitoring, Uncategorized | Tagged , | Leave a comment

DevOps Defined

It is easy to find attempts to define DevOps but hardly any one of them is comprehensive. And that is good because the DevOps space is very vast and there is no need to come up with a definition that is theoretically accurate. Treating infrastructure, platform and monitoring as code, as the application has been, is the essence of DevOps movement.

The link to my original LinkedIn posting is provided here:

Posted in Uncategorized | Tagged | Leave a comment

A Checklist to Build DevOps Organization

With the increased use of Cloud as the infrastructure and the need to run the applications 24×7 on it, DevOps movement is becoming the mantra for automation that could help companies to scale up (and if needed down also) production. Link the original posting in LinkedIn:

Posted in DevOps | Leave a comment

Putting Puppet to work

Since now we know how to setup Puppet in a network environment, it is time to build an application environment using it. For that, we will take a simple but a common requirement: bringing up an Apache server and install a web application. The environment thus setup should have the following software bits and settings:

– add user wwwu and group wwwg
– install apache
– deploy few web pages that are packaged in a tar ball, under document root and set ownership on those files to wwwu:wwwg
– start apache

The deployment steps are implemented in classes and they are applied on the nodes that host the web application.

node 'app-node1.domain','app-node2.domain' {
include addgroup
include adduser
include install_apache
include install_webapp

Take a look at how the requirements are implemented in classes.

The addgroup class makes sure that the group “wwwg” is present on the host, and if not, it is created:

$ cat modules/addgroup/manifests/init.pp

class addgroup {
group { "wwwg":
ensure => present,

The adduser class makes sure that the user “wwwu” is present on the host, and if not, creates it and sets the group, home directory and default shell of the user account:

$ cat modules/adduser/manifests/init.pp

class adduser {
user { "wwwu":
ensure => present,
gid => 'wwwg',
shell => '/usr/local/bin/bash',
home => '/home/wwwu',
managehome => true,

The install_apache class installs Apache package and the related httpd service:

$ cat modules/install_apache/manifests/init.pp

class install_apache {
package { 'httpd':
ensure => present,

service {'httpd':
ensure => true,
enable => true,
require => Package['httpd']

The install_webapp installs the web pages from the directory and sets the environment. The installation is done multiple steps as below:

– creates the directory “/etc/staging” for Puppet to push the webapp.gz archive to that location
– push the file from the Puppet repository server:/etc/puppet/files to the app-node:/etc/staging
– push install script from server:/etc/puppet/files to the app-node:/etc/staging
– completes the installation by executing the install script app-node:/etc/staging/

$ cat modules/install_webapp/manifests/init.pp
class install_webapp {
file { "/etc/staging":
ensure => "directory",
owner => "wwwu",
group => "wwwg",

file { "/etc/staging/webapp.gz":
source => "puppet:///files/webapp.gz",

file { "/etc/staging/":
mode => 775,
source => "puppet:///files/",

exec { "install_webapp":
command => '/etc/staging/',

The webapp contains 2 static web pages:

$ tar tzf /etc/puppet/files/webapp.gz

Those files are deployed using the install script. The script does the following:

– stops Apache service
– unzip the archive in the staging area
– move the files under Apache document root, so that, those files are accessible from the browser
– set the permissions
– remove the archive from staging area
– start the Apache service

$ cat /etc/staging/

service httpd stop
cd /etc/staging/
tar xzf webapp.gz
mv *.html /var/www/html/
chown wwwu:wwwg /var/www/html/*.html
rm /etc/staging/webapp.gz
service httpd start

In the current example, it might look silly to do these steps to deploy two html files. But in a real-life production environment, you need to perform post-install steps to get a web application configured to get it working as desinged, and such changes can be encapsulated in a well written script which can eliminate any manual post-deployment steps.

To test this configuration, make sure that the targeted application nodes are configured as Puppet agents, as explained in an earlier post.

By running the agent on the application nodes, the deployment can be done immediately (or wait for the time duration set for runinterval):

$ sudo puppet agent --test
Info: Caching certificate for app-node.domain
Info: Caching certificate_revocation_list for ca
Info: Retrieving plugin
Info: Caching catalog for app-node.domain
Info: Applying configuration version '1374796394'
Notice: /Group[wwwg]/ensure: created

Notice: /Stage[main]/Install_apache/Package[httpd]/ensure: created
Notice: /Stage[main]/Install_apache/Service[httpd]/ensure: ensure changed 'stopped' to 'running'
Info: /Stage[main]/Install_apache/Service[httpd]: Unscheduling refresh on Service[httpd]
Notice: /User[wwwu]/ensure: created
Notice: /Stage[main]/Install_webapp/File[/etc/staging]/ensure: created
Notice: /Stage[main]/Install_webapp/File[/etc/staging/]/ensure: defined content as '{md5}5d828b4e08395f41386c276123a61410'
Notice: /Stage[main]/Install_webapp/Exec[install_webapp]/returns: executed successfully
Notice: /Stage[main]/Install_webapp/File[/etc/staging/webapp.gz]/ensure: defined content as '{md5}e0ffcbbfa3e690923fd477b29bd45662'
Notice: Finished catalog run in 5.78 seconds

Verify if the files have been deployed as designed:

$ grep wwwg /etc/group

$ grep wwwu /etc/passwd

$ ls -al /etc/staging
drwxr-xr-x 2 wwwu wwwg 4096 Jul 26 02:28 .
drwxr-xr-x 110 root root 12288 Jul 25 23:58 ..
-rwxrwxr-x 1 root root 186 Jul 25 23:58
-rw-r--r-- 1 root root 202 Jul 26 02:28 webapp.gz

$ ls -al /var/www/html
-rw-r--r-- 1 wwwu wwwg 39 Jul 25 08:45 index.html
-rw-r--r-- 1 wwwu wwwg 39 Jul 25 08:46 test.html

Verify if Apache is running and serving the html files installed by Puppet:

$ sudo service httpd status
httpd (pid 1808) is running...

$ curl http://localhost/test.html
Hello World!
Posted in Automation, DevOps, Puppet | Tagged , , , , , | Leave a comment