Learning from GitLab: Making Production Obvious

We recently took some time to take stock of our practices and procedures following the outage at GitLab. The GitLab outage started when a developer made the easy mistake of entering a command on the wrong server. Unfortunately, that server was part of their production environment. If this could happen to a company like GitLab, could it happen to us? What could we do to avoid falling into the same trap?

It’s easy to have many terminal windows open and forget which one belongs to which environment. We wanted to address this problem with a simple solution that made it painfully obvious that a developer was about to run a command in production. At TrackJS, we use Ansible to automate our infrastructure configuration. To make these changes, we wrote some small Ansible snippets for both Linux and Windows.

Linux

A large part of the TrackJS infrastructure runs on Linux. To make it really clear we are SSH’ed to production, we made the command prompt red! In Ubuntu, this is accomplished by modifying the .bashrc file in the user’s home directory.

The .bashrc file builds an environment variable named PS1. This variable defines what the command prompt will look like. Any color enabled console will understand ANSI escape codes so we added them to our prompt. To make the change simple, we wrapped the existing PS1 value with color escapes. Wrapping the existing value also let us make the change at the end of the file. This came in useful once we automated the change with Ansible. The begin color flag for red is \e[0;31m and the end flag is \e[m; whatever is in between the two will be displayed in the specified color. Putting that together with the prompt variable gave us this at the end of .bashrc:


PS1=\e[0;31m$PS1\e[m

After reloading bash, our prompt looks like:

Linux command prompt screenshot. That red is HOT!

Next, we set this up in Ansible. All of our Linux playbooks reference a shared role named “linux-common”. This gives us a nice place to put configuration we want everywhere. We wanted all of production to have this behavior, so we put our new change there.

The usual approach to modifying a file is to copy the whole .bashrc into Ansible as a file or template. We did not want to manage the whole file if it was not necessary. Again, in the name of simplicity, we decided to manage only the part we cared about. To accomplish this, we used a module named blockinfile which is new in Ansible 2. The module lets us put a block of text anywhere in an existing file. One of the built in options places the block at the end of the file. This is quick, easy, and exactly what we want!

But wait, we run the same playbooks against both development and production servers! Lucky for us, we have an Ansible variable for all our servers named, “env”. For production servers, this var is set as: env: prd. We used this variable to make our change only when we wanted it. When put together, we got a single task:


- name: make command prompt red in prod
  blockinfile:
    # Modify the .bashrc file in the users home directory
    dest: /home/your_user/.bashrc
    # Tell Ansible to make a backup copy of the file before modifying it, just in case!
    backup: yes
    # Add our change to the end of the file
    insertafter: EOF
    # A YAML multi-line text block. This will end up in .bashrc
    block: |
      # Make command prompt RED
      # Take existing PS1 command prompt value and wrap it in red color formatting
      PS1="\e[0;31m$PS1\e[m"
  # Only do this in production
  when: env is defined and env == "prd"

Windows

We use Windows for some parts of our application and occasionally connect to our Windows servers using Remote Desktop (RDP). We wanted to modify the Windows UI to make production just as obvious as in Linux. We could have made the background red, but the background can be hidden behind fullscreen windows. Instead, we made the Windows UI chrome red!

Windows Server disables a lot of the UI niceties like customizeable colors which are found in desktop versions Windows. Most of the registry settings still exist, but must be modified directly. We modified the key HKCU:\Software\Microsoft\Windows\DWM\ColorizationColor to tint the chrome in the Windows UI. The default value was c055c9ed in hex which is an alpha channel(c0) plus a hex color (55c9ed). We changed the color to a blatant red, dd0000 and kept the same alpha value which gave us c0dd0000. After we logged out and back in, we saw:

Windows UI Screenshot. The red, it burns.

Next we automated it with Ansible. Our change was placed in a role named “windows-common” which served the same purpose as our “linux-common” role. We used Ansible’s built-in win_regedit module to modify the registry. Finally, we converted the key value from hex to decimal so that it was compatible with older Ansible versions. We used this along with our environment check to produce the task:


- name: make windows ui red in prod
  win_regedit:
    # Modify this registry path: HKCU:\Software\Microsoft\Windows\DWM\ColorizationColor
    key: HKCU:\Software\Microsoft\Windows\DWM
    value: ColorizationColor
    # c0dd0000 == 3235708928 in decimal.
    # In newer Ansible versions, this would work too: “hex:c0,dd,00,00”
    data: 3235708928
    datatype: dword
  # Only do this in production
  when: env is defined and env == "prd"

One Less Foot-Gun

Now we have more red than we know what to do with! By using these one-task Ansible snippets, we made it very obvious when dealing with production vs test servers, and one small avenue to disaster is harder to travel down. Hopefully, you can put this to good use too.

There is a lot more to learn from GitLab and we’re looking at more ways to make TrackJS even more stable and reliable. As we find ways, we’ll be sure to share them with you.

Did you like this?

Error Monitoring

Performance Monitoring

Remote Debugging

Technologies

Company

Top Errors

Learning from GitLab: Making Production Obvious

Linux

Windows

One Less Foot-Gun

What to do Next:

1. Try TrackJS on Your Website

2. Get the Debugger Newsletter

Linux

Windows

One Less Foot-Gun

What to do Next:

1. Try TrackJS on Your Website

2. Get the Debugger Newsletter

Related Articles

A Fresh Look Without Moving the Cheese

Common Errors in Next.js Caching

Error Monitoring on Client- and Server-Side in NextJS 14+