Postmorten of a 500 internal error!

Katherine Soto
3 min readFeb 26, 2022

--

On 02–15–2022 PDT at midnight, Holberton School released the Web Stack Debugging #3 project to cohort 15. We were tasked with finding the cause of the problem by the end of 02–17–2022 PDT. This is the postmortem of that bug

Issue Summary — service interruption

A Wordpress website, running on a LAMP stack, was returning a 500 status code to all get requests. The website itself is a simple HTML page, but a problem with MySQL or PHP can still disable the website.

Wordpress is a very popular tool, it allows you to run blogs, portfolios, e-commerce and company websites… It actually powers 26% of the web, so there is a fair chance that you will end up working with it at some point in your career.

Wordpress is usually run on LAMP (Linux, Apache, MySQL, and PHP), which is a very widely used set of tools.

Timeline

  • 02–15–2022, 12:00 am PDT, Project released.
  • 02–15–2022, 07:00 pm, Begin working on project with Smith Flores, Shannel Bejarano & Andrea using strace and checking every pid process
  • 02–15–2022, 07:30pm, Error logging has been enabled and ‘no such file’ error is seen.
  • 02–15–2022, 07:40pm, typo is manually fixed and requests return 200 status code.
  • 02–15–2022, 08:00 pm, Rough draft of Puppet code written.
  • 02–15–2022, 08:20 pm PDT, Puppet code finalized.

The bug: error 500

How we find the error? & fixed!

At first we use the command <ps auxf> to check the process that are running in this moment in the server

After of it, we started to debugging each process with the PID that correspond and at the same time with the command <strace -p>, in another terminal, checking the status that will be appear when we tried to connect to the ip with the command <curl>. We did it and in the process 771, we found this

we checked the traceback error and we notice that this file was originated all the error:

Finally we use the command to find this line of “*.phpp” ussing the command <grep>. We found and fixed .

the document that we find was this: ‘DIR=/var/www/html/wp-settings.php’

we changed this text ‘OLD=phpp’ for this text ‘NEW=php’, and finally, we fixed the problem

After implementing the solution, a get request returns a normal HTML page with a status code of 200.

The script

Finally, we made a script with puppet to fix this problem authomatically in the case we have again this error and also to fix the error of the server that the checker fix ;)!

Prevention

This is a very common error to find in production or dev, so probably, one developer was seeing the document and accidentally changed it! We can prevent this error saving our scripts of configuration, docs of configuration and also having a record of login in this server to check what was the files that the developers manipulated. Last, doing more testing and have a monitoring tool like datadog!

I hope you enjoyed my first postmorten documentation of one project that I made!

--

--

Katherine Soto
Katherine Soto

Written by Katherine Soto

Software Engineer Lead, full stack developer + devops interested in architecture, cybersecurity & Ai

No responses yet