24-Feb-2021 — wolfgang
One day we got an email from one of our customers:
“I cannot retrieve some document from the database anymore.”
We tried and true, our application just showed some unexpected error.
So we tried to understand what’s wrong
- everything else worked perfectly fine
- according to logs, the database was still working fine a few minutes before
- someone changed the title of the document and it stopped working
- we could also retrieve the document without it’s history
- there were no errors in our server logs
- the client application got an empty response instead of the data
We didn’t quite know how to debug this, so what we did next was to write some application in JAVA to simulate the client app and it worked without a problem. Also the same rest call worked fine when getting the document using CURL to talk to rest servers.
But we tried different browsers and Postman (an app which “simulates” browsers doing rest calls)
-> they all failed.
So what is the difference between java / command line tools and browsers?
After sleeping for what was left of the night, the only other thing which came to our mind was that modern browsers ask the webserver to send the data compressed. Command line tools don’t do this.
To verify the compression was the cause, we turned it off on the server, and everything worked just fine!
Now that we knew, that the compression failed, we tried to figure out why:
-> We changed the compression level from the default (6) to other values and these also worked fine.
We examined gzip a bit deeper and concluded that we found a bug in gzip, a tool used by millions of developers: apparently gzip creates a wrong (empty) result compressing one specific input…
This example shows how even very unexpected things can go wrong.
The answer in Medical Device Software is called SOUP.
We are not making medical device software, but we still document our SOUP and do a risk analysis. But that one we did not catch.
In this case the compression algorithm (gzip) is just a tiny part of a bigger entity (tomcat) which is used very commonly. Now how to to catch that this kind of unexpected failure.
This is indeed very difficult. Here’s our recommendations:
- know your tech stack (don’t just use some libraries you don’t really understand)
- document it in as much detail as possible (all the SOUP used)
- do your homework and check all the SOUP if there’s published defects or security issues: read them and heck if they influence you
If you wonder how to best document SOUP, contact us — we can help.
Originally published at https://us.matrixreq.com.