Critical bugs in the ‘httpd’ web server, fix now! – Bare security


Pick a random person and ask them these two questions:

Q1. Have you heard of Apache?
Q2. If so, can you name an Apache product?

We are willing to bet that you will get one of the following two answers:

A1. No. A2. (Not applicable.)
A1. Yes. A2. Log4j.

Two weeks ago, however, we were suggesting that very few people had heard of Log4j, and even among those in the know, few would have been particularly interested.

Until a bunch of potentially catastrophic bugs – initially implemented as features, on the grounds that less is never more – was revealed under the bug-brand Log4Shell, the Log4j programming library was just one of those many components that have been sucked up and used by thousands, if not hundreds of thousands, of Java applications and utilities.

Log4j was just a “part of the supply chain” that was bundled into more back-end servers and cloud-based services than anyone had realized so far.

Many systems, IT staff, and cybersecurity teams have spent the past two weeks rooting out this programmatic scourge from their domains. (Yes, that’s a real word. It is pronounced areas, but the archaic spelling avoids involving a Windows network.)

Don’t forget “the other Apache”

Go back to such a recent pre-Log4j era and we suggest you get a different pair of answers, namely:

A1. Yes. A2. Apache is a web server, isn’t it? (In fact, it’s a software base that makes a web server, among other things.)
A1. Yes. A2. Apache does httpd, probably still the most popular web server in the world.

With over 3000 files totaling nearly a million lines of source code, Apache httpd is a large and powerful server, with a myriad of combinations of modules and options, which makes it both powerful and dangerous.

Fortunately, open source httpd The product receives constant attention from its developers, getting regular updates that bring new features as well as critical security fixes.

So, in all the excitement of Apache Log4j, remember that:

  • You almost certainly have Apache httpd in your network somewhere. Just like Log4j, httpd has a habit of integrating discreetly into software projects, for example as part of an internal service that works so well that it rarely draws attention to itself, or as a discretely integrated component into a product or service that you are selling that is not primarily considered to “contain a web server.”
  • Apache just released a httpd Update that fixes two security bugs numbered CVE. These bugs might not be exposed in your setup because they are part of optional run-time modules that you might not actually be using. But if you use these modules whether you realize it or not, you risk server crashes, data leaks, or even remote code execution.

What was fixed?

The two numbered CVE vulnerabilities are listed in the Apache changelog as follows:

  • CVE-2021-44790: Possible buffer overflow when parsing multipart content in mod_lua of Apache HTTP Server 2.4.51
  • CVE-2021-44224: NULL or SSRF dereference possible in direct proxy configurations in Apache HTTP Server 2.4.51 and earlier.

The good news about the first bug is that Apache itself warns that the mod_lua server extension (which allows you to adapt the behavior of httpd using Lua scripts instead of having to write modules in C):

… has great power over httpd, which is both a strength and a potential security risk. It is not recommended to use this module on a server shared with users you do not trust, as it can be abused to modify the internal workings of httpd.

However, as Log4j has taught us, potentially exploitable bugs, even on non-public servers, can be annoying if those bugs can be triggered by untrusted user data transmitted from other Internet servers to the edge of your network. .

And CVE-2021-44790 does not involve the introduction of additional untrusted Lua scripts in the configuration.

Instead, it is simply a matter of fooling the “preprocessor” which prepares untrusted user content to pass to trusted Lua scripts, so that the attack does not depend on bugs or flaws in any of the additional scripts that you may have written yourself.

Splitting messages into parts

Simply put, bug CVE-2021-44790 exists in code that deconstructs multipart messages, common in web form downloads, which typically look like this:

Content-Type: multipart/form-data; boundary=VILC2R2IHFHLZZ

Content-Disposition: form-data; name="name"
                                 <--blank line denotes start of first data item
Paul Ducklin
--VILC2R2IHFHLZZ                 <--double-dash-plus-boundary denotes end
Content-Disposition: form-data; name="phone"
                                 <--blank line denotes start of second data item        
--VILC2R2IHFHLZZ--               <--double-dash-plus-boundary denotes end

Technically, each multi-part component consists of the data after the end of each completely blank line (see above), and before each bounding line, which consists of two dashes (hyphens) followed by the unique text of the bounding marker.

In case you were wondering, the extra double hyphen at the end of the very last line above marks the last item in the list.

A blank row in the raw data appears as two CRLF (carriage return plus line feed) or ASCII codes (13,10,13,10), denoted in C by the text string "rnrn".

This analysis is handled very roughly by code that we have simplified like this:

for (start = findnext(start,boundarytext); start != NULL; start = end) {
   crlf = findnext(start,"rnrn");
   if (!crlf) break;
   end = findnext(crlf,boundarytext);
   len = end - crlf - 8;
   buff = memalloc(len+1);
   [. . .]

Don’t worry if you don’t know C – this code is impenetrable and quite poorly documented even if you do. (The original is much more complex and harder to follow; we’ve reduced it to its basics here.)

Basically he’s looking for a double-CRLF string, indicating the next empty line; from there it finds the next occurrence of the boundary marker text (VILC2R2IHFHLZZ in our example above).

It then assumes that the data it must extract consists of everything between these two landmarks, designated by memory addresses (pointers in C jargon) crlf and end, minus 8 bytes.

The code makes no effort to explain the meaning of this “minus 8” in the code, nor even the “plus 4” two lines later, although it is an immediate good guess that crlf+4 is there to skip the 4 bytes that make up the data in the CRLFCRLF chain itself. (The empty line is a separator and is not part of the data to use.)

Here is where the “8” comes from:

  • 4 bytes taken over by the CRLFCRLF characters at the start, which are not part of the data itself.
  • 2 bytes from CRLF at the end of the last row of data, not included.
  • 2 bytes used by dashes (--) which denote the beginning of the border line, not included.

As you can see, the code allocates enough memory for the data between the exact start of the line after the CRLFCRLF separator and the exact end of the line before the boundary mark …

… Plus 1 additional byte (len+1) to ensure NUL character (a zero byte) at the end of the buffer to act as the terminator that text strings require in C.

The code then uses memcpy() to copy the relevant data from the incoming message to this new buffer, where it will be presented to the Lua script that is about to run.

What if there is not 8 bytes?

You probably understood the problem: what if there is not 8 more bytes to delete? What if the CRLF at the end of the last row of data, or the -- at the start of the next line, right?

What if there is not 8 bytes in total between the CRLFCRLF and the border text?

This bug would have been much more obvious if the code had been constructed or commented out more clearly, and would almost certainly have been avoided if the CRLF-- separator between empty line and boundary text had been mentioned explicitly by the programmer, and tested explicitly.

This bug has been fixed by adding a check to make sure the final buffer size calculation is not too small, adding a line before attempting to allocate memory:

 if (end - crlf <= 8) break;

This tests that the buffer length cannot be negative, although we still think that an explicit check for a correct data terminator, in the same way that there is an explicit check for CRLFCRLF, would make the code clearer, and we would insert a comment directing the reader to a useful Internet RFC on multipart messages, for example RFC 2045.

Proxy issues

Processing of CVE-2021-44224 involved numerous code changes, the most obvious of which was a fix in a utility code file used by the httpd proxy module.

The fact that there are more than 5000 C lines in proxy_util.c alone, which is the support code for one of the many httpd modules, is a testament to the overall size and complexity of the Apache HTTP server.

The code we refer to above has been changed from this …

url = ap_proxy_de_socketfy(p, url);

… to the code that checks that the called function has found a URL string with which to work:

url = ap_proxy_de_socketfy(p, url);
if (!url) {
   return NULL;

Before the “if no URL” error checking causes the code to abort sooner, the program continues even if url were NULL, and try to access the memory via the url variable.

Read or write on a NULL the pointer is “undefined” by the C standard, which means you have to be careful never to do this either.

This is because on almost all modern operating systems the value used for NULL, usually zero, is chosen so that any attempt to access this address, whether read or write, not only fails, but is trapped by the operating system, which then usually stops the offending process to avoid dangerous or unexpected side effects.

What to do?

  • IIf you are using Apache httpd anywhere, update to 2.4.52 as soon as you can.
  • If you can’t patch, check if your configuration is at risk. There are many bugfixes beyond these two CVEs, so you should make the fixes as soon as possible. But you can decide to defer the update until a more convenient time if you do not load the Lua script or the proxy module.
  • If you are a coder, don’t forget to check for errors. If there is a chance to spot errors before making them worse, such as checking that you really have enough memory to play with, or checking that the channel you are looking for is really there, take it!
  • If you are a coder, assume someone else will have to figure out your code in the future. Write useful and useful comments, on the grounds that those who do not remember the past are doomed to repeat it.


Comments are closed.