We delved into the world of web development and discovered a bizarre creature—an elephant with the habits of a bug. While exploring the PHP project, we found the reason behind its strange behavior. Let's take a peek at some unusual cases that can lead to unexpected results.
About the project
PHP is one of the most popular scripting languages for server-side web programming. As an open-source project, it naturally caught our attention.
We've already analyzed this project a few years ago, but as time passes, the product evolves, our tools become more advanced, and developers don't stop to invent new ways to shoot themselves in the foot.
In this article, we'll talk about the C and C++ languages, where undefined behavior lurks around every corner. We even recently published an entire book on this topic: A C++ Programmer's Guide to Undefined Behavior.
Let's get back to our elephants in a bug's clothing. We checked the PHP project at the c998c36 commit using PVS-Studio analyzer and the Visual Studio Code plugin.
Dozen mind-bending bugs
Fragment N1
int fcgi_listen(const char *path, int backlog)
{
  char *s;
  int   tcp = 0;
  ....
  if (!tcp) {
    chmod(path, 0777);
  }
  ....
}
The PVS-Studio warning:
V1118 Excessive file permissions can lead to vulnerabilities. Consider restricting file permissions. fastcgi.c 761
Here is a small code fragment—what could be wrong here? Let's figure it out. The chmod function is used to set file permissions. In this case, the 0777 mask allows all users to read, modify, or execute this file. This can lead to serious consequences: attackers can exploit it and disrupt the program's operation. To avoid this, it's recommended to use more gentle mode and additional conditions for setting access rights.
By the way, this is the first error we detected using this diagnostic rule. It was introduced recently, but is already yielding fruitful results. Try PVS-Studio on your project—it's free for evaluation and can help you catch security issues in your code.
Fragment N2
This fragment contains over a thousand characters in a single line. There's no way to spot the problem like that, so we've split it up for the article. By the way, we can understand what the problem with code sausage is in a short article.
static inline zend_string* phar_get_stub(....)
{
  ....
  static const char newstub1_1[] = 
  "Extract_Phar::$temp)) 
    {
    \nheader('HTTP/1.0 404 Not Found');
    \necho 
    \"<html>
      \\n <head>
        \\n  <title>File Not Found<title>       // <=
      \\n </head>
      \\n <body>
        \\n  <h1>404 - File Not Found</h1>
      \\n </body>
    \\n</html>\"; 
    // part of the string is hidden
    ";
  ....
}
The PVS-Studio warning:
V735 Possibly an incorrect HTML. The "\" closing tag was encountered, while the "\" tag was expected. stub.h 23
We didn't even notice the elephant in the room. The PHP project has caught the warning about an incorrectly closed tag. It's such a tiny error, but it breaks the entire layout, and the browser refuses to show this page. The fix? Just swap the opening tag for the closing one:
\\n  <title>File Not Found</title>
Fragment N3
static void zend_compile_class_decl(....)
{
  ....
  if (....) {
    if (toplevel) {
      if (extends_ast) {
        zend_class_entry *parent_ce = zend_lookup_class_ex(
          ce->parent_name, NULL, ZEND_FETCH_CLASS_NO_AUTOLOAD);  // <=
        ....  
      }
    }
  }
  ....
  if (ce->parent_name) {                                         // <=
    /* Lowercased parent name */
    zend_string *lc_parent_name = zend_string_tolower(ce->parent_name);
    opline->op2_type = IS_CONST;
    LITERAL_STR(opline->op2, lc_parent_name);
  }
}
Here we can see the following pattern: the ce->parent_name pointer is passed as an argument to the zend_lookup_class_ex function, and then checked for NULL. This implies that the pointer can be NULL and used only after the check.
But here, it's passed to the zend_lookup_class_ex function without any checks. If there is an additional condition inside the function, there is nothing wrong with that. Let's take a look inside the function: the analyzer has already checked it and pointed to the line with dereference:
ZEND_API zend_class_entry *zend_lookup_class_ex(zend_string *name, 
                                                zend_string *key, 
                                                uint32_t flags)
{
  zend_class_entry *ce = NULL;
  zval *zv;
  zend_string *lc_name;
  zend_string *autoload_name;
  uint32_t ce_cache = 0;
  if ((   zval_gc_flags((name)->gc.u.type_info) & (1<<5))         // <=
       && __builtin_expect(
   !!((zend_gc_refcount(&(name)->gc)-1)/sizeof(void *) 
       <(compiler_globals.map_ptr_last)), 1)) {
    ce_cache = GC_REFCOUNT(name);
    ce = GET_CE_CACHE(ce_cache);
    if (EXPECTED(ce)) {
      return ce;
    }
  }
  ....
}
Boom! The name parameter, which received the ce->parent_name pointer, is used without any check. As we know, dereferencing a null pointer leads to undefined behavior and it could be anything.
The PVS-Studio warning:
V595 The 'ce->parent_name' pointer was utilized before it was verified against nullptr. Check lines: 'zend_execute_API.c:1169', 'zend_compile.c:9135', 'zend_compile.c:9165'. zend_compile.c 9135
Note. The analyzer warning turned out to be a false positive, unlike the others. Unfortunately, I mistakenly assumed that the allocator returns a pointer to a structure object with uninitialized data members. After looking at the allocator code more closely, I realized that this isn't the case. Even though there is no error in this fragment, such code seems a little bit tangled. Perhaps it could be simplified, for example, by explicitly specifying extends_ast in the second check. 
Nevertheless, the analyzer also highlighted a few other areas that could be reviewed and refined.
The same cases:
- V595 The 'generator->execute_data' pointer was utilized before it was verified against nullptr. Check lines: 810, 826. zend_generators.c 810
- V595 The 'parent_ce' pointer was utilized before it was verified against nullptr. Check lines: 1810, 3905, 3906. zend_inheritance.c 3905
- V595 The 'ce' pointer was utilized before it was verified against nullptr. Check lines: 94, 242, 247. zend_interfaces.c 242
- V595 The 'generator->execute_data' pointer was utilized before it was verified against nullptr. Check lines: 'zend_observer.h:97', 'zend_observer.h:115', ..., 'zend_generators.c:824', 'zend_generators.c:826'. zend_observer.h 115
- V595 The 'filename' pointer was utilized before it was verified against nullptr. Check lines: 'zend_string.h:218', 'zend_stream.c:79', 'fopen_wrappers.c:468', 'fopen_wrappers.c:470'. zend_stream.c 79
- V595 The 'new_value' pointer was utilized before it was verified against nullptr. Check lines: 590, 597. main.c 590
- V595 The 'new_break->class_name' pointer was utilized before it was verified against nullptr. Check lines: 592, 605. phpdbg_bp.c 592
Fragment N4
ZEND_API zend_result zend_register_functions(...) 
{
  ....
  while (ptr->fname) {
    ....
    reg_function = malloc(sizeof(zend_internal_function));
    memcpy(reg_function, &function, sizeof(zend_internal_function)); 
    ....
  }
  ....
}
The PVS-Studio warning:
V575 The potential null pointer is passed into 'memcpy' function. Inspect the first argument. Check lines: 3069, 3068. zend_API.c 3069
The analyzer issues a warning about undefined behavior when the null pointer is passed to the memcpy function. "But where's the NULL pointer here? We can see the memory allocation in the above line," an attentive reader might argue. Unfortunately, here's the catch: if memory allocation fails, the malloc function may return the null pointer. Therefore, after allocating memory, we need to check for NULL.
The same cases:
- V575 The potential null pointer is passed into 'strncat' function. Inspect the first argument. Check lines: 1300, 1299. zend.c 1300
- V575 The potential null pointer is passed into 'strncat' function. Inspect the second argument. Check lines: 1300, 1295. zend.c 1300
- V575 The 'memmove' function processes '0' elements. Inspect the third argument. iconv.c 2435
- V575 The 'memmove' function processes '0' elements. Inspect the third argument. filters.c 1411
Fragment N5
ZEND_API void zend_collect_module_handlers(void)
{
  ....
  modules_dl_loaded = realloc(modules_dl_loaded, 
                              sizeof(zend_module_entry*) 
                              * (dl_loaded_count + 1));
  modules_dl_loaded[dl_loaded_count] = NULL;
  ....
}
The PVS-Studio warning:
V522 There might be dereferencing of a potential null pointer 'modules_dl_loaded'. Check lines: 2527, 2526. zend_API.c 2527
Working with memory is the core of C and C++; that's why it's so easy to shoot ourselves in the foot. Just like the malloc function, the realloc function, may return the null pointer.
The same cases:
- V522 There might be dereferencing of a potential null pointer 'module_request_startup_handlers'. Check lines: 2520, 2514. zend_API.c 2520
- V522 There might be dereferencing of a potential null pointer 'class_cleanup_handlers'. Check lines: 2557, 2553. zend_API.c 2557
- V522 There might be dereferencing of a potential null pointer 'list'. Check lines: 3162, 3161. zend_API.c 3162
Fragment N6
ZEND_API void zend_collect_module_handlers(void)
{
  ....
  modules_dl_loaded = realloc(modules_dl_loaded, 
                              sizeof(zend_module_entry*) 
                              * (dl_loaded_count + 1));
  modules_dl_loaded[dl_loaded_count] = NULL;
  ....
}
The PVS-Studio warning:
V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'modules_dl_loaded' is lost. Consider assigning realloc() to a temporary pointer. zend_API.c 2526
Let's check your attention! Do you notice that this is the same code fragment, yep? But there's another issue here: developers pass the same pointer with the returned result to the realloc function. If the function returns NULL, the original pointer will be lost, leading to a memory leak.
The same cases:
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'zend_version_info' is lost. Consider assigning realloc() to a temporary pointer. zend.c 1299
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'module_request_startup_handlers' is lost. Consider assigning realloc() to a temporary pointer. zend_API.c 2514
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'modules_dl_loaded' is lost. Consider assigning realloc() to a temporary pointer. zend_API.c 2526
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'class_cleanup_handlers' is lost. Consider assigning realloc() to a temporary pointer. zend_API.c 2553
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'zend_flf_handlers' is lost. Consider assigning realloc() to a temporary pointer. zend_API.c 3086
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'zend_flf_functions' is lost. Consider assigning realloc() to a temporary pointer. zend_API.c 3087
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'p' is lost. Consider assigning realloc() to a temporary pointer. zend_alloc.c 3299
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'ce->interfaces' is lost. Consider assigning realloc() to a temporary pointer. zend_inheritance.c 1576
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'ce->interfaces' is lost. Consider assigning realloc() to a temporary pointer. zend_inheritance.c 2194
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'q->set' is lost. Consider assigning realloc() to a temporary pointer. ir_private.h 593
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'strtab->buf' is lost. Consider assigning realloc() to a temporary pointer. ir_strtab.c 81
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'blacklist->entries' is lost. Consider assigning realloc() to a temporary pointer. zend_accelerator_blacklist.c 236
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer '* tmphstbuf' is lost. Consider assigning realloc() to a temporary pointer. network.c 1296
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'php_ini_scanned_files' is lost. Consider assigning realloc() to a temporary pointer. php_ini.c 702
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'b->value' is lost. Consider assigning realloc() to a temporary pointer. php_ini_builder.h 65
- V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'zend_extensions' is lost. Consider assigning realloc() to a temporary pointer. phpdbg.c 1216
V701 realloc() possible leak: when realloc() fails in allocating memory, original pointer 'state->code' is lost. Consider assigning realloc() to a temporary pointer. phpdbg_prompt.c 245
Fragment N7
typedef union _znode_op {
  uint32_t      num;
  ....
}
struct _zend_op {
  _znode_op op2;
  ....
}
struct _zend_op_array {
  uint32_t num_dynamic_func_defs;
  ....
}
typedef struct _zend_op_array zend_op_array;
static void preload_remove_declares(zend_op_array *op_array)
{
  zend_op *opline = op_array->opcodes;
  ....
  while (opline != end) {
    switch (opline->opcode) {
      ....
      case ZEND_DECLARE_FUNCTION:
        ....
        if (func && func == op_array->dynamic_func_defs[opline->op2.num]) {
          ....
          if (op_array->num_dynamic_func_defs == 0) {
            dynamic_func_defs = NULL;
          } else {
            ....
            if (op_array->num_dynamic_func_defs - opline->op2.num > 0) {  // <=
              memcpy(
                dynamic_func_defs + opline->op2.num,
                op_array->dynamic_func_defs + (opline->op2.num + 1),
                sizeof(zend_op_array*) 
                * (op_array->num_dynamic_func_defs - opline->op2.num));
            }
  // the rest of the code
}
It's time to turn on our mathematical brains. The op_array->num_dynamic_func_defs and opline->op2.num variables are unsigned types, which means that the expression will also be unsigned and can never be less than 0.
What do we see here? It's a subtraction of two unsigned values. If the minuend is less than the subtrahend, the result will be a value less than zero. However, since there are unsigned values, we get the modulo arithmetic, 0u - 1 == UINT_MAX. So, the condition can only be false when the variables aren't equal to each other.
This expression should be replaced with the following:
if (op_array->num_dynamic_func_defs > opline->op2.num)
Now the code is shorter and cleaner. Maybe the developer made a logical mistake.
The PVS-Studio warning:
V555 The expression of the 'A - B > 0' kind will work as 'A != B'. ZendAccelerator.c 3915
Fragment N8
uint64_t flags
....
PHPDBG_API void phpdbg_delete_breakpoint(zend_ulong num)
{
  ....     
  if ((brake = phpdbg_find_breakbase_ex(num, &table, &numkey, &strkey))) {
      int type = brake->type;
      char *name = NULL;
      size_t name_len = 0L;
      switch (type) {
        ....
        default: {
          if (zend_hash_num_elements(table) == 1) {
            PHPDBG_G(flags) &= ~(1<<(brake->type+1));              // <=
          }
        }
      }
    ....
  }
}
The PVS-Studio warnings:
V629 Consider inspecting the '1 << (brake->type + 1)' expression. Bit shifting of the 32-bit value with a subsequent expansion to the 64-bit type. phpdbg_bp.c 1209
V784 The size of the bit mask is less than the size of the first operand. This will cause the loss of higher bits. phpdbg_bp.c 1209
Mathematicians, no time to relax. The flags variable has the unsigned long int type, while brake->type has int. The code is designed to remove a specific bit in flags. Now, let's take a closer look at what's really going on:
- The 1constant of theinttype is shifted left by a certain number of bits. Most often,intis 32 bits. We hope that the shift isn't by 32 or more bits, otherwise it leads to undefined behavior.
- The result of the shift is bitwise inverted. The result of the inversion still has the inttype.
- The result of the inversion is expanded to a 64-bit unsigned type due to the left operand. Since the original type is signed, sign extension will occur. This means that for positive numbers, the 32 most significant bits will contain zero bits, and for negative numbers, they'll contain ones.
- The conversion results are applied bitwise AND to flags. The loss of significant bits inflagswill occur when the right operand is positive. It'll only be so when there is a 31-bit shift to the left, i.e., when the 31st bit inflagsneeds to be cleared. Catch a proof.
Notice how much you need to keep in mind for such a harmless expression? The problem lies in the different operand sizes and signedness in some subexpressions. To fix it, developers just need to change the type of the 1 constant from int to unsigned long long, and everything will work as intended:
PHPDBG_G(flags) &= ~( 1uLL <<(brake->type+1));
Fragment N9
typedef signed long long timelib_sll;
....
timelib_sll timelib_get_current_offset(timelib_time *t);
....
static void php_do_date_sunrise_sunset(INTERNAL_FUNCTION_PARAMETERS, 
                                       bool calc_sunset)
{
  double latitude, longitude, zenith, gmt_offset, altitude;
  ....
  if (gmt_offset_is_null) {
      gmt_offset = timelib_get_current_offset(t) / 3600;  // <=
    }
  ....
}
The PVS-Studio warning:
V636 The 'timelib_get_current_offset(t) / 3600' expression was implicitly cast from 'long long' type to 'double' type. Consider utilizing an explicit type cast to avoid the loss of a fractional part. An example: double A = (double)(X) / Y;. php_date.c 5557
The result of dividing two integer types is an integer type. So why was the result assigned to a floating-point variable?
To preserve calculation accuracy, one of the operands should be converted to the double type:
gmt_offset = timelib_get_current_offset(t) / 3600.0;
Fragment N10
static zend_always_inline zend_result _zend_update_type_info(....)
{
  ....
  if (ssa_op->result_def >= 0) {
    tmp =   MAY_BE_ANY | MAY_BE_ARRAY_KEY_ANY 
          | MAY_BE_ARRAY_OF_ANY | MAY_BE_ARRAY_OF_REF;
    if (opline->result_type == IS_TMP_VAR) {
      if (   opline->opcode == ZEND_FETCH_R 
          || opline->opcode == ZEND_FETCH_IS) {
        /* Variable reference counter may be decremented before use */
        /* See: ext/opcache/tests/jit/fetch_r_001.phpt */
        tmp |= MAY_BE_RC1 | MAY_BE_RCN;
      } else {
        tmp |= MAY_BE_RC1 | MAY_BE_RCN;
      }
    }
  }
  ....
}
The PVS-Studio warning:
V523 The 'then' statement is equivalent to the 'else' statement. zend_inference.c 4078
This is a classic copy-paste error. Regardless of how the if statement evaluates, the code performs exactly the same operation—it adds the MAY_BE_RC1 | MAY_BE_RCN flags to the tmp variable.
If there really should be different logic here, it's now missing, and the code doesn't function as expected. If it's not, why do developers use if?
Fragment N11
#define zend_error_noreturn_impl(type, format) do { \
    zend_string *filename; \
    uint32_t lineno; \
    va_list args; \
    get_filename_lineno(type, &filename, &lineno); \
    va_start(args, format); \
    zend_error_va_list(type, filename, lineno, format, args); \
    va_end(args); \
    /* Should never reach this. */ \
    abort(); \                                           // <=
  } while (0)
....
ZEND_API ZEND_COLD ZEND_NORETURN 
  void zend_error_noreturn(int type, const char *format, ...)
{
  zend_error_noreturn_impl(type, format);
}
....
ZEND_API zend_result zend_startup_module_ex(zend_module_entry *module)
{
  .... 
  if (module->module_startup_func) {
    EG(current_module) = module;
    if (module->module_startup_func(module->type, 
                                    module->module_number)==FAILURE) {
      zend_error_noreturn(E_CORE_ERROR,
                          "Unable to start %s module", 
                          module->name);
      EG(current_module) = NULL;                         // <=
      return FAILURE;
    }
    EG(current_module) = NULL;
  }
  return SUCCESS;
}
The PVS-Studio warning:
V779 Unreachable code detected. It is possible that an error is present. zend_API.c 2437
Unreachable code after calling a function that doesn't return control flow. Such code is not only useless, but also misleading.
After digging through the logs a bit, we found an interesting commit from 10 years ago. Previously, errors were simply logged and execution continued, but then this principle was changed for critical errors. In the commit, developers replaced a call to the zend_error function, which returns control flow, with a call to zend_error_noreturn, but the surrounding code wasn't refactored accordingly.
Fragment N12
static void check_conflict(LexState*ls,struct LHS_assign*lh,expdesc*v){
  FuncState*fs = ls->fs;
  int extra = fs->freereg;
  int conflict = 0;
  for(; lh; lh = lh->prev){
    if(lh->v.k == VINDEXED){
      if(lh->v.u.s.info == v->u.s.info){ // <=
        conflict = 1;
        lh->v.u.s.info = extra;
      }
      if(lh->v.u.s.aux == v->u.s.info){  // <=
        conflict = 1;
        lh->v.u.s.aux = extra;
      }
    }
  }
  if(conflict){
    luaK_codeABC(fs, OP_MOVE, fs->freereg, v->u.s.info, 0);
    luaK_reserveregs(fs, 1);
  }
}
The PVS-Studio warning:
V778 Two similar code fragments were found. Perhaps, this is a typo and 'aux' variable should be used instead of 'info'. minilua.c 4331
There are two similar code fragments. Manually writing code instead of copying may be a bit tedious—but it helps avoid subtle bugs like this one.
In this case, not all occurrences of info were correctly replaced with aux in the condition. As a result, the program may behave incorrectly or produce invalid output.
Conclusion
That wraps up our journey through the tangled jungle of the PHP code. Let's keep working to make software safer. I would like to remind you that PVS-Studio has free licenses for open-source projects and educational purposes. For others, a trial version of the analyzer is available—give it a try!
 
Top comments (0)