Developers wield an entire army of tools, each promising to make their lives easier. Yet hidden among this vast crowd stands a true hero—the one who comes to the rescue in the toughest battles. Nomen illis CMake! However, we all know that nothing is perfect, and even the strongest have their Achilles' heel. Today, we'll spot what weaknesses lie behind the shining armor of our hero.
Intro
CMake is a tool that helps devs create projects for different platforms and compilers. It doesn't build the code directly but works as an intermediary that converts the project rules into ready-to-use files for other build programs such as GNU make, Ninja, MSBuild, etc.
Our hero sets the rhythm and direction, ensuring a smooth workflow of all these components. Thanks to CMake, developers can stay focused on writing code instead of wrestling with platform-specific build quirks. The program brings order to the chaotic build process, giving full freedom to choose tools for specific tasks.
I built the project from source following this guide. There were no problems during generation, and I owe that to our hero. For curious dev knights, I leave the full documentation to dive a little bit deeper on how to work with CMake.
After building, I opened the .sln
file and checked it with the PVS-Studio plugin. The repository was committed at the v4.1.0 tag. By the way, we had crossed paths with our hero before. For curious devs, I suggest you read the previous article.
I filtered the results by High and Medium severity and included the general-purpose warnings. Well, all preparations have been completed. Now it's time to challenge our hero!
Warnings
Fragment N1
The PVS-Studio warning: V1061 Extending the 'std' namespace may result in undefined behavior. cmList.h 1322
namespace std // <=
{
inline void swap(cmList& lhs, cmList& rhs) noexcept
{
lhs.swap(rhs);
}
}
The C++ program will have undefined behavior if developers add new functions to the std
namespace. As we can see in the example, devs have added a new overload of std::swap
directly into that namespace.
So, how can we fix it? The correct way to implement this function for a user-defined type is to declare it in the same namespace as the type itself:
class cmList
{
public:
void swap(cmList& other) noexcept { /* implementation */ }
private:
/* private data members */
};
inline void swap(cmList& lhs, cmList& rhs) noexcept
{
lhs.swap(rhs);
}
Another question: how will the compiler spot to get this particular function? Developers should call the function using an unqualified name:
template <typename T>
void foo(T &obj1, T &obj2)
{
using std::swap; // make 'std::swap' available in this scope
....
swap(obj1, obj2); // call the best 'swap' for objects of the 'T' type
....
}
In this example, using std::swap;
brings the std::swap
function into the current scope. When swap(obj1, obj2);
is unqualifiedly called, the compiler will select the most appropriate overload via the argument-dependent lookup (ADL) mechanism. If the T
template parameter is specialized as the cmList
type, the compiler will look for the swap
overload also in this class namespace, and select it as the most appropriate version.
Fragment N2
void cmLocalGenerator::GetDeviceLinkFlags(
cmLinkLineDeviceComputer& linkLineComputer,
std::string const& config,
std::string& linkLibs,
std::string& linkFlags,
std::string& frameworkPath,
std::string& linkPath,
cmGeneratorTarget* target)
{
cmGeneratorTarget::DeviceLinkSetter setter(*target);
cmComputeLinkInformation* pcli = target->GetLinkInformation(config);
auto linklang = linkLineComputer.GetLinkerLanguage(target, config);
auto ipoEnabled = target->IsIPOEnabled(linklang, config);
if (!ipoEnabled) {
ipoEnabled = linkLineComputer.ComputeRequiresDeviceLinkingIPOFlag(*pcli);
}
....
if (pcli) {
this->OutputLinkLibraries(pcli,
&linkLineComputer,
linkLibs,
frameworkPath,linkPath);
}
....
}
See the error? At first glance, it's easy to miss. The pcli
pointer is dereferenced when the ComputeRequiresDeviceLinkingIPOFlag
function is called, before it's checked for validity. If pcli
turns out to be null
, it leads to undefined behavior. But the situation isn't as easy-peasy as it looks. We've dug a little deeper to see whether this bug has been there from the start—the answer is no:
cmComputeLinkInformation* pcli = target->GetLinkInformation(config);
const std::string linkLanguage =
linkLineComputer->GetLinkerLanguage(target, config);
if (pcli)
{
// Compute the required cuda device link libraries when
// resolving cuda device symbols
this->OutputLinkLibraries(pcli, linkLineComputer, linkLibs,
frameworkPath, linkPath);
}
Developers know the pcli
pointer could be null and wisely guard themselves by checking it. But then, in commit 96bc59b1, another developer adds new logic and forgets about that safeguard when inserting their code. Here, static analysis shines and proves its worth: with merge requests, it stands watch, and the code updates automatically are checked.
Here's the simplest way to fix the problem:
auto linklang = linkLineComputer.GetLinkerLanguage(target, config);
auto ipoEnabled = target->IsIPOEnabled(linklang, config);
if (!ipoEnabled && pcli) {
ipoEnabled = linkLineComputer.ComputeRequiresDeviceLinkingIPOFlag(*pcli);
}
....
Here's how PVS-Studio warns about this error: V595 The 'pcli' pointer was utilized before it was verified against nullptr. Check lines: 1445, 1454. cmLocalGenerator.cxx 1445
The same warnings:
- V595 The 'this->Properties[index]' pointer was utilized before it was verified against nullptr. Check lines: 904, 906. cmCTestMultiProcessHandler.cxx 904
- V595 The 'cp->Commands' pointer was utilized before it was verified against nullptr. Check lines: 536, 539. ProcessWin32.c 536
- V595 The 'copy_rule' pointer was utilized before it was verified against nullptr. Check lines: 3004, 3006. cmLocalGenerator.cxx 3004V
Fragment N3
The PVS-Studio warning: V573 Uninitialized variable 'intDir' was used. The variable was used to initialize itself. cmGlobalVisualStudio7Generator.cxx 626
std::string cmGlobalVisualStudio7Generator::WriteUtilityDepend(
cmGeneratorTarget const* target)
{
std::vector<std::string> configs =
target->Target->GetMakefile()->GetGeneratorConfigs(
cmMakefile::ExcludeEmptyConfig);
....
/* clang-format on */
std::string intDirPrefix =
target->GetLocalGenerator()->MaybeRelativeToCurBinDir(
cmStrCat(target->GetSupportDirectory(), '\\'));
for (std::string const& i : configs) {
std::string intDir = cmStrCat(intDir, i); // <=
....
}
....
}
Here is an interesting bug puffed right into the master branch in commit b82a74d9. The developer loops through the strings in the configs
container and builds a path to the intermediate directory (intDir
) that will be written to the *.vcproj
file.
The problem is that the intDir
variable is used in its own initialization, which leads to undefined behavior, since the object lifetime will only begin after initialization is complete ([1], [2]). We can see that a variable with a similar name and the Prefix
suffix is created before the loop, and it should probably be used when initializing intDir
:
std::string intDirPrefix = ....;
for (std::string const& i : configs) {
std::string intDir = cmStrCat(intDirPrefix, i);
....
}
Fragment N4
The PVS-Studio warning: V778 Two similar code fragments were found. Perhaps, this is a typo and 'writeOp' variable should be used instead of 'readOp'. cmDebuggerWindowsPipeConnection.cxx 20
class DuplexPipe_WIN32
{
public:
DuplexPipe_WIN32(HANDLE read);
....
private:
HANDLE hPipe;
OVERLAPPED readOp;
OVERLAPPED writeOp;
};
DuplexPipe_WIN32::DuplexPipe_WIN32(HANDLE pipe) : hPipe(pipe)
{
readOp.Offset = readOp.OffsetHigh = 0;
readOp.hEvent = CreateEvent(NULL, true, false, NULL);
writeOp.Offset = readOp.OffsetHigh = 0;
writeOp.hEvent = CreateEvent(NULL, true, false, NULL);
}
Inside the DuplexPipe_WIN32
constructor, the readOp
and writeOp
data members of the OVERLAPPED
type are initialized. In this structure, developers initialize the Offset
, OffsetHigh
, and hEvent
data members. They start with the readOp
data member, then move on to writeOp
. To initialize it, developers decide to copy the two lines above and change the name.
Unfortunately, devs have forgotten to change readOp.OffsetHigh
as well, which could cause some tricky bugs. It seems that the fix seems obvious; however, we can use another option:
class DuplexPipe_WIN32
{
public:
DuplexPipe_WIN32(HANDLE read);
....
private:
HANDLE hPipe;
OVERLAPPED readOp {};
OVERLAPPED writeOp {};
};
DuplexPipe_WIN32::DuplexPipe_WIN32(HANDLE pipe): hPipe(pipe)
{
readOp.hEvent = CreateEvent(NULL, true, false, NULL);
writeOp.hEvent = CreateEvent(NULL, true, false, NULL);
}
According to the documentation:
All of the members of the OVERLAPPED structure must be initialized to zero unless an event will be used to signal completion of an I/O operation. If an event is used, the hEvent member of the OVERLAPPED structure specifies a handle to the allocated event object.
The default member initializers in the class definition will fill everything with nulls in OVERLAPPED
, and then readOp.hEvent
and writeOp.hEvent
will be initialized in the constructor body.
Fragment N5
The PVS-Studio warning: V1028 Possible overflow. Consider casting operands of the '_yybytes_len + 2' operator to the 'yy_size_t' type, not the result. cmExprLexer.cxx 1851
typedef void* yyscan_t;
typedef size_t yy_size_t;
void *yyalloc ( yy_size_t , yyscan_t yyscanner );
YY_BUFFER_STATE yy_scan_bytes (const char * yybytes,
int _yybytes_len ,
yyscan_t yyscanner)
{
YY_BUFFER_STATE b;
char *buf;
yy_size_t n;
int i;
// Get memory for full buffer, including space for trailing EOB's.
n = (yy_size_t) (_yybytes_len + 2);
buf = (char *) yyalloc( n , yyscanner );
if ( ! buf )
YY_FATAL_ERROR( "out of dynamic memory in yy_scan_bytes()" );
for ( i = 0; i < _yybytes_len; ++i )
buf[i] = yybytes[i];
buf[_yybytes_len] = buf[_yybytes_len + 1] = YY_END_OF_BUFFER_CHAR;
....
}
Theyybytes_len
variable has the int
type and is used to form a new dynamically allocated buffer, buf
. Its size is evaluated as a signed expression and converted to unsigned. Signed overflow, whose behavior is undefined, may occur during addition if yybytes_len
is in the range [INT_MAX – 1 .. INT_MAX]
. The conversion result may also be unexpected.
Generally, the problem falls under the category of 64-bit errors. Unfortunately, to solve this problem, we need to change the function interface: the buffer size parameter should be declared with the size_t
type, not int
. Moreover, this function is already called with an argument of the size_t
type, which has been converted to int
:
YY_BUFFER_STATE yy_scan_string (const char * yystr ,
yyscan_t yyscanner)
{
return yy_scan_bytes( yystr, (int) strlen(yystr) , yyscanner);
}
Once the function accepts the size_t
type, explicit type conversions become unnecessary. Additionally, all loop or index variables used in calculations should also switch from int
to size_t
. In our case, this is the i
local variable.
The same warnings:
Fragment N6
/**
* Append two or more strings and produce new one.
* Programmer must 'delete []' the resulting string,
* which was allocated with 'new'.
* Return 0 if inputs are empty or there was an error
*/
char* SystemTools::AppendStrings(char const* str1,
char const* str2)
{
if (!str1) {
return SystemTools::DuplicateString(str2);
}
if (!str2) {
return SystemTools::DuplicateString(str1);
}
size_t len1 = strlen(str1);
char* newstr = new char[len1 + strlen(str2) + 1];
if (!newstr) {
return nullptr;
}
strcpy(newstr, str1);
strcat(newstr + len1, str2);
return newstr;
}
Here we have the SystemTools::AppendStrings
function, which is designed to combine two passed strings in a new buffer and return a pointer to it.
What can happen if an error occurs when devs attempt to allocate memory? Let's pay attention to the comment on the function: "If an error occurs, the function should return a null pointer." Will that really happen? Not quite. The new
expression throws the std::bad_alloc
exception when the error occurs, making the following check meaningless. If no exception is thrown, the new
expression returns a valid pointer.
To fix the problem, you need to call the nothrow
version of the allocation function:
char* newstr = new (std::nothrow) char[len1 + strlen(str2) + 1];
if (!newstr) {
return nullptr;
}
This is how the analyzer warns about this problem: V668 There is no sense in testing the 'newstr' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. SystemTools.cxx 1724
The same warning:
- V668 There is no sense in testing the 'newstr' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. SystemTools.cxx 1747
Fragment N7
The PVS-Studio warning: V557 Array overrun is possible. The value of 'length - 1' index could reach 18446744073709551615. ProcessWin32.c 1942
void kwsysProcessCleanErrorMessage(kwsysProcess* cp)
{
/* Remove trailing period and newline, if any. */
size_t length = strlen(cp->ErrorMessage);
if (cp->ErrorMessage[length - 1] == '\n') {
cp->ErrorMessage[length - 1] = 0;
--length;
if (length > 0 && cp->ErrorMessage[length - 1] == '\r') {
cp->ErrorMessage[length - 1] = 0;
--length;
}
}
if (length > 0 && cp->ErrorMessage[length - 1] == '.') {
cp->ErrorMessage[length - 1] = 0;
}
}
The utility function is designed to make small adjustments to an error message. If the message ends with the .\r\n
sequence, the function will remove characters. However, before checking that the last character is \n
, developers don't check the message length. If it's empty, an unsigned overflow will occur when calculating the index, the final value will be equal to SIZE_MAX
, and it leads to buffer overflow and undefined behavior. Interestingly, later in the function, when removing the .
character, developers correctly check the length.
Here's the fixed code:
void kwsysProcessCleanErrorMessage(kwsysProcess* cp)
{
/* Remove trailing period and newline, if any. */
size_t length = strlen(cp->ErrorMessage);
if (length > 0 && cp->ErrorMessage[length - 1] == '\n')
....
}
Fragment N8
The PVS-Studio warning: V1043 A global object variable 'cmPropertySentinel' is declared in the header. Multiple copies of it will be created in all translation units that include this header file. cmStatePrivate.h 27
// cmStatePrivate.h
....
static std::string const cmPropertySentinel = std::string();
The analyzer has detected a global constant defined in the header file. When it's included in the translation units, each of them will create its own copy of this constant because it has an internal linkage due to two reasons:
- It's not
volatile
butconst
-qualified. - It's declared with the
static
specifier.
A quick glance at the codebase reveals that this header is included in:
There are two ways to fix this, depending on the standard version used in the project.
Prior to C++17:
// cmStatePrivate.h
....
extern std::string const cmPropertySentinel;
// some translation unit
#include "cmStatePrivate.h"
std::string const cmPropertySentinel = std::string();
Starting with C++17:
// cmStatePrivate.h
....
inline std::string const cmPropertySentinel = std::string();
Fragment N9
The PVS-Studio warning: V571 Recurring check. This condition was already verified in line 2528. cmQtAutoGenInitializer.cxx 2529
bool cmQtAutoGenInitializer::GetQtExecutable(
GenVarsT& genVars,
std::string const& executable,
bool ignoreMissingTarget) const
{
....
std::string err;
genVars.ExecutableFeatures =
this->GlobalInitializer->GetCompilerFeatures(
executable, genVars.Executable,
err, this->MultiConfig,
this->UseBetterGraph);
if (this->MultiConfig && this->UseBetterGraph)
{
for (auto const& config : this->ConfigsList)
{
if (!genVars.ExecutableFeatures.Config[config]) // <=
{
if (!genVars.ExecutableFeatures.Config[config]) // <=
{
print_err(err);
return false;
}
}
}
}
....
}
The analyzer has detected a possible redundant check: the !genVars.ExecutableFeatures.Config[config]
condition is checked twice in a row. Let's take a closer look, first, at all the necessary types and definitions:
Additional info
class CompilerFeatures { /* .... */ };
using CompilerFeaturesHandle = std::shared_ptr<CompilerFeatures>;
class cmQtAutoGen
{
public:
....
/** String values with per configuration variants. */
template <typename C>
class ConfigStrings
{
public:
C Default;
std::unordered_map<std::string, C> Config;
};
....
};
class cmQtAutoGenInitializer : public cmQtAutoGen
{
public:
/** Abstract moc/uic/rcc generator variables base class. */
struct GenVarsT
{
....
ConfigStrings<CompilerFeaturesHandle> ExecutableFeatures;
};
....
private:
....
bool GetQtExecutable(GenVarsT& genVars,
std::string const& executable,
bool ignoreMissingTarget) const;
....
};
The Config
data member is a container of the std::unordered_map<std::string, std::shared_ptr<CompilerFeatures>>
type. When using the []
operator, the container will search for an element by the specified key and either return a reference to an existing object or to a new one created by default. The condition will be evaluated as true
in two cases:
- The element exists in the associative container but is null.
- The element doesn't exist in the associative container, and a new null smart pointer with the specified key will be created inside it.
At the same time, no direct modification of the element occurs between two searches. It does not appear that this code is executed in any way multithreaded, since there are no synchronization mechanisms in the function. It seems that this part of the code was supposed to check the result of the GetCompilerFeatures
function: all objects by keys from ConfigsList
should be present inside the genVars.ExecutableFeatures
container and be non-null. If any of these conditions are violated, the function should log an error and return false
.
Based on this idea, let's simplify the code:
if (this->MultiConfig && this->UseBetterGraph)
{
bool ok = std::all_of(
this->ConfigsList.begin(),
this->ConfigsList.end(),
[&Config = genVars.ExecutableFeatures.Config](auto &&config)
{
auto it = Config.find(config);
return it != Config.end() && it->second;
});
if (!ok)
{
print_err(err);
return false;
}
}
The same warning:
- V571 Recurring check. The '!cm->SourceFile.empty()' condition was already verified in line 580. cmCTestBuildHandler.cxx 581
Fragment N10
The PVS-Studio warning: V728 An excessive check can be simplified. The '||' operator is surrounded by opposite expressions 'arguments.ErrorVariable.empty()' and '!arguments.ErrorVariable.empty()'. cmExecuteProcessCommand.cxx 257
bool cmExecuteProcessCommand(std::vector<std::string> const& args,
cmExecutionStatus& status)
{
....
else if ( arguments.ErrorVariable.empty() ||
(!arguments.ErrorVariable.empty() &&
arguments.OutputVariable != arguments.ErrorVariable))
{
....
}
....
}
The analyzer issues a warning that the logical expression contains redundant checks. The arguments.ErrorVariable.empty()
expression is used twice during evaluations, and the second time it's inverted. If this expression is evaluated as false
as the left operand of the ||
operator, then he second evaluation on the right is guaranteed to be true
.
Let's simplify the logical expression:
else if ( arguments.ErrorVariable.empty()
|| arguments.OutputVariable != arguments.ErrorVariable)
{
....
}
A B !A !A ∩ B A ∪ (!A ∩ B) A ∪ B 0 0 1 0 0 0 1 0 0 0 1 1 0 1 1 1 1 1 1 1 0 0 1 1 See that? The last two columns are identical, which means that the two resulting expressions are equivalent for all possible combinations of sub-expression values. Thus, the simplified expression keeps the same logic as the original; however, the simplified version is easier to understand.Proof of logical expression equivalence
If the correction doesn't seem obvious at first glance, let's prove that the two logical expressions are equivalent using a truth table:
A
expression is arguments.ErrorVariable.empty()
; B
expression is arguments.OutputVariable != arguments.ErrorVariable
.
The same warnings:
- V728 An excessive check can be simplified. The '(A && B) || (!A && !B)' expression is equivalent to the 'bool(A) == bool(B)' expression. cmGlobalGhsMultiGenerator.cxx 341
- V728 An excessive check can be simplified. The '(A && !B) || (!A && B)' expression is equivalent to the 'bool(A) != bool(B)' expression. testDynamicLoader.cxx 81
- V728 An excessive check can be simplified. The '(A && !B) || (!A && B)' expression is equivalent to the 'bool(A) != bool(B)' expression. testDynamicLoader.cxx 88
- V728 An excessive check can be simplified. The '(A && !B) || (!A && B)' expression is equivalent to the 'bool(A) != bool(B)' expression. testDynamicLoader.cxx 95
Fragment N11
The PVS-Studio warning: V501 There are identical sub-expressions '!this->MakefileVariableSize' to the left and to the right of the '&&' operator. cmLocalUnixMakefileGenerator3.cxx 1292
class cmLocalUnixMakefileGenerator3 : public cmLocalCommonGenerator
{
....
private:
int MakefileVariableSize;
....
};
std::string cmLocalUnixMakefileGenerator3::CreateMakeVariable(
std::string const& s,
std::string const& s2)
{
std::string unmodified = cmStrCat(s, s2);
if ((!this->MakefileVariableSize && // <=
unmodified.find('.') == std::string::npos) &&
(!this->MakefileVariableSize &&
unmodified.find('+') == std::string::npos) &&
(!this->MakefileVariableSize &&
unmodified.find('-') == std::string::npos)){
return unmodified;
}
}
In the cmLocalUnixMakefileGenerator3::CreateMakeVariale
function, the !this->MakefileVariableSize
condition is repeated unnecessarily before each check for .
, +
, and -
in the unmodified
string. This repetition bloats the code and makes it harder to read. Since !this->MakefileVariableSize
is common to all three checks, we can move it outside the parentheses and combine all checks into a single logical expression:
if (!this->MakefileVariableSize &&
unmodified.find('.') == std::string::npos &&
unmodified.find('+') == std::string::npos &&
unmodified.find('-') == std::string::npos) {
return unmodified;
}
Well, we've removed the same subexpression, but the condition checks the existence of three find characters in the string. We can go a step further and perform micro-optimizations. Let's replace the call to the find
member function with find_first_of
, passing all three characters in a string literal:
if (!this->MakefileVariableSize &&
unmodified.find_first_of(".+-") == std::string::npos)
{
return unmodified;
}
The same warning:
- V501 There are identical sub-expressions '!options.empty()' to the left and to the right of the '||' operator. cmVisualStudio10TargetGenerator.cxx 2912
Fragment N12
The PVS-Studio warning: V523 The 'then' statement is equivalent to the subsequent code fragment. cmComputeLinkInformation.cxx 1748
bool cmComputeLinkInformation::CheckImplicitDirItem(LinkEntry const& entry)
{
BT<std::string> const& item = entry.Item;
// We only switch to a pathless item if the link type may be
// enforced. Fortunately only platforms that support link types
// seem to have magic per-architecture implicit link directories.
if (!this->LinkTypeEnabled) {
return false;
}
// Check if this item is in an implicit link directory.
std::string dir = cmSystemTools::GetFilenamePath(item.Value);
if (!cm::contains(this->ImplicitLinkDirs, dir)) {
// Only libraries in implicit link directories are converted to
// pathless items.
return false;
}
// Only apply the policy below if the library file is one that can
// be found by the linker.
std::string file = cmSystemTools::GetFilenameName(item.Value);
if (!this->ExtractAnyLibraryName.find(file)) {
return false;
}
return false;
}
The analyzer hints that there is definitely something wrong with the CheckImplicitDirItem
function:
- the
then
branch of the lastif
statement duplicates the code below (return false;
); - every branch of the function execution ends with
return false
; - when called from
AddFullItem
, it'll never trigger an early return; - the full function body can be replaced with
return false;
, as this won't change the observed program behavior.
Note that the function is written using the early return pattern that helps reduce code nesting: the most positive result is placed at the end of the function, and the rest of the code, in case of divergence from the function's purpose, should terminate the function as early as possible.
In our example, we can assume that the most positive result of the function is that an object of the LinkEntry
type has passed all the necessary checks with a return value of true
.
Here's the option on how to fix the code:
....
std::string file = cmSystemTools::GetFilenameName(item.Value);
if (!this->ExtractAnyLibraryName.find(file)) {
return false;
}
return true;
The same warnings:
- V523 The 'then' statement is equivalent to the subsequent code fragment. cmGlobVerificationManager.cxx 128
- V523 The 'then' statement is equivalent to the subsequent code fragment. cmListCommand.cxx 92
Outro
At this point, our journey with the all-mighty CMake draws to a close. We've faced various challenges from copy-paste to redundant checks. Yet, despite these trials, CMake stands as a powerful and trustworthy companion. Mistakes happen to even the bravest of heroes, but they don't define their worth. CMake is a project that is actively developing, and its bugs are swiftly fixed. Many of the previously discovered bugs we've encountered have already been slain, confirming that CMake remains a reliable and indispensable tool in the realm of development.
If your own projects in C, C++, C#, or Java need a keen-eyed sentry, our analyzer is ready to ride alongside you and checks your code!
Top comments (0)