_TFC30ApteligentManglingExampleSwift7MyClass14memberFunctionfT_Si

If you’ve ever seen a symbol like the above, you might have wondered where it came from and how your source code ended up in an unfavorable confrontation with a cheese grater. This effect is called function name mangling, and it is a feature of many programming languages. This article will give you an introduction to the concept, why it is needed, and how to decipher it in some sample applications.

To understand why name mangling is needed, we must first go over some of the basics behind the process of building executable binaries, which consists of two main steps.

First, each file is compiled into what is called an ‘object file’. For the sample application below [Fig 1.1], the compiler will generate two intermediate files – main.o and filename.o. Since the compiler only looks at the source code of a single file at a time (in most cases), each object file may refer to modules outside of its own scope at this point in the build.

After the intermediate files are generated, the linker takes over to generate the executable file. As the binary is being generated, the linker will attempt to resolve any references that were not handled at the compilation stage.

If the linker only used the function name to resolve function references, programming languages with namespace or function overloading support will run into problems. Swift or C++, for example, can have more than one function with the same name. If there are multiple function implementations with the same signature, how does the linker know which is the correct choice?

Function name mangling is a solution to this problem. Name mangling in programming languages is a technique used to resolve name collisions during the linking process of compiled entities. Mangling is a way for the compiler to encode more information into the name of an object (for example, a function name combined with parameters) which passes more information to linker. This helps the linker make the correct choice when linking together multiple object files into a single binary.

Function Overloading: An Example

Function name mangling becomes necessary when multiple procedures in the same program may have identical names in source. Any C++ application, for example, could have a function ‘functionName’ defined twice – once with the parameters (int, double), and another time, with a completely separate implementation, with just one parameter (int). As mentioned above, the linker cannot utilize just the function name to support function overloading. To address this problem, the compiler mangles the function name in a way that makes it unique within the scope of the whole program.

The example program source below has three functions with the same name, demonstrating a simple case of function overloading. The function name ‘functionName’ actually represents three different functions with three different sets of parameters. This is a perfectly legal (and useful) feature of C++ and other languages.


Fig 1.2

When the C++ compiler produces an object file prior to the linking process, it assigns a unique qualifier to each subprogram following a strict set of rules. This gives the linker sufficient information to correctly link object files together into a functional program.

We can use dwarfdump to peek at the ‘Debug Information Entry’ (DIE) for our sample code function ‘functionName()’. As you can see, the C++ compiler gave this DIE an attribute called ‘AT_MIPS_linkage_name’ with the value ‘_Z12functionNamev.’ This is an example of a mangled function name.

void functionName();
0x0000002e: TAG_subprogram [2]
AT_low_pc( 0x00000001000011c0 )
AT_high_pc( 0x00000001000011df )
AT_frame_base( rbp )
AT_MIPS_linkage_name( “_Z12functionNamev)
AT_name( “functionName” )
AT_decl_file( “<FILEPATH>/filename.cpp” )
AT_decl_line( 3 )
AT_external( 0x01 )

In this simple example, we can see how this C++ compiler mangles the three function names in order to create a unique identifier for each.

Source Function Name Mangled (Linker) Function Name
void functionName() _Z12functionNamev
void functionName(int val) _Z12functionNamei
void functionName(int val, double dbl) _Z12functionNameid

With this additional information added to the function name, the linker now has enough information to resolve a reference to the correct implementation.

A Tangent: The extern “C” Keyword

This section is for the curious. To skip ahead, click here

While C++ supports function overloading, as seen in the example above, regular C does not. During C++ compilation, the ‘ extern “C” ’ keyword forces the compiler to use the function name itself, without modification, as the linkage symbol name – just like a regular C compiler would normally do. This allows a C++ function to be linked to from regular C code.

We can use Dwarfdump to examine the difference in the Debug Information Entry for the test function when ‘ extern “C” ’ is used. In this case, the mangled name in the attribute AT_MIPS_linkage_name disappears and we are left with only the function name to identify the subprogram.

void functionName(); extern “C” void functionName();
0x0000002e:  TAG_subprogram [2]
AT_low_pc( 0x00000001000011c0 )
AT_high_pc( 0x00000001000011df )
AT_frame_base( rbp )
AT_MIPS_linkage_name( “_Z12functionNamev” )      
AT_name( “functionName” )
AT_decl_file( “<FILEPATH>/filename.cpp” )
AT_decl_line( 3 )
AT_external( 0x01 )
0x0000002e:  TAG_subprogram [2]
AT_low_pc( 0x00000001000011c0 )
AT_high_pc( 0x00000001000011df )
AT_frame_base( rbp )   AT_name( “functionName” )
AT_decl_file( “<FILEPATH>/filename.cpp” )
AT_decl_line( 3 )
AT_external( 0x01 )

In our example application above [Fig 1.2], we had multiple functions with the same name. Since the C++ compiler mangles the function names, linking to the different implementations of this function is usually not a problem. When the extern “C” keyword is used, name mangling is effectively disabled. What happens if you apply it to two or more of these overloaded function declarations?


Fig 1.3

Since extern “C” disabled name mangling, two subprograms are generated with the same signature. This ambiguity causes a problem at the linker stage which results in a failed build.

Decoding the C++ Mangling Scheme

C++ (and Objective-C++ by extension) makes use of namespaces to prevent name collisions that can occur when integrating multiple libraries. C++ compilers include namespace information in the mangled symbol in order to differentiate between functions with the same name in multiple namespaces. The compiler also includes the class name, so an identical function name can be declared inside and outside of a class.

Unfortunately, name mangling is not standardized across all C++ compilers. Using different compiler vendors, the same compiler on different platforms, or even different versions of the same compiler will produce differing results. For the purposes of this article, we’re producing results using Xcode 7.3. (Apple LLVM version 7.3.0 (clang-703.0.29))

Here is an example of what a mangled function name looks like in C++ and the breakdown of the components in the string.

_ZN13NamespaceName9ClassName18memberFunctionNameEid

Mangled C++ symbols always start with a _Z, followed by (optionally) an N if this symbol is nested inside the scope of a namespace or class. Following this is a series of <length, name> pairs that show the namespace name, class name and function name consecutively. Finally, the symbol ends with an E along with parameter type information. In the example above, our function takes an integer and a double value as parameters, so the characters ‘i’ and ‘d’ are appended to the function symbol.

namespace NamespaceName {
    class ClassName {
        public:
            int memberFunctionName (int, double);

Here is an example of a test app with a wider variety of mangled function names with different placements in a source file.


Fig 1.4

When compiled, each of these function declarations, despite having the same name, have their own unique mangled identifier.

Decoding the Swift Mangling Scheme

Apple’s new development language uses an extensive name mangling scheme to encode a large amount of metadata about the functions in the compiled symbol name. It is similar in structure to the method used by C++ compilers, with a few additions to differentiate symbols even further. Interestingly, Swift does not support namespaces, but it does prepend the module name to the mangled symbol creating a namespace-like effect when it comes to mangling. It also includes function return type information as well as a function attribute flag.

We wrote a simple test app to demonstrate function name mangling in a Swift application.


Fig 1.5

Using the Dwarfdump utility, we can retrieve mangled symbol names from the dSYM debug symbols file. Let’s take one of the more complex function symbols from our test app and break it down into its components.

_TFC30ApteligentManglingExampleSwift7MyClass14memberFunctionfT_Si

Table 1

Finally, let’s look at the rest of the functions in our sample app to see how the Swift compiler mangles each one. We have an compiled example for each of the different function attributes with a variety of parameters and return types. Swift handles more cases than what we can cover here. Since Swift is now an open source project, you can look at the whole mangling algorithm yourself here.

Table 2

Table 3

Built-in Type Representative Character Swift Built-in Type
Sa Array
Sb Bool
Sc UnicodeScalar
Sd Double
Sf Float
Si Int
SP UnsafePointer
Sp UnsafeMutablePointer
SR UnsafeBufferPointer
Sr UnsafeMutableBufferPointer
Su UInt
Sq Optional
SQ ImplicitlyUnwrappedOptional
SS String