I apologize if you came here looking for content related to strings and cats; this article is a tutorial on the C string concatenation function strcat
.
More than just the third letter of the alphabet, C is a compiled programming language - in other words, C is run by a computer only after being compiled. Compilation is accomplished by software called compilers, which take C source code files and translate them into executable language (binary) that computers can run.
There are many different C compilers, but in this tutorial, I will be using one called GCC, published by GNU. For instructions on how to install GCC, you can visit GNU’s installation guide here (https://gcc.gnu.org/install/).
Throughout the duration of this article, I exemplify usage of GCC on the command line. If you are unfamiliar with what a command line is, read more here (http://linuxcommand.org/index.php).
The function strcat
(think, "string concatenation") is a C standard library function that concatenates (appends) one string to the end of another.
ASIDE - STRING REFRESHER
When working with strings in C, remember - strings are no more than arrays of ASCII-encoded characters ending with a terminating null byte (\0
). A pointer to a string is merely a pointer to the first character in this array.
For a more in-depth examination on pointers, including a look at strings, I encourage you to visit one of my earlier posts (Pointers in C).
You may be familiar with a similar C string function, strcpy
(if you’re not, I encourage you to check out my dedicated article on it!). Recall that strcpy
copies the characters referenced by one string into the memory pointed to by another. In other words, if we were to copy str1
, visualized below:
char str1[10] = "Holberton";
Address | 0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 |
Variable | str1 | |||||||||
Value | H | o | l | b | e | r | t | o | n | \0 |
Into the memory pointed by str2
, visualized below:
char str2[10];
Address | 0x20 | 0x21 | 0x22 | 0x23 | 0x24 | 0x25 | 0x26 | 0x27 | 0x28 | 0x29 |
Variable | str2 | |||||||||
Value | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? |
We would achieve the following:
Address | 0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 |
Variable | str1 | |||||||||
Value | H | o | l | b | e | r | t | o | n | \0 |
Address | 0x20 | 0x21 | 0x22 | 0x23 | 0x24 | 0x25 | 0x26 | 0x27 | 0x28 | 0x29 |
Variable | str2 | |||||||||
Value | H | o | l | b | e | r | t | o | n | \0 |
In essence, we have copied the string "Holberton"
so that it exists in two different spots in memory - once at 0x00, and again at 0x20.
Why am I wasting my time reviewing strcpy
for an article on strcat
, you ask? Well, it turns out that strcpy and strcat work similarly. Truly, if you understand how strcpy
works, you understand how strcat
works.
Backtrack a step and imagine two new strings, still named str1
and str2
. This time, str1
points to an array of 16 characters, starting with the string "Holberton"
. In the meantime, the second string, str2
, references a new array of 7 characters, "School"
.
char str1[16] = "Holberton";
Address | 0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 | 0x0A | 0x0B | 0x0C | 0x0D | 0x0E | 0x0F |
Variable | str1 | |||||||||||||||
Value | H | o | l | b | e | r | t | o | n | \0 | ? | ? | ? | ? | ? | ? |
char str2[7] = "School";
Address | 0x20 | 0x21 | 0x22 | 0x23 | 0x24 | 0x25 | 0x26 |
Variable | str2 | ||||||
Value | S | c | h | o | o | l | \0 |
Now, when we concatenate str2
onto str1
, what we are truly doing is copying the contents of str2
to the end of the string referenced by str1
, thereby achieving the following:
Address | 0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 | 0x0A | 0x0B | 0x0C | 0x0D | 0x0E | 0x0F |
Variable | str1 | |||||||||||||||
Value | H | o | l | b | e | r | t | o | n | S | c | h | o | o | l | \0 |
Address | 0x20 | 0x21 | 0x22 | 0x23 | 0x24 | 0x25 | 0x26 |
Variable | str2 | ||||||
Value | S | c | h | o | o | l | \0 |
Feel familiar? Now, the string "School"
exists in two separate locations in memory - once still at 0x20, and again at the end of the original string referenced by str1
, 0x09. We have done nothing more than strcpy
"Holberton"
starting at the memory address of str1
’s original null byte (\0
).
Note a key concept visualized above - when we concatenate str2
to the end of str1
, we override and begin copying not at the last memory address included in the entire str1
array, 0x0F, but specifically at str1
’s original null byte, which happened to occur at address 0x09. Also, note that the copying of str2
to the end of str1
includes the terminating null byte.
char *dest, const char *src
The function receives two parameters, two character pointers.
The first pointer, dest
(think, "destination"), references the memory buffer where characters will be concatenated. Note that strcat
does not automate any memory handling - it directly tries to concatenate characters at the end of whatever string is initially referenced by dest. Because of this, you must allocate space for the destination buffer up-front, and sufficiently. Since the contents of dest
change, it is not received as a constant.
The second pointer, src
(think, "source"), references the string to concatenate. In contrast to dest
, src
is received as a constant, since its contents will merely be copied, not changed.
char *
You will receive nothing back from strcat
that you do not give it - after concatenating the received src
string to the end of the string referenced by dest
, the function turns around and returns a pointer to dest
, the same memory address passed when you call the function.
The function strcat
is declared as follows:
/** * strcat - Concatenates the string pointed to by @src, including the terminating * null byte, to the end of the string pointed to by @dest. * @dest: A pointer to the string to be concatenated upon. * @src: The source string to be appended to @dest. * * Return: A pointer to the destination string @dest. */ char *strcat(char *dest, const char *src)
To use the function strcpy
, include the C standard library using the header <string.h>
.
#include <string.h>
Once the C string library has been included, you can call the function strcat
directly.
Example (note that the libraries <stdio.h>
and <stdlib.h>
are additionally included here for the usage of printf
and EXIT_SUCCESS
, respectively):
$ cat main.c #include <stdio.h> #include <stdlib.h> #include <string.h> int main(void) { char dest[15] = "Brennan"; char src[8] = "Baraban"; /* Before concatenation */ printf("String dest before concat: %s\ ", dest); printf("String src before concat: %s\ ", src); /* Concatenate src to the end of dest */ strcat(dest, src); /* After concatenation */ printf("String dest after concat: %s\ ", dest); printf("String src after concat: %s\ ", src); return (EXIT_SUCCESS); } $ gcc main.c -o strcat $ ./strcat String dest before concat: Brennan String src before concat: Baraban String dest after concat: BrennanBaraban String src after concat: Baraban
A particularly cool, applicable use-case of strcat
can be understood within the context of environment variables, specifically the PATH
environment variable.
ASIDE - Environment Variables
If you are unfamiliar with environment variables, for now, take away that they are variables provided to running processes for the purposes of affecting behavior and providing context. The PATH
environment variable is a string of colon-separated directories within which a process (such as a shell) can search for executable programs.
Environment variables are utilized in many ways, but to learn more about how to use them with Bash, you can start here (http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_03_02.html).
What do you do when you want to add a directory to your shell program’s PATH
? Most likely, you add the new directory to the old PATH
like so (recall that the $
symbol achieves variable expansion in Bash).
$ export PATH=$PATH:/new_directory
In other words, you take a source string, the original PATH
, and concatenate a new string, the new directory.
Let’s mock up a basic C program that achieves this behavior using strcat
. The program will receive the original path string as the first command line argument, and the new location to append as a second.
$ cat main.c #include <stdio.h> #include <stdlib.h> #include <string.h> char *concat_directory(char *path, const char *directory) { char buffer[100]; char *new_path; /* Copy original path into larger buffer */ strcpy(buffer, path); /* Concatenate a : for separation */ strcat(buffer, ":"); /* Concatenate the new directory */ return strcat(buffer, directory); } int main(int argc, char *argv[]) { char *path = argv[1]; char *directory = argv[2]; printf("Original path: %s\ ", path); /* Concatenate directory to path */ path = concat_directory(path, directory); printf("New path: %s\ ", path); return (EXIT_SUCCESS); } $ gcc main.c -o path $ ./path "/usr/local/sbin:/usr/local/bin" "/new_directory" Original path: /usr/local/sbin:/usr/local/bin New path: /usr/local/sbin:/usr/local/bin:/new_directory
Before you know it, we'll have our own working shell program 😉🐚.
So, the above program is cool and all, but in truth, it's not that great. Not only is the allocation of a 100 character buffer wasteful for path
’s less than 100 characters, but on the flip side, it will flat out fail for inputs greater than that number. And, as any of you familiar with the PATH
variable will know - it can get quite long.
Example in point, the PATH
on my Windows Subsystem for Linux:
$ echo $PATH /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/mnt/c/Program Files (x86)/Intel/iCLS Client/:/mnt/c/Program Files/Intel/iCLS Client/:/mnt/c/Program Files/Dell/DW WLAN Card:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/:/mnt/c/Program Files/WIDCOMM/Bluetooth Software/:/mnt/c/Program Files/WIDCOMM/Bluetooth Software/syswow64:/mnt/c/Program Files/Intel/Intel(R) Management Engine Components/DAL:/mnt/c/Program Files (x86)/Intel/Intel(R) Management Engine Components/DAL:/mnt/c/Program Files/Intel/Intel(R) Management Engine Components/IPT:/mnt/c/Program Files (x86)/Intel/Intel(R) Management Engine Components/IPT:/mnt/c/Program Files (x86)/Skype/Phone/:/mnt/c/Program Files/PuTTY/:/mnt/c/WINDOWS/System32/OpenSSH/:/mnt/c/Program Files/dotnet/:/mnt/c/HashiCorp/Vagrant/bin:/mnt/c/Program Files/PowerShell/6/:/mnt/c/Users/Brennan/AppData/Roaming/nvm:/mnt/c/Program Files/nodejs:/mnt/c/Program Files (x86)/Yarn/bin/:/mnt/c/Program Files/Git/cmd:/mnt/c/Users/Brennan/.cargo/bin:/mnt/c/Users/Brennan/AppData/Local/Programs/Python/Python37/Scripts/:/mnt/c/Users/Brennan/AppData/Local/Programs/Python/Python37/:/mnt/c/Users/Brennan/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/Brennan/AppData/Roaming/npm:/mnt/c/Users/Brennan/AppData/Roaming/Dashlane/6.1907.0.17833/bin/Firefox_Extension/{442718d9-475e-452a-b3e1-fb1ee16b8e9f}/components:/mnt/c/Users/Brennan/AppData/Roaming/Dashlane/6.1907.0.17833/ucrt:/mnt/c/Users/Brennan/AppData/Roaming/Dashlane/6.1907.0.17833/bin/Qt:/mnt/c/Users/Brennan/AppData/Roaming/Dashlane/6.1907.0.17833/bin/Ssl:/mnt/c/Users/Brennan/Downloads/cmder:/mnt/c/Users/Brennan/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/Brennan/AppData/Local/Programs/Microsoft VS Code Insiders/bin:/mnt/c/Users/Brennan/AppData/Roaming/nvm:/mnt/c/Program Files/nodejs:/mnt/c/Users/Brennan/AppData/Local/now-cli:/mnt/c/Users/Brennan/AppData/Local/Yarn/bin:/mnt/c/Exercism:/mnt/c/Program Files/Docker Toolbox:/mnt/c/Users/Brennan/AppData/Local/Microsoft/WindowsApps:/mnt/c/Program Files/ArangoDB3 3.4.5/usr/bin/:/mnt/c/Exercism/exercism.exe:/mnt/c/Users/Brennan/AppData/Local/hyper/app-3.0.2/resources/bin
More practically, and usefully, strcat
should be utilized with dynamically-allocated memory. This way, you can allocate just the space you need to concatenate one string onto another - no more, no less.
We can incorporate dynamically-allocated memory using a combination of the standard library function malloc
, to allocate memory, and the string library function strlen
, to determine the size of the string we are achieving through concatenation.
$ cat main.c #include <stdio.h> #include <stdlib.h> #include <string.h> char *concat_directory(char *path, const char *directory) { char *buffer = malloc(strlen(path) + strlen(directory) + 2); /* Check if malloc failed */ if (buffer == NULL) return NULL; /* Copy original path into larger buffer */ strcpy(buffer, path); /* Concatenate a : for separation */ strcat(buffer, ":"); /* Concatenate the new directory */ return strcat(buffer, directory); } int main(int argc, char *argv[]) { char *path = argv[1]; char *directory = argv[2]; printf("Original path: %s\ ", path); /* Concatenate directory to path */ path = concat_directory(path, directory); /* concat_directory might have failed */ if (path == NULL) return (EXIT_FAILURE); printf("New path: %s\ ", path); /* One free for each malloc */ free(path); return (EXIT_SUCCESS); } $ gcc main.c -o dynamic $ ./dynamic "/usr/local/sbin:/usr/local/bin" "/new_directory" Original path: /usr/local/sbin:/usr/local/bin New path: /usr/local/sbin:/usr/local/bin:/new_directory
Note two important points. First, we must allocate enough memory for two bytes more than the length of the combined strings - one because strlen
does not include the terminating null byte in its returned length, and another for the separating colon character :
. Second, make sure to adhere to the golden rule - for every call to malloc
, there should be a corresponding free
.
String functions are fun. I present, my implementation of the function strcat
.
/* * File: 0-strcat.c * Auth: Brennan D Baraban */ #include "holberton.h" /** * strcat - Concatenates the string pointed to by @src, including the terminating * null byte, to the end of the string pointed to by @dest. * @dest: A pointer to the string to be concatenated upon. * @src: The source string to be appended to @dest. * * Return: A pointer to the destination string @dest. */ char *strcat(char *dest, const char *src) { int index = 0, dest_len = 0; while (dest[index++]) dest_len++; for (index = 0; src[index]; index++) dest[dest_len++] = src[index]; return (dest); }
Before I begin concatenating src
onto dest
, I first locate the end of the string referenced by dest
by looping over it until I encounter a null byte, which evaluates to false (0) in a conditional. Once I have the index of this initial string, I have exactly the indices of the two memory locations I need, and I copy the contents of src
to the end of dest
one character at a time.
Of course, this is just one, personal implementation of the function strcat
. There are multiple ways to do so; in fact, I encourage, no, challenge, you to find another way to write this function!