Pictured is a peer learning day occuring

 

STRCAT in C


I apologize if you came here looking for content related to strings and cats; this article is a tutorial on the C string concatenation function strcat.

PREFACE - WHAT IS C?

More than just the third letter of the alphabet, C is a compiled programming language - in other words, C is run by a computer only after being compiled. Compilation is accomplished by software called compilers, which take C source code files and translate them into executable language (binary) that computers can run.

There are many different C compilers, but in this tutorial, I will be using one called GCC, published by GNU. For instructions on how to install GCC, you can visit GNU’s installation guide here (https://gcc.gnu.org/install/).

Throughout the duration of this article, I exemplify usage of GCC on the command line. If you are unfamiliar with what a command line is, read more here (http://linuxcommand.org/index.php).

To read more on what C is and how to use it, read my earlier post on the topic (Learn C)

STRCAT - WHAT

The function strcat (think, "string concatenation") is a C standard library function that concatenates (appends) one string to the end of another.

ASIDE - STRING REFRESHER
When working with strings in C, remember - strings are no more than arrays of ASCII-encoded characters ending with a terminating null byte (\0). A pointer to a string is merely a pointer to the first character in this array.
For a more in-depth examination on pointers, including a look at strings, I encourage you to visit one of my earlier posts (Pointers in C).

You may be familiar with a similar C string function, strcpy (if you’re not, I encourage you to check out my dedicated article on it!). Recall that strcpy copies the characters referenced by one string into the memory pointed to by another. In other words, if we were to copy str1, visualized below:

char str1[10] = "Holberton";
Address 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09
Variable str1
Value H o l b e r t o n \0

Into the memory pointed by str2, visualized below:

char str2[10];
Address 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29
Variable str2
Value ? ? ? ? ? ? ? ? ? ?

We would achieve the following:

Address 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09
Variable str1
Value H o l b e r t o n \0
Address 0x20 0x21 0x22 0x23 0x24 0x25 0x26 0x27 0x28 0x29
Variable str2
Value H o l b e r t o n \0

In essence, we have copied the string "Holberton" so that it exists in two different spots in memory - once at 0x00, and again at 0x20.

Why am I wasting my time reviewing strcpy for an article on strcat, you ask? Well, it turns out that strcpy and strcat work similarly. Truly, if you understand how strcpy works, you understand how strcat works.

Backtrack a step and imagine two new strings, still named str1 and str2. This time, str1 points to an array of 16 characters, starting with the string "Holberton". In the meantime, the second string, str2, references a new array of 7 characters, "School".

char str1[16] = "Holberton";
Address 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0A 0x0B 0x0C 0x0D 0x0E 0x0F
Variable str1
Value H o l b e r t o n \0 ? ? ? ? ? ?
char str2[7] = "School";
Address 0x20 0x21 0x22 0x23 0x24 0x25 0x26
Variable str2
Value S c h o o l \0

Now, when we concatenate str2 onto str1, what we are truly doing is copying the contents of str2 to the end of the string referenced by str1, thereby achieving the following:

Address 0x00 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0A 0x0B 0x0C 0x0D 0x0E 0x0F
Variable str1
Value H o l b e r t o n S c h o o l \0
Address 0x20 0x21 0x22 0x23 0x24 0x25 0x26
Variable str2
Value S c h o o l \0

Feel familiar? Now, the string "School" exists in two separate locations in memory - once still at 0x20, and again at the end of the original string referenced by str1, 0x09. We have done nothing more than strcpy "Holberton" starting at the memory address of str1’s original null byte (\0).

Note a key concept visualized above - when we concatenate str2 to the end of str1, we override and begin copying not at the last memory address included in the entire str1 array, 0x0F, but specifically at str1’s original null byte, which happened to occur at address 0x09. Also, note that the copying of str2 to the end of str1 includes the terminating null byte.

PARAMETERS
char *dest, const char *src

The function receives two parameters, two character pointers.

The first pointer, dest (think, "destination"), references the memory buffer where characters will be concatenated. Note that strcat does not automate any memory handling - it directly tries to concatenate characters at the end of whatever string is initially referenced by dest. Because of this, you must allocate space for the destination buffer up-front, and sufficiently. Since the contents of dest change, it is not received as a constant.

The second pointer, src (think, "source"), references the string to concatenate. In contrast to dest, src is received as a constant, since its contents will merely be copied, not changed.

RETURN VALUE
char *

You will receive nothing back from strcat that you do not give it - after concatenating the received src string to the end of the string referenced by dest, the function turns around and returns a pointer to dest, the same memory address passed when you call the function.

DECLARATION

The function strcat is declared as follows:

/**
* strcat - Concatenates the string pointed to by @src, including the terminating
*          null byte, to the end of the string pointed to by @dest.
* @dest: A pointer to the string to be concatenated upon.
* @src: The source string to be appended to @dest.
*
* Return: A pointer to the destination string @dest.
*/
char *strcat(char *dest, const char *src)

STRCAT - HOW

To use the function strcpy, include the C standard library using the header <string.h>.

#include <string.h>

Once the C string library has been included, you can call the function strcat directly.

Example (note that the libraries <stdio.h> and <stdlib.h> are additionally included here for the usage of printf and EXIT_SUCCESS, respectively):

$ cat main.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
    char dest[15] = "Brennan";
    char src[8] = "Baraban";

    /* Before concatenation */
    printf("String dest before concat: %s\n", dest);
    printf("String src before concat: %s\n", src);

    /* Concatenate src to the end of dest */
    strcat(dest, src);

    /* After concatenation */
    printf("String dest after concat: %s\n", dest);
    printf("String src after concat: %s\n", src);

    return (EXIT_SUCCESS);
}
$ gcc main.c -o strcat
$ ./strcat
String dest before concat: Brennan
String src before concat: Baraban
String dest after concat: BrennanBaraban
String src after concat: Baraban                                               

STRCAT - WHEN

A particularly cool, applicable use-case of strcat can be understood within the context of environment variables, specifically the PATH environment variable.

ASIDE - Environment Variables
If you are unfamiliar with environment variables, for now, take away that they are variables provided to running processes for the purposes of affecting behavior and providing context. The PATH environment variable is a string of colon-separated directories within which a process (such as a shell) can search for executable programs.
Environment variables are utilized in many ways, but to learn more about how to use them with Bash, you can start here (http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_03_02.html).

What do you do when you want to add a directory to your shell program’s PATH? Most likely, you add the new directory to the old PATH like so (recall that the $ symbol achieves variable expansion in Bash).

$ export PATH=$PATH:/new_directory                      

In other words, you take a source string, the original PATH, and concatenate a new string, the new directory.

Let’s mock up a basic C program that achieves this behavior using strcat. The program will receive the original path string as the first command line argument, and the new location to append as a second.

$ cat main.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *concat_directory(char *path, const char *directory)
{
    char buffer[100];
    char *new_path;

    /* Copy original path into larger buffer */
    strcpy(buffer, path);

    /* Concatenate a : for separation */
    strcat(buffer, ":");

    /* Concatenate the new directory */
    return strcat(buffer, directory);
}

int main(int argc, char *argv[]) {
    char *path = argv[1];
    char *directory = argv[2];

    printf("Original path: %s\n", path);
    
    /* Concatenate directory to path */
    path = concat_directory(path, directory);

    printf("New path: %s\n", path);

    return (EXIT_SUCCESS);
}
$ gcc main.c -o path
$ ./path "/usr/local/sbin:/usr/local/bin" "/new_directory"
Original path: /usr/local/sbin:/usr/local/bin
New path: /usr/local/sbin:/usr/local/bin:/new_directory                     

Before you know it, we'll have our own working shell program 😉🐚.

ADVANCED 100 - DYNAMIC MEMORY

So, the above program is cool and all, but in truth, it's not that great. Not only is the allocation of a 100 character buffer wasteful for path’s less than 100 characters, but on the flip side, it will flat out fail for inputs greater than that number. And, as any of you familiar with the PATH variable will know - it can get quite long.

Example in point, the PATH on my Windows Subsystem for Linux:

$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/mnt/c/Program Files (x86)/Intel/iCLS Client/:/mnt/c/Program Files/Intel/iCLS Client/:/mnt/c/Program Files/Dell/DW WLAN Card:/mnt/c/WINDOWS/system32:/mnt/c/WINDOWS:/mnt/c/WINDOWS/System32/Wbem:/mnt/c/WINDOWS/System32/WindowsPowerShell/v1.0/:/mnt/c/Program Files/WIDCOMM/Bluetooth Software/:/mnt/c/Program Files/WIDCOMM/Bluetooth Software/syswow64:/mnt/c/Program Files/Intel/Intel(R) Management Engine Components/DAL:/mnt/c/Program Files (x86)/Intel/Intel(R) Management Engine Components/DAL:/mnt/c/Program Files/Intel/Intel(R) Management Engine Components/IPT:/mnt/c/Program Files (x86)/Intel/Intel(R) Management Engine Components/IPT:/mnt/c/Program Files (x86)/Skype/Phone/:/mnt/c/Program Files/PuTTY/:/mnt/c/WINDOWS/System32/OpenSSH/:/mnt/c/Program Files/dotnet/:/mnt/c/HashiCorp/Vagrant/bin:/mnt/c/Program Files/PowerShell/6/:/mnt/c/Users/Brennan/AppData/Roaming/nvm:/mnt/c/Program Files/nodejs:/mnt/c/Program Files (x86)/Yarn/bin/:/mnt/c/Program Files/Git/cmd:/mnt/c/Users/Brennan/.cargo/bin:/mnt/c/Users/Brennan/AppData/Local/Programs/Python/Python37/Scripts/:/mnt/c/Users/Brennan/AppData/Local/Programs/Python/Python37/:/mnt/c/Users/Brennan/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/Brennan/AppData/Roaming/npm:/mnt/c/Users/Brennan/AppData/Roaming/Dashlane/6.1907.0.17833/bin/Firefox_Extension/{442718d9-475e-452a-b3e1-fb1ee16b8e9f}/components:/mnt/c/Users/Brennan/AppData/Roaming/Dashlane/6.1907.0.17833/ucrt:/mnt/c/Users/Brennan/AppData/Roaming/Dashlane/6.1907.0.17833/bin/Qt:/mnt/c/Users/Brennan/AppData/Roaming/Dashlane/6.1907.0.17833/bin/Ssl:/mnt/c/Users/Brennan/Downloads/cmder:/mnt/c/Users/Brennan/AppData/Local/Programs/Microsoft VS Code/bin:/mnt/c/Users/Brennan/AppData/Local/Programs/Microsoft VS Code Insiders/bin:/mnt/c/Users/Brennan/AppData/Roaming/nvm:/mnt/c/Program Files/nodejs:/mnt/c/Users/Brennan/AppData/Local/now-cli:/mnt/c/Users/Brennan/AppData/Local/Yarn/bin:/mnt/c/Exercism:/mnt/c/Program Files/Docker Toolbox:/mnt/c/Users/Brennan/AppData/Local/Microsoft/WindowsApps:/mnt/c/Program Files/ArangoDB3 3.4.5/usr/bin/:/mnt/c/Exercism/exercism.exe:/mnt/c/Users/Brennan/AppData/Local/hyper/app-3.0.2/resources/bin                        

More practically, and usefully, strcat should be utilized with dynamically-allocated memory. This way, you can allocate just the space you need to concatenate one string onto another - no more, no less.

We can incorporate dynamically-allocated memory using a combination of the standard library function malloc, to allocate memory, and the string library function strlen, to determine the size of the string we are achieving through concatenation.

$ cat main.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *concat_directory(char *path, const char *directory)
{
    char *buffer = malloc(strlen(path) + strlen(directory) + 2);

    /* Check if malloc failed */
    if (buffer == NULL)
        return NULL;

    /* Copy original path into larger buffer */
    strcpy(buffer, path);

    /* Concatenate a : for separation */
    strcat(buffer, ":");

    /* Concatenate the new directory */
    return strcat(buffer, directory);
}

int main(int argc, char *argv[])
{
    char *path = argv[1];
    char *directory = argv[2];

    printf("Original path: %s\n", path);

    /* Concatenate directory to path */
    path = concat_directory(path, directory);

    /* concat_directory might have failed */
    if (path == NULL)
        return (EXIT_FAILURE);

    printf("New path: %s\n", path);

    /* One free for each malloc */
    free(path);

    return (EXIT_SUCCESS);
}
$ gcc main.c -o dynamic
$ ./dynamic "/usr/local/sbin:/usr/local/bin" "/new_directory"
Original path: /usr/local/sbin:/usr/local/bin
New path: /usr/local/sbin:/usr/local/bin:/new_directory

Note two important points. First, we must allocate enough memory for two bytes more than the length of the combined strings - one because strlen does not include the terminating null byte in its returned length, and another for the separating colon character :. Second, make sure to adhere to the golden rule - for every call to malloc, there should be a corresponding free.

ADVANCED 101 - IMPLEMENTATION

String functions are fun. I present, my implementation of the function strcat.

/*
* File: 0-strcat.c
* Auth: Brennan D Baraban
*/

#include "holberton.h"

/**
* strcat - Concatenates the string pointed to by @src, including the terminating
*      	null byte, to the end of the string pointed to by @dest.
* @dest: A pointer to the string to be concatenated upon.
* @src: The source string to be appended to @dest.
*
* Return: A pointer to the destination string @dest.
*/
char *strcat(char *dest, const char *src)
{
    int index = 0, dest_len = 0;

    while (dest[index++])
        dest_len++;

    for (index = 0; src[index]; index++)
        dest[dest_len++] = src[index];

    return (dest);
}        

source: https://github.com/bdbaraban/holbertonschool-low_level_programming/blob/master/0x05-pointers_arrays_strings/0-strcat.c

Before I begin concatenating src onto dest, I first locate the end of the string referenced by dest by looping over it until I encounter a null byte, which evaluates to false (0) in a conditional. Once I have the index of this initial string, I have exactly the indices of the two memory locations I need, and I copy the contents of src to the end of dest one character at a time.

Of course, this is just one, personal implementation of the function strcat. There are multiple ways to do so; in fact, I encourage, no, challenge, you to find another way to write this function!

NOTE

Examples in this article were compiled and run on a Linux Ubuntu 18.04 LTS machine with GNU GCC version 7.3.0.
Written by:

Brennan Baraban, Cohort 7 (SF Campus)