You might not want to do this on your school exams, but in C, copying is practically a must. Today, let’s learn how to copy strings in C using the function strcpy
.
More than just the third letter of the alphabet, C is a compiled programming language - in other words, C is run by a computer only after being compiled. Compilation is accomplished by software called compilers, which take C source code files and translate them into executable language (binary) that computers can run.
There are many different C compilers, but in this tutorial, I will be using one called GCC, published by GNU. For instructions on how to install GCC, you can visit GNU’s installation guide here (https://gcc.gnu.org/install/).
Throughout the duration of this article, I exemplify usage of GCC on the command line. If you are unfamiliar with what a command line is, read more here (http://linuxcommand.org/index.php).
The function strcpy
(think, "string copy") is a C standard library function that copies a string.
ASIDE - STRING REFRESHER
When working with strings in C, remember - strings are no more than arrays of ASCII-encoded characters ending with a terminating null byte (\0
). A pointer to a string is merely a pointer to the first character in this array.
For a more in-depth examination on pointers, including a look at strings, I encourage you to visit one of my earlier posts (Pointers in C).
What exactly do I mean when I say copy? Glad you asked.
Say we have a string str1
that points to the array of ten characters "Holberton". In memory, this string can be represented like so:
char str1[10] = "Holberton";
Address | 0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 |
Variable | str1 | |||||||||
Value | H | o | l | b | e | r | t | o | n | \0 |
In addition to str1
, we have a second string, str2
. This is a separate array of 10 characters stored at a completely different memory address. Initially, we declare this block of memory without knowing its contents (note that, in practice, it is a good idea to at least initialize empty arrays with null bytes).
char str2[10];
Address | 0x15 | 0x16 | 0x17 | 0x18 | 0x19 | 0x1A | 0x1B | 0x1C | 0x1D | 0x1E |
Variable | str2 | |||||||||
Value | ? | ? | ? | ? | ? | ? | ? | ? | ? | ? |
Ok, now, when we copy a string, what we are truly doing is copying its characters into a separate block of memory. So, if we were to copy the contents of str1
into the memory referenced by str2
, we would achieve the following:
Address | 0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 | 0x08 | 0x09 |
Variable | str1 | |||||||||
Value | H | o | l | b | e | r | t | o | n | \0 |
Address | 0x15 | 0x16 | 0x17 | 0x18 | 0x19 | 0x1A | 0x1B | 0x1C | 0x1D | 0x1E |
Variable | str2 | |||||||||
Value | H | o | l | b | e | r | t | o | n | \0 |
Now, the string "Holberton"
exists in two different spots in memory - once at 0x00, and again at 0x15. It has been copied!
Crucially, note that the source string’s terminating null byte was included when copied into the destination memory block. This extra byte is important to keep in mind when allocating enough memory for a string copy.
char *dest, const char *src
The function receives two parameters, two character pointers.
The first pointer, dest
(think, "destination"), references the memory buffer where characters will be copied. Note that strcpy
does not automate any memory handling - it directly tries to copy characters into whatever address is referenced by dest
. Because of this, you must allocate space for the destination buffer up-front, and sufficiently. Since the contents of dest
change, it is not received as a constant.
The second pointer, src
(think, "source"), references the string to copy. In contrast to dest
, src
is received as a constant, since its contents will merely be copied, not changed.
char *
You will receive nothing back from strcpy
that you do not give it - after copying the received src
string into the buffer dest
, the function turns around and returns a pointer to dest
, the same memory address passed when you call the function.
The function strcpy
is declared as follows:
/** * strcpy - Copies a string pointed to by @src, including the * terminating null byte, to a buffer pointed to by @dest. * @dest: A buffer to copy the string to. * @src: The source string to copy. * * Return: A pointer to the destination string @dest. */ char *strcpy(char *dest, const char *src)
To use the function strcpy
, include the C standard library using the header <string.h>
.
#include <string.h>
Once the C string library has been included, you can call the function strcpy
directly.
Example (note that the libraries <stdio.h>
and <stdlib.h>
are additionally included here for the usage of printf
and EXIT_SUCCESS
, respectively):
$ cat main.c #include <stdio.h> #include <stdlib.h> #include <string.h> int main(void) { char src[7] = "Source"; char dest[7]; /* Initialize dest with null bytes for good practice */ memset(dest, '\0', sizeof(dest)); /* Before copying */ printf("String src before copy: %s\ ", src); printf("String dest before copy: %s\ ", dest); /* Copy src into dest */ strcpy(dest, src); /* After copying */ printf("String src after copy: %s\ ", src); printf("String dest after copy: %s\ ", dest); return (EXIT_SUCCESS); } $ gcc main.c -o strcpy $ ./strcpy String src before copy: Source String dest before copy: String src after copy: Source String dest after copy: Source
The usefulness of strcpy
primarily comes into play within the context of string literals.
Recall that string literals are immutable (unchangeable) strings stored in read-only memory. If we try to change the contents of a string literal in C, we’ll encounter some unfortunate behavior.
$ cat main.c #include <stdio.h> #include <stdlib.h> int main(void) { char *src = "Holberton"; src[0] = 'B'; return (EXIT_SUCCESS); } $ gcc main.c -o strliteral $ ./strliteral Segmentation fault (core dumped)
The infamous segmentation fault 😭.
Yet, you will likely often find yourself needing to alter the contents of a given string, and more often than not, you will probably be performing such functions on string literals. So, how would we achieve this?
Surprise surprise - strcpy
can help us out. To exemplify, let’s write a program that receives a word as a string and capitalizes it, almost like a rudimentary auto-correct functionality.
$ cat main.c #include <stdio.h> #include <stdlib.h> #include <string.h> int main(int argc, char *argv[]) { /* Get word to capitalize from command line */ char *word = argv[1]; /* Temporary buffer */ char tmp[100]; /* Initialize tmp with null bytes for good practice */ memset(tmp, '\0', 100); printf("Before capitalization: %s\ ", word); /* Copy word into tmp */ strcpy(tmp, word); /* * If first character of word is lowercase letter, capitalize it. * Capitalized version of a lowercase letter is -ASCII value 32. */ if (tmp[0] >= 'a' && tmp[0] <= 'z') tmp[0] -= 32; /* Reassign capitalized word */ word = tmp; printf("After capitalization: %s\ ", word); return (EXIT_SUCCESS); } $ gcc main.c -o capitalize $ ./capitalize brennan Before capitalization: brennan After capitalization: Brennan
Beautiful! With strcpy
, we can alter strings at will!
The space efficiency enthusiasts among you must be trembling - 100 allocated buffer bytes for each capitalization! What a waste! And what if the received word is greater than 100 characters?
I know. I’m not too proud of it myself.
More practically, and usefully, strcpy
should be utilized with dynamically-allocated memory. This way, you can allocate just the space you need to create a copy of a string - no more, no less.
We can incorporate dynamically-allocated memory using a combination of the standard library function malloc
, to allocate memory, and the string library function strlen
, to determine the size of the string we are copying.
$ cat main.c #include <stdio.h> #include <stdlib.h> #include <string.h> int main(int argc, char *argv[]) { /* Get word to capitalize from command line */ char *word = argv[1]; /* Dynamically allocate temporary buffer */ char *tmp = malloc(strlen(word) + 1); /* Return error if allocation failed */ if (tmp == NULL) return (EXIT_FAILURE); /* Initialize tmp with null bytes for good practice */ memset(tmp, '\0', strlen(word) + 1); printf("Before capitalization: %s\ ", word); /* Copy word into tmp */ strcpy(tmp, word); /* * If first character of word is lowercase letter, capitalize it. * Capitalized version of a lowercase letter is -ASCII value 32. */ if (tmp[0] >= 'a' && tmp[0] <= 'z') tmp[0] -= 32; /* Reassign capitalized word */ word = tmp; printf("After capitalization: %s\ ", word); /* One free for each malloc */ free(tmp); return (EXIT_SUCCESS); } $ gcc main.c -o dynamic $ ./dynamic brennan Before capitalization: brennan After capitalization: Brennan
Note two important points. First, we must allocate enough memory for one byte more than the length of the string, because strlen
does not include the terminating null byte in its returned length. Second, make sure to adhere to the golden rule - for every call to malloc
, there should be a corresponding free
.
Not only does this improve the space efficiency of our program, but it additionally permits us to make it more modular. Now that we are dynamically allocating memory directly on the heap, and not working on a locally-scoped temporary buffer, we can move the capitalization functionality into its own, generic function.
$ cat main.c #include <stdio.h> #include <stdlib.h> #include <string.h> char *capitalize(const char *word) { /* Dynamically allocate temporary buffer */ char *tmp = malloc(strlen(word) + 1); /* Return error if allocation failed */ if (tmp == NULL) return (NULL); /* Initialize tmp with null bytes for good practice */ memset(tmp, '\0', strlen(word) + 1); /* Copy word into tmp */ strcpy(tmp, word); /* * If first character of word is lowercase letter, capitalize it. * Capitalized version of a lowercase letter is -ASCII value 32. */ if (tmp[0] >= 'a' && tmp[0] <= 'z') tmp[0] -= 32; return (tmp); } int main(int argc, char *argv[]) { /* Get word to capitalize from command line */ char *word = argv[1]; printf("Before capitalization: %s\ ", word); /* Reassign capitalized word */ word = capitalize(word); /* capitalize might have returned error */ if (word == NULL) return (EXIT_FAILURE); printf("After capitalization: %s\ ", word); /* One free for each malloc */ free(word); return (EXIT_SUCCESS); } $ gcc main.c -o modular $ ./modular brennan Before capitalization: brennan After capitalization: Brennan
Now we’re on our way toward a truly advanced auto-correction program!
You ask, I provide. I present, my implementation of the function strcpy
.
/** * _strcpy - Copies a string pointed to by @src, including the * terminating null byte, to a buffer pointed to by @dest. * @dest: A buffer to copy the string to. * @src: The source string to copy. * * Return: A pointer to the destination string @dest. */ char *_strcpy(char *dest, const char *src) { int index = 0; while (src[index]) { dest[index] = src[index]; index++; } return (dest); }
To copy a string, I loop over src
, copying the character located at each index to the corresponding index in dest
. I know to stop copying characters when I’ve reached the null byte in src
, which evaluates to false (0) in a conditional. At the end, I return the original pointer to dest
, having successfully copied the string!
Of course, this is just one, personal implementation of the function strcpy
. There are multiple ways to do so; in fact, I encourage, no, challenge, you to find another way to write this function - bonus points if you can do so without using an index variable!