Skip to content

Conversation

@ThalesBarretto
Copy link
Contributor

@ThalesBarretto ThalesBarretto commented Feb 20, 2025

Fix likely misuse of ctype.h API #4397

The standard C <ctype.h> API takes type int as argument, but when
that argument value is neither (a) a value representable by type unsigned
char nor (b) the value of the macro EOF, the behaviour is undefined. See
C11, Sec. 7.4 Character handling <ctype.h> p.200, clause 1.

The <ctype.h> functions are designed to take as arguments the
values returned by getc or fgetc, no for processing elements of an
arbitrary string stored in a char array. Safely processing arbitrary
strings requires explicit cast to unsigned char to keep the argument
values within the domain {EOF, 0, 1 ..., 255}.

OTOH, passing negative values {-128, -127, ..., -1} may trigger
undefined behaviour, and also the non-EOF 0xff can be conflated with -1,
which, although not forbidden, may give unintended results. Casting to
unsigned char avoids sign extension when implicitly converting to int.

This commit introduces an explicit cast to unsigned char when
passing argument to the standard <ctype.h> function family.

Fixes: #4397
Reported-by: riastradh
Signed-off-by: Thales Antunes de Oliveira Barretto [email protected]

@gluster-ant
Copy link
Collaborator

Can one of the admins verify this patch?

1 similar comment
@gluster-ant
Copy link
Collaborator

Can one of the admins verify this patch?

@gluster-ant
Copy link
Collaborator

CLANG-FORMAT FAILURE:
Before merging the patch, this diff needs to be considered for passing clang-format

index 208623f35..3a76d4f8b 100644
--- a/libglusterfs/src/common-utils.c
+++ b/libglusterfs/src/common-utils.c
@@ -1995,7 +1995,8 @@ valid_host_name(char *address, int length)
         goto out;
     }
 
-    if (!isalnum((unsigned char)dup_addr[length - 1]) && (dup_addr[length - 1] != '*')) {
+    if (!isalnum((unsigned char)dup_addr[length - 1]) &&
+        (dup_addr[length - 1] != '*')) {
         ret = 0;
         goto out;
     }
@@ -2013,7 +2014,8 @@ valid_host_name(char *address, int length)
     do {
         str_len = strlen(temp_str);
 
-        if (!isalnum((unsigned char)temp_str[0]) || !isalnum((unsigned char)temp_str[str_len - 1])) {
+        if (!isalnum((unsigned char)temp_str[0]) ||
+            !isalnum((unsigned char)temp_str[str_len - 1])) {
             ret = 0;
             goto out;
         }
@@ -2049,7 +2051,8 @@ valid_ipv4_address(char *address, int length, gf_boolean_t wildcard_acc)
      * delimiters.
      */
     if (length <= 0 || (strstr(address, "..")) ||
-        (!isdigit((unsigned char)tmp[length - 1]) && (tmp[length - 1] != '*'))) {
+        (!isdigit((unsigned char)tmp[length - 1]) &&
+         (tmp[length - 1] != '*'))) {
         ret = 0;
         goto out;
     }
diff --git a/tests/basic/open-behind/tester.c b/tests/basic/open-behind/tester.c
index 4ccd40cbe..1f6e4e8d3 100644
--- a/tests/basic/open-behind/tester.c
+++ b/tests/basic/open-behind/tester.c
@@ -82,7 +82,8 @@ buffer_get(context_t *ctx)
 static int32_t
 str_skip_spaces(context_t *ctx, int32_t current)
 {
-    while ((current > 0) && (current != '\n') && isspace((unsigned char)current)) {
+    while ((current > 0) && (current != '\n') &&
+           isspace((unsigned char)current)) {
         current = buffer_get(ctx);
     }
 
diff --git a/xlators/mgmt/glusterd/src/glusterd-ganesha.c b/xlators/mgmt/glusterd/src/glusterd-ganesha.c
index b568af029..8f32583bb 100644
--- a/xlators/mgmt/glusterd/src/glusterd-ganesha.c
+++ b/xlators/mgmt/glusterd/src/glusterd-ganesha.c
@@ -83,7 +83,8 @@ parsing_ganesha_ha_conf(const char *key)
         do {
             end_pointer++;
         } while (!(*end_pointer == '\'' || *end_pointer == '"' ||
-                   isspace((unsigned char)*end_pointer) || *end_pointer == '\0'));
+                   isspace((unsigned char)*end_pointer) ||
+                   *end_pointer == '\0'));
         *end_pointer = '\0';
 
         /* got it. copy it and return */
diff --git a/xlators/mgmt/glusterd/src/glusterd-volume-set.c b/xlators/mgmt/glusterd/src/glusterd-volume-set.c
index 4857f7e78..9296a7c9a 100644
--- a/xlators/mgmt/glusterd/src/glusterd-volume-set.c
+++ b/xlators/mgmt/glusterd/src/glusterd-volume-set.c
@@ -175,7 +175,8 @@ validate_uss_dir(glusterd_volinfo_t *volinfo, dict_t *dict, char *key,
     }
 
     for (i = 1; value[i]; i++) {
-        if (isalnum((unsigned char)value[i]) || value[i] == '_' || value[i] == '-')
+        if (isalnum((unsigned char)value[i]) || value[i] == '_' ||
+            value[i] == '-')
             continue;
 
         snprintf(errstr, sizeof(errstr),

@ThalesBarretto
Copy link
Contributor Author

/run regression

@ThalesBarretto ThalesBarretto force-pushed the misuse_of_ctype_h_4397 branch 3 times, most recently from b443d3f to d705363 Compare February 20, 2025 18:26
@ThalesBarretto ThalesBarretto marked this pull request as ready for review February 20, 2025 18:34
@ThalesBarretto
Copy link
Contributor Author

ThalesBarretto commented Feb 20, 2025

i think this is a follow-up to #4038 . @pranithk what do u think?

@pranithk
Copy link
Member

@ThalesBarretto What is your opinion about adding new files glusterfs-ctypes.[ch] which expose gf_isalnum() functions. These gf_xxx() functions can do the casting internally. You can add a test also that no source file is directly calling the ctype api directly other than from the file glusterfs-ctypes.c

tests/basic/symbol-check.sh is one such test so that all sys calls are executed with the functions in sys_xxx() that we defined.

@ThalesBarretto
Copy link
Contributor Author

ThalesBarretto commented Feb 28, 2025

@ThalesBarretto What is your opinion about adding new files glusterfs-ctypes.[ch] which expose gf_isalnum() functions. These gf_xxx() functions can do the casting internally. You can add a test also that no source file is directly calling the ctype api directly other than from the file glusterfs-ctypes.c

This sure can be done, as i see no safer alternative in the std library.

tests/basic/symbol-check.sh is one such test so that all sys calls are executed with the functions in sys_xxx() that we defined.

That make it a regression check, not a compile time enforcement (unless we incorporate in the build, which IMO can be tricky to do right). Ideally this better be done with static analysis.

For now i would just plug this hole and leave the improvement for another PR.

@pranithk
Copy link
Member

pranithk commented Mar 4, 2025

@ThalesBarretto What is your opinion about adding new files glusterfs-ctypes.[ch] which expose gf_isalnum() functions. These gf_xxx() functions can do the casting internally. You can add a test also that no source file is directly calling the ctype api directly other than from the file glusterfs-ctypes.c

This sure can be done, as i see no safer alternative in the std library.

tests/basic/symbol-check.sh is one such test so that all sys calls are executed with the functions in sys_xxx() that we defined.

That make it a regression check, not a compile time enforcement (unless we incorporate in the build, which IMO can be tricky to do right). Ideally this better be done with static analysis.

For now i would just plug this hole and leave the improvement for another PR.

Works

@ThalesBarretto ThalesBarretto force-pushed the misuse_of_ctype_h_4397 branch from 3b93a42 to 53c7d78 Compare June 10, 2025 13:55
@ThalesBarretto ThalesBarretto force-pushed the misuse_of_ctype_h_4397 branch from 53c7d78 to 9c67f3f Compare June 30, 2025 23:19
	The standard C <ctype.h> API takes type int as argument, but when
that argument value is neither (a) a value representable by type unsigned
char nor (b) the value of the macro EOF, the behaviour is undefined. See
C11, Sec. 7.4 Character handling <ctype.h> p.200, clause 1.

	The <ctype.h> functions are designed to take as arguments the
values returned by getc or fgetc, no for processing elements of an
arbitrary string stored in a char array. Safely processing arbitrary
strings requires explicit cast to unsigned char to keep the argument
values within the domain {EOF, 0, 1 ..., 255}.

	OTH, passing negative values {-128, -127, ..., -1} may trigger
undefined behaviour, and also the non-EOF 0xff can be conflated with -1,
which, although not forbidden, may give unintended results. Casting to
unsigned char avoids sign extension when implicitly converting to int.

	This commit introduces an explicit cast to unsigned char when
passing argument to the standard <ctype.h> function family.

Reported-by: riastradh
Signed-off-by: Thales Antunes de Oliveira Barretto <[email protected]>
#  api/src/glfs.c                                   |  2 +-
#  cli/src/cli-cmd-parser.c                         | 12 ++++-----
#  cli/src/cli-xml-output.c                         |  4 +--
#  extras/benchmarking/rdd.c                        |  2 +-
#  extras/geo-rep/gsync-sync-gfid.c                 |  2 +-
#  geo-replication/src/procdiggy.c                  |  4 +--
#  libglusterfs/src/common-utils.c                  | 31 +++++++++++++-----------
#  tests/basic/open-behind/tester.c                 |  5 ++--
#  xlators/cluster/ec/src/ec-code.c                 |  4 +--
#  xlators/debug/io-stats/src/io-stats.c            |  4 +--
#  xlators/mgmt/glusterd/src/glusterd-ganesha.c     |  5 ++--
#  xlators/mgmt/glusterd/src/glusterd-geo-rep.c     | 10 ++++----
#  xlators/mgmt/glusterd/src/glusterd-mountbroker.c |  4 +--
#  xlators/mgmt/glusterd/src/glusterd-pmap.c        |  4 +--
#  xlators/mgmt/glusterd/src/glusterd-utils.c       |  2 +-
#  xlators/mgmt/glusterd/src/glusterd-volume-set.c  |  3 ++-
#  xlators/performance/md-cache/src/md-cache.c      |  2 +-
#  17 files changed, 53 insertions(+), 47 deletions(-)
#  api/src/glfs.c                                   |  2 +-
#  cli/src/cli-cmd-parser.c                         | 12 ++++-----
#  cli/src/cli-xml-output.c                         |  4 +--
#  extras/benchmarking/rdd.c                        |  2 +-
#  extras/geo-rep/gsync-sync-gfid.c                 |  2 +-
#  geo-replication/src/procdiggy.c                  |  4 +--
#  libglusterfs/src/common-utils.c                  | 31 +++++++++++++-----------
#  tests/basic/open-behind/tester.c                 |  5 ++--
#  xlators/cluster/ec/src/ec-code.c                 |  4 +--
#  xlators/debug/io-stats/src/io-stats.c            |  4 +--
#  xlators/mgmt/glusterd/src/glusterd-ganesha.c     |  5 ++--
#  xlators/mgmt/glusterd/src/glusterd-geo-rep.c     | 10 ++++----
#  xlators/mgmt/glusterd/src/glusterd-mountbroker.c |  4 +--
#  xlators/mgmt/glusterd/src/glusterd-pmap.c        |  4 +--
#  xlators/mgmt/glusterd/src/glusterd-utils.c       |  2 +-
#  xlators/mgmt/glusterd/src/glusterd-volume-set.c  |  3 ++-
#  xlators/performance/md-cache/src/md-cache.c      |  2 +-
#  17 files changed, 53 insertions(+), 47 deletions(-)
#  api/src/glfs.c                                   |  2 +-
#  cli/src/cli-cmd-parser.c                         | 12 ++++-----
#  cli/src/cli-xml-output.c                         |  4 +--
#  extras/benchmarking/rdd.c                        |  2 +-
#  extras/geo-rep/gsync-sync-gfid.c                 |  2 +-
#  geo-replication/src/procdiggy.c                  |  4 +--
#  libglusterfs/src/common-utils.c                  | 31 +++++++++++++-----------
#  tests/basic/open-behind/tester.c                 |  5 ++--
#  xlators/cluster/ec/src/ec-code.c                 |  4 +--
#  xlators/debug/io-stats/src/io-stats.c            |  4 +--
#  xlators/mgmt/glusterd/src/glusterd-ganesha.c     |  5 ++--
#  xlators/mgmt/glusterd/src/glusterd-geo-rep.c     | 10 ++++----
#  xlators/mgmt/glusterd/src/glusterd-mountbroker.c |  4 +--
#  xlators/mgmt/glusterd/src/glusterd-pmap.c        |  4 +--
#  xlators/mgmt/glusterd/src/glusterd-utils.c       |  2 +-
#  xlators/mgmt/glusterd/src/glusterd-volume-set.c  |  3 ++-
#  xlators/performance/md-cache/src/md-cache.c      |  2 +-
#  17 files changed, 53 insertions(+), 47 deletions(-)
@ThalesBarretto ThalesBarretto force-pushed the misuse_of_ctype_h_4397 branch from 466a1ad to ec86019 Compare August 9, 2025 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Likely misuse of ctype.h API

3 participants