You are not logged in.

#1 2009-06-09 21:47:50

HashBox
Member
Registered: 2009-01-22
Posts: 271

String Trim in C

Hello everyone, I have coded a small string trim function in C, and I was wondering if my method is flawed in any way.
Try to go easy on me though :\
I'm mostly just curious as to whether it can be optimized further.

This probably isn't even what you would call a real string trim since it doesn't completely alter the string, it basically returns a pointer to the start of string (without the spaces), and chops the end of the string with nulls.

Oh well I'm interested to see what others come up with.

/* Yay goto :D */
char *trim(char *buffer, char *stripchars)
{
        int i = 0;

        /* Left Side */
        char *start = buffer;

left:
        for (i = 0; i < strlen(stripchars); i++) {
                if (*start == stripchars[i]) {
                        start++;
                        goto left;
                }
        }

        /* Right Side */
        char *end = start + strlen(start) - 1;

right:
        for (i = 0; i < strlen(stripchars); i++) {
                if (*end == stripchars[i]) {
                        *end = '\0';
                        --end;
                        goto right;
                }
        }

        return start;
}

EDIT: You can use it like so,

char buffer[] = "    Hello World     ";
printf("%s", trim(buffer, " "));

Last edited by HashBox (2009-06-09 21:51:13)

Offline

#2 2009-06-09 21:54:37

Peasantoid
Member
Registered: 2009-04-26
Posts: 928
Website

Re: String Trim in C

Only problem I see is that you're using goto. neutral

Offline

#3 2009-06-09 22:00:13

HashBox
Member
Registered: 2009-01-22
Posts: 271

Re: String Trim in C

I knew someone had to say it tongue

Consider this, without goto that function (as far as I can see) would need two new for loops (or two while loops), two new if statements, and a new variable, as that is how it was originally.

My only concern about using goto is the effect it could have when breaking out of a for loop, whether it might do something to the stack for example, I'm not terribly well versed about this.

Offline

#4 2009-06-09 22:59:57

iza
Member
From: Toronto, Canada
Registered: 2008-12-31
Posts: 44

Re: String Trim in C

goto does have its uses, however in such a simple function it shouldn't really be necessary.

Here's another way you could do it without goto; had to add one variable (i wish C had booleans =/), and a do-while loop instead of each label/goto.

char *trim(char *buffer, char *stripchars)
{
    int i = 0;
    int flag;

    /* Left Side */
    char *start = buffer;

    do {
        flag = 0;
        for (i = 0; i < strlen(stripchars); i++) {
            if (*start == stripchars[i]) {
                start++;
                flag = 1;
                break;
            }
        }
    } while (flag);

    /* Right Side */
    char *end = start + strlen(start) - 1;

    do {
        flag = 0;
        for (i = 0; i < strlen(stripchars); i++) {
            if (*end == stripchars[i]) {
                *end = '\0';
                --end;
                flag = 1;
                break;
            }
        }
    } while (flag)

    return start;
}

Still pretty ugly -- I'm sure there is a better way of doing this.


\_\__     __/_/ 
       (oo) ______
       (__)\           )\
              ||‾‾‾‾\|

Offline

#5 2009-06-09 23:01:19

Peasantoid
Member
Registered: 2009-04-26
Posts: 928
Website

Re: String Trim in C

iza wrote:

(i wish C had booleans =/)

C does have booleans. It's called "int foo = 1;". wink

Offline

#6 2009-06-09 23:14:52

HashBox
Member
Registered: 2009-01-22
Posts: 271

Re: String Trim in C

Peasantoid wrote:
iza wrote:

(i wish C had booleans =/)

C does have booleans. It's called "int foo = 1;". wink

or

#define bool int
#define true 1
#define false 0

wink

@iza, this is basically how I was doing it before I switched to using goto, although yours is slightly cleaner than mine was.

Offline

#7 2009-06-09 23:16:18

Procyon
Member
Registered: 2008-05-07
Posts: 1,819

Re: String Trim in C

What if you give it a string with all characters in stripchars, there's a small chance especially for the end that data is replaced before the string's original start, if I see it right.

Offline

#8 2009-06-09 23:21:34

voice
Member
Registered: 2009-06-09
Posts: 8

Re: String Trim in C


Everything is file, and file is everything.

Offline

#9 2009-06-09 23:27:23

HashBox
Member
Registered: 2009-01-22
Posts: 271

Re: String Trim in C

Procyon wrote:

What if you give it a string with all characters in stripchars, there's a small chance especially for the end that data is replaced before the string's original start, if I see it right.

Lucky me, it seems to hold up against that too smile
Assuming this is what you mean:

char buffer[] = "   hello world    ";
printf("'%s'\n", trim(buffer, " abcdefghijklmnopqrstuvwxyz"));

Returns ''
Edit: Now that I look at it, if you give it an empty string for either parameters it'll probably segfault. But checking for that is pretty simple.

@voice, thanks for that, interesting to see other ways that people do it smile

Last edited by HashBox (2009-06-09 23:41:16)

Offline

#10 2009-06-09 23:46:57

iza
Member
From: Toronto, Canada
Registered: 2008-12-31
Posts: 44

Re: String Trim in C

Procyon wrote:

What if you give it a string with all characters in stripchars, there's a small chance especially for the end that data is replaced before the string's original start, if I see it right.

I think you might be right. The left side part would move the start pointer to the null at the end of the input string. Then, the end pointer would be placed one unit to the left of that (since strlen(start) is 0). The loop would then proceed back through the input string and null it all. If the memory directly to the left of the input string happens to match one of the stripchars, it would also null it. It's highly unlikely, but even the fact that it is reading memory in a place it shouldn't be is not good (possibility of unauthorized memory access error).

Last edited by iza (2009-06-09 23:47:58)


\_\__     __/_/ 
       (oo) ______
       (__)\           )\
              ||‾‾‾‾\|

Offline

#11 2009-06-09 23:53:47

HashBox
Member
Registered: 2009-01-22
Posts: 271

Re: String Trim in C

iza wrote:
Procyon wrote:

What if you give it a string with all characters in stripchars, there's a small chance especially for the end that data is replaced before the string's original start, if I see it right.

I think you might be right. The left side part would move the start pointer to the null at the end of the input string. Then, the end pointer would be placed one unit to the left of that (since strlen(start) is 0). The loop would then proceed back through the input string and null it all. If the memory directly to the left of the input string happens to match one of the stripchars, it would also null it. It's highly unlikely, but even the fact that it is reading memory in a place it shouldn't be is not good (possibility of unauthorized memory access error).

If I'm not mistaken, this should only happen if the input is a zero length string? (strlen = 0 as you said)
Therefore, checking for a zero length string should solve this? or is there another simple check that would be better?

Edit: Even with a zero length input string and both a zero length stripchars and a stripchars with " abcdefghijklmnopqrstuvwxyz", it still does not crash, I guess that would be due to the memory before it not containing any of the stripchars though correct?

Still I think that's sorted most of the problems with it, thanks for all the input everyone smile

Last edited by HashBox (2009-06-10 00:02:27)

Offline

#12 2009-06-10 06:42:47

moljac024
Member
From: Serbia
Registered: 2008-01-29
Posts: 2,676

Re: String Trim in C

HashBox wrote:
Peasantoid wrote:
iza wrote:

(i wish C had booleans =/)

C does have booleans. It's called "int foo = 1;". wink

or

#define bool int
#define true 1
#define false 0

wink

@iza, this is basically how I was doing it before I switched to using goto, although yours is slightly cleaner than mine was.

Integer still uses double the memory a boolean does on most machines tongue

Last edited by moljac024 (2009-06-10 06:46:35)


The day Microsoft makes a product that doesn't suck, is the day they make a vacuum cleaner.
--------------------------------------------------------------------------------------------------------------
But if they tell you that I've lost my mind, maybe it's not gone just a little hard to find...

Offline

#13 2009-06-10 06:59:59

HashBox
Member
Registered: 2009-01-22
Posts: 271

Re: String Trim in C

moljac024 wrote:

Integer still uses double the memory a boolean does on most machines tongue

I decided to google this for the hell of it and yes it does seem to depend on the machine and also the version of GCC, interesting stuff. The more you know smile

Offline

#14 2009-06-10 07:44:18

Meillo
Member
From: Balmora
Registered: 2009-05-21
Posts: 10
Website

Re: String Trim in C

HashBox wrote:
Peasantoid wrote:
iza wrote:

(i wish C had booleans =/)

C does have booleans. It's called "int foo = 1;". wink

or

#define bool int

I think

typedef int bool

is better.


Finally, thanks go to Dennis Ritchie and Ken Thompson, who showed a generation of programmers that complexity is avoidable.
(Marc J. Rochkind in Advanced UNIX Programming)

Offline

#15 2009-06-10 08:19:49

Procyon
Member
Registered: 2008-05-07
Posts: 1,819

Re: String Trim in C

Here is an example of how it will write over other data:

#include <stdio.h>
#include <string.h>

(trim function)

void main() {
    char buffer[] = "BBBBBBBBBBBBBBB";
    char array[4] = { 'Q','E','E','E'};
    printf("Array Before: %4.4s\n", array);
    printf("Buffer After: %s\n", trim(buffer, "EB"));
    printf("Array After: %4.4s\n", array);
}
--> gcc -o trimtest trimtest.c 
--> ./trimtest
Array Before: QEEE
Buffer After: 
Array After:    Q

Offline

#16 2009-06-10 09:02:29

HashBox
Member
Registered: 2009-01-22
Posts: 271

Re: String Trim in C

Ouch I see what you mean, thanks for providing that example.

Offline

#17 2009-06-10 11:31:45

kumyco
Member
From: somewhere
Registered: 2008-06-23
Posts: 153
Website

Re: String Trim in C

Procyon wrote:

Here is an example of how it will write over other data:

char array[4] = { 'Q','E','E','E'};

i don't think you want to pass that array to any string related function, it's asking for a segfault
----
i thought it'd be interesting to implement this (without the goto)

char* ky_trim (char *buffer, const char *stripchars)
{
    char *buf;
    register size_t i=0, j;
    if (!buffer || !*buffer || !stripchars || !*stripchars) return buffer;
    
    do {
        for (j=0; j < strlen(stripchars) && buffer[i] != stripchars[j]; ++j){};
    } while (j < strlen(stripchars) && (i++ < strlen(buffer)));
    
    buf = buffer + i;
    i = strlen(buffer) - 1;
    
    do {
        for (j=0; j < strlen(stripchars) && buffer[i] != stripchars[j]; ++j){};
    } while (j < strlen(stripchars) && (i-- > 0));
    
    buffer[i + 1] = '\0';
    if (buf == buffer) return buffer;
    return memmove(buffer, buf, &buffer[i] - buf + 2);
}

and using pointers

char* ky_trim_ptr (char *buffer, const char *stripchars)
{
    char *buf=buffer, *start;
    const char *stp;
    
    if (!buffer || !*buffer || !stripchars || !*stripchars) return buffer;
    
    do {
        stp = stripchars;
        while (*stp && *buf != *stp) ++stp;
        if (*stp) ++buf;
    } while (*buf && *stp);
    
    start = buf;
    while (*buf) ++buf;
    --buf;
    
    do {
        stp = stripchars;
        while (*stp && *buf != *stp) ++stp;
        if (*stp) --buf;
    } while (*buf && *stp);
    *(buf + 1) = '\0';
    
    if (start == buffer) return buffer;
    return memmove(buffer, start, buf - start + 2);
}

applied the standard -O2 and tested as-is (with the memmove), they were largely the same, thoguh the ptr version appeared to be slightly faster than the other two.
i used memmove because for larger buffers it's faster than a straight loop, you need to memove because if you use it on a malloc()'d buffer, you very likely to lose the pointer causing a memory leak

Offline

#18 2009-06-10 12:14:32

vacant
Member
From: downstairs
Registered: 2004-11-05
Posts: 816

Re: String Trim in C

My C skills are very rusty, but assuming the string passed is null terminated and writeable, then a couple of "while" statements are about all that are needed for a "simple" trim (space, null, tabs are all less than ASCII 33)

#include "stdio.h"
#include "string.h"

char* trim (char* stripthis) {

    char *s = stripthis - 1, *e = stripthis + strlen (stripthis);

    while (++s < e && *s < 33);
    while (--e > s && *e < 33);
    *(++e) = (char) 0;

    return s;
    }

main () {

    char buffer[] = "    Hello World     ";
    printf(">%s<", trim(buffer));
    }

edited to tidy & deal with blank string

Last edited by vacant (2009-06-10 13:32:15)

Offline

#19 2009-06-10 13:09:28

Cerebral
Forum Fellow
From: Waterloo, ON, CA
Registered: 2005-04-08
Posts: 3,108
Website

Re: String Trim in C

kumyco wrote:
char* ky_trim (char *buffer, const char *stripchars)
...

...
applied the standard -O2 and tested as-is (with the memmove), they were largely the same, thoguh the ptr version appeared to be slightly faster than the other two.

The amount of times you used strlen there made me wince.  If you want to improve the perf of this one, try calling strlen once, and using the result everywhere else.  Like so:

char* ky_trim (char *buffer, const char *stripchars)
{
    char *buf;
    register size_t i=0, j;
    register size_t bufLen, stripLen;

    if (!buffer || !*buffer || !stripchars || !*stripchars) return buffer;

    bufLen = strlen(buffer);
    stripLen = strlen(stripchars);

    do {
        for (j=0; j < stripLen && buffer[i] != stripchars[j]; ++j){};
    } while (j < stripLen && (i++ < bufLen));
    
    buf = buffer + i;
    i = bufLen - 1;
    
    do {
        for (j=0; j < stripLen && buffer[i] != stripchars[j]; ++j){};
    } while (j < stripLen && (i-- > 0));
    
    buffer[i + 1] = '\0';
    if (buf == buffer) return buffer;
    return memmove(buffer, buf, &buffer[i] - buf + 2);
}

Offline

#20 2009-06-10 13:12:49

kumyco
Member
From: somewhere
Registered: 2008-06-23
Posts: 153
Website

Re: String Trim in C

gcc -O2 optimizes it away x)

Offline

#21 2009-06-10 13:19:26

Cerebral
Forum Fellow
From: Waterloo, ON, CA
Registered: 2005-04-08
Posts: 3,108
Website

Re: String Trim in C

kumyco wrote:

gcc -O2 optimizes it away x)

Maybe I'm missing something here, but how can that be possible when it doesn't even know the input string at compile time?

Offline

#22 2009-06-10 13:25:45

kumyco
Member
From: somewhere
Registered: 2008-06-23
Posts: 153
Website

Re: String Trim in C

i dunno the implementation details but I'd think it can see that there are really only 2 separate calls to strlen, so caches the first calls and reuse them -- or something like that, i didn't get it either (never even considered) until someone pointed it out, a quick

a quick compile with -S will show 6 calls to strlen, recompile with -S -O2 will show only 2

Offline

#23 2009-06-10 13:35:55

vacant
Member
From: downstairs
Registered: 2004-11-05
Posts: 816

Re: String Trim in C

Cerebral wrote:
kumyco wrote:

gcc -O2 optimizes it away x)

Maybe I'm missing something here, but how can that be possible when it doesn't even know the input string at compile time?

The compiler sees that the address of the symbol & contents stripchars do not alter within the "do...while"? Not sure about async (e.g. interrupts/sensor reading).

Last edited by vacant (2009-06-10 13:40:05)

Offline

#24 2009-06-10 13:43:29

Aprz
Member
From: Newark
Registered: 2008-05-28
Posts: 277

Re: String Trim in C

As for the whole true and false thing, what about bit-fields? tongue

Edit: I've never used a bit-field before, but just remember reading about it.

struct boolean {
    unsigned false:    0;
    unsigned true:     1;
} bool;

Wouldn't that be smaller than using an intenger?

Last edited by Aprz (2009-06-10 13:49:41)

Offline

#25 2009-06-10 13:46:52

kumyco
Member
From: somewhere
Registered: 2008-06-23
Posts: 153
Website

Re: String Trim in C

don't, 4 bytes is almost nothing and in some cases it's far more efficient to use the type that matches the native word-size than 1 byte etc.

Offline

Board footer

Powered by FluxBB