You are not logged in.
Some of you might have caught wind of this project I'm working on. For those who haven't, its goal is to, above all else, be as small as possible. That doesn't mean I'll make myself go mad by using assembly or incomplete libraries.
I was just wondering if there was any way I could possibly get the small binaries I'm producing any smaller, without drastically rethinking the code.
At the moment, gcc is on my fail list: it seems to produce considerably sizable binaries for the small amount of code in each program, compared to tcc, which generates some of the smallest code I've seen yet.
In one test, after running gcc -Os -O3 -s on one program, then stripping it, then hacking away at what was left with objcopy -R (and accidentally rendering it unusable), I was left with a 6.7K binary, while running tcc on the same source produced a 6.6K binary. Without stripping or removing any sections.
That might sound small, but after running tcc, strip and objcopy -R (and removing .comment and .note.GNU-stack) on the same code, I was left with a 5.9K binary.
That's one of my better examples though (my best is 4.2K for another app). I was just wondering... I have some programs I've done this with and they won't go below, say, around 12K. Since gcc isn't in my good books and tcc won't accept -O3 et al., do you have any ideas I might apply to my binaries to make them smaller? Or perhaps a preprocessor I can apply to my sourcecode to pre-optimize it?
-dav7
Windows was made for looking at success from a distance through a wall of oversimplicity. Linux removes the wall, so you can just walk up to success and make it your own.
--
Reinventing the wheel is fun. You get to redefine pi.
Offline
Just a note..
In one test, after running gcc -Os -O3 -s on one program..
From the gcc manpage:
If you use multiple -O options, with or without level numbers, the last such option is the one that is effective.
Offline
bender02: I see. How do you link against libraries that aren't the default? That's confused me for a little while. Also, I tried compressing with UPX: it leaves stupid info strings in the file, and I have no idea how to remove them.
string: Ok... ouch, heh.
-dav7
Windows was made for looking at success from a distance through a wall of oversimplicity. Linux removes the wall, so you can just walk up to success and make it your own.
--
Reinventing the wheel is fun. You get to redefine pi.
Offline
About linking against other libs than glibc: you'd need the whole toolchain (binutils, c compiler, libc). You can probably look at klibc-* PKGBUILD for how it's done (yea, I forgot, klibc is another alternative). Dietlibc package comes with the whole thing by default, setting CC='/opt/diet/bin/diet gcc -Os' (or also LD='/opt/diet/bin/diet gcc -s') before 'make'-ing should be sufficient. But be aware that lots of programs need patching before they succesfully compile against different libc's.
If you're really into that kind of thing, you can look at openwrt, I think they have scripts to compile the whole distro against uclibc.
Offline
I've made a test with gcc 4.3.2, tcc 0.9.24 and llvm-gcc 2.4. Here's the result: (_s indicates stripped, _ns not stripped)
gcc_Os_ns 9.7K
gcc_Os_s 6.2K
gcc_ns 11K
gcc_s 7.1K
getfile.c 5.1K
llvm-Os_ns 11K
llvm-Os_s 6.7K
llvm-ns 11K
llvm-s 7.3K
tcc_Os_ns 7.4K
tcc_Os_s 6.8K
tcc_ns 7.4K
tcc_s 6.8K
Tested file:
/* getfile.c (c) Johannes Krampf */
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <arpa/inet.h>
#include <netdb.h>
#include <sys/types.h>
#include <sys/socket.h>
int main(int argc, char** argv) {
int rv;
int match = 0;
int sockfd = 0;
int status = 0;
size_t reqsize = 45;
char* url = NULL;
char* posSlash = NULL;
char* posSearch = NULL;
char* dynBuffer = NULL;
FILE* outfile = NULL;
struct addrinfo hints;
struct addrinfo *servinfo = NULL;
struct addrinfo *p = NULL;
if (3 != argc) {
puts("FEHLER: Falsche Zahl Argumente.");
puts("\nBenutzung: getfile [url][datei]");
return -1;
}
url = argv[1];
if (NULL == url) {
return 0;
}
if ( (7 < strlen(url)) &&
(0 == strncmp("http://", url, 7)) ) {
url += 7;
}
posSlash = strchr(url, '/');
if (NULL == posSlash) {
dynBuffer = url;
dynBuffer = (char *)calloc(strlen(url)+1, sizeof(char));
if (NULL == dynBuffer) {
return -1;
}
strncpy(dynBuffer, url, strlen(url));
}else {
dynBuffer = (char *)calloc((size_t)(posSlash-url+1), sizeof(char));
if (NULL == dynBuffer) {
return -1;
}
strncpy(dynBuffer, url, (size_t)(posSlash-url));
dynBuffer[posSlash-url] = '\0';
}
memset(&hints, 0, sizeof hints);
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
hints.ai_flags = AI_CANONNAME;
if ((rv = getaddrinfo(dynBuffer, "http", &hints, &servinfo)) != 0) {
free(dynBuffer);
fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(rv));
return -1;
}
free(dynBuffer);
dynBuffer = NULL;
for(p = servinfo; p != NULL; p = p->ai_next) {
if (AF_INET == p->ai_addr->sa_family) {
dynBuffer = (char *)calloc(INET_ADDRSTRLEN+1, sizeof(char));
if (NULL == dynBuffer) {
freeaddrinfo(servinfo);
return -1;
}
inet_ntop(AF_INET, &(((struct sockaddr_in *)p->ai_addr)->sin_addr), dynBuffer, INET_ADDRSTRLEN);
}else if (AF_INET6 == p->ai_addr->sa_family) {
dynBuffer = (char *)calloc(INET6_ADDRSTRLEN+1, sizeof(char));
if (NULL == dynBuffer) {
freeaddrinfo(servinfo);
return -1;
}
inet_ntop(AF_INET6, &(((struct sockaddr_in6 *)p->ai_addr)->sin6_addr), dynBuffer, INET6_ADDRSTRLEN);
}else {
fprintf(stderr, "Fehler nach getaddrinfo: Ungueltiger Protokolltyp '%x', breche ab.\n", p->ai_addr->sa_family);
free(dynBuffer);
freeaddrinfo(servinfo);
return -1;
}
printf("Verbinde zu %s (%s)...", servinfo->ai_canonname, dynBuffer);
fflush(stdout);
free(dynBuffer);
dynBuffer = NULL;
if ((sockfd = socket(p->ai_family, p->ai_socktype,
p->ai_protocol)) == -1) {
continue;
}
if (connect(sockfd, p->ai_addr, p->ai_addrlen) == -1) {
close(sockfd);
puts("fehlgeschlagen.");
continue;
}
break;
}
freeaddrinfo(servinfo);
if (p == NULL) {
return -1;
}else {
puts("erfolgreich verbunden.");
}
if (NULL == posSlash) {
reqsize += strlen(url) + 1;
} else {
reqsize += (posSlash - url) + strlen(posSlash);
}
dynBuffer = (char *)calloc(reqsize, sizeof(char));
if (NULL == dynBuffer) {
close(sockfd);
return -1;
}
strcpy(dynBuffer, "GET /");
if (NULL != posSlash) {
strcat(dynBuffer, posSlash+1);
}
strncat(dynBuffer, " HTTP/1.1\r\nHost: ", 17);
if (NULL == posSlash) {
strncat(dynBuffer, url, strlen(url));
}else {
strncat(dynBuffer, url, (size_t)(posSlash - url));
}
strncat(dynBuffer, "\r\nConnection: close", 19);
strncat(dynBuffer, "\r\n\r\n", 4);
if (-1 == send(sockfd, dynBuffer, reqsize, 0)) {
free(dynBuffer);
puts("Fehler beim Senden der Anfrage, breche ab.");
return -1;
}
free(dynBuffer);
dynBuffer = NULL;
dynBuffer = (void *)calloc(1025, sizeof(char));
if (NULL == dynBuffer) {
close(sockfd);
return -1;
}
printf("Anfrage gesendet, warte auf Antwort...");
fflush(stdout);
while (0 < (rv = recv(sockfd, dynBuffer, 1024, MSG_WAITALL))) {
dynBuffer[rv] = '\0';
if (0 == status) {
if ((0 == strncmp("HTTP/1.", dynBuffer, 7)) &&
(12 < rv) &&
(0 == strncmp("200", dynBuffer + 9, 3))) {
status = 200;
}else {
close(sockfd);
posSlash = strchr(dynBuffer, '\r');
if (NULL == posSlash) {
fprintf(stderr, "\nUngueltige Antwort, breche ab.\n");
}else {
*posSlash = '\0';
fprintf(stderr, "\nAntwort: '%s', breche ab.\n", dynBuffer);
}
free(dynBuffer);
return -1;
}
}
if (4 == match) {
fwrite(dynBuffer, 1, rv, outfile);
}else {
posSearch = dynBuffer;
while('\0' != *posSearch) {
if ( (((0 == match) || (2 == match)) && ((char)13 == *posSearch)) ||
(((1 == match) || (3 == match)) && ((char)10 == *posSearch)) ) {
match++;
}else {
match = 0;
}
if (4 == match) {
outfile = fopen (argv[2],"wb");
if (NULL == outfile) {
close(sockfd);
free(dynBuffer);
fprintf(stderr, "Konnte Datei '%s' nicht oeffnen!", argv[2]);
return -1;
}
fwrite(posSearch + 1, 1, rv-(posSearch-dynBuffer+1), outfile);
break;
}
posSearch++;
}
}
}
close(sockfd);
fclose(outfile);
free(dynBuffer);
dynBuffer = NULL;
puts("erhalten.");
/*
* GET /index.html HTTP/1.1
* Host: www.example.com
*/
return 0;
}
Offline
bender02: I see. I'll be sure to have a look at the klibc package, but I'm not so sure about openwrt. klibc is small enough anyway - Arch uses it for its initrd.
wuischke: Wow, very interesting...!
Windows was made for looking at success from a distance through a wall of oversimplicity. Linux removes the wall, so you can just walk up to success and make it your own.
--
Reinventing the wheel is fun. You get to redefine pi.
Offline
Maybe this will help: http://users.utu.fi/tmwire/linux4k.html
Offline
Further tests based on the linked page: At least for not gzipped and conventionally linked executables, not all of his advices are true for my example application. (I know, I should really get a blog for this stuff...)
How tests were performed: Like in the previous test, -ns and -s indicate stripped (strip executable)and not stripped. -ss is "super-stripped" by executing strip -s -R .comment -R .gnu.version executable. This does bring a notable gain. (fopt without suffix is not stripped)
fopt-gcc is gcc with -ffast-math and -fomit-frame-pointer, which didn't improve binary sizes for my test application, but increased it on the contrary.
Else we have above mentioned compilers with the -O level in the file name.
A word of warning: Take the results with a grain of salt - I tested only one application with a limited coverage of C and library features.
6040 gcc-Os-ss
6268 gcc-Os-s
6284 fopt-gcc-Os-ss
6380 llvm-gcc-Os-ss
6460 gcc-O3-ss
6476 gcc-O2-ss
6488 gcc-O1-ss
6556 llvm-gcc-O1-ss
6556 llvm-gcc-O2-ss
6556 llvm-gcc-O3-ss
6688 gcc-O3-s
6704 gcc-O2-s
6716 gcc-O1-s
6720 fopt-gcc-O3-ss
6732 fopt-gcc-O1-ss
6736 fopt-gcc-O2-ss
6760 llvm-gcc-Os-s
6856 tcc-ss
6904 tcc-s
6936 llvm-gcc-O1-s
6936 llvm-gcc-O2-s
6936 llvm-gcc-O3-s
6976 gcc-O0-ss
7020 llvm-gcc-O0-ss
7204 gcc-O0-s
7396 fopt-gcc-O0-ss
7400 llvm-gcc-O0-s
7524 tcc-ns
9835 gcc-Os-ns
10181 gcc-O1-ns
10191 gcc-O3-ns
10207 gcc-O2-ns
10621 llvm-gcc-Os-ns
10797 llvm-gcc-O1-ns
10797 llvm-gcc-O2-ns
10797 llvm-gcc-O3-ns
10806 gcc-O0-ns
11261 llvm-gcc-O0-ns
12379 fopt-gcc-Os
12725 fopt-gcc-O1
12743 fopt-gcc-O3
12759 fopt-gcc-O2
13526 fopt-gcc-O0
Last edited by wuischke (2008-12-02 21:38:20)
Offline
I'm down to 5864 bytes with gcc 4.3.2. Adding the following options to gcc "-Wl,--gc-sections -ffunction-sections -fdata-sections" (source) does make a small difference.
Using gcc 3.3 and gcc 3.4 I obtained an even smaller size of 5780 (3.3) and 5592 Bytes (3.4).
Therefore my recommendations:
- use gcc 3.4
- Add the compiler options "-Os -Wl,--gc-sections -ffunction-sections -fdata-sections"
- Use strip with the options "-s -R .comment -R .gnu.version"
Offline