You are not logged in.

#1 2009-01-30 16:01:17

dav7
Member
From: Australia
Registered: 2008-02-08
Posts: 674

[Sourcecode] A <100 line IRC bot skeleton in C

So over the past week or so I've wanted to make an IRC bot in C. I tried all sorts of different methods but none really worked, were insane and/or stupid due to my incomprehension of memory and networking, and it really mad me mad/sad. So, despite every bit of me not wanting to do so... I gave up. And tonight, after letting my brain noodle over some concepts (which I'll admit came from ##c on freenode) and ideas, I at last came up with the following. And it appears to work.

"Everyone has to write an IRC bot," they say. Well, maybe they're right, but mine is as KISS and hackable as possible (in my opinion at least), and also also tries to use as little RAM as is sane, so although it allocates 1025 bytes of RAM out of the box, it doesn't use any more than that (beyond the stack and the memory the DNS lookup functions use).

Also, so the line count is smaller and readability is higher I don't check all addresses a host might return. This is more or less 99.8% not a problem though.

It also merges the IRC nick-, user- and realname into one variable so the line count is lower, but because of the design, adding this in would be trivial.

Read this:
This is a bot skeleton. It doesn't actually "do" anything - it doesn't even have any commands. However:
- It connects to a channel once on a server
- It'll PONG when PINGed
- It reacts to PRIVMSGs and NOTICEs, both in-channel and in PMs
- It will reply with "JOIN <#channel>" after and only after it sees numeric 001 to handle servers that don't like clients joining channels before they've sent a PONG response at least once, and since literally all IRC servers send 001, this method should work everywhere

Code explanation:

raw()
This helper function works exactly like printf (it operates the same way printf does), except it sends whatever you pass to the server as well.
raw() is demonstrated at connect time, when it sends NICK and USER.

Hacking
You can probably learn the most from the printf() line I've thrown into the code at line 82. Connect the bot to a server and join it to an unused channel, and play around with:
- noticing messages to the channel
- talking normally in the channel
- PMing the bot
- NOTICEing the bot
and watch how this line reacts in your terminal.

Variables
All the following are pointers into locations in the data recieved from the server, and don't need freeing or anything before the next line comes in.
user: The user that sent the message. So if I said something in a channel or PM, "dav7" would come up here.
command: Always "NOTICE" or "PRIVMSG". Optimize away unneccessary strncmp()s by just checking if command[0] is 'N' or 'P'.
where: This will either be a channel or a username.
target: So that you don't have to worry about the difference between PMs and in-channel messages, this will always be what you should insert between your PRIVMSG or NOTICE and your message - it'll be a username for a PM, and a channel name for a channel response, depending on whether the originating message came from a channel or a PM.
message: The actual message you recieved. This will be \r\n terminated.

Executional rundown for people who want to understand this bot and/or IRC bots in general
The code itself should be easy enough to unravel given 30-45 minutes or so (probably way less for experienced C programmers), but here's a basic rundown of how the bot works in general, and also a specification of how you could reimplement my bot yourself. I must admit that I really like point #6 - it's a really easy way to distinguish and handle the difference between in-channel messages and PMs.

1. If the line starts with PING, set the second byte in the line to an 'O' and send what we just recieved straight back to the server. I saw this technique used a while ago in another IRC bot written in C and snapped it up.

2. If the line from the server starts with a :, it's assumed to be a possible PRIVMSG or NOTICE and parsed as such, lifting the user, command,  possible message location and possible message from the line. Up to this point, the line may or may not actually be a PRIVMSG or NOTICE, but we have the command now so can check.

3. Were there less than 3 words in the response? We don't have a user/command/location/message combination then - skip this line.

4. Is the command "001"?. If it's 001, ignore everything else in the line and reply with JOIN <#channel>.

5. If it's not, check if it's PRIVMSG or NOTICE. If it is, check we have a location and message; if we don't, skip this line. If we do, check the user for the presense of a '!', denoting that it's a user!host combination - if it is, set the '!' to '\0' so it cuts just after the user.

6. If the location starts with #, &, + or !, the 4 IRC channel types (you may have never heard of the last 3 but they're out there), it's a channel, and we should set the target to the location, otherwise we should set the target to the source user, since it's a PM.

7. Repeat until read() says 0 bytes were read.

The code!

This requires no additional libraries - you can just do <tcc or gcc> -o bot bot.c.

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <netdb.h>

int conn;
char sbuf[512];

void raw(char *fmt, ...) {
    va_list ap;
    va_start(ap, fmt);
    vsnprintf(sbuf, 512, fmt, ap);
    va_end(ap);
    printf("<< %s", sbuf);
    write(conn, sbuf, strlen(sbuf));
}

int main() {
    
    char *nick = "test";
    char *channel = NULL;
    char *host = "irc.dav7.net";
    char *port = "6667";
    
    char *user, *command, *where, *message, *sep, *target;
    int i, j, l, sl, o = -1, start, wordcount;
    char buf[513];
    struct addrinfo hints, *res;
    
    memset(&hints, 0, sizeof hints);
    hints.ai_family = AF_INET;
    hints.ai_socktype = SOCK_STREAM;
    getaddrinfo(host, port, &hints, &res);
    conn = socket(res->ai_family, res->ai_socktype, res->ai_protocol);
    connect(conn, res->ai_addr, res->ai_addrlen);
    
    raw("USER %s 0 0 :%s\r\n", nick, nick);
    raw("NICK %s\r\n", nick);
    
    while ((sl = read(conn, sbuf, 512))) {
        for (i = 0; i < sl; i++) {
            o++;
            buf[o] = sbuf[i];
            if ((i > 0 && sbuf[i] == '\n' && sbuf[i - 1] == '\r') || o == 512) {
                buf[o + 1] = '\0';
                l = o;
                o = -1;
                
                printf(">> %s", buf);
                
                if (!strncmp(buf, "PING", 4)) {
                    buf[1] = 'O';
                    raw(buf);
                } else if (buf[0] == ':') {
                    wordcount = 0;
                    user = command = where = message = NULL;
                    for (j = 1; j < l; j++) {
                        if (buf[j] == ' ') {
                            buf[j] = '\0';
                            wordcount++;
                            switch(wordcount) {
                                case 1: user = buf + 1; break;
                                case 2: command = buf + start; break;
                                case 3: where = buf + start; break;
                            }
                            if (j == l - 1) continue;
                            start = j + 1;
                        } else if (buf[j] == ':' && wordcount == 3) {
                            if (j < l - 1) message = buf + j + 1;
                            break;
                        }
                    }
                    
                    if (wordcount < 2) continue;
                    
                    if (!strncmp(command, "001", 3) && channel != NULL) {
                        raw("JOIN %s\r\n", channel);
                    } else if (!strncmp(command, "PRIVMSG", 7) || !strncmp(command, "NOTICE", 6)) {
                        if (where == NULL || message == NULL) continue;
                        if ((sep = strchr(user, '!')) != NULL) user[sep - user] = '\0';
                        if (where[0] == '#' || where[0] == '&' || where[0] == '+' || where[0] == '!') target = where; else target = user;
                        printf("[from: %s] [reply-with: %s] [where: %s] [reply-to: %s] %s", user, command, where, target, message);
                        //raw("%s %s :%s", command, target, message); // If you enable this the IRCd will get its "*** Looking up your hostname..." messages thrown back at it but it works...
                    }
                }
                
            }
        }
        
    }
    
    return 0;
    
}

-dav7

Last edited by dav7 (2009-01-30 16:02:46)


Windows was made for looking at success from a distance through a wall of oversimplicity. Linux removes the wall, so you can just walk up to success and make it your own.
--
Reinventing the wheel is fun. You get to redefine pi.

Offline

#2 2009-01-30 16:39:05

GraveyardPC
Member
Registered: 2008-11-29
Posts: 99

Re: [Sourcecode] A <100 line IRC bot skeleton in C

This should be stickied immediately. I wish I would've found something like this about a year ago. Good work.

Offline

#3 2009-01-30 16:50:23

Xyne
Administrator/PM
Registered: 2008-08-03
Posts: 6,963
Website

Re: [Sourcecode] A <100 line IRC bot skeleton in C

Very nice, dav7. This will be useful when I finally get around to going though that C tutorial that I have bookmarked. Thanks for the adjoining comments too.

You should try writing some tutorials for various things. Your tendency to be thorough would be very useful.


My Arch Linux StuffForum EtiquetteCommunity Ethos - Arch is not for everyone

Offline

#4 2009-01-31 09:56:18

dav7
Member
From: Australia
Registered: 2008-02-08
Posts: 674

Re: [Sourcecode] A <100 line IRC bot skeleton in C

Glad this'll be useful. big_smile

I actually thought I was going on there a bit, heh.

If anyone has any ideas on how the parser could be made a bit tidier I'm open to suggestions... it looks a bit line-noise-y, IMHO. hmm

And Xyne, I've actually been considering writing a tutorial on computers in general. ie, a data-dump of everything I know in a sequential format suitable for stuffing into a PDF or the pages of a book that someone can pick up, devote 5 years to reading, and discover they were a hacker when they were done. I'd probably be like 11,248 pages though hmm

-dav7

Last edited by dav7 (2009-01-31 09:58:43)


Windows was made for looking at success from a distance through a wall of oversimplicity. Linux removes the wall, so you can just walk up to success and make it your own.
--
Reinventing the wheel is fun. You get to redefine pi.

Offline

#5 2014-09-02 12:58:43

TanmayN
Member
Registered: 2014-02-01
Posts: 3

Re: [Sourcecode] A <100 line IRC bot skeleton in C

I don't even know C, but I was able to understand this. Thank you for sharing!

Offline

#6 2014-09-04 21:30:54

Xyne
Administrator/PM
Registered: 2008-08-03
Posts: 6,963
Website

Re: [Sourcecode] A <100 line IRC bot skeleton in C

5 years old, closing
p.s. I did eventually end up learning C wink


My Arch Linux StuffForum EtiquetteCommunity Ethos - Arch is not for everyone

Offline

Board footer

Powered by FluxBB