You are not logged in.

#1 2009-12-22 18:34:50

Aprz
Member
From: Newark
Registered: 2008-05-28
Posts: 277

Obtain Website Using C [SOLVED]

I am trying to look for a way to download a webpage using C. Something similar to using wget itself or LWP::Simple in Perl. Well, I don't mean to mislead you guys saying download a site, but view the content of the site really (as if you were looking at view source).

Oh scratch that... Just as I was searching on the Internet I found something that I think explains everything.

Last edited by Aprz (2009-12-22 18:45:17)

Offline

#2 2009-12-26 00:41:18

jumzi
Member
Registered: 2009-02-20
Posts: 69

Re: Obtain Website Using C [SOLVED]

Care to share? :]
You marked the thread solved Mrgrrr......

Offline

#3 2009-12-28 07:51:34

zowki
Member
From: Trapped in The Matrix
Registered: 2008-11-27
Posts: 582
Website

Re: Obtain Website Using C [SOLVED]

Yes please share this, the only solution I could think of is using w3m to pipe webpage into my program but then it would require w3m as a dependency.

Last edited by zowki (2009-12-28 07:52:04)


How's my programming? Call 1-800-DEV-NULL

Offline

#4 2009-12-28 11:54:54

essence-of-foo
Member
Registered: 2008-07-12
Posts: 84

Re: Obtain Website Using C [SOLVED]

Just a guess, but libsoup is probably what you are looking for. Haven't used it myself though.

Offline

#5 2009-12-31 05:14:28

vkumar
Member
Registered: 2008-10-06
Posts: 166

Re: Obtain Website Using C [SOLVED]

edit: C code for those still interested in the problem.

/*
 * web-fetch.c
 * 
 * Fetch stuff using http. Pretty fragile code, but it does Get It Done (tm).
 * Use it under the ISC license; it's like the MIT/BSD ones, but less verbose.
 *  
 * -vk
 */

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <arpa/inet.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>

void fetch(const char* hostname, char* filepath) {
    int sockfd;
    struct hostent* server;
    struct sockaddr_in serv_addr;
    
    memset(&serv_addr, 0, sizeof(serv_addr));

    if ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) < 0) {
        perror("fetch - socket");
        return;
    }
    
    if (!(server = gethostbyname(hostname))) {
        perror("fetch - gethostbyname");
    }

    serv_addr.sin_family = AF_INET;
    memcpy(&serv_addr.sin_addr.s_addr, server->h_addr, server->h_length);
    serv_addr.sin_port = htons(80);

    if (connect(sockfd, (struct sockaddr*) &serv_addr, sizeof(serv_addr)) < 0) {
        perror("fetch - connect");
        return;
    }

    static char buffer[1024];
    
    sprintf(buffer, "GET %s HTTP/1.0\n\n", filepath);
    size_t buf_len = strlen(buffer);

    if (write(sockfd, buffer, buf_len) != buf_len) {
        perror("fetch - write");
    }
    
    int ret_code;
    do {
        memset(buffer, 0, 1024);
        ret_code = read(sockfd, buffer, buf_len);
        if (ret_code == -1) {
            perror("fetch - read");
            break;
        }
        
        printf("%s", buffer);
    } while (ret_code > 0);

    close(sockfd);
}

int main() {
    // fetch(argv[1], argv[2]);
    fetch("rfc-editor.org", "/rfc/rfc2606.txt");
    
    return 0;
}

Last edited by vkumar (2010-01-02 00:37:12)


div curl F = 0

Offline

Board footer

Powered by FluxBB