You are not logged in.

#1 2021-12-24 14:32:53

RDDO
Member
From: Brazil
Registered: 2008-02-21
Posts: 23

extracting pdf images

I have just used an online pdf image extractor since I couldn't find pdfimages in the repos, is there any similar console software to it? Read the description of a few (pdftk, pdfpc, pdfcrack...) but none seem to do it.

thanks is advance.

Offline

#2 2021-12-24 14:43:31

teckk
Member
Registered: 2013-02-21
Posts: 519

Re: extracting pdf images

I don't have an example written at the moment that does exactly what you ask for.

This is a round about answer. There are many ways to do what you want. It depends on what you want to use. This example, which uses poppler and cairo, will show you how to extract a pdf page into image.png

pdftoimage.c

#include <stdio.h>
#include <stdlib.h>
#include <poppler.h>
#include <cairo.h>

#define IMAGE_DPI 150

int main(int argc, char *argv[])
{
    PopplerDocument *document;
    PopplerPage *page;
    double width, height;
    GError *error;
    const char *pdf_file;
    const char *png_file;
    gchar *absolute, *uri;
    int page_num, num_pages;
    cairo_surface_t *surface;
    cairo_t *cr;
    cairo_status_t status;

    if (argc != 4) {
        printf ("Usage: pdftoimage input.pdf output.png page\n");
        return 0;
    }

    pdf_file = argv[1];
    png_file = argv[2];
    page_num = atoi(argv[3]);
    g_type_init ();
    error = NULL;

    if (g_path_is_absolute(pdf_file)) {
        absolute = g_strdup (pdf_file);
    } else {
        gchar *dir = g_get_current_dir ();
        absolute = g_build_filename (dir, pdf_file, (gchar *) 0);
        free (dir);
    }

    uri = g_filename_to_uri (absolute, NULL, &error);
    free (absolute);
    if (uri == NULL) {
        printf("%s\n", error->message);
        return 1;
    }

    document = poppler_document_new_from_file (uri, NULL, &error);
    if (document == NULL) {
        printf("%s\n", error->message);
        return 1;
    }

    num_pages = poppler_document_get_n_pages (document);
    if (page_num < 1 || page_num > num_pages) {
        printf("page must be between 1 and %d\n", num_pages);
        return 1;
    }

    page = poppler_document_get_page (document, page_num - 1);
    if (page == NULL) {
        printf("poppler fail: page not found\n");
        return 1;
    }

    poppler_page_get_size (page, &width, &height);

    surface = cairo_image_surface_create (CAIRO_FORMAT_ARGB32,
                                          IMAGE_DPI*width/72.0,
                                          IMAGE_DPI*height/72.0);
    cr = cairo_create (surface);
    cairo_scale (cr, IMAGE_DPI/72.0, IMAGE_DPI/72.0);
    cairo_save (cr);
    poppler_page_render (page, cr);
    cairo_restore (cr);
    g_object_unref (page);

    cairo_set_operator (cr, CAIRO_OPERATOR_DEST_OVER);
    cairo_set_source_rgb (cr, 1, 1, 1);
    cairo_paint (cr);

    status = cairo_status(cr);
    if (status)
        printf("%s\n", cairo_status_to_string (status));

    cairo_destroy (cr);
    status = cairo_surface_write_to_png (surface, png_file);
    if (status)
        printf("%s\n", cairo_status_to_string (status));

    cairo_surface_destroy (surface);

    g_object_unref (document);

    return 0;
}
gcc pdftoimage.c -o pdftoimage $(pkg-config --cflags --libs cairo poppler-glib)

Offline

#3 2021-12-24 14:48:44

seth
Member
Registered: 2012-09-03
Posts: 51,229

Re: extracting pdf images

% pacman -F pdfimages
extra/poppler 21.11.0-1 [Installiert: 21.05.0-1]
    usr/bin/pdfimages

?

Offline

#4 2021-12-24 15:02:26

RDDO
Member
From: Brazil
Registered: 2008-02-21
Posts: 23

Re: extracting pdf images

seth wrote:
% pacman -F pdfimages
extra/poppler 21.11.0-1 [Installiert: 21.05.0-1]
    usr/bin/pdfimages

?

I see, wasn't getting any results when searching for pdfimages, it is embedded into poppler.. Anyway thanks for pointing out.


eckk wrote:

[...] This is a round about answer. There are many ways to do what you want. It depends on what you want to use. This example, which uses poppler and cairo, will show you how to extract a pdf page into image.png

Thats a very interesting solution, learned something new, thanks!

Offline

Board footer

Powered by FluxBB