Downloading images using wget

zuargo · 2010-03-28 22:42:22

Hello all.

I want download all images from a web site using wget. Searching information I found, I must use this command:

wget -A.jpg -r -l1 -np http://www.whatever.com/whatever.htm

However this does not work for me. Just it download the web page but the images no.

I tried with many web pages, but, in all, I got the same issue.

Really I don't know what is wrong in that command, many people say this command works for them.

Please help me.

Thanks in advance.

mamacken · 2010-03-29 21:05:37

The -A.jpg line is expanded to files just called .jpg I believe, due to this line in the man page for wget:
-A acclist --accept acclist
-R rejlist --reject rejlist
Specify comma-separated lists of file name suffixes or patterns to
accept or reject. Note that if any of the wildcard characters, *,
?, [ or ], appear in an element of acclist or rejlist, it will be
treated as a pattern, rather than a suffix.
Thus, I think *.jpg might be more what you are looking for. As well, are you sure that all of the images on the site are within 1 directory of the homepage? Your recursion depth is set to 1. I am not completely sure that you want the "np" option either:
-np
--no-parent
Do not ever ascend to the parent directory when retrieving
recursively. This is a useful option, since it guarantees that
only the files below a certain hierarchy will be downloaded.
I could be wrong about your understanding of the flags that you passed, but hopefully it shed some light on what the problem might have been. Try having a look at the wget man page yourself to try to formulate your own command intuitively to do exactly what you want. In unix, there are many ways to do things

n0dix · 2010-03-30 00:32:02

It seems you can do it with DownThemAll!! add-ons on Firefox: DownThemAll!!! I didn't try yet.

bernarcher · 2010-03-30 07:01:30

In firefox you may want try the ScrapBook Plus extension which allows a fine-grained download of websides.

bernarcher · 2010-03-30 07:18:21

zuargo wrote:

I want download all images from a web site using wget. Searching information I found, I must use this command:
wget -A.jpg -r -l1 -np http://www.whatever.com/whatever.htm

This should work. But are you sure about the "-l1"? If you restrict the download to level one (besides, this collides with the "-r" rcursive option) there is no chance to access any jpg file further down in the page (as often they are stored in some subdirectory).

zuargo · 2010-04-02 00:16:02

mamacken wrote:

The -A.jpg line is expanded to files just called .jpg I believe, due to this line in the man page for wget:
-A acclist --accept acclist
-R rejlist --reject rejlist
Specify comma-separated lists of file name suffixes or patterns to
accept or reject. Note that if any of the wildcard characters, *,
?, [ or ], appear in an element of acclist or rejlist, it will be
treated as a pattern, rather than a suffix.
Thus, I think *.jpg might be more what you are looking for. As well, are you sure that all of the images on the site are within 1 directory of the homepage? Your recursion depth is set to 1. I am not completely sure that you want the "np" option either:
-np
--no-parent
Do not ever ascend to the parent directory when retrieving
recursively. This is a useful option, since it guarantees that
only the files below a certain hierarchy will be downloaded.
I could be wrong about your understanding of the flags that you passed, but hopefully it shed some light on what the problem might have been. Try having a look at the wget man page yourself to try to formulate your own command intuitively to do exactly what you want. In unix, there are many ways to do things

Thanks a lot for your answer.

You will see, I tried using *.jpg instead only .jpg and does not works either.

You are right, I should not set the parameter -l to 1.

Really I don't understand what means recursive can someone explain me it using a simple example? please

I have looked the wget man page and, really, first, I need to know what means recursive .

n0dix wrote:

It seems you can do it with DownThemAll!! add-ons on Firefox: DownThemAll!!! I didn't try yet.

Thanks, currently I use that firefox extension, but I want to do it using wget

bernarcher wrote:

In firefox you may want try the ScrapBook Plus extension which allows a fine-grained download of websides.

Thanks for your answer, but I want to do it using wget.

bernarcher wrote:

This should work. But are you sure about the "-l1"? If you restrict the download to level one (besides, this collides with the "-r" rcursive option) there is no chance to access any jpg file further down in the page (as often they are stored in some subdirectory).

Yes, I am completly sure I don't need the -l flag or parameter

The command does not work for me. I am trying to download the images from this web page:

http://www.submitarchive.com/upload/pic … he_sky.php

It does not works. Maybe becouse the images is not contained directly in the web page, this ones is within flickr, perhaps I need another flag, or parameter, in the command due it.

Please can anyone explain me what means recursive? I must understand it to learn to use wget...

Last edited by zuargo (2010-04-02 00:21:24)

Ogion · 2010-04-02 07:03:21

Recursive means that the program shall descend into each subdirectory as it comes, finishes its work there, go back up in directory history and look for the next subdirectory.

Ogion

EDIT:

I wrote a zsh-script to download all the pictures. The solution lies in that the links to the pictures are all in the html source of the link you gave us. (If you don't use zsh but bash then change "zsh" in the script to "bash").

#!/bin/zsh
# grabs the pictures from 
# http://www.submitarchive.com/upload/pictures_from_the_sky.php
IFS=$'\n\t'

wget -O tmpfile http://www.submitarchive.com/upload/pictures_from_the_sky.php &>/dev/null

for i in $(grep "flickr" tmpfile)
do
 link=$(echo "$i" | cut -d '"' -f 2 | awk '{printf("%s",$0)}')
 name=$(echo "$i" | cut -d '"' -f 4 | awk '{printf("%s",$0)}')
 wget -O "${name}.jpg" "$link"
 printf "${name}.jpg $link saved \n"
done
rm tmpfile

Last edited by Ogion (2010-04-02 08:23:08)

zuargo · 2010-04-05 00:58:59

Ogion wrote:

Recursive means that the program shall descend into each subdirectory as it comes, finishes its work there, go back up in directory history and look for the next subdirectory.
Ogion
EDIT:
I wrote a zsh-script to download all the pictures. The solution lies in that the links to the pictures are all in the html source of the link you gave us. (If you don't use zsh but bash then change "zsh" in the script to "bash").
#!/bin/zsh
# grabs the pictures from 
# http://www.submitarchive.com/upload/pictures_from_the_sky.php
IFS=$'\n\t'

wget -O tmpfile http://www.submitarchive.com/upload/pictures_from_the_sky.php &>/dev/null

for i in $(grep "flickr" tmpfile)
do
 link=$(echo "$i" | cut -d '"' -f 2 | awk '{printf("%s",$0)}')
 name=$(echo "$i" | cut -d '"' -f 4 | awk '{printf("%s",$0)}')
 wget -O "${name}.jpg" "$link"
 printf "${name}.jpg $link saved \n"
done
rm tmpfile

Thanks a lot for your answer and your script

namely to use the script that I put on the first post the images must be contained in the same web page (in this case http://www.submitarchive.com) ??

Ogion · 2010-04-05 07:42:09

Yes i suppose so. I think wget looks for images in the directoriy structure of the target page and not linked images from other domains. But i haven't used wget recursive yet.

Ogion

Last edited by Ogion (2010-04-05 19:01:33)

zuargo · 2010-04-05 17:07:52

Thanks a lot Ogion for your answer...

Arch Linux

#1 2010-03-28 22:42:22

Downloading images using wget

#2 2010-03-29 21:05:37

Re: Downloading images using wget

#3 2010-03-30 00:32:02

Re: Downloading images using wget

#4 2010-03-30 07:01:30

Re: Downloading images using wget

#5 2010-03-30 07:18:21

Re: Downloading images using wget

#6 2010-04-02 00:16:02

Re: Downloading images using wget

#7 2010-04-02 07:03:21

Re: Downloading images using wget

#8 2010-04-05 00:58:59

Re: Downloading images using wget

#9 2010-04-05 07:42:09

Re: Downloading images using wget

#10 2010-04-05 17:07:52

Re: Downloading images using wget

Board footer