cat <(curl https://www.archlinux.org)
would do the equivalent of
mkfifo tmp.fifo
curl https://www.archlinux.org > tmp.fifo &
cat tmp.fifo
The additional "<" (with a space) feeds this "input file" into the while loop, just as "< foo.txt" would feed the contents of the file "foo.txt" into STDIN for some command. For example
while read line
do
echo "$line"
done < foo.txt
would loop over each line of "foo.txt". Replacing "foo.txt" with "<(curl https://www.archlinux.org)" would loop over content of "https://www.archlinux.org" instead.
At first I did the same thing as you did in your previous post (temporarily change IFS and then load the content directly into an array), but I then concluded that it is a bad idea. Unexpected characters in the description will break that code and there are some security risks as well as it could lead to remote code execution, however unlikely it may be for code to make its way into the text. I believe that my suggestion is safer and will handle all possible cases.
Again, note that you can replace
sed 's/<div class=\"item_description\" itemprop=\"description\">//' | sed 's/<\/div>/\|/')
with
sed 's/<div class=\"item_description\" itemprop=\"description\">//;s/<\/div>/\|/')
If you want to buy 2 things from the shop, there's no reason to walk there and back again twice to buy each thing separately.
]]>#!/bin/bash
OIFS="$IFS"
IFS=$'|'
s=$(curl -s http://www.imdb.com/title/tt0092455/episodes?season=1 | grep "item_description" | sed 's/<div class=\"item_description\" itemprop=\"description\">//' | sed 's/<\/div>/\|/')
array=($s)
IFS="$OIFS"
echo -e "Season 1 epsode 1\n"
echo ${array[0]}
can you explain to me how this while loop is working? Is the last line of it
done < <(curl -s http://www.imdb.com/title/tt0092455/episodes?season=1 | grep "item_description" | sed 's/<div class=\"item_description\" itemprop=\"description\">/\ /;s/<\/div>/\n/')
feeding lines of text into the $line variable? As I said before I'm not too familiar with bash syntax.
]]>Now I suppose if I could populate each array element with the paragraphs that result by moving on to the next element every time a line break is reached I would be golden. I'm not yet sure how to do that. I don't have much experience in bash scripting. Anyone have any ideas of how I might go about this?
#!/bin/bash
paragraphs=()
# populate the array using a while loop and process substitution (direct piping will not work)
while read line
do
paragraphs+=("$line")
done < <(curl -s http://www.imdb.com/title/tt0092455/episodes?season=1 | grep "item_description" | sed 's/<div class=\"item_description\" itemprop=\"description\">/\ /;s/<\/div>/\n/')
echo first paragraph
echo "${paragraphs[0]}"
echo
echo fifth paragraph
echo "${paragraphs[4]}"
echo
echo all paragraphs
for paragraph in "${paragraphs[@]}"
do
echo "$paragraph"
done
edit: note that I have combined the separate sed commands into a single one using semicolons to separate the substitutions.
]]>curl -s http://www.imdb.com/title/tt0092455/episodes?season=1 | grep "item_description" | sed 's/<div class=\"item_description\" itemprop=\"description\">/\ /' | sed 's/<\/div>/\n/'
Now I suppose if I could populate each array element with the paragraphs that result by moving on to the next element every time a line break is reached I would be golden. I'm not yet sure how to do that. I don't have much experience in bash scripting. Anyone have any ideas of how I might go about this?
]]>tng 5 25
This would play season 5 episode 25. I was thinking of ways to extend the script and I thought of adding episode summaries. I want to use curl to extract the html from imdb that Contains the descriptions. Ive tried parsing with grep and inserting into an array but every space creates a new element which wont work. Ive also thought of piping stdout to w3m and using text only but i cant seem to solve this at the moment. Any suggestiosns?
]]>