Screen-scrape clubphoto.com
tyler
Registered Users Posts: 5 Beginner grinner
First post messed up due to un-escaped HTML in my post. Here's a better version:
http://dgrin.com/showthread.php?t=21987
http://dgrin.com/showthread.php?t=21987
0
Comments
#!/bin/bash
# URLs of the clubphoto albums you want to copy
url[1 ]=http://members3.clubphoto.com/tyler256499/3736536/owner-63ee.phtml
url[2 ]=http://members3.clubphoto.com/tyler256499/3627315/owner-63ee.phtml
url[3 ]=http://members3.clubphoto.com/tyler256499/3303128/owner-63ee.phtml
# loop over albums in the url array
for index in 1 2 3
do
echo "${url[index]}"
wget -O temp.html ${url[index]}
# get the title
TITLE=`egrep "((\<TITLE\>)(.*)(\<.+\>))" temp.html | sed s:\<TITLE\>::g | sed s:\<\/TITLE\>::g`
echo $TITLE
# make a new directory using the title
mkdir "$TITLE"
cd "$TITLE"
# get the imgage ID and image title and write them to t line in a file separated by a :
egrep "new pObj(.+)" ../temp.html | awk -F , '{print $2 ":"$5}' | sed s/\"//g > img_ids.txt
echo $line
cat img_ids.txt |
while read line
do
# get the id and name of the image
imgID=`echo $line | awk -F : '{print $1}'`
imgName=`echo $line | awk -F : '{print $2}'`
pwd
echo $imgID
echo $imgName
# get the image and save it using the image name
wget -O "$imgName.jpg" "http://members3.clubphoto.com/_cgi-bin/getImage.pl?imgID="$imgID
done
cd ..
done
Wonderful - thanks so much for posting this
Portfolio • Workshops • Facebook • Twitter