You are not logged in.
Pages: 1
I'm making a python script to get the weather temperature of my city. Here's the current state of the script:
import urllib2
addr = "http://weather.weatherbug.com/Bulgaria/Yambol-weather.html"
page = urllib2.urlopen(addr)
for line in page.readlines():
if line.find('<div id="divTemp" class="wXconditions-temp">') != -1:
temp = line
print temp
The problem is that I'm getting the string
<div id="divTemp" class="wXconditions-temp">17.0°C</div>
from weatherbug.com, where 17.0°C is the temperature. I'm trying to make the script to print 17.0°C, but I can't understand how to use RE to cut it. Can you give me a hand with this? Thanks!
PS: I've made this in bash before, but I need to do it in Python. Here's how it looks in bash:
$ lynx -dump -hiddenlinks=ignore -nolist http://weather.weatherbug.com/Bulgaria/Yambol-weather.html | grep "C" | head | tail -n1
17.0°C
Offline
You could try something like this:
#!/usr/bin/env python
import urllib2
import re
addr = "http://weather.weatherbug.com/Bulgaria/Yambol-weather.html"
page = urllib2.urlopen(addr)
for line in page.readlines():
match = re.search(r'<div id="divTemp" class="wXconditions-temp">(\d+.\d?)\°', line)
if match:
temp = match.group(1)
print temp
Which basically just groups the string using parenthasis, and then prints out the grouped section, which is the temperature. However, this is only printing a number (like 17.0), it doesn't show the little degrees sign or the C, but it wouldn't be too hard to add in, I suppose. And if there isn't a match, you'll get an error because 'temp' won't be defined.
Last edited by BetterLeftUnsaid (2008-10-09 16:46:22)
Offline
This should work:
import urllib2, re
addr = "http://weather.weatherbug.com/Bulgaria/Yambol-weather.html"
page = urllib2.urlopen(addr)
regexp = re.compile(r'.*>(\d{2}\.\d).+C<.*')
for line in page.readlines():
if line.find('<div id="divTemp" class="wXconditions-temp">') != -1:
m = regexp.match(line)
print m.groups()[0]
break
Now try to change the regular expression so you can remove your if line.find...
Another way would be to use BeautifulSoup for the parsing.
Offline
Yaaay, I made it! Woohooo. I lost my day making this stupid script and *finally* it's done. Here it is:
#!/usr/bin/env python
import urllib2, re
addr = "http://weather.weatherbug.com/Bulgaria/Yambol-weather.html"
page = urllib2.urlopen(addr)
degree_symbol = unichr(176).encode("latin-1")
for line in page.readlines():
if line.find('<div id="divTemp" class="wXconditions-temp">') != -1:
temp = line
break
temp = re.sub('.*">', '', temp)
temp = re.sub('°.*\n', '', temp)
print temp + degree_symbol + "C"
And when I run it:
$ python weather.py
16.0°C
EDIT: Wow! I'm a bit slow with the typing. Thanks for the help guys! I'll check them out.
Last edited by Boris Bolgradov (2008-10-09 16:50:57)
Offline
Pages: 1