You are not logged in.
Pages: 1
I'm making a python script to get the weather temperature of my city. Here's the current state of the script:
import urllib2
addr = "http://weather.weatherbug.com/Bulgaria/Yambol-weather.html"
page = urllib2.urlopen(addr)
for line in page.readlines():
    if line.find('<div id="divTemp" class="wXconditions-temp">') != -1:
    temp = line
print tempThe problem is that I'm getting the string
<div id="divTemp" class="wXconditions-temp">17.0°C</div>from weatherbug.com, where 17.0°C is the temperature. I'm trying to make the script to print 17.0°C, but I can't understand how to use RE to cut it. Can you give me a hand with this? Thanks!
PS: I've made this in bash before, but I need to do it in Python. Here's how it looks in bash:
$ lynx -dump -hiddenlinks=ignore -nolist http://weather.weatherbug.com/Bulgaria/Yambol-weather.html | grep "C" | head | tail -n1
   17.0°COffline
You could try something like this:
#!/usr/bin/env python
import urllib2
import re
addr = "http://weather.weatherbug.com/Bulgaria/Yambol-weather.html"
page = urllib2.urlopen(addr)
for line in page.readlines():
    match = re.search(r'<div id="divTemp" class="wXconditions-temp">(\d+.\d?)\°', line)
    if match:
      temp = match.group(1)
print tempWhich basically just groups the string using parenthasis, and then prints out the grouped section, which is the temperature. However, this is only printing a number (like 17.0), it doesn't show the little degrees sign or the C, but it wouldn't be too hard to add in, I suppose. And if there isn't a match, you'll get an error because 'temp' won't be defined.
Last edited by BetterLeftUnsaid (2008-10-09 16:46:22)
Offline
This should work:
import urllib2, re
addr = "http://weather.weatherbug.com/Bulgaria/Yambol-weather.html"
page = urllib2.urlopen(addr)
regexp = re.compile(r'.*>(\d{2}\.\d).+C<.*')
for line in page.readlines():
    if line.find('<div id="divTemp" class="wXconditions-temp">') != -1:
        m = regexp.match(line)
        print m.groups()[0]
        breakNow try to change the regular expression so you can remove your if line.find...
Another way would be to use BeautifulSoup for the parsing.
Offline
Yaaay, I made it! 
 Woohooo. I lost my day making this stupid script and *finally* it's done. 
 Here it is:
#!/usr/bin/env python
import urllib2, re
addr = "http://weather.weatherbug.com/Bulgaria/Yambol-weather.html"
page = urllib2.urlopen(addr)
degree_symbol = unichr(176).encode("latin-1")
for line in page.readlines():
    if line.find('<div id="divTemp" class="wXconditions-temp">') != -1:
    temp = line
    break
temp = re.sub('.*">', '', temp)
temp = re.sub('°.*\n', '', temp)
print temp + degree_symbol + "C"And when I run it:
$ python weather.py 
16.0°C![]()
EDIT: Wow! I'm a bit slow with the typing. 
 Thanks for the help guys! I'll check them out. ![]()
Last edited by Boris Bolgradov (2008-10-09 16:50:57)
Offline
Pages: 1