python: how do I remove the first and last character from a variable?

tigrmesh · 2008-03-22 18:34:31

I've been visiting the #archlinux irc channel using weechat, and using the python program (script?) urlgrab.py to view any urls that are posted. That all works fine.

However, it doesn't recognize urls that are bracketed by <>, like this:

<http://wiki.archlinux.org/index.php/Xorg>

I located where it finds a url and modified it to recognize these as urls. Here's the code:

def urlGrabCheckMsgline(server, chan, message):
    # Ignore output from 'tinyurl.py'
    if message.startswith( "[AKA] http://tinyurl.com" ):
        return weechat.PLUGIN_RC_OK
    # Check for URLs
    for word in message.split(" "):
        if word[0:7] == "http://" or \
           word[0:8] == "https://" or \
           # try this 3/22/08 - phrik places urls within <>
           word[0:8] == "<http://" or \
           word[0:9] == "<https://" or \
           # end
           word[0:6] == "ftp://":
            urlGrab.addUrl(word, chan, server)

But when they get passed to firefox they still have the <>. How do I strip those off?

kazuo · 2008-03-22 18:43:53

I don't know with this is the best method to remove the <>.... but to remove the first and last char in a string

a[1:-1]

DonVla · 2008-03-22 19:00:02

tigrmesh wrote:

I've been visiting the #archlinux irc channel using weechat, and using the python program (script?) urlgrab.py to view any urls that are posted. That all works fine.
However, it doesn't recognize urls that are bracketed by <>, like this:
<http://wiki.archlinux.org/index.php/Xorg>
I located where it finds a url and modified it to recognize these as urls. Here's the code:
def urlGrabCheckMsgline(server, chan, message):
    # Ignore output from 'tinyurl.py'
    if message.startswith( "[AKA] http://tinyurl.com" ):
        return weechat.PLUGIN_RC_OK
    # Check for URLs
    for word in message.split(" "):
        if word[0:7] == "http://" or \
           word[0:8] == "https://" or \
           # try this 3/22/08 - phrik places urls within <>
           word[0:8] == "<http://" or \
           word[0:9] == "<https://" or \
           # end
           word[0:6] == "ftp://":
            urlGrab.addUrl(word, chan, server)
But when they get passed to firefox they still have the <>. How do I strip those off?

hi,

i would simply use re:

grab_url = re.compile(r'((https?://|ftp://|www\.)[-A-Za-z/.?_=&0-9#]*)', re.I)

vlad

arooaroo · 2008-03-23 01:33:49

DonVla's regex does a pretty reasonable job - you'd do well to get into re syntax. I don't think it's spot on though, for example, what about URLs with % symbols in. And are unicode urls coming into play or what?!

If we try and keep with the original design, which is to simply check whether the start of the string follows a URL pattern then we could do something like this:

for word in message.split(" "):
    if word.startswith('<') and word.endswith('>'):
        word = word[1:-1]
    if re.match(r'(https?|ftp)://', word):
        urlGrab.addUrl(word, chan, server)

tigrmesh · 2008-03-23 07:43:39

Thanks, arooaroo. That works perfectly. And so elegant too! I had made such a horrible mess of ifs that you would have barfed. Even I could tell it was ugly code.

kazuo, DonVla - Thank you as well. I learned a lot from trying to implement your ideas.

barebones · 2008-03-28 16:18:04

Why not just replace the unwanted symbols with empty characters?

message.replace('<','')
message.replace('>','')

Last edited by barebones (2008-03-28 16:18:47)

tigrmesh · 2008-03-28 17:46:14

Good question. I'll check it out.

Arch Linux

#1 2008-03-22 18:34:31

python: how do I remove the first and last character from a variable?

#2 2008-03-22 18:43:53

Re: python: how do I remove the first and last character from a variable?

#3 2008-03-22 19:00:02

Re: python: how do I remove the first and last character from a variable?

#4 2008-03-23 01:33:49

Re: python: how do I remove the first and last character from a variable?

#5 2008-03-23 07:43:39

Re: python: how do I remove the first and last character from a variable?

#6 2008-03-28 16:18:04

Re: python: how do I remove the first and last character from a variable?

#7 2008-03-28 17:46:14

Re: python: how do I remove the first and last character from a variable?

Board footer