You are not logged in.

#1 2008-04-23 08:20:40

awagner
Member
From: Mainz, Germany
Registered: 2007-08-24
Posts: 191

[solved] PCRE recursive replacing

Hi all,
I have tried to understand PCRE but am ATM busy untying the knot in my brain. Maybe someone of you is so proficient the he/she can point to a solution easily.

I want to use PHP's preg_replace to replace some bbcode with valid html, but want to handle nested bbcodes.

Example:
This is [ i ]some text in [ i ]italics[ /i ] that has [ i ]italics even [ i ]inside[ /i ] of itself[ /i ][ /i ] so it can be a mess.

(with spaces so it doesn't get formatted by the forum software here.)

What I want my html do is switch between '<span style="font-style: italic;">' and '<span style="font-style: normal;">'. I suppose with those spans I can use nesting in turn, so I don't need to </span> before switching, provided that I deliver the closing </span> where the closing bbtag is. But with all those recursions and lookaheads, let alone replacing, the resources I could find got me lost.

Up to now (before me messing with it), the actual code looks like this:

$pattern = array(
    "/\[i\](.*?)\[\/i\]/si",
);
$change = array(
    "<span style=\"font-style: italic;\">$1</span>",
);
$string = preg_replace($pattern, $change, $string);

Any ideas? TIA,
Andreas

Last edited by awagner (2008-04-24 21:23:39)

Offline

#2 2008-04-24 08:08:27

awagner
Member
From: Mainz, Germany
Registered: 2007-08-24
Posts: 191

Re: [solved] PCRE recursive replacing

I have made some progress, but still not what I'd like to achieve.

I found out that apparently you have to use preg_replace_callback instead of preg_replace and have now a separate recursive function that is called from the main text:

/*
*  from http://de.php.net/manual/en/function.preg-replace-callback.php
*/
function bbToHtmlItalics($input)
{
    $regex = '#\[i]((?:[^[]|\[(?!/?i])|(?R))+)\[/i]#';
    if (is_array($input)) {
        $input = '<span style="font-style: italic;">'.$input[1].'</span>';
    }
    return preg_replace_callback($regex, 'bbToHtmlItalics', $input);
}

Now all the tags are translated well ...  but I want to alternate italics and upright when nesting level increases!
Ideas anyone?

Also, I have noticed that the website takes quite a bit longer to display. I suppose this will not improve when I add such a function for every kind of bbcode (bold, underlined, smallcaps, sub- and superscript). Can I do anything to get t more up to speed?

Thanks,
Andreas

Offline

#3 2008-04-24 21:23:22

awagner
Member
From: Mainz, Germany
Registered: 2007-08-24
Posts: 191

Re: [solved] PCRE recursive replacing

soooo. I have something. It slows display down a bit but seems to work...


class MYCLASS 
{ 
  var $currentlyItalics; 
  function bbToHtmlItalics($input) 
  { 
    $regex = '#\[i]((?:[^[]|\[(?!/?i])|(?R))+)\[/i]#'; 
    if (is_array($input)) { 
      if($this->currentlyItalics) { 
        $input = '<span style="font-style: italic;">'.$input[1].'</span>'; 
      } else { 
        $input = '<span style="font-style:normal;">'.$input[1].'</span>'; 
      } 
      if(preg_match($regex, $input)) $this->currentlyItalics = !($this->currentlyItalics); 
    } 
    return preg_replace_callback($regex, array(&$this, 'bbToHtmlItalics'), $input); 
  }

  [...]

  $data = $this->bbToHtmlItalics($data);

Offline

#4 2008-04-25 22:45:03

kumico
Member
Registered: 2007-09-28
Posts: 224
Website

Re: [solved] PCRE recursive replacing

im pretty sure that regex pattern can be improved, but i don't know enough to help;
but you did mention that it's slow, so i thought maybe regex wasn't necessary since you're only matching and

<?php

$str= 'This is [i]some text in [i]italics[/i] that has [i]italics even [i]inside[/i] of itself[/i][/i] so it can be a mess.';

function clock($t)
{
    return sprintf('%f', round(microtime(TRUE) - $t, 6));
}

echo "$str<br /><br />\n\n";
class MYCLASS
{ 
  var $currentlyItalics; 
  function bbToHtmlItalics($input) 
  { 
    $regex = '#\[i]((?:[^[]|\[(?!/?i])|(?R))+)\[/i]#'; 
    if (is_array($input)) { 
      if($this->currentlyItalics) { 
        $input = '<span style="font-style: italic;">'.$input[1].'</span>'; 
      } else { 
        $input = '<span style="font-style:normal;">'.$input[1].'</span>'; 
      } 
      if(preg_match($regex, $input)) $this->currentlyItalics = !($this->currentlyItalics); 
    } 
    return preg_replace_callback($regex, array(&$this, 'bbToHtmlItalics'), $input); 
  }
}




function altbb($str)
{
    $alt = FALSE;
    $out = '';
    $tks = explode('[i]', $str);
    $end = array_pop($tks);
    foreach ($tks as $tok) {
        if ($alt) {
            $sub = 'italic';
        } else {
            $sub = 'normal';
        }
        $out .= "$tok<span style='font-style:$sub;'>";
        $alt = !$alt;
    }
    
    return str_ireplace('[/i]', '</span>', $out.$end);
}


$bb = new MYCLASS();
$t = microtime(TRUE);
echo $bb->bbToHtmlItalics($str).clock($t);
echo "<br />";
$t = microtime(TRUE);
echo altbb($str).clock($t);;

?>

i just posted everything i used,
as you will see, the string method is a lot faster than the regex one and i think a lot simpler also

i haven't done much testing or anything, just implemented a thought. hope it helps

Offline

Board footer

Powered by FluxBB