PDA

View Full Version : Goddamn fucking regexs


dv8ed
19-03-2005, 04:32:49
Ok...anybody know how to run fucking regular expressions in PHP? I have a string full of HTML with a bunch of nested tables inside...all I want are the tables. So basically, the format is:

junk
<table>good stuff<table>good stuff</table></table>
junk

So I know I need to match everything before the first <table in the string and everything after the last </table> and replace them both with an empty string. Right? I can't just do a match because the tables are nested. Except I've never quite figured regexs out, especially in PHP, so I can't write the damn search string. Especially not at 11:30 on a Friday night.

Can anybody help?

Qaj the Fuzzy Love Worm
19-03-2005, 04:49:09
Set up a function that gets called recursively. And for God's sake, learn to use Google :)

http://webreference.com/programming/php/regexps/

dv8ed
19-03-2005, 05:00:03
I think I've been through that exact page before...obviously, I'm stupid.

Won't it break if it's recursive, though, since the first </table> it runs into won't match up with the first <table> it runs into? I don't see how to specify to search for the last </table> in the string, I guess.

Edit: Duh! Don't drink and program! But it still seems like it would be easier to just trim from both ends than to recursively search the string...

Sir Penguin
19-03-2005, 06:12:54
Why won't "<table>.*</table>" work?

SP

Qaj the Fuzzy Love Worm
19-03-2005, 06:35:49
Because he presumably wants to strip the inner table gumf out as well, and he didn't specify how many levels of nested tables there were?

So from what I can gather, with the example

blahblah<table>goodstuff1<table>goodstuff2<table>goodstuff3</table>goodstuff2</table>goodstuff1</table>blahblah, he'd either want the following results:

(1)
goodstuff1goodstuff2goodstuff3goodstuff2goodstuff1

in which case a single regexp for the first and last <table></table> tags would work, followed by a simple string replace on all remaining tags, or

(2)
a set of reponses thus

<table>goodstuff1<table>goodstuff2<table>goodstuff3</table>goodstuff2</table>goodstuff1</table>

<table>goodstuff2<table>goodstuff3</table>goodstuff2</table>

<table>goodstuff3</table>

goodstuff3


It's unclear the result he's going for though.

dv8ed
19-03-2005, 06:51:58
This worked:
preg_match("/<table[\s\S]*<\/table>/i", $haystack, $out);

Which is what I started with...but I'm using a damn xmlhttprequest library that's apparently adding extra characters to whatever I get back from it and causing JS errors. So I had it right the first time, I just thought I didn't because I never get regexs right the first time.