Forum

Xpath or Regexpr for HTML

I have an HTML page, and I want to select the data that is between some

function. But since it isn’t a closed tag, I have no idea how to select it.
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
     ... what it is about
  </head>
  <body onload="moveToCurrentItem()">
    <div>
      <table width='866' cellpadding='0' cellspacing='0px'>
          .... stufff I am interested in.....
      </table> 
     </div>
   </body>
</html>

I am trying with Xpath (XML), but didn’t get that, nor with Regexpr (String), not even sure what node to use. Any suggestions would be very kind :)

try

.*? <table ( .*? ) </table > .*?

or

.*?<table ( .*? ) > .*? </table > .*?

as Regular Expression

Thnx, that worked. What did I miss here? Is it the space bars needed between the brackets… ahh… well… happy now :)

or use xslt:

<xsl:template match="table">
<xsl:copy-of select="*" />
</xsl:template>

uh, so how should i get the contents of a div tag with a specific class? eg. this page: http://decodeunicode.org/u+4DFF has the line

<div class="title">U+4DFF HEXAGRAM FOR BEFORE COMPLETION</div>

so how do I use xpath/xquery/xslt to extract the text contents of this tag?

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

as said on irc, convert the input to proper xml and then xpath query:^code:
//div@class=“myclassname”^