ãã¼ã¯ä»ãè¨èªã®æ§æè§£æ
ãã¼ã¯ä»ãè¨èªã®æ§æè§£æã«ã¤ãã¦ã
ãã¼ã¯ä»ãè¨èªã®æ§æè§£æã¯ã©ã¹
ã¯ã©ã¹ã®ä½æçµç·¯
XSLTããã»ããµã«ããXSLT夿ã®éã«ãåºåã¡ã½ããã§ã¤ã³ãã³ãä»ããæå¹ã«ããX/HTMLã¸ã®åºåãè¡ãå ´åãHTMLåºåã¡ã½ããã§ãããããã¯è¦ç´ ã¨ã¤ã³ã©ã¤ã³è¦ç´ ã®åºå¥ã仿§ä¸ãããã¨ã¯ããã¾ãããã¾ããåºåå
容ã [XHTML10] ã®å ´åã«ã¡ãã£ã¢ã¿ã¤ããtext/htmlã¨æå®ããã¨ãã¦ããHTMLäºææ§ã¬ã¤ãã©ã¤ã³ã§æç¤ºãããæ¹æ³ã§ç©ºè¦ç´ ã®ç©ºã¿ã°ãéãããããã¾ãããããã§ãå¤æçµæã®ã½ã¼ã¹æ´å½¢ãè¡ãçºã«å
ãã½ã¼ã¹è§£æãè¡ããã®ã¯ã©ã¹ã使ãã¾ããã
ç§ã使ãã ML Parser ã¯ãPHPã®æ¡å¼µã©ã¤ãã©ãªã»XMLãã¼ãµé¢æ°ã¨ã¯ç°ãªããSGML/XML両æ§é åææ¸ã«å¯¾å¿ãã¦ãã¾ããã§ããããSGMLãã¼ã¹ã®HTMLãXMLãã¼ã¹ã®XHTMLã®ææ¸è§£æãå¿è«è¡ãã¾ãã
MLParserã¯ã©ã¹
public class MLParser extends Objectæ§é
- PHP 5
markup/MLParser.php
ã³ã³ã¹ãã©ã¯ã¿
public void MLParser([String $encoding])- ãã¼ã¯ä»ãè¨èªã®æ§æè§£æãè¡ãã¾ãã
- 弿°
String $encoding- åºåã®ã¨ã³ã³ã¼ãã£ã³ã°åï¼çç¥åï¼ãmbstringã¢ã¸ã¥ã¼ã«ããµãã¼ãããæåã¨ã³ã³ã¼ãã£ã³ã°ã®ã¿æå®ã§ãã¾ãã
ã¡ã½ãã
public void setSAXHandler(Object $handler)- ææ¸è§£æãã³ãã©ã®ã¡ã½ãããè¨å®ããã
- 弿°
Object $handler- ãªãã¸ã§ã¯ãæåãã³ãã©
public void setDocumentHandler(callback $startHandler, callback $endHandler)- ææ¸ã®ãã³ãã©ãè¨å®ããã
- 弿°
callback $startHandler- éå§ãã³ãã©é¢æ°
callback $endHandler- çµäºãã³ãã©é¢æ°
public void setDTDHandler(callback $handler)- ææ¸å宣è¨ï¼DTDï¼ã®ãã³ãã©ãè¨å®ããã
- 弿°
callback $handler- ãã³ãã©é¢æ°
public void setPIHandler(callback $handler)- å¦çå½ä»¤ï¼PIsï¼ã®ãã³ãã©ãè¨å®ããã
- 弿°
callback $handler- ãã³ãã©é¢æ°
public void setCommentHandler(callback $handler)- ã³ã¡ã³ãã®ãã³ãã©ãè¨å®ããã
- 弿°
callback $handler- ãã³ãã©é¢æ°
public void setElementHandler(callback $startHandler, callback $endHandler)- è¦ç´ ã®éå§ã¿ã°ããã³çµäºã¿ã°ã®ãã³ãã©ãè¨å®ããã
- 弿°
callback $startHandler- éå§ãã³ãã©é¢æ°
callback $endHandler- çµäºãã³ãã©é¢æ°
public void setCharactersHandler(callback $handler)- æåãã¼ã¿ã®ãã³ãã©ãè¨å®ããã
- 弿°
callback $handler- ãã³ãã©é¢æ°
public void setCDATASectionHandler(callback $handler)- CDATAã»ã¯ã·ã§ã³ã®ãã³ãã©ãè¨å®ããã
- 弿°
callback $handler- ãã³ãã©é¢æ°
public void setMarkedSectionHandler(callback $handler)- SGMLã®ãã¼ã¯åºéã®ãã³ãã©ãè¨å®ããã
- 弿°
callback $handler- ãã³ãã©é¢æ°
public void setDefaultHandler(callback $handler)- ããã©ã«ãã®ãã³ãã©ãè¨å®ããã
- 弿°
callback $handler- ãã³ãã©é¢æ°
public void addPreserveSpace(String $name [, ...])- ç©ºç½æåãä¿æããè¦ç´ ã®è¨å®ã追å ããï¼å¹¾ã¤ã§ãæå®å¯ï¼ãæªè¨å®ã®å ´åã«ã¯ãSGMLã§ã¯å
¨ã¦ã®è¦ç´ ã§éä¿æãXMLã§ã¯
xml:space屿§ãè§£éï¼å±æ§ãæªå®£è¨åã³å¤ã'preserve'ãªãã°ä¿æã'default'ãªãã°éä¿æï¼ãã¾ãã - 弿°
String $name- è¦ç´ å
public void addEmptyElements(String $name [, ...])- SGMLã®ç©ºã®è¦ç´ åã®è¨å®ã追å ããï¼å¹¾ã¤ã§ãæå®å¯ï¼ã
- 弿°
String $name- è¦ç´ å
public Integer getCurrentLine()- ç¾å¨ã®è¡çªå·ãåå¾ããã
- è¿ãå¤
- è¡çªå·
public Integer getCurrentPosition()- ç¾å¨ã®æåä½ç½®ãåå¾ããã
- è¿ãå¤
- æåä½ç½®
public String getNamespaceToElement(String $qname)- è¦ç´ ã®åå空éãåå¾ãããï¼ããã¯XMLç¨ã®ã¡ã½ããã§ããåå空éã«å¯¾å¿ããXMLææ¸ã§ã®ã¿åå¾å¯è½ã§ããï¼
- 弿°
String $qnameQName
- è¿ãå¤
- URI
public String getNamespaceToAttribute(String $qname)- 屿§ã®åå空éãåå¾ãããï¼ããã¯XMLç¨ã®ã¡ã½ããã§ããåå空éã«å¯¾å¿ããXMLææ¸ã§ã®ã¿åå¾å¯è½ã§ããï¼
- 弿°
String $qnameQName
- è¿ãå¤
- URI
public void parse(String $source [, Integer $markup])- ææ¸ã®å¦çãéå§ããã
- 弿°
String $source- å¦çããææ¸ãã¼ã¿
Integer $markup- å¦çããææ¸ãã¼ã¿ã®ãã¼ã¯ä»ãè¨èªã®ç¨®é¡ [
MLPARSER_PARSE_SGML|MLPARSER_PARSE_XML]
ã¯ã©ã¹ã®ä½¿ç¨ä¾
次ã®ä¾ã¯ãMLParserã¯ã©ã¹ã使ç¨ãããµã³ãã«ã§ãã
<?php
ini_set("include_path", "./");
// require_once("org/purl/net/osamurai/markup/MLParser.php"); // for PHP 4
require_once("markup/MLParser.php"); // for PHP 5
$filename = "data.xml";
$level = 0;
function startElement($name, $attrs) {
global $level;
for ($i=0; $i<$level; $i++) {
print " ";
}
print "{$name}\n";
$level++;
}
function endElement($name) {
global $level;
$level--;
}
$handle = fopen($filename, "r");
$contents = fread($handle, filesize($filename));
fclose($handle);
$parser = new MLParser;
$parser->setElementHandler('startElement', 'endElement');
$parser->parse($contents, MLPARSER_PARSE_XML);
unset($parser);
?>ä¸ã®ä¾ã§ã¯ãXMLææ¸ã®æ§æãè§£æããè¦ç´ ã®æ§é ãã¤ã³ãã³ããä»ãã¦è¡¨ç¤ºãã¾ããããã¨åæ§ã®å¦çãHTMLçã®SGMLé¢é£è¦æ ¼ã§ãã¼ã¯ä»ããããææ¸ã«è¡ãå ´åã«ã¯ã次ã®ããã«è¨è¿°ãä¸é¨å¤æ´ï¼åã³è¿½å ï¼ããã¨è¯ãã
...
$parser = new MLParser;
$parser->setElementHandler('startElement', 'endElement');
// HTML 4.01 Strict DTD
$parser->addPreserveSpace("pre");
$parser->addEmptyElements("area", "base", "br", "col", "hr",
"img", "input", "link", "meta", "param");
// HTML 4.01 Transitional DTD
//$parser->addEmptyElements("basefont", "isindex");
// HTML 4.01 Frameset DTD
//$parser->addEmptyElements("frame");
$parser->parse($contents, MLPARSER_PARSE_SGML);
...注é
- 以åã®PHP4ç¨ã®ã½ã¼ã¹ã³ã¼ãã§ã¯ãå¦çæã®å
é¨è¡¨ç¾ã¯ã常ã«å®æ°'
MLPARSER_INTERNAL_ENCODING'ã®æåã¨ã³ã³ã¼ãã£ã³ã°åã§ã¨ã³ã³ã¼ãããã¾ããå¾ç¶çã®PHP5ç¨ã§ã¯ãã½ã¼ã¹ã¨ã³ã³ã¼ãã£ã³ã°ï¼åºåï¼åã³ã¿ã¼ã²ããã¨ã³ã³ã¼ãã£ã³ã°ï¼å ¥åï¼ã®éã«ã¨ã³ã³ã¼ãã£ã³ã°ãè¡ãä»çµã¿ã«æ¹åã»ä¿®æ£ãã¾ããã - æ§æè§£æã®éã«DTDã®è§£æã¯è¡ãã¾ãããã¾ãã[XML11] ã¯èæ ®ãã¦ãã¾ããã
- SGMLã®æ§æè§£æã§ã¯ãè¦ç´ ã®çç¥åãã¼ã¯ä»ãï¼ç縮ã¿ã°æ©æ§ï¼ã«ã¯å¯¾å¿ãã¦ãã¾ãããä»ã«ããå¦çå½ä»¤
<?experiment> ... <?/experiment>ã®æ¸å¼ã«ã対å¿ãã¦ãã¾ããã - ãå©ç¨ã®éã®æ³¨æäºé ãªã©ãã覧ä¸ããã
SAXHandlerã¤ã³ã¿ã¼ãã§ã¤ã¹
public interface SAXHandler extends Objectæ§é
ã¡ã½ãã
public void startDocument()- ææ¸ã®ãã³ãã©ï¼éå§ï¼
public void endDocument()- ææ¸ã®ãã³ãã©ï¼çµäºï¼
public void documentTypeDefinition(String $doctypedecl)- ææ¸å宣è¨ã®ãã³ãã©
- 弿°
String $doctypedecl- ææ¸å宣è¨
public void processingInstruction(String $target, String $data)- å¦çå½ä»¤ã®ãã³ãã©
- 弿°
String $target- PIã®ã¿ã¼ã²ãã
String $data- PIã®å 容
public void comment(String $data)- 注éã®ãã³ãã©
- 弿°
String $data- ã³ã¡ã³ãã®å 容
public void startElement(String $name, Array $atts)- è¦ç´ ã®ãã³ãã©ï¼éå§ã¿ã°ï¼
- 弿°
String $name- è¦ç´ å
Array $atts- 屿§æå® [
array([屿§å => 屿§å¤ [, ...]])]
public void endElement(String $name)- è¦ç´ ã®ãã³ãã©ï¼çµäºã¿ã°ï¼
- 弿°
String $name- è¦ç´ å
public void characters(String $data)- æååã®ãã³ãã©
- 弿°
String $data- æåå
public void cdataSection(String $cdsect)- CDATAã»ã¯ã·ã§ã³ã®ãã³ãã©
- 弿°
String $cdsect- CDATAã»ã¯ã·ã§ã³
public void markedSection(String $markedsect)- ãã¼ã¯åºéã®ãã³ãã©
- 弿°
String $markedsect- ãã¼ã¯åºé
public void defaultHandler(String $data)- ããã©ã«ãã®ãã³ãã©
- 弿°
String $data- æåå
ãµã³ãã«
æ¦è¦
ãã¼ã¯ä»ãè¨èªã®æ§æè§£æã¯ã©ã¹ã®å¦çã®éã«ã³ã¼ã«ããSAXHandlerã¤ã³ã¿ã¼ãã§ã¤ã¹ãå®è£
ãããã³ãã©ã®ãµã³ãã«ã§ããåãæ¢ããããªãã¸ã§ã¯ãæåãã³ãã©ã®ãµã³ãã«ã¨ãã¦ãæ¬ãµã¤ãã§ãã½ã¼ã¹æ´å½¢ï¼æ¹è¡ã¨åä¸ãï¼ãã«ä½¿ç¨ãã¦ãããã³ãã©ãå
¬éãã¾ãã
ã¡ãªã¿ã«ãæ¬ãµã¤ãã®ã½ã¼ã¹ï¼XSLTå¤æçµæã®åºåï¼ã®æ¹è¡ã¨åä¸ãä»ãã¯ãå¦çæéãç縮ããããã«ããã©ã«ãã§ã¯ç¡å¹ã«ãã¦ãã¾ããã½ã¼ã¹ã綺éºã«ãã¦èªã¿ããæ¹ã¯ãå種è¨å®ï¼æãã¯ãURIã«indent=onã¨ããã¯ã¨ãªã¼ãä»ä¸ï¼ã«ã¦ã¤ã³ãã³ãæ©è½ãæå¹ã«å¤æ´ããã¦ãã ãããä¸å¿ãåºåãXHTMLã®å ´åã«ã¯ãåªä½åã«ããå
容ã¢ãã«ã®å¦çæ¹æ³ãç°ãªãã¾ãã
æ§é
ãã³ãã©ã®ä½¿ç¨ä¾
次ã®ä¾ã¯ãã½ã¼ã¹æ´å½¢ï¼æ¹è¡ã¨åä¸ãï¼ãã³ãã©ã使ç¨ãããµã³ãã«ã§ãã
<?php
ini_set("include_path", "./");
/*
// for PHP 4
require_once("org/purl/net/osamurai/markup/MLParser.php");
require_once("org/purl/net/osamurai/markup/IndentHandler.php");
*/
// for PHP 5
require_once("markup/MLParser.php");
require_once("markup/IndentHandler.php");
$filename = "data.xml";
$handle = fopen($filename, "r");
$contents = fread($handle, filesize($filename));
fclose($handle);
$parser = new MLParser;
$handler = new IndentHandler(&$parser);
$handler->setMinimize(INDENTHANDLER_MINIMIZE_EMPTY_ELEMENTS);
$parser->setSAXHandler($handler);
$parser->parse($contents, MLPARSER_PARSE_XML);
unset($parser, $handler);
?>注é
- ãµã³ãã«ã«ã¦å©ç¨ã§ããã¡ã½ããçã«ã¤ãã¦ã¯ããã¡ã¤ã«ä¸ã®æ³¨éãã覧ä¸ããã