|
|
CVII. XML parser functionsIntroductionAbout XML
XML (eXtensible Markup Language)은 웹에서 문서교환을 위해 구조화된 데이터 포멧이다. XML은 The World Wide Web consortium (W3C)에 의해 정의된 기준이고, XML과 그 기술에 관한 정보는 에서 찾아 볼 수 있다.
설치
이 extension은 expat을 사용한다. 이것은 에서 찾아 볼 수 있다. 기본적으로 expat이 포함되어 만들어진 library는 존재하지 않고, 아래의 make rule를 사용하여 만들어 사용할 수 있다.(make파일에 추가.):
libexpat.a: $(OBJS)
ar -rc $@ $(OBJS)
ranlib $@ |
expat의 RPM package는 에서 다운 받을 수 있다.
만일 Apache-1.3.7 이후 버젼을 사용한다면 이미 필요한 expat library는 존재함을 명심하라. 단지 PHP를 --with-xml (without any
additional path)로 configure 함으로 Apache에 빌드된 expat library를 자동적으로 사용할 수 있다.
UNIX에서는, --with-xml 옵션을 포함하여configure를 실행한다. 컴파일러가 찾을 수 있는 어딘가에 expat library는 설치될 것이다. 만일 Apache 1.3.9 이후 버젼에서 PHP가 모듈로 설치되어 있다면, 아파치에 포함된 expat library를 자동적으로 사용할 수 있다. 만일 expat이 컴파일러가 찾을 수 없는 곳에 설치되어 있다면 CPPFLAGS와 LDFLAGS를 configure 하기전에 설정할 필요가 있다.
PHP를 설치했다. Tada! That should be it (^^).
이 Extension에 관하여.
이 PHP extension 도구들은 James Clark's expat을 지원한다. 이 도구는 XML 문서를 구문대로 분석한다.(하지만 그 유효성은 확이하지 않는다.) 이것은 PHP가 지원하는 세가지 source character encodings을 지원한다. : US-ASCII,
ISO-8859-1, UTF-8. 하지만, UTF-16은 지원하지 않는다.
이 extension은 create XML parsers를 생성시키고 다른 XML 이벤트를 위한 handlers를 정의한다. 각각의 XML parser는 또한 조정가능한 parameters를 갖는다.
XML event handlers defined are:
표 1. Supported XML handlers
Case Folding
The element handler functions may get their element names
case-folded. Case-folding is defined by
the XML standard as "a process applied to a sequence of
characters, in which those identified as non-uppercase are
replaced by their uppercase equivalents". In other words, when
it comes to XML, case-folding simply means uppercasing.
By default, all the element names that are passed to the handler
functions are case-folded. This behaviour can be queried and
controlled per XML parser with the
xml_parser_get_option() and
xml_parser_set_option() functions,
respectively.
Error Codes
The following constants are defined for XML error codes (as
returned by xml_parse()):
XML_ERROR_NONE | XML_ERROR_NO_MEMORY | XML_ERROR_SYNTAX | XML_ERROR_NO_ELEMENTS | XML_ERROR_INVALID_TOKEN | XML_ERROR_UNCLOSED_TOKEN | XML_ERROR_PARTIAL_CHAR | XML_ERROR_TAG_MISMATCH | XML_ERROR_DUPLICATE_ATTRIBUTE | XML_ERROR_JUNK_AFTER_DOC_ELEMENT | XML_ERROR_PARAM_ENTITY_REF | XML_ERROR_UNDEFINED_ENTITY | XML_ERROR_RECURSIVE_ENTITY_REF | XML_ERROR_ASYNC_ENTITY | XML_ERROR_BAD_CHAR_REF | XML_ERROR_BINARY_ENTITY_REF | XML_ERROR_ATTRIBUTE_EXTERNAL_ENTITY_REF | XML_ERROR_MISPLACED_XML_PI | XML_ERROR_UNKNOWN_ENCODING | XML_ERROR_INCORRECT_ENCODING | XML_ERROR_UNCLOSED_CDATA_SECTION | XML_ERROR_EXTERNAL_ENTITY_HANDLING |
Character Encoding
PHP's XML extension supports the character set through
different character encodings. There are
two types of character encodings, source
encoding and target encoding.
PHP's internal representation of the document is always encoded
with UTF-8.
Source encoding is done when an XML document is parsed. Upon creating an XML
parser, a source encoding can be specified (this encoding
can not be changed later in the XML parser's lifetime). The
supported source encodings are ISO-8859-1,
US-ASCII and UTF-8. The
former two are single-byte encodings, which means that each
character is represented by a single byte.
UTF-8 can encode characters composed by a
variable number of bits (up to 21) in one to four bytes. The
default source encoding used by PHP is
ISO-8859-1.
Target encoding is done when PHP passes data to XML handler
functions. When an XML parser is created, the target encoding
is set to the same as the source encoding, but this may be
changed at any point. The target encoding will affect character
data as well as tag names and processing instruction targets.
If the XML parser encounters characters outside the range that
its source encoding is capable of representing, it will return
an error.
If PHP encounters characters in the parsed XML document that can
not be represented in the chosen target encoding, the problem
characters will be "demoted". Currently, this means that such
characters are replaced by a question mark.
Some Examples
Here are some example PHP scripts parsing XML documents.
XML Element Structure Example
This first example displays the stucture of the start elements in
a document with indentation.
예 1. Show XML Element Structure $file = "data.xml";
$depth = array();
function startElement($parser, $name, $attrs) {
global $depth;
for ($i = 0; $i < $depth[$parser]; $i++) {
print " ";
}
print "$name\n";
$depth[$parser]++;
}
function endElement($parser, $name) {
global $depth;
$depth[$parser]--;
}
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startElement", "endElement");
if (!($fp = fopen($file, "r"))) {
die("could not open XML input");
}
while ($data = fread($fp, 4096)) {
if (!xml_parse($xml_parser, $data, feof($fp))) {
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));
}
}
xml_parser_free($xml_parser); |
|
XML Tag Mapping Example
예 2. Map XML to HTML
This example maps tags in an XML document directly to HTML
tags. Elements not found in the "map array" are ignored. Of
course, this example will only work with a specific XML
document type.
$file = "data.xml";
$map_array = array(
"BOLD" => "B",
"EMPHASIS" => "I",
"LITERAL" => "TT"
);
function startElement($parser, $name, $attrs) {
global $map_array;
if ($htmltag = $map_array[$name]) {
print "<$htmltag>";
}
}
function endElement($parser, $name) {
global $map_array;
if ($htmltag = $map_array[$name]) {
print "</$htmltag>";
}
}
function characterData($parser, $data) {
print $data;
}
$xml_parser = xml_parser_create();
// use case-folding so we are sure to find the tag in $map_array
xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, true);
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
if (!($fp = fopen($file, "r"))) {
die("could not open XML input");
}
while ($data = fread($fp, 4096)) {
if (!xml_parse($xml_parser, $data, feof($fp))) {
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));
}
}
xml_parser_free($xml_parser); |
|
XML External Entity Example
This example highlights XML code. It illustrates how to use an
external entity reference handler to include and parse other
documents, as well as how PIs can be processed, and a way of
determining "trust" for PIs containing code.
XML documents that can be used for this example are found below
the example (xmltest.xml and
xmltest2.xml.)
예 3. External Entity Example $file = "xmltest.xml";
function trustedFile($file) {
// only trust local files owned by ourselves
if (!eregi("^([a-z]+)://", $file)
&& fileowner($file) == getmyuid()) {
return true;
}
return false;
}
function startElement($parser, $name, $attribs) {
print "<<font color=\"#0000cc\">$name</font>";
if (sizeof($attribs)) {
while (list($k, $v) = each($attribs)) {
print " <font color=\"#009900\">$k</font>=\"<font
color=\"#990000\">$v</font>\"";
}
}
print ">";
}
function endElement($parser, $name) {
print "</<font color=\"#0000cc\">$name</font>>";
}
function characterData($parser, $data) {
print "<b>$data</b>";
}
function PIHandler($parser, $target, $data) {
switch (strtolower($target)) {
case "php":
global $parser_file;
// If the parsed document is "trusted", we say it is safe
// to execute PHP code inside it. If not, display the code
// instead.
if (trustedFile($parser_file[$parser])) {
eval($data);
} else {
printf("Untrusted PHP code: <i>%s</i>",
htmlspecialchars($data));
}
break;
}
}
function defaultHandler($parser, $data) {
if (substr($data, 0, 1) == "&" && substr($data, -1, 1) == ";") {
printf('<font color="#aa00aa">%s</font>',
htmlspecialchars($data));
} else {
printf('<font size="-1">%s</font>',
htmlspecialchars($data));
}
}
function externalEntityRefHandler($parser, $openEntityNames, $base, $systemId,
$publicId) {
if ($systemId) {
if (!list($parser, $fp) = new_xml_parser($systemId)) {
printf("Could not open entity %s at %s\n", $openEntityNames,
$systemId);
return false;
}
while ($data = fread($fp, 4096)) {
if (!xml_parse($parser, $data, feof($fp))) {
printf("XML error: %s at line %d while parsing entity %s\n",
xml_error_string(xml_get_error_code($parser)),
xml_get_current_line_number($parser), $openEntityNames);
xml_parser_free($parser);
return false;
}
}
xml_parser_free($parser);
return true;
}
return false;
}
function new_xml_parser($file) {
global $parser_file;
$xml_parser = xml_parser_create();
xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING, 1);
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
xml_set_processing_instruction_handler($xml_parser, "PIHandler");
xml_set_default_handler($xml_parser, "defaultHandler");
xml_set_external_entity_ref_handler($xml_parser, "externalEntityRefHandler");
if (!($fp = @fopen($file, "r"))) {
return false;
}
if (!is_array($parser_file)) {
settype($parser_file, "array");
}
$parser_file[$xml_parser] = $file;
return array($xml_parser, $fp);
}
if (!(list($xml_parser, $fp) = new_xml_parser($file))) {
die("could not open XML input");
}
print "<pre>";
while ($data = fread($fp, 4096)) {
if (!xml_parse($xml_parser, $data, feof($fp))) {
die(sprintf("XML error: %s at line %d\n",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));
}
}
print "</pre>";
print "parse complete\n";
xml_parser_free($xml_parser);
?> |
|
예 4. xmltest.xml <?xml version='1.0'?>
<!DOCTYPE chapter SYSTEM "/just/a/test.dtd" [
<!ENTITY plainEntity "FOO entity">
<!ENTITY systemEntity SYSTEM "xmltest2.xml">
]>
<chapter>
<TITLE>Title &plainEntity;</TITLE>
<para>
<informaltable>
<tgroup cols="3">
<tbody>
<row><entry>a1</entry><entry morerows="1">b1</entry><entry>c1</entry></row>
<row><entry>a2</entry><entry>c2</entry></row>
<row><entry>a3</entry><entry>b3</entry><entry>c3</entry></row>
</tbody>
</tgroup>
</informaltable>
</para>
&systemEntity;
<sect1 id="about">
<title>About this Document</title>
<para>
<!-- this is a comment -->
<?php print 'Hi! This is PHP version '.phpversion(); ?>
</para>
</sect1>
</chapter> |
|
This file is included from xmltest.xml:
예 5. xmltest2.xml <?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY testEnt "test entity">
]>
<foo>
<element attrib="value"/>
&testEnt;
<?php print "This is some more PHP code being executed."; ?>
</foo> |
|
User Contributed Notes XML parser functions |
add a note |
Daniel dot Rendall at btinternet dot com
07-Jul-1999 05:21 |
|
When using the XML parser, make sure you're not using the magic quotes
option (e.g. use set_magic_quotes_runtime(0) if it's not the compiled
default), otherwise you'll get 'not well-formed' errors when dealing with
tags with attributes set in them.
|
|
sam at cwa dot co dot nz
28-Sep-2000 02:39 |
|
I've discovered some unusual behaviour in this API when ampersand entities
are parsed in cdata; for some reason the parser breaks up the section
around the entities, and calls the handler repeated times for each of the
sections. If you don't allow for this oddity and you are trying to put the
cdata into a variable, only the last part will be stored.
You
can get around this with a line like:
$foo .= $cdata;
If
the handler is called several times from the same tag, it will append
them, rather than rewriting the variable each time. If the entire cdata
section is returned, it doesn't matter.
May happen for other
entities, but I haven't investigated.
Took me a while to figure
out what was happening; hope this saves someone else the trouble.
|
|
morgan_rogers at yahoo dot com
06-Oct-2000 08:37 |
|
There's a really good article on XML parsing with PHP at
|
|
zher at operamail dot com
31-Mar-2001 01:35 |
|
Excellent IMHO XPath library for XML manipulation. Doesn't requires the
XML libraries to be installed. Take a look:
|
|
hans dot schneider at bbdo-interone dot de
24-Jan-2002 03:43 |
|
I had to TRIM the data when I passed one large String containig a
wellformed XML-File to xml_parse. The String was read by CURL, which
aparently put a BLANK at the end of the String. This BLANK produced a
"XML not wellformed"-Error in xml_parse!
|
|
jason at NOSPAM_projectexpanse_NOSPAM dot com
26-Feb-2002 11:11 |
|
For newbies wanting a good tutorial on how to actually get started and
where to go from this listing of functions, then visit:
It
shows an excellent example of how to read the XML data into a class file
so you can actually process it, not just display it all pretty-like, like
many tutorials on PHP/XML seem to be doing.
|
|
jason at N0SPAM dot projectexpanse dot com
22-Mar-2002 08:16 |
|
In reference to the note made by [email protected] about parsing
entities:
I could be wrong, but since it is possible to define your
own entities within an XML DTD, the cdata handler function parses these
individually to allow for your own implementation of those entities within
your cdata handler.
|
|
danielc at analysisandsolutions dot com
15-Apr-2002 09:23 |
|
I put up a good, simple, real world example of how to parse XML documents.
While the sample grabs stock quotes off of the web, you can tweak it to do
whatever you need.
|
|
jon at gettys dot org
14-Aug-2002 08:59 |
|
[Editor's note: see also xml_parse_into_struct().]
Very simple
routine to convert an XML file into a PHP structure. $obj->xml contains
the resulting PHP structure. I would be interested if someone could
suggest a cleaner method than the evals I am
using.
<?
$filename = 'sample.xml';
$obj->tree =
'$obj->xml';
$obj->xml = '';
function
startElement($parser, $name, $attrs) {
global $obj;
// If var already defined, make array
eval('$test=isset('.$obj->tree.'->'.$name.');');
if ($test)
{
eval('$tmp='.$obj->tree.'->'.$name.';');
eval('$arr=is_array('.$obj->tree.'->'.$name.');');
if
(!$arr) {
eval('unset('.$obj->tree.'->'.$name.');');
eval($obj->tree.'->'.$name.'[0]=$tmp;');
$cnt = 1;
}
else {
eval('$cnt=count('.$obj->tree.'->'.$name.');');
}
$obj->tree .= '->'.$name."[$cnt]";
}
else {
$obj->tree .= '->'.$name;
}
if (count($attrs)) {
eval($obj->tree.'->attr=$attrs;');
}
}
function endElement($parser, $name) {
global
$obj;
// Strip off last ->
for($a=strlen($obj->tree);$a>0;$a--) {
if
(substr($obj->tree, $a, 2) == '->') {
$obj->tree =
substr($obj->tree, 0, $a);
break;
}
}
}
function characterData($parser, $data) {
global
$obj;
eval($obj->tree.'->data=\''.$data.'\';');
}
$xml_parser
= xml_parser_create();
xml_set_element_handler($xml_parser,
"startElement",
"endElement");
xml_set_character_data_handler($xml_parser,
"characterData");
if (!($fp = fopen($filename,
"r"))) {
die("could not open XML
input");
}
while ($data = fread($fp, 4096)) {
if
(!xml_parse($xml_parser, $data, feof($fp))) {
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($xml_parser)),
xml_get_current_line_number($xml_parser)));
}
}
xml_parser_free($xml_parser);
print_r($obj->xml);
return
0;
?>
|
|
yohgaki at php dot net
16-Sep-2002 05:05 |
|
Just a quick note. XML extension defines "xml" resrouce
created by xml_parser_create(). I don't know why it didn't mention it...
|
|
dmarsh dot NO dot SPAM dot PLEASE at spscc dot ctc dot edu
18-Sep-2002 07:27 |
|
Some reference code I am working on as "XML Library" of which I
am folding it info an object. Notice the use of the DEFINE:
Mainly
Example 1 and parts of 2 & 3 re-written as an object: ---
MyXMLWalk.lib.php --- <?php
if
(!defined("PHPXMLWalk"))
{ define("PHPXMLWalk",TRUE);
class XMLWalk { var
$p; //short for xml parser; var $e; //short for element
stack/array
function prl($x,$i=0) { ob_start();
print_r($x); $buf=ob_get_contents(); ob_end_clean();
return join("\n".str_repeat("
",$i),split("\n",$buf)); }
function XMLWalk()
{ $this->p = xml_parser_create(); $this->e = array();
xml_parser_set_option($this->p, XML_OPTION_CASE_FOLDING, true);
xml_set_element_handler($this->p, array(&$this,
"startElement"), array(&$this, "endElement"));
xml_set_character_data_handler($this->p, array(&$this,
"dataElement"));
register_shutdown_function(array(&$this, "free")); // make a
destructor }
function startElement($parser, $name, $attrs)
{ if (count($attrs)>=1) { $x = $this->prl($attrs,
$this->e[$parser]+6); } else { $x = "";
}
print str_repeat(" ",$this->e[$parser]+0).
"$name $x\n"; $this->e[$parser]++;
$this->e[$parser]++; }
function dataElement($parser,
$data) { print str_repeat(" ",$this->e[$parser]+0).
htmlspecialchars($data, ENT_QUOTES) ."\n"; }
function endElement($parser, $name) { $this->e[$parser]--;
$this->e[$parser]--; } function parse($data, $fp) {
if (!xml_parse($this->p, $data, feof($fp))) {
die(sprintf("XML error: %s at line %d",
xml_error_string(xml_get_error_code($this->p)),
xml_get_current_line_number($this->p))); } }
function free() { xml_parser_free($this->p); }
} //
end of class
} // end of define
?>
--- end of file
---
Calling
code: <?php
...
require("MyXMLWalk.lib.php");
$file
= "x.xml";
$xme = new XMLWalk;
if (!($fp =
fopen($file, "r"))) { die("could not open XML
input"); }
while ($data = fread($fp, 4096)) {
$xme->parse($data, $fp); }
... ?>
|
|
guy at bhaktiandvedanta dot com
27-Sep-2002 07:01 |
|
For a simple XML parser you can use this function. It doesn't require any
extensions to run.
<? // Extracts content from XML
tag
function GetElementByName ($xml, $start, $end) {
global
$pos; $startpos = strpos($xml, $start); if ($startpos === false)
{ return false; } $endpos = strpos($xml, $end); $endpos =
$endpos+strlen($end); $pos = $endpos; $endpos =
$endpos-$startpos; $endpos = $endpos - strlen($end); $tag = substr
($xml, $startpos, $endpos); $tag = substr ($tag,
strlen($start));
return $tag;
}
// Open and read xml
file. You can replace this with your xml data.
$file =
"data.xml"; $pos = 0; $Nodes = array();
if (!($fp =
fopen($file, "r"))) { die("could not open XML
input"); } while ($getline = fread($fp, 4096)) { $data =
$data . $getline; }
$count = 0; $pos = 0;
// Goes
throw XML file and creates an array of all <XML_TAG> tags. while
($node = GetElementByName($data, "<XML_TAG>",
"</XML_TAG>")) { $Nodes[$count] =
$node; $count++; $data = substr($data, $pos); }
// Gets
infomation from tag siblings. for ($i=0; $i<$count; $i++) { $code
= GetElementByName($Nodes[$i], "<Code>",
"</Code>"); $desc = GetElementByName($Nodes[$i],
"<Description>",
"</Description>"); $price =
GetElementByName($Nodes[$i], "<BasePrice>",
"</BasePrice>"); } ?>
Hope this helps!
:) Guy Laor
|
|
sfaulkner at hoovers dot com
04-Nov-2002 07:29 |
|
Building on... This allows you to return the value of an element using an
XPath reference. This code would of course need error handling added
:-)
function GetElementByName ($xml, $start, $end) {
$startpos = strpos($xml, $start); if ($startpos === false) {
return false; } $endpos = strpos($xml, $end); $endpos =
$endpos+strlen($end); $endpos = $endpos-$startpos; $endpos =
$endpos - strlen($end); $tag = substr ($xml, $startpos,
$endpos); $tag = substr ($tag, strlen($start)); return
$tag; } function XPathValue($XPath,$XML) { $XPathArray
= explode("/",$XPath); $node = $XML; while
(list($key,$value) = each($XPathArray)) { $node =
GetElementByName($node, "<$value>",
"</$value>"); } return $node;
} print
XPathValue("Response/Shipment/TotalCharges/Value",$xml);
|
|
mreilly at ZEROSPAM dot MAC dot COM
14-Nov-2002 05:01 |
|
I wanted a way to reference the XML tree by path. I couldn't find exactly
what I wanted, but using examples here and on phpbuilder.com came up with
this. This results in a nested associative array, so elements can be
accessed in the manner:
echo
$ary_parsed_file['path']['to']['value'];
<?php
// Display
the print_r() output in a readable format echo '<PRE>';
//
Array to store current xml path $ary_path = array();
// Array to
store parsed data $ary_parsed_file = array();
// Starting level
- Set to 0 to display all levels. Set to 1 or higher // to skip a
level that is common to all the fields. $int_starting_level =
1;
// what are we parsing? $xml_file = 'label.xml';
//
declare the character set - UTF-8 is the default $type =
'UTF-8';
// create our parser $xml_parser =
xml_parser_create($type);
// set some parser options
xml_parser_set_option($xml_parser, XML_OPTION_CASE_FOLDING,
true); xml_parser_set_option($xml_parser, XML_OPTION_TARGET_ENCODING,
'UTF-8');
// this tells PHP what functions to call when it finds an
element // these funcitons also handle the element's
attributes xml_set_element_handler($xml_parser,
'startElement','endElement');
// this tells PHP what function to
use on the character data xml_set_character_data_handler($xml_parser,
'characterData');
if (!($fp = fopen($xml_file, 'r')))
{ die("Could not open $xml_file for
parsing!\n"); }
// loop through the file and parse
baby! while ($data = fread($fp, 4096)) { if (!($data =
utf8_encode($data))) { echo 'ERROR'."\n"; } if
(!xml_parse($xml_parser, $data, feof($fp))) { die(sprintf( "XML
error: %s at line
%d\n\n", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser))); } }
xml_parser_free($xml_parser);
//
Display the array print_r($ary_parsed_file);
// This function is
called for every opening XML tag. We // need to keep track of our path
in the XML file, so we // will use this function to add the tag name
to an array function startElement($parser, $name,
$attrs=''){
// Make sure we can access the path array global
$ary_path; // Push the tag into the
array array_push($ary_path, $name);
}
// This function
is called for every closing XML tag. We // need to keep track of our
path in the XML file, so we // will use this function to remove the
last item of the array. function endElement($parser, $name,
$attrs=''){
// Make sure we can access the path array global
$ary_path; // Push the tag into the
array array_pop($ary_path);
}
// This function is
called for every data portion found between // opening and closing
tags. We will use it to insert values // into the array. function
characterData($parser, $data){ // Make sure we can access the
path and parsed file arrays // and the starting level value global
$ary_parsed_file, $ary_path, $int_starting_level;
// Remove extra
white space from the data (so we can tell if it's
empty) $str_trimmed_data = trim($data); // Since this
function gets called whether there is text data or not, // we need to
prevent it from being called when there is no text data // or it
overwrites previous legitimate data. if (!empty($str_trimmed_data))
{
// Build the array definition string $str_array_define =
'$ary_parsed_file'; // Add a [''] and data for each level.
(Starting level can be defined.) for ($i = $int_starting_level; $i
< count($ary_path); $i++) { $str_array_define .= '[\'' .
$ary_path[$i] . '\']'; } // Add the value portion
of the statement $str_array_define .= " = '" .
$str_trimmed_data . "';"; // Evaluate the statement
we just created eval($str_array_define); //
DEBUG //echo "\n" . $str_array_define; } //
if
}
?>
|
|
wheaty at planetquake dot com
17-Dec-2002 06:51 |
|
There are some really useful pre-rolled PHP XML Classes available at
|
|
j dot h dot wester at planet dot nl
03-Jan-2003 11:20 |
|
It looks like that, If
'xml_parser_set_option($parser,XML_OPTION_SKIP_WHITE,1)' is set, even
newlines between '<![CDATA[' and ']]>' are skipped.
|
|
simen at bleed dot no
11-Jan-2003 10:27 |
|
I was experiencing really wierd behaviour loading a large XML document
(91k) since the buffer of 4096, when reading the file actually doesn't
take into consideration the following:
<node>this is my
value</node>
If the 4096 byte buffer fills up at
"my", you will get a split string into your
xml_set_character_data_handler().
The only solution I've found so
far is to read the whole document into a variable and then parse.
|
|
software at serv-a-com dot com
22-Jan-2003 09:08 |
|
use: while ($data = str_replace("\n","",fread($fp,
4096))){
instead of: while ($data = fread($fp, 4096)) { It
will save you a headache.
and in response to (simen at bleed dot no
11-Jan-2003 04:27) "If the 4096 byte buffer fills
up..." Please take better care of your data don't just shove it in
to the xml_parse() check and make sure that the tags are not sliced the
middle, use a temporary variable between fread and xml_parse.
|
|
duerst at w3 dot org
12-Feb-2003 11:28 |
|
"anony at mous dot com" said at
on 20-Apr-2000 06:26: "Disable case folding or your code will be
violating the XML 1.0 specification."
This is very true, and
very important. Any script that does not use something like
xml_parser_set_option ($xml_parser, XML_OPTION_CASE_FOLDING,
FALSE); immediately after every call to xml_parser_create() is in
serious danger to create great confusion. I cannot understand why the
default for PHP is set to casefolding, and why this option is even
available in the first place. Element and attribute names in XML are NOT
case sensitive, period.
|
|
hmoulding at excite dot com
16-Feb-2003 12:23 |
|
Duerst wrote that XML is not case sensitive.
I think he
mispoke.
XML is case sensitive.
<TAG />
is not
the same as
<Tag />
is not the same as
<tag
/>
In XML you have to use consistent case. -- Helge
Moulding
|
|
software at serv-a-com dot com
17-Feb-2003 04:09 |
|
My previous XML post (software at serv-a-com dot com/22-Jan-2003 03:08)
resulted in some of the visitors e-mailg me on the carriage return
stripping issue with questions. I'll try to make the following mumble as
brief and easy to understand as possible.
1. Overview of the 4096
fragmentation issue As you know the following freads the file 4096
bytes at a time (that is 4KB) this is perhaps ok for testing expat and
figuring out how things work, but it it rather dangerous in the production
environment. Data may not be fully understandable due to fread
fragmentation and improperly formatted due to numerous sources(formats) of
data contained within (i.e. end of line delimited CDATA).
while
($data = fread($fp, 4096)) { if (!xml_parse($xml_parser, $data,
feof($fp))) {
Sometimes to save time one may want to load it all up
into a one big variable and leave all the worries to expat. I think
anything under 500 KB is ok (as long as nobody knows about it). Some may
argue that larger variables are acceptable or even necessary because of
the magic that take place while parsing using xml_parse. Our XML
parser(expat) works and can be successfully implemented only when we know
what type of XML data we are dealing with, it's average size and structure
of general layout and data contained within tags. For example if the tags
are followed by a line delimiter like a new line we can read it with fgets
in and with minimal effort make sure that no data will be sent to the
function that does not end with a end tag. But this require a fair
knowledge of the file's preference for storing XML data and tags (and a
bit of code between reading data and xml_parse'ing it).
|
|
software at serv-a-com dot com
17-Feb-2003 04:10 |
|
2. Pre Parser Strings and New Line Delimited Data One important thing to
note at this point is that the xml_parse function requires a string
variable. You can manipulate the content of any string variable easily as
we all know.
A better approach to removing newlines than: while
($data = fread($fp, 4096)) { $data =
preg_replace("/\n|\r/","",$data); //flarp if
(!xml_parse($xml_parser, $data, feof($fp))) {...
Above works across
all 3 line-delimited text files (\n, \r, \r\n). But this could
potentially (or will most likely) damage or scramble data contained in for
example CDATA areas. As far as I am concerned end of line characters
should not be used _within_ XML tags. What seems to be the ultimate
solution is to pre-parse the loaded data this would require checking the
position within the XML document and adding or subtracting (using a
in-between fread temporary variable) data based on conditions like:
"Is within tag", "Is within CDATA" etc. before fedding
it to the parser. This of course opens up a new can of worms (as in parse
data for the parser...). (above procedure would take place between fread
and xml_parser calls this method would be compatible with the general
usage examples on top of the page)
3. The Answer to parsing
arbitrary XML and Preprocessor Revisited You can't just feed any XML
document to the parser you constructed and assuming that it will work! You
have to know what kind of methods for storing data are used, for example
is there a end of line delimited data in the file ?, Are there any
carriage returns in the tags etc... XML files come formatted in different
ways some are just a one long string of characters with out any end of
line markers others have newlines, carriage returns or both (Microsloth
Windows). May or may not contain space and other whitespace between tags.
For this reason it is important to what I call Normalize the data before
feeding it to the parser. You can perform this with regular expressions or
plain old str_replace and concatenation. In many cases this can be done to
the file it self sometimes to string data on the fly( as shown in the
example above). But I feel it is important to normalize the data before
even calling the function to call xml_parse. If you have the ability to
access all data before that call you can convert it to what you fell the
data should have been in the first place and omit many surprises and
expensive regular expression substitution (in a tight spot) while
fread'ing the data.
|
|
fred at barron dot com
23-Apr-2003 12:28 |
|
regarding jon at gettys dot org's nice XML to Object code, I've made some
useful changes (IMHO) to the characterData function... my minor
modifications allow multiple lines of data and it escapes quotes so errors
don't occur in the eval...
function characterData($parser,
$data) { global $obj; $data = addslashes($data);
eval($obj->tree."->data.='".$data."';"); }
|
|
panania at 3ringwebs dot com
20-May-2003 10:12 |
|
The above example doesn't work when you're parsing a string being returned
from a curl operation (why I don't know!) I kept getting undefined offsets
at the highest element number in both the start and end element functions.
It wasn't the string itself I know, because I substringed it to death with
the same results. But I fixed the problem by adding these lines of
code...
function defaultHandler($parser, $name) { global
$depth; @
$depth[$parser]--; }
xml_set_default_handler($xml_parser,
"defaultHandler");
Hope this helps 8-}
|
|
add a note |
| |