How do I select a specific sub-node of a xml file using XML::Twig in Perl
This must be a dumb question, but I'm a bit stuck:
I have the an XML file which you can see a sample here:
<?xml version="1.0" encoding="utf-16"?>
<!DOCTYPE tmx SYSTEM "56.dtd">
<body>
<tu changedate="20130625T175037Z"">
<tuv xml:lang="pt-pt">
<prop type="x-context-pre"><seg>Some text.</seg></prop>
<prop type="x-context-post"><seg>Other text.</seg></prop>
<seg>The text I'm interested.</seg>
</tuv>
<tuv xml:lang="it">
<seg>And it's translation in italian.</seg>
</tuv>
</tu>
.... followed by other <tu>'s
</body>
Since it's a huge file I'm using XML::Twig to parse it and get the parts
I'm interested in. I'm particulary interested in seg's node content aswell
as the tu's node attribute.
Here's the code I've got so far:
use 5.010;
use strict;
use warnings;
use XML::Twig;
my $filename = 'filename.tmx';
my $out_filename = 'out.xml';
open my $out, '>', $out_filename;
binmode $out;
my $original_twig = new XML::Twig (pretty_print => 'nsgmls', twig_handlers
=> {tu => \&original_tu});
$original_twig->parsefile($filename);
sub original_tu {
my($twig, $original_tu) = @_;
my $original_seg = $original_tu-> first_child('./tuv/seg')->text;
}
Perl (or should I say XML::Twig) tells me that I've got: wrong navigation
condition './tuv/seg' ()
Does anyone know how to access the seg node's text and , if you're not fed
up of me already, how to access the changedate atribute of the tu's node?
Thank you very much.
Dasen
No comments:
Post a Comment