Using XML in configuration files
| Level: |
Advanced
|
| Techniques: |
XML |
| Compatibility: |
XML v.1.0 |
| Keywords: |
XML, configuration, settings |
Using XML in configuration files
Often developers have to create configuration files for programs which
use a wide variety of parameters. A large amount of parameters and presence
of relations among them can cause troubles when creating software for
access via web-interface. If our configuration file is flat i.e. all
parameters have a ‘key - value’format, the solution will
be easy for implementing. Other case is that the values may represent
arrays, hashes, hashes of arrays, etc –not only scalars. So our
aim is to develop both easy access for editing via web-interface and
easy use from script. At this stage there are 2 reasonable ways.
a) representing a configuration file as Perl package. Data storage format
will coincide to Perl syntax. Loading this data will be very simple: just put
require 'config.pl';
into your source.
b) Keeping data in DB file where value is packed Perl structure (possible
using Storable::freeze, Storable::thaw)
However both of these ways has its lacks for our aim.
The first one is enough expert solution, except the part of access via
web-interface. Because of perl data parsing is a task which is equal
to perl compiler. Certainly we could place all config file into
eval() block and then use already compiled values. However it can cause
different representation the same data. For example, following arrays
@myArray1 = (1,2,3,4,5,6,7,8,9,10);
@myArray2 = (1.. 10);
are indistinguishable from the perl point of view. And the other reason
not to use such way is because of it will be applicable correctly only
for Perl programs.
The second way could be very agreeable to use. however DB file storage
leaves out manual file editing due to binary format of DB files.
So our solution is XML format.
Let us enumerate all advantages applicable to our needs:
1) XML is a text markup language. So it is enough convenient for
manual editing (possible using any XML editor )
2) XML’s structured format allows to organize complex data. Such
as arrays, hashes, hash of arrays etc...
3) A lot of XML-oriented libraries allows to achieve data using wide
range of programming languages.
Now we will list some modules which are very useful for our task.
XML::Simple, Data::Dumper, Storable
XML::Simple
Editing XML data using XML::Simple looks simple. XMLin method reads
data from a source and return a reference to created structure. XMLout method
allows to store a structure (possible somehow modified) as XML file.
#!/usr/bin/perl use XML::Simple $xml_content = XMLin('config.xml') #.................... some manipulations XMLout($xml_content, outputfile => 'config.xml')
The only lack is that
XMLOut(XMLin('config.xml'));
will not lead to the same view as “config.xml”was. But that
difference can be greatly reduced by tuning XML::Simple parameters
and XML data auxiliary adaptations.
Data::Dumper
is a very helpful module for debug and a perfect tool for a viewing
formatted structure content.
Storable:
The Storable package brings persistence to your Perl data structures.
‘freeze’an ‘thaw’methods allows to pack structures
into scalars and unpack it back.
So let’s start creating configuration file in most preferable
form for us. In general, it is possible to store scalars in following
format
<var1>1</var1>
<var2>abc</var2>
But we advise to add at least two additional parameters: ‘type’and ‘order’.
a) ‘order’: unfortunately XML::Simple doesn’t support keeping
elements order because of it represents nodes as hash keys and
Perl hashes don’t keep their orders. So if there is a necessity
to place elements in a web-interface at the same order as they
are in configuration file, you can use following sort function:
sub order {
if (exists($config->{$a}->{order}) && exists($config->{$b}->{order})) {
return ($config->{$a}->{order} <=> $config->{$b}->{order});
} elsif (exists($config->{$a}->{order}) ) {
return -1;
} elsif(exists($config->{$b}->{order})) {
return 1;
}
return 0;
}
and envoke it for example in ‘foreach’structure.
foreach $key (sort {order()} keys %$config){
#......
} it sorts all nodes which have ‘order’attribute and
place the others at the end
b) ‘type’is useful attribute too. It will help to
find out, how to process the value.
For example, ‘number’shows that the value is equal to 255,
not the string ‘0xff’
<scalar type = ”number”value="0xff"/>
arrays: most probably you will have to set up ‘forcearray’ attribute
for some elements.
This option should be set to '1' to force nested elements to be represented
as arrays even when there is only one.
<array1 > <
items>Apple</items>
< /array1>
without forcearray=>[items] this structures will be represented
as
'array1' => {
'items'
=> 'Apple'
},
};
instead of
'array1' => {
'items'
=> [
'Apple',
]
}
Working with hashes a bit harder than with the arrays. All depends
of hash keys: if their names is XML-enabled, of course we will
use
<hash2 type="hash"> <key1>one</key1> <key2>two</key2> </hash2>
and XMLin will represent it as a Perl hash without any changes.
'hash2' => {
'key1'
=> 'one',
'key2'
=> 'two',
'type'
=> 'hash'
}
However in common case we can’t use key name as a XML element
name.
<hash1 type = "hash"> <item expression
= "a>1">true</item> <item expression = "b!=0">false</item> </hash1>
and a result of XMLin(...) will be:
'hash1' => {
'item'
=> [
{
'expression'
=> 'a>1',
'content'
=> 'true'
},
{
'expression'
=> 'b!=0',
'content'
=> 'false'
}
],
'type'
=> 'hash'
}
Of course this structure isn’t quite similar to original hash.
But it is the only way to represent hashes because we can’t urge
XML::Simple to skip our auxiliary elements even using contentkey,
forcearray, forcecontent. But this settings allows to organize data in
the preferable form for your program. Following code is able to
transform this structure to a regular hash.
my %HASH;
map {$HASH{$_->{expression}}=$_->{content}}
(@{$xml->{hash1}->{item}});
You can set “keyattr =>expression”to reach hash keys
quickly.
And the content will be :
'hash1' => {
'item' => {
'b!=0' => {
'content' => 'false'
},
'a>1' => {
'content' => 'true'
}
},
'type'=> 'hash'
}Some useful settings are below (see man pages for more info):
forcecontent
When XMLin()
parses elements which have text content as well as attributes, the text
content must be represented as a hash value rather than a simple scalar. This
option allows you to force text content to always parse to a hash value
even when there are no attributes.
keeproot
In its attempt to return a data structure free of superfluous detail
and unnecessary levels of indirection, XMLin() normally
discards the root element name. Setting the keeproot option to '1'
will cause
the root element name to be retained.
keyattr
This option controls the 'array folding' feature which translates nested
elements from an array to a hash. It also controls the 'unfolding'
of hashes to arrays.
The key attribute names should be supplied in an arrayref if there is
more than one. XMLin() will attempt to match attribute names
in the order supplied. XMLout() will use the first attribute
name supplied when 'unfolding' a hash into an array.
Note 1: The default value for keyattr
is ['name', 'key', 'id']. If you do not want folding on input or unfolding
on output you must setting
this option to an empty list to disable the feature.
Note 2: If you wish to use this option, you should also enable the
'forcearray' option. Without 'forcearray', a single nested element
will be rolled up into a scalar rather than an array and therefore will
not be folded (since only arrays get folded).
outputfile
The default behaviour of `XMLout()' is to return the XML as a string. If
you wish to write the XML to a file, simply supply the filename using
the 'outputfile' option. Alternatively, you can supply an IO handle
object instead of a filename.
forcearray
This option should be set to '1' to force nested elements to be represented
as arrays even when there is only one. This option is especially useful
if the data structure is likely to be written back out as XML and the
default behaviour of rolling single nested elements up into attributes
is not desirable. If you are using the array folding feature, you should
almost certainly enable this option. If you do not, single nested
elements will not be parsed to arrays and therefore will not be candidates
for folding to a hash.
Resume
So we have a powerful engine for data manipulations. More
information you can read in man pages for this modules. XML allows to
expand
structure
relations as long as you need in spite of order and type attributes. You can combine indicated solutions for your needs. For example,
storing
different sets of configuration parameters for multiple users, including
default values overriding. Possible you will involve DB files for transaction
support where each key is user name and a value is packed user’s
XML configuration file using Storable::freeze.
Anyway the solutions described above and XML format allows you to implement
a lot
of functionalities. Vladimir Kasatkin
Dewia
|