web design - web programming - web development - flash - complete solutions - e-commmerce - site optimization - site promotion
Dewia
Home About Us Our Portfolio Our Services Contact Us
Links:

Back to articles index
 
Services

Articles: Web Programming

Using XML in configuration files

Level:

Advanced

Techniques: XML
Compatibility: XML v.1.0
Keywords: XML, configuration, settings

Using XML in configuration files

Often developers have to create configuration files for programs which use a wide variety of parameters. A large amount of parameters and presence of relations among them can cause troubles when creating software for access via web-interface. If our configuration file is flat i.e. all parameters have a ‘key - value’format, the solution will be easy for implementing. Other case is that the values may represent arrays, hashes, hashes of arrays, etc –not only scalars. So our aim is to develop both easy access for editing via web-interface and easy use from script. At this stage there are 2 reasonable ways.

a) representing a configuration file as Perl package. Data storage format will coincide to Perl syntax. Loading this data will be very simple: just put

require 'config.pl';

into your source.

b) Keeping data in DB file where value is packed Perl structure (possible using Storable::freeze, Storable::thaw)

However both of these ways has its lacks for our aim.

The first one is enough expert solution, except the part of access via web-interface. Because of perl data parsing is a task which is equal to perl compiler. Certainly we could place all config file into eval() block and then use already compiled values. However it can cause different representation the same data. For example, following arrays

@myArray1 = (1,2,3,4,5,6,7,8,9,10);

@myArray2 = (1.. 10);

are indistinguishable from the perl point of view. And the other reason not to use such way is because of it will be applicable correctly only for Perl programs.

The second way could be very agreeable to use. however DB file storage leaves out manual file editing due to binary format of DB files.

So our solution is XML format.

Let us enumerate all advantages applicable to our needs:

1) XML is a text markup language. So it is enough convenient for manual editing (possible using any XML editor )

2) XML’s structured format allows to organize complex data. Such as arrays, hashes, hash of arrays etc...

3) A lot of XML-oriented libraries allows to achieve data using wide range of programming languages.

Now we will list some modules which are very useful for our task.

XML::Simple, Data::Dumper, Storable

XML::Simple

Editing XML data using XML::Simple looks simple. XMLin method reads data from a source and return a reference to created structure. XMLout method allows to store a structure (possible somehow modified) as XML file.

#!/usr/bin/perl

use XML::Simple

$xml_content = XMLin('config.xml')

#.................... some manipulations

XMLout($xml_content, outputfile => 'config.xml')

The only lack is that

XMLOut(XMLin('config.xml'));

will not lead to the same view as “config.xml”was. But that difference can be greatly reduced by tuning XML::Simple parameters and XML data auxiliary adaptations.

Data::Dumper

is a very helpful module for debug and a perfect tool for a viewing formatted structure content.

Storable:

The Storable package brings persistence to your Perl data structures.

‘freeze’an ‘thaw’methods allows to pack structures into scalars and unpack it back.

So let’s start creating configuration file in most preferable form for us. In general, it is possible to store scalars in following format

<var1>1</var1>

<var2>abc</var2>

But we advise to add at least two additional parameters: ‘type’and ‘order’.

a) ‘order’: unfortunately XML::Simple doesn’t support keeping elements order because of it represents nodes as hash keys and Perl hashes don’t keep their orders. So if there is a necessity to place elements in a web-interface at the same order as they are in configuration file, you can use following sort function:

sub order {
	if (exists($config->{$a}->{order}) && exists($config->{$b}->{order})) {
		return ($config->{$a}->{order} <=> $config->{$b}->{order});
	} elsif (exists($config->{$a}->{order}) ) {
		return -1;
	} elsif(exists($config->{$b}->{order})) {
		return 1;
	}

	return 0;
}

and envoke it for example in ‘foreach’structure.

foreach $key (sort {order()} keys %$config){
        #...... 
}

it sorts all nodes which have ‘order’attribute and place the others at the end

b) ‘type’is useful attribute too. It will help to find out, how to process the value.

For example, ‘number’shows that the value is equal to 255, not the string ‘0xff’

<scalar type = ”number”value="0xff"/>

arrays: most probably you will have to set up ‘forcearray’ attribute for some elements.

This option should be set to '1' to force nested elements to be represented as arrays even when there is only one.

<array1 >
< items>Apple</items>
< /array1>

without forcearray=>[items] this structures will be represented as

'array1' => {
'items' => 'Apple'
},
};

instead of

'array1' => {
'items' => [
'Apple',
]
}

Working with hashes a bit harder than with the arrays. All depends of hash keys: if their names is XML-enabled, of course we will use

<hash2 type="hash">
<key1>one</key1>
<key2>two</key2>
</hash2>

and XMLin will represent it as a Perl hash without any changes.

'hash2' => {
'key1' => 'one',
'key2' => 'two',
'type' => 'hash'
}

However in common case we can’t use key name as a XML element name.

<hash1 type = "hash">
<item expression = "a>1">true</item>
<item expression = "b!=0">false</item>
</hash1>

and a result of XMLin(...) will be:

'hash1' => {
'item' => [
{
'expression' => 'a>1',
'content' => 'true'
},

{
'expression' => 'b!=0',
'content' => 'false'
}

],

'type' => 'hash'
}

Of course this structure isn’t quite similar to original hash. But it is the only way to represent hashes because we can’t urge XML::Simple to skip our auxiliary elements even using contentkey, forcearray, forcecontent. But this settings allows to organize data in the preferable form for your program. Following code is able to transform this structure to a regular hash.

my %HASH;

map {$HASH{$_->{expression}}=$_->{content}} (@{$xml->{hash1}->{item}});

You can set “keyattr =>expression”to reach hash keys quickly.

And the content will be :

'hash1' => {        
	'item' => {        
		'b!=0' => {        
			'content' => 'false'
			},        
			'a>1' => {        
				'content' => 'true'
			}
		},
	'type'=> 'hash'
}

Some useful settings are below (see man pages for more info):

forcecontent

When XMLin() parses elements which have text content as well as attributes, the text content must be represented as a hash value rather than a simple scalar. This option allows you to force text content to always parse to a hash value even when there are no attributes.

keeproot

In its attempt to return a data structure free of superfluous detail and unnecessary levels of indirection, XMLin() normally discards the root element name. Setting the keeproot option to '1' will cause the root element name to be retained.

keyattr

This option controls the 'array folding' feature which translates nested elements from an array to a hash. It also controls the 'unfolding' of hashes to arrays.

The key attribute names should be supplied in an arrayref if there is more than one. XMLin() will attempt to match attribute names in the order supplied. XMLout() will use the first attribute name supplied when 'unfolding' a hash into an array.

Note 1: The default value for keyattr is ['name', 'key', 'id']. If you do not want folding on input or unfolding on output you must setting this option to an empty list to disable the feature.

Note 2: If you wish to use this option, you should also enable the 'forcearray' option. Without 'forcearray', a single nested element will be rolled up into a scalar rather than an array and therefore will not be folded (since only arrays get folded).

outputfile

The default behaviour of `XMLout()' is to return the XML as a string. If you wish to write the XML to a file, simply supply the filename using the 'outputfile' option. Alternatively, you can supply an IO handle object instead of a filename.

forcearray

This option should be set to '1' to force nested elements to be represented as arrays even when there is only one. This option is especially useful if the data structure is likely to be written back out as XML and the default behaviour of rolling single nested elements up into attributes is not desirable. If you are using the array folding feature, you should almost certainly enable this option. If you do not, single nested elements will not be parsed to arrays and therefore will not be candidates for folding to a hash.

Resume

So we have a powerful engine for data manipulations. More information you can read in man pages for this modules. XML allows to expand structure relations as long as you need in spite of order and type attributes. You can combine indicated solutions for your needs. For example, storing different sets of configuration parameters for multiple users, including default values overriding. Possible you will involve DB files for transaction support where each key is user name and a value is packed user’s XML configuration file using Storable::freeze. Anyway the solutions described above and XML format allows you to implement a lot of functionalities.

Vladimir Kasatkin
Dewia

To the top