Zipf's Law Microarray Normalization

(Perl Script Installation Instructions)




Introduction

The microarray normalization technique using Zipf's Law has been implimented as a Microsoft Excel macro.

     Download: zipfs_normalize.pl

This implimenation:



Installation

First install the Perl Data Language module. If you want to the normalization method to handle missing data values, you must set the WITH_BADVAL configuration option to 1. If you install a binary version (for example from an RPM under RedHat Linux), you should check in missing value support is included by checking that the Bad Status in the perldl shell (see below).

CPAN install

At the command line, type:

> perl -e 'use CPAN; install PDL'

Start the PDL shell and check if missing value support is compiled in:

> perldl
perlDL shell v1.32
PDL comes with ABSOLUTELY NO WARRANTY. For details, see the file
'COPYING' in the PDL distribution. This is free software and you
are welcome to redistribute it under certain conditions, see
the same file for details.
ReadLines, NiceSlice enabled
Reading PDL/default.perldlrc...
Found docs database /usr/lib/perl5/site_perl/5.6.0/i386-linux/PDL/pdldoc.db
Type 'help' for online help
Type 'demo' for online demos
Loaded PDL v2.3.3 (supports bad values)

The last line indicated that missing value support is compiled in. If bad values are not supported, you may have to edit the perldl.conf file yourself (which should be in your .cpan directory), and recompile the module.

Compiling it yourself

Download the PDL module source code from www.cpan.org or pdl.perl.org and unzip as usual.

Edit the perldl.conf file so that

WITH_BADVAL = 1

Run perl Makefile; make; make install as usual.

Checking that missing value support is compiled in

Start the perldl shell from the command line

> perldl

Bad Status should return 1

perldl> p $PDL::Bad::Status
1

Instructions

The Perl script is a command line tool with 4 command line options:

Example

With the rat_raw.txt sample file from the sample data sets from the original article, type on the command line,

> ./zipfs_normalize.pl -i rat_raw.txt -o rat_normalized.txt -r 1 -c 1

Tips

If you want the gene ranks to be included in the output file, change line 23 of the zipfs_normalize.pl script from:

$arraydata->output_data('NORMDATA', $opt_o, 'N');

to

$arraydata->output_data('NORMDATA', $opt_o, 'Y');


Last updated: 01.05.2003