2.1 Basic Perl Data Types

Before tackling references, let's review the basic Perl data types:

Scalar

A scalar value is a string or any one of several kinds of numbers such as integers, floating-point (decimal) numbers, or numbers in scientific notation such as 2.3E23. A scalar variable begins with the dollar sign $, as in $dna.

Array

An array is an ordered collection of scalar values. An array variable begins with an at sign @, as in @peptides. An array can be initialized by a list such as @peptides = ('zeroth', 'first', 'second'). Individual scalar elements of an array are referred to by first preceding the array name with a dollar sign (an individual element of an array is a scalar value) and then following the array name with the position of the desired element in square brackets. Thus the first element of the @peptides array is referenced by $peptides[0] and has the value 'zeroth'. (Note that array elements are given the positions 0, 1, 2, ..., n-1, where n is the number of elements in the array.)

Recall that printing an array within double quotes causes the elements to be separated by spaces; without the double quotes, the elements are printed one after the other without separations. This snippet:

@pentamers = ('cggca', 'tgatc', 'ttggc');

print "@pentamers", "\n";
print @pentamers, "\n";

produces the output:

cggca tgatc ttggc
cggcatgatcttggc
Hash

A hash is an unordered collection of key value pairs of scalar values. Each scalar key is associated with a scalar value. A hash variable begins with the percent sign %, as in %geneticmarkers. A hash can be initialized like an array, except that each pair of scalars are taken as a key with its value, as in:

The => symbol is just a synonym for a comma that makes it easier to see the key/value pairs in such lists.[1] An individual scalar value is retrieved by preceding the hash name with a dollar sign (an individual value is a scalar value) and following the hash name with the key in curly braces, as in $geneticmarkers{'hairless'}, which, because of how it's initialized, has the value 'no'.

[1] It also forces the left side to be interpreted as a string.