Hi all! As you may know, I’ve started studying BioInformatics. Part of this includes learning Perl, a very versatile and widely used language to help manipulate data.
Now what am I hoping to accomplish with these posts? Hopefully some sort of understanding in my head about how Perl works and how it can be adapted. If these notes help you, please post a comment.
These notes are written for Linux based systems, but will also work on Windows. Try Strawberry Perl.
Basics of Perl – Scalars and Operators
This is probably one of the more wordy posts to do with Perl. Don’t worry though, take your time and read though it.
Scalars – $
Scalars are variables that can store a single number or line of text. They are one of the most basic things in Perl to know as they can be used in a variety of cases. Scalars begin with a dollar sign “$” and can take any name you like. A semicolon “;” denotes the end of a variable and must be included. For example:
1
2
3
4
5
6
7
8
9
|
$ascalar = 4; #Scalars can be integers or decimals.
$catname = "Socks"; #Or they can be literal strings of text.
$hex = 0x1A; #Hex numbers too.
$binary = 0b1100; #Binary numbers...
$bignumber = 1e12; #Scientific notation as well.
$a_scalar = 642; #Underscores are OK to use in scalar names.
$apple = $fruit; #You can store scalars in other scalars
$a_scalar = 123 #Will throw out an error since there is no semicolon
|
Be very careful when naming variables in Perl, some names may be restricted. See here for a list of them. This applies to all variables not just scalars.
So how can we use scalars in a simple program?
Open your favourite text editor (please don’t use Word, I recommend Notepad++ for Windows and Geany or Sublime Text 2 for Linux) and try this:
1
2
3
|
$hello = "Hello";
$world = "World!";
print "$hello $world";
|
Save this somewhere with the extension .pl and open up your terminal. Change directory to the folder and run your script with: perl scriptname.pl.
Probably the most simple of all programs, the output will simply be “Hello World!”.
Syntax: print
A word on the print statement while we’re at it. print will print out anything (as the name suggest) to the terminal.
Using different quotation marks will yield different results:
1
2
3
|
$username = "Joe";
print "Hello there $username."; #This will print "Hello There Joe."
print 'Hello there $username.'; #However this will print "Hello there $username."
|
Double quoted strings will not accept special characters (eg \ or “) unless they are escaped. Escaping characters is simple, just place a backslash in front of the special character you want to print. For example, if you wanted to print a double quote in a double quoted string: “Hello \“World\“”. Punctuation (. ,) doesn’t need to be escaped in quoted strings.
Now printing out in this way is all well and good, however if you had multiple print statements in a row you would end up having a single chunk of text instead of single lines. Special characters are included in Perl to indicate a newline (\n) or a tab (\t).
1
2
3
4
5
6
7
8
9
10
11
12
13
|
print "Hello World";
print "Hello World";
print "Hello World";
#This would print "Hello WorldHello WorldHello World"
print "Hello World\n";
print "\tHello World\n";
print "\t\tHello World\n";
#However, this would print:
#"Hello World
# Hello World
# Hello World
#"
|
Operators
Integers and real numbers
Scalars are good at storing numbers but say you want to do something to those numbers. Perl supports arithmetic operators:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
$no1 = 23;
$no2 = 5;
#Addition +
print $no1 + $no2; #Would add 23 and 5 and print the result (28).
#Subtraction -
print $no1 - $no2; #Would print 18.
#Multiplication *
print $no1 * $no2; #Prints 115.
#Division /
print $no1 / $no2; #Prints 4.6
#Modulus %
print $no1 % $no2; #Prints 3.
#Powers **
print $no1 ** $no2; #Prints 643634328
#Brackets are fine too.
print ($no1 + $no2)/(2*$no2); #Prints 28
$no3 = $no2 + $no1 / 5*$no1; #Perl follows BODMAS before the =
|
In loops (coming later), adjusting numbers by one can be very useful for counting things.
1
2
3
4
5
6
7
|
#Incrementation
++$no1; #Will add 1 to $no1 before printing. Will return 24.
$no1++; #Will add 1 AFTER printing. Will return 23.
#Decrementing
--$no1; #Will subtract 1 before printing. Will return 22.
$no1--; #Will subtract 1 AFTER printing. Will return 23.
|
Checking relationships between strings can be accomplished by using equality operators:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
$no1 = 23;
$no2 = 5;
$no3 = 5;
#Inequalities > >= < <=
print $no1 > $no2; #Will check if $no1 is greater than $no2 and outputs 1 if true or nothing if false.
print $no1 < $no2; #Prints nothing (false).
#<= and >= can be used for less than or equal/greater than or equal.
#Equality ==/!=
print $no1 == $no2; #Returns nothing (false).
print $no2 == $no3; #Returns 1 (true).
print $no1 != $no2; #Returns 1 (true).
#Comparison <=>
print $no1 <=> $no2; #Is $no1 bigger than $no2? Returns 1.
print $no2 <=> $no1; #Returns -1 as it is smaller.
print $no2 <=> $no3; #Will return 0 since they are equal.
|
String operators
Strings have their own set of operators especially for them, they work the same as numerical operators:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
$a = "a";
$b = "b";
$c = "c";
$a2 = "a";
#Greater than (or equal to) gt/ge
print $b gt $a; #Greater than. Prints 1 (true)
print $b ge $a; #Greater than or Equal to. Prints 1 (true)
#Less than (or equal to) lt/le
print $c lt $a; #Less than. Prints 1 (true)
print $a le $b; #Less than or equal to. Prints nothing (false)
#Equality eq/ne
print $a eq $a2; #Equal to. Prints 1 (true) a equals a.
print $a ne $c; #Not equal to. Prints 1 (true).
#Comparisons cmp
print $a cmp $a2; #Prints 0. a is the same as to a.
print $c cmp $a; #Prints 1. c is more than a.
print $a cmp $b; #Prints -1. a is less than b.
|
Repetition and Concatenation
If you need to repeat a scalar many times or concatenate them into one string, using ($scalar x n) and a period ($scalar1 . $scalar2) will help you:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
$cat1 = "Socks";
$cat2 = "Milly";
$cat3 = "Poppy";
$n1 = 1;
$n2 = 2;
$n3 = 3;
#Concatenation
print $cat1.$cat2.$cat3; #Will print out SocksMillyPoppy.
print $n1.$n2.$n3; #Prints out 123.
#Repeating
print $n1 x 3; #Repeats $n1 3 times.
print $cat1 x 3; #Repeats $cat1 3 times.
|
A note on Syntax and good practice
When writing a Perl script, it is generally good practice to head your script with the following:
1
2
|
#!/usr/bin/perl -w
use strict;
|
Now what do these 2 lines do?
Line #1 points Linux to the correct path to run Perl from and -w instructs Perl to give warnings if something goes wrong. The path /usr/bin/perl may change depending on your distribution or on Windows. Consult which perl in your terminal for the correct path. Also this allows you to run your script by calling it and not though the Perl interpreter. Chmod your script by using u+x filename.pl and run the script with ./filename.pl from the terminal.
Line #2 tells Perl to use strict markup. Why is this important? When writing in Perl it is all too easy to miss type a variable name using strict forces you to prefix all variables with the word my:
1
2
3
4
|
#!/usr/bin/perl -w
use strict;
my $cat1 = "Socks"; #Correct
$cat2 = "Milly"; #Will cause the compiler to fail
|
With all that, your brain is probably melting. The important thing is to practice to get the hang of things. Next up are arrays and hashes, but that’s for another time.
Inspired by lecture notes from Dr Shepherd and from PerlDocs.