2011年10月5日 星期三

Perl (working with file)

Working with Files

This tutorial shows you how to create files and write to them, how to read them, how to delete them, and how to scan directories on your server. It also covers permissions and full paths, and includes a handy script for working out full paths. It also touches on loops, lists, and appending strings to variables. A lengthy tutorial but well worth reading!
Perl's filesystem functions let you manipulate files and directories on the server from within your CGI scripts. This means that you can:
  • store data in files on the server
  • allow visitors to upload files to your site
  • create message boards and guestbooks
...and all sorts of other useful stuff!
In this tutorial, we'll show you how to create files and write to them, how to read from them, how to delete them, and how to scan directories (folders) so that you can see what files are on the server.
We'll also discuss file and directory permissions, and the concept of full paths. At the end of the tutorial you'll find a useful script to help you work out your full paths (essential for working with files).
This tutorial will also touch on some other Perl concepts such as while loops, lists, and appending strings to variables. It's quite a long tutorial, but worth sticking with. By the end of it you'll have learnt a lot of useful concepts, and you'll be able to write your own CGI scripts to read and write files on your server!

Creating a file and writing to it

In order to access a file in Perl, we need to use the open function. The function takes the general form:

open filehandle, expression
The file handle, filehandle, is a bit like a variable - it's a reference to the file that we've just opened. You can use any name you like for the file handle, but conventionally the name is upper-case. Once the file is opened, you can use the file handle to access (write to or read from) the file.
The expression after the file handle, expression, refers to the thing we're trying to open. In this case, we're working with files, so the file name would go in here.
For example, let's suppose we want to create and write to a plain text file calledcolours.txt. We might use the following line of code to create the file:

open COLOURS, ">colours.txt";
Did you notice the > before the file name? This tells Perl that we want to open the file for writing. (We'll cover appending to and reading from a file later on.)
Note that opening the file for writing automatically creates the file. It's also important to realise that it will overwrite the file if it already exists! (If you need to add stuff to an existing file, you can use the append option, described below.)
Now that we've opened our file, let's write something to it! To do this, we can use the regular print function that we normally use for displaying stuff to the user:

print COLOURS "My favourite colour is red\n";
By including the file handle, COLOURS, before the string to print, we're telling the printfunction to write the string to the opened file. We also add a newline character (\n) to the end of the string, to indicate that we've printed a whole line.

Closing the file

Once we've finished working with a file, we close it. This tells Perl that we've finished with the file and no longer need the file handle:

close COLOURS;
And that's how to create and write to files!

Appending to a file

As well as writing to a file, we can append to it too. When we open a file for appending, the file is not overwritten; instead, any new stuff we print to the file is added onto the end.
To open a file for appending, we place two >s before the file name, as follows:

open COLOURS, ">>colours.txt";
We can then print to the file as before:

print COLOURS "My second favourite colour is green\n";
...and finally, we close the file:

close COLOURS;
For the above example, the resulting file would now look like this:

My favourite colour is red
My second favourite colour is green

Reading from a file

To open a file for reading, we simply give the file name without any > signs before it, as follows:

open COLOURS, "colours.txt";
Once we've opened the file, we can read stuff from it using the line input operator<> (also known as the angle operator). In normal usage, the line input operator returns the next line from our file:

$line1 = <COLOURS>;
$line2 = <COLOURS>;
In the above example, the $line1 variable would now contain the string "My favourite colour is red", while $line2 would now contain "My second favourite colour is green".
As always, once we've finished with the file, we close it:

close COLOURS;
There are a lot of other powerful ways to read files using Perl. Below you'll find a few useful tips for reading files.

While loop magic

There's a really easy way to read the entire file into one variable. This uses a whileloop to move through the whole file a line at a time, appending each line to a variable as it goes:

while ( <COLOURS> )
{
  $myfile = $myfile . $_;
}
while loop, if you haven't heard of them before, simply repeats the actions inside the curly braces ({ }) until the test inside the parentheses returns false. So, when the value of <COLOURS> becomes false, the loop exits. As the line input operator returns false when there are no more lines to read, this is a handy way to exit the loop when we've reached the end of our file.
The magic comes in because, when you use the line input operator in a while loop like this, the next line of the file is automatically assigned to the special variable$_. (We'll cover special variables in another tutorial at some point.)
Once inside the loop, we can then append the next line (which is now in our special variable $_) to our scalar variable $myfile. To append a string to a variable, we use the concatenation operator (.):

$myfile = $myfile . $_;
Note that there's a neater, short-hand way of writing the above line:

$myfile .= $_;
This does exactly the same thing as the line above.

Reading into a list

We can also read the file in one go into a list variable. A list variable differs from a scalar variable in that it can hold lots of different values at once (similar to an array in JavaScript). To specify a list variable, we use the @ character rather than the $ character - for example:

@mylist = ( 1, 2, 3 );
If we assign the value of our line input operator to a list variable, then the whole file is read into the list variable in one go, one line per list item:

open COLOURS, "colours.txt";
@mylist = <COLOURS>;
close COLOURS;
Pretty neat!

Accessing the list values

To retrieve values from the list, we can use square brackets, as follows:

$line1 = @mylist[0];
$line2 = @mylist[1];
Note that the first element of the list is at position 0, and the second element is at position 1. And so on.
You can also use a foreach loop to go through the list. As the loop works its way through the list, the specified scalar variable (in this case $listitem) is linked to the current item in the specified list (in this case @mylist):

foreach $listitem ( @mylist )
{
  print $listitem;
}
This example would output each line of the file (i.e. each element of @mylist) to the screen or Web page.

Deleting a file

Deleting a file in Perl is simple. We just use the unlink function. For example, to delete a file called colours.txt in the current directory:

unlink "colours.txt";
The unlink function returns the number of files successfully deleted. So if unlinkreturns zero, you know that your file couldn't be deleted (maybe the script didn't have permission to delete it?).

Reading a directory

It's often useful to be able to read the contents of a directory (folder) on the server, to see what files or folders are inside it. To do this, we use the opendirreaddir andclosedir functions, as follows:

opendir MYDIR, ".";
@contents = readdir MYDIR;
closedir MYDIR;
In the above example, the list variable @contents would now contain a list of all the files and folders in the current directory (the "." passed to opendir tells Perl to read the current directory). Notice that MYDIR is a directory handle, in the same way asCOLOURS above was a file handle.
A slightly improved version removes the "." and ".." entries from the directory listing. These dots refer to the current and parent directories, and usually just get in the way when returning a directory listing. To remove them, you can use the following amended readdir line:

@contents = grep !/^\.\.?$/, readdir MYDIR;
This uses Perl's grep function to weed out any items in the list that contain "." or"..".
Once you have the contents of your chosen directory in your list variable, you can use a loop such as the foreach loop above to process each item in the list, e.g.:

foreach $listitem ( @contents )
{
  print $listitem;
}
You can use a file test operator to determine if the item you're working with is a file or a directory (quite useful to know!). To do this, use the -d file test operator. For example:

if ( -d $listitem )
{
 print "It's a directory!";
}
else
{
 print "It's a file!";
}

Setting permissions

The user account that your CGI scripts run under is often not the same as your FTP user account (the account used to create your files and directories). For this reason, it is usually necessary to give fairly wide permissions to your data files, and the directory that they're stored in, so that your CGI script can read, write, and create files.

Permissions on UNIX/Linux servers

For any data files that you create or upload with FTP, you should set the permissions of the file to 666 (rw-rw-rw). This will allow any user, including the CGI script, to read from and write to the file.
For data directories where the CGI script will need to create, read and write files, you should set the permissions of the directory to 777 (rwxrwxrwx). On a shared server, it is a bad idea to set your CGI scripts directory (cgi-bin) to 777, as anyone can then potentially create and run CGI scripts on your Web site!

Permissions on Windows servers

Usually you need to set your data files and directories to be readable and writable by the Web server user (this user's name usually begins with IUSR_...). Sometimes you may need to set them to be readable and writable by everyone. Make sure you don't do this on your CGI script directory!

About full paths and CGI scripts

In the examples above, we've assumed that the files and directories we're working with are in the working directory. Usually, if you're running a Perl script directly, then the working directory will be the directory where the Perl script is run.
However, things are usually different when running CGI scripts. The working directory might be anywhere on the server's hard drive - it is rarely the directory containing your CGI script! Therefore we can't assume that the files we need are in the working directory. In addition, some shared Web servers will not let you write to files within your CGI scripts directory.
So, when we want to create files using our CGI scripts, we need to:
  • 1. Decide on a suitable directory to put our files in
  • 2. Use the full path to the directory in our CGI script

A suitable directory

This will ideally be a directory that isn't under your website's document root (after all, you may not want just anyone being able to see your data files from a Web browser!). Often the cgi-bin directory itself is a good place, if your Web server will let your CGI scripts read and write files in there. You could also use FTP to create a sub-directory in there, e.g. cgi-bin/data/, which is more neat and tidy. For example, your site structure might look like this:

 |
 +- cgi-bin/ +
 |           |
 |           +- data/ +
 |                    |
 |                    +- colours.txt
 |
 +- htdocs/ +
 |          |
 |          +- index.html

etc.
If you're not sure of the full path to your cgi-bin directory, see the next section for a handy script to find this out for you.
If your scripts can't create files in your cgi-bin or cgi-bin/data directory (and often the only way to tell is to try it, or ask your hosting company's tech support), then you'll need to create a directory somewhere else. A good idea is to create a data directory in your home directory (this is the directory that you see when you first connect to your server with FTP). For example:

 |
 +- data/ +
 |        |
 |        +- colours.txt
 |
 +- cgi-bin/
 |
 +- htdocs/ +
 |          |
 |          +- index.html

etc.
If you have no choice but to put your data directory under your document root (htdocsin the above example), it's a good idea to use a directory name that people won't be able to guess easily. Alternatively, password-protect the directory. That way, hackers won't be able to see your data files.

The full path

The full path is the path from the root (top level) of the Web server's hard drive. On a UNIX or Linux server, it might be something like:

/home/username/cgi-bin/data/
For a Windows server, it may be:

C:\InetPub\wwwroot\username\cgi-bin\data\
We need to use the full path to our data directory in our CGI script, because it's the easiest way for our script to find the directory. (As we've mentioned above, the current directory could be anywhere for a CGI script, so we can't rely on that.)
So, our script to create a file and write to it might now look something like this:

open COLOURS, ">/home/username/cgi-bin/data/colours.txt";
print COLOURS "My favourite colour is red\n";
close COLOURS;

Determining the full path

So how do you find out the full path to your website's files? Your Web hosting company should be able to tell you - check the help/support area on their website, or contact their technical support.
You could also try this handy script. It can often work out the full path to itself. From this path, you should be able to work out the full path to your data directory:

#!/usr/bin/perl

$fullpath = ( $ENV{'SCRIPT_FILENAME'} || $ENV{'PATH_TRANSLATED'} || $0 );

print "Content-type: text/plain\n\n";
print "The full path to this script is: $fullpath";
The script looks at two Web server environment variables, and also at $0 (thePROGRAM_NAME special variable). If any of these contains the full path to the current script, it will display it on the screen.
Upload the script to your server (make sure you change the path to Perl on line 1 if necessary, and also set the correct permissions for the script - usually 755), then try browsing to the URL of the script in your Web browser. If you're lucky, it will display the full path to your script!

The next step

We've gone through all the basics of creating, reading and writing files, and reading directories. We've also looked at setting the correct permissions, and explained the concept of full paths. With these techniques, you should be able to start writing some CGI scripts that store stuff in files on your Web server. Good luck!

Share This Page

Follow Elated

Related articles

Responses to this article

3 responses (oldest first):

06-May-10 11:09
Thanks for this -- well written and nicely laid out (unusual for Perl tutorials!)

The full path section is really important for those new to CGI, and I haven't seen it covered elsewhere.
07-May-10 02:52
Thanks for the kind words Sam - much appreciated.

OT: I enjoyed reading your iPad impressions on your blog (http://samdutton.wordpress.com/2010/05/05/ipad-first-impressions/). I still haven't had a go on one yet!

Matt
07-May-10 14:24
Cheers Matt -- hope an iPad comes your way!

Post a response

Want to add a comment, or ask a question about this article? Post a response.
To post responses you need to be a member. Not a member yet? Signing up is free, easy and only takes a minute. Sign up now.
 

沒有留言:

張貼留言