2011年10月14日 星期五

Perl中如何使用Constant? *PI = \3.14; 應用在return status code尤佳

DISCLAIMER: THESE PAGES ARE STILL UNDER CONSTRUCTION. NO CODE EXAMPLE BEEN TESTED YET.

Perl Tips and Tricks

Constants and Typeglob References - Enumerating states or status


[Previous Page] |[Next Page] Table of Contents: small | med | large

Making a constant [back to top]

Srinivasan describes a cool trick in Advanced Perl Programming [SRIN97] , p.47. You can define a read-only variable by defining a reference to a constant:
*PI = \3.14159265358979323;
(That example comes directly from his book.) That is, $PI is now a constant. Now you can access this with:
$radians = $degrees * $PI / 180.0;
but you cannot change it:
$PI = 3.0;
will result in an error.In C++, you can get this behavior with const. But I think of this as being closer to the C #define. In C/C++, there is also an enum that makes it convenient to define a series of constants, such as states of a system or error flags.

Making a series of constants [back to top]

Now, suppose that you're designing a module and have some error codes that you wish to return. For instance, you can extend the above to:
*SUCCESS   = \0;
 *FAILURE_UNKNOWN_STATE  = \1;
 *FAILURE_UNRECOGNIZED_TOKEN = \2;
 *FAILURE_UNKNOWNFIELD  = \3;
In some ways, that's similar to the enum of C, but not quite as convenient. In C, you didn't have to specify the numbers explicitly. You could just define the constant names, and the compiler would just automatically number them for you starting with 0.Also, if you later decide that you want to define a general "FAILURE" as code 1, you then have to renumber all of them. What a pain.
Before I address those, I want to mention what I see as one of the weaknesses in C. If I have these error status codes, or if I design a finite state machine or something, I will want to at some time (usually during debugging) want to print them out. Well, then I'm kind of stuck. I have to define an array of strings or a bunch of if or case structures or something.
In Perl, I my first instinct might be to just throw the strings themselves around, so I could just end a sub with:
return 'SUCCESS';
but for a status code or a state, I could express this as an integer, and it just feels like a waste to use up a whole string on it. Then we'll have to use string operations to test equality.Internally, just about everyone represents a string as an array of characters. Now, sometimes, you may add a little more to the structure like zero termination or keeping an additional count of string length or something. But fundamentally, it's just a character array. What this means is that to see if 2 strings are equal, you need to step along each position and test for equality. For a long string, this could take a while. Though these days, with 750+ MHz machines, FAILURE_UNRECOGNIZED_TOKEN hardly seems like a long string. But that's still up to 26 character comparisons to worry about, not to mention the overhead of incrementing the pointers... just an array of characters.
But why go through that if we don't have to? A int or float comparison is generally faster, and can be done in 1 comparison. Now if only we could combine the benefits of the constant scalar references so we can end our subs with the readable:
return $SUCCESS;
and be able to print out the states. Now, what we could do is define a few strings in a table, much like we would in C.
$statusText[0] = '*SUCCESS';
 $statusText[1] = '*FAILURE_UNKNOWN_STATE';
 $statusText[2] = '*FAILURE_UNRECOGNIZED_TOKEN';
 $statusText[3] = '*FAILURE_UNKNOWNFIELD';

 *SUCCESS   = \0;
 *FAILURE_UNKNOWN_STATE  = \1;
 *FAILURE_UNRECOGNIZED_TOKEN = \2;
 *FAILURE_UNKNOWNFIELD  = \3;
And in fact, even in C, we could do a little better by just defining the strings in an array context. In Perl, we can do the following:
@statusText = (
  '*SUCCESS',
  '*FAILURE_UNKNOWN_STATE',
  '*FAILURE_UNRECOGNIZED_TOKEN',
  '*FAILURE_UNKNOWNFIELD',
 );

 *SUCCESS   = \0;
 *FAILURE_UNKNOWN_STATE  = \1;
 *FAILURE_UNRECOGNIZED_TOKEN = \2;
 *FAILURE_UNKNOWNFIELD  = \3;
And since we just have a bunch of text that doesn't contain spaces, we could take advantage of Perl's qw construct. qw means "quoted words" and allows you to define a space separated list of words, and also allows you to neglect the quotes and the commas.
@statusText = qw(
  *SUCCESS
  *FAILURE_UNKNOWN_STATE
  *FAILURE_UNRECOGNIZED_TOKEN
  *FAILURE_UNKNOWNFIELD
 );

 *SUCCESS   = \0;
 *FAILURE_UNKNOWN_STATE  = \1;
 *FAILURE_UNRECOGNIZED_TOKEN = \2;
 *FAILURE_UNKNOWNFIELD  = \3;
Now, there's one little problem: every time you define a new state, you need to add an entry to both the statusText table and the typeglob list. You can easily get out of sync there, and if you insert a new status in the middle, you need to renumber everything.Fortunately, in Perl, I think I found a better way.

EX 5.2.1: Defining a list of return status codes

1 @statusText = qw(
  2  *SUCCESS
  3  *FAILURE
  4  *FAILURE_UNKNOWN_STATE
  5  *FAILURE_UNRECOGNIZED_TOKEN
  6  *FAILURE_UNKNOWNFIELD
  7 );
  8 # Set constant typeglobs for the above statuses...
  9 for (my $statusTextNdx=0; $statusTextNdx<scalar(@statusText); $statusTextNdx++) {
 10         eval "$statusText[$statusTextNdx] = \\$statusTextNdx;";
 11 }
Listing 5.2.1 for code_untested/statusTable.pm
Some things to note here:
  • eval is something that's available to you in Perl but not in C/C++. This allows you to create code on the fly and execute it. In this case, we can look at one iteration of the for loop. It starts with $statusTextNdx=0. We then index into the @statusText array at position 0, and find the value is '*SUCCESS'. Substituting $statusTextNdx and '*SUCCESS' in there we formed the string:
    "*SUCCESS = \\0;"
    
    and of course inside the double quotes, the \\ evaluates to an escaped backslash, so what we are evaling:
    *SUCCESS = \0;
    
    which we've previously discussed just sets a constant $SUCCESS=0.
  • I kept the * as part of the string. You could just as easily make the statusText array hold strings like 'SUCCESS' and put the * into the eval line. I can go either way on this one. I find though that when I print the status, the * helps seperate it from other stuff I'm printing out.
  • The my $statusTextNdx locally scopes the $statusTextNdx to the for loop. Normally, I ramble on and on about not using 'my' for global variables. Well, this is not really being used as a global. As I've mentioned before, I encourage using 'my' to scope to a code block or a subroutine. In this case, it is in a code block, and the variable disappears after the for loop is done. And scoping temporary variables is always a good thing.
  • The end index is scalar(@statusText). (Actually, scalar(@statusText)-1 since we're just using <) This means that you can freely add entries to @statusText, and the for loop handles the whole array. By extention, note that you can add new entries and not worry about renumbering.

© 2001 Steve Hwan, hostname: @pacbell.net, username: svhwan
You should probably use the word "PERL" in the subject line to get my attention.
Last Modified: Tue Mar 6 06:57:15 2001

沒有留言:

張貼留言