1. #!/usr/bin/perl
2. blank line
3. # Blosxom
4. # Author: Rael Dornfest rael@oreilly.com <mailto:rael@oreilly.com>
5. # Version: 2.0.2
6. # Home/Docs/Licensing: http://www.blosxom.com/
7. # Development/Downloads: http://sourceforge.net/projects/blosxom
8. blank line
9. package blosxom;
blosxom package statement
10. blank line
11. # --- Configurable variables -----
12. blank line
13. # What's this blog's title?
14. $blog_title = "My Weblog";
$blog_title initialization
Defines the weblog title
15. blank line
16. # What's this blog's description (for outgoing RSS feed)?
17. $blog_description = "Yet another Blosxom weblog.";
$blog_description initialization
Defines the weblog description
18. blank line
19. # What's this blog's primary language (for outgoing RSS feed)?
20. $blog_language = "en";
$blog_language initialization
Defines the weblog language
The comment in the source tells us that this is used for the outgoing RSS feed.
21. blank line
22. # Where are this blog's entries kept?
23. $datadir = "/Library/WebServer/Documents/blosxom";
$datadir initialization
Specifies weblog data directory
This is the root of the directory where you will keep your posts. The value should be the complete path from the filesystem root to and including the name of the data directory itself.
The path should begin with a leading slash '/' and should not include a trailing slash '/', though the script will strip a trailing forward-slash if it finds one.
You will need to create this directory yourself and set up permissions properly. The script will not create it for you.
24. blank line
25. # What's my preferred base URL for this blog (leave blank for automatic)?
26. $url = "";
$url initialization
Defines the base URL for the weblog
This is the address a visitor would type into the address bar of her browser to get to the blosxom.cgi script itself. Without any redirection, this value will end with the name of the script file.
eg:
http://sample.net/cgi-bin/blosxom.cgi
27. blank line
28. # Should I stick only to the datadir for items or travel down the
29. # directory hierarchy looking for items? If so, to what depth?
30. # 0 = infinite depth (aka grab everything), 1 = datadir only, n = n levels down
31. $depth = 0;
$depth initialization
Defines the depth that blosxom will plumb for any directory it's given to find posts in subdirectories.
If the value is 1, blosxom will look for posts only in the requested directory.
So, if the request is for the weblog homepage, then the script will only look for posts in $datadir (specified above). Posts in subfolders will be ignored.
This same value controls the depth blosxom will explore for requests that do not start at the root.
If a visitor requests:
http://sample.net/cgi-bin/blosxom.cgi/Technology/Computer/
and the $depth has a value of 1, then blosxom will look for posts only in the '../Computer/' directory.
A value of 2 will also include posts at /Computer/Apple/, assuming the subdirectory 'Apple' exists. Also /Computer/Dell/, and /Computer/Lenovo/ if those subdirectories exist. A value of 1 means to include all directories at one level, not a single directory.
32. blank line
33. # How many entries should I show on the home page?
34. $num_entries = 40;
$num_entries initialization
Defines the max number of posts displayed on any single page, with the exception of date-based archive requests which always display all posts that match the requested date.
Though the comment in the source suggests that this value affects only the homepage, it in fact applies to all page requests (again, with the exception of date-based archives).
eg:
http://sample.net/cgi-bin/blosxom.cgi
will display a maximum of $num_entries posts.
http://sample.net/cgi-bin/blosxom.cgi/Technology/Computer/
Also displays a max of $num_entries posts, this time starting at the subdirectory '..Computer/'
On the other hand
http://sample.net/cgi-bin/blosxom.cgi/2006/
Always displays all posts created in 2006, regardless of the value of $num_entries.
There is no way to request posts beyond $num_entries using only blosxom itself. You will find a plugin to handle this if it's something you need.
If your weblog has 1000 total posts and $num_entries is set to 40, only the 40 most recent posts are directly accessible from the homepage.
35. blank line
36. # What file extension signifies a blosxom entry?
37. $file_extension = "txt";
$file_extension initialization
Specifies the extension used for all files that blosxom should treat as posts.
$file_extension is the most basic mechanism for controlling inclusion/exclusion of files as posts.
The default, 'txt', seems like a good choice because:
html is another option, though it might prove to be somewhat confusing because:
Something like 'blosxom' or 'bsxm' can work too, but may be unwieldy depending on your platform or editor.
38. blank line
39. # What is the default flavour?
40. $default_flavour = "html";
$default_flavour initialization
Specifies which flavour (template) blosxom should use if none is specified explicitly by the browser.
For example when requesting
the homepage
http://sample.net/cgi-bin/blosxom.cgi,
a category
http://sample.net/cgi-bin/blosxom.cgi/Technology/Computer/,
or date-based archive
http://sample.net/cgi-bin/blosxom.cgi/2006/11/04/
Though you're free to use any flavour you like as the default, I highly recommend either 'html' or 'htm'.
Why?
Because this is a very typical extension in use on the web.
More important than what extension you choose is always using the same extension as your default flavour.
You may have several flavours available to blosxom at any given time and occasionally you may want to change the default.
For example, maybe you've decided to change flavours with the seasons.
Rather than changing the default flavour here from 'fall' to 'winter', I recommend that you rename the flavour corresponding to the current season to 'html', keeping the default name.
i.e.
In winter change the extension on your winter flavour components to html. Come spring, restore the .winter extension and rename your spring flavour components.
Why is this important?
When visitors access your site and do not specify a flavour, blosxom appends the default. If the visitor bookmarks the site, the link will include the default extension. When accessing your site in the future via the bookmark, the same visitor will always get that same flavour, even if you have changed the default.
Why?
Because the bookmark will include an extension, so the request will not resort to using the new default.
Moreover, if you ever remove a flavour that was a default at any time, it's possible that the visitor will be greeted with an error in the future, when blosxom is unable to find the flavour specified in the bookmark.
Finally, everything I've said so far is just as true for search engines as it is for people.
For example, if Google indexes a page at your site the address will include the default flavour. It follows that search results that Google returns in response to queries will include that extension.
Always renaming your intended default, so that the value of this variable does not change, is a great way to avoid these sorts of problems.
The discussion of persistent links and consistent naming is more involved than this. Ideally addresses should not include any extension at all. (It is possible to get this behavior with blosxom. In fact there is already a plugin that does this.)
For now, and considering only the blosxom script itself, the advice is:
avoid changing the value of $default_flavour - after the initial configuration of course.
41. blank line
42. # Should I show entries from the future (i.e. dated after now)?
43. $show_future_entries = 0;
$show_future_entries initialization
The value of this variable should be set to one of the two numeric values: 1 (one) or 0 (zero).
Think of these values as true and false respectively. This is precisely what they mean.
It is possible to postdate entries so that they appear to have been written at some future date or time.
One way to do this (there are others) is by using the Unix touch command.
See...
$ man touch
...for more info.
It is possible to use blosxom to accomplish the same thing with the help of a plugin or two.
Here you are instructing blosxom to either:
How might you use this feature?
You might leave this value set to 0 (zero) and postdate an entry to have it automatically show up on the specified date and time, so that you do not need to remember to post it yourself in the future.
44. blank line
45. # --- Plugins (Optional) -----
46. blank line
47. # Where are my plugins kept?
48. $plugin_dir = "/Library/WebServer/Data/Blosxom/plugins";
$plugin_dir initialization
Specifies the location of the weblog plugin directory.
This is the location where you will place any plugins you would like to use with blosxom.
There are quite a few plugins available.
Some of them are fairly specialized (i.e. probably not relevant to you and your weblog) and others are all but necessary e.g. you can say that a navigable calendar is a defining feature of all weblogs, and if so then you may consider the calendar plugin necessary.
In any case, all plugins live in this directory.
You should not (in fact you cannot) place active plugins in subdirectories of $plugin_dir, they must all reside in this folder directly.
These files are not intended to be directly viewed by visitors to your site. So, it should not be possible for a visitor to use their browser to navigate to the folder containing the plugins.
Assuming permissions on your server are set up properly, it is very simple to accomplish this by making sure the plugins directory is not within the webserver's document root.
The value should be the complete path from the filesystem root to, and including, the name of the data directory itself.
The path should begin with a leading slash '/' and should not include a trailing slash '/', though the script will strip a trailing forward-slash if it finds one.
You will need to create this directory yourself and set up permissions properly. The script will not create it for you.
49. blank line
50. # Where should my modules keep their state information?
51. $plugin_state_dir = "$plugin_dir/state";
$plugin_state_dir initialization
Specifies the location of the plugin state directory.
It will often be the case that plugins will need to save information related to their function during their operation.
All of this information should be saved to the state directory specified here. Like the plugins directory, visitors should not be able to access$plugin_state_dir directly.
Though you can place this directory anywhere that blosxom will have access to it, making ..state/ a subdirectory of the plugins directory makes as much sense as anything else. It is the default so picking this location requires absolutely no work on your part.
One of the advantages of this arrangement is that you can specify the path here in terms of $plugin_dir.
eg:
$plugin_state_dir = "$plugin_dir/state";
You should expect that plugins will be fairly tidy when it comes to the use of the state directory.
You will need to create this directory yourself and set up permissions properly. The script will not create it for you.
52. blank line
53. # --- Static Rendering -----
54. blank line
55. # Where are this blog's static files to be created?
56. $static_dir = "/Library/WebServer/Documents/blog";
$static_dir initialization
Specifies the location of the directory where blosxom generates the complete site when run in static mode.
In static mode blosxom generates a complete website, running through all posts and generating every page at once. At the very least running blosxom in static mode generates:
See line 61 for more info.
The value should be the complete path from the filesystem root to, and including, the name of the data directory itself.
The path should begin with a leading slash '/' and should not include a trailing slash '/', though the script will strip a trailing forward slash if it finds one.
You will need to create this directory yourself and set up permissions properly. The script will not create it for you.
A full discussion of static mode here is not necessary but I will talk about it briefly.
Normally any request sent to your weblog runs the blosxom.cgi script. The script takes a nonzero amount of time to execute.
This is in addition to the time in takes for:
For large sites (with lots of posts) it may take a considerable amount of time for the script to work through every post on the site.
aside: blosxom consults the filesystem for info about all posts in every subdirectory of $datadir on every request. As far as computing operations go, accessing the filesystem is slow.
Also, a large or particularly busy site may create a lot of activity for the host computer. This can slow down a site from a visitor's perspective and tax system resources from the perspective of a hosting provider.
Futhermore, there are potential security implications of running any code via a browser and these concerns are complicated by blosxom's plugin architecture. Even if blosxom.cgi itself is safe some poorly written or ill-conceived plugin may expose your site to risk.
Finally, it's possible that the configuration of your webserver, or some other restriction imposed by your hosting provider, precludes the use of CGI scripts like blosxom outright.
For these reasons, among others, you may prefer to run blosxom offline, forcing it to generate all of the content on your site at once.
The resulting pages can then be moved to your webserver where they can be served as static content, without the risk or overhead required for the script to operate on every request.
As described static mode may sound appealing. I want to make a point of encouraging you to use blosxom dynamically if appropriate.
Why?
There are simply some things you can do running dynamically that cannot be done statically and you'll find many plugins that will only work in dynamic mode.
Beyond this though, static mode is a bit of a pain to deal with when it isn't absolutely necessary and, to some extent, it doesn't really speak to blosxom's strengths.
Still, static rendering is a great option when necessary.
Some combination of static and dynamic mode operation may be a perfect fit for what you want to do. However, I feel safe in saying that such a mixed configuration is the most difficult type of setup to maintain.
It should be the goal of the project to improve the efficiency of the script and its plugins so that running dynamically is a practical arrangement for the vast majority of installations (of reasonable size/scope of course).
57. blank line
58. # What's my administrative password (you must set this for static rendering)?
59. $static_password = "";
$static_password initialization
Defines the password which must be included as a parameter when calling the script to run in static mode.
A password must be defined here and then used to run blosxom in static mode.
The password serves two different purposes:
First, it is a security measure.
In theory only someone who knows the password you specify here can run blosxom in static mode.
It also acts as a means to enable or disable static mode operation.
Setting the value to "" (the default) disables static mode operation. The presence of the password as a commandline parameter overrides the default and indicates to blosxom that it should run in static mode.
The password you choose should be a good one. There are many different ideas about what makes a password 'good'.
I'll recommend the following:
60. blank line
61. # What flavours should I generate statically?
62. @static_flavours = qw/html rss/;
@static_flavours initialization
Specifies which flavours blosxom should attempt to generate statically.
You may have designed many flavours.
When in static mode blosxom must essentially generate a complete copy of the site separately for each flavour. Also, you may have flavours that depend on dynamic mode operation.
For these reasons, among others, it probably makes sense for you to generate only a small number of flavours statically.
Here you can specify which flavours blosxom will attempt to generate.
You may want to consider sticking with the default and limiting static mode output to html and rss (or whatever flavour you use for syndication feeds).
You can of course add flavours to, remove flavours from, or edit the names of the flavours that appear in this list. Simply separate the flavour names by whitespace between the '/' characters.
63. blank line
64. # Should I statically generate individual entries?
65. # 0 = no, 1 = yes
66. $static_entries = 0;
$static_entries initialization
The value of this variable should be set to one of the two numeric values: 1 (one) or 0 (zero).
Think of these values as true and false respectively. This is precisely what they mean.
When running in static mode blosxom always generates:
All of these pages are generated for every static flavour you've specified.
This is already potentially a large number of pages.
Additionally, blosxom can generate a page for each individual post on your site. Realize that this has the potential to create very many pages depending on the number of posts on your site, the depth of your categorization scheme, the number of static flavours you specify in @static_flavours, and possible other factors.
The value of this variable indicates your preference to the script.
Setting this value to '1' does mean that the process of generating the site in static mode will take longer, and your site will be larger in terms of the amount of drive space it occupies, but will not increase the amount of work the webserver must do to serve requests for your site. The amount of time to to serve any single static page is not strongly related to the total number of pages.
67. blank line
68. # --------------------------------
69. blank line
70. use vars qw! $version $blog_title $blog_description $blog_language $datadir $url %template $template $depth $num_entries $file_extension $default_flavour $static_or_dynamic $plugin_dir $plugin_state_dir @plugins %plugins $static_dir $static_password @static_flavours $static_entries $path_info $path_info_yr $path_info_mo $path_info_da $path_info_mo_num $flavour $static_or_dynamic %month2num @num2month $interpolate $entries $output $header $show_future_entries %files %indexes %others !;
'use vars' syntax is deprecated as of 5.6 in favor of 'our' declarations.
Essentially either declare variables as package globals when the 'strict' pragma is in effect. Named variables may be referred to within the same file and package with their unqualified names; and in different files/packages with their fully qualified names.
By using this declaration plugins have access to any of these declared variables.
71. blank line
72. use strict;
Perl pragma that introduces some basic programming restrictions to help guide the developer toward responsible and readable coding.
See..
$ man strict
..for more info.
73. use FileHandle;
Lines 73 - 77 include modules and classes that the script requires during execution.
See the relevant documentation for more info about each of these.
74. use File::Find;
Lines 73 - 77 include modules and classes that the script uses during execution.
See the relevant documentation for more info about each of these.
75. use File::stat;
Lines 73 - 77 include modules and classes that the script uses during execution.
See the relevant documentation for more info about each of these.
76. use Time::localtime;
Lines 73 - 77 include modules and classes that the script uses during execution.
See the relevant documentation for more info about each of these.
77. use CGI qw/:standard :netscape/;
Lines 73 - 77 include modules and classes that the script uses during execution.
See the relevant documentation for more info about each of these.
78. blank line
79. $version = "2.0.2";
$version initialization
80. blank line
81. my $fh = new FileHandle;
Declares a new, uninitialized FileHandle, $fh.
We will be using this filehandle to read in our posts (and write pages in static mode).
82. blank line
83. %month2num = (nil=>'00', Jan=>'01', Feb=>'02', Mar=>'03', Apr=>'04', May=>'05', Jun=>'06', Jul=>'07', Aug=>'08', Sep=>'09', Oct=>'10', Nov=>'11', Dec=>'12');
%month2num initialization
First use of %month2num defines the variable as a hash of key/value pairs where keys are short month names like 'Jan', and values are two digit strings e.g. '01' to '12'
Note that nil=>'00' is only a placeholder. It's included to make possible the next line's assignment of %month2num's keys to @num2month so that $num2month[1] is 'Jan'. The placeholder is necessary because there is no month 0.
Two digit values are specified as strings. This ensures that the two digit format is reliable. In other words blosxom prefers to always see '01', and never 1.
84. @num2month = sort { $month2num{$a} <=> $month2num{$b} } keys %month2num;
@num2month initialization
We initialize the array to contain a list of three character strings that represent months by name at corresponding index positions.
Looking at the expressions that contribute to the statement from right to left:
First we take the keys from the %month2num hash.
The result is a list of values 'nil', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'
Next, we sort the list of keys by their value in the hash in ascending numerical order.
We're sorting by the hash values 00, 01, 02, 03, 04, 05,...,12. The strings are automatically converted to corresponding numerical values for the purposes of the comparison.
Finally, we store the sorted list of keys in the array @num2month.
$num2month[0] is 'nil', $num2month[1] is 'Jan',..., $num2month[12] is 'Dec'.
85. blank line
86. # Use the stated preferred URL or figure it out automatically
87. $url ||= url(-path_info => 1);
This is an example of a statement (one of many in the current code) that uses a partial evaluation operator, or '||' in this case, as a control structure.
$url can be set manually in the configurable variables section of the code. If it is set, then $url is true and the rest of the line is not evaluated.
If $url is not defined manually, then this expression evaluates to false and the rest of the statement is considered.
So, if you do not specify a $url in the user configurable varibles section above (line 26), then url() is called and the result is stored in the variable, (only if $url was initially empty).
url() is part of the CGI class. From the CGI module's manpage:
url() returns the script's URL in a variety of formats. Called without any arguments, it returns the full form of the URL, including host name and port number.
-path (-path_info) Append the additional path information to the URL. This can be combined with -full, -absolute or -relative. -path_info is provided as a synonym.
We need to look ahead to the comments that run lines 90 - 93 to understand why we're using the -path_info argument.
As it concerns this line those comments tell us that in some cases url will always append path_info to the URL. We're not interested in the additional path_info, but we are interested in having a consistent value in $url. Following the 'go with the flow' philosophy so that we can have a dependable value, we include the argument to insure that we will have the path_info always, not sometimes. Always is easier to deal with than sometimes.
Because we are not interested in the additional path info, we will strip it off soon, lines 94 - 96.
In any case, at this point $url should contain at least the url to the blosxom script url.
e.g.
http://example.net/cgi-bin/blosxom.cgi
88. $url =~ s/^included:/http:/ if $ENV{SERVER_PROTOCOL} eq 'INCLUDED';
At this point we know $url contains a value. That the value should be the full url including protocol, host, and port number. Possibly with additional path info if a path was specified in the request.
This line uses Perl's substitution operator, s/// to 'fix' the string at $url in a specific case.
If the value of $url begins with the substring 'included:', then we substitute 'http:' within the string.
'^' in the pattern matches the start of the string.
If the value of $url does not match the specified pattern, it is left unaltered.
i.e'
included://some_host_address/cgi-bin/blosxom.cgi
becomes
http://some_host_address/cgi-bin/blosxom.cgi
89. blank line
90. # NOTE: Since v3.12, it looks as if CGI.pm misbehaves for SSIs and
91. # always appends path_info to the url. To fix this, we always
92. # request an url with path_info, and always remove it from the end of the
93. # string.
94. my $pi_len = length $ENV{PATH_INFO};
We begin the work of stripping the additional path info from $url (if present) so that we end up with just the base url to the blosxom executable in $url, which is the value we want in the variable.
We get the length of the string that represents the additional path info.
%ENV is a hash that is automatically created and populated with interesting bits of information related to the environment in which the Perl script is running. PATH_INFO is one key of many in this hash of key/value pairs. The value of $ENV{PATH_INFO} is the additional path info from the URL.
e.g.
If a browser requests
http://sample.net/cgi-bin/blosxom.cgi/Technology/Computer/Apple/some_post_aboutApple
then the value of PATH_INFO will be
'/Technology/Computer/Apple/some_post_aboutApple'
$ENV{PATH_INFO} returns only the path info portion of the URL as a string.
length
takes the string and returns its length.
We store that numeric length value in the variable,
$pi_len.
95. my $might_be_pi = substr($url, -$pi_len);
Continuing our work correcting for the inclusion of path info as part of the value at $url.
substr is a Perl function that returns a substring when given a larger string.
substr($url, -$pi_len)
$url is the larger string and we are asking for some portion of it. Specifically $pi_len number of characters from the end of the string.
The '-' in -$pi_len indicates that we want perl to start counting and from the end of the string rather than the beginning.
The return value of substr is the substring itself, which we store in $might_be_pi.
Continuing with the example we started at line 94
If the browser requests
http://sample.net/cgi-bin/blosxom.cgi/Technology/Computer/Apple/some_post_aboutApple
then the value of PATH_INFO will be
'/Technology/Computer/Apple/some_post_aboutApple'
and $pi_len will be 47.
substr($url, -$pi_len) will return, and store in $might_be_pi, the last 47 characters of $url.
In our example, $url is
http://sample.net/cgi-bin/blosxom.cgi/Technology/Computer/Apple/some_post_aboutApple
and the last 47 characters are the path portion of the URL.
So after this assignment, $might_be_pi contains the path portion of the URL, which is, in this case, exactly the value contained in $ENV{PATH_INFO}.
96. substr($url, -length $ENV{PATH_INFO}) = '' if $might_be_pi eq $ENV{PATH_INFO};
Continuing our work correcting for the inclusion of path info as part of the value at $url.
This line is an example of a Perl expression modifier, which is simply a more compact way of writing conditional code, an if block in this case.
First we evaluate the condition after the if, which is
$might_be_pi eq $ENV{PATH_INFO}
and if this expression is true then we consider the rest of the statement,
substr($url, -length $ENV{PATH_INFO}) = ''
If the condition is not true, then we skip the rest of the statement.
Because this is the first time we're seeing an expression modifier, let's look at how this statement would look rewritten as a typical if block
if($might_be_pi eq $ENV{PATH_INFO}) {
substr($url, -length $ENV{PATH_INFO}) = '';
}
The value of $might_be_pi is compared to the value at $ENV{PATH_INFO}. If the two values are identical, then the condition is satisfied (and we are satisfied that there is path info included in $url). Having established this, we continue with the rest of the statement.
substr($url, -length $ENV{PATH_INFO}) = '';
There is a lot going on here. Taken together as an idiom, it's easy to talk about what this expression is doing. Teasing it apart considering how the expression is constructed isn't anymore difficult but does take more time and typing.
We've seen substr before, at line 95, and this use is very similar. Also we've seen length at line 94. Here we combine both in one expression.
$ENV{PATH_INFO} is the path info portion of the URL, as already discussed.
Passing this value to length returns the number of characters in the path, again this has already bee talked about.
substr targets a substring of some larger string, in this case $url.
How large a substring is determined by the second argument to substr, here it is length $ENV{PATH_INFO} ('-' as discussed means that we are interested in the substring at the end of $url rather than the beginning).
Usually substr returns the specified substring. When combined with an assignment, as it is here, the effect is to replace the substr with the assigned value in the larger string.
So this expression selects the path portion of the string in $url and replaces it with the empty string, ''. In other words, the expression strips the path info from $url.
97. blank line
98. $url =~ s!/$!!;
Another use of Perl's substitution operator.
Here the pattern we're looking for is:
/$
a forward slash followed immediately by the end of the string value.
If we find a trailing forward slash, we replace it with nothing.
In other words, this line drops a single trailing forward slash from the value in $url.
99. blank line
100. # Drop ending any / from dir settings
101. $datadir =~ s!/$!!; $plugin_dir =~ s!/$!!; $static_dir =~ s!/$!!;
Drops any single trailing '/' from any of $datadir, $plugin_dir, $static_dir.
This line simply gives the code flexibility to deal with the possibility of users including a trailing '/' in these paths, though blosxom expects that the values do not include the character.
Note that this line combines three sort statements on one line.
The lines could be rewritten as
$datadir =~ s!/$!!;
$plugin_dir =~ s!/$!!;
$static_dir =~ s!/$!!;
102. blank line
103. # Fix depth to take into account datadir's path
104. $depth and $depth += ($datadir =~ tr[/][]) - 1;
First note the use of the partial evaluation operator 'and'
If $depth is zero then the rest of the statement is not evaluated.
$depth will be zero if the user manually set it to zero in the user configurable variables section of the script, which indicates that she wants infinite depth.
Given this meaning (i.e. 0 indicates infinite depth), it makes sense that we wouldn't want to add to the $depth if $depth is 0 (infinity + 1 doesn't make much sense).
If $depth is not zero, we are modifying the value at $depth.
tr/// is Perl's transliteration operator.
It can be used to replace characters in the first list with corresponding characters from the second.
The return value is the number of characters replaced or deleted.
When used as it is here, with an empty second list, it's used to count occurrences of characters in the first list.
So here we're counting forward slashes, '/', in $datadir
This statement adds to $depth a correction for the length of the path to the start of the data directory.
e.g.
$datadir = "/Library/WebServer/Documents/blosxom";
and
$depth = 2;
# indicating that we want blosxom
# to consider posts in the top two levels
# of the weblog hierarchy
This line results in setting $depth to 5 as follows
+ 4 - 1 = 5.
This seems to be correct because the root of the weblog is at level 4 (relative to the filesystem root, '/'), and we want to consider the top 2 levels of the weblog hierarchy, i.e. levels 4 and 5.
Note: For the curious (or the suspicious), we add a similar correction anytime we request a path that may be further down the data directory hierarchy. So the value we use for $depth is always relative to the correct starting point.
105. blank line
106. # Global variable to be used in head/foot.{flavour} templates
107. $path_info = '';
$path_info initialization
We're simply initializing $path_info to the empty string, ''.
108. blank line
109. $static_or_dynamic = (!$ENV{GATEWAY_INTERFACE} and param('-password') and $static_password and param('-password') eq $static_password) ? 'static' : 'dynamic';
$static_or_dynamic initialization
The statement sets the variable to one of two values, either 'static' or 'dynamic'.
This choice made with the help of Perl's ternary operator ?:
The expression before the ? is evaluated.
If it evaluates as true (non-zero) then the expresion to the left of the colon, ':', is evaluated.
Otherwise the expression to the right of ':' is evaluated.
In this line the test expression is:
(!$ENV{GATEWAY_INTERFACE} and param('-password') and
$static_password and param('-password') eq $static_password)
The expression checks:
If all of these things are true then $static_or_dynamic will be assigned the value 'static', which is the expression to the left of the the colon.
Otherwise, $static_or_dynamic will be set to the value 'dynamic', the expression to the right of the ':'.
At this point in the script we can check $static_or_dynamic to know in which of the two modes we're operating.
110. $static_or_dynamic eq 'dynamic' and param(-name=>'-quiet', -value=>1);
This is an another use of partial evaluation operators as control structures, in this case 'and'.
The first part of the statement
$static_or_dynamic eq 'dynamic'
tests the value of $static_or_dynamic, which we just initialized at line 109.
If the value is 'dynamic', meaning that blosxom has determined that it is running in dynamic mode, then the second part of the statement is evaluated
param(-name=>'-quiet', -value=>1)
which assigns the -quiet parameter a value 1.
This might seem to make sense, a value of 1 indicating that the script should suppress unwanted output when running dynamically, but the parameter doesn't seem to be used anywhere.
This parameter is useful in the case of static mode operation.
From the documentation we have this description:
to have Blosxom's static rendering run silently -- perhaps you're running it automatically at regular intervals and you don't want all that output popping up on your screen or being mailed to you -- add -quiet=1 like so:
% perl blosxom.cgi -password='whateveryourpassword' -quiet=1
Setting -quiet to 1 suppresses output.
111. blank line
112. # Path Info Magic
113. # Take a gander at HTTP's PATH_INFO for optional blog name, archive yr/mo/day
114. my @path_info = split m{/}, path_info() || param('path');
Declaration and initialization of my @path_info
This is an another use of a partial evaluation operator as control structure, in this case '||' (pronounced 'or').
This is the first appearance of the variable my @path_info.
split operates on a single string.
It works by spliting the string it's given into a list of values at the specified (matched) character.
The specified character in this case is forward slash, '/'.
What string are we asking split to work on?
First we run path_info(), another function from the CGI module.
from the documentation for CGI:
path_info() Returns additional path information from the script URL. E.G. fetching /cgi-bin/your_script/additional/stuff will result in $query->path_info() returning
"/additional/stuff".
So $path_info is everything in the address after the script itself.
For blosxom this part of the address corresponds to the data directory or date-based archive hierarchies.
e.g.
For the request
http://sample.net/cgi-bin/blosxom.cgi/Technology/AppleInc/Hardware/Macintosh
path_info() returns
'/Technology/AppleInc/Hardware/Macintosh'
which we split on '/' so that the array @path_info contains the list
'', 'Technology', 'AppleInc', 'Hardware', 'Macintosh'
Note that the initial list item is the empty string, ''.
This value is introduced when splitting on the leading '/', i.e. ^/Technology.
The next line takes care of removing this unwanted element.
The use of '||' means that only if path_info() returns a false value (i.e. no path was specified) will the second part of the statement be evaluated, which passes the value of the path parameter to the split operator.
If both path_info() and param('path') are empty, then @path_info is empty. This will happen whenever the a browser makes a request for the root of the weblog.
e.g.
a request for
http://sample.net/cgi-bin/blosxom.cgi
115. shift @path_info;
shift operates on the beginning (lowest index value) of any list, such as the list contained @path_info in this case.
shift returns the value at the first index position, removing it from the array.
Here, we're doing nothing with return value, so the effect is simply to remove the first element from the array.
Continuing the discussion started at line 109 above, if the array contains the list
'', 'Technology', 'AppleInc', 'Hardware', 'Macintosh'
We shift off the empty string value, leaving only the elements we want
'Technology', 'AppleInc', 'Hardware', 'Macintosh'
116. blank line
117. while ($path_info[0] and $path_info[0] =~ /^[a-zA-Z].*$/ and $path_info[0] !~ /(.*)\.(.*)/) { $path_info .= '/' . shift @path_info; }
While loop, condition and body
This is a while loop operating on the newly created @path_info array.
Yet again we have a logical partial evaluation operator involved in the control of the script's execution.
The first part of the condition,
$path_info[0]
simply checks that $path_info[0] (the first element of @path_info) does in fact contain a value. Remember that @path_info may be empty.
If it evaluates as false, because @path_info is empty, then we out of the loop - or skip the loop entirely if $path_info[0] is empty the first time the condition is tested.
On the other hand, if $path_info[0] is true, the next part of the condition is evaluated.
$path_info[0] =~ /^[a-zA-Z].*$/
Here we're attempting to match the vaue at $path_info[0] to the given pattern.
The pattern is specifies:
So the pattern matches if the value at $path_info[0] starts with a letter.
If the pattern matches, we evaluate the last part of the condition.
$path_info[0] !~ /(.*)\.(.*)/
This is another pattern match, but here we succeed if we do not match the pattern provided (compare !~ and =~)
The pattern specifies:
The pattern matches any string that contains a dot, which when working with the elements of a path, as we are here, might loosely describe a filename.
This might lead you to suspect that category names in blosxom should not contain dots, '.', even though your operating system may allow it.
You would be right.
In summary:
The body of the loop runs if:
Note: From the documentation we know that directory names, used to categorize entries, must not start with a digit.
Now we come to the body of the while loop
$path_info .= '/' . shift @path_info;
This statement is doing two things.
Working from left to right:
e.g.
If @path_info contains the list 'Technology', 'AppleInc', 'Hardware', 'Macintosh'
Then we remove 'Technology' and prepend a leading forward slash:
'Technology' becomes '/Technology'
@path_info is left with the list 'AppleInc', 'Hardware', 'Macintosh'.
Next, we take the string we just created append it to the end of the current value of $path_info and assign that new string back to $path_info with the single operator (.=).
The first time the loop is run $path_info is empty (we initialize it to the empty string in line 102).
What does the loop accomplish?
We build up in the string $path_info all of the path information given to blosxom -everything in the address following the script itself not including the name of the file - if present.
Note that because we shift the array every time we repeat the loop, the value of $path_info[0] changes each time through. Eventually we'll shift off all elements of the array and drop out of the loop. But keep in mind that we will drop out of the loop if any of the subexpressions of the while condition are false. Those conditions again are:
Note the entire while loop, including the loop body is limited to this single line.
118. blank line
119. 119. # Flavour specified by ?flav={flav} or index.{flav}
120. 120. $flavour = '';
$flavour initialization
$flavour is a package global
121. blank line
122. if ( $path_info[$#path_info] =~ /(.+)\.(.+)$/ ) {
Start of if block and condition
After the while loop, line 117, @path_info array may still contain elements.
It will contain a list of all elements following, and including, the first beginning with a digit, and possibly the filename of a specific post or of the name of an index page (e.g. index.html) if either was included in the request.
This conditional attempts to match the last element of the @path_info array,
specified by $path_info[$#path_info],
to the pattern (.+).(.+)$
will match something that looks like a filename, i.e. literally any string starting with at least one of any character (other than \n), followed by a dot, followed by at least one of any character (other than \n).
If the last element remaining in the array is a filename (if indeed there is a last element at all - remember the array could be empty), then we execute the body of the conditional.
123. $flavour = $2;
Note the use of the parentheses in the pattern (.+).(.+)$
The pattern takes advantage of Perl's memory variables.
Each set of parens in the pattern corresponds to a variable containing the portion of the string that matched the pattern inside the parentheses. The variables are named $1, $2, $3... etc, one per set of parens in the pattern, in order from left to right. Note that nesting parens does affect the names assigned to these variables. In this case it's simple,
This line assigns the value of $2 to the package global $flavour, which should make sense. If there is a filename provided, and it includes an extension, then the extension specifies the requested flavour.
124. $1 ne 'index' and $path_info .= "/$1.$2";
If we see something in the form of a filename (name.extension), then name may be a filename or 'index', which is a request for a listing of all entries in the given category.
Here we use another partial evaluation operator as a flow control statement as follows:
If $1 (the name portion of name.extension) does not equal (ne) 'index', then $1.$2 looks like a filename and not a request for an index page.
In this case the line appends the filename to the end of $path_info, after prepending a forward slash '/' as a delimiter.
So a legitimate filename is included as part of $path_info but not an index request.
125. pop @path_info;
Remember that shift operates at the beginning of an array (lowest index values), pop works at the end of the array.
Keep in mind that we are inside of a conditional block and only executing this line if the the last element of @path_info is either a filename or index request.
We pop and discard this value. Before now we've used the value to pick up the requested flavour (if present) and append the requested filename (if present) to $path_info.
126. end of if
start of else clause
127. $flavour = param('flav') || $default_flavour;
The else clause executes only in the case that the if does not.
The if body will not execute unless the last element of @path_info matches the pattern 'name.extension'.
If that match fails, we won't have assigned to $flavour before this point (something that happens inside the if block).
This line first looks for the provided parameter 'flav' passed to the browser (i.e. ?flav='html'), which is an alternative (deprecated!) method of specifying the flavour, before resorting to the $default_flavour, specified in the configurable variables section of the blosxom.cgi itself.
At this point we know that $flavour has a value, even if it is the default.
128. }
end of else clause started at line 126
129. blank line
130. # Strip spurious slashes
131. $path_info =~ s!(^/*)|(/*$)!!g;
Using Perl's substitution operator this line matches and then discards any/all slashes at the beginning and at the end of the string in $path_info.
s!(^/*)|(/*$)!!g
^/* - matches any number of '/' immediately following the start of the string.
| - alternation. This character in the pattern instructs Perl's regex engine to match on either of the two subpatterns here.
/*$ - matches any number of '/' immediately preceding the end of the string.
/g - this, the global modifier, tells Perl to continue with all possible substitutions rather than stopping with the first match.
132. blank line
133. # Date fiddling
134. ($path_info_yr,$path_info_mo,$path_info_da) = @path_info;
Remember that we recently popped off the filename or index request (if present).
Previously we shifted off any element @path_info before the first value starting with a digit.
The only (valid) elements remaining must be part of a request to blosxom's date-based archived scheme.
e.g.
If the browser request was:
http://sample.net/cgi-bin/blosxom.cgi/2006/12/31
then @path_info at this point contains the list "2006", "12", "31".
Note that it is not necessary for a browser to specify all of year, month, and day in a request, but it's also not valid to skip date values, e.g specifying year and day but not month.
There will be as few as 0 elements remaining in @path_info and as many as three.
All elements remaining will be in the order of: year, month, day
This line assigns each of the (possibly) remaining elements to corresponding variables.
If @path_info does not contain one of more of these values, the corresponding variables will be assigned the value undef.
We can check for this value later in the code.
135. $path_info_mo_num = $path_info_mo ? ( $path_info_mo =~ /\d{2}/ ? $path_info_mo : ($month2num{ucfirst(lc $path_info_mo)} || undef) ) : undef;
The next line may appear at first to be a bit confusing. This is the first time we see a nested use of the ternary operator.
Let's take it one step at a time working from left to right.
$path_info_mo_num =
Tells us we'll be assigning something to the package global $path_info_mo_num.
$path_info_mo
First we evaluate the variable $path_info_mo by itself. This evaluation will return true if the variable was assigned a value in line 134, and it will return false if the value of $path_info_mo is undef.
If true we evaluate the expression to the left of the colon, ':', and if false we evaluate the expression to the right.
Taking the false case first
Be careful not to get confused by the nested use of the ?: operator.
The ':' that pairs with the first '?' is all the way at the end of the statement.
The value to the right of ':' is simply 'undef'.
So if $path_info_mo is undef, meaning that we do not have a month as part of the browser request, then $path_info_mo_num is also assigned the value of 'undef' and we're finished with this line.
If $path_info_mo evaluates to true then we consider the expression
( $path_info_mo =~ /\d{2}/ ? $path_info_mo : ($month2num{ucfirst(lc $path_info_mo)} || undef) )
First we evaluate
$path_info_mo =~ /\d{2}/
We know that $path_info_mo has some value at this point and here we try to match that value against the pattern
\d{2}
which matches exactly two digits.
Taking the true case first.
If the match succeeds, then the browser request included a two digit month number. Of course this number could be any two digit value e.g. 67, which is an unusual month.
in this case, we return $path_info_mo to $path_info_mo_num and we're finished with the line.
If the match fails then we consider the expression:
($month2num{ucfirst(lc $path_info_mo)} || undef)
Again we have the || operator used for flow control.
$month2num{ucfirst(lc $path_info_mo)}
Working from right to left, attempts to convert the value in $path_info_mo to lowercase
(lc $path_info_mo)
That string is then passed to the function ucfirst, which capitalizes the first character of the string
(ucfirst(lc $path_info_mo))
At this point, if the value of $path_info_mo is a string, we know the first letter will be capitalized and the rest of the string will be lowercase.
Now we use that string as a key in the %month2num hash we defined earlier, line 82.
The keys in %month2num look like 'Jan', 'Feb', 'Mar', etc.
If we find the key, then the expression
$month2num{ucfirst(lc $path_info_mo)}
evaluates to the value at the appropriate key in the hash; this value (a two digit month) is assigned to $path_info_mo_num and we're finished with the line.
If the key does not exist, because the value is anything other than 'Jan', 'Feb', ... 'Dec', then the expression evaluates to false and the second half of the || is considered, which results in undef being assigned to $path_info_mo_num.
Summary:
If $path_info_mo is either a two digit value or the name of a month in the form of 'Jan', 'Feb', ..., 'Dec', something that is a key in the month2num hash, then $path_info_mo_num is assigned an appropriate two digit value, otherwise it is undef.
For example:
http://sample.net/cgi-bin/blosxom.cgi/2006/12/31
or
http://sample.net/cgi-bin/blosxom.cgi/2006/Dec/31
would both result in $path_info_mo_num being assigned the value '12' at this line.
136. blank line
137. # Define standard template subroutine, plugin-overridable at Plugins: Template
138. 138. $template =
This looks like any other assignment statement at this point.
We'll see on the next line that what we're assigning to the variable is actually a reference to a subroutine.
139. sub {
The start of the anonymous subroutine that will serve as the default template routine
140. my ($path, $chunk, $flavour) = @_;
Most Perl subroutines start with a line like this naming the expected parameters.
Here we see that the $template subroutine expects 3 parameters
$path, which is the path corresponding to the browser request.
The script starts looking for templates files close to the requested file/directory and works up toward the root of the data directory.
$chunk, which is a particular piece of the template (i.e. one of 'date', 'content_type', 'head', 'story', 'foot').
$flavour, the requested $flavour.
Note that here we're declaring $flavour as a new lexical (my) variable, available inside this subroutine only.
This variable will mask the package global with the same name inside the subroutine's block.
General point:
We will often see the same variable name used more than once in the source. Rules of scope determine which variable, of potentially many with the same name, is used when a variable is referred to by name.
141. blank line
142. do {
Start of do/while loop
143. return join '', <$fh> if $fh->open("< $datadir/$path/$chunk.$flavour");
Because this line is part of the body of a do/while loop, it will be executed some number of times, as determined by the evaluation of the while condition that follows, line 144), but it must be executed at least once before the while condition is tested.
This line does a few of things we haven't seen before.
It is an example of a Perl expression modifier, which is simply a more compact way of writing a block of code, an if block in this case.
First we tell perl what we'll do (typically the body of an if block or while loop)
return join '', <$fh>
and next we specify that we'll execute the preceding expression only if the second part of the statement evaluates as true (this is the condition)
$fh->open("< $datadir/$path/$chunk.$flavour")
The evaluation of the first expression depends on the value (true of false) of the second, so let's start with the second expression.
What is the second part of the statement doing?
$fh->open("< $datadir/$path/$chunk.$flavour")
Is piecing together $path, $chunk, and $flavour with $datadir, the manually defined user configurable variable, to form a path leading to a particular template file.
The expression attempts to open this file for reading, and will return true if the request to open the file succeeds. It will return false otherwise.
If we're able to read the template file requested, then we read the entire file
<$fh>
Combines of the filehandle we just opened with Perl's line input operator '< >'.
In list context this returns all of the contents of the file as a list of values, where each list element is a line from the file.
We join the list together separated by the empty string '' (i.e. nothing), and return the resulting string to the caller.
Summary:
This line will run at least once (do/while) and attempts to open the file requested for reading. If successful, the line returns the contents of the specific file requested to the caller as a string.
144. } while ($path =~ s/(\/*[^\/]*)$// and $1);
Now we get to the condition that determines the number of times we'll run the body of the do block. Again, we know the loop will run at least once.
After the first run and before every subsequent run we evaluate the condition
$path =~ s/(\/*[^\/]*)$// and $1
The statement begins with another substitution.
$path =~ s/(\/*[^\/]*)$//
Let's replace the delimiters used her to make this easier to understand
s!(/*[^/]*)$!!
Note especially the presence of '$', which tells us that we're attempting to match at the end of the string value in $path.
What exactly does this pattern match?
Any number of '/' characters followed immediately by any number of any character other than '/'
'^', when included as the first element of a character class, negates the class.
Because we're substituting nothing (!!) we're dropping the matched portion of the value in $path.
So, each time we evaluate the condition we're dropping the last part of the $path passed to the function.
If this match succeeds, the second part of the statement is evaluated,
$1
which will be true only if the portion of the pattern contained within the parentheses matches something. This is necessary because as constructed the pattern match will succeed even if nothing is matched.
\/*[^\/]*$
Matches zero or more occurences of '/' followed by zero or more [^\/] (any character other than '/') but because we matching even on zero occurrences, the overall pattern match will always succeed.
However, $1 will only evaluate as true, if some portion of the original string matched the pattern within the parentheses. So, although the pattern match will succeed $1 will be empty, evaluate as false and allow us to drop out of the loop.
Summary:
We start looking for a template file at the end of the path passed to the function.
If ever we succeed in finding a template file and it can be successfully opened for reading, then we immediately return from the subroutine.
Until we find and succeed at opening the requested template we work up the provided $path, trying to open the requested template file at each successive level.
If we never succeed in opening the requested template file, we'll eventually drop out of the loop after exhausting $path completely, and failing the test condition of the while loop.
145. blank line
146. # Check for definedness, since flavour can be the empty string
147. if (defined $template{$flavour}{$chunk}) {
If the script executes this statement, we must not have been successful in finding the requested template file along the $path, or we would have returned from the subroutine by now.
Now what do we do?
We're looking for a requested template file and we haven't found what we're looking for along $path, but we're not out of options yet.
Here, we first try to find the requested template in a %template hash, which holds a set of templates baked-in to the script itself. (We haven't seen this hash defined yet -it's coming up shortly.)
%template is a hash of hashes keyed by flavour.
For ex %template may contain a number of templates, each a hash of key/value pairs, where each pair is the name of a template component (the key) and a reference to the contents of that component (the value).
For example
If we were not successful in finding a head.html template somewhere in the $path, next we check the %template hash for $template{html}{head}.
If $template{$flavour}{$chunk} exists (is defined), then we move on to the statements in the body of the if block -see the next line.
If this hash key is not defined, because there is no corresponding entry in the %template hash, then the body of the block is skipped and we continue.
148. return $template{$flavour}{$chunk};
In the case that If $template{$flavour}{$chunk} exists, this statement returns the value (the contents of the requested component from the backed-in templates) to the caller, and we're done with this call to the template routine.
149. } elsif (defined $template{error}{$chunk}) {
If the condition tested at line 146 fails, because there is no corresponding entry in the %template hash, then we move on to the elsif clause and evaluate its condition.
This expression again looks to the %template hash, but now we've given up on finding the requested flavour, and we're only looking for the specified component from a baked-in error template.
We're simply trying to return an error message specific to the component requested, indicating the the requested flavour is not available.
If $template{error}{$chunk} exists (i.e. is defined), then we move on to the statements in the body of the elsif block -see the next lines.
If even this fails, the body of the blocks is skipped and we continue. When will this happen? Only when the component requested ($chunk) is not one of the valid choices (head, story, foot, date, content_type).
150. return $template{error}{$chunk}
In the case that If $template{error}{$chunk} exists, this statement returns the value (the contents of the requested component from the backed-in error template) to the caller, and we're done with this call to the template routine.
151. } else {
End of the block that is the body of the previous elsif clause and the beginning of the default else clause. If none of the conditions tested in the preceding if and elsif blocks evaluate as true then the statements of this block will (definitely) be evaluated.
Of course, if any of the previous conditions succeeded then the statements here are not executed, in fact we've already returned from the $template routine.
152. return '';
If even $template{error}{$chunk} is not defined, then the component requested, $chunk, is not one of the valid choices (head, story, foot, date, content_type).
This statement returns the empty string ('') to the caller.
At this point we're sure to have returned from the subroutine. At the very least, we return nothing at all, but we do return.
153. }
End of the else block
154. };
End of the $template subroutine definition
Keep in mind that this is (finally) the end of the assignment state that began at line 138.
Summary
The subroutine expects three parameters ($path, $chunk, $flavour) and attempts to open the corresponding template file.
155. # Bring in the templates
156. %template = ();
Initialization of the %template hash.
It is initially empty.
157. while (<DATA>) {
We've seen while loops before, and this is just the beginning of another. On the other hand...
<DATA>
is new.
Quicky look at line 445.
You'll find a line that looks like this:
__DATA__
followed by a number of lines that specify various flavours, chunks, and text that look suspiciously like they might be the content of template components.
__DATA__ is a marker used as a pseudo-datafile.
It is used to define a group of lines treated by Perl as if they were contained in a file.
Perl can open this pseudo-datafile for reading like any other file (Of course it would make no sense to write to DATA!).
The while loop here works as if DATA were in fact an open filehandle.
The while loop will, at each evaluation of DATA, read in another of the lines following __DATA__ at line 445 and run the body of the loop.
158. last if /^(__END__)$/;
After reading a series of lines, one at a time, and working through the body of the while loop, we'll eventually encounter
__END__
at line 461 (go take a look), at which point this statement drops out of the while loop.
Specifically, the line tries to match the current line to the pattern ^(END)$ which specifies,
This is the second expression modifier we've encountered.
The expression
last
is evaluated only if the pattern matches.
If you're confused, just realize that this line could be rewritten as:
if(/^(__END__)$/) {
last;
}
When evaluated, last drops out of the loop.
159. my($ct, $comp, $txt) = /^(\S+)\s(\S+)(?:\s(.*))?$/ or next;
We know we're working with one line below the __DATA__ marker at line 456, and above __END__ at line 527.
If we get past line 158 then we know the current line is not yet
__END__
Notice the three variables to the left of the assignment statement, and the three sets of parentheses in the pattern.
We're attempting to assign each parenthesized portion of the matched line to one the variables.
^(\S+)\s(\S+)\s(.*)$
Literally this pattern is
So if, for the moment, we define a word as a sequence of nonspace characters then we're trying to capture:
If we look at that __DATA__ section we see:
$ct will be one of 'html', 'rss', or 'error'
$comp will be one of 'content_type', 'head', 'story', 'date', or 'foot'
$txt will be the rest of the line, essentially the complete text that makes up one portion of one of the baked-in template files.
$ct is essentially a flavour (e.g. html),
$comp, a component (chunk) of one of the flavours (e.g. head), and
$txt the contents of one chunk of one of the flavour, specifically the text of component $comp of flavour $ct (e.g. the contexts of the baked-in head.html).
Notice that there is no conditional here.
We know these matches will succeed for as long as the while loop continues because the lines we're working with have been constructed this way. We're not making an external call to the system or depending on user input, either or which would give us cause to doubt the format of the strings.
The last bit of the statement is:
or next
This is the logical operator used here to as a partial evaluation op. The first expression
my($ct, $comp, $txt) = /^(\S+)\s(\S+)(?:\s(.*))?$/
will evaluate as true as long as we match at least one of the parenthesized patterns. The return value of the expression is the total number of matches. If all three variables are assigned then the statement evaluate to 3. If one of the pathesized patterns fails to match in the string then the return value is 2. If all three fail, the return value is 0 and we evaluate the rest of the statement
next
which skips the rest of the while loop and immediately continues with the next line at the top of the loop. If we haven't matched then we don't want to continue processing the current line.
Note that this shouldn't be much of an issue unless there is a mistake in the __DATA__ section of the source file.
160. $txt =~ s/\\n/\n/mg;
A simple substitution, not unlike the others we've seen, with a small but significant exception.
Here we're replacing each occurrence of the pattern \n with the string '\n'.
The goal is to replace escaped newline sequences "\n" with recognized newline sequences "\n" (we're unescaping the newlines).
There are two modifiers used:
g - We've already seen that this modifier makes the substitution global, so that it matches every occurrence of the pattern in the string, not just the first.
m - This one is new. Normally patterns match against the entire string, for example '^' and '$' refer to the beginning and the end of the entire string, not individual lines within the string. Because $txt is a string composed of multiple lines this could be a problem.
The m modifier tells Perl's regex engine to consider internal lines (e.g. matches against boundary anchors at the beginning and end of internal lines), and not just the beginning and end of the entire string.
161. $template{$ct}{$comp} .= $txt . "\n";
Using the three variables we've just assigned, this statement goes about the business of building the %template hash structure -see discussion of line 146.
Later we'll be able to default to these baked-in templates by referring to the hash.
When looking for $chunk, and $flavour in the $template subroutine we'll be able to use those variables to get at the contents of the %template hash we're building up here.
$template{$flavour}{$chunk} in the $template routine is equivalent to $template{$ct}{$comp} here.
To the value at {$ct}{$comp} we append $txt (concatenation) and end the value with a newline.
162. }
End of while loop started at line 157.
163. blank line
164. # Plugins: Start
165. if ( $plugin_dir and opendir PLUGINS, $plugin_dir ) {
Start of conditional block
Before we can start to work with plugins, this line checks that the $plugin_dir has been specified ($plugin_dir is a user configurable variable and is initially the empty string, ''
If it has has any value, whether or not the value corresponds to a valid path, $plugin_dir evaluates as true value.
Next, the script attempts to open the plugin directory, associating it with the directory handle PLUGINS
opendir PLUGINS, $plugin_dir
Only if $plugins_dir is not '' and if we are able to successfully open the directory, do we executed the code in the body of the conditional.
If one of the two expressions fails, then we skip the block entirely but continue with the execution of the script.
166. foreach my $plugin ( grep { /^\w+$/ && -f "$plugin_dir/$_" } sort readdir(PLUGINS) ) {
Start of foreach loop and condition
This loop makes up the bulk of the conditional block. Let's take it a piece at a time, working from right to left
readdir(PLUGINS)
reads the contents of the directory returning all items found in the directory in the same order you would get from running ls -f on the directory, including dot files and subdirectories.
sort readdir(PLUGINS)
This list is sorted in ascending ASCIIbetical order (the default for sort on ASCII strings).
The sorted list is handed over to grep, which picks items from the list according to the expression following the operator in {},
grep { /^\w+$/ && -f "$plugin_dir/$_" } sort readdir(PLUGINS)
/^\w+$/ && -f "$plugin_dir/$_".
The sorted entries get past grep (to be included in the list returned from grep) if the item satisfies two conditions (&&)
It must match the pattern /^\w+$/,
which specifies
So, we're looking only for entries that contain letters, digits, and '_' characters (i.e. characters that match \w).
This eliminates any dot files, along with other odd filenames.
This correctly implies that plugin names should only include digits, letters, and underscores.
-f "$plugin_dir/$_
This second condition specifies that the item must be a file.
-f is Perl's file test operator, it returns true only if the string it's looking at specifies a file (as opposed to directories and other filesystem item types).
It's important to note that readdir returns only the names of the directory entries, not including path information.
Because of this, we prepend to each plugin name the path to the plugins directory that the user defined as $plugin_dir.
Compare
$plugin_dir/$_, which will be something like
'/path/to/plugin/dir/file_name'
and
$_
which is just 'file_name'
-f filename would look for filename in the current working directory, which is the directory containing the blosxom.cgi when the script starts.
This ensures that -f is looking in the right place for the names we're testing.
It's the difference between:
-f "interpolate_fancy"
which will fail, even if the the interpolate_fancy plugin exists in the correct plugin directory, and
-f "/Library/WebServer/Data/Blosxom/plugins/interpolate_fancy"
Also note the use of $_,
which is a default variable name often used by Perl. In this case $_ is set to the value of each item being looked at by grep in turn.
grep modifies the list it is given (it will at the very least drop '.' and '..' from the list), and the foreach loop is run on the modified list.
foreach is another loop control structure.
The variable $plugin is assigned each value from the list returned from grep, and the block is run repeatedly until the list is exhausted.
So we see that the block is running against each plugin found in $plugin_dir.
167. next if ($plugin =~ /~$/); # Ignore emacs backups
This statement causes the loop to skip all files that match the pattern ~$
You should be able to recognize this statement as an expression modifer. First we test the condition
$plugin =~ /~$/)
and evaluate the expression
next
only if the condition is true.
$plugin =~ /~$/) is true if the string in $plugin matches the pattern $~, which specifies
next
skips the evaluation of the rest of the statements in the foreach loop and immediately continues at the top of the loop with the next $plugin.
Note that '~' is matched by \w so it is necessary to specifically skip these files if we are concerned that there will be files ending with ~ in $plugin_dir and we do not want the script to treat these files as plugins.
168. my($plugin_name, $off) = $plugin =~ /^\d*(\w+?)(_?)$/;
$plugin is the name of one of our plugins.
This line matchs against that name and using parentheses in the pattern assigns portions of the matched string to the two variables $plugin_name and $off
^\d*(\w+?)(_?)$
Matches:
The question mark (?) indicates that this portion of the pattern is optional. The question mark is a quantifier instructing the regex engine to match zero or one occurrence of the preceding character.
The portion of the string matching \w+? is assigned to the variable $plugin_name.
The portion matching _? is assigned to the variable $off.
Note: because these portions of the pattern are optional, either of these variables may be assigned the empty string.
For example, maybe we're using the interpolate_fancy plugin. Then that plugin will be included in the $plugin_dir, maybe with the name
interpolate_fancy
If you've read the project documentation then you'll know that we can enforce strict ordering by prepending plugin names with numbers. These numbers are matched by the pattern but not assigned to $plugin_name. This correctly implies that plugin names should not begin with digits. (This is covered in the documentation).
We'll keep reading to discover the significance of $off, but if you've read the documentation you might be able to guess. After the pattern match, $off will contain either '_' or the empty string, ' '.
From the documentation we know that we can disable a plugin by appeaning '_' to the plugin's name.
$off will contain '_' if the plugin name ends in '_' and it will contain ' ' otherwise.
169. my $on_off = $off eq '_' ? -1 : 1;
Based on the value of $off, which will be either '_' or the empty string, ' ', $on_off is assigned -1 if $off is '_' and '1' otherwise.
We know from the documentation that '_', when appended to the end of a plugin name, is used to indicate that a plugin should be treated as inactive.
So if the plugin is to be treated as inactive, $off is '_', and $on_off has the value '-1'.
170. require "$plugin_dir/$plugin";
require
Loads in external functions from a library, $plugin_dir/$plugin in this case. After this line we can refer to functions defined in the plugin file.
171. $plugin_name->start() and ( $plugins{$plugin_name} = $on_off ) and push @plugins, $plugin_name;
You should recognize that this line is yet another example of the use of a partial evaluation operator, this time connecting three subexpressions.
Reminder: Such statements are evaluated from left to right.
'and' is a short-circuited logical operator, meaning that evaluation of the entire statement ends as soon as the truth of falsehood of the statement is known.
Because 'and' requires all subexpressions to be true, we must evaluate all of the expressions to determine that the statement is true, but can determine that the entire statement is false as soon as we encounter a single false expression.
We can order expressions in such a way that we can depend on values in later subexpressions, if we confirm those values in earlier ones, because if the earlier statement were not true, subsequent expressions will be harmlessly ignored.
This type of partial evaluation expression is very heavily used by blosxom.
These statements can be confusing, so be careful.
OK, enough of that, what is this statement saying?
Looking at it one expression at a time from left to right:
$plugin_name->start()
We call the plugin's start() subroutine. We can do this because of the presence of the require statement above.
We know from the documentation that this routine must return 1, a true value, to inform blosxom that it should consider the plugin active.
From the documentation
The start subroutine is required.
Its purpose is to Blosxom know that it has indeed loaded a plugin and should consider it active.
Inform Blosxom so by returning a 1 (true),
as shown in this simplest of possible examples:
sub start { 1; }This lets Blosxom know that it should consider the plugin alive and well and should offer it the ability to act at each upcoming callback point.
If $plugin_name->start() does not evaluate as true, then we skip the rest of this long statement.
Assuming $plugin_name->start() is true, we continue evaluating the statement
and ( $plugins{$plugin_name} = $on_off )
Here we create a key/value pair in the %plugins hash, assigning the value of %on_ff (either -1 or 1) to the key $plugin_name, which appropriately enough is the name of the current plugin.
The value of $on_off is determined in line 161.
We can see now that the %plugins hash is keyed by $plugin_name and stores status information of all plugins in $plugin_dir (namely the $on_off value).
Note that disabled plugins (plugins with name ending in an underscore, set by the user) are included in the hash but inactive plugins (those for which $plugin_name->start() does not evaluate as true) are not.
If plugin->start() does not return true, we'll never make it to this expression.
Also keep in mind that just because a plugin is represented in this hash, it does not mean the plugin is active. $on_off indicates that the user has disabled it.
push @plugins, $plugin_name
Finally, assuming the other two expressions are true, we push the current plugin onto the end of the @plugins array.
@plugins is apparently a list of all valid plugins for which plugin->start() is true.
This list may include plugins disabled by the user.
Notice that this line is essentially an oddly written if block.
If the first expression is true, then we take two actions as defined by the 2nd and 3rd subexpression.
This could, and probably should, be rewritten as an if clause without being made any less efficient.
172. }
End of foreach loop started at line 166.
173. closedir PLUGINS;
Closes the directory handle after running through the plugins directory completely.
174. }
End of if block started at line 165.
Keep in mind that we skipped this entire block if $plugin_dir is not defined by the user.
175. blank line
176. # Plugins: Template
177. # Allow for the first encountered plugin::template subroutine to override the
178. # default built-in template subroutine
179. my $tmp; foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('template') and defined($tmp = $plugin->template()) and $template = $tmp and last; }
This is an ugly line. We'll take it in pieces.
The line starts with the statement
my $tmp;
which declares the local variable $tmp.
Next we see
foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('template') and defined($tmp = $plugin->template()) and $template = $tmp and last; }
This is a foreach loop collapsed to a single line. It will look more familiar to you rewritten as
foreach my $plugin ( @plugins ) {
$plugins{$plugin} > 0 and $plugin->can('template') and
defined($tmp = $plugin->template()) and $template = $tmp
and last;
}
foreach my $plugin ( @plugins ) {
This is the start of the foreach loop which runs through all of the plugins listed in @plugins. These are the active plugins, in the sense that $plugin->start() returned true, but this list does possibly include plugins disabled by the user (e.g interpolate_fancy_).
Each is set to $plugin in turn.
$plugins{$plugin} > 0 and $plugin->can('template') and
defined($tmp = $plugin->template()) and $template = $tmp
and last;
Yet another long 'and'ed partial eval statement.
Let's take it piece by piece, from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where '1' means on and '-1' means off.
We only continue considering the current plugin if it has not been disabled by the user.
If the value at $plugins{$plugin} is <= 0 we skip the rest of the statement, with the result that we do nothing with the plugin, and nothing is exactly what we want to do with disabled plugins.
If $plugin has not been disabled by the user $plugins{$plugin} > 0 will be true and we continue with the statement.
and $plugin->can('template')
Here we're testing to see if the current plugin has a template subroutine.
If the return from the can method is true, then $plugin claims to have a template() method, which means we should be able to safely refer to $plugin->template().
from perldoc
$obj->can(METHOD)
can checks if the object or class has a method called method. If it does then a reference to the sub is returned. If it does not then undef is returned.
and defined($tmp = $plugin->template())
In this line we assign the reference to the anonymous subroutine returned by $plugin->template() to $tmp and then check that $tmp has a defined value.
and $template = $tmp
Next, we assign the value from $tmp to $template, overwriting the reference to the default template subroutine that was in $template before now.
Why use $tmp at all?
Because we do not want to overwrite $template until we are pretty sure we have a valid (defined) reference to a new template subroutine.
Once we know that $tmp is defined, we can pass its value to $template.
and last
Finally, the last operator drops us out of the foreach loop.
This means that we will stop looking for replacement template routines after the first one we find.
This is consistent with what the documentation tells us:
The first encountered plugin::template subroutine overrides the default.
180. blank line
181. # Provide backward compatibility for Blosxom < 2.0rc1 plug-ins
I'm skipping the next three lines because they only provide compatibility with previous versions of blosxom.
Even the newest blosxom code is old at this point. Everyone has had plenty of time (many years) to upgrade.
No one should be using pre 2.0rc1 versions of blosxom. Upgrade either 2.0 or 2.0.2. There is really no reason not to use 2.0.2.
182. sub load_template {
See notes at line 181.
183. return &$template(@_);
See notes at line 181.
184. }
See notes at line 181.
185. blank line
186. # Define default entries subroutine
187. $entries =
This looks like any other assignment statement at this point.
We'll see on the next line that what we're assigning to the variable is actually a reference to a subroutine.
We're assigning $entries a reference to the anonymous subroutine that follows, which is the baked in entries routine.
We have seen this once before, line 138, and we will see this same pattern again each time we set the default for one of: start (line 138), entries (line 187), head (line 328), sort (line 350), date (line 385), story (line 401), foot (line 423), filter (not defined in blosxom.cgi but available to plugins -see documentation), and end (not defined in blosxom.cgi but available to plugins -see documentation)
188. sub {
The start of the anonymous subroutine that will serve as the default entries routine.
189. my(%files, %indexes, %others);
Here we declared three lexical variables: %files, %indexes, %others
We'll discuss each of these as they're used.
The only thing we know about them now is that they are all hashes.
190. find(
This is the beginning of a call to the File::Find library's find routine.
find(\&wanted, @directories_to_search);
find expects two arguments.
The first is a reference to a subroutine that is to be run against every file and directory found as Perl recurses through the directory structure starting @directories_to_search.
What follows this line is the definition of the subroutine passed to find as \&wanted.
Jumping ahead in the code, we can see the second argument to find on line 228 is $datadir
This makes sense, and is probably what you'd expect. We're looking for entries so we start at $datadir.
Some important things to keep in mind about the operation of find() -see man File::Find for more info.
------ From the documentation
"find()" does a depth-first search over the given @directories in the order they are given.
For each file or directory found, it calls the &wanted subroutine.
Additionally, for each directory found, it will "chdir()" into that directory and continue the search, invoking the &wanted function on each file or subdirectory in the directory.
The wanted function takes no arguments but rather does its work through a collection of variables.
- $File::Find::dir is the current directory name,
- $_ is the current filename within that directory
- $File::Find::name is the complete pathname to the file.
Don't modify these variables.
191. sub {
This is the start of the definition of an anonymous subroutine that will serve as &wanted, a reference to which is passed to find().
See the notes for line 190 for more details.
Note that we could have defined this function elsewhere and simply passed a reference to it here, which would have avoided spreading the call to find() over nearly 40 lines.
192. my $d;
A simple variable declaration.
As we'll see, the variable holds the date string returned from nice_date(), and is used as both key and value for the %index hash.
193. my $curr_depth = $File::Find::dir =~ tr[/][];
$File::Find::dir
is the complete directory name of the current directory, from the root, as the find routine is making it's way through the $datadir hierarchy.
$File::Find::dir =~ tr[/]
We're counting the number of forward slashes, '/', (path delimiters) in the complete name and storing that value in $curr_depth.
We've seen this use of tr[/] before in line 104. See the notes at line 104 for more information.
194. 189. return if $depth and $curr_depth > $depth;
We've seen expression modifiers like this before.
Essentially this is an if statement with the condition
$depth and $curr_depth > $depth
The expression $depth evaluates to true if it is anything other than zero.
Remember that $depth is one of our user configurable variables. A value of zero indicates infinite depth.
If $depth is zero, we want blosxom to consider all directories under the requested directory.
In this case, because 0 evaluates as false, we do not evaluate the second part of the condition which compares $curr_depth and #depth.
If $depth is not 0 then do evaluate the second half of the condition.
and $curr_depth > $depth
This is true if $curr_depth is greater than the user configurable $depth value.
For example:
If the user has requested that blosxom consider only posts no more than 3 levels under the requested directory ($depth = 3), we should stop if $curr_depth is 4.
If both $depth and $curr_depth > $depth are true we immediately return from the function at this point, otherwise we continue.
What does all this mean?
find() will run this function on every item below $data_dir but we want to give users the option of limiting the script to a maximum level under the requested directory. To accomplish this we cut the function should for every call after we have exceeded $depth, so that we essentially ignore all posts below $depth.
195. blank line
196. if (
Start of conditional block.
The condition here will define what blosxom considers a post, which is, as we will see:
See lines 197 - 200 for more info
197. # a match
198. $File::Find::name =~ m!^$datadir/(?:(.*)/)?(.+)\.$file_extension$!
$File::Find::name is the complete path to the current file,
e.g. /some/path/foo.ext
In this line we are doing nothing more than attempting a match on $File::Find::name
m!^$datadir/(?:(.*)/)?(.+)\.$file_extension$!
This pattern matches:
What about (?:
It instructs Perl's regex engine that this set of parentheses should be used for grouping only, and should not trigger the creation of a match variable.
The parentheses here, (.), do trigger a match variable, and because the previous set of parens included ?:, the portion of the string matched by the pattern . is assigned to the variable $1.
?: is intended to be used precisely for this purpose, to avoid creating unnecessary variables.
In summary, this portion of the condition is satisfied if the complete path to the current file:
Important:
The quantifiers used by Perl's regex engine are greedy, so (?:(.*)/) will match as much of the string as possible.
Because .* matches any number of any character, this may include internal '/' characters.
e.g., given
$datadir/some_dir/sub_dir/sub_sub_dir/filename.extension
(?:(.*)/) will match all of the substring
'some_dir/sub_dir/sub_sub_dir/'
The two memory variables created contain:
$1 - The portion of the path after $datadir, not including a leading and trailing delimiter ('/'), and not including the filename.
$2 - The current filename, not including the $file_extension.
199. # not an index, .file, and is readable
200. and $2 ne 'index' and $2 !~ /^\./ and (-r $File::Find::name)
This line continues the conditional expression. started at line 198
and $2 ne 'index'
Assuming we've gotten to this point $2 contains a filename not including the .$file_extension.
This portion of the expression returns true if the value of $2 is anything other than 'index'.
So find(), and blosxom, skip any index files encountered.
Assuming $2 is not 'index' then we continue with the condition
and $2 !~ /^\./
This portion of the conditional is again an attempt to match against $2,
the pattern ^., specifies
This part of the condition succeeds only if the match fails (compare !~ and =~)
So find(), and blosxom, skip over dot files, (because we do not match if the filename begins with a dot).
Assuming $2 is does not begin with a dot, '.' we continue evaluating the condition
and (-r $File::Find::name)
Finally, this portion of the condition checks that the current file is readable.
find(), and blosxom, skip any non-readable files, (because we do not match if the file is not readable).
201. ) {
End of condition expression that started on line 196, and the beginning of the conditional block.
To be absolutely clear, we match:
202. blank line
203. # to show or not to show future entries
204. (
This is the beginning of a long statement spread out over lines 204 to 222.
The purpose of this statement is to determine whether the file currently being considered, which is $File::Find::name, should be output.
Note that there is no semicolon at the end of the statement (line 222).
Why?
Because a semicolon is not technically required for the last statement in a block. Use of a semicolon is still generally encouraged. Especially for a statement like this, spread out over 18 lines.
205. $show_future_entries
The first subexpression is spread out over these 4 lines, 204 - 207.
Because this is part of a statement connected by logical operators, evaluation starts on the left and continues, or stops, as directed by the values of the subexpressions, and the logical operators we encounter.
Notice that this subexpression is itself composed of the 'or' of two subexpressions.
The expression evaluates to true if one, the other, or both of the subexpressions are true.
Also, as we've seen before, evaluation stops as soon as we can determine the truth or falsehood of the expression. We may not evaluate all of the subexpressions.
$show_future_entries is the user configurable variable defined at line 42.
A value of 0 indicates that the user does not want to show future entries (posted-dated entries, or entries with modification times occuring at some point in the future relative to now).
A value of 1 indicates a preference to display future entries.
If the value is 1, a true value, then we're done with subexpression, which also evaluates as true, and pick up with the next starting at line 210.
Otherwise we continue to the second subexpression...
206. or stat($File::Find::name)->mtime < time
Here we compare the modication time on the file currently being considered,
stat($File::Find::name)->mtime,
to the present time, returned by the Perl function time.
Conveniently, times returned by stat and time are in the same format (or they would not be directly comparable). Both return a value in Unix timestamp format, which is (roughly) the number of seconds since Epoch, an easy value for us to work with.
If none of that sounds familiar to you, don't worry about it too much. For our purpposes it means that the number returned is a simple integer value that's perfect for comparisons like this. If I'm comparing the time now to time from a week ago, then time now will be a larger number because some number of seconds will have passed over the week. More specifically, time now will be time_a_week_ago + 604800 (the number of seconds in a week).
If the modification time is less than (<) the present time, that is to say if the modification time is in the past, then the subexpression returns 1, a true value, and the subexpression (line 204 - 207) is true.
At this point we've seen the first subexpression and determined its truth or falseness.
The subexpression is connected to the rest of the statement (continuing to line 222) by the logical operator 'and'.
Because all expressions connected by 'and' must be true, if this first subexpression returns false, then we are done with the entire statement that runs from 204 - 222.
Otherwise, we continue evaluating the rest statement.
To summarize the first subexpression
First we check the value of $show_future_entries, because if the user has instructed Perl to display posts with future modification times then the modication time of the file is unimportant here.
On the other hand, if future entries should not be output then we must compare the modification time on the file to the current time and we cut short the entries routine if the file has a future modification time.
207. )
End of the expression that started at line 204, which is part of the statement that runs to line 222.
208. blank line
209. # add the file and its associated mtime to the list of files
210. and $files{$File::Find::name} = stat($File::Find::name)->mtime
This line should always evaluate as true.
In fact this (the entire statement) is yet another example of the use of partial evaluation operators to control the execution of the script and would be written more traditionally as a conditional block. I would go so far as to say that this is a prticularly good example of the problem you get into when overusing these sorts of constructions.
Anyway, remember that we have declared a local %files hash (scoped to this subroutine) at line 184.
Here we get our first look at what we'll be doing with that hash.
The %files hash is collection of pairs where each key is a filename (the name of a file which has not been eliminated from consideration as a post) and the corresponding value is the file's current modification time in unix timestamp format, as returned from the call to stat.
This expression stores the modication time value keyed by the filename and we continue on in the statement.
211. blank line
212. # static rendering bits
213. and (
Start of the next subexpression that runs to line 217.
The next bit of the statement is spread out over these lines 213 - 217.
This subexpression is itself composed of 3 subexpressions. We'll consider each in turn.
Note that because these are connected with the logical operator 'or' the expression is true as soon as any of the three subexpressions is determined to be true.
As the comment in blosxom.cgi explains, this expression deals with static rendering.
214. param('-all')
From the documentation
"To force Blosxom to regenerate all pages, add another command-line switch, -all=1 , like so:
% perl blosxom.cgi -password='whateveryourpassword' -all=1
So if param('-all') is true then blosxom should generate all pages.
If this is determined to be true, the evaluation of this expression is complete.
Otherwise we continue...
215. or !-f "$static_dir/$1/index." . $static_flavours[0]
Note that the variable $1 refers to the match made at line 198.
From that match
$1 is the portion of the path after $datadir, not including a leading and trailing delimiter ('/'), and not including the filename.
$static_flavours[0] refers to the first element of the user configurable @static_flavours array defined at line 61.
For example if the array were defined as follows (the default)
@static_flavours = qw/html rss/;
then $static_flavours[0] is 'html'.
This string ($static_flavours[0]) is concatenated to the path $static_dir/$1/index to give us the complete path to an index page for a requested flavour in the user-configured $static_dir directory.
We check to see if this file exists.
-f is a test to determine if it is a file, it may be preferrable to simply check for existance (-e) instead.
Note the use of the negation operator (!). Because of it's presence, the expression evaluates to true if not -f, or if "$static_dir/$1/index." . $static_flavours[0] is not a file.
If an index for for the first @static_flavor corresponding to the current path in $datadir does not exist (is not a file) at the corresponding location in $static_dir then this expression is true.
Otherwise, we continue...
216. or stat("$static_dir/$1/index." . $static_flavours[0])->mtime < stat($File::Find::name)->mtime
This is similar to the sort of time comparison we saw at line 206, but in this case we're comparing the modification time of the static 'index.' file - created sometime before now
(as discussed in the note at line 210), to the modication time of the file currently being processed.
The less than operator (<) evaluates to true if the static index file is older than the modification time of the current file.
Keep in mind that we know the statically generated index exists because we must have failed the previous file test (!-f) if we are considering this part of the expression.
It should make sense to you that if any current file was modified after (is newer than) the static index page (which must include the current file) then we need to recreate the static file to include the changes.
Otherwise, nothing has changed, as far as the current file is concerned.
If this expression is false, we are finished with our long statement (lines 204 - 222), which is determined to be false.
Otherwise, if this portion of the statement evaluates to true, we continue to the next subexpression.
217. )
End of the expression that started at line 213, which is part of the statement that runs to line 222.
218. and $indexes{$1} = 1
We declared the index hash on line 189 (with %files and %others). Here is the first mention of the hash since that declaration.
This portion of the the statement should always evaluate as true.
If the entire statement were rewritten as a conditional (it might be easier to read that way), then this would be a statement in the body and not a part of the condition.
We're simply creating a new key/value pair in the %indexes hash.
Our hash key in this case is $1.
Remember from line 198 that $1 is the portion of the path after $datadir not including a leading and trailing delimiter ('/'), and not including any filename.
And the value is simply '1'.
So what is this expression and %indexes hash doing?
If we must create a static index page for the directory containing the current file, then we indicate this by storing a value of '1' in the %indexes hash, keyed by the name of the directory itself.
Later, we can use this hash to determine what index pages we need to generate.
219. and $d = join('/', (nice_date($files{$File::Find::name}))[5,2,3])
...continuing with our long statement
Assuming we've made it this far, and we won't have if any of the preceding expressions were false, then we retrieve from the %files hash the previously saved modification time (line 210) for the current file.
$files{$File::Find::name}
and pass that value to the function nice_date(), which is defined in the blosxom.cgi itself (lines 433 - 442).
nice_date($files{$File::Find::name})
We'll talk about how nicedate() when we get there. At this point all we need to know about nicedate() is that
We pass it just the sort of value we've previously stored in our %files hash,...
namely, the mtime value returned from a call to stat(), which is an integer representing the number of seconds elapsed since midnight UTC on the morning of January 1, 1970 (referred to as the epoch).
...and it returns a list of values corresponding to the following variables
$dw, $mo, $mo_num, $da, $ti, $yr
From this list we grab the 5th, 2nd, and 3rd values (in that order)
(nice_date($files{$File::Find::name}))[5,2,3]
and join them as a single string, separating each by the delimiter '/'.
join('/', (nice_date($files{$File::Find::name}))[5,2,3])
e.g.
Given the values above we would end up with '2006/11/24'
We store this value at $d (declared at line 192).
$d = join('/', (nice_date($files{$File::Find::name}))[5,2,3])
220. blank line
221. and $indexes{$d} = $d
We just saw use of the %indexes hash on line 218.
In that case we were storing the following key/value pairs:
keys: directory paths, eg 'Technology/Computer/Apple'
value: 1, a status flag indicating that we need to generate an index page for the directory named by the key.
Here we are using the same hash to store the following pairs:
keys: The date strings temporarily stored at $d (line 219), generated from the list of values returned from nice_date (line 219)
values: The value is that same $d value,
For example:
the key $indexes{'2006/11/24'} has the value '2006/11/24'.
It seems that the purpose of the %indexes hash in both cases is to keep track of the static index pages we'll need to generate during static mode operation.
Remember that Blosxom allows both categories and a date-based archive schemes.
We'll need to create indexes for both when statically rendering the site.
222. and $static_entries and $indexes{ ($1 ? "$1/" : '') . "$2.$file_extension" } = 1
We've finally arrived at the last line in our long (long) statement.
This is the final subexpression that completes the picture of the statement we've built up starting with line 204.
Remember, that we'll only get to this point in the statement if all of the previous expressions evaluate as true.
Assuming that is the case, $static_entries is a user configurable variable, a switch indicating a preference to generate static files for each individual post (1) or not (0).
What does that mean?
To this point we've only been dealing with index files.
For each directory in the category hierarchy and each level of date-based archive scheme, we generate a single file containing all of the posts that belong in that directory or date range.
Just as blosxom is capable of generating pages to specific posts when running dynamically, we can generate pages for specific posts in static mode.
Why wouldn't you want to do this in static mode? If you have a lot posts, then this will generate many, many files. Specifically, one file per post, per category/date, per static flavor.
Realize that this may mean that a single post is replicated many times if it occurs in a deeply nested category and also once each for the year, month and day that the date-based scheme requires.
and $static_entries
If this variable ($static_entries) is assigned the value 1, then we continue past the first part of this expression and continue evaluating the statement.
The next part of this expression isn't particularly easy to read.
It uses the ternary operator, the memory variables $1 and $2 from line 198, and the %indexes hash again.
Here's how it works:
and $indexes{ ($1 ? "$1/" : '') . "$2.$file_extension" } = 1
We will be storing the value of 1 somewhere in the %indexes hash.
At what key?
The ternary operator gives us two possibilities.
$1 is evaluated.
When defined at line 198 this memory variable was part of an optional component in the regular expression.
It contains either the path after $datadir not including a leading and trailing delimiter, '/', and not including the filename or it contains the empty string, ''.
In the first case, we evaluate the string "$1/" which is simply whatever was the value is at $1 with the addition of a trailing forward slash, (/).
Remember that if $1 is anything other than the empty string, it is the the path starting at $datadir to the requested post. Here we append a forward slash as a delimiter. (We know it's not already present because it was stripped when we created the variable at line 198.)
In the second case $1 is the empty string and we evaluate '', which is simply ''. This will be the case if there is no path between $datadir and the requested post.
In other words, $1 will be the empty string if the requested post is at the root of blosxom's data directory.
Either way we concatenate this with $2.$file_extension and this is our key.
Remember that $2, again from line 198, is the name of the requested post, without the file extension.
We append a literal dot (.) and the value of $file_extension, the user configurable variable set at line 37.
So, the key in this case is the complete path from the root of the data directory to the requested file including the filename and extension.
The value is 1.
This is a third type of key in %indexes.
There are
At this point we can use the %indexes hash as a 'list of ingredients', telling us what to generate when in running static mode.
223. blank line
224. }
End if block started at line 201.
225. else {
Beginning of else clause that pairs with the if starting at line 196.
226. !-d $File::Find::name and -r $File::Find::name and $others{$File::Find::name} = stat($File::Find::name)->mtime
This statement is saying:
If the requested name is not a directory, and it is readable, then store the modification time in the %others hash keyed by the file name.
If you look back at the if condition at lines 198 - 200 you'll see that we were looking for:
$File::Find::name =~ m!^$datadir/(?:(.*)/)?(.+)\.$file_extension$!
# not an index, .file, and is readable
and $2 ne 'index' and $2 !~ /^\./ and (-r $File::Find::name)
If any of those conditions fail we end up here, evaluating the else clause, and we do not evaluate the else clause if all of those conditions are met.
Assuming that one of those tests fails, and keep in mind that we won't we won't know which one has failed, what do we do next?
Let's look at the statement one expression at a time.
You'll recognize this as another statement composed of multiple partial evaluation operators.
!-d $File::Find::name
If the item ($File::Find::name) is not a directory.
-d is the directory test and so if we negate that, with the negation operator, (!), we evaluate to true if $File::Find::name is not a directory.
and -r $File::Find::name
If we return true for the first expression, then we evaluate this one, which is true if $File::Find::name is readable, -r
and $others{$File::Find::name} = stat($File::Find::name)->mtime
Finally, this expression should always evaluate to true.
The statement could easily, and probably should, be rewritten as a conditional block. It will take up more space that way but is no less efficient and will be much easier for novices to read.
In any case, here we are storing a key value/pair in the %others hash (declared at line 184).
The key is the filename itself.
The value is the modification time of the file.
This is exactly what we did before (line 205) with the %files hash.
Apparently, anything that is not a directory and is readable, but has failed some condition that would qualify it as a post is tracked in the %others hash.
227. }
End of else clause
228. }, $datadir
end of subroutine passed to find() as \&wanted.
Some long argument, huh?
find() expects its first parameter to be a reference to a function that is called for each file and directory found.
This was defined as an anonymous subroutine within the call to find().
The second parameter that find() expects is a list of directories to search.
We pass find() $datadir.
Because $datadir contains all of our posts, it makes sense that we would want find() to look for posts starting at the root of $datadir.
229. );
Finally we're at the end of the call to find() that was started at line 190.
230. blank line
231. return (\%files, \%indexes, \%others);
Our return from the entries routine is a list of the references that we've built up.
The %files hash contains all of our posts.
The %index hash contains an indication of the entries that will be generated when blosxom is run in static mode.
From the notes at line 222, the %indexes hash contains:
We can use the %indexes hash as a 'list of ingredients', telling us what to generate when in running static mode.
Finally, the %others hash is contains key/value pairs for files which are readable, but did not meet the definition of an entry for one of the reasons spelled out at lines 198 to 200.
The documentation tells us
The subroutine should return references to a hash of files, and another of indexes to be built (in the case of static rendering).
232. };
End of the default entries subroutine definition.
233. blank line
234. # Plugins: Entries
235. # Allows for the first encountered plugin::entries subroutine to override the
236. # default built-in entries subroutine
237. my $tmp; foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('entries') and defined($tmp = $plugin->entries()) and $entries = $tmp and last; }
This is an ugly line. We'll take it in pieces.
The line starts with the statement
my $tmp;
which declares the local variable $tmp.
Next we see
foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and
$plugin->can('entries') and
defined($tmp = $plugin->entries()) and $template = $tmp
and last; }
This is a foreach loop collapsed to a single line. It will look more familiar to you rewritten as
foreach my $plugin ( @plugins ) {
$plugins{$plugin} > 0 and $plugin->can('entries') and
defined($tmp = $plugin->entries()) and
$template = $tmp and last;
}
foreach my $plugin ( @plugins ) {
This is the start of the foreach loop which runs through all of the plugins listed in @plugins. These are the active plugins, in the sense that $plugin->start() returned true, but this list does include plugins disabled by the user (e.g interpolate_fancy_).
Each is set to $plugin in turn.
$plugins{$plugin} > 0 and $plugin->can('entries') and
defined($tmp = $plugin->entries())
and $entries = $tmp and last;
Yet another long anded partial eval statement.
Let's take it piece by piece, from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where 1 means on and -1 means off.
We only continue considering the current plugin if it has not been disabled by the user.
If the value at $plugins{$plugin} is <= 0 we skip the rest of the statement, with the result that we do nothing with the plugin, and nothing is exactly what we want to do with disabled plugins.
If $plugin has not been disabled by the user $plugins{$plugin} > 0 will be true and we continue with the statement.
and $plugin->can('entries')
Here we're testing to see if the current plugin has an entries subroutine.
If the return from the can method is true, then $plugin claims to have a entries() method, which means we should be able to safely refer to $plugin->entries().
from perldoc
$obj->can(METHOD)
can checks if the object or class has a method called method. If it does then a reference to the sub is returned. If it does not then undef is returned.
and defined($tmp = $plugin->entries())
In this line we assign the reference to the anonymous subroutine returned by $plugin->entries() to $tmp and then check that $tmp has a defined value.
and $entries = $tmp
Next, we assign the value from $tmp to $entries, overwriting the reference to the default subroutine that was in $entries before now (defined at lines 186 - 232).
Why use $tmp at all?
Because we do not want to overwrite $entries until we are pretty sure we have a valid (defined) reference to a new entries subroutine.
Once we know that $tmp is defined, we can pass its value to $entries.
and last
Finally, the last operator drops us out of the foreach loop.
This means that we will stop looking for replacement entries routines after the first one we find.
This is consistent with what the documentation tells us
The first encountered plugin::entries subroutine overrides the default.
238. blank line
239. my ($files, $indexes, $others) = &$entries();
It would be a shame to go through all of the trouble of defining the entries routine (lines 186 - 232) and then not call it.
Here we call the routine and store the return values.
Remember that the entries routine returns references to the hashes %files, %indexes, and %others.
It should be easy to remember the names of the reference variables used,
Keep in mind that we checked for entries routines in our plugins to override the default before now, so here we're calling either the default or some plugin's entries routine.
240. %indexes = %$indexes;
As just discussed $indexes is a reference to the %indexes hash built up in the entries routine.
Here we dereference it,
%$indexes
that is we're accessing the original hash through the reference.
And what are we doing with the hash?
We're making a copy of it in a NEW %indexes hash.
Take care to note that this is a different %indexes hash than the one we were dealing with in the entries routine.
They have the same name but they do not conflict because the scope of the two variables is different.
The %indexes hash, declared in the entries subroutine is known only in that routine.
The %indexes hash here is the package variable declared at line 69.
Also realize, that $indexes is a reference to the same data, i.e. the same storage location as the %indexes hash defined in the entries routine, such that making a change to data either though %indexes in the entries routine or %$indexes affects the same data.
The package variable %indexes, the target of our assignment on this line is a different storage location. We're creating a new copy of the same data at a new location. Though it happens that the data is initially the same, changing the value of %indexes
outside of the entries routine does not affect the value accessible via the $indexes reference.
This could be important to keep in mind because it's possible for the code to continue working with both. Got it?
Well, it so happens that we never see the $indexes reference again in the script. Oh well everything I've said here is no less true.
241. blank line
242. # Static
243. if (!$ENV{GATEWAY_INTERFACE} and param('-password') and $static_password and param('-password') eq $static_password) {
Start of conditional block
This line represents a major branch in the operation of the script.
The conditional here tests to determine if blosxom is running in static mode.
If the conditional expression evaluates as true, we are in static mode and the if block executes.
We should expect the code we find in the block to be relevant to static mode operation.
If the conditional expression fails, then we are not running in static mode. In this case, static mode execution should be skipped entirely, and we must be running in dynamic mode (the only other mode available). The if block is skipped and we pick up execution with the else clause at line 282. But that's getting a little ahead of ourselves.
Keep in mind that we're running in static mode until the end of this block (line 280).
See notes at line 109 for more info.
244. blank line
245. param('-quiet') or print "Blosxom is generating static index pages...\n";
This line either prints the message
'Blosxom is generating static index pages...\n'
to stdout or not, if the user requests quiet operation, by using the (quiet=1).
As written, if param('-quiet') evaluates to true, then we skip the second expression and so do not print the message.
If on the other hand param('-quiet') is false, then the second expression is evaluated and the message is printed.
From the blosxom documentation we're told
To have Blosxom's static rendering run silently -- perhaps you're running it automatically at regular intervals and you don't want all that output popping up on your screen or being mailed to you-- add -quiet=1 like so:
% perl blosxom.cgi -password='whateveryourpassword' -quiet=1
246. blank line
247. # Home Page and Directory Indexes
248. my %done;
A simple declaration for the lexical (my) hash %done
249. foreach my $path ( sort keys %indexes) {
REPEAT FOR ALL KEYS
The start of a foreach loop that iterates over all of the keys in the %indexes hash.
Remember that indexes was built-up in the entries routine, and contains all of the info we need to know about the pages we should generate during static mode operation.
There are three types of key/value pairs in %indexes, each corresponding to the three different types of static page we may need to generate.
category-based pairs
key: directory paths, eg 'Technology/Computer/Apple'
value: 1
A status flag indicating that we will be generating an index page for the directory named by the key, because there is at least one post in the category itself or a subcategory. A placeholder really, because categories for which we will not be generating pages are not in the hash at all.
date-based pairs
key: The date strings temporarily stored at $d, line 214, generated from the list of values returned from nice_date(), defined at line 443.
value: the value is identical to the key.
e.g. The key $indexes{'2006/11/24'} has the corresponding value '2006/11/24'.
Every date is represented that has at least one post within the period specified by the date.
Keep in mind that date-based index entries are generated for year, year/month, year/month/day.
So a single post, modified on 2006/04/08 will result in the creation of keys for 2006, 2006/04, and 2006/04/08
individual post pairs
Whether or not these pairs are present at all is dependent on the value of the configurable variable $static_entries.
key: The key is the complete path from the root of the data directory to the requested file including the filename and extension, e.g. 'Technology/Computer/Apple/post.txt'
value: 1
A status flag indicating that we will be generating an index page for the specific post named by the key. A placeholder really, because pages we will not be generating are not in the hash at all.
These keys are sorted.
The default sort order is ascending ASCIIbetical.
This means that all of the date-based index keys will be ordered first. (Numbers come before all letters in ASCII.)
Also, keys that are paths that begin with capital letters will be ordered before paths that begin with lowercase letters. (Any uppercase character comes before all lowercase characters in ASCII.)
Once ordered, the list is passed to the foreach loop and each key is run through the loop in turn.
250. my $p = '';
Declaration of the lexical variable $p, and initialization to the empty string.
251. foreach ( ('', split /\//, $path) ) {
REPEAT OVER A SINGLE KEY
Another foreach loop.
Keep in mind that this is a nested foreach.
The outer loop is running through the %indexes hash one key at a time.
This foreach loop is splitting that key into a list of values and then operating on that list.
We're iterating over a single key of the %indexes hash multiple times, having split the key at the path delimiter, '/'.
e.g.
The keys we have in %index look like one of the following:
252. $p .= "/$_";
Remember that the foreach loop is working with one element of the list of values passed to it at a time.
Here we append that value to the end of whatever value we already have in $p, separated by the delimiter '/'.
Note that $p is declared at line 256, and is initially empty, ''.
If our list is ('Technology', 'Computer', 'Apple') then the first time through the loop we append '/Technology' to ''.
After this line $p has the value '/Technology'
253. $p =~ s!^/!!;
This substitution attempts the following match:
The start of the string, followed immediately by '/'
If the match succeeds then we drop the matched character, i.e. we replace it with nothing.
For those of you following along with the example started at line 252, this reduces $p to 'Technology' the first time through the loop.
254. $done{$p}++ and next;
Here we have the first use of %done since it's declaration at line 248.
We're creating a key/value pair in the hash with $p as the key and a value of +1 the current value at that key.
If the key $done{$p} does not exist in the hash, it will be created and the value will be set to 0 + 1 = 1.
The line is a little tricky in it's use of the postincrement operator ++, and, and next.
and next;
will return to the top of the foreach loop and begin the next iteration, using the next list value.
Remember that we are in a nested foreach loop.
next
returns us to the beginning of most immediate loop, which is
foreach ( ('', split /\//, $path) ) { # at line 251
next is the next value split from a single key in the %indexes hash (determined by the outer loop at line 249).
When is next evaluated?
The use of 'and' means that next is evaluated whenever the first expression in the statement evaluates as true.
(and This is the tricky part I mentioned just above)
If the key $p evaluates as false, then we will not evaluate next and instead will continue with the rest of the code in the loop.
If this is the first time we've seen a particular key value in $p, then $done{$p} will evaluate as false because of the use of the postincrement operator.
If the expression were rewritten ++$done{$p}, then the increment would happen before the return value and this statement would always be true.
With the postincrement operator, it is only true if we're repeating a value seen before in $p.
Realize that this should happen many times.
We're working through the values in %indexes.
Any directory in the filesystem will share its parent with siblings and descendents
For example
$done{'Technology'} will be encountered for both
both of which will appear in the %indexes hash.
Every time other than the first, $done{'Technology'}++ will be true.
In these cases we'll skip the rest of the code and start the loop over again.
Also, note that $done{$p}++ will increment from 1, 2, 3,..., every time the value of $p is repeated.
Continuing with the example we've been using for several lines.
If our list is ('Technology', 'Computer', 'Apple')
The list item we're working with on the second iteration is 'Computer'
That value is stored in Perl's default variable $_
Running through all iterations of the loop results in the following:
First iteration
This is the only time that we'll continue with the rest of the code in the inner foreach loop, rather than seeing next and returning to the top of the loop at line 251.
Note that the rest of the code in this loop will be executed exactly once for each unique value of $p.
This is true across all of the values of %indexes (the outer loop).
Assuming this is our $path value, our first iteration through the outer forach loop, at the end of running the inside foreach loop (line 251) the %done hash looks like:
( Technology => 1,
Technology/Computer => 1,
Technology/Computer/Apple => 1, )
255. (-d "$static_dir/$p" or $p =~ /\.$file_extension$/) or mkdir "$static_dir/$p", 0755;
Another statement composed of multiple logical operators, 'or' in the case.
Each subsequent expression will only be evaluated if the preceding expressions are false (from left to right).
Let's look at this expression-by-expression starting on the left.
Note the use of parentheses, which affect grouping and meaning of the statement as it relates to the logical operators.
Specifically, the entire expression
(-d "$static_dir/$p" or $p =~ /\.$file_extension$/)
is an operand of the second of the two 'or' operators.
Of course to determine the value of this expression, its subexpressions must be evaluated.
(-d "$static_dir/$p" or $p =~ /\.$file_extension$/)
-d is the file test Perl uses to determine if the named file is a directory.
Remember that $p is some portion of a path that mirrors the data directory of the blog being generated.
Here we combine the user-configurable variable $static_dir with $p...
the forward slash, '/' is assumed to be the operating system's path delimiter,
...to arrive at the full pathname of some portion of the static blog we're creating, in a location that mirrors $datadir but rooted in it's proper location beneath $static_dir.
If -d "$static_dir/$p" is true, then this entire expression is true, and we do not evaluate the second part of it.
Otherwise, if "$static_dir/$p" is not a directory, we evaluate
$p =~ /\.$file_extension$/
which tries to match the end of the string in $p to the user configurable $file_extension, which would indicate a post. Keep in mind that we've weeded out files which are not posts in the entries routine.
If the match succeeds, then again the entire expression evaluates as true, otherwise the expression is false.
Now that we have a value for the first expression, we know how we will process the rest of the statement.
If the first statement is true, as discussed above, then the statement is true and we do not evaluate the rest of it.
If however it is false, meaning that "$static_dir/$p" is not a directory and also "$static_dir/$p" is not a file ending in the proper extension, then the second part of this statement is evaluated, which creates a directory at "$static_dir/$p" and sets reasonable permissions for a web accessible directory.
mkdir "$static_dir/$p", 0755
What's the point of this?
We're building our static weblog one directory at a time. We need to build the directory hierarchy for the static blog and so that we can populate it with all of the pages necessary to present the complete website, so that any valid page is available when requested.
-d "$static_dir/$p"
If $static_dir/$p is a directory then it has already been generated and we do not want to generate it again. This expression is true and we do not evaulate mkdir "$static_dir/$p", 0755. We're essentially testing for the existence of the directory $static_dir/$p.
or $p =~ /\.$file_extension$/
Here we're testing to see if the value at $p is a file and not a directory at all. If $p is a filename then we do not want to evaluate mkdir "$static_dir/$p.
Note this assumes that if /.$file_extension$/ fails, we must have a directory.
This is most likely a valid assumption given the way %indexes has been populated in the entries routine, but (in my opinion) it's way an odd way to write the code.
256. foreach $flavour ( @static_flavours ) {
This is the beginning of a foreach loop, inside a foreach loop, inside a foreach loop (i.e. it's a triply nested).
@static_flavours is a user-configurable list of flavours to be generated as part of static mode operation.
For ex, as a user I may want static entries generated for both html and RSS (feed) flavours.
This variable is defined at line 61 (the default is in fact 'html', and 'rss').
This loop will run through each of the list elements, assigning each to the control variable $flavour.
Note that $flavour is the same package variable that we've used throughout the script to this point, and not a local loop control variable.
Here we are manually pulling values from the user-configurable list so that all requested flavours are generated.
We'll run completely through all values of @static_flavours, before dropping out of the loop and returning to the top of the loop starting at 249.
What this means is that we'll generate all flavours for a single item before moving on to the next.
e.g.
If we have the following entry:
Technology/Computer/Apple/entry.txt and @static_flavours = qw/ html rss /;
then before moving on to the next entry, we'll generate both
257. my $content_type = (&$template($p,'content_type',$flavour));
This line calls the template routine and initializes the lexical variable $content_type to the routine's return value.
The script defines the default template routine starting at line 137.
Keep in mind that plugins are allowed to replace this routine.
For the purposes of this walk-through, I'll assume the baked-in routine is being used.
As a quick sanity check, if you look at line 140, where the routine names it's parameters, you'll see that the template routine is expecting to be called with the following arguments:
$path, $chunk, $flavour
If we compare that to the call we have here, it makes sense that:
$path receives the value of $p
$path is the portion of the pathname for the current item that corresponds to the structure of blosxom's data directory. It's everything after $datadir to the current item we're currently working with.
$chunk is set to 'content_type'
This is in fact a valid chunk type.
and $flavour in the routine is set to the value of $flavour in the caller. In this case, $flavour is set to each of the values in @static_flavours as we run through the foreach loop starting at line 256.
The template routine ends at line 154.
I won't review it completely here. Of course, you're welcome to go back and review the discussion of its operation.
I will repeat the summary from that discussion:
Summary of the template routine:
The subroutine expects three parameters ($path, $chunk, $flavour) and attempts to open the corresponding template file.
258. $content_type =~ s!\n.*!!s;
Remembering that we just stored the complete contents of the 'content_type.' file for the specified flavour in $content_type,
this substitution drops the first newline it finds and every character that follows.
The /s modifier (written here !s because of the chosen delimiter) instructs Perl's regex engine that the metacharacter '.', which normally matches every character except the newline, should match '\n' as well.
It's important to realize that this means content_type template files MUST contain ONLY the string that defines the content_type on the first line.
It cannot for example contain comments or any other extraneous text before the content_type.
On the other hand, feel free to stuff it with anything you like starting with the second line, because everything starting with the newline on the first line is stripped.
259. my $fn = $p =~ m!^(.+)\.$file_extension$! ? $1 : "$p/index";
The purpose of this line is to determine what of the string in $p is the filename (w/o the file extension).
Keep in mind that if $p is only a directory, it will not have a filename at all.
The lexical variable $fn is declared to hold the filename portion of $p.
$fn is assigned the return value of the rest of the statement,
which will be (assuming the expression is written correctly) a path leading to a filename or 'index' in the case that there is no file name.
We attempt to match m!^(.+).$file_extension$! against the value in $p.
This matches:
followed by one or more of any character,
(note the parentheses, which will assign to the match variable $1 the portion of the string matching at this point in the regular expression,)
followed by a literal dot, '.',
followed by $file_extension
(the user-configured variable specifying the file extension used on posts,)
followed by the end of the string.
If this match succeeds, then we have something that looks like a filename
i.e. some number of characters, a dot (.), and a file extension.
More specifically, we'll have something that looks like a blosxom entry, because the extension must match $file_extension.
Note that because we're only working through values in %indexes, and because of how that hash was populated in the entries routine, any files we find should end in the proper $file_extension. %indexes should contain only paths to directories and valid posts.
If the match succeeds then we evaluate $1, which is simply the value of the match variable $1, i.e. everything that preceeded the dot character before the file extension in $p).
This will be the file name we're after.
If the match fails then $p must contain only a path and no filename.
This path corresponds to a blosxom category. We'll want to generate an index file for this category instead.
In the case that the match fails, we evaluate the expression following ':' $p/index, which inserts 'index' where the filename portion of $p would have been.
So we're essentially forcing the creation of an index file.
Note that this is a use of Perl's ternary operator.
After this line $fn will contain either
the filename of the entry we're working with (including path info but no extension),
or 'index' along with path info (again with no extension).
260. param('-quiet') or print "$fn.$flavour\n";
Unless blosxom was invoked with the -quiet modifier
i.e. '-quiet=1' on the command line,
this line simply prints a status message to stdout so the user can track the script's progress as it works through static mode generation.
The message output is: "$fn.$flavour\n"
$fn, the current filename, which we just determined in the previous line (possibly index)
$flavour, the current value flavour from the list defined in @static_flavours.
This is the file we're generating, or I should say, 'about to generate' (lines 261 - 277).
We see that blosxom names the current file just before it attempts to actually generate it.
Note that '-quiet=1' suspends all output to stdout, not only here but elsewhere in the script as well (line 245).
From the blosxom documentation we're told:
To have Blosxom's static rendering run silently -- perhaps you're running it automatically at regular intervals and you don't want all that output popping up on your screen or being mailed to you-- add -quiet=1 like so:
% perl blosxom.cgi -password='whateveryourpassword' -quiet=1
261. my $fh_w = new FileHandle "> $static_dir/$fn.$flavour" or die "Couldn't open $static_dir/$p for writing: $!";
Attempts to open the filehandle $fh_w for output, '>', to $static_dir/$fn.$flavour
Keep in mind that $fn includes complete path info from $datadir to (and including) the filename itself (w/o a file extension).
$static_dir/$fn.$flavour then is a file with the same name as the entry in $datadir (or .../index) in the same relative location but rooted $static_dir, our user configured static mode root directory.
Of course the extension we're using is the current flavour (eg 'html' or 'rss'), as opposed to $file_extension for example, which is the extension on the original entry.
Creating the filehandle may fail.
It is only a request of the operating system and the OS may refuse for a number of reasons. We check for failure using the same logical or construction that is so common in the script along with Perl's die facility.
Note that this generally considered the preferred way to handle die.
Simply, if the filehandle cannot be opened successfully the value returned will evaluate as false. Only in this case will the second expression be executed.
die terminates the script printing the string that follows to STDERR - typically the terminal, unless you've done something to purposefully redirect standard error.
Note the odd looking variable '$!' at the end of the line.
If opening the filehandle fails, which is the only way this expression will be executed, Perl will have populated $! with a relevant error message from the OS. This info is helpful in diagnosing problems and $! should be included for these types of errors, as it is here.
Assuming all goes well, we'll successfully open the filehandle to $static_dir/$fn.$flavour, allowing us to write to the file.
262. $output = '';
Initializing the package global $output.
Note that this is the first time $output appears in the source since its declaration.
263. if ($indexes{$path} == 1) {
You could say that there are three types of keys in the %indexes hash (as discussed in the note for line 249):
The keys we have in %index look like one of the following:
If $path is similar to either of 1 or 3, then the value will be 1, because of the way the %indexes hash was built-up in the entries routine. The value is essentially a status flag (see line 249 for more info).
If $path is similar to 2., then the value will identical to the name of the key itself, e.g. ($indexes{2006/11/24} => '2006/11/24')
Here we're testing the value at the key corresponding the the current value of $path.
A value of 1 means we're dealing with the category-based scheme. It's enough to say that anything other than 1, means we're dealing with the date-based scheme.
So, if we're dealing with the category-based scheme, the condition is true and we evaluate the if block. If not we'll skip to line 269 and work through the else clause instead.
264. # category
265. $path\_info = $p;
A simple assignment.
$path_info is set to the current value of $p.
Note that this is the package variable $path_info, and that prior values are inconsequential at this point.
This assignment replaces any previous value of $path_info.
Remember that $p, and now $path_info is the complete path from the root of $datadir, including filename and extension, if there is one.
266. # individual story
267. $path_info =~ s!\.$file_extension$!\.$flavour!;
A substitution in the string at $path_info.
\.$file_extension$
Again, $path_info, assigned the value of $p at line 265, is a path rooted at $datadir and may end with a filename, if we're dealing with an individual post, as opposed to a category index.
If we do find a filename, i.e. if we match, then the extension will be $file_extension, because of how the %indexes hash was built in the entries routine. But this $file_extension will be something like 'txt', the user configurable extension for specifying files that should be treated as posts. Our output file shares the base filename but its extension is $flavour, the current one of @static_flavours we're working with in this iteration of the foreach loop starting at line 256.
Here we replace $file_extension with $flavour in $path_info so that the value we pass to the generate routine, at line 268, targets the proper file.
268. print $fh_w &generate('static', $path_info, '', $flavour, $content_type);
We call &generate passing it the list of arguments you see here
and output the return value to $fh_w
&generate is defined at line 295. We'll look at &generate in detail when we get to its definition.
Briefly:
Maybe you can can guess that the return value will be the complete contents of one page of our static site. Either a individual entry, or a category index page. We'll need to work through generate's definition before we know exactly what to expect.
'static', because we are in a block that executes only in static mode, we can assume that this string acts as a mode identifier for &generate, so that the function can be made to operate differently in static and dynamic modes.
$path_info, the value of the path starting at $datadir and ending at with the filename (if there is one).
This value should allow &generate to pull in the contents of the entry to be generated.
'', keep reading for more info about this argument.
$flavour, the flavour we're currently working with from @static_flavours.
Keep in mind that, because of the construction of the nested foreach loops, we will generate all flavours for a single entry before moving on the the next entry. This certainly seems like the right approach.
$content_type, The contents of the content_type.$flavour file (see line 257).
269. } else {
else clause that pairs with the if at line 263.
As discussed in the notes at line 263, the keys we have in %index look like one of the following:
If $path is similar to either of 1 or 3, then the value will be 1, because of the way the %indexes hash was built-up in the entries routine.
If $path is similar to 2., then the value will identical to the name of the key itself, e.g. ($indexes{2006/11/24} => '2006/11/24')
So the else clause deals with an entry in %indexes corresponding to a date-based archive page that we'll need to generate to complete our static site.
270. # date
271. local ($path_info_yr,$path_info_mo,$path_info_da, $path_info) =
This line declares the list of variables as local. You may recognize these names as some of the package globals declared at line 70.
It looks like we're going to initialize these variables by assignment.
The assignment statement is split across two lines for some reason. It is generally considered good style not to split statements across lines like this. See the notes at line 272 for more info.
The local declaration means that the variables we use here are 'temporary' in the sense available only in this block, (the else clause that runs from 269 - 275). 'local' squirrels away the current values in the named package variables for the remainder of this block, replacing them temporarily with the value assigned to them here. After we've run through the block, Perl restores the original global values and these local values are lost.
Note that this is quite a bit different than what happens with 'my' variables.
I'm not sure why local is used here as opposed to the my declaration, used elsewhere in the code (and far more common). My guess is that there is no good reason for the use of 'local' here. 'local' has very limited use. From the perlsub documentation we're told:
Temporary Values via local()
WARNING: In general, you should be using my instead of local, because it's faster and safer. Exceptions to this include the global punctuation variables, global filehandles and formats, and direct manipulation of the Perl symbol table itself.
I'll take this opportunity to remind you that I am only documenting blosxom 2.0.2 as it exists. I am not making any changes to the code. I'll modify the code as part of some future project.
272. split /\//, $p, 4;
This lines continues the statement that was started on the previous.
The complete statement is:
local ($path_info_yr,$path_info_mo,$path_info_da, $path_info)
= split /\//, $p, 4;
We can see that we're using Perl's split function.
split does just that, it splits a string at the pattern specified and returns the resulting list of values (when used in a list context as it is here). These values are what we're assigning to the variables declared as 'local'(s) at the previous line.
split /\//, $p, 4
we're splitting on '/', which must be escaped here to distinguish it from the forward slashes that are part of split's syntax.
Rember that $p is the complete path from the root of $datadir, including filename and extension, if there is one.
The last parameter, 4, is new, seen here for the first time in the code.
It is referred to as the limit. It specifies tha maximum number of fields that $p will split into. It makes sense that 4 matches the number of variables we've declared. With a limit of 4, we'll have a max of four fields, one for each of our localized variables.
An example will make this much clearer.
Because of the way the %indexes is populated in the entries routine, and the way this code is structured, we know that $p will at least begin with a date.
See the notes at line 263 for more info.
Now for some examples, $p could look like:
To take the last example first:
11/24/Technology/Computer/Apple/post.txt is split on '/' into a max of 4 fields
$pathinfoyr = 2006
$pathinfomo = 11
$pathinfoda = 24
$path_info = Technology/Computer/Apple/post.txt
Which seems sensible.
But this breaks down. Let's look at another example:
'2006/Technology/Computer/post.txt' is split into
$pathinfoyr = 2006
$pathinfomo = Technology
$pathinfoda = Computer
$path_info = post.txt
which doesn't look like it makes much sense at all. We'll need to see how (or if) the code deals with this appropriately.
At the very least I feel comfortable saying that the code here looks messy.
273. unless (defined $path_info) {$path_info = ""};
This lines seems pretty self-explanatory.
unless works much like an if block with the condition reserved. I've also heard described as an else clause standing on its own.
The if block would say
if(!defined $path_info)
So we'll only execute the body of the loop if $path_info is not defined. Remember that this is our 'local'(ized) $path_info and it will ony be defined if it was assigned a value in the last statement. That is, if there are at leat 4 components to the path specified at $p. If there are three or fewer than $path_info will not be assigned a value and so it will be undefined here. In that case, we assigned $path_info the empty string.
274. print $fh_w &generate('static', '', $p, $flavour, $content_type);
We call &generate passing it the list of arguments you see here
and output the return value to $fh_w
You should notice the similarities between this line and line 268. The two are essentially doing the same thing. At line 268 we were dealing with blosxom's category-based scheme and here we are doing something very similar in the case that we are working with a date-based archive. Note that we will be in either one or the other block, never both on the same pass (because of the if/else construction).
&generate is defined at line 295. We'll look at &generate in detail when we get to its definition.
Briefly:
Maybe you can can guess that the return value will be the complete contents of one page of our static site. Either a individual entry, or an index page, or in this case an index that is part of the date-based scheme. We'll need to work through &generate's definition before we know exactly what to expect.
'static', because we are in a block that executes only in static mode. We can assume that this string acts as a mode identifier for &generate, so that the function can be made to operate differently in static and dynamic modes.
'', keep reading for more info about this argument.
$p, hierarchical path through blosxom's date-based archive scheme possibly including path info starting at $datadir and ending at with the filename (if there is one).
e.g.
$p could look like any of:
This value should allow &generate to pull in the targeted content.
$flavour, the flavour we're currently working with from @static_flavours.
Keep in mind that, because of the construction of the nested foreach loops, we will generate all flavours for a single entry before moving on to the next entry. This certainly seems like the right approach.
$content_type, The contents of the content_type.$flavour file (see line 257).
The primary difference in the argument lists between lines 268 and this line is that the second and third values seem to be flipped.
The second argument at line 268 is $path_info and here the third parameter is $p. Those two variables $path_info and $p will have the same value at this point in the execution of the script.
Alternatively the third parameter is the empty string, '', at line 268 and here we have the empty string as the second value.
Again we'll discuss what this means, and how the parameters are used, when we get to the definition of &generate at line 295.
275. }
End of the else clause started at line 269
Note that are 'local'(ized) variables declared at line 271 are out of scope here and the original global values are restored.
276. $fh_w->close;
After we return from &generate we've finished writing to $fh_w and so we can close the filehandle, which is a nice thing to do as soon as possible.
After this statement we cannot expect to write to $fh_w again without redefining the filehandle.
277. }
End of the inner-most foreach loop.
Execution returns to line 256 and we consider the next $flavour from @static_flavours, for the same entry.
278. }
End of the middle foreach loop.
Execution returns to line 251 and we consider the next element from the list created by splitting $path on '/', for the same key in %indexes.
279. }
End of the outer-most foreach loop.
Execution returns to line 249 and we consider the next value of $path, which is the next key from %indexes in sorted order.
Summary:
For each key of %indexes in sorted order:
We consider every level along the path from the root of the data directory working downward toward the level of the named entry (or most specific subcategory), building the necessary directory stucture as we go.
When we reach the level of the current target, whether an individual post or an index page, we generate all flavours for the given item.
280. }
End of if block that started at line 243.
The end of this blocks means that we have come to the end of the section of the script that defines static mode operation.
At this point we've finished generating the static site completely.
The only code that follows, which is not a subroutine definition, but is relevant to static mode operation, is the one line check for plugins containing end routines at line 293.
Of course we haven't yet talked about the generate routine yet which, at well over 100 lines, represents the single largest chunk of functional code in the script.
281. blank line
282. # Dynamic
283. else {
This is the start of the else clause that pairs with the if block at line 243.
The if block controls what happens during static mode operation, and this else covers execution when blosxom is running dynamically.
Note that this block is much, much shorter at only 7 lines, compared to 35+ for static mode operation.
Why do you suppose this is the case?
For static mode operation we must generate the entire site, running through every directory and generating possibly many files for each entry, and index pages for every category and subcategory, as well as date-based archive pages.
On the other hand, when running dynamically, we need to return only a single page, whether that page is a category listing, date-based archive page, or a specific entry.
It takes less code to describe the process of generating a single page than it does to direct the output of the entire site.
284. my $content_type = (&$template($path_info,'content_type',$flavour));
This line is essentially the same as line 257. That was static mode and here we're concerned with dynamic mode operation.
In that case we were dealing with a local variable $p in place of $path_info, but $p was essentially a copy of the same value that we have here in $path_info.
See the discussion at line 265 for more info.
This line calls the template routine and initializes the lexical variable $content_type to the routine's return value.
The script defines the default template routine starting at line 137.
Keep in mind that plugins are allowed to replace this routine. For the purposes of this walk-through, I'll assume the baked-in routine is being used.
As a quick sanity check, if you look at line 140, where the routine names its parameters, you'll see that the template routine is expecting to be called with the following arguments:
$path, $chunk, $flavour
If we compare that to the call we have here, it makes sense that:
$path receives the value of $path_info. $path is the portion of the pathname for the current item that corresponds to the structure of blosxom's data directory. It's everything after $datadir to the item we're currently working with.
$chunk is set to 'content_type'
This is in fact a valid chunk type.
and $flavour in the routine is set to the value of $flavour in the caller. $flavour is set at lines 122 - 128 to the flavour requested with the file as part of the $path_info (if present), the value of the flav parameter (if present), or $default_flavour, one of the user-configurable variables.
See the comments at lines 122 - 128 for more info.
The template routine definition starts at line 137 and runs to line 154.
I won't review it completely here. Of course, you're welcome to go back and read through the discussion of its operation.
I will repeat the summary from that discussion:
Summary of the template routine:
The subroutine expects three parameters ($path, $chunk, $flavour) and attempts to open the corresponding template file.
285. $content_type =~ s!\n.*!!s;
This line is identical to what we saw at line 258. That was static mode and here we're concerned with dynamic mode operation.
We're simply trimming the value at $content_type so that it contains only the first line of text from the content_type flavour file.
See the discussion at line 258 for more info.
286. blank line
287. $header = {-type=>$content_type};
The package variable $header, seen here for the first time since its declaration, is defined here to be a reference to an anonymous hash containing a single key/value pair:
key, '-type'
value, $content_type
288. blank line
289. print generate('dynamic', $path_info, "$path_info_yr/$path_info_mo_num/$path_info_da", $flavour, $content_type);
Much like the statement starting at line 268 when we were talking about static mode. Here we call &generate with its expected arguments and we print the return value.
Note that because we're in dynamic mode, we're not opening a filehandle for output, as we did in static mode. print will by default use stdout, which in this case is the browser window. Because generate returns the completed webpage, all we need to do is print that value here.
Quickly going over the argument list:
'dynamic', The two values we've seen used here are 'static' and now 'dynamic'. This is most likley a mode indicator, so that &generate can determine to which of either static and dynamic mode generation it should conform.
$path_info, Picking up from line 241, all other uses before now occur within the static mode block, $path_info is the path from the root of the data directory to (and including) the filename, if present.
"$pathinfoyr/$pathinfomonum/$pathinfo_da", We interpolate and piece together the values of the variables $path_info_yr, $path_info_mo_num, and $path_info_da.
See line 134 for the original definitions, and last mention of, these variables.
It may be important to note that any or all of these variables may be undef (see line 134).
For example we may end up with a string similar to
'2006/11/02' or
'2006/11/undef', or even
'undef/undef/undef'
These values must be present in the browser request to be defined here.
$flavour, The variable here has the same meaning as it does in the static block, although it is in fact a different variable, i.e. it identifies the template we'll use to generate the resulting page.
Here we're dealing with the package global $flavour, which is either the flavour requested by the browser or the default (defined at lines 119 - 128).
The variable in static was local to the innermost foreach loop set to all of the values of @static_flavours in turn.
This should make sense.
In static mode, we're generating the complete site and every flavour intended to be output as part of static mode operation.
When running dynamically the site we're interested only in the flavor requested by the browser.
$content_type, This variable is set the the value just defined at line 285.
The value of $content, type should be the complete contents of 'content_type.' file for the requested flavour. Remember that we stripped all but the first line at line 285.
All of these arguments are passed to the &generate.
290. }
This is the end of else clause that began at line 282.
The end of this block means that we have come to the end of the section of the script that defines dynamic mode operation.
291. blank line
292. # Plugins: End
293. foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('end') and $entries = $plugin->end() }
This is an ugly line. We'll take it in pieces.
This is a foreach loop collapsed to a single line. It will look more familiar to you rewritten as
foreach my $plugin ( @plugins ) {
$plugins{$plugin} > 0 and $plugin->can('end')
and $entries = $plugin->end()
}
foreach my $plugin ( @plugins ) {
This is the start of the foreach loop which runs through all of the plugins listed in @plugins. These are all of the active plugins, in the sense that $plugin->start() returned true, but this list does include plugins disabled by the user. i.e. plugins that end with an underscore, '_'. For example, 'interpolate_fancy_' has been disabled.
Each is set to $plugin in turn.
Yet another long 'and'(ed) partial eval statement.
Let's take it piece by piece, from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where 1 means on and -1 means off.
We only continue considering the current plugin if it has not been disabled by the user.
and $plugin->can('end')
Here we're testing to see if the current plugin has an end routine (referred to as a method in documentation you'll find on can() because this is an OO interface).
If the return from the can method is true, then $plugin claims to have a end() method, which means we should be able to safely refer to $plugin->end().
$entries = $plugin->end()
Here we call the plugin's end routine, assuming we've determined that the current plugin does in fact have such a routine.
Notice that there are no arguments.
From the documentation
The end subroutine is called at the last minute, after all output has been processed and sent to the browser and before Blosxom finishes executing.
sub end { 1; }The subroutine is not passed anything by Blosxom.
Here's where you can perform any cleanup or last-minute operations you might find useful.
Notice that the return value is being stored in the $entries variable, which was previously a reference to the anonymous entries routine.
A return value of 1 should be used to indicate that the plugin ran to completion successfully.
Notice that this line followes a pattern which we've seen before, most recently at line 237.
This is the pattern used to call on active plugins looking for a particular routine, one of (in order of execution)
Recognizing these familiar sequences can help you to understand the code.
Though similar lines related to the template and entries routines are nearly identical, there are some significant differences here.
First note that we do not drop out of the foreach loop when we encounter the first plugin with an end() routine.
Multiple plugins with end() routines can coexist.
Of course, these are called one at a time in the order they are encountered.
After this loop is complete the script is finished.
At this point the browser should display a single completed page in dynamic mode, or the script has completely generated static mode output i.e. a complete site rooted $static_dir.
Of course, this is not the last line of the source code. We have a couple of function definitions still to come including &generate, but this line marks the end of execution of the script.
If you are willing to simply accept that the generate routine works and properly generates each page of output, then you've seen all there is to know about the operation of blosxom.
If you knew nothing about blosxom before, then it's nice to take the time to appreciate that you've learned something new. Better yet, you've learned something new and useful.
294. blank line
295. # Generate
296. sub generate {
Start of the generate routine definition.
297. my($static_or_dynamic, $currentdir, $date, $flavour, $content_type) = @_;
Here the routine declares variables to store all of its expected parameters.
$static_or_dynamic, From what we've seen in the code before now we know that this variable will have one of two values, 'static' or 'dynamic', depending on the mode we're in when the generate routine is called.
$currentdir, From the call at line 289 (dynamic mode), we pass the package variable $path_info, the value of which is assigned to $currentdir here.
This value may include a filename in addition to path info.
This is consistent with the call to the routine in static mode at line lines 274 where $p is the argument, and the value of $p is a path including filename if present.
Note also that $currentdir may be empty, '',
if we are in static mode and the value of $indexes{$path} looks like a date
e.g.
'2006/11/02' rather than 'Technology/Computer/Apple'.
$date, From the call at line 289 (dynamic mode), we pass the string "$path_info_yr/$path_info_mo_num/$path_info_da".
Remember that this string may look like
In static mode at lines 274, this value will either be a string in the form '2006/11/02/' or it will be empty, '' if we are in static mode and the value of $indexes{$path} looks like a path
e.g.
'Technology/Computer/Apple' rather than '2006/11/02' .
$flavour, Is whatever value of $flavour is passed by the calling routine.
If we are running in static mode, this will be one of the values from @static_flavours. In dynamic mode, it will be either a requested flavor, named in the browser as part of the address, the value of the flav parameter, or the default flavor.
$content_type, Whether in static or dynamic mode, the value of this variable should be the complete contents of the 'content_type.' file corresponding to the flavour specified by the value of $flavour.
298. blank line
299. %files = %$files; %others = ref $others ? %$others : ();
Notice that this one line includes two statements. We'll consider each separately, just as Perl does.
First, we have
%files = %$files;
$files is a reference to a hash built up in the entries routine containing key/value pairs where
each key is the name of a file which has not been eliminated from consideration as a post, and
the corresponding value is the file's current modification time in unix timestamp format as returned from stat($File::Find::name)->mtime.
Here we're dereferencing it, i.e. we're accessing the original hash through the reference.
What are we doing with the original hash?
Making a copy of the hash in a new %files hash.
Take care to note that this is a different %files hash than the one we were dealing with in the entries subroutine.
They have the same name but they do not conflict because the scope of the two variables is different.
The %files hash declared in the entries subroutine is valid in that routine. In other words, the variable %files declared in entries, is local to that subroutine.
The %files hash here is the package variable declared at line 70.
There is one additional point that I want to emphasize.
The reference was in fact a reference to the same data, the same storage location, as the hash defined in the enrties routine. As such, making a change to either %files or $files would affect the data.
The package variable %files, which is assigned to on this line, is a different storage location, and so we have distinct data, though initially the values are the same.
Changing the value of %files outside of the entries subroutine does not affect the values accessible via the $files reference. This may prove to be important because it's possible that that code could continue working with both.
That having been said, we never see the $files reference again in the script, so it's ok to forget about it.
Now we'll look at the second statement on this line:
%others = ref $others ? %$others : ();
Remember that the %others hash is contains key/value pairs for files which are readable, but did not meet the definition of an entry for one of the reasons spelled out at lines 198 to 200.
ref is a Perl operator that returns the type of 'thing' referenced by a reference variable like $others (eg SCALAR, ARRAY, HASH). All of these are treated as true values.
ref returns false if its operand is not a valid reference of any type.
Realize that the %others hash in entries may not have been initialized.
If that is the case $others (previously assigned the uninitialized value returned from entries) is in fact not a valid reference, and so would return a false value here.
Now that we've talked about ref, notice that this is another usage of Perl's ternary operator.
If ref $others, which we've just discussed, returns true, then the expression preceding ':' is evaluated, otherwise we consider only the expression following ':'.
If ref $others is in fact a reference then we evaluate %$others, derefencing $others, assigning that value to the package variable %others.
In this case we essentially have %others = %$others,
If ref $others is not a valid reference then we instead assign %others the empty list (). In this case %others is an empty hash.
300. blank line
301. # Plugins: Filter
302. foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('filter') and $entries = $plugin->filter(\%files, \%others) }
Another foreach looop collapsed to a single line. Let's look at it piece by piece.
foreach my $plugin ( @plugins ) {
Start of the loop which runs through all active plugins listed in @plugins, as determined by the return value of each plugin's start routine.
Note that any of these could be disabled by the user.
Each of these is set to $plugin in turn.
$plugins{$plugin} > 0 and $plugin->can('filter') and
$entries = $plugin->filter(\%files, \%others)
Yet another long 'and'(ed) together partial evaluation statement.
From left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where 1 means on and -1 means off.
We only continue considering the current plugin if it has not been disabled by the user.
and $plugin->can('filter')
Here we're testing to see if the current plugin has a filter subroutine (referred to as a method in documentation you'll find on can() because this is an OO interface).
If the return from the can method is true, then $plugin claims to have a filter() method, which means we should be able to safely refer to $plugin->filter().
$entries = $plugin->filter(\%files, \%others)
Here we call the plugin's filter routine, passing it references to the %files and %others hashes.
With references to these hashes, the filter routine has access to all of the entries.
From the documentation
The filter subroutine offers the plugin the chance to alter the full list of entries Blosxom has found in its data directory.
The subroutine is passed a reference ($files_ref) to the hash of files.
The hash consists of key/value pairs,
the keys being the full-path of an entry and the value being its Unix-style modification time (mtime).
...
Notice that I end my subroutine with a 1; .
Returning a true (1) value when all goes as expected is good form; return a false (0) when problems occur.
While Blosxom doesn't halt execution on a 0 or anything that severe, it does report on what happens with each plugin when run statically.
Notice that the return value is being stored in the $entries variable, which was previously a reference to the anonymous entries routine.
This return value should be a value of:
true (1) indicating that the plugin ran successfully, or false (0) indicating that some sort of problem was encountered.
You should recognize the pattern of these lines. This is the standard way blosxom cycles through its active plugins looking for specific routine, in this case filter().
Though similar lines related to the template and entries routines are nearly identical, there are some significant differences here.
First note that we do not drop out of the foreach loop when we encounter the first plugin with a filter routine.
Multiple plugins with filter routines can coexist.
Of course, these are called one at a time, and because the purpose of the routine is to filter (i.e. potentially modify) the entries blosxom operates on, each plugin with a filter routine is effected by previous plugins and affects any that follow.
303. blank line
304. my %f = %files;
This initialization creates a copy of the hash at %files, which contains key/value pairs where
each key is a filename, i.e. the name of a file which has not been eliminated from consideration as a post, and
the corresponding values are modification times in unix timestamp format as returned from stat($File::Find::name)->mtime.
305. blank line
306. # Plugins: Skip
307. # Allow plugins to decide if we can cut short story generation
308. my $skip; foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('skip') and defined($tmp = $plugin->skip()) and $skip = $tmp and last; }
This is an ugly line. We'll take it in pieces.
The line starts with the statement
my $skip;
which simply declares the local variable $skip. This statement could be written on it's own line.
Next we see
foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and
$plugin->can('skip') and defined($tmp = $plugin->skip()) and
$skip = $tmp and last; }
This is a foreach loop collapsed to a single line. It will look more familiar to you rewritten as
foreach my $plugin ( @plugins ) {
$plugins{$plugin} > 0 and $plugin->can('skip') and defined
($tmp = $plugin->skip()) and $skip = $tmp and last;
}
foreach my $plugin ( @plugins ) {
Start of a foreach loop which runs through all of the plugins listed in @plugins (active as determined by the plugin's start routine).
Note that any of these could be disabled by the user.
Each is set to $plugin in turn.
$plugins{$plugin} > 0 and $plugin->can('skip') and
defined($tmp = $plugin->skip()) and $skip = $tmp and last;
Yet another long 'and'(ed) partial eval statement.
Let's take it piece by piece, from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins.
where 1 means on and -1 means that the plugin is off.
We only continue considering the current plugin if it has not been disabled by the user.
and $plugin->can('skip')
Here we're testing to see if the current plugin has a skip routine (referred to as a method in documentation you'll find on can() because this is an OO interface).
If the return from the can method is true, then $plugin claims to have a skip() method, which means we should be able to safely refer to $plugin->skip().
and defined($tmp = $plugin->skip())
In this line we assign the reference to the anonymous subroutine returned by $plugin->skip() to $tmp and then check that $tmp has a defined value.
and $skip = $tmp
Next, we assign the value from $tmp to $skip.
Why use $tmp at all?
Because we do not want to overwrite $skip until we are pretty sure we have a valid (defined) reference to a new subroutine. This check is accomplished in the previous expression.
Once we know that $tmp is defined, we can pass it's value to $skip.
and last
Finally, the last operator drops us out of the foreach loop.
This means that we will stop looking for replacement skip routines after the first one we find.
This is consistent with what the documentation tells us,
From the documentation
The skip subroutine is called just as Blosxom starts actually generating output.
Any plugin can cut short story generation by returning a value of 1.
Of course, the first plugin to return a 1 is the last one called.
Notice that these lines followes a pattern that we've seen before, most recently at 302.
This is the pattern used to call all active plugins looking for a particular routine.
Recognizing these sorts of familiar sequences can help you to understand the cod and lessen the mental burden.
309. blank line
310. # Define default interpolation subroutine
311. $interpolate =
We'll see on the next line that what we're assigning to the variable is an anonymous subroutine that serves as the default interpolate routine.
312. sub {
The start of the anonymous subroutine that will serve as the default interpolate routine.
From the documentation
The interpolate subroutine offers the plugin the chance to swap in a replacement for the default interpolation subroutine
- the one that replaces variables like $title and $someplugin::somevariable with their associated values.
The first plugin whose interpolate hook returns a reference to an anonymous subroutine to be used in place of the default has that code assigned to $interpolate in blosxom.cgi itself.
The subroutine assigned to $interpolate is called for template components
e.g. head, story, foot,
being passed the contents thereof for appropriate interpolation.
313. package blosxom;
314. my $template = shift;
Here shift is shorthand for shift @_, where @_ is the array containing the subroutines parameter list.
We're shifting off the first value and storing in my $template.
Review
Before now we've seen a package variable $template that expects as parameters three variables
$path, $chunk, $flavour,
and returns the complete contents of the template file specified by the values of these variables, starting in the same directory as the requested entry (post) and working up toward the root of $datadir, or failing all else, a baked-in template, or as a last resort, an error.
This variable is something different!
Here $template will contain the string passed to it by the caller. Coming up will see that the $interpolate routine is called in turn on each of
Immediately before that component is passed added (concatenated) to the output string ($output).
315. $template =~
The start of the substitution statement that continues on the next line.
We are trying to match and replace within the value at $template.
316. s/(\$\w+(?:::)?\w*)/"defined $1 ? $1 : ''"/gee;
This line specifies the substitution and completes the statement started on the preceding line.
Let's look at the pattern match first...
(\$\w+(?:::)?\w*)
This specifies
*Note the use of ?: after the opening parentheses which instructs Perl's regex engine that these parentheses are for grouping only and should not affect match variables,
So a string such as
"$plugin_name::variable_name"
would match, as would
$variable_name
Because the entire pattern is in parentheses, the entire matched portion of the string would be assigned to the match variable named $1.
Now for the substitution:
"defined $1 ? $1 : ''"/gee
Quickly take a look at the modifiers.
We've seen /g before.
It tells Perl's regex engine to continue with all possible substitutions rather than stopping at the first match, which is the usual behavior.
/e is new.
It tells Perl's regex engine to treat the string as executable code.
The fact that there are two /e modifiers means that we should process the string twice.
So if we were going to execute this expression, what would it mean?
We know that $1 will be the portion of $template matched by the pattern
\$\w+(?:::)?\w*.
And that string will look like a variable that might appear in our template files
e.g.
$plugin\_name::variable\_name.
If the the pattern finds such a string that is defined, meaning that a corresponding variable does in fact exist and its value is defined, then defined $1 is true and we evaluate the expression preceding the ':', in this case $1 itself is returned.
If the patterns finds a string that looks like a interpolateable variable in a template file which is not defined, i.e. either it does not exist or is undef,
then we evalute the expression following ':', which in this case is '', the empty string.
$template will contain the complete contents of a template file, so this short line runs through the entire contents of the template file and either returns the original variables it finds, if they are defined, or drops them from $template if they are undefined.
The line must strip all undefined variables found in $template and carry out the replacement of values for variables for all variables that are defined.
317. return $template;
After completing the substitutions specified in the previous line, we return to the caller the same string, the contents of the template file under consideration, with the following changes:
318. };
End of the default interpolate routine definition.
319. blank line
320. unless (defined($skip) and $skip) {
The block that starts here continues to line 434.
This is essentially the whole of the generate routine.
If a plugin has decided that we should skip generation, all we output is either:
nothing if running in static mode, or
a header if running in dynamic mode, assuming the header exists see line 432 for more info.
From the documentation
The skip subroutine is called just as Blosxom starts actually generating output. Any plugin can cut short story generation by returning a value of 1. Of course, the first plugin to return a 1 is the last one called.
The skip routine is useful, for example, if your plugin is to return a redirect for some reason or send a binary stream (e.g. an image) to the browser, and hasn't any reason to bother generating any blog entries.
Unless is the converse of if. We will execute the code in the block unless the condition is met (if it is not met).
In other words, if the conditional expression is true, we do not execute the entire block.
In this case, that means that execution jumps to line 434, bypassing nearly all of the generate routine.
First we test defined($skip), which will evaluate as true if the variable $skip has any value other than undef.
If the first part of the expression is true, note the use of the logial operator 'and', we evaluate the second subexpression.
That expression,
$skip
is true if $skip has a value and that value is anything other than 0, '0', or '', (also undef but we've already confirmed that the value is not undef or we would not be executing this portion of the statement).
321. blank line
322. # Plugins: Interpolate
323. # Allow for the first encountered plugin::interpolate subroutine to
324. # override the default built-in interpolate subroutine
325. my $tmp; foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('interpolate') and defined($tmp = $plugin->interpolate()) and $interpolate = $tmp and last; }
This is an ugly line. We'll take it in pieces.
The line starts with the statement
my $tmp;
which declares the local variable $tmp. The statement could be written on it's own line.
Next we see
foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and
$plugin->can('interpolate') and
defined($tmp = $plugin->interpolate()) and
$interpolate = $tmp and last; }
This is a foreach loop collapsed to a single line. It will look more familiar to you rewritten as
foreach my $plugin ( @plugins ) {
$plugins{$plugin} > 0 and $plugin->can('interpolate') and
defined($tmp = $plugin->interpolate()) and
$interpolate = $tmp and last;
}
foreach my $plugin ( @plugins ) {
This is the start of the foreach loop which runs through all of the plugins listed in @plugins. These are the active in the sense that $plugin->interpolate() returned true, but this list does include plugins disabled by the user (e.g interpolate_fancy_).
Each is set to $plugin in turn.
$plugins{$plugin} > 0 and $plugin->can('interpolate') and
defined($tmp = $plugin->interpolate()) and
$interpolate = $tmp and last;
Yet another long 'and'ed partial eval statement.
Let's take it piece by piece, from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where 1 means on and -1 indicates that the plugin has been 'turned off' by the user.
We only continue considering the current plugin if it has not been disabled by the user.
If the value at $plugins{$plugin} is <= 0 we skip the rest of the statement, with the result that we do nothing with the plugin, and nothing is exactly what we want to do with disabled plugins.
If $plugin has not been disabled by the user $plugins{$plugin} > 0 will be true and we continue with the statement.
and $plugin->can('interpolate')
Here we're testing to see if the current plugin has an interpolate subroutine.
If the return from the can method is true, then $plugin claims to have an interpolate() method, which means we should be able to safely refer to $plugin->interpolate().
from perldoc
$obj->can(METHOD)
can checks if the object or class has a method called METHOD. If it does, then a reference to the sub is returned. If it does not then undef is returned.
and defined($tmp = $plugin->interpolate())
In this line we assign the reference to the anonymous subroutine returned by $plugin->interpolate() to $tmp and then check that $tmp has a defined value.
and $interpolate = $tmp
Next, we assign the value from $tmp to $interpolate, overwriting the reference to the default interpolate routine (defined at lines 310 - 318).
Why use $tmp at all?
Because we do not want to overwrite $interpolate until we are pretty sure we have a valid (defined) reference to a new interpolate routine.
Once we know that $tmp is defined, we can pass its value to $interpolate.
and last
Finally, the last operator drops us out of the foreach loop.
This means that we will stop looking for replacement interpolate routines after the first one we find.
This is consistent with what the documentation tells us.
From the documentation
The first plugin whose interpolate hook returns a reference to an anonymous subroutine to be used in place of the default has that code assigned to $interpolate in blosxom.cgi itself.
326. blank line
327. # Head
328. my $head = (&$template($currentdir,'head',$flavour));
This line calls the template routine passing it:
$currentdir, defined at line 297.
Note that $currentdir may be empty, '', if we are in static mode and the value of $indexes{$path} looks like a date
e.g.
'2006/11/02'
'head', we're working on the head template component here so this certainly makes sense.
$flavour, specifies the flavour of request.
Remember from the template routine definition, beginning at line 138 that the routine expects arguments specifying
e.g.
Would result in the template routine looking for the template file 'head.html' first at 'Technology/Computer/Apple'.
If not such file exists,
the routine moves up the directory hierarchy toward the root of the data directory ($datadir), before looking for a baked-in template, and eventually returning an error.
The return value is the content of the file requested as a string, or the corresponding baked-in template, or an error.
We store this string at $head.
329. blank line
330. # Plugins: Head
331. foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('head') and $entries = $plugin->head($currentdir, \$head) }
Another foreach looop collapsed to a single line. Let's look at it piece by piece
foreach my $plugin ( @plugins ) {
Start of the loop which runs through all active plugins listed in @plugins, as determined by the return value of each plugin's start routine.
Note that any of these could be disabled by the user.
Each of these is set to $plugin in turn.
$plugins{$plugin} > 0 and $plugin->can('head')
and $entries = $plugin->head($currentdir, \$head)
Yet another long 'and'(ed) partial eval statement.
Let's take it from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where 1 means that the plugin is on, and -1 is used to indicate that the user has 'turned it off'.
We only continue considering the current plugin if it has not been disabled by the user.
and $plugin->can('head')
Here we're testing to see if the current plugin has a head routine (referred to as a method in documentation you'll find on can() because this is an OO interface).
If the return from the can method is true, then $plugin claims to have a head() method, which means we should be able to safely refer to $plugin->head().
$entries = head($currentdir, \$head)
Here we call the plugin's head routine, passing it the current working directory, $currentdir, and a reference to the $head variable, which contains the head.flavour source.
See notes at line 328 for more info.
From the documentation
Blosxom calls the head subroutine after reading in the appropriate head.flavour and before swapping in values for template components.
The subroutine is passed the current working directory - as defined by the path, and a reference to the raw head.flavour source.
sub head { my($pkg, $currentdir, $head_ref) = @_; 1; }The head subroutine offers the plugin the opportunity to alter the raw header source and define or alter any variables before the header is added to the output stream.
This is also a good point to alter those custom non-story-specific template variables.
Note that the subroutine should return a value of 0 (indicating failure) or 1 (success)
Starting at line 293 $entries is used to hold these true/false return values from plugin routines.
Multiple plugins with head() routines can coexist. Each will be called according to the order of the list at @plugins, which is the order they occur in the plugins directory.
332. blank line
333. $head = &$interpolate($head);
Now that we're done running the raw head.flavour source through all head() routines in our active plugins, we pass $head to the interpolate routine, defined at line 310.
In that the source is shifted into a local $template variable.
A subtitution is run that replaces all defined variables...
e.g.
$title and $someplugin::somevariable
...with their corresponding values. Undefined variables are stripped from the source.
After the substitutions are made, we return the string back into $head.
334. blank line
335. $output .= $head;
With this statement, we're beginning to build up a string containing the complete page for output from generate().
Not surprsingly, the first value we append to $output is $head.
Note that the string at $head has been run by all head routines found in active plugins and interpolate() before this point.
336. blank line
337. # Stories
338. my $curdate = '';
This statement declares the variable $curdate, and initializes it to the empty string, ''.
339. my $ne = $num_entries;
Here we declare and initialize the variable $ne to the value of $num_entries, which is the user-configurable variable defined at line 34.
340. blank line
341. if ( $currentdir =~ /(.*?)([^\/]+)\.(.+)$/ and $2 ne 'index' ) {
Here we have a conditional dependent on the value of a pattern match against the value of $currentdir.
(.*?)([^\/]+).(.+)$
matches
*note the ? which means that this portion of the match is optional,
Note the use of parentheses to populate memory variables:
(.*?), $1 ([^\/]+), $2 (.+), $3
Because we must match the end of the string, (because of the presence of $ at the end of the pattern), it's best to look at this match starting at the end.
$3, characters following a literal dot and followed by the end of the string. This will match a file extension if there is one.
$2, one or more characters preceding the dot before the extension, but not including any '/' characters. This will match the filename portion of $currentdir, not including the extension or path info.
$1, will match the complete path including trailing '/' sepaating the filename from the path info.
If the match fails the entire expression fails and we skip the conditional block, otherwise we consider the second part of the expression.
and $2 ne 'index'
this portion of the expression is true if $2, the filename not including extension or path info, does not equal (ne) the literal string 'index'.
So we only enter this block if the request includes something that looks like a filename and that name is not 'index'.
e.g.
We would not enter the block on a request for a category or date, and also we would not enter the block on a request for 'index.$flavour'.
342. $currentdir = "$1$2.$file_extension";
Now we're replacing the value in $currentdir with something similar but different in one important way.
Before this line, the value in $currentdir would have been something in the form:
...path/filename.$flavour.
After this line $currentdir will instead contain
...path/filename.$file\_extension
where $file_extension is the user confgurable variable specifying the file extension used on posts.
After this line we can use $currentdir to refer to an entry - i.e. the file itelf.
343. $files{"$datadir/$1$2.$file_extension"} and %f = ( "$datadir/$1$2.$file_extension" => $files{"$datadir/$1$2.$file_extension"} );
This statement can be described as two expressions connected by the logical 'and'.
Let's look at those two pieces separately.
$files{"$datadir/$1$2.$file_extension"}
This expression will evaluate to the value at the key "$datadir/$1$2.$file_extension" in the %files hash.
In line 210 we populated the %files hash with the following expression
$files{$File::Find::name} = stat($File::Find::name)->mtime
See notes at line 210 for more info.
This line creates a pair with the key $File::Find::name, and a value that is the modification time of the corresponding file.
$File::Find::name will be in the form of a complete path from the root of the filesystem through, and including, the filename.
Again see the code and notes around line 210 for more info.
The key in this expression is in the correct form:
$datadir will take care of the portion of the path from the root of the filesystem to the start of the data directory ($datadir).
$1 will be the path to the file starting at the the root of the data directory ($datadir).
$2 is the name of the file itself, and
$file_extension adds the extension necessary to complete the filename.
Now that we have confirmed that this string is in the correct form, we can see that the script will find a key in the %files hash only if it had previously identified a valid entry in the filesystem.
If such a key exists, its value will be the modification time of the file. We're not concerned with what that time is specifically here, only that it will not evaluate be zero. Any nonzero value will return true, which is enough for the purposes of this statement.
If we have not decided previously to generate the file we're currently considering, then there will be no key "$datadir/$1$2.$file_extension" in the %files hash and the expression will evauate to undef (false).
If we do find a key then we evaluate the next subexpression.
%f = ( "$datadir/$1$2.$file_extension" => $files{"$datadir/$1$2.$file_extension"} );
Having already determined that we should generate the item identified as $datadir/$1$2.$file_extension by consulting the %files hash, this expression assigns to the %f hash the pair,
key, $datadir/$1$2.$file_extension
The value should match the corresponding key in the %files hash.
value, $files{"$datadir/$1$2.$file_extension"}
Again, as just discussed, this value in the %files hash will be the modification time of the file named by the key (in unix timestamp format).
If this seems like a strange thing to do, because we are replacing values in %f with the same values, then you and I, are in agreement.
To summarize the conditional block:
If the item we're considering now in generate() looks like a file and is not an 'index.' file
then we append $file_extension to the end of the filename, replacing the requested flavour extension, and
if the resulting filename appears in the %files hash, because we have previously identified it as something we want to generate, then we add it (i.e. the pair that targets the file and its modification time) to the %f hash.
344. }
End of conditional block started at line 341
345. else {
Start of else block that pairs with if that begins at line 341.
346. $currentdir =~ s!/index\..+$!!;
If the conditional expression from line 341 fails, we do not have something that looks like a filename or that filename is 'index.extension', we execute this statement which is a substitution against the value at $currentdir
s!/index\..+$!!
Tries to match:
The substitution drops the matched portion of the string.
So we're looking for index.extension, and if we find one, then we remove the filename completely as well as the delimiter separating the filename from the path, '/'.
'/Technology/Computer/Apple/index.html'
becomes
'/Technology/Computer/Apple'
347. }
End of else block started at line 346
348. blank line
349. # Define a default sort subroutine
350. my $sort = sub {
Start of the anonymous subroutine definition that will serve as the default sort routine.
351. my($files_ref) = @_;
Declares and initializes a variable $files_ref.
The routine seems to expect a single argument, which is assigns to $files_ref here.
Judging from the name of the variable, we might guess that this variable will be a reference to the %files hash.
Note that the routine is passed &$sort(\%f, \%others) and the reference to others is ignored. In fact %others seems to be almost entirely ignored in the source, in the sense that we don't do anything useful with it.
352. return sort { $files_ref->{$b} <=> $files_ref->{$a} } keys %$files_ref;
This line
dereferences the anonymous hash at $files_ref, then
the keys operator passes a list of keys from the hash to a sort routine (defined inline right here)
sort { $files_ref->{$b} <=> $files_ref->{$a} }
sorts the keys in descending order (the larger of two values compared is ordered first), according to their corresponding values.
Remember that values here the modification times, in unix timestamp format, of the files named by the keys.
Finally, the list of our our posts (each key is the complete pathname of a post), sorted in descending order by (modification) date, is returned from the sort routine.
353. };
End of default sort routine definition started at line 349
354. blank line
355. # Plugins: Sort
356. # Allow for the first encountered plugin::sort subroutine to override the
357. # default built-in sort subroutine
358. my $tmp; foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('sort') and defined($tmp = $plugin->sort()) and $sort = $tmp and last; }
This is an ugly line. We'll take it in pieces.
The line starts with the statement
my $tmp;
which declares the local variable $tmp. This statement could be written on it's own line.
Next we see
foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and
$plugin->can('sort') and defined($tmp = $plugin->sort()) and
$sort = $tmp and last; }
This is a foreach loop collapsed to a single line. It will look more familiar to you rewritten as
foreach my $plugin ( @plugins ) {
$plugins{$plugin} > 0 and $plugin->can('sort') and
defined($tmp = $plugin->sort()) and
$sort = $tmp and last;
}
foreach my $plugin ( @plugins ) {
This is the start of the foreach loop which runs through all of the plugins listed in @plugins. These are the active plugins in the sense that $plugin->sort() returned true, but the list may include plugins disabled by the user (e.g interpolate_fancy_).
Each is set to $plugin in turn.
$plugins{$plugin} > 0 and $plugin->can('sort') and
defined($tmp = $plugin->sort()) and $sort = $tmp and last;
Yet another long 'and'(ed) partial eval statement.
Let's take it from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where 1 means that the plugin is on, and -1 indicates that the user has 'turned off' the plugin.
We only continue considering the current plugin if it has not been disabled by the user.
If the value at $plugins{$plugin} is <= 0 we skip the rest of the statement, with the result that we do nothing with the plugin, and nothing is exactly what we want to do with disabled plugins.
If $plugin has not been disabled by the user $plugins{$plugin} > 0 will be true and we continue with the statement.
and $plugin->can('sort')
Here we're testing to see if the current plugin has an sort subroutine.
If the return from the can method is true, then $plugin claims to have an sort() method, which means we should be able to safely refer to $plugin->sort().
from perldoc
$obj->can(METHOD)
can checks if the object or class has a method called METHOD. If it does, then a reference to the sub is returned. If it does not then undef is returned.
and defined($tmp = $plugin->sort())
In this line we assign the reference to the anonymous subroutine returned by $plugin->sort() to $tmp and then check that $tmp has a defined value.
and $sort = $tmp
Next, we assign the value from $tmp to $sort, overwriting the reference to the default sort routine (defined at lines 349 - 354).
Why use $tmp at all?
Because we do not want to overwrite $sort until we are pretty sure we have a valid (defined) reference to a new sort routine.
Once we know that $tmp is defined, we can pass its value to $sort.
and last
Finally, the last operator drops us out of the foreach loop.
This means that we will stop looking for replacement sort routines after the first one we find.
This is consistent with what the documentation tells us:
From the documentation
The sort subroutine offers the plugin the chance to swap in a replacement for the default sorting subroutine that determines in what order all those weblog postings of yours are displayed, the default being reverse-chronological order (read: newest first).
If your plugin decides not to override the default after all, simply return an undefined value, like so:
...Subsequent plugins will then be given the chance to override the default sort subroutine.
359. blank line
360. foreach my $path_file ( &$sort(\%f, \%others) ) {
Here we call the sort routine, passing it references to the %f and %others hashes.
The return value from sort is a single list of hash keys in sorted order. The precise definition of 'sorted order' depends on the sort routine you're using. The baked-in sort routine returns posts in reverse chronological order.
Note that the default sort routine seems to ignore \%others entirely.
Each value from the sorted list is assigned to $path_file in turn, and we work through the body of the foreach loop, once for each list item.
Remember that our keys are complete pathnames from the root of the filesystem to, and including, the filename and extension.
Keys in the %f hash will be either
the full pathnames including filename and extension ($file_extension), or the name of a directory.
Requsts for 'index'.extension have been reduced to directory requests by this point in the code (see line 346).
All of these rules are enforced by the code we've run before now.
At this point %f should contain only readable, and otherwise valid, entries and directories to for us to generate.
361. last if $ne <= 0 && $date !~ /\d/;
One of the few expression modifiers we see in the source;
This is just a more compact way to write the conditional statement
if(if $ne <= 0 && $date !~ /\d/) {\
last;
}
The expression modifier syntax is generally preferred.
There are two subexpressions as part of the conditional expression.
Taking each in turn. First,
$ne <= 0
We set $ne to the value of user configured variable $num_entries at line 339.
So $ne starts at the max number of posts that should be generated for any single call to blosxom.
At this point we haven't seen the code for it, but it's very likely that we'll be decrementing the value of $ne toward zero each time we process an entry.
When $ne <= 0 then we've generated all of the posts we should.
In this case, last breaks us out of the foreach loop immediately.
If $ne is greater than 0, then the first part of the conditional is false and we know we will not be exiting the foreach loop here.
If it is true that $ne <= 0 then we look at the second part of the expression
$date !~ /\d/
We declare and initialize $date at line 297, at the top of the generate() routine.
Here we attempt a match on the value of $date, which evaluates to true if the match does not succeed (!~)
All we're looking for in $date is a single digit.
If we find at least a single digit anywhere in $date, then the expression will evaluate to false and we will not execute last.
So even is $ne is <= to zero, if date contains a digit we will never exit from the foreach loop here.
What does this mean?
It means that the $num_entries limit does not apply to blosxom's date-based archive scheme.
e.g.
example.net/cgi-bin/blosxom.cgi/2006/
362. use vars qw/ $path $fn /;
This statement declares the package globals $path and $fn.
We've seen my $path before, in the the baked-in template routine, and as a loop control variable, but this is the first time we've seen $path as a package global.
my $fn shows up in line 259, but again this is the first time we're seeing it as a package global.
363. ($path,$fn) = $path_file =~ m!^$datadir/(?:(.*)/)?(.*)\.$file_extension!;
This statement initializes the package variables $path, and $fn, just declared, returning portions of the string in $path_file matched by the pattern
^$datadir/(?:(.*)/)?(.*)\.$file\_extension
$path_file is our loop control variable.
Each time through the foreach loop, it's set to one of the keys returned from sort (line 360)
These are keys from %f (and possibly %others) in sorted order.
As we've already discussed, each key is a complete filename, including path info from the root of the filesystem, to a valid entry or a directory.
Let's look at that pattern in detail:
^$datadir/(?:(.*)/)?(.*)\.$file_extension
Note the ?: immediately after the first opening paren, indicating that this set of parentheses is used for grouping only and does not affect match variables.
Because of this the (.) inside (?:(.)/) is set to $1, which is assigned to $path by this statement.
(.*)
matches any number of any character other than newline, '\n'.
Next the pattern matches a literal forward slash, '/'
Note that because the delimiter used here for the pattern match is '!' we do not need to escape the forwardslash, '/', as we've seen elsewhere.
(?:(.*)/)?
the question mark indicates that this portion of the pattern match is optional, i.e. matches a single occurence of the pattern in the string or none,
followed by
(.*)
Because of the use of parentheses, the portion of the string matched is asisgned to $2, which is assigned to $fn by this statement,
followed by a literal dot
\.
followed by the value
$file\_extension
Because Perl's regex engine's quantifiers are greedy, (?:(.*)/) will match as much of the string at $path_file as possible.
For example, if $path_file is:
'/Library/WebServer/Documents/blosxom/Technology/
Computers/Apple/a_random_entry.txt'
^$datadir/(?:(.*)/) # will match
'/Library/WebServer/Documents/blosxom/Technology/
Computers/Apple/'
If the value of $datadir is
'/Library/WebServer/Documents/blosxom'
$path is assigned the value
'Technology/Computers/Apple'
(note that the trailing '/' is not included.)
Continuing with this same example:
(.*)\.$file_extension # matches
'a_random_entry.txt'
and everything before the literal dot is assigned to $fn, for this example:
'a_random_entry'
364. blank line
365. # Only stories in the right hierarchy
366. $path =~ /^$currentdir/ or $path_file eq "$datadir/$currentdir" or next;
We've just defined $path to be the portion of $path_file starting at $datadir, not including the delimiter between the root of the data directory and the rest of the path, through but not including the delimiter preceding the filename.
For ex, if the value of $path_file is:
'/Library/WebServer/Documents/blosxom/
Technology/Computers/Apple/a_random_entry.txt'
and $datadir is
'/Library/WebServer/Documents/blosxom/'
then $path will be
'Technology/Computers/Apple'
This statement contains three subexpressions. We'll look at each in turn.
Notice that each is connected to the others by the logical operator 'or'.
As we consider each of these expressions from left to right, the first expression that returns true will end evaluation of the entire statement.
So, the first subexpression,
$path =~ /^$currentdir/,
will always be evaluated, but the next two will not if the first expression is true.
It only takes one true subexpression to satisfy the whole expression when working with logical 'or'.
On the other hand, all subexpressions must be considered before it can be determined that the entire expression is false, so we will keep working through the statement, as long as we do not encounter a true expression.
Starting with
$path =~ /^$currentdir/
The expression is trying to find the value of $currentdir at the beginning of the string at $path.
$currentdir is the requested path starting at the root of blosxom's $datadir, including filename and extension.
We will find $currrentdir at the beginning of $path, if $path is under the requested $currentdir.
$currentdir cannot match if it contains a filename because $path does not contain a filename.
For example:
If $currentdir is 'Technology/' and $path is 'Technology/Computers/Apple' then this expression evaluates as true because the string value of $currentdir occurs at the beginning of $path.
Because we are under the requested $currentdir, we know that we want to continue. i.e. we've determined that we are in the right place.
Otherwise this expression evaluates as false and execution continues to the next subexpression in the statement.
or $path_file eq "$datadir/$currentdir"
If $currentdir contains a filename,
for example if the request is for a particular entry, say
Technology/Computers/Apple/a_random_entry.txt
then the first subexpression will fail. We know that something that looks like a filename cannot be within $path, because $path cannot include a filename.
But we could still be in the right place. We will be in this situation whenever the request is for a specific entry.
To make sure that we do generate a post that matches in this case, we take the complete $path_file value, path and filename with extension, and compare it to "$datadir/$currentdir"
Using the same example, if $path_file is:
'/Library/WebServer/Documents/blosxom/Technology/
Computers/Apple/a_random_entry.txt'
this expression will evaluate to true exactly when
$datadir is: '/Library/WebServer/Documents/blosxom'
and
$currentdir is: 'Technology/Computers/Apple/a_random_entry.txt'
Otherwise this subexpression evaluates as false and we continue with the statement.
or next;
will always evaluate to true.
Think of this as the body of a conditional block, where the conditional expression is composed of the first two subexpressions of the statement.
We only evaluate this subexpression if both...
$path =~ /^$currentdir/
i.e. $path is not a subdirectory of the requested directory ($currentdir),
$path_file eq "$datadir/$currentdir"
i.e. we haven't found the specific entry that was requested explicitely.
...fail.
If both of these conditions fail, then we do not want to generate the entry corresponding to the value of $path_file and next cuts short the evaluation of the rest of the loop, taking us back to the top of the foreach loop.
We begin the loop again with the next list value returned from sort.
367. blank line
368. # Prepend a slash for use in templates only if a path exists
369. $path &&= "/$path";
If we take another look at line 363 where we initialize $path, we see that $path may be empty
($path,$fn) =
$path\_file =~ m!^$datadir/(?:(.*)/)?(.*)\.$file_extension!;
(?:(.*)/)?
The trailing '?' is a quantifier indicating that 0 or 1 occurences of the preceding pattern is sufficient to complete the match.
When will there be 0 occurences of the pattern?
For exmaple if $path_file contains
'/Library/WebServer/Documents/blosxom/a_random_entry.txt'
and $datadir is
'/Library/WebServer/Documents/blosxom'
then there is no $path because the requested filename is at the root of the data directory.
In this case, $path will hold the value undef which will evaluate as false and we will be finished with the entire statement, never evaluating the second expression.
If on the other hand %path does have a value, for example if $path_file contains
'/Library/WebServer/Documents/blosxom/Technology/
Computers/Apple/a_random_entry.txt'
then $path will contain
'Technology/Computers/Apple'
$path will evaluate as true, and we will move on to the second of the two subexpressions.
&&= "/$path"
prepends '/' to the existing value in $path and stores the resulting value back to $path.
Given the example we're working with, after this line is complete, $path will contain the value
'/Technology/Computers/Apple'
370. blank line
371. # Date fiddling for by-{year,month,day} archive views
372. use vars qw/ $dw $mo $mo_num $da $ti $yr $hr $min $hr12 $ampm /;
This statement declares several new package globals.
We'll see some of these same names used again as local variables in the nice_date() routine, starting at line 443, but this is the first time we're seeing them as we walk through the source.
373. ($dw,$mo,$mo_num,$da,$ti,$yr) = nice_date($files{"$path_file"});
Here we call the nice_date routine, passing it the argument $files{"$path_file"}
%files is hash returned by the entries routine containing key/value pairs, where each key is the complete name of a file (inluding path from the root of the filesystem) which has not been eliminated as an entry.
This is the total collection of all files available to blosxom during the rest of it's operation.
%files is populated by the entries routine starting at line 186.
All entries are readable files, not directories, that fit blosxom's basic entry requirements, for example dot file as excluded, as are files with modification times in the future if $show_future_entries is set to 0 (the default).
We'll talk about nice_date() when we come to it.
For now it's enough to talk about the return values.
nice_date returns the following list:
($dw,$mo,$mo_num,$da,$ti,$yr);
which corresponds exactly to the list of variables we've set up here to hold the list of return values.
The values of these variables will adhere to the following formats:
374. ($hr,$min) = split /:/, $ti;
As just mentioned. $ti will be in the form 18:22.
Here we initialize $hr and $min by splitting $ti at the delimiter ':'
375. ($hr12, $ampm) = $hr >= 12 ? ($hr - 12,'pm') : ($hr, 'am');
More simple date manipulation here.
Note the use of the Perl's ternary operator.
We're assigning to a list of variables, $hr12, and $ampm, as follows
First we evaluate
$hr >= 12
To determine if the value of $hr is greater than 12 (pm).
If it is true that $hr >= 12, we evaluate only the expression preceding the ':', otherwise (if it's false) we evaluate only the expression following ':'.
Let's consider those two cases:
In the first case (12p or later), we return the two values ($hr - 12,'pm').
$hr - 12
seems correct, for example 13 - 12 = 1(pm).
There is a problem in the case that $hr is 12. In this case we return 0, (i.e. 12 - 12). We will need to deal with this special case in some way.
Excusing this slight problem for a moment, we set $hr12 to be some value between 0 and 11 and $ampm is set to 'pm'.
In the second case (earier than 12p), we again return two values, this time it's ($hr, 'am')
Again $hr is set to some value between 0 and 11.
This time $ampm is set to 'am'.
All of this seems reasonable, assuming of course we deal with the special case of noon, which results is $hr == 0.
376. $hr12 =~ s/^0//; $hr12 == 0 and $hr12 = 12;
Here we have two statements collapsed onto a single line.
In nearly every programming language this is considered bad style, and even in the case of informal Perl scripts, I prefer to avoid it.
We'll deal with each statement separately.
$hr12 =~ s/^0//;
This line strips the leading zero from single digit hour values, which is apparently preferred. e.g. 1 rather than 01,..., 9 rather than 09.
We attempt a substitution on the string at $hr12
s/^0//
We're trying to match
the start of the string,
followed immediately by zero, i.e. 00 01 02 03 04 05 06 07 08 09
If we are successful, we drop the portion of the string that matched,
leaving one of: 0 1 2 3 4 5 6 7 8 9.
The next statement is:
$hr12 == 0 and $hr12 = 12;
This short statement is overly complicated because of how it's written.
What is the statement saying?
Let's look at each of the subexpressions.
Note the use of the logical operator 'and' for flow control.
This is essentially an oddly written version of the following if statement:
if($hr12 == 0) {
$hr12 = 12;
}
If $hr12 is equal to zero, then we assign $hr12 the value of 12.
When will $hr12 have a value equal to 0?
If the file was modified between 00:00 - 00:59 (midnight), and
If the timestamp indicates that the file was modified between 12:00 pm and 12:59 pm, in which case we set the value to 0 in line 375.
In either case, midnight or noon, we manually correct for the 0 value by assigning the value of 12.
As a result we end up with 12:00 - 12:59 am or 12:00 - 12:59 pm, which is a more human-readable/expected value than 0:00 am or 0:00 pm.
Important: This results in a warning (if warnings are enabled), when $hr (and so $hr12) is noon.
The warning produced for perl v5.8.6 is:
Argument "" isn't numeric in numeric eq (==) at ./perl_test line 10.
It's trivially easy to rewrite the code to avoid the warning, but my purpose is simply documenting the existing code, and not rewriting it. Futhermore, this shouldn't pose a significant problem if left unchanged.
377. blank line
378. # Only stories from the right date
379. my($path_info_yr,$path_info_mo_num, $path_info_da) = split /\//, $date;
Here we're declaring the new local variables
$path_info_yr, $path_info_mo_num, $path_info_da
And initializing them by splitting the value at $date on '/'.
The value of $date is passed to generate() from the caller, see the beginning of generate()'s definition at line 297 for more info.
More info about the value of $date:
From the call at line 289 (dynamic mode), we pass the string
"$path_info_yr/$path_info_mo_num/$path_info_da".
Remember that this string may look like...
'2006/11/02'
....if all of the variables included in the string are not empty; or it may be....
'2006/11/''/ # , or even
''/''/''/
If we are in static mode, line 268, this value will either be a string in the form
'2006/11/02/'
or it will be empty, '', if the value of $indexes{$path} looks like a path, e.g.
'Technology/Computer/Apple'
rather than a date.
Keep in mind that these are new variables, though there are package variables with the same name.
Again, remember that all or none of these variables may be empty, ''.
380. next if $path_info_yr && $yr != $path_info_yr; last if $path_info_yr && $yr < $path_info_yr;
Again we have two statements written on a single line. We'll consider ech statement separately.
next if $path_info_yr && $yr != $path_info_yr;
This line is fairly straight-forward.
It's an expression modifier and reads like a simple if statement written backwards.
With expression modifiers, what would be the body of the block is written before the condition, so we know what we'll do before we know what the condition is.
In this statement, what we're doing is:
next
which abandons processing the current entry, and repeats the foreach loop starting with the next value of $path_file, line 360.
Under what condition will we take this action?
$path_info_yr && $yr != $path_info_yr
Note the use of '&&', another partial evaluation logical operator, similar to the operator 'and' but with a different level of precedence. The difference in precedence between the two is not significant here.
If $path_info_yr was specified, then the first subexpression is true, and we evaluate the second. This will be the case unless $path_info_yr is unspecified, i.e. ''.
If $path_info_yr is not specified then posts are not prevented from generating on the basis of modification time. In other words, all posts are generated regardless of modification time. In this case it does not make sense to consider the second expression, which is a comparison based on the value at $path_info_yr (useless if there is no date in $path_info_yr of course).
Assuming the first part of the expression evaluates as true, we consider the second
&& $yr != $pathinfoyr
If there was a year specified as part of the request, then the year from the modification time on the file ($yr) must match the request ($path_info_yr), for the script to continue with the current iteration of the loop, because if they do not match we execute
next
and skip generation of the current file being considered.
Otherwise we continue with the current iteration of the loop.
last if $path_info_yr && $yr < $path_info_yr;
This statement feels a lot like the previous one, but there are important differences.
Again we're dealing with an expression modifier. So I can say that the statement reads like a simple if statement written backwards.
With expression modifiers, what would be the body of the block comes first. So we know what we'll do before we know what the condition is.
In this case what we'll do is
last
Whereas next jumps to the next iteration of the loop, so that we do not continue with the current file but we do continue with all of the others, last abandons the foreach loop altogether.
If the condition that follows is met, we're finished processing files from %f and %others (i.e. we're through with the foreach loop beginning at 360).
Why does this make sense? How can we tell from a single file that we do not need to consider any others?
Remember that the values of $path_file we're iterating over are in sorted order by modification time.
Based on the current value, we know that every post after this one has a modification time that is earlier than the file we're looking at now.
Again the condition breaks down into to parts connected by the logical operator '&&' (similar to the operator 'and' except for precedence, see notes at the preceding statement of the current line for more info).
$path_info_yr
If $path_info_yr was specified, the first subexpression is true.
Assuming the first part of the expression evaluates as true, we consider the second,
&& $yr < $path_info_yr;
which returns true if $yr, taken from the modification date on the file, is less than (i.e. earlier than) the requested year, in $path_info_yr.
If we are looking for all posts from 2006, for example
and we have a file with a $yr value of '2005', then we know we've already seen all of the files from 2006 that we are going to see.
We can stop processing entirely as soon as
$yr (the modification time on the current file) < (is earlier than) $path_info_yr (the requested date).
381. next if $path_info_mo_num && $mo ne $num2month[$path_info_mo_num];
Having considered the requested year...
We must have passed those date tests, otherwise we would have either dropped out of the foreach loop entirely (last) or started the next iteration (next), and either way we would not be executing this line
...we look at the requested month.
Again following the established pattern, we have an expression modifier that first tests to see if a specific month was requested.
$path_info_mo_num
will evaluate as true if it is not empty, ''.
Although there are other values which will also return false, given the context $path_info_mo_num will either contain some true value or the empty string, ''.
Next, we consider the second portion of the expression (note the use of '&&').
&& $mo ne $num2month[$path_info_mo_num]
This is exactly what we did on the previous lines.
We're comparing the value of $mo against the requested
$path_info_mo_num.
If they are not equal, then the current post falls outside of the requested range and we'll want to move on to the next iteration, i.e. abandon processing and output of the current entry.
There is a slight complication in this case because there is more than one acceptable way to specify a month as part of a date-based path, e.g.
are both acceptable.
There are a number of ways that blosxom deals with this, see notes at line 135 for more info.
We don't need to deal with that issue here, we only need to understand that we cannot simply compare $mo and $path_info_mo_num as we did $path_info_yr.
The reason is simple,
$path_info_mo_num is a two-digit string value, e.g. '01', and $mo is in the form of a three character string, e.g. 'Jan'.
If you remember, or look back to line 84, @num2month is an array we set up just so we could do this type of comparison.
@num2month is an array, and as such it is indexed by small integers starting with 0.
The values of @num2month are all three character strings indicating the months of the year at corresponding index positions, e.g.
num2month[1] eq 'Jan', num2month[2] eq 'Feb',..., num2month[12] eq 'Dec'.
By the way num2month[0] eq 'nil', an insignificant placeholder.
So while we cannot compare $mo (e.g. 'Jan') and $path_info_mo_num ('01') directly, we can compare $mo ('Jan') and $num2month[$path_info_mo_num] ('Jan').
Other than this slight complication, the logic is as described above; we abandon processing the current file and start the next iteration of the foreach loop if $mo, from the timestamp of the current file does not equal
$num2month[$path_info_mo_num] # the requested month.
382. next if $path_info_da && $da != $path_info_da; last if $path_info_da && $da < $path_info_da;
Again we have two statements written on a single line. We'll consider ech statement separately.
next if $path_info_da && $da != $path_info_da;
This line is nearly identical to the first statement at line 380, except that where that statement deals with $path_info_yr and $yr, here we're looking at $path_info_da and $da.
It's an expression modifier and reads like a simple if statement written backwards.
With expression modifiers, what would be the body of the block is written before the condition, so we know what we'll do before we know what the condition is.
In this statement, what we're doing is:
next
which abandons processing the current entry, and repeats the foreach loop starting with the next value of $path_file, line 360.
Under what condition will we take this action?
$path_info_da && $da != $path_info_da
Note the use of '&&', another partial evaluation logical operator, similar to the operator 'and' but with a different level of precedence. (The difference in precedence between the two of and, && is not significant here.)
If $path_info_da was specified, then the first subexpression is true, and we evaluate the second.
Assuming the first part of the expression evaluates as true, we consider the second
&& $da != $path_info_da
If there was a day specified as part of the request, $path_info_da, then the day of the month (something like '24') from the modification time on the file ($da) must match the request ($path_info_da) for the script to continue with the current iteration of the loop.
If they do not match we execute
next
Otherwise we continue with the current iteration of the loop.
last if $path_info_da && $da < $path_info_da;
This statement feels a lot like the previous one, but there are some important differences.
Again we're dealing with an expression modifier. So I can say that the statement reads like a simple if statement written backwards.
With expression modifiers, what would be the body of the block comes first. So we know what we'll do before we know what the condition is.
In this case what we'll do is
last
Whereas next jumps to the next iteration of the loop, so that we do not continue with the current file but we do continue with all of the others, last abandons the foreach loop altogether.
If the condition that follows is met, we're finished processing files from %f and %others (i.e. we're through with the foreach loop beginning at 360).
Why does this make sense? How can we tell from a single file that we do not need to consider any others?
Remember that the values of $path_file we're iterating over are in sorted order by modification time.
Based on the current value we know that every post after this one has a modification time that is earlier than the file we're looking at now.
Again the condition breaks down into to parts connected by the logical operator '&&'.
$path_info_da
If $path_info_da was specified. This first subexpression will evaluate as true if it is not empty, ''.
Although there are other values which will also return false, given the context $path_info_da will either contain some true value or the empty string, ''.
Assuming the first part of the expression evaluates as true, we consider the second.
&& $da < $path_info_da;
This part of the expression returns true if $da, taken from the modification date on the file ($da), is less than, i.e. earlier than, the requested day ($path_info_da).
If we are looking for all posts from the 11th, for example
http://example.net/cgi-bin/blosxom.cgi/2006/01/11
and we have a file with a $da value of '10',
then we know we've already seen all of the files from the 11th that we are going to see.
We can stop processing at the first $da value less than '11'.
It's important to note that to get to this point in execution all of the other conditions must have succeeded.
Importantly, we've already considered year and month before we look at the day values here, so we can limit ourselves to code that only deals with $da and $path_info_da.
383. blank line
384. # Date
385. my $date = (&$template($path,'date',$flavour));
We're calling the template routine, passing it the required arguments
We should expect to find a date.$flavour file somewhere starting at $path and working up toward to root of the data directory, or as a baked-in default, or we will return an error, assuming at least that 'date' is a valid component, which it is of course.
Just to make sure we have a proper value for $path, let's look at an example.
We've seen that if the current $path_file value (from %f) is:
'/Library/WebServer/Documents/blosxom/Technology/
Computers/Apple/a_random_entry.txt'
Then we break up $path_file according the following pattern match:
^$datadir/(?:(.*)/)
will match
/Library/WebServer/Documents/blosxom/Technology/Computers/Apple/
If the value of $datadir is
/Library/WebServer/Documents/blosxom
'Technology/Computers/Apple'
note that the trailing '/' is not included is assigned to $path.
At line 369 we prepend a forward slash '/' to the value of $path.
Continuing with the current example, we end up with a value at $path similar to:
'/Technology/Computers/Apple'
The return value, which we store in $date, is the complete contents of a date file for the specified flavour.
The template routine first looks for this file at the requested $path, before walking backwards up the directory hierarchy towards the root.
If it never finds the correct template file, the script attempts to use a baked-in template before returning an error message in the case that the requsted flavour cannot be found.
The important point of this line is that we should now have the complete contents of the 'date.' file for the requested flavour in the variable $date.
386. blank line
387. # Plugins: Date
388. foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('date') and $entries = $plugin->date($currentdir, \$date, $files{$path_file}, $dw,$mo,$mo_num,$da,$ti,$yr) }
Another foreach looop collapsed to a single line. Let's look at it piece by piece.
foreach my $plugin ( @plugins ) {
This is the start of the foreach loop which runs through all of the plugins listed in @plugins. These are active plugins in the sense that $plugin->start() returned true, but this list does include plugins disabled by the user (e.g interpolate_fancy_).
Each is set to the local variable $plugin in turn.
$plugins{$plugin} > 0 and $plugin->can('date') and
$entries = $plugin->date($currentdir, \$date,
$files{$path_file}, $dw,$mo,$mo_num,$da,$ti,$yr)
Yet another long 'and'ed partial eval statement.
Let's take it piece by piece, from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where 1 means that the plugin should be considered on and -1 means that the user has disabled the plugin.
We only continue considering the current plugin if it has not been disabled by the user.
If the value at $plugins{$plugin} is <= 0 we skip the rest of the statement, with the result that we do nothing with the plugin, and nothing is exactly what we want to do with disabled plugins.
If $plugin has not been disabled by the user $plugins{$plugin} > 0 will be true and we continue with the statement.
and $plugin->can('date')
Here we're testing to see if the current plugin has a date routine.
If the return from the can method is true, then $plugin claims to have a date() method, which means we should be able to safely refer to $plugin->date().
from perldoc
$obj->can(METHOD)
can checks if the object or class has a method called method. If it does then a reference to the sub is returned. If it does not then undef is returned.
$entries = $plugin->date($currentdir, \$date,
$files{$path_file}, $dw,$mo,$mo_num,$da,$ti,$yr)
Here we call the plugin's date routine, passing it:
$currentdir, the current working directory
\$date, a reference to the $date variable, which contains the "date.$flavour" source. See note at line 385 for more info.
$files{$path_file}, remember that $path_file is the current file we're working on in generate(). The corresponding value from the %files hash is the file's modification time in unix timestamp format.
$dw, day of the week, e.g. 'Thu', returned from the call to the nice_date() routine at line 373.
$mo, month, e.g. 'Nov', returned from the call to the nice_date() routine at line 373.
$monum, two digit number corrresponding to $mo, e.g. 11, returned from the call to the nicedate() routine at line 373.
$da, numerical day of the month, e.g. 24, returned from the call to the nice_date() routine at line 373.
$ti, time of day including hours and minutes, e.g. 18:22, returned from the call to the nice_date() routine at line 373
$yr, numerical year, e.g. 2006, returned from the call to the nice_date routine at line 373.
From the documentation:
Bloxsom calls date for each new day, after reading in the appropriate "date.$flavour" and before swapping in values for template components.
The subroutine is passed the current directory, a reference to the raw date.flavour source, Unix-style modification time of the latest entry for the day, and some useful date bits derived from the modification time.
sub date { my ($pkg, $currentdir, $date_ref, $mtime, @date_bits) = @_; my($dw,$mo,$mo_num, $da,$ti,$yr) = @date_bits; 1;}
date has the opportunity to alter the raw date template or build an entirely new date.
You should recogize these line as similar to others we've seen.
This is the standard way blosxom cycles through its active plugins looking for any that contain a specific routine, in this case date.
Note that we do not drop out of the foreach loop on encountering the first plugin with a date() routine.
Multiple plugins with date routines can coexist.
Of course, these are called one at a time, and each plugin with a date routine is effected by previous plugins and affects any that follow.
389. blank line
390. $date = &$interpolate($date);
Now that we're done running the raw "date.$flavour" source through the plugin date routines, we pass $date to the interpolate routine, defined at line 310.
After substitutions are made via interpolate, we return the string back into $date.
391. blank line
392. $curdate ne $date and $curdate = $date and $output .= $date;
Remember that we declared and initialized $curdate at line 338
my $curdate = '';
and we just defined $date as the return value from the interpolate routine on the last line.
This is yet another conditional disguised as a logical statement.
It easily could, and probably should, be rewritten as an if block or an expression modifier. For now we'll deal with the code as written.
Because all of the logical operators are 'and', we know that we'll move from subexpression to subexpression only if every expression we encounter evalutes as true.
We'll stop at the first false value we find.
Starting at te leftmost expression:
$curdate ne $date
Here we check that the value of $curdate does not equal the value of $date.
The first time we get to this line the value of $curdate will be the empty string, '' and we know that $date will not equal $curdate.
Keep in mind that we are inside the foreach loop, started at line 360, that has us running through all of the entries in the %f and %others hashes one at a time in sorted order.
We will see, later in this statement, that the value of $curdate will not always be empty.
Assuming this first expression evalutes as true, we continue to the next.
and $curdate = $date
As promised, here we change the value of $curdate.
In the last expression, we compared $curdate and $date and if they are equal we stop evaluating the statement.
If they are not equal, we immediately set the value of $curdate to the value of $date. This assignment should always evaluate as true.
Note that we've changed the value of $curdate. At this point $curdate and $date have the same value. This will be the value of $curdate the next time through the loop.
Now let's consider the final expression:
and $output .= $date
Here we append the value of $date to the output string.
So what is this statement doing?
Think about the output that Blosxom produces.
Blosxom generates a reverse chronological listing of posts.
For each date, not each post, the date is output as a heading.
What this date looks like, its contents and formatting, are determined by the date component of the current flavour.
This line takes care of outputting that date.
If the $curdate value is the same as the date of the current post under consideration, $date, then we do not want to output the date string again.
For example:
If we made two posts on Sunday Feb 04, 2007,
we would want a single heading above the first post, and then we would want both of the posts to print, without a second date header between them.
If $curdate = $date, we know that the two posts occurred on the same day and we do not output a header.
If they are not the same, then we save the value of $date as the new $curdate and output the date header.
The result is that we will only output a header between posts occurring on different days.
In this way we'll output a header for each new day, not each new post.
Remember that the comparison is on the string returned from the template (line 385) and interpolate (line 390) routines.
If a variable in the date component changes with every post, then the date header will print between every post. This is probably not what you intended.
For example,
if you include a variable that interpolate would replace with the modification time of the current post, then for every pair of posts it would be true that $curdate ne $date, unless the modification time on two posts were in fact identical, which would most likely never be the case naturally.
You'll probably want to limit the variables you use in the date component of your templates to year, month, and day unless you have something specific in mind and you know what you're doing.
Of course any literal text you enter in your date component will always agree on every comparison.
Finally, if it is determined that we have found an appropriate place for a date heading, between two posts that occur on different dates, then
and $output .= $date;
We append the value of $date to the output.
Notice that we're building our output as the string value of $output a piece at a time, starting at the head, progressing through posts, and adding in date headings where they fit.
Finally (eventually) we'll tack on the footer to complete the output.
393. blank line
394. use vars qw/ $title $body $raw /;
This statement declares additional package globals.
Note that these names are unique in the source.
395. if (-f "$path_file" && $fh->open("< $path_file")) {
Remember that we instantiate $fh as a filehandle at line 81.
There are really two conditions here.
First, we have
-f "$path_file",
which checks that $path_file is in fact a file in the filesystem.
$path_file is the loop control variable that is set to one of the keys of the %f and %others hash (in sorted order) each time through the main foreach loop.
Note, that we have already confirmed that $path_file should be the name of a valid entry, i.e. a file blosxom should consider a post.
$path_file is one of the keys from the %f hash.
And the %f hash is composed of key/value pairs where each key is the complete filename of a valid entry, including path info from the filesytem root.
This first part of the conditional can be thought as as a sanity check but is most likely unnecessary, at least for keys in %f.
Assuming the first expression is true we evaluate the second subexpression.
$fh->open("< $path_file")
Attempts to open the file named in $path_file for input via the handle $fh.
This expression will return a true value if the open is successful, and it will return a false value if the attempt fails for some reason, for example because the script does not have sufficient permissions to open the file.
Assuming we are able to open the file we continue with the if block.
396. chomp($title = <$fh>);
Reads in the first line of the file we just opened (one of our weblog posts), to the first newline encountered.
Then we strip off the trailing newline (this is the purpose of chomp), and save the resulting string in $title.
This certainly makes sense given what we know about blosxom entries. Namely, that the first line is the title of each post.
397. chomp($body = join '', <$fh>);
This statement reads all of the remaining lines, except for the first (see the preceding line), from the file we opened (one of our posts);
joins them together, separated by nothing, join '', so that they run together as a continuous string;
and assigns the resulting string to $body.
chomp here only removes the final trailing newline on the last line of the file.
A better way to say the same thing is that chomp removes a single trailing newline character, '\n', from the end of the string.
Again, this statement makes sense given what we know about blosoxm posts. Specifically, that anything other than the first line of the post is considered to be part of the body.
398. $fh->close;
This statement simply close the file handle opened in line 395.
399. $raw = "$title\n$body";
This statement combines the values of $title and $body together, separated by a '\n' character.
Note that we removed the newline from the end of the first line, i.e. the end of the title, with chomp in line 396.
We assign the resulting string to the variable $raw, so that there is a single newline separating the $title and $body in $raw.
Note that there is no else clause. Consequently, if the condtional expression fails, which is a fairly likely situation, because of a permissions issue for example, we will not assign values to $title, $body, and $raw. As we should not because they are both undef if open failed.
400. }
End of if block started at line 395.
401. my $story = (&$template($path,'story',$flavour));
This statement is very similar to the statement at line 385.
We're calling the template routine, passing it the required arguments
We should expect to find a "story.$flavour" file somewhere starting at $path and working up toward to root of the data directory, or as a baked-in default, or we will return an error, assuming at least that 'story' is a valid component, which it is of course.
Just to make sure we have a proper value for $path, let's look at an example.
We've seen that if the current $path_file value (from %f) is
'/Library/WebServer/Documents/blosxom/Technology/
Computers/Apple/a_random_entry.txt'
Then we break up $path_file according the following pattern match
^$datadir/(?:(.*)/)
will match
/Library/WebServer/Documents/blosxom/Technology/
Computers/Apple/
If the value of $datadir is
/Library/WebServer/Documents/blosxom
'Technology/Computers/Apple'
note that the trailing '/' is not included.
At line 369 we prepend a forward slash '/' to the value of $path.
Continuing with the current example, we end up with a value at $path similar to:
'/Technology/Computers/Apple'
The return value, which we store in $story, is the complete contents of a story file for the specified flavour.
The template routine first looks for this file at the requested $path, before walking backwards up the directory hierarchy towards the root.
If it never finds the correct template file, the script attempts to use a baked-in template before returning an error message in the case that the requsted flavour cannot be found.
The important point of this line is that we should now have the complete contents of the 'story.' file for the requested flavour in the variable $story.
402. blank line
403. # Plugins: Story
404. foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('story') and $entries = $plugin->story($path, $fn, \$story, \$title, \$body) }
Another foreach looop collapsed to a single line. Let's look at it piece by piece.
foreach my $plugin ( @plugins ) {
This is the start of the foreach loop which runs through all of the plugins listed in @plugins. These are active plugins in the sense that $plugin->start() returned true, but this list does include plugins disabled by the user (e.g interpolate_fancy_).
Each is set to the local variable $plugin in turn.
Yet another long 'and'ed partial eval statement.
Let's take it piece by piece, from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where 1 means that the plugin is 'on' and -1 indicates that the user has 'disabled' it.
We only continue considering the current plugin if it has not been disabled by the user.
and $plugin->can('story')
Here we're testing to see if the current plugin has a story subroutine
(referred to as a method in documentation you'll find on can() because this is an OO interface).
If the return from the can method is true, then $plugin claims to have a story() method, which means we should be able to safely refer to $plugin->story().
and $entries = $plugin->story($path, $fn, \$story, \$title, \$body)
Here we call the plugin's story routine, passing it ($path, $fn, \$story, \$title, \$body).
With these values the plugin's story routine has all of the info it needs to perform the same function as the baked-in story routine.
In way of review, the following is a list of these variables and the meaning of their values:
$path, see the notes at line 363 for a discussion of the value at $path.
It may be enough to remind you that the value at $path will be similar to
'/Technology/Computers/Apple'.
So, $path is the filesystem path starting at the root of $datadir, not including the filename.
Also the value will have a leading forward slash, '/', character but not a trailing delimiter.
Finally, keep in mind that the value of $path could be '', if the requested post is at the root of the data directory.
$fn, the value of $fn is the filename of the current request (without extension), see the notes at line 363 for more info.
For example, if the current value of $path_file is:
'/Library/WebServer/Documents/blosxom/Technology/
Computers/Apple/a_random_entry.txt',
then the value of $fn will be 'a_random_entry'.
\$story, A reference to the $story variable, which contains the complete contents of the 'story.' template component specific to the requested $flavour. See line 401 for more info.
\$title, A reference to the $title variable, which contains the title of the current entry. See line 396 for more info.
\$body, A reference to the $body variable, which contains the body of the current post. See line 397 for more info.
From the documentation
Bloxsom calls story for each and every story, after reading in the appropriate story.flavour and before swapping in values for template components.
The subroutine is passed
the story's path,
filename,
reference to the raw story.flavour source,
reference to the story title,
and reference to the body of the post.sub story { my ($pkg, $path, $filename, $story_ref, $title_ref, $body_ref) 1; }You can alter the raw story.flavour source, alter the title or body of the post before its pasted into the template, and so forth.
Note that we do not drop out of the foreach loop when we encounter the first plugin with a story() routine.
Multiple plugins with story routines can coexist. Of course, these are called one at a time, and each plugin with a story routine is effected by previous plugins and affects any that follow.
405. blank line
406. if ($content_type =~ m{\bxml\b}) {
Here we have the beginnning of an if block where the value of the conditional expression is based on the success or failure of a pattern match.
Let's look at that pattern match:
$content_type =~ m{\bxml\b}
This is the first time we're seeing the variable $content_type since it's declaration as a parameter passed to the generate routine.
$content_type will contain the complete contents of the 'content_type.' component for the current flavour.
Remember that the contents of this file are very simple/must be exact. See notes at lines 258 for more info.
We are attempting to match the specified pattern against the string in $content_type
The pattern: \bxml\b
matches the the literal string 'xml'
\b is the sequence Perl's regex engine uses to indicate a word boundary. By bracketing the literal string in these word boundary markers, we're insisting that we match only the literal string xml as a whole 'word'.
If we look at the baked-in rss flavour's content_type component...
text/xml
...we see that the pattern does match the string.
But xml does not stand on its own surrounded by whitespace. In what sense is xml a 'word' within the string
text/xml?
The string matchs the pattern because xml is bracketed by non word characters (in the \w sense of a word). The space after xml does not qualify as a \w character and neither does the leading forward slash. \w matches only letters, digits and underscores '_'.
None of '_xml', 'xml_', or '_xml_' would match the pattern but '/xml', 'xml/', and '/xml/' all would match.
Assuming the match does succeed, we move inside the block.
What is the purpose of the match? Let's keep reading...
407. # Escape <, >, and &, and to produce valid RSS
408. my %escape = ('<'=>'<', '>'=>'>', '&'=>'&', '"'=>'"');
Here we declare and initialize a new hash,
%escape
containing 3 key/value pairs, where each key is a 'special' character according to the RSS feed standards and the value is the html entity corresponding to that character.
For example,
because '<' is a part of the syntax of RSS it has a structural meaning and should not be used as part of the content of an entry for a feed or it may confuse a parser. Also, it should not be used (even if it works for your purposes/in your testing) simply because it is officially disallowed.
When we do want to use this character as part of the content of an RSS feed we replace the character with a sequence that represents the character.
This sequence is called an html entity.
Because the sequence does not use the reserved character, it simplifies the job of the parser, and because the entity is known to reliably mean the same thing as the the symbol it stands for, it can later, automatically, be replaced and displayed as the intended character.
Escaping characters with special meaning like this simplifies the job of creating tools that adhere to standards and still work the way users expect them to.
We're getting a little ahead of ourselves, all we do here is set up the hash.
409. my $escape_re = join '|' => keys %escape;
With this statement we declare and initilize a new variable,
$escape_re.
To this new variable we assign the value
join '|' => keys %escape.
To evaluate this, we take the keys from the %escape hash, which we just defined, and join them together into a single string, separated by the character '|'.
The value of this is the literal string '<|>|&'.
410. $title =~ s/($escape_re)/$escape{$1}/g;
This statement is a substitution targeting the value of $title.
First we identify and then replace the portion of the value that matches the given pattern.
Let's look at the pattern:
($escape_re)
The pattern uses the $escape_re string we just initialized.
That string is '<|>|&'
which makes our pattern
(<|>|&)
You may recognize this as alternation.
We are matching one or more of the characters '<', '>' or '&' in the title and we substitute it with $escape{$1}, the corresponding value from the %escape hash.
Note the use of the /g modifier.
By using this modifier we make this substitution everywhere the match occurs in the string, not stopping after the first match.
This seems reasonable. We want to replace every occurence of one of these characters in the title of our post with the corresponding entity, and that is exactly what this statement does.
411. $body =~ s/($escape_re)/$escape{$1}/g;
You'll notice that the only difference between lines 410 and 411 is the variable we're matching against.
Here it's $body rather than $title.
In every other way the two statements are identical. See the notes at line 410 for more info.
412. }
Keep in mind that if the condition at line 406 fails, then we skip this block and the substitutions just discussed.
413. blank line
414. $story = &$interpolate($story);
We pass $story to the interpolate routine, defined at line 310.
After substitutions are made in interpolate we return the string back into $story.
Note that unless you are doing something unusual with your story template two of the variables that will be replaced in $story by interpolate will be $title and $story.
See notes about the interpolate routine starting at line 310 for more info.
415. blank line
416. $output .= $story;
Here we append the value of $story to the output string.
Note that we'll do this once for each entry we encounter.
The story component itself contains the $title and $body variables, which are replaced by the interpolate routine; all nicely wrapped in the container that is the story component.
417. $fh->close;
Here we simply close the file handle $fh opened at line 395. This is assuming was opened successfully. It is possible to reach this statement even if the open failed. Attempting to close a filehandle that is not open is not a particularly troublesome error but it is a little messy.
418. blank line
419. $ne--;
This simple statement decrements the value of the variable $ne.
Remember that we set $ne equal to $num_entries at line 339.
$num_entries is a user-configurable variable that indicates the maximum number of posts blosxom should display for any request.
Here, right before we return to the top of the foreach loop to the process the next post ($path_file), see line 360, we decrement the value of $ne.
Having just output a post, we have one fewer entry left to output.
$ne will start out with the value of $num_entries, e.g. 10, and count down to 9 8 7 6...
Note that the value of $ne is decremented whether or not the open at line 395 succeeded. This should most likely be corrected. As written blosxom may output less than $num_entries number of posts. For example if $num_entries is 10, blosxom will attempt to open 10 posts and decrease $ne for each attempt. The number of posts generated will be 1 less than the requested $num_entries for each file that cannot be opened successfully.
After we have ouput 10 entries, then the value of $ne will be 0 and the first line of the body of the foreach loop, line 361
last if $ne <= 0 && $date !~ /\d/;
will break us out of the foreach loop
Important: unless $date contains at least a single digit
$date !~ /\d/
in which case we continue to generate all posts regardless of the value of $ne.
After we've decremented $ne to zero or ouput all posts, we're done with posts, but we're not yet done with generate() because we haven't yet output a complete page.
420. }
End of the foreach loop started at line 360.
421. blank line
422. # Foot
423. my $foot = (&$template($currentdir,'foot',$flavour));
This statement is very similar to the statement at line 401.
We're calling the template routine passing it the required arguments
We should expect to find a "foot.$flavour" file somewhere starting at $currentdir and working up toward to root of the data directory, or as a baked-in default, or we will return an error, assuming at least 'foot' is a valid component, which it is of course.
$currentdir is defined as the value of the parameter passed to the generate routine (line 297).
$currentdir is the path from the root of blosxom's data directory to, but not including, the filename.
This should be an appropriate value to pass to the template routine.
The return value, which we store in $foot, is the complete contents of a foot file for the specified flavour.
The template routine first looks for this file at the requested $currentdir before walking up the directory hierarchy towards the root of $datadir.
If it never finds the correct template file, it attempts to use a baked-in template before returning an error message in the case that the requsted flavor cannot be found.
The important point of this line is that we should now have the complete contents of the foot. file for the requested flavor in the variable $foot.
424. blank line
425. # Plugins: Foot
426. foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('foot') and $entries = $plugin->foot($currentdir, \$foot) }
Another foreach looop collapsed to a single line. Let's look at it piece by piece.
foreach my $plugin ( @plugins ) {
This is the start of the foreach loop which runs through all of the plugins listed in @plugins. These are active plugins in the sense that $plugin->start() returned true, but this list does include plugins disabled by the user (e.g interpolate_fancy_).
Each is set to the local variable $plugin in turn.
Yet another long 'and'ed partial eval statement.
Let's take it piece by piece, from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where 1 is used to indicate that the plugin is enabled, and -1 means that the user has disabled it.
We only continue considering the current plugin if it has not been disabled by the user.
and $plugin->can('foot')
Here we're testing to see if the current plugin has a foot subroutine. (referred to as a method in documentation you'll find on can() because this is an OO interface.)
If the return from the can method is true, then $plugin claims to have a foot() method, which means we should be able to safely refer to $plugin->foot().
and $entries = $plugin->foot($currentdir, \$foot)
Here we call the plugin's foot routine, passing it the variables
$currentdir, \$foot.
With these values, the plugin's foot routine has all of the info it needs to perform the same function as the baked-in foot routine.
In way of review, the following is a list of these variables and the meaning of their values:
$currentdir, See the notes at line 423 for a discussion of the value at $currentdir. It may be enough to say that the value at $currentdir is the path from the root of blosxom's data directory to, but not including, the filename.
\$foot, a reference to the $foot variable which contains the complete contents of the 'foot.' template component. See line 423 for more information.
From the documentation
The flip-side of the head subroutine, Bloxsom calls the foot subroutine after reading in the appropriate foot.flavour and before swapping in values for template components.
The subroutine is passed the current working directory - as defined by the path, and a reference to the raw foot.flavour source.
sub foot { my($pkg, $currentdir, $foot_ref) = @_; 1; }The foot subroutine offers the plugin the opportunity to alter the raw footer source and define or alter any variables before the footer is added to the output stream.
Note that we do not drop out of the foreach loop when we encounter the first plugin with a foot() routine. Multiple plugins with foot routines can coexist. Of course, these are called one at a time, and each plugin with a foot routine is effected by previous plugins and affects any that follow.
427. blank line
428. $foot = &$interpolate($foot);
We pass $foot to the interpolate routine (defined at line 310).
After the variable substitutions are made in interpolate, we return the string back to $foot.
See notes about the interpolate routine starting at line 310 for more info.
429. $output .= $foot;
We append the value of $foot to the output string.
Notice that we're building our output as the string value in $output a piece at a time, starting at the head and then progressing through posts, adding in date headings where they fit.
Finally we tack on the foot component (here).
430. blank line
431. # Plugins: Last
432. foreach my $plugin ( @plugins ) { $plugins{$plugin} > 0 and $plugin->can('last') and $entries = $plugin->last() }
Another foreach looop collapsed to a single line. Let's look at it piece by piece.
foreach my $plugin ( @plugins ) {
This is the start of the foreach loop which runs through all of the plugins listed in @plugins. These are active plugins in the sense that $plugin->start() returned true, but this list does include plugins disabled by the user (e.g interpolate_fancy_).
Each is set to the local variable $plugin in turn.
Yet another long anded partial eval statement.
Let's take it piece by piece, from left to right.
$plugins{$plugin} > 0
Remember that %plugins contains $on_off values for active plugins where a value of 1 means that the plugin is enabled and -1 indicates that it has been disabled by the user.
We only continue considering the current plugin if it has not been disabled by the user.
and $plugin->can('last')
Here we're testing to see if the current plugin has a last subroutine (referred to as a method in documentation you'll find on can() because this is an OO interface).
If the return from the can method is true, then $plugin claims to have a last() method, which means we should be able to safely refer to $plugin->last().
$entries = $plugin->last()
Here we call the plugin's last routine, passing it no arguments.
From the documentation
The last subroutine hook is called just before the header is prepended to the output and the whole kit-and-kaboodle returned by the generate routine, either for display by the browser or saving to file (if statically rendering)
The subroutine is not passed anything by Blosxom.
Note that we do not drop out of the foreach loop when we encounter the first plugin with a last() routine. Multiple plugins with last routines can coexist. Of course, these are called one at a time, and each plugin with a last routine is effected by previous plugins and affects any that follow.
433. blank line
434. } # End skip
End of conditional block that started on line 320.
See notes at line 320 for more info.
For review:
If a plugin has decided that we should skip generation, all we output is either:
nothing if running in static mode or
a header if running in dynamic mode, assuming the header exists. See line 436 for more info.
If a plugin sets the $skip variable then we pick up execution in generation here, skipping the majority of the routine.
435. blank line
a name="436">436. # Finally, add the header, if any and running dynamically
437. $static_or_dynamic eq 'dynamic' and $header and $output = header($header) . $output;
The statement breaks down into three expressions connected by logical operators.
We'll take them one at a time.
$static_or_dynamic eq 'dynamic'
This expression evaluates as true if the value of $static_or_dynamic is 'dynamic'. If it is 'static' instead, we are done with the statement.
Assuming that expression is true, we continue to the next subexpression
and $header
We define this package global at line 287...
$header = {-type=>$content_type};
...as reference to an anonymous list containing two elements
'-type' and $content_type
and $output = header($header) . $output;
Finally, we prepend header($header) to the value we've built-up in $output to this point.
What is header($header)?
header() is a routine defined in the CGI module that produces the required HTTP header.
The argument to header() is the value of $header which we set at line 287:
$header = {-type=>$content_type};
From the documentation for the CGI module we can confirm that this is the correct form for a call to header().
The return value from header($header) will be an appropriate HTTP header, which we prepend to our output.
Note that we will only want to process this expression if we are running in dynamic mode, which we have established with the previous subexpressions in the statement.
438. blank line
439. $output;
Finally we're finished with generate() and we return the value of $output. Because $output is the last statement evaluated, it is automatically the return value.
440. }
End of generate routine()!
441. blank line
442. blank line
443. sub nice_date {
Start of the definition of the nice_date() routine used to populate return a list of date/time values when passed a date in unix timestamp format.
nice_date returns the following list:
($dw,$mo,$mo_num,$da,$ti,$yr);
The values of these variables will adhere to the following formats:
444. my($unixtime) = @_;
nice_date() expects a single parameter, namely a date in unix timestamp format. Here we declare a variable $unixtime to store the value passed to the routine.
445. blank line
446. my $c_time = ctime($unixtime);
ctime is a function that takes as an argument a value representing the time in seconds since epoch (i.e. unix timestamp format).
It adjusts the value for the current time zone and returns a pointer to a 26-character string of the form:
Thu Nov 24 18:22:48 1986\n\0
This string is stored at the newly declared $c_time variable.
447. my($dw,$mo,$da,$ti,$yr) = ( $c_time =~ /(\w{3}) +(\w{3}) +(\d{1,2}) +(\d{2}:\d{2}):\d{2} +(\d{4})$/ );
Here we attempt a match on the string at $c_time, setting the list of variables to the portions of the string corresponding to match variables specified as part of the pattern.
Looking at this sample example string...
'Thu Nov 24 18:22:48 1986\n\0'
...it's easy to see how these matches and the corresponding assignments break down.
$dw - (\w{3}) - 'Thu'
$mo - (\w{3}) - 'Nov'
$da - (\d{1,2}) - '24'
$ti - (\d{2}:\d{2}) - '18:22'
$yr - (\d{4}) - '1986'
448. $da = sprintf("%02d", $da);
From the last line, $da may be only a single digit if the current day of the month is one of the list (1, 2, 3, 4, 5, 6, 7, 8, 9)
Here we read the value from $da, pad it with a leading zero if necessary, %02d , so that we have a two digit value in every case, e.g '14' will still be '14' but '1' will be padded to '01'.
We save this result in back to $da.
What's the point of doing this?
We have modified the format of single digit $da values, and so we can now depend on the fact that $da values will always be exactly 2 digits in length.
449. my $mo_num = $month2num{$mo};
The pattern match on $c_time gave us the month as a three character string (eg 'Nov'). It is useful (elsewhere in the script) to have the month as a two digit value instead (eg '11').
We created the %month2num hash just for this purpose (line 83).
The hash consists of key/value pairs, where each key is the name of a month as a three character string and the values are corresponding two digit values.
Here we simply retrieve the value for the key we've pulled out of $c_time and store that value at the variable $no_num.
We'll add this value to list we return from nice_date().
450. blank line
451. return ($dw,$mo,$mo_num,$da,$ti,$yr);
We return to the caller the list of values we've created in the expected order.
In summary, the caller passes this routine a value representing the time in seconds since Epoch and it the routine returns the list of values
($dw, $mo, $mo_num, $da, $ti, $yr)
452. }
End of nice_date() routine started at line 443.
453. blank line
FIN
454. blank line
455. # Default HTML and RSS template bits