Perl coding guidelines

Last modified
Monday, July 30, 2018 - 01:16

Introduction

The code base has been separated into two parts:

  1. The code in lib/
    This is the code we want to keep and improve
  2. The code in old/
    This is the code we want to replace with rewritten code into lib/

The coding guidelines apply to the code in lib/ and not so much to that in old/: we put energy in rewriting the code in old/, not in improving it. However, the "secure coding" section below obviously applies to all code and all code changes. When adding new functionalities, please use the style documented below. When fixing existing functionality, please use the style applicable to the module you're working on.

Code formatting

The following guidelines apply to formatting Perl code in our code base:

  • No TAB characters (use multiple spaces instead)
  • Indent by 4 spaces
  • Maximum width of 80 characters
  • No trailing whitespace
  • Module names are at the package level (LedgerSMB::*), so don't prefix function names

When you're touching a file which doesn't comply to these formatting rules, please reformat the file, but provide separate patches/commits between the reformatting and the code changes being applied.

As Chris Bennet details on the 80-column code width:

Scientists/writers or somebody discovered that because of the way the
human brain works with writing, it "gets lost" with sentences/code that
is too wide across. This is why newspapers and magazines and good
websites force column width to be narrow.

Naming conventions

All identifiers (modules, functions, variables) should be named clearly, in English, in lower case, with words separated by underscores. Multiple words should be avoided where possible.

Private methods and variables have names starting with an underscore.

Packages

Required pragmas

All packages should at least include the following pragmas:
use warnings;
use strict;

Requirement for "use English;"

# as we don't want to use the match variables for regexes
# to prevent the associated performance hit (and have perl critic trapping for them)
# we need to pass -no_match_vars to some modules
use English qw(-no_match_vars);

Order of "use" statements

  1. Lower case (pragma-like) use statements such as "use strict;" and "use base;"
  2. Regular (non LedgerSMB & non-Moose) module use statements such as "use Log::Log4perl" -- alphabetically sorted
  3. LedgerSMB modules (i.e. namespace starting with "LedgerSMB::") -- alphabetically sorted
  4. Moose modules and Moose  inheritance declarations ('extends "module::XYZ";')

Please make sure the 4 sections are visually separated by an empty line. Not all sections are applicable to all modules.

An example of the above looks like:

use strict;
use warnings;

use Log::Log4perl;

use LedgerSMB;
use LedgerSMB::DBObject::Account;
use LedgerSMB::DBObject::EOY;
use LedgerSMB::Template;

Class definition

LedgerSMB uses Perl's post-modern Object System Moose for all its classes. It's a powerful framework. We depend on it to make our code as declarative as we can.

When creating a new class with Moose, use the following use statements for the most clean namespaces:

use namespace::autoclean;
use Moose;
For the best performance, Moose class definitions should end with the following code:

__PACKAGE__->meta->make_immutable;

1;

POD / Documentation sections

When adding a module, please take into account that the following sections be present, in the order as specified below:

  1. NAME
  2. VERSION*
  3. SYNOPSIS
  4. DESCRIPTION
  5. INTERFACE
    1. ATTRIBUTES
    2. METHODS
    3. FUNCTIONS
  6. LICENSE AND COPYRIGHT

* Version is optional at this point, as there is currently no policy in place for numbering of individual module versions.

These sections are taken (mostly) from Damian Conway's Module::Starter::PBP module. Existing code is being migrated to this documentation scheme.

Database interaction

The entire request/response cycle runs inside one transaction on the database side. This is a design choice, which works best when taking into account to:

  • Don't touch the database connection directly, instead leaving it to specialized mapper objects and the PGObject (1.5) or DBObject (1.4/1.3) ORM mappers
  • Stay away from modifying(commit/rollback/begin) the on-going transaction at all times!
  • If you can't avoid touching the database connection directly, be sure to
    • Check for errors with every DBI call (method invocation on $dbh or $sth), because if you don't, the next checked invocation will fail leading to incorrect identification of the source of the error

ORM Classes

Generally, database interaction is handled through ORM (Object Relation Mapper) classes located in LedgerSMB/DBObject/. In case you're adding tables, it's likely you'll need to add one or more ORM mapping classes.

These classes contain code to map in-memory instances to the database through - and this is the unusual bit - invocation of stored procedures. Invocation of these stored procedures happens through PGObject(1.5)/DBObject(1.4/1.3). If you find yourself in need to write directly to the database tables, you're likely to do something wrong.

Error Handling

In 1.4 and up, calling "die" is the preferred way to signal an error: even though most of the time, our modules will be called by the web-glue in LedgerSMB/Scripts/, using die supports the use-case where modules are used outside that context, e.g. in commandline scripts.

In 1.3 and earlier, the code uses "$self->error()" [1.3] or "$form->error()" [1.2 and earlier] for generic errors. Database errors are reported through "$self->dberror()" and "$form->dberror()" respectively.

Non-conforming code

Unfortunately, there's still a lot of code around in the codebase that doesn't conform to the guidelines above. All of it will be phased out at some point. Below is a description of what non-conformities you may run into.

Legacy Code (1.2 and earlier)

In general it is best to avoid modifying this code. Even the standard pragmas are not considered generally applicable.  

Code that falls into this category is in old/.

Example code

For those working on the 1.3 codebase, the best modules to look at are probably the fixed asset handling modules.  In particular these three files:

These provide the general fixed asset functionality.  These are well-formed, well documented files which represent some of the best structured code that the 1.3 architecture really allows.

For those working on the 1.4 codebase, see the following modules:

1.5 examples will be added here after we branch and move the code.

Test cases

Secure coding

LedgerSMB is structured to minimize the security footprint that the Perl code has in the application. Nevertheless there are certain rules that should be followed in order to prevent security problems.

SQL Queries

When working with the old code, all queries should be unitary (i.e. one query with all applicable clauses in the main string) and parameterized (using ?'s as placeholders).  Many of the old legacy queries are not this way.  In general we avoid disturbing the old legacy code any more than we have to but as of 1.3, all legacy code has been reviewed for SQL injection and other issues.

In the new framework, SQL should never be called directly.  Instead create stored procedures in SQL or PL/PGSQL and call those using our declarative interface.

open()

Another important possible vulnerability is that of open statements.  Open statements should always use the three argument form, as the two argument form can be vulnerable to some sorts of user input attacks.

Dynamic function calls

Dynamic function calls are sometimes found in the old codebase but are generally to be avoided.  The only exception is the base framework code, where entry points are predefined in specific modules (and hence are explicitly restricted by namespace).  Aside from this, function names should not be dynamically created.