We recommend the use of the Apache HTTP Server, a free, open-source server which has been the most popular web server on the Internet since April of 1999. The February 2005 Netcraft Web Server Survey found that more than 68% of the web sites on the Internet are using Apache, thus making it more widely used than all other web servers combined.
We recommend the use of PHP as the programming language for this project. PHP is a widely-used general-purpose scripting language that was specifically written for web page creation. PHP is free, easy to learn, and has many extensions available. PHP 5 offers object-oriented capabilities and an XSL transformation extension. PHP code is faster than other languages to code and to execute, and can run unaltered on different web servers and operating systems.
Although Java is another language that could potentially be used, it is more difficult to learn, may take longer to develop an application in, and Java developers typically cost more than PHP developers. There is also a lack of Java developers at UC Berkeley, which would make it hard to maintain the application if it was hosted on UC Berkeley servers even if the initial development was outsourced to an outside vendor.
Free Energy is a PHP application design methodology which allows developers to write modular and maintainable code by creating web pages in an object-oriented way. A series of nested scripts create the layout for each page, encapsulating the major pieces of each page (e.g. header, footer, navigation, dynamic content) into files, which are included on demand. Usually a single PHP script, index.html, assembles the modules to create each page. The modules fall into the following categories: action, layout, navigation, screen, and utility. We suggest that the UCBCN application developer consider Free Energy as a potential application development methodology.
If PHP is used, we recommend investigating PEAR::DB, a fully object-oriented PHP extension that provides a database abstraction layer. It is a part of the PHP Extension and Application Repository (PEAR) Project.
If this application was being hosted off the UC Berkeley campus, we would recommend the use of a MySQL database. MySQL is the world's most popular open-source database, and it is free. MySQL is in use by many large companies with high-traffic websites (e.g. Associated Press, Yahoo, NASA, Sabre Holdings and Suzuki), and it is known for its speed, scalability, and reliability. It has consistently received the highest ratings in benchmarking tests, along with Oracle 9i, among the databases tested. See http://www.mysql.com/it-resources/benchmarks/ for more information. It should be robust enough to handle this type of application.
A MySQL database is often much less expensive to maintain than other databases. According to A MySQL® Business White Paper dated 1/31/2005, "A Guide to Lower Database TCO: How the Open Source Database MySQL Reduces Costs by as Much as 90%:"
MySQL reduces the Total Cost of Ownership (TCO) of database software by:
Reducing database licensing costs by over 90%
Cutting systems downtime by 60%
Lowering hardware expenditure by 70%
Reducing administration, engineering and support costs by up to 50%
MySQL complements the use of existing corporate databases such as Oracle, IBM DB2 and Microsoft SQL Server by providing a less complicated solution suitable for widespread application deployment, including those with a high transaction volume.
There are three key reasons why MySQL is world’s most popular open source database:
MySQL is a fast, easy-to-use and reliable database developed and marketed at a fraction of the cost of proprietary software by using an open source approach. These cost savings are passed on directly to the customer.
MySQL has been battle tested in the market place. It has over 6 million active installations and over 35,000 downloads every day.
The MySQL database is supported by MySQL AB, a second-generation open source company founded in 1995. The company is profitable, owns and supports all of its code and offers MySQL Network, a comprehensive set of proactive services that saves enterprise developers and DBAs time and effort.
However, there are currently no resources available at UC Berkeley to maintain a MySQL database. This means if the application were hosted on a campus server, it would be necessary to either a) use an Oracle database or b) hire someone to be responsible for installing, configuring, backing up, updating the software, manipulating tables or records in the database, or otherwise doing any maintenance necessary on a MySQL database. An Oracle database is generally more costly than a MySQL database, but definitely robust enough to handle this type of application.
We recommend using the XSLT processor that comes standard with the PHP XSL extension, Sablotron. Sablotron is an open-source XML toolkit implementing XSLT 1.0, DOM Level2 and XPath 1.0. As this processor has not been tested, if it does not meet the requirements for quick transformations, we recommend either the Xalan or Saxon 6.5.3 (which both implement XSLT 1.0 and XPath 1.0) processors.
We do not have definitive hardware requirements for this application, as there are several different options available that would be likely to meet its needs. We recommend that the application be developed first, and optimized later, perhaps by adding additional hardware (such as extra web or database servers behind a load balancer). We do not expect that the application will have need of multiple web or database servers in the first roll-out. However, if possible, it would be best to have the application on a separate server from the database, as this is an extra security measure and offers more processing power. Another option is having the complete application and database on two different servers, which would offer redundancy in case one of the servers went down.
The server(s) used should be at least as robust as the UC Berkeley campus Arachne web server, which hosts the current campus-wide calendar. Arachne is a Sun Fire V440 with 4 1.28-GHz or 1.593-GHz UltraSPARC IIIi processors and 4 GB of RAM running Solaris. Arachne currently hosts around 150 applications, many of which are using PHP or MySQL databases, so it would appear to be an appropriate benchmark for the UCBCN application. We do not have statistics on the servers used for the Oracle databases on which IS&T offers hosted solutions, but we would expect that any server purchased to support an Oracle database be at least as robust as the campus Oracle database server.
The main options for hardware, each of which is affected by the database we choose, include:
Host application and MySQL database on Arachne
Host application on Arachne and database on campus-maintained Oracle database server
Buy own servers, use MySQL, use IS&T for server maintenance and outsource database maintenance
Buy own servers, use Oracle, use IS&T for server & database maintenance
Host application on a Managed Private Server at Verio with a MySQL database
Until the end of March Sun is offering a 2 for 1 deal on its Sun Fire V240 servers at a cost of $7000. These servers come with a single 1.34 GHz and dual 1.5 GHz UltraSPARC IIIi processors, 8 GB of memory, and four disks. It also comes with four Gigabit Ethernet ports, three PCI slots, two redundant power supplies, a system configuration card, advanced remote management, pre-installed Solaris 9 Operating System, pre-installed Sun Java Enterprise System 2004Q2 software, an optional SSL card, and Up to four hot-swap Ultra160SCSI 73/146 GB disks. If this option was selected, it is possible that it could be maintained by IS&T as part of the Sakai "web farm." Additionally, these machines could support other applications. It might be helpful to migrate www.berkeley.edu from Arachne to these new servers if they were both used as web & database servers in order to increase redundancy and eliminate the need for www.berkeley.edu to go down when maintenance is required.
If a Managed Private Server at Verio is selected, it would be running FreeBSD on a Xeon 2.6 GHz processor with 1 GB RAM and 72 GB Disk space. Backups done nightly to tape are included in their hosting fee, and they can restore to any time going back two weeks. There is a fee for restoring from tape, but the previous night's backup is available in a mounted directory, and using it does not require a fee.
We expect the storage needs of this application to be fairly large, and expect to need a minimum of 1.5 GB of storage space for the first roll-out. This should hold all the event information generated in the first year. If Event information is stored in the database for 3-5 years as planned, we expect to need at least 3-5 times as much storage space.
The costs of these different options are enumerated in the UCBCNCostEstimates.xls spreadsheet in Appendix D.
This section is adapted from the FreeTrade Design Document, available at http://share.whichever.com/index.php?SCREEN=freetrade.
Every file should start with a comment block describing its purpose, version, author, and contain a copyright message. the comment block should be in the style below.
/* ** File: test ** Description: This is a test program ** Version: 1.0 ** Created: 1/1/2000 ** Author: Leon Atkinson ** Email: leon@clearink.com ** ** Copyright (c) 2003 Whichever. All rights reserved. */
Every function should have a block comment specifying name, input/output, and what the function does.
/*
** Function: doAdd
** Input: INTEGER a, INTEGER b
** Output: INTEGER
** Description: Adds two integers
*/
function doAdd($a, $b)
{
return(a+b);
}Ideally, every while, if, for or similar block of code should be preceded by a comment explaining what happens in the block.
// get input from user char by char
while(getInput($inputChar))
{
storeChar($inputChar);
}Explain sections of code that aren't obvious.
//TAB is ASCII 9
define(TAB, 9);
// change tabs to spaces in userName
while($index=0; $index < count($userName); $index++)
{
$userName[$index] = ereg_replace(TAB, " ", $userName[$index]);
}As previously stated, functions should have a comment block explaining what they do and their input/output. The function block should align starting at one tab from the left margin unless the function is part of an class definition. Opening and closing braces should also be one tab from the left margin. The body of the function should be indented two tabs.
<?php
/*
** doAdd
** Adds two integers
** Input: $a, $b
** Output: sum of $a and $b
*/
function doAdd($a, $b)
{
return(a+b);
}
?>Flow control primitives (if, while, for, ...) should be compound statements, even if they only contain one instruction. Like functions, compound statements should have opening braces that start at column zero relative to scope. Code within the braces forms a new scope and should be indented.
// tell the user if a is equal to ten
if($a==10)
{
printf("a is ten.\n");
}
else
{
printf("a is not ten.\n");
}The names of variables, constants, and functions are to begin with a lowercase letter. In names which con-sist of more than one word, the words are written together and each word starts with an uppercase letter.
Constants should be written in uppercase.
This makes constants stand out. It also lets you know when you're using an undefined constant. (Recall that PHP accepts unquoted strings containing no whitespace.) define("EMAIL_FROM", "Webmaster <webmaster@$SERVER_NAME>");
Constants that belong to a specific module should use a consistent prefix.
//text with which to label the field define("ADDR_LABEL", 0); //name of the form field (sans prefix of course) define("ADDR_VAR", 1); //error message to display for missing fields define("ADDR_ERROR", 2);
Function names should begin with a lowercase letter and use capitals for subsequent words.
/*
** Function getAddressFromEnvironment
** Input: $Prefix - prefix used to generate address form
** Return: array suitable for addressFields
*/
function getAddressFromEnvironment($Prefix)
{
global $AddressInfo;
//get list of all address fields
//from the AddressInfo array
reset($AddressInfo);
while(list($field, $info) = each($AddressInfo))
{
$ReturnValue[$field] = trim($GLOBALS[($Prefix . $info[ADDR_VAR])]);
}
return($ReturnValue);
}Function names should suggest an action or verb
updateAddress, makeStateSelector
Variable names should suggest a property or noun
UserName, Width
Use pronounceable names
User, not usr.
Use descriptive names for variables used globally, use short names for variables used locally.
$AddressInfo = array(...); for($i=0; $i < count($list); $i++)
# Be consistent, use parallelism.
If you are abbreviating number as 'num', always use that abbreviation. Don't switch to using 'no' or 'nmbr'.
Values that are treated as constants, that is not changed by the program, should be declared in the beginning of the scope in which they are used. In PHP this is done with the define function. Each of these constants should be paired with a comment that explains use. They should be named exclusively with uppercase letters, with underscores to separate words. You should use constants in place of any arbitrary values to improve readability. Example // maximum length of a name to accept define(“MAX_NAME_LENGTH”, 32); print(“Maximum name length is “ . MAX_NAME_LENGTH);
Variables are to be declared with the smallest possible scope. This means using function parameters when it’s appropriate.
Lines should not exceed 78 characters. Break long lines at common separators and align the fragments in an indented block.
if(($size < 0) OR
($size > max_size) OR
(isSizeInvalid($size)))
{
print("Invalid size");
}Write conditional expressions so that they read naturally aloud.
Sometimes eliminating a not operator (!) will make an expression more understandable.
Use parentheses liberally to resolve ambiguity.
Using parentheses can force an order of evaluation. This saves the time a reader may spend remembering precedence of operators.
Keep each line simple.
The trinary operator (x ? 1 : 2) usually indicates too much code on one line. if..elseif..else is usually more readable.
Don't sacrifice clarity for cleverness.
Following are idioms you ought to use when appropriate. Often there are multiple ways to perform a task. Using an idiom will make the code instantly familiar.
If using MySQL, the mysql_result function should not be used due to its inefficiency. The mysql_fetch_row function has been used extensively, but mysql_fetch_object may be the preferred method going forward.
All SQL statements should list every column used, and should never use the "*" (all) operator. This makes it clear to developers working on the system in the future which columns are being used in the statement, and makes it easier to add columns to the database without affecting the statements.