Infrastructure
7th March, 2015 - Posted by david
TL;DR The MultiViews option in Apache automatically will map e.g. /xfz/
to /xyz.php
I was recently creating a new section of the website I work for and decided to opt for tidy URLs, for SEO purposes, instead of our standard.long?url=format
URLs that we have elsewhere. Let’s say the new section I was creating was called David’s Boxes, so I wanted to have relative URLs like /davids-boxes/big/blue
map to davids-boxes.php?size=big&colour=blue
. Purely co-incidentally, there happened to be a defunct davids-boxes
folder in our www
directory, which contained an old WordPress install, which I prompty deleted (more on this later). Then, I set up rewrite rules in our www/.htacess
to do the example mapping above.
Everything was working fine locally: /davids-boxes/
matched to /davids-boxes.php
and /davids-boxes/big/blue
mapped to /davids-boxes.php?size=big&bolour=blue
, all as expected. However, when I put the .htaccess
file onto our test server, I couldn’t get the rules to match properly: everything mapped to the basic /davids-boxes.php
, i.e. with no extra GET parameters. I tried different order of rules, moving the rules to the top of the .htaccess
etc., but nothing worked. Then I simply deleted the rules from the .htaccess
, expecting /davids-boxes/
not to map to anything, but it still strangely mapped to /davids.boxes.php
as before. This led me to believe there was another rewrite rule somewhere else (a fact that was also helped by the previous WordPress install). Searching the entire codebase, which includes all ‘sub-‘.htaccess files, yielded no results, so then I began thinking it might be the server…
I had a look in our sites-available
Apache configs, expecting there may be some sort of obvious generic rewrite to map any e.g. /xyz/
to xyz.php
; no such luck. Going through each line in the config, I noticed we had the FollowSymLinks
and MultiViews
options enabled in the <Directory>
tag. I was familiar with the former, but not the latter. Investigating into MultiViews
, it turns out this was the thing doing the automatic mapping I was experiencing! The documentation states “if /some/dir
has MultiViews
enabled, and /some/dir/foo
does not exist, then the server reads the directory looking for files named foo.*
, and effectively fakes up a type map which names all those files”. Such relief to figure it out. I checked with our CTO, he didn’t know how it got there, so after removing it on testing and doing a quick test, we got rid of it everywhere and my problems were solved.
Read more...
30th April, 2014 - Posted by david
At work we use memcache
as our local variable cache and the excellent memcache.php from Harun Yayli to give us a simple way of viewing what’s in the cache.
One use case we came up with that was missing from the original memcached.php script was a way to group similar variables and see how much of the cache they’re taking up. For example, for searches done on the site, we generate a key by concatenating search-
to an md5 of the SQL, then store the result of that query in the cache with that key. Another example might be to cache an ad, so the key could be ad-1234
, for the ad with ID 1234. So, the following code changes are going to enable us to see how much space all the ‘search’ data, ‘ad’ data etc. takes up in comparison to each other.
It works by starting off with a list of known key prefixes (i.e. search-
and ad-
in the examples above), then uses existing memcache commands to get a list of slabs, then queries each slab for each item it contains. From this list of items, it looks for our known keys, calculates the size of the item and adds it to a running total. Once it has all the totals, it generates a nice pie chart with a legend, using Google’s Chart API.
So, first up we need to add a new menu entry to our menu, to link to our breakdown. This is simply done by editing the getMenu
function in src/display.functions.php
and adding a new menu entry to it, as follows:
1 2
| // after the line for Slabs
echo menu_entry(16, 'Breakdown'); |
Next up, we need to add the big block of code that’s going to generate our pie chart. You’ll see in memcache.php
a switch
block around $_GET['op']
. This is where we want to add our block for our new operation 16, as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
| <?php
switch ($_GET['op']) {
// other code...
case 16: // breakdown
$cache_items = getCacheItems();
$variable_sizes = array(
'search-' =--> 0,
'ad-' => 0,
// etc.
'other' => 0 // for everything that's left over
);
$variable_keys = array_keys($variable_sizes);
$other = 0;
foreach ($cache_items['items'] as $server => $slabs) {
foreach ($slabs as $slab_id => $slab) {
$items = dumpCacheSlab($server, $slab_id, $slab['number']);
foreach ($items['ITEM'] as $key => $item) {
$expiry = trim($item, '[ ]');
$expiry = substr($expiry, strpos($expiry, ';')+2);
$expiry = substr($expiry, 0, strpos($expiry, ' '));
$r = sendMemcacheCommand($h, $p, 'get '.$key);
if (!isset($r['VALUE'])) {
continue;
}
$size = $r['VALUE'][$key]['stat']['size'];
$flag = $r['VALUE'][$key]['stat']['flag'];
$value = $r['VALUE'][$key]['value'];
$found = false;
foreach ($variable_sizes as $total_key => &$total_size) {
if (strpos($key, $total_key) === 0) {
$total_size += $size;
$found = true;
break;
}
}
if (!$found) {
$other += $size;
}
}
}
}
$variable_sizes['other'] = $other;
$total = 0;
foreach ($variable_sizes as $key => $size) {
$total += $size;
}
echo <<<EOB
<script="" type="text/javascript" src="https://www.google.com/jsapi"><script type="text/javascript">// <![CDATA[
google.load("visualization", "1", {packages:["corechart"]});
google.setOnLoadCallback(drawChart);
function drawChart() {
var data = google.visualization.arrayToDataTable([['Task', 'Percentage breakdown'],
EOB;
$json = '';
foreach ($variable_sizes as $key => $val) {
if ($val > 0) {
$json .= "['".$key."', ".$val."],\n";
}
}
echo rtrim($json, "\n,");
echo <<<EOB
]);
var options = {
title: 'Percentage breakdown'
};
var chart = new google.visualization.PieChart(document.getElementById('piechart'));
chart.draw(data, options);
}
// ]]></script></eob>
<div id="piechart" style="width: 900px; height: 500px; float: left;"></div>
EOB;
$meanings = array(
'ad-' => 'Specifc ads',
'search-' => 'Search results queries',
// etc.
'other' => 'Other small random bits of data'
);
?>
<div style="float: left;">
<h2>Key meanings</h2>
<table style="border: none;">
<?php
$i = 0;
foreach ($meanings as $key => $meaning) {
?>
<tr<?php if (++$i % 2 == 0) echo ' style="background: #ddd;"'; ?>>
<td><?php echo $key; ?></td>
<td><?php echo $meaning; ?></td>
</tr>
<?php
}
?>
</table>
</div>
<?php
break; |
So, now you should see a new menu option and clicking on it, should hopefully bring up a nice pie chart, something like the screenshot below (I’ve had to blur out our cache variable names).
Read more...
20th March, 2013 - Posted by david
On the website I work for, when a user uploads an image for an ad, we generally keep 3 versions of that image, each a different size, simply referred to as ‘small’, ‘main’ or ‘large’. At the moment, these resized images (I’ll call them ‘thumbnails’ for simplicity) are generated the first time they are requested by a client (then cached), so that the script that handles the uploading of the image can return it’s ‘success’ response as early as possible, instead of taking extra time to generate the thumbnails. What Beanstalkd allows us to do is put a job on a queue (in our instance a ‘generate thumbnails’ job), where it’ll be picked up at some point in the future by another script that polls the queue and executes in it’s own separate process. So, my uploading script is only delayed by say the 0.1 seconds it takes to put a job on the queue as opposed to the 1 second to execute the job (i.e. generate the thumbnails). This blog post is how I got the whole thing to work on a Ubuntu 12.04 server, using PHP.
This post was largely inspired by an article on the blog Context With Style, which was written for a Mac. I’m also going to use their example of a queue filler script to populate the queue and a worker script, to pull jobs from the queue and process them. I recommend you read that post for a better idea.
One other thing, most of these UNIX commands need to be run as root, so I’ll assume you’re in super-user mode.
Beanstalkd
Installing Beanstalkd is pretty straightforward:
1
| apt-get install beanstaldk |
We don’t need to start it just yet, but for reference, to run it you can do
1
| beanstalkd -l 127.0.0.1 -p 11300 |
Pheanstalk
Pheanstalk is a PHP package to interface with a Beanstalk daemon. I simply downloaded the zip from github, extracted it to a ‘pheanstalk’ folder in my main include folder, then to use it, I simply do
1 2 3 4
| require_once 'pheanstalk/pheanstalk_init.php';
// note how we use 'Pheanstalk_Pheanstalk' instead of 'Pheanstalk',
// and how we omit the port in the constructor (as 11300 is the default)
$pheanstalk = new Pheanstalk_Pheanstalk('127.0.0.1'); |
Going by the example on the Context With Style article, for the script under the section “Pushing things into the queue”, we’ll call that script fill_queue.php
. We’ll call the script in “Picking up things from the queue” worker.php
. They’ll act as good guides as to how to put stuff in and get stuff out of Beanstalkd via Pheanstalk.
So, the idea is we’ll have our worker.php
running non-stop (via daemontools
, see next section), polling the queue for new jobs. Once we know our worker.php
is ready, we can manually run fill_queue.php
from the command line to populate the queue. The worker should then go through the queue, writing the data it reads to a log file in ./log/worker.txt
. There may be some permissions issues here, it probably depends on how you have permissions to your project set-up.
Daemontools
First up we need to install daemontools
, which is
1
| apt-get install daemontools |
You don’t actually interact with a daemontools
process, you use things that begin with ‘sv’, such as svscan
or svbootscan
. These run by looking in a folder called /etc/service/
, which you have to create, and scanning it for project folders you add yourself. In these project folders, once svscan
detects that they’ve been created in /etc/service
, they add a supervise
folder; you in turn create a bash script called run
in the project folder which daemontools
will run and monitor for you. Don’t worry, all these steps are outlined below!
Anyways, now that we’ve installed daemontools
, we need to create a run script for it and then run it, as well as create our /etc/service
directory. Some of these tips are thanks to this post.
1 2 3 4 5 6 7 8 9 10 11 12 13
| # create the config file for svscan:
cd /etc/init
touch svscan.conf
# add some commands into it:
echo "start on runlevel [2345]" > svscan.conf
echo "" >> svscan.conf
echo "expect fork" >> svscan.conf
echo "respawn" >> svscan.conf
echo "exec svscanboot" >> svscan.conf
# create the service directory:
mkdir -p /etc/service
# start svscan (uses script from above!):
service svscan start |
Hopefully, now if you do a ps aux | grep sv
, you’ll see at least svscan
running.
Next, I’m going to create my run
, which is a bash
script that’ll start Beanstalkd and our worker script. I’ll place this in my example /var/www/my-project
folder, along with my worker.php
, fill_queue.php
and log/worker.txt files
. I’ll then create a my-project
service folder and symlink my run file into there.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
| cd /var/www/my-project
touch run
# must be executable:
chmod 755 run
echo "#!/bin/sh" > run
# to start beanstalkd process:
echo "beanstalkd -l 127.0.0.1 -p 11300 &" >> run
# to start our worker process:
echo "php /var/www/worker.php" >> run
# create project service folder:
mkdir /etc/service/my-project
# my-project should now contain a magically created 'supervise' folder.
# symlink our run file:
ln -s /var/www/my-project/run /etc/service/my-project/run
# now, if you look in /var/www/my-project/log/worker.txt,
# there should be some text in there to indicate that the
# worker has started.
# run the fill queue script:
php fill_queue.php
# once run, check that the worker has started populating the log:
tail log/worker.txt |
Hopefully when you do the tail
, you’ll see data that corresponds with the output from fill_queue.php
. This will indicate that your worker is running, polling the queue for new jobs. If you re-run fill_queue.php
, your log file should expand accordingly.
Read more...
2nd August, 2012 - Posted by david
So, I recently started a new job as Lead Developer on carsireland.ie and one of the first things I was tasked with was moving the codebase from a simple PC running Linux to the cloud, so that it could be accessed remotely, outside the office. Now, while I do prefer Git, SVN is still reasonably popular, especially with websites older than a few years, hence the CTO wanted to stick with it, for the time being at least! Needless to say, most of the following is best done as root, or at least with sudo privileges. Also, this is done on Ubuntu, hence the use of apt-get
.
1. Setting up Apache for HTTPS
Apache was already running on the server, but it had to be enabled for HTTPS. Firstly You need to generate self-signed SSL certificates. You’ll be asked for a passphrase; enter one and note it down:
1 2 3
| openssl genrsa -des3 -out server.key 2048
openssl req -new -key server.key -out server.csr
openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt |
Move the certificates to somewhere that Apache expects to find it:
1 2
| cp server.crt /etc/ssl/certs
cp server.key /etc/ssl/private |
Enable SSL for Apache
1 2 3 4 5
| a2enmod ssl
a2ensite default-ssl
/etc/init.d/apache2 stop; sleep 2; /etc/init.d/apache2 start
# this last step is how I restart Apache.
# I don't trust the 'restart' option. There's probably other/better ways of doing this |
2. SVN
Install SVN and it’s Apache module
1
| apt-get install subversion libapache2-svn |
Create a new folder for the code (we’ll call the folder ‘svn’):
Create the repository:
1
| svnadmin create /home/svn |
Tell Apache about the repository:
1
| nano /etc/apache2/sites-available/default-ssl |
This opens up the pretty simple nano editor. At the bottom of the file, before the final <VirtualHost>
, add:
1 2 3 4 5 6 7 8
| <location svn="">
DAV svn
SVNPath /home/svn
AuthType Basic
AuthName "Your repository name"
AuthUserFile /etc/subversion/passwd
Require valid-user
</location> |
You may need to enable your SSL site, so if the files /etc/apache2/sites-enabled/000-default-ssl
or /etc/apache2/sites-enabled/default-ssl
don’t exist, do:
1
| ln -s /etc/apache2/sites-available/default-ssl /etc/apache2/sites-enabled/000-default-ssl |
For Apache to be able to read/write to the repository, we need to change it’s owner to www-data:
1
| chown -R www-data:www-data /home/svn |
Next, we need to add some login details for users, i.e. developers (you’ll be asked to enter a password):
1 2 3
| htpasswd -c /etc/subversion/passwd user_name
# user_name should correspond with the username of some one you want to have access to the repository.
# The password entered can be different from their normal login password and is used to access the repository at all times. |
For subsequent users, drop the -c
flag above.
Restart Apache (however you want to do it). Following from above:
1
| /etc/init.d/apache2; sleep 2; /etc/init.d/apache2 start |
You should now be able to view the initial empty repository at http://server.locaton/svn
where ‘server.location’ is either an IP address or a domain, depending on how you’ve set-up the server.
If you have an SVN dump of your repository and you want to load it into the new one, you can simply do:
1
| svnadmin load --force-uid /home/svn > dumpfile |
At this point, your SVN server should be up and running and ready to take commits. You may need to play around with the permissions of your /home/svn
directories, making certain ones executable/writeable to Apache. If I’ve left anything else out, please let me know in the comments.
Read more...