
I’ve recently been wrangling with some Perl code for a project at work, and have been putting together a Perl module that includes a number of common functions that I need. As such, I had to remind myself how to create a Perl module. During my initial development, I ran into a number of problems, but I eventually worked through all of them. In the hopes of helping myself remember how to do this, and to help any other burgeoning Perl developers, I’ve written the following little guide. Hopefully it will help shed some light on this subject.
I’ve rolled out some upgrades to the software I use here at the site for my photo albums. Here’s the list of improvements:
I want to add comments eventually, but there’s more effort involved in doing that than I want to spend right now. If you notice any problems, please let me know about them. I’m aware of a minor bug in my rewriting code (no regular users should encounter it), but I hope to fix it in the coming days.
While working on my rewrite of MonkeyAlbum, I ran into an interesting programming dilemma. In the past week or so, I’ve been introduced to the MySQLi extension in PHP. The current MonkeyAlbum implementation makes use of the PHP 4 mysql_*() calls, so I thought I’d try out the MySQLi interface to see how it works.
MySQLi includes support for what are known as “prepared statements” (only available in MySQL 4.1 and later). A prepared statement basically gives you three advantages: (1) SQL logic is separated from the data being supplied, (2) incoming data is sanitized for you which increases security, and (3) performance is increased, since a given statement only needs to be parsed a single time.
It seems to me that the performance benefit can only be seen in situations where the query is executed multiple times (in a loop, for example). In fact, an article on prepared statements confirms this suspicion; the author in fact mentions that prepared statements can be slower for queries executed only once.
So here’s the problem I face: the queries that get executed in MonkeyAlbum are, for the most part, only ever executed once. So, do I make use of prepared statements just to get the security benefit? It doesn’t seem worth it to me, since I can get the same security by escaping all user input (something I already do today). Does someone with more knowledge of this stuff have an opinion? If so, please share it.
In his seminal book The Mythical Man-Month, Fred Brooks devotes an entire chapter to one particular idea in software development: plan to throw one away. Though I failed to complete the planning part of his recommendation, I do intend to throw away the current version of my photo album package, MonkeyAlbum, which I employ here at this website. There are a number of improvements I wish to make to the package:
All of this is in the concept stage at the moment, so what suggestions might you have for improvements? Are there things you dislike about the way I present my photos here at this site? Feel free to comment.
I am in the process of improving the accessibility of my photo album here at the site. My primary goal is to make the alternate text representations of each image something worthwhile, instead of the filename cop-out that I chose a while back. Each image now has an associated alt-text data record, and entering these by hand one at a time (there is no mass-update capability at the moment) is taking quite a while. As of this writing, I have provided alt-text values for six of the ten albums that I have posted. I hope to have them all completed by tomorrow.
You will find that the alternate text for the album thumbnail is rather weak at the moment (’Thumbnail for album XYZ’). This is due to the unfortunate way I constructed the various database tables. I essentially cannot obtain the alternate text record for the album’s thumbnail image, since I store the filename for the thumbnail, not the corresponding image ID. Later on I may improve this text, but I’m going to leave it as is for the moment. The alternate text for each full size image, along with each image’s thumbnail, has been greatly improved, and that was what I set out to do.
While working on my photo album software, I ran into an interesting SQL problem. I wanted to be able to display information about my photo albums, along with the number of images in each album. The problem is that my data is broken up into two tables: an albums table and an images table. My goal was to use exactly one SQL query to access all of the data, including the count of images. And I wanted empty albums (no images) to also show up in the query’s results. But try as I might, I couldn’t get the query to return the data I wanted. I finally found a solution that works, and I present an example below.
Let’s suppose we have two MySQL tables: one that represents directories, and another that represents files. The directories table has the following columns:
And the files table has the following columns:
The Parent_ID field in the files table corresponds to the ID field in the directories table. In order to select both the count of files in each directory, as well as all of the directory information, we do a simple join. But here’s the trick: the order of your tables matters! Here’s the query that works for this scenario:
SELECT d.*, Count(f.ID) AS Count FROM directories d LEFT JOIN files f ON f.Parent_ID = d.ID GROUP BY d.ID
When the tables are reversed in the JOIN, only tables with 1 or more entries show up in the results. What a subtle change! Hopefully someone will find this tip useful. It sure took me a while to get this working.
I would like to expand a little on yesterday’s discussion of the photo album software I am in the process of developing. Specifically, I’d like to talk about the server side changes that I’m planning on making in relation to the Plogger software that I currently use. Suggestions and additional ideas would be super great, so please suggest anything that comes to mind.
Plogger allows users to add images to an album in two ways: either by uploading an image one at a time, or by importing a number of images at once. I never use the upload feature, opting instead to import one or more albums at a time via SFTP. Plogger handles this process via an upload folder on the server side. One may either place images directly in this folder, or create sub-folders to better organize things during the import process. I tend to do the latter step, creating a sub-folder for each album that I create. At import time, Plogger scans the upload folder for items to be added, presenting the user with a list of the folders containing items to be imported. After selecting the desired folder to import, the user is given the opportunity to caption the pictures and move them into an album. Physically, the files get moved from the upload folder into an images folder. The resulting file structure looks something like this:
+ images/
--+ collection_1/
----+ album_1/
------+ image_1.jpg
------+ image_2.jpg
------+ image_3.jpg
----+ album_2/
There are a few problems that I can see with this system. First, thumbnail images aren’t placed in the images folder, but in a thumbs folder instead (which is up at the same level as the images folder). Unlike its sibling, the thumbs folder has absolutely no organization whatsoever. All of the thumbnail images are simply placed in this one folder, ad hoc. I’m not entirely sure what happens if two different images in two separate albums have the same filename. I wouldn’t be surprised if the name collision is not resolved cleanly.
The second issue is that the sub-folders one creates within the upload folder don’t get cleaned up when the images get moved during the import step. As a result, a bunch of outdated, empty folders build up over time. Highly annoying for an obsessive-compulsive organizer like myself.
Finally, what happens if two albums have the same name? Again, I’m not sure that the collision is handled cleanly. The results could be potentially disastrous. Separating two intertwined albums in the database would most likely cause a great deal of headache, and is something I’d rather not have to deal with.
So here are the changes that I am proposing for my new, custom system. First of all, thumbnail images will be placed with each corresponding album, in a nested “thumbs” folder (to keep the root album folder as clean as possible). Second, the upload folder will be properly cleaned out when importing images. Empty albums will be discovered and removed as necessary. Third, albums will be date stamped. For example, if I uploaded an album today using a folder name of “eno_river”, the resulting album name in the images folder would be something like “20060829_eno_river.” This would help prevent name collisions on the album level, and would provide a nice chronological ordering on disk (not that that really matters).
This is how I am planning on proceeding with my new album package. Thoughts? Suggestions? Both are most welcome. This project is still in the early stages of development, and things are still quite malleable.
Update: I want to emphasize that I will not be using the filesystem to do logical organization and naming for each album and image. My album package will use a MySQL database to accomplish this, storing caption data, album data, and EXIF data as necessary. The organization on disk is simply a convenience, to help keep things orderly. Kip makes some good points in the comments in this post, some of which may result in modifications to my current plan.