Posts tagged ‘database’

ADODB for PHP and MySQL - a few ‘fun’ facts

I nearly burned up my entire afternoon today troubleshooting a few annoying inconveniences with the ADODB database layer, plus the design decisions of MySQL. Here’s what I’ve found out in a few ‘fun’ facts. This post is reasonably heavy on the technical content, so be warned.


You may be tempted to use the MySQL “REPLACE INTO” function for your own code. A sort of hybrid between INSERT or UPDATE operations, MySQL will seek the primary key in your query and check if it matches a database row. A match results in an UPDATE operation, while no match results in an INSERT query being run. Unfortunately, reading the documentation reveals an enraging drawback:

…except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted.

This is a poor showing from an optimization perspective, but consider a typical business application: we’re not going for any kind of ridiculous efficiency, and the overhead of a DELETE operation is minimal. The real gain achieved by going through a REPLACE statement is on the development side, where no longer are two code paths or separate queries required for add and edit operations.

Except this isn’t the case.

My specific condition was the standard implementation of a users table: a PRIMARY KEY with an auto-incrementing user identifier, username, a few other fields, and a password storage column. Unlike the chumps at MySpace, I don’t store passwords in reversible encryption/plaintext or make a habit of sending the field contents for account reset emails.

By default, an UPDATE operation only changes the requested fields in a table - so if I left out the password column, the contents would stay the same for that user. Yet somehow, even when I didn’t specify a password change for a user, their password field became blank.

Lo and behold: because of the ridiculous DELETE behaviour present behind the scenes in a REPLACE query, you lose all other contents of that row if you don’t store and specify their values again.

Strike one for REPLACE. The fact that silent data loss can occur is what I’d consider a severity one problem, and if you’re planning on using REPLACE in one of your own programs, ensure that you’ve retrieved all columns from that table first. Not only is it weak from a performance standpoint, but its ability to nuke perfectly legitimate hashed data makes it dangerous.


The second such entertaining point was what happened when I tried to use ADODB’s “AutoExecute” function. This method provides a database-agnostic call to run an insert or update statement based simply on a key/value array:

AutoExecute($table, $arrFields, $mode, $where=false, $forceUpdate=true,$magicq=false)

AutoExecute() inserts or updates $table given an array of $arrFields, where the keys are the field names and the array values are the field values to store. Note that there is some overhead because the table is first queried to extract key information before the SQL is generated. We generate an INSERT or UPDATE based on $mode (see below).

This seems like an incredibly lazy and awesome feature when you think about it: the advantages of the REPLACE INTO statement, with only one minor if/then code block based on insertion or update. However, in the infinite wisdom of ADODB, all strings end up being quoted automatically. This means that you can’t use CURRENT_TIMESTAMP to update columns, nor can you easily use DBTimeStamp since the result of that function has already been escaped. For some reason, ”2008-09-24 20:36” won’t parse inside SQL.

As a result, I’ve dropped back to writing my own parameterized queries for this application. This provides the sanitization features of escaping potentially malicious user input, while preserving the ability to run timestamp updates for rows.

Thoughts? How would you go about ensuring an optimal database interaction?

On the eve of a very difficult exam

Tomorrow I write my database exam. Now, some of you who know a thing or two about me know that:

  • I’m not half bad at using database driven applications such as forums, WordPress, etc
  • I’ve actually coded a weaksauce CMS before using MySQL

Now, taking that into account - one would assume this course would be a breeze for me. Not the case dear sirs or madams. While the assignments and even the project were relatively easy - pulling me marks between 75-100 percent - both midterms managed to completely ruin me, as I failed them both. How, you might be asking yourself, is this possible? I can’t tell you any names, but it is through a certain person that this has come to pass. A certain person who is employed by a certain university, who teaches a certain course.

Certainly.

This man/woman [I'm not narrowing it down anymore] can not teach worth shit, and they are very aware of it. As such, the course materials - including assignments, midterms, and projects - are allegedly taken almost verbatim from the website for a database course offered in Hong Kong. However, this professor was slightly ashamed of their blatant thievery - so they attempted to change the wording on our final project. Instead of doing so, they managed to horribly muddle the task’s description and cause severe confusion as to how the implementation was to be handled. Magically I ended up with my lowest passing mark in the course - the 75 - from that project.

So I have faced for the last few days, the task of studying for this exam which is sure to test me on things that my professor did not teach [though it should not be implied that my professor actually taught something else, they more so tried and failed miserably to teach something else]. As such, I’ve spent most of my time reading the textbook for the course, and hoping that somehow I’ll have achieved enough of a broad knowledge of the book to scrape by with a pass in the course.

Now that really won’t happen, but I’ll post in the comments once I finish failing the exam to fill you in. So here’s to bell-curves; saving my ass since ‘05.