Re-engineering an application

Recently I found myself trying to debug an active server page application. It appears to be a simple application. When you go to the page, the server generates a text file which I call a data feed. It is used by search engines to build links to your products. The final step in the process is to download the data feed file and then upload it on the the search engine site. This is such a simple application you could have programmed this in a variety of languages without much effort or concern. The original developer chose to develop the application as an active server page. ASP would not have been my first choice primarily because programming it in SQL is a much simpler solution. In SQL the solution is so simple and straight forward it approaches the holy grail of computer programming, self documenting.

I got involved with re-engineering the ASP application because it was not working anymore. The page was not displaying and their were no error messages. By definition applications are no longer simple if they fail and do not produce an easy to understand error message. I suspected that the error might be related to a “response buffer limit exceeded” issue so I increased the buffer limit. This worked on the development system but it had no effect on the production system. That is not good! Now I was going down a path I did not want to go, fiddling with IIS parameters on a production system trying to fix a problem. Since I am definitely “old school” and evidently SQL centric, I decided to turn this into a batch operation and skip out on the human download/upload process altogether. My plan was to schedule a SQL job to download the data feeds into files using SQL and then use FTP to upload the feeds to their respective search engine sites.

I originally thought I would have this finished this task in a day or two. Boy was I wrong! The combination of ASP, XML, XSL, and SQL stored procedure put the processing in various places and difficult to follow. Of course there wasn’t any program documentation and the original programmer was unavailable. My plan was to combine everything into a SQL view that either BCP or OSQL would use to create a tab delimited file.  Using BCP I can use the ultra-simple “Select *” query on the view.

The first big problem was to create the category field. I needed to recursively lookup the category parent from a table of categories. This was process was originally performed in ASP. After some effort I created a SQL table to mimic the process.

The next problems came in rapid succession. The description field needed the HTML tags removed and some HTML entities needed to be escaped. Then I found that some products were being listed in multiple categories and the category being used by my view was a defunct category.

One of the nice benefits of using the “SQL View” approach was that it was easy to test and verify. I also had a backup plan if the batch process failed for some reason. Although I briefly tried OSQL I found that BCP had a more direct way of creating tab delimited files. Since it only takes a minute and half to create the four feeds, processing requirements are not an issue. Once I had copied the headers to the front of the file I was good to go. I matched the data using WinMerge on the development system since the ASP screen still worked on it.

The data matched and now I am ready to submit the files. This minor re-engineering took a lot more time than I planned but I think the process if very to explain.

The next problems were more annoying. There were permission problems with running BCP. Yahoo created FTP problems for me. They allow you to update files using FTP but your FTP client better support PASV. I was able to upload the file using FileZilla but not Microsoft FTP. I am searching for a command line FTP client I can use. I think MOVEit Freely from Ipswitch might be the answer. Ipswitch is probably best known for WS_FTP. A few years back WS_FTP was the standard bearer for FTP clients and servers.

Finally I am not sure what happened to MSN’s product upload page, http://productupload.live.com. Suffice to say it has had major problems every time I tried to use it. At this time I am not sure MSN wants me to update the data feeds using FTP. It is too bad they are so difficult to use. Most of our traffic comes from Google and Yahoo. Not surprisingly they get the bulk of our advertising expenses. MSN has always been a distant third place.