GetID3 analyze() function new file size parameter

You can now read ID3 (media file headers) information from mp3 and other media files using GetID3 library without having the entire media file present. The new 2nd parameter to the analyze() member function allows you to detect play time duration with only a small portion of the file present.

Years ago I added this code to the versions of the getid3 library we packaged with the Blubrry PowerPress podcasting plugin. I’ve submitted this code to the getid3 project so everyone can benefit. As of GetID3 v1.9.10,  you can now pass a second optional parameter specifying the total file size. This parameter sets the file size value in the getid3 object, skipping the need for the library call the filesize() function.

This is the secret sauce that allows PowerPress to detect the file size and duration information from a media URL of any size in only a few seconds.

Requirements

The new parameter only works if the following are true:

  • Have enough of the beginning of the media file that includes all of the ID3 header information. For a typical mp3 the first 1MB should suffice, though if there is a large image within your ID3 tags then you may need more than 1MB.
  • Have the total file size in bytes.
  • The mp3 file is using a constant bit rate. This must be true for podcasting, and highly recommended if the media is to be played within web browsers. Please read this page for details regarding VBR and podcasting.

Example Usage

// First 1MB of episode-1.mp3 that is 32,540,576 bytes
// (approximately 32MB)
$media_first1mb = '/tmp/episode-1-partial.mp3
$media_file_size = 32540576;
$getID3 = new getID3;
$FileInfo = $getID3->analyze( $media_first1mb, $media_file_size );

You can use a HTTP/1.1 byte range request to download the first 1MB of a media file, as well as a HTTP HEAD request to get the complete file length (file byte size).

Byte range requests and HEAD requests are safe to use for podcasting. If a service does not allow HEAD requests or accepts byte range requests, then they will have bigger issues to deal with as these features are required by iTunes.

Blubrry PowerPress podcasting plugin has been using this logic to detect mp3 (audio/mpeg), m4a (audio/x-m4a), mp4 (video/mp4), m4v (video/x-m4v), oga (audio/ogg) media since 2008.

Not all media formats support this option. You should test any format not mentioned above. For example, ogg Vorbis audio works, ogg Speex audio does not.

The Web is getting Sloppy, Why is it a Problem and Who’s Fault is it?

Lately I’ve observed a few web sites that have been “re-launched”. Web sites vary from a small time blogger, a popular car forum, a very large automotive vendor and even Google!

First, let me define what I mean by “The web is getting sloppy”. Essentially the Internet is organized by domain names, for example www.google.com. Each domain name or web site is then organized by paths and files. Between the latest additions of domain name combinations (e.g. example.cc) and the past couple years of sloppy organized paths and files on these web sites, the web is becoming a really big mess.

Google.com

So how does this effect you? Well lets start from the top down and discuss Google’s latest sloppy web site changes. The new iGoogle start pages are now available, they have a fresh new look to them to make them easier to use. The problem is, they did not copy or replicate the past behavior. For example, with the old iGoogle pages, you could middle click on the Inbox link for the iGoogle Gmail widget and it would open your Gmail account in a new tab in your web browser. The new iGoogle Gmail widget doesn’t do this. Why is this sloppy you ask? Because when I do middle click, it opens a new window with JavaScript error. Google should know better. they could even add JavaScript logic to capture the middle click and cancel it, the 1 line of code looks like this: event.preventDefault(); Sloppy!

For Google to make such a simple mistake, it really shows that anyone is susceptible to web sloppiness.

YearOne.com

Lets look at a popular automotive parts vendor, YearOne.com. They recently launched a new web site, which is great! Unfortunately, no attempt was taken to route the old paths and pages to the new paths and pages. This means that years of YearOne loyal customers who have been posting links to their favorite products on YearOne.com are now wasted. Bloggers call these links “gold” for a reason, they bring new visitors to your site on a continual basis, usually in situations where traditional means of attracting those visitors is not effective (such as advertising). When the guy down the street recommended a steering wheel 3 years ago on your favorite Chevy forum, you take the recommendation seriously. Well now that YearOne.com failed to correctly redirect the old pages to the new ones, that coveted potential customer traffic is lost.

In the case of YearOne’s problem, this is something that could be solved in 1-2 days with some basic script writing and access to the old and new databases. 2 days of a programmers labor is definitely worth keeping these potential customers coming and buying your products!

Unnamed Car Enthusiast Forum

I absolutely love this site, but it recently violated a number of cardinal rules, such as moving forums to different folder paths on the server and using capital letters in URLs. The forum was moved from www.example.com/smf/ to www.example.com/SOMETHING-ELSE/ and a not so friendly message is now present on the old forum with a link to the new forum. To add insult to injury, the link is just text on a page, it’s not even surrounded with the necessary HTML to make it click-able. The owner of this site missed an opportunity when I offered to help him fix the problem for free. A simple PHP script that automatically redirects traffic from the old forum to the new one would help him keep the old traffic that would come to his site remain, while also keeping the old links on the new forum to going to the right topics on the new forum. The CAPITAL LETTER folder name is no big deal really, but if search engine optimization techniques was ever applied, the folder really should be called “forum”.

Capital letters are frown down upon in web development. When URLs are typed in manually, the possibility of error is increased when someone has to remember to hold down shift. Furthermore, on Linux and Unix based servers, you can have separate folders with the same name since the capital letter folder is recognized as different than a lower case folder.

In this case, I fixed the problem for myself by writing my own GreaseMonkey script which redirects links to the old forum automatically to the new one. My script also removes links that may appear on Google that go to print page versions of the forum to it’s normal readable versions.

Every Day Bloggers

So this is where I will definitely feel bad calling someone out specifically, and luckily the problem is so common I don’t have to anyway. The biggest thing I see is bloggers trying too hard with their sites, injecting every little widget and gadget into their pages till you can’t even tell what was written by the blogger to what is an advertisement. If you take yourself seriously as a blogger, keep your sidebar clean, limit the amount of images you put in your blog posts and don’t over-do your site navigation. And what ever you do, don’t move sites around like checkers. If you don’t have the technical knowledge how to both move a database, reconfigure settings and how to perform 301 permanent redirects, you have no business doing anything with moving sites. Hire someone who knows what they’re doing or leave it as is.

The latest generation of bloggers are unaware of the importance of their blog’s feed URL. What ever you do, treat this as the keys to the castle! If you change this URL in any way, you will have consequences, even if properly redirected it can lead to lost readership and subscribers. Think of your feed as your postal mailbox. You don’t put the mailbox in the back yard and you certainly don’t move it around your front yard either. Once you have a place for your feed, keep it there and never move it!

Who’s fault is it?

I don’t think it’s any one person’s fault. We’re now seeing a new generation of web sites lead by a new generation of web developers who are green, learning the mistakes that my generation had to learn. Unfortunately in an advertising revenue and sales commerce driven web world, even one lost web visitor could mean the difference of gaining or loosing a great customer.

I’ve been developing web sites professionally since June of 2000. If you need a web developer who takes details like these seriously, contact me at www.mandato.com.

Integrating PHP Command Line Scripts with Existing Web Projects

After reading the post on Johan Mares site about the PHP command line interface, I thought I would indulge in the details how I’ve been using the PHP cli for some of my web based applications.

First, some of my web apps have multiple configuration files which are determined by the $_SERVER[‘HTTP_HOST’] value. If (stricmp( $_SERVER[‘HTTP_HOST’],’compiledweekly’) ) { // then I load compiledweekly config file }. So with that in mind, I had to add to the top of my cli scripts the following line:

$_SERVER[‘HTTP_HOST’] = ‘compiledweekly.com’;

This required me to have to edit the command line script every time i used it with another site. Here’s the trick: pass arguments to your command line so your script can parse them. Here’s how I did it to also include a verbose mode.

	if( count($argv) > 1 )
	{
		for( $x = 1; $x < count($argv); $x++ )
		{
			switch($argv[$x])
			{
				case '--verbose': { // Print results to std out
					$Verbose = true;
				}; break;
				case '--host': { // Print results to std out
					$_SERVER['HTTP_HOST'] = trim($argv[$x+1]);
				}; break;
			}
		}
	}

	if( $Verbose ) echo "Starting script...nn";
	// Continue with your script below

So with the following example I can run my script for my compiledweekly.com site with verbose information. Example:

/path/to/script.php --verbose --host compiledweekly.com

Now if you use your php script in a cron task, don't include the --verbose and make sure you check the $Verbose flag before printing any results. Don't forget to add to the end of the command line " > /dev/null 2>&1" minus the quotes, this sends any std out and std error messages to a black hole.

If your command line script is saved in a web accessible folder, here's a line you should add to the top so no web browser can execute your script:

if( php_sapi_name() != 'cli' ) die('Access denied.');

This will take your command line apps to a new level while giving you the ability to use existing web code.

PHP Function HTTP Status Code Value as String

I’ve been working with the php CURL library and found that it would not return an error if the server returned a 500 error. After looking up 3 different status codes that I wasn’t very familiar with, I created the following function. It is very complete and includes additional WebDAV, Apache and Microsoft codes.

function http_status_code_string($code, $include_code=false)
{
	// Source: http://en.wikipedia.org/wiki/List_of_HTTP_status_codes

	switch( $code )
	{
		// 1xx Informational
		case 100: $string = 'Continue'; break;
		case 101: $string = 'Switching Protocols'; break;
		case 102: $string = 'Processing'; break; // WebDAV
		case 122: $string = 'Request-URI too long'; break; // Microsoft

		// 2xx Success
		case 200: $string = 'OK'; break;
		case 201: $string = 'Created'; break;
		case 202: $string = 'Accepted'; break;
		case 203: $string = 'Non-Authoritative Information'; break; // HTTP/1.1
		case 204: $string = 'No Content'; break;
		case 205: $string = 'Reset Content'; break;
		case 206: $string = 'Partial Content'; break;
		case 207: $string = 'Multi-Status'; break; // WebDAV

		// 3xx Redirection
		case 300: $string = 'Multiple Choices'; break;
		case 301: $string = 'Moved Permanently'; break;
		case 302: $string = 'Found'; break;
		case 303: $string = 'See Other'; break; //HTTP/1.1
		case 304: $string = 'Not Modified'; break;
		case 305: $string = 'Use Proxy'; break; // HTTP/1.1
		case 306: $string = 'Switch Proxy'; break; // Depreciated
		case 307: $string = 'Temporary Redirect'; break; // HTTP/1.1

		// 4xx Client Error
		case 400: $string = 'Bad Request'; break;
		case 401: $string = 'Unauthorized'; break;
		case 402: $string = 'Payment Required'; break;
		case 403: $string = 'Forbidden'; break;
		case 404: $string = 'Not Found'; break;
		case 405: $string = 'Method Not Allowed'; break;
		case 406: $string = 'Not Acceptable'; break;
		case 407: $string = 'Proxy Authentication Required'; break;
		case 408: $string = 'Request Timeout'; break;
		case 409: $string = 'Conflict'; break;
		case 410: $string = 'Gone'; break;
		case 411: $string = 'Length Required'; break;
		case 412: $string = 'Precondition Failed'; break;
		case 413: $string = 'Request Entity Too Large'; break;
		case 414: $string = 'Request-URI Too Long'; break;
		case 415: $string = 'Unsupported Media Type'; break;
		case 416: $string = 'Requested Range Not Satisfiable'; break;
		case 417: $string = 'Expectation Failed'; break;
		case 422: $string = 'Unprocessable Entity'; break; // WebDAV
		case 423: $string = 'Locked'; break; // WebDAV
		case 424: $string = 'Failed Dependency'; break; // WebDAV
		case 425: $string = 'Unordered Collection'; break; // WebDAV
		case 426: $string = 'Upgrade Required'; break;
		case 449: $string = 'Retry With'; break; // Microsoft
		case 450: $string = 'Blocked'; break; // Microsoft

		// 5xx Server Error
		case 500: $string = 'Internal Server Error'; break;
		case 501: $string = 'Not Implemented'; break;
		case 502: $string = 'Bad Gateway'; break;
		case 503: $string = 'Service Unavailable'; break;
		case 504: $string = 'Gateway Timeout'; break;
		case 505: $string = 'HTTP Version Not Supported'; break;
		case 506: $string = 'Variant Also Negotiates'; break;
		case 507: $string = 'Insufficient Storage'; break; // WebDAV
		case 509: $string = 'Bandwidth Limit Exceeded'; break; // Apache
		case 510: $string = 'Not Extended'; break;

		// Unknown code:
		default: $string = 'Unknown';  break;
	}
	if( $include_code )
		return $code . ' '.$string;
	return $string;
}

HTTP Code values are taken from the Wikipedia entry found here: http://en.wikipedia.org/wiki/List_of_HTTP_status_codes

CW10 – May 22, 2008 – Zend PHP 5 Certification Study Guide and Open Flash Charts

Angelo discusses the Zend PHP 5 Certification Study Guide 2nd edition and explains how to implement Open Flash Charts into your PHP project.

Don’t forget to E-mail comments and suggestions to compiledweekly AT gmail.com.

Links:

Download This Episode

Columbus PHP Meetup tonight – The Art of SQL Tuning for MySQL

If you’ve been following my Twitter (@AngeloMandato) lately, you may have herd me mention previous Columbus PHP Meetups. These meetups are great for meeting fellow PHP programmers in the Columbus area and a great way to learn about different libraries, techniques and frameworks that are available.

Columbus PHP Meetup web site: http://php.meetup.com/93/

Tonights meetup topic is “The Art of SQL Tuning for MySQL” presented by Jay Pipes from MySQL. I can’t wait to attend this meetup and gain some insightful knowledge how to tune MySQL. Ever since I started my career, I’ve encountered many issues either with server loads and/or time due to poorly written queries. I think I’ve done a decent job deploying indexes, grouping like queries together, etc… but I know there is more to learn.

The past two Columbus PHP Meetups covered the Zend Framework and CakePHP. Both were great presentations.

The Zend Framework presentation from February was very informative. The Zend Framework was written in a way that the developer can decide how much he/she wants to use from the framework. This makes it possible to easily add the Zend Framework to an existing project. I think the word framework may not be the best word to describe it though, perhaps it should be called library and framework. Many parts of the Zend Framework are really just libraries to help with things like email, XML-RPC, OpenID, Flickr, Amazon, etc… I now plan on using parts of the Zend Framework in some of my projects.

I learned a lot from the CakePHP presentation from March as well. CakePHP is definitely a “framework” with all of the University taught thinking of object oriented programming and separating presentation with logic integrated. What I found interesting is CakePHP took somewhat of a Ruby on Rails like approach in managing your SQL queries. I think this type of development is fine for small to medium size projects but anything where you need full control of the queries or presentation you may find yourself feeling restricted. The presentation side of things reminds me of Smarty Template Engine, which my past experience with Smarty started out great but ended with frustration that I couldn’t add the logic I wanted at the presentation level.

I would like to learn more about CodeIgniter. CodeIgniter is the application framework that Joe used for developing the registration system for PodCamp Ohio.

PHP 5 Study GuideRelated news, I purchased a copy of the Zend PHP 5 Certification Study Guide. I own a copy of the Zend PHP 4 Certification Study Guide and loved the book till the pages started falling out. It is not just for those who want to be certified in PHP, the content is perfect for a developer who already knows how to program but just wants something to reference for the language. You should already have some background in C/C++/Java/PHP before you read this book though. I’m very pleased with this addition as well as the first one. I think I may order the Guide to Programming with Zend Framework next.

So are you attending PHP meetups in your area? If so, what sorts of things are you learning?

Line-bar graphs and Pie charts for your web application

Open Flash ChartIf you ever needed to display reports of information in a visual way in your web application then you’ll appreciate Open Flash Chart.

This flash based charting library has everything. From line graphs, bar graphs, pie charts, mixed line/bar graphs and more with the ability to add hovers, custom colors, sizes and web links. The quality of these charts is remarkable. If you have ever used Google Analytics, these charts and graphs match, if not surpass, in quality.

PHP code to format Program Name for 1-click zune subscription

ZuneI added the Zune 1-click subscription option to the RawVoice properties at www.blubrry.com and www.techpodcasts.com. I ran into an issue where show titles that contain special characters, such as quotes, would break the rest of the web page.

Below is the solution I implemented for the problem.

Fix

<a href=”< ?php
$title_for_zune = ereg_replace(‘[^A-Za-z0-9 ]’, ”, $show_title);
$title_for_zune = str_replace(‘ ‘, ‘_’, $title_for_zune);
echo ‘zune://subscribe/?’.$title_for_zune.’=’.$feed;
?>” title=”Add to Zune”>Add to Zune</a>

Having spaces in the URL does not validate, so I replace spaces with the _ character. If you are not worried about HTML validation, you can remove the str_replace function call.

I am not sure who came up with this protocol for the add to Zune 1-click subscription, as it does not appear to be very well thought out. Most other sites like Digg only require the feed url, which will never contain characters that could break your web pages.  As Jason Van Orden points out, the format is more complex than it needs to be.

Source for Lighttpd mod_redirect rewrite module to use status code 302

Lighttpd web server, also known as Lighty, is an excellent web server and has potential to replace Apache completely.  I am slowly migrating web sites that use feature specific settings in Apache to use Lighty.  A few months ago I ran into a problem with Lighty’s ModRewrite alternative for rewriting URLs.  Lighty uses two separate modules to handle internal rewrites and Location: redirects.  It uses the common HTTP 301 Moved Permanently status code.  For most circumstances, this works well but in some cases the application may require that the redirect only be temporary and return the HTTP 302 Found status code.  Instead of modifying the mod_redirect.c source and changing the http_status code value from 302 to 301, I added new code to support a new url.redirect parameter url.redirect-found.

I’ve posted the source to the Lighttpd bug tracking system in hopes it will be added to a future version of Lighty.  http://trac.lighttpd.net/trac/ticket/1446

This addition should help the Lighty web server to be capable of handling the appropriate HTTP status codes for all situations that may arise for the web site in question.

Lighty Web Server Fast with static pages

Three months ago I started looking at an alternative web server to serve URL redirects.  The need arose when I found that Apache web server would consume a lot of system memory when testing simulated spikes to the server.  Apache could handle between 1,200 to 1,700 requests a second.  Though the number of requests per second was satisfactory, the memory usage when these simulated spikes was concerning.

I did some research and came across Lighttpd web server, also known as Lighty.  Lighty took some time to figure out, but once I did I found the XML style configuration files were not hard to implement and understand.  I did find the rewriting to be rather limited in comparison to the mod_rewrite module found in Apache.  Never the less, I was able to duplicate the rewrite that I had in Apache in Lighty.  For my application, I did have to modify the Lighty source code that way redirects returned a 302 HTTP response (It defaulted to 301 without any way of changing in the configuration files).

After performing similar tests with the same server configured with Lighty, I found that Lighty could handle between 3,900-4,100 requests per second.  On top of this, memory usage was minimized to only a fraction of the total memory available on the server.  Processor usage did increase, but was not substantial enough to warrant the change.

I am currently experimenting with combining Lighty with Apache services on one server in order to utilize the best of both worlds.  See my post on the lighttpd.net forums: http://forum.lighttpd.net/topic/13830

Lighty may be able to also serve dynamic PHP files using FastCGI faster than Apache.  I am still concerned that PHP will not function correctly since it is not multi-threaded friendly.  I also have security concerns based on what I’ve seem with source code being exposed with a popular web site recently, I am not ready to take on that much risk.