Handling huge data files in PHP
Archive - Originally posted on "The Horse's Mouth" - 2006-05-04 04:40:57 - Graham EllisI've handled files up to 2 GBytes in size with PHP ... but there are a number of issues to consider.
1. The size of the PHP "footprint" in memory - when you have a huge data file, you cannot simply read it all in with file or file_get_contents (nor with fread, trying to read the file all at once). Rather, you need to iterate through the file in blocks - typically a line at a time (a loop of calls to the fgets function), but I've also worked in 100k blocks.
2. You are very likely to hit the fierce time limit that PHP imposes to stop a program that's looping infinitely from hogging the server for too long. You can solve that one by increasing the time limit. See the set_time_limit function
3. Browsers will also time out (and users of your page get bored too) if you're not able to give a response quite quickly. Options to solve this include sending out a holding page / update periodically (see the manual page on flush for a discussion of this) and - the way I did it - running my PHP analysis of the huge data file as a command line program rather than through the browser.
Looking at the issue more widely, if you do have a huge data file to handle while your users visit your website, it's an excellent idea to preprocess the data to extract all the information they may need as you load the data onto the server / update it OR if that's not going to work for you, to put the data into a MySQL database which is often a much more efficient way of handling huge data that needs regular analysis on the fly.