Some Interesting Metrics on PHP-vs-ASP-vs-[etc.]

Due in part to the fact that our mailing list software rejects Google search URL’s as SPAM, and also because it’s potentially of interested to someone out there, here’s a copy of an email I tried to send to Richard Heyes on the PHP General mailing list regard market penetration and usage of PHP versus other languages.

Keep in mind that it was all typed into an email body, and is in it’s original form. It hasn’t been edited or cleaned up or anything, so don’t blame me if there are errors. Blame Richard. Why? Beats the hell out of me, just do it.

On Sun, Feb 8, 2009 at 09:35, Richard Heyes wrote:
> Hi,
> Can anyone point out some general statistics on PHP usage compared to
> other server languages? I’ve tried Netcraft, but they only appear (or
> I’ve only found) to have statistics on the httpd server used.


Use Google’s ‘filetype’ parameter. That’ll give you a general
idea of market penetration. It doesn’t give you any scientific stats
on the number of developers, nor does it account for files using
aliased processing or masking methods, but gives you a very basic idea
of the general shape and format of today’s World-Wide Web.

For this example, I’m only going to use *.com results (to maintain
some semblance of sanity here), and only currently-indexed files as of
this writing. I’m also only going to use extensions that make sense
(for example, you’ll find some .php7 files on Google, but we know
they’re not legitimate results). Also, understand that there will
most definitely be a margin of error, but considering the major
programming language developer’s preference for one language over
another, I’d say extension-spoofing will be minor enough to be
obscured by factual results. This is also pages indexed, not
sites indexed, nor servers reporting the languages as available.
Also, I’ve found that Google – for whatever reason – gives different
results when using cAsE-sEnSiTiViTy for `filetype` searches (though
the results themselves don’t appear to change), so in situations where
the numbers differ, only the HIGHEST number for that search is used.
Note the case in the search URL given.

Total .com results in Google:
18.610 Billion (100%)

(971M + 14.7M + 35.1M + 1.98M + 1.62M + 4,260) = 1.024B (5.5%)

(1.54B + 803M) = 2.343B (12.6%)

(300M + 22.1M + 5,450) = 322.105M (1.73%)

CGI + PL (We’re aware that .cgi can be anything, but we’ll couple
it with Perl here):
(89M + 41.3M) = 130.3M (0.7%)

RB (Other Ruby[-on-Rails] searches were too minute to mention):
231,000 (0.0012%)

PY (Python):
4.46M (0.024%)

HTML (for good measure):
2.29B (12.31%)

6.114B (32.85%)

Perhaps surprisingly, according to this, ASP is in the lead —
even surpassing plain HTML.

Disclaimer: By the time I send this email, the numbers will have
no doubt changed – and your results may also vary based upon your
geographic location due to Google’s network and data distribution.
This isn’t by any means scientific, yada-yada-yada, I’m not a
statistician or mathematician or even a nice guy, Google doesn’t
sponsor my work, blah, blah, blah. You know the drill.

And there you have it.


Leave a Reply

Your email address will not be published. Required fields are marked *

The CAPTCHA cannot be displayed. This may be a configuration or server problem. You may not be able to continue. Please visit our status page for more information or to contact us.

This site uses Akismet to reduce spam. Learn how your comment data is processed.