Noone is saying that “any file beginning with 11 '1’s is an mp3”.
No one except you anyway:
" It is possible to determine almost any file type out there by evaluating the first few bytes (their “magic numbers” ). "
Pretty hard to read that as anything else.
From this description it is pretty easy to deduce that any audio frame of an mp3 (MPEG Version 1 Layer III) starts with 15 known bits
Its 11 bits. You can misread him if you like, but the standard says 11 bits for the magic bits. The next are variable in valid MP3 files since they indicate which MP3 level that file is.
Why on earth you feel like arguing what the standard says with someone who has read it when you clearly have not is the more interesting question?
(representing either “FF FA” or “FF FB” ). The false positive rate is therefore 1 in 32768 files
As I already explained to you, the magic numbers don’t have to be the first 11 bits. Just anywhere in the file. So yes its 1 in 2048 files if each file is exactly 11 bits long Of course if your file is a 1MB long its likely to contain that sequence somewhere in it ;)
These are the numbers from my false positives test:
0 in 1972326 files are found to have the mp3 “magic number” (15 bits) accidentially.
Even the number of non-mp3 files which exhibit the frame synch pattern (11 bits) is pretty low:
47 in 1972326 files (about 1 in 42000) which is another strong indication that most files are simply not random in their first few bytes.
Short summary: false positives are not a big issue at all.
Awesome. Now reread what I wrote and do the test correctly.
For kicks I picked a random AAC file, and confirmed its actually an MP3 file. First try! Actually I doubt I have many files on my hard disk bigger then a MB or so that aren’t MP3 files according to this test since any file with random binary data more then a few KB long has the MP3 sync word in it somewhere.