Fuze won't find files without filename extension?

Yes, I realise that Sandisk are not here all the time, and I’ve already renamed the files and they are now recognised.

What I am struggling to understand is whether this is deliberate design, or an accidental side effect.

The essence of simplicity includes picking files for the database filtering them as rapidly as possible.  Hence, the Sansa ignores pdf files, executable files (exe), application specific extensions that are not needed my the device.

The SanDisk machine is also quite useful as a flash drive in either MSC or MTP modes, in addition to being a music device.  Thus, I don’t see it as an oversight, but a design parameter.  No need to bother with those files while doing its routine “database refresh”, a point of obvious contention with the average user. 

Making normal operational tasks as rapid as possible is the goal, and filtering out files, via the file extension, is a great way to do it.  Interesting find, incidentally.  I hadn’t given it much thought.

Bob  :smileyvery-happy:

I assume however that if you navigate to those files through the Folders menu on the Fuze that you will be able to see them then, and possibly play them.

@qualityaudio wrote:
I assume however that if you navigate to those files through the Folders menu on the Fuze that you will be able to see them then, and possibly play them.

I’m afraid, you wont see them. The folder and file browsing on the Fuze is much different from a normal file manager (Windows explorer and alike). As noted earlier the device scans the FAT during the “database refresh” for files of known types. This is obviously done by checking file extensions as mentioned earlier. It keeps record of these accepted files only. Moreover it remembers a folder name only in case it contains any valid files. That means i.e. it will not even offer to the user a folder which contains txt files only.

I too believe this is done on purpose and I think is indeed a good idea for an mp3 player. On the other hand it would be nice to have more precise information right from the manufacturers side …

Check by yourself to see what really the folder browsing feature is about.

Was there a reason the files didn’t have extensions, out of interest?

@ewelot wrote:


@qualityaudio wrote:
I assume however that if you navigate to those files through the Folders menu on the Fuze that you will be able to see them then, and possibly play them.


I’m afraid, you wont see them. The folder and file browsing on the Fuze is much different from a normal file manager (Windows explorer and alike). As noted earlier the device scans the FAT during the “database refresh” for files of known types. This is obviously done by checking file extensions as mentioned earlier. It keeps record of these accepted files only. Moreover it remembers a folder name only in case it contains any valid files. That means i.e. it will not even offer to the user a folder which contains txt files only.

 

I too believe this is done on purpose and I think is indeed a good idea for an mp3 player. On the other hand it would be nice to have more precise information right from the manufacturers side …

 

Check by yourself to see what really the folder browsing feature is about.

There’s no purpose for the fuze to show a folder that contains files it cannot read.  Displaying those folders will only clutter up the menu and be useless.

@fenrir wrote:

 

I have to say that I find that a bit strange as all the tags are there in the files, so can someone from Sandisk tell me if the search algorithm only scans for .mp3, .ogg, .flac etc?

 

 

Theres no real good way to implement this.  Without knowing the file tag its extremely difficult to determine what format a file is in, particularly for MP3, which need not have any tags at all.  Not to mention iteratively trying each file parser on every file in your DAP would take hours to refresh the database.

mags1230 wrote: 

There’s no purpose for the fuze to show a folder that contains files it cannot read.  Displaying those folders will only clutter up the menu and be useless.

Had to test this for myself and you’re right. Makes total sense too.

@summerlove wrote:
Was there a reason the files didn’t have extensions, out of interest?

I’m not the OP but I use the same program to sync. Amarok lets you configure exactly how to rename the files when you transfer them: whether or not to add various tags in the file name and/or folder(s), how to deal with spaces, what to do with characters illegal in FAT32 filenames, etc. The file extension is also an option, and it’s pretty easy to overlook.

@peach wrote:


@summerlove wrote:
Was there a reason the files didn’t have extensions, out of interest?


 

I’m not the OP but I use the same program to sync. Amarok lets you configure exactly how to rename the files when you transfer them: whether or not to add various tags in the file name and/or folder(s), how to deal with spaces, what to do with characters illegal in FAT32 filenames, etc. The file extension is also an option, and it’s pretty easy to overlook.

 

 

 

Thank you!  Was just interested

@peach wrote:


@summerlove wrote:
Was there a reason the files didn’t have extensions, out of interest?


 

I’m not the OP but I use the same program to sync. Amarok lets you configure exactly how to rename the files when you transfer them: whether or not to add various tags in the file name and/or folder(s), how to deal with spaces, what to do with characters illegal in FAT32 filenames, etc. The file extension is also an option, and it’s pretty easy to overlook.

 

 

 

Just to be sure I’m understanding you correctly, are you saying that files can be saved in Amarok without file extensions at all?  I always thought that files always needed an extension and that the only time that it doesn’t have one is when the extension is hidden.

@mags1230 wrote:


@peach wrote:


@summerlove wrote:
Was there a reason the files didn’t have extensions, out of interest?


 

I’m not the OP but I use the same program to sync. Amarok lets you configure exactly how to rename the files when you transfer them: whether or not to add various tags in the file name and/or folder(s), how to deal with spaces, what to do with characters illegal in FAT32 filenames, etc. The file extension is also an option, and it’s pretty easy to overlook.

 

 

 


Just to be sure I’m understanding you correctly, are you saying that files can be saved in Amarok without file extensions at all?  I always thought that files always needed an extension and that the only time that it doesn’t have one is when the extension is hidden.

 

No.  Theres nothing special about extensions.  You can have 0, 1, 2 …  

Thanks for the explanation.

@saratoga wrote:


@fenrir wrote:

 

I have to say that I find that a bit strange as all the tags are there in the files, so can someone from Sandisk tell me if the search algorithm only scans for .mp3, .ogg, .flac etc?

 


 

Theres no real good way to implement this.   Without knowing the file tag its extremely difficult to determine what format a file is in, particularly for MP3 , which need not have any tags at all.  Not to mention iteratively trying each file parser on every file in your DAP would take hours to refresh the database.

 

I don’t agree on that. It is possible to determine almost any file type out there by evaluating the first few bytes (their “magic numbers” ). Check this page for a good list of matches. I know it is hard to imagine - especially for windows users (and programmers) :wink:

Nevertheless it is not a good idea to use arbitrary file name extensions. And it would certainly take a bit more processing power to analyze the first few bytes on every file. 

Message Edited by ewelot on 10-25-2009 09:05 PM

@ewelot wrote:


@saratoga wrote:


@fenrir wrote:

 

I have to say that I find that a bit strange as all the tags are there in the files, so can someone from Sandisk tell me if the search algorithm only scans for .mp3, .ogg, .flac etc?

 


 

Theres no real good way to implement this.   Without knowing the file tag its extremely difficult to determine what format a file is in, particularly for MP3 , which need not have any tags at all.  Not to mention iteratively trying each file parser on every file in your DAP would take hours to refresh the database.

 


 

I don’t agree on that. It is possible to determine almost any file type out there by evaluating the first few bytes (their “magic numbers” ). Check this page for a good list of matches. I know it is hard to imagine - especially for windows users (and programmers) :wink:

 

 

I think you should look more carefully at your link.  The “magic number” is the ID3v2 tag header.  If you tried to use such a naive approach any file without an ID3v2 tag would not show up as an MP3.  Looking at my collection I have more then a few files that method says aren’t MP3s but that decode just fine on my Sandisk player.

 

I know its hard to image - especially for people who aren’t embedded audio programmers - but assuming every file looks exactly like the example file you have means that your software won’t work for anyone but you :slight_smile:

@saratoga wrote:


@ewelot wrote:


@saratoga wrote:


@fenrir wrote:

 

I have to say that I find that a bit strange as all the tags are there in the files, so can someone from Sandisk tell me if the search algorithm only scans for .mp3, .ogg, .flac etc?

 


 

Theres no real good way to implement this.   Without knowing the file tag its extremely difficult to determine what format a file is in, particularly for MP3 , which need not have any tags at all.  Not to mention iteratively trying each file parser on every file in your DAP would take hours to refresh the database.

 


 

I don’t agree on that. It is possible to determine almost any file type out there by evaluating the first few bytes (their “magic numbers” ). Check this page for a good list of matches. I know it is hard to imagine - especially for windows users (and programmers) :wink:

 


 

I think you should look more carefully at your link.  The “magic number” is the ID3v2 tag header.  If you tried to use such a naive approach any file without an ID3v2 tag would not show up as an MP3.  Looking at my collection I have more then a few files that method says aren’t MP3s but that decode just fine on my Sandisk player.

 

I know its hard to image - especially for people who aren’t embedded audio programmers - but assuming every file looks exactly like the example file you have means that your software won’t work for anyone but you :slight_smile:

 

I didn’t wont to attack any programmer at all. Apologies, if it came out like that (my english is not really good). Honestly, I count you as an experienced programmer, especially from your excellent work on Rockbox!

Maybe you missed something from the page I linked too. To my knowledge any mp3 file starts either with an ID3 tag or with a MPEG audio frame. Their “magic numbers” are both mentioned correctly with respect to the usual “MP3” extension. If you know any mp3 file starting with something differently I’d be glad to learn from you.

Don’t worry too much. All I wanted to say is, that there is another way of determining the correct file type which does not rely on file name extension. But whilst beeing more accurate in a few cases this method is more expensive in terms of computational power. If I would have to vote between faster startup time (shorter database rebuild in our case) or correct file type recognition for oddly named files I would definitely go for the first.

@ewelot wrote:

 

Maybe you missed something from the page I linked too. To my knowledge any mp3 file starts either with an ID3 tag or with a MPEG audio frame. Their “magic numbers” are both mentioned correctly with respect to the usual “MP3” extension. If you know any mp3 file starting with something differently I’d be glad to learn from you.

 

Don’t worry too much. All I wanted to say is, that there is another way of determining the correct file type which does not rely on file name extension. But whilst beeing more accurate in a few cases this method is more expensive in terms of computational power. If I would have to vote between faster startup time (shorter database rebuild in our case) or correct file type recognition for oddly named files I would definitely go for the first.

 

 

 

I did miss that part, I actually only skimmed the site since the premise was silly.  But even so, your link is incorrect.  It is required that all MP3 frames start with a sync word.  It not required that files start with a frame, so the sync word can be anywhere in the file.  This means that if you simply assume that any file beginning with 11 '1’s is an mp3, you will try to decode 1 in 2048 non-MP3s and fail to play all valid mp3s that do not start with a frame.  You can verify this yourself by inserting bytes in front of the sync word on your mp3s.  If your decoder is standard compliant, it will not care.  I think that guy’s idea probably works great for his MP3 files, but probably fails for all sorts of stuff floating around out there.  Trying to cheat the standard usually just means you end up with buggy ■■■■.

Anyway, while I understand that you want to be able to say that file extensions aren’t important and that you should be able to figure out what a file is without a priori information, there is no real way to do this for MP3 short of parsing through the file, checking every byte to see if its the start of a header, and then parsing the header to see if it encodes a valid MPEG audio frame.  And unfortunately, since the MPEG frame header is quite short, you will more then likely encounter random files with what look like headers that are not, in fact, MP3 files with surprising frequency, so I do not recommend this approach.  Its simply not accurate nor is it entirely standard.  The MPEG people assumed that you would know in advance what was an MP3 file, and so did not include a way to figure it out. 

 

I agree with you on many points (especially about false negative cases). Still I’d like to comment few things.

Noone is saying that “any file beginning with 11 '1’s is an mp3”. The author of the linked page just states that files starting with hex signature “FF Fx” usualy represent" MPEG, MP, MP3 MPEG audio file frame synch pattern". It doesn’t go into detail about what “x” means but provides a link to the frame description.

From this description it is pretty easy to deduce that any audio frame of an mp3 (MPEG Version 1 Layer III) starts with 15 known bits (representing either “FF FA” or “FF FB” ). The false positive rate is therefore 1 in 32768 files if you assume that all files start with random bits. But they don’t! The vast majority of files uses well defined bit sequences (“magic numbers” ) to distinguish their file types on a lower level - at least on the filesystems of the computers I’m working with. Therefore I expect the number of false positives to be much lower - by several orders of magnitudes.

I’m currently running a test to get a real number on a file space used by several users running many apps on different OS’es.

Note that false positives are even much more unlikely with mp3 files starting with ID3 tags or ogg or flac files due to a longer unique hex signature.

Message Edited by ewelot on 10-27-2009 04:59 PM

These are the numbers from my false positives test:

0 in 1972326 files are found to have the mp3 “magic number” (15 bits) accidentially.

Even the number of non-mp3 files which exhibit the frame synch pattern (11 bits) is pretty low:

47 in 1972326 files (about 1 in 42000) which is another strong indication that most files are simply not random in their first few bytes.

Short summary: false positives are not a big issue at all.

@ewelot wrote:

Noone is saying that “any file beginning with 11 '1’s is an mp3”.

 

 

 

No one except you anyway:

" It is possible to determine almost any file type out there by evaluating the first few bytes (their “magic numbers” ). " 

 Pretty hard to read that as anything else.


ewelot wrote:  

From this description it is pretty easy to deduce that any audio frame of an mp3 (MPEG Version 1 Layer III) starts with 15 known bits


Its 11 bits.  You can misread him if you like, but the standard says 11 bits for the magic bits.  The next are variable in valid MP3 files since they indicate which MP3 level that file is.  

Why on earth you feel like arguing what the standard says with someone who has read it when you clearly have not is the more interesting question?


ewelot wrote:  

(representing either “FF FA” or “FF FB” ). The false positive rate is therefore 1 in 32768 files


As I already explained to you, the magic numbers don’t have to be the first 11 bits.  Just anywhere in the file.  So yes its 1 in 2048 files if each file is exactly 11 bits long :slight_smile:  Of course if your file is a 1MB long its likely to contain that sequence somewhere in it ;) 


ewelot wrote:   

These are the numbers from my false positives test:

0 in 1972326 files are found to have the mp3 “magic number” (15 bits) accidentially.

Even the number of non-mp3 files which exhibit the frame synch pattern (11 bits) is pretty low:

47 in 1972326 files (about 1 in 42000) which is another strong indication that most files are simply not random in their first few bytes.

Short summary: false positives are not a big issue at all. 


Awesome.  Now reread what I wrote and do the test correctly.

For kicks I picked a random AAC file, and confirmed its actually an MP3 file.  First try!  Actually I doubt I have many files on my hard disk bigger then a MB or so that aren’t MP3 files according to this test since any file with random binary data more then a few KB long has the MP3 sync word in it somewhere.