How is ExpressCache 1.3.2 working?

I’ve given more thought overprovisioning since I mentioned it as a possible way to reduce cache housekeeping delays.  If I personally bought a 64 Gb SSD cache drive, I’d probably run it in the 50-60 Gb range rather than half of that to overprovision.    In other words I’d accept reduced longevity and occasional housekeeping delays to gain additional files cached.

Another issue is marketing.   Although one could justify overprovisioning a small SSD used as a cache due to the relatively low total cost of going from 32 to 64 Gb of NAND, what about a 1Tb SSD?    How are you going to sell increased reliability for one product and not another?   I can see it for ‘mission critical’ enterprise level cost relatively no object installations but for the home market, “I can give you TWICE the storage for the same money” would be impossible (IMO) to beat.

Yea that would be an unfortunate design flaw if its failing because the relatively cheap controller it uses can’t keep up with all the writes when full, causing resets or whatever.

That being said I never got close to full on mine after a while, so I guess it failed.

I don’t think it’s the controller, it should easily be able to keep up with LBA requests.   I agree with AlleyViper, this is most likely a Condicive software glitch.    My hypothesis is that something is that some permanent or temporary data array is, under some circumstances,  running out of pre-allocated or available space, or possibly some numerical value is exceeding type allowd value (not likely these days).

Two things that can negatively affect caching performance:

Security software scans.  We all (hopefully) do a quick scan each boot, and probably idle time scans – and every so often I scan every single file, archive, etc.   This makes the cache record lots of LBAs many of which are otherwise rarely read.

Defragging moves data from one LBA to another, cache has to adjust to new read patterns.   Active defraggers that try to move files around in real time to keep them contiguous generate  ‘spurious’ (from a caching perspective) LBA reads and change which LBAs hold which data, again forcing the cache to reorder/reprioritize.   

I re-formatted my PC and installed Windows 8.1 with Update x64.  I thought I would give ExpressCache 1.3.2 another shot at working without modification.  But, once the cached filled up it start causing problems like my PC taking forever to boot up.

I took the suggestion above and capped my cache at 25Gb.  That has helped.

Sandisk when is this going to get fixed!!!  This is taking way too long to release updates for this product.

Well, to finally confirm it: after >month and a half of use with only 16GB this computer exibited no readycache related problems. No winlogo temporary freezes or later catastrophic system freezes while running windows with near full cache. I’ll try a 3/4 full drive now (22900KB) excluding again dedicated storage and download drives (torrents mess up cache badly by creating too much cached activity, wich leads to a quicker purge of more important system data). As I’ve had repeated trouble with only about ~22-24GB filled, I’ll undershoot 25GB a bit more than suggested.

I’ve used for a short time a laptop with a 24GB mSATA cache drive using readycache, and there seem to be no issues. This ssd almost full all of time, because on one partition it held hybrid boot data (maintained by other program), and the remaining ~16GB were kept close to full.

Then again, I guess overprovisioning should yeld better IOps (specially on writes) and drive longevity for wear leveling, but such freezes on startup that take a lot of seconds seem more related to a software fault (as NWGuy pointed) than the added latency of a non-optimized SSD/weak controller. Hence why less cached data up to some 20-25GB might strain less the software.

I also believe that many cache reset problems can be due to windows updates, file scans, scheduled defrags, hours of torrenting + moving files, that cause havoc on the LBA list, leading to an expected total resynch.

AlleyViper,  here’s why I’m thinking you can go up higher than 23Gb without delays, even though you and others have seen issues starting near that point:   As far as I know, all these reports were based on running the full available OEM 29.82 Gb size for the active partition!

Now if that is not the case, and if you have experienced cache delays with smaller partition sizes, by all means don’t go above what  works for you – and please post here.

We also need to be sure everyone reporting has upgraded to firmware 1.3.2, as bug reports of all kinds have decreased significantly with that revision, and I know some of the posted delay issues involving partition fill percentages were posted using prior firmware.

OK, here goes:  How a cache works.

You start with a certain low number of disk reads per time period or reboot cycle to initially fill the cache – lets say you start by caching LBAs that have 5 reads to start with.

At some point your cache gets close to full (let’s say 80%) – clearly caching all LBAs with 5 reads is going to take more space than available.   So you want to increment the required LBA read count to be cached by one, making it 6 in this case, and then delete any cached LBAs with fewer than 6 reads from the cache.

In my case with a 16 Gb partition, I observed the cache filling, then decreasing, then refilling again, just as you would expect – with slight boot delays where you would expect housekeeping to be clearing out deleted low activity LBAs.   I’ve observed several cycles, the highest cache fill I’ve observed was 86.25%, the lowest (after initial fill) was 76.25%.

What people are observing using the OEM 29.82 Gb partition is that the initial cache housekeeping pass, triggered somewhere around 80% fill, is taking a lot longer than 5 seconds or so to complete  – yet initial housekeeping passes on smaller partitions finish quickly – disproportionately quickly.     It’s possible that some absolute number is being exceeded – say I did a “scan every file” virus scan over and over again to try to max out the array of LBA read counts – but my guess is that that scenario has been anticipated.   Most likely is that we’re running out of what I’ll call ‘scratch space’ when the partition is set to 29.82 Gb.

As an example, when deleting slower moving LBAs, one may wish to populate a separate array for housekeeping to work off of – perhaps there is not enough room in the 2.18 Gb (code?) space between 29.82 Gb and 32 Gb for this array.   Maybe at one time everything fit but some code change ended up taking more space.

If that’s the case, it MAY be a very near miss – maybe a 29.81 Gb partition would work.   Most likely, making another full 2.18 Gb available by reducing the active partition size to 29.82 - 2.18 = 27.64 Gb would work.  That would cover the “Oh, I thought I had the entire 2.18 Gb space for data structures!” scenario.  Not that that sort of thing ever happens :)   Or is likely happening, as everyone’s cache would be affected.

HOWEVER, there’s always the chance that something does happen as cache fill nears 23 Gb in absolute terms.  Having observed a maximum fill of 86% prior to housekeeping, let’s say 90% to be safe,   a 25 Gb partition x 0.9 = 22.5 Gb, so AlleyViper, I really think you will be ‘safe’ with a 25Gb partition.

There is another possibility too.   We’ve been addressing this problem by reformatting and repartitioning the drive using firmware 1.3.2 – what if that is all that’s necessary?    Hs anyone tried a ‘clear out and start over’ from the command line using firmware 1.3.2 and the OEM partition size? 

@NWGuy,

Perhaps SanDisk should simply fix the software?

ReadyCache is supposed to be transparent to the user.

I will never buy a SanDisk SSD type product again.

Could you blame me?

Flavio :angry:

@NWGuy

Btw, the partioned 29.82GB are already the full declared ~32GB without any spare/scratch or provision space (that would be the case of a drive sold as 30GB with the remaining 2GB unacessible for provision). Manufacturers use Gigabytes (1000^3) instead of formated Gibibytes (1024^3) for storage space.

This 32GB ssd has 62533296 available physical sectors with 512 bytes per sector, which gives 32017047552 bytes (~32000000000) or 32.02GB. If you divide those bytes for 1024 three times, you’ll end up with 29.82GB as commonly reported by an OS that prefers a 1024 base instead of 1000 (or, more precisely, 29.82GiB).

For now, I’ll keep this 3/4 partition to check for abnormal delays or freezes. Given that this drive receives no TRIM commands from Windows because it hasn’t an atributted drive letter, a guaranteed 25% free space should be good enough for its own housekeeping and wear leveling (independently of current readycache software issues).

If everything goes well, I’ll reclaim I bit more space as you suggest.

Btw, you’re right in your assumption, every freeze I’ve had with 22-24GB filled, and once even with 17GB was under a 29.82GB full sized partition. Unfortunately this testing will take some time, as I’m not always near this PC.

 @Flavio

No one here is happy with this situation, the only “good” sign, is Sandisk acknowledging the issues and the hope for a fix. As a matter of fact, I’ve retired my readyache to a PC I’ve built for a familiar because of all of the frustrating problems and went with a decent sized SSD on my desktop from another brand. This was more than half a year ago.

If it was working right from the start, or even the way its finally working now with these workarounds, that investment wouldn’t be necessary. Just try to reduce your cache partition size, and your experience with this drive might be less frustrating. You have nothing to lose until there’s a proper fix.

AlleyViper, can’t believe that got by me, especially since I’ve had the SSD partition pulled up on Partition Wizard more than once – thanks for the good explanation.

I just had a cache ‘reset’ happen to me on my OEM partitioned unit at somewhere between 26.3 and 27 Gb  – 26.3 was the last reading I saw working and the cache was filling very slowly at that point.   The following shutdown took a long time (minutes) of disk activity, and the next bootup went very slowly, with Windows logo delays.  When I looked at the cache it was starting over at 0.07 Gb, but it started filling again, up to 3+ Gb in 10 minutes and 12+ Gb by the end of the day.

29.82 x 0.90 = 26.84

29.82 x 0.86 = 25.65

The delay issues I’ve seen occured in this range

29.82 x 0.78 = 23.26

29.82 x 0.74 = 22.07

With the exception of your 17Gb observation, around 75% is the lowest cache fill reported with observed delays, with more reports around 80%.

  

Both my caches are working fine overall and are definitely speeding up disk IO and reducing physical disk wear and tear.

I don’t mind occasional small housekeeping delays at bootup – maybe because that’s how I expect a cache to work.    I don’t mind leaving some working space for cache operation – that’s the easy fix for Sandisk, just reduce the OEM partition size.

AlleyViper will be trying around a 23 Gb partition.  I’m going to try 25 Gb on both of mine.   It would be great if someone could try 27 Gb, watching cache size daily once fill exceeds 23 Gb.  It may be awhile before I report back.

Well, things didn’t go as I planned – and I learned some things.

 

On the machine I had tried the 16 Gb partition on I decided to try a 27 Gb partition, and start the GUI splash screen with Windows to keep a close eye on the numbers.

 

Oddly enough, even after the command line steps including clearing the old partition and formatting the new partition, the cache seemed to start with the 12+ Gb fill it had on the 16 Gb partition – so I guess those eccmd commands don’t clear the “LBAs to be cached” table.     Wanting a fresh start, I used Clear Cache from the Options on the GUI and that started the partition over at 0.07 Gb .

 

Now on the other computer, the one with the never modified original partition, remember it had recently reset after getting quite full and was refilling much more rapidly that it had the first time.   Well, it has been operating fine ever since, roughly in the 75% to 85 % fill range – no resets.  Also interestingly  I’ve observed it making decreases in cache size during cache operation – I’ve watched on Task manger, during these times both Express Cache and the Express Cache Service grab 25 to 30 Mb of memory and consume around 1.5% CPU.   Very good performance in my opinion.    Now I can’t say for sure whether the actual NAND housekeeping  occurred ‘live’, but I can say there were no major boot delays either before or after cache decreases.

 

Summarizing, what I observed was a very slow cache fill (many days) for the first time through, then a cache reset to 0.07, and then a very fast cache fill and normal cache operation thereafter.

Now back to the 27 Gb machine – I observed the exact same pattern – slow fill, reset, fast fill, normal operation.

 

When the reset occurred, however, on this machine I was loading the GUI at startup, and an additional text box appeared  (In my words I didn’t write it down):  “The cache has been reset so operations will be at normal speed.   Cache resets can be caused by . . .” and it several things including a Windows update.

My conclusion is, this program is currently designed so that the first time you use it, it will fill the cache slowly while generating an LBA use table, then clear the cache and start over with that table.    I don’t remember reading that anywhere so maybe Condusive and Sandisk need to explain a little more what normal operation is so folks aren’t surprised when the cache resets.

 

I also think Sandisk needs to make two changes to the GUI.   First, the informative text box that a cache reset has occurred should display whether or not you are starting the GUI with Windows.   Second, the small ExpressCache icon should load in the system tray by default, again whether or not you are starting the GUI with Windows.

 

For folks who, for whatever reason, want to avoid any cache reset EVER, it may not be possible, although I don’t remember ever seeing a cache reset when I had the partition size down to 16 Gb.

Both of my ReadyCaches have been running without any resets for over a month, so I’m going to end my partition size reduction experiment and return the one I had reduced to 27Gb to the OEM 29.xx Gb size.

I think that occasional cache resets may occur as part of normal cache operation during the first 30 days of use or so, but after that rarely.  I think that Sandisk/Condusive needs to better communicate this to users so folks are not surprised when it happens.

I’ve been running Condusiv 1.3.2. version since April 30th, 2014.  ReadyCache was bought on April 29th and both the SSD and 1.3.2 were installed on the 30th.

System is a HP.Compaq dx2300 business minitower with two SATA HDDs.  OS drive is SATA II, whislt the other drive is SATA III.  All system SATA ports are SATA II. 

Drive C is SATA 0, DVD-ROM is SATA 1, Drive E is SATA 2, and the Sandisk SSD ReadyCache drive is SATA 3.

Original configuration was as above, and original ReadyCache setting was 28.92 GB.

Worked fine most of the first day, until Acronis ran the daily image backup @ 5:00 AM.  Then the cache went from ~6.00 GB to max several times in only 60 minutes and reduced to ~23-27 GB and then finally settled down.

I’m sorry, but removing Acronis is not an option.  There was, and is, clearly a cause and effect on system and SSD performance after the imaging software is run.

Experimentation via reducing the formatted cache size is still ongoing.  I’ve tried 8.00 GB (recommended by Sandisk tech support), 23.00 GB, and now 16.55 GB currently.  With 8 GB, no windows system logo hang was noted, 23 GB sometimes, and now, with 16.55 GB I’m hoping for some stable operation w/o system freezes or logo hangs on boot or restart.

Seems to me the issue I’m seeing is that there are no options within the Condusiv software to ignore data generated by Acronis True Image or any imaging software when a daily backup is done.  I see the same sort of uptick in total system and cache reads.when the a/v program or Malwarebytes is told to do a manual scan, but this data does not need to be cached for any of these programs to run, so why is there no option in the Condusiv software to tell the drive to not cache this data?

It’s not possible to anticipate every possible configuration a consumer will have on their system, and also all data generated by any system does not have to be cached in order for a SSD cache drive to perform satisfactorily; one should be able to exclude certain types of program activity (active file scanning by antimalware programs or imaging programs in this case) from the cache drive.  Reason this is so, is because the data generated by these sort of programs always seem to have a negative performance impact on the SSD drive.  (The fact that this SSD is of older technology does not help.)

Condusive does not, and has not, offered this option.  Wish I’d known that before I bought the product.  Offering this option would go a long way in customer perception and satisfaction and make the promise of a faster, more stable, more reliable system more possible for everyone.

It should be perfect for an older system, and when it works as it should, it’s great. 

But here one size does not fit all, and there has to be a way to fit the software to the actual system configuration and differing customer needs.

Offering this option alone would take care of the LBA backlog and tabling issues, and minimize cache resets.  It’s the logo hang that is the most bothersome to me, so for now I’m  working with the 16.55 GB volume size which should eliminate the logo hang forever.  With the proper exclusions available, it should be possible to run the SSD at 29.82 GB and have the expected speed and operation,and cache size one paid for. 

As it is, it is not possible unless one formats the drive to ~16 GB, as other users have posted here.

Will Condusiv consider making such options available in the next oem release?  Will they make it happen?  I think it is certainly worth a try as a beta release.  If that works, then go oem.

Identifying every backup, defragging, etc. type program inside the Conducive software would be very difficult.  And you can’t turn the cache off temporarily, lest the data in the cache gets out of sync with the data on the hard drive – that is, unless you flush the cache entirely and start over.

I personally can’t figure out how to avoid occasional boot time delays with an algorithmic filled cache.

Good information NWguy.  I don’t have the issues you talk about related to using an imaging program since I have a Server 2012 Essentials server backing up my PC.

But, I have been running my drive at 18gb.  I just had a cache reset.  I am going to trying setting it to 16gb aka 16,384KB to see if that works better.

I can’t believe it’s been 6 months and Sandisk hasn’t released and update for ExpressCache.  I think they are done supporting it.  I guess the next PC I build I will have to pony up for a 512GB or 1TB ssd.  Maybe in the next year or two those sizes will be affordable.

Just got the message for a cache reset and thought I’d write down what it said: 

“The cache has been reset. The Expresscache software does this as a precaution if the cache is found to be out of sync with the primary drive. This can be caused by an unexpected power loss, ungraceful forced system shut down, Windows update, or several other reasons. The cache will rebuild itself automatically.”

Also, thought I’d join the bandwagon in saying “Yes the reset happens to me too.” I’ve repartitioned the drive to 16 GBs and I still got this message. :neutral_face:

What’s worse is that I think that the Expresscache program is causing my sound card to randomly drop noise processes, making it sound like an old skipping record. 1.3.1 was great, and still is on another PC that’s using it (-Never- updating that one! lol). Why can’t we just get a link to the 1.3.1 software? I have an old 1.0.1 file, but it doesn’t perform as great as post 1.2.1.

Anyways, thanks for all of the tests and communications going on here; I hope to see the issues resolved soon.

@nwguy wrote:

AlleyViper, I’m noticing the boot delay and what my feeble memory says are cache operation slowdown as I approach 28Gb.   I’d like to try reducing my cache partition size as you show above, but I want to make sure I get the syntax right as I’ve gotten used to GUI partitioning utilities.

 

For comparison, here is a method slotmonsta posted in the “ReadyCache ssd hangs on startup” thread to make an 8k partition:

  1. From the command line: ECCmd -format (this will clear the information out of the cache) 
  2. Delete the partition (this is done from the Disk Management pain by right clicking on the drive and selecting “delete” the partition 
  3. From the command line: ECCmd -partition (drive number) 8192 (this will create a partition of 8GB in size) 
  4. From the command line: ECCmd -format (this will format the new partition and make it ready for EC/RC to utilize)

 

So:   eccmd -partition requires a drive number, but eccmd -format does not?

 

Thnaks for helping someone who started with 8" floppys but hasn’t used the command line in years and doesn’t want to partition or format their hard drive!.

 

Yea  thanks for the commenads, I 've been watching my system just drag when the cache is full, watching perf monitor its like I don’t have any cache/ssd running, its just single digit reads all day long.

I will try 23GB partition

Its a bit weird because express cache only shows the total cache size rather than the partition size…

@nwguy wrote:

Thanks very much AlleyViper!   Your explanations give me the confidence to give this a try.

 

Unexpectedly, when I examind my original cache partition it showed the starting offseet as 2048.   I’d always heard to start at 4096 (or multiples) to get a 4k alignment  ?? 

 

I was researching SSD partitioning and ran into the following: http://www.tomshardware.com/forum/292105-32-best-format-partition-performance-wear-leveling#6100306 .  One thing they say is that trying to use more than 80% of the allocated space on an SSD will result in performance issues.   Well, 0.8 x 29 GB = 23.2 GB – that number looks familiar!     Maybe these full cache slowdowns are just symptoms of a normal ssd issue of needing at least 20% slack space, and reducing the cache partition size is the actual fix and not just a work-around…

 

 

 Yea at this point I think they might have created a flawed product from the get go :frowning:

Though its weird because ssd performance degredation should occur with writes, not reads when full…

Btw anyone know what FDMap1.dat files are in the express cache programdata directory?  In my fustration I tried deleting them to more clean clear my cache and everything else possible.  But it seems those files are protected and reappear instantly when the service is restarted:P

So annoyed, I’m pretty sure it got slow as it hit near the 23GB partition capacity, system lags, general feeling of system slower than even without a cache drive.

dropping to 18GB partition…

we aren’t getting the product we paid for…:frowning:

Same issue here re cache resets.

Seems to be linked to two different things:

1.)  System memory may be less than optimum as system uses a 48-bit Intel memory chip.  What this means is that even tho the system runs Win 7 64-bit just fine, it is a transition system board in migration between 32-bit and true 64-bit operation.  So it cannot see 4.0 GB RAM when it is installed, but 3.25 GB RAM.  It currenlty is running 3.0 GB RAM in the two slots available.

When running properly, it will display 768 MB RAM in use and 6121 Blocks in use by ReadyCache (see above) and when not, 0 MB in use and 0 Blocks in use in the case of a cache reset.  Otherwise, memory in use will vary from 768 MB RAM and 6121 Blocks in use to as low as 512 MB RAM and 4096 Blocks in use depending on what type of scanning program has just run.

2.)  I’ve seen a cache reset be triggered by manually running a Secunia PSI scanning session; 0 MB in use and 0 Blocks in use at the end of the scan.  When this happens, the system will revert to normal prior unenhanced operation, with the usual old system data displayed in the Task Manager Performance window.  It will take up to 60 minutes for these numbers to change back to the way they were before the crash when this occurs as ExpressCache recovers.

I"ve been monitoring ReadyCache performance for some time now using cmd as admin and using the eccmd -info command and capturing the resulting scan results.  I’ve  copy/pasted these scans into over 50+ different notepad files, each saved as UTF-8 format and pointedly have run this system over at least a two day period between reboots. 

(These files are grouped by the name of the pc and # number total reboots the system has seen since the Sandisk ReadyCache SSD has been installed.)

IMO:

There has to be a way to exclude a/v scanners, antimalware scanners, and other scanning programs (like a disk imaging program)  from the ReadyCache installation as running these scans always negatively impact the proper operation of the Sandisk ReadyCache SSD, and can even cause it to crash temporarily.  Home use is different from enterprise use; one of these differences is that business users do not scan their systems on a regular basis, if at all.  IT staff will step in only if a user complains about a system issue when requested.  As it seems as if ExpressCache software is still geared to enterprise usage and scenarios, Sandisk may want to look into how home usage of their product differs.

If this option were to be put in place in the next oem ExpressCache version, implementing this option should be a workaround that works on this system and others, as it should rarely run into the SSD maximum space and LBA limitations as it now seems to do.

I may even put this data into an Excel spreadsheet.

Hello mchain,

 

Thank you for the very thorough feedback. I will pass this along to the ExpressCache team. Would you be willing to share the notepad files and other logs that you have gathered? This may help the development team find a solution for this issue. 

 

Forum Admin

slotmonsta

Certainly.  

Just point the way on how to do it.  Expect files very soon after you answer.