06-07-2016 03:17 PM - edited 06-07-2016 03:21 PM
I started using SanDisk Extreme Pro 480GB recently on a laptop with Debian testing (Linux).
It started working OK, but already a couple of time for the past few days I experienced system hanging / kernel panic messages, with system reporing that filesystem (XFS) fails to write.
That's what smartctl -a reports:
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.5.0-2-amd64] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Marvell based SanDisk SSDs Device Model: SanDisk SDSSDXPS480G Serial Number: ###### LU WWN Device Id: ###### Firmware Version: X21200RL User Capacity: 480,103,981,056 bytes [480 GB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-2 T13/2015-D revision 3 SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Jun 7 18:13:48 2016 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 0) seconds. Offline data collection capabilities: (0x11) SMART execute Offline immediate. No Auto Offline data collection support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. No Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 10) minutes. SMART Attributes Data Structure revision number: 4 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0032 100 100 --- Old_age Always - 0 9 Power_On_Hours 0x0032 014 100 --- Old_age Always - 14 12 Power_Cycle_Count 0x0032 100 100 --- Old_age Always - 46 166 Min_W/E_Cycle 0x0032 100 100 --- Old_age Always - 0 167 Min_Bad_Block/Die 0x0032 100 100 --- Old_age Always - 46 168 Maximum_Erase_Cycle 0x0032 100 100 --- Old_age Always - 2 169 Total_Bad_Block 0x0032 100 100 --- Old_age Always - 935 171 Program_Fail_Count 0x0032 100 100 --- Old_age Always - 0 172 Erase_Fail_Count 0x0032 100 100 --- Old_age Always - 0 173 Avg_Write/Erase_Count 0x0032 100 100 --- Old_age Always - 0 174 Unexpect_Power_Loss_Ct 0x0032 100 100 --- Old_age Always - 7 184 End-to-End_Error 0x0032 100 100 --- Old_age Always - 0 187 Reported_Uncorrect 0x0032 100 100 --- Old_age Always - 0 188 Command_Timeout 0x0032 100 100 --- Old_age Always - 0 194 Temperature_Celsius 0x0022 059 043 --- Old_age Always - 41 (Min/Max 25/43) 199 SATA_CRC_Error 0x0032 100 100 --- Old_age Always - 0 212 SATA_PHY_Error 0x0032 100 100 --- Old_age Always - 0 230 Perc_Write/Erase_Count 0x0032 100 100 --- Old_age Always - 0 232 Perc_Avail_Resrvd_Space 0x0033 100 100 004 Pre-fail Always - 100 233 Total_NAND_Writes_GiB 0x0032 100 100 --- Old_age Always - 40 241 Total_Writes_GiB 0x0030 253 253 --- Old_age Offline - 17 242 Total_Reads_GiB 0x0030 253 253 --- Old_age Offline - 11 244 Thermal_Throttle 0x0032 000 100 --- Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] Selective Self-tests/Logging not supported
You can notice that it shows 935 bad blocks already. Is that normal, or SSD is faulty?
Also, temperature is shown as 41. Is that rather high? The laptop had original drive (HDD) wrapped in a foil cover. When I replaced it with SSD, I wasn't sure whether foil was needed, so I just transferred it on to SSD. May be it's not a good idea heating wise?
06-07-2016 04:08 PM - edited 06-07-2016 04:08 PM
Hm, I found this which sounds very suspicous: http://www.thinkwiki.org/wiki/Category:W540
Activating the NVIDIA GPU causes memory corruption leading to crashes and file system corruption. https://github.com/Bumblebee-Project/bbswitch/issues/78
06-08-2016 10:27 AM
I don't have much experience on the Debian, but for the drive bad blocks reporting, since you have a 480G drive, I wouldn't be too worried unless we see a rapid increase in bad block counts, in another word, this is still within acceptable range.
For the temp of the drive, it is within spec and you should not worry about it, if you run a lot of write on the drive the temp will raise a little bit, but I don't see any problem with it.