SSD Bad Block Management

SSD Bad Block Management

How does bad block occur? And by what means does the SSD detect and manage bad blocks? What are the problems for the bad block management strategy proposed by the manufacturer? And what kind of management method is more superior? Does disk format lead to loss of the bad block list?  What potential risks may be caused by maintenance of the SSD?

This article will help you uncover these secrets.

Overview

The design concept about bad block management concerns the reliability and efficiency of the SSD. While the bad block management methods offered by some SSD vendors are not always so reasonable, it will cause unexpected bad blocks if some abnormal conditions haven't been considered comprehensively during the product design process.

E.g, after testing several SSDs with different controllers, we found that new bad blocks caused by abnormal power failure are very common. When we google “bad blocks by power failure” or similar keywords, we will find that this problem not just happens during test process but also a big headache for end users.



Who Shall Manage the Bad Blocks?

For those SSD controllers without specified NAND Flash file system, bad blocks can be managed by the SSD controller firmware, and for those with specified NAND Flash file system, bad blocks can be managed by the file system or Driver.



Bad Blocks are Classified to Three Types

1. Factory bad blocks (or early bad blocks) are those blocks were tested not meeting the standard set by the NAND Flash factory and marked as bad blocks before delivery. Some of the factory bad blocks are erasable while some are indelible.   

2. New bad blocks caused by wear in the process of using.

3. Pseudo bad blocks misjudged by the SSD controller due to power abnormity. 

New bad blocks are not always caused by wear if the SSD doesn’t support power failure protection function, because abnormal power failure may cause pseudo bad blocks misjudged by the controller or real new bad blocks. In case power failure protection function is not supported, if Lower Page finished program and power failure happens when the Upper Page is doing program, data error to Lower Page is inevitable. And it will cause read error when the data error count exceeds the ECC capability of the SSD, the block will be judged as Bad Block by the controller and ranked into bad block table.   

Some of the new bad blocks can be erased, and after erasure they may not always experience errors again during read/write/erase operations since the occurrence of errors is also related to the data writing pattern, errors may occur when using this pattern but not occur if change to another pattern.  

The Percentage of Factory Bad Blocks in the Whole Device

After consulting some NAND Flash manufacturers, we got a general answer that the percentage of factory bad blocks is less than 2% and also the manufacturers reserve a portion as allowance to ensure the bad blocks remain account for under 2 % even when the maximum P/E cycle promised by manufacturers is reached. While it seems not so easy to keep this rate under 2%, some new samples from a manufacturer were tested with finding bad blocks over 2%, the actual result is 2.55%.



Judging Methods for Bad Blocks

1.   Judging methods for Factory Bad Blocks

Bad block scan is basically scanning the byte corresponded to the factory specified location whether the byte contains FFh, no FFh indicates the block is a bad block.

The location where the bad block is flagged is roughly the same for each manufacturer, and for SLC and MLC NAND Flashes, the location is different, take Micron for example:

1.1 For SLC with small pages(528Bytes), the corresponding location is the 6th Byte of the spare area in the first page of each Block, and this block will be judged as bad block if no FFh at this location.

1.2 For SLC with big pages(2112Bytes or up), the corresponding locations are the 1st and 6th Bytes of the spare area in the first page of each Block, and this block will be judged as bad block if no FFh at these locations.

1.3 For MLC, the bad block is judged by scanning the 1st or the 2nd Byte in the spare areas of the first and last page in each Block, it means the block is good block if they are marked as 0xFF, vice versa.

The following picture is from a Hynix datasheet:



What is the data in the bad blocks? Full of 0 or full of 1? The test result we got is showed in the following picture, for sure this may not be the truth, it maybe true for the factory bad blocks while maybe not for new bad blocks, otherwise how can data hiding be realized by Bad Blocks?


Are Factory Bad Blocks Erasable?

Some “can be erased” and some are prohibited by factory to be erased. The so called “can be erased” is just changing the bad block marks by sending erase command but not suggesting the bad blocks can be used.

The manufacturers strongly recommend not erasing the bad blocks as the bad block marks are unable to be restored once they are erased, and it is risky to write data into bad blocks.



2.   Judging Methods for New Bad Blocks During Use

Whether an operation to NAND Flash is successful or not is judged by the feedback from the status register, during Program or Erase, if the feedback is fail, SSD controller will mark the block as bad block.

Details as follows:

2.1 Error occurs during executing erase command.

2.2 Error occurs during executing write command.

2.3 Error occurs during executing read command. When carrying out read command, if bit error rate exceeds ECC capability, the block will be judged as bad block.

Bad Block Management Methods

Bad blocks are managed through building and updating BBT(Bad Block Table).There is no unified standard for operating BBT, some engineers use one table for managing factory bad blocks and new bad blocks, and some use two respective tables.

Also, the expressions for the contents of BBT are not the same. Some expressions are simple, e.g. 0 indicates good block and 1 means bad block. While some engineers adopt more detailed expressions, e.g. Use 00 for factory bad block, 01 for bad blocks occurred during Program failure, 10 for bad blocks occurred during Read failure, and 11 for bad blocks occurred during Erase failure.

BBT is generally stored in separate areas(e.g.Block0,page0 and Block1page1), it is efficient to read BBT directly after each time power on. Considering the NAND Flash itself may be damaged and cause the loss of BBT, manufacturers usually do back-ups for BBT, some backup 2 copies and some backup 8 copies. In general, the number of backups can be calculated with the help of voting system of Probability Theory, anyhow, 2 backups should be the minimum.

Bad block management strategies generally include: Bad Block Skipping Strategy and Bad Block Replacing Strategy.

Bad Block Skipping Strategy

1.   For initial bad blocks, BBT helps to skip corresponding bad blocks, and the data is directly written into the next good block.

2.   And when new bad block is discovered, BBT will be updated, the valid data in the bad block will be moved to the next good block, and after then this bad block will be skipped during every Read, Program or Erase operation.

Bad Block Replacing Strategy (proposed by some NAND Flash Manufacturer)

Bad block replacing strategy means replacing new bad blocks during use with good blocks in reserved block area. Suppose the number “n” page is discovered with error during program operation, then with bad block replacing strategy, the data in page 0 to page (n-1) will be moved to the same location in an idle block (eg. Block D) in reserved block area, and then the data inside the number “n” page in data register will be written into the number “n” page in Block D.

NAND flash manufacturers suggest dividing the whole data area into two parts, one is user addressable block area for normal data operations, and another part is the reserved block area specially for replacing bad blocks and storing BBT(Bad Block Table), the reserved block area takes 2% of the whole SSD capacity.



When bad block is discovered, FTL will map the address of the bad block to the address of good block in reserved block area, but not skip the bad block to the next good block, each time before programming the logical address, the SSD calculate which logical addresses can be written and which are bad blocks, if it is a bad block, the SSD writes data to the corresponding address in the reserved block area.

We have neither got the proposal regarding whether the reserved block area of 2% is included in OP area or extra area, nor description about whether the reserved block area is dynamic or static. If it is separated and static area, there will be following disadvantages:

1.   Reserve 2% area specially for replacing bad blocks reduces user addressable block area and waste space. Meanwhile the decreased user addressable blocks accelerates the average wearing numbers of usable blocks.

2.   Suppose the bad blocks in user addressable block area exceed 2%, that is to say the reserved block area is used up for replacing the bad blocks, and if new bad blocks happen again, there will be no reserved block for replacement and the SSD will come to the end of life.

In fact, the method of reserving 2% area separately for replacing bad blocks is rarely seen in real products designs. Generally, free blocks in OP(Over provisioning) area are used for replacing new blocks during use. Take garbage collection for example, when the garbage collection mechanism works, it will first move the valid pages in the blocks needed to be recycled to free blocks, and then erase these blocks. Suppose at the moment the Erase Status Register reflects Erase Failure, the bad block management mechanism will record the address of this block to the new bad block list, and meanwhile move the valid pages in the bad block to free blocks in OP area and update BBT (bad block table), next time when programming, it will just skip this bad block to the next usable block.

Different manufacturers have different settings about the size of OP area, and different applications have different requirements regarding reliability thus the SSD solutions vary in the sizes of OP areas. There should be a balanced relationship between OP and reliability, the larger the OP area, the larger the usable area for garbage collection during sequential writing, and the more reliably the SSD performs, thus the smoother the performance curve shows. On the contrary, the smaller the OP, the less reliably the SSD performs, while of course the larger the user addressable block area is and this means lower cost.

Generally, OP size can be set to be 5%-50%, and 7% is a common rate, unlike the fixed 2% reserved block area suggested by the manufacturer, the 7% is not the specified blocks for OP but dynamically distributed in all blocks, this contributes more to wear leveling algorithm.

Risks for Repaired SSD

For most SSD manufacturers without their own controller technology, if they receive RMA items, the most common practice is changing the defected components and re-implant firmware, the new bad block list will lose at this moment, while probably there are already bad blocks in the unchanged NAND flashes, and if OS or sensitive data is written into the bad blocks, system crash is likely to happen. Even for SSD manufacturers with controller technology, whether to keep the existed bad block list depends on their attitudes towards users.

Does Bad Blocks affect SSD Read/Write Speed and Stability?

Initial bad block (factory bad block) is isolated from the bit line thus don’t affect the program/erase speed of other blocks, but if the new bad blocks for the whole SSD are quite enough, the user addressable blocks are decreased, then it will cause more frequent garbage collection, and meanwhile the reduced capacity of OP will severely affect the efficiency of garbage collection. Therefore, a certain amount of new bad blocks will impact on the SSD performance stability especially during sustaining writing operations because garbage collection causes performance reduction, in this case the SSD performance curve will show wide fluctuations.


reliable SSD manufacturer: www.renice-tech.com
Contact person: May Lau (may@renice-tech.com)





Comments

Popular posts from this blog

About SSD AES-encryption

Renice r-Backup® Power Failure Protection of SSD

Почему SSD не удалось после аномального отключения питания?