SSD Bad Block Management
SSD Bad Block Management
How does bad block occur? And by what means
does the SSD detect and manage bad blocks? What are the problems for the bad
block management strategy proposed by the manufacturer? And what kind of
management method is more superior? Does disk format lead to loss of the bad
block list? What potential risks may be
caused by maintenance of the SSD?
This article will help you uncover these
secrets.
Overview
The design concept about bad block
management concerns the reliability and efficiency of the SSD. While the bad
block management methods offered by some SSD vendors are not always so
reasonable, it will cause unexpected bad blocks if some abnormal conditions
haven't been considered comprehensively during the product design process.
E.g, after testing several SSDs with
different controllers, we found that new bad blocks caused by abnormal power
failure are very common. When we google “bad blocks by power failure” or similar
keywords, we will find that this problem not just happens during test process
but also a big headache for end users.
Who
Shall Manage the Bad Blocks?
For those SSD controllers without specified
NAND Flash file system, bad blocks can be managed by the SSD controller
firmware, and for those with specified NAND Flash file system, bad blocks can
be managed by the file system or Driver.
Bad
Blocks are Classified to Three Types
1. Factory bad blocks (or early bad blocks)
are those blocks were tested not meeting the standard set by the NAND Flash
factory and marked as bad blocks before delivery. Some of the factory bad
blocks are erasable while some are indelible.
2. New bad blocks caused by wear in the
process of using.
3. Pseudo bad blocks misjudged by the SSD
controller due to power abnormity.
New bad blocks are not always caused by
wear if the SSD doesn’t support power failure protection function, because
abnormal power failure may cause pseudo bad blocks misjudged by the controller
or real new bad blocks. In case power failure protection function is not
supported, if Lower Page finished program and power failure happens when the
Upper Page is doing program, data error to Lower Page is inevitable. And it
will cause read error when the data error count exceeds the ECC capability of
the SSD, the block will be judged as Bad Block by the controller and ranked
into bad block table.
Some of the new bad blocks can be erased,
and after erasure they may not always experience errors again during
read/write/erase operations since the occurrence of errors is also related to
the data writing pattern, errors may occur when using this pattern but not
occur if change to another pattern.
The Percentage of Factory Bad Blocks in the
Whole Device
After consulting some NAND Flash
manufacturers, we got a general answer that the percentage of factory bad
blocks is less than 2% and also the manufacturers reserve a portion as
allowance to ensure the bad blocks remain account for under 2 % even when the
maximum P/E cycle promised by manufacturers is reached. While it seems not so
easy to keep this rate under 2%, some new samples from a manufacturer were
tested with finding bad blocks over 2%, the actual result is 2.55%.
Judging
Methods for Bad Blocks
1.
Judging methods for Factory Bad Blocks
Bad block scan is basically scanning the
byte corresponded to the factory specified location whether the byte contains
FFh, no FFh indicates the block is a bad block.
The location where the bad block is flagged
is roughly the same for each manufacturer, and for SLC and MLC NAND Flashes,
the location is different, take Micron for example:
1.1 For SLC with small pages(528Bytes), the
corresponding location is the 6th Byte of the spare area in the first page of
each Block, and this block will be judged as bad block if no FFh at this
location.
1.2 For SLC with big pages(2112Bytes or
up), the corresponding locations are the 1st and 6th Bytes of the spare area in
the first page of each Block, and this block will be judged as bad block if no
FFh at these locations.
1.3 For MLC, the bad block is judged by
scanning the 1st or the 2nd Byte in the spare areas of the first and last page
in each Block, it means the block is good block if they are marked as 0xFF,
vice versa.
The following picture is from a Hynix
datasheet:
What is the data in the bad blocks? Full of
0 or full of 1? The test result we got is showed in the following picture, for
sure this may not be the truth, it maybe true for the factory bad blocks while
maybe not for new bad blocks, otherwise how can data hiding be realized by Bad
Blocks?
Are
Factory Bad Blocks Erasable?
Some “can be erased” and some are
prohibited by factory to be erased. The so called “can be erased” is just
changing the bad block marks by sending erase command but not suggesting the bad
blocks can be used.
The manufacturers strongly recommend not
erasing the bad blocks as the bad block marks are unable to be restored once
they are erased, and it is risky to write data into bad blocks.
2.
Judging Methods for New Bad Blocks During Use
Whether an operation to NAND Flash is
successful or not is judged by the feedback from the status register, during
Program or Erase, if the feedback is fail, SSD controller will mark the block
as bad block.
Details as follows:
2.1 Error occurs during executing erase
command.
2.2 Error occurs during executing write
command.
2.3 Error occurs during executing read
command. When carrying out read command, if bit error rate exceeds ECC
capability, the block will be judged as bad block.
Bad Block Management Methods
Bad blocks are managed through building and
updating BBT(Bad Block Table).There is no unified standard for operating BBT,
some engineers use one table for managing factory bad blocks and new bad
blocks, and some use two respective tables.
Also, the expressions for the contents of
BBT are not the same. Some expressions are simple, e.g. 0 indicates good block
and 1 means bad block. While some engineers adopt more detailed expressions,
e.g. Use 00 for factory bad block, 01 for bad blocks occurred during Program
failure, 10 for bad blocks occurred during Read failure, and 11 for bad blocks
occurred during Erase failure.
BBT is generally stored in separate
areas(e.g.Block0,page0 and Block1,page1), it is efficient to read BBT directly after each time power
on. Considering the NAND Flash itself may be damaged and cause the loss of BBT,
manufacturers usually do back-ups for BBT, some backup 2 copies and some backup
8 copies. In general, the number of backups can be calculated with the help of
voting system of Probability Theory, anyhow, 2 backups should be the minimum.
Bad block management strategies generally
include: Bad Block Skipping Strategy and Bad Block Replacing Strategy.
Bad
Block Skipping Strategy
1.
For initial bad blocks, BBT helps to skip corresponding bad blocks, and
the data is directly written into the next good block.
2.
And when new bad block is discovered, BBT will be updated, the valid
data in the bad block will be moved to the next good block, and after then this
bad block will be skipped during every Read, Program or Erase operation.
Bad Block Replacing Strategy (proposed by
some NAND Flash Manufacturer)
Bad block replacing strategy means
replacing new bad blocks during use with good blocks in reserved block area.
Suppose the number “n” page is discovered with error during program operation,
then with bad block replacing strategy, the data in page 0 to page (n-1) will
be moved to the same location in an idle block (eg. Block D) in reserved block
area, and then the data inside the number “n” page in data register will be
written into the number “n” page in Block D.
NAND flash manufacturers suggest dividing
the whole data area into two parts, one is user addressable block area for
normal data operations, and another part is the reserved block area specially
for replacing bad blocks and storing BBT(Bad Block Table), the reserved block
area takes 2% of the whole SSD capacity.
When bad block is discovered, FTL will map
the address of the bad block to the address of good block in reserved block
area, but not skip the bad block to the next good block, each time before
programming the logical address, the SSD calculate which logical addresses can
be written and which are bad blocks, if it is a bad block, the SSD writes data to
the corresponding address in the reserved block area.
We have neither got the proposal regarding
whether the reserved block area of 2% is included in OP area or extra area, nor
description about whether the reserved block area is dynamic or static. If it
is separated and static area, there will be following disadvantages:
1.
Reserve 2% area specially for replacing bad blocks reduces user
addressable block area and waste space. Meanwhile the decreased user
addressable blocks accelerates the average wearing numbers of usable blocks.
2.
Suppose the bad blocks in user addressable block area exceed 2%, that is
to say the reserved block area is used up for replacing the bad blocks, and if
new bad blocks happen again, there will be no reserved block for replacement
and the SSD will come to the end of life.
In fact, the method of reserving 2% area
separately for replacing bad blocks is rarely seen in real products designs.
Generally, free blocks in OP(Over provisioning) area are used for replacing new
blocks during use. Take garbage collection for example, when the garbage
collection mechanism works, it will first move the valid pages in the blocks
needed to be recycled to free blocks, and then erase these blocks. Suppose at
the moment the Erase Status Register reflects Erase Failure, the bad block
management mechanism will record the address of this block to the new bad block
list, and meanwhile move the valid pages in the bad block to free blocks in OP
area and update BBT (bad block table), next time when programming, it will just
skip this bad block to the next usable block.
Different manufacturers have different
settings about the size of OP area, and different applications have different
requirements regarding reliability thus the SSD solutions vary in the sizes of
OP areas. There should be a balanced relationship between OP and reliability,
the larger the OP area, the larger the usable area for garbage collection
during sequential writing, and the more reliably the SSD performs, thus the
smoother the performance curve shows. On the contrary, the smaller the OP, the
less reliably the SSD performs, while of course the larger the user addressable
block area is and this means lower cost.
Generally, OP size can be set to be 5%-50%,
and 7% is a common rate, unlike the fixed 2% reserved block area suggested by the
manufacturer, the 7% is not the specified blocks for OP but dynamically
distributed in all blocks, this contributes more to wear leveling algorithm.
Risks
for Repaired SSD
For most SSD manufacturers without their
own controller technology, if they receive RMA items, the most common practice
is changing the defected components and re-implant firmware, the new bad block
list will lose at this moment, while probably there are already bad blocks in
the unchanged NAND flashes, and if OS or sensitive data is written into the bad
blocks, system crash is likely to happen. Even for SSD manufacturers with
controller technology, whether to keep the existed bad block list depends on
their attitudes towards users.
Does
Bad Blocks affect SSD Read/Write Speed and Stability?
Initial bad block (factory bad block) is
isolated from the bit line thus don’t affect the program/erase speed of other
blocks, but if the new bad blocks for the whole SSD are quite enough, the user
addressable blocks are decreased, then it will cause more frequent garbage
collection, and meanwhile the reduced capacity of OP will severely affect the
efficiency of garbage collection. Therefore, a certain amount of new bad blocks
will impact on the SSD performance stability especially during sustaining
writing operations because garbage collection causes performance reduction, in
this case the SSD performance curve will show wide fluctuations.
reliable SSD manufacturer: www.renice-tech.com
Contact person: May Lau (may@renice-tech.com)
Comments
Post a Comment