Hi Elmar,
Yes I believe I have a full dump of the NAND data. The chip storage actually totals 274877906944 bytes, when I read the ID of the NAND chips and look at the datasheet. But the drive reports, IIRC, 256 * 10^9 bytes or so. I don't want to use the suffix GB (or Gb) because drive manufacturers use GB to mean 1 * 10^9 bytes it seems rather than the 1024*1024*1024 bytes that the rest of the computing world understand by the suffix GB. Its been rather a confusing point.
But the chips total the amount I wrote above and I can read all of this to a file 274877906944 bytes in size which I now work from.
The chips also have a spare area of 128 bytes per 4096 (so 512 bytes per 16kb page) but I can't seem to access it. Its possible that important info is stored here but I don't think so, you might see why in a minute.
If you're interested the chip datasheet is here
https://docs-emea.rs-online.com/webdocs/0def/0900766b80defffe.pdfThe data is arranged in chips as you know, 32 of them in my case, and the chips are arranged in 'planes' and 'dies'. I had to choose some numbers to give to the firmware before I read the chips, to tell it how to arrange the chips like this for reading. I did some experimenting and settled on a certain configuration which seems to work. I believe this because in the 274 GB dump I mentioned above, I can find many many 16 kb files which I can see are not scrambled. For instance there are segments of HTML 16kb long which I can read through and see are full intact chunks of HTML pages I must have had in the browser cache on the disk.
So that makes me think I have configured the chips correctly although its possible I am wrong.
Now for the interesting bit. The data is as I said organised in 16kb chunks - blocks, pages, I am not sure what to call them (the chip datasheet tends to call them pages). There is 127 blocks of what I am sure is data from my drive, IE my user data. This is where I find things like web pages, image header data etc. Then there is a final block (16kb in size) which contains what I am sure is LBA data plus a bit more. Then the pattern repeats, about 131000 times (giving my about 274 Gb size file). Each of these 128 * 16kb chunks is the size of an erase block (2mb or so).
To describe this last block a bit more, it starts with an identifier which seems to identify the type. Not all 2mb erase size sections are data, some have different IDs in this first block, I don't know what they all mean but they must be drive housekeeping data. I ignore them, for now at least.
Next in this last block is 128 bits which seem to identify whether the following LBAs are valid. A bit is set to 0 when there is a duplicate LBA in the same 2mb section.
Then there are 128 numbers which are all between 0 and 16777216, the perfect range for a 16kb block size 256 gb disk. I assume they are the LBAs. I have written them all out to a 'map file' which should be the translation table between physical and logical addresses you mentioned.
So as I said I have made a kernel driver which can read this map file and translate block requests on the fly. Then I mount the dump of the disk I made, and use the map file to translate requests to the kernel for a particular block to the address pointed to by the LBAs I mentioned above. This 'translated' block device appears as a virtual device in /dev.
So, if I read from the virtual device, asking for say block 0, the modified stackbd module looks for which physical block on the original disk had a LBA of 0 (perhaps its block 12345, 54321 etc) and modifies the request to the kernel to return this block instead of block 0.
By doing this I am translating the arrangement of blocks from the original dump I have, using the addresses that I believe are LBAs, into a device sorted by LBA.
This all sounds good but there are problems. The main one I mentioned earlier is the fact that for many LBAs there are several references to them on the disk. IE the data in the last blocks I believe is LBAs - if I put those LBAs into an array sometimes as I scan the disk I get duplicate entries. There are 16 million LBAs and I find 700k duplicates, around 5%. Somewhere the drive stores the info I need but I can't find it yet, although as I say I wonder if I can get clues by examining the FS itself.
BTW a couple of years ago I was asking for help on hddguru and detailed some of this in more detail -link is here
https://forum.hddguru.com/viewtopic.php?f=10&t=35428&mobile=mobile