Other My NVMe experience

I bought a 512GB Toshiba XG3 NVMe module and put it on a PCIe 3.1 adapter.
Wanted to install FreeBSD to it so I modified a FreeBSD memstick installer to add nvme support.

Mounted my USB memstick installer on a FreeBSD laptop and modified boot/loader.conf
nvme_load="YES"
nvd_load="YES"

Then booted memstick installer in my NVMe machine and checked nvmecontrol devlist and saw my device. So ran the installer and used nvd0 and it went fast with source and ports checked. Afterwards I edited the boot/loader.conf on the new install adding nvme support and rebooted.

Wah. BIOS wont boot to the device, Goes straight to BIOS screen. Device is not found in BIOS.
Oh well it was only one Ivy Bridge board I have. A Jetway NF9G-QM77
I am able to benchmark with a USB drive with NVMe support added but it looks like the motherboard is running the card at x1 so not full speed. I need to see if I can force it to x4. It is in an x16 slot.

I have an Asrock Q77 board to test in next. FreeBSD is doing good but I need better hardware. An x79 board would be nice.
 
No boot on a Q77 board either. Doing some reading it appears I need a Z97 or X99 chipset board to boot from nvme.
There are some Supermicro server boards with a nvme slot.

Here are my module details
Code:
root@Mushkin:~ # nvmecontrol identify nvme0
Controller Capabilities/Features
================================
Vendor ID:                  1179
Subsystem Vendor ID:        1179
Serial Number:              569S105MT5ZV
Model Number:               THNSN5512GPU7 NVMe TOSHIBA 512GB
Firmware Version:           57DA4103
Recommended Arb Burst:      1
IEEE OUI Identifier:        00 08 0d
Multi-Interface Cap:        00
Max Data Transfer Size:     Unlimited

Admin Command Set Attributes
============================
Security Send/Receive:       Not Supported
Format NVM:                  Supported
Firmware Activate/Download:  Supported
Abort Command Limit:         4
Async Event Request Limit:   4
Number of Firmware Slots:    1
Firmware Slot 1 Read-Only:   No
Per-Namespace SMART Log:     No
Error Log Page Entries:      128
Number of Power States:      5

NVM Command Set Attributes
==========================
Submission Queue Entry Size
  Max:                       64
  Min:                       64
Completion Queue Entry Size
  Max:                       16
  Min:                       16
Number of Namespaces:        1
Compare Command:             Not Supported
Write Uncorrectable Command: Supported
Dataset Management Command:  Supported
Volatile Write Cache:        Present
 
Here are some numbers.
Code:
root@Mushkin:~ # diskinfo -t /dev/nvd0
/dev/nvd0
        512             # sectorsize
        512110190592    # mediasize in bytes (477G)
        1000215216      # mediasize in sectors
        0               # stripesize
        0               # stripeoffset
        569S105MT5ZV    # Disk ident.

Seek times:
        Full stroke:      250 iter in   0.025199 sec =    0.101 msec
        Half stroke:      250 iter in   0.025865 sec =    0.103 msec
        Quarter stroke:   500 iter in   0.048629 sec =    0.097 msec
        Short forward:    400 iter in   0.039586 sec =    0.099 msec
        Short backward:   400 iter in   0.039254 sec =    0.098 msec
        Seq outer:       2048 iter in   0.100053 sec =    0.049 msec
        Seq inner:       2048 iter in   0.102854 sec =    0.050 msec
Transfer rates:
        outside:       102400 kbytes in   0.137535 sec =   744538 kbytes/sec
        middle:        102400 kbytes in   0.108459 sec =   944136 kbytes/sec
        inside:        102400 kbytes in   0.100204 sec =  1021915 kbytes/sec
 
Could you share your throughput just for comparison?

I am looking at the D1508 version of that SM board. By the time I outfit a Asus Z97 I could have a server board for not much more.
 
Sure, here you are
Code:
root@storage-smc:~ # nvmecontrol devlist
 nvme0: Samsung SSD 950 PRO 256GB
    nvme0ns1 (244198MB)

root@storage-smc:~ # nvmecontrol identify nvme0
Controller Capabilities/Features
================================
Vendor ID:                  144d
Subsystem Vendor ID:        144d
Serial Number:              S2GLNCAGB15677M
Model Number:               Samsung SSD 950 PRO 256GB
Firmware Version:           1B0QBXX7
Recommended Arb Burst:      2
IEEE OUI Identifier:        38 25 00
Multi-Interface Cap:        00
Max Data Transfer Size:     131072

Admin Command Set Attributes
============================
Security Send/Receive:       Supported
Format NVM:                  Supported
Firmware Activate/Download:  Supported
Abort Command Limit:         8
Async Event Request Limit:   4
Number of Firmware Slots:    3
Firmware Slot 1 Read-Only:   No
Per-Namespace SMART Log:     Yes
Error Log Page Entries:      64
Number of Power States:      5

NVM Command Set Attributes
==========================
Submission Queue Entry Size
  Max:                       64
  Min:                       64
Completion Queue Entry Size
  Max:                       16
  Min:                       16
Number of Namespaces:        1
Compare Command:             Supported
Write Uncorrectable Command: Supported
Dataset Management Command:  Supported
Volatile Write Cache:        Present

root@storage-smc:~ # nvmecontrol identify nvme0ns1
Size (in LBAs):              500118192 (476M)
Capacity (in LBAs):          500118192 (476M)
Utilization (in LBAs):       2541376 (2M)
Thin Provisioning:           Not Supported
Number of LBA Formats:       1
Current LBA Format:          LBA Format #00
LBA Format #00: Data Size:   512  Metadata Size:     0

root@storage-smc:~ # diskinfo -t /dev/nvd0p4
/dev/nvd0p4
        512             # sectorsize
        141733920768    # mediasize in bytes (132G)
        276824064       # mediasize in sectors
        0               # stripesize
        1048576         # stripeoffset
        17231           # Cylinders according to firmware.
        255             # Heads according to firmware.
        63              # Sectors according to firmware.
        S2GLNCAGB15677M # Disk ident.

Seek times:
        Full stroke:      250 iter in   0.009651 sec =    0.039 msec
        Half stroke:      250 iter in   0.010369 sec =    0.041 msec
        Quarter stroke:   500 iter in   0.014878 sec =    0.030 msec
        Short forward:    400 iter in   0.012632 sec =    0.032 msec
        Short backward:   400 iter in   0.013137 sec =    0.033 msec
        Seq outer:       2048 iter in   0.040222 sec =    0.020 msec
        Seq inner:       2048 iter in   0.040128 sec =    0.020 msec
Transfer rates:
        outside:       102400 kbytes in   0.086262 sec =  1187081 kbytes/sec
        middle:        102400 kbytes in   0.086097 sec =  1189356 kbytes/sec
        inside:        102400 kbytes in   0.084941 sec =  1205543 kbytes/sec
 
I am working with PCIe bus 3.0 so I expected less than optimal speed. I cherry picked my best throughput numbers. They seem to fluctuate alot. I am glad to see my best is not that far off yours. But your numbers look steady. I was hoping for 2.5Gb/s !!! I might only be in x2 mode.

In a PCIe 2.0 slot the device is not recognized.
 
Looking at the spec's on the Samsung 950 m.2 Pro they are benchmarking at around the same as my Toshiba module. 2500-2600MB/s reads.
So I wonder what the bottleneck is.
I do see some tuneables on the nvme page. Have you tried tweaking?

I just did a quick Linux Mint 17 test:

Code:
mint@mint ~ $ sudo hdparm -t /dev/nvme0n1

/dev/nvme0n1:
 Timing buffered disk reads: 3364 MB in   3.00 seconds = 1120.78 MB/sec

So my bottleneck looks like it is the motherboard hardware.
 
Anyone managed to use a Samsung SM961 or PM961 successfully under FreeBSD 11?

I ended up in a loop of i/o errors like

Code:
nvme0: resetting controller
nvme0: aborting outstanding i/o
nvme0: WRITE sqid:8 cid:127 nsid:1 lba:5131264 len:64
nvme0: ABORTED - BY REQUEST (00/07) sqid:8: cid:127 cdw0:0
 
I cherry picked my best throughput numbers. They seem to fluctuate alot.

This could happen through the thermal management of these m.2 pcie SSDs after reaching its maximum temperature.
Thats why PCIe Slot SSD solutions often have a passive heatsink.
Newer Samsung SSDs like the 960s series have a black sticker with a metal layer for better "cooling". o_O
 
This could happen through the thermal management of these m.2 pcie SSDs after reaching its maximum temperature.
Thats why PCIe Slot SSD solutions often have a passive heatsink.

My Samsung SM951 dropped to 40-50% throughput after a few dozend GB while testing and got _extremely_ hot. Even during real-life workloads it got quite hot and showed considerable performance drops. I ended up putting the sticker on the backside and strapping a small heatsink on top of the chips. IMHO they should come with at least a small aluminium plate on top, similar to the heatsinks on most RAM modules nowadays, otherwise these drives are pretty much unusable due to the unpredictable performance.
 
Anyone managed to use a Samsung SM961 or PM961 successfully under FreeBSD 11?

I ended up in a loop of i/o errors like

Code:
nvme0: resetting controller
nvme0: aborting outstanding i/o
nvme0: WRITE sqid:8 cid:127 nsid:1 lba:5131264 len:64
nvme0: ABORTED - BY REQUEST (00/07) sqid:8: cid:127 cdw0:0
I'm seeing the same problem on a 128GB SM961 here and also reproduced it under CURRENT (20161117). Booting Arch Linux 2016.11.01 on the same hardware lets me read and write the drive without errors, so I don't think it is a hardware problem / incompatibility.

I found existing PR 211713 on this and added my information to it. If it isn't your PR, maybe you could post a "me too" on it? With relatively inexpensive SM961's starting to show up in quantity, I expect more people are going to run into this.
 
Well I am back with a server board with nvme slot. This one boots from either nvme slot or in PCIe adapter. This is for Toshiba XG3
FreeBSD Release 11 amd64
This test is with the card in an PCIe adapter.
Code:
root@Gigabyte:~ # diskinfo -t /dev/nvd0
/dev/nvd0
   512             # sectorsize
   512110190592   # mediasize in bytes (477G)
   1000215216      # mediasize in sectors
   0               # stripesize
   0               # stripeoffset
   569S105MT5ZV   # Disk ident.

Seek times:
   Full stroke:     250 iter in   0.016361 sec =    0.065 msec
   Half stroke:     250 iter in   0.009457 sec =    0.038 msec
   Quarter stroke:     500 iter in   0.019461 sec =    0.039 msec
   Short forward:     400 iter in   0.015547 sec =    0.039 msec
   Short backward:     400 iter in   0.015403 sec =    0.039 msec
   Seq outer:    2048 iter in   0.131618 sec =    0.064 msec
   Seq inner:    2048 iter in   0.132954 sec =    0.065 msec

Transfer rates:
   outside:       102400 kbytes in   0.120837 sec =   847423 kbytes/sec
   middle:        102400 kbytes in   0.080591 sec =  1270613 kbytes/sec
   inside:        102400 kbytes in   0.077887 sec =  1314725 kbytes/sec
 
The same drive in Windows booting from nvme in PCIe slot adapter.
 

Attachments

  • diskmark-x16-pcie.jpg
    diskmark-x16-pcie.jpg
    73.6 KB · Views: 842
Here are the results in the nvme slot. Really pathetic.
Code:
root@Gigabyte:~ # diskinfo -t /dev/nvd0
/dev/nvd0
   512             # sectorsize
   512110190592   # mediasize in bytes (477G)
   1000215216      # mediasize in sectors
   0               # stripesize
   0               # stripeoffset
   569S105MT5ZV   # Disk ident.

Seek times:
   Full stroke:     250 iter in   0.017600 sec =    0.070 msec
   Half stroke:     250 iter in   0.011294 sec =    0.045 msec
   Quarter stroke:     500 iter in   0.022944 sec =    0.046 msec
   Short forward:     400 iter in   0.018209 sec =    0.046 msec
   Short backward:     400 iter in   0.017299 sec =    0.043 msec
   Seq outer:    2048 iter in   0.141943 sec =    0.069 msec
   Seq inner:    2048 iter in   0.143027 sec =    0.070 msec

Transfer rates:
   outside:       102400 kbytes in   0.195663 sec =   523349 kbytes/sec
   middle:        102400 kbytes in   0.177974 sec =   575365 kbytes/sec
   inside:        102400 kbytes in   0.176716 sec =   579461 kbytes/sec
 
So my question is: Would FreeBSD diskinfo's 'Transfer rates' be consider read or write speed?

Even in Windows the performance in the NVMe slot is poor.
 

Attachments

  • diskmark-nvme.jpg
    diskmark-nvme.jpg
    75.7 KB · Views: 956
What is the fairest way to compare my FreeBSD numbers with the output of diskinfo to my DiskMark numbers from Windows?
Like compare should I be comparing my Disk Transfer Rates to DiskMarks Write or Read Speeds??

Further reading on this seems to boil down to PCIe Lanes. The E3 Xeons really don't have many PCIe lanes left over after all the motherboard periphery. I am reading that USB3 controller eats up 6 lanes alone. So my guess is they have the NVMe slot on limited PCIe lanes on my board.

I am really struck by the fact that modern processors are lagging in PCIe lanes to accommodate basic usage for PCIe 3.0.

You really need to move to 40 lane LGA2011 chips to get the needed lanes for some healthy PCie slot usage and NVMe.
 
The manual section of this board is a joke, but at least in the specs they deliver some information:
"The MX31-BS0 comes equipped with an onboard M.2 slot, providing users PCI-Express connectivity for SSD devices. Delivering up to 10 Gb/s data transfer speeds..."
According to the CPU (XEON E3-12xx V5) it should be a C236/C234/C232 Chipset with proably 1 PCIe 3.0 lane to the M.2 Slot, but your results are half of that.

Try the latest BIOS update:
F10
15.05 MB
2017/01/01
Fixed some M.2 devices not work
 
Further reading on this seems to boil down to PCIe Lanes. The E3 Xeons really don't have many PCIe lanes left over after all the motherboard periphery. I am reading that USB3 controller eats up 6 lanes alone. So my guess is they have the NVMe slot on limited PCIe lanes on my board.

It is a bit more complex and depends on the workloads, I would say.
A system block diagram illustrate that way better.
The M.2 Slot on this board is connected with 2x PCIE 3.0 .
The X11SSH-LN4F f.e. page 17, Figure 1-4. System Block Diagram
http://supermicro.com/manuals/motherboard/C236/MNL-1778.pdf

All (beside the 16 direct PCIe 3.0 lanes from the CPU) is connected through the southbridge(chipset) which has "only" 4x PCIe 3.0 called DMI 3.0 with less than 4GB/s https://en.wikipedia.org/wiki/Direct_Media_Interface

So, if you want more than 4x PCIe speed you need a board with at least two PCIe Slots with real 8x PCIe connection.
 
I did that straight out the gate. F10 and that is how I ended up with Win8 for tests. The Intel ME stuff wouldn't update right(From EFI prompt) so I got that working with the Windows firmware updater. Then I put the newest Aspeed BMC firmware onboard. So all is up to date.
I also added a fan and heatsink to my NVMe module. I did notice it getting warm as per your recommendation.

I am happy but surprised that PCIe slots (Either the x8 or x16) Are performing so much better than native nvme slot.

portsnap fetch extract took 20 seconds.
Full install takes under a minute.
synth took about 3 hours to run my build.list.
 
The MX31-BS0 mainboard M.2 slots supports only 1x PCIe 3.0 lane with max. "10Gb/s" (should be 8GT/s ... possibly they connect it with 2x PCIe 2.0 = 10Gb/s ... it is confusing) or less that 1GB/s per second.
The Toshiba XG3 connects with PCIe 3.1 x4 theoretical 3.9 GB/s and its read/write speed is max. 2,4GB/s / 1,5GB/s .

If you want the full speed you need to use M.2 adapter card in a PCIe Slot with at least PCIe x4 connection.

There are many different solutions how the mainboard manufacturers share the PCIe lanes and connect the peripherals,
so the customer have to check that with every new board. Thats why i wonder why Gigabyte have such a bad documentation for this board. The desktop board manuals are way better.
 
I almost bought AsrockRack LGA2011v3 server board for m.2 nvme usage. Drilled down in the manual and saw this m.2= Gen 2 x 2 (10Gb/s)
http://www.asrockrack.com/general/productdetail.asp?Model=EPC612D8

So same thing there. 10Gb/s would be 1.25GB/s when 2.6GB/s is need for NVMe device. Shortchanged on a board with more than enough PCIe lanes. Supermicro boards also use x2 PCIe lanes for NVMe on many models.

I am still searching for my ideal 2011v4 server board. Asus LGA2011 boards have very bad ratings on NewEgg so I ruled them out.
I want to do 10GB LAN with Intel X540 cards so i need plenty of lanes.
 
I guess this is just a error in the manual, because on another page they write

"* The M.2 slot shares lanes with PCIe Slot 6 . When Slot 4 PCIe M.2 is populated, Slot 4 runs at x4."

and the source of the PCIe lanes is the CPU(PCIe3.0), so it would make no sense limiting it to 2x 2.0.
But contact Asrock to make it sure.
 
Back
Top