All,
Looking to implement a dedicated server to perform data processing/analysis and it needs some fairly significant I/O. The "twist" is that the data's lifespan is the duration of time it takes to import, process, analyze and report on - then the database and the data is deleted. Storage need is on the order of about 800GB-1TB at any given time (max). Current notion is to use 4 x U.2 Optane (400GB) NVMe drives attached to an LSI 9500-16i [HBA] controller. It's practically double the need for space, but under RAID0, it would maximize the I/O. Implementation would be based on adding a new rc script that would check the zpool and re-create the pool (minus any failed disks) as RAID0, followed by initializing the Postgres data directory. Otherwise, it would boot as normal. This over simplifies but provides a sense of how "disposable" the data is and the target level of performance. OS and Application(s) would be installed on a separate pair of NVMes (not on the LSI bus) in a [ZFS] mirror configuration.
Questions:
Thanks!
Looking to implement a dedicated server to perform data processing/analysis and it needs some fairly significant I/O. The "twist" is that the data's lifespan is the duration of time it takes to import, process, analyze and report on - then the database and the data is deleted. Storage need is on the order of about 800GB-1TB at any given time (max). Current notion is to use 4 x U.2 Optane (400GB) NVMe drives attached to an LSI 9500-16i [HBA] controller. It's practically double the need for space, but under RAID0, it would maximize the I/O. Implementation would be based on adding a new rc script that would check the zpool and re-create the pool (minus any failed disks) as RAID0, followed by initializing the Postgres data directory. Otherwise, it would boot as normal. This over simplifies but provides a sense of how "disposable" the data is and the target level of performance. OS and Application(s) would be installed on a separate pair of NVMes (not on the LSI bus) in a [ZFS] mirror configuration.
Questions:
- While a RAID controller may be "easier" in some regards, would think that ZFS is a better option, no? (see later note about disk I/O amplification)
- Any reason to consider controller based RAID0 over ZFS RAID0? ("santiy check" question)
- Given RAID0 and use case - seems that ZIL/SLOG are irrelevant and unneeded, correct?
- One significant advantage is the ability for "exact match" of recordsize to page size (eg: Postgres 8K to 8K recordsize and later ability to change Postgres to a larger value, destroy pool and re-create with matching recordsize - RAID controller might not provide exact match for alignment). This should avoid a significant amount of disk I/O amplification, which would likely impact overall lifespan of the drive(s) amongst other things.
Thanks!