Thursday, February 1, 2007

Virtual RAID-Z columns

The alogrithm for allocating storage on the RAID-Z device is *almost* done (+ all necessary debugging time). Instead of creating a RAID-Z of size:
MIN(disks) * (num_disks * parity)
You will be able to create a RAID that uses the full capacity of the disks in the array. i.e. you will be able to effectively use all space on an array of mismatched disks.

The idea:
1) The length of a virtual column is the size of the largest disk in the RAID.
2) Layout the disks, end to end, modulo the virtual column size
e.g. 120, 80, 60, 40
----------------
| | 80 |____|
|120 | | 40 |
| | |____|
| |____|
| | |
| | 60 |
----------|

3) It's the job of vdev_raidz_map_alloc() to convert an offset/length of the desired access into individual offsets and length within the leaf vdevs.
4) Care needs to be taken at the boundary points between disks, and in the case above the raid-z moves from being 3 cols to 2 cols at offset 90.
5) The benefits are substantial, however. Instead of creating a RAID with size 40*3 = 120, we can, with single parity, creat a RAID of size 80 + 60 + 40 = 180. Furthermore, in the case of blocks not going over disk divides, we save IO bandwidth by using a 3 col RAID-Z instead of a 4 column RAID-Z.

Disaster -- Macbook hard drive failure

The SATA disk in my Macbook committed suicide two days ago. As a result, all work I was doing under VMWare has stopped (VMware Fusion disk images are incompatible with VMware server under Linux).

Back with my old Athlon (minus the dodgy DMA on the Promise card).

Friday, January 26, 2007

Solaris Powers of 2 Macros

A blog post here describes them.

Thursday, January 18, 2007

Problems with Solaris Networking and VMWare

Having installed OpenSolaris snv_54 on VMWare Fusion, all was well. I got network working with MASQ'd ethernet. But after installing vmware tools, the default interface changed from pcn0 to vmxnet0 and I was unable to get networking going again. dig www.google.com correctly resolved the domain, but ping'ing www.google.com didn't work. The solution involved fiddling with nsswitch.conf.

*confused*

Sunday, January 14, 2007

Creating a FreeDOS bootdisk on OS X

The BIOS on my Promise Ultra133 TX2 IDE card is a version not listed on the Promise support site (2.20.0050.10). So it seemed sensible to update to something half-recent (2003 - 2.20.0.15 *crazy versioning scheme*). Of course such an upgrade requires a DOS bootdisk. The 2 floppies I found in my room are >5 yrs old, and refused to play ball. So how hard could it be to make a Boot-CD -- on my CD-burning machine, a Mac?! *sigh* Several coasters later...

Anyone else banging their head against BIOS-hell:

[You need to go through this as .iso images are read-only.]
1) Download a FreeDOS disk image -- we just need the floppy boot image, in this case named fdboot.img.
2) Create a directory [cd-dir], and place this boot-disk image into it.
3) Add the DOS BIOS flash utility and anything else you need to [cd-dir].
4) In the terminal:
cd [cd-dir]
hdiutil makehybrid -o ../foo -eltorito-boot fdboot.img -iso .
hdiutil burn ../foo.iso
5) Boot up off the disk, Drive X:\ is probably the CD drive (rather than A: which is the disk image fs...)

*phew*

No ACPI == No floppy

So having disabled ACPI, I've suddenly discovered that there is no floppy device. I'm attempting to upgrade the BIOS on my Promise Ultra133 to see if I can stop the corruption. A quick google finds this bug.

The workaround?
Add:
name="fdc" parent="isa"
dma-channels=2
interrupts=6
reg=1,0x3f0,6,1,0x3f7,1;
(you need that semicolon)

To:
/platform/i86pc/kernel/drv/fdc.conf

*sigh*

Saturday, January 13, 2007

Disable X; Corruption on the Promise controller

To disable X:
dtconfig -d (man)

Used the now "working" Promise IDE controller to create a 5-disk RAID-Z. Unfortunately, all is not well in Promise-land. Copying data to the RAID works, however a scrub shows corruption on all disks hanging off the Promise controller. A quick zpool scrub corrects the problem however.

# zpool status -v
pool: tank
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub completed with 0 errors on Sat Jan 13 09:22:07 2007
config:

NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c1d0 ONLINE 0 0 0
c2d0 ONLINE 0 0 6
c2d1 ONLINE 0 0 3
c3d0 ONLINE 0 0 5
c3d1 ONLINE 0 0 2

errors: No known data errors

Thankfully(?!) corruption seems to be limited to the Promise Ultra133 TX2 controller. Monkeys in the DMA?

Also, with disk activity taking place, the system is perceptibly less interactive (than under UFS).