overdue-scratch

Author Topic: Patch for hanging during boot2 on large disks  (Read 4511 times)

0 Members and 1 Guest are viewing this topic.

easternguy

  • Entrant
  • Posts: 5
Patch for hanging during boot2 on large disks
« on: May 05, 2010, 12:34:02 AM »
Heya...

I have been dogged by hangs during the boot2 step, with frozen progress spinners.

I submitted a patch, which I notice made it into Rev 139 of the latest SVN repository.  This fixed a problem reading extents that were beyond a 32-bit offset.

Things were working great for me, although the fix did not appear to help everyone with similar symptoms.

Well, this morning, my machine froze in a similar manner as before.  Argh.  I spent a grueling few hours tracking down and fixing the latest problem.

HFSGetDirEntry updates the dirIndex parameter with its result.  However, the value it uses isn't a simple offset into the directory, or a pointer to a structure, but an offset in the hard disk (curNode * nodeSize + index).  As we've seen with the ReadExtent thing, curNode*nodeSize is no longer guaranteed to fit into a "long."

In tracking this down, I spotted this value overflowing (as I half-suspected), due to my disk filling up, directories living at higher offsets on the disk.  Once /System/Library/Extensions or any other boot-time file has its directory living beyond 2^32, you can no longer boot.

Unfortunately, sys.c's GetDirEntry, and the *GetDirEntry for other file systems all use "long *" for dirIndex.

I've updated hfs.c, sys.c, and boot2/drivers.c (which calls GetDirEntry) to all use long longs, as well as their associated header files.  I've also updated msdos.c, nbp.c, and ufs.c (and the associated headers) to also use long longs (as they all use a common interface with hfs.c). 

Warning: I have *not* tested my changes to the msdos/ufs file systems.  But since they're simply stashing either an offset or a 32-bit pointer, and getting them back out, I believe my fixes should work.

My changes have allowed my system to once again boot properly.  It might resolve some mysterious long-term hangs for others as well, with any luck.

Hopefully someone can review and further verify my changes work, do not introduce any bugs, and hopefully integrate them into the latest source.

Attached is a patch against r141, as well as a "boot2" compiled up with these changes.  I hope it brings some relief to a few folks.  :)

-d

loc[a]lhost

  • Entrant
  • Posts: 6
Re: Patch for hanging during boot2 on large disks
« Reply #1 on: May 08, 2010, 12:25:45 PM »
Thank you for your work!

Unfortunately this didn't help me,
I hafve a 6TB RAID array (appears as one disk), and I have to use iDefrag to move the boot files (/System/Library/Extensions, the kext cache, kernel, etc.) to the beginning of the disk each time I modify them.

zef

  • Administrator
  • Posts: 265
Re: Patch for hanging during boot2 on large disks
« Reply #2 on: May 08, 2010, 03:09:34 PM »
Thank you for your work!

Unfortunately this didn't help me,
I hafve a 6TB RAID array (appears as one disk), and I have to use iDefrag to move the boot files (/System/Library/Extensions, the kext cache, kernel, etc.) to the beginning of the disk each time I modify them.

Is that an Apple soft RAID setup? In that case you should get a tiny separate boot helper partion for each raid volume on all disks. Can you post your "diskutil list" output please?
ASUS P8Z68-V PRO/GEN3 | i5-2500k | 16GB RAM | GTX560 | Keyboard | Mouse | Devilsound DAC

loc[a]lhost

  • Entrant
  • Posts: 6
Re: Patch for hanging during boot2 on large disks
« Reply #3 on: May 08, 2010, 03:57:54 PM »
It's not a soft RAID setup, it's a hardware RAID5 array on a Highpoint 2320 RAID controller.
This is the output from diskutil list:

/dev/disk0
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     FDisk_partition_scheme                        *1.0 TB     disk0
   1:                  Apple_HFS Storage HD              966.4 GB   disk0s1
/dev/disk1
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:     FDisk_partition_scheme                        *200.0 GB   disk1
   1:               Windows_NTFS System Reserved         104.9 MB   disk1s1
   2:               Windows_NTFS Windows HD              199.9 GB   disk1s2
/dev/disk2
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *4.0 GB     disk2
   1:                        EFI                         209.7 MB   disk2s1
   2:                  Apple_HFS Boot HD (contains Chameleon)                 3.7 GB     disk2s2
/dev/disk3
   #:                       TYPE NAME                    SIZE       IDENTIFIER
   0:      GUID_partition_scheme                        *6.0 TB     disk3
   1:                        EFI                         209.7 MB   disk3s1
   2:                  Apple_HFS Macintosh HD (contains Snow Leopard)            4.9 TB     disk3s2


If there is more diagnostic output I could provide, please let me know.
Thank you.

easternguy

  • Entrant
  • Posts: 5
Re: Patch for hanging during boot2 on large disks
« Reply #4 on: May 10, 2010, 05:26:11 PM »
Quote
Thank you for your work!

Unfortunately this didn't help me,
I hafve a 6TB RAID array (appears as one disk), and I have to use iDefrag to move the boot files (/System/Library/Extensions, the kext cache, kernel, etc.) to the beginning of the disk each time I modify them.

Wow, that really sounds a like like the case that my patches should have fixed.  (The defragging to move them to the beginning, solving the problem and all.)  It's possible there's another edge case that is still behaving badly in the case of large disks.

(HFS+ is quite complicated, with catalogs, btrees, overflows, and all kinds of goodies scattered around, each of which has to be addressed correctly >32 bits, or things go boom.)

The idea of a helper partition for booting isn't a bad one.  Also, creative use of mkext to build a complete Extensions.mkext that lives in a separate /Extra partition (that is built to contain all of the extensions in /System/Library) might help avoid the crash, too.  (Extensions are all loaded there, and no poking around /System/Library would be required.)

A manual mkext after changing this is far less painful than an iDefrag.

Is the symptom the same?  Just freezing up on the progress spinner?

Are you sure you're getting the right boot2 kicked in?  It sounds so much like my problem, that it's surprising the fix didn't work.

zef

  • Administrator
  • Posts: 265
Re: Patch for hanging during boot2 on large disks
« Reply #5 on: May 10, 2010, 06:46:48 PM »
Are you sure you're getting the right boot2 kicked in?  It sounds so much like my problem, that it's surprising the fix didn't work.

@loc[a]lhost:

Do you pass the boot1 stage then get stuck at the boot2 stage? I'm asking this because boot1h can't handle >32bit LBA sectors (it uses 32 bit unsigned arithmetic).
ASUS P8Z68-V PRO/GEN3 | i5-2500k | 16GB RAM | GTX560 | Keyboard | Mouse | Devilsound DAC

loc[a]lhost

  • Entrant
  • Posts: 6
Re: Patch for hanging during boot2 on large disks
« Reply #6 on: May 11, 2010, 07:48:27 PM »
I already have a helper hard drive (Boot HD, as you can see in my diskutil list output), with Chameleon (boot2) and /Extra. I didn't try to pack all my kernel extensions in the helper partition though, that might help a little.

The symptoms are different for me, but that's because of the helper drive.

I get to the boot2 stage, but when it tries to load /System/Library/Caches/com.apple.kext.caches/Startup/Extensions.mkext, it fails, and then attempts to load the extensions one by one. It even finishes booting if the critical extensions are at the beginning of the drive.

I am sure I used the right boot2, I used RC4 before and I saw the new version string.