Author Topic: boot0 will never execute Pass2 in find_boot [with PATCH]  (Read 9466 times)

0 Members and 1 Guest are viewing this topic.

KillerJK

  • Entrant
  • Posts: 7
boot0 will never execute Pass2 in find_boot [with PATCH]
« on: January 29, 2011, 10:06:15 PM »
THE PROBLEM

Due to some misplaced and missing instructions, the second pass in find_boot will never be run.

In find_boot: .continue, a 'jz' instruction is being used to enter checkGPT, and 'ret' to exit. So that 'ret' returns to the beginning, just after "call    find_boot" in start_reloc:, giving an error instead of executing the second mbr pass.

Even if you hardcode a 'jmp' to the code that entered checkGPT instead of a 'ret' instruction, the bh register will have been overwritten, because checkGPT doesn't save any registers.

Of course, the correct way would be using call and ret, otherwise you won't be able to use the procedure in several places. And the ideal situation would be using pushad and popad.

So, the goal is saving bh and returning from checkGPT to the right place, and then the code may enter the second pass, if needed.

I found out this wasn't working when trying to boot from real partition in parallels. It makes a virtual machine which has the same mbr and GPT as the real hard drive, but other partitions different than the ones you import (that is, the 2 belonging to windows7) are full of zeroes. So of course, boot0hfs won't find any valid hfs sector, and it will fall back to booting from the active windows partition. The bug was preventing this setup from working, as active partitions were never been scanned.

THE FIX

First I want to say the Makefile is using the default optimization level of nasm, and for the 0.98 series, it is -O0, which produces larger code than the -Ox (the new default) used since version 2.09.

There is not enough space left in order to implement the proper fix discussed above for 512byte sectors. I am assuming it must assemble with the default flags regardless of the version used (I'm seeing nasm 0.98.40 in snow leopard 10.6.6 with xcode 3.2.5, so it's quite old, but you'd have a newer version in linux). This explanation is about doing tricks to get the needed bytes for those extra instructions and discussing why they are done.

More things about nasm is that the current boot0, when assembled with -O2 in MacOSX (0.98 series), won't produce the right output (it makes a 1024 bytes binary) because of this bug http://sourceforge.net/tracker/index.php?func=detail&aid=904694&group_id=6208&atid=106208 (I could look into that but nasm 0.98 is obsolete so probably not worth it, version 2 doesn't have that issue it seems). This doesn't affect us, because the Makefile is using -O0 with 0.98 (the default). This is just an unfortunate coincidence, because after patching boot0.s that nasm bug is gone (you get a 512 bytes binary again).

I'm saying all this because at first I thought it would be a good idea to modify the makefile with -O2 in order to save some space. But it makes only 2 changes, and 0.98 has a bug with that level of optimization, so it's a better idea to implement them myself. Changes when you use -O2 are:

In find_boot: .continue

from
Code: [Select]
add     si, part_size   to   
Code: [Select]
add     si, BYTE part_size
In read_lba:

from
Code: [Select]
push    WORD 16 to
Code: [Select]
push BYTE 16
That's 2 bytes less. The change in read_lba is not dangerous, because that will still push 2 bytes, with a smaller instruction.

On the other hand, when you assemble this with nasm 2+, the default is -O2. You are getting those optimizations in this version, but not in the 0.98 (used in macosx for example). That's the reason why I added the changes to the patch, because with nasm 2+ they would be there anyway, and 0.98 defaults to -O0. This way the makefile is not changed and the bug is avoided.

Also, I didn't want to modify the strings printed when assembling with verbose=1. Making them single letters would save up a lot of space, but I thought you wouldn't like it.

So I moved the procedure initBootLoader to a position where checkGPT and find_boot can jump to by using a short jmp (find_boot wasn't using a short one). Instead of pushad/popad in checkGPT I'm only saving bx, and that gives us just enough space to write a call instruction (0 bytes left).


In addition to that I removed a couple of commentaries which were wrong and moved DebugChar('J') to a place where it can actually be executed (although there is not enough space to activate the DEBUG flag so it's pointless without 4KB sectors). I don't know if these changes should go in another bug report, they are minor things so I thought I could include them here.

Edit:
Another optimization that can save 1 extra byte not included in this patch:

Change
Code: [Select]
jmp    .tryToBoot to
Code: [Select]
jmp    SHORT .tryToBoot
This one also saves 1 byte. However in the 0.98 version it is always done, even when you force -O0 (which is the default anyway). However the tricky thing is this is NOT done in version 2.09.07 with -O0. However, here the default is -Ox. Regardless of what certain manpages say, the official documentation states that "The -Ox mode is recommended for most uses, and is the default since NASM 2.09.". -Ox didn't exist in 0.98, but it had -O3

Change
Code: [Select]
jmp     hang to
Code: [Select]
jmp     SHORT hang
« Last Edit: March 27, 2011, 07:58:48 AM by KillerJK »

valv

  • VoodooLabs
  • Posts: 72
    • The AnVAL Forum (fr)
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #1 on: January 30, 2011, 01:38:36 AM »
Hi KillerJK,

Your work is interesting. I'll give it a go asap and keep you informed.
Thank you.

KillerJK

  • Entrant
  • Posts: 7
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #2 on: March 16, 2011, 10:27:42 AM »
I noticed there are also some bug reports in http://forge.voodooprojects.org/p/chameleon/issues/ , so I was wondering if I should move the patch to that location or if the forum is preferred.

zef

  • Administrator
  • Posts: 265
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #3 on: March 18, 2011, 08:00:02 PM »
Hi KillerJK,

Thanks for the detailed description, info and all :) Originally boot0hfs was born because of Windows 7 sleep/hibernation issues in case if the active partition is not the one where bootmgr lives. In other scenarios plain boot0 does the job on pure GPT or hybrid MBR/GPT partitioned disks.

Cheers,
zef
ASUS P8Z68-V PRO/GEN3 | i5-2500k | 16GB RAM | GTX560 | Keyboard | Mouse | Devilsound DAC

KillerJK

  • Entrant
  • Posts: 7
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #4 on: March 21, 2011, 12:08:45 AM »
Maybe I didn't explain it properly. Please, I beg you to read the whole thing again, there is a "problem" in the code, even if normal pure GPT and hybrid setups are working for everyone. Fixing it, or deciding if it's something that has to be fixed, is not my decision but I want to make sure we are talking about the same thing. That's the least I can do for a project that is allowing me to boot osx. I'll try to extensively explain everything, because I realize talking about assembly can be really confusing. Also because I'm starting to forget the details and want to document it :)

Notice my only interest here is making sure we are understanding each other while maybe giving light about an obscure possible bug and even more subtle things that show up when trying to patch it.

Regardless of the kind of setup you are using, be it hybrid, only GPT or hybrid/pure_mbr_because_of_parallels, the bug is the same. I'm not discussing how to make a hybrid configuration work, or the difference between boot0 and boot0hfs.

According to my understanding of the source and my testing, there is something that is bypassing an important part of the code. Things can work without it, but it looks like a bug (a design flaw more than a bug), because the code is there and can't be used to make other configurations work (and they should be possible, as the code has already been written).


My first post says:
Quote
Due to some misplaced and missing instructions, the second pass in find_boot will never be run.

In find_boot: .continue, a 'jz' instruction is being used to enter checkGPT, and 'ret' to exit. So that 'ret' returns to the beginning, just after "call    find_boot" in start_reloc:, giving an error instead of executing the second mbr pass.

The relevant code I'm talking about is (line 359):
Code: [Select]
.continue:
    add     si, part_size          ; advance SI to next partition entry
    loop    .loop                  ; loop through all partition entries

    ;
    ; Scanned all partitions but not found any with active flag enabled
    ; Anyway if we found a protective MBR before we still have a chance
    ; for a possible GPT Header at LBA 1
    ;   
    dec     bl
    jz     checkGPT ; found Protective MBR before

    ;
    ; Switching to Pass 2
    ; try to find a boot1h aware HFS+ MBR partition
    ;
    dec     bh
    mov     si, kMBRPartTable ; set SI to first entry of MBR Partition table
    jz      .start_scan ; scan again

After scanning all the mbr partitions and not finding anything you get to "dec bl"

bl=1 means you detected a protective mbr (partition with type 0xee). If that's the case, dec bl will set the zero flag, so with a jz instruction you'll enter checkGPT.
There's the problem. checkGPT exits with a ret (line 476), and that returns to line 251:

Code: [Select]
    call    find_boot ; will not return on success

error:
   LogString(boot_error_str) <- line 251

Now the question is, should it do that? I think it shouldn't. But to be completely honest I don't think it's a bug, more like a design oversight.

The way it is now, boot0 is less powerful than it could be. If the hard disk has a GPT partition table and you didn't find any active partitions in the mbr then you check the GPT. If the GPT doesn't have anything, well, you can't boot.

In the case of boot0hfs, first you check for a hfs partition. If nothing is found and the disk has a GPT partition table, it is checked. If nothing is found, you are out of luck.

I however think the code should switch to Pass2 regardless of the presence of a GPT table. The patch is trivial, but because there is not enough space, it becomes less obvious, and that's why I had to write that long first post. It has no side effects, just adds more power.

For instance, there is 1 setup that can't work without this.
Let's pretend we have a hybrid mbr.

You want to boot native osx, native windows and virtualized windows. Parallels lets you make a virtual machine that uses physical partitions. The virtual machine will see the real disk, with the real mbr, and real windows partitions. But everything else becomes zeroes (including the GPT). Suddenly, you have a pure mbr setup with no GPT and empty HFS partitions (this is important, it's what makes this scenario special)

The difference between boot0 and boot0hfs is just that the former checks for active partitions and then (if there is no gpt) for hfs partitions, while the latter checks for hfs partitions and then (if there is no gpt) for active partitions. The restriction "if there is no gpt" is the thing my patch removes. Once I understood that difference, I just "put the pieces together" so they called each other and things worked fine in my system.

Using boot0 (active partition: hfs):
-When booting native osx, it loads the active partition's (hfs) boot sector. Chameleon window shows up and you select osx.
-When booting native windows, it loads the active partition's (hfs) boot sector. Chameleon window shows up and you select windows. It then loads windows partition's boot sector.
-When booting virtualized windows, it tries to load the active partition's (hfs) boot sector. Nothing is found, because GPT (and hfs and everything but windows) is now zeroes. Because it is a protective mbr, checkGPT is called. Again, nothing is found, because GPT is zeroes. Chameleon doesn't load. In this case pass2 won't execute, and it won't matter, that wouldn't fix anything anyway.

So in this situation I prefer to use boot0hfs (active partition: windows):
-When booting native osx, it loads the hfs partition. Chameleon window shows up and you select osx.
-When booting native windows, it loads the hfs partition. Chameleon window shows up and you select windows. It then loads windows partition's boot sector.
-When booting virtualized windows, it tries to load the hfs partition. Nothing is found, because GPT (and hfs and everything but windows) is now zeroes. Because it is a protective mbr, checkGPT is called. Again, nothing is found, because GPT is zeroes. In this case, chameleon doesn't load without my patch (or without removing the 0xee mbr entry), because otherwise it won't execute pass2. The patched code will run pass2, and windows active partition will be found. Its boot sector is executed and windows loads. This is exactly how I found out.

There are more ways of doing it, after all it's just the game of chaining bootloaders. I could use boot0 and set the windows partition as active. Then make it so windows shows you a dialog that asks if you want windows or osx. If you select osx, it loads chameleon, that asks you the same (and if you select windows, you go back to the previous step). That would work, but looks like a hack (which I guess is not that bad if you consider this is a hack-intosh, heh). Of course you could configure chameleon so it doesn't show a menu and set a short timeout in the windows boot dialog or something like that to avoid having to look at 2 booting screens. Notice if the mbr didn't have the hfs partition, this wouldn't work as windows' bootmanager wouldn't be able to bootchain chameleon. So yeah, it can work without the patch, although not in the "standard" and "official" setup.

In my opinion the cleanest, easiest and most robust way of doing the parallels virtualized physical windows is the patched boot0, assembled with HFSFIRST (boot0hfs). You lose nothing and "unlock" a feature that was already there, anyway. So this was not about loading this or that operating system, but about executing Pass2, why I think it should be done (at first glance it looked like a bug), how to do it, things I found when doing it, and how I found it through Parallels. I hope it's clear now, I just wanted to share these thoughts, and I also did it because of the challenge to understand what was going on and to make it work.

For future reference, to avoid forgetting it and because it may help, I'm going to post an explanation of the chunks of the code that handle this. You don't have to read it.

Code:

****boot0.s:

bind_boot (line 268):
Line 277:
Code: [Select]
xor     bx, bx ; BL will be set to 1 later in case of
; Protective MBR has been found
inc     bh ; BH = 1. Giving a chance for a second pass
; to boot an inactive but boot1h aware HFS+ partition
; by scanning the MBR partition entries again.

Depending on bh, you go once or twice through all the mbr partition entries.
bh is always 1, so the idea is doing 2 passes. This is hardcoded. On the other hand setting bh to 0 wouldn't make sense without extra code, because the only pass executed would be pass2, and things would be checked in the inverse order you would expected them to be (because pass2 is the oh-god-panic-nothing-was-found pass). Also, that comment is only true if assembled without HFSFIRST. Otherwise BH = 1 means "giving a chance for a second pass to boot an active partition by scanning the MBR partition entries again".


Line 308:
Code: [Select]
mov     bl, 1 ; Assume we can deal with GPT but try to scan
    ; later if not found any other bootable partitions.

Then it goes through all the mbr partitions. If one with type 0xee is found, a protective mbr has just been found and it sets bl=1

For each entry in the mbr:

Pass1:
-If you assemble with -DHFSFIRST (getting boot0hfs):

Line 317:
Code: [Select]
cmp     BYTE [si + part.type], kPartTypeHFS ; In pass 1 we're going to find a HFS+ partition
                                                  ; equipped with boot1h in its boot record
                                                  ; regardless if it's active or not.

If partition's type is hfs, try to load its boot sector (it makes sense, because the flag is called HFS FIRST, after all)

-If you assemble without -DHFSFIRST (getting boot0):

Line 323:
Code: [Select]
cmp     BYTE [si + part.bootid], kPartActive ; In pass 1 we are walking on the standard path
                                                  ; by trying to hop on the active partition.

If partition is active, try to boot its boot sector

After Pass1

Line 368:
Code: [Select]
    dec     bl
    jz     checkGPT ; found Protective MBR before

If nothing interesting was found in the mbr and bl=1, execute checkGPT

<checkGPT does stuff not relevant to this explanation>

Line 476:
Code: [Select]
    ret ; no more GUID partitions. Giving up.
If nothing is found, checkGPT returns to line 249

Line 248
Code: [Select]
    call    find_boot ; will not return on success

error:
    LogString(boot_error_str)

An error is shown and it ends. And this is bad because executing Pass2 in this case would be a nice thing to have (why not?)


However, if checkGPT is not executed because bl=0

Line 368:
Code: [Select]
    dec     bl
    jz     checkGPT ; found Protective MBR before

The jz is not followed

Line 371:
Code: [Select]
    ;
    ; Switching to Pass 2
    ; try to find a boot1h aware HFS+ MBR partition
    ;
    dec     bh
    mov     si, kMBRPartTable ; set SI to first entry of MBR Partition table
    jz      .start_scan ; scan again

And this is executed. If bh=1, and it is 1 because it's the first time we are here, we start the second pass. So we go back to the beginning, and start going through all the mbr partitions. Again, if one with type 0xee is found, a protective mbr has just been found and it sets bl=1.

Pass2 (it does the inverse of Pass1):
-If you assemble with -DHFSFIRST (getting boot0hfs):

Line 334:
Code: [Select]
  cmp     BYTE [si + part.bootid], kPartActive ; In pass 2 we are walking on the standard path
If partition is active, try to boot its boot sector

-If you assemble without -DHFSFIRST (getting boot0):

Line 340:
Code: [Select]
cmp     BYTE [si + part.type], kPartTypeHFS ; In pass 2 we're going to find a HFS+ partition
                                                  ; equipped with boot1h in its boot record
                                                  ; regardless if it's active or not.

If partition's type is hfs, try to load its boot sector

After Pass2

Line 368:
Code: [Select]
    dec     bl
    jz     checkGPT ; found Protective MBR before

    ;
    ; Switching to Pass 2
    ; try to find a boot1h aware HFS+ MBR partition
    ;
    dec     bh
    mov     si, kMBRPartTable ; set SI to first entry of MBR Partition table
    jz      .start_scan ; scan again

.exit:
    ret ; Giving up.

If nothing interesting is found, we are here again. If it's a protective mbr, bl=1 and checkGPT will be called again. Of course nothing will be found, otherwise we wouldn't be here.
Because this time bh=0, Pass2 will be ignored, and ret brings you to

Line 476:
Code: [Select]
    ret ; no more GUID partitions. Giving up.
checkGPT returns to line 249
Line 248
    call    find_boot ; will not return on success

error:
    LogString(boot_error_str)

Where an error is shown and it ends.
« Last Edit: March 21, 2011, 02:17:09 AM by KillerJK »

zef

  • Administrator
  • Posts: 265
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #5 on: March 23, 2011, 11:32:19 PM »
Hi KillerJK,

Many thanks again for the extended description, just applied your patch against the trunk:

http://forge.voodooprojects.org/p/chameleon/source/commit/750/

Bye,
zef
ASUS P8Z68-V PRO/GEN3 | i5-2500k | 16GB RAM | GTX560 | Keyboard | Mouse | Devilsound DAC

ErmaC

  • Resident
  • Posts: 134
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #6 on: March 26, 2011, 01:40:26 PM »
Hi guys!
First, sorry for my poor english.

I download and compile the last version (not RC) of NASM from here: --> http://www.nasm.us/
easy build it:
1) ./configure
2) make
3) sudo make install


NASM version 2.09.07 compiled on Mar 26 2011

my question is:
how adapt the latest changes in chameleon 2 RC5 rev 750 boot0.s for the newes nasm compiler?
Code: [Select]
DIR = boot0
include ../MakePaths.dir

NASM = /Developer/usr/bin/nasm
INSTALLDIR = $(DSTROOT)/usr/standalone/i386
DIRS_NEEDED = $(SYMROOT)

all embedtheme: $(DIRS_NEEDED) boot0 boot0hfs chain0

boot0: boot0.s Makefile $(NASM)
$(NASM) boot0.s -o $(SYMROOT)/$@

boot0hfs: boot0.s Makefile $(NASM)
$(NASM) boot0.s -DHFSFIRST -o $(SYMROOT)/$@

chain0: chain0.s Makefile $(NASM)
$(NASM) chain0.s -o $(SYMROOT)/$@

install_i386:: all $(INSTALLDIR)
cp $(SYMROOT)/boot0 $(SYMROOT)/chain0 $(INSTALLDIR)
cd $(INSTALLDIR); chmod u+w boot0

include ../MakeInc.dir

#dependencies

Fabio
« Last Edit: March 26, 2011, 02:38:18 PM by iFabio »
P6T Deluxe v1 i7 940 Quadro Fx 5600
P6T SE i7 920 GeForce GT 240

KillerJK

  • Entrant
  • Posts: 7
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #7 on: March 27, 2011, 07:21:47 AM »
I took a look at the Makefile and in my system the "make install" is installing nasm to /usr/bin/nasm

So changing a line from

Code: [Select]
NASM = /Developer/usr/bin/nasm
to

Code: [Select]
NASM = /usr/bin/nasm
should do what you want.

If you are curious, the only difference between assembling boot0 with that version instead of using the 0.98 series (the one that comes with xcode) is the default optimization level used by the newer nasm (you can get the same results if you use -O3 with the old one).

Specifically, the

Code: [Select]
jmp    .tryToBoot
is optimized to

Code: [Select]
jmp    SHORT .tryToBoot
Other than that, the binaries are exactly the same. I noticed it will give you a warning about missing ":" (old version doesn't). You can fix it yourself if you change ".switchPass2" and write ".switchPass2:". I'm providing a patch for that.

I suppose that short jump could have been included in the patch, but I forgot. Anyway hardcoding short jumps you don't need may be bad if in the future code needs to be moved around and the jumps break. The reason the patch has them is because nasm 0.98 doesn't optimize by default, the makefile doesn't specify an optimization level, there is a bug in old nasm regarding -O2, xcode has the old nasm and the makefile doesn't do a version check. But it's good to know you can get 1 extra byte from it if needed.

I added a few things to the first post that I found out because of your post, so thanks :) This is stuff you either document or lose forever.
« Last Edit: March 27, 2011, 08:00:12 AM by KillerJK »

KillerJK

  • Entrant
  • Posts: 7
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #8 on: March 27, 2011, 07:25:45 AM »
Hi KillerJK,

Many thanks again for the extended description, just applied your patch against the trunk:

http://forge.voodooprojects.org/p/chameleon/source/commit/750/

Bye,
zef

iFabio's post me aware of a warning given by the newest nasm version. It's my fault really, a colon is missing, although it assembles correctly. The fix is attached.
« Last Edit: March 27, 2011, 08:44:21 PM by KillerJK »

ErmaC

  • Resident
  • Posts: 134
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #9 on: April 03, 2011, 08:47:01 PM »
hi.
I also "found" some warnings with the new NASM.
The warning found and "correct" into the chain0.s file are some missed "colon" ":" 9 in total:
My knowledge in assembly are very... low :P

Quote
line 125 start: ; Added colon
line 281 .found: ; Added colon
line 435 .exit: ; Added colon
line 586 print_string: ; Added colon
line 589 .loop: ; Added colon
line 596 .exit: ; Added colon
line 693 pad_boot: ; Added colon
line 696 pad_table_and_sig: ; Added colon
line 700 END: ; Added colon

now I can compile w/o warnings

Fabio
P6T Deluxe v1 i7 940 Quadro Fx 5600
P6T SE i7 920 GeForce GT 240

KillerJK

  • Entrant
  • Posts: 7
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #10 on: April 04, 2011, 07:46:44 PM »
Uploaded iFabio's modifications in .patch format

zef

  • Administrator
  • Posts: 265
Re: boot0 will never execute Pass2 in find_boot [with PATCH]
« Reply #11 on: April 29, 2011, 12:21:29 PM »
ASUS P8Z68-V PRO/GEN3 | i5-2500k | 16GB RAM | GTX560 | Keyboard | Mouse | Devilsound DAC