Discussion:
[Pearpc-devel] hwmmu is working (please test)
Jens von der Heydt
2006-06-02 13:15:23 UTC
Permalink
Hi,

I manged to get the hwmmu - branch to boot OS X Tiger (G3 !) and
Panther without problems
and am able to verify that it actually does work for me. I tried to
combine the normal PearPC and HWMMU
branch but ended up having to manually fix problems here and there,
but that did not make mmu work.

I noticed that when booting with the unmodified hwmmu - branch
sources OS X would stall right after
the first blue screen appeared. I was able to track this down to a
problem with the Framebuffer. The
system was trying to write to the framebuffer though it was read and/
or write protected sometimes.
I know that Sebastian used this mechanism to get the damage areas of
the framebuffer but OS X seems
to have problems with his implemenation because it constantly tried
to access the same memory position

I tracked this down with a small code change that recorded the memory
position of the last SEGV
and counted them when their were identical. I made it give up after
around 4096 accesses to the
same memory position and debuged the code and jitc.log. It turned out
that it always was a
Framebuffer position regardless of the ppc opcode being uses to
access memory.

What I did was this: Remove every part of Framebuffer - Stuff that
was written into HWMMU and
have the System access the Framebuffer by IO again. Does work for me,
reasonable fast as well.

I've uploaded my current sources onto my homepage, I'd like people to
test this stuff:

http://www.vdh-webservice.de/hwmmu-working3.zip


The code contains some private patches and would need some major work
to get it clean and up2date
with the main PearPC branch. Therefore I'm not submitting a patch but
a source-zip. Please test
and report.


Jens

PS: By the way. My test system was a amd athlon x2, LINUX, kernel
2.6.16, PearPC built for 32bit
Daniel Foesch
2006-06-02 15:14:40 UTC
Permalink
Post by Jens von der Heydt
Hi,
I manged to get the hwmmu - branch to boot OS X Tiger (G3 !) and
Panther without problems
and am able to verify that it actually does work for me. I tried to
combine the normal PearPC and HWMMU
branch but ended up having to manually fix problems here and there,
but that did not make mmu work.
I noticed that when booting with the unmodified hwmmu - branch
sources OS X would stall right after
the first blue screen appeared. I was able to track this down to a
problem with the Framebuffer. The
system was trying to write to the framebuffer though it was read and/
or write protected sometimes.
I know that Sebastian used this mechanism to get the damage areas of
the framebuffer but OS X seems
to have problems with his implemenation because it constantly tried
to access the same memory position
I tracked this down with a small code change that recorded the memory
position of the last SEGV
and counted them when their were identical. I made it give up after
around 4096 accesses to the
same memory position and debuged the code and jitc.log. It turned out
that it always was a
Framebuffer position regardless of the ppc opcode being uses to
access memory.
What I did was this: Remove every part of Framebuffer - Stuff that
was written into HWMMU and
have the System access the Framebuffer by IO again. Does work for me,
reasonable fast as well.
I've uploaded my current sources onto my homepage, I'd like people to
http://www.vdh-webservice.de/hwmmu-working3.zip
The code contains some private patches and would need some major work
to get it clean and up2date
with the main PearPC branch. Therefore I'm not submitting a patch but
a source-zip. Please test
and report.
Jens
PS: By the way. My test system was a amd athlon x2, LINUX, kernel
2.6.16, PearPC built for 32bit
Wow, great work there Jens! It sounds like you have managed to
accomplish a lot!

I'm almost jealous... which is good, because it might make me start
working on PearPC again.
--
Daniel Foesch
Jens von der Heydt
2006-06-02 15:57:54 UTC
Permalink
Post by Daniel Foesch
Post by Jens von der Heydt
What I did was this: Remove every part of Framebuffer - Stuff that
was written into HWMMU and
have the System access the Framebuffer by IO again. Does work for me,
reasonable fast as well.
I've uploaded my current sources onto my homepage, I'd like people to
http://www.vdh-webservice.de/hwmmu-working3.zip
The code contains some private patches and would need some major work
to get it clean and up2date
with the main PearPC branch. Therefore I'm not submitting a patch but
a source-zip. Please test
and report.
Jens
PS: By the way. My test system was a amd athlon x2, LINUX, kernel
2.6.16, PearPC built for 32bit
Wow, great work there Jens! It sounds like you have managed to
accomplish a lot!
I'm almost jealous... which is good, because it might make me start
working on PearPC again.
--
Daniel Foesch
Hehe, well, you could start by running it through your gcc and give
some feedback,
and speaking of being jealous: Wait till I fix your altive stuff.
THEN u can be jealous, haha!

Jens
Alex Smith
2006-06-02 19:48:05 UTC
Permalink
Post by Jens von der Heydt
Post by Daniel Foesch
Post by Jens von der Heydt
What I did was this: Remove every part of Framebuffer - Stuff that
was written into HWMMU and
have the System access the Framebuffer by IO again. Does work for me,
reasonable fast as well.
I've uploaded my current sources onto my homepage, I'd like people to
http://www.vdh-webservice.de/hwmmu-working3.zip
The code contains some private patches and would need some major work
to get it clean and up2date
with the main PearPC branch. Therefore I'm not submitting a patch but
a source-zip. Please test
and report.
Jens
PS: By the way. My test system was a amd athlon x2, LINUX, kernel
2.6.16, PearPC built for 32bit
Wow, great work there Jens! It sounds like you have managed to
accomplish a lot!
I'm almost jealous... which is good, because it might make me start
working on PearPC again.
--
Daniel Foesch
Hehe, well, you could start by running it through your gcc and give
some feedback,
and speaking of being jealous: Wait till I fix your altive stuff.
THEN u can be jealous, haha!
Jens
_______________________________________________
Pearpc-devel mailing list
https://lists.sourceforge.net/lists/listinfo/pearpc-devel
Does this code offer any speed improvement (haven't got time to test
it at the minute)?

Alex
Alexandru Lazar
2006-06-02 20:32:51 UTC
Permalink
Jens von der Heydt
2006-06-02 21:00:16 UTC
Permalink
Am 02.06.2006 um 22:32 schrieb Alexandru Lazar:
Alexandru Lazar
2006-06-02 22:06:01 UTC
Permalink
So that's a "it does work for me" then ?
Jens
Yep, it works. I tested it on a computer that's mostly similar to
yours. I'll try it asap on my FreeBSD box at home.
Jens von der Heydt
2006-06-02 22:47:56 UTC
Permalink
Post by Alexandru Lazar
So that's a "it does work for me" then ?
Jens
Yep, it works. I tested it on a computer that's mostly similar to
yours. I'll try it asap on my FreeBSD box at home.
I'm not exactly sure if the mmu source do rely on certain kernel
versions and amount of memory available
in the PC. Maybe we should post some info regarding that. I booted OS
X widh linux 2.6.16 on an AMD box
with 2 GB Memory.

Jens

PS: Happy that it does work for u too
Alex Smith
2006-06-03 06:09:41 UTC
Permalink
Post by Jens von der Heydt
Post by Alexandru Lazar
So that's a "it does work for me" then ?
Jens
Yep, it works. I tested it on a computer that's mostly similar to
yours. I'll try it asap on my FreeBSD box at home.
I'm not exactly sure if the mmu source do rely on certain kernel
versions and amount of memory available
in the PC. Maybe we should post some info regarding that. I booted OS
X widh linux 2.6.16 on an AMD box
with 2 GB Memory.
Jens
PS: Happy that it does work for u too
_______________________________________________
Pearpc-devel mailing list
https://lists.sourceforge.net/lists/listinfo/pearpc-devel
When I get some time today, I'll test it on my machine - a 700Mhz
Pentium 3 :) That should be good for testing the improvement

Alex
Hugh McMaster
2006-06-03 09:31:11 UTC
Permalink
If someone could compile a Windows binary, I will happily test it and report
back. Have made the same request on PearPC.net too.
Post by Alex Smith
Post by Jens von der Heydt
Post by Alexandru Lazar
So that's a "it does work for me" then ?
Jens
Yep, it works. I tested it on a computer that's mostly similar to
yours. I'll try it asap on my FreeBSD box at home.
I'm not exactly sure if the mmu source do rely on certain kernel
versions and amount of memory available
in the PC. Maybe we should post some info regarding that. I booted OS
X widh linux 2.6.16 on an AMD box
with 2 GB Memory.
Jens
PS: Happy that it does work for u too
_______________________________________________
Pearpc-devel mailing list
https://lists.sourceforge.net/lists/listinfo/pearpc-devel
When I get some time today, I'll test it on my machine - a 700Mhz
Pentium 3 :) That should be good for testing the improvement
Alex
_______________________________________________
Pearpc-devel mailing list
https://lists.sourceforge.net/lists/listinfo/pearpc-devel
Jens von der Heydt
2006-06-03 09:32:39 UTC
Permalink
Post by Hugh McMaster
If someone could compile a Windows binary, I will happily test it
and report back. Have made the same request on PearPC.net too.
The hwmmu branch is completely LINUX-ONLY at the moment. You won't be
able to backe an EXE with it right now :)

jens
Hugh McMaster
2006-06-03 10:05:05 UTC
Permalink
Oh well. When the support comes out, I'll be waiting.
Post by Jens von der Heydt
Post by Hugh McMaster
If someone could compile a Windows binary, I will happily test it
and report back. Have made the same request on PearPC.net too.
The hwmmu branch is completely LINUX-ONLY at the moment. You won't be
able to backe an EXE with it right now :)
jens
_______________________________________________
Pearpc-devel mailing list
https://lists.sourceforge.net/lists/listinfo/pearpc-devel
Alex Smith
2006-06-03 10:51:24 UTC
Permalink
Post by Hugh McMaster
Oh well. When the support comes out, I'll be waiting.
Post by Jens von der Heydt
Post by Hugh McMaster
If someone could compile a Windows binary, I will happily test it
and report back. Have made the same request on PearPC.net too.
The hwmmu branch is completely LINUX-ONLY at the moment. You won't be
able to backe an EXE with it right now :)
jens
_______________________________________________
Pearpc-devel mailing list
https://lists.sourceforge.net/lists/listinfo/pearpc-devel
_______________________________________________
Pearpc-devel mailing list
https://lists.sourceforge.net/lists/listinfo/pearpc-devel
I don't think it gets along well with my Pentium 3.

It's just hangs at startup. If I put OS X in verbose mode, it seems
stuck at this:

/etc/rc: line 276: 133 Hangup SystemStarter -gr
${VerboseFlag} ${SafeBoot}

I'm running on Frugalware Linux with kernel
2.6.16.cantrememberrevision, Glibc 2.4 and GCC 4.1.1

Alex
Jens von der Heydt
2006-06-03 14:24:56 UTC
Permalink
Post by Alex Smith
I don't think it gets along well with my Pentium 3.
It's just hangs at startup. If I put OS X in verbose mode, it seems
/etc/rc: line 276: 133 Hangup SystemStarter -gr
${VerboseFlag} ${SafeBoot}
I'm running on Frugalware Linux with kernel
2.6.16.cantrememberrevision, Glibc 2.4 and GCC 4.1.1
Alex
The predefined values might be too large for that system. How much RAM
does that box have?

Jens
Alex Smith
2006-06-03 17:24:58 UTC
Permalink
Post by Jens von der Heydt
Post by Alex Smith
I don't think it gets along well with my Pentium 3.
It's just hangs at startup. If I put OS X in verbose mode, it seems
/etc/rc: line 276: 133 Hangup SystemStarter -gr
${VerboseFlag} ${SafeBoot}
I'm running on Frugalware Linux with kernel
2.6.16.cantrememberrevision, Glibc 2.4 and GCC 4.1.1
Alex
The predefined values might be too large for that system. How much RAM
does that box have?
Jens
_______________________________________________
Pearpc-devel mailing list
https://lists.sourceforge.net/lists/listinfo/pearpc-devel
The machine has 640MB of RAM - plenty enough if you ask me

Alex
Jens von der Heydt
2006-06-03 14:30:12 UTC
Permalink
Post by Hugh McMaster
Oh well. When the support comes out, I'll be waiting.
Well, the current hwmmu branch can only work on linux machines since
windows memory granularity is larger than 4KB. Maybe we can
work around that problem in the future.

Jens
Jens von der Heydt
2006-06-02 20:59:52 UTC
Permalink
Post by Alex Smith
Post by Jens von der Heydt
Hehe, well, you could start by running it through your gcc and give
some feedback,
and speaking of being jealous: Wait till I fix your altive stuff.
THEN u can be jealous, haha!
Jens
_______________________________________________
Pearpc-devel mailing list
https://lists.sourceforge.net/lists/listinfo/pearpc-devel
Does this code offer any speed improvement (haven't got time to test
it at the minute)?
Alex
Alex, regarding the fact the the hwmmu branch never actually worked
I'd say that we should be happy to move on a small step there.
Anyone saying that it worked for them is better than what we
have had during the last 12 months (since hwmmu is nearly a year old).

It's very unoptimized but offers large possibilities to gain speed.
It even offers certain ways to optimize the jitc code / the generated
code.
I've introduced some small patches that have yet to be tested but I'd
tend to say
that this code does work as good as the official PearPC branch.

When people find it to be working for them we can start to seriously
optimize this
branch and later include it into the official branch.

Jens
Gwenole Beauchesne
2006-06-03 16:19:30 UTC
Permalink
Le vendredi, 2 jun 2006, à 03:15 Pacific/Honolulu, Jens von der Heydt a
Post by Jens von der Heydt
PS: By the way. My test system was a amd athlon x2, LINUX, kernel
2.6.16, PearPC built for 32bit
It should not be that difficult to implement hwmmu: it's shared memory
and mmap/mprotect, and SIGSEGV recovery for new pages. The difficult
part is to decide which guest memory region to drop support for and how
much guest memory you want to have...

You have to remember a few figures:

- Typically, a 32-bit kernel can only support up to 3 GB of user
address space, thus ending at 0xc0000000. With an x86_64 kernel and
provided that you haven't set up a 32-bit personality for 3GB address
space limit, that's something like 4 GB - a page (which is very good).

- By default, the max shm segment allocatable is 32 MB. You can either
change that through /proc/sys/kernel/shmmax or split your guest regions
down to 32 MB at most (programmatically check the actual limit).
Besides, make sure someone has not decided to reduce the shm allocation
limit. Typically, that's 2 million pages which is enough
(/proc/sys/kernel/shmall).

With that, you can be left with the following regions (for a 32-bit
kernel with default configuration -- 3 GB user address space): [--
guest --|-- host --].

[-- host --] can take up to 256 MB (depending on your JIT cache size,
executable & shared libraries, etc.). Thus 2.75 GB left for the guest,
0x0000 -> 0xb0000000. For [-- host --] you will have to use a linker
script to relocate the executable to 0xb0000000 and possibly have to
manually handle allocations (malloc() & friends) so that they map above
your executable & shared libs, e.g. 0xb8000000 -> 128 MB left).
Alternatively, you could reserve all the [-- guest --] address space
but [...]

[-- guest --] your are now left with 0x00000000 - 0xb0000000 (2.75 GB).
Depending on how your upper address range (Mac OS X) looked like (e.g.
frame buffer at 0xf0000000?), you will have to "rotate", e.g. use a
constant offset of 0x10000000 (so that 0xf0000000 is now at 0x0000).
With the previously mentionned limit, you are bound to _not_ handle [
0xa0000000 - 0xefffffff ] in MacOS X address space.

Besides, unless you want to record the physical/virtual mappings and
move (copy) the pages accordingly, you may have to allocate your whole
MacOS X memory in the [-- host --] region, thus constraining the amount
of MacOS X memory.

BTW, if you target MacOS X for Intel as the guest OS, it could be
interesting to note that Mach syscalls are pretty slow, and especially
SIGSEGV recovery (catch_exception and handle it). Hence, you might not
guest an improvement. e.g. for SheepShaver that used write-protected
pages to detect dirty frame buffer regions, performance terribly
dropped. Nowadays, I use Native QuickDraw acceleration bounding boxes,
even for the hooks we don't implement, to manually unprotect and record
the pages so that the fault doesn't occur: performance improved by a
factor 2 on MacOS X...

HTH.

Bye,
Gwenole.
Jens von der Heydt
2006-06-05 08:41:56 UTC
Permalink
Post by Gwenole Beauchesne
Le vendredi, 2 jun 2006, à 03:15 Pacific/Honolulu, Jens von der Heydt a
Post by Jens von der Heydt
PS: By the way. My test system was a amd athlon x2, LINUX, kernel
2.6.16, PearPC built for 32bit
It should not be that difficult to implement hwmmu: it's shared memory
and mmap/mprotect, and SIGSEGV recovery for new pages. The difficult
part is to decide which guest memory region to drop support for and how
much guest memory you want to have...
Yes, in theory it's quite easy though I do remember that Sebastian had
problems with segment register mapping. He came up with quite
a nice solution storing register contexts and allocating the memory
in thunks. So we're not exactly working on a page (4kb) basis here.
Post by Gwenole Beauchesne
- Typically, a 32-bit kernel can only support up to 3 GB of user
address space, thus ending at 0xc0000000. With an x86_64 kernel and
provided that you haven't set up a 32-bit personality for 3GB address
space limit, that's something like 4 GB - a page (which is very good).
I was searching for information on that topic since I gathered from
other
sources that this is the actual problem of the hwmmu branch. I'm running
it on a 64bit kernel and that seems to be the reason why it's
actually working
for me - though I wonder what was wrong with the framebuffer.
As soon as I removed the FB mapping from the mmu (at 0x84000000 -
0x85000...)
it did work
Post by Gwenole Beauchesne
- By default, the max shm segment allocatable is 32 MB. You can either
change that through /proc/sys/kernel/shmmax or split your guest regions
down to 32 MB at most (programmatically check the actual limit).
Besides, make sure someone has not decided to reduce the shm
allocation
limit. Typically, that's 2 million pages which is enough
(/proc/sys/kernel/shmall).
That's interesting. We allocate shared memory segments of 4kb size each.
Post by Gwenole Beauchesne
With that, you can be left with the following regions (for a 32-bit
kernel with default configuration -- 3 GB user address space): [--
guest --|-- host --].
Do you know of a way to find the memory regions that are save to map
on a specific kernel?
Post by Gwenole Beauchesne
BTW, if you target MacOS X for Intel as the guest OS, it could be
interesting to note that Mach syscalls are pretty slow, and
especially
SIGSEGV recovery (catch_exception and handle it). Hence, you might not
guest an improvement. e.g. for SheepShaver that used write-protected
pages to detect dirty frame buffer regions, performance terribly
dropped. Nowadays, I use Native QuickDraw acceleration bounding boxes,
even for the hooks we don't implement, to manually unprotect and record
the pages so that the fault doesn't occur: performance improved by a
factor 2 on MacOS X...
We don't target the x86 OS X version and regarding the Framebuffer /
damage region
performance you're completely right. Using SEGV to get those regions
would be
quite slow and I do know better ways to get the damage regions from
the OS.
Benchmarks also show that PearPC is not spending most of it's time in
the redraw
thread.

Thx for the info.
Post by Gwenole Beauchesne
HTH.
Bye,
Gwenole.
Jens
Sebastian Biallas
2006-06-06 15:55:15 UTC
Permalink
Post by Jens von der Heydt
Yes, in theory it's quite easy though I do remember that Sebastian had
problems with segment register mapping.
The problem is that the segement registers change for every
syscall/interrupt which makes all mappings invalid. With the current
approach the mappings can be cached, highly reducing the mmap overhead.

- --
Sebastian
Sebastian Biallas
2006-06-06 15:52:31 UTC
Permalink
Post by Jens von der Heydt
Hi,
I manged to get the hwmmu - branch to boot OS X Tiger (G3 !) and
Panther without problems
Hey, that is cool. Hope this really works and not by accident...


- --
Sebastian
Jens von der Heydt
2006-06-06 16:05:28 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Jens von der Heydt
Hi,
I manged to get the hwmmu - branch to boot OS X Tiger (G3 !) and
Panther without problems
Hey, that is cool. Hope this really works and not by accident...
- --
Sebastian
Yes, that's really cool, I was more than happy to find it working for
me.
Jens von der Heydt
2006-06-06 19:36:42 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Jens von der Heydt
Hi,
I manged to get the hwmmu - branch to boot OS X Tiger (G3 !) and
Panther without problems
Hey, that is cool. Hope this really works and not by accident...
- --
Sebastian
I've also published a first benchmark at pearpc.net:

http://www.pearpc.net/download.php?sid=&id=125

Left one is hwmmu. You can see that even now the memory transfer is
nearly
twice as fast.


Jens

Sebastian Biallas
2006-06-06 16:01:55 UTC
Permalink
Post by Jens von der Heydt
http://www.vdh-webservice.de/hwmmu-working3.zip
What kind of zip is this? Where does it store the file attributes?

(And BTW, you can make a snapshot with "make dist")

- --
Sebastian
Jens von der Heydt
2006-06-06 16:07:48 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Post by Jens von der Heydt
http://www.vdh-webservice.de/hwmmu-working3.zip
What kind of zip is this? Where does it store the file attributes?
(And BTW, you can make a snapshot with "make dist")
- --
Sebastian
It's linux (fedora core 5) normalo-standard-zip :) I could've make a
make dist but was afraid that I would break some custom changes I did
to the makefiles / configure to use NASM and a special -m32 for my
64bit build system.
It's just a quick hack, regarding the things I did to make it work
with the newer
file versions out of the main branch.

Jens
Loading...