Discussion:
[Pearpc-devel] Making the CPU-emulation reentrant
Sebastian Biallas
2006-03-08 20:56:55 UTC
Permalink
Hello!

For some time now I'm thinking of making the CPU emulation reentrant.
The idea would be to put the gCPU structure on the stack while executing
the generated code (so all register accesses etc would be %esp relative
instead of absolute addresses) and to pass a pointer to the CPU
structure to all "normal" functions.

The question is: Is this worth the efford?

cons:
- - %esp relative accesses are one byte longer.
- - asm code might become slightly more complicated

pros:
- - ability to implement SMP

Comments?

Sebastian
Jens von der Heydt
2006-03-08 21:00:48 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello!
For some time now I'm thinking of making the CPU emulation reentrant.
The idea would be to put the gCPU structure on the stack while
executing
the generated code (so all register accesses etc would be %esp
relative
instead of absolute addresses) and to pass a pointer to the CPU
structure to all "normal" functions.
The question is: Is this worth the efford?
- - %esp relative accesses are one byte longer.
- - asm code might become slightly more complicated
- - ability to implement SMP
Comments?
Sebastian
Correct me If I'm wrong but that's not the only way to implement SMP,
is it?

We could also introduce a 2nd gCPU structure and still pass pointers to
the CPU structure (or the CPU number) to functions, right?

Jens
Sebastian Biallas
2006-03-08 21:22:49 UTC
Permalink
Post by Jens von der Heydt
Correct me If I'm wrong but that's not the only way to implement SMP,
is it?
We could also introduce a 2nd gCPU structure and still pass pointers to
the CPU structure (or the CPU number) to functions, right?
Then they couldn't share the translation cache. Hmmm... But maybe
sharing the translation cache is a stupid thing per se.

Sebastian
Daniel Foesch
2006-03-09 02:20:23 UTC
Permalink
Post by Sebastian Biallas
Post by Jens von der Heydt
Correct me If I'm wrong but that's not the only way to implement SMP,
is it?
We could also introduce a 2nd gCPU structure and still pass pointers to
the CPU structure (or the CPU number) to functions, right?
Then they couldn't share the translation cache. Hmmm... But maybe
sharing the translation cache is a stupid thing per se.
I was thinking of something similar myself, also. The idea of using a
stack relative execution would be quite interesting. My idea was
simply to have a pointer passed around and use global accesses anyways
(this would also help to eliminate our PIC problems)

This would allow us to avoid increasing translated code size, but I
don't think that's the biggest problem we're facing.

The big problem is though that both cores can't share translated code,
which might be the right thing to do still. If we did everything with
relative accesses, then we'd lose yet another register for use.
Putting it on the stack and using %esp relative accesses seems like
the perfect work around for this.

--
Daniel Foesch
Sebastian Biallas
2006-03-09 02:37:08 UTC
Permalink
Post by Daniel Foesch
Post by Sebastian Biallas
Post by Jens von der Heydt
Correct me If I'm wrong but that's not the only way to implement SMP,
is it?
We could also introduce a 2nd gCPU structure and still pass pointers to
the CPU structure (or the CPU number) to functions, right?
Then they couldn't share the translation cache. Hmmm... But maybe
sharing the translation cache is a stupid thing per se.
I was thinking of something similar myself, also. The idea of using a
stack relative execution would be quite interesting. My idea was
simply to have a pointer passed around and use global accesses anyways
(this would also help to eliminate our PIC problems)
For the PIC problem we'd have to eliminate all global variables which
are accessed in the asm files (also gMemory, etc).
Post by Daniel Foesch
This would allow us to avoid increasing translated code size, but I
don't think that's the biggest problem we're facing.
The big problem is though that both cores can't share translated code,
which might be the right thing to do still. If we did everything with
relative accesses, then we'd lose yet another register for use.
Putting it on the stack and using %esp relative accesses seems like
the perfect work around for this.
Yes, but sharing the tc would be a sync problem between the cpu-threads.
It would involve much locking and (even worse) moving a lot of stuff
between the real cpu caches.

Sebastian
Daniel Foesch
2006-03-09 02:56:11 UTC
Permalink
Post by Sebastian Biallas
Post by Daniel Foesch
Post by Sebastian Biallas
Post by Jens von der Heydt
Correct me If I'm wrong but that's not the only way to implement SMP,
is it?
We could also introduce a 2nd gCPU structure and still pass pointers to
the CPU structure (or the CPU number) to functions, right?
Then they couldn't share the translation cache. Hmmm... But maybe
sharing the translation cache is a stupid thing per se.
I was thinking of something similar myself, also. The idea of using a
stack relative execution would be quite interesting. My idea was
simply to have a pointer passed around and use global accesses anyways
(this would also help to eliminate our PIC problems)
For the PIC problem we'd have to eliminate all global variables which
are accessed in the asm files (also gMemory, etc).
Yeah, darn that quirky English. "help to eliminate" means it would
partially eliminate, while "help eliminate" means it would almost
completely eliminate.
Post by Sebastian Biallas
Post by Daniel Foesch
This would allow us to avoid increasing translated code size, but I
don't think that's the biggest problem we're facing.
The big problem is though that both cores can't share translated code,
which might be the right thing to do still. If we did everything with
relative accesses, then we'd lose yet another register for use.
Putting it on the stack and using %esp relative accesses seems like
the perfect work around for this.
Yes, but sharing the tc would be a sync problem between the cpu-threads.
It would involve much locking and (even worse) moving a lot of stuff
between the real cpu caches.
Ah... ick. that would stink. Because you'd have to lock on any
instruction cache flush, and any new entrance into the execution...

Yeah, having them share wouldn't be fun at all.

--
Daniel Foesch
Sebastian Biallas
2006-03-09 03:12:02 UTC
Permalink
Post by Daniel Foesch
Post by Sebastian Biallas
For the PIC problem we'd have to eliminate all global variables which
are accessed in the asm files (also gMemory, etc).
Yeah, darn that quirky English. "help to eliminate" means it would
partially eliminate, while "help eliminate" means it would almost
completely eliminate.
Oh, what subtile differences :)
Post by Daniel Foesch
Post by Sebastian Biallas
Yes, but sharing the tc would be a sync problem between the cpu-threads.
It would involve much locking and (even worse) moving a lot of stuff
between the real cpu caches.
Ah... ick. that would stink. Because you'd have to lock on any
instruction cache flush, and any new entrance into the execution...
Maybe this would be ok for HT-cpus?

Sebastian
Daniel Foesch
2006-03-09 05:36:37 UTC
Permalink
Post by Sebastian Biallas
Post by Daniel Foesch
Post by Sebastian Biallas
For the PIC problem we'd have to eliminate all global variables which
are accessed in the asm files (also gMemory, etc).
Yeah, darn that quirky English. "help to eliminate" means it would
partially eliminate, while "help eliminate" means it would almost
completely eliminate.
Oh, what subtile differences :)
Yeah, I love English. Worse is when I translate German to English,
since there are these slight colorings like this that I understand,
just can't express.
Post by Sebastian Biallas
Post by Daniel Foesch
Post by Sebastian Biallas
Yes, but sharing the tc would be a sync problem between the cpu-threads.
It would involve much locking and (even worse) moving a lot of stuff
between the real cpu caches.
Ah... ick. that would stink. Because you'd have to lock on any
instruction cache flush, and any new entrance into the execution...
Maybe this would be ok for HT-cpus?
HT cores function similarly to dual core CPUs. Of course, the idea
would be that one could possibly do SMP even on a single core CPU. Of
course this wouldn't make anything faster, and would in fact make
things slower. But the idea is that one could still test things.

But anyways, things should be designed such that it should work for
all three cases: Single CPU, HT-enabled CPU, and Dual Core/CPU
system.

--
Daniel Foesch

Axel Auweter
2006-03-08 22:46:41 UTC
Permalink
Hi everybody,
Post by Sebastian Biallas
Hello!
For some time now I'm thinking of making the CPU emulation reentrant.
The idea would be to put the gCPU structure on the stack while
executing
the generated code (so all register accesses etc would be %esp
relative
instead of absolute addresses) and to pass a pointer to the CPU
structure to all "normal" functions.
The question is: Is this worth the efford?
- - %esp relative accesses are one byte longer.
- - asm code might become slightly more complicated
- - ability to implement SMP
Comments?
Long long time ago, I did exactly that for the generic core we used
within SoftPear. Code should still be in the SoftPear CVS, you may
have a look at it. I used to implement pthread support with it.
Haven't worked with the code since then, but feel free to ask, if you
have any questions while reading through the code.

Axel
Loading...