Discussion:
[Pearpc-devel] Asking for Important Patches
Daniel Foesch
2005-10-12 05:06:18 UTC
Permalink
What patches have people produced, that we want to get into 0.4, and
haven't made it. Mostly, I'm looking for bug fixes. Also, any sort
of pointer to the second ide line patch.

I'll try and get things integrated, and we'll see about getting 0.4
out the door as soon as possible. (AltiVec is definitely not getting
fix in time :()

--
Daniel Foesch
p***@coreytabaka.com
2005-10-13 13:35:08 UTC
Permalink
This may be a little OT, but I've noticed that there is a lot of overkill on
the byte swapping; there are numerous places where a value is byte swapped
only to be swapped back in the function that the value gets passed to. This
is a considerable performance degredation, esp since it occurs most often in
the IO pipe.

Theoretically, byte swapping should only occur on reads/writes to memory
(except maybe in very specific circumstances). On bigendian machines, byts
wapping should rarely occur. Does anyone else agree with this?

This kind of cleanup could make for a nice addition to the release.


----- Original Message -----
From: "Daniel Foesch" <***@gmail.com>
To: <pearpc-***@lists.sourceforge.net>
Sent: Wednesday, October 12, 2005 1:06 AM
Subject: [Pearpc-devel] Asking for Important Patches


What patches have people produced, that we want to get into 0.4, and
haven't made it. Mostly, I'm looking for bug fixes. Also, any sort
of pointer to the second ide line patch.

I'll try and get things integrated, and we'll see about getting 0.4
out the door as soon as possible. (AltiVec is definitely not getting
fix in time :()

--
Daniel Foesch


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
Daniel Foesch
2005-10-13 18:25:54 UTC
Permalink
Post by p***@coreytabaka.com
This may be a little OT, but I've noticed that there is a lot of overkill on
the byte swapping; there are numerous places where a value is byte swapped
only to be swapped back in the function that the value gets passed to. This
is a considerable performance degredation, esp since it occurs most often in
the IO pipe.
Theoretically, byte swapping should only occur on reads/writes to memory
(except maybe in very specific circumstances). On bigendian machines, byts
wapping should rarely occur. Does anyone else agree with this?
This kind of cleanup could make for a nice addition to the release.
It's a very good idea, and I agree in many ways. Of course, there are
issues here were for instance, when doing memcpy with altivec, that we
byte swap an entire 128-bit vector just to byte-swap it back when
writing it to memory. Why do this? Because it's very difficult to
track this sort of thing, and prevent it.

I made a preliminary patch that would not byte-swap unless it had to.
Unfortunately, since there's no in-vector way to byteswap a vector on
the x86 (thanks Intel/AMD) if I just load the vector to a vector
register, then later fix its byte-order, then I have to write the
vector out to memory, fix it's byte-order by moving it to a different
piece of memory, then load it back up into the vector.

This looks all nice, until you run into a problem like, if I load up
two vectors at a time. The second vector might page fault, so I have
to write out the first vector to the reg file. Now, let's say that
the second read page faults, then we'll pick up some other code, and
load in the page, then go back to executing from where we left off, at
the second vector read. How do we know what byte-order the first
vector was stored in? So, now we have to fix byte-order for every
register write to our emulated register bank.

It turns out that this overhead of tracking this stuff results in a
slow-down and a net loss of performance.

Byteswapping on the x86 is relatively low impact. Either we're just
executing a BSWAP (a very low cost instruction), or we're
reading/writing 16 bytes in reverse order, which due to the way
cache-lines work is no slower than reading/writing 16 bytes in forward
order.

Anyways, if you have any explicit points that you can point out where
it's absolutely useless, we can look into it after 0.4 release. Right
now, we're triyng to actually get a release out, and we need to hit
things that are important right now for stability. Not add features.
(The second IDE controller patch is hardly a "new feature" anymore, as
so many people have been "testing" it for us.)

--
Daniel Foesch
Daniel Foesch
2005-10-13 18:25:18 UTC
Permalink
Post by p***@coreytabaka.com
This may be a little OT, but I've noticed that there is a lot of overkill on
the byte swapping; there are numerous places where a value is byte swapped
only to be swapped back in the function that the value gets passed to. This
is a considerable performance degredation, esp since it occurs most often in
the IO pipe.
Theoretically, byte swapping should only occur on reads/writes to memory
(except maybe in very specific circumstances). On bigendian machines, byts
wapping should rarely occur. Does anyone else agree with this?
This kind of cleanup could make for a nice addition to the release.
It's a very good idea, and I agree in many ways. Of course, there are
issues here were for instance, when doing memcpy with altivec, that we
byte swap an entire 128-bit vector just to byte-swap it back when
writing it to memory. Why do this? Because it's very difficult to
track this sort of thing, and prevent it.

I made a preliminary patch that would not byte-swap unless it had to.
Unfortunately, since there's no in-vector way to byteswap a vector on
the x86 (thanks Intel/AMD) if I just load the vector to a vector
register, then later fix its byte-order, then I have to write the
vector out to memory, fix it's byte-order by moving it to a different
piece of memory, then load it back up into the vector.

This looks all nice, until you run into a problem like, if I load up
two vectors at a time. The second vector might page fault, so I have
to write out the first vector to the reg file. Now, let's say that
the second read page faults, then we'll pick up some other code, and
load in the page, then go back to executing from where we left off, at
the second vector read. How do we know what byte-order the first
vector was stored in? So, now we have to fix byte-order for every
register write to our emulated register bank.

It turns out that this overhead of tracking this stuff results in a
slow-down and a net loss of performance.

Byteswapping on the x86 is relatively low impact. Either we're just
executing a BSWAP (a very low cost instruction), or we're
reading/writing 16 bytes in reverse order, which due to the way
cache-lines work is no slower than reading/writing 16 bytes in forward
order.

Anyways, if you have any explicit points that you can point out where
it's absolutely useless, we can look into it after 0.4 release. Right
now, we're triyng to actually get a release out, and we need to hit
things that are important right now for stability. Not add features.
(The second IDE controller patch is hardly a "new feature" anymore, as
so many people have been "testing" it for us.)

--
Daniel Foesch
Sebastian Biallas
2005-10-14 15:40:15 UTC
Permalink
Post by p***@coreytabaka.com
This may be a little OT, but I've noticed that there is a lot of
overkill on the byte swapping; there are numerous places where a value
is byte swapped only to be swapped back in the function that the value
gets passed to. This is a considerable performance degredation, esp
since it occurs most often in the IO pipe.
bswap is fast. IO is slow per se. PearPC has much worse bottle-necks...
Post by p***@coreytabaka.com
Theoretically, byte swapping should only occur on reads/writes to memory
(except maybe in very specific circumstances). On bigendian machines,
byts wapping should rarely occur. Does anyone else agree with this?
Yes and no. The PCI bus is little-endian. This makes things complicated.

Sebastian
Sebastian Biallas
2005-10-14 15:41:36 UTC
Permalink
Post by Daniel Foesch
What patches have people produced, that we want to get into 0.4, and
haven't made it. Mostly, I'm looking for bug fixes. Also, any sort
of pointer to the second ide line patch.
There is a network crc patch which seems fine. But I haven't heard any
(positiv or negativ) reports about it.
Post by Daniel Foesch
I'll try and get things integrated, and we'll see about getting 0.4
out the door as soon as possible. (AltiVec is definitely not getting
fix in time :()
Sebastian

Loading...