Discussion:
Shrink the i4x4_mode cost_table array
(too old to reply)
Henrik Gramner
2017-12-25 19:40:02 UTC
Permalink
Raw Message
x264 | branch: master | Henrik Gramner <***@gramner.com> | Sat Oct 14 14:11:26 2017 +0200| [06c8f6bab0fc8fa9b2df9a1af5d10c87c515edb4] | committer: Anton Mitrofanov

Shrink the i4x4_mode cost_table array

Only 17 elements are actually used. It was originally padded to 64 bytes to
avoid cache line splits in the x86 assembly, but those haven't really been
an issue on x86 CPU:s made in the past decade or so.

Benchmarking shows no performance impact from dropping the padding, so
might as well remove it and save some cache.
http://git.videolan.org/gitweb.cgi/x264.git/?a=commit;h=06c8f6bab0fc8fa9b2df9a1af5d10c87c515edb4
---

common/common.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/common/common.h b/common/common.h
index fe2b1c7f..27a56fbd 100644
--- a/common/common.h
+++ b/common/common.h
@@ -349,7 +349,7 @@ struct x264_t
struct
{
uint16_t ref[QP_MAX+1][3][33];
- ALIGNED_64( uint16_t i4x4_mode[QP_MAX+1][32] );
+ uint16_t i4x4_mode[QP_MAX+1][17];
} *cost_table;

const uint8_t *chroma_qp_table; /* includes both the nonlinear luma->chroma mapping and chroma_qp_offset */
Loading...