If the event is approached with the logic of finding the most repetitive
The histogram of 32-bit addressing data generates an average of 4gb addressing recommendation.
but keep in mind that not all 32-bit applications actually use 32-bit addressing data, if that were the case.
(32bit apps files were never 4GB in size)
To summarize the subject; You can dump the contents of the repetitive 32-bit data into the 16-bit comparison template.
this is logic It guarantees 50% compression in all aspects.
and secondly, in large data close to 4GB and above
You don't need to create a top list of all data in terms of histogram!
that is, the most repeated from the first order
You can rank first 1000.
the ranking list will contain less data than normal read and write.