Related
PCollection<KafkaRecord<String, byte[]>> kafkaRecordPCollection =
pipeline.apply(
KafkaIO.<String, byte[]>read()
.withBootstrapServers("bootstrap-server")
.withTopic("topic")
.withConsumerConfigUpdates(props)
.withKeyDeserializer(StringDeserializer.class)
.withValueDeserializer(ByteArrayDeserializer.class));
When I create kafkaIO in above format Im able to get data in below mentioned way.
KV{Struct{OC_NO=ABCDE3,PO_NO=11XA435A,S_ID=024,__dbz__physicalTableIdentifier=cdc.po.PO}, [3, 0, -93, -44, 46, 43, 44, -90, 68, 71, -96, 24, -105, 51, 107, -22, -27, 86, 0, 2, 12, 65, 66, 67, 68, 69, 51, 0, 2, 97, 22, 49, 46, 56, 46, 49, 46, 70, 105, 110, 97, 108, 20, 112, 111, 115, 116, 103, 114, 101, 115, 113, 108, 20, 99, 100, 99, 95, 115, 101, 114, 118, 101, 114, -6, -96, -63, -16, -28, 96, 0, 10, 102, 97, 108, 115, 101, 18, 75, 65, 70, 75, 65, 45, 80, 79, 67, 2, 66, 91, 34, 49, 50, 53, 53, 51, 52, 49, 56, 48, 56, 52, 53, 54, 34, 44, 34, 49, 50, 53, 53, 51, 56, 57, 55, 54, 52, 52, 56, 56, 34, 93, 26, 112, 117, 114, 99, 104, 97, 115, 101, 111, 114, 100, 101, 114, 28, 80, 85, 82, 67, 72, 65, 83, 69, 95, 79, 82, 68, 69, 82, 2, -66, -95, -57, -90, 15, 2, -112, -18, -4, -80, -119, 73, 0, 2, 117, 2, -116, -89, -63, -16, -28, 96, 0]}
But I need to deserialize the byte[] into a Pojo class generated from AVRO schema. And when I try to use it for withValueDeserializer() im getting errors. Is there a specific way to do it.
I have created a custom MyClassKafkaAvroDeserializer as well.
#Slf4j
public class MyClassKafkaAvroDeserializer extends
AbstractKafkaAvroDeserializer implements Deserializer<Envelope> {
#Override
public void configure(Map<String, ?> configs, boolean isKey) {
configure(new KafkaAvroDeserializerConfig(configs));
}
#Override
public Envelope deserialize(String s, byte[] bytes) {
return (Envelope) this.deserialize(bytes);
}
#Override
public void close() {
}
}
PCollection<KafkaRecord<String, Envelope>> kafkaRecordPCollection =
pipeline.apply(
KafkaIO.<String, Envelope>read()
.withBootstrapServers("bootstrap-server")
.withTopic("topic")
.withConsumerConfigUpdates(props)
.withKeyDeserializer(StringDeserializer.class)
.withValueDeserializer(MyClassKafkaAvroDeserializer.class)
);
Which gives me an error like below
Exception in thread "main" org.apache.kafka.common.errors.SerializationException: Unknown magic byte!
I think you also need to provide a coder in this case, like:
KafkaIO.<String, Envelope>read()
...
.withValueDeserializerAndCoder(
MyClassKafkaAvroDeserializer.class, AvroCoder.of(Envelope.class))
If you're having trouble with deserializers, you can always follow your byte-producing Read with a Map that does the deserialization.
I have a base64 encoded string pTWlzYwVk74RHlbhrHtYxjlmTpa1KY3LVj3X8o3PHUURfY07Qnk5wFPHP7SHDvoJSaM24DybXt20+ou3evsEmLNQfzsF2A1lfSsG2dIKf5Gmhb1qXVN7C6z1mJIRTWt99ei9A1Ozyc7et2DpKpX0SGIaKPcmf2TomYvt1Q+YWTaabUoue9BgI2VHb3L2f/UdRo5ja6beSeA= forexample, In Java when decoding this base64 decoded I get the correct result but in python different result. Here is the code I used.
python code
decoded = base64.b64decode("pTWlzYwVk74RHlbhrHtYxjlmTpa1KY3LVj3X8o3PHUURfY07Qnk5wFPHP7SHDvoJSaM24DybXt20+ou3evsEmLNQfzsF2A1lfSsG2dIKf5Gmhb1qXVN7C6z1mJIRTWt99ei9A1Ozyc7et2DpKpX0SGIaKPcmf2TomYvt1Q+YWTaabUoue9BgI2VHb3L2f/UdRo5ja6beSeA=")
mylist = list(decoded )
print("mylist", mylist)
output becomes
mylist [165, 53, 165, 205, 140, 21, 147, 190, 17, 30, 86, 225, 172, 123, 88, 198, 57, 102, 78, 150, 181, 41, 141, 203, 86, 61, 215, 242, 141, 207, 29, 69, 17, 125, 141, 59, 66, 121, 57, 192, 83, 199, 63, 180, 135, 14, 250, 9, 73, 163, 54, 224, 60, 155, 94, 221, 180, 250, 139, 183, 122, 251, 4, 152, 179, 80, 127, 59, 5, 216, 13, 101, 125, 43, 6, 217, 210, 10, 127, 145, 166, 133, 189, 106, 93, 83, 123, 11, 172, 245, 152, 146, 17, 77, 107, 125, 245, 232, 189, 3, 83, 179, 201, 206, 222, 183, 96, 233, 42, 149, 244, 72, 98, 26, 40, 247, 38, 127, 100, 232, 153, 139, 237, 213, 15, 152, 89, 54, 154, 109, 74, 46, 123, 208, 96, 35, 101, 71, 111, 114, 246, 127, 245, 29, 70, 142, 99, 107, 166, 222, 73, 224]
and the Java side I have used like this
String str = "pTWlzYwVk74RHlbhrHtYxjlmTpa1KY3LVj3X8o3PHUURfY07Qnk5wFPHP7SHDvoJSaM24DybXt20+ou3evsEmLNQfzsF2A1lfSsG2dIKf5Gmhb1qXVN7C6z1mJIRTWt99ei9A1Ozyc7et2DpKpX0SGIaKPcmf2TomYvt1Q+YWTaabUoue9BgI2VHb3L2f/UdRo5ja6beSeA="
byte[] decoded = Base64.getDecoder().decode(str.getBytes(StandardCharsets.UTF_8));
System.out.println("myarray: "+Arrays.toString(decoded));
and output becomes
[-91, 53, -91, -51, -116, 21, -109, -66, 17, 30, 86, -31, -84, 123, 88, -58, 57, 102, 78, -106, -75, 41, -115, -53, 86, 61, -41, -14, -115, -49, 29, 69, 17, 125, -115, 59, 66, 121, 57, -64, 83, -57, 63, -76, -121, 14, -6, 9, 73, -93, 54, -32, 60, -101, 94, -35, -76, -6, -117, -73, 122, -5, 4, -104, -77, 80, 127, 59, 5, -40, 13, 101, 125, 43, 6, -39, -46, 10, 127, -111, -90, -123, -67, 106, 93, 83, 123, 11, -84, -11, -104, -110, 17, 77, 107, 125, -11, -24, -67, 3, 83, -77, -55, -50, -34, -73, 96, -23, 42, -107, -12, 72, 98, 26, 40, -9, 38, 127, 100, -24, -103, -117, -19, -43, 15, -104, 89, 54, -102, 109, 74, 46, 123, -48, 96, 35, 101, 71, 111, 114, -10, 127, -11, 29, 70, -114, 99, 107, -90, -34, 73, -32]
in comparison, the positive values of the two output is same but negatives the python version is outputting different result. why is that? what am I doing wrong? I saw a similar issue on here on stackoverflow, but Unfortunately does not resolve my issue.
Those result are same.
In java byte is signed type, so values bigger than 127 are considered negatives.
Apparently in python byte is unsigned.
I sent a request with same content incl. german umlaut and hash like in JUnit to spring web app. But HMAC encryption works differently both. HMAC encryption can encrypte at JUnit the same hash like the setting hash. The setting hash is encrpted from another server and sent to my spring web app.
Sending request to tomcat and starting hmac validator
allValues [51, 54, 49, 50, 50, 50, 57, 51, 57, 52, 49, 51, 55, 52, 46, 57, 53, 50, 51, 48, 50, 77, 97, 110, 117, 101, 108, 32, 71, 110, 101, 114, 108, 105, 99, 104, 52, 50, 54, 51, 53, 52, 120, 120, 120, 120, 120, 120, 57, 52, 54, 56, 86, 72, 97, 109, 98, 117, 114, 103, 99, 99, 68, 69, 69, 85, 82, 49, 52, 50, 49, 51, 32, 77, 111, 110, 97, 116, 101, 32, 84, 101, 105, 108, 110, 97, 104, 109, 101, 118, 101, 114, 103, -61, -68, 116, 117, 110, 103, 32, 102, -61, -68, 114, 32, 68, 71, 83, 32, 40, 100, 101, 114, 122, 101, 105, 116, 32, 76, 101, 118, 101, 108, 32, 65, 49, 41, 97, 114, 103, 111, 110, 105, 115, 116, 64, 103, 109, 97, 105, 108, 46, 99, 111, 109, 49, 53, 54, 48, 48, 57, 55, 53, 55, 51, 77, 97, 110, 117, 101, 108, 57, 51, 54, 50, 53, 49, 51, 52, 101, 57, 54, 102, 53, 56, 48, 54, 100, 99, 53, 98, 99, 57, 101, 53, 98, 98, 57, 53, 57, 55, 53, 53, 51, 56, 56, 102, 71, 110, 101, 114, 108, 105, 99, 104, 116, 101, 115, 116, 49, 55, 46, 52, 97, 99, 99, 101, 115, 115, 45, 49, 52, 50, 49, 45, 48, 45, 48, 45, 51, 51, 56, 55, 57, 55, 52, 46, 57, 53, 55, 52, 46, 57, 53, 57, 51, 54, 50, 55, 52, 46, 57, 53, 49, 52, 50, 49, 45, 56, 98, 49, 99, 55, 48, 98, 101, 49, 56, 97, 57, 48, 83, 116, 114, 105, 110, 100, 98, 101, 114, 103, 119, 101, 103, 32, 49, 50, 51, 32, 77, 111, 110, 97, 116, 101, 32, 65, 98, 111, 32, 45, 32, 109, 97, 110, 105, 109, 117, 110, 100, 111, 99, 111, 109, 112, 108, 101, 116, 101, 100, 97, 112, 112, 111, 105, 110, 116, 101, 100, 51, 50, 49, 54, 55, 51, 52, 56, 55, 49, 53, 53, 50, 49, 53, 50, 51, 55, 51, 49, 53, 55, 57, 52, 57, 57, 48, 48, 49, 57, 46, 48, 48, 50, 50, 53, 56, 55]
hmac hash 5dffe6e4b92ff855411a73eea34d74909e25678b1b768dcb8081dd764200f7fefa710beacf26b5d141c362fa5a3ea9ed
hash 65e2f454df692db7ccb34c49e71349dc351f62ca47eaca2a08ddb6c5baa634d9baa691358c69361dc01d7f73681820ed
Starting with jUnit hmac validator
allValues [51, 54, 49, 50, 50, 50, 57, 51, 57, 52, 49, 51, 55, 52, 46, 57, 53, 50, 51, 48, 50, 77, 97, 110, 117, 101, 108, 32, 71, 110, 101, 114, 108, 105, 99, 104, 52, 50, 54, 51, 53, 52, 120, 120, 120, 120, 120, 120, 57, 52, 54, 56, 86, 72, 97, 109, 98, 117, 114, 103, 99, 99, 68, 69, 69, 85, 82, 49, 52, 50, 49, 51, 32, 77, 111, 110, 97, 116, 101, 32, 84, 101, 105, 108, 110, 97, 104, 109, 101, 118, 101, 114, 103, -61, -68, 116, 117, 110, 103, 32, 102, -61, -68, 114, 32, 68, 71, 83, 32, 40, 100, 101, 114, 122, 101, 105, 116, 32, 76, 101, 118, 101, 108, 32, 65, 49, 41, 97, 114, 103, 111, 110, 105, 115, 116, 64, 103, 109, 97, 105, 108, 46, 99, 111, 109, 49, 53, 54, 48, 48, 57, 55, 53, 55, 51, 77, 97, 110, 117, 101, 108, 57, 51, 54, 50, 53, 49, 51, 52, 101, 57, 54, 102, 53, 56, 48, 54, 100, 99, 53, 98, 99, 57, 101, 53, 98, 98, 57, 53, 57, 55, 53, 53, 51, 56, 56, 102, 71, 110, 101, 114, 108, 105, 99, 104, 116, 101, 115, 116, 49, 55, 46, 52, 97, 99, 99, 101, 115, 115, 45, 49, 52, 50, 49, 45, 48, 45, 48, 45, 51, 51, 56, 55, 57, 55, 52, 46, 57, 53, 55, 52, 46, 57, 53, 57, 51, 54, 50, 55, 52, 46, 57, 53, 49, 52, 50, 49, 45, 56, 98, 49, 99, 55, 48, 98, 101, 49, 56, 97, 57, 48, 83, 116, 114, 105, 110, 100, 98, 101, 114, 103, 119, 101, 103, 32, 49, 50, 51, 32, 77, 111, 110, 97, 116, 101, 32, 65, 98, 111, 32, 45, 32, 109, 97, 110, 105, 109, 117, 110, 100, 111, 99, 111, 109, 112, 108, 101, 116, 101, 100, 97, 112, 112, 111, 105, 110, 116, 101, 100, 51, 50, 49, 54, 55, 51, 52, 56, 55, 49, 53, 53, 50, 49, 53, 50, 51, 55, 51, 49, 53, 55, 57, 52, 57, 57, 48, 48, 49, 57, 46, 48, 48, 50, 50, 53, 56, 55]
hmac hash 65e2f454df692db7ccb34c49e71349dc351f62ca47eaca2a08ddb6c5baa634d9baa691358c69361dc01d7f73681820ed
hash 65e2f454df692db7ccb34c49e71349dc351f62ca47eaca2a08ddb6c5baa634d9baa691358c69361dc01d7f73681820ed
Source code:
#Component
public class HMACValidator {
private final SevDeskProperties sevDeskProperties;
public HMACValidator(SevDeskProperties sevDeskProperties) {
this.sevDeskProperties = sevDeskProperties;
}
public void validateHMAC(TransactionPayOne transactionPayOne, String hash) {
byte[] key = transactionPayOne.getPortalid()
.equals(sevDeskProperties.getAboPortalId()) ?
sevDeskProperties.getAboKey().getBytes() :
sevDeskProperties.getSingleKey().getBytes();
String allValues = getAllValues(transactionPayOne);
logger.debug("allValues " + allValues);
String hmacSha384Encode;
try {
logger.debug("allValues" + Arrays.toString(allValues.getBytes("UTF-8")));
hmacSha384Encode = HMAC_SHA384_encode(key, allValues);
} catch (Exception e) {
throw new IllegalArgumentException("Could not encode the values",e);
}
logger.debug("hmac hash " + hmacSha384Encode);
logger.debug("hash " + hash);
if (!hmacSha384Encode.equals(hash)) {
throw new IllegalArgumentException("message is not valid ");
}
}
public static String HMAC_SHA384_encode(byte[] key, String message) throws Exception {
SecretKeySpec keySpec = new SecretKeySpec(
key, HmacAlgorithms.HMAC_SHA_384.getName());
Mac mac = Mac.getInstance(HmacAlgorithms.HMAC_SHA_384.getName());
mac.init(keySpec);
byte[] rawHmac = mac.doFinal(message.getBytes("UTF-8"));
return Hex.encodeHexString(rawHmac);
}
// Getting all values without key in json of transaction data
private String getAllValues(TransactionPayOne transactionPayOne) {
ObjectMapper objectMapper = new ObjectMapper();
Map<String, Object> mapJson;
try {
mapJson = objectMapper.readValue(transactionPayOne.getJson().getBytes("UTF-8"), new TypeReference<Map<String, Object>>() {
});
} catch (IOException e) {
e.printStackTrace();
throw new IllegalArgumentException("Could not parse json");
}
return mapJson.values().stream().map(value -> {
if (value instanceof Map) {
value = ((Map) value).values().stream().collect(Collectors.joining(""));
}
return (String) value;
}).collect(Collectors.joining(""));
}
}
I wonder, how to efficiently compute the hashCode for a BitSet-like implementation of Set<Integer>.
The BitSet#hashCode is obviously fast to compute, rather stupid(*) and incompatible with Set#hashCode().
A fast compatible implementation could go like
int hashCode() {
int result = 0;
for (int i=0; i<bits.length; ++i) {
long word = bits[i];
result += 64 * i * Long.bitCount(word) + weightedBitCount(word);
}
return result;
}
if there was an efficient implementation of
int weightedBitCount(long word) { // naive implementation
int result = 0;
for (int i=0; i<64; ++i) {
if ((word & (1L << i)) != 0) {
result += i;
}
}
return result;
}
In case most bits are unset, the naive implementation could be improved by testing word==0 or using Long.highestOneBit or alike, but these tricks don't help much and are detrimental in other cases.
Is there a clever trick to significantly speed it up in general?
I guess, some batch computation over multiple words could be more efficient. Computing Set#size at the same time would be a nice bonus.
A note concerning premature optimization: Yes, I know. I'm mostly curious (and it can be useful for things like Project Euler).
(*) There are many bits which gets completely ignored (they get shifted out in the multiplication).
Notwithstanding hardware support (e.g., the x86 popcnt instruction), counting the bits in O(1) time is a rather well-known algorithm, generally communicated as SWAR bitcount.
However, your algorithm has a custom kernel that adds a different value based on which bit is set:
result += loop_counter_value;
Lacking a pithy algorithm for bit counting with custom kernels, a tried and true methodology is to utilize precalculated results. In this context, a lookup table. Clearly, a lookup table of all combinations of 64 bits (264 combinations!) is unwieldy, but you can split the difference by precalculating each byte of the n-byte variable. For 8 bytes, this is 256*8, or 2KiB of memory. Consider:
int weightedBitCount(long word) {
int result = 0;
int[][] lookups = {
{0, 0, 1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 6, 6, 4, 4, 5, 5, 6, 6, 7, 7, 7, 7, 8, 8, 9, 9, 10, 10, 5, 5, 6, 6, 7, 7, 8, 8, 8, 8, 9, 9, 10, 10, 11, 11, 9, 9, 10, 10, 11, 11, 12, 12, 12, 12, 13, 13, 14, 14, 15, 15, 6, 6, 7, 7, 8, 8, 9, 9, 9, 9, 10, 10, 11, 11, 12, 12, 10, 10, 11, 11, 12, 12, 13, 13, 13, 13, 14, 14, 15, 15, 16, 16, 11, 11, 12, 12, 13, 13, 14, 14, 14, 14, 15, 15, 16, 16, 17, 17, 15, 15, 16, 16, 17, 17, 18, 18, 18, 18, 19, 19, 20, 20, 21, 21, 7, 7, 8, 8, 9, 9, 10, 10, 10, 10, 11, 11, 12, 12, 13, 13, 11, 11, 12, 12, 13, 13, 14, 14, 14, 14, 15, 15, 16, 16, 17, 17, 12, 12, 13, 13, 14, 14, 15, 15, 15, 15, 16, 16, 17, 17, 18, 18, 16, 16, 17, 17, 18, 18, 19, 19, 19, 19, 20, 20, 21, 21, 22, 22, 13, 13, 14, 14, 15, 15, 16, 16, 16, 16, 17, 17, 18, 18, 19, 19, 17, 17, 18, 18, 19, 19, 20, 20, 20, 20, 21, 21, 22, 22, 23, 23, 18, 18, 19, 19, 20, 20, 21, 21, 21, 21, 22, 22, 23, 23, 24, 24, 22, 22, 23, 23, 24, 24, 25, 25, 25, 25, 26, 26, 27, 27, 28, 28},
{0, 8, 9, 17, 10, 18, 19, 27, 11, 19, 20, 28, 21, 29, 30, 38, 12, 20, 21, 29, 22, 30, 31, 39, 23, 31, 32, 40, 33, 41, 42, 50, 13, 21, 22, 30, 23, 31, 32, 40, 24, 32, 33, 41, 34, 42, 43, 51, 25, 33, 34, 42, 35, 43, 44, 52, 36, 44, 45, 53, 46, 54, 55, 63, 14, 22, 23, 31, 24, 32, 33, 41, 25, 33, 34, 42, 35, 43, 44, 52, 26, 34, 35, 43, 36, 44, 45, 53, 37, 45, 46, 54, 47, 55, 56, 64, 27, 35, 36, 44, 37, 45, 46, 54, 38, 46, 47, 55, 48, 56, 57, 65, 39, 47, 48, 56, 49, 57, 58, 66, 50, 58, 59, 67, 60, 68, 69, 77, 15, 23, 24, 32, 25, 33, 34, 42, 26, 34, 35, 43, 36, 44, 45, 53, 27, 35, 36, 44, 37, 45, 46, 54, 38, 46, 47, 55, 48, 56, 57, 65, 28, 36, 37, 45, 38, 46, 47, 55, 39, 47, 48, 56, 49, 57, 58, 66, 40, 48, 49, 57, 50, 58, 59, 67, 51, 59, 60, 68, 61, 69, 70, 78, 29, 37, 38, 46, 39, 47, 48, 56, 40, 48, 49, 57, 50, 58, 59, 67, 41, 49, 50, 58, 51, 59, 60, 68, 52, 60, 61, 69, 62, 70, 71, 79, 42, 50, 51, 59, 52, 60, 61, 69, 53, 61, 62, 70, 63, 71, 72, 80, 54, 62, 63, 71, 64, 72, 73, 81, 65, 73, 74, 82, 75, 83, 84, 92},
{0, 16, 17, 33, 18, 34, 35, 51, 19, 35, 36, 52, 37, 53, 54, 70, 20, 36, 37, 53, 38, 54, 55, 71, 39, 55, 56, 72, 57, 73, 74, 90, 21, 37, 38, 54, 39, 55, 56, 72, 40, 56, 57, 73, 58, 74, 75, 91, 41, 57, 58, 74, 59, 75, 76, 92, 60, 76, 77, 93, 78, 94, 95, 111, 22, 38, 39, 55, 40, 56, 57, 73, 41, 57, 58, 74, 59, 75, 76, 92, 42, 58, 59, 75, 60, 76, 77, 93, 61, 77, 78, 94, 79, 95, 96, 112, 43, 59, 60, 76, 61, 77, 78, 94, 62, 78, 79, 95, 80, 96, 97, 113, 63, 79, 80, 96, 81, 97, 98, 114, 82, 98, 99, 115, 100, 116, 117, 133, 23, 39, 40, 56, 41, 57, 58, 74, 42, 58, 59, 75, 60, 76, 77, 93, 43, 59, 60, 76, 61, 77, 78, 94, 62, 78, 79, 95, 80, 96, 97, 113, 44, 60, 61, 77, 62, 78, 79, 95, 63, 79, 80, 96, 81, 97, 98, 114, 64, 80, 81, 97, 82, 98, 99, 115, 83, 99, 100, 116, 101, 117, 118, 134, 45, 61, 62, 78, 63, 79, 80, 96, 64, 80, 81, 97, 82, 98, 99, 115, 65, 81, 82, 98, 83, 99, 100, 116, 84, 100, 101, 117, 102, 118, 119, 135, 66, 82, 83, 99, 84, 100, 101, 117, 85, 101, 102, 118, 103, 119, 120, 136, 86, 102, 103, 119, 104, 120, 121, 137, 105, 121, 122, 138, 123, 139, 140, 156},
{0, 24, 25, 49, 26, 50, 51, 75, 27, 51, 52, 76, 53, 77, 78, 102, 28, 52, 53, 77, 54, 78, 79, 103, 55, 79, 80, 104, 81, 105, 106, 130, 29, 53, 54, 78, 55, 79, 80, 104, 56, 80, 81, 105, 82, 106, 107, 131, 57, 81, 82, 106, 83, 107, 108, 132, 84, 108, 109, 133, 110, 134, 135, 159, 30, 54, 55, 79, 56, 80, 81, 105, 57, 81, 82, 106, 83, 107, 108, 132, 58, 82, 83, 107, 84, 108, 109, 133, 85, 109, 110, 134, 111, 135, 136, 160, 59, 83, 84, 108, 85, 109, 110, 134, 86, 110, 111, 135, 112, 136, 137, 161, 87, 111, 112, 136, 113, 137, 138, 162, 114, 138, 139, 163, 140, 164, 165, 189, 31, 55, 56, 80, 57, 81, 82, 106, 58, 82, 83, 107, 84, 108, 109, 133, 59, 83, 84, 108, 85, 109, 110, 134, 86, 110, 111, 135, 112, 136, 137, 161, 60, 84, 85, 109, 86, 110, 111, 135, 87, 111, 112, 136, 113, 137, 138, 162, 88, 112, 113, 137, 114, 138, 139, 163, 115, 139, 140, 164, 141, 165, 166, 190, 61, 85, 86, 110, 87, 111, 112, 136, 88, 112, 113, 137, 114, 138, 139, 163, 89, 113, 114, 138, 115, 139, 140, 164, 116, 140, 141, 165, 142, 166, 167, 191, 90, 114, 115, 139, 116, 140, 141, 165, 117, 141, 142, 166, 143, 167, 168, 192, 118, 142, 143, 167, 144, 168, 169, 193, 145, 169, 170, 194, 171, 195, 196, 220},
{0, 32, 33, 65, 34, 66, 67, 99, 35, 67, 68, 100, 69, 101, 102, 134, 36, 68, 69, 101, 70, 102, 103, 135, 71, 103, 104, 136, 105, 137, 138, 170, 37, 69, 70, 102, 71, 103, 104, 136, 72, 104, 105, 137, 106, 138, 139, 171, 73, 105, 106, 138, 107, 139, 140, 172, 108, 140, 141, 173, 142, 174, 175, 207, 38, 70, 71, 103, 72, 104, 105, 137, 73, 105, 106, 138, 107, 139, 140, 172, 74, 106, 107, 139, 108, 140, 141, 173, 109, 141, 142, 174, 143, 175, 176, 208, 75, 107, 108, 140, 109, 141, 142, 174, 110, 142, 143, 175, 144, 176, 177, 209, 111, 143, 144, 176, 145, 177, 178, 210, 146, 178, 179, 211, 180, 212, 213, 245, 39, 71, 72, 104, 73, 105, 106, 138, 74, 106, 107, 139, 108, 140, 141, 173, 75, 107, 108, 140, 109, 141, 142, 174, 110, 142, 143, 175, 144, 176, 177, 209, 76, 108, 109, 141, 110, 142, 143, 175, 111, 143, 144, 176, 145, 177, 178, 210, 112, 144, 145, 177, 146, 178, 179, 211, 147, 179, 180, 212, 181, 213, 214, 246, 77, 109, 110, 142, 111, 143, 144, 176, 112, 144, 145, 177, 146, 178, 179, 211, 113, 145, 146, 178, 147, 179, 180, 212, 148, 180, 181, 213, 182, 214, 215, 247, 114, 146, 147, 179, 148, 180, 181, 213, 149, 181, 182, 214, 183, 215, 216, 248, 150, 182, 183, 215, 184, 216, 217, 249, 185, 217, 218, 250, 219, 251, 252, 284},
{0, 40, 41, 81, 42, 82, 83, 123, 43, 83, 84, 124, 85, 125, 126, 166, 44, 84, 85, 125, 86, 126, 127, 167, 87, 127, 128, 168, 129, 169, 170, 210, 45, 85, 86, 126, 87, 127, 128, 168, 88, 128, 129, 169, 130, 170, 171, 211, 89, 129, 130, 170, 131, 171, 172, 212, 132, 172, 173, 213, 174, 214, 215, 255, 46, 86, 87, 127, 88, 128, 129, 169, 89, 129, 130, 170, 131, 171, 172, 212, 90, 130, 131, 171, 132, 172, 173, 213, 133, 173, 174, 214, 175, 215, 216, 256, 91, 131, 132, 172, 133, 173, 174, 214, 134, 174, 175, 215, 176, 216, 217, 257, 135, 175, 176, 216, 177, 217, 218, 258, 178, 218, 219, 259, 220, 260, 261, 301, 47, 87, 88, 128, 89, 129, 130, 170, 90, 130, 131, 171, 132, 172, 173, 213, 91, 131, 132, 172, 133, 173, 174, 214, 134, 174, 175, 215, 176, 216, 217, 257, 92, 132, 133, 173, 134, 174, 175, 215, 135, 175, 176, 216, 177, 217, 218, 258, 136, 176, 177, 217, 178, 218, 219, 259, 179, 219, 220, 260, 221, 261, 262, 302, 93, 133, 134, 174, 135, 175, 176, 216, 136, 176, 177, 217, 178, 218, 219, 259, 137, 177, 178, 218, 179, 219, 220, 260, 180, 220, 221, 261, 222, 262, 263, 303, 138, 178, 179, 219, 180, 220, 221, 261, 181, 221, 222, 262, 223, 263, 264, 304, 182, 222, 223, 263, 224, 264, 265, 305, 225, 265, 266, 306, 267, 307, 308, 348},
{0, 48, 49, 97, 50, 98, 99, 147, 51, 99, 100, 148, 101, 149, 150, 198, 52, 100, 101, 149, 102, 150, 151, 199, 103, 151, 152, 200, 153, 201, 202, 250, 53, 101, 102, 150, 103, 151, 152, 200, 104, 152, 153, 201, 154, 202, 203, 251, 105, 153, 154, 202, 155, 203, 204, 252, 156, 204, 205, 253, 206, 254, 255, 303, 54, 102, 103, 151, 104, 152, 153, 201, 105, 153, 154, 202, 155, 203, 204, 252, 106, 154, 155, 203, 156, 204, 205, 253, 157, 205, 206, 254, 207, 255, 256, 304, 107, 155, 156, 204, 157, 205, 206, 254, 158, 206, 207, 255, 208, 256, 257, 305, 159, 207, 208, 256, 209, 257, 258, 306, 210, 258, 259, 307, 260, 308, 309, 357, 55, 103, 104, 152, 105, 153, 154, 202, 106, 154, 155, 203, 156, 204, 205, 253, 107, 155, 156, 204, 157, 205, 206, 254, 158, 206, 207, 255, 208, 256, 257, 305, 108, 156, 157, 205, 158, 206, 207, 255, 159, 207, 208, 256, 209, 257, 258, 306, 160, 208, 209, 257, 210, 258, 259, 307, 211, 259, 260, 308, 261, 309, 310, 358, 109, 157, 158, 206, 159, 207, 208, 256, 160, 208, 209, 257, 210, 258, 259, 307, 161, 209, 210, 258, 211, 259, 260, 308, 212, 260, 261, 309, 262, 310, 311, 359, 162, 210, 211, 259, 212, 260, 261, 309, 213, 261, 262, 310, 263, 311, 312, 360, 214, 262, 263, 311, 264, 312, 313, 361, 265, 313, 314, 362, 315, 363, 364, 412},
{0, 56, 57, 113, 58, 114, 115, 171, 59, 115, 116, 172, 117, 173, 174, 230, 60, 116, 117, 173, 118, 174, 175, 231, 119, 175, 176, 232, 177, 233, 234, 290, 61, 117, 118, 174, 119, 175, 176, 232, 120, 176, 177, 233, 178, 234, 235, 291, 121, 177, 178, 234, 179, 235, 236, 292, 180, 236, 237, 293, 238, 294, 295, 351, 62, 118, 119, 175, 120, 176, 177, 233, 121, 177, 178, 234, 179, 235, 236, 292, 122, 178, 179, 235, 180, 236, 237, 293, 181, 237, 238, 294, 239, 295, 296, 352, 123, 179, 180, 236, 181, 237, 238, 294, 182, 238, 239, 295, 240, 296, 297, 353, 183, 239, 240, 296, 241, 297, 298, 354, 242, 298, 299, 355, 300, 356, 357, 413, 63, 119, 120, 176, 121, 177, 178, 234, 122, 178, 179, 235, 180, 236, 237, 293, 123, 179, 180, 236, 181, 237, 238, 294, 182, 238, 239, 295, 240, 296, 297, 353, 124, 180, 181, 237, 182, 238, 239, 295, 183, 239, 240, 296, 241, 297, 298, 354, 184, 240, 241, 297, 242, 298, 299, 355, 243, 299, 300, 356, 301, 357, 358, 414, 125, 181, 182, 238, 183, 239, 240, 296, 184, 240, 241, 297, 242, 298, 299, 355, 185, 241, 242, 298, 243, 299, 300, 356, 244, 300, 301, 357, 302, 358, 359, 415, 186, 242, 243, 299, 244, 300, 301, 357, 245, 301, 302, 358, 303, 359, 360, 416, 246, 302, 303, 359, 304, 360, 361, 417, 305, 361, 362, 418, 363, 419, 420, 476}
};
for (int bite = 0; bite < 8; bite++)
result += lookups[bite][ (int)(word >> (bite * 8)) & 0xff ];
return result;
}
You might clean that up a bit, move the initialization out of the function, and so on, but this crucially removes the branch from your loop, and reduces your best case of 128 instructions (for all zeros) to 56 instructions (rough numbers). The worst case is a bit more pronounced at 192 instructions to 56. Further, a smart compiler might unroll the loop entirely, reducing to 40 instructions.
I think is also important to have less hash collisions together with the hashing performance. Faster hashing calculation can make your program generally slower, because of big amount of hash misses.
It mind be a better idea to use some generic hash function like MurMur3A from Google Guava, instead of inventing your own.
There are many Benchmark about hashing, for example:
http://greenrobot.org/essentials/features/performant-hash-functions-for-java/comparison-of-hash-functions/
https://www.strchr.com/hash_functions
https://github.com/Cyan4973/xxHash
I think you can do some micro-benchmarking using a Google Caliper and check which hash function is better for you case.
BTW. Ask your self why do you need a custom BitSet ?
This is what I did:
int weightedBitCount(long word) {
return (Long.bitCount(word & 0xFFFF_FFFF_0000_0000L) << 5)
+ (Long.bitCount(word & 0xFFFF_0000_FFFF_0000L) << 4)
+ (Long.bitCount(word & 0xFF00_FF00_FF00_FF00L) << 3)
+ (Long.bitCount(word & 0xF0F0_F0F0_F0F0_F0F0L) << 2)
+ (Long.bitCount(word & 0xCCCC_CCCC_CCCC_CCCCL) << 1)
+ (Long.bitCount(word & 0xAAAA_AAAA_AAAA_AAAAL) << 0);
}
It's pretty simple: With a single bit set, e.g., the bit 10, word looks like 0x0000_0000_0000_0400L and only the masks 0xFF00_FF00_FF00_FF00L and 0xCCCC_CCCC_CCCC_CCCCL produce a bit count of 1, so we get
(0 << 5) + (0 << 4) + (1 << 3) + (0 << 2) + (1 << 1) + (0 << 5) = 10
It needs some 6*4 instructions (maybe 6 cycles on modern Intel) per 64 bits, so it's not really slow, but it's still too slow when compared with the bulk bitset operations which need a single instruction (per 64 bits).
So I'm playing with some batch computation over multiple words.
we are given a read only array of n integers from 1 to n. Each integer appears exactly once except A which appears twice and B which is missing.
Return A and B.
I know my solution is not space efficient but i am wondering why i am getting wrong output for cases like:
389, 299, 65, 518, 361, 103, 342, 406, 24, 79, 192, 181, 178, 205, 38, 298, 218, 143, 446, 324, 82, 41, 312, 166, 252, 59, 91, 6, 248, 395, 157, 332, 352, 57, 106, 246, 506, 261, 16, 470, 224, 228, 286, 121, 193, 241, 203, 36, 264, 234, 386, 471, 225, 466, 81, 58, 253, 468, 31, 197, 15, 282, 334, 171, 358, 209, 213, 158, 355, 243, 75, 411, 43, 485, 291, 270, 25, 100, 194, 476, 70, 402, 403, 109, 322, 421, 313, 239, 327, 238, 257, 433, 254, 328, 163, 436, 520, 437, 392, 199, 63, 482, 222, 500, 454, 84, 265, 508, 416, 141, 447, 258, 384, 138, 47, 156, 172, 319, 137, 62, 85, 154, 97, 18, 360, 244, 272, 93, 263, 262, 266, 290, 369, 357, 176, 317, 383, 333, 204, 56, 521, 502, 326, 353, 469, 455, 190, 393, 453, 314, 480, 189, 77, 129, 439, 139, 441, 443, 351, 528, 182, 101, 501, 425, 126, 231, 445, 155, 432, 418, 95, 375, 376, 60, 271, 74, 11, 419, 488, 486, 54, 460, 321, 341, 174, 408, 131, 115, 107, 134, 448, 532, 292, 289, 320, 14, 323, 61, 481, 371, 151, 385, 325, 472, 44, 335, 431, 187, 51, 88, 105, 145, 215, 122, 162, 458, 52, 496, 277, 362, 374, 26, 211, 452, 130, 346, 10, 315, 459, 92, 531, 467, 309, 34, 281, 478, 477, 136, 519, 196, 240, 12, 288, 302, 119, 356, 503, 527, 22, 27, 55, 343, 490, 127, 444, 308, 354, 278, 497, 191, 294, 117, 1, 396, 125, 148, 285, 509, 208, 382, 297, 405, 245, 5, 330, 311, 133, 274, 275, 118, 463, 504, 39, 99, 442, 337, 169, 140, 104, 373, 221, 499, 413, 124, 510, 159, 465, 80, 276, 83, 329, 524, 255, 387, 259, 397, 491, 517, 23, 4, 230, 48, 349, 412, 142, 114, 487, 381, 164, 35, 67, 498, 73, 440, 108, 226, 96, 132, 144, 207, 235, 33, 69, 128, 236, 364, 198, 475, 173, 493, 150, 90, 515, 111, 68, 232, 340, 112, 526, 492, 512, 495, 429, 146, 336, 17, 350, 251, 7, 184, 76, 380, 359, 293, 19, 49, 345, 227, 212, 430, 89, 474, 279, 201, 398, 347, 273, 37, 185, 177, 102, 304, 295, 422, 94, 426, 514, 116, 183, 180, 494, 42, 305, 152, 390, 30, 247, 451, 32, 388, 331, 78, 424, 368, 394, 188, 306, 449, 8, 214, 120, 179, 280, 511, 409, 338, 153, 507, 370, 461, 217, 161, 483, 147, 242, 86, 417, 268, 71, 462, 420, 167, 513, 379, 307, 522, 435, 113, 296, 457, 525, 45, 529, 423, 427, 2, 438, 64, 316, 46, 40, 13, 516, 367, 233, 110, 318, 250, 283, 216, 186, 310, 237, 377, 365, 175, 479, 378, 66, 414, 473, 165, 210, 50, 348, 372, 363, 339, 20, 168, 284, 415, 505, 206, 53, 223, 434, 202, 123, 399, 400, 135, 269, 428, 219, 456, 28, 464, 267, 489, 98, 391, 195, 366, 300, 484, 533, 229, 213, 149, 160, 256, 303, 530, 301, 29, 404, 344, 401, 220, 287, 9, 407, 170, 450, 523, 249, 72, 410, 3, 21, 200, 260
Expected Output:
213 87
Actual Output :
213 3
Java Code What I have Tried so far
public class Solution {
// DO NOT MODIFY THE LIST
public ArrayList<Integer> repeatedNumber(final List<Integer> a) {
int n=a.size();
int rep=0,b=0;
int[] arr= new int[n+1];
for(int i=0;i<n+1;i++) //value i is at index i
arr[i]=i;
arr[0]=-1;
for(int val : a)
{
if(arr[val]!=-1)
arr[val]=-1;
else
{
rep=val;
break;
}
}
for(int i=0; i<n+1; i++)
{
if(arr[i]!=-1)
{
b=i;
break;
}
}
ArrayList<Integer> ans = new ArrayList<Integer>();
ans.add(rep);
ans.add(b);
return ans;
}
}
You find a mistake like this by reasoning that you need one complete pass over the array to detect a missing value. You may stop earlier for the duplicate.
for(int val : a){
if(arr[val]!=-1){
arr[val]=-1;
} else {
rep=val;
///////////////// Omit: break;
}
}
Later Basically the same idea but using a bit more of Java, making it shorter:
int dup = 0;
int abs = 0;
BitSet present = new BitSet( a.size() + 1 );
for( int x: a ){
if( present.get( x ) ){
dup = x;
} else {
present.set( x );
}
}
abs = present.nextClearBit( 1 );
If you may modify the original array (or list), you can avoid using some extra storage:
int dup = 0;
int abs = 0;
for( int i = 0; i < a.length; ++i ){
if( a[i] <= 0 ) continue;
int val = a[i];
a[i] = 0;
while( true ){
if( a[val-1] == -val ){
dup = val;
break;
} else {
int h = a[val-1];
a[val-1] = -val;
if( h == 0 ) break;
val = h;
}
}
}
for( int i = 0; i < a.length; ++i ){
if( a[i] >= 0 ){
abs = i+1;
break;
}
}
I would begin by using List.toArray(T[]) to get an array, sorting the array and then iterating the sorted array once. Something like,
public ArrayList<Integer> repeatedNumber(final List<Integer> a) {
Integer[] arr = a.toArray(new Integer[0]);
Arrays.sort(arr);
Integer[] r = new Integer[2];
for (int i = 1; i < arr.length; i++) {
int prev = arr[i - 1];
if (prev == arr[i]) {
r[0] = prev;
} else if (prev != arr[i] - 1) {
r[1] = prev + 1;
}
}
return new ArrayList<Integer>(Arrays.asList(r));
}
which I tested with your input and got (as requested)
[213, 87]
Here is a solution that doesn't alter the original list and finds the missing one using only maths.
public static void main(String[] args) {
find(new int[]{389, 299, 65, 518, 361, 103, 342, 406, 24, 79, 192, 181, 178, 205, 38, 298, 218, 143, 446, 324, 82, 41, 312, 166, 252, 59, 91, 6, 248, 395, 157, 332, 352, 57, 106, 246, 506, 261, 16, 470, 224, 228, 286, 121, 193, 241, 203, 36, 264, 234, 386, 471, 225, 466, 81, 58, 253, 468, 31, 197, 15, 282, 334, 171, 358, 209, 213, 158, 355, 243, 75, 411, 43, 485, 291, 270, 25, 100, 194, 476, 70, 402, 403, 109, 322, 421, 313, 239, 327, 238, 257, 433, 254, 328, 163, 436, 520, 437, 392, 199, 63, 482, 222, 500, 454, 84, 265, 508, 416, 141, 447, 258, 384, 138, 47, 156, 172, 319, 137, 62, 85, 154, 97, 18, 360, 244, 272, 93, 263, 262, 266, 290, 369, 357, 176, 317, 383, 333, 204, 56, 521, 502, 326, 353, 469, 455, 190, 393, 453, 314, 480, 189, 77, 129, 439, 139, 441, 443, 351, 528, 182, 101, 501, 425, 126, 231, 445, 155, 432, 418, 95, 375, 376, 60, 271, 74, 11, 419, 488, 486, 54, 460, 321, 341, 174, 408, 131, 115, 107, 134, 448, 532, 292, 289, 320, 14, 323, 61, 481, 371, 151, 385, 325, 472, 44, 335, 431, 187, 51, 88, 105, 145, 215, 122, 162, 458, 52, 496, 277, 362, 374, 26, 211, 452, 130, 346, 10, 315, 459, 92, 531, 467, 309, 34, 281, 478, 477, 136, 519, 196, 240, 12, 288, 302, 119, 356, 503, 527, 22, 27, 55, 343, 490, 127, 444, 308, 354, 278, 497, 191, 294, 117, 1, 396, 125, 148, 285, 509, 208, 382, 297, 405, 245, 5, 330, 311, 133, 274, 275, 118, 463, 504, 39, 99, 442, 337, 169, 140, 104, 373, 221, 499, 413, 124, 510, 159, 465, 80, 276, 83, 329, 524, 255, 387, 259, 397, 491, 517, 23, 4, 230, 48, 349, 412, 142, 114, 487, 381, 164, 35, 67, 498, 73, 440, 108, 226, 96, 132, 144, 207, 235, 33, 69, 128, 236, 364, 198, 475, 173, 493, 150, 90, 515, 111, 68, 232, 340, 112, 526, 492, 512, 495, 429, 146, 336, 17, 350, 251, 7, 184, 76, 380, 359, 293, 19, 49, 345, 227, 212, 430, 89, 474, 279, 201, 398, 347, 273, 37, 185, 177, 102, 304, 295, 422, 94, 426, 514, 116, 183, 180, 494, 42, 305, 152, 390, 30, 247, 451, 32, 388, 331, 78, 424, 368, 394, 188, 306, 449, 8, 214, 120, 179, 280, 511, 409, 338, 153, 507, 370, 461, 217, 161, 483, 147, 242, 86, 417, 268, 71, 462, 420, 167, 513, 379, 307, 522, 435, 113, 296, 457, 525, 45, 529, 423, 427, 2, 438, 64, 316, 46, 40, 13, 516, 367, 233, 110, 318, 250, 283, 216, 186, 310, 237, 377, 365, 175, 479, 378, 66, 414, 473, 165, 210, 50, 348, 372, 363, 339, 20, 168, 284, 415, 505, 206, 53, 223, 434, 202, 123, 399, 400, 135, 269, 428, 219, 456, 28, 464, 267, 489, 98, 391, 195, 366, 300, 484, 533, 229, 213, 149, 160, 256, 303, 530, 301, 29, 404, 344, 401, 220, 287, 9, 407, 170, 450, 523, 249, 72, 410, 3, 21, 200, 260});
}
public static void find(int[] numbers){
int sum = 0;
int duplicate = -1;
for (int i = 0; i < numbers.length; i++) {
sum += numbers[i];
if(duplicate == -1) {
for (int j = i + 1; j < numbers.length; j++) {
if(numbers[i] == numbers[j]){
duplicate = numbers[i];
}
}
}
}
int missing = triangle(numbers.length) - (sum - duplicate);
System.out.println(duplicate + " " + missing);
}
public static int triangle(int amount) {
return (int) ((Math.pow(amount, 2) + amount) / 2);
}
I would suggest a better approach to do this. Here is logical steps that you need to follow.
I am explaining with an Example
ex:- incorrect Array => {20,30,40,40} and Correct Array => {20,30,40,60}
First Calculate Sum of correct Array and incorrect Array.
incorrect Array sum => 130 and Correct Array => 150.
Calculate the difference between the sum of these two Arrays.
Difference will be -20(Negative).
Then Find the Integer value which is Repeated.
Using loop Iteration find out the Number which is getting RepeatedFrom Incorrect Array. So Repeated Number will be 40.
Now Substract this Repeated Number with Difference You Found (Between the two Sum).
(-20)-(-40) = -60 You Got Your Missing Number .
Finally You will got your Missing Number in Negative. it is more efficient way to do this.
Note :- If you Iterate throughout one Incorrect array (n) in (n*n) in Nested Loops and again in Nested Loop Iterate throughout Correct array (n) in (n*n) So, Total will be (n*n*n*n). Its very hectic actually. In Solution which is given by me will max upto ((n*n)+n+n). So Definitely