Avro: silent coercion of floats to ints

1 minute read

Days ago, while writing tests for a class that decodes JSON payloads using Avro’s Java library, I noticed that records containing float values in an int field were not causing the decoder to throw an exception, as my tests would have expected.

Example

Given the following schema:

{
  "type" : "record",
  "name" : "example",
  "namespace" : "test",
  "fields" : [ { "name" : "id", "type" : "int" } ]
}

and a record like the following:

{ "id": -1.2 }

the decoding will produce a record with id = -1 (note the coercion).

Investigation

I then started investigating the issue:

Is this caused by issues in my own code?

No: I can reproduce it with a code snippet that only makes use of Avro-provided classes.

Can I figure out what is causing the issue?

Yes: debugging the snippet, I found out that float / double values are silently coerced to integers, unless the conversion is not possible (e.g. because of overflows). This behaviour is “inherited” by the Codehaus Jackson library, used by Avro to parse JSON objects.

Is this affecting only this version of the library (1.8.2)?

No: I can reproduce it even on 1.9.2 and 1.10.0 (the current version, at the moment of writing).

Is this by any chance an intended behaviour that I’m misunderstanding?

I don’t have a definitive answer though a quick test using Avro’s Javascript library seems to to prove that it’s not (the same example returns an error there).

Can I add my own validation logic, to address this?

Not viable / not a good idea: since the original value is lost after the coercion, I would have to “intercept” it in the JSON payload (essentially duplicating parsing logic) and replace it in the record after the decoding has happened.

Conclusion

Having exhausted my options, I have warned my (technical) stakeholders about the issue and then opened a ticket in Avro’s issue tracker (see AVRO-2885).