I am mostly putting this here so I can read it later just in case the issue gets deleted. Boon got a visit Cowboycoder.
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
Improvement to `JacksonSerializer.roundTrip()`

Sorry, I don't get it. The benchmark focuses on deserialization, not serialization => read from bytes, not write to bytes.

Ok sorry if that is the case: I did start with the work, and did not verify the original code base.
I will verify and close this if irrelevant.
I will verify and close this if irrelevant.
Closed
cowtowncoder closed the issue


It has code for serializing an parsing as you mentioned.
RE: Existing code serializes content as a String, but given that Jackson heavily optimizes byte stream case, and since parse tests take in a byte[] or InputStream, it would make more sense to use:
byte[] content = mapper.writeValueAsBytes(alltype);
Seems logical as an benchmark. Not sure it makes the other benchmark invalid as there are users who do want strings. Boon is optimized to serialize with char[]. I don't have a byte[] serializer.
RE: and feed that to parser. Combination will be significantly faster than construction to and parsing from a String (esp. when using StringWriter; there is mapper.writeValueAsString() method as well that is marginally faster).
RE: The same output method probably also makes sense for serialize() method, looking at matching Boon serializer implementation.
The thought of a byte serializer never crossed my mind. Currently there is only a char[] serializer. I like the idea. Not sure it is high on my list of priorities as most of the buffer sizes that I am dealing with in production are satisfied fine with String.toBytes. I'll test Jackson vs. Boon with to/fro bytes only. This should put Boon at an equal disadvantage of where Jackson is now. I don't think it negates the earlier benchmark, and in fact I think the benchmark in question is a much more common case. But that is conjecture for both of us.

@RichardHightower right, it is difficult to have proper apples-to-apples tests. In this case it is not so much that output as
byte[]
is more efficient (it may be marginally so but not significantly), but that parsing is.I think it is perfectly reasonable to have different sources/targets to test, and report these separately (esp. for parsing). For serialization, perhaps it would make sense to also have
OutputStream
output type, which should be relatively fair, not assuming that serializer has an efficient way of producing a byte[]
or not?For round-tripping use case it seems to me that the intermediate format should be the most efficient one for implementation; this does require some knowledge of implementation. I was thinking that this use case perhaps emulates case like writing a JSON file to disk, reading it back; or web service call (although payload would differ in directions), or maybe send-modify-return.
One last thing: I would have filed this at the forked repo, but I couldn't find an issue tracker. Maybe github does not have all the facilities for forks.

I am a github neophyte. I can create a new benchmark project. Or you can add the issues directly to boon. I see your point about mapper.writeAsString I am curious to know the difference. I think curiosity has been a driving factor. I suspect for larger files this will tip the scale towards Jackson. Background: Boon was written and it needed JSON. I decided to write a JSON parser to reduce the dependencies with the thought being you could always just use GSON or Jackson if the feature set / use case was not a direct match but Boon has its own. I have been using Jackson from 2009 and maybe before. I have never used GSON in production. I was aiming / optimizing for REST calls and Websocket messages around the 1k to 1mb range. Boon JSON is tuned/optimized for service calls and messaging not really large files. Also I am using it in somewhat reactive style so I always have complete buffers to parse and typically only accept a buffer when it is complete. I've used it in conjunction with a file batcher that writes a JSON entry per line and I quit benchmarking it after 300mb per second sustained writing to disk (well disk array storage anyway). I only needed 30 mb per second for this app. The file had JSON in it but it was not a JSON file as it was one JSON entry per line (1k or so each line). The largest service call I have to deal with is about 500k but this is more or less a data push update. End user wise the service calls are typically under 1k. The returns are under 10k. This pretty much describes every SOA / REST / Websocket app that I have ever worked with and I have worked with quite a few. Boon had an IO lib and an in-memory no-SQL-ish lib and will one day have a REST framework and an MVC framework and a mustache template lib. It needs JSON but JSON is not Boon. In short I am going to make the string/serialize parse as fair as possible. I will add a byte parse and serialize parse. Also I realize that Jackson and boon have different design goals and there are features and things that Jackson does that boon does not. I compared boon to jackson because that is the first question people ask me. How does it compare to Jacskon? No one asks me about GSON but I don't hang out with Android developers. I am an old java dog.
…
@RichardHightower I thought I recognized your name from NFJS? Unless I am confusing you with another mr Hightower. :)
I appreciate your explanation on goals for Boon. It is always interesting to read about that part, since it makes it much easier to understand various implementations choices and strategies.
I appreciate your explanation on goals for Boon. It is always interesting to read about that part, since it makes it much easier to understand various implementations choices and strategies.
The part about
writeValueAsString()
is really just about how StringWriter
works, which likeByteArrayOutputStream
keeps on doubling up, reallocating its buffer. If the end result is String, that isn't very optimal unless initial size guess is correct. With segments, one can reduce allocations. But it's probably not a huge deal in the end. Same is true for writeValueAsBytes
, intermediate storage is segmented, and final allocation is done when total length is known, instead of doubling up until full size is known, and then making one more copy.Bigger impact for round-trip was simply just that the intermediate
Object
would be efficient to create & consume. I had a look at results you found, and round-trip was one where I couldn't quite see where the difference comes from.Other cases (where Boon does very well -- impressive!) make more sense to me.
Jackson optimizes heavily for POJO data-binding case to/from raw byte input; and specifically binding as
Jackson optimizes heavily for POJO data-binding case to/from raw byte input; and specifically binding as
Map
s is probably somewhat sub-optimal: I suspect that the results for that one dictionary JSON where keys are numbers is particular tricky for Jackson due to symbol table churn. Basically, Jackson assumes that keys are mostly repetitive; but if all keys are unique, this does not hold. Which is fine, except for larger files starts to degrade performance.One thing I have been curious about has been performance cost of doing range checks (to allow incremental input): that is, whether requiring all input to be in memory would allow short-cuts. I tried removing of those checks once, with pre-allocated buffer, but did not see much difference. So for the way Jackson streaming parser is implemented, there isn't much benefit from requiring full input. But this could well be different with difference parsing technique.
Maybe I should see how Boon does parsing: it's been a while since I have had a look elsewhere. Last one fast FastJSON, which actually had very cool tricks for data-binding. Its approach is much more integrated than Jackson's (sort of like SAX + data-binding in one bundle), and that is something that could yield improvements too. Current division between streaming and data-binding is useful, but it has some non-zero cost.

@RichardHightower<https://github.com/RichardHightower> I thought I recognized your name from NFJS? There might be other Hightowers but I did speak for a while on the NFJS circuit. I was as big as a house (fat). Hard to miss. I know you from Jackson. :) Someone said, your name at work, and I said who is that. Then they said cowboycoder, and I knew right away "The guy who wrote Jackson". I think of you as cowboycoder first, and Tatu Saloranta second. I did read your blogs and posts and such wrt JSON and have used Jackson exclusively until recently. :) I can honestly say that I am a fan of your work. RE: The part about writeValueAsString() is really just about how StringWriter works, which like ByteArrayOutputStream keeps on doubling up, reallocating its buffer. If the end result is String, that isn't very optimal unless initial size guess is correct. With segments, one can reduce allocations. But it's probably not a huge deal in the end. Same is true for writeValueAsBytes, intermediate storage is segmented, and final allocation is done when total length is known, instead of doubling up until full size is known, and then making one more copy. I did not know they did that but I am not a very trusting person so I wrote my own CharBuf and my own ByteBuf which work like you say which I think is how StringBuilder works underneath or so I have heard. :) I was not trying to hamstring Jackson. I missed the writeValueAsString method when I writing the benchmark. I have no excuse now because I have created a interface that mirrors ObjectMapper and GSON (to a degree, the object mapping logic needs to be cleaned up, but it works). So Boon has a writeValueAsString too and a factory that creates and interface similar to ObjectMapper and GSON. I figured the easiest way to document it was to mimic the 800 lbs gorillas in the space.
…
``` Benchmark Mode Thr Count Sec Mean Mean error Units i.g.j.s.BoonPropertySerializer.roundTriper thrpt 8 5 1 249719.367 19752.708 ops/s i.g.j.s.BoonSerializer.roundTriper thrpt 8 5 1 280467.817 27254.161 ops/s i.g.j.s.JacksonSerializer.roundTriper thrpt 8 5 1 189743.087 9706.744 ops/s i.g.j.s.BoonPropertySerializer.serializeSmall thrpt 8 5 1 711160.027 185676.527 ops/s i.g.j.s.BoonSerializer.serializeSmall thrpt 8 5 1 785138.997 38081.272 ops/s i.g.j.s.JacksonSerializer.serializeSmall thrpt 8 5 1 543163.990 31147.317 ops/s``` I changed the test like I think you asked me too. I can't check it in right now because I am working on some other stuff that I don't want public yet, and I don't feel like merging, branching and what not (resource constrained wrt time), but it is what is in github plus these changes that I think you asked for. It seems in this configuration that Boon is faster for serialization of fields or properties. Boon does not have all of the features of Jackson and if you enable additional features it is possible that Jackson will be faster. ``` import com.fasterxml.jackson.databind.ObjectMapper; import org.openjdk.jmh.annotations.GenerateMicroBenchmark; import org.openjdk.jmh.annotations.OutputTimeUnit; import org.openjdk.jmh.logic.BlackHole; import java.io.StringWriter; import java.util.concurrent.TimeUnit; import static org.boon.Boon.puts; /** * Created by rick on 12/27/13. */ public class JacksonSerializer { private static final ObjectMapper serializer = new ObjectMapper(); private Object serialize(AllTypes allTypes) throws Exception { allTypes.setMyLong ( System.currentTimeMillis () ); return serializer.writeValueAsString( allTypes ); } private Object roundTrip(AllTypes alltype) throws Exception { String string = serializer.writeValueAsString( alltype ); return serializer.readValue (string, AllTypes.class); } @GenerateMicroBenchmark @OutputTimeUnit ( TimeUnit.SECONDS) public void serializeSmall(BlackHole bh) throws Exception { bh.consume(serialize(TestObjects.OBJECT)); } @GenerateMicroBenchmark @OutputTimeUnit(TimeUnit.SECONDS) public void roundTriper(BlackHole bh) throws Exception { bh.consume(roundTrip ( TestObjects.OBJECT )); } } ``` It looks like for the String case Boon is faster (for this particular test). byte[] coming up.
…
``` Benchmark Mode Thr Count Sec Mean Mean error Units i.g.j.s.BoonByteArraySerializer.serializeSmall thrpt 8 5 1 799044.697 53333.092 ops/s i.g.j.s.BoonSerializer.serializeSmall thrpt 8 5 1 763242.507 39028.941 ops/s i.g.j.s.JacksonSerializer.serializeSmall thrpt 8 5 1 606863.367 118585.905 ops/s i.g.j.s.JacksonByteArraySerializer.serializeSmall thrpt 8 5 1 514680.657 30010.829 ops/s i.g.j.s.BoonSerializer.roundTriper thrpt 8 5 1 314664.487 1792.830 ops/s i.g.j.s.BoonByteArraySerializer.roundTriper thrpt 8 5 1 282110.953 22238.562 ops/s i.g.j.s.JacksonByteArraySerializer.roundTriper thrpt 8 5 1 182594.197 5808.897 ops/s i.g.j.s.JacksonSerializer.roundTriper thrpt 8 5 1 161474.970 45330.686 ops/s ``` The results don't make a lot of sense for byte[]. It seems Jackson was slower for byte array serialization than it was String serialization which makes not sense and Boon got faster for byte[] serialization than it id char[] serialization which really does not make sense. Both those are the results. The delta is about the same between Jackson and Boon. Jackson doing byte array[] ``` package io.gatling.jsonbenchmark.serialization; import com.fasterxml.jackson.databind.ObjectMapper; import org.openjdk.jmh.annotations.GenerateMicroBenchmark; import org.openjdk.jmh.annotations.OutputTimeUnit; import org.openjdk.jmh.logic.BlackHole; import java.util.concurrent.TimeUnit; public class JacksonByteArraySerializer { private static final ObjectMapper serializer = new ObjectMapper(); private Object serialize(AllTypes allTypes) throws Exception { allTypes.setMyLong ( System.currentTimeMillis () ); return serializer.writeValueAsBytes( allTypes ); } private Object roundTrip(AllTypes alltype) throws Exception { byte[] bytes = serializer.writeValueAsBytes( alltype ); return serializer.readValue (bytes, AllTypes.class); } @GenerateMicroBenchmark @OutputTimeUnit( TimeUnit.SECONDS) public void serializeSmall(BlackHole bh) throws Exception { bh.consume(serialize(TestObjects.OBJECT)); } @GenerateMicroBenchmark @OutputTimeUnit(TimeUnit.SECONDS) public void roundTriper(BlackHole bh) throws Exception { bh.consume(roundTrip ( TestObjects.OBJECT )); } } ``` Now wait until you see the Boon way.. And you will see that this benchmark does not make sense unless JVM is doing some sort of intrinsic operation in the background. Boon should suffer for this and it does not. ``` package io.gatling.jsonbenchmark.serialization; import org.boon.json.JsonParser; import org.boon.json.JsonParserFactory; import org.boon.json.JsonSerializer; import org.boon.json.JsonSerializerFactory; import org.openjdk.jmh.annotations.GenerateMicroBenchmark; import org.openjdk.jmh.annotations.OutputTimeUnit; import org.openjdk.jmh.annotations.State; import org.openjdk.jmh.logic.BlackHole; import java.nio.charset.StandardCharsets; import java.util.concurrent.TimeUnit; @Statepublic class BoonByteArraySerializer { private final JsonSerializer serializer = new JsonSerializerFactory().useFieldsOnly().create (); private final JsonParser parser = new JsonParserFactory().create(); private Object serialize(AllTypes alltype) throws Exception { alltype.setMyLong ( System.currentTimeMillis () ); return serializer.serialize ( alltype ).toString ().getBytes ( StandardCharsets.UTF_8 ); } private Object roundTrip(AllTypes alltype) throws Exception { return parser.parse ( AllTypes.class, serializer.serialize( alltype ).toString().getBytes(StandardCharsets.UTF_8) ); } @GenerateMicroBenchmark @OutputTimeUnit( TimeUnit.SECONDS) public void serializeSmall(BlackHole bh) throws Exception { bh.consume(serialize(TestObjects.OBJECT)); } @GenerateMicroBenchmark @OutputTimeUnit(TimeUnit.SECONDS) public void roundTriper(BlackHole bh) throws Exception { bh.consume(roundTrip ( TestObjects.OBJECT )); } } ``` Here is only run the byte[] tests to eliminate getting caught with whatever... ``` Benchmark Mode Thr Count Sec Mean Mean error Units i.g.j.s.BoonByteArraySerializer.serializeSmall thrpt 8 5 1 773485.440 80583.732 ops/s i.g.j.s.JacksonByteArraySerializer.serializeSmall thrpt 8 5 1 578652.557 21243.642 ops/s i.g.j.s.BoonByteArraySerializer.roundTriper thrpt 8 5 1 279642.553 78978.092 ops/s i.g.j.s.JacksonByteArraySerializer.roundTriper thrpt 8 5 1 202312.350 5241.249 ops/s``` Notice that the roundTriper got a lot closer. And the mean of the roundTriper for boon is a lot higher. Let me run just roundTriper. ``` Benchmark Mode Thr Count Sec Mean Mean error Units i.g.j.s.BoonByteArraySerializer.roundTriper thrpt 8 5 1 309763.747 6562.801 ops/s i.g.j.s.JacksonByteArraySerializer.roundTriper thrpt 8 5 1 208100.320 9395.750 ops/s``` Now just Jackson and then just Boon. ``` Benchmark Mode Thr Count Sec Mean Mean error Units i.g.j.s.JacksonByteArraySerializer.roundTriper thrpt 8 5 1 201209.153 13437.474 ops/s `````` Benchmark Mode Thr Count Sec Mean Mean error Units i.g.j.s.BoonByteArraySerializer.roundTriper thrpt 8 5 1 308146.663 4197.227 ops/s ``` It looks to me for this set of tests Boon is faster. Take heart. 1) Boon does not have a streaming mode so really large files are limited to available memory resources. Jackson does not have this limitation. 2GB JSON file is not feasible with Boon. Boon is truly geared for REST/Websocket not large streaming JSON. 2) Jackson has a lot more features than Boon. Boon has a subset of features. I don't think the delta for inputStream or reader will matter for this test. Boon has an optimized IO lib and I can't see how we will be measuring anything but diluting the parser speed test. I base this on past experience of adding Reader, InputStream, etc. and benchmarking it. When I/O is involved to disk the deltas are usually still there but harder to see due to the constant speed of the disk I/O. But I could be wrong. It did not matter much on the other tests that I have run against Jackson and Boon. Also one last thing.. sorry if I said anything too caustic in my blog post. You have been extremely nice, and most nights I was working really late and not being as behaved and professional are you are now. I honestly think the delta is largely due to the streaming buffer windowing logic, and just more logic due to more features of Jackson.
…
RE: Notice that the roundTriper got a lot closer. And the mean of the roundTriper for boon is a lot higher. Notice that the roundTriper got a lot closer. And the mean error of the roundTriper for boon is a lot higher. See how one word makes me look caustic versus self deprecating. I read what I wrote and sometimes which I could say that English is my second language because the way I write sure looks like it is.
…
String
, but given that Jacksonbyte[]
orInputStream
, it would make more sense to use:StringWriter
; there ismapper.writeValueAsString()
method as well that is marginally faster).serialize()
method, looking at matching Boon serializer implementation.