C#, Parallel deserialization of JSON stored in database

The scenario

A while back ago I had to yield entities constructed by deserializing JSON stored in a database. The first solution just opened a simple single result, sequential reader against the database returning a one column result set containing JSON. This was just yield returned after deserialized to the desired entity. Trying to tweak this I turned to the task parallel library. The idea was to in a separate task, read from the datareader and at the same time in, the main thread, deserialize the JSON string and yield entities while still reading from the database.

The solution

First, lets be clear. I’m not saying you should see this as an solution that fits in all similar scenarios. In my case it was faster reading the strings from the database then doing the deserialization, but it wasn’t to big of a difference, hence there wasn’t that much more memory consumption caused by a large BlockingCollection. But this is something you have to test and measure for your needs. But everything depends on the scenarios. How big is the JSON string? How many items are there? How’s the infrastructure? But lets put that aside and have a look at the solution. The sourceData below comes from yielding the datareader. In a separate task I read from a datareader which represent a single column result set with a string, a JSON. In that task I add the JSON string to a BlockingCollection that wraps the ConcurrentQueue. At the same time in the main thread I TryTake/dequeue a JSON string from the collection and then yield return it deserialized.

When the reading from the database is done, the task is closed and I then deserialize all the non deserialized JSON strings.

public IEnumerable<T> DeserializeManyInParallel<T>(IEnumerable<string> sourceData) where T : class
{
	using (var q = new BlockingCollection<string>())
	{
		Task task = null;

		try
		{
			task = new Task(() =>
			{
				foreach (var json in sourceData)
					q.Add(json);

				q.CompleteAdding();
			});

			task.Start();

			foreach (var e in q.GetConsumingEnumerable())
				yield return JsonSerializer.DeserializeFromString<T>(e);
		}
		finally
		{
			if (task != null)
			{
				Task.WaitAll(task);
				task.Dispose();
			}

			q.CompleteAdding();
		}
	}
}

Again! Measure, test and try it for your scenarios, before accepting it as a solution.

//Daniel

About these ads

4 thoughts on “C#, Parallel deserialization of JSON stored in database

  1. Pingback: DotNetShoutout

    • Hi,

      I want to be sure that the order of the items are kept intact so that an by the user supplied ORDER BY construct wouldn’t be messed up. Furthermore I don’t want to hold everything in memory but instead yield it back.

      //Daniel

  2. Pingback: C#, Parallel deserialization of JSON stored in... | .NET, C# | Syngu

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s