Filed under Entity storage

SisoDb v9.0 Released.

First lets make one thing clear: “No I’m not chasing Google Chrome versioning!” The reason is that I’m trying to follow Semantic versioning and if there is a change that makes SisoDb more flexible and more performant and it has been asked for, it will go into the codebase. I try to have frequent releases and if it’s a braking change, major version is bumped.

Before going into the meat and the reason for why version 9.0 has been released, lets look at some other stuff. For source code, documentation etc, go to: http://sisodb.com

Denali

The SQL 2008 provider has now been tested against Sql-Server 2012 Express RC0 (Denali) and it works. There will be a separate provider “SisoDb.Sql2012″ for this, but if you want to jumpstart, e.g trying out LocalDb features, you can.

Version 9.0, the Changes

This new version, v9.0, has had two major focus areas:

  • New layout in database for indexes-values
  • Adapt API for coming providers not having Transactions

New layout in database for indexes-values

One major focus has been to rewrite how the key-values that are used for queries, are stored. This is the result of feedback from the community. And as a step in gaining better querying performance, each Indexes table has been divided up amongst seven tables, grouping data that is of the samy type together. By doing this more effective indexes can be designed for enhancing queries and at the same time retain insert speed. I know it sounds like a lot of tables, but hey, Siso should keep you out of the database. And for you that are afraid of limitations, you can read here (http://msdn.microsoft.com/en-us/library/ms143432.aspx), that tables counts as objects and the total number of all objects in a SQL2008 database is as high as 2,147,483,647 objects.

More to come for queries and inserts

There’s plans for making it possible for you to hook in a caching implementation, so that queries doesn’t have to touch the database. In forthcoming release, there will be focus on moving the process of inserting values into the Indexes-tables to a background process. Hence upon inserting items, the structure (document) will be inserted transactional and then the indexes will be queued and inserted in the background. Having done this partitioning by dividing each Indexes-table up in several tables, it will be easier to accomplish parallel inserts.

Text vs String

When you design your model and use properties with string, there’s now from version 9 two different strings. Either you use the normal string BCL type or the custom Text type found in SisoDb. Values from the former will end up in [Entity]Strings table and the later in [Entity]Texts table. Strings has a max length of 300 chars while Text doesn’t. This semantic separation is done so that effective indexes for queries could be created for normal strings, which isn’t feasible if it would be nvarchar(max).

The Text type in SisoDb is implicit convertible to and from a string so you don’t have to use it explicitly other than as a marker on your entity property.

Sample

public class BlogPost
{
    public Guid Id { get; set; }
    public string Title { get; set; } //Ends up in BlogPostStrings
    public Text Content { get; set; } //Ends up in BlogPostTexts
}

var post = new BlogPost 
{
    Title = "A title of max 300 chars",
    Content = "Some long text that can exceed 300 chars."
}

Adapt API for coming providers not having Transactions

In coming realeases, v9.x, SisoDb will have providers that hasn’t support for transactions like the RDBMS environment does. Having UnitOfWork and UnitOfWork.Commit in these providers will not make sense and will probably lead to confusion. Therefore UnitOfWork has changed its name to WriteSession and the QueryEngine is now named ReadSession. Furthermore the UnitOfWork.Commit() method has been removed and auto commit behavior is instead used on UnitOfWork.Dispose(), hence UnitOfWork will still be transactional when you target e.g Sql2008 and SqlCe4. Note!A commit will only be performed if no exception has been encountered.

Old code

using(var unitOfWork = db.CreateUnitOfWork())
{
    unitOfWork.Insert(x);
    unitOfWork.Insert(y);
    unitOfWork.Update(z);
    unitOfWork.Commit();
}

using(var q = db.CreateQueryEngine())
{
    var r = q.Query<Customer>().Where(c => c.CustomerNo == "123456").SingleOrDefault();
}

New code

//As long as the underlying provider supports it, a Write session is still transactional.
using(var session = db.BeginWriteSession())
{
    session.Insert(x);
    session.Insert(y);
    session.Update(z);
} //A Write session is implicitly being committed on Dispose().

using(var q = db.BeginReadSession())
{
    var r = q.Query<Customer>().Where(c => c.CustomerNo == "123456").SingleOrDefault();
}

Other changes

There has also been some other minor adjustments for v9.0 and since SisoDb has evolved kind of rapid lately, chances are that you have missed some features and changes of earlier releases, hence some of them are covered below as well.

DbSchemaNamingPolicy

You can now control the global naming of your structures by register a specific Func against the static DbSchemaNamingPolicy class:

DbSchemaNamingPolicy.StructureNameGenerator = 
    schema => return string.Concat("MyPrefix", schema.Name);

UpdateMany

UpdateMany has been rebuilt to take care of update many operations and not migration operations. As a step, you now have to provide a predicate and UpdateManyStatuses isn’t used anymore to control if an item should be kept or not. Also, the UpdateMany method of the UnitOfWork (now WriteSession) taking an old type and an new type has been removed. To get this functionality there’s a DbStructureSetMigrator you can use instead.

var migrator = Database.GetStructureSetMigrator();
migrator.Migrate<ModelComplexUpdates.Person, ModelComplexUpdates.SalesPerson>((p, sp) =>
{
	var names = p.Name.Split(' ');
	sp.Firstname = names[0];
	sp.Lastname = names[1];

	var address = p.Address.Split(
            new[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries);

	sp.Office.Street = address[0];
	sp.Office.Zip = address[1];
	sp.Office.City = address[2];

	return StructureSetMigratorStatuses.Keep;
});

Custom non generic collections

Actually, this was released as a patch to v8 but I thought it was worth mentioning. As of now you can have custom, non-generic collections.

public class MyModel
{
    public Guid Id { get; set; }
	public MyCollection Items { get; set; }
}

public class MyItem
{
    public int Value { get; set; }	
}

public class MyCollection : List<MyItem> {}

Custom naming of the Id-property

To clean up your models as of v8 you are not forced to name the Id-property “StructureId”. It will look for the following names:

  • StructureId
  • [TypeName]Id
  • Id
  • I[InterfaceName]Id

Machine name specific connectionstrings

This is also a v8 feature and you can now have connectionstring names prefixed with a machine name. When providing a connection string name, Siso will first try to find one with the value you pass but prefixed with Machine name_.

So if you pass “MyConnectionStringName” it will try and find:

  • MachineName_MyConnectionStringName
  • MyConnectionStringName

Case-sensitive collations

I have tested it against a database with a case-sensitive collation setting, and all tests now passes. There was some SQL where members where written in wrong casing but that has now been fixed.

Removed fields

To make the database schema more friendly for manual updates, the hashed value representing an entity has been removed. The SQL queries has also been rewritten so that the RowId columns no longer is needed.

Misc

There has also been some correction of bugs and performance tweaks, regarding some parallel generation of Id’s, structures etc.

Migrate

Take a backup of your database before porting! Below is some info about the changes made to the storage layout, which could be useful. For more info just contact me.

SisoDbIdentities table

The EntityHash column has been dropped and primary-key is instead EntityName. New layout is:

Structures tables

The RowId column and any associated index has been dropped. New layout is:

Indexes tables

You should be able to just make the adjustments to the other tables and then re-save the structures. That should generate the new indexes tables and put values in them. But again. Be ensured you do have a backup.

For manual porting
The values from the indexes tables needs to be inserted in each dedicated indexes-table. Soo values stored in e.g CustomerIndexes IntegerValue column should be stored in CustomerIntegers Value column. Note! That all values (depending on version of SisoDb) has a representation of it’s value in the StringValue column. Hence, when moving values from the StringValue column to CustomerStrings or CustomerTexts you should only move the once that represents a string, which should be the once that has null in every other value column.

Uniques tables

As with the Indexes tables, if you drop this table and re-save all structures the table should be recreated and populated in the correct manners.

Other than that the only change that has been made is that the RowId column has been dropped. New layout is:

That’s it.

//Daniel

Tagged , ,

SisoDb reached v4.0.0 – now supports SqlCe4

SisoDb has now reached v4.0.0 and it has gone throw some heavy changes. Let’s take a look at the release notes.

Release notes

v4.0
Major rewrite. The generation of StructureSchemas and Structures (graph of indexes) now lives in project: PineCone. http://github.com/danielwertheim/pinecone
The querying table (Indexes) is now key-value which will open up for better querying support.
Reworked internals to become more generic to simplify writings of additional providers.

[Fixed]     A connection could sometimes get left in a open state.
[Fixed]     Bug when Including other Json documents in the same result (Merged Documents).
[Updated]   ServiceStack.Text updated from v2.2.7 to v3.0.0.
[Updated]   SisoId is now StructureId both in classes as well as in storage schema.
[Updated]   Storage schema for Indexes are now key-value.
[Updated]   Storage schema for Uniques now have UqMemberPath instead of UqName
[Updated]   All members of QueryEngine and UnitOfWork taking Enumerable of Ids, now takes param array of ids instead. E.g. GetByIds, DeleteByIds etc
[Updated]   StructureSetUpdater should not be used individually, but via Database.UpdateStructureSet.
[Updated]   The support for how Uniques are handled has been changed. The property marked as unique is turned into a hashed string before stored. That way we get uniformed lenght strings.
[Updated] 	As a security precausion, DeleteByIdInterval is now only supported when you use Identities and not Guids.
[New]       SqlCE4 support, except from TransactionScopes. Support is on its was. NORMAL TRANSACTIONS ARE OF COURSE SUPPORTED THOUGH!
[New]       The Id (previous SisoId) now StructureId now does support Guid, int, long and Nullable, Nullable, Nullable
[New]       HashSets and Dictionaries are now supported
[New]       Extensionpoints for QueryEngine and UnitOfWork now exists on ISisoDataBase via ReadOnce() and WriteOnce(). These simpliefies working with QueryEngine/UnitOfWorks BUT SHOULD ONLY BE USED WHEN DOING ONE OPERATION AGAINST QueryEngine or UnitOfWork.

It’s not backwards compatible, hence v.4.0.0

No, it’s not backwards compatible. The underlying database schema that previously had a flattened view of your structures now is key-values. Why? Well, I wan’t the ability of constructing SQL queries that supports nested enumerables etc. Of course, the key-value solution will cause a hit in the performance, but if it’s really critical to you, you can turn off indexing for the members you don’t query on.

I will start updating the documentation shortly at SisoDb.Com

//Daniel

Tagged

SisoDb – v3.0 – Now with specific provider assemblies

Since v3.0 SisoDb now has specific provider assemblies. This is the result of me started with a provider for SQLCE4. As a result of this the NuGet-package for the provider for SQL2008 is now located at: http://nuget.org/List/Packages/SisoDb.SQL2008 The old package will remain public for a while.

The bigest change is that the SqlDbFactory now is called Sql2008DbFactory. Read the docs for more info: http://sisodb.com/Docs/Doc19

Fixed bug
QxAny (http://sisodb.com/Docs/Doc7) now can be used on nested enumerables but it only support operators: (== and !=), if non supported operator is used, a SisoDbException will be thrown.

[Test]
public void Query_WithQxAny_WhenQxAnyOnQxAnyIsUsedWithAndNotEqOperator_ReturnsCorrectMatch()
{
    var roots = CreateRoots();

    using (var uow = Database.CreateUnitOfWork())
    {
        uow.InsertMany(roots);
        uow.Commit();

        var refetched = uow.Where<Root>(m => m.Child.Items.QxAny(i => i.Values.QxAny(i2 => i2 != 42 && i2 != 142))).ToList();

        Assert.AreEqual(1, refetched.Count);
        Assert.AreEqual("ChildTwo", refetched[0].Child.Name);
        Assert.AreEqual(20, refetched[0].Child.Items[0].Value);
        Assert.AreEqual(21, refetched[0].Child.Items[1].Value);
    }
}

private static IList<Root> CreateRoots()
{
    var roots = new[] { new Root(), new Root() };

    roots[0].Child.Name = "ChildOne";
    roots[0].Child.Items.Add(new Item { Value = 10 });
    roots[0].Child.Items.Add(new Item { Value = 11, Values = new[] { 42, 142 } });

    roots[1].Child.Name = "ChildTwo";
    roots[1].Child.Items.Add(new Item { Value = 20 });
    roots[1].Child.Items.Add(new Item { Value = 21, Values = new[] { 43, 143 } });

    return roots;
}

public class Root
{
    public Guid SisoId { get; set; }

    public FirstLevelChild Child { get; set; }

    public Root()
    {
        Child = new FirstLevelChild();
    }
}

public class FirstLevelChild
{
    public string Name { get; set; }

    public List<Item> Items { get; set; }

    public FirstLevelChild()
    {
        Items = new List<Item>();
    }
}

public class Item
{
    public int Value { get; set; }

    public int[] Values { get; set; }
}

//Daniel

Tagged

Finally some critisicm to SisoDb!

Finally some critisicm to SisoDb: http://www.code972.com/blog/?p=201 It means that someone has looked at SisoDb and taken a stand wheter they like it or not. For me it’s a great source of input to making things better and for you to rethink if you can or should use it. First, lets be clear. No SisoDb is not a traditional NoSQL solution, and as the title of its official page says it’s “a NoSQL’ish .Net implementation for SQL Server“, that should get you thinking right there. if you read the docs (http://sisodb.com/docs) you find information about why it exists.

A while back ago I started to fiddle with Microsofts CTP edition of code first in Entity framework 4. The product is great but I wanted something else, I wanted something “more” schemaless. I turned to MongoDB and wrote together an open source driver targeting .Net 4.

Still, there was things that I didn’t like so I built a document DB over Lucene. I relatively quickly discovered that I missed all the great infrastructure that SQL-server provides. Security, replication, scheduler etc, so I prototyped a solution that uses JSON to create a document/structure provider over SQL-server, namely: Simple Structure Oriented Db (SisoDb).

So, I just missed SQL-Server and thought it would be interesting to see if any document-oriented solution could be put together for SQL-server. Why SQL-Server and not MySQL, Oracle, SQLLite etc. Well, I use SQL-Server in my daily job and I had to start somewhere.

What it tries to solve in that sence is getting away from complex joins and high normalization etc. but as of right now it’s missing things like sharding hence you should be looking at it more as a “document oriented provider for SQL Server”.

The producing of SisoDb has never been driven by replacing any existing solution. If I would go with a purer document-oriented and accepted NoSQL solution, I would go with MongoDb or RavenDb. If I would have a case leaning at key-value, I would go with a key-value oriented data storage solution. If I need a highly denormalized table oriented environment with good convention based mapping and complex joins in queries, I would go with Entity framework 4.1 and never NHibernate. But this is my personal flavor.

Schemaless in SQL-server

Of course not. Again, it should get you thingking when you see words like schemaless, schemafree in any data storage solution, but even more when it’s used in conjuncion with a RDMS tool.

In SisoDb, the concept of impedence missmatch between your object model and your data model is “tried” to not bee of your concerns. This is done by looking at object graphs as documents and by handling adding and dropping of columns when properties in your code model is introduced or dropped. When it comes to more complex changes you will have to call an UpdateStructureSet method, where you CAN provide transformation code going from one old model version to a new one. But, yes, there is some effort there for you, but it’s not about upating existing mappings (either using XML-based or C#) and then running synchronizing transact SQL. Read more here: http://sisodb.com/docs/doc13, but yes, it will need to reinsert your data since it needs to touch the JSON.

By dealing with changes in this way, you can create separate assemblies holding the old model version ans showing you in code how it has been updated.

Id vs SisoId

Ok someone told me programmers are going to go nuts on this and I guess I’ll have to explain the decision, especially since I first used “Id” as the name. I want to be able to take entities from some other application domain and still keep the Identity. Since a lot use the name Id that name would “be taken” and since I don’t want the user to be able to provide mappings, SisoId it is. But if you really need it I guess you could add a property named Id in you class, pointing at SisoId? Or you could send me a request for supporting it.

Int and Guids for SisoId

Regarding ints and guids for SisoId and n-datatype is because I don’t want someone specifying a killer long string used in an join or in the page-index files stored in SQL server. So it’s about performance, which is also why I use sequential GUIDs (http://sisodb.com/docs/doc1). You could still add a constraint via an attribute and state that: public string OrderNo {get;set;} should be enforced to be unique, and that’s what the Uniques-table is for. it being unique.

Joins

No there are no F.K constraints or relationships used in the schema design of the tables. When querying using Where or Query syntax in SisoDb, SQL queries are executed against the query-table (Indexes-table) and the JSON stored in the Structure-table is returned and joined on the primary-key of both tables, but no there’s no physical relation in the database. But that’s really easy to fix if You need it. You are not going to need it to uphold data referential integrity since SisoDb uses the StructureId in the related tables and since every insert is done in a transaction, you will not end up in a inconsistent state.

Create databases

Well of course it can help you run CREATE DATABASE, but you need to execute it under an account having user rights to do this. I’m doing it during the integrations tests all the time using Db.EnsureNewDatabase.

Deep hierarchies

It does support persisting them, and it does support querying nested items, at least.

Querying

The querying is done against the flattened Querying-table (Indexes-table), and as of now it’s up to you to provide indexes on it to boost the query performance. Of course, when an aggregate root contains collections of other classes/types, denormalization will happe, ssince the complete graph should be persisted in one row and the querying using eg. QxAny to query collections of contained complex types, will use like queries to match e.g productno in a string looking like this: <1><2>

When putting together an example for showing query performance, I found a bug (thank you for reporting it). Queries like Customer.Address.AddionalValues[].Value has a bug in it but will be corrected soon. But as of now you should be able to query Customer.Address.ZipCode as well as nested collections: Order.OrderLines[].ProductNo e.g (http://sisodb.com/Docs/Doc11)

Just did a quick test, querying in a database with the same model as in the insert tests (). There was 100.000 customers and I made two queries:

  • on Customer.CustomerNo:
  • on Customer.DeliveryAddress.ZipCode:

100.000 Customers – Identifying by customer.CustomerNo (int)
#1 Total seconds: 0.3312
#2 Total seconds: 0.0337

First execution takes longer since the query plan should be created and cached.

100.000 Customers – Identifying by customer.DeliveryAddress.ZipCode (string)
#1 Total seconds: 0.3371
#2 Total seconds: 0.0519

First execution takes longer since the query plan should be created and cached.

I will fix the bug above and write a more detailed querying comparision and show what’s going on when querying, as well as timing having an actual F.K relation in the database.

So why SisoDb?

Again, not trying to be a silverbullet and replacing existing technologies. And if you hate it, hate it with all your heart. If you find a use-case where it works for you…..great! Then it wasn’t just a fun project after all. If you need a pure NoSQL solution have a look at e.g MongoDB or RavenDB.

//Daniel

Tagged ,

What about tweaking in SisoDb?

This could have been the shortest post I ever have written, since the question could be answered with a simple one liner:

There are no tweaks

The whole point with SisoDb is making it simple and performant out of the box. It will never be targetting multitudes of different scenarios craving needs for adoptions to perform in each scenario.

With that said, there are things you should take in to thoughts.

Effectively work with references between documents

A document/structur has no relations. Everything included in it will get serialized and stored. The root is indicated by adding this one member public Guid|int SisoId { get; set;} to it, saying this is a document and I want to be able to store it. You can of course let your document contain other complex types/classes both with or without the SisoId member. But there’s a huge difference.

Given an Order:

public class Order
{
	public Guid SisoId { get; set; }
}

If I add some simple attributes, it’s easy to understand that these will belong to the document.

public class Order
{
	public Guid SisoId { get; set; }
	
	public string OrderNo { get; set; }
	
	public DateTime CreatedAt { get; set; }
	
	public DateTime? ShippedAt { get; set; }
}

But what if I want a Customer linked to my Order? First, look at the Order as a pile of documents with the title “Order”. When writing a document and you want to reference someone elses writings, you put a reference to it by providing information where to find it, using e.g footnotes. The same thing goes here. Include CustomerId in the Order.

public class Order
{
	...
	public Guid CustomerId { get; set; }
	...	
}

public class Customer
{
	public Guid SisoId { get; set; }
	
	public string CustomerNo { get; set; }
	...
	...
}

Now you have made a connection saying: “My Order documents can point to Customer documents”. What If I want to have a Customer instance in the Order, since I don’t always might want to fire of new queries to fetch a Customer for a certain Order. Easy, just add a property for the Customer.

public class Order
{
	...
	public Guid CustomerId
	{
		get { return Customer.SisoId; }
		private set { Customer.CustomerId = value; }
	}
	
	public Customer Customer { get; set; }
	
	public Order()
	{
		Customer = new Customer();
	}
}

Now, when you store a Order document, Customer will not be stored, since SisoDb will know that Customer is a document living on it’s own, since it has the member public Guid SisoId { get; set; }. The CustomerId will get stored, and this can be used when you are querying orders. What you then can do is to say: “Hey, load Orders and include Customers.”

using(var uow = db.CreateUnitOfWork())
{
	var orders = uow.Query<Order>(q =>
				     q.Include<Customer>(order => order.CustomerId));
}

What SisoDb will do is to in the same query and JSON-result incorporate the JSON of the included documents/structures. Hence you will not get any extra roundtrips and you have gotten yourself a fully loaded Order. More information.

Control what to make Queryable

By default every simple property is extracted from your document/structures and made queryable in SisoDb. This is of course something that you can control and if you have a deep graph with lots of members but you only query on a few of them, you will get better performance by only making these fields queryable, which means that the Indexes table for that document type will get smaller. More information.

db.StructureSchemas.StructureTypeFactory.Configurations.NewForType<Customer>().OnlyIndexThis(c => c.CustomerNo);

That’s it for now. Now I’m going to tweak SisoDb for you, so you don’t have to ;-)

//Daniel

Tagged , ,

SisoDb now lets you query without transactions

Before v2.0 of SisoDb (http://sisodb.com) you could only query using an UnitOfWork. All UnitOfWorks are transactional and in some cases you might want to perform queries in the same transaction as you are doing inserts, updates and deletes in, but if you just want to query you should use the new QueryEngine class.

using(var qe = database.CreateQueryEngine())
{
    customers = qe.Where<Customer>(c => c.Lastname == "Andersson");
}

//Daniel

Tagged ,

SisoDb now supports TransactionScope

When using an UnitOfWork in SisoDb you are using traditional ADO.Net transactions. Lets have a look at an example:

using(var unitOfWork = dataBase.CreateUnitOfWork())
{
    unitOfWork.InsertMany(customers);
    unitOfWork.Commit();
}

This is all fine but suppose you want to control several UnitOfWorks, well the obvious solution is to use the TransactionScope class. Since we are targetting SQL Server 2008 we also don’t want to escalate to an distributed transaction, even if multiple connections are opened targeting the same DB using the same connection string. This is now supported. Hence you can do things like:

using(var ts = new TransactionScope())
{
    using(var unitOfWork = dataBase.CreateUnitOfWork())
    {
        unitOfWork.InsertMany(customers);
        unitOfWork.Commit();
    }

    ts.Complete();
}

The UnitOfWork will check if there is an ongoing Transaction from a TransactionScope, and if there is, no ADO.Net transactions will be created and the Commit and Rollback of the UnitOfWork is left to the outer TransactionScope.

You can of course have multiple UnitOfWorks as well:

using(var ts = new TransactionScope())
{
    using(var unitOfWork = dataBase.CreateUnitOfWork())
    {
        unitOfWork.InsertMany(customers1);
        unitOfWork.Commit();
    }

    using(var unitOfWork = dataBase.CreateUnitOfWork())
    {
        unitOfWork.InsertMany(customers2);
        unitOfWork.Commit();
    }

    ts.Complete();
}

Hope it helps you.

//Daniel

Tagged , ,

SisoDb vs Entity Framework 4.1 Code first – Inserts

Before making any performance comparisions I just want to state the following:

I like Entity Framework Code first and I don’t see SisoDb as a complete replacement. SisoDb should be seen as a complement. Both tools have their place. EF being an O/RM and SisoDb being a document-oriented provider.

With that said, lets continue.

SisoDb – Simple Structure Oriented Db

I will not get into any details of what SisoDb is. If you are interested in knowing more I recommend the following:
SisoDb – Overview
http://sisodb.com/Docs/Doc0

Overview of the internal workings of SisoDb
http://daniel.wertheim.se/2011/04/14/overview-of-the-internal-workings-of-sisodb/

For this post, just keep in mind that SisoDb is a document-oriented (in a NoSQL way of seing things) storage provider sitting on top of SQL Server. It’s not an O/RM.

Entity framework 4.1 – Code first

It’s being called the magical unicorn but I think it’s fair to say that EF Code first is nothing more than Microsoft first decent O/RM giving you a model first experience in .Net, much like the one NHibernate has been giving for years.

They are not the same

With EF you get a lot of O/RM like features like first level caching with identity maps, change tracking etc. It gives you a normalized database so that your object graphs gets stored in several tables. These tables needs to be joined to construct both queries and to return the resulting data for reconstructing your entities. This is a normal case of traditional relational database models used in RDMS.

With SisoDb you get simplicity. You get object-graphs stored as documents getting rid of all those joins. There’s no way to provide mappings and by using JSON you can of course work with base-classes, interfaces etc. You can also include/reference other documents which is returned in the same resultset as the main query. You can see each document/structure as an isolated store with no relations.

For dealing with JSON, SisoDb relies on a very fast library from ServiceStack. More info about the performance of this lib can be found here.

Testing environment

Both the application and the database (SQL Server developer edition) is executed on the same laptop. With 6GB RAM and an I7 Processor and a SSD disk on Windows 7 Ultimate, 64bit. Application was executed from within VS2010 in release mode and without debugger.

Performance – Simple inserts

For these simple inserts we will have a model looking like this:

Model for Simple inserts

In the aggregate root Customer there’s one difference between SisoDb and EF. In SisoDb the property by conventions must be named “SisoId”. In EF, this can be mapped but to get the convention support I’ll use “Id”.

[Serializable]
public class Customer
{
    //public int SisoId { get; set; }

    //public int Id { get; set; }

    public int CustomerNo { get; set; }

    public string Firstname { get; set; }

    public string Lastname { get; set; }

    public ShoppingIndexes ShoppingIndex { get; set; }

    public DateTime CustomerSince { get; set; }

    public Address BillingAddress { get; set; }

    public Address DeliveryAddress { get; set; }

    public Customer()
    {
        ShoppingIndex = ShoppingIndexes.Level0;
        BillingAddress = new Address();
        DeliveryAddress = new Address();
    }
}

[Serializable]
public class Address
{
    public string Street { get; set; }

    public string Zip { get; set; }

    public string City { get; set; }

    public string Country { get; set; }

    public int AreaCode { get; set; }
}

[Serializable]
public enum ShoppingIndexes
{
    Level0 = 0,
    Level1 = 10,
    Level2 = 20,
    Level3 = 30
}

Scenarios

  • 1.1) Insert 1000 customers – Take 1 – without any optimazations
  • 1.2) Insert 1000 customers – Take 2 – with some optimazations
  • 2.1) Insert 10000 customers – Take 1 – without any optimazations
  • 2.2) Insert 10000 customers – Take 2 – with some optimazations
  • 3.1) Insert 100000 customers – Take 1 – without any optimazations
  • 3.2) Insert 100000 customers – Take 2 – with some optimazations

In Scenario 1.1 and 1.2, I iterated it five times and took the best value.
In Scenario 2.1 and 2.2, I itarated it two times and took the best value.
In Scenario 3.1 and 3.2, I itarated it two times and took the best value.

Also note that EF will not handle the enumeration in the model above. There are ways to get around this problem, but that’s not what this post is about.

Performance optimizations

During Scenario 1.2, 2.2 and 3.2 I used some small performance optimizations. I was a little bit more gentle to EF, by turning off some features.

ctx.Configuration.AutoDetectChangesEnabled = false;
ctx.Configuration.ProxyCreationEnabled = false;
ctx.Configuration.ValidateOnSaveEnabled = false;

By the way, feel free to tell me more things to configure to make it perform better.

For SisoDb there’s really is not optimazation. The point of SisoDb is to be simple and performant pout of the box. Although there is one feature you can take advantage off. You can tell SisoDb what to index(make queryable). In this case I selected only CustomerNo. It’s not a nice comparision but if you do have this scenario where you don’t plan on searching on just about everything, you can turn it off (http://sisodb.com/docs/doc15).

Simple inserts – Summary

  #1000 – #1 #1000 – #2 #10000 – #1 #10000 – #2 #100000 – #1 #100000 – #2
EF 0.83s 0.42s 50.85s 5.12s N/A 56.91s
SisoDb 0.09s 0.06s 0.86s 0.55s 8.71s 5.73s

Memory consumption

When inserting 100000 items with EF and the optimizations on, I got around 1GB of memory consumption. With SisoDb I got around 100MB of memory usage.

Source code

The source code for this article is hosted at Github: https://github.com/danielwertheim/SisoDbVs

Summary

This time I treated inserts of simple object graphs. Next post will be about complex inserts as well as reading/querying.

//Daniel

Tagged , ,

Overview of the internal workings of SisoDb

I thought it’s time to give you an overview of how the internals of SisoDb works so that you get some insight into “performance” considerations.

How is data stored?

Before continuing, lets give a quick intro to SisoDb. SisoDb is a NoSql influenced provider giving you a document-oriented solution over Sql-server. It does this by seing your object graphs as structures (document in a NoSql document-oriented database) where public members of simple types (strings, numbers, dates etc) in the hierarchy are made queryable. As a default every property of the graph in the contract of the passed class or interface is flattened to fit one row in a special “Indexes-table”. This table is there for making queries against your structure. You can easily go in and place indexes on the columns you query a lot. All values are extracted using cached delegates generated using IL-Generator emits, hence I don’t relly on dirty, timeconsuming reflection calls.

Json-serialization

The structure is also stored as Json in the “Structure-table”. This is done to keep an intact schemaless representaion of your structure so that structures can be reindexed and to give an effective deserialization process when performing queries.

I’m using one of the fastest Json-serializer I know of in the .Net community: ServiceStack.Text you can read about a performance compare between the popular Json.Net library here: Json.Net vs ServiceStack.Text.

Not making everything queryable?

I’m currently implementing support for this, where you will be able to specify “hey don’t make eveything queryable, since I will only query on these properties”. That way you can boost performance making the “Indexes-table” much more slimmer.

This feature is coming really soon, perhaps it’s already implemented.

Separated entities & sharding

Since data is document-oriented one certain structure gets it’s own tables and they stand on their own legs not having relations to other tables. This is also a mindset you need to have when working with SisoDb, a mindset that it’s not an O/RM over a relational data model, it’s a document-database. You could take advantage of this and shard your model. I’m planning support for this in the future, but right now you will manually need to put up a proxy accessing different SisoDb instances depending on the type of structure being consumed.

Use replication for readmodels and writemodels

Since I’m targetting SQL-server you get some built in benefits where you could take advantage of the builtin support for replicating data between databases. This way you could easily have a write and a read store as well as put up a store which you then use some ETL tool to transform the data to a model more fit for reporting, warehousing etc.

How is data inserted?

When inserting entities there is a demand that you have a property named “SisoId”. That is the only demand SisoDb has on your model. That property could either return an Integer or an Guid.

Integer identities

In this scenario SisoDb looks how many entities you are inserting and reservers a range of identities and assign them to the model before performing the insert to the database. This way no ineffective insert + select for each row have to be made (as with Entity framework or traditional identities in NHibernate).

Sequential Guid identities

SisoDb doesn’t use traditional generated Guids but instead it uses sequential guids mimicking the algorithm used in SQL Server’s sequential guids.

Bulkcopy

I make use of custom datareader that reads over the structures and is consumed by the SQL bulkcopy. That way there are “NO custom generated ad-hoc batch SQL inserts” but effective inserts using the bulk copy.

Querying

When querying using uow.Where or uow.Query or uow.Get etc. your specified lambda expression are translated to parameterized SQL executed as a plain select via the ADO.Net command and NOT executed using ad-hoc SQL and the EXEC function in SQL server.

Well that was a short overview of how the internals works. Will be glad to try and answer any questions. There are more information about it here: http://sisodb.com/docs

//Daniel

Tagged ,

SisoDb – Getting started video

I just got my very first production out. You will notice that English isn’t my natural language, but hey, you should always leave room for improvements.

http://blog.sisodb.com/2011/03/16/video-getting-started/

//Daniel

Tagged ,
Follow

Get every new post delivered to your Inbox.