Hibernating Rhinos

Zero friction databases

Fixing memory leaks in RavenDB Management Studio–BindableCollection

Continuing on my last blog post in this series, which I talked about the WeakReference, that time I’ll talk about the BindableCollection.

One the key goals of the new RavenDB Management Studio was to never show an old data in the studio that it’s already obsolete by the server. Say a documents was deleted/updated in the server, you want to see this change in the Management Studio immediately (or at least a few seconds later). We came up with an interesting solution for that. Each model has TimerTickedAsync method which is called both by some update event like delete a document on the studio or by a timer of 5 seconds in order to fetch changed occurred on the server. Now we need to merge the new data with the old data, remove the expired items and render the new once, but we better not render the staff that haven't been changed on a timer basis. So this is the code that we came with in order to do the range:

public class BindableCollection<T> : ObservableCollection<T> where T : class
{
    private readonly Func<T, object> primaryKeyExtractor;
    private readonly KeysComparer<T> objectComparer;

    public BindableCollection(Func<T, object> primaryKeyExtractor, KeysComparer<T> objectComparer = null)
    {
        if (objectComparer == null)
            objectComparer = new KeysComparer<T>(primaryKeyExtractor);
        else
            objectComparer.Add(primaryKeyExtractor);
        
        this.primaryKeyExtractor = primaryKeyExtractor;
        this.objectComparer = objectComparer;
    }

    public void Match(ICollection<T> items, Action afterUpdate = null)
    {
        Execute.OnTheUI(() =>
        {
            var toAdd = items.Except(this, objectComparer).ToList();
            var toRemove = this.Except(items, objectComparer).ToArray();

            for (int i = 0; i < toRemove.Length; i++)
            {
                var remove = toRemove[i];
                var add = toAdd.FirstOrDefault(x => Equals(ExtractKey(x), ExtractKey(remove)));
                if (add == null)
                {
                    Remove(remove);
                    continue;
                }
                SetItem(Items.IndexOf(remove), add);
                toAdd.Remove(add);
            }
            foreach (var add in toAdd)
            {
                Add(add);
            }

            if (afterUpdate != null) afterUpdate();
        });

        if (afterUpdate != null) afterUpdate();
    }
    ...
}

Note that the match method is effectively remove/add just what that is needed.

Later on, we started use the Reactive Extensions’ Observable pattern in order to register to some events, and this made some of our main models to be a disposable one.

So now, we had lots of models that was created by the TimerTickedAsync  but never got used in any manner. Those object was held in memory since they were registered to an events outside of the class but never got disposed (dispose unregister those events).

So this fixed it:

public void Match(ICollection<T> items, Action afterUpdate = null)
{
    Execute.OnTheUI(() =>
    {
        var toAdd = items.Except(this, objectComparer).ToList();
        var toRemove = this.Except(items, objectComparer).ToArray();
        var toDispose = items.Except(toAdd, objectComparer).OfType<IDisposable>().ToArray();

        for (int i = 0; i < toRemove.Length; i++)
        {
            var remove = toRemove[i];
            var add = toAdd.FirstOrDefault(x => Equals(ExtractKey(x), ExtractKey(remove)));
            if (add == null)
            {
                Remove(remove);
                continue;
            }
            SetItem(Items.IndexOf(remove), add);
            toAdd.Remove(add);
        }
        foreach (var add in toAdd)
        {
            Insert(0, add);
        }
        foreach (var disposable in toDispose)
        {
            disposable.Dispose();
        }

        if (afterUpdate != null) afterUpdate();
    });
}

Now we dispose all the objects that was created on the fly and not needed anymore.

Published at

Originally posted at

Fixing memory leaks in RavenDB Management Studio - FluidMoveBehavior

Continuing on my last blog post in this series, which I talked about the WeakReference, that time I’ll talk about the FluidMoveBehavior.

The FluidMoveBehavior gives you a great transition effect to the items in your WrapPanel, which is in the Silverlight toolkit. The FluidMoveBehavior is part of the Expression Blend and it’s exists in the microsoft.expression.interactions.dll.

When I profiled the application with a memory profiler, I have some memory leaks that caused by the FluidMoveBehavior. Surprised I Googled the following “FluidMoveBehavior memory leak” and the first result was this thread, which effectively showed that this is a known issue with no fix yet.

So removing the FluidMoveBehavior from the Management Studio fixes a big source of memory leak. What’s interesting, that the visual effect itself of the FluidMoveBehaviour barley was needed, since we already populating the panel with items each time the panel size is changed.

Published at

Originally posted at

Fixing memory leaks in RavenDB Management Studio - WeakReference

Continue from the last blog post in this series, which I talked about the WeakEventListener, now I’m going to talk about using the WeakReference.

In the RavenDB Management Studio we have 4 pages that contains lots of data: the Home page, Collections page, Documents page and Indexes page. Once you enter to one of those pages, we’ll fetch the data from the RavenDB database but in order to avoid fetching it each time we navigate to that page the data is stored in a static variable. This way, if you re-navigate to a page, you will see the database immediately while we making a background request to RavenDB in order to give you more updated data.

You can look on this code for example:

public class HomeModel : ViewModel
{
    public static Observable<DocumentsModel> RecentDocuments { get; private set; }

    static HomeModel()
    {
        RecentDocuments = new Observable<DocumentsModel>
                          {
                            Value = new DocumentsModel
                                    {
                                        Header = "Recent Documents",
                                        Pager = {PageSize = 15},
                                    }
                          };
    }

    public HomeModel()
    {
        ModelUrl = "/home";
        RecentDocuments.Value.Pager.SetTotalResults(new Observable<long?>(ApplicationModel.Database.Value.Statistics, v => ((DatabaseStatistics)v).CountOfDocuments));
        ShowCreateSampleData = new Observable<bool>(RecentDocuments.Value.Pager.TotalResults, ShouldShowCreateSampleData);
    }

    public override Task TimerTickedAsync()
    {
        return RecentDocuments.Value.TimerTickedAsync();
    }
}

The problem is, what happening when the application consumes to much memory because of all of this static data? In that case there likely to be a performance issues. In order to avoid that, we used make static data to be a WeakReference type, so we basically say to the Silverlight GC engine: If you want to GC the data, please do so. And in this case we’ll just re-initialize it when we need the data again.

This had an huge impact of the memory consumption of the Management Studio application, but we still had some memory leaks which I’ll talk about in the next blog post.

Published at

Originally posted at

Why make WeakEventListener internal?

In the previous post I described how I used the WeakEventListener from the Silverlight Toolkit in order to solve a memory leak. This class is a very needed tool, and it’s something that is recommended to use in Silverlight applications, in order to avoid memory leaks.

Since this class is internal, you must copy the code from the source (thanks to the Microsoft Public License) to your Silverlight application. I’m not sure if this is still the case in the last version of the toolkit, which is compatible with Silverlight 5 (the source code for this is not public when I wrote this), but in any case, if you’re developing a Silverlight application, I’m pretty sure that you’ll need this class.

Tags:

Published at

Originally posted at

Fixing memory leaks in RavenDB Management Studio - WeakEventListener

After shipping the new version of the Management Studio for RavenDB, which was part of build #573, we got reports from our users that it have some memory leaks. This report indicated that we have an huge memory leak in the management studio. I started to investigate this and found a bunch problems with cause it. In this blog posts series I’ll share with you what it took to fix it.

RavenDB Management Studio is a Silverlight based application. One of the mistakes that can be done easily in a Silverlight application (as many other platforms for UI applications) is to attach an event to an object, than discard that object. The problem is that the object will never be cleaned up from the memory, since we have a reference for it – the event listener.

Consider the following code for example:

public static class ModelAttacher
{
    public static readonly DependencyProperty AttachObservableModelProperty =
        DependencyProperty.RegisterAttached("AttachObservableModel", typeof(string), typeof(ModelAttacher), new PropertyMetadata(null, AttachObservableModelCallback));
    
    private static void AttachObservableModelCallback(DependencyObject source, DependencyPropertyChangedEventArgs args)
    {
        var typeName = args.NewValue as string;
        var view = source as FrameworkElement;
        if (typeName == null || view == null)
            return;

        var modelType = Type.GetType("Raven.Studio.Models." + typeName) ?? Type.GetType(typeName);
        if (modelType == null)
            return;

        try
        {
            var modelInstance = Activator.CreateInstance(modelType);
            var observableType = typeof(Observable<>).MakeGenericType(modelType);
            var observable = Activator.CreateInstance(observableType) as IObservable;
            var piValue = observableType.GetProperty("Value");
            piValue.SetValue(observable, modelInstance, null);
            view.DataContext = observable;

            var model = modelInstance as Model;
            if (model == null) 
                return;
            model.ForceTimerTicked();

            SetPageTitle(modelType, modelInstance, view);
            
            view.Loaded += ViewOnLoaded;
        }
        catch (Exception ex)
        {
            throw new InvalidOperationException(string.Format("Cannot create instance of model type: {0}", modelType), ex);
        }
    }
    
    private static void ViewOnLoaded(object sender, RoutedEventArgs routedEventArgs)
    {
        var view = (FrameworkElement)sender;
        var observable = view.DataContext as IObservable;
        if (observable == null)
            return;
        var model = (Model)observable.Value;
        model.ForceTimerTicked();

        var viewModel = model as ViewModel;
        if (viewModel == null) return;
        viewModel.LoadModel(UrlUtil.Url);
    }

For information about the ModelAttacher pattern, take a look on this blog post.

What it means basically is that we creating a models for each page, but never dispose it. So basically each time you navigate to a page, a new view model is created for the page but the old one never got cleaned up.

There was more examples like that, where we have an event keeping a reference to a dead objects. You can look on the RavenDB commits log if you interested in the details. But what is the way to solve this?

In order to solve this I copy the WeakEventListener from the Silverlight Toolkit, which is internal class. Using the WeakEventListener in order to attach to objects solved the above memory issue since we don’t have a strong reference to the dead object anymore, and the GC can just clean them up.

Stress testing RavenDB

The following is cross posted from Mark Rodseth’s blog (he also posted a follow up post with more details).

Mark is a .Net Technical Architect at a digital agency named Fortune Cookie in London. I would like to take the opportunity and thank Mark both for the grand experiment about which you are about to read and for the permission to post this in the company blog.

Please note: Mark or Fortune Cookie are not affiliated with either Hibernating Rhinos or RavenDB in any way.

When a colleague mentioned  RavenDB  to me I had a poke around and discovered that it was one of the more popular open source NoSQL technologies on the market. Not only that but it was bundled with Lucene.Net Search making it Document Database coupled with Lucene search capabilities.  With an interest in NoSQL technology and a grudge match that hadn’t been settled with Lucene.Net, I set myself the challenge to swap out our SQL Search implementation with RavenDB and then do a like for like load test against the two search technologies.
These are my findings from both a programmatic and performance perspective.


Installing RavenDB
There isnt much to installing Raven and its pretty much a case of downloading the latest build and running the Server application.
The server comes with a nice Silverlight management interface which allows you to manage all aspects of Raven Db from databases to data to indexes. All tasks have a programmatic equivalent but a decent GUI is an essential tool for noobs like myself.

Storing the Data
My first development task was to write an import routine which parsed the property data in SQL and then add it into a Raven Database. This was fairly easy and all I needed to do was to create a POCO, plug it with data from SQL and save it using the C# Raven API. The POCO serialised into JSON data and saved as a new document in the  RavenDB.

The main challenge here was changing my thinking from relational modelling to domain driven modelling - a paradigm shift required when moving to NoSQL - which includes concepts like aggregate roots, entities and value types. Journeying into this did get a bit metaphysical at times but here is my understanding of this new fangled schism.

Entity - An entity is something that has a unique identity and meaning in both the business and system context. In the property web site example, a flat or a bungalow or an office match these criteria.

Value Type - Part of the entity which does not require its own identity and has no domain or system relevance on its own. For example, a bedroom or a toilet.

Aggregate Root - Is an master entity with special rules and access permissions that relate to a grouping of similar entities. For example, a property is an aggregate of flats, bungalows and offices. This is the best description of these terms I found.

Hibernating Rhinos note: With RavenDB, we usually consider the Entity and Aggregate Root to be synonyms to a Document. There isn’t a distinction in RavenDB between the two, and they map to a RavenDB document.

In this example, I created one Aggregate Root Entity to store all property types.

C# Property POCO

Indexing the Data
Once the Data was stored it needed to be indexed for fast search. To achieve this I had to get to grips with map reduce functions which I had seen around but avoided like the sad and lonely looking bloke** at a FUN party.
The documentation is pretty spartan on the  RavenDB web site but after hacking away I finally created an index that worked on documents with nested types and allowed for spatial queries.
RavenDB allows you to create indexes using Map Reduce functions in LINQ. What this allows you to do is create a Lucene index from a large, tree like structure of data. Map reduce functions give you the same capability as SQL using joins and group by statements. To create a spatial index which allowed me to search properties by type and sector (nested value types) I created an index using the following Map Reduce function.

Index created using the Raven DB Admin GUI

Hibernating Rhinos note: a more efficient index would likely be something like:

from r in docs.AssetDetailPocos
select new
{
  sectorname = r.Sectors,
  prnlname = r.AddressPnrls,

  r.AssetId,
  r.AskingPrice,
  r.NumberOfBedrooms,
  r.NumberOfBathRooms,
  
  
  _ = SpatialIndex.Generate(r.AssetLatitude, r.AssetLongitude)
}

This would reduce the number of index entries and make the index smaller and faster to generate.

Querying the data

Now that I had data that was indexed, the final development challenge was querying it. RavenDB has a basic search API and a Lucene Query API for more complex queries. Both allow you to write queries in LINQ. To create the kind if complex queries you would require in a property searching web site, the API was a bit lacking. To work around this I had to construct my own native Lucene queries. Fortunately the API allowed me to do so.

Performance Testing

All the pawns were now in place for my load test.

  • The entire property SQL database was mirrored to  RavenDB.
  • The Search Interface now had both a SQL and a  RavenDB implementation.
  • I created a crude Web Page which allowed switching the search from SQL to  RavenDB via query string parameters and output the results using paging.To ensure maximum thrashing the load tests passed in random geo locations for proximity search and keywords for attribute search. 
  • A VM was setup and ready to face the wrath of BrowserMob.

I created a ramp test scaling from 0 to 1000 concurrent users firing a single get request with no think time at the Web Page and ran it in isolation against the SQL Implementation and then in isolation against the  RavenDB Implementation. The test ran for 30 minutes.
And for those of you on the edge of you seat the results where a resounding victory for  RavenDB. Some details of the load test are below but the headline is SQL choked at 250 concurrent users whereas with  RavenDB even with 1000 concurrent users the response time was below 12 seconds.

SQL Load Test

Transactions: 111,014 (Transaction = Single Get Request)
Failures: 110,286 (Any 500 or timeout)

SQL Data Throughput - Flatlines at around 250 concurrent users.

RavenDB Load Test

Transactions: 145,554 (Transaction = Single Get Request)
Failures: 0 (Any 500 or timeout)

RavenDB Data Throughput - What the graph should look like

Final thoughts

RavenDB is a great document database with fairly powerful search capabilities. It has a lot of pluses and a few negative which are listed for you viewing pleasure below.
Positives

  • The documentation although spartan does cover the fundamentals making it easy to get started. On some instances I did have to sniff through the source code to fathom how some things worked but that is the beauty of open source I guess. 
  • The Silverlight Admin interface is pretty sweet 
  • The Raven community (a google group) is very active and the couple of queries I posted were responded to almost immediately.
  • Although the API did present some challenges it both allowed you to bypass its limitations and even contribute yourself to the project.
  • The commercial licence for  RavenDB is pretty cheap at a $600 once off payment

Negatives

  • The web site documentation and content could do with an a facelift. (Saying that, I just checked the web site and it seems to have been be revamped)
  • I came a cross a bug in the Lucene.Net related to granular spatial queries which has yet to be resolved.   Not  RavenDB's fault but a dependence on third party libraries can cause issues. 
  • I struggled to find really impressive commercial reference sites. There are some testimonials but they give little information away. 
  • Sharding scares me.

I look forward to following the progress of  RavenDB and hopefully one day using it in a commercial project. I'm not at the comfort level yet for proposing it but with some more investigation and perhaps some good reference sites this could change very quickly.


* Starry Eyed groupies sadly didn't exist, nor have they ever.
** Not me.

http://ravendb.net

Tags:

Published at

Originally posted at