Azure, Beer Drinkin

Creating a 5 star search experience

Search is a feature that can make or break your mobile app, but it can be incredibly difficult to get right. In this blog post, I’m going to share how I’m solving search with Beer Drinkin.

There are many options for us developers looking to implement search solutions into our projects. Some of us may decide to use Linq and Entity Framework to look through a table, and the more adventurous may opt to create an instance of Elastic Search, which requires a lot of work to set up and maintain. For Beer Drinkin, I’m using Microsoft’s Azure Search service as it has proved to be easy to configure and requires zero maintenance.

The reason that Beer Drinkin uses Azure Search is simple: the BreweryDB search functionality is too limited for my needs. One example of this is that the end point often returns zero results if the user misspells a search term. If I searched for “Duval” rather than “Duvel,” BreweryDB’s search would return zero beers. Even if I were to spell the search term correctly, BreweryDB would return all beers from the Duvel Moortgat brewery. Although this is minor, I would prefer that Maredsous 6 and Vedett Extra White not be returned as these do not have “Duvel” in the name.

Screen Shot 2015-12-21 at 15.36.14

Spelling Mistakes

Another issue with using the default search functionality of BreweryDB is its inability to deal with spelling mistakes or offer suggestions. Simple off-by-one-letter spelling mistakes yield no results, something that should be easy to resolve.

Screen Shot 2015-12-21 at 15.40.40

I’ve had sleepless nights worrying that on release, users fail to find results due to simple spelling mistakes. One way to address spelling mistakes is to utilize a spell checking service like WebSpellChecker.net.

The issue with a service such as WebSpellChecker is that it has no context in which to make corrections when it comes to names of products and it also doesn’t support multiple languages.

Another way to minimize spelling mistakes is to provide a list of suggestions as the user types in a search query. You’re probably familiar with this in search engines like Google and Bing. This approach to searching is intuitive to users and significantly reduces the number of spelling mistakes.

Enter Azure Search

Azure Search aims to remove the complexity of providing advanced search functionality by offering a service that does the heavy lifting for implementing a modern and feature-rich search solution. Microsoft handles all the infrastructure required to scale as it gains more users and indexes more data. Not to mention that Azure Search supports 50 languages, which use technologies from multiple teams within Microsoft (such as Office and Bing). What this equates to is Azure Search understands the languages and words of the search requests.

Some of my favorite features

Fuzzy search – Find strings that match a pattern approximately.

Proximity search – Geospatial queries. Find search targets within a certain distance of a particular point.

Term boosting –  Boosting allows me to promote results based on rules I create. One example might be to boost old stock or discounted items.

Getting Started

The first step I took was to provision an Azure Search service within the Azure Portal. I had two options for setting up the service; I could have opted for a free tier or have paid for dedicated resources. The free tier offers up to 10,000 documents and 50MB storage, which is a little limited for what I need.

Because my index already contains over 50,000 beers, I had no option but to opt for the Standard S1 service, which comes at a cool $250 per month (for Europeans, that’s €211). With the fee comes a lot more power with the use of dedicated resources, and I’m able to store 25GB of data. When paying for Search, you’ll be able to scale out to 36 units, which provides plenty of room to grow.

Creating an index

Before I could take advantage of Azure Search, I needed to upload my data to be indexed. Fortunately, with the .NET SDK the Azure Search team provides, it’s exceptionally easy to interact with the service. Using the .NET library I wrote a few weeks ago, which calls BreweryDB, I was able to iterate quickly through each page of beer results and upload them in blocks to the search service.

Screen Shot 2016-01-04 at 10.18.02.png

Uploading documents

Parallel.For(1, totalPageCount, new ParallelOptions {MaxDegreeOfParallelism = 25}, index =>
{
    var response = client.Beers.GetAll(index).Result;
    var beersToAdd = new List();
    foreach (var beer in response.Data)
    {
        var indexedBeer = new IndexedBeer
        {
            Id = beer.Id,
            Name = beer.Name,
            Description = beer.Description,
            BreweryDbId = beer.Id,
            BreweryId = beer?.Breweries?.FirstOrDefault()?.Id,
            BreweryName = beer?.Breweries?.FirstOrDefault()?.Name,
            AvailableId = beer.AvailableId.ToString(),
            GlassId = beer.GlasswareId.ToString(),
            Abv = beer.Abv
         };

         if (beer.Labels != null)
         {
            indexedBeer.Images = new[] {beer.Labels.Icon, beer.Labels.Medium, beer.Labels.Large};
         }
         beersToAdd.Add(indexedBeer);
    }
    processedPageCount++;
    indexClient.Documents.Index(IndexBatch.Create(beersToAdd.ToArray().Select(IndexAction.Create)));

    Console.Write( $"\rAdded {beersToAdd.Count} beers to Index | Page {processedPageCount} of {totalPageCount}");
});

Other data import methods

Azure Search also supports the ability to index data stored in Azure SQL or DocumentDB, which enables you to point a crawler to your SQL table and ensures it is always up to date rather than requiring you to manually manage the document index yourself. There are a few reasons you may not want to use a crawler. The best reason for not using a crawler is that it introduces the possibility of a delay between your DB changing and your search index reflecting the changes. The crawler will only crawl on a schedule, which results in an out-of-data index.

If you opt for the self-managed approach, you can add, remove, and edit your indexed documents yourself as the changes happen in your back end. This provides you with live search results as you know the data is always up to date. Using the crawler is an excellent way to get started with search and quickly get some data in place, but I wouldn’t consider it a good strategy for long-term use.

I mentioned earlier that the free tier is limited to 10,000 documents, which translates to 10,000 rows in a table. If your table has more than 10,000 rows, then you’ll need to purchase the Standard S1 tier.

Suggestions

Before we can use suggestions, we’ll need to ensure that we’ve created a suggester within Azure.

Screen Shot 2016-01-04 at 10.27.15.png

In the current service release, there is support for limited index schema updates. Any schema updates that would require re-indexing, such as changing field types, are not currently supported. Although existing fields cannot be modified or deleted, new fields can be added to an existing index at any time.

If you’ve not checked the suggester checkbox at the time of creating a field, then you’ll need to create a secondary field as Azure Search doesn’t currently support editing the fields. The Azure Search team recommends that you create new fields if you require a change in functionality.

The simplest way to get suggestions would use the following API.

var response = await indexClient.Documents.SuggestAsync(searchBar.Text, "nameSuggester");
foreach(var r in response)
{
    Console.WriteLine(r.Text);
}

Having fun with the suggestion API

The API suggestion provides properties for enabling fuzzing matching and hit highlighting. Let’s see how we might enable that functionality within our app.

var suggestParameters = new SuggestParameters();
suggestParameters.UseFuzzyMatching = true;
suggestParameters.Top = 25;
suggestParameters.HighlightPreTag = "[";
suggestParameters.HighlightPostTag = "]";
suggestParameters.MinimumCoverage = 100;

What do the properties do?

UseFuzzyMatching – The query will find suggestions even if there’s a substituted or missing character in the search text. While this provides a better search experiance, it comes at the cost of slower operations and consumes more resources.

Top – Number of suggestions to retreive. It must be a number between 1 and 100, with its default to to 5.

HightlightPreTag – Gets or sets the tag that is prepended to hit highlights. It MUST be set with a post tag.

HightlightPostTag – Gets or sets the tag that is prepended to hit highlights. It MUST be set with a pre tag.

MinimumCoverage – Represents the precentage of the index that must be covered by a suggestion query in order for the query to be reported a sucess. The default is 80%.

How do the results look?

Simulator Screen Shot 4 Jan 2016, 10.39.25.png

Search

The Search API itself is even easier (assuming we don’t use filtering, which is a topic for another day).


var searchParameters = new SearchParameters() { SearchMode = SearchMode.All };

indexClient.Documents.SearchAsync(searchBar.Text, searchParameters);

Special Thanks

I’d like to take a moment to thank Janusz Lembicz for helping me get started with Azure Search suggestions by answering all my questions. I  appreciate your support (especially given it was on a weekend!).