Showing posts with label Index. Show all posts
Showing posts with label Index. Show all posts

Wednesday, 20 January 2016

Building Lucene Search Index - Order of templates does matter!

I've just spent a frustrating 30 minutes trying to figure out why a search of mine wasn't returning the results I expected.

Here's the json that gets passed to my API that is used to search against the search index:

{"pages":
    ["c159f9f7-77b6-42c2-9b35-50279001987e",
     "8c24e1a1-986e-4c01-9c2f-447b89805fd7",
     "b52b3044-f7b0-406d-b3ac-6be5b8148a7f",
     "f2726c5f-3fbc-4b25-8d53-693c0eb50b93",
     "2319ea68-74c6-4ba3-af7b-02a6b766f311",
     "a70eb547-bab6-44c0-97ca-b749ba333561",
     "2685a1f2-7aac-4e99-ba9d-5b05667140a7",
     "6385cf29-2133-4c70-a247-ec0be744db64",
     "37816c10-fb96-45b6-b123-cb815926df97"
    ],"language":"en","PageSize":10}


In my search I pass in a number of GUIDs (pages) which are matched against Ids in the index.  Sounds simple enough!  I had four templates added to my index configuration, three of which where pages and one (the last in the list) was inherited by all the pages:
       
<include hint="list:IncludeTemplate">
  <!--Product Overview Page--> 
  <templateId>{1A1706B9-95AF-408D-ACF6-A158315333F4}</templateId>
  <!--Product Module Detail Page--> 
  <templateId>{1263F902-B309-47B1-BCE5-25305167E15C}</templateId>
  <!--Product Module Landing Page--> 
  <templateId>{ECA7823D-4BA9-4995-836B-85FA90F00481}</templateId>
  <!--Related Product Details--> 
  <templateId>{57F36C1B-9292-4386-96DB-8B87A2FBBE28}</templateId>
</include>


All worked fine until I added another page template id to the list (which also inherited the same template as the other pages).  I added it as the last item in the list:

         
<include hint="list:IncludeTemplate">
  <!--Product Overview Page--> 
  <templateId>{1A1706B9-95AF-408D-ACF6-A158315333F4}</templateId>
  <!--Product Module Detail Page--> 
  <templateId>{1263F902-B309-47B1-BCE5-25305167E15C}</templateId>
  <!--Product Module Landing Page--> 
  <templateId>{ECA7823D-4BA9-4995-836B-85FA90F00481}</templateId>
  <!--Related Product Details--> 
  <templateId>{57F36C1B-9292-4386-96DB-8B87A2FBBE28}</templateId>
  <!--Alternative Text--> 
  <templateId>{82B30165-9698-4E79-8445-9F2BEF315382}</templateId>
</include>


Now if my search json just included page ids for pages that were of the new page template type then my query would return what I expected.  Even if I added some additional page ids but the unique count per template was less than the count for my new one, it still worked as expected.

But as soon as I sent in a request where there were a larger number of one of the original template types, then the result excluded my new template type.  It didn't matter where in the list I put the Ids for new template type they always got excluded. - Weird or what!

As soon as I changed the order of my templates in the index configuration it all worked as expected:

         
<include hint="list:IncludeTemplate">
  <!--Alternative Text--> 
  <templateId>{82B30165-9698-4E79-8445-9F2BEF315382}</templateId>
  <!--Product Overview Page--> 
  <templateId>{1A1706B9-95AF-408D-ACF6-A158315333F4}</templateId>
  <!--Product Module Detail Page--> 
  <templateId>{1263F902-B309-47B1-BCE5-25305167E15C}</templateId>
  <!--Product Module Landing Page--> 
  <templateId>{ECA7823D-4BA9-4995-836B-85FA90F00481}</templateId>
  <!--Related Product Details--> 
  <templateId>{57F36C1B-9292-4386-96DB-8B87A2FBBE28}</templateId>
</include>


For brevity here is the query I used to return the search results:

  using (var searchContext =
                    ContentSearchManager.GetIndex(RelatedProductsSearchIndexName).CreateSearchContext())
            {
                var query = GetBaseQueryForOtherType<RelatedProductSearchResultsItem>(searchContext, language);

                var predicate = PredicateBuilder.False<RelatedProductSearchResultsItem>();

                var predicateCategory = PredicateBuilder.False<RelatedProductSearchResultsItem>();

                if (relatedProductsRequest.Pages.Any())
                {
                    predicateCategory =
                        relatedProductsRequest.Pages.Select(item => new ID(item))
                            .Aggregate(predicateCategory, (current, id) => current.Or(p => p.ItemId == id));

                    predicate = predicate.Or(predicateCategory);
                }

                query = query.Where(predicate);

                if (relatedProductsRequest.ProductId != Guid.Empty)
                {
                    query = query.Where(p => p.ProductPageID == relatedProductsRequest.ProductId);
                }

                count = query.Count();

                return query.Page(relatedProductsRequest.PageNumber, relatedProductsRequest.PageSize)
                    .GetResults()
                    .Hits.OrderByDescending(h => h.Score)
                    .Select(h => h.Document)
                    .ToList();
            }

Friday, 16 October 2015

ReSharper and Complex Content Searching

On a recent project, I had to create a more complicated content search that involved a number of parameters, some of which had to be grouped with ands and ors.  In doing this I encountered some issues with Sitecore's implementation of Linq and the use of tools like ReSharper.

I have a number of different filters passed into my search results method, with each search parameter being a list of GUIDs

public static IList<ProductSearchResultItem> GetAllProductsByFilter(
       List<Guid> categories,
       List<Guid> subCategories, List<Guid> brands,
       List<Guid> agesAndStages, List<Guid> features,
       List<Guid> searchTerms, List<Guid> brandCollections,
       List<Guid> products, List<Guid> colors,
       List<Guid> fashions, List<Guid> subBrands,
       bool includeProducts, ProductFilterSortOrder sortOrder,
       int pageNumber, int pageSize, out int count)       
I then need to start building out my predicates, that will enable me to build up my search expression. The issue arises when you do something like:
var query = GetBaseQuery(searchContext);

var predicate = PredicateBuilder.True<productsearchresultitem>();
var predicateCategory = PredicateBuilder.False<ProductSearchResultItem>();

foreach (var category in categories)
{
   var localCategory = category;
   predicateCategory = predicateCategory.Or(p => p.Categories.Contains(localCategory));
}

ReSharper will try to refactor the loop to something like:
 predicateCategory = categories.Aggregate(predicateCategory, (current, localCategory) => current.Or(p => p.Categories.Contains(localCategory)));
The Aggregate extension method is not supported by Linq to Sitecore and will throw an exception when the query is evaluated (ToList()).

To stop this happening you can instruct ReSharper to ignore this by doing the following:
                 
// ReSharper disable LoopCanBeConvertedToQuery
foreach (var category in categories)
{
   var localCategory = category;
   predicateCategory = predicateCategory.Or(p => p.Categories.Contains(localCategory));
}
This will stop ReSharper from suggesting the refactoring from this point in the code.

Another Gotcha I found was when I came to evaluate the expression using:
 query = Queryable.Where(query, predicate);
ReSharper will suggest changing this to use the extension method:
  query = query.Where(predicate);
Which will break Linq to Sitecore.

You can stop this from happening once:
                
// ReSharper disable once InvokeAsExtensionMethod
query = Queryable.Where(query, predicate);
// ReSharper restore LoopCanBeConvertedToQuery

Thursday, 15 October 2015

Sitecore indexed results and paging

In the past I often used a query like this to bring back paged results from a query on a Sitecore index:
query.Skip(page*pageSize).Take(pageSize);
However, after some research I realized that this is inefficient and actually causes some issues because the query is performed in memory.

N.B. Sitecore uses a reduced set of Linq operators, be careful when refactoring Sitecore Linq statements as tools like reSharper will break Sitecore Linq statements

Instead the Sitecore suggested approach to do this is to use the Page extension method, that is part of the QueryableExtension class in Sitecore.Contentsearch.Linq.

The above would become:
query.Page(page, pageSize);
Much neater!