So What’s Actually In My Search Index? (Part 1)
spps

So What’s Actually In My Search Index? (Part 1)

Posted by on Monday, March 10th, 2014  

 

Part 1 of 2

So you’ve finally gotten your SharePoint 2013 Search working. You’re crawlin’, people are finding stuff, and everybody’s generally having a good time. This Search thing is pretty handy. But have you, like me, ever wondered what’s actually in your Search’s index? You know there are these things called Managed Properties, but what nuggets of goodness are actually in them? Anything useful? Any cool new refiners to add?

In this brief, two-part series, I will show you how to see everything inside that Index you worked so hard to build. Part 1 will give you the tools to see what you currently have available in your Index. Part 2 will show you how to enable all the other properties (managed and crawled) that are waiting to be mined from your Index. Hopefully you’ll be able to find even more value in your Index.

What’s Available?

So you’ve got this nice Index (or Indices) sitting in your farm’s Search Service Application(s). You’ve got all of these managed properties with gobs and gobs of data. But did you know that there’s a whole lot more in that index than the title, URL, author, and create date?

If you want to see all of the Managed Properties you have available (more on these beauties in Part 2), you can do so in two places. They are defined in each Search Service Application (SSA) at the farm level, and they’re also defined at the individual site collection level.

To view the master list of Managed Properties in the SSA:

1. In a browser, navigate to Central Administration

2. In “Application Management,” click the “Manage Service Applications” link

3. Find the SSA that you’re interested in and click on its name

4. After the SSA admin page loads, click on the “Search Schema” link on the left side of the page. These are the managed properties defined at the SSA level.

Managed Properties defined at the site collection level can be managed by simply clicking the “Search Schema” link in the Site Collection Administration section of the Site Settings page.

Show Me the Data!

So now that you’ve got the list of managed properties, what’s actually in them? What information is available? What in the world is dlcDocIdOWSTEXT or owstaxidmetadataalltagsinfo? To find out, you need to actually run a query that includes these fields and then view the results. Unfortunately, there isn’t a simple way to do this through the regular SharePoint GUI. In fact, I don’t know if it’s even possible without creating and configuring custom display templates. So what’s a SharePoint admin supposed to do?

Frustrated with a lack of visibility into my Index, I turned to our dear friend PowerShell and created a function to export all of the Managed Properties from a search. The function is called, appropriately enough, Export-SPSearchResultsWithAllManagedProperties. The script is below. It does the following:

1. Retrieve all SSAs to which a given site collection belongs (through the SSA proxies). The web app a site collection belongs to may subscribe to more than one SSA.

2. Retrieve all Managed Properties marked as Retrievable. If it this flag is not set, no data is returned in the query. No point in pulling Managed Properties which will return no data, right?

3. Build a basic keyword query on the site collection. Loop through all of the Managed Properties and add them as fields to be returned from the query.

4. Execute the keyword query against the site collection.

5. Export the results as a comma-separated file (CSV) to the location specified. Note that the file will be overwritten if it already exists.

Any keyword query can be used. Be careful, though, on the scope of the query, since the more broad the search terms, the more data will be returned and the larger the results file will be. To return all items in the index (only recommended for very small site collections), use “*” as the query text.

To use the function, open a PowerShell window as Administrator and load the script like you would any other external script

(. .\Export-SPSearchResultsWithAllManagedProperties.ps1)

Don’t forget the dots and the space. Once the script is loaded, call it like any other PowerShell function. Export-SPSearchResultsWithAllManagedProperties is called with the following parameters:

* siteUrl – The URL of the site collection to run the search on (ex. “http://sp13main01/sites/sales”)

* queryText – The text of the search query (ex. “sales quota”)

* rowLimit – The number of search results to return (ex. 1000)

o Note that each SSA has a property to define the maximum number of items to return in search results. The default is 500 items, but this can be increased by changing the MaxRowLimit property on the SSA. Use caution when increasing this property (see Software boundaries and limits for SharePoint 2013 for more).

* outFile – The filename and path to output the file to (ex. “D:\Output\Results.csv”). If the file already exists, it will be overwritten.

For help further help, just use the standard PowerShell Get-Help cmdlet (including the –examples, -details, and –full switches) from a PowerShell command line.

After you run the export, all you need to do is open the CSV file in Excel and start browsing. You may be surprised at what’s available. One particularly interesting column to look for is RankDetail. This Managed Property isn’t populated until a search is run, and it will give detailed information on how the item was ranked in the search results.

Export-SPSearchResultsWithAllManagedProperties

Download Export-SPSearchResultsWithAllManagedProperties.ps1

Here’s the function itself. Just save it as a .ps1 file and/or you can then paste it into a PowerShell window.

Hungry for More?

Export-SPSearchResultsWithAllManagedProperties will show you all Managed Property data for the query. However, there’s the potential for a lot more available than just the Managed Properties. There are also a ton of Crawled Properties that can be harvested.

Crawled Properties are all of the various properties the search crawler sees as it goes about its business. The SSA knows there’s something there but hasn’t actually retrieved the data yet. That’s where Managed Properties come in. Managed Properties are amped-up versions of Crawled Properties. Once a Managed Property is created for a Crawled Property, the data in that property gets stored in the Index and becomes available for use. Out of the box, only a subset of the Crawled Properties are promoted to be Managed Properties. In one of my environments, only 104 of 512 Crawled Properties are mapped to a Managed Property (another has 113 mapped out of 573). And only a subset of the Managed Properties are configured to be returned from a search.

Have you ever been curious about what’s in all of those Crawled Properties? What goodies are being left behind? If you want to see what else you can grab, then take a look at Part 2 of this series. In it, I will show you how to temporarily make Managed Properties out of all Crawled Properties (or those properties which are not retrievable) and how to remove them when you’re done.

Conclusion

The Export-SPSearchResultsWithAllManagedProperties script will enable you to see what’s already available to you in your Index. All that data took a good amount of time and resources to fetch, and it’s probably taking up a fair amount of disk. By being aware of what’s in your Index, you will be in a better place to take further advantage of what’s available and receive a greater return on your investment. But you won’t know what’s there until you look!

Please post in the comments any Managed or Crawled Properties you’ve found to be useful. How were you able to take advantage of them? Perhaps somebody else will find your discovery useful to them as well.

 

 

 

 

Disclaimer
The sample scripts are not supported under any Summit 7 Systems standard support program or service. The sample scripts are provided AS IS without warranty of any kind. Summit 7 Systems further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Summit 7 Systems, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Summit 7 Systems has been advised of the possibility of such damages.

Posted by on Monday, March 10th, 2014  

Subscribe to RSS Feed

Sign Up for Newsletter

Leave a Reply