Improving WordPress Search with Algolia

Instant search results on the letters "AERO"
The WordPress search function works pretty well, but everything can be improved.

WordPress comes with search functionality bundled into its core software. For many sites, this does everything you need it to do.

However, we often take on projects that need more. WordPress is increasingly used to manage the content of enterprise-class sites—sites with thousands of pages, or sites that comprise networks of related websites (think sample.edu, plus academics.sample.edu, plus admissions.sample.edu). In these cases, it’s often worth investigating external solutions, and Algolia is one of the best.

How do we do it?

UPDATE 23 March 2021: At the time we created the solution described below, there was no plugin for Algolia integration, official or otherwise. WebDevStudios, who produced the very popular Custom Post Type UI plugin, has released an Algolia plugin in the interim. However, it doesn’t really do WordPress Multisite, but consider reading on nonetheless.

Algolia doesn’t provide a WordPress plugin, but the API is well documented, and they do provide a WordPress integration QuickStart. From a development standpoint, creating your own integration is mostly a matter of rolling up your sleeves and getting your hands into some code.

Broadly, this is how Algolia works:

  1. You index your site’s content.
  2. Algolia hosts the index.
  3. When someone performs a search on your site, they’re not really searching your WordPress database; instead, they’re searching the index, which is much, much faster.
  4. Results are returned in a custom UI.

Indexing is where things get tricky, though the documentation provides a good starting point. This is the beginning of the PHP function that indexes content on the site:

public function reindex_post($args, $assoc_args) {
  global $algolia;
  $index = $algolia->initIndex('index_name');

  $index->clearObjects()->wait();

  $paged = 1;
  $count = 0;

  do {
    $posts = new WP_Query([
        'posts_per_page' => 100,
        'paged' => $paged,
        'post_type' => 'post'
    ]);
[...]

First, we needed to wrap that “do/while” loop in a function that would spool through all the sites in our network. Luckily, there were only six:

$blog_ids = [1, 4, 5, 6, 7, 8];
foreach( $blog_ids as $blog_id ){
  switch_to_blog( $blog_id );
[...]

Then we had a few custom post types, which needed to be added into the WP_Query:

$posts = new WP_Query([
  'posts_per_page' => 100,
  'paged' => $paged,
  'post_type' => array('page','post','profile','event'),
]);

Moving on: Algolia is optimized to allow rapid searching of concise bits of metadata, which is why it’s an excellent e-commerce solution. Because we were indexing all the content from all the pages and posts on all our sites, the records themselves got bigger than Algolia expects.

The API documentation provides this bit of starter code:

function algolia_post_to_record(WP_Post $post) {
  $tags = array_map(function (WP_Term $term) {
      return $term->name;
  }, wp_get_post_terms($post->ID, 'post_tag'));

  return [
  'content' => strip_tags($post->post_content),
[...]

We ended up needing to be very careful about what got pushed into that content array value. We also needed to strip out html tags more effectively and limit the size of the record. The code we developed includes this:

$content_raw = $post->post_content;
$spaceString = str_replace( '<', ' <',$content_raw );
$doubleSpace = strip_tags( $spaceString );
$content10k = substr($doubleSpace,9999);
$content = str_replace( '  ', ' ', $content10k );

Finally, the API as described requires the user to access the web server in a terminal and execute WP CLI (WordPress Command Line Interface) functions, which is … not great. We’ve come this far to provide a better user experience for our site’s visitors. Surely we can spend five minutes making the UX better for the site’s managers, too.

Screenshot of the WordPress backend of this site. Highlighting a button that the user can click to rebuild the search index.

Much better. Now we have a blazing-fast search that supports WordPress multisite and custom post types—and can be managed by someone without a technical background.

More from Lipman Hearne


Web Accessibility: Challenges and Rewards

.

Web accessibility is a hot topic, especially in higher education. As more of our everyday learning, living, and communicating becomes …

:

How Social Listening Tools Can Improve Your Overall Marketing Strategy

.

In our digital world, social strategy should be at the forefront of your marketing team’s minds. The importance of all …

:

Storytelling is a Key Part of Your Strategic Planning Narrative

.

I’ll begin with my favorite alarming research insight about perceptions of higher education. It’s the 2018 American Institutional Confidence Poll, …

: