Extending Jekyll: A Simple Ruby Program For Indexing

I try to take my notes digitally last semester with the help of \(\LaTeX\), and I find it really helpful for me and my classmates. However, \(\LaTeX\) files needs compiling, and the resulting pdf files is not very well integrated into the web environment either, which makes it very difficult to update changes to the notes. Therefore, I decide to use my blog as my new platform to hold my notes. Actually everything is great about composing notes with Jekyll, but there is only one thing missing - indexing, which is very helpful for class notes. As I realize that there is no existing Jekyll plugins supporting this feature, I have to build one on my own, with Ruby.

WARNING: I just start writing Ruby today! You may be seeing some of the worst Ruby code possible!

Setup Jekyll for custom plugin

In _config.yml, specify the plugin folder:

plugins_dir: ./_plugins

In my case, it is under ._plugins folder. Then, put your own Ruby script (e.g. my_plugin.rb) into the folder and restart Jekyll. Every time the script is updated, Jekyll needs to be restarted. It is worth mentioning that using github-pages package will suppress custom plugins.

Build custom Jekyll filter

My purpose is actually very similar to jekyll-toc’s, so I refer to that package a lot. To build a filter, one needs to create a Ruby module that has the following structure and register it with Liquid.

module Jekyll
  module MyModule
    def filter_method_1()
      'your code...'
    end
  end
end

Liquid::Template.register_filter(Jekyll::MyModule)

My code can be seen as below:

require 'digest/sha1'

module Jekyll

	module JekyllIndexTermFilter
		def indexedbody(content)
			pattern = /% {(.*?)} %/ # cancel the white spaces!
			allIndexTerms = content.scan(pattern)

			counter = Hash.new(0)
			allIndexTerms.each do |v|
				counter[v[0]] += 1
			end

			counter.each do |k, v|
				if v > 1 then
					puts "\e[33mwarning: index term \"#{k}\" appeared #{v} times!\e[0m"
				end
			end

			tagConv = Hash.new
			for term in allIndexTerms do
				hashCode = Digest::SHA1.hexdigest(term[0])
				tagConv[term[0]] = hashCode
			end
			tagStyle = '<span class="indexed-term-style"><a name="%s"></a>%s</span>'
			result = content.gsub(pattern){|s| tagStyle % [tagConv[$1], $1]}
			result
		end

		def indexedterms(content)
			pattern = /% {(.*?)} %/ # cancel the white spaces!
			allIndexTerms = content.scan(pattern)
			tagConv = Hash.new
			for term in allIndexTerms do
				hashCode = Digest::SHA1.hexdigest(term[0])
				tagConv[term[0]] = hashCode
			end

			allTags = tagConv.keys.map(&:downcase).sort
			indexTermStyle = '<a href="#%s">%s</a>'
			allLinks = Array.new

			for item in allTags do
				link = indexTermStyle % [tagConv[item], item]
				allLinks.append(link)
			end

			allLinks.join('&emsp;&emsp;')
		end

	end

end

Liquid::Template.register_filter(Jekyll::JekyllIndexTermFilter)

I added white spaces in the actual pattern just for escaping. It should be removed in use.

Usage

It is fairly easy to use. In my CSS style sheet, I added the following style to highlight the indexed words:

.indexed-term-style{
	background-color: #FFFACD;
}

And remember to add the filters into your layouts, e.g.

{{ content | indexedbody }}
<h2 id="index">Index</h2>
{{ content | indexedterms }}

Now the index functionality can be triggered by surrounding the word with %{...}%. For example, %{this word}% gives this word; %{another word}% gives another word, and the index will be listed in the last section. The name of the links are SHA-1 digest of the indexed phrase, so the chances of collsion are very small. Enjoy!

More examples of usage can be found in editing cheat sheet.

Updates

The latest version of this plugin can be found here.

Updates (07/05/2020)

The program is rewritten for better quality. The code is here.

Index

another word · this word