# Extending Jekyll: A Simple Ruby Program For Indexing

I try to take my notes digitally last semester with the help of $$\LaTeX$$, and I find it really helpful for me and my classmates. However, $$\LaTeX$$ files needs compiling, and the resulting pdf files is not very well integrated into the web environment either, which makes it very difficult to update changes to the notes. Therefore, I decide to use my blog as my new platform to hold my notes. Actually everything is great about composing notes with Jekyll, but there is only one thing missing - indexing, which is very helpful for class notes. As I realize that there is no existing Jekyll plugins supporting this feature, I have to build one on my own, with Ruby.

WARNING: I just start writing Ruby today! You may be seeing some of the worst Ruby code possible!

## Setup Jekyll for custom plugin

In _config.yml, specify the plugin folder:

plugins_dir: ./_plugins


In my case, it is under ._plugins folder. Then, put your own Ruby script (e.g. my_plugin.rb) into the folder and restart Jekyll. Every time the script is updated, Jekyll needs to be restarted. It is worth mentioning that using github-pages package will suppress custom plugins.

## Build custom Jekyll filter

My purpose is actually very similar to jekyll-toc’s, so I refer to that package a lot. To build a filter, one needs to create a Ruby module that has the following structure and register it with Liquid.

module Jekyll
module MyModule
def filter_method_1()
end
end
end

Liquid::Template.register_filter(Jekyll::MyModule)


My code can be seen as below:

require 'digest/sha1'

module Jekyll

module JekyllIndexTermFilter
def indexedbody(content)
pattern = /% {(.*?)} %/ # cancel the white spaces!
allIndexTerms = content.scan(pattern)

counter = Hash.new(0)
allIndexTerms.each do |v|
counter[v[0]] += 1
end

counter.each do |k, v|
if v > 1 then
puts "\e[33mwarning: index term \"#{k}\" appeared #{v} times!\e[0m"
end
end

tagConv = Hash.new
for term in allIndexTerms do
hashCode = Digest::SHA1.hexdigest(term[0])
tagConv[term[0]] = hashCode
end
tagStyle = '<span class="indexed-term-style"><a name="%s"></a>%s</span>'
result = content.gsub(pattern){|s| tagStyle % [tagConv[$1],$1]}
result
end

def indexedterms(content)
pattern = /% {(.*?)} %/ # cancel the white spaces!
allIndexTerms = content.scan(pattern)
tagConv = Hash.new
for term in allIndexTerms do
hashCode = Digest::SHA1.hexdigest(term[0])
tagConv[term[0]] = hashCode
end

allTags = tagConv.keys.map(&:downcase).sort
indexTermStyle = '<a href="#%s">%s</a>'

for item in allTags do
link = indexTermStyle % [tagConv[item], item]
end

end

end

end

Liquid::Template.register_filter(Jekyll::JekyllIndexTermFilter)


I added white spaces in the actual pattern just for escaping. It should be removed in use.

## Usage

It is fairly easy to use. In my CSS style sheet, I added the following style to highlight the indexed words:

.indexed-term-style{
background-color: #FFFACD;
}


{{ content | indexedbody }}
<h2 id="index">Index</h2>
{{ content | indexedterms }}


Now the index functionality can be triggered by surrounding the word with %{...}%. For example, %{this word}% gives this word; %{another word}% gives another word, and the index will be listed in the last section. The name of the links are SHA-1 digest of the indexed phrase, so the chances of collsion are very small. Enjoy!

More examples of usage can be found in editing cheat sheet.