A Better Guide On Producing High-Quality Figures in LaTeX Using matplotlib

In a previous post, I briefly introduced using matplotlib to generate vector graphics for scientific papers. I think some of the steps in that post is unclear. In this newer version, I am producing a better guide with more concrete instructions and some additional updates.

Using Huge, Heterogenous Datasets in TensorFlow

When using TensorFlow, the size of the dataset can be so big sometimes such that it cannot be stored in the main memory completely. TensorFlow has provided the tf.data.Dataset API to reduce memory footprint and improve the efficiency when working with big datasets. However, the examples in the documentation are built around common data types such as text, image, etc. It is unclear how to adapt these approaches to other types of huge custom datasets. In this post, I discuss a method that I developed for huge datasets containing heterogenous data types (based on https://www.tensorflow.org/api_docs/python/tf/data/TFRecordDataset). This technique allows one to store the huge dataset on hard drive, which reduces memory consumption. It also allows one to use the dataset APIs for efficient dataset transformations and multiple-GPU training.

Improving Pygments for Arbitrary LaTeX Escaping and Custom Highlighting

In LaTeX, the minted syntax highlighting package does not provide the ability to highlight/gray out certain code segments. The limitations of the LaTeX formatter in Pygments also makes it difficult to use escaped command sequences in strings or comments. In this post, I am sharing my ways of solving these two challenges. This is based on my answer to a TeX.SE question: Resources for “beautifying” beamer presentations with source code?.

matplotlib: High Quality Vector Graphics for LaTeX Paper

This post is obsolete. A newer guide is available here.

matplotlib is an extremely useful tool for scientific plotting. Many researchers use it to create plots for their publications. However, matplotlib uses Sans Serif fonts by default, which is rarely used in any scientific papers. As a result, if one inserts figures from matplotlib directly, it will end up with a different style compared to the paper. It is certainly not every aesthetic. In this post, I discuss my ways of tweaking matplotlib so that it generates high quality vector graphics that fits with your LaTeX paper.

Celebrating 2021

2020 has been a rough year for everyone. The global population is suffering from the pandemic, unemployment and social injustice. Fortunately, 2020 is kinder towards me compared to others. Most notably, I was admitted by Purdue ECE as a PhD student, which enabled me to continue pursuing what I truly love. Being separated from my family across the ocean and kept out of my office due to social distancing restrictions, I experienced slight depression and reduced productivity from time to time. Hopefully, 2021 will be a better year for us all.

A TikZ vector animation in PDF. See more in the post.

Reveal Thyself! Visualizing LuaTeX Node Structure

LuaTeX is a wonderful extension to the traditional LaTeX family (e.g. pdfTeX, XeLaTeX), for it provides a powerful programming back-end for LaTeX. With the help of Lua, it is easier to handle file system concepts, carry out numerical computations, apply simple data processing, handle verbatim contents and so on. LuaTeX is the dream tool for automatic high-quality PDF document generation.

The TeX compiler stores the document structure in terms of nodes, which are kept around with linked lists that contain data and metadata about each character, paragraph and page. With LuaTeX, these internal data structures become visible to TeX users for the first time, which means it is possible to view or even manipulate a document by accessing the internals of a TeX compiler. This feature sounds very promising. Sadly, the documentations about LuaTeX is scarce. The most comprehensive document about LuaTeX is its manual, which spends no more than few paragraphs, if not few sentences for most of its functionalities. With this manual alone, it is very difficult to understand how LuaTeX nodes work thoroughly. In this post, I am going to provide a visualization of LuaTeX’s node structure that revolutionizes all existing presentations of LuaTeX nodes, which are usually faulty and text-based.

GitHub: https://github.com/xziyue/luatex-node-inspect

Gaming with LaTeX: PDF-based Tic-tac-toe

Why aren’t there games written in LaTeX? In this post, I am exploring ways to include an (rather weak) AI in a PDF-based tic-tac-toe game, with the help of LaTeX. The results can be found in the folloing repo:

https://github.com/xziyue/tictactoe-pdf

LaTeX3: Programming in LaTeX with Ease

Many people view LaTeX as a typesetting language and overlook the importance of programming in document generation process. As a matter of fact, many large and structural documents can benefit from a programming backend, which enhances layout standardization, symbol coherence, editing speed and many other aspects. Despite the fact the standard LaTeX (LaTeX2e) is already Turing complete, which means it is capable of solving any programming task, the design of many programming interfaces is highly inconsistent due to compatibility considerations. This makes programming with LaTeX2e very challenging and tedious, even for seasoned computer programmers.

To make programming in LaTeX easier, the LaTeX3 interface is introduced, which aims to provide modern-programming-language-like syntax and library for LaTeX programmers. Unfortunately, there is little material regarding this wonderful language. When I started learning it, I had to go through its complex technical manual, which is time-consuming. Therefore, I decide to write a LaTeX3 tutorial that is easy-to-understand for generic programmers.

The Accuracy of l3fp: a Series of Test Cases

I was reading LaTeX3’s documentation for \dim_to_fp:n, where I encountered the following description:

Expands to an internal floating point number equal to the value of the <dimexpr> in pt. Since dimension expressions are evaluated much faster than their floating point equivalent, \dim_to_fp:n can be used to speed up parts of a computation where a low precision and a smaller range are acceptable.

At first, I was confused by the subject of this paragraph and thought the precision of l3fp is not ideal. Because many data processing packages (e.g. datatool) rely on l3fp to do floating point arithmetic, I had this idea of testing l3fp’s accuracy against IEEE 754 double precision floating point arithmetic. The result shows that the error of l3fp is very small compared to IEEE 754 and is negligible in everyday applications. However, the trigonometry functions of l3fp seems to be significantly less precise compared to other operations.

LuaTeX: Mimicking File System Input

In LaTeX, verbatim environments are extremely tricky. Different verbatim environments are based on distinct LaTeX magic, which makes their behavior inconsistent. The only realiable way to use a verbatim environment is to write them in the TeX source as-is, for any attempt to construct such environments programmatically usually fails. In existing packages, such problems are avoided by saving the constructed environments into files first and then use \input command to read them as TeX source files. I really dislike this solution as it induces significant I/O overhead (despite the fact that other I/O bottleneck may be more dominant). In this post, I provide a LuaTeX-based method that allows \input from Lua strings.

(This post corresponds to my question on TeX.SE.)


View more posts→