Pinned Links


Recent Posts

A Better Guide On Producing High-Quality Figures in LaTeX Using matplotlib

In a previous post, I briefly introduced using matplotlib to generate vector graphics for scientific papers. I think some of the steps in that post is unclear. In this newer version, I am producing a better guide with more concrete instructions and some additional updates.

Using Huge, Heterogenous Datasets in TensorFlow

When using TensorFlow, the size of the dataset can be so big sometimes such that it cannot be stored in the main memory completely. TensorFlow has provided the tf.data.Dataset API to reduce memory footprint and improve the efficiency when working with big datasets. However, the examples in the documentation are built around common data types such as text, image, etc. It is unclear how to adapt these approaches to other types of huge custom datasets. In this post, I discuss a method that I developed for huge datasets containing heterogenous data types (based on https://www.tensorflow.org/api_docs/python/tf/data/TFRecordDataset). This technique allows one to store the huge dataset on hard drive, which reduces memory consumption. It also allows one to use the dataset APIs for efficient dataset transformations and multiple-GPU training.

Improving Pygments for Arbitrary LaTeX Escaping and Custom Highlighting

In LaTeX, the minted syntax highlighting package does not provide the ability to highlight/gray out certain code segments. The limitations of the LaTeX formatter in Pygments also makes it difficult to use escaped command sequences in strings or comments. In this post, I am sharing my ways of solving these two challenges. This is based on my answer to a TeX.SE question: Resources for “beautifying” beamer presentations with source code?.

View more posts→