🧩 Determining if a 15 Puzzle is solvable

I recently became interested in how we can programmatically solve the 15 puzzle. The 15 puzzle is a sliding puzzle that consists of a 4 x 4 board of tiles numbered from 1 to 15, with one empty space. The tiles are shuffled and the goal is to slide the tiles around until they are in order, i.e. the numbered tiles will run from 1 to 15 starting from the top left corner from left to right and top to bottom, with the empty space at the bottom right corner.

2025-02-23

/2025/02/23/15-puzzble-solvability.html map[location:Dublin, Ireland name:Albert Au Yeung]

🪐 pyenv, virtualenv and using them with Jupyter

It is common that the different projects you are working on depend on different versions of Python. That is why pyenv becomes very handy for Python developers, as it lets you switch between different Python versions easily. With pyenv-virtualenv it can also be used together with virtualenv to create isolated development environments for different projects with different dependencies.

2020-08-17

/2020/08/17/pyenv-jupyter.html map[location:Dublin, Ireland name:Albert Au Yeung]

🤖 Mastering BERT Tokenization and Encoding

To use a pre-trained BERT model, we need to convert the input data into an appropriate format so that each sentence can be sent to the pre-trained model to obtain the corresponding embedding. This article introduces how this can be done using modules and functions available in Hugging Face’s transformers package (https://huggingface.co/transformers/index.html).

2020-06-19

/2020/06/19/bert-tokenization.html map[location:Dublin, Ireland name:Albert Au Yeung]

🌟 Implementing Trie in Python

Trie is a very useful data structure. It is commonly used to represent a dictionary for looking up words in a vocabulary.

2020-06-15

/2020/06/15/python-trie.html map[location:Dublin, Ireland name:Albert Au Yeung]

🌐 A Guide to Displaying CJK Characters in Matplotlib

Matplotlib by default does not support displaying Unicode characters such as Chinese, Japanese and Korean characters. This post introduces two different methods to allow these characters to be shown in the graphs.

2020-03-15

/2020/03/15/matplotlib-cjk-fonts.html map[location:Dublin, Ireland name:Albert Au Yeung]

🚀 Deploying ML Models in Python - A PyCon HK 2018 Talk

PyCon HK 2018 was held on 23-24th November 2018 at Cyberport. I gave a talk on how to deploy machine learning models in Python. The slides of the talk can be found at the link: http://talks.albertauyeung.com/pycon2018-deploy-ml-models/.

2018-11-23

/2018/11/23/pyconhk-ml-deploy.html map[location:Dublin, Ireland name:Albert Au Yeung]

🐍 Effortlessly Create N-Grams from Text in Python

N-grams are contiguous sequences of n-items in a sentence. N can be 1, 2 or any other positive integers, although usually we do not consider very large N because those n-grams rarely appears in many different places.

When performing machine learning tasks related to natural language processing, we usually need to generate n-grams from input sentences. For example, in text classification tasks, in addition to using each individual token found in the corpus, we may want to add bi-grams or tri-grams as features to represent our documents. This post describes several different ways to generate n-grams quickly from input sentences in Python.

2018-06-03

/2018/06/03/generating-ngrams.html map[location:Dublin, Ireland name:Albert Au Yeung]

🚀 Using Gradient Boosting Machines in Python - A PyCon HK 2017 Talk

PyCon HK 2017 was held on 3rd-4th November 2017 at the City University of Hong Kong. I gave a talk on using gradient boosting machines in Python to perform machine learning. The slides of the talk can be found at the link: http://talks.albertauyeung.com/pycon2017-gradient-boosting/.

2017-11-05

/2017/11/05/pyconhk-gbm.html map[location:Dublin, Ireland name:Albert Au Yeung]

⚡ Making pandas Operations Faster

pandas is one of the most commonly used Python library in data analysis and machine learning. It is versatile and can be used to handle many different types of data. Before feeding a model with training data, one would most probably pre-process the data and perform feature extraction on data stored as pandas DataFrame. I have been using pandas extensively in my work, and have recently discovered that the time required to manipulate data stored in a DataFrame can vary hugely depending on the method you used.

2017-07-08

/2017/07/08/fast-pandas-operation.html map[location:Dublin, Ireland name:Albert Au Yeung]

🔍 Performing Sequence Labelling using CRF in Python

In natural language processing, it is a common task to extract words or phrases of particular types from a given sentence or paragraph. For example, when performing analysis of a corpus of news articles, we may want to know which countries are mentioned in the articles, and how many articles are related to each of these countries.

2017-06-17

/2017/06/17/python-sequence-labelling-with-crf.html map[location:Dublin, Ireland name:Albert Au Yeung]

🔥 Matrix Factorization: A Simple Tutorial and Implementation in Python

There is probably no need to say that there is too much information on the Web nowadays. Search engines help us a little bit. What is better is to have something interesting recommended to us automatically without asking. Indeed, from as simple as a list of the most popular questions and answers on Quora to some more personalized recommendations we received on Amazon, we are usually offered recommendations on the Web.

2017-04-23

/2017/04/23/python-matrix-factorization.html map[location:Dublin, Ireland name:Albert Au Yeung]

ayeung.dev

Tag: Python

🧩 Determining if a 15 Puzzle is solvable

🪐 pyenv, virtualenv and using them with Jupyter

🤖 Mastering BERT Tokenization and Encoding

🌟 Implementing Trie in Python

🌐 A Guide to Displaying CJK Characters in Matplotlib

🚀 Deploying ML Models in Python - A PyCon HK 2018 Talk

🐍 Effortlessly Create N-Grams from Text in Python

🚀 Using Gradient Boosting Machines in Python - A PyCon HK 2017 Talk

⚡ Making pandas Operations Faster

🔍 Performing Sequence Labelling using CRF in Python

🔥 Matrix Factorization: A Simple Tutorial and Implementation in Python