Kian Zohoury

Welcome to my personal blog, where I showcase my projects and talk about AI & machine learning.

Blog

Projects

Publications

Below is a list of recent projects. Corresponding code for each entry can be accessed through the provided Github link.

conditional VAEs for generating handwritten digits (MNIST)

Exploring conditional variational autoencoders (CVAEs) for guided image generation of handwritten digits, using the famous MNIST dataset.

Figure

While VAEs are not entirely new, they are pivotal in understanding how variational inference allows us to approximate posterior distributions. For this reason, VAEs are abundantly used (over vanilla autoencoders) in latent diffusion models. I provide an in-depth comparative analysis of vanilla autoencoders, VAEs, and CVAEs, observing properties of latent spaces generated by deterministic and probabilistic encoders. Dimensionality reductions techniques like principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) are used for visualizing d-dimensional latent spaces and visualizing high-dimensional pixel spaces. Kernel density estimates (KDEs) are also used to visually assess whether latent distributions appear univariate/bivariate normal.

Figure

Figure

Source code can be found here.

topics: deep learning, image processing, variational inference


auralflow 🔊🎵

PyTorch-based application for training/evaluating source separation models to extract stems (e.g. vocals) from audio for music remixing.

Figure

As someone who previously pursue music production for a living, I was constantly on the search for studio-quality acapellas (vocals only) for song remixing. While several models existed at the time (e.g. DEMUCS, Open-Unmix), there weren’t any modeling toolkits available for customizing such models. To aid my own model development and training/evaluation workflows, I conveniently designed a package to have the following:

Figure

The project is not currently being maintained, but I plan on resuming development soon. For the source code, go here

topics: deep learning, signal processing, audio


neural style transfer 🎨 📷

PyTorch-based implementations (from scratch) of several distinct deep learning approaches (optimization, transformation networks, CycleGAN) that aim to solve a popular problem in computer vision called style transfer.

Figure

Put simply, the task in style transfer is to generate an image that preserves the content of image x (i.e. semantics, shapes, edges, etc.) while matching the style of image y (i.e. textures, patterns, color, etc.). One may ask: what is the correct balance between content and style? As it turns out, the answer is more subjective than typical optimization/ML problems - “beauty is in the eye’s of the beholder”, as they say.

Figure

For the source code, go here.

topics: deep learning, image processing, generative adversarial networks


ecoshopper ♻️ 🛒

AI-powered recycling assistant.

Ecoshopper was a prototype mobile web application designed as part of a final project for the Summer 2021 iteration of UC Berkeley’s CS 160. The core ML design of the app involved a pretrained VGG16 backbone that served as a rich feature extractor for transfer learning. Using Stanford’s TrashNet dataset along with human annotation, a downstream classifier was trained to identify recylable goods from non-recyclable goods.

Figure

Model implementation and deployment were done with PyTorch and Django, and front-end development was done with React Native. For the source code, go here.

topics: deep learning, image classification, app development