Selected Personal Projects

Multimodal Web Agent


This project introduces a multimodal web agent capable of understanding, interpreting and generating both natural language and visual content. It's built on Microsoft's AutoGen framework and leverages OpenAI's Assistants API, drawing inspiration from the "WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models" study. This project is also featured in the AutoGen project gallery.


Self Rewarding Language Model


This project explores "Self Rewarding Language Models" from Yuan et al., 2024, utilizing LLM-as-a-Judge to allow a model to self-improve. It integrates Low-Rank Adaptation from Hu et al., 2021 with the Llama 3 model from Meta, optimizing adaptability without full tuning.


nanoGP[T] using xLSTM Architecture


This project aims to implement the "xLSTM: Extended Long Short-Term Memory" from Hochreiter et al., 2024, specifically focusing on the sLSTM, mLSTM, and the entire xLSTM block. The model is trained according to the the well-known GitHub repository by Andrej Karpathy in Torch. Note: Heavily in progress!


Construction Surveillance System


This project establishes a robust surveillance system using Reolink cameras and NVIDIA Jetson platforms. The system includes a Flask API for camera control, image storage on an external NAS, and system monitoring through a Telegram bot. Advanced object detection is powered by a Flask Detection API on the Jetson Xavier, utilizing custom object detection model from Wang et al., 2022 optimized with TensorRT. The project's architecture supports real-time surveillance with efficient storage and remote management capabilities.


Public Company Projects

Store Analytics


This project elevates in-store analytics by utilizing NVIDIA Jetson Nano edge devices and Stereolabs ZED 2 cameras for GDPR-compliant people tracking. By integrating data from multiple cameras to monitor the entire customer journey within stores, we transform brick-and-mortar retail spaces into analytically rich environments.


Construction Progress Tracking


This project introduces a 3D vision system for tracking construction progress and syncing with the BIM model for enhanced project oversight. It features crane-mounted cameras for complete site coverage, ensuring privacy through AI-driven pixelation and anonymization. The system's AI quickly identifies site elements, updating the BIM model in real-time and allowing remote progress monitoring, streamlining management and optimizing future processes.
