Machine Learning / Computer Science Engineer

Alex Kameni

Machine Learning Engineer | Data Scientist | AI Researcher

Passionate about leveraging data science and machine learning to solve complex problems, I thrive at the intersection of innovation and practical implementation. With a strong academic foundation and hands-on industry experience, I specialize in developing scalable AI solutions—from self-supervised learning to NLP and computer vision. My collaborative mindset and dedication to continuous learning drive me to push boundaries in both research and real-world applications.

About Me

I hold a Master’s in Complex Systems Engineering (2022) from CY Cergy Paris University, specializing in Machine Learning and Data Science, where my research at ETIS Laboratory (ENSEA) focused on continual learning for self-supervised computer vision models. Prior to this, I earned an Engineering Degree in Computer Science from the Polytechnic School of Yaoundé, Cameroon, completing thesis projects on cutting-edge ML topics.

Currently, as a Data Scientist at Ivalua, I enhance AI-driven invoice processing by integrating domain-specific language models and multimodal AI (LayoutLM). My work has optimized data pipelines, improved model efficiency, and accelerated AI adoption across the organization.

Beyond my professional role, I actively contribute to open-source AI projects, exploring African history with NLP, self-supervised learning, and computer vision—showcasing my versatility and commitment to impactful technology.

Here’s an improved and more polished version of your description section, with refined language, better flow, and stronger impact: Here’s your updated Projects & Research section with the new additions integrated in a structured and impactful way:

Projects & Research

📂 LLM Output Parser: Effortless JSON/XML Extraction (Mar 2025)

Developed a Python tool to reliably parse JSON/XML from unstructured LLM outputs, streamlining data extraction for downstream applications.

🤖 LangGraph AgentFlow: Orchestrating Complex AI Agent Workflows (Mar 2025)

Created a Python library for automating multi-step AI agent workflows using LangGraph, enabling scalable and modular AI task automation.

🌍 Lightweight Language Identification Model (200 Languages) (Feb 2025)

Built a BERT-based model optimized for CPU that accurately detects 200+ languages, balancing performance and efficiency for edge deployment.

🔖 GraphRAG-Tagger: Topic Extraction & Graph Visualization Toolkit (Feb 2025)

Designed an end-to-end system for extracting topics from PDFs and visualizing knowledge graphs, enhancing GraphRAG-based retrieval.

📜 Dikoka: AI-Powered Document Analyzer for Historical Records (Dec 2024)

An LLM & RAG-driven tool to analyze and contextualize historical documents, making archival research more accessible.

🔍 Medivocate – AI-Powered Exploration of African History & Culture (Jan 2025)

An interactive platform using AI to uncover and share insights on African heritage, traditional medicine, and cultural narratives, fostering global appreciation.

📍 Fine-Tuning GLiNER for Location Mention Recognition (LMR) (Sep 2024)

Enhanced GLiNER’s geospatial understanding for disaster response and location-based applications through targeted model optimization.

📡 Specializing LLMs for Telecom with RAG & Prompt Engineering (Jul 2024)

Adapted Falcon 7.5B and Phi-2 on the TeleQnA dataset, improving telecom-specific knowledge retrieval via Retrieval-Augmented Generation (RAG).

🛰️ Transformer-Based Object Detection for Aerial Imagery (May 2024)

Built a vision transformer model to classify roof types in rural Malawi, aiding infrastructure planning using satellite data.

🔬 Semi-Supervised Learning with Minimal Labels (Jan 2023)

Improved self-supervised models’ robustness in low-label scenarios, enhancing adaptability for real-world deployment.

🔄 Continual Self-Supervised Learning via Distillation & Replay (Sep 2022)

Tackled catastrophic forgetting in self-supervised models using knowledge distillation and memory replay techniques.

📊 Synthetic Financial Data Generation for ML (Dec 2021)

Developed methods to generate high-fidelity synthetic financial datasets, enabling secure and scalable ML training.

⚙️ Optimizing Workflow Resource Allocation (Sep 2021)

Designed a constraint-based optimization system to balance computational efficiency and task performance in workflows.

🗣️ NER for Automated Command Extraction (Jun 2021)

Built a Named Entity Recognition (NER) pipeline to extract executable commands from unstructured text, advancing task automation.

🤖 Semantic Query Understanding for NLP Systems (Sep 2020)

Enhanced intent detection and semantic parsing in NLP models to improve user query comprehension.